← Review & Edit — Substitution StructureSynthesisSKILL 2 · multi-variant

Context

9 participants · 3 per variant 3 structures tested 3 hypotheses Pass 1 · Partial 1 · Fail 1

Three structures for the Review & Edit "Substitutions" screen — a Tab split, an Accordion of collapsible groups, and a Filtered list — tested for which best helps a customer manage which grocery items can be replaced if out of stock. Same 9-task scenario across all participants: read the screen, state the count, locate an item, set a "don't replace," find items that can't be substituted, and rate confidence & clarity.

⚠ Small sample (n=3 per variant) — treat magnitudes directionally; the patterns are consistent enough to act on.

Hypothesis results

HypothesisResultEvidence
H1. Tabs with counts let users state the count at a glancePASS3/3 answered correctly & instantly, citing the tab number. Highest confidence (4.3) & clarity (4.7).
H2. An accordion lets users locate & manage items on one scrollPARTIALItems were findable, but 3/3 said the layout jumped on expand/collapse; one mis-added group counts. Confidence & clarity dropped (3.3 / 3.3).
H3. A filtered list is the most familiar, so users navigate comfortablyFAILFamiliar — but 0/3 stated the count correctly. No persistent total + items hidden by the active filter → under-count, no answer, or guess. Lowest clarity (2.7).

Variant comparison

VariantCount correct (Q2)ConfidenceClarityCore issue
Tab3 / 34.34.7"Ineligible" wording read as corporate
Accordion3 / 3 (1 self-corrected)3.33.3Layout shifts on expand/collapse
Filter0 / 33.32.7No persistent count; filter hides items

Findings

F1 — Tabs make the count legible; that drove comprehension, confidence, and clarity.

The per-tab counts answered "how many will be replaced?" without anyone counting rows. Tab led every comprehension and rating measure.

F2 — The accordion's strength (one scroll) was undercut by motion.

Items were findable, but every expand/collapse moved the page. All three accordion users named the shifting layout as disorienting — it cost confidence even when the task succeeded.

F3 — The filter's familiarity hid a comprehension failure.

It looked like a pattern people knew, so they navigated confidently — but the filtered view removed the very thing the task needed: a running total. None of the three could state the count correctly.

⚠ Unexpected · cross-variant

F4 — The "eligibility" framing confused people regardless of structure.

The "Ineligible" label (the system's eligible-vs-ineligible model) was flagged across variants — corporate jargon, unclear until tapped. The question users actually asked wasn't "is this eligible?" — it was "which of my items will get replaced, and which won't?" The categories were named for the backend's data model, not the customer's decision.

Recommendations

HIGH

Ship the Tab structure.

Best count comprehension (3/3), highest confidence & clarity, lowest friction. Few categories, counts always visible.

HIGH

Re-label around the customer's decision, not the system's model.

Replace the eligibility framing with the question users actually ask: "Replace" / "Don't replace." Surface can't-be-substituted items in plain language ("Can't be swapped — alcohol & medication") instead of "Ineligible."

MED

Keep a persistent count visible.

The filter failed because totals vanished under the active filter. The at-a-glance count is what carries comprehension — never hide it.

LOW

Avoid layout shift.

The accordion's expand/collapse motion cost orientation and confidence. Minimize reflow when users act.

Final takeaway

The eligibility split read fine on paper. It was only when people moved through a real order that the mismatch between how the system thinks (eligible / ineligible) and how the customer thinks (will this get replaced or not?) became obvious enough to fix. The Tab model won — but the bigger win is reorganizing it around the customer's decision, not the backend's data.