Three structures for the Review & Edit "Substitutions" screen — a Tab split, an Accordion of collapsible groups, and a Filtered list — tested for which best helps a customer manage which grocery items can be replaced if out of stock. Same 9-task scenario across all participants: read the screen, state the count, locate an item, set a "don't replace," find items that can't be substituted, and rate confidence & clarity.
| Hypothesis | Result | Evidence |
|---|---|---|
| H1. Tabs with counts let users state the count at a glance | PASS | 3/3 answered correctly & instantly, citing the tab number. Highest confidence (4.3) & clarity (4.7). |
| H2. An accordion lets users locate & manage items on one scroll | PARTIAL | Items were findable, but 3/3 said the layout jumped on expand/collapse; one mis-added group counts. Confidence & clarity dropped (3.3 / 3.3). |
| H3. A filtered list is the most familiar, so users navigate comfortably | FAIL | Familiar — but 0/3 stated the count correctly. No persistent total + items hidden by the active filter → under-count, no answer, or guess. Lowest clarity (2.7). |
| Variant | Count correct (Q2) | Confidence | Clarity | Core issue |
|---|---|---|---|---|
| Tab | 3 / 3 | 4.3 | 4.7 | "Ineligible" wording read as corporate |
| Accordion | 3 / 3 (1 self-corrected) | 3.3 | 3.3 | Layout shifts on expand/collapse |
| Filter | 0 / 3 | 3.3 | 2.7 | No persistent count; filter hides items |
The per-tab counts answered "how many will be replaced?" without anyone counting rows. Tab led every comprehension and rating measure.
Items were findable, but every expand/collapse moved the page. All three accordion users named the shifting layout as disorienting — it cost confidence even when the task succeeded.
It looked like a pattern people knew, so they navigated confidently — but the filtered view removed the very thing the task needed: a running total. None of the three could state the count correctly.
The "Ineligible" label (the system's eligible-vs-ineligible model) was flagged across variants — corporate jargon, unclear until tapped. The question users actually asked wasn't "is this eligible?" — it was "which of my items will get replaced, and which won't?" The categories were named for the backend's data model, not the customer's decision.
Best count comprehension (3/3), highest confidence & clarity, lowest friction. Few categories, counts always visible.
Replace the eligibility framing with the question users actually ask: "Replace" / "Don't replace." Surface can't-be-substituted items in plain language ("Can't be swapped — alcohol & medication") instead of "Ineligible."
The filter failed because totals vanished under the active filter. The at-a-glance count is what carries comprehension — never hide it.
The accordion's expand/collapse motion cost orientation and confidence. Minimize reflow when users act.