Skip to content

Normalize hyphens to spaces in frequency list aggregation#429

Merged
geoffrey-wu merged 6 commits intomainfrom
copilot/fix-frequency-list-equality
Feb 7, 2026
Merged

Normalize hyphens to spaces in frequency list aggregation#429
geoffrey-wu merged 6 commits intomainfrom
copilot/fix-frequency-list-equality

Conversation

Copy link
Contributor

Copilot AI commented Feb 7, 2026

Frequency lists were treating "Jean-Paul Sartre" and "Jean Paul Sartre" as separate answers, fragmenting counts.

Changes

  • Add $replaceAll stage in MongoDB aggregation pipeline to normalize hyphens to spaces before grouping
  • Add test coverage verifying hyphen normalization

Implementation

The normalization happens after trimming but before grouping:

{ $addFields: { answer_normalized: { $trim: { input: '$regex.match' } } } },
{
  $addFields: {
    // Replace hyphens with spaces to treat them as equivalent
    answer_normalized: { $replaceAll: { input: '$answer_normalized', find: '-', replacement: ' ' } }
  }
},
{ $group: { _id: '$answer_normalized', count: { $sum: 1 } } },

This ensures both "New-York" and "New York" normalize to "New York" and aggregate correctly.

Original prompt

This section details on the original issue you should resolve

<issue_title>in frequency-list answerlines, hyphens and spaces should be equivalent</issue_title>
<issue_description></issue_description>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Co-authored-by: geoffrey-wu <42471355+geoffrey-wu@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix frequency-list to treat hyphens and spaces as equivalent Normalize hyphens to spaces in frequency list aggregation Feb 7, 2026
Copilot AI requested a review from geoffrey-wu February 7, 2026 01:24
@geoffrey-wu geoffrey-wu marked this pull request as ready for review February 7, 2026 03:21
@geoffrey-wu geoffrey-wu merged commit 0eaeda1 into main Feb 7, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

in frequency-list answerlines, hyphens and spaces should be equivalent

2 participants