Skip to content

EVA-3933 Add script to read variant ids from a file and do the annotation remediatioin#206

Merged
nitin-ebi merged 3 commits intoEBIvariation:masterfrom
nitin-ebi:annotation-remediation
Mar 31, 2026
Merged

EVA-3933 Add script to read variant ids from a file and do the annotation remediatioin#206
nitin-ebi merged 3 commits intoEBIvariation:masterfrom
nitin-ebi:annotation-remediation

Conversation

@nitin-ebi
Copy link
Copy Markdown
Contributor

No description provided.

@nitin-ebi nitin-ebi self-assigned this Mar 25, 2026
@nitin-ebi nitin-ebi force-pushed the annotation-remediation branch from 896a461 to 311f131 Compare March 26, 2026 09:15
Comment on lines +176 to +179
for (Map.Entry<String, String> entry : orgIdNewIdMap.entrySet()) {
regexCriteria.add(Criteria.where("_id").regex("^" + Pattern.quote(entry.getKey()) + "_\\d"))
regexCriteria.add(Criteria.where("_id").regex("^" + Pattern.quote(entry.getValue()) + "_\\d"))
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make two query per batch so that you know which one is old and which one is new?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated


// Group fetched annotation documents by variant id prefix (both old and new ids)
Map<String, Set<Document>> variantIdToDocuments = new HashMap<>()
for (Document doc : annotationsList) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might missunderstand this but it seems that for each annotation you go through all the document which would be 1000 * 1000 operations
Wouldn't it be simpler to derive the oldVariantId and newVariantId from the id by removing the end and match with that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

}

void storeNotRemediatedVariant(String oldVariantId, String newVariantId, String reason) {
try (BufferedWriter writer = new BufferedWriter(new FileWriter(notRemediatedVariantsFilePath, true))) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to open the file every time? If we have a lot of failure it could take a while.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@nitin-ebi nitin-ebi force-pushed the annotation-remediation branch from c927150 to 1aada1f Compare March 30, 2026 12:25
@nitin-ebi nitin-ebi requested a review from tcezard March 30, 2026 12:32
@nitin-ebi nitin-ebi merged commit b1d542b into EBIvariation:master Mar 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants