How to deal with the same proteins with slightly different names from the RefSeq_bac database?

Hi Sam,

In my statistical analysis, some of the functions from the RefSeq_bac database are being categorized as different proteins only because of a small difference in their names like a dash (e.g. "(3R)-hydroxymyristoyl ACP dehydratase" "(3R)-hydroxymyristoyl-ACP dehydratase"), a comma, or lower/uppercase letters (e.g. "(2fe-2S)-binding domain-containing protein" and "(2Fe-2S)-binding domain-containing protein"). 
Also, some others are partial or complete sequences of the same protein (e.g. "(2Fe-2S) ferredoxin" and "(2Fe-2S) ferredoxin, partial"). 

I wanted to know if you correct those names in the database or after annotation-aggregation. And if yes, would you please guide me on how to do it?

-Mona


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to deal with the same proteins with slightly different names from the RefSeq_bac database? #52

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

How to deal with the same proteins with slightly different names from the RefSeq_bac database? #52

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions