-
Notifications
You must be signed in to change notification settings - Fork 0
Fix12 Add protein names to output CSV and print MAVISp-ready protein list #53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| parser.add_argument("--protein-col", default="Protein", help="Column name for protein names in input CSV (default: Protein)") | ||
| parser.add_argument("--uniprot-col", default="Uniprot AC", help="Column name for UniProt IDs in input CSV (default: Uniprot AC)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also add single-letter options for these, as there are for the other options?
|
|
||
| upids =[args.u] if args.u else pd.read_csv(args.i)['Uniprot AC'].dropna().astype(str).tolist() | ||
| if args.u: | ||
| protein_data = [{args.protein_col: args.u, args.uniprot_col: args.u}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add an option that allows us to input protein name as well?
so we don't need to use args.u for protein col name
| if proteins_without_cofactors: | ||
| print(f"Proteins without cofactors: {len(proteins_without_cofactors)}") | ||
| print("-"*60) | ||
| print(", ".join(proteins_without_cofactors)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have this without space?
",".join(...)
| print("\n" + "="*60) | ||
| if proteins_without_cofactors: | ||
| print(f"Proteins without cofactors: {len(proteins_without_cofactors)}") | ||
| print("-"*60) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we remove this line?
| print(f"Proteins without cofactors: {len(proteins_without_cofactors)}") | ||
| print("-"*60) | ||
| print(", ".join(proteins_without_cofactors)) | ||
| print("-"*60) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we remove this line?
| print("-"*60) | ||
| print(", ".join(proteins_without_cofactors)) | ||
| print("-"*60) | ||
| print("\nThese proteins can be used to run MAVISp.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we remove this line?
| print("\nThese proteins can be used to run MAVISp.") | ||
| else: | ||
| print("All proteins have cofactors.") | ||
| print("="*60) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we remove this line?
mtiberti
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please
- update the examples so that they have been run with the latest version
- double check that option
-ais actually used when it supposed to; in my test I obtained asummary_output.csvwith uniprot ID in both column name and protein name while using-a
(python ../../../find_cofactor.py -u P40692 -a MLH1 -p AF_MLH1a_1-341.pdb -s 1 -e 341 -c ../../../cofactors_dict.json -l ions.txt -n A)
#12