Skip to content

最后protein.fasta文件中,很多氨基酸序列含有多个终止信号*,gff3注释文件中 部分gene位置注释为负值 #34

@wlhCNU

Description

@wlhCNU

陈老师您好,
首先非常感谢您非常棒的基因注释软件。整个流程跑下来后,检查最后结果时候发现:
1.蛋白序列文件中,有许多氨基酸序列含有2个及以上的终止密码子产生的* 号;
2.gff3注释文件中部分gene位置注释为负值。
请问可能是什么原因导致的呢,还是我自己分析流程中有问题导致的?谢谢。

分析流程及参数如下:
geta.pl --RM_species Embryophyta --pe1 355_1.fq.gz,292.1.clean_R1.fq.gz --pe2 355_2.fq.gz,292.1.clean_R2.fq.gz --protein homo.protein.fa --augustus_species 20240311 --out_prefix test --config conf_all_defaults.txt --cpu 40 --gene_prefix Ah01Ggene --HMM_db Pfam-AB.hmm reference_genome.fa

蛋白序列信息:

Ah01Ggene000854.t01 [Parent=Ah01Ggene000854] [Transcript_Ratio=100.00%] [Integrity=complete] [Source=homolog0776.t01]
MSKNRDKEPPLNFDPNIKKTVRRCQQQARAFRSAESLRDNSKEEAEVITMEPNNNNNQPKRTLDSYTAPNPTFYGSSIIVHPMNANNFELKPQLITLVQQDCQFYGLPRENPNLFISNFLQICDTVKTNRVYPDVYWLLLFSFTVRDQEKQWLDTQPQESLDTWDNVVSRFLNKFSPPQRVTNLTTDV* TFRQQEGAFLYETWERYKVMLKSVLPTCFQT* YSCRSFIMGLLRPPGPLWIILQEDPFIRKALKRL S* LR* LLTTTIYTLL* KSP* GKESWN* MLWIPLLLRIKPCLNR* MPLLNTWLDYKSQLLITKMLLMT* VVNFLKVRVMIMVNFPLNSLIT* AISPDLPIMIYFPRFIIRGGGITQILDGKINHRGNHTSTTTTTVLWVILIRIILTVTTDIFNPLNHIMYPPLLRNLLT* NL* LQNLPRILII* CRKPKYQLETWWFRWVN*

gff3基因注释信息:
jcf7180006177890 GETA gene -2 1041 . - . ID=Ah01Ggene000129;Name=gene1203;Type=fair_gene_models_predicted_by_homolog;>
jcf7180006177890 GETA mRNA -2 1041 . - . ID=Ah01Ggene000129.t01;Parent=Ah01Ggene000129;Name=gene1203.mRNA;Type=fair_g>
jcf7180006177890 GETA five_prime_UTR 1030 1041 . - . ID=Ah01Ggene000129.t01.utr5p1;Parent=Ah01Ggene000129.t01;
jcf7180006177890 GETA exon 929 1041 . - . ID=Ah01Ggene000129.t01.exon1;Parent=Ah01Ggene000129.t01;
jcf7180006177890 GETA CDS 929 1029 . - 0 ID=Ah01Ggene000129.t01.CDS1;Parent=Ah01Ggene000129.t01;
jcf7180006177890 GETA intron 855 928 0 - . ID=Ah01Ggene000129.t01.intron1;Parent=Ah01Ggene000129.t01;Supported_times=20>
jcf7180006177890 GETA exon 764 854 . - . ID=Ah01Ggene000129.t01.exon2;Parent=Ah01Ggene000129.t01;
jcf7180006177890 GETA CDS 764 854 . - 1 ID=Ah01Ggene000129.t01.CDS2;Parent=Ah01Ggene000129.t01;
jcf7180006177890 GETA intron 643 763 0 - . ID=Ah01Ggene000129.t01.intron2;Parent=Ah01Ggene000129.t01;Supported_times=20>
jcf7180006177890 GETA exon -2 642 . - . ID=Ah01Ggene000129.t01.exon3;Parent=Ah01Ggene000129.t01;
jcf7180006177890 GETA CDS -2 642 . - 0 ID=Ah01Ggene000129.t01.CDS3;Parent=Ah01Ggene000129.t01;

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions