Hello: I am trying to get coding sequences based on gff file and the genome fast file. I am using the function gffread. It works well with "gffread my.gff -g genome.assembly.fasta -x cds.fa". But after I add the parameter "-J" it reported an error "Error (GFaSeqGet): end coordinate (76134) cannot be larger than sequence length 76132". But I manually checked the annotation of the contig with the length 76132, and found there is no annotation with the end coordinate "76134". The maximum coordinate is 76132.
About the "-J" parameter
-J discard any mRNAs that either lack initial START codon or the terminal STOP codon, or have an in-frame stop codon (only print mRNAs with a fulll, valid CDS)
Anybody have met the same problem?
Thanks
Hi. I have the same issue. Did you find any solution?
Please use
ADD COMMENT/ADD REPLY
when responding to existing posts to keep threads logically organized.I had the same problem... For me the "easiest" way is to do it is to run gffread without the -J parameter. Then with a custom script check that each sequence with "ATG" and finish with one stop-codon. Additionally, you have to look for stop codons inside the sequence.