Entering edit mode
8.1 years ago
wu.zhiqiang.1020
▴
50
Dear all,
I have a small question to ask:
How can I split the gff file into two separate files based on the strand information (+/-): the strand information is the 7th column. I hope to get two independent files for forward and reverse gff.
Is there any script that could help?
thanks ZQ
How can I just based on the 7th column to extract the gff file?
for example: if 7th matched with "+", then it will be Forward gff;
if 7th matched with "-", then it will be as reverse gff
any idea how to extract?
thanks
Have you tried a simple
grep +
andgrep -v +
filename ?thanks your point. but it can not give you complete result. for example as in second line:
NC_024218.1 Gnomon CDS 8938650 8939467 . + 2 ID=cds98;Parent=rna107;Dbxref=GeneID:101952514,Genbank:XP_008166494.1;Name=XP_008166494.1;gbkey=CDS;gene=TMTC3;product=transmembrane and TPR repeat-containing protein 3 isoform X2;protein_id=XP_008166494.1
NC_024218.1 Gnomon mRNA 9963121 10072488 . - . ID=rna123;Parent=gene61;Dbxref=GeneID:101932175,Genbank:XM_005279137.2;Name=XM_005279137.2;gbkey=mRNA;gene=ATP2B1;model_evidence=Supporting evidence includes similarity to: 22 Proteins%2C and 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 8 samples with support for all annotated introns;product=ATPase%2C Ca++ transporting%2C plasma membrane 1%2C transcript variant X1;transcript_id=XM_005279137.2
in some gene annotation, they have the "+" in gene ID column, and it will also report.
thanks