Split the Gff file based on the strand information
1
0
Entering edit mode
8.1 years ago

Dear all,

I have a small question to ask:

How can I split the gff file into two separate files based on the strand information (+/-): the strand information is the 7th column. I hope to get two independent files for forward and reverse gff.

Is there any script that could help?

thanks ZQ

How can I just based on the 7th column to extract the gff file?

for example: if 7th matched with "+", then it will be Forward gff;

if 7th matched with "-", then it will be as reverse gff

any idea how to extract?

thanks

genome • 2.7k views
ADD COMMENT
1
Entering edit mode

Have you tried a simple grep + and grep -v + filename ?

ADD REPLY
0
Entering edit mode

thanks your point. but it can not give you complete result. for example as in second line:

NC_024218.1 Gnomon CDS 8938650 8939467 . + 2 ID=cds98;Parent=rna107;Dbxref=GeneID:101952514,Genbank:XP_008166494.1;Name=XP_008166494.1;gbkey=CDS;gene=TMTC3;product=transmembrane and TPR repeat-containing protein 3 isoform X2;protein_id=XP_008166494.1

NC_024218.1 Gnomon mRNA 9963121 10072488 . - . ID=rna123;Parent=gene61;Dbxref=GeneID:101932175,Genbank:XM_005279137.2;Name=XM_005279137.2;gbkey=mRNA;gene=ATP2B1;model_evidence=Supporting evidence includes similarity to: 22 Proteins%2C and 100%25 coverage of the annotated genomic feature by RNAseq alignments%2C including 8 samples with support for all annotated introns;product=ATPase%2C Ca++ transporting%2C plasma membrane 1%2C transcript variant X1;transcript_id=XM_005279137.2

in some gene annotation, they have the "+" in gene ID column, and it will also report.

thanks

ADD REPLY
1
Entering edit mode
8.1 years ago
$ gff2bed < input.gff | awk '$6=="+"' > output.forward.bed
$ gff2bed < input.gff | awk '$6=="-"' > output.reverse.bed
ADD COMMENT
0
Entering edit mode

thanks for the late reply. it is good. ZQ

ADD REPLY

Login before adding your answer.

Traffic: 2203 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6