Strand in gff file
1
0
Entering edit mode
2.5 years ago
blz ▴ 30

Hello,

It's a basic question, but what's the meaning of strand in gff file? I mean, when a gene is annotated as in the + strand, the sequence I see in the + strand is the reverse complement of mRNA or is exactly (identical) the mRNA sequence? I'm asking because I need mRNA sequences and I don't know how to get them.

Thanks,

strand mRNA-sequence gff transcript-strand • 1.6k views
ADD COMMENT
0
Entering edit mode

If a gene is on the positive strand the mRNA would have the same sequence to reference sequence (sans uracil). Instead of coding this from scratch the sequences can be retrieved programmatically from the command line or programming languages like R or Python.

ADD REPLY
0
Entering edit mode
2.5 years ago

The coordinates will be represented relative to the forward strand, the strand indicates the direction. Indeed for a feature on the reverse strand you would need to reverse complement.

To obtain sequences use a tool like bedtools getfasta:

bedtools getfasta [OPTIONS] -fi <fasta> -bed <bed/gff/vcf>

Options: 
    ...
    -s      Force strandedness. If the feature occupies the antisense,
            strand, the sequence will be reverse complemented.
    ...

if the sequences span multiple exons you could use gffread:

Filter, convert or cluster GFF/GTF/BED records, extract the sequence of transcripts (exon or CDS) and more.

ADD COMMENT

Login before adding your answer.

Traffic: 2662 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6