How to extract fasta sequences from assembled transcripts generated by Stringtie
4
3
Entering edit mode
7.6 years ago
seta ★ 1.9k

Hi all,

I used STAR and stringtie for mapping reads to reference genome and assembly. As you know, the generated assembled transcripts by stringtie are in gtf format. Now, I want to have fasta sequence of assembled transcript. I used gffread, but all sequences had the same header! maybe it's not compatible with stringtie.

Could you please help me out to convert assembled transcripts by stringtie in gtf format to fasta format?

Thanks

fasta gtf stringtie • 11k views
ADD COMMENT
0
Entering edit mode

use gffread, you can find it in cufflink package

ADD REPLY
3
Entering edit mode
7.3 years ago
zzqr ▴ 50

The stringtie_merged.gtf file have seqname, start, end strand info. So, you can use R GRanges object and getSeq function from GenomicRanges and BSgenome packages to retrive sequences.

ADD COMMENT
3
Entering edit mode
5.3 years ago
Juke34 9.0k

I use agat_sp_extract_sequences.pl from AGAT.

agat_sp_extract_sequences.pl --cdna --gff input.gtf --fasta genome.fa -o output.fa

ADD COMMENT
0
Entering edit mode

thanks for your response

ADD REPLY
1
Entering edit mode
7.3 years ago

You can also use bedtools getfasta to fetch sequences from GTF or BED files.


UPDATE

Here is the perfect solution

ADD COMMENT
0
Entering edit mode

I used this, but I run into the following error

"Error (GFaSeqGet): subsequence cannot be larger than 465 Error getting subseq for gene1 (465..1503)!"

Did you had any issues using gffread?

Thanks

ADD REPLY
0
Entering edit mode

There is a Python script that fixes this error, you can follow A: gffread error when extracting transcript sequences from gtf, coordinates exceed

ADD REPLY
0
Entering edit mode
3.9 years ago
DNAvinci • 0

Per, http://ccb.jhu.edu/software/stringtie/gff.shtml#gffread_ex

gffread \
-w assembled_transcripts.fa \
-g ref_genome.fa \
-E cov_refs.gtf \
./stringtie_output.gtf
ADD COMMENT

Login before adding your answer.

Traffic: 1072 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6