Question

How do we do quantification using stringtie merge option for all the merged samples generated.

0

Entering edit mode

7 months ago

Varsha • 0

stringtie -e -B -G /mnt/Data/VARSHA/COMBINED/ANNOTATION_FILES/ANNOTATION_FILE_MERGED/lncipediahg38_renamed.gtf /mnt/Data/VARSHA/COMBINED/SAMPLES/MERGED_STRINGTIE/samples_merged.gtf -p 32 -A gene_abundance.tsv -o sample_quantified.gtf

I have run the above command but I am getting the following error. how do I fix it?

WARNING: no reference transcripts were found for the genomic sequences where reads were mapped!

Please make sure the -G annotation file uses the same naming convention for the genome sequences.##

Also i got TPM values as 0.0 for all the samples. what does that signify?

stringtie quantification • 696 views

ADD COMMENT • link 7 months ago by Varsha • 0

1

Entering edit mode

I ask the same I always ask when people use stringtie for human data. Do you really need a transcript assembly or do you just want gene counts? If the latter then just use something like salmon for quantification of fastq files and skip stringtie which is both unnecessary and overly complicated for this purpose.

ADD REPLY • link 7 months ago by ATpoint 86k

score 1 · Answer 1 · 2024-05-23

If you are asking StringTie to estimate expression, then you need to provide it with BAM or CRAM files of the original reads, not a stringtie assembled GTF. Instead, provide the GTF output from stringtie merge to -G, and the BAM files from which the GTF was assembled as the input.

However.... StringTie not regarded as a particularly good way to quantify expression. It does a pretty good job of its primary purpose - assembling novel isofrom structures from reads, but not a particularly good job of quantifying the expression of those isoforms. I would only quantify with StringTie if you have a really good reason to be using it for quantification.

If you are using human, then the reference transcript assembly is usually good enough for most purposes. In this case just use the standard reference annotation, and quantify with something like Salmon or RSEM.

If you have a reason to want to assemble novel isofrom strucutres (and such reasons do exist), then I would create the transcript assembly using StringTie/StringTie merge, and then quantify that novel assembly using something like Salmon or RSEM.