Hi friends,
I need TPM
values for gene correlation analysis the multiple experiments. How to get TPM values from the STAR output?
Please help me with the following questions.
According to the manual here I must take the second column to downstream analysis, right?
This is the TPM value or just the raw counts?
If not the TPM values, what method do you recommend to get the TPM values?
This is the head SRR5570693_ReadsPerGene.out.tab
:
==> SRR5570693_ReadsPerGene.out.tab <==
N_unmapped 2047114 2047114 2047114
N_multimapping 1022855 1022855 1022855
N_noFeature 545783 5073330 4998597
N_ambiguous 72548 10439 14263
LOC119628875 0 0 0
LOC101741181 471 233 238
LOC101739479 855 421 434
LOC105842509 14 5 9
LOC101741407 354 190 164
LOC101739615 67 35 32
Many thanks!
You can also directly use salmon on the transcriptome BAM output of STAR (though I believe OP currently has a genome-centric BAM).
You can tell STAR to output a transcriptome-centered bam, even if the alignment was to genome.
Yes, but you have to provide that argument during alignment (i.e. if OP already has just a genome-centric BAM they will have to realign).
Hi friends, many thanks for your reply. Now I am using kallisto:
But the results is not by gene (LOCxxxxxx) as I need. Do you have any idea what is wrong?
This is the results:
Those are chromosomes, not genes or transcripts.
Hi swbarnes2, many thanks for your reply. Yes, I think that was a problem when I made the index. Do you think that is correct? I don't know why the result comes out for chromosomes and not for genes.
many thanks again
Have you looked up what types of input files Kallisto wants for making the index?
Do you understand how Kallisto is different from STAR?
Many thanks for your reply swbarnes2! You are right. Now I understand the difference. I needed to rna.fasta as input to build the Kallisto transcriptome index. I was working with genomic.fasta file. Many thanks again!