Entering edit mode
5.3 years ago
ali.hakimzadeh
▴
10
Hi all,
i am trying to run cufflinks over Bam files that generated through Tophat but i get the sort order error. i lookup for other posts with this topic but they were not that much useful. I try to solve it by Samtools sort with option -n and also without it but it doesn't work. i also check the header of the Bam files they were the same :
@HD VN:1.0 SO:coordinate
The transcript that i use is :
./cuffdiff -o cuffdiff_output \
-b /home/alihkz/References/Trasncriptome/gencode.v19.annotation.fa \
-p 8 \
-L pre,pro \
-u ./Homo_sapiens.GRCh37.87.chr.gtf \
/home/alihkz/bam/high/sample25.bam \
/home/alihkz/bam/high/sample26.bam
can some one clarify what's the source of my error?
Thanks for any comments in advance.
Not answering your question but do your really need to use tophat and cufflinks? Both tools are quite old and considered deprecated (at least tophat is by its senior author Lior Pachter, not sure about cufflinks). There are more recent alternatives such as
star
orhisat2
for alignment, orsalmon
andkallisto
for transcript quantification, and differential analysis tools such asedgeR
,sleuth
andDESeq2
. If you do not have a legacy burden that forces you to use these old tools consider switching. The so-called pseudoaligners (salmon/kallisto) are also much faster than traditional alignments, check the publications to learn about other advantages.i know my friend but this is what my supervisor asked me to do and unfortunately i have no choice! i did it before with "edgeR" and i get the results but with cufflinks i had this problem. i want to get the 'FPKM' values so is there another tools that i can get 'FPKM' values from it?
It's been a very long time since I used cuffdiff, for the reason mentioned by ATpoint, the method is deprecated. It must have been 2013 when I used it for the last time... However, I remember that I used for -u argument: the gtf file which was generated with cuffmerge, not the annotation file downloaded from UCSC or Ensembl (like yours is likely from). So my guess is that that is the problem here. Furthermore, you also ask for FPKM values, which is another thing which is deprecated and highly NOT recommended to use (are you sure that your supervisor is up to date with latest developments? Do you blindly want to follow?). Most people on Biostars here agree with that. If you still want to use these values for example for clustering or visualization, you can make them in e.g., edgeR.
@benn thanks, i will try also this. indeed, you are absolutely right but i have no choices as it's my thesis. I told him in the begging when he suggests me the tophat, that these staffs are old and untrustable but he refused to accept it as he also not really into RNA-seq staff...