Question

to much time for using cufflinks

0

Entering edit mode

7.7 years ago

hasani.iut6 ▴ 60

Hi all,

I want to run cufflinks pipeline explained in it's site. I have used the paired-end sample from the Illumina body map project with B20G06AAX1_s7 assay name. Size of data after unzipping are 13G for each pairs.

For the reference file I've used the genome and annotation files from the geneCode site.

First step of pipeline is transcript assembly for generating .gtf file. For this purpose, I've used the following commad:

cufflinks -b genome_38.fa -g gencode.v21.chr_patch_hapl_scaff.annotation.gtf ERR030885.sam

ERR030885.sam is the output of tophat alignment.

This step takes a lots of time. After 2 weeks only half of processing is done on a computer with Corei7 processor and 32GB of RAM. I don't think it was supposed to take his long, please tell me where I am wrong?

Thanks.

Mansoor.

cufflinks RNA-Seq transcripts • 2.3k views

ADD COMMENT • link updated 7.7 years ago by Jeffin Rockey ★ 1.3k • written 7.7 years ago by hasani.iut6 ▴ 60

0

Entering edit mode

You should know that the old 'Tuxedo' pipeline of Tophat and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. (If you can't get access to that publication, let me know and I'll -cough- help you.) There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using kallisto or salmon.

ADD REPLY • link 7.7 years ago by WouterDeCoster 47k

0

Entering edit mode

Thanks a lot, yes you right, In most recent paper reported that cufflinks output not reliable, but I just want to run it's pipeline completely for a comparison

ADD REPLY • link 7.7 years ago by hasani.iut6 ▴ 60

score 2 · Accepted Answer · 2017-05-24

It is the "-g" that is taking the time. I have observed that whenever -g was used, it kept running for so long.

However the steps/method mentioned in the protocol paper will not take that much time. It will get done within a day itself or two.

(In that method, -g comes at the cuffmerge step.Will get it done in few minutes.)

Please see the protocol paper mentioned and see whether doing in that manner would suffice for your study.