mapping rate of Circulationg tumor cell RNASeq data is too low!
0
0
Entering edit mode
7.5 years ago

Dear friends of biostar, I am a newcomer to circulating tumor cell field. Recently I practised dealing with other people's data(GSE51827). I tried to map the fasq files to GRCh37.primary_assembly.genome.fa with tophat and bowtie. However the mapping rate of each run is too low.

tophat -p 10 --bowtie1 --no-novel-juncs -G $gtf2 -o $workshop/raw_data/${file%%.*}    ${ref2%.*}     $workshop/raw_data/$file
for example in the sample(SRR1020057.fastq)
Reads:
          Input     :   1273615
           Mapped   :       777 ( 0.1% of input)
            of these:        82 (10.6%) have multiple alignments (0 have >20)
 0.1% overall read mapping rate.

I also checked the QC of the fastq files and the quality of fastq files looks fine. Is there anyone can help to resolve this problem? Thanks!

alignment • 2.3k views
ADD COMMENT
0
Entering edit mode

export ref2="/home/zhanghl/supporting_files/Homo_sapiens_glk_v37/GENE_CODE/bowtie/GRCh37.primary_assembly.genome.fa" export gtf2="/home/zhanghl/supporting_files/Homo_sapiens_glk_v37/GENE_CODE/gencode.v25lift37.annotation.gtf"

ADD REPLY
0
Entering edit mode

First I suggest tophat2 and bowtie2 instead

also is this single end reads? and have you tried removing --no-novel-juncs

ADD REPLY
0
Entering edit mode

Dear Kennethcondon2007, Thanks for your reply. I tried tophat+bowtie2 and tophat+bowtie and also removed --no-novel-juncs as well. they did not make much differences. The datasets are single end reads. tophat2+bowtie2 may not help.

ADD REPLY
0
Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

ADD REPLY
0
Entering edit mode

Have you scanned and trimmed this data?

ADD REPLY
0
Entering edit mode

Yes, I trimmed the data and map again. But did not make much differences.

ADD REPLY
0
Entering edit mode

Why are you not using GRch38 and did you create indexes?

ADD REPLY
0
Entering edit mode

Yes, I used GRch38 and also hg19 as the reference. I may figure out where is the problem and am trying. The downlaoded data is colorspace fastq, so colorspace reads should be alinged to the colorspace index. When it works I will post the result. Thanks.

ADD REPLY
1
Entering edit mode

I don't see --color option in your command line if these are color space reads. You may have to go back to an older version of TopHat if the one you are using does not support that option.

That said a recent thread on SA had this to say about colorspace data. Consider your options.

ADD REPLY

Login before adding your answer.

Traffic: 2256 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6