Question

Mapping 454 reads to reference genome

0

Entering edit mode

10.8 years ago

Ginsea Chen ▴ 140

There were some 454 reads obtained from NCBI and I tried to map them to reference genome and estimate abundance by calculation of FPKM values. The website of tophat told me that version 2 of tophat can be used to map 454 reads so I gave terminal common commands like Illumina Hiseq 2000 mapping. But the mapping results were bad with little (about 500Kb) accepted_hits.bam and zero FPKM values. I don't know how to solve it so I ask for your help, my operation was here:

$ fastq-dump -I --split-files SRR123456.sra #converted sra file to fastq file
$ tophat -p 8 -G genes.gtf -o SRR123456 genome SRR123456_1.fastq SRR123456_2.fastq # Version of tophat is version 2

After this command, I checked accepted-hits.bam of SRR123456 folder and found little file.

Notes: the sra file was 454 reads, No problem met by using Illumina reads (such as Hiseq 2000)

454 RNA-Seq Reads • 3.7k views

ADD COMMENT • link updated 3.5 years ago by Ram 45k • written 10.8 years ago by Ginsea Chen ▴ 140

0

Entering edit mode

tophat won't give you FPKM. It will give give u mapped reads. To get FPKM, you need to use cufflinks. Or any other available tools including bedtools (using coverageBed).

ADD REPLY • link updated 3.5 years ago by Ram 45k • written 10.8 years ago by Chirag Nepal ★ 2.4k

0

Entering edit mode

Sounds like you're not mapping for some reason: What errors are you receiving, if any? Sounds like you don't have any "hits" in your BAM/SAM.

Can you be a little more explicit in your workflow from your SRA file?

ADD REPLY • link updated 3.5 years ago by Ram 45k • written 10.8 years ago by Josh Herr 5.8k

Ram · Answer 1 · 2014-07-24

1

Entering edit mode

10.8 years ago

Chirag Nepal ★ 2.4k

I think the problem may be due to varying length of 454 reads.

While tophat2 may allow to map longer reads efficiently, but number of mismatches you allow is fixed. As 454 reads are variable (50-500nt; for eg: the dataset i worked previosuly), and allowing a constant 2 nt mismatch may not be ideal, which might explain one of the reasons why you have less reads mapped. Use other tools. I had used Blat with good success.

ADD COMMENT • link 10.8 years ago by Chirag Nepal ★ 2.4k

0

Entering edit mode

Dear Nepal

Thanks for your suggestion, I have never learned Blat program and I want to know if Blat can be used to estimate interested genes FPKM values. Before this, I have obtained some Illumina RNA_seq files and finished FPKM values estimation, and I want to use the same expression level characterization in my manuscript.

ADD REPLY • link updated 3.5 years ago by Ram 45k • written 10.8 years ago by Ginsea Chen ▴ 140