Question

Bismark duplicate ID error

0

Entering edit mode

9.8 years ago

bharata1803 ▴ 560

Hello,

I have paired end data from an SRA file. I extract it to a single file using SRA tools and trimmed it with trim_galore. Sfter that I used a simple command to do the alignment using bismark. The command just :

bismark <ref genome> <file name>

I use bowtie1 to build the human genome reference.

The problem is after the process, bismark get an error that stated there are duplicate ID. I understand this because my data is paired end and I didn't split it into 2 files using sra-toolkit. Is there anything that I can do or I need to redo all of my work? Thank you

RRBS Bismark • 2.3k views

ADD COMMENT • link updated 2.6 years ago by Ram 44k • written 9.8 years ago by bharata1803 ▴ 560

Ram · Answer 1 · 2015-03-16

0

Entering edit mode

9.8 years ago

Devon Ryan 105k

You really need to split the reads into two files, the trimming with trim_galore is even going to be problematic if you don't do so (the program will run, but you won't be left with a properly interleaved fastq file). Always use the --split-3 option with fastq-dump. Even if you think a dataset is paired-end, it won't hurt to do so.

ADD COMMENT • link 9.8 years ago by Devon Ryan 105k

0

Entering edit mode

I have tried it with only use --split-files (before I use --split-spot). The fastq result are 2 files, <filename>_1.fq and <filename>_2.fq. I already trim it using trim galore and use bismark command

bismark <options> -1 <filename>_1.fq -2 <filename>_2.fq

But, there is only 1 bam file as the result and the name is the same as the first fastq file. Is it correct or wrong?

Thank you

ADD REPLY • link updated 2.6 years ago by Ram 44k • written 9.8 years ago by bharata1803 ▴ 560

0

Entering edit mode

That's how it should be. Paired end reads originate from 2 fastq files and are written together to a single BAM file.

ADD REPLY • link 9.8 years ago by Devon Ryan 105k

0

Entering edit mode

Ok, Thank you. It seems it works now. I found out the mapping efficiency is only 13.9%. It's really low. Do you think it's normal or maybe another part is wrong, too? [[Sorry, just read your answer in seqanswer, thank you once again]]

ADD REPLY • link 9.8 years ago by bharata1803 ▴ 560

0

Entering edit mode

I replied on seqanswers. Let's just keep the conversation there, since it's too confusing to go back and forth.

ADD REPLY • link 9.8 years ago by Devon Ryan 105k