Question

Bam file with unmapped reads from another genome than reference

0

Entering edit mode

7.3 years ago

Vca80553 • 0

Hello everyone,

I started with bioinformatics 2 weeks ago, so maybe my question is a bit too easy for you, but I don't find any answer for it in other threads. I would appreciate a lot if you could help me out.

This is the case:

I mapped my paired end reads to my viral reference genome (Nextgenmap) and selected those that were mapped and paired (both pairs mapped only). Now I want to filter out, those reads that map also to human DNA (my contaminant). For that, I took the "viral_mapped_paired_end_reads.bam", converted it to fastq (R1 and R2) and mapped it to hg19. In this bam file, I extracted the unmapped paired end. So, I assume that here I have the paired end reads that only mapped to virus before , but not to human now. This bam file has the reads that I want, but unmapped to the human reference genome. No info about the reference viral genome.

Now, how do I continue? I want to do coverage analysis for the viral genome for example. Can I use the unmapped bam file? or do I need to use the viral_mapped_paired_end_reads.bam and filter out the reads that mapped to human? If so, Is it done by extracting reads IDs? Or the IDs change depending on the reference genome?

Thanks a lot

bam unmapped no reference genome • 2.1k views

ADD COMMENT • link 7.3 years ago by Vca80553 • 0

score 2 · Answer 1 · 2017-09-16

2

Entering edit mode

7.3 years ago

mastal511 ★ 2.1k

Map your reads to hg19 first, remove the reads that map, then align the unmapped reads to the viral genome.

ADD COMMENT • link 7.3 years ago by mastal511 ★ 2.1k

0

Entering edit mode

Thanks a lot! I thought it would be faster the other way around, as viral genome is less than 10000 pb. I will do it as you say.

ADD REPLY • link 7.3 years ago by Vca80553 • 0