How to do a local denovo assembly including unmapped paired reads for many samples to genotype a large insertion.
2
0
Entering edit mode
7.3 years ago
William ★ 5.3k

I have a set of many bam files for which I would like to know if an insertion of a few 100bp is present at a certain locus.

The insertion variant is not picked up by small variant callers like freebayes, gatk or structural variant callers like lumpy or manta when doing a full genome variant calling.

It should be possible to use the aligned reads and their unmapped mates to do a full local assembly of the region that includes the insertion.

What tool or pipeline can I best use for this?

I guess I need to use both the bam files and the original fastq files, since I need the unclipped, unsplit reads for the local assembly?

local assembly • 1.5k views
ADD COMMENT
0
Entering edit mode

Take a look at ABRA/ABRA2.

ADD REPLY
0
Entering edit mode
7.3 years ago

If you know the exact sequence and location of the insertion, and you just need to confirm presence/absence, what not 'grep' the FASTQs for the genome/insert junction sequences? Or am I missing something?

ADD COMMENT
0
Entering edit mode
7.3 years ago
h.mon 35k

If all you are interested is one position at one locus, viewing the BAMs on IGV or other genome browser should settle the issue. If your reference genome does not contain the insertion, look at reads being soft-clipped at the expected position of the insert.

What kind of sequencing did you perform (RNAseq, DNAseq, etc) ? What mapper did you use? Subread claims to be able to identify short indels (up to 200bp) at the alignment step - use -I with subread-align.

ADD COMMENT

Login before adding your answer.

Traffic: 4380 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6