Question

How to search miseq data for large mutations

1

Entering edit mode

9.8 years ago

Rößti ▴ 40

Hi,

I was wondering if any one could help me. I am currently working on C. difficile and have obtained a LuxS mutant that does not always behave as one would expect. I have therefor sequenced it together with the wild type strain to see what is going on. This is about as far as I have got. I used both breseq, bowtie2 and samtools to check my sequences against the published sequence data on ncbi. As however, the luxS gene was knocked out through the insertion of an ~600bp insert I have been told that this method of analysis may not be able to determine whether this insert is present or where it is.

If anyone has any ideas on how I would go about determining whether this insert is present within the luxS gene. I would be most grateful.

next-gen-sequencing genome • 1.5k views

ADD COMMENT • link updated 2.6 years ago by Ram 44k • written 9.8 years ago by Rößti ▴ 40

Ram · Accepted Answer · 2015-02-24

I would do this:

If you know the sequence of the insert, concatenate it to the reference.
Map the reads to the modified reference, with an aligner that produces global alignments with high sensitivity (like BBMap), or that produces chimeric alignments (like bwa-mem).
Look at the gene in IGV. If there is a long insertion event, you will see some place in it where the reads hit a junction. With global alignments, they will look error-free up to some point then suddenly all the bases will not match the reference. That point is where the insertion occurred, and the part of the read that does not match the reference should match the insert.
If your reads are paired, you should see improper pairs such that one read mapped to the gene of interest, and the other read mapped to the insert (which is why it was added to the reference).