large insertion/deletion detection from NGS data
3
2
Entering edit mode
9.3 years ago
illinois.ks ▴ 210

Dear all,

I am struggling with identification of large insertion/deletion as well as variants calls. I have used the GATK for snps calls and realized that GATK tools identified up to 50bp(?) insertion/deletion as well as SNPs.

However I am wondering whether I could also detect the large size insertion/deletion from paired-end NGS data.

I know that insertion is much more difficult to detect .. And especially, insertions which are larger than the distance of paired ends reads could not be detected (impossible to detect).. Then, still can I detect the large size deletion?

Is there any size limit for Indels we can detect ? In my case, my fragment size is 52bp ~ 546bp.. And median size is 216bp. In this case, how long Indel can I detect? What kind of tool I have to use?

KS

indel deletion gatk insertion • 8.2k views
ADD COMMENT
3
Entering edit mode
9.3 years ago
Dan D 7.4k

BreakDancer is the first tool that comes to mind, though its documentation is shoddy.

I recommend running multiple tools in parallel. A good second choice in my opinion is Scalpel, which uses a microassembly approach.

ADD COMMENT
3
Entering edit mode
9.3 years ago

There are many good options for structural variation detection using paired end data. Some tools that are maintained, easy to use, and have relatively good accuracy:

ADD COMMENT
1
Entering edit mode
9.3 years ago

BBMap, with ~216bp reads, can detect insertions up to ~60bp and deletions up to 32000bp at default settings. With non-default settings (increasing the maxindel flag and reducing the minidentity flag) it can detect much longer deletions of hundreds of thousands of bases, and longer insertions as well, but less than read length. Maybe insertions up to 60% of read length if you set minidentity at the minimum. This is purely based on mapping, without realignment or reassembly - I do not recommend realignment of BBMap's output during variant calling; calling directly from consensus should be more accurate. Though if you want insertions longer than a fraction of read length, some kind of assembly is required.

ADD COMMENT

Login before adding your answer.

Traffic: 2551 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6