1000 Genome Project and ESP for indel or frameshift
2
0
Entering edit mode
10.0 years ago
897598644 ▴ 100

Excuse me:

After calling variants of indel and frameshift, I wanted to annotate them using 1000 Genome Project and ESP.Could I get the frequency of indel and frameshift?It seemed that the answer was no. I was confused.

Any help that help me understand that issue would be much appreciated.

Many thanks in advance!

next-gen-sequencing • 2.6k views
ADD COMMENT
0
Entering edit mode
10.0 years ago
Ram 44k

This should theoretically be possible. Indels are variants where len(REF) != len(ALT), if your REF and ALT match the REF and ALT from 1000g/ESP, I don't see why you can't use the AF from them.

I'm not a 100% sure, but aren't frameshift variants indels too?

ADD COMMENT
0
Entering edit mode

No, but most of indel and frameshift in 1000g/esp had no annotation information("."). Is that normal?

ADD REPLY
0
Entering edit mode

Could you give me an example?

ADD REPLY
0
Entering edit mode

The 1000g first round didn't attempt to call indels. They were very low coverage WGS, intended for medium-frequency SNPs. They simply didn't call indels on most of the subjects, because 2-5x coverage can't do it accurately. I don't know about later rounds of data releases, maybe now they do have indel data. Same for ESP, as indels are harder to call, they avoided it to keep the dataset clean.

I don't know a good dataset for indels, and will monitor this thread closely :).

I know the GATK best practices for DNA-seq includes a variant recalibration that references some kind of indel reference, so that's worth a look. It's called the Mills set but I don't have a reference.

ADD REPLY
0
Entering edit mode

Yes! I have recalibrated using Mills. But I want to filter them depending on the frequency of indel and frameshift in 1000g/esp. It seemed unreasonable. So how could I deal with that kind of variants?

ADD REPLY
0
Entering edit mode
10.0 years ago
Katie D'Aco ★ 1.1k

What about the ExAc exome browser? It seems to include indels (e.g., http://exac.broadinstitute.org/variant/17-41197801-T-TC), and the complete dataset is available for download as a vcf.

ADD COMMENT

Login before adding your answer.

Traffic: 1713 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6