Question

Genome-Wide Sequencing Depth of Coverage Relationship to Genomic Regional Variation

0

Entering edit mode

8.5 years ago

Youzhi • 0

I have data from several soybean genomes that were sequenced at the same depth of coverage (15x) using Illumina 150PE reads. We see regions of interest between closely related lines with unexpected variation in SNP density, heterozygosity, or homozygosity that is not captured using a 50K SNP-Chip. How can I ensure the variation in the region is not a sequencing coverage/depth artifact for that particular locus?

sequencing next-gen • 1.8k views

ADD COMMENT • link updated 8.5 years ago by Vitis ★ 2.6k • written 8.5 years ago by Youzhi • 0

score 0 · Answer 1 · 2016-12-08

If you have your snps output in a vcf file, you would simply need to look at AO and DP where AO is the number of times you have seen your snp and DP would be the number of times that locus was sequenced. This will tell you if you had enough depth at the regions of variability.

If you just have your fastq files and you want a quick GUI way of checking some of this, you could try IGV it will show your the regions that have been sequenced, and at what depth they have been sequenced.

score 0 · Answer 2 · 2016-12-09

People familiar with plant genomes would probably check the region for repeats. Repetitive genome sequences such as transposons can cause all the variations you see due to ambiguous read mappings. If this is not because of proliferate repetitive sequences, you probably hit some gene families with very similar homologs. This is very normal for certain genes such as R-genes. Copy number variations in R-genes would generate the variations you in the WGS results.