Allele frequency in gene sequence
1
0
Entering edit mode
4 months ago
Elizabeth ▴ 30

I have a gene sequence file about 500bp in length, which I mapped to a set of Illumina paired-end reads to extract the reads specific to this sequence. My objective is to determine the allele frequency of a particular SNP (located at codon 87) in this gene among the extracted reads. In the first step, BWA identified 300 reads that mapped to my gene. After that, I trimmed the gene sequence to a shorter 100bp length that contains the SNP and remapped this trimmed sequence to the 300 reads to further narrow down the candidates. As a result, I now have 35 sets of paired reads that map to the 100bp sequence.

When I use aligners like MUSCLE, the alignment algorithm generates more gaps than aligned reads, which is not providing the expected results. Going through each and every read by eye doesn’t make much sense. I’m sure, there is a better way to do this.

Has anyone done something similar? Any insights would be helpful. Thanks.

Allele-frequency • 332 views
ADD COMMENT
0
Entering edit mode
4 months ago

Hi, I guess you are calling variants on a reference genome fragment of 500bp. I would recommend:

  1. mapping your reads with bwa mem and produce a BAM file
  2. stack mapped reads and get stats with samtools mpileup, see instructions at https://github.com/samtools/bcftools/wiki/HOWTOs#mpileup-calling. For each genomic position you will see how many reads map and which alleles can be found (where . is the reference allele).

Hope this helps

ADD COMMENT

Login before adding your answer.

Traffic: 2524 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6