Extract read position where a base is covered, and the allele at that base
1
0
Entering edit mode
2.1 years ago
Jautis ▴ 580

Hello, I have a mapped bam files (from bwa) and I want to extract which base on the read overlaps a specific base pair in the genome (ie base 4 of 55). I also want to extract the base is at that position.

I've made some progress on this in terms of obtaining reads which overlap the site, getting the read lengths, and getting the starting positions of the read. However, I'm having a hard time figuring out how to extract the specific base, and whether that is affected by whether the read is in the forward or reverse direction.

For an example, I'd like to extract position 96244549 from the following 2 reads. Any help would be appreciated!

MYRUN 16 chr5    96244500        37      100M    *       0       0       CAAGATTACTCAGCTAGTCAGTGGCAGAGCTAAGGTTTAAACTGAATGGTCCAGCTATAATGACCTACTTCTGTTATTGTTCTCTCTTTCATATTTGTTT    DDDDDIIIIIIIIIJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJIIIIIIIII

MYRUN 16 chr5 96244531        37      62M     *       0       0       AAGATTTAAACTGAATGGCCCAGCTATAATGACCTACTTCTGTTATTGTTCTCTATTTCATA  JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ
samtools bam bwa • 494 views
ADD COMMENT
2
Entering edit mode
2.1 years ago

use sam2tsv : http://lindenb.github.io/jvarkit/Sam2Tsv.html

samtools view -O BAM chr5:96244549-96244549  in.bam |  java -jar dist/sam2tsv.jar -R ref.fa | grep -w 96244549
ADD COMMENT

Login before adding your answer.

Traffic: 1951 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6