How can I locate a specific nucleotide based on a bam file?
1
0
Entering edit mode
3.8 years ago
2689563392 • 0

Hello, now I need to get the nucleotide (from the reads, but not the reference genome) at a specific position, for example, chr4:33567, from a corresponding .bam file, and I hope the output will be a specific base, such as "A" or "T". Somebody said that samtools can do that, but I don't know how to.
So, I wanna know if it's possible and the command line to do that. Thank you!

alignment genome sequence • 777 views
ADD COMMENT
0
Entering edit mode

Try IGVtools count function with window size 1 and bases option. It would give counts per each base at that position.

ADD REPLY
0
Entering edit mode
3.8 years ago
tothepoint ▴ 940

This may work for you.

samtools view input.bam "chr4:33567" > output.bam

ADD COMMENT
1
Entering edit mode

I am pretty suer this will resut in a SAM file without header so for bam add -b option or for sam add -h, but the idea is correct I think.

ADD REPLY
0
Entering edit mode

Thank you, but the output is not what I want. Maybe I didn't clearly articulate my purpose, and the output I want is a specific base, such as "A" or "T", but not the reads that overlap the given position.

ADD REPLY
0
Entering edit mode

You can pipe the resulting bam from above command into: A: BAM/SAM to FASTA conversion

ADD REPLY
0
Entering edit mode

Or alternatively run the relevant intervals through a variant caller and then filter the VCF file for REF and ALT alleles at that specific spot of the genome. That has the advantage that you get information on how reliable observed changes are. Not all positions in a read are of equal quality, plus variant callers often have heuristics to determine variant quality based on a variety of metrics.

ADD REPLY

Login before adding your answer.

Traffic: 1468 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6