Hello all,
I want to extract sequence from a bam file,
suppose that I have a bam file "normal.bam",
I learned that I can use
samtools view -b normal.bam 2:129208172-129208196 > extract.bam
and then
samtools bam2fq extract.bam | seqtk seq -A > extract.fa
However, extract.fa contain the reads, not the sequence at the specific position. like in the following format
>ERR424937.5101065/1
CTCTGTCTCTGTCTCTCTCCCTCTCCCTTCTCCCCTCACCCTCTCCTCTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTATGC
I want to index the extract.fa file and use the faxid command like the following
samtools faidx extract.fa
samtools faidx extract chr2:129208172-129208196
But as you can see, the extract.fa does not contain the position information, so I can't really do that.
Can anybody help?
Thanks, Gene
it's not clear to me.
Thanks Pierre, the position 129208172-129208196 has 25 bases, the reads fetched by samtools bam2fq is longer than that. So I think the sequence that I am interested should be a sub sequence of that?
I am a novice please correct me if I am wrong.
Thanks