Dear All,
I am working with BAM files for chip-seq analysis. For each read in the BAM file, I want to extractt the chromosome number, the start position and the stop position and replace that in a new file. Is that possible?
Many thanks!
Dear All,
I am working with BAM files for chip-seq analysis. For each read in the BAM file, I want to extractt the chromosome number, the start position and the stop position and replace that in a new file. Is that possible?
Many thanks!
samtools view bamfile.bam|awk '{print $3 "\t" $4 "\t" $4+length($10)-1}' > newfile.tab
will do the job. The stop position here is the last matching position.
You could also try BEDTools, which can convert your .bam to a .bed, and bed pretty much is a name, a chromosome, a start and a stop position.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
bedtools' bamToBed inspects the CIGAR string when computing the end coordinate, so deletions are properly handled. the example here assumes that only substitutions can occur,
this approach has a minor issue that the length of the sequence does not necessarily agree with the span of the alignment, e.g. indels
Thanks to Farhat, that works fine for me! :)
True. bamToBed would be the proper tool to handle something like this.