Question

Samtools- Cutting Out A Region/ Getting Information About A Region

0

Entering edit mode

11.9 years ago

diltsjeri ▴ 470

We have a reference sequence with 40Ns located in the middle of it. After aligning the reads to the reference with the 40Ns, what is the best way to view the contigs that over lap the 40Ns region? Where can I get information on how many reads and contigs are covering that region using a bam file? I have sorted and indexed the bam file, but I'm not sure where to go from here. If anybody has done something like this, your help would be appreciated.

Thanks.

samtools reference • 4.1k views

ADD COMMENT • link updated 11.9 years ago by Matt Shirley 10k • written 11.9 years ago by diltsjeri ▴ 470

score 2 · Answer 1 · 2012-12-17

2

Entering edit mode

11.9 years ago

swbarnes2 14k

I'm not sure what you mean by "contig" in this context.

You can always use samtools view to filter the .bam to just the desired region of the desired chromosome. But looking at that region with IGV is probably the simplest thing to do.

ADD COMMENT • link 11.9 years ago by swbarnes2 14k

0

Entering edit mode

If you have access to samtools I would also suggest samtools tview as a very simple, fast viewer.

ADD REPLY • link 11.9 years ago by Matt Shirley 10k

0

Entering edit mode

Thanks for your responses.

Maybe I'm confused. I thought a bam file has a contigs and reads? I want to know which unique ones are covering the region of interest. I've used IGV and tview to see the alignment, but I'm interested in pipe-lining the data, so manually looking at it isn't exactly a solution we are looking for. I also have access to an sff, fasta, and fastq file.

ADD REPLY • link 11.9 years ago by diltsjeri ▴ 470

0

Entering edit mode

The SAM format (of which BAM is just block gzip compressed and the blocks are indexed against genomic coordinates) is really just a container for reads, quality scores, alignment information, and other optional flags and strings. There are no contiguous constructs held in this format. The header of a SAM format file does contain a sequence library (@SQ) which defines the contains that you have aligned your reads to.

ADD REPLY • link 11.9 years ago by Matt Shirley 10k

0

Entering edit mode

samtools view can whittle down your .bam file to just the reads that cover a particular region, or it can take a list of regions in bed format. Reads with high MQ should be unique, samtools view can also filter by MQ.

ADD REPLY • link 11.9 years ago by swbarnes2 14k

score 1 · Answer 2 · 2012-12-18

If you want to determine the coverage for a specific region of your aligned reads, take a look at the coverageBed tool in bedtools.

If you are interested in more specific information about the reads in your region of interest, take a look at Extract Reads From A Bam File That Fall Within A Given Region, which has many relevant answers.