Trying to create short test bam not working.
2
0
Entering edit mode
8.0 years ago
ariel.balter ▴ 260

I have a bam file sample.bam. I would like to create a smaller version for testing my pipeline. I tried this:

samtools view sample.bam > temp.sam
head -1000000 temp.sam > temp1e6.sam
samtools -bT temp1e6.sam > short.bam

Now short.bam looks like:

$ cat short.bam

Usage: samtools view [options] <in.bam>|<in.sam>|<in.cram> [region ...]

Options:
  -b       output BAM
  -C       output CRAM (requires -T)
  -1       use fast BAM compression (implies -b)
  -u       uncompressed BAM output (implies -b)
  -h       include header in SAM output
  -H       print SAM header only (no alignments)
  -c       print only the count of matching records
  -o FILE  output file name [stdout]
  -U FILE  output reads not selected by filters to FILE [null]
  -t FILE  FILE listing reference names and lengths (see long help) [null]
  -L FILE  only include reads overlapping this BED FILE [null]
  -r STR   only include reads in read group STR [null]
...
...
...
samtools bam sam • 2.0k views
ADD COMMENT
0
Entering edit mode

Didn't you miss a samtools view in your last command? Can you check if temp.sam and temp1e6.sam look okay? Why the -T flag?

ADD REPLY
0
Entering edit mode

It looks like you are getting the help function from the first samtools view command. Does the first file look OK?

cat temp.sam

You probably want to add the -h option on to keep the header as well.

ADD REPLY
0
Entering edit mode

Like WouterDeCoster says - looks like you are missing the "view" after samtools on the last command - although would of thought that would give you the samtools help, or an unrecognised -bT, rather than the samtools view help

ADD REPLY
0
Entering edit mode

I think the command he showed us here is different from the one he actually executed, hence the samtools view help info. (Although I'm a bit surprised it went to stdout)

ADD REPLY
2
Entering edit mode
8.0 years ago

in your example, last samtools command is wrong. it should be:

samtools view -bS temp1e6.sam > short.bam

you could also concatenate all commands and save time and disk space:

samtools view -h sample.bam | head -1000000 | samtools view -bS - > short.bam

I would use samtools native slicing rather than selecting first x lines:

samtools view -b sample.bam chr21:1000000-2000000 > short.bam

although you can also use the subsampling option -s, for instance, to get 10% of the reads:

samtools view -s 0.10 sample.bam > short.bam
ADD COMMENT
1
Entering edit mode
8.0 years ago
Ron ★ 1.2k

You can probably subset by :

samtools view -b reads.bam chr1:10420000-10421000 > subset.bam

ADD COMMENT

Login before adding your answer.

Traffic: 1920 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6