How To Sort Sam File
6
8
Entering edit mode
10.8 years ago
Chen Sun ★ 1.1k

I want to process the sam file created by BWA, if it is sorted by the sequence coordinate on reference, it will be easier.

How can I sort my sam file?

do I need to first convert sam file to bam file, then use samtools sort to sort the bam file, then convert bam file back to sam file?

Is there any tools that can do this, or should I write the tools by myself?

samtools sam • 46k views
ADD COMMENT
13
Entering edit mode
10.8 years ago

use an unix pipeline, run your downstream analysis using the convenient BAM format.

$ bwa mem  ref.fasta f1.fq.gz f2.fq.gz | samtools view  -Sb - | samtools sort - sorted && samtools index sorted.bam
ADD COMMENT
1
Entering edit mode

2018: with a recent samtools , the syntax has changed and you don't need to use 'samtools view'

ADD REPLY
0
Entering edit mode

Hi,

I have tried to use the below pipeline based on the above suggestion.

$BWAPATH/bwa mem $REFPATH/hg19 $DATPATH/R1.fastq.gz $DATPATH/R2.fastq.gz | $SAMPATH/samtools view -Sb - | $SAMPATH/samtools sort - sorted && $SAMPATH/samtools index - > $DATPATH/sorted.bam

The subsequent step is failing with error message sorted.bam is not coordinate sorted. To my surprise, though the step has completed successfully, it created an empty file.

Could you please help me understand what might be the reason?

Edit:

This has been posted in a new thread not to hijack this unnoticed one.

samtools error when using pipeline view/sort/index

ADD REPLY
4
Entering edit mode
8.8 years ago
DVA ▴ 630

If I understand your question correctly, you can do it using samtools sort with -O command:

samtools sort -O sam -T sample.sort -o sample.sort.sam sample.sam

where: -O is to specify output format; -T is to specify prefix, and I think it's needed because samtools sort create a few temporary files with this prefix; -o is your output file name. sample.sam is the original input.

Doing so, you basically sort sam file, and directly output sam file in one command.

Hope it help.

ADD COMMENT
0
Entering edit mode

sort: invalid option -- 'O' sort: invalid option -- 'T'

This is the error I get if I try this command. I don't see -O and -T flags in the man pages on SamTools. From where you got this flags?

ADD REPLY
1
Entering edit mode

Did you by chance do sort -O sam -T sample.sort -o sample.sort.sam sample.sam instead of samtools sort -O sam -T sample.sort -o sample.sort.sam sample.sam? Becuase the error you get doesn't look like a samtools error.

ADD REPLY
0
Entering edit mode

I don't see -O and -T flags in the man pages

http://www.htslib.org/doc/samtools.html

-O FORMAT

    Write the final output as sam, bam, or cram.

    By default, samtools tries to select a format based on the -o filename extension; if output is to standard output or no format can be deduced, bam is selected. 
-T PREFIX

    Write temporary files to PREFIX.nnnn.bam, or if the specified PREFIX is an existing directory, to PREFIX/samtools.mmm.mmm.tmp.nnnn.bam, where mmm is unique to this invocation of the sort command. 
ADD REPLY
3
Entering edit mode
10.8 years ago

You can use Picard.

Picard SortSam function can take SAM formatted file as input and output coordinate sorted BAM file as output. See this: http://picard.sourceforge.net/command-line-overview.shtml#SortSam

A usage will look like:

java -jar -Xmx40g /home/xyz/picard-tools-1.78/SortSam.jar  VALIDATION_STRINGENCY=LENIENT  MAX_RECORDS_IN_RAM=7500000 TMP_DIR=/scratch/  INPUT=/home/xyz/Some.sam  OUTPUT=/home/xyz/some.bam  SORTORDER=coordinate
ADD COMMENT
1
Entering edit mode
10.8 years ago

Samtools has a build in sort command. If you're interested in knowing more about any command or program usually > x.sh --help or > x.sh -h works. For some bash command typing 'man' before a command opens the manual

ADD COMMENT
1
Entering edit mode
9.6 years ago
gganebnyi ▴ 10

Try this visual tool to explore SamSort options and get command you need: http://bash.works/#tools/555c90e849e7235135cbb314/show

ADD COMMENT
0
Entering edit mode
9.6 years ago
Samad ▴ 90

Hi,

I had the same question about sorting the sam files, here is another solution, using unix command:

grep -v '^[[:space:]]*@' test.sam | sort -k3,3 -k4,4n > test.sorted.sam

(source methylKit R package)

ADD COMMENT

Login before adding your answer.

Traffic: 1696 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6