coordinate-sorted versus name-sorted bam file
1
0
Entering edit mode
5.8 years ago
bnayer26 • 0

I am new to RNA seq analysis, and I am at the stage of sorting my bam file after alignment. I need to further use this sorted bam file as an input for HTseq counting. I understand that I can perform the sorting by name or by coordinates according to the samtools documentation. Could someone please explain how to decide between sorting by name or by coordinate? I read that the default is by coordinate, and that the coordinate sorted bam file will work as an input for "samtools index" command, as well as for HTseq counting command. So it seems like coordinate sorted file is the best. However, is there any case where a name sorted file is better? What does one need to consider? Thank you for your help.

samtools • 3.6k views
ADD COMMENT
3
Entering edit mode
5.8 years ago

However, is there any case where a name sorted file is better?

when one need to quickly retrieve a set of paired reads by their name.

e.g: Extract PE Reads (with their mates) supporting variants in vcf file

Hi, Suppose I have a bam file and a vcf file containing variant calling result. I want to extract only reads with their mate that support variant allele in the vcf. It would be nice to get those reads in bam format.

e.g: How to retain SEQ and QUAL fields in BAM file after filtering?

I would like to retain the actual SEQ and QUAL fields in the SAM/BAM file after it has been filtered.

(...)

ADD COMMENT
0
Entering edit mode

Oh okay, I see. Thanks so much for the examples!

ADD REPLY
1
Entering edit mode

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.

Upvote|Bookmark|Accept

ADD REPLY
0
Entering edit mode

oh okay thanks for letting me know!

ADD REPLY

Login before adding your answer.

Traffic: 1642 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6