Do The Sequences In Sam/Bam Header Have To Match The Sequences In Reference Fasta File?
1
0
Entering edit mode
11.7 years ago

Do the sequences in SAM/BAM header have to match the number and the order of the sequences in reference FASTA file?

For example, if I have the following in my SAM header (the file only contains reads for chr11):

@HD    VN:1.4    SO:coordinate
@SQ    SN:chr11    LN:135006516
@RG    ID:mySample    SM:mySample
@PG    ID:bozo    PN:bozo    CL:commandLine

And my reference is the standard hg19 reference with all the sequences.

Can I expect my SAM file to work with all the standard tools?

sam • 4.8k views
ADD COMMENT
0
Entering edit mode

It's important that the chr names in BAM and reference genome match, like ashutoshmits said. That's why it's usually recommended to use the same reference which was used to generate your BAM file, whenever possible.

ADD REPLY
1
Entering edit mode
11.7 years ago

Well depends on the tool you are using. But I think your bam file is NOT gonna work with most of the standard tools. Tools like GATK are pretty particular about the BAM header. For example, it should have the same names for the chromosomes as in reference file (chr16 vs 16). Moreover it should also has information about all the chromosomes mentioned in the reference genome. Some tools even ask that the chromosomes in bam and the reference file should follow the same order. I think any aligner will automatically include all the chromosomes from the reference genome. In your case it seems you aligned your sequences to chromosome 11 only. In your case, I would either edit my header to include SN and LN tags for all the chromosomes in hg19 OR I will use a reference fasta file that only contains chr11.

ADD COMMENT
0
Entering edit mode

Yes, GATK requires. Samtools works as long as every sequence you want to process in BAM is present in the fasta. MAQ has a similar requirement to GATK, but this to me is annoying. That is why samtools comes up with fasta indexing (faidx).

ADD REPLY

Login before adding your answer.

Traffic: 1647 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6