what should the STR in the bwa mem -R option contain?
1
0
Entering edit mode
4.6 years ago
wrab425 ▴ 50

I am aligning fastq files R1_1.fastq and R1_2.fastq to a small genome and using '@RG\tID:R1\tSM:R1' as the string in the -R option but I think this is causing the command to give a wrong answer. Perhaps you can suggest a reason.

Thanks

ChIP-Seq alignment • 2.3k views
ADD COMMENT
1
Entering edit mode

Read groups are not required unless you plan to use GATK to do any analysis. You can find more info about read groups here.

ADD REPLY
0
Entering edit mode

Hi, Following your advice I took out the -R field and removed the STR but then when I ran the command

bwa mem -B 40 -O 60 -E 10 -L 50 -M -t 10  Cen_22_before_combined.fasta R1.unmapped_ecoli_1.fastq R1.unmapped_ecoli_2.fastq | samtools view -b  > R1.bam

and then

 samtools view -c -F 260 R1.bam

to get the number of mapped reads I get:

[W::bam_hdr_read] EOF marker is absent. The input is probably truncated.

Any advice?

Thanks

ADD REPLY
0
Entering edit mode

Add -S to your samtools view command since your input is in SAM format.

ADD REPLY
0
Entering edit mode

What is you samtools version? samtools --version?

ADD REPLY
0
Entering edit mode

samtools version 1.3.1

ADD REPLY
0
Entering edit mode

Ok, this is ancient. I recommend upgrading. With the current version it would simple be:

(...) | samtools view -o R1.bam

There have been many bug fixes and performance optimizations since 1.3.1, so it is worth upgrading.

ADD REPLY
0
Entering edit mode

thanks but I do not have the authority to do that so I will add the -S option; I am confused though; my command is

samtools view -c -F 260 R1.bam

so I m asking it to count the number of mapped reads and am giving it a bam file so why does it need a -S option. I think the real problem is coming from bwa mem as I getting

W::bam_merge_core2] No @HD tag found.

at the end of the run.

My command here is

bwa mem -B 40 -O 60 -E 10 -L 50 -M -R '@RG\tID:R1\tSM:R1' -t 10  Cen_22_before_combined.fasta R1.unmapped_ecoli_1.fastq R1.unmapped_ecoli_2.fastq | samtools view -bS - > R1.bam

I am not at all sure why this is not running properly.

Thanks as always

ADD REPLY
0
Entering edit mode

so I m asking it to count the number of mapped reads and am giving it a bam file so why does it need a -S option.

That command does not need the -S option but the first time you make the BAM file does since for older version of samtools you tell it that the input is in SAM format.

ADD REPLY
0
Entering edit mode

thanks;

samtools view -bS - > R1.bam

I assumed that this was catching it.

The real problem seems to be the No @HD tag found; I have been trying a lot of different things but cannot see why this is happening.

Thanks again for your time.

ADD REPLY
0
Entering edit mode

It is puzzling why your file does not have @HD tag. What do you see by this:

samtools view -H R1.bam | head -3

Can you do the two steps separately?

bwa mem -B 40 -O 60 -E 10 -L 50 -M -R '@RG\tID:R1\tSM:R1' -t 10  Cen_22_before_combined.fasta R1.unmapped_ecoli_1.fastq R1.unmapped_ecoli_2.fastq > file.sam

samtools view -bS file.sam -o R1.bam

They try the command above on this file.

ADD REPLY
0
Entering edit mode

We have installed a more recent version of samtools (1.10) and I have run the pipeline in several folders and am getting completely consistent results which is good but when I look at the head of the bam file I cannot see the @HD tag although everything else is there;

$ samtools view -H R1.bam | head -10
@SQ     SN:Not_76_Chr_1_ref     LN:4655817
@SQ     SN:Not_76_Chr_3_ref     LN:2447592
@SQ     SN:Not_76_Chr_4_ref     LN:1713734
@SQ     SN:Not_76_Chr_2_ref_22_before_50p213    LN:3954797
@RG     ID:R1   SM:R1
@PG     ID:bwa  PN:bwa  VN:0.7.15-r1142-dirty   CL:bwa mem -B 40 -O 60 -E 10 -L 50 -M -R @RG\tID:R1\tSM:R1 -t 10 Cen_22_before_combined.fasta R1.unmapped_ecoli_1.fastq R1.unmapped_ecoli_2.fastq
@PG     ID:samtools     PN:samtools     PP:bwa  VN:1.10 CL:samtools view -bS -o R1.bam R1.sam
@PG     ID:samtools.1   PN:samtools     PP:samtools     VN:1.10 CL:samtools view -H R1.bam
[

My main worry is that the missing @HD head suggests that something is wrong and the output is unreliable but am unsure.

I think we are nearly there.

Thanks,

William

ADD REPLY
0
Entering edit mode

The HD header is optional metadata.

As the SAM spec states

@HD File-level metadata. Optional. 
If present, there must be only one @HD line and it must be the first line of the file

https://samtools.github.io/hts-specs/SAMv1.pdf

Thus it is quite conceivable that some aligners do not include the HD tag and that does not mean that the output is incomplete.

ADD REPLY
0
Entering edit mode

Thanks that was helpful

ADD REPLY
0
Entering edit mode
4.6 years ago

Stop chaining commands. When one of them is incorrect you can't quite tell which one causes the problem. Especially if you are new at this each error is more puzzling than the next.

Instead, break all steps into individual stages:

bwa mem this-that > file1.sam

samtools view -b file1.sam > file1.bam

samtools sort ....

and so on. At each step investigate the output that gets created.

Once it all works you can chain them back.

ADD COMMENT
0
Entering edit mode

samtools 1.3 already was able to read SAM files for sorting, therefore no need for the second command given you eventually want a sorted bam file.

http://www.htslib.org/doc/1.3/samtools.html

 samtools sort [-l level] [-m maxMem] [-o out.bam] [-O format] [-n] [-T tmpprefix] [-@ threads] [in.sam|in.bam|in.cram]
ADD REPLY
0
Entering edit mode

thanks; thats helpful advice. I am trying it and will let you know how it goes

ADD REPLY

Login before adding your answer.

Traffic: 1598 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6