I am aligning fastq files R1_1.fastq and R1_2.fastq to a small genome and using '@RG\tID:R1\tSM:R1' as the string in the -R option but I think this is causing the command to give a wrong answer. Perhaps you can suggest a reason.
Thanks
I am aligning fastq files R1_1.fastq and R1_2.fastq to a small genome and using '@RG\tID:R1\tSM:R1' as the string in the -R option but I think this is causing the command to give a wrong answer. Perhaps you can suggest a reason.
Thanks
Stop chaining commands. When one of them is incorrect you can't quite tell which one causes the problem. Especially if you are new at this each error is more puzzling than the next.
Instead, break all steps into individual stages:
bwa mem this-that > file1.sam
samtools view -b file1.sam > file1.bam
samtools sort ....
and so on. At each step investigate the output that gets created.
Once it all works you can chain them back.
samtools 1.3 already was able to read SAM files for sorting, therefore no need for the second command given you eventually want a sorted bam file.
http://www.htslib.org/doc/1.3/samtools.html
samtools sort [-l level] [-m maxMem] [-o out.bam] [-O format] [-n] [-T tmpprefix] [-@ threads] [in.sam|in.bam|in.cram]
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Read groups are not required unless you plan to use GATK to do any analysis. You can find more info about read groups here.
Hi, Following your advice I took out the -R field and removed the STR but then when I ran the command
and then
to get the number of mapped reads I get:
Any advice?
Thanks
Add
-S
to yoursamtools view
command since your input is in SAM format.What is you samtools version?
samtools --version
?samtools version 1.3.1
Ok, this is ancient. I recommend upgrading. With the current version it would simple be:
(...) | samtools view -o R1.bam
There have been many bug fixes and performance optimizations since 1.3.1, so it is worth upgrading.
thanks but I do not have the authority to do that so I will add the -S option; I am confused though; my command is
so I m asking it to count the number of mapped reads and am giving it a bam file so why does it need a -S option. I think the real problem is coming from bwa mem as I getting
at the end of the run.
My command here is
I am not at all sure why this is not running properly.
Thanks as always
That command does not need the
-S
option but the first time you make the BAM file does since for older version of samtools you tell it that the input is in SAM format.thanks;
samtools view -bS - > R1.bam
I assumed that this was catching it.
The real problem seems to be the No @HD tag found; I have been trying a lot of different things but cannot see why this is happening.
Thanks again for your time.
It is puzzling why your file does not have
@HD
tag. What do you see by this:Can you do the two steps separately?
They try the command above on this file.
We have installed a more recent version of samtools (1.10) and I have run the pipeline in several folders and am getting completely consistent results which is good but when I look at the head of the bam file I cannot see the @HD tag although everything else is there;
My main worry is that the missing @HD head suggests that something is wrong and the output is unreliable but am unsure.
I think we are nearly there.
Thanks,
William
The
HD
header is optional metadata.As the SAM spec states
https://samtools.github.io/hts-specs/SAMv1.pdf
Thus it is quite conceivable that some aligners do not include the
HD
tag and that does not mean that the output is incomplete.Thanks that was helpful