Hello,
I have used the following command to extract the properly paired ends from a bam file: samtools view -F 0x2 file.bam > proper_paired.bam
but then when I used validateSamFile with picard to validate the bam file with the following command: java -jar /apps/picard-tools/2.1.0/picard.jar ValidateSamFile I=proper_paired.bam MODE=SUMMARY
I get the following errors: ERROR:MISSING_READ_GROUP 1 ERROR:MISSING_SEQUENCE_DICTIONARY 9671811 ERROR:READ_GROUP_NOT_FOUND 9878009 WARNING:RECORD_MISSING_READ_GROUP 9878009
The dictionary for the fasta file exists in my folder It seems that samtools view -F removes the sam header
I tried to add the header from file.bam with samtools reheader header.sam proper_paired.bam but I get the following error: Segmentation fault (core dumped) samtools reheader header.sam proper_paired.bam
The initial file.bam is just fine, double checked. I just need only the properly paired reads in the bam file.
Thank you.
Thanks both. It have now the correct format. However, I was wondering you might be able to help me clarify something. the results of the samtools flagstat command on the original bam file is:
and when I run the
samtools view -h -b -F 0x2 original_file.bam > proper_paired.bam
Then the results of the samtools flagstat on the proper paired bam file is:How is it possible after the filter 0x2 to have 0 properly paired reads? 0x2 is to keep only the properly paired reads in the bam file. Do I miss something?
little f is to keep, big F is to remove.
I also tried
so the -f or the multiple -F option corrupts the file. Let me know how I can get the only properly paired reads from a bam file if you know. Thank you.
Use the
ADD COMMENT/ADD REPLY
buttons on previous posts to add additional information like this. Don't add "New answers".Hehehe, corrupts the file. I think "it corrupts the file!" will be my new go-to phrase for when people ask me a difficult question which has a long, complicated and boring answer.
You can't chain -F flags up like that. You need to add the 0x4 and x8 and 0x400 and 0x200 up, which as we all know is 0x60c. Obviously. You could also do -F 1548, but you can't use 11000001100, because that would be too easy.
You can have an -f and an -F at the same time though.
I can't think/calculate in hexadecimal, so no, I did not know it would be 0x60c. I am sure a lot of people on this forum may not either :-)
<sarcasm />
:P hehehe, sorry. I think making people use hex or even base 10 to talk/think about flags was a huge user interface mistake for samtools. Actually i've gone on record saying it a lot stronger than that, and in all of my tools I use letters instead of flags because honestly, summing numbers is what computers should do, not humans.Many (including me) would totally agree with you.
Is that an irreversible decision? It would be a big help to have the ability of being able to specify letter flags.
Is
samtools flags
command only meant for translation of flags? If it does that then surely letter/word options can be easily enabled for -f and -F.