What does Picard tools cleansam and fixemate exactly do and what are the equivalents in samtools ?
1
4
Entering edit mode
8.7 years ago
BioinfoNovice ▴ 110

Hi,

I'm wondering what do Picard Tools "cleansam" and "fixemate" exactly do to the data and what are the equivalent in samtools?

Actually, I'm processing my sorted by bam files (resulted mapping from fastq files with samtools) in order to proceed to the SNP calling with GATK. It is recommended from the GATK best practices guideline that unmapped sam/bam files should be treated with Picard tools "cleansam" and "fixmate" before removeduplicate. However, as I have already sorted my bam files with samtools and do not want to start all over with the unsorted files, I'm just wondering what could I do under samtools on my sorted bamfiles in order to have the same required results as using Picard tools "cleansam" and "fixmate".

Thanks.

picard tools samtools GATK • 6.1k views
ADD COMMENT
0
Entering edit mode

As far as I know the fix mate command will check that two mates of a pair are actually in the file. Sometimes prior filtering will remove one mate of a pair, but it won't update the SAM flag which says both mates are present. Fix mate will run through the file and update the flag if it can't find the mate anymore.

ADD REPLY
0
Entering edit mode

Samtools fixmate:

Fill in mate coordinates, ISIZE and mate related flags from a name-sorted alignment.

Fixmate checks the two mates from a paired-end bam (name sorted) and then updates the flags and insert sizes. IMHO that only makes sense if you did any filtering on your bam. For example, I typically filter my bam (for ChIP-seq, ATAC-seq, these kind of assays) for properly-paired reads and MAPQ. In case filtering for MAPQ>30 removes the forward, but not the reverse mate, the actual read is no longer paired, even though the bitwise flag still indicates the remaining mate as such. Running fixmate will then flag this singleton as unpaired and remove the insert size field, which allows subsequent removal by e.g. samtools view -f 2. In your case, as you did align directly from fastq, I do not think that it is necessary. Just be sure in your subsequent SNV calling that you exclude reads with MAPQ=0, as these multimappers are unrealiable.

ADD REPLY
0
Entering edit mode
7.2 years ago
lamteva.vera ▴ 220

I'm highly interested in the matter as well.

ADD COMMENT

Login before adding your answer.

Traffic: 2012 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6