When you look in the GATK best practices workflow there are 3 pre processing steps:
1) Duplicate marking
2) Local realingment
3) Base quality recalibration
Now I get contradicting information from different people on whether or not to use these pre processing steps for sequencing data produced on the Solid platform. What do other Solid and GATK users do?
Of course it already makes a difference wether the BWA or Lifescope was used as a mapper, since for one lifescope clips low quality tails and BWA tries to map whole reads including the low quality tails.
Are there SOLID specific characteristics that warrant not using certain preprocessing steps, or warrant using different preprocessing steps?
do solid systems produce FASTQ files for forward reads and reverse reads? I have been working with only Illumina data with GATK SNP calling pipeline. I believe preprocessing is required also for SOLID.
Native SOLID output is XSQ which lifescope can use for mapping. There are tools to convert the XSQ file to CSFasta and Fastq so you can use other mappers.