Entering edit mode
3.7 years ago
loy_loy
▴
10
Hi everyone,
I am following the DNA Seq Analysis Pipeline by GDC (https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/DNA_Seq_Variant_Calling_Pipeline/).
They give the following hint for bwa aln: 'If the quality scores are encoded as Illumina 1.3 or 1.5, use BWA aln with the "-l" flag.'
How do I get this information from my bam files?
Thank you!
Best
Lynn
If you have the fastq files you can determine it (more info in here ) because version 1.5 doesn't use "@" and "A" for the phred scores
Right. Basically either you know from others or you need to guess from the phred score enums. Btw, you are not expected to encounter these 1.3 or 1.5 encoding unless you are working on some super old data (like a small fraction of TCGA).