Quality scores of base recalibrated Cram file
1
0
Entering edit mode
2.1 years ago
asalimih ▴ 60

Hello,
I tried both samtools fastq and also picard SamToFastq commands to convert a cram file to it's original fastq files. However when I compare the fastqs to the original one the quality scores of the reads are changed.


Converted from cram
@A00718:331:HCCFYDSX2:1:1502:12292:2769/1
ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAACCTAACCCAAGACCTAACACAAAACCAACACCTAACACTAAACCTAACGCTAA
+
??????????????????????????????????????????????????????????????????????????????+?????????????++?'++?+5?+?+?'?5+?+5??+'????++?++++++?'??++5'++++5?++'55'+


Original
@A00718:331:HCCFYDSX2:1:1502:12292:2769 1:N:0:GTTAGAAC+ATCACGTG
ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAACCTAACCCAAGACCTAACACAAAACCAACACCTAACACTAAACCTAACGCTAA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,:FFFF::FFFF::,,:,,,:,,F,:,F,:,,F,,F:,,F:FF,,F,,,,,,:,::,,,,,,,,,:,,,,,,,

Is anything wrong with the fastqs? (Note that my case is I accidently have deleted the original fastqs and only have the cram file. this here Is just another example that I have both the fastqs and cram file)

Update 25 December 2022 (changed the question title accordingly) It seems that the cram file has been base recalibrated before (using GATK BaseRecalibrator and ApplyBQSR). but I'm not sure if this is an expected output of base recalibrator.

fastq cram • 1.4k views
ADD COMMENT
0
Entering edit mode

Note that my case is I accidently have deleted the original fastqs and only have the cram file. t

so , how did you get that Original view ?

ADD REPLY
1
Entering edit mode

As noted in original post:

this here Is just another example that I have both the fastqs and cram file)

ADD REPLY
0
Entering edit mode

(post edited, I had put the reads in reverse)

ADD REPLY
0
Entering edit mode

Can you show how you did the conversion from fastq to CRAM originally? Have you considered that possibility that there was some error in that step itself?

ADD REPLY
0
Entering edit mode

Sure,
fastq to bam:

bwa mem -K 100000000 -p -v 3 -t $Threads -Y $ref_fasta

bam to cram:

samtools view -C -T ${ref_fasta} ${input_bam} > output.cram

(with indexing)

ADD REPLY
0
Entering edit mode

I don't think there was any error. because these cram files are viewable inside IGV.

ADD REPLY
0
Entering edit mode
2.1 years ago
ATpoint 85k

I cannot reproduce this:

$ cat foo.fq
@read1
GACGTTACGTATGCTACGATCGATCGATCGATCAGTCGACTGCAGACGTTACGTATGCTACGATCGATCGATCGATCAGTCGACTGCA
+
GACGTTACGTATGCTACGATCGATCGATCGATCAGTCGACTGCAGACGTTACGTATGCTACGATCGATCGATCGATCAGTCGACTGCA

$ bwa mem ref.fa foo.fq | samtools view -o foo.bam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 1 sequences (88 bp)...
[M::mem_process_seqs] Processed 1 reads in 0.001 CPU sec, 0.001 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem ref.fa foo.fq
[main] Real time: 0.002 sec; CPU: 0.003 sec

$ samtools view -C -T ref.fa -o foo.cram foo.bam
$ samtools fastq foo.cram
[M::bam2fq_mainloop] discarded 0 singletons
[M::bam2fq_mainloop] processed 1 reads
@read1
GACGTTACGTATGCTACGATCGATCGATCGATCAGTCGACTGCAGACGTTACGTATGCTACGATCGATCGATCGATCAGTCGACTGCA
+
GACGTTACGTATGCTACGATCGATCGATCGATCAGTCGACTGCAGACGTTACGTATGCTACGATCGATCGATCGATCAGTCGACTGCA

It's the same. bwa=0.7.17-r1188 && samtools:1.15.1

ADD COMMENT
0
Entering edit mode

I tired your code on one of the reads too and yes there were no problems. those bwa and samtools that I mentioned in the comments are just the main command for the conversion. the whole workflow is based on GATK best practices and includes multiple steps. I'll dig more into it and see if there is anything relevant.

ADD REPLY
0
Entering edit mode

I updated the question.

ADD REPLY

Login before adding your answer.

Traffic: 2646 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6