from CRAM to fastq
1
0
Entering edit mode
6 months ago

hello, so I have some WGS data, and I was given the cram and cram.crai; I want to change them to fastq I am using samtools:

samtools fastq -O -c 7 C-7507T.cram > C-7507T.fastq.gz

The cram file is 32G The fastq file is 583G

is that difference in size normal?

FASTQ WGS CRAM • 671 views
ADD COMMENT
0
Entering edit mode

Possibly. You appear to have written the data to a single file (in case it was paired-end to begin with). Check to make sure secondary alignments did not result in duplicate reads entries.

ADD REPLY
0
Entering edit mode

one of the issues is that I am not very sure how the sequencing was done, but I think it was single end. any suggestions hoot do that?

ADD REPLY
0
Entering edit mode

I am 99.99 sure that your fastq file is not compressed. What is file C-7507T.fastq.gz? I do not think samtools autodetects suffix if you send to stdout like that.

ADD REPLY
0
Entering edit mode

the option -c is to compress the file

ADD REPLY
0
Entering edit mode

Yes, but I still think that with this syntax you get an uncompressed file, because samtools does not detect the gz suffix. Just take a head of the fastq file and see whether it is text or binary.

ADD REPLY
0
Entering edit mode
6 months ago

if the CRAM/BAM is paire-end, it must be sorted on query name using samtools collate

https://x.com/lh3lh3/status/1132756684789768202

Command line to remap a position sorted bam:

samtools collate -uOn128 old-pos-srt.bam tmpxyz | samtools fastq - | bwa mem -pt16 ref.fa - | samtools sort --threads=4 -m4G -o new-pos-srt.bam
ADD COMMENT

Login before adding your answer.

Traffic: 1591 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6