Question

Split fastq.gz file

0

Entering edit mode

2.1 years ago

sandy • 0

Hello everyone,

I download a single fastq.gz file from a study posted on GEO, and this is a paired read single cell sequencing. Somehow this study compressed R1 and R2 file into one fastq.gz file, I used zcat to check the reads and they are really long, like 98 characters. I want to split them to a R1_fastq.gz and R2_fastq.gz files. Please give me some advice on what should I do.

fastq • 1.3k views

ADD COMMENT • link updated 20 months ago by Ram 44k • written 2.1 years ago by sandy • 0

0

Entering edit mode

Can you post the accession number so we can examine the record? From where and how did you download the file?

ADD REPLY • link 2.1 years ago by GenoMax 147k

0

Entering edit mode

Hi, thanks for your replay. I download this study: GSE138669 from GEO. I used download link generated by SRA explorer using ascp download the fastq.gz.

ADD REPLY • link 2.1 years ago by sandy • 0

0

Entering edit mode

Please show a head.

ADD REPLY • link 2.1 years ago by ATpoint 85k

0

Entering edit mode

sure!

This is:

zcat SRR10254552_GSM4115872_SC18_Homo_sapiens_RNA-Seq.fastq.gz|head -10 
@SRR10254552.1 NS500211:194:HNFHHBGXY:4:21402:13612:10917
CACCGCGAGGGCGGAGCTGCGTTGTCCTCTGCACAGATTTCGGTGGTACTCTGAAGGCGGAGCACAGTTCTCCTCAGGTCAGACCCGGGCGGGCGGGC

ADD REPLY • link updated 2.1 years ago by GenoMax 147k • written 2.1 years ago by sandy • 0

score 1 · Answer 1 · 2022-10-14

1

Entering edit mode

2.1 years ago

GenoMax 147k

Looks like this is a 10x dataset which can be a hit-or-miss proposition to download intact from SRA. Your best bet is to either use the count file provided in the GEO record or use Data Access tab to download the bam format file that has original data https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR10254549&display=data-access . Then use bam2fastq utility provided by 10x to reconstruct the original fastq data.

ADD COMMENT • link 2.1 years ago by GenoMax 147k

0

Entering edit mode

Thank you!

ADD REPLY • link 2.1 years ago by sandy • 0

0

Entering edit mode

Sorry to bother you again. To download bam format file, can you give me some suggestions about how to download it? Should I use wget, prefetch, or ascp?

Thank you

ADD REPLY • link 2.1 years ago by sandy • 0

1

Entering edit mode

You can use wget with the link https://sra-pub-src-2.s3.amazonaws.com/SRR10254549/SC2possorted_genome_bam.bam.1