Hi all,
I am trying to align and produce counts form some scRNA-seq using CellRanger 7.1.0
. The fastq files for input are of the following format (for illustration):
SI-GA-A1_1_S1_L001_I1_001.fastq.gz
SI-GA-A1_1_S1_L001_R1_001.fastq.gz
SI-GA-A1_1_S1_L001_R2_001.fastq.gz
SI-GA-A1_2_S2_L001_I1_001.fastq.gz
SI-GA-A1_2_S2_L001_R1_001.fastq.gz
SI-GA-A1_2_S2_L001_R2_001.fastq.gz
SI-GA-A1_3_S3_L001_I1_001.fastq.gz
SI-GA-A1_3_S3_L001_R1_001.fastq.gz
SI-GA-A1_3_S3_L001_R2_001.fastq.gz
SI-GA-A1_4_S4_L001_I1_001.fastq.gz
SI-GA-A1_4_S4_L001_R1_001.fastq.gz
SI-GA-A1_4_S4_L001_R2_001.fastq.gz
and these fastq files are from the same sample. I am using the following cellranger count
options:
cellranger count --id=EXP20_CELLRAN7_1_0 \
--fastqs=/folder1/fastq_folder \
--sample=SI-GA-A1_1,SI-GA-A1_2,SI-GA-A1_3,SI-GA-A1_4 \
--transcriptome=/refdata-gex-GRCh38-2020-A
However, I am encountering the following issue:
14: start_thread
15: clone
Input FASTQ file ended prematurely: file: "/SI-GA-A1_4_S4_L001_R1_001.fastq.gz", line: 153386036
I have aligned the same set of fastq files using STAR solo
and it ran successfully. Hence, I am not able to understand what might be the issue here. Any help is highly appreciated.
Thank you.
Best, BP
It would be prudent to validate that the fastq file is not corrupt: https://github.com/biopet/validatefastq
Dear GenoMax
Thanks for your suggestion.
Indeed! I have validated the fastq files using the tool that you have mentioned and there doesn't seem to be anything wrong with the fastq files:
Thank you.
Best, BP
Have you tried re-running cellranger? Perhaps that was a temp glitch of some sort.
Not directly related but It is also a bit odd to name sample fastq files with 10x library codes. Were these files made using indexes listed on separate lines (e.g. 4 that go with
SI-GA-A1
) on each line in the samplesheet during demultiplexing?Thanks for the suggestion! Yes, I have tried re-running cellranger and I have tried different versions of cellranger as well. I ran into the same issue.
I am not aware of the demultiplexing procedure as the data was sent to us from a sequencing facility long back. However, I am aware that the fastq files are from the same sample and I have the one of the scenarios as described here by 10X: Specifying input 10X
under the section:
Do you think it would be appropriate to concatenate the fastq files into one and then run cellranger pipeline given that the fastq files are from the same sample and same lane?
If _1 and _2 really are the same library, you should rename the fastqs, or make symlinks, such that they have the same sample name, but different lanes. Cellranger will naturally take them all in together.