Hello,
I have a question regarding merging FASTQ files from the same sample but different sequencing runs.
We generated cDNA libraries using the 10X v3 protocol but weren’t sure they had worked so initially sequenced only a small fraction of the library (10,000 reads/cell) using a NovaSeq.
Once this data came back, it was analysed by generating count matrix and using Seurat we confirmed that the data looked usable. The same library was then sequenced to the full sequencing depth that we would require for proper analysis (with the same company and a NovaSeq).
I now have two FASTQs files- one from the original shallow sequence and another from the new run.
I am now wondering how to combine the two FASTQs into one?
Would cellranger aggr be best or could I use the —lanes function in the cellranger count?
Many thanks!
You wouldn't want to use
aggr
because it would treat the same barcode from each of your two samples as two unique molecules for the purposes of determining cells and UMIs. Lanes wouldn't work either because the two samples aren't necessarily separated by lane, but sequencing run. Your best bet would be to just concatenate the fastq files prior to running them on cellranger.Sorry, why wouldn't lane work? Supposedly different lane presumes same library (which would account for combining reads under same barcode) so it shouldn't be a problem if I just want to merge data from two runs? Since from data level there's no difference between two seq runs and two lanes?
I have a similar issue but my two seq runs each accounts for similar sequencing depth.
Unless you have a need, you can simply use the full depth data. That test sequence is likely not going to add much to the overall result.
hi, i am also trying 2 merge fastq files from the same sample but different sequencing runs. I now have two FASTQs files- one from the original shallow sequence and another from the new run. when I did cell ranger count , I am getting weird result. 1st run Estimated Number of Cells - 8,748. 2nd run Estimated Number of Cells - 3,340. merged run Estimated Number of Cells - 12,260
I am getting more cells, there should be overlap because they are all from same GEM well.
cellranger count --id=sample_483_merged \ --fastqs=old_run/fastq_path/AAC7GKFHV,new_run/fastq_path/AAC2WMNHV/ \ --transcriptome=/scratch/tmp/vermaa/hg38/ \ --sample=sample_483,sample_483 \ --include-introns=false
Is my code correct? how to resolve this problem?