Samtools flagstat output said there is zero paired-reads in scRNA-seq bam file
1
0
Entering edit mode
6 weeks ago
juyeonlee • 0

Hi, I'm new to scRNA-seq analysis and wanna ask a few things about processing scRNA-seq bam file. While finding paired-reads in scRNA-seq data, I did samtools flagstat to see the stat of bam file. Here's the output of samtools flagstat.

samtools flagstat ./02_cellranger/count_luc_bsd/Vehicle/outs/possorted_genome_bam.bam

Output:

1013_20240619/02_cellranger/count_luc_bsd/Vehicle/outs/possorted_genome_bam.bam
359526689 + 0 in total (QC-passed reads + QC-failed reads)
359526689 + 0 primary
0 + 0 secondary
0 + 0 supplementary
93469059 + 0 duplicates
93469059 + 0 primary duplicates
339612936 + 0 mapped (94.46% : N/A)
339612936 + 0 primary mapped (94.46% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

The output said there is zero pairs in sequencing and there is no read1 and read2, which is weird because the scRNA-seq was done with 150-bp paired-end sequencing and I have both R1 and R2 fastqs. Also, when I did flagstat to the bamfile obtained from bulk RNA-seq, the output said '142669467 + 0 paired in sequencing', which means there are paired reads in the file. I don't think that the differences in methods between bulk and single-cell RNA-seq made the difference in the output of flagsta but I'm not sure the reason why I got this output.

The company I requested scRNA-seq did cellranger mkfastq and I started processing and aligning from cellranger count. Here's the code how I got the position-sorted bam file that I also used to analyze flagstat. I've been already doing downstream analysis using Seurat and there was no problem while doing the job.

cellranger count --sample V-1_F4 --id Vehicle --fastqs /mnt/bigHDD/kjh_mouse_scrna_240620/01_rawreads/V-1/ --transcriptome /mnt/bigHDD/cellranger_customized_reference/

I would really appreciate any help or advice :)

samtools flagstat scRNAseq • 331 views
ADD COMMENT
2
Entering edit mode
6 weeks ago

Short answer: this is normal, stop worrying about it.

Longer answer: the reads are not paired in the way, say, a paired DNASeq project would be paired, where R1 and R2 represent two ends of a piece of DNA, and both R1 and R2 are aligned to the genome.

In single cell sequencing R1 has no RNA information at all. It only has cell barcode and UMI information. It doesn't get aligned, so it's not recorded in the bam file as its own read. The cell barcode and UMI are recorded in the tags of the R2 bam entry for that read, which is an RNA sequence aligned to the genome.

ADD COMMENT
0
Entering edit mode

Also to add to the OP - why did you do 2x150? 10X is and other single cell technologies have very clear sequencing guidelines and none of the ones I'm aware of require 2x150bp. You can save a ton of money and time by using the 100 cycle kits.

ADD REPLY
1
Entering edit mode

People don't always have control over the runs their samples go on. If you have to share a run with other paired samples that needs a 2x150, then that's what you get. Cellranger doesn't care.

ADD REPLY

Login before adding your answer.

Traffic: 692 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6