10x Cellranger SN RNA-seq BAM file output looks off
1
0
Entering edit mode
20 months ago
rebeliscu ▴ 60

I'm working with SN RNA-seq data that I processed using Cellranger v. 7.1.0 (which uses STAR for alignment with default settings).

Using the samtools flagstat command to look at a BAM file output:

284890368 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
141030430 + 0 duplicates
277071629 + 0 mapped (97.26% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

Which looks off... shouldn't there be reads annotated as read1 or read1, properly paired or not, etc.?

Looking at the unique samtools flags in this particular file, I find:

flag  occurrences
0   62497243
16  59602309
1024    73268831
1040    67761599

... which correspond to 'unpaired+forward', 'reverse', 'duplicate', and 'reverse+duplicate', respectively.

My goal is to filter for uniquely mapping, properly paired reads, but it's not clear to me how this can be done given these outputs.

Any advice or insight would be much appreciated.

cellranger samtools star bam rna-seq • 1.2k views
ADD COMMENT
0
Entering edit mode

My goal is to filter for uniquely mapping, properly paired reads

So I'm guessing you'll need a flag to pick for samtools view -f/-F, right?

it's not clear to me how this can be done given these outputs

How does the flagstat output affect the flag you pick for samtools view?

ADD REPLY
0
Entering edit mode

The flagstat output doesn't necessarily impact those flags, I just wanted to know what I was working with in terms of properly paired reads, and was initially confused by the finding that there were none. And, according to my search of the BAM file, none of the flags indicated any of the reads were properly paired, hence my confusion about how to extract these data.

However, per Genomax's comment, this does not apply to 10x SC/SN RNA-seq data, so my question was misguided from the start.

ADD REPLY
0
Entering edit mode

Your question was not misguided, we all miss "obvious" differences at times owing to the fact that our expectations are set based on the majority of our experience. BTW, I've moved GenoMax's comment to an answer. Please accept it to mark the post as solved.

ADD REPLY
3
Entering edit mode
20 months ago
GenoMax 147k

My goal is to filter for uniquely mapping, properly paired reads

There are no paired reads in 10x scRNAseq data. Read 1 only contains UMI+Cell Barcodes.Read 2 is single read for RNA.

ADD COMMENT
1
Entering edit mode

Minor correction: There are no paired end reads in 10X scRNA-seq data. 10X scATAC-seq has PE data.

ADD REPLY
0
Entering edit mode

Thank you GenoMax, I realized this shortly after posting. Makes sense now!

ADD REPLY

Login before adding your answer.

Traffic: 1643 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6