Unexpected High T-Base Content in scRNA-seq Fastq Files and GC Bias Correction
1
0
Entering edit mode
9 weeks ago
Cooper • 0

Hello,

While performing quality control on a public scRNA-seq raw dataset, I noticed that some of the fastq files exhibit an unusually high T-base content in the middle portion of the sequences (approximately positions 30-70 in the read). However, this issue does not appear in other fastq files. Additionally, almost every fastq file shows a warning for failed per-sequence GC content in the quality control reports.

I would like to ask:

  1. What could be the cause of the sudden increase in T-base frequency, and how should I address it?
  2. Should I perform GC bias correction for single-cell RNA-seq data?

Additionally, can scRNA-seq fastq files be processed with fastp using default parameters for quality control? I have mostly worked with bulk RNA-seq data, so I'm quite new to scRNA-seq.

If there are any papers that provide a thorough yet accessible discussion of practical challenges in scRNA-seq data processing and analysis, I would greatly appreciate the recommendation!

Thank you!

Here are the details:

SRR18015167_1_fastqc SRR18015167_1_fastqc

SRR18015167_2_fastqc SRR18015167_2_fastqc

For the fastqc.html: https://github.com/coopertdx/Biostars

Fastq Fastqc scRNA-seq quality-control • 405 views
ADD COMMENT
1
Entering edit mode
9 weeks ago
ATpoint 86k

What you see is normal and expected. In 10x scRNA-seq version 3 (which this is) R1 is the CB and UMI in the first 28bp, then comes a polyT (what you see impressively) representing the sequence to originally bind to the polyA tail of transcripts for cDNA synthesis. Nothing to be done here. Just run them through CellRanger or alternative pipelines for scRNA-seq.

ADD COMMENT
0
Entering edit mode

Get it! Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 2026 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6