Question

Is there any tutorial page that tells you how to extract read count from illumina single-cell RNAseq fastq file?

0

Entering edit mode

3.0 years ago

Simon Ahn ▴ 10

Hi. I'm new conducting analysis from illumina single-cell RNAseq fastq files.

I downloaded SRR5494706 which is scRNAseq fastq file from illumina hiseq 4000.

I tried to aligned the reads with human GENCODE hg38 reference and annotation file with BWA.

bwa mem -t 10 /data/genome_reference/hg38/GENCODE/GRCh38.p13.genome.fa \
/data/fastq/SRR5494706.fastq > /data/sam/SRR5494706.sam

I transfromed sam file to bam using samtools

samtools view -S -b SRR5494706.sam > SRR5494706.bam

Then, I did not know I should add barcode information to bam, so I just used featureCounts to extract read count

featureCounts -T 10 -s 2 -a /data/msahn/genome_reference/hg38/GENCODE/gencode.v40.annotation.gtf \
-o /data/raw_read/test \
/data/bam/SRR5494706.bam

Then, I get only one column which means I have only one sample (or one cell) \ It seems that this is because I did not add UMI or barcode information...\ I tried to find whole workflow of extracting read count from illumina single-cell RNAseq fastq file, but all I could find is \ codes that I used above, or cellranger tutorial for 10X genomics. Could someone plz tell me what I should do or maybe\ tell me some website that explains about illumina scRNA-seq data processing?

illumina scRNA-seq • 959 views

ADD COMMENT • link 3.0 years ago by Simon Ahn ▴ 10

score 2 · Accepted Answer · 2022-05-19

Based on the entry at NCBI https://www.ncbi.nlm.nih.gov/sra/?term=SRR5494706 this is a standard RNA-seq sample, not single-cell, so one column is expected. Even if not, featureCounts would not be suitable for single-cell data in most cases.

Construction protocol: Total/Factory RNA was isolated using Tripure reagent according to the manufacturer’s instructions (Roche). Libraries were prepared with the RNA sample preparation kit (TruSeq v2; Ilumina) following ribodepletion (Ribozero Gold Kit). RNA libraries were prepared for sequencing using standard Illumina protocols

See, a standard RNA-seq (bulk) sample. If you are referring to the Layout: Single section then this means single-end, not single-cell. By the way, there is no such thing as "Illumina single-cell", Illumina is just the sequencer, it is the library preparation (such as MARS-seq, 10X Chromium, Fluidigm) that determines "what" a sample is in terms of the method.