Entering edit mode
4.0 years ago
shenwei1376
•
0
Hi everyone,
I downloaded some single cell RNA-Seq data from SRA sites. Unfortunately, only the R2.reads (~100bps) was uploaded for all the SRR files (SRR11573310...).
I am wondering if anyone would know if it will be ok to align the reads to genome just like regular RNA-Seq pipeline. For example, use hisat2_htseq-count to achieve the gene counts?
Thank you very much for any the help!
If this is 10X and R2 is only the cDNA then that would give you kind of "bulk" sample with excessive PCR duplicates since both the Cellular Barcodes (the per-cell identifier) as well as the UMIs (for deduplication) are in R1. I would contact the authors, scRNA-seq is complex enough, no need to add additional uncertainty by custom (and unreliable) pipelines.
Thank you. I get the point. Would you help me understand why duplcates is a bigger problem for single cell RNASeq? I remember we generally don't remove duplicates for regular RNASeq. Many thanks!
Read 1 may be available from
Data Access
tab for these accessions (an example here), if the name of the file is correct. Look under "Original Format".Thanks. I think they just name it as R1 but its actually R2 as I only can found one file. It is a big headaache to download single cell RNASeq data.