Hello -rw-r--r-- 1 5062 5000 753851810 Nov 2 2018 xxxx_L001_R1_001.fastq.gz -rw-r--r-- 1 5062 5000 1772725195 Nov 2 2018 xxxx_L001_R2_001.fastq.gz -rw-r--r-- 1 5062 5000 748651163 Nov 2 2018 xxxxx_S1_L002_R1_001.fastq.gz -rw-r--r-- 1 5062 5000 1763628623 Nov 2 2018 xxxxx_L002_R2_001.fastq.gz
questions: xxxx_L001_R1_001.fastq.gz and xxxxx_L001_R2_001.fastq.gz should not have the same size ?
thank you in advance
Get a background first and follow guided tutorials. There is no need to use trimmomatic on regular single-cell/10x data. Also make yourself familiar with how these libraries look, what R1 and R2 is (CB/UMI, cDNA) etc. There is lots of online material on that.
Hi ATpoint , Would you mind explaining why we don't have to trim 10x scRNAseq data? I noticed that cell ranger workflow will take care of TSO and poly-A. But I'm not sure it will trim general illumina adapter sequences. Although, STAR will do soft-clipping. So I was wondering why there's no need to use trimming tools for 10x single cell data. I've seen illumina universal adapter sequence via Fastqc from 10x scRNAseq 3' data.
For R1 CellRanger uses only the CB and UMI positions (for example the first 28 bp in a v3 chemistry dataset) so it ignores everything beyond that, and for R2 the STAR aligner which CellRanger uses) can soft-clip parts of the read that do not properly align. That's why trimming is not mandatory, yet you can do it if you feel safer. But people generally don't for 10x data afaik.