Question

RNASeq data has 3 files R1, R2 and R3, which one is the barcode file?

0

Entering edit mode

2.5 years ago

leranwangcs ▴ 150

Hi,

I received some RNASeq files to do analysis. But I'm very confused as a newbie in RNASeq when I checked the files. There are R1, R2 and R3 files for each sample:

lane1_NoIndex_L001_R1_001.fastq.gz lane1_NoIndex_L001_R2_001.fastq.gz lane1_NoIndex_L001_R3_001.fastq.gz

lane2_NoIndex_L002_R1_001.fastq.gz lane2_NoIndex_L002_R2_001.fastq.gz lane2_NoIndex_L002_R3_001.fastq.gz

And here is what R1 file looks like:

enter image description here

Here is what R2 looks like:

enter image description here

Here is what R3 looks like: enter image description here

I also found a Sample_Key.txt file for the data which looks like this:

enter image description here

It looks like the 3 .fastq.gz files per sample are all with different length so I'm not sure which one is the barcode file and which one is R1 and which one is R2. And how to use the sample_key file to identify which is which?

Thanks so much!!

RNASeq • 1.4k views

ADD COMMENT • link updated 2.5 years ago by GenoMax 148k • written 2.5 years ago by leranwangcs ▴ 150

1

Entering edit mode

R2 is probably the index read which you do not need if data are already demultiplexed. What kind of barcodes are we talking about? Is this single-cell, or some custom thing?

ADD REPLY • link 2.5 years ago by ATpoint 86k

1

Entering edit mode

IF this is single cell RNAseq data (which you don't say) then R3 read is likely the one for UMI+Cell barcodes if this is 10x data. Not sure why someone would flip the file names. This file will generally be R1.