Hi,
I received some RNASeq files to do analysis. But I'm very confused as a newbie in RNASeq when I checked the files. There are R1, R2 and R3 files for each sample:
lane1_NoIndex_L001_R1_001.fastq.gz lane1_NoIndex_L001_R2_001.fastq.gz lane1_NoIndex_L001_R3_001.fastq.gz
lane2_NoIndex_L002_R1_001.fastq.gz lane2_NoIndex_L002_R2_001.fastq.gz lane2_NoIndex_L002_R3_001.fastq.gz
And here is what R1 file looks like:
Here is what R2 looks like:
Here is what R3 looks like:
I also found a Sample_Key.txt file for the data which looks like this:
It looks like the 3 .fastq.gz files per sample are all with different length so I'm not sure which one is the barcode file and which one is R1 and which one is R2. And how to use the sample_key file to identify which is which?
Thanks so much!!
R2 is probably the index read which you do not need if data are already demultiplexed. What kind of barcodes are we talking about? Is this single-cell, or some custom thing?
IF this is single cell RNAseq data (which you don't say) then R3 read is likely the one for UMI+Cell barcodes if this is 10x data. Not sure why someone would flip the file names. This file will generally be R1.
Did you try looking for your given barcodes in your files?