Bcl2fastq conversion according to samples (demultiplexing)
1
0
Entering edit mode
5.6 years ago

hellow,

I am doing the bcl2fastq conversion of my RNA-Seq data (demultiplexing) but i am getting the results according to lane but not by samples.

my code:

sudo  bcl2fastq --input-dir ./Data/Intensities/BaseCalls -R ./ --no-lane-splitting --sample-sheet ./SampleSheet.csv

my samplesheet:

[Header],,,,
Date,2019-04-12,,,
Workflow,GenerateFASTQ,,,
Application,FASTQOnly,,,
Assay,TruSeq,,,
Description,,,,
Chemistry,,,,
,,,,
,,,,
[Reads],,,,
72,,,,
72,,,,
,,,,
,,,,
[Data],,,,
Lane,Sample_ID,Sample_Name,index,Sample_project
1,F0,0_DPI_MOCK,AGTCAA,Anjali
1,F2,2_DPI_MOCK,AGTTCC,Anjali
1,F5,5_DPI_MOCK,ATGTCA,Anjali
1,F7,7_DPI_MOCK,CCGTCC,Anjali
2,VF0,0_DPI_INFECTED,CGATGT,Anjali
2,VF2,2_DPI_INFECTED,TGACCA,Anjali
2,VF5,5_DPI_INFECTED,ACAGTG,Anjali
2,VF7,7_DPI_INFECTED,GCCAAT,Anjali
2,VF9,9_DPI_INFECTED,CAGATC,Anjali
2,VF10,10_DPI_INFECTED,CTTGTA,Anjali
3,F9,9_DPI_MOCK,GTCCGC,Anjali
3,F10,10_DPI_MOCK,GTGAAA,Anjali
3,F0,0_DPI_MOCK,AGTCAA,Anjali
3,F2,2_DPI_MOCK,AGTTCC,Anjali
4,CphiX,CONTROL,,Anjali
5,F5,5_DPI_MOCK,ATGTCA,Anjali
5,F7,7_DPI_MOCK,CCGTCC,Anjali
5,F9,9_DPI_MOCK,GTCCGC,Anjali
5,F10,10_DPI_MOCK,GTGAAA,Anjali
6,VF0,0_DPI_INFECTED,CGATGT,Anjali
6,VF2,2_DPI_INFECTED,TGACCA,Anjali
6,VF5,5_DPI_INFECTED,ACAGTG,Anjali
6,VF7,7_DPI_INFECTED,GCCAAT,Anjali
7,VF9,9_DPI_INFECTED,CAGATC,Anjali
7,VF10,10_DPI_INFECTED,CTTGTA,Anjali
7,VF0,0_DPI_INFECTED,CGATGT,Anjali
7,VF2,2_DPI_INFECTED,TGACCA,Anjali
8,VF5,5_DPI_INFECTED,ACAGTG,Anjali
8,VF7,7_DPI_INFECTED,GCCAAT,Anjali
8,VF9,9_DPI_INFECTED,CAGATC,Anjali
8,VF10,10_DPI_INFECTED,CTTGTA,Anjali

Thankyou

RNA-Seq next-gen bcl2fastq demultiplexing • 4.4k views
ADD COMMENT
0
Entering edit mode

It doesn't help that your columns are in a non-standard order. It's quite likely that that broke things. They should be:

Lane,Sample_ID,Sample_Name,index,Sample_project

ADD REPLY
0
Entering edit mode

Thankyou Devon,

I have updated my SampleSheet.csv

ADD REPLY
0
Entering edit mode

What did you get? What did you expect. There is a [data] section header missing. Did you try to find out what's going on from stdout output?

ADD REPLY
0
Entering edit mode

Dear sklages, I am trying to convert the raw data (BCL files) from Illumina GAIIx into fastq files. I want the fastq files according to my samples, but i am getting the fastq file according to Lane (i.e. 16 fastq files of R1&R2 of each lane). Also i have updated my SampleSheet.csv and rerun the programme.

Thank you.

ADD REPLY
1
Entering edit mode
5.6 years ago
GenoMax 147k

I am trying to convert the raw data (BCL files) from Illumina GAIIx into fastq files.

That does not sound right. GAIIx was one of the older Illumina sequencers and it never had BCL files. You must surely be using data from a new machine.

If you want to get the sample level files across all lanes, then use identical names in both Sample_ID and Sample_Name. e.g. 0_DPI_INFECTED if that sample is in pool run on all lanes.

6,0_DPI_INFECTED,0_DPI_INFECTED,CGATGT,Anjali
ADD COMMENT
0
Entering edit mode

Sorry if it doesn't sound right.

thank you for the advise I will try this.

But our samples were not pooled, we used biological triplicate for each sample. e.g. 0_DPI_INFECTED have 3 biological replicate in 3 different lane (lane- 2,6 and 7).

Sorry, I am not good in technical terms.

ADD REPLY
0
Entering edit mode

If your samples weren't pooled you couldn't have more than one on a lane.

ADD REPLY
0
Entering edit mode

we have our sample in 3 different lanes for e.g. 0_DPI_INFECTED samples are in 3 different lane (lane- 2,6 and 7).

ADD REPLY
0
Entering edit mode

If those are biological replicates then you would want individual files for them from the three lanes.

Think of it this way. If Sample_1 ran in three lanes as a part of pool (technical replicates) then you can generate a single sample level file for that sample by setting identical Sample_1 name for those three lanes. If Sample_1_Rep1,Sample_1_Rep2,Sample_1_Rep3 ran in three separate lanes you would want to get separate files for those biological replicates.

ADD REPLY
0
Entering edit mode

Yes, you are correct. This is what i want to do for each sample. Sorry, i was not very much clear in my question.

ADD REPLY
0
Entering edit mode

Then edit your sample sheet so it looks something like this (just showing one sample below)

2,0_DPI_INFECTED_Rep1,0_DPI_INFECTED_Rep1,CGATGT,Anjali
6,0_DPI_INFECTED_Rep2,0_DPI_INFECTED_Rep2,CGATGT,Anjali
7,0_DPI_INFECTED_Rep3,0_DPI_INFECTED_Rep3,CGATGT,Anjali

--no-lane-splitting is not useful if you don't have technical reps or if your sample did not run in more than one lane.

ADD REPLY
0
Entering edit mode

I have updated my SampleSheet.csv file:

[Header],,,,
Date,2019-04-12,,,
Workflow,GenerateFASTQ,,,
Application,FASTQOnly,,,
Assay,TruSeq,,,
Description,,,,
Chemistry,,,,
,,,,
,,,,
[Reads],,,,
72,,,,
72,,,,
,,,,
,,,,
[Data],,,,
Lane,Sample_ID,Sample_Name,index,Sample_project
1,F0_rep_1,F0_rep_1,AGTCAA,Anjali
1,F2_rep_1,F2_rep_1,AGTTCC,Anjali
1,F5_rep_1,F5_rep_1,ATGTCA,Anjali
1,F7_rep_1,F7_rep_1,CCGTCC,Anjali
2,VF0_rep_1,VF0_rep_1,CGATGT,Anjali
2,VF2_rep_1,VF2_rep_1,TGACCA,Anjali
2,VF5_rep_1,VF5_rep_1,ACAGTG,Anjali
2,VF7_rep_1,VF7_rep_1,GCCAAT,Anjali
2,VF9_rep_1,VF9_rep_1,CAGATC,Anjali
2,VF10_rep_1,VF10_rep_1,CTTGTA,Anjali
3,F9_rep_1,F9_rep_1,GTCCGC,Anjali
3,F10_rep_1,F10_rep_1,GTGAAA,Anjali
3,F0_rep_2,F0_rep_2,AGTCAA,Anjali
3,F2_rep_2,F2_rep_2,AGTTCC,Anjali
4,CphiX,CphiX,,Anjali
5,F5_rep_2,F5_rep_2,ATGTCA,Anjali
5,F7_rep_2,F7_rep_2,CCGTCC,Anjali
5,F9_rep_2,F9_rep_2,GTCCGC,Anjali
5,F10_rep_2,F10_rep_2,GTGAAA,Anjali
6,VF0_rep_2,VF0_rep_2,CGATGT,Anjali
6,VF2_rep_2,VF2_rep_2,TGACCA,Anjali
6,VF5_rep_2,VF5_rep_2,ACAGTG,Anjali
6,VF7_rep_2,VF7_rep_2,GCCAAT,Anjali
7,VF9_rep_2,VF9_rep_2,CAGATC,Anjali
7,VF10_rep_2,VF10_rep_2,CTTGTA,Anjali
7,VF0_rep_3,VF0_rep_3,CGATGT,Anjali
7,VF2_rep_3,VF2_rep_3,TGACCA,Anjali
8,VF5_rep_3,VF5_rep_3,ACAGTG,Anjali
8,VF7_rep_3,VF7_rep_3,GCCAAT,Anjali
8,VF9_rep_3,VF9_rep_3,CAGATC,Anjali
8,VF10_rep_3,VF10_rep_3,CTTGTA,Anjali

But still not getting the desired result.

My code:

bcl2fastq --input-dir ./Data/Intensities/BaseCalls -R ./ --sample-sheet ./SampleSheet.csv --create-fastq-for-index-reads
ADD REPLY
0
Entering edit mode

But still not getting the desired result.

We can't read you mind nor see what files you are getting. Can you provide a listing of fastq files (ls -1 *.fastq.gz) and explain why that is not the result you want.

ADD REPLY
0
Entering edit mode

output of

ls -1 *.fastq.gz

Undetermined_S0_L001_I1_001.fastq.gz
Undetermined_S0_L001_R1_001.fastq.gz
Undetermined_S0_L001_R2_001.fastq.gz
Undetermined_S0_L002_I1_001.fastq.gz
Undetermined_S0_L002_R1_001.fastq.gz
Undetermined_S0_L002_R2_001.fastq.gz
Undetermined_S0_L003_I1_001.fastq.gz
Undetermined_S0_L003_R1_001.fastq.gz
Undetermined_S0_L003_R2_001.fastq.gz
Undetermined_S0_L005_I1_001.fastq.gz
Undetermined_S0_L005_R1_001.fastq.gz
Undetermined_S0_L005_R2_001.fastq.gz
Undetermined_S0_L006_I1_001.fastq.gz
Undetermined_S0_L006_R1_001.fastq.gz
Undetermined_S0_L006_R2_001.fastq.gz
Undetermined_S0_L007_I1_001.fastq.gz
Undetermined_S0_L007_R1_001.fastq.gz
Undetermined_S0_L007_R2_001.fastq.gz
Undetermined_S0_L008_I1_001.fastq.gz
Undetermined_S0_L008_R1_001.fastq.gz
Undetermined_S0_L008_R2_001.fastq.gz

These are according to Lane but I want to get separate files for my samples.

ADD REPLY
0
Entering edit mode

Are you using Illumina expt manager to make the sample sheet? If not you may want to give that a try (Note: it is a windows only application).

Properly formatted SampleSheet.csv has more than the fields you have specified. They should have the following in Data part (this is a random example from Illumina Expt Manager) :

[Data]                          
Lane    Sample_ID   Sample_Name Sample_Plate    Sample_Well I7_Index_ID index   Sample_Project
1   Sample_1                AD002   CGATGT  
3   Sample_2                AD006   GCCAAT
ADD REPLY
0
Entering edit mode

sorry, but i have confirmed that my data is from Illumina GAIIx sytem. IEM is not compatible with GAIIx. Thankyou

ADD REPLY
0
Entering edit mode

It's highly likely that whoever told you that was wrong. But in the unlikely event that they're correct you'll have to use an old version of bcl2fastq, since no recent versions are compatible with a GAIIx.

ADD REPLY
0
Entering edit mode

Are you getting the project directories as well? There should be an Anjali directory with a subdirectory for each library.

ADD REPLY
0
Entering edit mode

Yes you are right, I will look into them. Thankyou

ADD REPLY

Login before adding your answer.

Traffic: 1588 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6