10X scRNA-seq sample sheet generation issue
1
0
Entering edit mode
6.7 years ago

The answer can be found here: https://bioinformatics.stackexchange.com/questions/4024/10x-illumina-demultiplexing-sample-sheet-issue/4046#4046

I am trying to generate sample sheet for my 10X single cell data, using the following tool:

https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/bcl2fastq-direct

When I enter a sample_id and sample_index and click add nothing happens either because I need to enter something else or because their tool is not functional. My sample data is shown below:

SampleID    Name    Species   Project   NucleicAcid Well    Index1Name  Index1Sequence
U382_01     17R   Homo sapiens  DSC     DNA          A01    SIGAB401    ACTTCATA
U382_02     17R   Homo sapiens  DSC     DNA          A02    SIGAB402    GAGATGAC
U382_03     17R   Homo sapiens  DSC     DNA          A03    SIGAB403    TGCCGTGG
U382_04     17R   Homo sapiens  DSC     DNA          A04    SIGAB404    CTAGACCT
U382_05     19RL  Homo sapiens  DSC     DNA          A05    SIGAB501    AATAATGG
U382_06     19RL  Homo sapiens  DSC     DNA          A06    SIGAB502    CCAGGGCA
U382_07     19RL  Homo sapiens  DSC     DNA          A07    SIGAB503    TGCCTCAT
U382_08     19RL  Homo sapiens  DSC     DNA          A08    SIGAB504    GTGTCATC

So, I am entering in the first field of their tool, say, for the first sample, U382_01 and in the second field - SIGAB401, click add, but nothing happens. I am also not sure which lane should I select, which column in the sample data signifies the lane. Any suggestions would be greatly appreciated.

sequencing bcl fastq 10X • 5.5k views
ADD COMMENT
2
Entering edit mode

That does not look right. As you recall each sample ID you have will get 4 10x indexes.

SampleID    Name    Species   Project   NucleicAcid Well    Index1Name  Index1Sequence
U382     17R   Homo sapiens  DSC     DNA          A01    SIGAB401    ACTTCATA
U382     17R   Homo sapiens  DSC     DNA          A02    SIGAB402    GAGATGAC
U382     17R   Homo sapiens  DSC     DNA          A03    SIGAB403    TGCCGTGG
U382     17R   Homo sapiens  DSC     DNA          A04    SIGAB404    CTAGACCT

The tool probably requires javascript/scripting to be turned on. So if you have that off in your browser the add button may not be working.

My example from yesterday

Lane,Sample_ID,Sample_Name,index,Sample_Project
1,SI-GA-A1_1,Sample_1,GGTTTACT,Chromium_20180405
1,SI-GA-A1_2,Sample_1,CTAAACGG,Chromium_20180405
1,SI-GA-A1_3,Sample_1,TCGGCGTC,Chromium_20180405
1,SI-GA-A1_4,Sample_1,AACCGTAA,Chromium_20180405
1,SI-GA-B1_1,Sample_2,GTAATCTT,Chromium_20180405
1,SI-GA-B1_2,Sample_2,TCCGGAAG,Chromium_20180405
1,SI-GA-B1_3,Sample_2,AGTTCGGC,Chromium_20180405
1,SI-GA-B1_4,Sample_2,CAGCATCA,Chromium_20180405

What sequencer did this run on? Was it a common pool that ran on multiple lanes? In that case the same pool listing would be repeated for second lane like so

Lane,Sample_ID,Sample_Name,index,Sample_Project
1,SI-GA-A1_1,Sample_1,GGTTTACT,Chromium_20180405
1,SI-GA-A1_2,Sample_1,CTAAACGG,Chromium_20180405
1,SI-GA-A1_3,Sample_1,TCGGCGTC,Chromium_20180405
1,SI-GA-A1_4,Sample_1,AACCGTAA,Chromium_20180405
1,SI-GA-B1_1,Sample_2,GTAATCTT,Chromium_20180405
1,SI-GA-B1_2,Sample_2,TCCGGAAG,Chromium_20180405
1,SI-GA-B1_3,Sample_2,AGTTCGGC,Chromium_20180405
1,SI-GA-B1_4,Sample_2,CAGCATCA,Chromium_20180405
2,SI-GA-A1_1,Sample_1,GGTTTACT,Chromium_20180405
2,SI-GA-A1_2,Sample_1,CTAAACGG,Chromium_20180405
2,SI-GA-A1_3,Sample_1,TCGGCGTC,Chromium_20180405
2,SI-GA-A1_4,Sample_1,AACCGTAA,Chromium_20180405
2,SI-GA-B1_1,Sample_2,GTAATCTT,Chromium_20180405
2,SI-GA-B1_2,Sample_2,TCCGGAAG,Chromium_20180405
2,SI-GA-B1_3,Sample_2,AGTTCGGC,Chromium_20180405
2,SI-GA-B1_4,Sample_2,CAGCATCA,Chromium_20180405
ADD REPLY
0
Entering edit mode

In RunInfo.xml I have NB501830 under the tag instrument. It must be common pool. So, how can I determine the lane? Do you mean that I am actually having only two samples, with each sample info just spread over 4 lines of the document that I posted? Javascript is allowed in my browser. I tried it on two OS systems: Ubuntu and Mac OSX and it is not working. They even do not show what is wrong with my input, so I suppose that their tool is actually broken or implemented in a sloppy way without the right error-handling code. So, I guess I need to generate the sample sheet file myself. I still do not understand the lanes: do I have two lanes if I got 2 samples (with each one spread over 4 rows)? Should I ask the info about the lanes people who made the experiment?

ADD REPLY
1
Entering edit mode

No you can't find out if this is a common pool or not from RunInfo.xml file. I am not 100% sure but NB* serial number may indicate that this is a NextSeq. Flowcells for NextSeq have 4 lanes but they will have the same pool run on a flowcell since lanes in a NextSeq flowcell are not fluidically independent.

Edit: Based on your last post where you said you had L001 through L004 folders this must be a NextSeq run. Same pool of samples would go on one FC. That nomenclature of the samples is still odd looking but if there are only 8 rows then you likely have two samples in the pool.

You should really track down the people who have done these libraries and get all relevant information. Otherwise you are going to keep spinning your wheels in place for no good reason.

Note: I just tried the 10x tool again on macOS using firefox and had not problem generating the samplesheet.

ADD REPLY
0
Entering edit mode

Could you tell me just for test purposes what you entered in the two fields? I am trying Firefox, nothing happens...

ADD REPLY
1
Entering edit mode

I enter Sample_1 in sample field. As I start typing GA- in Index set field the tool automatically starts showing me possible options, I complete/choose the right option in this case SI-GA-A1. Once I click on Add button, in the black box below I get the samplesheet contents populated. I am copying them here.

[Header]
EMFileVersion,4

[Reads]
26
98

[Data]
Lane,Sample_ID,Sample_Name,index,Sample_Project
1,SI-GA-A1_1,Sample_1,GGTTTACT,Chromium_20180407
1,SI-GA-A1_2,Sample_1,CTAAACGG,Chromium_20180407
1,SI-GA-A1_3,Sample_1,TCGGCGTC,Chromium_20180407
1,SI-GA-A1_4,Sample_1,AACCGTAA,Chromium_20180407

Full set of 10x index sequences is available at this LINK. If you can get the people who made the libraries to tell you what index set they used with what sample you can make the file above trivially.

ADD REPLY
0
Entering edit mode

OK, I was copy-pasting - that is why it was not working. When I started typing, it works. So, I generated the following file:

[Header]
EMFileVersion,4

 

[Reads]
26
98

 

[Data]
Lane,Sample_ID,Sample_Name,index,Sample_Project
1,SI-GA-B4_1,17R,ACTTCATA,Chromium_20180406
1,SI-GA-B4_2,17R,GAGATGAC,Chromium_20180406
1,SI-GA-B4_3,17R,TGCCGTGG,Chromium_20180406
1,SI-GA-B4_4,17R,CTAGACCT,Chromium_20180406
1,SI-GA-B5_1,19RL,AATAATGG,Chromium_20180406
1,SI-GA-B5_2,19RL,CCAGGGCA,Chromium_20180406
1,SI-GA-B5_3,19RL,TGCCTCAT,Chromium_20180406
1,SI-GA-B5_4,19RL,GTGTCATC,Chromium_20180406

So, if I have L001 through L004 folders, does it mean that now I should repeat the same rows, just changing lanes? So, it would be:

[Header]
EMFileVersion,4

 

[Reads]
26
98

 

[Data]
Lane,Sample_ID,Sample_Name,index,Sample_Project
1,SI-GA-B4_1,17R,ACTTCATA,Chromium_20180406
1,SI-GA-B4_2,17R,GAGATGAC,Chromium_20180406
1,SI-GA-B4_3,17R,TGCCGTGG,Chromium_20180406
1,SI-GA-B4_4,17R,CTAGACCT,Chromium_20180406
1,SI-GA-B5_1,19RL,AATAATGG,Chromium_20180406
1,SI-GA-B5_2,19RL,CCAGGGCA,Chromium_20180406
1,SI-GA-B5_3,19RL,TGCCTCAT,Chromium_20180406
1,SI-GA-B5_4,19RL,GTGTCATC,Chromium_20180406
2,SI-GA-B4_1,17R,ACTTCATA,Chromium_20180406
2,SI-GA-B4_2,17R,GAGATGAC,Chromium_20180406
2,SI-GA-B4_3,17R,TGCCGTGG,Chromium_20180406
2,SI-GA-B4_4,17R,CTAGACCT,Chromium_20180406
2,SI-GA-B5_1,19RL,AATAATGG,Chromium_20180406
2,SI-GA-B5_2,19RL,CCAGGGCA,Chromium_20180406
2,SI-GA-B5_3,19RL,TGCCTCAT,Chromium_20180406
2,SI-GA-B5_4,19RL,GTGTCATC,Chromium_20180406
3,SI-GA-B4_1,17R,ACTTCATA,Chromium_20180406
3,SI-GA-B4_2,17R,GAGATGAC,Chromium_20180406
3,SI-GA-B4_3,17R,TGCCGTGG,Chromium_20180406
3,SI-GA-B4_4,17R,CTAGACCT,Chromium_20180406
3,SI-GA-B5_1,19RL,AATAATGG,Chromium_20180406
3,SI-GA-B5_2,19RL,CCAGGGCA,Chromium_20180406
3,SI-GA-B5_3,19RL,TGCCTCAT,Chromium_20180406
3,SI-GA-B5_4,19RL,GTGTCATC,Chromium_20180406
4,SI-GA-B4_1,17R,ACTTCATA,Chromium_20180406
4,SI-GA-B4_2,17R,GAGATGAC,Chromium_20180406
4,SI-GA-B4_3,17R,TGCCGTGG,Chromium_20180406
4,SI-GA-B4_4,17R,CTAGACCT,Chromium_20180406
4,SI-GA-B5_1,19RL,AATAATGG,Chromium_20180406
4,SI-GA-B5_2,19RL,CCAGGGCA,Chromium_20180406
4,SI-GA-B5_3,19RL,TGCCTCAT,Chromium_20180406
4,SI-GA-B5_4,19RL,GTGTCATC,Chromium_20180406

Or we can't be sure and I need to ask people who produced the data?

ADD REPLY
1
Entering edit mode

The format of the file looks fine now. What you need to know for sure is sample <-> 10x index pool association. That can only be confirmed by people who made the libraries. But at this time you should be able to execute a test run.

ADD REPLY
0
Entering edit mode

Do you have samples multiplexed, more than one to a lane? Because if you didn't, you or whoever gave you these files could make the fastqs for you with a stripped down sample sheet that omitted the indices, with 2 fastqs per lane. Older versions of cellranger required the fastqs to be interleaved, but now you can use them just like bcl2fasq would make them.

ADD REPLY
1
Entering edit mode
6.6 years ago
GenoMax 147k

This question has been answered here by OP (posting link separately to provide closure to this thread) : https://bioinformatics.stackexchange.com/questions/4024/10x-illumina-demultiplexing-sample-sheet-issue/4046#4046

ADD COMMENT
0
Entering edit mode

An update. 10x software will accept a simple SampleSheet that looks like

Lane,Sample,Index
*,Sample_1,SI-GA-A3
*,Sample_2,SI-GA-A4
*,Sample_3,SI-GA-A5

and so on. Save as a .csv file.

Add lane numbers if you have distinct sample pools in different lanes.

ADD REPLY

Login before adding your answer.

Traffic: 2114 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6