Also posted on bioinformatics stackexchange.
I am trying to run bcl2fastq
to generate fastq
files from the bcl
ones that I got for 10X
single cell experiment run. I am getting the following exception when I am trying to run the bcl2fastq
:
For that I am using the following bash script, generate_fastq.sh
that I made myself:
#!/bin/bash
FLOWCELL_DIR="/scratch/nv4e/kipnis/180403_NB501830_0158_AHN3LLBGX5"
OUTPUT_DIR="/scratch/nv4e/kipnis/fastq"
INTEROP_DIR="/scratch/nv4e/kipnis/180403_NB501830_0158_AHN3LLBGX5/InterOp"
SAMPLE_SHEET_PATH="/scratch/nv4e/kipnis/sample_sheet.csv"
bcl2fastq --use-bases-mask=Y26,I8,Y98 --create-fastq-for-index-reads --minimum-trimmed-read-length=8 --mask-short-adapter-reads=8 --ignore-missing-positions --ignore-missing-controls --ignore-missing-filter --ignore-missing-bcls -r 6 -w 6 -R ${FLOWCELL_DIR} --output-dir=${OUTPUT_DIR} --interop-dir=${INTEROP_DIR} --sample-sheet=${SAMPLE_SHEET_PATH}
So, apparently something is wrong with my sample sheet. I looked into RunInfo.xml
and there I see 3
reads:
I used the sample sheet generator: https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/bcl2fastq-direct
and got the following file, sample_sheet.csv
:
[Header]
EMFileVersion,4
[Reads]
26
8
98
[Data]
Lane,Sample_ID,Sample_Name,index,Sample_Project
1,SI-GA-B4_1,17R,ACTTCATA,Chromium_20180406
1,SI-GA-B4_2,17R,GAGATGAC,Chromium_20180406
1,SI-GA-B4_3,17R,TGCCGTGG,Chromium_20180406
1,SI-GA-B4_4,17R,CTAGACCT,Chromium_20180406
1,SI-GA-B5_1,19RL,AATAATGG,Chromium_20180406
1,SI-GA-B5_2,19RL,CCAGGGCA,Chromium_20180406
1,SI-GA-B5_3,19RL,TGCCTCAT,Chromium_20180406
1,SI-GA-B5_4,19RL,GTGTCATC,Chromium_20180406
2,SI-GA-B4_1,17R,ACTTCATA,Chromium_20180406
2,SI-GA-B4_2,17R,GAGATGAC,Chromium_20180406
2,SI-GA-B4_3,17R,TGCCGTGG,Chromium_20180406
2,SI-GA-B4_4,17R,CTAGACCT,Chromium_20180406
2,SI-GA-B5_1,19RL,AATAATGG,Chromium_20180406
2,SI-GA-B5_2,19RL,CCAGGGCA,Chromium_20180406
2,SI-GA-B5_3,19RL,TGCCTCAT,Chromium_20180406
2,SI-GA-B5_4,19RL,GTGTCATC,Chromium_20180406
3,SI-GA-B4_1,17R,ACTTCATA,Chromium_20180406
3,SI-GA-B4_2,17R,GAGATGAC,Chromium_20180406
3,SI-GA-B4_3,17R,TGCCGTGG,Chromium_20180406
3,SI-GA-B4_4,17R,CTAGACCT,Chromium_20180406
3,SI-GA-B5_1,19RL,AATAATGG,Chromium_20180406
3,SI-GA-B5_2,19RL,CCAGGGCA,Chromium_20180406
3,SI-GA-B5_3,19RL,TGCCTCAT,Chromium_20180406
3,SI-GA-B5_4,19RL,GTGTCATC,Chromium_20180406
4,SI-GA-B4_1,17R,ACTTCATA,Chromium_20180406
4,SI-GA-B4_2,17R,GAGATGAC,Chromium_20180406
4,SI-GA-B4_3,17R,TGCCGTGG,Chromium_20180406
4,SI-GA-B4_4,17R,CTAGACCT,Chromium_20180406
4,SI-GA-B5_1,19RL,AATAATGG,Chromium_20180406
4,SI-GA-B5_2,19RL,CCAGGGCA,Chromium_20180406
4,SI-GA-B5_3,19RL,TGCCTCAT,Chromium_20180406
5,SI-GA-B5_4,19RL,GTGTCATC,Chromium_20180406
What is wrong with my .csv
? What am I doing wrong?
cellranger
generated the same error:Could not parse the CSV stream text:
Here is more detailed error from
_stderr
generated file: https://ibb.co/bxiJ4xLooks like it is the carriage return/line feed difference. You can use the
dos2unix file.csv
to convert CRLF to LF. Ifdos2unix
is not on your system then you would know what to do.I just got the same error with bcl2fastq on my own project...someone thought it was clever to spell 'naive' with a diaeresis. Since the visible characters look okay in what you posted, it must be a white space character, as Genomax suggested.
There is no
dos2unix
installed and I tried to usetr -d '\r' < input > output
andperl -pi -e 's/\r\n/\n/g' input
from the following thread:https://unix.stackexchange.com/questions/277217/how-to-install-dos2unix-on-linux-without-root-access
But the error stays the same.
Why does your error show reads settings as following (your screen cap included below)?
That should be
correct?
No, it should be
I have 3 reads. I do not understand why
_stderr
file is showing that because I am feeding the correct file in it. That seems very weird for me.Why are you modifying the output from the official sample sheet generator?
It should be:
You don't have 3 reads. You have 2 reads and an index read.
I changed it to two read, not working. Removing all the top thing until
[Data]
gives sample sheet formatting error. Everything looks good in sample sheet, so either I need to somehow find what is actually wrong in samplesheet, some kind of a parsing, verifying program that would tell which line is wrong or there is something else going wrong.Are you able to run the test included in the software (look for the tinyBCL dataset)?
Let's make sure your installation works properly.
Sure, their sample test works perfect
Now I am close to getting stumped. So the problem is clearly your samplesheet file. Is there a
SampleSheet.csv
file in the raw data folder you have. Can you rename it something else to ensure thatcellranger
is reading the file you made using their tool?Can you also verify what @swbarnes2 commented on: C: bcl2fastq: Could not parse the CSV stream text
You can also contact 10x tech support to see if they have a solution.
Oh my god, thank you so much! I cannot tell you how much time this saved me, I actually made this account just in case you see this at some point. This was a spreadsheet that never touched a windows machine save for a microsoft file sharing server which apparently was enough to corrupt it. All of the unix based software I used never saw an issue until bcl2fastq.