Dear all,
I would like to align some reads to a reference using STAR. The following command works perfectly:
STAR --genomeDir output/spiking/index/star --readFilesIn reads.fastq --outFileNamePrefix outputFolder --runThreadN 8 > message.txt
However, the following command does not work:
STAR --genomeDir output/spiking/index/star --readFilesIn reads.fastq.gz --outFileNamePrefix outputFolder --runThreadN 8 > message.txt
Error: the read ID should start with @ or >
Presumably, STAR expects a FASTQ file and not a fastq.gz file. Does anybody know how to get round this in an efficient way?
Thanks.
C.
Thanks. That didn't work for me probably because I am on Mac OSX. But I've found this and 'gunzip -c' instead of 'zcat' worked. https://groups.google.com/forum/#!topic/rna-star/NY56kU3mC64
Actually, you can use the bash shell hack
<(gunzip -c filename.gz)
to pass the gzipped file (or similarly, any other kind of zip file), which doesn't have a built-in mechanism to read the zipped files directly (STAR is awesome in providing the built-in mechanism :). It uses a trick of shell calledProcess Substitution
. In essence, the command inside()
is run inside a subshell, and the output of the command is passed as a input to the main command (which is your main program).Hi Santosh, Thanks for the reply. I am afraid this sounds a bit too technical for me. Do you know how the whole command would look like using this bash shell hack? C.
Suppose STAR was not supporting the gzipped file (reads.fastq.gz), then you would have run STAR like this:
STAR --genomeDir output/spiking/index/star --readFilesIn <(gunzip -c reads.fastq.gz) --outFileNamePrefix outputFolder --runThreadN 8 > message.txt
. Essentially, you need to put the zipped-file with the unzipping commands inside<()
The same mechanism for any other program which doesn't support zipped file natively.
Thanks, I didn't know about this!