Hi fellow gurus,
I am trying to feed multiple files into hisat2 for alignment that are gzipped fastq files, these are paired end. One method of input is to generate a comma-separated list of files within a directory as input.
I have generated this text file however, hisat is not recognising them as being fastq files, they are still gzipped, it is not clear if hisat2 is aware and would pipe through zcat or not. Decompressing first and then running hisat2 works.
Is there a option to make hisat2 aware of gzipped files in "paired-end, multi-file" mode?
${HISAT2}/hisat2 -x $REFO -S hisat2/${k}_o.sam --dta-cufflinks -p $MAXCPU -1 R1.txt -2 R2.txt
Please excuse the bad coding, this is a snippet.
Cheers
Nick
You need to specify files (for one sample) on the command line in identical order for -1 and -2 option. From HISAT2 manual:
-1 flyA_1.fq,flyB_1.fq -2 flyA_2.fq,flyB_2.fq
Most aligners now a days will accept compressed files so there is no need to uncompress them first.Note: You can't align multiple samples in one command line/together.
Thanks @genomx2,
Can the fastq files be gzipped? Or do they have to be fastq?
Cheers
Nick
gzipped files should be fine.
Hmmmm, I am still getting errors,
terminate called after throwing an instance of 'int' (ERR): hisat2-align died with signal 6 (ABRT)
I wondering if the fact that they are symbolic links an issue?
I zcat | less and they look like fastq files to me! #stumped
test them with gunzip -t filename.gz
Unzipping them is fine and I am sure hisat2 is gz aware.
In the off chance that you're actually naming things
R1.txt
andR2.txt
then rename them toR1.fq.gz
andR2.fq.gz
.I was writing R1.txt and R2.txt files from an ls -m <sample>_R1.fastq.gz and <sample>_R2.fastq.gz respectively.
So they were technically a file of comma separated lists. I have resorted to decompressing and processing accordingly (this is working).
I have folders by Lane, and hoping to combine them all as the library was sequenced over many lanes, the other way is to cat them into one file was hoping this would work, it hasn't so far.
A file of comma separated lists? That won't ever work. Just use a comma separated list.
hmmm. Thanks @Devon Ryan, will need to come back to this at some stage, setting ls -m output to a variable and using that is still trowing the same error.
Resorted to alignment of samples by lane and will merge the BAM files in the end. Another way to skin the cat I suppose.