Hello everyone,
I am running GSNAP on human NA12878 genome (NIST HG001_HiSeq_300x) but the computing time takes forever. I have downloaded the HG38 NIST reference file. My sequencing files are roughly 12Go each (paired-end), the reference around 3Go. I qsub a shell script on a cluster with a small work load (qsub -lnodes=1:ppn=4).
I proceed to create the index with this command line: gmap_build -d /gmap_index/index_name /human_reference_NIST_HG38.fa The running time of this operation is quite alright, a few hours on a cluster node.
I then ran GSNAP to align the reads on my reference genome. The command line used is : gsnap -D /gmap_index_1/index_name -d index_name /workdir/NA12878_R1.fastq /workdir/NA12878_R2.fastq -t 2 -A sam
The shell script has been running for close to a week now... What am I not doing right? Is there any confusion in the command line of the indexing/mapping ?
Thanks a lot for any help!