Entering edit mode
7.3 years ago
KVC_bioinfo
▴
600
Hello All,
I am using GMAP aligner to map the Oxford nanopore reads to hg19 genome. I obtained the pre-built genome. I have fasta file. (convterted from fastq) I am using the following command::
path/to/bin/gmap -D /path/to/default/genome/dir/ -d /path/to/hg19/dir -g path/to/fasta/small.fa -t 3 -f samse
When I use this command I get following:
Checking compiler assumptions for SSE2: 6B8B4567 327B23C6 xor=59F066A1
Checking compiler assumptions for SSE4.1: -103 -58 max=-58 => compiler sign extends
Checking compiler options for SSE4.2: 6B8B4567 __builtin_clz=1 __builtin_ctz=0 _mm_popcnt_u32=17 __builtin_popcount=17
Finished checking compiler assumptions
Reading from stdin
And this never proceeds. I had a huge fasta file initially I also tried to subsample it. and tried the same command with a small file. It never proceeded. Could someone help to figure out what I am doing wrong?
Thank you very much
Are you using the right options for GMAP? PacBio tutorial seems to show different ones that you have.
README
I was using the readme file provided by the authors.
The readme doesn't say to use
-g
to specify your query fasta. Your command doesn't specify a query fasta, that's why it's waiting for input on stdin.Yes got it. Thank you very much for pointing that out. I have changed my command now to
However, I am not sure how do I save the alignment to a sam file. I tried "-f samse" or "-f sampe but when I used that it says no paths found for ..... and when I don't use it prints all the alignments on the terminal.
You will have to redirect the output to a file
/path/to/gmapl -D /path/to/genome/dir -d database_name -S /path/to/query -t 4 > file.sam
Or pipe to
| samtools sort -o alignment.bam
got it thank you very much.
One last question:
How do we get the mapping statistics in gmap?
Like % uniquely mapped reads
non uniquely mapped reads etc
/path/to/gmapl -D /path/to/genome/dir -d database_name -S /path/to/query -t 4 > file.sam 2> gmap_align.log
See if the log file captures stats. Otherwise usepileup.sh
from BBMap or Qualimap or samtools to get the stats.Thank you very much. The alignment is running now. However, it's been more than 5 hours it has not finished yet. my query sequence has on an average 2.5 million reads which I am trying to align with human genome.
Is the time required normal?