hisat2 location does not exist
0
0
Entering edit mode
9 months ago
Eric • 0

I am new to upstream RNA sequencing analysis. So, I am trying to align the fastq sequence to human genome downloaded from https://genome-idx.s3.amazonaws.com/hisat/hg19_genome.tar.gz. The reference was unzipped and stored at the location : /media/extdata/e4/genomes/hg19/genome/ , and the files are these when I call ls command :

genome.1.ht2  genome.2.ht2  genome.3.ht2  genome.4.ht2  genome.5.ht2  genome.6.ht2  genome.7.ht2  genome.8.ht2  make_hg19.sh.

But when I am trying to run the command:

hisat2 -p 32 -x /media/extdata/e4/genomes/hg19/genome -1 /media/extdata/e4/Rseq/test2/clean/SRR10502962_1_val_1.fq.gz -2 /media/extdata/e4/Rseq/test2/clean/SRR10502962_2_val_2.fq.gz -S SRR10502962.hisat.sam

The error of (ERR): "/media/extdata/e4/genomes/hg19/genome" does not exist returned. But when I tried add / to the -x parameter, like this : /media/extdata/e4/genomes/hg19/genome/, this is the error: Could not locate a HISAT2 index corresponding to basename "/media/extdata/e4/genomes/hg19/genome/" Error: Encountered internal HISAT2 exception (#1).

But the interesting part is that even I run the code like this: hisat2 --dta -p 16 -x ~/ -1 /media/extdata/e4/Rseq/test2/clean/SRR10502962_1_val_1.fq.gz -2 /media/extdata/e4/Rseq/test2/clean/SRR10502962_2_val_2.fq.gz -S SRR10502962.hisat.sam. It also returns the (ERR): "/home/user/" does not exist error. So, can anyone help me out? Many thanks.

hisat2 • 1.6k views
ADD COMMENT
0
Entering edit mode

What do these commands tell you:

file /media/extdata/e4/genomes/hg19/genome*
df -h /media/extdata/e4/genomes/hg19/
whoami
groups
stat /media/extdata/e4/genomes/hg19/
ADD REPLY
0
Entering edit mode

Thanks for the responses, here's what i got:

file /media/extdata/e4/genomes/hg19/genome 

/media/extdata/e4/genomes/hg19/genome: directory, 

df -h /media/extdata/e4/genomes/hg19/ 

/dev/sde         15T  232G   14T    2% /media/extdata/e4, 

whoami 

myusername

groups 

myusername adm cdrom sudo dip plugdev lpadmin lxd sambashare docker, 

 stat /media/extdata/e4/genomes/hg19/ 

 /media/extdata/e4/genomes/hg19/: File type: Directory Size: 4096 bytes Blocks: 8 IO block size: 4096 bytes Device: 840h/2112d Inode: 100401176 Hard links: 3 Permissions: 0775/drwxrwxr-x Uid: 1000/myusername Gid: ( 1000/myusername ). 

Any suggestion?

ADD REPLY
0
Entering edit mode

is your username myusername? show the actual output instead of replacing it with a different word, we are trying to figure out the problem here, replacing words just sows more confusion

also show just the output from all the commands in one output, instead of writing this returns that, I had to remove a lot of content and reformat the post to make it reasonably readable, even after that it is quite hard to read, especially since you seem to have altered the output

don't worry people can tell the output of the commands. Next time write it like this, I've entered:

file /media/extdata/e4/genomes/hg19/genome*
df -h /media/extdata/e4/genomes/hg19/
whoami
groups
stat /media/extdata/e4/genomes/hg19/

and it produced

output here
ADD REPLY
0
Entering edit mode

Thanks, here's what i got:

/media/extdata/e4/genomes/hg19/genome: directory

Filesystem       Size  Used Avail Use% Mounted on
/dev/sde       15T  232G  14T   2% /media/extdata/e4

lq

lq adm cdrom sudo dip plugdev lpadmin lxd sambashare docker

Size: 4096           Blocks: 8          IO Block Size: 4096
Device: 840h/2112d     Inode: 100401176   Links: 3
Permissions: (0775/drwxrwxr-x)   Uid: ( 1000/   lq)   Gid: ( 1000/   lq)
Access: 2024-02-20 10:33:29.500987218 +0800
Modify: 2024-02-20 10:33:27.712979656 +0800
Change: 2024-02-20 10:33:27.712979656 +0800
Birth: 2024-02-19 14:05:48.602623515 +0800
ADD REPLY
1
Entering edit mode

the error

/home/user/ does not exist

Indicates that the $HOME variable is not set quite right. According to your settings that path should be /home/lq

In addition that

file /media/extdata/e4/genomes/hg19/genome*

returns only a single output indicates that the index is not present in that location. Or perhaps you did not type the command you were instructed to and you forgot the * from the end of the command.

going back to the original post show us the exact commands and their outputs, without adding any other qualifier and explanations,

show the actual ls command and its output with no other words

ls -l /media/extdata/e4/genomes/hg19/genome*

... output of ls goes here ...

# show the precise command you run with hisat2
....

#now show the exact output of hisat2 without any other formatting
...

always past in both the command you have and its output, and not just say I typed that in because small mistakes can always occur

ADD REPLY
0
Entering edit mode

the hisat2 program launches a Perl script that in turn launches launches another program hisat2-align via the default shell

to me your error (if indeed the filesystems exist) appears to indicate that the subshells that get launched do not inherit the proper values

it kind of sounds like these are some sort of emulated, mounted filesystems that are not visible

ADD REPLY
0
Entering edit mode

I am accessing the server by SHH, is there anything that i can do about it?

ADD REPLY
0
Entering edit mode

Thank you for your patience, here are the commands and results returned.

file /media/extdata/e4/genomes/hg19/genome*
/media/extdata/e4/genomes/hg19/genome: directory

df -h /media/extdata/e4/genomes/hg19/
Filesystem      Size  Used Avail Use% Mounted on
/dev/sde       15T   232G   14T   2% /media/extdata/e4

whoami
lq

groups
lq adm cdrom sudo dip plugdev lpadmin lxd sambashare docker

stat /media/extdata/e4/genomes/hg19/
File: /media/extdata/e4/genomes/hg19/
Size: 4096
Blocks: 8
IO Block size: 4096
Directory
Device: 840h/2112d
Inodes: 100401176
Hard links: 3
Permissions: (0775/drwxrwxr-x)
Uid: ( 1000/      lq)
Gid: ( 1000/      lq)
Access time: 2024-02-22 10:09:05.759151619 +0800
Modify time: 2024-02-20 10:33:27.712979656 +0800
Change time: 2024-02-20 10:33:27.712979656 +0800
Create time: 2024-02-19 14:05:48.602623515 +0800

ls -l /media/extdata/e4/genomes/hg19/genome*

Total blocks 4326516

-rw-rw-r-- 1 lq lq  969972432 Nov 20 2015 genome.1.ht2
-rw-rw-r-- 1 lq lq  724327620 Nov 20 2015 genome.2.ht2
-rw-rw-r-- 1 lq lq    4814 Nov 20 2015 genome.3.ht2
-rw-rw-r-- 1 lq lq  724327616 Nov 20 2015 genome.4.ht2
-rw-rw-r-- 1 lq lq 1274093467 Nov 20 2015 genome.5.ht2
-rw-rw-r-- 1 lq lq  737586876 Nov 20 2015 genome.6.ht2
-rw-rw-r-- 1 lq lq      8 Nov 20 2015 genome.7.ht2
-rw-rw-r-- 1 lq lq      8 Nov 20 2015 genome.8.ht2
-rwxrwxr-x 1 lq lq    1287 Nov 20 2015 make_hg19.sh
-rw-rw-r-- 1 lq lq       0 Feb 20 2024 SRR10502962.hisat.sam

here are the code of hisat2 i run:

hisat2 -p 32 -x /media/extdata/e4/genomes/hg19/genome -1 /media/extdata/e4/Rseq/test2/clean/SRR10502962_1_val_1.fq.gz -2 /media/extdata/e4/Rseq/test2/clean/SRR10502962_2_val_2.fq.gz -S SRR10502962.hisat.sam 

 (ERR): "/media/extdata/e4/genomes/hg19/genome" does not exist
 Exiting now ...

Add / at the end of file location, return this error:

hisat2 -p 32 -x /media/extdata/e4/genomes/hg19/genome/ -1 /media/extdata/e4/Rseq/test2/clean/
SRR10502962_1_val_1.fq.gz -2 /media/extdata/e4/Rseq/test2/clean/SRR10502962_2_val_2.fq.gz -S SRR10502962.hisat.sam 
Could not locate a HISAT2 index corresponding to basename "/media/extdata/e4/genomes/hg19/genome/"
Error: Encountered internal HISAT2 exception (#1)
Command: /home/lq/miniconda3/envs/RNA_seq/bin/hisat2-align-s --wrapper basic-0 -p 32 -x /media/extdata/e4/genomes/hg19/genome/ -S SRR10502962.hisat.sam --read-lengths 150,149,147,146,148,142,141,145,136,132,131,144,140,130,139,143,134,123,137,119,116,124,133,126,129,125,121,115,127,120,105,135,122,113,112,97,138,114,94,128,117,111,109,103,36,118,110,95,92,77,69,57,108,107,101,99,78,104,102,98,96,93,90,87,83,75,74,71,68,65,62,60,59,58,55,54,44,38,37,106,91,86,85,82,80,76,73,72,67,66,64,56,53,52,51,49,46,45,42 -1 /tmp/168214.inpipe1 -2 /tmp/168214.inpipe2 
(ERR): hisat2-align exited with value 1

Hope this make the condition clear.

ADD REPLY
1
Entering edit mode

remember the index has to a prefix to a file and not a directory, so don't add / at the end of it. When you type

hisat2 -p 32 -x /media/extdata/e4/genomes/hg19/genome

then press double TAB does it show you the files?

genome.1.ht2
genome.2.ht2
...

These are the files that will be read.

I will say my guess is that this problem is extraordinarily confusing because the explanation might be ludicrously simple.

I found that trivial problems will cause the most confusing errors. (for example, you are running ls and hisat2 on different systems or different filesystems) or something like that,

there is nothing magical about the index;

the program will read eight files; in the -x parameter, you have to list their common prefix; that's all.

There is nothing special to it, but the file must be accessible when the program runs.

ADD REPLY
0
Entering edit mode

I can't seem to make sense of how your file command says genome is a directory while the ls does not list any dir with that name, just files starting with genome. What is the output to:

echo -e "$(date): $(uname -a)\n$(ls -lR /media/extdata/e4/genomes/hg19/)"
ADD REPLY
0
Entering edit mode

here's what i got

echo -e "$(date): $(uname -a)\n$(ls -lR /media/extdata/e4/genomes/hg19/)"
2024-02-23 Friday 10:36:45 CST: Linux lq-Precision-7920-Tower 6.2.0-35-generic #35~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Oct 6 10:23:26 UTC x86_64 x86_64 x86_64 GNU/Linux
/media/extdata/e4/genomes/hg19/:

Total 4
drwxrwxrwx 2 lq lq 4096 Feb 20 11:31 genome

/media/extdata/e4/genomes/hg19/genome:

Total 4326516
-rw-rw-r-- 1 lq lq 969972432 Nov 20 2015 genome.1.ht2
-rw-rw-r-- 1 lq lq 724327620 Nov 20 2015 genome.2.ht2
-rw-rw-r-- 1 lq lq  4814 Nov 20 2015 genome.3.ht2
-rw-rw-r-- 1 lq lq 724327616 Nov 20 2015 genome.4.ht2
-rw-rw-r-- 1 lq lq 1274093467 Nov 20 2015 genome.5.ht2
-rw-rw-r-- 1 lq lq 737586876 Nov 20 2015 genome.6.ht2
-rw-rw-r-- 1 lq lq   8 Nov 20 2015 genome.7.ht2
-rw-rw-r-- 1 lq lq   8 Nov 20 2015 genome.8.ht2
-rwxrwxr-x 1 lq lq  1287 Nov 20 2015 make_hg19.sh
-rw-rw-r-- 1 lq lq   0 Feb 20 12:01 SRR10502962.hisat.sam
ADD REPLY
2
Entering edit mode

ok I think I understand the problem

the naming of your files is super confusing, the word genome is used and reused several times in different context, seems like there are directories with the names genomes and genome but then the file is also prefixed as genome - no wonder it is confusing

I think your index file is called:

/media/extdata/e4/genomes/hg19/genome/genome.1.ht2

in that case the path to the index file for hisat2 should be specified as:

-x /media/extdata/e4/genomes/hg19/genome/genome

perhaps next time use different words :-)

ADD REPLY
0
Entering edit mode

Many thanks, knowing that the last "genome" is for the prefix but not the folder location!

ADD REPLY
0
Entering edit mode

Istvan caught your problem. Quick question: How were you at 10:36 CST Friday 12 hours ago when it's still 9:23 CST Friday right now? Are you a time traveler lol?

ADD REPLY
0
Entering edit mode

Hahaha, maybe because we are in different time zones :)

ADD REPLY
0
Entering edit mode

That is not possible - your timestamp was from my time zone and yet ahead of me. I'm just curious lol, maybe your server has some timezone setting inaccuracy?

ADD REPLY

Login before adding your answer.

Traffic: 2314 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6