I am new to upstream RNA sequencing analysis. So, I am trying to align the fastq sequence to human genome downloaded from https://genome-idx.s3.amazonaws.com/hisat/hg19_genome.tar.gz. The reference was unzipped and stored at the location : /media/extdata/e4/genomes/hg19/genome/
, and the files are these when I call ls
command :
genome.1.ht2 genome.2.ht2 genome.3.ht2 genome.4.ht2 genome.5.ht2 genome.6.ht2 genome.7.ht2 genome.8.ht2 make_hg19.sh.
But when I am trying to run the command:
hisat2 -p 32 -x /media/extdata/e4/genomes/hg19/genome -1 /media/extdata/e4/Rseq/test2/clean/SRR10502962_1_val_1.fq.gz -2 /media/extdata/e4/Rseq/test2/clean/SRR10502962_2_val_2.fq.gz -S SRR10502962.hisat.sam
The error of (ERR): "/media/extdata/e4/genomes/hg19/genome" does not exist
returned. But when I tried add /
to the -x
parameter, like this : /media/extdata/e4/genomes/hg19/genome/
, this is the error: Could not locate a HISAT2 index corresponding to basename "/media/extdata/e4/genomes/hg19/genome/" Error: Encountered internal HISAT2 exception (#1)
.
But the interesting part is that even I run the code like this: hisat2 --dta -p 16 -x ~/ -1 /media/extdata/e4/Rseq/test2/clean/SRR10502962_1_val_1.fq.gz -2 /media/extdata/e4/Rseq/test2/clean/SRR10502962_2_val_2.fq.gz -S SRR10502962.hisat.sam
. It also returns the (ERR): "/home/user/" does not exist
error. So, can anyone help me out? Many thanks.
What do these commands tell you:
Thanks for the responses, here's what i got:
Any suggestion?
is your username
myusername
? show the actual output instead of replacing it with a different word, we are trying to figure out the problem here, replacing words just sows more confusionalso show just the output from all the commands in one output, instead of writing this returns that, I had to remove a lot of content and reformat the post to make it reasonably readable, even after that it is quite hard to read, especially since you seem to have altered the output
don't worry people can tell the output of the commands. Next time write it like this, I've entered:
and it produced
Thanks, here's what i got:
the error
Indicates that the $HOME variable is not set quite right. According to your settings that path should be
/home/lq
In addition that
returns only a single output indicates that the index is not present in that location. Or perhaps you did not type the command you were instructed to and you forgot the * from the end of the command.
going back to the original post show us the exact commands and their outputs, without adding any other qualifier and explanations,
show the actual
ls
command and its output with no other wordsalways past in both the command you have and its output, and not just say I typed that in because small mistakes can always occur
the hisat2 program launches a Perl script that in turn launches launches another program hisat2-align via the default shell
to me your error (if indeed the filesystems exist) appears to indicate that the subshells that get launched do not inherit the proper values
it kind of sounds like these are some sort of emulated, mounted filesystems that are not visible
I am accessing the server by SHH, is there anything that i can do about it?
Thank you for your patience, here are the commands and results returned.
here are the code of hisat2 i run:
Add / at the end of file location, return this error:
Hope this make the condition clear.
remember the index has to a prefix to a file and not a directory, so don't add
/
at the end of it. When you typethen press double TAB does it show you the files?
These are the files that will be read.
I will say my guess is that this problem is extraordinarily confusing because the explanation might be ludicrously simple.
I found that trivial problems will cause the most confusing errors. (for example, you are running
ls
andhisat2
on different systems or different filesystems) or something like that,there is nothing magical about the index;
the program will read eight files; in the
-x
parameter, you have to list their common prefix; that's all.There is nothing special to it, but the file must be accessible when the program runs.
I can't seem to make sense of how your
file
command saysgenome
is a directory while thels
does not list any dir with that name, just files starting withgenome
. What is the output to:here's what i got
ok I think I understand the problem
the naming of your files is super confusing, the word genome is used and reused several times in different context, seems like there are directories with the names
genomes
andgenome
but then the file is also prefixed asgenome
- no wonder it is confusingI think your index file is called:
in that case the path to the index file for hisat2 should be specified as:
perhaps next time use different words :-)
Many thanks, knowing that the last "genome" is for the prefix but not the folder location!
Istvan caught your problem. Quick question: How were you at 10:36 CST Friday 12 hours ago when it's still 9:23 CST Friday right now? Are you a time traveler lol?
Hahaha, maybe because we are in different time zones :)
That is not possible - your timestamp was from my time zone and yet ahead of me. I'm just curious lol, maybe your server has some timezone setting inaccuracy?