Good Afternoon,
I am utilizing hmmer tools to analyze and better understand a DNA Sequence dataset that I have obtained referred to as the H3 Dataset containing dna sequences both being from class 0 and class 1.
The following is what I did:
Acquired 7000 DNA Sequences of the H3 Dataset which are Class 0, and built an hmm profile for 80% of it, resulting in 5600 sequences.
free photo hosting for ebay
Next, is that I took 20% of the dna sequences from both the Class 0 and Class 1.
top free photo hosting sites
Now I performed hmmsearch command using both the 20% dna sequences as search criteria on the previously formed HMM Profile. What I expect is that the hmmsearch performed on the Class 0 Sequences is to have a lot of DNA sequences above the inclusion threshold, and also have lots of e-values which are near the 0 value.
Resulting in the below output file.
How come no targets have been detected? I though I did something wrong until this point, so I performed one last experiment.
- I performed an hmmsearch on the hmm profile, having the search criteria of the same sequences which formed the profile, which when thinking about it, there must be matches since they are the same sequences in the exact form, and the result out file is the below:
Once again, no hits were detected.
So my final question is: Am I using the hmmbuild and hmmsearch in the correct way and how can I improve the results in any form? It is extremely strange that I am comparing the exact same sequences and getting no hits. Any help would be appreciated
Thanks.
In the output of hmmbuild, "eff_nseq" is too high and "re/pos" is too low. I think that is because your input multi-FASTA file is not aligned and resulting hmm is nonsense. Thus does not hit against any sequence.
I agree with that. Looks like the input is a random alignment. The title of the question is misleading.