Does nhmmer look at reverse complements?
1
1
Entering edit mode
8.6 years ago
Steven Lakin ★ 1.8k

nhmmer is the DNA pipeline version of hmmsearch within the HMMER software. Recently, I've been comparing two paired-end Illumina libraries with the following:

  1. Trimmomatic
  2. bwa aln, forward/reverse trimmed reads to database of known reference genes
  3. Parsing the SAM file for mapped reads (samtools -F 4) and analyzing those

I ran the same raw read files through nhmmer and analyzed the intersection of the alignment and HMMER pipelines. When I look at reads that aligned but were not called by HMMER, bit 16 (reverse strand) is set on virtually all of them. Additionally, when the read information is pulled from the SAM file (where it has been appropriately reverse complemented) and re-run through HMMER, nhmmer does find ~70% of them.

So it appears that HMMER doesn't consider the reverse complement of the database being queried, though it does consider the reverse orientation when the query is of the same strandedness with respect to the database. This is in contrast to the manual, which claims that it does search both strands of the database, but this doesn't explain the findings here.

Has anyone had similar experiences, or does someone have inside knowledge on how or if nhmmer considers reverse complements?

alignment hmmer sam ngs • 2.4k views
ADD COMMENT
1
Entering edit mode
8.6 years ago
Steven Lakin ★ 1.8k

HMMER does in fact do this correctly; it was an issue with an intermediate step written by myself. Lesson learned; do more internal testing on code.

ADD COMMENT

Login before adding your answer.

Traffic: 1588 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6