How can I access the consensus sequences of all human ERVs from Dfam?
2
0
Entering edit mode
6.7 years ago
mbk0asis ▴ 700

Hi, everyone!

I'm trying to estimate the expression levels of ERVs in human by mapping the reads on the consensus sequences.

I found a paper that used consensus sequences from Dfam, but I couldn't find the exact file.

I found some files 'hg38_dfam.hits.gz' and 'hg38_dfam.nrph.hits.gz' which had only location information without actual sequences.

Should I extract sequences using the location info, or is there other way?

Can someone give me some hints?

Thank you!

dfam erv • 2.4k views
ADD COMMENT
1
Entering edit mode
5.7 years ago
mlbendall ▴ 240

You can get the Dfam consensus sequences by downloading the HMM (i.e. DF0000558.hmm) and using hmmemit from the hmmer package:

hmmemit -c DF0000558.hmm > DF0000558.consensus.fasta
ADD COMMENT
0
Entering edit mode

With the latest version of Dfam there is an API you can use to access the consensus sequences without downloading the HMMs. You would use the /families/{id}/sequence endpoint.

See http://dfam.org/help/api and https://dfam.org/releases/Dfam_3.0/apidocs/

ADD REPLY
0
Entering edit mode
6.7 years ago
h.mon 35k

There are (at least) two pipelines for HERV quantification, TEToolkit ans SalmonTE, you could look into them. What is the paper using consensus DFam?

ADD COMMENT

Login before adding your answer.

Traffic: 2859 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6