start OMA run - file.log
2
0
Entering edit mode
6.7 years ago

Hi,

I have a question about the file.log when OMA start to convert the files.

I download 30 coding genomes of different species from Ensembl or NCBI. In order to eliminate the transcripts and isoforms I used cd-hitest first and then I passed the files through TRANSDECODER.

When I start running OMA in the .log file shows me many of these errors:

WARNING: IUPAC ambiguity characters for DNA/RNA not supported. Will replace them with 'X'

Pre-processing input (DNA)
19099 sequences within 19099 entries considered
Creating file Cache/DB/Balaena_mysticetus.db.map for mapping
Building new Pat index in file Cache/DB/Balaena_mysticetus.db.tree with 27254391 entries
Pat index with 27254391 entries
 sorted, from "A</SEQ></E>\n" to "XXXXXXXXXXXXXXXXXXX"
Reading 44567976 characters from file Cache/DB/Balaenoptera_acutorostrata.db
Pre-processing input (DNA)
20993 sequences within 20993 entries considered
Creating file Cache/DB/Balaenoptera_acutorostrata.db.map for mapping
Building new Pat index in file Cache/DB/Balaenoptera_acutorostrata.db.tree with 37893972 entries
Pat index with 37893972 entries
 sorted, from "A</SEQ></E>\n" to "XXXXXXXXXXXXXXXXXXX"

I want to know if that errors can generate some problems with the normal run of OMA?

Thanks,

omabrowser OMA • 1.8k views
ADD COMMENT
0
Entering edit mode

Tagging: adrian.altenhoff

ADD REPLY
0
Entering edit mode
6.7 years ago

Hi,

it seems to me that you have an inconsistency between your input data and the parameters: From this output it looks to me that you specified in the parameters.drw file the InputDataType := 'DNA'; but you provide protein sequences (which would make sense to use). In that case OMA would convert all amino acids that are not ATCG to unknown nucleotides and threat the remaining amino acids as nucleotides. The proper setting should be InputDataType := 'AA'; as far as I can understand.

Cheers Adrian

ADD COMMENT
0
Entering edit mode
6.7 years ago

Hi, thanks for the answer.

I have all the sequences in nucleotides, and in the input I have InputDataType := 'DNA'; That's why I find it strange.

It only happens with the final files thrown by TRANSDECODER.

Cheers, Daniela

ADD COMMENT

Login before adding your answer.

Traffic: 2440 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6