Error in running miRDeep2
1
0
Entering edit mode
3.6 years ago
DEEPESH • 0

Hi I am trying to run miRDeep2 for identification of miRNAs.

But its showing error miRNA reference this species file mature_osa.fa has not allowed whitespaces in its first identifier.

I checked it properly and matched with the file given in miRDeep2.pl tutorial, my file is also has a same pattern. I used sed command also but issue is not resolved

Can anyone please help me

Thank you in advance

Whitespace miRDeep2 error • 2.9k views
ADD COMMENT
4
Entering edit mode
3.5 years ago
Paolo ▴ 40

I found the same problem. I understand that I can test my input for errors with:

sanity_check_mature_ref.pl bta_mature.fa

This is the error I got:

Error in line 1: The identifier

bta-miR-26a MIMAT0003516 Bos taurus miR-26a

contains white spaces


Please check your file for the following issues:

I.  Sequences are allowed only to comprise characters [ACGTNacgtn].
II. Identifiers are not allowed to have withespaces.


You could run remove_white_space_in_id.pl inputfile > newfile
This will remove everything from the id line after the first whitespace

So by calling

remove_white_space_in_id.pl bta_mature.fa > bta_mature.fa.fix

I was able to run mirdeep2 without problems

ADD COMMENT
0
Entering edit mode

Thank you.

ADD REPLY
0
Entering edit mode

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one answer if they all work. If an answer was not really helpful or did not work, provide detailed feedback so others know not to use that answer.
upvote_bookmark_accept

ADD REPLY
0
Entering edit mode

Respected all,

I am working with miRNA data. Currently, I am facing the following issues in the genome.fa file. Please look at it and give your valuable comments to solve the error. What should I do next? should I change N instead of R or should I remove R?

Thanks

Error: problem with genome.fa
Error in line 6.618: The sequence
TCAAATACTGAAAAATATTTCACAGCATTCTCATATTTGTGGTGAATTTTCAGAAGCTTR
contains characters others than [acgtnACGTN]
Please check your file for the following issues:

I. Sequences are allowed only to comprise characters [ACGTNacgtn].
II. Identifiers are not allowed to have withespaces.
ADD REPLY
1
Entering edit mode

I think you'd be better off replacing R with N so you lose the least amount of information.

ADD REPLY
0
Entering edit mode

Thanks, Ram. Its working after replacing N with R,Y,S,M etc

ADD REPLY
0
Entering edit mode

Please understand that while this step helps make the software work, you're compromising somewhere. Understand the implications of changing the real world input for an algorithm.

ADD REPLY

Login before adding your answer.

Traffic: 1771 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6