Question

Error in running miRDeep2

0

Entering edit mode

3.6 years ago

DEEPESH • 0

Hi I am trying to run miRDeep2 for identification of miRNAs.

But its showing error miRNA reference this species file mature_osa.fa has not allowed whitespaces in its first identifier.

I checked it properly and matched with the file given in miRDeep2.pl tutorial, my file is also has a same pattern. I used sed command also but issue is not resolved

Can anyone please help me

Thank you in advance

Whitespace miRDeep2 error • 2.9k views

ADD COMMENT • link updated 23 months ago by Ram 44k • written 3.6 years ago by DEEPESH • 0

Ram · Accepted Answer · 2021-05-14

4

Entering edit mode

3.5 years ago

Paolo ▴ 40

I found the same problem. I understand that I can test my input for errors with:

sanity_check_mature_ref.pl bta_mature.fa

This is the error I got:

Error in line 1: The identifier

bta-miR-26a MIMAT0003516 Bos taurus miR-26a

contains white spaces


Please check your file for the following issues:

I.  Sequences are allowed only to comprise characters [ACGTNacgtn].
II. Identifiers are not allowed to have withespaces.


You could run remove_white_space_in_id.pl inputfile > newfile
This will remove everything from the id line after the first whitespace

So by calling

remove_white_space_in_id.pl bta_mature.fa > bta_mature.fa.fix

I was able to run mirdeep2 without problems

ADD COMMENT • link 3.5 years ago by Paolo ▴ 40

0

Entering edit mode

Thank you.

ADD REPLY • link 3.5 years ago by DEEPESH • 0

0

Entering edit mode

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one answer if they all work. If an answer was not really helpful or did not work, provide detailed feedback so others know not to use that answer.
upvote_bookmark_accept

ADD REPLY • link 3.5 years ago by Ram 44k

0

Entering edit mode

Respected all,

I am working with miRNA data. Currently, I am facing the following issues in the genome.fa file. Please look at it and give your valuable comments to solve the error. What should I do next? should I change N instead of R or should I remove R?

Thanks

Error: problem with genome.fa
Error in line 6.618: The sequence
TCAAATACTGAAAAATATTTCACAGCATTCTCATATTTGTGGTGAATTTTCAGAAGCTTR
contains characters others than [acgtnACGTN]
Please check your file for the following issues:

I. Sequences are allowed only to comprise characters [ACGTNacgtn].
II. Identifiers are not allowed to have withespaces.

ADD REPLY • link updated 23 months ago by Ram 44k • written 23 months ago by kuttibiotech2009 ▴ 30

1

Entering edit mode

I think you'd be better off replacing R with N so you lose the least amount of information.

ADD REPLY • link 23 months ago by Ram 44k

0

Entering edit mode

Thanks, Ram. Its working after replacing N with R,Y,S,M etc

ADD REPLY • link 23 months ago by kuttibiotech2009 ▴ 30

0

Entering edit mode

Please understand that while this step helps make the software work, you're compromising somewhere. Understand the implications of changing the real world input for an algorithm.

ADD REPLY • link 23 months ago by Ram 44k