reading blast result file
2
1
Entering edit mode
8.1 years ago

Hi everyone

when I send these miRNA to mRNA database using blastn I get these result. what I would to do fined a new miRNA taht similar to the query sequence.

I run it by this command line

blastn -db esthuman.fasta -word_size 7 -query allprecursor.txt -out nasr -perc_identity 100 -outfmt 6 -max_target_seqs 2
hsa-let-7b  gi|262205597|ref|NR_029479.1|   100.00  83  0   0   1   83  1   83  3e-35     154
hsa-let-7b  gi|4826511|emb|AL049853.1|  100.00  83  0   0   1   83  17069   16987   3e-35     154
hsa-let-7c gi|262205602|ref|NR_029480.1|    100.00  84  0   0   1   84  1   84  8e-36     156
hsa-let-7c gi|7768689|dbj|AP001667.1|   100.00  84  0   0   1   84  186545  186628  8e-36     156
hsa-let-7d gi|262205605|ref|NR_029481.1|    100.00  87  0   0   1   87  1   87  2e-37     161
hsa-let-7d gi|71516309|gb|CH471089.1|   100.00  87  0   0   1   87  25762708    25762794    2e-37     161
hsa-let-7f-1 gi|262205612|ref|NR_029483.1|  100.00  87  0   0   1   87  1   87  2e-37     161
hsa-let-7f-1 gi|71516309|gb|CH471089.1| 100.00  87  0   0   1   87  25760221    25760307    2e-37     161
hsa-let-7g gi|262205206|ref|NR_029660.1|    100.00  84  0   0   1   84  1   84  8e-36     156    
hsa-let-7g gi|71518807|gb|CH471055.1|   100.00  84  0   0   1   84  52281689    52281606    8e-36     156

please anyone who have use blast help

blast • 4.2k views
ADD COMMENT
1
Entering edit mode

Dear nasromer2191989, Hi

1- why you have used "precursors" ? As it is shown in Figure 4 of the paper which @Parsad has kindly offered below, they used Mature mirna first.

2- did you "remove redundant miRNA" first from your database? (it is also shown in the fig 4)

ADD REPLY
0
Entering edit mode

Dear Farbod

I use the precursor because the mature it dose not showing any hits or score. do you know how to turn the subject sequence to miRNA? Thanks

ADD REPLY
1
Entering edit mode

by default when you run blastn it performs megablast(word size=28) this might be a reason you may not find any hits as miRNA ranges from (16-30 max). while doing blastn enable the -task blastn-short or else change the word size. All options can be found here

ADD REPLY
0
Entering edit mode

Dear Prasad

I could not run blastn for mature miRNA which command line I have to use for short sequence

Thanks

ADD REPLY
1
Entering edit mode

could you share your blast command

ADD REPLY
0
Entering edit mode

Dear Parsad This is the command line I have used blastn -db mrna.fasta -query matureall.txt -out resultmature.out

And I get this result it si no hits found

Database: mrna.fasta 508,597 sequences; 15,096,209,361 total letters

Query= hsa-let-7a-3p

Length=21

No hits found

Lambda K H 1.33 0.621 1.12

Gapped Lambda K H 1.28 0.460 0.850

Effective search space used: 45261163845

Query= hsa-let-7a-5p

Length=22

No hits found

Thanks

ADD REPLY
2
Entering edit mode

Try this version.

blastn -db mrna.fasta -task blastn-short -query matureall.txt -out resultmature_new.out
ADD REPLY
0
Entering edit mode

Dear Genomax2

please Can I have your email for more discuss in blast work

Thanks Nasr

ADD REPLY
0
Entering edit mode

Dear Genomax2

I have use your suggestion for mature sequence and I did not get which sequence I have to use in Mfold. While the subject sequence is too short for folding. for instance this is the first sequence I get please can you advice me which one I have to take to MFold.

emb|LQ069844.1| Sequence 168 from Patent EP2964234 42.1 0.009

emb|LQ069844.1| Sequence 168 from Patent EP2964234 Length=21

Score = 42.1 bits (21), Expect = 0.009 Identities = 21/21 (100%), Gaps = 0/21 (0%) Strand=Plus/Plus

Thanks

ADD REPLY
1
Entering edit mode

What data source have you used for mature miRNA? Example ID is unassigned RNA from human, same sequence is identified as hsa-let-7a-3p in miRBase.

For the mfold part, usually miRNAs positioning on genome is identified and upstream & downstream sequences (~50 to 100bps) from the position (including miRNA) are considered to see the stem-loop structure. You could use tools like mirdeep2, mireap etc

ADD REPLY
0
Entering edit mode

Dear Prasad

The data source is miRbase. I have made file of miRNAs in fasta format I want to use it as query to EST & GSS (human). By this work I need to identify miRNA similar to the one in the file. So that I compare their target.What I see in this paper (Identification of miRNA encoded by Jatropha curcas from EST and GSS ) they send the known miRNA to EST&GSS and identify new miRNA and their target.

I am confusing 1)-which sequence they use in Mfold. 2)-Are they use 100% Identities or use the one with specific number of mismatch

Thanks

ADD REPLY
1
Entering edit mode

kindly refer methodology section of the paper you have mentioned for all the parameters.

If you are working on human data, you may use human genome instead of gss or est. see mirdeep or mireap, pretty east to implement for novel miRNA prediction

ADD REPLY
0
Entering edit mode

Dear Prasad

I use the human miRNA against the EST & GSS human also. I have got the mismatched less than 2 so could I use them as similar miRNA.

ADD REPLY
0
Entering edit mode

Dear Prasad

Please can I have your email address for more guidelines ?

Thanks

ADD REPLY
2
Entering edit mode

Please do not ask for personal email addresses. They are not shared publicly on Biostars. We encourage users to keep all discussions open.

ADD REPLY
1
Entering edit mode

i do agree with @genomax2, as the discussion here would help many others

ADD REPLY
0
Entering edit mode

Thanks for replying

So, Could we follow the steps together? I have done blastn then I confuse which sequence I have to use for blastx.

Thanks

ADD REPLY
0
Entering edit mode

If you have done blast of human miRNA against human EST. Filter miRNA-EST pairs with the cutoff mentioned in the paper. Then predict secondary structure of entire EST. if proper secondary structure forms (stem-loop) take those ESTs for blastx to remove protein coding. details steps and parameters are mentioned in the paper

ADD REPLY
0
Entering edit mode

Dear Prasad

please how did you find the name of this miRNA (hsa-lat-7a-3p) by the information in the result file.

Thanks

ADD REPLY
1
Entering edit mode

I took embl id from here, extracted sequence and did blast in miRBase. Are trying to do similar study as in paper or you have sequenced miRNA data?

ADD REPLY
0
Entering edit mode

I have miRNA data I collected from miRbase. I need to find the similar to what I have then I check their target. please advise me If there is tool can preform it. The tool allow me to scan my file to any other data base and give me back the similar miRNA.

ADD REPLY
1
Entering edit mode

as i had mentioned earlier you could refer mirdeep2 (pipeline tool). If you want web tools, miRNAkey, miRanalyzer, sRNAtoolbox

Here you can find a list of other tools 1, 2

ADD REPLY
0
Entering edit mode

sorry

How did you blast (emb|LQ069844.1| Sequence 168 from Patent EP2964234 Length=21 )to miRbase I want to check the mismatched one if it is already there in miRbase.

Thanks

ADD REPLY
0
Entering edit mode

opt search by sequence from mirbase

ADD REPLY
0
Entering edit mode

Dear Prasad

I want to do blastx on all the sequence that I get in result file. I want to check if is it the same miRNA. that I did blastn.

Thanks

ADD REPLY
1
Entering edit mode
8.1 years ago
Prasad ★ 1.6k

go through this, hope this answers

ADD COMMENT
0
Entering edit mode

Dear parsad, Hi

Is the blast script in the paper you have suggested is suitable for finding previously known miRNA from newly sequenced miRNA fastq reads of a non-model animal or it is just for EST and GSS?

ADD REPLY
1
Entering edit mode

you can use the same approach for your study

ADD REPLY
0
Entering edit mode

Dear Parsad,

In your paper the authors mentioned :"BLASTN parameter settings as: expect values at 1e-3; low complexity was chosen as the sequence filter; the number of descriptions and alignments was raised to 1,000. The default word-match size between the query and database sequences was 7"

would you please help me for that blast script ? for example how they used "low complexity" and " number of descriptions and alignments was raised to 1,000" in their script ?

is that as follow (in ncbi-blast-2.4.0+) ?

blastn  -query my-miRNA-trimmed.fasta  -db  mirbase-MATURE.fa  -evalue 1e-3  -word_size 7  *-perc_identity 100*  *max_target_seqs 1000*  -outfmt 6 -num_threads 22 -out blastn_my-miRNA_mature.outfmt6

Thank you in advance

ADD REPLY
1
Entering edit mode
 -num_descriptions <Integer, >=0>
   Number of database sequences to show one-line descriptions for
   Not applicable for outfmt > 4
   Default = `500'
    Incompatible with:  max_target_seqs
 -num_alignments <Integer, >=0>
   Number of database sequences to show alignments for
   Default = `250'
    Incompatible with:  max_target_seqs
-dust <String>
   Filter query sequence with DUST (Format: 'yes', 'level window linker', or
   'no' to disable)
ADD REPLY
0
Entering edit mode

Dear genomax hi and thank you.

I have not recieved your answer in my "Messages" and the delay of saying thanks is because of that.

1- for "low complexity was chosen as the sequence filter", what must I use in the string of "-dust <String>"

2- please have a look at this whole script for any error, would you?

(I have removed "-perc_identity 100" & "max_target_seqs 1000" from that. is that OK?)

blastn  -query my-miRNA-trimmed.fasta  -db  mirbase-MATURE.fa  -evalue 1e-3  -word_size 7  -num_descriptions 1000  -num_alignments 1000  **-dust ?**   -outfmt 6 -num_threads 22 -out blastn_my-miRNA_mature.outfmt6
ADD REPLY
1
Entering edit mode

I would interpret that as -dust yes.

Since the mature miR's are probably small you may want to use them as query instead of the target. Your blast options may need to change in that case.

ADD REPLY
0
Entering edit mode

Dear genomax2,

it seems that you have experiences in the field of miRNA, too.

If yes please inform me as I have some question on the miRNA analysis standard pipeline and using biological replications.

Thank you

ADD REPLY
0
Entering edit mode

.....................

ADD REPLY
0
Entering edit mode
8.0 years ago

Hi guys

I have question. Is local blast work on one species only and give good result? Because the query is human known miRNA and databases is human as will. I checked all the subject sequence from the result file in miRbase and give same miRNA and miRNA with different suffix.

I expect or look for different miRNA with slightly different to the one I have in query file to do validation.

Any suggestion in this case

Thanks all

ADD COMMENT

Login before adding your answer.

Traffic: 2490 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6