Question

Annotating Rna-Seq Data Using A Reference Genome.

3

Entering edit mode

13.2 years ago

Linda ▴ 160

I have RNA-seq reads from a non-model organism. I used cufflinks to identify transcripts. Is there an existing pipeline to BLAST these transcripts to a model organism's proteins to identify orthologs?

rna blast • 4.8k views

ADD COMMENT • link updated 11.7 years ago by Zhidkov ▴ 600 • written 13.2 years ago by Linda ▴ 160

score 2 · Answer 1 · 2011-09-04

2

Entering edit mode

13.2 years ago

Zhidkov ▴ 600

Hi Linda, Regarding BLAST usage: you can download local blast from here: http://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=Download

available databases can be found here: ftp://ftp.ncbi.nlm.nih.gov/blast/db/

most probably you would like to run blastx against nr, or you can download uniref90. I suggest to use tabular format for output (saves some time and space). You can filter out obtained results by alignment length and /or E-value smaller than -5 (for example).

Ilia

ADD COMMENT • link 13.2 years ago by Zhidkov ▴ 600

0

Entering edit mode

HI @Zhidkov, I have a similar question. To identify the sequence conservation of our de novo assembled transcriptome of a non-model plant, we blasted our transcriptome against several plants' proteome database using blastx (NCBI Blast+ 2.2.26), with the output of tabular format. As you mentioned, we can filter our results by alignment length and/or E-value. Since I already set the e-value to 1e-5, how to set the alignment length in filter? Generally, the value of the alignment length. Thank you. Regards

ADD REPLY • link 11.8 years ago by lzsph ▴ 70

0

Entering edit mode

Hi, I don't really understand where is the problem... if you used default tabular output, column number 4 correspond to alignment length. Do you run blast from command line (terminal) or web-based? In any case, I'm not sure you can set filter "minimum alignment length" as parameter in blast search. If you run blast from command line, you can give something like: 'blastx -query <File_In> -db <your_database> -evalue <your favorite=""> -outfmt 6 |awk -F "\t" '{if ($4>=yourlength) print}' > Tabularblastx.txt'

ADD REPLY • link 11.8 years ago by Zhidkov ▴ 600

0

Entering edit mode

Hi @Zhidkov, sorry for the late reply. I did run blastx from command line, I just set the evalue to 1e-5, leaving the alignment length undefined, it it ok?

ADD REPLY • link 11.8 years ago by lzsph ▴ 70

score 0 · Answer 2 · 2013-02-28

0

Entering edit mode

11.7 years ago

Zhidkov ▴ 600

Hi,
the length of alignment is additional filter, if you get to many results using e-value cutoff only, you can stringent your filter by filtering out to short alignments, query coverage etc.
Your data, your goals, your filters.
Just for example: you have transcript-Y 2kb long, after BLASTX you got hit with 1e-7 to X-protein and alignment length was 200bp with several indels, can you conclude that transcript-Y is X-protein?
What will be your filters for reliable annotation in that case?

Ilia

ADD COMMENT • link 11.7 years ago by Zhidkov ▴ 600

0

Entering edit mode

Hi Ilia,

Thank you very much.

I have a query file contains ~200,000 sequences with various lengths, and the tabular output of blastx also contains many sequences with different lengths, is it possible to set only one alignment length value in the command "...($4>=yourlength)"? If I was wrong, please figure it out.

Regards, lzsph

ADD REPLY • link 11.7 years ago by lzsph ▴ 70

0

Entering edit mode

Yes it possible, (you set a minimum length >= something), if that doesn't feel right for you you can filter on coverage, for example you can demand that at least 50% of your query sequence will be covered. I suggest you to perform small test (you'll feel much more confident after that) - take several known transcripts , run BLASTX against all plants proteins and check which alignments get you unreliable results (i.e you used SOD1 for query but getting p53 as hit), set your filters accordingly.

Ilia

ADD REPLY • link 11.7 years ago by Zhidkov ▴ 600

0

Entering edit mode

OK, Ilia, I'll get it a try.

Thank you!

Regards,lzsph

ADD REPLY • link 11.7 years ago by lzsph ▴ 70