Entering edit mode
6.7 years ago
donger1103
▴
10
I have a tomato genomic loci list, such as "SL2.50ch01:89425896-89429084", the locis mostly are inside the exons of genes,but some may also within introns or intergenic spacer. I want to perform miRNA target search, so i need to download all the corresponding partial RNA or cDNA sequences first. are there any tools or ideas? many thanks!
Do you want to extract the exact ranges as you give an example for or do you want to first look up those locations, then determine in which gene they reside and then extract the (whole) corresponding gene?
to lieven.sterck, thanks for your reply, i want to extract the sequences (RNA or cDNA, not genomic DNA) of the exact ranges in my loci list, they are partial sequences of genes. thanks a lot. i reply your question here becuase there is no respond when i click "ADD REPLY".
OK, but the coordinates you are talking about are genomic, right? so you just want to extract that range or do you want to link them to the most close-by genic feature?
to lieven.sterck,thanks again. you are right, the coordinates in my list are genomic, and i just want to extract that range (corresponding RNA or cDNA sequence),besides, it would be better if gene annotation is added. Actually, the coordinates i mentioned are the loci of some circular RNAs, i want to perform the circular RNA-miRNA-mRNA interaction, and my first step is to extract the specific sequences of the circular RNAs and then predict the tomato miRNA targets with tools like psRNATarget. Do you have any ideas? thanks :)
OK, I would get the genomic fasta files local, build a blastDB from them and then extract all the ranges you want with the blastdbcmd cmdline. Not that difficult to loop over your list and select each region. that will give you already all the genomic sequences, to get the corresponding gene annotation might be a bit more cumbersome but you might be able to do that with BED intersect or so.
Thanks you so much, however, i still get the genomic sequences ,not the RNA sequences, right?
You will get genomic sequence corresponding to the coordinates of loci you are interested in. You have said above that they are not necessarily all in coding regions.