Question

miRNA binding position in gene transcript (alignment of cDNA FASTA and miRNA FASTA file)

0

Entering edit mode

9.2 years ago

bharata1803 ▴ 580

Hello,

So, I need to predict which miRNAs bind to which genes. My supervisor aksed me to check for every possibilities in every part of mRNA, not only 3' UTR. Before, I found tools to predict the binding site of mRNAs in mRNAs only from its 3'UTR.

What I have right now is RNA-seq and Exome-seq data. I imagine, I only need to do some alignment (complementary) for miRNA sequence which I downloaded from miRBase and the RNA-seq reads. Is there any tools out there to do this? I think this function is similar to NCBI BLAST.

Details on my data: I'm working with human data. I used Ensembl human reference Hg 38.80. My data is for cancer-normal pair experiment from NCBI GEO consists of 7 pair of RNASEQ data and 8 pair of Exome seq data.

After doing some research, I think I found that probably I can use FASTA from cDNA from Ensembl. In that case, I will have full sequence per gene transcript. I also have the miRNA FASTA file. I will only need to search the pattern of the miRNA in the cDNA sequence. Is it possible and meaningful? What tools is available to align FASTA to FASTA file?

mirna sequence • 3.4k views

ADD COMMENT • link updated 9.1 years ago by dario.garvan ▴ 520 • written 9.2 years ago by bharata1803 ▴ 580

0

Entering edit mode

What tools did you use when you were just working with the 3'UTR?

Why are you trying to align to the RNA-Seq reads rather than the assembled transcript?

Are you working with plants or animals? Vertebrates or invertebrates? etc.

ADD REPLY • link 9.2 years ago by Thomas ▴ 180

0

Entering edit mode

I used MixMir. Well, I noticed that probably that the assembled transcript will given similar result unless some kind of snp or indels exist which can change the sequence and therefore make it possible for miRNA binds to.

I'm working with human data. I used Ensembl human reference Hg 38.80. My data is for cancer-normal pair experiment from NCBI GEO consists of 7 pair of RNASEQ data and 8 pair of Exome seq data.

ADD REPLY • link 9.2 years ago by bharata1803 ▴ 580

score 0 · Answer 1 · 2016-03-23

0

Entering edit mode

9.1 years ago

dario.garvan ▴ 520

It's a common research question which many projects have addressed before. Some previously published prediction tools would be suitable. Note that this task is not the same as mapping a miRNA sequence to a gene database, because the binding between miRNA and genes can have mismatches.

ADD COMMENT • link 9.1 years ago by dario.garvan ▴ 520

0

Entering edit mode

What I'm thinking right now is extracting the gene sequence from the alignment result, and then map mapping miRNA to that gene sequence per sample. Because I realize, if I map miRNA sequence to the gene database, it will applicable to all sample. Probably, in a sample, there is some different that can affect the binding of miRNA. What do you think about that?

ADD REPLY • link 9.1 years ago by bharata1803 ▴ 580

0

Entering edit mode

Yes, you could use your assembled gene sequences as the reference sequences. But, you should not use an ordinary mapping program such as Bowtie, but an miRNA target prediction program.

ADD REPLY • link 9.1 years ago by dario.garvan ▴ 520

0

Entering edit mode

After I check TargetScan, it seems their algorithm is get the possible seed from the miRNA and then check the existence of the seed sequence in the target gene/transcript sequence. It seems for this miRNA seed sequence, miRNA needs an exact match to bind with mRNA target. Is it this simple? I know that they calculate some other values, but I think it is because they want to measure the existence of the seed sequence in different species. I just look out the code for the sequence mapping, it is like a string search in another string. TargetScan also deal with some gaps. My aim is just to know in which position the seed is found in the gene/transcript sequence. So, I think it will be simple then. Like finding "def" in a string "abcdefghijk". The output will be 4 because "def" start at 4 character. Target scan does give this kind of result though. What do you think?

ADD REPLY • link 9.1 years ago by bharata1803 ▴ 580