I met a problem about miRNAseq and the miRNA target genes prediction.
I know the basic workflow and I tried the first method:
I downloaded the 3'UTR fasta files (version: rat 6.0) from ensembl biochart and UCSC respectively.
And I used each of these files and predicted different numbers of target genes using miranda in linux.
However I were not satisfied with all the results on the numbers of target genes(with setting parameters below:).
miranda rno_DEGs.fasta GCF_000001895.5_Rnor_6.0_3'UTR.fa -sc 150 -en -30 -strict | grep ">>" >
rno_VS_NCBI_END.txt
Can I adjust the two parametesr:-sc 150 -en -30 to low standard ??? the default parameter is: -sc 140 -en 1
Here is the fasta and gtf files links: https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/895/GCF_000001895.5_Rnor_6.0/GCF_000001895.5_Rnor_6.0_genomic.fna.gz
annotation files:
So I tried the second method:
I used the rat 7.0 3'UTR fasta file from ensembl biochart and run the same flowchart and what suprised me was I got the most numbers of target genes and I thought it was a good result.
My question is my mRNA counts was done by rat 6.0 fasta I described above . So I don't know if it is suitable for me to use different versions of fasta files to analysis target genes ?
If not, I also have the other one method:
I think I can extract 3'UTR sequences from rat 6.0 fasta and gtf files. But I didn't find the coordinations of 3'UTR and any its information from the 6.0 gtf file. I don't know it's why. And because of this reason, I have no idea how to extract 3'UTR information from fasta and gtf files then.
And I got the final method:
I communicate with the sequencing company. They told me they use the whole genomic fasta file as the 3'UTR sequence and to get the target genes prediciton. I still don't know why they do this step in this way ??? Can I do this?
I looked up many methods including using R biomart or other methods but most of them were not suitable for me.
So I really hope somebody could give me some advice or method. Vary thankful.
Who could help me