Create annotation file for mature miRNA sequences from mirBase
1
0
Entering edit mode
8.3 years ago
polag03 • 0

Please help, I am new to sequence analysis. I have been trying to create gff/gtf annotation file for mature miRNA sequences obtained from miRBase in order to analyze some sequence data. I have not seen any directions on how to achieve this. The miRNA sequences are in fasta format and I have my reference genome sequence too. Please kindly guide me. Thank you

gtf miRNA gff mirBase • 3.6k views
ADD COMMENT
0
Entering edit mode

Thank you Prasad. I will try the SAM2GFF out and revert

ADD REPLY
2
Entering edit mode
8.3 years ago
Prasad ★ 1.6k

gff for some of the organism are already there in miRBase.

Other thing what you can do, align all the mature miRNAs to the genome and convert the sam to gff using SAM2GFF. Hope this helps

ADD COMMENT
0
Entering edit mode

Hi Prasad, Thanks for the reply On a second look, the mirbase sequences are in fasta format while Bowtie takes fastq reads for alignment. is there a way around this?

ADD REPLY
0
Entering edit mode

I got that resolved, Prasad But I got the sam file of the alignment. When I tried to run the perl script to convert the sam file to gff i got a fatal error "Unable to open input file <filename.sam>".

My command perl scampi_sam_to_gffv1.pl -i inputfile.sam -o outputfile.gff

ADD REPLY
0
Entering edit mode

Is the sam file not in current directory? Have you tried ./inputfile.sam to make the location more explicit?

ADD REPLY
0
Entering edit mode

Yes it is in the current directory. i also had to specify the path explicitly when i got the error, still got the error.

ADD REPLY
0
Entering edit mode

Can you post a few lines of your sam file?

head inputfile.sam
ADD REPLY
0
Entering edit mode

Sure. Here is the output

@HD     VN:1.0  SO:unsorted
@SQ     SN:Chromosome01 LN:34959721
@SQ     SN:Chromosome02 LN:32431396
@SQ     SN:Chromosome03 LN:29412403
@SQ     SN:Chromosome04 LN:28749345
@SQ     SN:Chromosome05 LN:28438989
@SQ     SN:Chromosome06 LN:27939960
@SQ     SN:Chromosome07 LN:27069033
@SQ     SN:Chromosome08 LN:34011518
@SQ     SN:Chromosome09 LN:29417918
ADD REPLY
0
Entering edit mode

That looks like a proper sam file (assuming you see alignments further down in the file, correct?).

ADD REPLY
0
Entering edit mode

yes it is

This is more of the file

cel-lin-4-3p    4       *       0       0       *       *       0       0       ACACCTGGGCTCTCCGGGTACC  IIIIIIIIIIIIIIIIIIIIII  XM:i:0
cel-lin-4-5p    4       *       0       0       *       *       0       0       TCCCTGAGACCTCAAGTGTGA   IIIIIIIIIIIIIIIIIIIII   XM:i:0
cel-miR-1-5p    4       *       0       0       *       *       0       0       CATACTTCCTTACATGCCCATA  IIIIIIIIIIIIIIIIIIIIII  XM:i:0
cel-let-7-5p    4       *       0       0       *       *       0       0       TGAGGTAGTAGGTTGTATAGTT  IIIIIIIIIIIIIIIIIIIIII  XM:i:0
cel-let-7-3p    4       *       0       0       *       *       0       0       CTATGCAATTTTCTACCTTACC  IIIIIIIIIIIIIIIIIIIIII  XM:i:0
ADD REPLY

Login before adding your answer.

Traffic: 1899 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6