Question

Where Can I Download Pirna (Piwi Rnas) Of Mouse (Mm9)

1

Entering edit mode

11.1 years ago

biorepine ★ 1.5k

Dear Biostars

Does some one know where can download mm9 piRNAs in BED or GTF format ?

Thanx in advance

• 5.2k views

ADD COMMENT • link updated 2.7 years ago by Ram 44k • written 11.1 years ago by biorepine ★ 1.5k

0

Entering edit mode

Can you tell more detail? Now I happened to need these coordinates in GRCm37 (mm9).

ADD REPLY • link updated 2.7 years ago by Ram 44k • written 9.7 years ago by bio_zhangxl ▴ 10

0

Entering edit mode

You can just use liftOver from UCSC to convert mm8 to mm9 coordinates.

ADD REPLY • link 9.7 years ago by Devon Ryan 104k

0

Entering edit mode

http://pirnabank.ibab.ac.in/ seems to be down. Is there any other database link?

Thanks

ADD REPLY • link 7.9 years ago by aky3100 • 0

0

Entering edit mode

i figure out how to download it ,go to ncbi nucleotide ,search piRNA and your species

ADD REPLY • link 7.8 years ago by 18745684945 • 0

0

Entering edit mode

As majority of the answers, pirna is usualy downloaded from pirnabank. All the coordination is quit old, for human is hg18 and for mice is mm8.

http://pirnabank.ibab.ac.in/request.html

However, if you blast these sequences and you will find the coordination is not perfect matched.

Take human piwi-RNA as example:

>hsa_piR_000011|gb|DQ569929|Homo sapiens:16:34608961:34608986:Plus
AAACUGACCAGAUGAAUGAGAAACCC

if you make a blast with UCSC you will find:

   ACTIONS      QUERY           SCORE START  END QSIZE IDENTITY CHRO STRAND  START    END      SPAN
---------------------------------------------------------------------------------------------------
browser details YourSeq           26     1    26    26 100.0%    16   +   34608961  34608986     26
browser details YourSeq           25     1    26    26 100.0%     9   +   31898448  31898965    518
browser details YourSeq           24     1    26    26  96.2%    19   +   62716276  62716301     26

Everyone should think about how to deal with this problem.

ADD REPLY • link 6.2 years ago by Shicheng Guo ★ 9.5k

score 4 · Answer 1 · 2013-12-12

1) Download the compressed file from http://pirnabank.ibab.ac.in/Mouse.tar.gz . It is mm9 based.

2) Uncompress the file using tar -xzvf Mouse.tar.gz. It will take a few minutes and a new directory with name Mouse will be created. Each piRNA has its own fasta file with a header storing the alignment information. For example, >mmu_piR_037869|gb|DQ726753|Mus_musculus:1:93235274:93235303:Minus. You now need to parse this information.

3) Create a shell script Piwi_BED.sh outside the Mouse directory and paste the below code in it.


for file in Mouse/mmu_piR_*
do
    grep ">" $file | awk -F: '{print $2,"\t",$3,"\t",$4,"\t",$1,"\t",$5}'
done

4) Run the code sh Piwi_BED.sh > mm9_piwi.bed

5) Columns are chr, start. end , name and strand. You need to sort it and also convert strand value from "Plus" to "+" and Minus to "-".

score 2 · Answer 2 · 2014-09-17

2

Entering edit mode

10.2 years ago

Devon Ryan 104k

I happened to need these coordinates in GRCm38 (aka mm10) coordinates today. As karlos.klammer mentioned, the coordinates in the txt files are mm8, so I just converted them to the current coordinate system with Ensembl chromosome names and made a GTF out of it. Should someone need something like this in the future (I have no idea why piRNAbank makes it such a pain to just get coordinates, they obviously have them in a database), you can just download the GTF here and save yourself the hassle.

ADD COMMENT • link 10.2 years ago by Devon Ryan 104k

0

Entering edit mode

I'm trying to create the same piRNA gtf file BUT for Human piRNAs (from piRNAbank).

I used the same command as karlos.klammer suggested here and did the liftOver in UCSC from hg18 to GRCh38. So, I currently have a bed file that looks like that:

$ head hglft_piRNA.bed
chr1    14630   14657   >hsa_piR_013426|gb|DQ588205|Homo        sapiens 1       +
chr1    18536   18563   >hsa_piR_005239|gb|DQ577218|Homo        sapiens 1       +
chr1    26806   26836   >hsa_piR_016792|gb|DQ593109|Homo        sapiens 1       -
chr1    32134   32160   >hsa_piR_019669|gb|DQ596983|Homo        sapiens 1       -

...and now I'm stuck...

I need to result in a file that looks exactly like the gtf you made (below), but how do I do this?? I'm unfortunately not a top-notch bioinformatician, so help is most appreciated :)

$ head piRNAs.GRCm38.gtf

        2       piRNAbank       exon    92542278        92542304        .       +       .       gene_id "mmu_piR_000001"; transcript_id "AB250975";
        2       piRNAbank       exon    92543694        92543719        .       +       .       gene_id "mmu_piR_000002"; transcript_id "AB250977";
        2       piRNAbank       exon    92546494        92546519        .       +       .       gene_id "mmu_piR_000003"; transcript_id "AB250979";

ADD REPLY • link 8.0 years ago by heso ▴ 40

0

Entering edit mode

cat hglft_piRNA.bed | tr "\>\|" "\t" | awk '{printf("%s\tpiRNAbank\texon\t%s\t%s\t.\t%s\t.\tgene_id \"%s\"; transcript_id \"%s\";\n", $1, $2, $3, $9, $4, $6)}' > hglft_piRNA.gtf

ADD REPLY • link 8.0 years ago by Devon Ryan 104k

0

Entering edit mode

Worked perfectly, thanks a lot :)

ADD REPLY • link 8.0 years ago by heso ▴ 40

0

Entering edit mode

Hi I have a similar problem and lacking a really small step.

I needed exactly the same as hesco and basically did the same.

I converted to bed file with karlos klammers script and then get this

$ head hsa_piRNA-GRCh38.bed

1       14630   14657   >hsa_piR_013426|gb|DQ588205|Homo        sapiens 1       +
1       18536   18563   >hsa_piR_005239|gb|DQ577218|Homo        sapiens 1       +
1       26806   26836   >hsa_piR_016792|gb|DQ593109|Homo        sapiens 1       -
1       32134   32160   >hsa_piR_019669|gb|DQ596983|Homo        sapiens 1       -
1       39680   39710   >hsa_piR_014636|gb|DQ590030|Homo        sapiens 1       -

i guess because there was a space in the fasta there is now to columns instead of one with Homo and sapiens

So when I am now trying to make a gtf using Devon Ryon's script I get the following:

$ head hsa_piRNA-try.gtf

1       piRNAbank       exon    14630   14657   .       1       .       gene_id "hsa_piR_013426"; transcript_id "DQ588205";
1       piRNAbank       exon    18536   18563   .       1       .       gene_id "hsa_piR_005239"; transcript_id "DQ577218";
1       piRNAbank       exon    26806   26836   .       1       .       gene_id "hsa_piR_016792"; transcript_id "DQ593109";
1       piRNAbank       exon    32134   32160   .       1       .       gene_id "hsa_piR_019669"; transcript_id "DQ596983";

..........

so instead of the +/- I get the 1. I thought I could correct this by running this command instead, so replacing 9 with 10 to get the next 'column' but this doesn't work and will instead give the weird output below

$ cat hsa_piRNA-GRCh38.bed | tr ">\|" "\t" | awk '{printf("%s\tpiRNAbank\texon\t%s\t%s\t.\t%cas\t.\tgene_id \"%s\"; transcript_id \"%s\";\n", $1, $2, $3, $10, $4, $6)}' > hsa_piRNA-GRCh38.gtf

$ head hsa_piRNA-GRCh38-v3.gtf

1       .iRNAbangene_id "hsa_piR_013426"; transcript_id "DQ588205";
1       .iRNAbangene_id "hsa_piR_005239"; transcript_id "DQ577218";
1       .iRNAbangene_id "hsa_piR_016792"; transcript_id "DQ593109";
1       .iRNAbangene_id "hsa_piR_019669"; transcript_id "DQ596983";
1       .iRNAbangene_id "hsa_piR_014636"; transcript_id "DQ590030";

Maybe someone with more bioinfo knowledge has a quick idea how to fix this? Thank you so much!

ADD REPLY • link 7.4 years ago by Anna • 0