Question

Getting the targets from a dat file

0

Entering edit mode

6.1 years ago

ailtonpcf • 0

I used miranda and RNAhybrid to find miRNAs in a miRNA-seq. My database was the 3'UTR from Homo sapiens hosted in the UTRDB. In the end I have a csv file with the ID of thousands of targets. So, how I can get the genes targets from a dat file usind the csv file as input? Summarizing, I want to provide the 'ID' and get back the 'DE'.

ID   3HSAA000001; SV 1; linear; mRNA; STD; HUM; 216 BP.
XX
AC   CA000001;
XX
DT   01-JUL-2009 (Rel. 1, Created)
DT   01-JUL-2009 (Rel. 1, Last updated, Version 1)
XX
DE   3'UTR in Homo sapiens alpha-1-B glycoprotein (A1BG), mRNA.
XX
DR   ASPicDB; b7e045ed97;
DR   UTRaspic; BA000001;
DR   GeneID; 1;
XX
OS   Homo sapiens (human)
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC   Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae;
OC   Homo.
XX
UT   3'UTR; Complete; 1 exon(s)
XX
FH   Key             Location/Qualifiers
FH
FT   source          1..216
FT                   /organism="Homo sapiens"
FT                   /mol_type="mRNA"
FT                   /db_xref="taxon:9606"
FT                   /db_xref="RefSeq:NM_130786"
FT   3'UTR           1..216
FT                   /source="ASPicDB::b7e045ed97:1551..1766"
FT                   /gene="A1BG"

dat utrdb perl python R • 1.9k views

ADD COMMENT • link updated 6.1 years ago by finswimmer 16k • written 6.1 years ago by ailtonpcf • 0

score 1 · Answer 1 · 2018-10-11

1

Entering edit mode

6.1 years ago

finswimmer 16k

Hello ailtonpcf ,

how should your output look like? Here is a way using grep. It will print out the line with ID followed by the line with DE

ids.txt looks like this:

3HSAA000001
3HSAA000002
3HSAA000003

The command to use:

$ grep -E "^ID|^DE" 3UTRaspic.Hum.dat|grep -A1 -f ids.txt
ID   3HSAA000001; SV 1; linear; mRNA; STD; HUM; 216 BP.
DE   3'UTR in Homo sapiens alpha-1-B glycoprotein (A1BG), mRNA.
ID   3HSAA000002; SV 1; linear; mRNA; STD; HUM; 1844 BP.
DE   3'UTR in Homo sapiens alpha-1-B glycoprotein (A1BG), mRNA.
ID   3HSAA000003; SV 1; linear; mRNA; STD; HUM; 172 BP.
DE   3'UTR in Homo sapiens alpha-1-B glycoprotein (A1BG), mRNA

fin swimmer

ADD COMMENT • link 6.1 years ago by finswimmer 16k

0

Entering edit mode

Thank you finswimmer for the help.

ADD REPLY • link 6.1 years ago by ailtonpcf • 0

0

Entering edit mode

Finswimmer, would you know how to get specifics "DE", based on a file.txt containing a list of "ID"?

ADD REPLY • link 6.1 years ago by ailtonpcf • 0

0

Entering edit mode

ailtonpcf please specify how your input file(s) and output file should look like (show examples).

ADD REPLY • link 6.1 years ago by finswimmer 16k