Question

Converting A Microarray Dataset Into Fasta Files

0

Entering edit mode

13.2 years ago

Stephanhart ▴ 100

I have GenBank accession numbers and gene symbols from my Agilent microarray.

A_55_P2051983    FALSE    NM_001001803    Spink7    serine peptidase inhibitor, Kazal type 7 (putative)    Mus musculus serine peptidase inhibitor, Kazal type 7 (putative) (Spink7), mRNA [NM_001001803]    GO:0005576|GO:0004867|GO:0030414    chr18:62753954-62753895        CAGTTTGTGGATCTGACTATATCACTTACGGGAATAAATGCAAGCTGTGTACAGAGATCT

I would like to convert all of these into one fasta file. Any advice would be appreciated.

Thanks,
Stephen

microarray fasta • 2.6k views

ADD COMMENT • link updated 21 months ago by Ram 44k • written 13.2 years ago by Stephanhart ▴ 100

0

Entering edit mode

So you in fact want to get a fasta file with the full DNA sequence of the transcript or gene that the reporter sequence matches? (Pierre's answer was seemingly correct for the question in the way you posed it.) If that is what you want you have to specify if you want gene or transcript sequence and which transcript to take if there are multiple transcripts. You can get this information from e.g. biomart using a list of accession numbers.

ADD REPLY • link 13.2 years ago by Michael 55k

score 2 · Answer 1 · 2011-10-05

2

Entering edit mode

13.2 years ago

Pierre Lindenbaum 164k

awk -F ' ' '{printf(">"); for(i=1;i[?] result.fa

Edit:

cat yourlist.txt | cut -f 3 |\
while read L
   do
     A=`echo -n $L |cut -d ' ' -f 3`
     curl -s "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=${A}&retmode=text&rettype=fasta"
   done

ADD COMMENT • link 13.2 years ago by Pierre Lindenbaum 164k