Convert MAF to FASTA
1
0
Entering edit mode
5.9 years ago

I've obtained multiple MAF files through UCSC. And i really want to be able to convert them into fasta files. Is there a tool for it? I found this github https://github.com/dentearl/mafTools/ which has mafToFastaStitcher tool, but the description says that it requires complete fasta records to work, which is an extra step and can in theory be avoided, if i understand it right.

But if it cannot - how can i get complete fasta records for my MAFs through UCSC?

for example:

##maf version=1 scoring=blastz
a score=1027970.000000
s dm6.chr2R              21257296 172 + 25286936 GAC--TGGACTGC---ATCAGATAGC---ATTAAATTGCTGGTCACTCG--------------CAATCAGCGAAAACAAGC--GAAAC-TGAATGGAGCAAA----CAAAGAGCAGTTATTGCGGGCAATC-----ATTAGTGATACAAATCGCCG--------AAACAATTCCCC---GGAGAT--------CTGGAACCTAATCAG-----------------GACA---------------------------------------------------------GTCCATG------A-----------------------------CCACA
s droSim1.chr2R          15795742 174 + 19596830 GACTGTGGCTTGG---ATCAGATAGT---ATTAAATTGCTGGTCACTCG--------------CAATCAGCGAAAACAAGC--GAAAC-TGAATGGAGCAAA----CAAAGAGCAGTTATTGCGGGCAATC-----ATTAGTGATACAAATCGCCG--------CAGCAACTCCCC---GGAGAT--------CTGGAACCTAATCAG-----------------GACA---------------------------------------------------------GTCCATG------A-----------------------------CCACA
i droSim1.chr2R          C 0 C 0
s droSec1.super_9          474992 172 +  3197100 GAC--TGGACTGG---ATCAGATAGT---ATTAAATTGCTGGTCACTCG--------------CAATCAGCGAAAACAAGC--GAAAC-TGAATGGAGCAAA----CAAAGAGCAGTTATTGCGGGCAATC-----ATTAGTGATACAAATCGCGG--------CAGCAATTCCCC---GGAGAT--------CTGGAACCTAATCAG-----------------GACA---------------------------------------------------------GTCCATG------A-----------------------------CCACA
i droSec1.super_9        C 0 C 0
s droYak3.chr2R           9094987 172 - 21139217 GAC---CGACTGG---ATCAGATAGT---ATTAAATTGCTGGTCACTCG--------------CAATCAGCGAAAACAAGC--GAAAC-TGAATGGAGCAAA----CAAAGGGCAGTTATTGCGGGCAATC-----ATTAGTGATACAAATCGCCG--------CAGCAATTCCCC---GGAGAT--------CTGAAACCTAATCAG-----------------GACA--------------------------------------------------------CAACCATG------A-----------------------------CCACA

How can i automatically download complete record of droSec1.super_9? I think there should be a way to do it

alignment soft ucsc • 4.8k views
ADD COMMENT
0
Entering edit mode

Thank you for the second link - i'll look into it

Do you mb know how to get complete UCSC fasta records by droSec1.super_9? I think i will need it anyway later

ADD REPLY
0
Entering edit mode
5.9 years ago
kloetzl ★ 1.1k

I am not sure what you want to achieve, but this extracts the alignment and produces fasta-like output: cat foo.maf | awk '/^s/{print ">" $2 "\n" $7}'.

ADD COMMENT
0
Entering edit mode

Does it considerer blocks inside an alignment? cos thats the whole problem - different blocks may include different organisms

ADD REPLY

Login before adding your answer.

Traffic: 1914 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6