Hello,
I am attempting to use the 27-primates UCSC multiz alignment data so that I can find an orthogonal sequence (across different mammals) to a human reference sequence (whether an ORF, a transcript, etc). I downloaded all the data in the maf folders here https://hgdownload.soe.ucsc.edu/goldenPath/hg38/multiz30way/maf/, but am very lost on how to extract anything meaningful out of it. Specifically, even with its description, I am unclear on what the alignments and maf folders are, and how to read the files within them. If anyone could send a resources/explanations, it would be greatly appreciated. Thanks!
There are several
maf
related utility programs included in Kent utilities at UCSC. You can find their descriptions in this file (there are other programs in there as well so look for maf programs). I linked linux version but there are macOS binaries available as well.Here is an example of how to extract regions: https://groups.google.com/a/soe.ucsc.edu/g/genome/c/87HvXW137SI
Thank you so much. The documentation seems a bit ambiguous, but I'll try my best and will post some questions here if I have them!