Hello,
I am writing concerning the following UCSC utility (binary file): http://hgdownload.cse.ucsc.edu/admin/exe/
================================================================
======== mafsInRegion ====================================
================================================================
mafsInRegion - Extract MAFS in a genomic region
usage:
mafsInRegion regions.bed out.maf|outDir in.maf(s)
options:
-outDir - output separate files named by bed name field to outDir
-keepInitialGaps - keep alignment columns at the beginning and of a block that are gapped in all species
I have been playing with this executable a little bit, and have came across some odd behaviour:
If I pass a BED file with one record to mafsInRegion - we have success, and the expected behaviour is exhibited i.e. a region is extracted from the input MAF file to an output file according to the given genomic co-ordinates - e.g.
chr1 1233 1244 Transcript_1 (mafsInRegion processing succeeds)
However, If I pass a different BED file to mafsInRegion, this time with the same BED record, but this time with an additional BED record including either the same start or end co-ordinate of the preceding BED record, then mafsInRegion will fail - and apart from header information, the output file is empty:
chr1 1233 1244 Transcript_1 (mafsInRegion processing succeeds)
chr1 1233 1245 Transcript_2 (mafsInRegion processing fails)
chr1 1244 1245 Transcript_3 (mafsInRegion processing fails)
So my question is this:
- Is this a recognised problem in mafsInRegion?
- Is there a recognised solution to this problem?
- If not, what do you reckon will be the easiest way to fix this problem?
Thank you
Can you send this question as an email to genome@soe.ucsc.edu? That way our whole team can see the question and has the opportunity to answer.
Thanks ChrisL from the UCSC Genome Browser
Hi Chris,
The question has been sent to genome@soe.ucsc.edu as well.
Thank you, Nikos