When trying to validate some VCFs that we're creating, we hit some snags at positions where the reference base is non-ACGT. (For example, it could be an "M", using the IUPAC code for A or C). The VCF spec states that only ACGT bases are allowed in the ref position, and that column doesn't allow for a comma-separated list.
What's the proper way to encode this position? At the present, I'm leaning towards replacing the ref base with an N. Is this reasonable and/or correct?
People on the VCF-spec mailing list seem to think that we're right - it should be an N, at least under the current spec. Pierre gets best-answer for being faster, but upvotes all around!