(1) Reference/remap files
We use joinx to create these: http://gmt.genome.wustl.edu/joinx/current/. The usage is as follows:
joinx create-contigs -v my_variants.vcf -r my_refseq.fa -o my_new_contigs.fa -R my_new_contigs.fa.remap --flank=99
This creates a new reference/remap pair with one sequence per variant* in myvariants.vcf (variants are relative to myrefseq.fa) with 99bp flanking on either side of the variant.
*The command is currently set up to only create sequences for variants that have an identifier (e.g., rsid). This was fine for making the dbsnp reference but is probably not ideal for general use cases. I will make the identifier requirement optional and maybe add a few more options (like skipping sites that fail filters and some things described in the next point) shortly.
(2) Refs based on sample genotypes
What you said (hs37lite.fa as the primary, output of joinx create-contigs as the alternate) is how we do it. Right now, joinx doesn't look at the genotype data; contigs are created for every alternate allele in the ALT field for each site in the vcf file (whether or not the variants appear in a GT call). I will add some options to do things like only process alleles that are present in GT calls, and maybe allow some basic filtering based on INFO/FORMAT fields (e.g., DP > 20). In any case, I don't think you need to worry about phasing in the vcf sample data (GT=1/2 vs GT=1|2); you will get both sequences created either way.
(3) Variant calling pipeline
I wouldn't change what you're doing right away. I would suggest running things through your existing pipeline to see how the results vary. Most of the testing I have done personally has just used samtools for variant calling after aligning with ibwa (not necessarily because I feel like that is the best thing to do). If you are wanting to generate sequences from existing sample data in a vcf, then I would definitely not suggest simplifying your existing calling strategy for generating the initial set of variants as you will want your alternate hypotheses to be as accurate as possible.
(4) Optimizations
Any pre-processing that works for bwa (trimming, filtering) should work the same for ibwa. The only differences between ibwa and stock bwa 0.5.9 are in sampe, so any methods that speed up "bwa aln" that yield equivalent .sai files are applicable (pbwa might be an option here). GobyBWA looks like it has its own file formats, so that will not work well. Lastly, ibwa sampe does have a -t option to support multi-threading. There have been some other sampe threading patches to bwa that work a bit better (at the expense of using more memory) than what I did in ibwa, but the -t option is worth trying if you become angry about the wall clock time used by ibwa sampe.
For those of us who didn't know, iBWA is "a fork of Heng Li’s BWA aligner with support for iteratively adding alternate haplotypes, reference patches, and variant hypotheses."
thanks for the clarification