Dear all,
I have bed file like:
Chr \t start \t stop
And I woud like to add to fourth column information about reference allele from hg19, so output for example should look like:
chr \t start \t stop \t A/A
chr \t start \t stop \t C/C
chr \t start \t stop \t T/T
chr \t start \t stop \t A/A
So create something very similar to gVCF. Reason is, that I need to annotate this output file in VEP.
Do you know a bit of Python? I have a script which almost does what you want.
Thank you for reply. I am scripting more in bash and awk.. But maybe I would understand..
How big is your file? Wondering whether to do everything in memory or stream.
It could be about 500000 rows and about 20 MB.