Change vcf header in windows
1
0
Entering edit mode
8.2 years ago
petersam • 0

I have run into a problem that seems simple but has been surprisingly hard. I have a group of .g.vcf.gz files that I want to joint genotype with GATK. They were generated with incorrect and sloppy read set names. I would like to change the name in the header before grouping them together. Unfortunately, I do not have access to a Unix based machine. I have tried to install cygwin for samtools or tabix but my lack of programming has prevented me from getting those programs to work. The other option I explored was decompressing the files, editing the vcf file and then converting back to vcf.gz. The e power of my computer is not great and the large file size has stumpted me.

Can anyone recommend an easy way to change or alter a header?

vcf windows header gatk samtools • 2.4k views
ADD COMMENT
0
Entering edit mode

Hello there.

well it seems you have a set of problems, instead of just one. I am in the same situation as you, but i have solved a few, so here you go:

If you haven't acces to a Unix or linux platform, and just windows, the way to go is to install in your machine a "virtual machine", you can actually use Unix/Linux as desired on your own windows machine, here is a way to start:https://www.storagecraft.com/blog/the-dead-simple-guide-to-installing-a-linux-virtual-machine-on-windows/

I lasted a day or two installing it and understanding the basics, but when you have success with it, is like having a whole new computer installed in your before only computer.

After doing that and having your Unix/Linux machine running, you may need to install samtools and bcftools through the terminal, that is not a big deal, you should be able to do that following the tutorials on the source page here: http://www.htslib.org/

you know the path, download packages, unpack them, install them, go through the tutorial ... etc.

when you have these programs installed, i have a script that solved me the same problem on a .bam file,

I am having the same problem as you, trying to change a .vcf.gz header at the moment and that's how i found your post... i hope my comment is of any aid, and i hope we both will be able to solve the tip of the iceberg of our bioinformatic problem, because i think that going through the correction of the headers manually will be dead boring... plus exageratedly time spending.

cheers !

ADD REPLY
0
Entering edit mode
8.2 years ago
Zaag ▴ 870

Maybe Notepad++ can handle it so you can change it manually (unpack first):

https://notepad-plus-plus.org/

ADD COMMENT

Login before adding your answer.

Traffic: 1558 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6