Entering edit mode
9.3 years ago
krp0001
▴
40
Dear Users,
I couldn't find any script that could split my vcf file that has scaffolds (10322) but not chromes number, i would like to split each scaffold into individual vcf file or a text file.
Thank you
Hai Goutham,
thank you for your quick response, I have tried your script I don't what
parallel -a
is, I am running it on grid not in an local machine, I get an error-bash: parallel: command not found,
thank you
For time being, you could use the 'for' loop option. You need to install GNU-Parallel. Its a wonderful tool.
hai again,
here is the few lines of output, missing headers in the vcf file.
I need something which has all the headers for each scaffold some thing like this: please see below
that has to be split into individual vcf files that has only one scaffold with variant information in each file. sorry if I haven't made it clear.
Run it on this file and let me know the problem you face. I could not understand from your data.
Hi Goutham,
Again looking for your help, I finally managed to split my large vcf file, with below script (from forums)
The problem now is missing header for each scaffold, I managed to split the header with
Could you please help me with how to add the header information to each scaffold. please let me know if I am not clear.
Thank you
If you are looking for very simplistic approach, I would do something like this:
This should also split VCF by chromosome/scaffold with header intact. Hope this helps.