Hi again,
It would actually be very helpful to be able to edit #CHROM column as well. What I am after is to change from this:
#CHROM POS ID
scaffold00002 764729 scaffold00002_764729_T_C
scaffold00002 764955 scaffold00002_764955_C_G
scaffold00002 765132 scaffold00002_765132_G_A
scaffold00002 766694 scaffold00002_766694_C_G
scaffold00002 766775 scaffold00002_766775_G_A
scaffold00002 766966 scaffold00002_766966_G_A
scaffold00002 771319 scaffold00002_771319_C_G
scaffold00002 773905 scaffold00002_773905_A_G
scaffold00002 775644 scaffold00002_775644_T_C
scaffold00002 776411 scaffold00002_776411_A_T
scaffold00007 1178023 scaffold00007_1178023_C_T
scaffold00007 1178440 scaffold00007_1178440_A_G
scaffold00007 1180956 scaffold00007_1180956_C_T
to this:
#CHROM POS ID
1 764729 scaffold00002_764729_T_C
1 764955 scaffold00002_764955_C_G
1 765132 scaffold00002_765132_G_A
1 766694 scaffold00002_766694_C_G
1 766775 scaffold00002_766775_G_A
1 766966 scaffold00002_766966_G_A
1 771319 scaffold00002_771319_C_G
1 773905 scaffold00002_773905_A_G
1 775644 scaffold00002_775644_T_C
1 776411 scaffold00002_776411_A_T
2 1178023 scaffold00007_1178023_C_T
2 1178440 scaffold00007_1178440_A_G
2 1180956 scaffold00007_1180956_C_T
Specifically, each scaffold would be changed to a number. I have 126 scaffolds in total, so numbers in #CHROM column would be from 1 to 126. I don't think it would work just replacing the column (but not sure) because of the format of the file (.vcf) which has a header), but again, not sure...(I am pretty new in all this).
I tried this:
## Remove header from txt file
tail -n+2 sub.txt > newpos.txt
## Get header from vcf
grep -P '^#' test.vcf > new.vcf
grep -v -P '^#' test.vcf \
| cut -f3- \
| paste newpos.txt - >> new.vcf
Which was posted: Replace fields CHROM and POS in a vcf file but it didn't work for me: the new.vcf file of the step ##Get header from vcf is empty. I suspect there is something wrong in the code, but as I don't understand completely this language I don't know what could it be. This is what my terminal shows:
MacBook-Pro-de-Angela:EditingCHROMfieldVCF angelaparodymerino$ grep -P '^#' mac3_minDP3_maxmeanDP289_maf005_minQ40_minGQ30_hwe005_265ind_IDs2.vcf > new.vcf
usage: grep [-abcDEFGHhIiJLlmnOoPqRSsUVvwxZ] [-A num] [-B num] [-C[num]]
[-e pattern] [-f file] [--binary-files=value] [--color=when]
[--context[=num]] [--directories=action] [--label] [--line-buffered]
[--null] [pattern] [file ...]
In short, does anyone know how can I edit #CHROM column of a .vcf file from a .txt file?
Thanks in advance,
Regards,
'Angela Parody-Merino
please show us a few lines of your input, a few lines of your desired output, what are the solutions you have tried ?
Please be as specific as possible, e.g. include error message or explain how the output you got differs from what you aim to obtain.