Entering edit mode
3.8 years ago
evafinegan
•
0
Hello,
I have a vcf file and I did not have any ID
for each of the SNP
in that column. So I manually added unique IDs
to the SNPs
using:
awk '{OFS="\t"} NR<67 {print $0;next} {{$3=$1"_"$2} print}' sample.vcf > out.vcf
but it also changed the column name from ID
to #CHROM_POS
. Now I am getting an error
Error in x@fix[, "ID"] : subscript out of bounds
in the downstream analysis. I think its the replaced column names that's causing the error. Is there a way to keep the column name to ID
in the awk
command line? Thank you!
Thank you! I used awk and now it gives this error: ID column contains non-unique names
because using cols CHROM and POS is not enough (duplicates...). Try
$3=sprintf("%s_%s_%d",$1,$2,NR)