hi everyone
i am trying to annotate of identified SNPs and INDELs (vcf format) using software such as snpEff. i downloaded gff3 file correspond to my reference genome. i have to build database in snpEff because my desired database is not present in snpEff.
problem: scaffold names in gff3 file and VCF file are different and i must to convert scaffold names in gff3 file to scaffolds name in VCF file and then try to building new database in snpEff using name-converted gff3 file. anybody here know how i can do it?
example of differences in scoffold names
name in gff3 * name in VCF
NW_011590949.1 * KN271049.1
NW_011590950.1 * KN271050.1
...
are there alternative ways to annotate new SNPs? it is possible to merge gff and VCF to annotate VCF file?
thanks in advance
It's weird that you have different scaffolds names taking in to account that it is the same genome. Are you sure that the genome you have used for variant calling is the same genome that it is annotated in the gff3 file? same genome and same version? If the answer is
yes
, if you have list withgff3 - vcf scaffold name correlation
, let's sayNW_011590949.1
gff3 scaffold correspond toKN271049.1
vcf scaffold, it shouldn't be so difficult to change the names using a little script or awk command.