Hi, I'm currently creating an exon only spliced graph (NOT haplotype specific) using vg rna, and wanted to know if there was a way to make the IDs (contigs) the chromosome number instead of the transcript ID. I understand that when using vg rna there is an option to input a transcript file
-n, --transcripts FILE transcript file(s) in gtf/gff format
and the option to select which attribute tag to use as the ID
-s, --transcript-tag NAME use this attribute tag in the gtf/gff file(s) as id [transcript_id]
but I want to use the chromosome number as the ID tag.
Currently when I use vg call on my graph, my VCFs look like this
##contig=<ID=ENST00000584536_R1,length=804>
##contig=<ID=ENST00000550993_R1,length=438>
##contig=<ID=ENST00000490417_R1,length=952>
ENST00000196061_R1 2489 >376099>117009861 T C 15.5976 PASS ...
ENST00000196061_R1 2640 >376103>118173663 T C 12.7808 PASS ...
ENST00000196061_R1 2662 >376104>105813788 A T 12.5346 PASS ...
But I want this format:
##contig=<ID=chr8,length=146364022>
##contig=<ID=chr9,length=141213431>
##contig=<ID=chr10,length=135534747>
1 10560 . C G 21.77 . ...
1 13813 . T G 30.78 . ...
1 14464 . A T 36.31 . ...
Is there a way around this? Or is this specific to vg rna? My current code for vg rna is:
vg rna -p -d -o -r -n chr${i}.gtf chr${i}.vg > ${GRAPH_PREFIX}_${i}.vg
Any advice is welcome, and thanks in advance.