Entering edit mode
2.5 years ago
mera El
▴
10
Hello, I have a TSV file like this :
CHRM POS REF ALT CLINSIG TYPE GENE MC
1 861332 G A Uncertain_significance single_nucleotide_variant SAMD11:148398 SO:0001583|missense_variant
1 861336 C T Likely_benign single_nucleotide_variant SAMD11:148398 SO:0001819|synonymous_variant
X 41524689 A G Likely_benign single_nucleotide_variant CASK:8573 SO:0001819|synonymous_variant
X 41524690 GGTGTT CACCTACGTCATTTATGTAGGA Pathogenic Indel CASK:8573 SO:0001587|nonsense
Starting with a chromosome and position, I am trying to get chrom Start and chrom End values I have single nucleotide base changes and I also have insertions and deletions that can be multiple bases long.
For Snps, i think i can write : chrom Start = pos and End = pos, because i have in another file like this : ( so they just take start = end )
CHRM Type start end CLINSIG
13 single_nucleotide_variant 32972745 32972745 Benign
17 single_nucleotide_variant 61565892 61565892 Benign
But for indels I don't know how to do.
that's not a vcf.
Oh yes I'm sorry, I extract features from vcf file to tsv file. I will edit the question.