Entering edit mode
3.4 years ago
evafinegan
•
0
Hi,
I have a vcf file with multiple samples.
REF ALT
TTTTAA TTTTAT,TTTATAA,TTAAAAAA
Here is one variant line from the vcf file. There are multiple alternate alleles for the variant position. I want to split the alternate alleles with a tab into different columns each. Then I want to find and compare the lengths of REF variant and largest ALT variant such as:
REF ALT
TTTTAA TTTTAT TTTATAA TTAAAAAA
6 8
Thank you for any help!
Please look into
bcftools query -f
to format VCF information in custom formats. You will need to use some python/R/awk to get to the length-of-longest-ALT part.