Hello,
I have answered your question via the IGSR Helpdesk, so please respond there if you have any further questions. I have added this response to help others looking at this question.
The IDs refer to the different structural variant classes and source call sets, which can also be identified using the SVTYPE and CS tags in the INFO column.
Below is a list of possible SVTYPEs:
ALU: Alu element insertion
LINE1: Line1 transposable element insertion
SVA: SVA element insertion, SVA stands for SINE-VNTR-Alu, it is a composite retrotransposon insertion
INS: Nuclear mitochondrial insertion
DEL: bi-allelic deletion
DUP: bi-allelic duplication
INV: bi-allelic inversion
CNV: multi-allelic copy-number variant
The DEL class has been further re-classified into DEL_ALU, DEL_LINE1 and DEL_SVA if the identified deletion appeared to
correspond to a reference mobile element insertion.
The source call set can be identified using the CS tag in the INFO column. Below is a list of possible CSs:
ALU_umary: Alu element insertion call set from the University of Maryland (MELT algorithm)
L1_umary: Line1 transposable element insertion from the University of Maryland (MELT algorithm)
SVA_umary: SVA element insertion from the University of Maryland (MELT algorithm)
NUMT_umich: Nuclear mitochondrial insertion from the University of Michigan (NumtS algorithm)
DEL_union: Union deletions genotypted by GenomeSTRiP and variant sites identified by GenomeSTRiP, Breakdancer, CNVnator, Delly and Variation Hunter.
DEL_pindel: Small deletions (<1kbp) from Washington University (Pindel algorithm)
INV_delly: Bi-allelic simple inversions from EMBL (Delly algorithm)
CINV_delly: Bi-allelic complex inversions from EMBL (Delly algorithm)
DUP_gs: Bi-allelic duplications and copy-number variants from Broad Institute (GenomeSTRiP algorithm)
DUP_delly: Bi-allelic tandem duplications from EMBL (Delly algorithm)
DUP_uwash: Bi-allelic deletions, duplications and copy-number variants from University of Washington (SSF algorithm)
Further information can be found in the README and the supplementary materials from the phase 3 publication:
[1] http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/integrated_sv_map/README_phase3_sv_callset_20150224
[2] https://static-content.springer.com/esm/art%3A10.1038%2Fnature15394/MediaObjects/41586_2015_BFnature15394_MOESM91_ESM.pdf
I hope this helps but please do get back in touch if you have any further questions.
Best wishes
Ben
IGSR Helpdesk
Hello, I tried to read all the documentation, but still, I'm not able to extract the ALU sequence. I only found their choordinates, but I cannot understand which is the sequence of the ALU