Entering edit mode
8 months ago
Palgrave
▴
130
I have 5000 somatic vcf files converted to plink format that I try to merge using this command
plink --bfile file1 --merge-list files.bed.txt --make-bed --out Merged.Plink
However, the output looks strange and I get several errors (1107578 to be exact) like below: Any ideas?
...
Warning: Multiple chromosomes seen for variant 'gridss335fb_1314h'.
Warning: Multiple chromosomes seen for variant 'unbalanced_22'.
Warning: Multiple chromosomes seen for variant 'gridss1070bb_15h'.
Warning: Multiple chromosomes seen for variant 'gridss584ff_92h'.
Error: 797428 variants with 3+ alleles present.
* If you believe this is due to strand inconsistency, try --flip with
Merged.Plink-merge.missnp.
(Warning: if this seems to work, strand errors involving SNPs with A/T or C/G
alleles probably remain in your data. If LD between nearby SNPs is high,
--flip-scan should detect them.)
* If you are dealing with genuine multiallelic variants, we recommend exporting
that subset of the data to VCF (via e.g. '--recode vcf'), merging with
another tool/script, and then importing the result; PLINK is not yet suited
to handling them.
See https://www.cog-genomics.org/plink/1.9/data#merge3 for more discussion.
End time: Sat Mar 9 08:19:15 2024
The output:
head Merged.Plink-merge.fam
ACTN01020001R ACTN01020001R 0 0 0 -9
ACTN01020001T ACTN01020001T 0 0 0 -9
ACTN01020002R ACTN01020002R 0 0 0 -9
ACTN01020002T ACTN01020002T 0 0 0 -9
ACTN01020003R ACTN01020003R 0 0 0 -9
ACTN01020003T ACTN01020003T 0 0 0 -9
ACTN01020005R ACTN01020005R 0 0 0 -9
ACTN01020005T ACTN01020005T 0 0 0 -9
ACTN01020006R ACTN01020006R 0 0 0 -9
ACTN01020006T ACTN01020006T 0 0 0 -9
wc Merged.Plink-merge.missnp
797428 797428 12522503 Merged.Plink-merge.missnp
Sorry, finxed now.