Entering edit mode
5.6 years ago
truebeliever24
▴
50
Hi all,
I am converting my vcf file to various formats, and have noticed a problem: my dataset is only showing "Pseudochromosome_1" and "Pseudochromosome_2"...when I have 33 chromosomes in my dataset.
Also, when I run my analyses, programs (i.e., SNeP, plink) read my data as residing on these two chromosomes, and not on the 33 separate chromosomes to which they belong.
How can I view/manipulate/analyze all 33 chromosomes instead of Pseudo 1 and 2?
Thanks!
How can your dataset have 33 chromosomes but only two pseudo-chromosomes? Which means, please explain more carefully your situation: what are you datasets? How did you obtain them? What commands were used? Where did the "pseudo-chromosomes" came from?
Hi,
thanks for the response. I use the command "plink --vcf filename --allow-extra-chr --recode --out filename" to convert my vcf to a ped for analysis. Then, when I run the analysis, I am told "number of chromosomes detected: 2". If I do not enter the code "--allow-extra-chr" in plink, then my analyses do not detect any chromosomes at all.
UPDATE running this script on my updated vcf (I removed some individuals from the original vcf): "grep -v "^#" test.vcf | cut -f 1 | sort | uniq -c" showed that my vcf had 2 chromosomes. Running that same script for my original vcf listed all 33 chromosomes. Thus, there was a problem when I ran this script to make my vcf with fewer individuals: "vcftools --vcf file --out file --recode --remove-indv xxx" ...do you have any suggestions on how to remove individuals from my vcf without lumping all of my data into 2 chromosomes? Thanks.