problem with chromosomes in michigan imputation server
4
2
Entering edit mode
7.4 years ago
jfertaj ▴ 110

Hi,

I am trying to impute a dataset using the Michigan imputation server but I got this error:

No valid chromosomes found!

I have vcf (with tabix index) files from 1-23 chromosomes that look like this:

Should I add chr at the beginning?

##fileDate=20170710
##source=PLINKv1.90
##contig=<ID=9,length=141077353>
##INFO=<ID=PR,Number=0,Type=Flag,Description="Provisional reference allele, may not be based on real reference genome">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  1A073_C08_100811_1A073  1A223_A05_100811_1A223  1A224_1A224     1A226_A11_100811_1A226  1A242_H09_100811_1A24
9       205764  rs10811213      C       T       .       .       PR      GT      0/0     0/1     0/0     0/0     0/1     0/0     0/0     0/0     0/1     0/0     0/0     0/0     0/1
9       212189  rs9406775       C       T       .       .       PR      GT      0/0     0/0     0/0     0/0     0/0     0/0     0/0     0/0     0/0     0/0     0/0     0/0     0/0
9       213149  rs680654        G       A       .       .       PR      GT      0/0     0/1     0/0     0/1     0/1     0/0     0/1     0/1     0/0     1/1     0/0     ./.     0/0
9       214864  rs2236547       C       T       .       .       PR      GT      0/0     0/1     0/0     0/0     0/1     0/0     0/0     0/0     0/1     0/0     0/0     0/0     0/1
9       215269  rs636922        A       C       .       .       PR      GT      0/0     0/1     0/0     0/1     0/1     0/0     0/1     0/1     0/0     1/1     0/0     1/1     0/0
9       215494  rs7869327       C       T       .       .       PR      GT      0/0     0/1     0/0     0/0     0/1     0/0     0/0     0/0     0/1     0/0     0/0     0/0     0/0
9       215511  rs2023402       C       T       .       .       PR      GT      0/0     0/1     0/0     0/0     0/1     0/1     0/0     0/0     1/1     0/0     0/0     ./.     0/0
9       215534  rs635615        C       T       .       .       PR      GT      0/0     0/1     0/0     0/1     0/1     0/0     0/1     0/1     0/0     1/1     0/0     1/1     0/0
9       216124  rs561921        A       G       .       .       PR      GT      0/1     0/0     1/1     0/1     0/0     0/1     0/0     0/0     0/0     0/0     0/1     0/0     0/0
9       217269  rs598791        A       G       .       .       PR      GT      0/0     0/1     0/0     0/1     1/1     0/0     0/1     0/1     0/0     1/1     0/0     1/1     0/0
9       217397  rs529045        G       A       .       .       PR      GT      0/0     1/1     0/0     0/1     1/1     0/0     0/1     0/1     0/1     1/1     0/0     1/1     0/1
9       224693  rs601023        C       T       .       .       PR      GT      1/1     0/0     0/1     0/1     0/0     0/1     0/1     0/1     0/1     0/0     1/1     0/0     0/1
9       227621  rs4740661       C       T       .       .       PR      GT      0/0     0/1     0/0     0/0     0/1     0/0     0/0     0/0     0/1     0/0     0/0     0/0     0/1
9
imputation michigan server • 6.0k views
ADD COMMENT
0
Entering edit mode

Should I add chr at the beginning?

That's one of the most common issues in bioinformatics, so it's always a safe bet to try if it solves your issue.

ADD REPLY
0
Entering edit mode

I was encountering the same problem. Adding 'chr' to the chromosomes (ie. '--output-chr chr26' in plink) didn't seem to resolve this. Instead, '--output-chr M' fixes the issue for me. Which is a bit silly, since chromosomes are listed as numbers in both the 'M' and '26' formats, but I guess there's a difference.

ADD REPLY
0
Entering edit mode

I split the file into chromosomes, I changed the chromosome name to chr1 and updated the contig line in the header, I bgzip'ed the file and passed all the checks from the vcf debugulator: https://github.com/EBIvariation/vcf-validator

The server still throws an uninformative error: unfortunately, your job failed.

Weirdest of all, the spinner next to the section that reads Input Validation -> Analyze file blah-01.vcf.gz... is still spinning away... I'm not sure if the job has /actually/ failed or not :(

ADD REPLY
1
Entering edit mode
6.6 years ago
mgru ▴ 20

It's because of your chromosome X- it only accepts an X as X rather than 23. Set it via plink and then resubmit.

ADD COMMENT
1
Entering edit mode
2.8 years ago
Dan ▴ 20

4.6 years ago and no answer?

ADD COMMENT
0
Entering edit mode

Using 1 works, using chr1 doesn't work.

ADD REPLY
0
Entering edit mode

New user here. I'm encountering the same problem. Where do you add the 'chr1' in the vcf file?

ADD REPLY
0
Entering edit mode
7.4 years ago
Samuel Brady ▴ 330

Yes, adding on "chr" may help. I'm not sure what program you are running, but if one of your program's inputs is a reference fasta file and it has ">chr1" format chromosome names instead of ">1" then adding "chr" will help.

ADD COMMENT
0
Entering edit mode
3.7 years ago
binodregmi30 ▴ 10

the server takes chr1 or 1 depending on genomic built

ADD COMMENT

Login before adding your answer.

Traffic: 1723 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6