PLINK --allow-extra-chr error
0
1
Entering edit mode
4.8 years ago
dumeir ▴ 20

Hi,

I've got .bed/.bim/.fam SNP files and am using physical bp positions to extract SNP subsets in PLINK (version 1.9). I got my coordinates from UCSC Table Browser so some of the chromosome IDs are unusual, like "6_ssto_hap7" or "Un_gl000211".

I tried using --allow-extra-chr 0 to include these but it doesn't seem to be working as I still get the same error

Error: Invalid chromosome code on line 938486 of --extract range file.

(Line 938486 is where the unusual chromosome IDs start)

I've been using this command:

plink --bfile <filename> --extract range <coordinates.txt> --make-bed --out <newfile> --allow-extra-chr 0

If I delete all the coordinates with the unusual chromsome IDs, I can successfully extract SNPs.

Any suggestions to fix this?

Thanks!

plink • 7.6k views
ADD COMMENT
0
Entering edit mode

Have you tried using --allow-extra-chr without the 0?

Have you tried looking at line 938485, 938486, 938487 to see whether these lines are just broken somehow?

ADD REPLY
0
Entering edit mode

Yes I've tried both of your suggestions. Those lines are in the exact same formatting, but just have IDs like "6_ssto_hap7". I tried changing one of those to just "6" and there was no error for that line.

ADD REPLY
0
Entering edit mode

Can you post the full .log file from your failed run? (Please make sure the version date/number is included.)

ADD REPLY
0
Entering edit mode
PLINK v1.90b6.14 64-bit (7 Jan 2020)

Options in effect:
  --allow-extra-chr 0
  --bfile <SNPfile>
  --extract <coordinates.txt>
  --make-bed
  --out <newfile>

Start time: Wed Jan 22 12:06:24 2020

Random number seed: 1579655184
8192 MB RAM detected; reserving 4096 MB for main workspace.
8146 variants loaded from .bim file.
571 people (0 males, 571 females) loaded from .fam.
571 phenotype values loaded from .fam.
Error: Invalid chromosome code on line 938486 of --extract range file.

End time: Wed Jan 22 12:06:24 2020
ADD REPLY
2
Entering edit mode

Ok, it's erroring out because the chromosome code isn't in your dataset. This is a bug, "--extract range" should just ignore that line. I'll post a fix tonight.

ADD REPLY
0
Entering edit mode

Sorry, just realised I accidentally deleted "range" from --extract range in the log file when I was changing the txt file name

ADD REPLY
1
Entering edit mode

Bugfix is now posted.

ADD REPLY
0
Entering edit mode

Thanks for that. I no longer receive the error but I know there are SNPs within some of the bp ranges with unusual chromosome IDs and they're not being extracted, only chromosomes 1-23. I would still like to extract them.

I also manually manipulated the bp position files to increase the search window but now I'm getting this error:

Error: Invalid range start position on line 830868 of --extract range file.

Is there a way to ignore that these positions are invalid and continue searching anyway?

ADD REPLY
0
Entering edit mode

At this point, I’ll need you to send me a set of files to reproduce what you’re seeing.

ADD REPLY
0
Entering edit mode

For the record, the problem was a negative position value, resulting from subtracting 5000 from the original interval-start and adding 5000 to the original interval-end coordinates.

This will remain an error in plink 1.9 and 2.0. However, today's plink 2.0 build adds --bed-border-bp/--bed-border-kb flags which perform this interval-extension for you.

ADD REPLY
0
Entering edit mode

Thanks for your help!

ADD REPLY

Login before adding your answer.

Traffic: 2100 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6