ERROR running STRUCTURE in command line
0
0
Entering edit mode
5.1 years ago
chparada ▴ 70

Hi!

I am trying to run STRUCTURE from the command line using Structure-threader. This is the error that I am getting:

----------------------------------------------------
STRUCTURE by Pritchard, Stephens and Donnelly (2000)
     and Falush, Stephens and Pritchard (2003)
       Code by Pritchard, Falush and Hubisz
             Version 2.3.4 (Jul 2012)
----------------------------------------------------


Reading file "/gpfs_common/share03/lmquesad/chparada/structure/mainparams".
datafile is
infile
Reading file "/gpfs_common/share03/lmquesad/chparada/structure/extraparams".
Reading file "/gpfs_common/share03/lmquesad/chparada/structure/input_file_NOvarroa.txt".


Data file "/gpfs_common/share03/lmquesad/chparada/structure/input_file_NOvarroa.txt" (truncated) --

Ind:   Label Genotype_data . . . .
  1: KA133 168  -9 108 162  -9 102 136 213  . . . .  92
  1: KA133  -9  -9  -9 182  -9 108 140 222  . . . .  98
  2: KA143  -9  -9 108 162 176 102 136 213  . . . .  92
  2: KA143  -9  -9  -9  -9  -9 108 140 222  . . . .  98
  3: KA144 162 152 108 162 176 102 136 213  . . . .  92
  3: KA144 174  -9  -9 176  -9 108 140 222  . . . .  -9
  4: KA145 162 144 106 162 176  99 136 216  . . . .  92
  4: KA145 174  -9 108 176  -9 102 140 222  . . . .  98

      *******   

140: BI424 168 146 108 162 176  99 140 213  . . . .  86
140: BI424  -9 152  -9  -9 178 102 149 220  . . . .  92
141: BI425 174 132 106 162 176  91 140 213  . . . .  90
141: BI425 186 162  -9 170  -9 102 149 220  . . . .  92

Number of alleles per locus: min= 4; ave=7.1; max=15
individual KA133 has negative location!  locations should be >= 0

Exiting the program due to error(s) listed above.

This is a few data lines from the input file:

    A107    A29 AP273   AC306   AP55    A24 A88 B124    AP43    AP81    A113    AP66
KA133   168 -9  108 162 -9  102 136 213 131 126 211 092
KA133   -9  -9  -9  182 -9  108 140 222 140 -9  217 098
KA143   -9  -9  108 162 176 102 136 213 131 126 211 092
KA143   -9  -9  -9  -9  -9  108 140 222 140 134 217 098
KA144   162 152 108 162 176 102 136 213 131 126 205 092
KA144   174 -9  -9  176 -9  108 140 222 140 134 217 -9

This is my mainparams file:

KEY PARAMETERS FOR THE PROGRAM structure.  YOU WILL NEED TO SET THESE
IN ORDER TO RUN THE PROGRAM.  VARIOUS OPTIONS CAN BE ADJUSTED IN THE
FILE extraparams.


"(int)" means that this takes an integer value.
"(B)"   means that this variable is Boolean 
        (ie insert 1 for True, and 0 for False)
"(str)" means that this is a string (but not enclosed in quotes!) 


Basic Program Parameters

#define MAXPOPS    10      // (int) number of populations assumed
#define BURNIN    100000   // (int) length of burnin period
#define NUMREPS   1000000   // (int) number of MCMC reps after burnin

Input/Output files

#define INFILE   infile   // (str) name of input data file
#define OUTFILE  outfile  //(str) name of output data file

Data file format

#define NUMINDS    141    // (int) number of diploid individuals in data file
#define NUMLOCI    12    // (int) number of loci in data file
#define PLOIDY       2    // (int) ploidy of data
#define MISSING     -9    // (int) value given to missing genotype data
#define ONEROWPERIND 0    // (B) store data for individuals in a single line


#define LABEL     1     // (B) Input file contains individual labels
#define POPDATA   0     // (B) Input file contains a population identifier
#define POPFLAG   0     // (B) Input file contains a flag which says 
                              whether to use popinfo when USEPOPINFO==1
#define LOCDATA   0     // (B) Input file contains a location identifier

#define PHENOTYPE 0     // (B) Input file contains phenotype information
#define EXTRACOLS 0     // (int) Number of additional columns of data 
                             before the genotype data start.

#define MARKERNAMES      1  // (B) data file contains row of marker names
#define RECESSIVEALLELES 0  // (B) data file contains dominant markers (eg AFLPs)
                            // and a row to indicate which alleles are recessive
#define MAPDISTANCES     0  // (B) data file contains row of map distances 
                            // between loci


Advanced data file options

#define PHASED           0 // (B) Data are in correct phase (relevant for linkage model only)
#define PHASEINFO        0 // (B) the data for each individual contains a line
                                  indicating phase (linkage model)
#define MARKOVPHASE      0 // (B) the phase info follows a Markov model.
#define NOTAMBIGUOUS  -999 // (int) for use in some analyses of polyploid data



Command line options:

-m mainparams
-e extraparams
-s stratparams
-K MAXPOPS 
-L NUMLOCI
-N NUMINDS
-i input file
-o output file
-D SEED

This is the command I use to run:

structure_threader run -K 10 -R 3 -i /gpfs_common/share03/lmquesad/chparada/structure/input_file_NOvarroa.txt -o /gpfs_common/share03/lmquesad/chparada/structure/ -t 16 -st /usr/local/usrapps/lmquesad/env_structure_threader/bin/structure

Any suggestion would be appreciated.

Thanks!!!

Camilo

popgen structure software error hpc • 2.7k views
ADD COMMENT

Login before adding your answer.

Traffic: 1878 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6