Hi everyone,
I am trying to use the CNV caller.
a) GATK version used: gatk-4.1.4.0
I used the following command in this step.
../gatk-4.1.4.0/gatk -L
Filtered_annotated_preprocessed_intervals_Twist.interval_list
--interval-merging-rule OVERLAPPING_ONLY -I S1071Nr10.counts.hdf5 -I S1071Nr11.counts.hdf5 ( added 200 samples here as input, skipped those
lines here to save the space) --contig-ploidy-priors
../contig_ploidy_priors.tsv
--output . --output-prefix ploidy --verbosity DEBUG --mapping-error-rate 0.01 --global-psi-scale 0.001 --sample-psi-scale 1.0E-4 --mean-bias-standard-deviation 0.01
I installed the conda environment following https://gatk.broadinstitute.org/hc/en-us/articles/360035889851?flash_digest=f2aaedc26749c67b8005def080fde44460155fb6#
Everything was working until I got the following error, which I cannot understand what it is and how I can solve it.
16:54:47.473 DEBUG ScriptExecutor -
--output_model_path=/data/NGS/Reanalysis-Package/CNV/ploidy-model /homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/h5py/__init__.py:36:
FutureWarning: Conversion of the second argument of issubdtype from
float
to np.floating
is deprecated. In future, it will be treated
as np.float64 == np.dtype(float).type
. from ._conv import
register_converters as _register_converters Traceback (most recent
call last): File
"/tmp/cohort_determine_ploidy_and_depth.1941148667013278511.py", line
79, in <module> args.contig_ploidy_prior_table) File
"/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/gcnvkernel/io/io_ploidy.py",
line 182, in get_contig_ploidy_prior_map_from_tsv_file
delimiter=delimiter) File
"/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/gcnvkernel/io/io_commons.py",
line 50, in read_csv input_pd = pd.read_csv(fh, delimiter=delimiter,
dtype=dtypes_dict) # dtypes_dict keys may not be present File
"/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/pandas/io/parsers.py",
line 705, in parser_f return _read(filepath_or_buffer, kwds) File
"/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/pandas/io/parsers.py",
line 451, in _read data = parser.read(nrows) File
"/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/pandas/io/parsers.py",
line 1065, in read ret = self._engine.read(nrows) File
"/homefolder/zfatahi/miniconda3/envs/gatk/lib/python3.6/site-packages/pandas/io/parsers.py",
line 1828, in read data = self._reader.read(nrows) File
"pandas/_libs/parsers.pyx", line 894, in
pandas._libs.parsers.TextReader.read File "pandas/_libs/parsers.pyx",
line 916, in pandas._libs.parsers.TextReader._read_low_memory File
"pandas/_libs/parsers.pyx", line 970, in
pandas._libs.parsers.TextReader._read_rows File
"pandas/_libs/parsers.pyx", line 957, in
pandas._libs.parsers.TextReader._tokenize_rows File
"pandas/_libs/parsers.pyx", line 2200, in
pandas._libs.parsers.raise_parser_error pandas.errors.ParserError:
Error tokenizing data. C error: Expected 5 fields in line 58, saw 7
16:54:55.812 DEBUG ScriptExecutor - Result: 1 16:54:55.813 INFO
DetermineGermlineContigPloidy - Shutting down engine [February 3, 2020
4:54:55 PM IRST]
org.broadinstitute.hellbender.tools.copynumber.DetermineGermlineContigPloidy
done. Elapsed time: 0.78 minutes. Runtime.totalMemory()=3370123264
org.broadinstitute.hellbender.utils.python.PythonScriptExecutorException:
python exited with 1 Command Line: python
So, it seems that the error is;
pandas.errors.ParserError: Error tokenizing data. C error: Expected 5 fields in line 58, saw 7
I googled a lot but I could not figure out what the problem is ( I have no experience working with python, I am just following the steps in here; https://gatkforums.broadinstitute.org/gatk/discussion/11684
Can anyone help me to solve the issue?
Thanks in advance,
Zohreh
The most likely explanation is that the file you're using to define contig ploidy priors has 7 columns instead of 5 in one or more of its rows. Even if it looks like that is not the case, double check you don't have any extra tabs (you can check with vim on the command line or notepad++). Hope this helps.
Thanks! That solved the issue.