GATK SelectVariant and Separation of Variants
1
0
Entering edit mode
9.8 years ago
basalganglia ▴ 40

Hello cheerful bioinformaticians :)

I have a problem about GATK SelectVariant tool. I have a VCF file including many samples. I want to separate certain samples into the different files.

I have download GATK (GenomeAnalysisTK.jar), and I have used following command,

My variants are like 14-254,14-345.... so I have written 14-282.141202 instead of SAMPLE_A_PARC

Select two samples out of a VCF with many samples:
 java -Xmx2g -jar GenomeAnalysisTK.jar \
   -R ref.fasta \
   -T SelectVariants \
   --variant input.vcf \
   -o output.vcf \
   -sn SAMPLE_A_PARC \
   -sn SAMPLE_B_ACTG

But I have received this error message

##### ERROR A USER ERROR has occurred (version 3.3-0-g37228af):
##### ERROR
##### ERROR This means that one or more arguments or inputs in your command are                                                                                         incorrect.
##### ERROR The error message below tells you what is the problem.
##### ERROR
##### ERROR If the problem is an invalid argument, please check the online docum                                                                                        entation guide
##### ERROR (or rerun your command with --help) to view allowable command-line a                                                                                        rguments for this tool.
##### ERROR
##### ERROR Visit our website and forum for extensive documentation and answers                                                                                         to
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR Please do NOT post this error to the GATK forum unless you have real                                                                                        ly tried to fix it yourself.
##### ERROR
##### ERROR MESSAGE: Invalid argument value 'R' at position 0.
##### ERROR Invalid argument value 'ref.fasta' at position 1.
##### ERROR Invalid argument value 'T' at position 2.
##### ERROR Invalid argument value 'SelectVariants' at position 3.
##### ERROR Invalid argument value 'TR-ETM-59.vcf' at position 6.
##### ERROR Invalid argument value 'sn' at position 7.
##### ERROR Invalid argument value '14-282.141202' at position 8.
##### ERROR --------------------------------------------------------------------

Thanks and have a nice bioinformatics !!!! :)

selectvariant gatk split-variant • 5.6k views
ADD COMMENT
1
Entering edit mode
9.8 years ago
Ram 44k

Check your command - it seems you might be using just R instead of -R

ADD COMMENT
0
Entering edit mode

Thank you so much, but now I have received that error message

ERROR MESSAGE: The fasta file you specified (/home/bio/IGBAM/ref.fasta) does not exist

So will I need to download ref.fasta file?

ADD REPLY
0
Entering edit mode

Yes, you do. Download the appropriate reference file (ucsc.hg19/GRCh37).

ADD REPLY
0
Entering edit mode

Thank your kind reply,

Can I use my samples' bam or fastq file instead of ref.fasta file

ADD REPLY
0
Entering edit mode

No, ref.fasta is the reference sequence. samples BAM is alignment of your reads to the ref file and the FASTQ file are the reads.

FASTQ and ref.fa are the two base pieces of information on which the entire process is built, so you cannot make do without either of them.

ADD REPLY

Login before adding your answer.

Traffic: 2651 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6