Entering edit mode
6.4 years ago
Omics data mining
▴
260
Dear all
I urgently want to download SNP information of TCGA-PAAD and run in PLINK for further analysis . I got the expression data of 183 samples. Is It possible to get SNPs of 183 samples in VCF ?
waiting for reply
All suggestions will be appreciated
Thank you in advance
Archana
Hi
I did the same. During runtime of maf2vcf.pl I am getting warning
WARNING: Reference allele mismatches found.
Many of SNPs are getting filtered out . What should I do ?
I checked respective position on given genome build, It was exactly same. Still I am getting this issue. Can anybody suggest me how to fix it ? Otherwise I will lose most of the SNPs information for downstream which I do not want to do. Or If its possible to create manual vcf ? How ?
Can u post the command u typed. Also are u getting many vcf files in the output folders? Because the code generates a vcf file per tumor-normal sample pair. Are u getting only the specific error u mentioned earlier, or other error. Detailed information will help us to debug. Else provide the input file.
Hi
First of all, defined the ref fasta my $ref_fasta = "/path/Homo_sapiens.GRCh37.75.dna.toplevel.fa" in script maf2vcf.pl and saved changes. Here is command used to convert maf to vcf .
perl maf2vcf.pl --input-maf TCGA-IB-8127-01.maf.txt --output-dir TCGA-IB-8127-01.vcf
Thank you in advance Archana