Hi,
We want to check the concordance of the NGS data obtained from targeted SOLiD sequencing with the Affymetrix 6.0 data generated on the same sample. Both are genomic germline DNA, not tumor.
Is there a simple tool that will take the SAM file or variant file and look for concordance with the individuals genotype file (say if extracted from PLINK)?
Also hoping to hear how to do this otherwise, such as looking for the positions and alleles and then doing comparison, tabulating the number of concordant vs discordants etc.,
the affy6.0 chip has a translation file that has all the typed genome positions tagged, with reference and alternative allele. considering that NGS will only detect how your sample differs from the current reference, taking affy6.0 alleles and positions there are 2 obvious checks you may perform:
check if all the SNPs that affy6.0 has typed as alternative allele WERE FOUND by NGS (hence considered variants).
check if all the SNPs that affy6.0 has typed as reference allele WERE NOT FOUND by NGS (hence considered reference).
keep in mind that genotyping is "limited" to the fixed positions stored in the chip, so you won't be able to do anything in reverse direction, i.e. things that arise from NGS that are not described on the affy6.0 chip. but still this check will be powerful, since the typing quality of affy6.0 is quite high, so at least you'll be able to have ~1M straight checks (SNP positions) and another ~1M not-so-straight checks (CNVs), depending on your NGS experiment type.
excel really? there's excel templates which work as analysis tools for bioinformatics from the microsoft website. NOT sure if they might have something like this. but conceptually, I think a linux 'join' command is what u need to work fast
If in VCF format, you can use the vcftools --diff options to do this. Alternatively, you could convert the format to PLINK and use the diff mode of that program.
Just have to say wow...
Wow... that's a tough job.
excel really? there's excel templates which work as analysis tools for bioinformatics from the microsoft website. NOT sure if they might have something like this. but conceptually, I think a linux 'join' command is what u need to work fast