Hi,
If I have targeted gene resequencing data (1000 genes) for 75 individuals in group A and 75 in group B, is there an established method of looking for SNPs or Indels associated with the two groups?
For example, I know I could align to a reference genome with Bowtie and then call SNPs for each individual with SAMtools, but how would I then go about considering the 75 Group A vs Group B?
Is there anything that allows this kind of analysis or would I have to design scripts myself?
Thanks in advance.
Actually, it is not always necessary to draw a line between variant detection and association. In additional to samtools, there is another just published paper doing association directly from the sequencing data.
Although we are fans of PLINK, +1 for a clear and concise drawing of the line from variant detection to the subsequent association study. Very nice!
Although we are not fans of PLINK, +1 for a clear and concise drawing of the line from variant detection to the subsequent association study. Very nice!
Can you elaborate there? Aren't you still asking whether there is a distribution of alleles that suggests that two sample sets are not drawn from the same underlying population, and that the samples segregate according to some a priori condition (e.g. cases vs. controls). That's an association study.
Association test can be done without genotypes. To get the allele frequency, you do not need genotype in principle.