Snp Cluster Analysis
2
2
Entering edit mode
12.2 years ago
jackuser1979 ▴ 890

I have illumina paired-end reads mapped to reference genome using bowtie and created mpileup using samtools and from mpileup identified SNPs using variant caller (varscan). I got the output in VCF format. I need to do SNP cluster analysis. Are there any software to do SNP cluster analysis or any R packages available?

r • 11k views
ADD COMMENT
0
Entering edit mode

What do you mean by clustering? There are many different types of analyses that uses clustering for NGS data...

ADD REPLY
1
Entering edit mode

Could even say, WHY do you need to do cluster analysis? Just because somebody told you?

ADD REPLY
0
Entering edit mode

just for my curosity..I want to try clustering analysis..any info or url to get started are welcome

ADD REPLY
1
Entering edit mode
11.9 years ago
Houkto ▴ 220

As the rest of answers I do not understand what do you mean by cluster analysis using SNPs. However, you can see the distribution and the frequency of SNPs over a certain window size across the genome. Doing that you can see if there are a cluster of SNPs in a certain region such as chromosome, to do that you can use a tool called CIRCOS LINK (it has a tutorial . Another clustering method of SNPs is by categorizing their predicted effect on a gene such as synonymous or non-synonymous, and stop coding variants using Ensembl tool called Variant Effect Predictor LINK . These the two clustering things I can think of. Another one in which you have to sequence more than one genome of the same species and you want to see if they are closely related or not.

Let me know which one you mean if they are not of the two examples in which I can be more helpful

ADD COMMENT
1
Entering edit mode
11.9 years ago
brentp 24k

If you are asking about burden testing, AssoTesteR

You simply put your phenotype as 1 / 0 for case / control and a genotype matrix with 1 for the alternate and 0 for the reference in columns of snps and rows of samples.

Once your data is in that format, you can perform a variety of multi-locus tests including, for example, c-alpha

ADD COMMENT
0
Entering edit mode

hmm, maybe a tool that reads the locations in a vcf-file and shows a chart of their density ?

f(x)=#locations between x/640L and (x+1)/640L , L=total length

looks useful, I want it too , trying to write it ...

ADD REPLY

Login before adding your answer.

Traffic: 2775 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6