I am doing a project in which I intend to compare the tools for finding snp,indel and cnv.I tried many tools and was successful in running few of them.The problem now is that I want datasets that I can run so that I will surely find indel,snp or cnvs for instance.As I have limited hardware like I have i5 processor with 2gb ram so large files may take time.
Can you suggest some datasets which are small in size and I will surely find the indels,.. in them so that I may perform the analysis and may compare tools on small dataset?
You could use some of the ICR142 NGS validation series: "The dataset includes high-quality exome sequence data from 142 samples together with Sanger sequence data at 730 sites; 409 sites with variants and 321 sites at which variants were called by an NGS analysis tool, but no variant is present in the corresponding Sanger sequence. The dataset includes 286 indel variants and 275 negative indel sites, and thus the ICR142 validation dataset is of particular utility in evaluating indel calling performance.
The FASTQ files and Sanger sequence results can be accessed in the European Genome-phenome Archive under the accession number EGAS00001001332. (Requires submission of an application form)"