Hi all,
The annotation file provided by Illumina for RatRef Expression chip is poorly annotated. Since illumina stopped selling these chips they stopped updating their annotation as well. So I am REMOAT annotation and Ensembl annotion to get a better annotation when analysing my data. However, recently we obtained NGS data for our rat strains and we wanted to find out the probes targeted regions SNPs from our analysis. So pulled the probes coordination from REMOAT annotation and try to find if our snps genomic location are included within the range or the probe sequence and I got my results. But I realized that many illumina probe start and end sequence are not always 50bp from both REMOAT and Ensembl. So what I am trying to do now is to blast the probes sequence to rat genome reference and the output will be illumina ID probe sequence and a genomic coordination in an automated way whether using web based program or perl software ? Any suggestions ?
Thanks for your input Sean, I did realize this after i blasted the sequence. However, blasting the sequence will get me somewhere to look for a SNp within the sequence instead of using the large range of probe sequence. So I am still waiting for an answer directing me to a systematic way of handling big blast jobs whether via NGS tools or using NCBI and/or Ensembl or add 50bp to probe start coordination and subtract 50bp from the probe end coordination and then look for SNPs withen these two seperate dataset. so we can eliminate some of the probes that have been implicated in our analysis.
I'd suggest blat or gmap for alignment. Each will align your 50bp probes to the genome in minutes.