I'm following this tutorial to create indicies of genome and transcripts at http://icb.med.cornell.edu/wiki/index.php/Elementolab/BWA_tutorial, but could not find the RefSeq from http://physiology.med.cornell.edu/faculty/elemento/lab/files/refGene.txt.07Jun2010.fa (the file is no longer there, throws a 403 error).
I have tried to google it, and found this NCBI ftp: ftp://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/RefSeqGene/, however, I'm not sure which file I should use for that tutorial. Can someone help me out? thanks
ps: I am trying to set up everything for doing exome-sequencing analysis (variant calling).
Oh, I just re-read your title.
To clearify, you probably don't need the RefSeq information for DNA-Seq variant calling (using the pipeline that I mentioned above).
For RNA-Seq you probably will need a table of mRNA locations. However, I probably wouldn't remmend using BWA for RNA-Seq. You should use something like TopHat or STAR to handle the exon junctions.
Either way, you probably really want the coordinates rather than the mRNA sequences themselves.