Hi, I have downloaded a fasta file from dbsnp of all snps for a certain gene. Unfortunately that fasta file is in nucleotide format, where I require the peptide sequence (I would have thought there would be an option to specify this when downloading). I see there are programs such as transeq that can convert them, but for each snp require the reading frame. I assume that in my fasta file containing X amount of sequences, there will be a distribution of different reading frames. How to obtain this information? Ideally I'd like to call the entire file to transeq, but I guess I will have to split the file up into individual snps and feed each one into transeq once I know the frame rate.......
unless anyone has any other suggestions to get the protein sequence. I will not be able to use UCSC unfortunately.