Hello,
I was wondering if there is a convenient way to figure out the sequence coverage of a given protein with a list of peptides.
For example I have this protein:
>sp|O00330|ODPX_HUMAN Pyruvate dehydrogenase protein X component, mitochondrial OS=Homo sapiens GN=PDHX PE=1 SV=3
MAASWRLGCDPRLLRYLVGFPGRRSVGLVKGALGWSVSRGANWRWFHSTQWLRGDPIKIL
MPSLSPTMEEGNIVKWLKKEGEAVSAGDALCEIETDKAVVTLDASDDGILAKIVVEEGSK
NIRLGSLIGLIVEEGEDWKHVEIPKDVGPPPPVSKPSEPRPSPEPQISIPVKKEHIPGTL
RFRLSPAARNILEKHSLDASQGTATGPRGIFTKEDALKLVQLKQTGKITESRPTPAPTAT
PTAPSPLQATAGPSYPRPVIPPVSTPGQPNAVGTFTEIPASNIRRVIAKRLTESKSTVPH
AYATADCDLGAVLKVRQDLVKDDIKVSVNDFIIKAAAVTLKQMPDVNVSWDGEGPKQLPF
IDISVAVATDKGLLTPIIKDAAAKGIQEIADSVKALSKKARDGKLLPEEYQGGSFSISNL
GMFGIDEFTAVINPPQACILAVGRFRPVLKLTEDEEGNAKLQQRQLITVTMSSDSRVVDD
ELATRFLKSFKANLENPIRLA
And the following peptides:
HSLDASQGTATGPR
STVPHAYATADCDLGAVLK
VVDDELATR
Is it possible that R tells me the start and end of each peptide in the protein of interest in a new file?
Is it also possible to get the fasta sequence directly from uniprot?
I need to do that for many proteins and sequences so I cant do that manually.
Thanks a lot!
Hello,
I have peptide 30,000 peptide sequences from human brain sample. I want to compare these peptides with the PRIDE database peptide sequence to see whether my peptide sequence is novel or not. I am facing two problems.
Firstly, I have to download the pride dataset in Linux server because my computer doesn't support to download these huge datasets. secondly, how I can compare these in R. Please give me some suggestions.
Shanzida.