Best practices for analysing protein (amino acid) sequences?
0
0
Entering edit mode
5.3 years ago
vmax • 0

My background is in RNA-seq, but I’m now starting to work with protein (amino acid) sequences. There is a lot of literature for best practices and quality checks throughout RNA-seq analysis (from raw reads to say differential expression analysis or variant calling).

I’m having trouble finding similar literature for working with protein sequences. Things I’m thinking about are 1) If I need to filter redundant sequences, how do different thresholds for sequence similarity affect my results? 2) How do I check the accuracy of a multiple sequence alignment? 3) What about accuracy when I align to a structure? 4) Do I give any special considerations towards gaps (insertions/deletions)? 5) Other things I’m not aware of?

Does anyone have resources that would help answer these questions?

If necessary to know, I’m aligning multiple sequences to a structure, clustering and evaluating point mutations.

alignment sequence • 918 views
ADD COMMENT
0
Entering edit mode

Can you clarify what you want to achieve? What do you call a structure? What do you mean by evaluating point mutations?
1- Filtering out proteins with redundant sequences depends on what you want to do. For evaluating amino-acid variability, you may want to keep all sequences but it really depends on what your data set is and what the goal is.
2- For accuracy of MSAs, check papers that compare MSA methods/tools, like those mentioned in this post.
3- What's a structure in this case?
4- Are there reasons in the context you're working in to consider insertions/deletions? Where do your sequences come from?

ADD REPLY
0
Entering edit mode

Jean-Karim Heriche started you in the right direction, but you're asking a lot in a single question. Answers for all of your points could be questions in their own right. You may want to dial-in which one(s) are giving you the most issues so you'll get more attention from high-throughput sequencing folks, people with experience in alignment, and structural biologists when applicable.

ADD REPLY

Login before adding your answer.

Traffic: 1750 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6