Redundancy Removal
1
0
Entering edit mode
13.4 years ago

I've got a matrix of the percent ID of each of my sequences. What's a simple way with linear algebra to set a redundancy threshold. Like 95%.

similarity • 2.4k views
ADD COMMENT
3
Entering edit mode

What do you mean by "set a redundancy threshold"? It seems like the matrix you are describing is probably populated with a percentage, so I'm not sure what linear algebra has to do with things. Do you mean that you want a vectorized way of removing redundant sequences from some list?

ADD REPLY
0
Entering edit mode

have you really got a matrix (AA) and not a vector (A1)?

ADD REPLY
0
Entering edit mode

This question makes no sense, try again!

ADD REPLY
0
Entering edit mode
13.3 years ago
scapella ▴ 390

Hi, I don't know how to do it with "linear algebra" but you can still do it in a different way. For instance, you can use CD-Hit algorithm, the one used for reducing the NCBI Non-Redudant database redundancy, or if you have your sequences aligned, you could try a trimAl [http://trimal.cgenomics.org] where you set the redundancy threshold and the program applies a clustering algorithm for removing redundant sequences.

ADD COMMENT
0
Entering edit mode

Fishing in a soup of technical terms?

ADD REPLY

Login before adding your answer.

Traffic: 1692 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6