Dear all,
I'm a master candicate who is interested in machine learning with gene prediction. I noticed that most papers would pick dimers (2 amino acids) as a key feature to train positive and negative data sets during gene prediction. However, I don't know why dimers is the only or best option. Anyone could help?
Thanks in advance!