Entering edit mode
2.1 years ago
Filago
▴
100
Hello everybody!
I am interested in training a gkm-svm model in order to predict the impact of regulatory sequences. Among different publications most often I find +-150 bp around an DNAseSeq-Peak to be the best choice for retrieving the positive training set.
However I have the following question:
What if two peaks are so close to each other, that an extension of 150bp leads to a "fusion" of both peaks and thus to a longer sequence than 300 bp? Can I keep those rare cases or does gkm-svm require a fixed sequence length (to be reliable)?
Best
Andreas