Welcome to SpotONE: hot SPOTs ON protein complexes with Extremely randomized trees via sequence-only features. This webserver allows the user to input solely a protein sequence (in FASTA format) and attain an in-silico prediction for all amino-acid residues as hot-spots (HS) or non-hot-spots (NS). Please cite A.J. Preto and Irina S. Moreira 2020.
The above plot displays the number of examples for each class available (HS and NS) in the dataset.
The above plot displays the number of amino-acids considering four quartiles that cover the length of the sequence. The amino acids are labelled according to their class, and as such it is possible to analyse the abundance of amino acids per relative position.
The above plot displays the proportion of Hot-spots and Null-spots per amino-acid residue type.
The above plots display several amino-acid characteristics split by their class value (HS vs NS). The values for the amino-acid characteristics were retrieve from the Biological Magnetic Resonance Data Bank (BMRB) DOI: 10.1093/nar/gkm957.
SPOTONE is a new Machine-Learning (ML) predictor able to accurately classify protein Hot-Spots (HS) via sequence-only features. This algorithm shows an accuracy, AUROC, precision, recall and F1-score of 0.82, 0.83, 0.91, 0.82 and 0.85, respectively, in an independent testing set. The algorithm is deployed within a free-to-use webserver at http://moreiralab.com/resources/spotone, only requiring the user to submit a FASTA file with one or more protein sequences.