A numeric value is assigned to each amino acid, using the amino acid scale outlined in the paper:
Prediction of protein antigenic determinants from amino acid sequences, T Hopp, K Woods, PNAS 1981 link
User provides protein seq in single-letter amino acid code, specifies a window size (length of the peptide), and edge weight (default ). For each amino acid in the window, the program computes a weight using the linear variation model. It then applies the weight to the original score at amino acid level. The final hydrophilicity score for the peptide is calculated by dividing the sum of the corrected amino acid scores by the sum of the weights. The program repeats the process along the sequence of the protein.
Given:
Rank items in set S from high to low
where:
: or non-weighted hydrophilicity scores
: Number of amino acids in the protein
: residue index position on the protein (starting from 0)
: size of the peptide "window"
: Hopp-Woods hydrophilicity value of amino acid X at index position i
: weight used at each position. Weights are calculated using linear variation model (see below)
-
When no weights are used:
-
When using weights from linear variation model, specify edge weight
1) When the peptide window is an odd number:
For example, if window=7 (7-mer peptide), edge , then the first and the last weights will be 0.1. The weight for each amino acid in the 7-mer is linspaced as:
[0.1, 0.4, 0.7, 1.0, 0.7, 0.4, 0.1]
2) When the peptide window is an even number: (new feature not available in Expasy)
For example, if window=10, edge , then the first and the last weights will be 0.1. The weight for each amino acid in the 10-mer is linspaced as:
[0.1, 0.33, 0.55, 0.78, 1.0, 1.0, 0.78, 0.55, 0.32, 0.1]
See jupyter notebook
Expasy results were obtained from ProtScale