T - type of the tokenpublic class SimonWhite<T> extends Object implements ListMetric<T>
similarity(a,b) = 2 * |(a A b)| / (|a| + |b|)
The A operation takes the list intersection of a and
b. This is a list c such that each element in has a
1-to-1 relation to an element in both a and b. E.g.
the list intersection of [ab,ab,ab,ac] and
[ab,ab,ad] is [ab,ab].
This metric is very similar to Dice's coefficient however Simon White used the list intersection rather then the set intersection to prevent list of duplicates from scoring a perfect match against a list with single elements. E.g. 'GGGGG' should not be identical to 'GG'.
This class is immutable and thread-safe.
DiceSimilarity| Constructor and Description |
|---|
SimonWhite() |
| Modifier and Type | Method and Description |
|---|---|
float |
compare(List<T> a,
List<T> b)
Measures the similarity between lists a and b.
|
String |
toString() |
public float compare(List<T> a, List<T> b)
ListMetricCopyright © 2014–2018. All rights reserved.