텍스트 싱크로 율 계산-Levenshtein
3931 단어 텍스트 싱크로 율
import java.util.BitSet;
public class Distance {
public static void main(String[] args) {
Distance distance = new Distance() ;
int i = distance.LD("gttttl", "gambol") ;
System.out.println(i);
}
// ****************************
// Get minimum of three values
// ****************************
private int Minimum(int a, int b, int c) {
int mi;
mi = a;
if (b < mi) {
mi = b;
}
if (c < mi) {
mi = c;
}
return mi;
}
// *****************************
// Compute Levenshtein distance
// *****************************
public int LD(String s, String t) {
//
int d[][]; // matrix
//s
int n; // length of s
//t
int m; // length of t
//s
int i; // iterates through s
//t
int j; // iterates through t
//s char
char s_i; // ith character of s
//t char
char t_j; // jth character of t
//
int cost; // cost
// Step 1
n = s.length();
m = t.length();
// n 0 . m
if (n == 0) {
return m;
}
//
if (m == 0) {
return n;
}
d = new int[n + 1][m + 1];
// Step 2 .
for (i = 0; i <= n; i++) {
d[i][0] = i;
}
for (j = 0; j <= m; j++) {
d[0][j] = j;
}
// Step 3
for (i = 1; i <= n; i++) {
s_i = s.charAt(i - 1);
// Step 4
// i t
for (j = 1; j <= m; j++) {
t_j = t.charAt(j - 1);
// Step 5
if (s_i == t_j) {
cost = 0;
} else {
cost = 1;
}
// Step 6
//
d[i][j] = Minimum(d[i - 1][j] + 1, d[i][j - 1] + 1,
d[i - 1][j - 1] + cost);
}
}
// Step 7
//
return d[n][m];
}
}
주석 을 달 았 는데.........................................................
This section shows how the Levenshtein distance is computed when the source string is "GUMBO" and the target string is "GAMBOL".
Steps 1 and 2
G U M B O
0 1 2 3 4 5
G 1
A 2
M 3
B 4
O 5
L 6
Steps 3 to 6 When i = 1
G U M B O
0 1 2 3 4 5
G 1 0
A 2 1
M 3 2
B 4 3
O 5 4
L 6 5
Steps 3 to 6 When i = 2
G U M B O
0 1 2 3 4 5
G 1 0 1
A 2 1 1
M 3 2 2
B 4 3 3
O 5 4 4
L 6 5 5
Steps 3 to 6 When i = 3
G U M B O
0 1 2 3 4 5
G 1 0 1 2
A 2 1 1 2
M 3 2 2 1
B 4 3 3 2
O 5 4 4 3
L 6 5 5 4
Steps 3 to 6 When i = 4
G U M B O
0 1 2 3 4 5
G 1 0 1 2 3
A 2 1 1 2 3
M 3 2 2 1 2
B 4 3 3 2 1
O 5 4 4 3 2
L 6 5 5 4 3
Steps 3 to 6 When i = 5
G U M B O
0 1 2 3 4 5
G 1 0 1 2 3 4
A 2 1 1 2 3 4
M 3 2 2 1 2 3
B 4 3 3 2 1 2
O 5 4 4 3 2 1
L 6 5 5 4 3 2
Step 7
The distance is in the lower right hand corner of the matrix, i.e. 2. This corresponds to our intuitive realization that "GUMBO" can be transformed into "GAMBOL" by substituting "A" for "U" and adding "L" (one substitution and 1 insertion = 2 changes).