I just implemented a best match file search algorithm to find the closest match to a string in a dictionary. After profiling my code, I found out that the overwhelming majority of time is spent calculating the distance between the query and the possible results. I am currently implementing the algorithm to calculate the Levenshtein Distance using a 2-D array, which makes the implementation an O(n^2) operation. I was hoping someone could suggest a faster way of doing the same.
Here's my implementation:
public int calculate(String root, String query)
{
int arr[][] = new int[root.length() + 2][query.length() + 2];
for (int i = 2; i < root.length() + 2; i++)
{
arr[i][0] = (int) root.charAt(i - 2);
arr[i][1] = (i - 1);
}
for (int i = 2; i < query.length() + 2; i++)
{
arr[0][i] = (int) query.charAt(i - 2);
arr[1][i] = (i - 1);
}
for (int i = 2; i < root.length() + 2; i++)
{
for (int j = 2; j < query.length() + 2; j++)
{
int diff = 0;
if (arr[0][j] != arr[i][0])
{
diff = 1;
}
arr[i][j] = min((arr[i - 1][j] + 1), (arr[i][j - 1] + 1), (arr[i - 1][j - 1] + diff));
}
}
return arr[root.length() + 1][query.length() + 1];
}
public int min(int n1, int n2, int n3)
{
return (int) Math.min(n1, Math.min(n2, n3));
}
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…