Approximate string search or Fuzzy string search, can be a difficult task if you have a very sophisticated searching / matching algorithm. So today I will show you a quite simple Fuzzy string search and use it to create a small efficient search application.
The CompareStrings method that we have ready is what we check a word against our search and give us back a value of difference (or threshold) from our search and the word we are checking. It does this by first checking each character while also checking for case insensitive or not, any difference in case if it applies or the character will add one to the threshold. We then check the length and add one for each single character difference in length. After all that we return the threshold and from there we can check the threshold to see if the word is something we want to keep and do something with. So before we do anything else, lets take a look at the CompareStrings function / method.
-
function CompareStrings($str1, $str2, $caseInsensitive = false) {
-
$threshold = 0;
-
-
if(strlen($str1) != strlen($str2)) {
-
if(strlen($str1) > strlen($str2)) {
-
$threshold = strlen($str1) - strlen($str2);
-
}
-
else if(strlen($str2) > strlen($str1)) {
-
$threshold = strlen($str2) - strlen($str1);
-
}
-
}
-
-
for($i = 0; $i < strlen($str1); $i++) {
-
if($i <= strlen($str2) - 1) {
-
if($caseInsensitive) {
-
if(strtolower($str1{$i}) != strtolower($str2{$i})) {
-
$threshold++;
-
}
-
}
-
else {
-
if($str1{$i} != $str2{$i}) {
-
$threshold++;
-
}
-
}
-
}
-
}
-
return $threshold;
-
}
So you can now see how it checks words, it’s not really difficult but also could be much more intelligent. So now lets start with our search method, “PreformSearch”. This will be a simple method that just checks the threshold and if it is within are acceptable limits we will store the word in an array, after we have sorted through the entire string we then return the array.
First lets set up the method and it’s arguments, as well as are array to store are matches.
-
function PreformSearch($search, $string, $threshold = 1, $caseSensitive = false) {
-
$matches = array();
-
}
Ok we have that now lets start adding some searching abilities into are new function / method.
We are going to check that the string to search is actually a string and is longer then 0 characters.
-
if(isset($string) && strlen($string) > 0) {
-
-
}
-
return $matches;
Now that is out of the way, we can now break up the string, loop though it and check for are matches.
-
$words = explode(" ", $this->string);
-
-
foreach($words as $word) {
-
if(CompareStrings($search, $word, $caseInsensitive) <= $threshold) {
-
$matches[] = $word;
-
}
-
}
There, that is all that is needed, so lets see the whole thing all together.
-
function PreformSearch($search, $threshold = 1, $caseSensitive = false) {
-
$matches = array();
-
-
if(isset($this->string) && strlen($this->string) > 0) {
-
$words = explode(" ", $this->string);
-
-
foreach($words as $word) {
-
if(CompareStrings($search, $word, $caseInsensitive) <= $threshold) {
-
$matches[] = $word;
-
}
-
}
-
}
-
return $matches;
-
}
and there it is. That is all that is needed for a very simple fuzzy string search using PHP.
So as you can see the above functions give you a base to start a much more sophisticated search then what is seen
here.
Below is a download to a FuzzySearch class, and a couple of test’s using it.
One Response
Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.
found your site on del.icio.us today and really liked it.. i bookmarked it and will be back to check it out some more later ..