Java – Optimizing a simple search algorithm

javaoptimizationsearch

I have been playing around a bit with a fairly simple, home-made search engine, and I'm now twiddling with some relevancy sorting code.

It's not very pretty, but I'm not very good when it comes to clever algorithms, so I was hoping I could get some advice 🙂

Basically, I want each search result to get scoring based on how many words match the search criteria. 3 points per exact word and one point for partial matches

For example, if I search for "winter snow", these would be the results:

  • winter snow => 6 points
  • winter snowing => 4 points
  • winterland snow => 4 points
  • winter sun => 3 points
  • winterland snowing => 2 points

Here's the code:

String[] resultWords = result.split(" ");
String[] searchWords = searchStr.split(" ");
int score = 0;
for (String resultWord : resultWords) {
    for (String searchWord : searchWords) {
        if (resultWord.equalsIgnoreCase(searchWord))
            score += 3;
        else if (resultWord.toLowerCase().contains(searchWord.toLowerCase()))
            score++;
    }
}

Best Solution

Your code seems ok to me. I suggest little changes:

Since your are going through all possible combinations you might get the toLowerCase() of your back at the start.

Also, if an exact match already occurred, you don't need to perform another equals.

    result = result.toLowerCase();
    searchStr = searchStr.toLowerCase();

    String[] resultWords = result.split(" ");
    String[] searchWords = searchStr.split(" ");
    int score = 0;
    for (String resultWord : resultWords) {
        boolean exactMatch = false;
        for (String searchWord : searchWords) {
            if (!exactMatch && resultWord.equals(searchWord)) {
                exactMatch = true;
                score += 3;
            } else if (resultWord.contains(searchWord))
                score++;
        }
    }

Of course, this is a very basic level. If you are really interested in this area of computer science and want to learn more about implementing search engines start with these terms: