About the Author

Nick Crum

Software Engineer

in Development 12 min read

Conquering the Fuzziness of Search Relevancy

While working on the Broadleaf Search project, I found myself very intrigued by the idea of relevancy. I really wanted to understand how I could manipulate my site to give the most awesome and relevant search results possible. I kept finding myself in the same place: frustrated. I really wanted there to be a right answer, or a formula, or some sort of combination of boosts to predict exactly how search results would display. Turns out, there wasn't. I inevitably realized that it was my mindset that needed to change and that through patience, the right tools and being okay with a couple of weird queries, I could do some pretty cool and powerful stuff. Here’s the newly-created process and how I use Broadleaf Search to provide relevant search results.

If you want to follow along and do this yourself, sign up for your own demo site at http://www.broadleafcommerce.com/demo.

The Process

  1. Define a Goal
  2. Make a Guess
  3. Analyze
  4. Repeat

Define a Goal

The first part of my process is to define a goal. Why define a goal you may ask? Well, if you go into the process of creating relevant search results, and you don't have a goal, things could get messy. Your goal has to be simple and attainable too. Don't expect perfection. If you expect too much, you are bound to disappoint yourself, and you will end up wasting unnecessary hours accomplishing nothing. So let's create a smart goal right now.

For our demo purposes, let’s say our website is called The Heat Clinic and we sell hot sauces. However, we are really not happy with our profits right now since we aren't selling enough of our higher margin sauces. In fact, we did some analysis and we have found that the products customers purchase the most show up first in their search results. This gives us the idea that we should boost our higher margin products to the top of our search results. So our goal is to rank products higher in search results based on their margin.

Before we go into setting up boost rules for this search, let's see what type of results we get now. Let's go to the admin and click to "Preview on Site". Once on the site, go ahead and search for "pepper."

searching for pepper originally

Broadleaf's preview environment also lets us view the score, or ranking, of the search results. If I click on "Show Search Score," I am able to see the search scores for each product overlaid on each image.

searching for pepper with score

We have a good idea of the current ordering of the search results for our search before adding any boost rules.

Make a Guess

Now that we have a goal, we need to go to the admin and set up a boost rule.

In the Broadleaf Admin, go to the left-hand nav and choose Relevancy Rules located under the Search module. Once you are there, go ahead and click Add Boost Rule. choosing Relevancy Rules in Admin

That was the easy part, now we need to fill out this form. Remember our goal here, we want to boost products by their margin. So we want to set up our boost rule to fulfill that goal. Feel free to tinker and experiment with some of the other fields on the form. They are fairly self-explanatory and have help text to guide you. However, for this example, this is what my form looks like, go ahead and fill out all the fields except Boost Amount.

Mostly filled out Boost Rule Form

Lastly, we need to fill out the hardest part of this form. What in the world do we put for Boost Amount? This is a hard question. I can give you a few tips, but this is where the tinkering will come in. Generally, you will want this number to be between 1.0 and 10.0. If you have multiple boost rules configured, setting the boost amount higher than 10 runs the risk of a single rule overwhelming the effects of other rules, especially if there is a high variance in values (e.g. one boost value at 2 and one boost value at 10). In general, the boost amounts for multiple rules should be relatively close to each other. For this example, we are going to make a guess, and set this value to 5.0.


Now that we have set a goal and made a guess for the best boost value, it is time to analyze our results. Let's go back to the preview and see how our search results have changed.

searching after first boost rule added

Let's look at the difference between our first search and our new one. For convenience, I've included the original order, new score, and the margin for each product.

If you want to learn more about how Lucene/Solr score documents, check out their docs at http://lucene.apache.org/core/3_6_0/scoring

Order Product Name Margin Original Order New Score
1 Cool Cayenne Pepper Hot Sauce 0.33 1 1.826
2 Roasted Red Pepper & Chipotle Hot Sauce 0.37 4 1.638
3 Blazin' Saddle XXX Hot Habanero Pepper Sauce 0.25 2 1.107
4 Bull Snort Cowboy Cayenne Pepper Hot Sauce 0.23 3 1.018
5 Green Ghost 0.19 5 0.526
6 Day of the Dead Chipotle Hot Sauce 0.36 6 0.398
7 Day of the Dead Scotch Bonnet Hot Sauce 0.23 7 0.255

I've personally found there are a two factors when gauging whether we have the correct boost values set up:

  1. Correct Ordering - Are the results in the order we wanted them to be?
  2. Score Spread - Do the scores make sense, or is the gap between adjacent documents very high? Does this rule do too much?

Looking at the results we see something interesting. Only one product has moved its positioning: Roasted Red Pepper & Chipotle Hot Sauce. It moved from position #4 to position #2, which makes sense, since it has a higher margin than both Blazin' and Bull Snort. We also notice that even though it has the highest margin, it is still behind Cool Cayenne. The reason for this is that Cool Cayenne better matches the search for "pepper" since it likely has matches in both its name and its description.

How come Day of the Dead Chipotle Hot Sauce is still in position #6 even though it has the second highest margin? This is pretty easily explained by the fact that it is a poor match for the search for "pepper." The only place that "pepper" exists for Day of the Dead Chipotle Hot Sauce is deep within its long description.

Overall, from these results I can say that we are doing pretty good on the Correct Ordering factor. However, I think our Score Spread could be better. Right now the spread is really high (between Roasted Red Pepper and Blazin' Sadlle it's ~0.5), so we should work to make that closer. We want the difference in scores of adjacent documents to be somewhere between 0.1 and 0.2 to avoid doing too much with this boost rule.


Now that we know what we need to fix, we have a new goal: decrease the score spread caused by our margin boost. Let's go ahead and make a new guess for our boost value. In this case, I'm going to go into the admin and set the boost value down a bit to 1.5. Let's check out what happens to the score spread when we do our search again.

adjust score spread down

After we have adjusted our boost values, our score spread appears to have improved to an acceptable level between 0.1 and 0.2.

Adding More Rules

Our first example was a relatively easy scenario, we had only one rule in play. What if we decided to add another rule? Does this get more difficult? The answer is that it depends. As you add more and more rules it can get more complex to tune things, but if you follow the process I've laid out, it should be relatively easy. Let's say that we wanted to promote a product line and we want to boost those products in our search results. For this example, we will be promoting Spice Exchange products, and we will keep our search keywords as "pepper," as well our margin boost. Our goal for the is to promote our Spice Exchange products higher in search results, but not to overwhelm the margin boost.

So let's go to the admin and add a new boost rule. For this one, we are going to boost when the manufacturer matches "Spice Exchange", so we will choose to Boost by Matching Value. I am going to choose a boost value of 2.0 as my guess.

Spice exchange boost rule Admin

Let's save this rule and then preview on site. Again, we will search for "pepper."

Spice Exchange Preview 1

Let's take a look at these scores and the product information, for convenience, I've organized it in a table here:

Order Product Name Margin Original Order New Score
1 *Day of the Dead Chipotle Hot Sauce 0.36 6 0.311
2 Cool Cayenne Pepper Hot Sauce 0.33 1 0.209
3 *Day of the Dead Scotch Bonnet Hot Sauce 0.23 7 0.199
4 Roasted Red Pepper & Chipotle Hot Sauce 0.37 2 0.187
5 Blazin’ Saddle XXX Hot Habanero Pepper Sauce 0.25 3 0.127
6 Bull Snort Cowboy Cayenne Pepper Hot Sauce 0.23 4 0.116
7 Green Ghost 0.19 5 0.060

* indicates products made by Spice Exchange

Let's analyze these results and see what's going on here. Now that we added this boost rule on Spice Exchange products, the two Day of the Dead products are now far higher in search results at positions #1 and #3 and before they were at #6 and #7. However, looking at our two factors above, I think we have a pretty good score spread this time, but our ordering is off. It doesn't make sense that a customer would sure for "pepper" and the first result is a product without "pepper" in its name. Also, Roasted Red Pepper is a much better match than Day of the Dead Scotch Bonnet. From what I can see, our guess of 2.0 was a little high and we should adjust the Spice Exchange product boost down.

Let's go to the admin and adjust that boost value from 2.0 down to 1.0, and then do our search again:

Spice Exchange Preview 2

With our new boost value of 1.0, our Spice Exchange products are in positions #3 and #6, so this looks a lot better. I'm sure we could spend all day tweaking these boost values, but for this example, I think we have met our goals. We now are successfully promoting a product line and increasing profitability using only a couple boost rules.

Closing Thoughts

Using these boost rules within Broadleaf Search, we were able to do some pretty powerful stuff. We were able to better promote a product line, and increase profitability by boosting higher margin products. The best part is that using the process I have laid out above, we were able to do this relatively quickly and without any lower-level customizations to Solr. All of this was done within the Broadleaf Admin which makes it a very useful tool for merchandisers and marketers looking to influence search results. No more having to communicate with engineers and wait for a new code deployment. That said, if you really need more complex search behavior (e.g. completely customized boost functions or changing the query that Broadleaf sends to Solr), Broadleaf's Solr components are easy to extend and customize how you see fit.