Coveo for Sitecore: How to prioritize newer over stale migrated content
The Use Case
In a typical content migration you may carry over certain data to Sitecore that pertains to the UI behavior, such as a Publication Date for Articles and an End Date for Webinars. These values may be used hand in hand as sort criteria for Coveo as a single computed field, such as contentdate. When all of this content has been imported to Sitecore, the statistics field section Updated Date field may not be set to the true Date of when that piece of content was last modified due to the nature of content being created and the statistics being automatically updated on item::save
events. The import of items can be customized to set the same value in the Updated field and then skip updating statistics, but that runs the risk of re-modifying a system field used to append the item to the publishing queue based on timestamps. I would recommend importing items in ascending order at this point.
With content imported, you’ll notice that an Article from 2012 has a similar last updated value as an Article that is less than a few weeks old. And this may be no different than what you see in the UI - especially when a Coveo Search Interface is loaded with relevancy and no query. This is due to Coveo’s ranking of documents.
Coveo Ranking in our Use Case
Relevancy heavily relies on two Coveo Ranking Phases called Term Weighting and Term Frequency & Adjaceny. These phases have important factors that are only used when a search query is applied. Without term ranking boosting relevancy, we are left with the Document Weighting phase, where one if it’s ranking factors, Date (Item last modification), plays a crucial role in ranking Items higher with the most recent modification date. This is not to say that a content item will be ranked higher, but a higher total score could be achieved. Due to the out-of-the-box Document Weighting, we find ourselves viewing too many irrelenvant articles and Past Webinars.
To further understand all of the ranking factors, please refer to Coveo’s Guidelines for Understanding Search Result Ranking.
Step 1: Fine Tuning Item last modification
Item last modification can be carefully fine tuned for a specific query pipeline. The emphasis on carefully is to say that if you don’t include a condition with this tuning, then any query that goes through the same pipeline will be affected. Let’s go ahead and navigate to platform.cloud.coveo.com and fine tune the Item last modification with a new condition below:
- Select Query Pipelines > [Query Pipeline] > Ranking Weights Tab. In the screenshow below, I have identified a new rule for Item last modification to be set to a value of 2 (out of 5)
when the result is either an Article or Webinar and the Query is empty
.
- Select button on the top right “Add Rule” to open a new Ranking Weight Rule that will allow us to adjust the Item last modification and create a new rule. In the screenshot below, I have tuned Item last modification on the left-hand side and have selected a pre-existing condition on the right-hand side:
- Create a new Condition by going to Search > Conditions > Add Condition (conditions are re-usable across many rules for QPL and ML). In the screenshot below, I have identified a Coveo Custom Context Key,
resulttype
, that my team uses to keep track of the corresponding result’s Sitecore Template Name. I won’t get into the details of how to track and send custom context keys as this step is relatively lightweight and can be found by Coveo documentation here.
- Select “Add Rule” to save this Rule with condition to the Ranking Weights of your Query Pipeline. With this tuning in place, Coveo states that a value of 0-4 will progressively reduce the weight of a ranking factor relative to its pre-tuned value. This did not drastically push a 2012 Article out of sight because of the quality of the document alone had one of the highest document weight scores.
Step 2: Query Ranking Function to the rescue
A Query Ranking function helps immensly with the ranking score of an item because the boost it provides is relative to the function and score limit provided. A range of scores become available based on the ranking function’s algorithm when a result item is passed through. You may notice within a Query Pipeline a tab for “Ranking Expression”, but do not get this feature confused with a Ranking Function. Ranking Expressions within the query pipeline can only apply statically modified ranking adjustments (reduction or boost). We could add a Ranking Expression for computed field @contentdate
to reduce results that are greater than N years old, but this still doesn’t provide a percentage based boosting or sliding scale of ranking.
For our ranking function, we decided that we want a sliding scale of boosting to start at 8 years prior to now for results that have the @contentdate
field and a max boosting limit of 500. I have found that an item from Jan 2012 will have 0 boosting, whereas an item from Jan 2013 will have around 1-3% of the 500 modifier total, and an item close to NOW will have around a 90-100% boost.
The ranking function should be added directly with JavaScript within the buildingQuery
event listener of a Coveo Search Interface:
args.queryBuilder.advancedExpression
.add("$qrf(expression: 'max(@contentdate, (NOW - (YEAR * 8)))', normalizeWeight: 'true', modifier: '500')");
The Result
Reducing the Item last modification weight for my Query Pipeline and adding a custom Query Ranking Function visibly shifted the results to a more date-eccentric ranking while keeping the Relevancy of other ranking factors. In the screenshot below, the first result has been boosted by an additional 466 points solely from the Ranking Function. We don’t see a true 500 point boost to an Upcoming Webinar due to Coveo’s ranking function algorithm which states that the boosting value used will not reach the modifier limit if the results around it don’t reach a point where a larger boost is necessary.