Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Excerpt

Adding basic search to an e-commerce store is easy. Building great, measurable search experiences — and iterating high performance product discovery — is very difficult and often requires an entire team of search

...

engineers. 

We set out to change that. Our goal was to make a service to not only streamline the construction of great search but also the analytics and data workflows to create an ongoing flywheel of continual improvement. We wanted to put that power into the hands of any developer.

...

There are few initial steps involved to get Sajari search.io search off the ground:

  1. Import some data to help define a schema

  2. Configure pipelines 

  3. Sync data

  4. Generate a user interface (UI)

  5. Connect data feedback to improve ranking and performance automatically

...

Schemas are all about performance. If you aren’t convinced we wrote an article on schema vs schemaless just to convince you 😉 

You can do this a) manually in our admin console b) via our API or c) via uploading a sample record during your collection setup, or d) using a connector. This example used option b.

Note: If you’re using our website crawler, Shopify, or another connector this will happen automatically.

...

The configuration of an intelligent search algorithm can be extremely complicated. So, we re-imagined how engineers can build search by creating pipelines. Pipelines break down search configuration into smaller pieces that can be easily mixed, matched, and combined to create an incredibly powerful search experience. Pipelines are highly composable and extendable. You can read a quick pipeline overview here and more specific details related to this example below.

...

  • Record pipelines define how your product data is processed on ingest 

  • Query pipelines define how your queries are constructed 

...

Sajari search.io automatically generates initial pipelines for you that you can modify or append later on via our built-in pipeline editor. When you first set up Sajarisearch.io, the auto-detecting onboarding flow allows you to:

...

Steps in a query pipeline can be used for:

  • Query understanding - query rewrites, spelling, NLP, etc.

  • Filtering results - based on any attribute in the index. For example location or customer-specific results.

  • Changing the relevance logic - dynamically boost different aspects based on the search query, parameters or data models

  • Constructing the engine query - as opposed to the input query, the engine query is what is actually executed, it can be extremely complex but you don’t need to worry about that!

...

Spelling is hard! If you’ve tried a fuzzy match in other search engines you would know it’s underwhelming. It also typically slows things down a lot! 

The nice thing about Sajari search.io spelling is that it’s not only very very fast, it’s also much smarter than a standard fuzzy matching. You can add spell correction via pipelines using a few lines of YAML. 

...

Below are sample queries showing how missing characters are handled seamlessly.

...

The nice thing about Sajari search.io spelling is that it also learns over time. As people execute more queries the suggestions and autocomplete begin to better understand the most likely suggestions your customers want. 

...

Below illustrates the transform:

Before regex step

After regex step

{

    "q": "phone in:cheap"

}

{

    "q": "phone",

    "in": "cheap"

}

Note: 

  • in is just an example. You could look for has, from or anything you like.

  • Many additional filters can be added for all the other potential values here. It could be in:sale or anything else.

...

That’s interesting and useful, but obviously difficult to do for many variations right? Actually Sajari search.io supports uploading these in bulk in several ways:

...

The API is useful if this information already exists in the business (ie, if you already have a solid understanding of what queries are generating purchases, we can import this data for you so you can avoid a “cold start” when migrating to Sajarisearch.io. Contact us for more details).  

The analytically generated conditional boosts are far more interesting though, as they can be regenerated automatically using machine learning to dynamically improve your business performance with no effort required. 

...

There are a couple different ways to sync your data with Sajarisearch.io:

  • During your initial onboarding, you can submit a JSON file and Sajari search.io walks you through a data indexing wizard

  • You can use our API to load and update data

For this demo, we opted for the latter using our Golang API which neatly uses gRPC. 

...

Note a few things here:

  1. The search facets include several different types (numeric, category, tags and buckets). Buckets allow any filter expression to define a facet grouping so these are highly flexible.

  2. The facets support filtered and non-filtered options. This is important as it allows you to retain the higher level numbers on a single query even when a facet is selected. Unlike many other search providers where this requires two requests!

  3. This supports desktop and mobile layouts responsively

  4. The result layout can be switched from a list to tiled

  5. Multiple different pipelines can be supported (popular, best, price based). In these options the price is a “hard sort” whereas the popularity is a score based “boosted sort” pipeline

...

In less time than it took to write this article we’ve shown how to build a blazing fast e-commerce search and discovery interface with autosuggest, sorting, some basic NLP, machine learning powered automatic result improvement, automated categorical segment alignment, AI generated visual image search, spell correction, gmail style filters, conditional business logic and much more!

...

Documentation

Pipelines Overview

Filter by label (Content by label)
showLabelsfalse
max5
spacescom.atlassian.confluence.content.render.xhtml.model.resource.identifiers.SpaceResourceIdentifier@957
showSpacefalse
sorttitle
typeshowSpacepagefalse
reversetrue
labelstypecrawlerpage
cqllabel in ( "label" , "ecommerce" ) and type = "page" and space = "KB"
labelscrawler