This Confluence instance is now read-only, please head over to the Algolia Confluence instance for the same more up-to-date information
How to setup multilingual site search
If your site is available in multiple languages, you might find yourself wondering how to show users search content based on their language.
In this guide we will walk you through how to do just that. While the methods in this guide may differ depending on what language your website is written in, the principles remain the same.
Crawling multilingual content
By default, if you add your apex domain (http://acme.com , for example) to the domains area of your console, our crawler will crawl all the content it can find on your website, including content in different languages, whether that content is stored in a sub directory such as a www.acme.com/fr/ or on a subdomain such as fr.acme.com.
The directory and domain information are stored in the schema fields dir1
, dir2
and domain
. For example the dir1
for www.acme.com/fr/ would be ‘fr' (for French) and the domain
for fr.acme.com would be ‘fr.acme.com’. We can then use this information to create search filters, settings and UIs based on different languages.
Displaying multilingual content based on user language
In order to show content to users based on their language, we are going conditionally render the search UI widgets to the page.
This will be a three step process:
Identify the user’s language by grabbing the page’s language attribute
Update the Search UI widgets to filter out all content that isn’t in the current language
Render the Search UI widget to the page.
For this example, we are going to be assuming PHP is the backend language for this site, and it is running Wordpress, but the same logic can be applied to any site using the Search UI widgets.
1. Identify the user’s language
The first thing we need to do is identify the browser language of the current user. For this we are going to be using a built-in Wordpress function called get_language_attributes()
, which will give us the language attributed of the <html>
tag.
<?php
$currentLang = get_language_attributes();
?>
In our example the user is in France, so if we echo the result we get ‘lang=”fr-FR”’.
<?php
echo $currentLang;
// Output: lang="fr-FR"
?>
In our example, the dir1
schema field holds the language information, which for French is 'fr'
. Therefore,
we just need to get the ‘fr'
portion of the language attribute string and store it to a variable (alternatively you could use a switch statement to set the variable if you don’t have many languages!). To do this we will use some regex:
<?php
$currentLang = get_language_attributes(); // Get language attribute
$regex = '/"(.*?)-/s'; // Regex pattern to extract 'fr'
preg_match($regex, $currentLang, $output); // Store 'fr' in $output
echo $output[1];
?>
This code sets a regex pattern and then uses the php preg_match
function to run the regex over $currentLang
, storing the output in an array $output
. The $output[1]
contains the pattern we want (see more on preg_match here: PHP: Hypertext Preprocessor).
2. Update the Search UI Widget
Now we are going to use the $output
as the basis for filtering French content in the Search UI Widgets. To do this we will simply echo out the widget JSON with the $output
lang as a default filter on our records' URLs.
We will do this in the search results widget, which accepts a default filter on the variables key.
Without the variable in place the default filter looks like this:
Adding the variable in we get the following:
Now every time a search is run a default filter will be added that will only include results if dir1
equals 'fr'
.
3. Render the Search UI Widget
Now we have the language code stored in our variable, and the variable in place in the JSON we can render the Search UI Widget to the page.
To do this we simply echo the html onto the page:
Because we are rendering the JSON server side, the JSON will be outputted to the page with the correct language filter in place before the page is loaded.
If we look at our source code we can see that the correct filter is in place:
That’s it! Now every time a user visits our site we will first detect what language they are accessing the site in and render the search widget to match