This Confluence instance is now read-only, please head over to the Algolia Confluence instance for the same more up-to-date information

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Most websites include metadata for recording a date when a publication (such as a blog, article, or a report) is published or updated. You can index this metadata in your collection to display this information in search results, use it for sorting, or for filtering.

This how-to-guide details standard metadata fields the Sajari crawler detects and how you can index date metadata in your collection.

How the crawler indexes date metadata

A website page might have multiple date metadata fields on it, for example, datePublished, modifiedDate & lastupdated. The crawler indexes date metadata automatically if the best practices are followed and the correct meta fields are used.

Below are a few examples of the standard meta-fields that our crawler detects and indexes automatically:

Open Graph Protocol

Property

Expected Type

Description

article: published_time

DateTime

When the article was first published.

article: modified_time

DateTime

When the article was last changed.

Note: If you do not provide a datetime with a timezone we will parse it as UTC.

Schema.org

Property

Expected Type

Description

dateCreated

Date or DateTime

The date on which the CreativeWork was created or the item was added to a DataFeed.

dateModified

Date or DateTime

The date on which the CreativeWork was most recently modified or when the item's entry was modified within a DataFeed.

datePublished

Date

Date of first broadcast/publication.

We also support schema.org entities in JSON-LD format for a few of our enterprise clients. We are working on supporting JSON-LD format by default. If you are a current customer looking to use JSON-LD format, get in touch via our Service Desk.

Custom Metadata

You can also add a custom metadata date field to your website and use that. Read more on how you can index custom fields in your collection.


How to index date metadata

Instructions

  1. Identify the metadata field that you want to index in your collection.

  2. If the field is using Open Graph Protocol or Schema.org entities, then skip to step 4.

  3. Add a schema field for the date metadata field via the Schema section and add data-sj-field="fieldname" attribute to the metadata (see detailed instructions).

  4. Re-index a sample page via the "Diagnose" tool in the Domains section. Once indexed, the record should be updated and the metadata should be added to the field.

  5. Check that the record has been indexed and has the correct field value. You can check this via the Preview section. Use "Expand all" to display all fields and use filter (e.g. "filter":"url='http://www.url.com'") to check a specific page.

  6. Once verified that the metadata is being indexed correctly, re-index all domains in the Domains section.

You can also use the Page Debug tool to check if the crawler detects the date metadata that you want to add.


After successfully indexing your date metadata, you can use the indexed date field to:


Documentation

Sorting

Filter records based on a timestamp field

Filtering content

  • No labels