Skip to main content
Version: legacy

How do we build a DocSearch index?

In this section you will learn how we build a DocSearch index from your page.

Everything starts from your page#

1st step

We extract the payload with to your set of selectors#

2nd step

We will focus on the highlighted information depending on your selectors.

We iterate through the HTML flow and build the payload#

3rd step

This payload will be the only data extracted from your page.

We iterate through the payload and start pushing records#

4th step

We index the temporary record when we add an element to it (if min_indexed_level equals 0)

We pile up the elements based on the current temporary record#

5th step

Based on the position within the flow, we nest elements as much as possible to keep the context and increase the relevancy.

We iterate until we match a text element#

6th step

We override the text element when we find a newer one#

7th step

We remove the stashed, deeper elements when we add a higher level#

8th step

Contextual information and hierarchy must be updated once we encounter a new level. We are doing that because it highlights a new sub-section not related to the previous one.

If you need any further information, please contact us.