Excluding content from search

The latest Insiders release brings three new simple ways to exclude dedicated parts of a document from the search index, allowing for more fine-grained control.

Two weeks ago, Material for MkDocs Insiders shipped a brand new search plugin, yielding massive improvements in usability, but also in speed and size of the search index. Interestingly, as discussed in the previous blog article, we only scratched the surface of what's now possible. This release brings some useful features that enhance the writing experience, allowing for more fine-grained control of what pages, sections and blocks of a Markdown file should be indexed by the built-in search functionality.

The following section discusses existing solutions for excluding pages and sections from the search index. If you immediately want to learn what's new, skip to the section just after that.

Prior art

MkDocs has a rich and thriving ecosystem of plugins, and it comes as no surprise that there's already a fantastic plugin by @chrieke to exclude specific sections of a Markdown file – the mkdocs-exclude-search plugin. It can be installed with:

pip install mkdocs-exclude-search

How it works: the plugin post-processes the search_index.json file that is generated by the built-in search plugin, giving the author the ability to exclude certain pages and sections by adding a few lines of configuration to mkdocs.yml. An example:

plugins:
  - search
  - exclude-search:
      exclude:
        - page.md
        - page.md#section
        - directory/*
        - /*/page.md

It's easy to see that the plugin follows a configuration-centric approach, which adds support for advanced filtering techniques like infix- and suffix-filtering using wildcards. While this is a very powerful idea, it comes with some downsides:

Exclusion patterns and content are not co-located: exclusion patterns need to be defined in mkdocs.yml, and not as part of the respective document or section to be excluded. This might result in stale exclusion patterns, leading to unintended behavior:
- When a headline is changed, its slug (permalink) also changes, which might suddenly match (or unmatch) a pattern, e.g., when an author fixes a typo in a headline.
- As exclusion patterns support the use of wildcards, different authors might overwrite each other's patterns without any immediate feedback since the plugin does only report the number of excluded documents – not what has been excluded.¹

Exclusion control might be too coarse: The mkdocs-exclude-search plugin only allows for the exclusion of pages and sections. It's not possible to exclude parts of a section, e.g., content that is irrelevant to search but must be included as part of the documentation.

What's new?

The latest Insiders release brings fine-grained control for excluding pages, sections, and blocks from the search index, implemented through front matter, as well as the Attribute Lists. Note that it doesn't replace the mkdocs-exclude-search plugin but complements it.

Excluding pages

An entire page can be excluded from the search index by adding a simple directive to the front matter of the respective Markdown file. The good thing is that the author now only has to check the top of the document to learn whether it is excluded or not:

---
search:
  exclude: true
---

# Page title
...

Excluding sections

If a section should be excluded, the author can use the Attribute Lists extension to add a pragma called data-search-exclude at the end of a heading. The pragma is not included in the final HTML, as search pragmas are filtered by the search plugin before the page is rendered:

=== ":octicons-file-code-16: docs/page.md"

``` markdown
# Page title

## Section 1

The content of this section is included

## Section 2 { data-search-exclude }

The content of this section is excluded
```

=== ":octicons-codescan-16: search_index.json"

``` json
{
  ...
  "docs": [
    {
      "location":"page/",
      "text":"",
      "title":"Document title"
    },
    {
      "location":"page/#section-1",
      "text":"<p>The content of this section is included</p>",
      "title":"Section 1"
    }
  ]
}
```

Excluding blocks

If even more fine-grained control is desired, the pragma can be added to any block-level element or inline-level element that is officially supported by the Attribute Lists extension:

=== ":octicons-file-code-16: docs/page.md"

``` markdown
# Page title

The content of this block is included

The content of this block is excluded
{ data-search-exclude }
```

=== ":octicons-codescan-16: search_index.json"

``` json
{
  ...
  "docs": [
    {
      "location":"page/",
      "text":"<p>The content of this block is included</p>",
      "title":"Document title"
    },
  ]
}
```

Conclusion

The latest release brings three simple ways to control more precisely what goes into the search index and what doesn't. It complements the already very powerful mkdocs-exclude-search plugin, allowing for new methods of shaping the structure, size and content of the search index.

When the log level is set to DEBUG, the plugin will report exactly which pages and sections have been excluded from the search index, but MkDocs will now flood the terminal with debug output from its core and other plugins. ↩︎

6.7 KiB Raw Blame History Unescape Escape