2021-09-26 17:53:29 +02:00
|
|
|
|
---
|
2022-09-11 19:25:40 +02:00
|
|
|
|
date: 2021-09-26
|
|
|
|
|
authors: [squidfunk]
|
2021-09-26 17:53:29 +02:00
|
|
|
|
description: >
|
2021-09-26 17:57:38 +02:00
|
|
|
|
Three new simple ways to exclude dedicated parts of a document from the search
|
|
|
|
|
index, allowing for more fine-grained control
|
2022-09-11 19:25:40 +02:00
|
|
|
|
categories:
|
|
|
|
|
- Search
|
|
|
|
|
links:
|
|
|
|
|
- blog/posts/search-better-faster-smaller.md
|
|
|
|
|
- setup/setting-up-site-search.md#search-exclusion
|
2024-10-10 10:29:50 +02:00
|
|
|
|
- insiders/how-to-sponsor.md
|
2021-09-26 17:53:29 +02:00
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
# Excluding content from search
|
|
|
|
|
|
|
|
|
|
__The latest Insiders release brings three new simple ways to exclude
|
|
|
|
|
dedicated parts of a document from the search index, allowing for more
|
|
|
|
|
fine-grained control.__
|
|
|
|
|
|
|
|
|
|
Two weeks ago, Material for MkDocs Insiders shipped a [brand new search
|
2021-10-11 17:16:48 +02:00
|
|
|
|
plugin], yielding [massive improvements in usability], but also in [speed
|
|
|
|
|
and size] of the search index. Interestingly, as discussed in the previous
|
2021-09-26 17:53:29 +02:00
|
|
|
|
blog article, we only scratched the surface of what's now possible. This
|
|
|
|
|
release brings some useful features that enhance the writing experience,
|
|
|
|
|
allowing for more fine-grained control of what pages, sections and blocks of a
|
|
|
|
|
Markdown file should be indexed by the built-in search functionality.
|
|
|
|
|
|
2022-09-11 19:25:40 +02:00
|
|
|
|
<!-- more -->
|
|
|
|
|
|
2021-09-26 17:53:29 +02:00
|
|
|
|
_The following section discusses existing solutions for excluding pages and
|
|
|
|
|
sections from the search index. If you immediately want to learn what's new,
|
2021-10-11 17:16:48 +02:00
|
|
|
|
skip to the [section just after that][what's new]._
|
2021-09-26 17:53:29 +02:00
|
|
|
|
|
2021-10-11 17:16:48 +02:00
|
|
|
|
[brand new search plugin]: search-better-faster-smaller.md
|
|
|
|
|
[massive improvements in usability]: search-better-faster-smaller.md#whats-new
|
|
|
|
|
[speed and size]: search-better-faster-smaller.md#benchmarks
|
|
|
|
|
[what's new]: #whats-new
|
2021-09-26 17:53:29 +02:00
|
|
|
|
|
|
|
|
|
## Prior art
|
|
|
|
|
|
2021-10-11 17:16:48 +02:00
|
|
|
|
MkDocs has a rich and thriving ecosystem of [plugins], and it comes as no
|
2021-09-26 17:53:29 +02:00
|
|
|
|
surprise that there's already a fantastic plugin by @chrieke to exclude specific
|
2021-10-11 17:16:48 +02:00
|
|
|
|
sections of a Markdown file – the [mkdocs-exclude-search] plugin. It can be
|
2021-09-26 17:53:29 +02:00
|
|
|
|
installed with:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
pip install mkdocs-exclude-search
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
__How it works__: the plugin post-processes the `search_index.json` file that
|
|
|
|
|
is generated by the built-in search plugin, giving the author the ability to
|
|
|
|
|
exclude certain pages and sections by adding a few lines of configuration to
|
|
|
|
|
`mkdocs.yml`. An example:
|
|
|
|
|
|
|
|
|
|
``` yaml
|
|
|
|
|
plugins:
|
|
|
|
|
- search
|
|
|
|
|
- exclude-search:
|
|
|
|
|
exclude:
|
|
|
|
|
- page.md
|
|
|
|
|
- page.md#section
|
|
|
|
|
- directory/*
|
|
|
|
|
- /*/page.md
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
It's easy to see that the plugin follows a configuration-centric approach, which
|
|
|
|
|
adds support for advanced filtering techniques like infix- and suffix-filtering
|
|
|
|
|
using wildcards. While this is a very powerful idea, it comes with some
|
|
|
|
|
downsides:
|
|
|
|
|
|
2021-10-11 17:16:48 +02:00
|
|
|
|
1. __Exclusion patterns and content are not co-located__: exclusion patterns
|
|
|
|
|
need to be defined in `mkdocs.yml`, and not as part of the respective
|
|
|
|
|
document or section to be excluded. This might result in stale exclusion
|
|
|
|
|
patterns, leading to unintended behavior:
|
2021-09-26 17:53:29 +02:00
|
|
|
|
|
|
|
|
|
- When a headline is changed, its slug (permalink) also changes, which might
|
|
|
|
|
suddenly match (or unmatch) a pattern, e.g., when an author fixes a typo
|
|
|
|
|
in a headline.
|
|
|
|
|
|
|
|
|
|
- As exclusion patterns support the use of wildcards, different authors
|
|
|
|
|
might overwrite each other's patterns without any immediate feedback since
|
|
|
|
|
the plugin does only report the number of excluded documents – not _what_
|
|
|
|
|
has been excluded.[^1]
|
|
|
|
|
|
|
|
|
|
[^1]:
|
|
|
|
|
When the log level is set to `DEBUG`, the plugin will report exactly which
|
|
|
|
|
pages and sections have been excluded from the search index, but MkDocs will
|
|
|
|
|
now flood the terminal with debug output from its core and other plugins.
|
|
|
|
|
|
2021-10-11 17:16:48 +02:00
|
|
|
|
2. __Exclusion control might be too coarse__: The [mkdocs-exclude-search]
|
|
|
|
|
plugin only allows for the exclusion of pages and sections. It's not
|
|
|
|
|
possible to exclude parts of a section, e.g., content that is irrelevant
|
|
|
|
|
to search but must be included as part of the documentation.
|
2021-09-26 17:53:29 +02:00
|
|
|
|
|
2021-10-11 17:16:48 +02:00
|
|
|
|
[plugins]: https://github.com/mkdocs/mkdocs/wiki/MkDocs-Plugins
|
|
|
|
|
[mkdocs-exclude-search]: https://github.com/chrieke/mkdocs-exclude-search
|
2021-09-26 17:53:29 +02:00
|
|
|
|
|
|
|
|
|
## What's new?
|
|
|
|
|
|
|
|
|
|
The latest Insiders release brings fine-grained control for [__excluding pages,
|
2021-10-11 17:16:48 +02:00
|
|
|
|
sections, and blocks__][search exclusion] from the search index, implemented
|
|
|
|
|
through front matter, as well as the [Attribute Lists]. Note that it doesn't
|
|
|
|
|
replace the [mkdocs-exclude-search] plugin but __complements__ it.
|
2021-09-26 17:53:29 +02:00
|
|
|
|
|
2021-10-11 17:16:48 +02:00
|
|
|
|
[search exclusion]: ../../setup/setting-up-site-search.md#search-exclusion
|
|
|
|
|
[Attribute Lists]: ../../setup/extensions/python-markdown.md#attribute-lists
|
2021-09-26 17:53:29 +02:00
|
|
|
|
|
|
|
|
|
### Excluding pages
|
|
|
|
|
|
|
|
|
|
An entire page can be excluded from the search index by adding a simple
|
|
|
|
|
directive to the front matter of the respective Markdown file. The good thing
|
|
|
|
|
is that the author now only has to check the top of the document to learn
|
|
|
|
|
whether it is excluded or not:
|
|
|
|
|
|
2022-09-11 19:25:40 +02:00
|
|
|
|
``` yaml
|
2021-09-26 17:53:29 +02:00
|
|
|
|
---
|
|
|
|
|
search:
|
|
|
|
|
exclude: true
|
|
|
|
|
---
|
|
|
|
|
|
2023-09-14 19:09:18 +02:00
|
|
|
|
# Page title
|
2021-09-26 17:53:29 +02:00
|
|
|
|
...
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Excluding sections
|
|
|
|
|
|
2021-10-11 17:16:48 +02:00
|
|
|
|
If a section should be excluded, the author can use the [Attribute Lists]
|
2022-09-11 19:25:40 +02:00
|
|
|
|
extension to add a __pragma__ called `data-search-exclude` at the end of a
|
2021-09-26 17:53:29 +02:00
|
|
|
|
heading. The pragma is not included in the final HTML, as search pragmas are
|
|
|
|
|
filtered by the search plugin before the page is rendered:
|
|
|
|
|
|
2022-09-11 19:25:40 +02:00
|
|
|
|
=== ":octicons-file-code-16: `docs/page.md`"
|
2021-09-26 17:53:29 +02:00
|
|
|
|
|
|
|
|
|
``` markdown
|
2023-09-14 19:09:18 +02:00
|
|
|
|
# Page title
|
2021-09-26 17:53:29 +02:00
|
|
|
|
|
|
|
|
|
## Section 1
|
|
|
|
|
|
|
|
|
|
The content of this section is included
|
|
|
|
|
|
|
|
|
|
## Section 2 { data-search-exclude }
|
|
|
|
|
|
|
|
|
|
The content of this section is excluded
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-11 19:25:40 +02:00
|
|
|
|
=== ":octicons-codescan-16: `search_index.json`"
|
2021-09-26 17:53:29 +02:00
|
|
|
|
|
|
|
|
|
``` json
|
|
|
|
|
{
|
|
|
|
|
...
|
|
|
|
|
"docs": [
|
|
|
|
|
{
|
|
|
|
|
"location":"page/",
|
|
|
|
|
"text":"",
|
|
|
|
|
"title":"Document title"
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"location":"page/#section-1",
|
|
|
|
|
"text":"<p>The content of this section is included</p>",
|
|
|
|
|
"title":"Section 1"
|
|
|
|
|
}
|
|
|
|
|
]
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Excluding blocks
|
|
|
|
|
|
|
|
|
|
If even more fine-grained control is desired, the __pragma__ can be added to
|
2021-10-11 17:16:48 +02:00
|
|
|
|
any [block-level element] or [inline-level element] that is officially
|
|
|
|
|
supported by the [Attribute Lists] extension:
|
2021-09-26 17:53:29 +02:00
|
|
|
|
|
2022-09-11 19:25:40 +02:00
|
|
|
|
=== ":octicons-file-code-16: `docs/page.md`"
|
2021-09-26 17:53:29 +02:00
|
|
|
|
|
|
|
|
|
``` markdown
|
2023-09-14 19:09:18 +02:00
|
|
|
|
# Page title
|
2021-09-26 17:53:29 +02:00
|
|
|
|
|
|
|
|
|
The content of this block is included
|
|
|
|
|
|
|
|
|
|
The content of this block is excluded
|
|
|
|
|
{ data-search-exclude }
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-11 19:25:40 +02:00
|
|
|
|
=== ":octicons-codescan-16: `search_index.json`"
|
2021-09-26 17:53:29 +02:00
|
|
|
|
|
|
|
|
|
``` json
|
|
|
|
|
{
|
|
|
|
|
...
|
|
|
|
|
"docs": [
|
|
|
|
|
{
|
|
|
|
|
"location":"page/",
|
|
|
|
|
"text":"<p>The content of this block is included</p>",
|
|
|
|
|
"title":"Document title"
|
|
|
|
|
},
|
|
|
|
|
]
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2021-10-11 17:16:48 +02:00
|
|
|
|
[block-level element]: https://python-markdown.github.io/extensions/attr_list/#block-level
|
|
|
|
|
[inline-level element]: https://python-markdown.github.io/extensions/attr_list/#inline
|
2021-09-26 17:53:29 +02:00
|
|
|
|
|
|
|
|
|
## Conclusion
|
|
|
|
|
|
|
|
|
|
The latest release brings three simple ways to control more precisely what goes
|
|
|
|
|
into the search index and what doesn't. It complements the already very powerful
|
2021-10-11 17:16:48 +02:00
|
|
|
|
[mkdocs-exclude-search] plugin, allowing for new methods of shaping the
|
2021-09-26 17:53:29 +02:00
|
|
|
|
structure, size and content of the search index.
|