diff --git a/mintlify-docs/LICENSE b/mintlify-docs/LICENSE
new file mode 100644
index 0000000000..5411374274
--- /dev/null
+++ b/mintlify-docs/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2023 Mintlify
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
\ No newline at end of file
diff --git a/mintlify-docs/README.md b/mintlify-docs/README.md
new file mode 100644
index 0000000000..055c983adb
--- /dev/null
+++ b/mintlify-docs/README.md
@@ -0,0 +1,43 @@
+# Mintlify Starter Kit
+
+Use the starter kit to get your docs deployed and ready to customize.
+
+Click the green **Use this template** button at the top of this repo to copy the Mintlify starter kit. The starter kit contains examples with
+
+- Guide pages
+- Navigation
+- Customizations
+- API reference pages
+- Use of popular components
+
+**[Follow the full quickstart guide](https://starter.mintlify.com/quickstart)**
+
+## Development
+
+Install the [Mintlify CLI](https://www.npmjs.com/package/mint) to preview your documentation changes locally. To install, use the following command:
+
+```
+npm i -g mint
+```
+
+Run the following command at the root of your documentation, where your `docs.json` is located:
+
+```
+mint dev
+```
+
+View your local preview at `http://localhost:3000`.
+
+## Publishing changes
+
+Install our GitHub app from your [dashboard](https://dashboard.mintlify.com/settings/organization/github-app) to propagate changes from your repo to your deployment. Changes are deployed to production automatically after pushing to the default branch.
+
+## Need help?
+
+### Troubleshooting
+
+- If your dev environment isn't running: Run `mint update` to ensure you have the most recent version of the CLI.
+- If a page loads as a 404: Make sure you are running in a folder with a valid `docs.json`.
+
+### Resources
+- [Mintlify documentation](https://mintlify.com/docs)
diff --git a/mintlify-docs/assets/a_tensor_formalism_for_computer_science.pdf b/mintlify-docs/assets/a_tensor_formalism_for_computer_science.pdf
new file mode 100644
index 0000000000..b103256731
Binary files /dev/null and b/mintlify-docs/assets/a_tensor_formalism_for_computer_science.pdf differ
diff --git a/mintlify-docs/assets/attribute-memory-Vespa.xls b/mintlify-docs/assets/attribute-memory-Vespa.xls
new file mode 100644
index 0000000000..41958c3de0
Binary files /dev/null and b/mintlify-docs/assets/attribute-memory-Vespa.xls differ
diff --git a/mintlify-docs/assets/commits-release.png b/mintlify-docs/assets/commits-release.png
new file mode 100644
index 0000000000..13039c61e2
Binary files /dev/null and b/mintlify-docs/assets/commits-release.png differ
diff --git a/mintlify-docs/assets/cover-image.png b/mintlify-docs/assets/cover-image.png
new file mode 100644
index 0000000000..dff3529287
Binary files /dev/null and b/mintlify-docs/assets/cover-image.png differ
diff --git a/mintlify-docs/assets/fonts/Roobert-Medium.woff b/mintlify-docs/assets/fonts/Roobert-Medium.woff
new file mode 100644
index 0000000000..d8a95adf5c
Binary files /dev/null and b/mintlify-docs/assets/fonts/Roobert-Medium.woff differ
diff --git a/mintlify-docs/assets/fonts/Roobert-Medium.woff2 b/mintlify-docs/assets/fonts/Roobert-Medium.woff2
new file mode 100644
index 0000000000..b5d52020ab
Binary files /dev/null and b/mintlify-docs/assets/fonts/Roobert-Medium.woff2 differ
diff --git a/mintlify-docs/assets/fonts/Roobert-MediumItalic.woff b/mintlify-docs/assets/fonts/Roobert-MediumItalic.woff
new file mode 100644
index 0000000000..a0f62a56e6
Binary files /dev/null and b/mintlify-docs/assets/fonts/Roobert-MediumItalic.woff differ
diff --git a/mintlify-docs/assets/fonts/Roobert-MediumItalic.woff2 b/mintlify-docs/assets/fonts/Roobert-MediumItalic.woff2
new file mode 100644
index 0000000000..39c52008a3
Binary files /dev/null and b/mintlify-docs/assets/fonts/Roobert-MediumItalic.woff2 differ
diff --git a/mintlify-docs/assets/fonts/Roobert-Regular.woff b/mintlify-docs/assets/fonts/Roobert-Regular.woff
new file mode 100644
index 0000000000..5985b8b1c4
Binary files /dev/null and b/mintlify-docs/assets/fonts/Roobert-Regular.woff differ
diff --git a/mintlify-docs/assets/fonts/Roobert-Regular.woff2 b/mintlify-docs/assets/fonts/Roobert-Regular.woff2
new file mode 100644
index 0000000000..78791dbf44
Binary files /dev/null and b/mintlify-docs/assets/fonts/Roobert-Regular.woff2 differ
diff --git a/mintlify-docs/assets/fonts/Roobert-RegularItalic.woff b/mintlify-docs/assets/fonts/Roobert-RegularItalic.woff
new file mode 100644
index 0000000000..3d7bdf6eca
Binary files /dev/null and b/mintlify-docs/assets/fonts/Roobert-RegularItalic.woff differ
diff --git a/mintlify-docs/assets/fonts/Roobert-RegularItalic.woff2 b/mintlify-docs/assets/fonts/Roobert-RegularItalic.woff2
new file mode 100644
index 0000000000..65b66ee797
Binary files /dev/null and b/mintlify-docs/assets/fonts/Roobert-RegularItalic.woff2 differ
diff --git a/mintlify-docs/assets/fonts/Roobert-SemiBold.woff b/mintlify-docs/assets/fonts/Roobert-SemiBold.woff
new file mode 100644
index 0000000000..f252deb4ea
Binary files /dev/null and b/mintlify-docs/assets/fonts/Roobert-SemiBold.woff differ
diff --git a/mintlify-docs/assets/fonts/Roobert-SemiBold.woff2 b/mintlify-docs/assets/fonts/Roobert-SemiBold.woff2
new file mode 100644
index 0000000000..7720398e20
Binary files /dev/null and b/mintlify-docs/assets/fonts/Roobert-SemiBold.woff2 differ
diff --git a/mintlify-docs/assets/fonts/Roobert-SemiBoldItalic.woff b/mintlify-docs/assets/fonts/Roobert-SemiBoldItalic.woff
new file mode 100644
index 0000000000..b39e55aa6e
Binary files /dev/null and b/mintlify-docs/assets/fonts/Roobert-SemiBoldItalic.woff differ
diff --git a/mintlify-docs/assets/fonts/Roobert-SemiBoldItalic.woff2 b/mintlify-docs/assets/fonts/Roobert-SemiBoldItalic.woff2
new file mode 100644
index 0000000000..51d21e3e68
Binary files /dev/null and b/mintlify-docs/assets/fonts/Roobert-SemiBoldItalic.woff2 differ
diff --git a/mintlify-docs/assets/graph-image.png b/mintlify-docs/assets/graph-image.png
new file mode 100644
index 0000000000..0d09f733c6
Binary files /dev/null and b/mintlify-docs/assets/graph-image.png differ
diff --git a/mintlify-docs/assets/icons/arrow-down.svg b/mintlify-docs/assets/icons/arrow-down.svg
new file mode 100644
index 0000000000..594625557d
--- /dev/null
+++ b/mintlify-docs/assets/icons/arrow-down.svg
@@ -0,0 +1,3 @@
+
diff --git a/mintlify-docs/assets/icons/arrow-up.svg b/mintlify-docs/assets/icons/arrow-up.svg
new file mode 100644
index 0000000000..b7a49ac5ba
--- /dev/null
+++ b/mintlify-docs/assets/icons/arrow-up.svg
@@ -0,0 +1,3 @@
+
diff --git a/mintlify-docs/assets/img/1x6.svg b/mintlify-docs/assets/img/1x6.svg
new file mode 100644
index 0000000000..39c2e129be
--- /dev/null
+++ b/mintlify-docs/assets/img/1x6.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/2x3.svg b/mintlify-docs/assets/img/2x3.svg
new file mode 100644
index 0000000000..9e760aa8b2
--- /dev/null
+++ b/mintlify-docs/assets/img/2x3.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/3Dplot.png b/mintlify-docs/assets/img/3Dplot.png
new file mode 100644
index 0000000000..509a4e05a9
Binary files /dev/null and b/mintlify-docs/assets/img/3Dplot.png differ
diff --git a/mintlify-docs/assets/img/3x2.svg b/mintlify-docs/assets/img/3x2.svg
new file mode 100644
index 0000000000..d2bc89766f
--- /dev/null
+++ b/mintlify-docs/assets/img/3x2.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/6x1.svg b/mintlify-docs/assets/img/6x1.svg
new file mode 100644
index 0000000000..a52b7754aa
--- /dev/null
+++ b/mintlify-docs/assets/img/6x1.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/CI-integration.png b/mintlify-docs/assets/img/CI-integration.png
new file mode 100644
index 0000000000..4c7af9d8f6
Binary files /dev/null and b/mintlify-docs/assets/img/CI-integration.png differ
diff --git a/mintlify-docs/assets/img/QPS-scaling.svg b/mintlify-docs/assets/img/QPS-scaling.svg
new file mode 100644
index 0000000000..a7835640ae
--- /dev/null
+++ b/mintlify-docs/assets/img/QPS-scaling.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/ScalingLatencyFactor0.005.svg b/mintlify-docs/assets/img/ScalingLatencyFactor0.005.svg
new file mode 100644
index 0000000000..2fb9a5c77b
--- /dev/null
+++ b/mintlify-docs/assets/img/ScalingLatencyFactor0.005.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/ScalingLatencyFactor0.5.svg b/mintlify-docs/assets/img/ScalingLatencyFactor0.5.svg
new file mode 100644
index 0000000000..50fda3fed4
--- /dev/null
+++ b/mintlify-docs/assets/img/ScalingLatencyFactor0.5.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/Threads-per-search.svg b/mintlify-docs/assets/img/Threads-per-search.svg
new file mode 100644
index 0000000000..e0d2b80af0
--- /dev/null
+++ b/mintlify-docs/assets/img/Threads-per-search.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/add-node-move-buckets.svg b/mintlify-docs/assets/img/add-node-move-buckets.svg
new file mode 100644
index 0000000000..1956b8f101
--- /dev/null
+++ b/mintlify-docs/assets/img/add-node-move-buckets.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/app-download-dev.png b/mintlify-docs/assets/img/app-download-dev.png
new file mode 100644
index 0000000000..291c7e2c77
Binary files /dev/null and b/mintlify-docs/assets/img/app-download-dev.png differ
diff --git a/mintlify-docs/assets/img/app-download-prod.png b/mintlify-docs/assets/img/app-download-prod.png
new file mode 100644
index 0000000000..3c8ad0fdd0
Binary files /dev/null and b/mintlify-docs/assets/img/app-download-prod.png differ
diff --git a/mintlify-docs/assets/img/application-key.png b/mintlify-docs/assets/img/application-key.png
new file mode 100644
index 0000000000..ab04a20651
Binary files /dev/null and b/mintlify-docs/assets/img/application-key.png differ
diff --git a/mintlify-docs/assets/img/archive-aws-access-logs.png b/mintlify-docs/assets/img/archive-aws-access-logs.png
new file mode 100644
index 0000000000..cf7288ec64
Binary files /dev/null and b/mintlify-docs/assets/img/archive-aws-access-logs.png differ
diff --git a/mintlify-docs/assets/img/archive-aws-configure-access.png b/mintlify-docs/assets/img/archive-aws-configure-access.png
new file mode 100644
index 0000000000..d3945436b6
Binary files /dev/null and b/mintlify-docs/assets/img/archive-aws-configure-access.png differ
diff --git a/mintlify-docs/assets/img/archive-aws-enclave.png b/mintlify-docs/assets/img/archive-aws-enclave.png
new file mode 100644
index 0000000000..bdc6be12f2
Binary files /dev/null and b/mintlify-docs/assets/img/archive-aws-enclave.png differ
diff --git a/mintlify-docs/assets/img/archive-aws-expanded-dropdown.png b/mintlify-docs/assets/img/archive-aws-expanded-dropdown.png
new file mode 100644
index 0000000000..9fc3369320
Binary files /dev/null and b/mintlify-docs/assets/img/archive-aws-expanded-dropdown.png differ
diff --git a/mintlify-docs/assets/img/archive-azure-access-logs.png b/mintlify-docs/assets/img/archive-azure-access-logs.png
new file mode 100644
index 0000000000..9cb171d905
Binary files /dev/null and b/mintlify-docs/assets/img/archive-azure-access-logs.png differ
diff --git a/mintlify-docs/assets/img/archive-azure-configure-access.png b/mintlify-docs/assets/img/archive-azure-configure-access.png
new file mode 100644
index 0000000000..74afe31d15
Binary files /dev/null and b/mintlify-docs/assets/img/archive-azure-configure-access.png differ
diff --git a/mintlify-docs/assets/img/archive-azure-expanded-dropdown.png b/mintlify-docs/assets/img/archive-azure-expanded-dropdown.png
new file mode 100644
index 0000000000..fb73624fe3
Binary files /dev/null and b/mintlify-docs/assets/img/archive-azure-expanded-dropdown.png differ
diff --git a/mintlify-docs/assets/img/archive-gcp-access-logs.png b/mintlify-docs/assets/img/archive-gcp-access-logs.png
new file mode 100644
index 0000000000..83650969ce
Binary files /dev/null and b/mintlify-docs/assets/img/archive-gcp-access-logs.png differ
diff --git a/mintlify-docs/assets/img/archive-gcp-configure-access.png b/mintlify-docs/assets/img/archive-gcp-configure-access.png
new file mode 100644
index 0000000000..19d27ca1ed
Binary files /dev/null and b/mintlify-docs/assets/img/archive-gcp-configure-access.png differ
diff --git a/mintlify-docs/assets/img/archive-gcp-expanded-dropdown.png b/mintlify-docs/assets/img/archive-gcp-expanded-dropdown.png
new file mode 100644
index 0000000000..6ceb9d8a3f
Binary files /dev/null and b/mintlify-docs/assets/img/archive-gcp-expanded-dropdown.png differ
diff --git a/mintlify-docs/assets/img/attributes-indexes.svg b/mintlify-docs/assets/img/attributes-indexes.svg
new file mode 100644
index 0000000000..4f0285e839
--- /dev/null
+++ b/mintlify-docs/assets/img/attributes-indexes.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/attributes-update.svg b/mintlify-docs/assets/img/attributes-update.svg
new file mode 100644
index 0000000000..b9889dd707
--- /dev/null
+++ b/mintlify-docs/assets/img/attributes-update.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/attributes.svg b/mintlify-docs/assets/img/attributes.svg
new file mode 100644
index 0000000000..d91da73aa5
--- /dev/null
+++ b/mintlify-docs/assets/img/attributes.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/automated-deployment-pin.png b/mintlify-docs/assets/img/automated-deployment-pin.png
new file mode 100644
index 0000000000..410ef27ac9
Binary files /dev/null and b/mintlify-docs/assets/img/automated-deployment-pin.png differ
diff --git a/mintlify-docs/assets/img/automated-deployment-production-test.png b/mintlify-docs/assets/img/automated-deployment-production-test.png
new file mode 100644
index 0000000000..23297a0b78
Binary files /dev/null and b/mintlify-docs/assets/img/automated-deployment-production-test.png differ
diff --git a/mintlify-docs/assets/img/automated-deployment-restart.png b/mintlify-docs/assets/img/automated-deployment-restart.png
new file mode 100644
index 0000000000..efd0b998e8
Binary files /dev/null and b/mintlify-docs/assets/img/automated-deployment-restart.png differ
diff --git a/mintlify-docs/assets/img/automated-deployment-supersede.png b/mintlify-docs/assets/img/automated-deployment-supersede.png
new file mode 100644
index 0000000000..f96a0925ac
Binary files /dev/null and b/mintlify-docs/assets/img/automated-deployment-supersede.png differ
diff --git a/mintlify-docs/assets/img/automated-deployments-complex.png b/mintlify-docs/assets/img/automated-deployments-complex.png
new file mode 100644
index 0000000000..c918afb784
Binary files /dev/null and b/mintlify-docs/assets/img/automated-deployments-complex.png differ
diff --git a/mintlify-docs/assets/img/automated-deployments-overview.png b/mintlify-docs/assets/img/automated-deployments-overview.png
new file mode 100644
index 0000000000..6e84b83dc2
Binary files /dev/null and b/mintlify-docs/assets/img/automated-deployments-overview.png differ
diff --git a/mintlify-docs/assets/img/block-window.png b/mintlify-docs/assets/img/block-window.png
new file mode 100644
index 0000000000..86407ba457
Binary files /dev/null and b/mintlify-docs/assets/img/block-window.png differ
diff --git a/mintlify-docs/assets/img/bucket-node-sequence.svg b/mintlify-docs/assets/img/bucket-node-sequence.svg
new file mode 100644
index 0000000000..e4e9220241
--- /dev/null
+++ b/mintlify-docs/assets/img/bucket-node-sequence.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/canary-instance-one-app.png b/mintlify-docs/assets/img/canary-instance-one-app.png
new file mode 100644
index 0000000000..80846ae08f
Binary files /dev/null and b/mintlify-docs/assets/img/canary-instance-one-app.png differ
diff --git a/mintlify-docs/assets/img/canaryapp.png b/mintlify-docs/assets/img/canaryapp.png
new file mode 100644
index 0000000000..9460ac3e98
Binary files /dev/null and b/mintlify-docs/assets/img/canaryapp.png differ
diff --git a/mintlify-docs/assets/img/cloud-benchmarks.svg b/mintlify-docs/assets/img/cloud-benchmarks.svg
new file mode 100644
index 0000000000..4b09b59229
--- /dev/null
+++ b/mintlify-docs/assets/img/cloud-benchmarks.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/config-assembly.svg b/mintlify-docs/assets/img/config-assembly.svg
new file mode 100644
index 0000000000..2d31025174
--- /dev/null
+++ b/mintlify-docs/assets/img/config-assembly.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/config-delivery.svg b/mintlify-docs/assets/img/config-delivery.svg
new file mode 100644
index 0000000000..6ff70b08f8
--- /dev/null
+++ b/mintlify-docs/assets/img/config-delivery.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/config-sentinel.svg b/mintlify-docs/assets/img/config-sentinel.svg
new file mode 100644
index 0000000000..4578c8337f
--- /dev/null
+++ b/mintlify-docs/assets/img/config-sentinel.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/console-notifications.png b/mintlify-docs/assets/img/console-notifications.png
new file mode 100644
index 0000000000..5e9733f818
Binary files /dev/null and b/mintlify-docs/assets/img/console-notifications.png differ
diff --git a/mintlify-docs/assets/img/console/autoscale.png b/mintlify-docs/assets/img/console/autoscale.png
new file mode 100644
index 0000000000..59746c1301
Binary files /dev/null and b/mintlify-docs/assets/img/console/autoscale.png differ
diff --git a/mintlify-docs/assets/img/console/delete-production-deployment.png b/mintlify-docs/assets/img/console/delete-production-deployment.png
new file mode 100644
index 0000000000..d828829e2f
Binary files /dev/null and b/mintlify-docs/assets/img/console/delete-production-deployment.png differ
diff --git a/mintlify-docs/assets/img/console/security.png b/mintlify-docs/assets/img/console/security.png
new file mode 100644
index 0000000000..5730d77d46
Binary files /dev/null and b/mintlify-docs/assets/img/console/security.png differ
diff --git a/mintlify-docs/assets/img/console/tuning.png b/mintlify-docs/assets/img/console/tuning.png
new file mode 100644
index 0000000000..54cb529cdd
Binary files /dev/null and b/mintlify-docs/assets/img/console/tuning.png differ
diff --git a/mintlify-docs/assets/img/console/upgrade.png b/mintlify-docs/assets/img/console/upgrade.png
new file mode 100644
index 0000000000..93c2bec0f7
Binary files /dev/null and b/mintlify-docs/assets/img/console/upgrade.png differ
diff --git a/mintlify-docs/assets/img/console/zone-overview.png b/mintlify-docs/assets/img/console/zone-overview.png
new file mode 100644
index 0000000000..fd9d9dd5e3
Binary files /dev/null and b/mintlify-docs/assets/img/console/zone-overview.png differ
diff --git a/mintlify-docs/assets/img/container-components.svg b/mintlify-docs/assets/img/container-components.svg
new file mode 100644
index 0000000000..fc7724d763
--- /dev/null
+++ b/mintlify-docs/assets/img/container-components.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/dandelion-song.png b/mintlify-docs/assets/img/dandelion-song.png
new file mode 100644
index 0000000000..1b77f46a1c
Binary files /dev/null and b/mintlify-docs/assets/img/dandelion-song.png differ
diff --git a/mintlify-docs/assets/img/dashboard.png b/mintlify-docs/assets/img/dashboard.png
new file mode 100644
index 0000000000..80713a1535
Binary files /dev/null and b/mintlify-docs/assets/img/dashboard.png differ
diff --git a/mintlify-docs/assets/img/deployment-with-system-test.png b/mintlify-docs/assets/img/deployment-with-system-test.png
new file mode 100644
index 0000000000..c43f8cbed9
Binary files /dev/null and b/mintlify-docs/assets/img/deployment-with-system-test.png differ
diff --git a/mintlify-docs/assets/img/diversity-1.png b/mintlify-docs/assets/img/diversity-1.png
new file mode 100644
index 0000000000..497349440d
Binary files /dev/null and b/mintlify-docs/assets/img/diversity-1.png differ
diff --git a/mintlify-docs/assets/img/document-processing-class-diagram.svg b/mintlify-docs/assets/img/document-processing-class-diagram.svg
new file mode 100644
index 0000000000..dcf3a63921
--- /dev/null
+++ b/mintlify-docs/assets/img/document-processing-class-diagram.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/ecommerce-facets.png b/mintlify-docs/assets/img/ecommerce-facets.png
new file mode 100644
index 0000000000..c1db5c2c06
Binary files /dev/null and b/mintlify-docs/assets/img/ecommerce-facets.png differ
diff --git a/mintlify-docs/assets/img/elastic-fail.svg b/mintlify-docs/assets/img/elastic-fail.svg
new file mode 100644
index 0000000000..bc66a7f0c8
--- /dev/null
+++ b/mintlify-docs/assets/img/elastic-fail.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/elastic-feed-container.svg b/mintlify-docs/assets/img/elastic-feed-container.svg
new file mode 100644
index 0000000000..6f4e60b3bc
--- /dev/null
+++ b/mintlify-docs/assets/img/elastic-feed-container.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/elastic-feed-vespafeeder.svg b/mintlify-docs/assets/img/elastic-feed-vespafeeder.svg
new file mode 100644
index 0000000000..7e31c97cdb
--- /dev/null
+++ b/mintlify-docs/assets/img/elastic-feed-vespafeeder.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/elastic-feed.svg b/mintlify-docs/assets/img/elastic-feed.svg
new file mode 100644
index 0000000000..ff159a93af
--- /dev/null
+++ b/mintlify-docs/assets/img/elastic-feed.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/elastic-grow.svg b/mintlify-docs/assets/img/elastic-grow.svg
new file mode 100644
index 0000000000..77b1df9646
--- /dev/null
+++ b/mintlify-docs/assets/img/elastic-grow.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/elastic-visit-get.svg b/mintlify-docs/assets/img/elastic-visit-get.svg
new file mode 100644
index 0000000000..ef7b5c24db
--- /dev/null
+++ b/mintlify-docs/assets/img/elastic-visit-get.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/enclave-architecture.png b/mintlify-docs/assets/img/enclave-architecture.png
new file mode 100644
index 0000000000..3452c556ce
Binary files /dev/null and b/mintlify-docs/assets/img/enclave-architecture.png differ
diff --git a/mintlify-docs/assets/img/federation-simple.svg b/mintlify-docs/assets/img/federation-simple.svg
new file mode 100644
index 0000000000..5ee055f66c
--- /dev/null
+++ b/mintlify-docs/assets/img/federation-simple.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/federation.svg b/mintlify-docs/assets/img/federation.svg
new file mode 100644
index 0000000000..0125c63a49
--- /dev/null
+++ b/mintlify-docs/assets/img/federation.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/flat-content-distribution.svg b/mintlify-docs/assets/img/flat-content-distribution.svg
new file mode 100644
index 0000000000..1cfa710fcc
--- /dev/null
+++ b/mintlify-docs/assets/img/flat-content-distribution.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/free-trial.png b/mintlify-docs/assets/img/free-trial.png
new file mode 100644
index 0000000000..509085a108
Binary files /dev/null and b/mintlify-docs/assets/img/free-trial.png differ
diff --git a/mintlify-docs/assets/img/geo/path1.png b/mintlify-docs/assets/img/geo/path1.png
new file mode 100644
index 0000000000..14adfc650c
Binary files /dev/null and b/mintlify-docs/assets/img/geo/path1.png differ
diff --git a/mintlify-docs/assets/img/geo/path2.png b/mintlify-docs/assets/img/geo/path2.png
new file mode 100644
index 0000000000..0b11df1e36
Binary files /dev/null and b/mintlify-docs/assets/img/geo/path2.png differ
diff --git a/mintlify-docs/assets/img/geo/path3.png b/mintlify-docs/assets/img/geo/path3.png
new file mode 100644
index 0000000000..70fbdf7ffc
Binary files /dev/null and b/mintlify-docs/assets/img/geo/path3.png differ
diff --git a/mintlify-docs/assets/img/geo/path4.png b/mintlify-docs/assets/img/geo/path4.png
new file mode 100644
index 0000000000..b1fdfdd8d6
Binary files /dev/null and b/mintlify-docs/assets/img/geo/path4.png differ
diff --git a/mintlify-docs/assets/img/geo/path5.png b/mintlify-docs/assets/img/geo/path5.png
new file mode 100644
index 0000000000..8cbca7c2b4
Binary files /dev/null and b/mintlify-docs/assets/img/geo/path5.png differ
diff --git a/mintlify-docs/assets/img/grafana-metrics.png b/mintlify-docs/assets/img/grafana-metrics.png
new file mode 100644
index 0000000000..fe13ee07ac
Binary files /dev/null and b/mintlify-docs/assets/img/grafana-metrics.png differ
diff --git a/mintlify-docs/assets/img/grouped-content-distribution.svg b/mintlify-docs/assets/img/grouped-content-distribution.svg
new file mode 100644
index 0000000000..23fbdc8e59
--- /dev/null
+++ b/mintlify-docs/assets/img/grouped-content-distribution.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/grouped-topology.svg b/mintlify-docs/assets/img/grouped-topology.svg
new file mode 100644
index 0000000000..209d5d10b8
--- /dev/null
+++ b/mintlify-docs/assets/img/grouped-topology.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/health-api.svg b/mintlify-docs/assets/img/health-api.svg
new file mode 100644
index 0000000000..cd9a8d498b
--- /dev/null
+++ b/mintlify-docs/assets/img/health-api.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/ide.gif b/mintlify-docs/assets/img/ide.gif
new file mode 100644
index 0000000000..ca3520635e
Binary files /dev/null and b/mintlify-docs/assets/img/ide.gif differ
diff --git a/mintlify-docs/assets/img/index-bootstrap.svg b/mintlify-docs/assets/img/index-bootstrap.svg
new file mode 100644
index 0000000000..0f0c2198fa
--- /dev/null
+++ b/mintlify-docs/assets/img/index-bootstrap.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/inheritance-overview.svg b/mintlify-docs/assets/img/inheritance-overview.svg
new file mode 100644
index 0000000000..a3bebf544b
--- /dev/null
+++ b/mintlify-docs/assets/img/inheritance-overview.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/instances-zones.svg b/mintlify-docs/assets/img/instances-zones.svg
new file mode 100644
index 0000000000..279e637590
--- /dev/null
+++ b/mintlify-docs/assets/img/instances-zones.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/jvm-dump.png b/mintlify-docs/assets/img/jvm-dump.png
new file mode 100644
index 0000000000..0bc54ee579
Binary files /dev/null and b/mintlify-docs/assets/img/jvm-dump.png differ
diff --git a/mintlify-docs/assets/img/latency-documents.svg b/mintlify-docs/assets/img/latency-documents.svg
new file mode 100644
index 0000000000..d05201c06a
--- /dev/null
+++ b/mintlify-docs/assets/img/latency-documents.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/latency-rank-profile.png b/mintlify-docs/assets/img/latency-rank-profile.png
new file mode 100644
index 0000000000..7c96af4077
Binary files /dev/null and b/mintlify-docs/assets/img/latency-rank-profile.png differ
diff --git a/mintlify-docs/assets/img/latency-total.png b/mintlify-docs/assets/img/latency-total.png
new file mode 100644
index 0000000000..d55b0dfb7a
Binary files /dev/null and b/mintlify-docs/assets/img/latency-total.png differ
diff --git a/mintlify-docs/assets/img/llm-rag-searcher.svg b/mintlify-docs/assets/img/llm-rag-searcher.svg
new file mode 100644
index 0000000000..2d978b9a3c
--- /dev/null
+++ b/mintlify-docs/assets/img/llm-rag-searcher.svg
@@ -0,0 +1,21 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/load.png b/mintlify-docs/assets/img/load.png
new file mode 100644
index 0000000000..88299ca525
Binary files /dev/null and b/mintlify-docs/assets/img/load.png differ
diff --git a/mintlify-docs/assets/img/lose-node-move-buckets.svg b/mintlify-docs/assets/img/lose-node-move-buckets.svg
new file mode 100644
index 0000000000..6f643e6231
--- /dev/null
+++ b/mintlify-docs/assets/img/lose-node-move-buckets.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/manage-users.png b/mintlify-docs/assets/img/manage-users.png
new file mode 100644
index 0000000000..c1ed9d0c60
Binary files /dev/null and b/mintlify-docs/assets/img/manage-users.png differ
diff --git a/mintlify-docs/assets/img/memory-visualizer-1.png b/mintlify-docs/assets/img/memory-visualizer-1.png
new file mode 100644
index 0000000000..0ddd21f65d
Binary files /dev/null and b/mintlify-docs/assets/img/memory-visualizer-1.png differ
diff --git a/mintlify-docs/assets/img/metrics-api.svg b/mintlify-docs/assets/img/metrics-api.svg
new file mode 100644
index 0000000000..cbf09fbc36
--- /dev/null
+++ b/mintlify-docs/assets/img/metrics-api.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/monitoring-annotations-example.png b/mintlify-docs/assets/img/monitoring-annotations-example.png
new file mode 100644
index 0000000000..5fbb473965
Binary files /dev/null and b/mintlify-docs/assets/img/monitoring-annotations-example.png differ
diff --git a/mintlify-docs/assets/img/monitoring-container-thread-pools.png b/mintlify-docs/assets/img/monitoring-container-thread-pools.png
new file mode 100644
index 0000000000..6066cc495c
Binary files /dev/null and b/mintlify-docs/assets/img/monitoring-container-thread-pools.png differ
diff --git a/mintlify-docs/assets/img/monitoring-dashboard-tabs.png b/mintlify-docs/assets/img/monitoring-dashboard-tabs.png
new file mode 100644
index 0000000000..2bcd50674c
Binary files /dev/null and b/mintlify-docs/assets/img/monitoring-dashboard-tabs.png differ
diff --git a/mintlify-docs/assets/img/monitoring-getting-started.svg b/mintlify-docs/assets/img/monitoring-getting-started.svg
new file mode 100644
index 0000000000..ec65fb28b9
--- /dev/null
+++ b/mintlify-docs/assets/img/monitoring-getting-started.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/monitoring-health-indicators.png b/mintlify-docs/assets/img/monitoring-health-indicators.png
new file mode 100644
index 0000000000..5186c9f67e
Binary files /dev/null and b/mintlify-docs/assets/img/monitoring-health-indicators.png differ
diff --git a/mintlify-docs/assets/img/monitoring-jvm-memory.png b/mintlify-docs/assets/img/monitoring-jvm-memory.png
new file mode 100644
index 0000000000..1390207d0e
Binary files /dev/null and b/mintlify-docs/assets/img/monitoring-jvm-memory.png differ
diff --git a/mintlify-docs/assets/img/monitoring-rank-profile-rows.png b/mintlify-docs/assets/img/monitoring-rank-profile-rows.png
new file mode 100644
index 0000000000..93bd1e0cdf
Binary files /dev/null and b/mintlify-docs/assets/img/monitoring-rank-profile-rows.png differ
diff --git a/mintlify-docs/assets/img/nodes.svg b/mintlify-docs/assets/img/nodes.svg
new file mode 100644
index 0000000000..c0febd35da
--- /dev/null
+++ b/mintlify-docs/assets/img/nodes.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/overall-architecture.png b/mintlify-docs/assets/img/overall-architecture.png
new file mode 100644
index 0000000000..0bae54dc67
Binary files /dev/null and b/mintlify-docs/assets/img/overall-architecture.png differ
diff --git a/mintlify-docs/assets/img/parent-child.svg b/mintlify-docs/assets/img/parent-child.svg
new file mode 100644
index 0000000000..4d379dde2d
--- /dev/null
+++ b/mintlify-docs/assets/img/parent-child.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/phased-ranking-rag.png b/mintlify-docs/assets/img/phased-ranking-rag.png
new file mode 100644
index 0000000000..cf72a5ccd2
Binary files /dev/null and b/mintlify-docs/assets/img/phased-ranking-rag.png differ
diff --git a/mintlify-docs/assets/img/phased-ranking.png b/mintlify-docs/assets/img/phased-ranking.png
new file mode 100644
index 0000000000..72bee90c1e
Binary files /dev/null and b/mintlify-docs/assets/img/phased-ranking.png differ
diff --git a/mintlify-docs/assets/img/pin-version.png b/mintlify-docs/assets/img/pin-version.png
new file mode 100644
index 0000000000..5267da3226
Binary files /dev/null and b/mintlify-docs/assets/img/pin-version.png differ
diff --git a/mintlify-docs/assets/img/pipeline-1.png b/mintlify-docs/assets/img/pipeline-1.png
new file mode 100644
index 0000000000..286c143075
Binary files /dev/null and b/mintlify-docs/assets/img/pipeline-1.png differ
diff --git a/mintlify-docs/assets/img/pipeline-2.png b/mintlify-docs/assets/img/pipeline-2.png
new file mode 100644
index 0000000000..15d8458b7c
Binary files /dev/null and b/mintlify-docs/assets/img/pipeline-2.png differ
diff --git a/mintlify-docs/assets/img/pipeline-3.png b/mintlify-docs/assets/img/pipeline-3.png
new file mode 100644
index 0000000000..a202c29720
Binary files /dev/null and b/mintlify-docs/assets/img/pipeline-3.png differ
diff --git a/mintlify-docs/assets/img/prodapp.png b/mintlify-docs/assets/img/prodapp.png
new file mode 100644
index 0000000000..19a28fcf85
Binary files /dev/null and b/mintlify-docs/assets/img/prodapp.png differ
diff --git a/mintlify-docs/assets/img/proton-databases.svg b/mintlify-docs/assets/img/proton-databases.svg
new file mode 100644
index 0000000000..a7398a6862
--- /dev/null
+++ b/mintlify-docs/assets/img/proton-databases.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/proton-feed.svg b/mintlify-docs/assets/img/proton-feed.svg
new file mode 100644
index 0000000000..4c48980b7e
--- /dev/null
+++ b/mintlify-docs/assets/img/proton-feed.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/proton-query.svg b/mintlify-docs/assets/img/proton-query.svg
new file mode 100644
index 0000000000..ec3819789b
--- /dev/null
+++ b/mintlify-docs/assets/img/proton-query.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/query-dispatch.svg b/mintlify-docs/assets/img/query-dispatch.svg
new file mode 100644
index 0000000000..645603f83a
--- /dev/null
+++ b/mintlify-docs/assets/img/query-dispatch.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/query-groups.svg b/mintlify-docs/assets/img/query-groups.svg
new file mode 100644
index 0000000000..1406c36fa3
--- /dev/null
+++ b/mintlify-docs/assets/img/query-groups.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/query-to-response.svg b/mintlify-docs/assets/img/query-to-response.svg
new file mode 100644
index 0000000000..390675f334
--- /dev/null
+++ b/mintlify-docs/assets/img/query-to-response.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/querytree.svg b/mintlify-docs/assets/img/querytree.svg
new file mode 100644
index 0000000000..7737d09bde
--- /dev/null
+++ b/mintlify-docs/assets/img/querytree.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/reindex-progress.png b/mintlify-docs/assets/img/reindex-progress.png
new file mode 100644
index 0000000000..190689e3fe
Binary files /dev/null and b/mintlify-docs/assets/img/reindex-progress.png differ
diff --git a/mintlify-docs/assets/img/relevance/blog-freshness.png b/mintlify-docs/assets/img/relevance/blog-freshness.png
new file mode 100644
index 0000000000..03f8f567a1
Binary files /dev/null and b/mintlify-docs/assets/img/relevance/blog-freshness.png differ
diff --git a/mintlify-docs/assets/img/relevance/closeness-logscale.png b/mintlify-docs/assets/img/relevance/closeness-logscale.png
new file mode 100644
index 0000000000..77048438a3
Binary files /dev/null and b/mintlify-docs/assets/img/relevance/closeness-logscale.png differ
diff --git a/mintlify-docs/assets/img/relevance/freshness-logscale.png b/mintlify-docs/assets/img/relevance/freshness-logscale.png
new file mode 100644
index 0000000000..7cd8e46c19
Binary files /dev/null and b/mintlify-docs/assets/img/relevance/freshness-logscale.png differ
diff --git a/mintlify-docs/assets/img/relevance/match-phase-max-hits.png b/mintlify-docs/assets/img/relevance/match-phase-max-hits.png
new file mode 100644
index 0000000000..018a10db6f
Binary files /dev/null and b/mintlify-docs/assets/img/relevance/match-phase-max-hits.png differ
diff --git a/mintlify-docs/assets/img/relevance/plot-firstocc-tune.png b/mintlify-docs/assets/img/relevance/plot-firstocc-tune.png
new file mode 100644
index 0000000000..63feaf5a11
Binary files /dev/null and b/mintlify-docs/assets/img/relevance/plot-firstocc-tune.png differ
diff --git a/mintlify-docs/assets/img/relevance/plot-firstocc-weight.png b/mintlify-docs/assets/img/relevance/plot-firstocc-weight.png
new file mode 100644
index 0000000000..fd9ddf3cc7
Binary files /dev/null and b/mintlify-docs/assets/img/relevance/plot-firstocc-weight.png differ
diff --git a/mintlify-docs/assets/img/relevance/plot-numocc-tune.png b/mintlify-docs/assets/img/relevance/plot-numocc-tune.png
new file mode 100644
index 0000000000..4f862ed70e
Binary files /dev/null and b/mintlify-docs/assets/img/relevance/plot-numocc-tune.png differ
diff --git a/mintlify-docs/assets/img/relevance/plot-numocc-weight.png b/mintlify-docs/assets/img/relevance/plot-numocc-weight.png
new file mode 100644
index 0000000000..417355b56d
Binary files /dev/null and b/mintlify-docs/assets/img/relevance/plot-numocc-weight.png differ
diff --git a/mintlify-docs/assets/img/relevance/plot-proximity-tune.png b/mintlify-docs/assets/img/relevance/plot-proximity-tune.png
new file mode 100644
index 0000000000..22c1567b5d
Binary files /dev/null and b/mintlify-docs/assets/img/relevance/plot-proximity-tune.png differ
diff --git a/mintlify-docs/assets/img/relevance/plot-proximity-weight.png b/mintlify-docs/assets/img/relevance/plot-proximity-weight.png
new file mode 100644
index 0000000000..ef3e966c32
Binary files /dev/null and b/mintlify-docs/assets/img/relevance/plot-proximity-weight.png differ
diff --git a/mintlify-docs/assets/img/relevance/ranktype-about.png b/mintlify-docs/assets/img/relevance/ranktype-about.png
new file mode 100644
index 0000000000..59dc03ea52
Binary files /dev/null and b/mintlify-docs/assets/img/relevance/ranktype-about.png differ
diff --git a/mintlify-docs/assets/img/relevance/ranktype-tags.png b/mintlify-docs/assets/img/relevance/ranktype-tags.png
new file mode 100644
index 0000000000..a5ce1e2e6b
Binary files /dev/null and b/mintlify-docs/assets/img/relevance/ranktype-tags.png differ
diff --git a/mintlify-docs/assets/img/relevance/segment-example.png b/mintlify-docs/assets/img/relevance/segment-example.png
new file mode 100644
index 0000000000..2cff8d687b
Binary files /dev/null and b/mintlify-docs/assets/img/relevance/segment-example.png differ
diff --git a/mintlify-docs/assets/img/resource-suggestions-1.png b/mintlify-docs/assets/img/resource-suggestions-1.png
new file mode 100644
index 0000000000..d17ba7abe1
Binary files /dev/null and b/mintlify-docs/assets/img/resource-suggestions-1.png differ
diff --git a/mintlify-docs/assets/img/retrieval-ranking.svg b/mintlify-docs/assets/img/retrieval-ranking.svg
new file mode 100644
index 0000000000..4dea18de15
--- /dev/null
+++ b/mintlify-docs/assets/img/retrieval-ranking.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/routing.svg b/mintlify-docs/assets/img/routing.svg
new file mode 100644
index 0000000000..6c595cfd6c
--- /dev/null
+++ b/mintlify-docs/assets/img/routing.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/rpms.svg b/mintlify-docs/assets/img/rpms.svg
new file mode 100644
index 0000000000..6cc042c9e0
--- /dev/null
+++ b/mintlify-docs/assets/img/rpms.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/schemas-and-content-clusters-multiple-proton.svg b/mintlify-docs/assets/img/schemas-and-content-clusters-multiple-proton.svg
new file mode 100644
index 0000000000..123c991d77
--- /dev/null
+++ b/mintlify-docs/assets/img/schemas-and-content-clusters-multiple-proton.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/schemas-and-content-clusters.svg b/mintlify-docs/assets/img/schemas-and-content-clusters.svg
new file mode 100644
index 0000000000..633356233b
--- /dev/null
+++ b/mintlify-docs/assets/img/schemas-and-content-clusters.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/secret-store-secret.png b/mintlify-docs/assets/img/secret-store-secret.png
new file mode 100644
index 0000000000..79f17de442
Binary files /dev/null and b/mintlify-docs/assets/img/secret-store-secret.png differ
diff --git a/mintlify-docs/assets/img/secret-store.png b/mintlify-docs/assets/img/secret-store.png
new file mode 100644
index 0000000000..5fc3c2ed3f
Binary files /dev/null and b/mintlify-docs/assets/img/secret-store.png differ
diff --git a/mintlify-docs/assets/img/service-isolation.png b/mintlify-docs/assets/img/service-isolation.png
new file mode 100644
index 0000000000..9d91956f23
Binary files /dev/null and b/mintlify-docs/assets/img/service-isolation.png differ
diff --git a/mintlify-docs/assets/img/shopping-1.png b/mintlify-docs/assets/img/shopping-1.png
new file mode 100644
index 0000000000..d28e02c885
Binary files /dev/null and b/mintlify-docs/assets/img/shopping-1.png differ
diff --git a/mintlify-docs/assets/img/skip-tests.png b/mintlify-docs/assets/img/skip-tests.png
new file mode 100644
index 0000000000..3fae9b622a
Binary files /dev/null and b/mintlify-docs/assets/img/skip-tests.png differ
diff --git a/mintlify-docs/assets/img/support-dev.png b/mintlify-docs/assets/img/support-dev.png
new file mode 100644
index 0000000000..39d0564941
Binary files /dev/null and b/mintlify-docs/assets/img/support-dev.png differ
diff --git a/mintlify-docs/assets/img/support-prod.png b/mintlify-docs/assets/img/support-prod.png
new file mode 100644
index 0000000000..85d79ede6d
Binary files /dev/null and b/mintlify-docs/assets/img/support-prod.png differ
diff --git a/mintlify-docs/assets/img/tenants-apps-instances.svg b/mintlify-docs/assets/img/tenants-apps-instances.svg
new file mode 100644
index 0000000000..3803ba59c5
--- /dev/null
+++ b/mintlify-docs/assets/img/tenants-apps-instances.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/tensor-guide.png b/mintlify-docs/assets/img/tensor-guide.png
new file mode 100644
index 0000000000..6f69361507
Binary files /dev/null and b/mintlify-docs/assets/img/tensor-guide.png differ
diff --git a/mintlify-docs/assets/img/tensor-mapped.png b/mintlify-docs/assets/img/tensor-mapped.png
new file mode 100644
index 0000000000..9d31d12aa6
Binary files /dev/null and b/mintlify-docs/assets/img/tensor-mapped.png differ
diff --git a/mintlify-docs/assets/img/tutorials/bm25_dotP_scatter.png b/mintlify-docs/assets/img/tutorials/bm25_dotP_scatter.png
new file mode 100644
index 0000000000..da9d5db585
Binary files /dev/null and b/mintlify-docs/assets/img/tutorials/bm25_dotP_scatter.png differ
diff --git a/mintlify-docs/assets/img/tutorials/bm25_hist.png b/mintlify-docs/assets/img/tutorials/bm25_hist.png
new file mode 100644
index 0000000000..0a370bfde6
Binary files /dev/null and b/mintlify-docs/assets/img/tutorials/bm25_hist.png differ
diff --git a/mintlify-docs/assets/img/tutorials/dotP_hist.png b/mintlify-docs/assets/img/tutorials/dotP_hist.png
new file mode 100644
index 0000000000..8160eacb12
Binary files /dev/null and b/mintlify-docs/assets/img/tutorials/dotP_hist.png differ
diff --git a/mintlify-docs/assets/img/tutorials/embeddings.png b/mintlify-docs/assets/img/tutorials/embeddings.png
new file mode 100644
index 0000000000..f882541f35
Binary files /dev/null and b/mintlify-docs/assets/img/tutorials/embeddings.png differ
diff --git a/mintlify-docs/assets/img/tutorials/mf.png b/mintlify-docs/assets/img/tutorials/mf.png
new file mode 100644
index 0000000000..10fe24c646
Binary files /dev/null and b/mintlify-docs/assets/img/tutorials/mf.png differ
diff --git a/mintlify-docs/assets/img/tutorials/mrr_boxplot.png b/mintlify-docs/assets/img/tutorials/mrr_boxplot.png
new file mode 100644
index 0000000000..3041741114
Binary files /dev/null and b/mintlify-docs/assets/img/tutorials/mrr_boxplot.png differ
diff --git a/mintlify-docs/assets/img/tutorials/rag-blueprint-overview.svg b/mintlify-docs/assets/img/tutorials/rag-blueprint-overview.svg
new file mode 100644
index 0000000000..f6a9d0c7b8
--- /dev/null
+++ b/mintlify-docs/assets/img/tutorials/rag-blueprint-overview.svg
@@ -0,0 +1,103 @@
+
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/tutorials/text_search_baseline_pointwise_listwise_rr.png b/mintlify-docs/assets/img/tutorials/text_search_baseline_pointwise_listwise_rr.png
new file mode 100644
index 0000000000..c8adbb19c5
Binary files /dev/null and b/mintlify-docs/assets/img/tutorials/text_search_baseline_pointwise_listwise_rr.png differ
diff --git a/mintlify-docs/assets/img/vespa-cloud-enclave-aws.png b/mintlify-docs/assets/img/vespa-cloud-enclave-aws.png
new file mode 100644
index 0000000000..0f7f87278c
Binary files /dev/null and b/mintlify-docs/assets/img/vespa-cloud-enclave-aws.png differ
diff --git a/mintlify-docs/assets/img/vespa-cloud-enclave-azure.png b/mintlify-docs/assets/img/vespa-cloud-enclave-azure.png
new file mode 100644
index 0000000000..dce93b5c4a
Binary files /dev/null and b/mintlify-docs/assets/img/vespa-cloud-enclave-azure.png differ
diff --git a/mintlify-docs/assets/img/vespa-cloud-enclave-gcp.png b/mintlify-docs/assets/img/vespa-cloud-enclave-gcp.png
new file mode 100644
index 0000000000..18c4d14f39
Binary files /dev/null and b/mintlify-docs/assets/img/vespa-cloud-enclave-gcp.png differ
diff --git a/mintlify-docs/assets/img/vespa-operator-architecture.png b/mintlify-docs/assets/img/vespa-operator-architecture.png
new file mode 100644
index 0000000000..d1602f5862
Binary files /dev/null and b/mintlify-docs/assets/img/vespa-operator-architecture.png differ
diff --git a/mintlify-docs/assets/img/vespa-overview-docproc.svg b/mintlify-docs/assets/img/vespa-overview-docproc.svg
new file mode 100644
index 0000000000..4578e7780b
--- /dev/null
+++ b/mintlify-docs/assets/img/vespa-overview-docproc.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/vespa-overview-embeddings-1.svg b/mintlify-docs/assets/img/vespa-overview-embeddings-1.svg
new file mode 100644
index 0000000000..a41d3e6b0a
--- /dev/null
+++ b/mintlify-docs/assets/img/vespa-overview-embeddings-1.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/vespa-overview-embeddings-2.svg b/mintlify-docs/assets/img/vespa-overview-embeddings-2.svg
new file mode 100644
index 0000000000..41f027ae2c
--- /dev/null
+++ b/mintlify-docs/assets/img/vespa-overview-embeddings-2.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/vespa-overview-linguistics.svg b/mintlify-docs/assets/img/vespa-overview-linguistics.svg
new file mode 100644
index 0000000000..69638142ea
--- /dev/null
+++ b/mintlify-docs/assets/img/vespa-overview-linguistics.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/vespa-overview-searcher.svg b/mintlify-docs/assets/img/vespa-overview-searcher.svg
new file mode 100644
index 0000000000..5cdbf73cc7
--- /dev/null
+++ b/mintlify-docs/assets/img/vespa-overview-searcher.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/vespa-overview.svg b/mintlify-docs/assets/img/vespa-overview.svg
new file mode 100644
index 0000000000..e70332d35c
--- /dev/null
+++ b/mintlify-docs/assets/img/vespa-overview.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/img/video-thumbs/deploying-a-vespa-searcher.png b/mintlify-docs/assets/img/video-thumbs/deploying-a-vespa-searcher.png
new file mode 100644
index 0000000000..f89d3ac230
Binary files /dev/null and b/mintlify-docs/assets/img/video-thumbs/deploying-a-vespa-searcher.png differ
diff --git a/mintlify-docs/assets/img/vpc-1.png b/mintlify-docs/assets/img/vpc-1.png
new file mode 100644
index 0000000000..796cb73afc
Binary files /dev/null and b/mintlify-docs/assets/img/vpc-1.png differ
diff --git a/mintlify-docs/assets/img/vpc-2.png b/mintlify-docs/assets/img/vpc-2.png
new file mode 100644
index 0000000000..76cd84a932
Binary files /dev/null and b/mintlify-docs/assets/img/vpc-2.png differ
diff --git a/mintlify-docs/assets/logos/Vespa-logo-dark-rgb.svg b/mintlify-docs/assets/logos/Vespa-logo-dark-rgb.svg
new file mode 100644
index 0000000000..6f3c141bb7
--- /dev/null
+++ b/mintlify-docs/assets/logos/Vespa-logo-dark-rgb.svg
@@ -0,0 +1,26 @@
+
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/logos/Vespa-logo-white-rgb.svg b/mintlify-docs/assets/logos/Vespa-logo-white-rgb.svg
new file mode 100644
index 0000000000..fdb0431d01
--- /dev/null
+++ b/mintlify-docs/assets/logos/Vespa-logo-white-rgb.svg
@@ -0,0 +1,26 @@
+
+
\ No newline at end of file
diff --git a/mintlify-docs/assets/logos/logo.svg b/mintlify-docs/assets/logos/logo.svg
new file mode 100644
index 0000000000..fed667cadf
--- /dev/null
+++ b/mintlify-docs/assets/logos/logo.svg
@@ -0,0 +1,16 @@
+
diff --git a/mintlify-docs/assets/logos/vespa-logo-green-rgb.svg b/mintlify-docs/assets/logos/vespa-logo-green-rgb.svg
new file mode 100644
index 0000000000..7ba3e2dd20
--- /dev/null
+++ b/mintlify-docs/assets/logos/vespa-logo-green-rgb.svg
@@ -0,0 +1,26 @@
+
+
\ No newline at end of file
diff --git a/mintlify-docs/docs.json b/mintlify-docs/docs.json
new file mode 100644
index 0000000000..0714262745
--- /dev/null
+++ b/mintlify-docs/docs.json
@@ -0,0 +1,623 @@
+{
+ "$schema": "https://mintlify.com/docs.json",
+ "theme": "linden",
+ "name": "Vespa Documentation",
+ "colors": {
+ "primary": "#61D790",
+ "light": "#61D790",
+ "dark": "#61D790"
+ },
+ "favicon": "/favicon.png",
+ "appearance": {
+ "default": "dark"
+ },
+ "fonts": {
+ "heading": {
+ "family": "Roobert",
+ "source": "https://vespa.ai/vespa-content/themes/website-wp-theme/fonts/Roobert-Medium.woff",
+ "format": "woff"
+ },
+ "body": {
+ "family": "Roobert",
+ "source": "https://vespa.ai/vespa-content/themes/website-wp-theme/fonts/Roobert-Regular.woff",
+ "format": "woff"
+ }
+ },
+ "navigation": {
+ "tabs": [
+ {
+ "tab": "Home",
+ "icon": "house",
+ "pages": ["index"]
+ },
+ {
+ "tab": "Guides",
+ "icon": "book-open",
+ "pages": [
+ {
+ "group": "Vespa Basics",
+ "pages": [
+ "en/basics/deploy-an-application",
+ "en/basics/applications",
+ "en/basics/schemas",
+ "en/basics/writing",
+ "en/basics/querying",
+ "en/basics/ranking",
+ "en/basics/operations",
+ "en/basics/whats-more"
+ ]
+ },
+ {
+ "group": "Learn More",
+ "pages": [
+ "en/learn/overview",
+ "en/learn/llm-help",
+ "en/learn/features",
+ "en/learn/tutorials",
+ "en/learn/glossary",
+ "en/learn/releases",
+ "en/learn/tenant-apps-instances",
+ "en/learn/migrating-to-cloud",
+ "en/learn/migrating-from-elastic-search",
+ "en/learn/about-documentation",
+ "en/learn/contributing"
+ ]
+ },
+ {
+ "group": "Applications & Components",
+ "pages": [
+ "en/applications/developer-guide",
+ "en/applications/ide-support",
+ "en/applications/deployment",
+ "en/applications/vespaignore",
+ "en/applications/containers",
+ "en/applications/components",
+ "en/applications/searchers",
+ "en/applications/document-processors",
+ "en/applications/request-handlers",
+ "en/applications/result-renderers",
+ "en/applications/dependency-injection",
+ "en/applications/configuring-components",
+ "en/applications/chaining",
+ "en/applications/inspecting-structured-data",
+ "en/applications/web-services",
+ "en/applications/unit-testing",
+ "en/applications/testing",
+ "en/applications/config-system",
+ "en/applications/processing",
+ "en/applications/bundles",
+ "en/applications/using-zookeeper",
+ "en/applications/http-servers-and-filters",
+ "en/applications/pluggable-frameworks",
+ "en/applications/configapi-dev"
+ ]
+ },
+ {
+ "group": "Schemas and documents",
+ "pages": [
+ "en/schemas/documents",
+ "en/schemas/inheritance-in-schemas",
+ "en/schemas/concrete-documents",
+ "en/schemas/parent-child",
+ "en/schemas/structs",
+ "en/schemas/predicate-fields",
+ "en/schemas/exposing-schema-information"
+ ]
+ },
+ {
+ "group": "Reading and writing",
+ "pages": [
+ "en/writing/reads-and-writes",
+ "en/writing/document-v1-api-guide",
+ "en/writing/indexing",
+ "en/writing/initial-batch-feed",
+ "en/writing/visiting",
+ "en/writing/document-api-guide",
+ "en/writing/partial-updates",
+ "en/writing/batch-delete",
+ "en/writing/feed-block",
+ "en/writing/document-routing",
+ "en/writing/indexing-paged-vectors"
+ ]
+ },
+ {
+ "group": "Querying",
+ "pages": [
+ "en/querying/query-api",
+ "en/querying/query-language",
+ "en/querying/grouping",
+ "en/querying/federation",
+ "en/querying/query-profiles",
+ "en/querying/vector-search-intro",
+ "en/querying/nearest-neighbor-search",
+ "en/querying/approximate-nn-hnsw",
+ "en/querying/nearest-neighbor-search-guide",
+ "en/querying/text-matching",
+ "en/querying/searching-multivalue-fields",
+ "en/querying/geo-search",
+ "en/querying/document-summaries",
+ "en/querying/result-diversity",
+ "en/querying/page-templates"
+ ]
+ },
+ {
+ "group": "Ranking and inference",
+ "pages": [
+ "en/ranking/ranking-intro",
+ "en/ranking/ranking-expressions-features",
+ "en/ranking/multivalue-query-operators",
+ "en/ranking/tensor-user-guide",
+ "en/ranking/tensor-examples",
+ "en/ranking/phased-ranking",
+ "en/ranking/tensorflow",
+ "en/ranking/onnx",
+ "en/ranking/xgboost",
+ "en/ranking/lightgbm",
+ "en/ranking/wand",
+ "en/ranking/bm25",
+ "en/ranking/nativerank",
+ "en/ranking/cross-encoders",
+ "en/ranking/reranking-in-searcher",
+ "en/ranking/significance",
+ "en/ranking/stateless-model-evaluation"
+ ]
+ },
+ {
+ "group": "RAG and embedding",
+ "pages": [
+ "en/rag/rag",
+ "en/rag/working-with-chunks",
+ "en/rag/embedding",
+ "en/rag/binarizing-vectors",
+ "en/rag/llms-in-vespa",
+ "en/rag/local-llms",
+ "en/rag/external-llms",
+ "en/rag/document-enrichment",
+ "en/rag/model-hub"
+ ]
+ },
+ {
+ "group": "Linguistics and text processing",
+ "pages": [
+ {
+ "group": "Linguistics",
+ "pages": [
+ "/en/linguistics/linguistics",
+ "/en/linguistics/linguistics-opennlp",
+ "/en/linguistics/lucene-linguistics",
+ "/en/linguistics/linguistics-custom"
+ ]
+ },
+ "en/linguistics/query-rewriting",
+ "en/linguistics/troubleshooting-encoding"
+ ]
+ },
+ {
+ "group": "content and elasticity",
+ "pages": [
+ "/en/content/proton",
+ "/en/content/content-nodes",
+ "/en/content/elasticity",
+ "/en/content/attributes",
+ "/en/content/consistency",
+ "/en/content/idealstate",
+ "/en/content/buckets"
+
+ ]
+ },
+ {
+ "group": "Performance",
+ "pages": [
+ "en/performance",
+ "en/performance/practical-search-performance-guide",
+ "en/performance/sizing-search",
+ "en/performance/sizing-feeding",
+ "en/performance/node-resources",
+ {
+ "group": "Instance types",
+ "pages": [
+ "en/performance/instance-types/aws-instance-types",
+ "en/performance/instance-types/gcp-instance-types",
+ "en/performance/instance-types/azure-instance-types"
+ ]
+ },
+ "en/performance/topology-and-resizing",
+ "en/performance/streaming-search",
+ "en/performance/benchmarking",
+ "en/performance/benchmarking-cloud",
+ "en/performance/memory-visualizer",
+ "en/performance/profiling",
+ "en/performance/container-tuning",
+ "en/performance/rate-limiting-searcher",
+ "en/performance/graceful-degradation",
+ "en/performance/caches-in-vespa",
+ "en/performance/container-http",
+ "en/performance/http2",
+ "en/performance/feature-tuning",
+ "en/performance/valgrind"
+ ]
+ },
+ {
+ "group": "Operations",
+ "pages": [
+ "en/cloud/quota",
+ "en/operations/environments",
+ "en/operations/zones",
+ "en/operations/production-deployment",
+ "en/operations/deployment-variants",
+ "en/operations/automated-deployments",
+ "en/operations/autoscaling",
+ {
+ "group": "Enclave: Bring your own cloud",
+ "pages": [
+ "en/operations/enclave/enclave",
+ "en/operations/enclave/aws-getting-started",
+ "en/operations/enclave/aws-architecture",
+ "en/operations/enclave/azure-getting-started",
+ "en/operations/enclave/azure-architecture",
+ "en/operations/enclave/gcp-getting-started",
+ "en/operations/enclave/gcp-architecture",
+ "en/operations/enclave/archive",
+ "en/operations/enclave/operations"
+ ]
+ },
+ "en/operations/reindexing",
+ "en/operations/data-management",
+ "en/operations/cloning",
+ "en/operations/monitoring",
+ "en/operations/metrics",
+ "en/operations/notifications",
+ "en/cloud/support",
+ "en/operations/deployment-patterns",
+ "en/operations/private-endpoints",
+ "en/operations/endpoint-routing",
+ "en/operations/access-logging",
+ {
+ "group": "Artifact archive",
+ "pages": [
+ "en/operations/archive/archive-guide",
+ "en/operations/archive/archive-guide-aws",
+ "en/operations/archive/archive-guide-gcp"
+ ]
+ },
+ "en/operations/deleting-applications",
+ {
+ "group": "Self-managed",
+ "pages": [
+ "en/operations/self-managed/admin-procedures",
+ "en/operations/self-managed/multinode-systems",
+ "en/operations/self-managed/files-processes-and-ports",
+ "en/operations/self-managed/node-setup",
+ "en/operations/self-managed/using-kubernetes-with-vespa",
+ "en/operations/self-managed/build-install",
+ "en/operations/self-managed/monitoring",
+ "en/operations/self-managed/content-node-recovery",
+ "en/operations/self-managed/configuration-server",
+ "en/operations/self-managed/live-upgrade",
+ "en/operations/self-managed/config-sentinel",
+ "en/operations/self-managed/config-proxy",
+ "en/operations/self-managed/docker-containers",
+ "en/operations/self-managed/vespa-gpu-container",
+ "en/operations/self-managed/cpu-support",
+ "en/operations/self-managed/slobrok",
+ "en/operations/self-managed/procedure-change-attribute-index",
+ "en/operations/self-managed/container",
+ "en/operations/self-managed/sizing-examples",
+ "en/operations/self-managed/vespa-support"
+ ]
+ },
+ {
+ "group": "Kubernetes",
+ "pages": [
+ "en/operations/kubernetes/vespa-on-kubernetes",
+ "en/operations/kubernetes/architecture",
+ {
+ "group": "Deployment",
+ "pages": [
+ "en/operations/kubernetes/deployment/installation",
+ "en/operations/kubernetes/deployment/local-deployment",
+ "en/operations/kubernetes/deployment/ecr-pull-through-cache",
+ "en/operations/kubernetes/deployment/dev-mode",
+ "en/operations/kubernetes/deployment/permissions"
+ ]
+ },
+ {
+ "group": "Operations",
+ "pages": [
+ "en/operations/kubernetes/operations/operations",
+ "en/operations/kubernetes/operations/upgrades",
+ "en/operations/kubernetes/operations/delete-vespaset",
+ "en/operations/kubernetes/operations/monitoring"
+ ]
+ },
+ {
+ "group": "Configuration",
+ "pages": [
+ "en/operations/kubernetes/configuration/configure-local-storage-type",
+ "en/operations/kubernetes/logging",
+ "en/operations/kubernetes/ingress",
+ "en/operations/kubernetes/custom-overrides-podtemplate",
+ "en/operations/kubernetes/tls"
+ ]
+ }
+ ]
+ }
+ ]
+ },
+ {
+ "group": "Security",
+ "pages": [
+ "en/security/security",
+ "en/security/guide",
+ "en/security/secret-store",
+ "en/security/cloudflare-workers",
+ "en/security/whitepaper",
+ "en/security/securing-your-vespa-installation",
+ "en/security/mtls"
+ ]
+ },
+ {
+ "group": "Clients",
+ "pages": [
+ "en/clients/vespa-cli",
+ "en/clients/python-client",
+ "en/clients/vespa-feed-client",
+ "en/clients/http-best-practices"
+ ]
+ },
+ {
+ "group": "Modules",
+ "pages": [
+ {
+ "group": "E-commerce",
+ "pages": [
+ "en/modules/e-commerce/multi-currency-filtering",
+ "en/modules/e-commerce/saved-search",
+ "en/modules/e-commerce/using-features-together"
+ ]
+ }
+ ]
+ }
+ ]
+ },
+ {
+ "tab": "FAQ",
+ "icon": "circle-question",
+ "pages": [
+ {
+ "group": "FAQ",
+ "pages": ["en/learn/faq"]
+ }
+ ]
+ },
+ {
+ "tab": "Reference",
+ "icon": "code",
+ "groups": [
+ {
+ "group": "APIs",
+ "pages": [
+ "en/reference/api/api",
+ "en/reference/api/query",
+ "en/reference/api/document-v1",
+ "en/reference/api/state-v1",
+ "en/reference/api/deploy-v2",
+ "en/reference/api/application-v2",
+ "en/reference/api/config-v2",
+ "en/reference/api/cluster-v2",
+ "en/reference/api/metrics-v1",
+ "en/reference/api/metrics-v2",
+ "en/reference/api/prometheus-v1"
+ ]
+ },
+ {
+ "group": "Applications and components",
+ "pages": [
+ "en/reference/applications/application-packages",
+ {
+ "group": "services.xml",
+ "pages": [
+ "en/reference/applications/services/services",
+ "en/reference/applications/services/admin",
+ "en/reference/applications/services/container",
+ "en/reference/applications/services/content",
+ "en/reference/applications/services/docproc",
+ "en/reference/applications/services/http",
+ "en/reference/applications/services/processing",
+ "en/reference/applications/services/search"
+ ]
+ },
+ "en/reference/applications/deployment",
+ "en/reference/applications/hosts",
+ "en/reference/applications/validation-overrides",
+ "en/reference/applications/components",
+ "en/reference/applications/config-files",
+ "en/reference/applications/testing",
+ "en/reference/applications/testing-java"
+ ]
+ },
+ {
+ "group": "Schemas and documents",
+ "pages": [
+ "en/reference/schemas/schemas",
+ "en/reference/schemas/document-json-format",
+ "en/reference/schemas/document-field-path"
+ ]
+ },
+ {
+ "group": "Reading and writing",
+ "pages": [
+ "en/reference/writing/indexing-language",
+ "en/reference/writing/document-selector-language"
+ ]
+ },
+ {
+ "group": "Querying",
+ "pages": [
+ "en/reference/querying/yql",
+ "en/reference/querying/simple-query-language",
+ "en/reference/querying/json-query-language",
+ "en/reference/querying/grouping-language",
+ "en/reference/querying/sorting-language",
+ "en/reference/querying/query-profiles",
+ "en/reference/querying/semantic-rules",
+ "en/reference/querying/default-result-format",
+ "en/reference/querying/page-result-format",
+ "en/reference/querying/page-templates"
+ ]
+ },
+ {
+ "group": "Ranking and inference",
+ "pages": [
+ "en/reference/ranking/ranking-expressions",
+ "en/reference/ranking/tensor",
+ "en/reference/ranking/rank-features",
+ "en/reference/ranking/nativerank",
+ "en/reference/ranking/string-segment-match",
+ "en/reference/ranking/rank-feature-configuration",
+ "en/reference/ranking/rank-types",
+ "en/reference/ranking/model-files",
+ "en/reference/ranking/constant-tensor-json-format"
+ ]
+ },
+ {
+ "group": "RAG and embedding",
+ "pages": ["en/reference/rag/chunking", "en/reference/rag/embedding"]
+ },
+ {
+ "group": "Operations",
+ "pages": [
+ "en/reference/operations/health-checks",
+ "en/reference/operations/log-files",
+ "en/reference/operations/tools",
+ {
+ "group": "Metrics",
+ "pages": [
+ "en/reference/operations/metrics/metrics",
+ "en/reference/operations/metrics/default-metric-set",
+ "en/reference/operations/metrics/vespa-metric-set",
+ "en/reference/operations/metrics/metric-units",
+ "en/reference/operations/metrics/container",
+ "en/reference/operations/metrics/distributor",
+ "en/reference/operations/metrics/searchnode",
+ "en/reference/operations/metrics/storage",
+ "en/reference/operations/metrics/configserver",
+ "en/reference/operations/metrics/logd",
+ "en/reference/operations/metrics/nodeadmin",
+ "en/reference/operations/metrics/slobrok",
+ "en/reference/operations/metrics/clustercontroller",
+ "en/reference/operations/metrics/sentinel"
+ ]
+ },
+ {
+ "group": "Self-managed",
+ "pages": ["en/reference/operations/self-managed/tools"]
+ }
+ ]
+ },
+ {
+ "group": "Security",
+ "pages": ["en/reference/security/mtls"]
+ },
+ {
+ "group": "Clients",
+ "pages": [
+ {
+ "group": "Vespa CLI",
+ "pages": [
+ "en/reference/clients/vespa-cli/vespa",
+ "en/reference/clients/vespa-cli/vespa_activate",
+ "en/reference/clients/vespa-cli/vespa_auth",
+ "en/reference/clients/vespa-cli/vespa_clone",
+ "en/reference/clients/vespa-cli/vespa_config",
+ "en/reference/clients/vespa-cli/vespa_curl",
+ "en/reference/clients/vespa-cli/vespa_deploy",
+ "en/reference/clients/vespa-cli/vespa_destroy",
+ "en/reference/clients/vespa-cli/vespa_document",
+ "en/reference/clients/vespa-cli/vespa_feed",
+ "en/reference/clients/vespa-cli/vespa_fetch",
+ "en/reference/clients/vespa-cli/vespa_inspect",
+ "en/reference/clients/vespa-cli/vespa_log",
+ "en/reference/clients/vespa-cli/vespa_prepare",
+ "en/reference/clients/vespa-cli/vespa_prod",
+ "en/reference/clients/vespa-cli/vespa_query",
+ "en/reference/clients/vespa-cli/vespa_status",
+ "en/reference/clients/vespa-cli/vespa_test",
+ "en/reference/clients/vespa-cli/vespa_version",
+ "en/reference/clients/vespa-cli/vespa_visit"
+ ]
+ }
+ ]
+ }
+ ]
+ },
+ {
+ "tab": "Changelog",
+ "icon": "clock-rotate-left",
+ "pages": [
+ "en/reference/release-notes/vespa7",
+ "en/reference/release-notes/vespa8",
+ "en/reference/release-notes/vespa9"
+ ]
+ }
+ ],
+ "global": {
+ "anchors": [
+ {
+ "anchor": "About",
+ "href": "https://vespa.ai/company/",
+ "icon": "users"
+ },
+ {
+ "anchor": "Blog",
+ "href": "https://blog.vespa.ai/.",
+ "icon": "newspaper"
+ }
+ ]
+ }
+ },
+ "logo": {
+ "light": "/logo/light.svg",
+ "dark": "/logo/dark.svg",
+ "href": "https://vespa.ai"
+ },
+ "navbar": {
+ "links": [
+ {
+ "label": "Console login",
+ "href": "https://console.vespa-cloud.com/",
+ "icon": "user"
+ },
+ {
+ "label": " ",
+ "icon": "github",
+ "href": "https://github.com/vespa-engine/vespa/"
+ }
+ ],
+ "primary": {
+ "type": "button",
+ "label": "Free trial",
+ "href": "https://vespa.ai/free-trial/"
+ }
+ },
+ "contextual": {
+ "options": [
+ "copy",
+ "view",
+ "chatgpt",
+ "claude",
+ "perplexity",
+ "mcp",
+ "cursor",
+ "vscode"
+ ]
+ },
+ "footer": {
+ "socials": {
+ "github": "https://github.com/vespa-engine",
+ "linkedin": "https://www.linkedin.com/company/vespa-ai/posts/?feedView=all",
+ "x": "https://x.com/vespaengine",
+ "youtube": "https://www.youtube.com/channel/UCVXw_f6UHff8-V9FA1LMIiw"
+ }
+ }
+}
diff --git a/mintlify-docs/en/applications/bundles.mdx b/mintlify-docs/en/applications/bundles.mdx
new file mode 100644
index 0000000000..96de1eece9
--- /dev/null
+++ b/mintlify-docs/en/applications/bundles.mdx
@@ -0,0 +1,689 @@
+---
+title: "Bundles"
+description: "The Container uses [OSGi](https://osgi.org) to provide a modular platform for developing applications that can be composed of many reusable components. The user can deploy, upgrade and remove these components at runtime."
+---
+
+## OSGi
+
+OSGi is a framework for modular development of Java applications, where a set of resources called *bundles* can be installed. OSGi allows the developer to control which resources (Java packages) in a bundle that should be available to other bundles. Hence, you can explicitly declare a bundle's public API, and also ensure that internal implementation details remain hidden.
+
+Unless you're already familiar with OSGi, we recommend reading Richard S. Hall's presentation [Learning to ignore OSGi](https://cwiki.apache.org/confluence/download/attachments/7956/Learning_to_ignore_OSGi.pdf), which explains the most important aspects that you must relate to as a bundle developer. There are other good OSGi tutorials available:
+
+- [OSGi for Dummies](https://thiloshon.wordpress.com/2020/03/04/osgi-for-dummies/)
+- [OSGi Modularity and Services - Tutorial](https://www.vogella.com/tutorials/OSGi/article.html) (You can ignore the part about OSGi services.)
+
+JDisc uses OSGi's *module* and *lifecycle* layers, and does not provide any functionality from the *service* layer.
+
+## OSGi bundles
+
+An OSGi bundle is a regular JAR file with a MANIFEST.MF file that describes its content, what the bundle requires (imports) from other bundles, and what it provides (exports) to other bundles. Below is an example of a typical bundle manifest with the most important headers:
+
+```java
+Bundle-SymbolicName: com.yahoo.helloworld
+Bundle-Description: A Hello World bundle
+Bundle-Version: 1.0.0
+Export-Package: com.yahoo.helloworld;version="1.0.0"
+Import-Package: org.osgi.framework;version="1.3.0"
+```
+
+The meaning of the headers in this bundle manifest is as follows:
+
+- `Bundle-SymbolicName` - The unique identifier of the bundle.
+- `Bundle-Description` - A human-readable description of the bundle's functionality.
+- `Bundle-Version` - Designates a version number to the bundle.
+- `Export-Package` - Expresses which Java packages contained in a bundle will be made available to the outside world.
+- `Import-Package` - Indicates which Java packages will be required from the outside world to fulfill the dependencies needed in a bundle.
+
+Note that OSGi has a strict definition of version numbers that need to be followed for bundles to work correctly. See the [OSGi javadoc](https://docs.osgi.org/javadoc/r4v42/org/osgi/framework/Version.html#Version(java.lang.String)) for details. As a general advice, never use more than three numbers in the version (major, minor, micro).
+
+## Building an OSGi bundle
+
+As long as the project was created by following steps in the [Developer Guide](/en/applications/developer-guide), the code is already being packaged into an OSGi bundle by the [Maven bundle plugin](#maven-bundle-plugin). However, if migrating an existing Maven project, change the packaging statement to:
+
+```xml
+container-plugin
+```
+
+and add the plugin to the build instructions:
+
+```xml
+
+ com.yahoo.vespa
+ bundle-plugin
+{/* Find latest version at search.maven.org/search?q=g:com.yahoo.vespa%20a:bundle-plugin */}
+ {{site.variables.vespa_version}}
+ true
+
+ true
+
+
+```
+
+Because OSGi introduces a different runtime environment from what Maven provides when running unit tests, one will not observe any loading and linking errors until trying to deploy the application onto a running Container. Errors triggered at this stage will be the likes of `ClassNotFoundException` and `NoClassDefFoundError`. To debug these types of errors, inspect the stack traces in the [error log](/en/reference/operations/log-files), and refer to [troubleshooting](#troubleshooting).
+
+[vespa-logfmt](/en/reference/operations/self-managed/tools#vespa-logfmt) with its *--nldequote* option is useful when reading logs.
+
+The test suite needs to cover deployment of the application bundle to ensure that its dynamic loading and linking issues are covered.
+
+
+
+## Depending on non-OSGi ready libraries
+
+Unfortunately, many popular Java libraries have yet to be bundled with the appropriate manifest that makes them OSGi-compatible. The simplest solution to this is to set the scope of the problematic dependency to **compile** in your pom.xml file. This will cause the bundle plugin to package the whole library into your bundle's JAR file. Until the offending library becomes available as an OSGi bundle, it means that your bundle will be bigger (in number of bytes), and that classes of that library can not be shared across application bundles.
+
+The practical implication of this feature is that the bundle plugin copies the compile-scoped dependency, and its transitive dependencies, into the final JAR file, and adds a `Bundle-ClassPath` instruction to its manifest that references those dependencies.
+
+Although this approach works for most non-OSGi libraries, it only works for libraries where the jar file is *self-contained*. If, on the other hand, the library depends on other installed files, it must be treated as if it was a [JNI library](#depending-on-JNI-libraries).
+
+## Depending on JNI Libraries
+
+This section details alternatives for using native code in the container.
+
+### OSGi bundles containing native code
+
+OSGi jars may contain .so files, which can be loaded in the standard way from Java code in the bundle. Note that since only one instance of an .so can be loaded at any time, it is not possible to hot swap a jar containing .so files - when such jars are changed the [new configuration will not take effect until the container is restarted](/en/applications/components#JNI-requires-restart). Therefore, it is often a good idea to package a .so file and its Java API into a separate bundle from the rest of your code to avoid having to restart the container on all code changes.
+
+### Add JNI code to the global classpath
+
+When the JNI dependency cannot be packaged in a bundle, and you run on an environment where you can install files locally on the container nodes, you can add the dependency to the container's classpath and explicitly export the packages to make them visible to OSGi bundles.
+
+Add the following configuration in the top level *services* element in [services.xml](/en/reference/applications/services/container):
+
+```xml
+
+
+
+ /lib/jars/foo.jar:/path/bar.jar
+ com.foo,com.bar
+
+
+ ...
+
+```
+
+Adding the config at the top level ensures that it's applied to all jdisc clusters.
+
+The packages are now available and visible, but they must still be imported by the application bundle that uses the library. Here is how to configure the bundle plugin to enforce an import of the packages to the bundle:
+
+```xml highlight={5-7}
+
+ com.yahoo.vespa
+ bundle-plugin
+ true
+
+ com.foo,com.bar
+
+
+```
+
+When adding a library to the classpath it becomes globally visible, and exempt from the package visibility management of OSGi. If another bundle contains the same library, there will be class loading issues.
+
+## Maven bundle plugin
+
+The *bundle-plugin* is used to build and package components for the [Vespa Container](/en/applications/components) with Maven. Refer to the [multiple-bundles sample app](https://github.com/vespa-engine/sample-apps/tree/master/examples/multiple-bundles) for a practical example.
+
+The minimal Maven *pom.xml* configuration is:
+
+```xml highlight={8, 18, 27, 43}
+
+
+ 4.0.0
+ com.yahoo.example
+ basic-application
+ container-plugin {/* Use Vespa packaging */}
+ 1.0.1
+
+
+{/* Find latest version at search.maven.org/search?q=g:com.yahoo.vespa */}
+ {{site.variables.vespa_version}}
+
+
+
+
+ {/* Build the bundles */}
+ com.yahoo.vespa
+ bundle-plugin
+ ${vespa.version}
+ true
+
+ true
+
+
+ {/* Zip the application package */}
+ com.yahoo.vespa
+ vespa-application-maven-plugin
+ ${vespa.version}
+
+
+
+ packageApplication
+
+
+
+
+
+
+
+
+ {/* Vespa dependencies */}
+ com.yahoo.vespa
+ container
+ ${vespa.version}
+ provided
+
+
+
+```
+
+To create a deployable [application package](/en/basics/applications), run:
+
+```bash
+$ mvn install package
+```
+
+The bundle plugin automates generation of configuration classes by invoking the maven step *generate-resources* - read more in [configuring-components.html](/en/applications/configuring-components)
+
+### Including Your Own Maven Submodules
+
+You can include your own Maven submodules as dependencies within your Vespa component bundle. This allows you to share code and functionality between different components within your project.
+
+To include a submodule as a dependency, add it to your bundle's pom.xml in scope *compile*:
+
+```xml
+
+ your.project.groupId
+ your-submodule-artifactId
+ compile
+
+```
+
+Replace `your.project.groupId` with the actual groupId of your project and `your-submodule-artifactId` with the artifactId of your submodule.
+
+### Including third-party libraries
+
+Include external dependencies into the bundle by specifying them as dependencies:
+
+```xml
+
+ org.apache.httpcomponents.client5
+ httpclient5
+ 5.0.3
+ compile
+
+```
+
+All packages in the library will then be available for use.
+
+If the external dependency is packaged as an OSGi bundle, it can be deployed as-is by setting the scope to *provided*:
+
+```xml highlight={5}
+
+ org.apache.httpcomponents.client5
+ httpclient5-osgi
+ 5.0-beta2
+ provided
+
+```
+
+Then, add the jar to the *components* folder of your application package, along with your own bundles. In this case, only packages exported by the author of the library will be available for use by your bundle (see the section below).
+
+### Exporting, Importing and Including Packages from Bundles
+
+OSGi features information hiding — by default all the classes used inside a bundle are invisible from the outside. Also, the bundle will by default only see (all) the packages in the Java and Container + Vespa APIs. If any other package is needed by the bundle, then it must happen in one of three ways:
+
+- Some additional packages are exported by the container and may be *imported* explicitly by a bundle
+- In addition, any deployed bundle may export packages on its own, which may then be imported by another bundle
+- Finally, the bundle may include its own JAR libraries
+
+One can export packages from a bundle by annotating the package. E.g. to export *com.mydomain.mypackage*, create *package-info.java* in the package directory with:
+
+```java
+@ExportPackage(version = @Version(major=1, minor=0, micro=0))
+package com.mydomain.mypackage;
+import com.yahoo.osgi.annotation.ExportPackage;
+import com.yahoo.osgi.annotation.Version;
+```
+
+The Maven plugin will place such information in the manifest of the plugin JAR built to be picked up by the Container.
+
+Note that this may also be used with bundles that do not contain any searchers but libraries used by other searchers - a bundle may just exist to export some libraries and never have any searchers instantiated.
+
+Bundles may *import* packages (exported by some other bundle or by the container). The maven plugin will automatically import any package used from bundles it compiles against(i.e. maven dependencies with scope provided).
+
+As mentioned above, each exported package has a version associated with it. Similarly, an import of a package has a version range associated with it. The version range determines which exported packages can be used. The range used by the maven plugin is the current version(i.e. the version of the package available at compile time) up to the next major version (not including).
+
+To learn more about OSGi manifests and bundle packaging (e.g. how to include Java libraries and native code), please refer to the OSGi spec at [the OSGi home page](https://osgi.org).
+
+More details in [troubleshooting](#troubleshooting).
+
+### Bundle Plugin Warnings
+
+The bundle plugin will emit warnings for the following common issues that may cause problems at runtime:
+
+[WARNING] This project uses packages that are not part of Vespa's public api Only Vespa types that are in Java packages annotated with @PublicApi should be used in application code, as other types are not guaranteed to be stable across Vespa releases. [WARNING] This project does not have 'container' as provided dependency All application bundles must have the com.yahoo.vespa:container artifact as a provided scoped dependency, to ensure that the generated 'Import-Package' OSGi header contains the Java packages provided by the Vespa runtime. [WARNING] Artifacts provided from Vespa runtime are included in compile scope This makes the bundle unnecessarily large and may cause problems at runtime, as these artifacts will be embedded in the bundle. Run mvn dependency:tree to identify the source of transitive dependencies, and add the necessary exclusions in pom.xml. If an artifact must be included, e.g. because a specific version is needed, an exception can be added with the configuration parameter allowEmbeddedArtifacts. [WARNING] This project defines packages that are also defined in provided scoped dependencies Overlapping Java packages between bundles will usually cause problems at runtime, because the OSGi framework will only be able to resolve classes from one of the bundles.
+
+### Configuring the Bundle-Plugin
+
+The bundle plugin can be configured to tailor the resulting bundle to specific needs.
+
+```xml
+
+
+
+ com.yahoo.vespa
+ bundle-plugin
+ ${vespa.version}
+ true
+
+ true/false
+ …
+ true/false
+ …
+ …
+ …
+ …
+ …
+ …
+ …
+ …
+
+
+```
+
+| Element | Description |
+| :--- | :--- |
+| failOnWarnings | If true, the maven build will fail upon warnings for e.g. using Vespa types that are not annotated with [@PublicApi](https://javadoc.io/doc/com.yahoo.vespa/annotations/latest/com/yahoo/api/annotations/PublicApi.html). This should always be set to *true* to ensure that your project will compile successfully on future Vespa releases. Default is *false* |
+| allowEmbeddedArtifacts | A comma-separated list of maven artifacts to allow embedding in the bundle, on the format *groupId:artifactId* |
+| attachBundleArtifact | Whether to attach the bundle jar artifact to the build. Use this if you want to install and deploy the bundle jar along with the default jar. Default is *false* |
+| bundleClassifierName | If *attachBundleArtifact* is true, this will be used as classifier for the bundle jar artifact. Default is *bundle* |
+| discApplicationClass | The fully qualified class name of the Application to be started by JDisc |
+| discPreInstallBundle | The name of the bundles that jDISC must pre-install |
+| bundleVersion | The version of this bundle. Defaults to the Maven project version |
+| bundleSymbolicName | The symbolic name of this bundle. Defaults to the Maven artifact ID |
+| bundleActivator | The fully qualified class name of the bundle activator |
+| configGenVersion | The version of *com.yahoo.vespa.configlib.config-class-plugin* that will be used to generate config classes |
+| configModels | List of config models |
+
+### Bundle Plugin Troubleshooting
+
+A package *p* is imported if all of this hold:
+
+
+
+ Using a class in *p* directly (i.e. not with reflection) in the bundle
+
+
+
+ There's no classes in the bundle that is in *p*
+
+
+
+ There's a bundle that exports *p*, and compiling against this bundle
+
+ To debug, run
+
+ ```bash
+ $ mvn -X package
+ ```
+
+ and look at Defined packages (=packages in the bundle), Exported packages of dependencies, Referenced packages(= packages used). A package is imported if it is in Exported packages and Referenced packages but not in Defined packages.
+
+
+
+
+## Troubleshooting
+
+This section describes how to troubleshoot the most common errors when working with bundles:
+
+- [Bundle reload](#bundle-reload)
+- [Could not create component](#could-not-create-component)
+- [Class not found](#class-not-found)
+- [Slow Container start](#slow-container-start)
+- [Unresolved constraint](#unresolved-constraint)
+- [Multiple implementations of the same class](#multiple-implementations-of-the-same-class)
+
+### Bundle reload
+
+Bundles that are uninstalled between re-configs are logged like this:
+
+```java
+INFO : qrserver Container.com.yahoo.container.core.config.ApplicationBundleLoader
+Bundles to schedule for uninstall: [com.yahoo.vespatest.ExtraHitSearcher [67]]
+```
+
+And in case there are none, it shows the empty set:
+
+```java
+INFO : qrserver Container.com.yahoo.container.core.config.ApplicationBundleLoader
+Bundles to schedule for uninstall: []
+```
+
+### Could not create component
+
+The Container fails to start if it cannot load bundles. Example, using wrong bundle name in the [multiple-bundles](https://github.com/vespa-engine/sample-apps/tree/master/examples/multiple-bundles) sample app:
+
+```xml
+
+
+ -
+ +
+```
+
+Looking at what is actually deployed in *multiple-bundles*:
+
+```bash
+$ ls -1 target/*.jar
+target/multiple-bundles-1.0.0-deploy.jar
+target/multiple-bundles-1.0.0-without-dependencies.jar
+target/multiple-bundles-lib-1.0.1-deploy.jar
+```
+
+Error in log:
+
+```java expandable
+[2020-01-23 14:28:01.367] WARNING : qrserver Container.com.yahoo.container.di.Container
+ Failed to set up new component graph. Retaining previous component generation.
+ exception=
+java.lang.IllegalArgumentException: Could not create a component with id 'com.mydomain.lib.FibonacciProducer'.
+Tried to load class directly, since no bundle was found for spec: multiple-bundles-typo.
+If a bundle with the same name is installed, there is a either a version mismatch or the installed bundle's version contains a qualifier string.
+ at com.yahoo.osgi.OsgiImpl.resolveFromClassPath(OsgiImpl.java:74)
+ at com.yahoo.osgi.OsgiImpl.resolveClass(OsgiImpl.java:65)
+ at com.yahoo.container.di.Container.addNodes(Container.java:228)
+ at com.yahoo.container.di.Container.createComponentsGraph(Container.java:217)
+ at com.yahoo.container.di.Container.getConfigAndCreateGraph(Container.java:160)
+ at com.yahoo.container.di.Container.getNewComponentGraph(Container.java:84)
+ at com.yahoo.container.core.config.HandlersConfigurerDi.getNewComponentGraph(HandlersConfigurerDi.java:145)
+ at com.yahoo.container.jdisc.ConfiguredApplication.lambda$startReconfigurerThread$1(ConfiguredApplication.java:275)
+ at java.base/java.lang.Thread.run(Thread.java:834)
+
+[2020-01-23 14:28:01.367] ERROR : qrserver Container.com.yahoo.container.jdisc.ConfiguredApplication
+ Reconfiguration failed, your application package must be fixed, unless this is a JNI reload issue: Could not create a component with id 'com.mydomain.lib.FibonacciProducer'. Tried to load class directly, since no bundle was found for spec: multiple-bundles-typo. If a bundle with the same name is installed, there is a either a version mismatch or the installed bundle's version contains a qualifier string.
+ exception=
+ java.lang.IllegalArgumentException: Could not create a component with id 'com.mydomain.lib.FibonacciProducer'. Tried to load class directly, since no bundle was found for spec: multiple-bundles-typo. If a bundle with the same name is installed, there is a either a version mismatch or the installed bundle's version contains a qualifier string.
+ at com.yahoo.osgi.OsgiImpl.resolveFromClassPath(OsgiImpl.java:74)
+ at com.yahoo.osgi.OsgiImpl.resolveClass(OsgiImpl.java:65)
+ at com.yahoo.container.di.Container.addNodes(Container.java:228)
+ at com.yahoo.container.di.Container.createComponentsGraph(Container.java:217)
+ at com.yahoo.container.di.Container.getConfigAndCreateGraph(Container.java:160)
+ at com.yahoo.container.di.Container.getNewComponentGraph(Container.java:84)
+ at com.yahoo.container.core.config.HandlersConfigurerDi.getNewComponentGraph(HandlersConfigurerDi.java:145)
+ at com.yahoo.container.jdisc.ConfiguredApplication.lambda$startReconfigurerThread$1(ConfiguredApplication.java:275)
+ at java.base/java.lang.Thread.run(Thread.java:834)
+```
+
+Make sure that the jar files (i.e. bundles) are actually deployed with correct names per *services.xml*.
+
+### Could not load class
+
+If a component is added to services.xml, and its class cannot be found in the declared bundle, the container will fail to start. For example:
+
+```xml
+
+```
+
+The log will contain an error like this:
+
+```java
+java.lang.IllegalArgumentException: Could not load class 'com.example.MissingClass' from bundle my-bundle
+```
+
+If you see this error, please make sure that the class actually exists in the given bundle. Also, verify that the `id` (or `class`) tag refers to the component class, and not e.g. a java package or the bundle name.
+
+### Class not found
+
+All classes that are referred to in a user bundle must either be embedded in the bundle, or imported from another bundle by an `Import-Package` statement in the bundle manifest. When this rule has been breached, we get one of the most commonly seen exceptions when working with OSGi bundles:
+
+```java
+...
+exception=
+java.lang.NoClassDefFoundError: com/acme/utils/Helper
+...
+java.lang.ClassNotFoundException: com.acme.utils.Helper not found by my_bundle [29]
+```
+
+For the [bundle-plugin](#maven-bundle-plugin) to automatically add an Import-Package statement to the bundle's manifest, that package must be exported from another bundle that is declared as a 'provided' scope dependency in *pom.xml*. If the dependency that contains the missing class is under your own control, make sure it's packaged as an OSGi bundle, and [export the package](#exporting-importing-and-including-packages-from-bundles) from that bundle. If not, the simplest way to resolve the issue is to embed the dependency in your own bundle, by setting its scope to 'compile' instead of 'provided'.
+
+If the strategy above does not resolve the case, it's most likely because the class in question is loaded by reflection, e.g. `Class.forName("com.acme.utils.Helper")`. This is quite common when working with libraries for pluggable frameworks, for which there is a separate [troubleshooting doc](/en/applications/pluggable-frameworks).
+
+### Slow Container start
+
+In the vespa log, a container startup looks like:
+
+```java
+[2021-01-07 10:13:35.325] INFO : container Container.com.yahoo.container.core.config.ApplicationBundleLoader
+Installed bundles: {[0]org.apache.felix.framework:6.0.3, [1]container-disc:7.335.22 ...
+...
+[2021-01-07 10:26:57.291] INFO : container Container.com.yahoo.container.jdisc.ConfiguredApplication
+Switching to the latest deployed set of configurations and components. Application config generation: 1
+```
+
+The container is ready at the last log line - note the long startup time. To get more details on what the container is doing at startup, inspect the ComponentGraph debug log. Find the container service name (here: "container"), set debug logging and restart the container:
+
+```bash
+$ vespa-sentinel-cmd list
+vespa-sentinel-cmd 'sentinel.ls' OK.
+container state=RUNNING mode=AUTO pid=246585 exitstatus=0 id="default/container.0"
+
+$ vespa-logctl container:com.yahoo.container.di.componentgraph.core debug=on
+
+$ vespa-stop-services && vespa-start-services
+
+# Find DEBUG log messages for component creation, like:
+
+[2021-01-07 10:13:37.006] DEBUG : container Container.com.yahoo.container.di.componentgraph.core.ComponentGraph
+Trying the fallback injector to create component of class com.yahoo.container.jdisc.messagebus.SessionCache to inject
+into component 'chain.mychain in MbusServer' of type 'com.yahoo.container.jdisc.messagebus.MbusServerProvider'.
+[2021-01-07 10:14:14.082] DEBUG : container Container.com.yahoo.container.di.componentgraph.core.ComponentNode
+Constructing 'com.yahoo.search.query.profile.compiled.CompiledQueryProfileRegistry'
+[2021-01-07 10:26:54.669] DEBUG : container Container.com.yahoo.container.di.componentgraph.core.ComponentNode
+Finished constructing 'com.yahoo.search.query.profile.compiled.CompiledQueryProfileRegistry'
+```
+
+In this particular example, query profile compilation takes a long time.
+
+### Unresolved constraint
+
+If the bundle has an Import-Package for a package that is not available at runtime, the OSGi framework will report an unresolved constraint error. The symptom as seen in the log is:
+
+```java
+org.osgi.framework.BundleException: Unresolved constraint in bundle my_bundle [29]:
+Unable to resolve 29.0:
+missing requirement [29.0] osgi.wiring.package; (osgi.wiring.package=com.acme.utils)
+at org.apache.felix.framework.Felix.resolveBundleRevision(Felix.java:3974)
+```
+
+This means that the missing class resides in a 'provided' dependency referred to from the bundle's *pom.xml*, either directly or transitively. In order to make the dependency available at runtime, there are two options:
+
+- The easiest is to set the dependency as 'compile' scope (instead of 'provided') to embed it in your own bundle. This works fine in most cases, unless two of the dependencies need two different versions of the same library.
+- Add the missing jar file to the `components/` folder of the application package, along with your own bundles. The maven-dependency-plugin has a goal called 'copy-dependencies' to help with this.
+
+If the missing jar is a transitive dependency, maven can help visualize the dependency graph of the project:
+
+```bash
+$ mvn dependency:tree
+```
+
+### Multiple implementations of the same class
+
+When two bundles interact via their public APIs, it is crucial that both bundles resolve each and every participating class to the same `Class` object. If not, we will get error messages like:
+
+```java
+java.lang.LinkageError: loader constraint violation: when resolving field
+"DATETIME" the class loader (instance of
+org/apache/felix/framework/BundleWiringImpl$BundleClassLoaderJava5) of the referring
+class, javax/xml/datatype/DatatypeConstants, and the class loader (instance of
+) for the field's resolved type, pe/DatatypeConstants, have different Class
+objects for that type
+```
+
+or:
+
+```java
+java.lang.LinkageError: loader constraint violation: loader (instance of )
+previously initiated loading for a different type with name "javax/xml/namespace/QName"
+```
+
+or (less frequently):
+
+```java
+java.lang.ClassCastException: com.acme.utils.Helper cannot be cast to com.acme.utils.Helper
+```
+
+All these error messages indicate that multiple implementations of one or more classes are used at runtime - possible root causes:
+
+- Two interacting user bundles embed the same Java package.
+- A user bundle embeds a Java package that is exported from one of the JDisc bundles.
+
+Usually, the "duplicate" package is pulled in by the user bundle transitively from a library dependency.
+
+#### Multiple implementations example
+
+Let's take a look at an example resolving the duplicate *javax.xml.namespace.QName* class from the error message above.
+
+All 'javax.xml' packages in the JDK are exported by the JDisc core bundle. This means that they should be imported by user bundles, instead of embedded inside them. Hence, ensure that there are no classes from packages prefixed by 'javax.xml' in the bundle. Find out which library that pulls in the package:
+
+
+
+ Extract the full component jar, including any embedded jars. One tool that does the job is [rjar](https://github.com/pojosontheweb/rjar/).
+
+
+
+ Search the folder where the jar was extracted for 'javax.xml' classes:
+
+ ```bash highlight={4}
+ $ find . | grep "javax/xml/.*\.class"
+
+ ...
+ ./my_bundle-deploy.jar/dependencies/stax-api-1.0.1.jar/javax/xml/namespace/QName.class
+ ...
+ ```
+
+
+
+
+ Find out which libraries that pulled in the offending classes - here it was `stax-api-1.0.1`. Usually, these libraries are not pulled in by the pom as direct dependencies, but rather transitively via another library being used. Use maven's dependency plugin from the application directory to find the direct dependency:
+
+ ```bash highlight={4}
+ $ mvn dependency:tree -Dverbose
+ [INFO] +- com.acme.utils:jersey-utils:1.0.0:compile
+ [INFO] | +- com.sun.jersey:jersey-json:jar:1.13:compile
+ [INFO] | | +- org.codehaus.jettison:jettison:jar:1.1:compile
+ [INFO] | | \- stax:stax-api:jar:1.0.1:compile
+ ```
+
+ Observe that `stax:stax-api:1.0.1` is pulled in transitively from the direct dependency `com.acme.utils:jersey-utils`.
+
+
+
+ To exclude `stax:stax-api`, add the appropriate `exclusion` from the direct dependency `com.acme.utils:jersey-utils` in *pom.xml*:
+
+
+ ```xml highlight={5-10}
+
+ com.acme.utils
+ jersey-utils
+ 1.0.0
+
+
+ stax
+ stax-api
+
+
+
+ ```
+
+
+
+
+#### Multiple implementations example slf4j-api
+
+This is similar to the previous example, but logging libraries are maybe the most common problem teams encounter. Here we will see the symptom, use dependency:tree and add an exclusion. The symptom:
+
+```java
+java.lang.RuntimeException: An exception occurred while
+constructing 'com.acme.utils.Helper in acme-utils'
+Caused by: java.lang.LinkageError: loader constraint violation: when resolving method
+"org.slf4j.impl.StaticLoggerBinder.getLoggerFactory()Lorg/slf4j/ILoggerFactory;"
+the class loader (instance of org/apache/felix/framework/BundleWiringImpl$BundleClassLoaderJava5) of the
+current class, org/slf4j/LoggerFactory, and the class loader (instance of
+sun/misc/Launcher$AppClassLoader) for the method's defining class,
+org/slf4j/impl/StaticLoggerBinder, have different Class objects for the type
+org/slf4j/ILoggerFactory used in the signature
+at
+org.slf4j.LoggerFactory.getILoggerFactory(LoggerFactory.java:299)
+at
+org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:269)
+```
+
+Running *mvn dependency:tree* in the previous example gives:
+
+```txt
+[INFO] +- com.yahoo.vespa:container-dev:jar:5.28.29:provided
+[INFO] | +- com.yahoo.vespa:jdisc_core:jar:5.28.29:provided
+[INFO] | | +- (org.slf4j:slf4j-api:jar:1.7.5:compile - scope updated from provided; omitted for duplicate)
+...
+[INFO] +- com.acme.utils:smartlib:jar:1.0.0:compile
+[INFO] | +- org.slf4j:slf4j-api:jar:1.6.6:compile
+```
+
+See that slf4j-api is no longer provided from container-dev, which it should. To fix this, add an exclusion on the offender:
+
+```xml
+
+ com.acme.utils
+ smartlib
+ 1.0.0
+ compile
+
+
+ org.slf4j
+ slf4j-api
+
+```
+
+But it still does not work! And we can see why:
+
+```txt
+$ jar -tf mailsearch-docprocs-deploy.jar | grep slf
+
+dependencies/slf4j-api-1.7.5.jar
+```
+
+Something still pulls in slf4j... Other candidates:
+
+```txt
+[INFO] \- com.yahoo.vespa:application:jar:5.28.29:test
+...
+[INFO] +- com.yahoo.vespa:zkfacade:jar:5.28.29:test
+[INFO] | +- org.apache.curator:curator-recipes:jar:2.4.1:test
+[INFO] | | +- org.apache.curator:curator-framework:jar:2.4.1:test
+[INFO] | | | +- org.apache.curator:curator-client:jar:2.4.1:test
+[INFO] | | | | +- (org.slf4j:slf4j-api:jar:1.6.4:test - omitted for conflict with 1.7.5)
+...
+[INFO] | +- (org.slf4j:slf4j-jdk14:jar:1.7.5:test - omitted for duplicate)
+```
+
+Added the right excludes for application and used mvn dependency:tree and verified that all references were gone, except the ones for container-dev. Still found:
+
+```txt
+$ jar -tf mailsearch-docprocs-deploy.jar | grep slf
+dependencies/slf4j-api-1.7.5.jar
+```
+
+One can make it work by managing this dependency explicitly - add this at POM top-level:
+
+```xml
+
+
+
+ org.slf4j
+ slf4j-api
+ 1.7.5
+ provided
+
+
+
+```
diff --git a/mintlify-docs/en/applications/chaining.mdx b/mintlify-docs/en/applications/chaining.mdx
new file mode 100644
index 0000000000..6ec00dc00a
--- /dev/null
+++ b/mintlify-docs/en/applications/chaining.mdx
@@ -0,0 +1,209 @@
+---
+title: "Chained Components"
+sidebarTitle: "Chaining"
+description: "[Processors](/en/applications/processing), [searcher plug-ins](/en/applications/searchers) and [document processors](/en/applications/document-processors) are chained components. They are executed serially, with each providing some service or transform, and other optionally depending on these. In other words, a chain is a set of components with dependencies. Javadoc: [com.yahoo.component.chain.Chain](https://javadoc.io/doc/com.yahoo.vespa/chain/latest/com/yahoo/component/chain/Chain)"
+---
+
+It is useful to read the [federation guide](/en/querying/federation) before this document.
+
+A chained component has three basic differences from a component in general:
+
+- The named services it *provides* to other components in the chain.
+- The list of services or checkpoints which the component itself should be *before* in a chain, in other words, its dependents.
+- The list of services or checkpoints which the component itself should be *after* in a chain, in other words, its dependencies.
+
+What a component should be placed before, what it should be placed after and what itself provides, may be either defined using Java annotations directly on the component class, or it may be added specifically to the component declarations in [services.xml](/en/reference/applications/services/container). In general, the implementation should have as many of the necessary annotations as practical, leaving the application specific configuration clean and simple to work with.
+
+## Ordering Components
+
+The execution order of the components in a chain is not defined by the order of the components in the configuration. Instead, the order is defined by adding the *ordering constraints* to the components:
+
+- Any component may declare that it `@Provides` some named functionality (the names are just labels that have no meaning to the container).
+- Any component may declare that it must be placed `@Before` some named functionality,
+- or that it must be placed `@After` some functionality.
+
+The container will pick any ordering of a chain consistent with the constraints of the components in the chain.
+
+Dependencies can be added in two ways. Dependencies which are due to the code should be added as annotations in the code:
+
+```java highlight={4-6}
+import com.yahoo.processing.*;
+import com.yahoo.component.chain.dependencies.*;
+
+@Provides("SourceSelection")
+@Before("Federation")
+@After("IntentModel")
+
+public class SimpleProcessor extends Processor {
+ @Override
+ public Response process(Request request, Execution execution) {
+ //TODO: Implement this
+ }
+}
+```
+
+Multiple functionality names may be specified by using the syntax `@Provides/Before/After({"A", "B"})`.
+
+Annotations which do not belong in the code may be added in the [configuration](/en/reference/applications/services/container):
+
+```xml highlight={8}
+
+
+
+
+
+
+
+ ai.vespa.examples.Processor1
+
+
+
+
+
+
+
+
+```
+
+For convenience, components always `Provides` their own fully qualified class name (the package and simple class name concatenated, e.g. `ai.vespa.examples.SimpleProcessor`) and their simple name (that is, only the class name, like `SimpleProcessor` in our searcher case), so it is always possible to declare that one must execute before or after some particular component. This goes for both general processors, searchers and document processors.
+
+Finally, note that ordering constraints are just that; in particular they are not used to determine if a given search chain, or set of search chains, is “complete”.
+
+## Chain Inheritance
+
+As implied by examples above, chains may inherit other chains in *services.xml*.
+
+```xml
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+```
+
+A chain will include all components from the chains named in the optional `inherits` attribute, exclude from that set all components named in the also optional `excludes` attribute and add all the components listed inside the defining tag. Both `inherits` and `excludes` are space delimited lists of reference names.
+
+For search chains, there are two built-in search chains which are especially useful to inherit from, `native` and `vespa`. `native` is a basic search chain, containing the basic functionality most systems will need anyway, `vespa` inherits from `native` and adds a few extra searchers which most installations containing Vespa backends will need.
+
+```xml
+
+
+
+
+
+
+
+
+
+
+
+
+```
+
+## Unit Tests
+
+A component should be unit tested in a chain containing the components it depends on. It is not necessary to run the dependency handling framework to achieve that, as the `com.yahoo.component.chain.Chain` class has several constructors which are easy to use while testing.
+
+```java
+Chain c = new Chain(new UselessSearcher("first"),
+new UselessSearcher("second"),
+new UselessSearcher("third"));
+Execution e = new Execution(c, Execution.Context.createContextStub(null));
+Result r = e.search(new Query());
+```
+
+The above is a rather useless test, but it illustrates how the basic workflow can be simulated. The constructor will create a chain with supplied searchers in the given order (it will not analyze any annotations).
+
+## Passing Information Between Components
+
+When different searchers or document processors depend on shared classes or field names, it is good practice defining the name only in a single place. An [example](/en/applications/searchers#passing-information-between-searchers) in the searcher development introduction illustrates an easy way to do that.
+
+## Invoking a Specific Search Chain
+
+The search chain to use can be selected in the request, by adding the request parameter: `searchChain=myChain`
+
+If no chain is selected in the query, the chain called `default` will be used. If no chain called `default` has been configured, the chain called `native` will be used. The *native* chain is always present and contains a basic set of searchers needed in most applications. Custom chains will usually inherit the native chain to include those searchers.
+
+The search chain can also be set in a [query profile](/en/querying/query-profiles).
+
+## Example: Configuration
+
+Annotations which do not belong in the code may be added in the configuration, here a simple example with [search chains](/en/reference/applications/services/search#chain):
+
+```xml highlight={9-12}
+
+
+
+
+
+
+ Cache
+ Statistics
+ Logging
+ SimpleTest
+
+
+
+
+
+
+```
+
+And for [document processor chains](/en/reference/applications/services/docproc), it becomes:
+
+```xml highlight={5}
+
+
+
+
+ TextMetrics
+
+
+
+
+
+
+
+```
+
+For searcher plugins the class [com.yahoo.search.searchchain.PhaseNames](https://javadoc.io/doc/com.yahoo.vespa/container-search/latest/com/yahoo/search/searchchain/PhaseNames) defines a set of checkpoints third party searchers may use to help order themselves when extending the Vespa search chains.
+
+Note that ordering constraints are just that; in particular they are not used to determine if a given search chain, or set of search chains, is “complete”.
+
+## Example: Cache with async write
+
+Use case: In a search chain, do early return and do further search asynchronously using [ExecutorService](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/concurrent/ExecutorService).
+
+Pseudocode: If cache hit (e.g. using Redis), just return cached data. If cache miss, return null data and let the following searcher finish further query and write back to cache:
+
+```java
+public Result search(Query query, Execution execution) {
+ // cache lookup
+ if (cache_hit) {
+ return result;
+ }
+ else {
+ execution.search(query); // invoke async cache update searcher next in chain
+ return result;
+ }
+}
+```
diff --git a/mintlify-docs/en/applications/components.mdx b/mintlify-docs/en/applications/components.mdx
new file mode 100644
index 0000000000..1e17926658
--- /dev/null
+++ b/mintlify-docs/en/applications/components.mdx
@@ -0,0 +1,259 @@
+---
+title: "Container Components"
+sidebarTitle: "Components"
+description: "This document explains the common concepts necessary to develop all types of Container components."
+---
+
+All components must extend a base class from the Container code module. For example, searchers must extend the class `com.yahoo.search.Searcher`. The main available component types are:
+
+- [processors](/en/applications/processing)
+- [searchers](/en/applications/searchers)
+- [document processors](/en/applications/document-processors)
+- [search result renderers](/en/applications/result-renderers)
+- [provider components](/en/applications/dependency-injection#special-components).
+
+Searchers and document processors belong to a subclass of components called [chained components](/en/applications/chaining). For an introduction to how the different component types interact, refer to the [overview of component types](/en/reference/applications/components#component-types).
+
+The components of the search container are usually deployed as part of an [OSGi bundle](/en/applications/bundles). Build the bundles using maven and the [bundle plugin](/en/applications/bundles#maven-bundle-plugin). Refer to the [multiple-bundles sample app](https://github.com/vespa-engine/sample-apps/tree/master/examples/multiple-bundles) for a multi-bundle example.
+
+## Concurrency
+
+Components will be executed concurrently by multiple threads. This places an important constraint on all component classes: *non-final instance variables are not safe.* They must be eliminated, or made thread-safe somehow.
+
+## Resource management
+
+Components that use threads, file handles or other native resources that needs to be released when the component falls out of scope, must override a method called `deconstruct`. Here is an example implementation from a component that uses a thread pool named 'executor':
+
+```java
+@Override
+public void deconstruct() {
+ super.deconstruct();
+ try {
+ executor.shutdown();
+ executor.awaitTermination(10, TimeUnit.SECONDS);
+ } catch (InterruptedException e) {
+ Thread.currentThread().interrupt();
+ }
+}
+```
+
+Note that it is always advisable to call the super-method first. Also see [SharedResource.java](https://github.com/vespa-engine/vespa/blob/master/jdisc_core/src/main/java/com/yahoo/jdisc/SharedResource.java) for how to configure [debug options](/en/reference/applications/services/container#jvm) for use in tools like YourKit. This can be used to track component lifetime / (de)construction issues, e.g.:
+
+```xml
+
+
+
+```
+
+Read more in [container profiling](/en/performance/profiling#profiling-the-query-container).
+
+## Dependency injection
+
+The components might need to access resources, such as other components or config. These are injected directly into the constructor. The following types of constructor dependencies are allowed:
+
+ - [Config objects](/en/applications/configuring-components)
+ - [Other components](/en/applications/dependency-injection)
+ - [The Linguistics library](/en/linguistics/linguistics)
+ - [System info](#the-systeminfo-injectable-component)
+
+The [Component Reference](/en/reference/applications/components#injectable-components) contains a complete list of built-in injectable components.
+
+If your component class needs more than one public constructor, the one to be used by the container must be annotated with `@com.yahoo.component.annotation.Inject` from [annotations](https://search.maven.org/artifact/com.yahoo.vespa/annotations).
+
+### The SystemInfo Injectable Component
+
+This component provides information about the environment that the component is running in, for example
+
+- The zone in the Vespa Cloud, if applicable.
+- The number of nodes in the container cluster, and their indices.
+- The index of the node this is running on.
+
+The two latter can be used e.g. for [bucket testing](/en/applications/testing#feature-switches-and-bucket-tests) new features on a subset of nodes. Please note that the node indices are not necessarily contiguous or starting from zero.
+
+## Deploying a Component
+
+The container will create one or more instances of the component, as specified in [the application package](#adding-component-to-application-package). The container will create a new instance of this component only when it is reconfigured, so any data needed by the component can be read and prepared from a constructor in the component.
+
+See the full API available to components at the [Container Javadoc](https://javadoc.io/doc/com.yahoo.vespa/container-disc/latest/com/yahoo/container/package-summary).
+
+Once the component passes unit tests, it can be deployed. The steps involved are building the component jar file, adding it to the Vespa application package and deploying the application package. These steps are described in the following sections, using a searcher as example.
+
+### Building the Plugin .jar
+
+To build the plugin jar, call `mvn install` in the project directory. It can then be found in the target directory, and will have the suffix *-deploy.jar*.
+
+Assume for the rest of the document that the artifactId is `com.yahoo.search.example.SimpleSearcher` and the version is `1.0`. The plugin built will then have the name *com.yahoo.search.example.SimpleSearcher-1.0-deploy.jar*.
+
+### Adding the Plugin to the Vespa Application Package
+
+The previous step should produce a plugin jar file, which may now be deployed to Vespa by adding it to an [application package](/en/basics/applications): A directory containing at minimum *hosts.xml* and *services.xml*.
+
+- put `com.yahoo.search.example.SimpleSearcher-1.0-deploy.jar` in the `components/` directory under the application package root
+- modify [services.xml](/en/reference/applications/services/services) to include the Searcher
+
+To include the searcher, define a search chain and add the searcher to it. Example:
+
+```xml
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+```
+
+The searcher id above is resolved to the plugin jar we added by the `Bundle-SymbolicName` ([a field in the manifest of the jar file](/en/applications/bundles)), which is determined by the `artifactId`, and to the right class within the bundle by the class name. By keeping the `searcher id`, `class name` and `artifactId` the same, we keep things simple, but more advanced use where this is possible is also supported. This will be explained in later sections.
+
+For a reference to these tags, see [the search chains reference](/en/reference/applications/services/search#chain).
+
+Example `hosts.xml`:
+
+```xml
+
+
+
+ node1
+
+
+```
+
+By creating a directory containing this `services.xml`, `hosts.xml` and `components/com.yahoo.search.example.SimpleSearcher-1.0-deploy.jar`, that directory becomes a complete application package containing a bundle, which can now be deployed.
+
+### Deploying the Application Package
+
+Set up a Vespa instance using the [quick start](/en/basics/deploy-an-application-local). Once the component and the config are added to the application package, it can be [deployed](/en/basics/applications#deploying-applications) by running `vespa deploy`. These steps will copy any changed bundles to the nodes in the cluster which needs them and switch queries over to running the new component versions.
+
+This works safely without requiring any processes to be restarted, even if the application package contains changes to classes which are already running queries. The switch is atomic from the point of view of the query - all queries will execute to completion, either using only the components of the last version of the application package or only the new ones, so interdependent changes in multiple searcher components can be deployed without problems.
+
+#### JNI requires restart
+
+The exception to the above is bundles containing JNI packages. There can only be one instance of the native library, so such bundles cannot reload. Best practice is to load the JNI library in the constructor, as this will cause the new bundle *not* to load, but continue on the current version. A subsequent restart will load the new bundle. This will hence not cause failures. Alternatively, if the JNI library is initialized lazily (e.g. on first invocation), bundle reloads will succeed, but subsequent invocations of code using the JNI library will fail. Hence, the new version will run, but fail.
+
+A warning is issued in the log when deploying rather than the normal *Switching to the latest deployed set of handlers* - example:
+
+```txt
+[2016-09-21 14:22:05.387] WARNING : container stderr Cannot load mylib native library
+```
+
+To minimize restarts, it is recommended to put JNI components in minimal, separate bundles. This will prevent reload of the JNI-bundles, unless the JNI-bundle itself is changed.
+
+#### Monitoring the active Application
+
+All containers also provide a built-in handler that outputs JSON formatted information about the active application, including its components and chains (it can also be configured to show [a user-defined version](/en/reference/applications/application-packages#versioning-application-packages)). The handler answers to requests with the path `/ApplicationStatus`. For example, if 'localhost' runs a container with HTTP configured on port 8080:
+
+```txt
+http://localhost:8080/ApplicationStatus
+```
+
+### Including third-party libraries
+
+External dependencies [can be included into the bundle](/en/applications/bundles#maven-bundle-plugin).
+
+### Exporting, importing and including packages in bundles
+
+[OSGi features information hiding - by default all the classes used inside a bundle are invisible from the outside.](/en/applications/bundles)
+
+### Global and exported packages
+
+The JDisc Container has one set of *global* packages. These are packages that are available with no import, and constitutes the supported API of the JDisc Container. Backwards incompatible changes are not made to these packages.
+
+There is also a set of *exported* packages. These are available for import, and includes all legacy packages, plus extension packages which are not part of the core API. Note that these are not considered to be "public" APIs, as global packages are, and backwards incompatible changes *can* be made to these packages, or they may be removed.
+
+The list of exported and global packages is available in the [container-disc pom.xml](https://github.com/vespa-engine/vespa/blob/master/container-disc/pom.xml), in `project/properties/exportedPackages` and `project/properties/globalPackages`.
+
+### Versions
+
+All the elements of the search container which may be referenced by an id may be *versioned*, that includes chains, components and query profiles. This allows multiple versions of these elements to be used at the same time, including multiple versions of the same classes, which is handy for [bucket testing](/en/applications/testing#feature-switches-and-bucket-tests) new versions.
+
+An id or id reference may include a version by using the following syntax: `name:version`. This works with ids in search requests, services.xml, code and query profiles.
+
+A version has the format:
+
+```txt
+major.minor.micro.qualifier
+```
+
+where major, minor and micro are integers and qualifier is a string. Any right-hand portion of the version string may be skipped. In *versions*, skipped values mean "0" (and *empty* for the qualifier). In *version references* skipped values means "unspecified". Any unspecified number will be matched to the highest number available, while a qualifier specified *must* be matched exactly if it is specified (qualifiers are rarely used).
+
+To specify the version of a bundle, specify version in pom.xml (we recommend not using *qualifier*):
+
+```xml
+com.yahoo.example
+MyPlugin
+major.minor.micro
+```
+
+This will automatically be used to set the `Bundle-Version` in the bundle manifest.
+
+For more details, see [component versioning](/en/reference/applications/components#component-versioning).
+
+## Troubleshooting
+
+### Container start
+
+If there is some error in the application package, it will usually be detected during the `vespa prepare` step and cause an error message. However, some classes of errors are only detected once the application is deployed. When redeploying an application, it is therefore recommended watching the vespa log by running:
+
+```txt
+vespa-logfmt -N
+```
+
+The new application is active after the INFO message:
+
+```txt
+Switched to the latest deployed set of handlers...;
+```
+
+If this message does not appear after a reasonable amount of time after completion of `vespa activate`, one will see some errors or warnings instead, that will help debug the application.
+
+### Component load
+
+At deployment or container start, components are constructed. Construction can fail - to debug, enable more logging (replace "container" as needed with container id):
+
+```txt
+$ vespa-logctl container:com.yahoo.container.di.componentgraph.core.ComponentNode debug=on
+.com.yahoo.container.di.componentgraph.core.ComponentNode ON ON ON ON ON ON ON OFF
+```
+
+Look for "Constructing" and "Finished constructing" in *vespa.log* - this identifies components that did not construct.
+
+Model downloading failures look like the below and are caused by a fail to download the model to the container:
+
+```json
+ERROR container Container.com.yahoo.jdisc.core.StandaloneMain JDisc exiting: Throwable caught:
+exception=
+java.lang.RuntimeException: Not able to create config builder for payload '{
+"tokenizerPath": "\\"\\" https://huggingface.co/Snowflake/snowflake-arctic-embed-l/raw/main/tokenizer.json \\"\\"",
+"transformerModel": "\\"\\" https://huggingface.co/Snowflake/snowflake-arctic-embed-l/resolve/main/onnx/model_int8.onnx \\"\\"",
+"transformerMaxTokens": 512,
+"transformerInputIds": "input_ids",
+"transformerAttentionMask": "attention_mask",
+"transformerTokenTypeIds": "token_type_ids",
+"transformerOutput": "last_hidden_state",
+"normalize": true,
+"poolingStrategy": "cls",
+"transformerExecutionMode": "sequential",
+"transformerInterOpThreads": 1,
+"transformerIntraOpThreads": -4,
+"transformerGpuDevice": 0
+}
+```
+
+Check urls / names, and that the model can be downloaded in the network the Vespa Container is running.
diff --git a/mintlify-docs/en/applications/config-system.mdx b/mintlify-docs/en/applications/config-system.mdx
new file mode 100644
index 0000000000..244172ea43
--- /dev/null
+++ b/mintlify-docs/en/applications/config-system.mdx
@@ -0,0 +1,170 @@
+---
+title: "The Config System"
+description: "The config system in Vespa is responsible for turning the application package into live configuration of all the nodes, processes and components that realizes the running system. Here we deep dive into various aspects of how this works."
+---
+
+## Node configuration
+
+The problem of configuring nodes can be divided into three parts, each addressed by different solutions:
+
+- **Node system level configuration:** Configure OS level settings such as time zone as well as user privileges on the node.
+- **Package management**: Ensure that the correct set of software packages is installed on the nodes. This functionality is provided by three tools working together.
+- **Vespa configuration:** Starts the configured set of processes on each node with their configured startup parameters and provides dynamic configuration to the modules run by these services. *Configuration* here is any data which:
+
+ - can not be fixed at compile time
+ - is static most of the time
+
+Note that by these definitions, this allows all the nodes to have the same software packages (disregarding version differences, discussed later), as variations in what services are run on each node and in their behavior is achieved entirely by using Vespa Configuration. This allows managing the complexity of node variations completely within the configuration system, rather than across multiple systems.
+
+Configuring a system can be divided into:
+
+- **Configuration assembly:** Assembly of a complete set of configurations for delivery from the inputs provided by the parties involved in configuring the system
+- **Configuration delivery:** Definition of individual configurations, APIs for requesting and accessing configuration, and the mechanism for delivering configurations from their source to the receiving components
+
+This division allows the problem of reliable configuration delivery in large distributed systems to be addressed in configuration delivery, while the complexities of assembling complete configurations can be treated as a vm-local design problem.
+
+An important feature of Vespa Configuration is the nature of the interface between the delivery and assembly subsystems. The assembly subsystem creates as output a (Java) object model of the distributed system. The delivery subsystem queries this model to obtain concrete configurations of all the components of the system. This allows the assembly subsystem to accept higher level, and simpler to use, abstractions as input and automatically derive detailed configurations with the correct interdependencies. This division insulates the external interface and the components being configured from changes in each other. In addition, the system model provides the home for logic implementing node/component instance variations of configuration.
+
+## Configuration assembly
+
+Config assembly is the process of turning the configuration input sources into an object model of the desired system, which can respond to queries for configs given a name and config id. Config assembly for Vespa systems can become complex, because it involves merging information owned by multiple parties:
+
+- **Vespa operations** own the nodes and controls assignment of nodes to services/applications
+- **Vespa service providers** own services which hosts multiple applications running on Vespa
+- **Vespa applications** define the final applications running on nodes and shared services
+
+The current config model assembly procedure uses a single source - the *application package*. The application package is a directory structure containing defined files and subdirectories which together completely defines the system - including which nodes belong in the system, which services they should run and the configuration of these services and their components. When the application deployer wants to change the application, [vespa prepare](/en/reference/clients/vespa-cli/vespa_prepare) is issued to a config server, with the application package as argument.
+
+At this point the system model is assembled and validated and any feedback is issued to the deployer. If the deployer decides to make the new configuration active, a [vespa activate](/en/reference/clients/vespa-cli/vespa_activate) is then issued, causing the config server cluster to switch to the new system model and respond with new configs on any active subscriptions where the new system model caused the config to change. This ensures that subscribers gets new configs timely on changes, and that the changes propagated are the minimal set such that small changes to an application package causes correspondingly small changes to the system.
+
+
+
+
+
+The config model itself is pluggable, so that service providers may write plugins for assembling a particular service. The plugins are written in Java, and is installed together with the Vespa Configuration. Service plugins define their own syntax for specifying services that may be configured by Vespa applications. This allows the applications to be specified in an abstract manner, decoupled from the configuration that is delivered to the components.
+
+## Configuration delivery
+
+Configuration delivery encompasses the following aspects:
+
+- Definition of configurations
+- The component view (API) of configuration
+- Configuration delivery mechanism
+
+These aspects work together to realize the following goals:
+
+- Eliminate inconsistency between code and configuration.
+- Eliminate inconsistency between the desired configuration and the state on each node.
+- Limit temporary inconsistencies after reconfiguration.
+
+The next three subsections discusses the three aspects above, followed by subsections on two special concerns - bootstrapping and system upgrades.
+
+### Configuration definitions
+
+A *configuration* is a set of simple or array key-values with a name and a type, which can possibly be nested - example:
+
+```txt
+myProperty "myvalue"
+myArray[1]
+myArray[0].key1 "someValue"
+myArray[0].key2 1337
+```
+
+The *type definition* (or class) of a configuration object defines and documents the set of fields a configuration may contain with their types and default values. It has a name as well as a namespace. For example, the above config instance may have this definition:
+
+```txt
+namespace=foo.bar
+
+# Documentation of this key
+myProperty string default="foo"
+
+# etc.
+myArray[].key1 string
+myArray[].key2 int default=0
+```
+
+An individual config typically contains a coherent set of settings regarding some topic, such as *logging* or *indexing*. A complete system consists of many instances of many config types.
+
+### Component view
+
+Individual components of a system consumes one or more such configs and use their values to influence their behavior. APIs are needed for *requesting* configs and for *accessing* the values of those configs as they are provided.
+
+*Access* to configs happens through a (Java or C++) class generated from the config definition file. This ensures that any inconsistency between the fields declared in a config type and the expectations of the code accessing it are caught at compile time. The config definition is best viewed as another class with an alternative form of source syntax belonging to the components consuming it. A Maven target is provided for generating such classes from config definition types.
+
+Components may use two different methods for *requesting* configurations subscription and dependency injection:
+
+**Subscription:** The component sets up *ConfigSubscriber*, then subscribes to one or more configs. This is the simple approach, there are [other ways of](/en/applications/configapi-dev) getting configs too:
+
+```java
+ConfigSubscriber subscriber = new ConfigSubscriber();
+ConfigHandle handle = subscriber.subscribe(MyConfig.class, "myId");
+if (!subscriber.nextConfig()) throw new RuntimeException("Config timed out.");
+if (handle.isChanged()) {
+ String message = handle.getConfig().myKey();
+ // ... consume the rest of this config
+}
+```
+
+**Dependency injection:** The component declares its config dependencies in the constructor and subscriptions are set up on its behalf. When changed configs are available a new instance of the component is created. The advantage of this method is that configs are immutable throughout the lifetime of the component such that no thread coordination is required. This method is currently only available in Java using the [Container](/en/applications/containers).
+
+```java
+public MyComponent(MyConfig config) {
+ String myKey = config.myKey();
+ // ... consume the rest of this config
+}
+```
+
+For unit testing, [configs can be created with Builders](/en/applications/configapi-dev#unit-testing), submitted directly to components.
+
+### Delivery mechanism
+
+The config delivery mechanism is responsible for ensuring that a new config instance is delivered to subscribing components, each time there is a change to the system model causing that config instance to change. A config subscription is identified by two parameters, the *config definition name and namespace* and the [config id](/en/applications/configapi-dev#config-id) used to identify the particular component instance making the subscription.
+
+The in-process config library will forward these subscription requests to a node local [config proxy](/en/operations/self-managed/config-proxy), which provides caching and fan-in from processes to node. The proxy in turn issues these subscriptions to a node in the configuration server cluster, each of which hosts a copy of the system model and resolves config requests by querying the system model.
+
+To provide config server failover, the config subscriptions are implemented as long-timeout gets, which are immediately resent when they time out, but conceptually this is best understood as push subscriptions:
+
+
+
+
+
+As configs are not stored as files locally on the nodes, there is no possibility of inconsistencies due to local edits, or of nodes coming out of maintenance with a stale configuration. As configuration changes are pushed as soon as the config server cluster allows, time inconsistencies during reconfigurations are minimized, although not avoided as there is no global transaction.
+
+Application code and config is generally pulled from the config server - it is however possible to use the [url](/en/reference/applications/config-files#url) config type to refer to any resource to download to nodes.
+
+### Bootstrapping
+
+Each Vespa node runs a [config-sentinel](/en/operations/self-managed/config-sentinel) process which start and maintains services run on a node.
+
+### System upgrades
+
+The configuration server will up/downgrade between config versions on the fly on minor upgrades which causes discrepancies between the config definitions requested from those produced by the configuration model. Major upgrades, which involve incompatible changes to the configuration protocol or the system model, require a [procedure](/en/operations/self-managed/config-proxy).
+
+## Notes
+
+Find more information for using the Vespa config API in the [reference doc](/en/applications/configapi-dev).
+
+Vespa Configuration makes the following assumptions about the nodes using it:
+
+- All nodes have the software packages needed to run the configuration system and any services which will be configured to run on the node. This usually means that all nodes have the same software, although this is not a requirement
+- All nodes have [VESPA_CONFIGSERVERS](/en/operations/self-managed/files-processes-and-ports#environment-variables) set
+- All nodes know their fully qualified domain name
+
+Reading this document is not necessary in order to use Vespa or to develop Java components for the Vespa container - for this purpose, refer to [Configuring components](/en/applications/configuring-components).
+
+## Further reads
+
+- [Configuration server operations](/en/operations/self-managed/configuration-server) is a good resource for troubleshooting.
+- Refer to the [bundle plugin](/en/applications/bundles#maven-bundle-plugin) for how to build an application package with Java components.
+- During development on a local instance it can be handy to just wipe the state completely and start over:
+
+
+
+ [Delete all config server state](/en/operations/self-managed/configuration-server#zookeeper-recovery) on all config servers
+
+
+
+ Run [vespa-remove-index](/en/reference/operations/self-managed/tools#vespa-remove-index) to wipe content nodes
+
+
+
diff --git a/mintlify-docs/en/applications/configapi-dev.mdx b/mintlify-docs/en/applications/configapi-dev.mdx
new file mode 100644
index 0000000000..eff0900877
--- /dev/null
+++ b/mintlify-docs/en/applications/configapi-dev.mdx
@@ -0,0 +1,329 @@
+---
+title: "Cloud Config API"
+sidebarTitle: "Java Config API"
+description: "This document describes how to use the C++ and Java versions of the Cloud config API (the 'config API'). This API is used internally in Vespa, and reading this document is not necessary in order to use Vespa or to develop Java components for the Vespa container. For this purpose, please refer to [Configuring components](/en/applications/configuring-components) instead."
+---
+
+Throughout this document, we will use as example an application serving up a configurable message.
+
+## Creating a Config Definition
+
+The first thing to do when deciding to use the config API is to define the config you want to use in your application. This is described in the [configuration file reference](/en/reference/applications/config-files). Here we will use the definition `motd.def` from the complete example at the end of the document:
+
+```text
+namespace=myproject
+
+message string default="NO MESSAGE"
+port int default=1337
+```
+
+## Generating Source Code and Accessing Config in Code
+
+Before you can access config in your program you will need to generate source code for the config definition. Simple steps for how you can generate API code and use the API are provided for [Java](/en/applications/configapi-dev#the-java-config-api). See also [javadoc](https://javadoc.io/doc/com.yahoo.vespa/config-lib))
+
+We also recommend that you read the [general guidelines](#guidelines) for examples of advanced usage and recommendations for how to use the API.
+
+## Config ID
+
+The config id specified when requesting config is essentially an identifier of the component requesting config. The config server contains a config object model, which maps a request for a given config name and config id to the correct configproducer instance, which will merge default values from the config definition with config from the object model and config set in `services.xml` to produce the final config instance.
+
+The config id is given to a service via the VESPA\_CONFIG\_ID environment variable. The [config sentinel](/en/operations/self-managed/config-sentinel) sets the environment variable to the id given by the config model. This id should then be used by the service to subscribe for config. If you are running multiple services, each of them will be assigned a **unique config id** for that service, and a service should not subscribe using any config id other than its own.
+
+If you need to get config for a services that is not part of the model (i.e. it is not specified in the services.xml), but that you want to specify values for in services.xml, use the config id `client`.
+
+## Schema Compatibility Rules
+
+A schema incompatibility occurs if the config class (for example `MotdConfig` in the C++ and Java sections above) was built from a different def-file than the one the server is seeing and using to serve config. Some such incompatibilities are automatically handled by the config system, others lead to error. This is useful to know during development/testing of a config schema.
+
+Let *S* denote a config definition called *motd* which the server is using, and *C* denote a config definition also called *motd* which the client is using, i.e. the one that created `MotdConfig` used when subscribing. The following is the system's behavior:
+
+| | |
+| :--- | :--- |
+| Compatible Changes | These schema mismatches are handled automatically by the configserver: - C is missing a config value that S has: The server will omit that value from the response. - C has an additional config value with a default value: The server will include that value in the response. - C and S both have a config value, but the default values differ: The server will use C's default value. |
+| Incompatible Changes | These schema mismatches are not handled by the config server, and will typically lead to error in the subscription API because of missing values (though in principle some consumers of config may tolerate them): - C has an additional config value without a default value: The server will not include anything for that value. - C has the type of a config value changed, for example from string to int: The server will print an error message, and not include anything for that value. The user must use an entirely new name for the config if such a change must be made. |
+
+As with any data schema, it is wise to be conservative about changing it if the system will have new versions in the future. For a `def` schema, removing a config value constitutes a semantic change that may lead to problems when an older version of some config subscriber asks for config. In large deployments, the risk associated with this increases, because of the higher cost of a full restart of everything.
+
+Consequently, one should prefer creating a new config name, to removing a config value from a schema.
+
+## Creating a Deployable Application Package
+
+The application package consists of the following files:
+
+```
+app/services.xml
+app/hosts.xml
+```
+
+The services file contains the services that is handled by the config model plugin. The hosts file contains:
+```xml
+
+
+ node0
+
+
+```
+
+## Setting Up a Running System
+
+To get a running system, first install the cloudconfig package, start the config server, then deploy the application: Prepare the application:
+
+$ vespa prepare /path/to/app/folder
+Activate the application:$ vespa activate /path/to/app/folder
+
+Then, start vespa. This will start the application and pass it its config id via the VESPA\_CONFIG\_ID environment variable.
+
+## Advanced Usage of the Config API
+
+For a simple application, having only 1 config may suffice. In a typical server application, however, the number of config settings can become large. Therefore, we **encourage** that you split the config settings into multiple logical classes. This section covers how you can use a ConfigSubscriber to subscribe to multiple configs and how you should group configs based on their dependencies. Configs can either be:
+
+- Independent static configs
+- Dependent static configs
+- Dependent dynamic configs
+
+We will give a few examples of how you can cope with these different scenarios. The code examples are given in a pseudo format common to C++ and Java, but they should be easy to convert to their language specific equivalents.
+
+### Independent Static Configs
+
+Independent configs means that it does not matter if one of them is updated independently of the other. In this case, you might as well use one ConfigSubscriber for each of the configs, but it might become tedious to check all of them. Therefore, the recommended way is to manage all of these configs using one ConfigSubscriber. In this setup, it is also typical to split the subscription phase from the config check/retrieval part. The subscribing part:
+
+```c++ C ++
+ConfigSubscriber subscriber;
+ConfigHandle::UP fooHandle = subscriber.subscribe(...);
+ConfigHandle::UP barHandle = subscriber.subscribe(...);
+ConfigHandle::UP bazHandle = subscriber.subscribe(...);
+```
+```java Java
+ConfigSubscriber subscriber;
+ConfigHandle fooHandle = subscriber.subscribe(FooConfig.class, ...);
+ConfigHandle barHandle = subscriber.subscribe(BarConfig.class, ...);
+ConfigHandle bazHandle = subscriber.subscribe(BazConfig.class, ...);
+```
+
+And the retrieval part:
+
+```
+if (subscriber.nextConfig()) {
+ if (fooHandle->isChanged()) {
+ // Reconfigure foo
+ }
+ if (barHandle->isChanged()) {
+ // Reconfigure bar
+ }
+ if (bazHandle->isChanged()) {
+ // Reconfigure baz
+ }
+}
+```
+
+This allows you to perform the config fetch part either in its own thread or as part of some other event thread in your application.
+
+### Dependent Static Configs
+
+Dependent configs means that one of your configs depends on the value in another config. The most common is that you have one config which contains the config id to use when subscribing to the second config. In addition, your system may need that the configs are updated to the same **generation**.
+
+
+**Note:**
+
+A generation is a monotonically increasing number which is increased each time an application is deployed with `vespa deploy`. Certain applications may require that all configs are of the same generation to ensure consistency, especially container-like applications. All configs subscribed to by a ConfigSubscriber are guaranteed to be of the same generation.
+
+
+The configs are static in the sense that the config id used does not change. The recommended way to approach this is to use a two phase setup, where you fetch the initial configs in the first phase, and then subscribe to both the initial and derived configs in order to ensure that they are of the same generation. Assume that the InitialConfig config contains two fields named *derived1* and *derived2*:
+
+```c++ C++
+ConfigSubscriber initialSubscriber;
+ConfigHandle::UP initialHandle = subscriber.subscribe(...);
+while (!subscriber.nextConfig()); // Ensure that we actually get initial config.
+std::auto_ptr initialConfig = initialHandle->getConfig();
+
+ConfigSubscriber subscriber;
+... = subscriber.subscribe(...);
+... = subscriber.subscribe(initialConfig->derived1);
+... = subscriber.subscribe(initialConfig->derived1);
+```
+
+```java Java
+ConfigSubscriber initialSubscriber;
+ConfigHandle initialHandle = subscriber.subscribe(InitialConfig.class, ...);
+while (!subscriber.nextConfig()); // Ensure that we actually get initial config.
+InitialConfig initialConfig = initialHandle.getConfig();
+
+ConfigSubscriber subscriber;
+... = subscriber.subscribe(InitialConfig.class, ...);
+... = subscriber.subscribe(DerivedConfig.class, initialConfig.derived1);
+... = subscriber.subscribe(DerivedConfig.class, initialConfig.derived1);
+```
+
+You can then check the configs in the same way as for independent static configs, and be sure that all your configs are of the same generation. The reason why you need to create a new ConfigSubscriber is that **once you have called nextConfig(), you cannot add or remove new subscribers**.
+
+### Dependent Dynamic Configs
+
+Dynamic configs mean that the set of configs that you subscribe for may change between each deployment. This is the hardest case to solve, and how hard it is depends on how many levels of configs you have. The most common one is to have a set of bootstrap configs, and another set of configs that may change depending on the bootstrap configs (typically in an application that has plugins). To cover this case, you can use a class named `ConfigRetriever`. Currently, it is **only available in the C++ API**.
+
+The ConfigRetriever uses the same mechanisms as the ConfigSubscriber to ensure that you get a consistent set of configs. In addition, two more classes called `ConfigKeySet` and `ConfigSnapshot` are added. The ConfigRetriever takes in a set of configs used to bootstrap the system in its constructor. This set does not change. It then provides one method, `getConfigs(ConfigKeySet)`. The method returns a ConfigSnapshot of the next generation of bootstrap configs or derived configs.
+
+To create the ConfigRetriever, you must first populate a set of bootstrap configs:
+
+```xml
+ConfigKeySet bootstrapKeys;
+bootstrapKeys.add(configId);
+bootstrapKeys.add(configId);
+```
+
+The bootstrap configs are typically configs that will always be needed by your application. Once you have defined your set, you can create the retriever and fetch a ConfigSnapshot of the bootstrap configs:
+
+```
+ConfigRetriever retri ever(bootstrapKeys);
+ConfigSnapshot bootstrapConfigs = retriever.getConfigs();
+```
+
+The ConfigSnapshot contains the bootstrap config, and you may use that to fetch the individual configs. You need to provide the config id and the type in order for the snapshot to know which config to look for:
+
+```xml
+if (!bootstrapConfigs.empty()) {
+ std::auto_ptr bootstrapFoo = bootstrapConfigs.getConfig(configId);
+ std::auto_ptr bootstrapBar = bootstrapConfigs.getConfig(configId);
+```
+
+The snapshot returned is empty if the retriever was unable to get the configs. In that case, you can try calling the same method again.
+
+Once you have the bootstrap configs, you know the config ids for the other components that you should subscribe for, and you can define a new key set. Let's assume that bootstrapFoo contains an array of config ids we should subscribe for.
+
+```java
+ConfigKeySet pluginKeySet;
+for (size_t i = 0; i < (*bootstrapFoo).pluginConfigId.size; i++) {
+ pluginKeySet.add((*bootstrapFoo).pluginConfigId[i]);
+}
+```
+
+In this example we know the type of config requested, but this could be done in another way letting the plugin add keys to the set.
+
+Now that the derived configs have been added to the pluginKeySet, we can request a snapshot of them:
+
+```java
+ConfigSnapshot pluginConfigs = retriever.getConfigs(pluginKeySet);
+if (!pluginConfigs.empty()) {
+ // Configure each plugin with a config picked from the snapshot.
+}
+```
+And that's it. When calling the method without any key parameters, the snapshot returned by this method may be empty if **the config could not be fetched within the timeout**, or **the generation of configs has changed**. To check if you should call getBootstrapConfigs() again, you can use the `bootstrapRequired()` method. If it returns true, you will have to call getBootstrapConfigs() again, because the plugin configs have been updated, and you need a new bootstrap generation to match it. If it returns false, you may call getConfigs() again to try and get a new generation of plugin configs.
+
+We recommend that you use the retriever API if you have a use case like this. The alternative is to create your own mechanism using two ConfigSubscriber classes, but this is **not** recommended.
+
+### Advice on Config Modelling
+
+Regardless of which of these types of configs you have, it is recommended that you always fetch all the configs you need **before** you start configuring your system. This is because the user may deploy multiple different version of the config that may cause your components to get conflicting config values. A common pitfall is to treat dependent configs as independent, thereby causing inconsistency in your application when a config update for config A arrives before config B. The ConfigSubscriber was created to minimize the possibility of making this mistake, by ensuring that all of the configs comes from the same config reload.
+
+**Tip:** Set up your entire *tree* of configs in one thread to ensure consistency, and configure your system once all of the configs have arrived. This also maps best to the ConfigSubscriber, since it is not thread safe.
+
+## The Java config API
+
+Assumption: a [def file](/en/applications/configapi-dev), which is the schema for one of your configs, is created and put in `src/main/resources/configdefinitions/`.
+
+To generate source code for the def-file, invoke the `config-class-plugin` from *pom.xml*, in the ``, `` section:
+
+```xml
+
+ com.yahoo.vespa
+ config-class-plugin
+ ${vespa.version}
+
+
+ config-gen
+
+ config-gen
+
+
+
+
+```
+
+The generated classes will be saved to `target/generated-sources/vespa-configgen-plugin`, when the `generate-sources` phase of the build is executed. The def-file [`motd.def`](/en/applications/configapi-dev) is used in this tutorial, and a class called `MotdConfig` was generated (in the package `myproject`). It is a subtype of `ConfigInstance`.
+
+When using only the config system (and not other parts of Vespa or the JDisc container), pull in that by using this in pom.xml:
+
+```xml
+
+ com.yahoo.vespa
+ config
+ ${vespa.version}
+ provided
+
+```
+
+## Subscribing and getting config
+
+To retrieve the config in the application, create a `ConfigSubscriber`. A `ConfigSubscriber` is capable of subscribing to one or more configs. The example shown here uses simplified error handling:
+
+```java
+ConfigSubscriber subscriber = new ConfigSubscriber();
+ConfigHandle handle = subscriber.subscribe(MotdConfig.class, "motdserver2/0");
+if (!subscriber.nextConfig()) throw new RuntimeException("Config timed out.");
+if (handle.isChanged()) {
+ String message = handle.getConfig().message();
+ int port = handle.getConfig().port();
+}
+```
+
+Note that `isChanged()` always will be true after the first call to `nextConfig()`, it is included here to illustrate the API.
+
+In many cases one will do this from a thread which loops the `nextConfig()` call, and reconfigures your application if `isChanged()` is true.
+
+The second parameter to `subscribe()`, *"motdserver2/0"*, is the [config id](/en/applications/configapi-dev#config-id).
+
+If one `ConfigSubscriber` subscribes to multiple configs, `nextConfig()` will only return true if the configs are of the same generation, i.e. they are "in sync".
+
+See the [com.yahoo.config](https://javadoc.io/doc/com.yahoo.vespa/config-lib) javadoc for details. Example:
+
+```java
+ConfigSubscriber subscriber = new ConfigSubscriber();
+ConfigHandle motdHandle = subscriber.subscribe(MotdConfig.class, "motdserver2/0");
+ConfigHandle anotherHandle = subscriber.subscribe(AnotherConfig.class, "motdserver2/0");
+if (!subscriber.nextConfig()) throw new RuntimeException("Config timed out.");
+// We now have a synchronized new generation for these two configs.
+if (motdHandle.isChanged()) {
+ String message = motdHandle.getConfig().message();
+ int port = motdHandle.getConfig().port();
+}
+if (anotherHandle.isChanged()) {
+ String myfield = anotherHandle.getConfig().getMyField();
+}
+```
+
+## Simplified subscription
+
+In cases like the first example above, where you only subscribe to one config, you may also subscribe using the `ConfigSubscriber.SingleSubscriber` interface. In this case, you define a `configure()` method from the interface, and call a special `subscribe()`. The method will start a dedicated config fetcher thread for you. The method will throw an exception in the user thread if initial configuration fails, and print a warning in the config thread if it fails afterwards. Example:
+
+```java
+public class MyConfigSubscriber implements ConfigSubscriber.SingleSubscriber {
+
+ public MyConfigSubscriber(String configId) {
+ new ConfigSubscriber().subscribe(this, MotdConfig.class, configId);
+ }
+
+ @Override
+ public void configure(MotdConfig config) {
+ // configuration logic here
+ }
+}
+```
+
+The disadvantage to using this is that one cannot implement custom error handling or otherwise track config changes. If needed, use the generic method above.
+
+## Unit testing config
+
+When instantiating a [ConfigSubscriber](https://javadoc.io/doc/com.yahoo.vespa/config/latest/com/yahoo/config/subscription/ConfigSubscriber.html), one can give it a [ConfigSource](https://javadoc.io/doc/com.yahoo.vespa/config/latest/com/yahoo/config/subscription/ConfigSource.html). One such source is a `ConfigSet`. It consists of a set of `Builder`s. This is an example of instantiating a subscriber using this - it uses 2 types of config, that were generated from files `app.def` and `string.def`:
+
+```java
+ConfigSet myConfigs = new ConfigSet();
+AppConfig.Builder a0builder = new AppConfig.Builder().message("A message, 0").times(88);
+AppConfig.Builder a1builder = new AppConfig.Builder().message("A message, 1").times(89);
+myConfigs.add("app/0", a0builder);
+myConfigs.add("app/1", a1builder);
+myConfigs.add("bar", new StringConfig.Builder().stringVal("StringVal"));
+ConfigSubscriber subscriber = new ConfigSubscriber(myConfigs);
+```
+
+To help with unit testing, each config type has a corresponding builder type. The `Builder` is mutable whereas the `ConfigInstance` is not. Use this to set up config fixtures for unit tests. The `ConfigSubscriber` has a `reload()` method which is used in tests to force the subscriptions into a new generation. It emulates a `vespa activate` operation after you have updated the `ConfigSet`.
+
+A full example can be found in [ConfigSetSubscriptionTest.java](https://github.com/vespa-engine/vespa/blob/master/config/src/test/java/com/yahoo/config/subscription/ConfigSetSubscriptionTest.java).
diff --git a/mintlify-docs/en/applications/configuring-components.mdx b/mintlify-docs/en/applications/configuring-components.mdx
new file mode 100644
index 0000000000..11f6cf4534
--- /dev/null
+++ b/mintlify-docs/en/applications/configuring-components.mdx
@@ -0,0 +1,138 @@
+---
+title: "Configuring Java components"
+description: "Any Java component might require some sort of configuration, be it simple strings or integers, or more complex structures. Because of all the boilerplate code that commonly goes into classes to hold such configuration, this often degenerates into a collection of key-value string pairs (e.g. [javax.servlet.FilterConfig](https://docs.oracle.com/javaee/6/api/javax/servlet/FilterConfig)). To avoid this, Vespa has custom, type-safe configuration to all [Container](/en/applications/containers) components. Get started with the [Developer Guide](/en/applications/developer-guide), try the [album-recommendation-java](https://github.com/vespa-engine/sample-apps/tree/master/album-recommendation-java) sample application."
+---
+
+Configurable components in short:
+
+- Create a [config definition](/en/reference/applications/config-files#config-definition-files) file
+- Use the Vespa [bundle plugin](/en/applications/bundles#maven-bundle-plugin) to generate a config class from the definition
+- Inject config objects in the application code
+
+The application code is interfacing with config through the generated code — code and config is always in sync. This configuration should be used for all state which is assumed to stay constant for the *lifetime of the component instance*. Use [deploy](/en/basics/applications) to push and activate code and config changes.
+
+## Config definition
+
+Write a [config definition](/en/reference/applications/config-files#config-definition-files) file and place it in the application's `src/main/resources/configdefinitions/` directory, e.g. `src/main/resources/configdefinitions/my-component.def`:
+
+```text
+package=com.mydomain.mypackage
+
+myCode int default=42
+myMessage string default=""
+```
+
+## Generating config classes
+
+Generating config classes is done by the *bundle plugin*:
+
+```bash
+$ mvn generate-resources
+```
+
+The generated the config classes are written to `target/generated-sources/vespa-configgen-plugin/`. In the above example, the config definition file was named *my-component.def* and its package declaration is *com.mydomain.mypackage*. The full name of the generated java class will be *com.mydomain.mypackage.MyComponentConfig*
+
+It is a good idea to generate the config classes first, *then* resolve dependencies and compile in the IDE.
+
+## Using config in code
+
+The generated config class is now available for the component through [constructor injection](/en/applications/dependency-injection), which means that the component can declare the generated class as one of its constructor arguments:
+
+```java
+package com.mydomain.mypackage;
+public class MyComponent {
+ private final int code;
+ private final String message;
+ @Inject
+ public MyComponent(MyComponentConfig config) {
+ code = config.myCode();
+ message = config.myMessage();
+ }
+}
+```
+
+The Container will create and inject the config instance. To override the default values of the config, [specify](/en/reference/applications/config-files#generic-configuration-in-services-xml) values in `src/main/application/services.xml`, like:
+
+```xml
+
+
+
+ 132
+ Hello, World!
+
+
+
+```
+
+and the deployed instance of `MyComponent` is constructed using a corresponding instance of `MyComponentConfig`.
+
+## Unit testing configurable components
+
+The generated config class provides a builder API that makes it easy to create config objects for unit testing. Example that sets up a unit test for the `MyComponent` class from the example above:
+
+```java
+import static com.mydomain.mypackage.MyComponentConfig.*;
+public class MyComponentTest {
+ @Test
+ public void requireThatMyComponentGetsConfig() {
+ MyComponentConfig config = new MyComponentConfig.Builder()
+ .myCode(668)
+ .myMessage("Neighbour of the beast")
+ .build();
+ MyComponent component = new MyComponent(config);
+ …
+ }
+}
+```
+
+The config class used here is simple — see a separate example of [building a complex configuration object](/en/applications/unit-testing#unit-testing-configurable-components).
+
+## Adding files to the component configuration
+
+This section describes what to do if the component needs larger configuration objects that are stored in files, e.g. machine-learned models, [automata](/en/reference/operations/tools#vespa-makefsa) or large tables. Before proceeding, take a look at how to create [provider components](/en/applications/dependency-injection#special-components) — instead of integrating large objects into e.g. a searcher or processor, it might be better to split the resource-demanding part of the component's configuration into a separate provider component. The procedure described below can be applied to any component type.
+
+Files can be transferred using either [file distribution](/en/applications/deployment#file-distribution) or URL download. File distribution is used when the files are added to the application package. If for some reason this is not convenient, e.g. due to size, origin of file or update frequency, Vespa can download the file and make it available for the component. Both types are set up in the config definition file. File distribution uses the `path` config type, and URL downloading the `url` type. You can also use the `model` type for machine-learned models that can be referenced by both model-id, used on Vespa Cloud, and url/path, used on self-hosted deployments. See [the config file reference](/en/reference/applications/config-files) for details.
+
+In the following example we will show the usage of all three types. Assume this config definition, named `my-component.def`:
+
+```java
+package=com.mydomain.mypackage
+
+myFile path
+myUrl url
+myModel model
+```
+
+The file must reside in the application package, and the path (relative to the application package root) must be given in the component's configuration in `services.xml`:
+
+```xml
+
+
+
+ my-files/my-file.txt
+ /en/reference/query-api-reference.html
+
+
+
+
+```
+
+An example component that uses these files:
+
+```java
+package com.mydomain.mypackage;
+import java.io.File;
+public class MyComponent {
+ private final File fileFromFileDistribution;
+ private final File fileFromUrlDownload;
+ public MyComponent(MyComponentConfig config) {
+ pathFromFileDistribution = config.myFile();
+ fileFromUrlDownload = config.myUrl();
+ modelFilePath = config.myModel();
+ }
+}
+```
+
+The `myFile()` and `myModel()` getter returns a `java.nio.Path` object, while the `myUrl()` getter returns a `java.io.File` object. The container framework guarantees that these files are fully present at the given location before the component constructor is invoked, so they can always be accessed right away.
+
+When the client asks for config that uses the `url` or `model` config type with a URL, the content will be downloaded and cached on the nodes that need it. If you want to change the content, the application package needs to be updated with a new URL for the changed content and the application [deployed](/en/basics/applications), otherwise the cached content will still be used. This avoids unintended changes to the application if the content of a URL changes.
diff --git a/mintlify-docs/en/applications/containers.mdx b/mintlify-docs/en/applications/containers.mdx
new file mode 100644
index 0000000000..1ecad7ef1c
--- /dev/null
+++ b/mintlify-docs/en/applications/containers.mdx
@@ -0,0 +1,50 @@
+---
+title: "Container clusters"
+description: "Vespa's Java container - JDisc, hosts all application components as well as the stateless logic of Vespa itself."
+---
+
+Which particular components are hosted by a container cluster is configured in services.xml. The main features of JDIsc are:
+
+- HTTP serving out of the box from an embedded Jetty server, and support for plugging in other transport mechanisms.
+- Integration with the config system of Vespa which allows components to [receive up-to-date config](/en/applications/configuring-components) (by constructor injection) resulting from application deployment.
+- [Dependency injection based on Guice](/en/applications/dependency-injection) (Felix), but extended for configs and component collections.
+- A component model based on [OSGi](/en/applications/bundles) which allows component to be (re)deployed to running servers, and to control which APIs they expose to others.
+- The features above combine to allow application package changes (changes to components, configuration or data) to be applied by Vespa without disrupting request serving nor requiring restarts.
+- Standard component types exists for:
+ - [general request handling](/en/applications/request-handlers)
+ - [chained request-response processing](/en/applications/processing)
+ - [processing document writes](/en/applications/document-processors)
+ - [intercepting queries and results](/en/applications/searchers)
+ - [rendering responses](/en/applications/result-renderers)
+
+ Application components can be of any other type as well and do not need to reference any Vespa API to be loaded and managed by the container.
+- A general [chain composition](/en/applications/chaining) mechanism for components.
+
+## Developing Components
+
+- The JDisc container provides a framework for processing requests and responses, named *Processing* - its building blocks are:
+ - [Chains](/en/applications/chaining) of other components that are to be executed serially, with each providing some service or transform
+ - [Processors](/en/applications/processing) that change the request and / or the response. They may also make multiple forward requests, in series or parallel, or manufacture the response content themselves
+ - [Renderers](/en/applications/processing#response-rendering) that are used to serialize a Processor's response before returning it to a client
+- Application Lifecycle and unit testing:
+ - [Configuring components](/en/applications/configuring-components) with custom configuration
+ - [Component injection](/en/applications/dependency-injection) allows components to access other application components
+ - Learn how to [build OSGi bundles](/en/applications/bundles) and how to [troubleshoot](/en/applications/bundles#troubleshooting) classloading issues
+ - Using [Libraries for Pluggable Frameworks](/en/applications/pluggable-frameworks) from a component may result in class loading issues that require extra setup in the application
+ - [Unit testing configurable components](/en/applications/unit-testing#unit-testing-configurable-components)
+- Handlers and filters:
+ - [Http servers and security filters](/en/applications/http-servers-and-filters) for incoming connections on HTTP and HTTPS
+ - [Request handlers](/en/applications/request-handlers) to process incoming requests and generate responses
+- Searchers and Document Processors:
+ - [Searcher](/en/applications/searchers) and [search result renderer](/en/applications/result-renderers) development
+ - [Document processing](/en/applications/document-processors)
+
+## Reference documentation
+
+- [services.xml](/en/reference/applications/services/container)
+
+## Other related documents
+
+- [Designing RESTful web services](/en/applications/web-services) as Vespa Components
+- [healthchecks](/en/reference/operations/health-checks) - using the Container with a VIP
+- [Vespa Component Reference](/en/reference/applications/components): The Container's request processing lifecycle
diff --git a/mintlify-docs/en/applications/dependency-injection.mdx b/mintlify-docs/en/applications/dependency-injection.mdx
new file mode 100644
index 0000000000..9107dd1b76
--- /dev/null
+++ b/mintlify-docs/en/applications/dependency-injection.mdx
@@ -0,0 +1,140 @@
+---
+title: "Dependency injection"
+description: "The Container (a.k.a. JDisc container) implements a dependency injection framework that allows components to declare arbitrary dependencies on configuration and other components in the application. This document explains how to write a container component that depends on another component. See the [reference](/en/reference/applications/components#injectable-components) for a list of injectable components."
+---
+
+The container relies on auto-injection instead of Guice modules. All components declared in the container cluster are available for injection, and the dependent component only needs to declare the dependency as a constructor parameter. In general, dependency injection involves at least three elements:
+
+- a dependent consumer,
+- a declaration of a component's dependencies,
+- an injector that creates instances of classes that implement a given dependency on request.
+
+Notes:
+
+- The dependent object describes what software component it depends on to do its work. The injector decides what concrete classes satisfy the requirements of the dependent object, and provides them to the dependent
+- The Container encapsulates the injector, and the consumer and all its dependencies are considered to be components.
+- The Container only supports constructor injection (i.e. all dependencies must be declared in a component's constructor).
+- Circular dependencies is not supported.
+
+Refer to the [multiple-bundles sample app](https://github.com/vespa-engine/sample-apps/tree/master/examples/multiple-bundles) for a practical example.
+
+## Depending on another component
+
+A component that depends on another is considered to be a *consumer*. A component's dependencies is whatever its `@Inject`-annotated constructor declares as arguments. E.g. the component:
+
+```java
+package com.yahoo.example;
+import com.yahoo.component.annotation.Inject;
+public class MyComponent {
+ private final MyDependency dependency;
+ @Inject
+ public MyComponent(MyDependency dependency) {
+ this.dependency = dependency;
+ }
+}
+```
+
+has a dependency on the class `com.yahoo.example.MyDependency`. To deploy `MyComponent`, register `MyDependency` in `services.xml`:
+
+```xml
+
+
+
+
+```
+
+Upon deployment, the Container will first instantiate `MyDependency`, and then pass that instance to the constructor of `MyComponent`. Multiple consumers can take the same dependency. One can also [inject configuration](/en/applications/configuring-components) to components.
+
+
+**Note:**
+
+A component will be reconstructed only when one of its dependencies, configuration, or its class changes - all which only occurs when you re-deploy your application package. Reconstruction is transitive; if component A depends on component B, and component B depends on component C, then a reconfiguration of component B causes a reconfiguration of A, but not of C. Reconfiguration of C causes a reconstruction of both A and B.
+
+
+### Extending components
+
+When injecting two components when one extends the other, the dependency injection code does not know which of the two to use as the argument for the parent class. To resolve this, inject a `ComponentRegistry` (see below), and look up its entries, like `getComponent(XXX.class.getName())`.
+
+### Specify the bundle
+
+The example above assumes the bundle name can be deducted from the class name. This is not always the case, and you will get class loading problems like:
+
+```txt
+Caused by: java.lang.IllegalArgumentException: Could not create a component with id
+'com.yahoo.example.My'.
+Tried to load class directly, since no bundle was found for spec:
+com.yahoo.example.Dependency
+```
+
+To remedy, specify the jar file (i.e. bundle) with the component:
+
+```txt
+
+
+
+```
+
+## Depending on all components of a specific type
+
+Consider the use-case where a component chooses between various strategies, and each strategy is implemented as a separate component. Since the number and type of strategies is unknown when implementing the consumer, it is impossible to make a constructor that lists all of them. This is where the `ComponentRegistry` comes into play. E.g. the following component:
+
+```java
+package com.yahoo.example;
+public class MyComponent {
+ private final ComponentRegistry strategies;
+ @Inject
+ public MyComponent(ComponentRegistry strategies) {
+ this.strategies = strategies;
+ }
+}
+```
+
+declares a dependency on the set of all components registered in `services.xml` that are instances of the class `Strategy` (including subclasses). The `ComponentRegistry` class provides accessors for components based on their [component id](/en/reference/applications/services/container#component).
+
+## Special Components
+
+There are cases where a component cannot be directly injected to its consumers - example:
+
+- The component must be instantiated via a factory method instead of its constructor
+- Each consumer must have a unique instance of the dependency class
+- The component uses native resources that must be cleaned up when the component goes out of scope
+
+For these situations, JDisc supports injection, and optional deconstruction, via its `Provider` interface:
+
+```java
+public interface Provider {
+T get();
+void deconstruct();
+}
+```
+
+`get()` is called by JDisc each time it needs to instantiate the specific component type. `deconstruct()` is only called after reconfiguring the system with a new application, where the current provider instance is either removed or replaced due to modified dependencies.
+
+Following the earlier example, declare a provider for the `MyDependency` class, that returns a new instance for each consumer:
+
+```java
+package com.yahoo.example;
+import com.yahoo.container.di.componentgraph.Provider;
+public class MyDependencyProvider implements Provider {
+ @Override
+ public MyDependency get() {
+ return new MyDependency();
+ }
+ @Override
+ public void deconstruct() { }
+}
+```
+
+Using this provider, `services.xml` has two instances of `MyComponent`, each getting a unique instance of `MyDependency`:
+
+```xml
+
+
+
+
+
+```
+
+Upon deployment, the Container will first instantiate `MyDependencyProvider`, and then invoke `MyDependencyProvider.get()` for each instantiation of `MyComponent`.
+
+A provider can declare constructor dependencies, just like any other component.
diff --git a/mintlify-docs/en/applications/deployment.mdx b/mintlify-docs/en/applications/deployment.mdx
new file mode 100644
index 0000000000..4d80ca5f7a
--- /dev/null
+++ b/mintlify-docs/en/applications/deployment.mdx
@@ -0,0 +1,116 @@
+---
+title: "Deployment"
+description: "In this document we explain various aspects of application deployment in detail. Refer to [application deployment](/en/basics/applications#deploying-applications) for an introduction."
+---
+
+## Convergence
+
+After the deployment command has succeeded, the application package will take effect, but this does not complete immediately in the distributed system that is your running application; it happens through a distributed *convergence* process that you can track from the command line or console. Refer to the [deploy reference](/en/reference/applications/application-packages#deploy) for detailed steps run when deploying an application.
+
+You can get the status of the last deployment by using the status command:
+
+```shell
+$ vespa status deployment
+```
+
+## Rollback
+
+Hover over the instance square to the left, click pin, give a reason - this will start the downgrade process:
+
+
+
+
+
+The pinning to a new version starts a new deployment, and can be rolled out as a normal rollout. To speed it up, cancel system and staging test jobs by clicking abort.
+
+
+
+
+
+Skipping tests is at the application owners own discretion and risk:
+
+- A system test on this version has already been run on an earlier deployment. Skipping this can be considered safe, for that reason.
+- A staging test has never been run before when rolling back, this path is untested.
+
+Of the two, the staging test takes longer to run. The user decides whether to skip testing phases or not. With this, a user can control whether to immediately roll back a version including test phases or not, as well as rolling out to production zones in parallel or not.
+
+After the pin to rollback, make sure to update the code repository so the next deployments is in sync, and remove the pin for later deployments.
+
+### Follow-up steps
+
+Generally, to roll back an application package change, deploy again with the previous version to roll back to. The above section describes the fast-track rollback. The alternatives are:
+
+1. With automation: Revert the code in the source code repository, and let the automation roll out the new version. You can speed up the deployment by skipping tests and clicking "deploy now" in the deployment graph in the console.
+2. If you have trouble rebuilding a good package, you can download a previous package from Vespa Cloud: Use the [console](/en/operations/automated-deployments#source-code-repository-integration) to pick the good version, download it and deploy again. Hover of the [instance](/en/operations/automated-deployments#block-windows) (normally called "default") to skip the system and staging test to speed up the deployment, if needed.
+3. On self-managed instances, regenerate the good version from source for new deployment, see also the [deploy API](/en/reference/api/deploy-v2#rollback)
+
+## File distribution
+
+The application package can have components and other large files. When an app is deployed, these files are distributed to the nodes:
+
+- Components (i.e bundles)
+- Files with type *path* and *url* in config, see [Adding files to the component configuration](/en/applications/configuring-components#adding-files-to-the-component-configuration)
+- Machine learned models
+- [Constant tensors](/en/reference/schemas/schemas#constant)
+
+When new components or files specified in config are distributed, the container gets a new file reference, waits for it to be available and switches to new config when all files are available.
+
+
+
+
+
+## Deploying remote models
+
+Most application packages are stored as source code in a code repository. However, some resources are generated or too large to store in a code repository, like models or an [FSA](/en/reference/operations/tools#vespa-makefsa).
+
+Machine learned models in Vespa, are stored in the application package under the *models* directory. This might be inconvenient for some applications, for instance for models that are frequently retrained on some remote system. Also, models might be too large to fit within the constraints of the version control system.
+
+The solution is to download the models from the remote location during the application package build. This is simply implemented by adding a step in *pom.xml* (see [example](https://github.com/vespa-cloud/cord-19-search/blob/main/pom.xml)):
+
+```xml expandable
+
+
+
+ org.codehaus.mojo
+ exec-maven-plugin
+ 1.4.0
+
+
+ download-model
+ generate-resources
+
+ exec
+
+
+ bin/download_models.sh
+
+ target/application/models
+ MODEL-URL
+
+
+
+
+
+
+
+```
+
+*bin/download_model.sh* example:
+
+```txt
+#!/bin/bash
+
+DIR="$1"
+URL="$2"
+
+echo "[INFO] Downloading $URL into $DIR"
+
+mkdir -p $DIR
+pushd $DIR
+curl -O $URL
+popd
+```
+
+Any necessary credentials for authentication and authorization should be added to this script, as well as any unpacking of archives (for TensorFlow models for instance).
+
+Also see the [model](/en/reference/applications/config-files#model) config type to specify resources that should be downloaded by container nodes during convergence.
diff --git a/mintlify-docs/en/applications/developer-guide.mdx b/mintlify-docs/en/applications/developer-guide.mdx
new file mode 100644
index 0000000000..c9f2bd6ba8
--- /dev/null
+++ b/mintlify-docs/en/applications/developer-guide.mdx
@@ -0,0 +1,182 @@
+---
+title: "Developer Guide"
+description: "This document explains how to develop applications, including basic terminology, tips on using the Vespa Cloud Console, and how to benchmark and size your application. See [deploy a sample application](/en/basics/deploy-an-application) to deploy a basic sample application, and [automated deployments](/en/operations/automated-deployments) on making production deployments safe routine occurences."
+---
+
+
+## Manual deployments
+
+Developers will typically deploy their application to the `dev` [zone](/en/operations/zones) during development. Each deployment is owned by a *tenant*, and each specified *instance* is a separate copy of the application; this lets developers work on independent copies of the same application, or collaborate on a shared one, as they prefer—more details [here](/en/learn/tenant-apps-instances). These values can be set in the Vespa Cloud UI when deploying, or with each of the build and deploy tools, as shown in the respective getting-started guides.
+
+Additionally, a deployment may specify a different [zone](/en/operations/zones) to deploy to, instead of the default `dev` zone.
+
+### Auto downsizing
+
+Deployments to `dev` are downscaled to one small node by default, so that applications can be deployed there without changing `services.xml`. See [performance testing](#performance-testing) for how to disable auto downsizing using `deploy:environment="dev"`.
+
+### Availability
+
+The `dev` zone is a sandbox and not for production serving; It has no uptime guarantees.
+
+An automated Vespa software upgrade can be triggered at any time, and this may lead to some downtime if you have only one node per cluster (as with the default [auto downsizing](#auto-downsizing)).
+
+## Performance testing
+
+For performance testing, to avoid auto downsizing, lock the [resources](/en/reference/applications/services/services) using `deploy:environment="dev"`:
+
+```xml
+
+
+
+```
+
+Read more in [benchmarking](/en/performance/benchmarking-cloud) and [variants in services.xml](/en/operations/deployment-variants).
+
+## Component overview
+
+
+
+
+
+Application packages can contain Java components to be run in container clusters. The most common component types are:
+
+- [Searchers](/en/applications/searchers), which can modify or build the query, modify the result, implement workflows issuing multiple queries etc.
+- [Document processors](/en/applications/document-processors) that can modify incoming write operations.
+- [Handlers](/en/applications/request-handlers) that can implement custom web service APIs.
+- [Renderers](/en/applications/result-renderers) that are used to define custom result formats.
+
+Components are constructed by dependency injection and are reloaded safely on deployment without restarts. See the [container documentation](/en/applications/containers) for more details.
+
+See [deploy an application having Java components](/en/basics/deploy-an-application-java), and [troubleshooting](/en/operations/self-managed/admin-procedures#troubleshooting).
+
+## Developing Components
+
+The development cycle consists of creating the component, deploying the application package to Vespa, writing tests, and iterating. These steps refer to files in [album-recommendation-java](https://github.com/vespa-engine/sample-apps/tree/master/album-recommendation-java):
+
+| | |
+| :--- | :--- |
+| **Build** | All the Vespa sample applications use the [bundle plugin](/en/applications/bundles#maven-bundle-plugin) to build the components. |
+| **Configure** | A key Vespa feature is code and configuration consistency, deployed using an [application package](/en/basics/applications). This ensures that code and configuration is in sync, and loaded atomically when deployed. This is done by generating config classes from config definition files. In Vespa and application code, configuration is therefore accessed through generated config classes. The Maven target `generate-sources` (invoked by `mvn install`) uses [metal-names.def](https://github.com/vespa-engine/sample-apps/blob/master/album-recommendation-java/app/src/main/resources/configdefinitions/metal-names.def) to generate `target/generated-sources/vespa-configgen-plugin/com/mydomain/example/MetalNamesConfig.java`. After generating config classes, they will resolve in tools like [IntelliJ IDEA](https://www.jetbrains.com/idea/download/). |
+| **Tests** | Examples unit tests are found in [MetalSearcherTest.java](https://github.com/vespa-engine/sample-apps/blob/master/album-recommendation-java/app/src/test/java/ai/vespa/example/album/MetalSearcherTest.java). `testAddedOrTerm1` and `testAddedOrTerm2` illustrates two ways of doing the same test: The first setting up the minimal search chain for [YQL](/en/querying/query-language) programmatically. The second uses [`com.yahoo.application.Application`](https://javadoc.io/doc/com.yahoo.vespa/application/latest/com/yahoo/application/Application), which sets up the application package and simplifies testing. Read more in [unit testing](/en/applications/unit-testing). |
+
+## Debugging Components
+
+
+**Important:**
+
+The debugging procedure only works for endpoints with an open debug port - most managed services don't do this for security reasons.
+
+
+Vespa Cloud does not allow debugging over the *Java Debug Wire Protocol (JDWP)* due to the protocol's inherent lack of security measures. If you need interactive debugging, deploy your application to a self-hosted Vespa installation (below) and manually [add the *JDWP* agent to JVM options](#debugging-components).
+
+You may debug your Java code by requesting either a JVM heap dump or a Java Flight Recorder recording through the [Vespa Cloud Console](https://console.vespa-cloud.com/). Go to your application's cluster overview and select *export JVM artifact* on any *container* node. The process will take up to a few minutes. You'll find the steps to download the dump on the Console once it's completed. Extract the files from the downloaded Zstandard-compressed archive, and use the free [JDK Mission Control](https://www.oracle.com/java/technologies/jdk-mission-control) utility to inspect the dump/recording.
+
+
+
+
+
+To debug a [Searcher](/en/applications/searchers) / [Document Processor](/en/applications/document-processors) / [Component](/en/applications/components) running in a self-hosted container, set up a remote debugging configuration in the IDEA - IntelliJ example:
+
+
+
+ Run -> Edit Configurations...
+
+
+
+ Click `+` to add a new configuration.
+
+
+
+ Select the "Remote JVM Debug" option in the left-most pane.
+
+
+
+ Set hostname to the host running the container, change the port if needed.
+
+
+
+ Set the container's [jvm options](/en/reference/applications/services/container#jvm) to the value in "Command line arguments for remote JVM":
+
+ ```xml
+
+
+
+
+
+ ```
+
+
+
+ Re-deploy the application, then restart Vespa on the node that runs the container. Make sure the port is published if using a Docker/Podman container, e.g.:
+
+ ```bash
+ $ docker run --detach --name vespa --hostname vespa-container \
+ --publish 127.0.0.1:8080:8080 --publish 127.0.0.1:19071:19071 --publish 127.0.0.1:5005:5005 \
+ vespaengine/vespa
+ ```
+
+
+ Start debugging! Check *vespa.log* for errors.
+
+
+
+
+
+
+Find *Debugging a Vespa Searcher* in the vespaengine [youtube channel](https://www.youtube.com/@vespaai)!
+
+
+
+
+## Developing system and staging tests
+
+When using Vespa Cloud, system and tests are most easily developed using a test deployment in a `dev` zone to run the tests against. Refer to [general testing guide](/en/applications/testing) for a discussion of the different test types, and the [basic HTTP tests](/en/reference/applications/testing) or [Java JUnit tests](/en/reference/applications/testing-java) reference for how to write the relevant tests.
+
+If using the [Vespa CLI](/en/clients/vespa-cli) to deploy and run [basic HTTP tests](/en/reference/applications/testing), the same commands as in the test reference will just work, provided the CLI is configured to use the `cloud` target.
+
+### Running Java tests
+
+With Maven, and [Java Junit tests](/en/reference/applications/testing-java), some additional configuration is required, to infuse the test runtime on the local machine with API and data plane credentials:
+
+```bash
+$ mvn test \
+-D test.categories=system \
+-D dataPlaneKeyFile=data-plane-private-key.pem -D dataPlaneCertificateFile=data-plane-public-cert.pem \
+-D apiKey="$API_KEY"
+```
+
+The `apiKey` is used to fetch the *dev* instance's endpoints. The data plane key and certificate pair is used by [ai.vespa.hosted.cd.Endpoint](https://github.com/vespa-engine/vespa/blob/master/cloud/tenant-cd-api/src/main/java/ai/vespa/hosted/cd/Endpoint.java) to access the application endpoint. Note that the `-D vespa.test.config` argument is gone; this configuration is automatically fetched from the Vespa Cloud API—hence the need for the API key.
+
+When running Vespa self-hosted like in the [sample application](/en/basics/deploy-an-application-local), no authentication is required by default, to either API or container, and specifying a data plane key and certificate will instead cause the test to fail, since the correct SSL context is the Java default in this case.
+
+Make sure the TestRuntime is able to start. As it will init an SSL context, make sure to remove config when running locally, in order to use a default context. Remove properties from *pom.xml* and IDE debug configuration.
+
+Developers can also set these parameters in the IDE run configuration to debug system tests:
+
+```txt
+-D test.categories=system
+-D tenant=my_tenant
+-D application=my_app
+-D instance=my_instance
+-D apiKeyFile=/path/to/myname.mytenant.pem
+-D dataPlaneCertificateFile=data-plane-public-cert.pem
+-D dataPlaneKeyFile=data-plane-private-key.pem
+```
+
+## Tips and troubleshooting
+
+- Vespa Cloud upgrades daily, and applications in `dev` also have their Vespa platform upgraded. This usually happens at the opposite time of day of when deployments are made to each instance, and takes some minutes. Deployments without redundancy will be unavailable during the upgrade.
+- Failure to deploy, due to authentication (HTTP code 401) or authorization (HTTP code 403), is most often due to wrong configuration of `tenant` and/or `application`, when using command line tools to deploy. Ensure the values set with Vespa CLI or in `pom.xml` match what is configured in the UI.
+- In case of data plane failure, remember to copy the public certificate to `src/main/application/security/clients.pem` before building and deploying. This is handled by the Vespa CLI `vespa auth cert` command.
+- To run Java [system and staging tests](/en/reference/applications/testing-java) in an IDE, ensure all API and data plane keys and certificates are configured in the IDE as well; not all IDEs pick up all settings from `pom.xml` correctly:
+
+ ```txt
+ -Dtest.categories=system
+ -DapiKeyFile=/path-to/tname.pem
+ -DdataPlaneCertificateFile=/path-to/data-plane-public-cert.pem
+ -DdataPlaneKeyFile=/path-to/data-plane-private-key.pem
+ ```
diff --git a/mintlify-docs/en/applications/document-processors.mdx b/mintlify-docs/en/applications/document-processors.mdx
new file mode 100644
index 0000000000..786c48e606
--- /dev/null
+++ b/mintlify-docs/en/applications/document-processors.mdx
@@ -0,0 +1,244 @@
+---
+title: "Document processors"
+description: "This document describes how to develop and deploy *Document Processors*, often called *docproc* in this documentation. Document processing is a framework to create [chains](/en/applications/chaining) of configurable [components](/en/applications/components), that read and modify document operations."
+---
+
+The input source splits the input data into logical units called [documents](/en/schemas/documents). A [feeder application](/en/writing/reads-and-writes) sends the documents into a document processing chain. This chain is an ordered list of document processors. Document processing examples range from language detection, HTML removal and natural language processing to mail attachment processing, character set transcoding and image thumbnailing. At the end of the processing chain, extracted data will typically be set in some fields in the document.
+
+The motivation for document processing is that code and configuration is atomically deployed, as like all Vespa components. It is also easy to build components that access data in Vespa as part of processing.
+
+To get started, see the [sample application](https://github.com/vespa-engine/sample-apps/tree/master/examples/document-processing). Read [indexing](/en/writing/indexing) to understand deployment and routing. As document processors are chained components just like Searchers, read [Searcher Development](/en/applications/searchers). For reference, see the [Javadoc](https://javadoc.io/doc/com.yahoo.vespa/docproc), and [services.xml](/en/reference/applications/services/docproc).
+
+
+
+
+
+## Deploying a Document Processor
+
+Refer to [album-recommendation-docproc](https://github.com/vespa-engine/sample-apps/tree/master/examples/document-processing) to get started, [LyricsDocumentProcessor.java](https://github.com/vespa-engine/sample-apps/blob/master/examples/document-processing/src/main/java/ai/vespa/example/album/LyricsDocumentProcessor.java) is a document processor example. Add the document processor in [services.xml](/en/reference/applications/services/docproc), and then add it to a [chain](#chains). The type of processing done by the processor dictates what chain it should be part of:
+
+- If it does general data-processing, such as populating some document fields from others, looking up data in external services, etc., it should be added to a general docproc chain.
+- If, and only if, it does processing required for *indexing*
+- or requires this to have already been run — it should be added to a chain which inherits the *indexing* chain, and which is used for indexing by a content cluster.
+
+An example that adds a general document processor to the "default" chain, and an indexing related processor to the chain for a particular content cluster:
+
+```xml highlight={4, 8, 18}
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ...
+
+
+
+
+```
+
+The "default" chain, if it exists, is run by default, before the chain used for indexing. The default indexing chain is called "indexing", and *must* be inherited by any chain that is to replace it.
+
+To run through any chain, specify a [route](/en/writing/document-routing) which includes the chain. For example, the route `default/chain.my-chain indexing` would route feed operations through the chain "my-chain" in the "default" container cluster, and then to the "indexing" hop, which resolves to the specified indexing chain for each content cluster the document should be sent to. More details can be found in [indexing](/en/writing/document-routing#document-processing):
+
+## Document Processors
+
+A document processor is a component extending `com.yahoo.docproc.DocumentProcessor`. All document processors must implement `process()`:
+
+```txt
+public Progress process(Processing processing);
+```
+
+When the container receives a document operation, it will create a new `Processing`, and add the `DocumentPut`s, `DocumentUpdate`s or `DocumentRemove`s to the `List` accessible through `Processing.getDocumentOperations()`. The latter is useful also where a processing should be stopped by doing `Processing.getDocumentOperations().clear()` before `Progress.DONE`, say for blocklist use, to stop a `DocumentPut/Update`.
+
+Furthermore, the call stack of the document processing chain in question will be *copied* to `Processing.callStack()`, so that document processors may freely modify the flow of control for this processing without affecting all other processings going on. After creation, the `Processing` is added to an internal queue.
+
+A worker thread will retrieve a `Processing` from the input queue, and run its document operations through its call stack. A minimal, no-op document processor implementation is thus:
+
+```java
+import com.yahoo.docproc.*;
+public class SimpleDocumentProcessor extends DocumentProcessor {
+ public Progress process(Processing processing) {
+ return Progress.DONE;
+ }
+}
+```
+
+The `process()` method should loop through all document operations in `Processing.getDocumentOperations()`, do whatever it sees fit to them, and return a `Progress`:
+
+```java
+public Progress process(Processing processing) {
+ for (DocumentOperation op : processing.getDocumentOperations()) {
+ if (op instanceof DocumentPut) {
+ DocumentPut put = (DocumentPut) op;
+ // TODO do something to 'put here
+ } else if (op instanceof DocumentUpdate) {
+ DocumentUpdate update = (DocumentUpdate) op;
+ // TODO do something to 'update' here
+ } else if (op instanceof DocumentRemove) {
+ DocumentRemove remove = (DocumentRemove) op;
+ // TODO do something to 'remove' here
+ }
+}
+return Progress.DONE;
+}
+```
+
+| Return code | Description |
+| :--- | :--- |
+| `Progress.DONE` | Returned if a document processor has successfully processed a `Processing`. |
+| `Progress.FAILED` | Processing failed and the input message should return a *fatal* failure back to the feeding application, meaning that this application will not try to re-feed this document operation. Return an error message/reason by calling `withReason()`. This result is represented as a `500 Internal Server Error` response in [Document v1](/en/writing/document-v1-api-guide). Example: `if (op instanceof DocumentPut) { return Progress.FAILED.withReason("PUT is not supported"); }` |
+| `Progress.INVALID_INPUT` | Available since 8.584. Processing failed due to invalid input, like a malformed document operation. This result is represented as a `400 Bad Request` response in [Document v1](/en/writing/document-v1-api-guide). |
+| `Progress.LATER` | See [execution model](#execution-model). The document processor wants to release the calling thread and be called again later. This is useful if e.g. calling an external service with high latency. The document processor may then save its state in the `Processing` and resume when called again later. There are no guarantees as to *when* the processor is called again with this `Processing`; it is simply appended to the back of the input queue. By the use of `Progress.LATER`, this is an asynchronous model, where the processing of a document operation does not need to consume one thread for its entire lifespan. Note, however, that the document processors themselves are shared between all processing operations in a chain, and must thus be implemented [thread-safe](#state). |
+
+| Exception | Description |
+| :--- | :--- |
+| `com.yahoo.docproc.TransientFailureException` | Processing failed and the input message should return a *transient* failure back to the feeding application, meaning that this application *may* try to re-feed this document operation. |
+| `RuntimeException` | Throwing any other `RuntimeException` means same behavior as for `Progress.FAILED`. |
+
+## Chains
+
+The call stack mentioned above is another name for a *document processor chain*. Document processor chains are a special case of the general [component chains](/en/applications/chaining) - to avoid confusion some concepts are explained here as well. A document processor chain is nothing more than a list of document processor instances, having an id, and represented as a stack. The document processor chains are typically not created for every processing, but are part of the configuration. Multiple ones may exist at the same time, the chain to execute will be specified by the message bus destination of the incoming message. The same document processor instance may exist in multiple document processor chains, which is why the `CallStack` of the `Processing` is responsible for knowing the next document processor to invoke in a particular message.
+
+The execution order of the document processors in a chain are not ordered explicitly, but by [ordering constraints](/en/applications/chaining#ordering-components) declared in the document processors or their configuration.
+
+## Execution model
+
+The Document Processing Framework works like this:
+
+
+
+ A thread from the message bus layer appends an incoming message to an internal priority queue, shared between all document processing chains configured on a node. The priority is set based on the message bus priority of the message. Messages of the same priority are ordered FIFO.
+
+
+
+ One worker thread from the docproc thread pool picks one message from the head of the queue, deserializes it, copies the call stack (chain) in question, and runs it through the document processors.
+
+
+
+ Processing finishes if **(a)** the document(s) has passed successfully through the whole chain, or **(b)** a document processor in the chain has returned `Progress.FAILED` or thrown an exception.
+
+
+
+ The same thread passes the message on to the message bus layer for further transport on to its destination.
+
+
+
+
+There is a single instance of each document processor chain. In every chain, there is a single instance of each document processor - unless a chain is configured with multiple, identical document processors - this is a rare case.
+
+As is evident from the model above, multiple worker threads execute the document processors in a chain concurrently. Thus, many threads of execution can be going through `process()` in a document processor, at the same time.
+
+This model places an important constraint on document processor classes: *instance variables are not safe.* They must be eliminated, or made thread-safe somehow.
+
+Also see [Resource management](/en/applications/components#resource-management), use `deconstruct()` in order to not leak resources.
+
+### Asynchronous execution
+
+The execution model outlined above also shows one important restriction: If a document processor performs any high-latency operation in its process() method, a docproc worker thread will be occupied. With all *n* worker threads blocking on an external resource, throughput will be limited. This can be fixed by saving the state in the Processing object, and returning `Progress.LATER`. A document processor doing a high-latency operation should use a pattern like this:
+
+
+
+ Check a self-defined context variable in Processing for status. Basically, *have we seen this Processing before?*
+
+
+
+ If no:
+
+ 1. We have been given a Processing object fresh off the network, we have not seen this before. Process it up until the high-latency operation.
+ 2. Start the high-latency operation (possibly in a separate thread).
+ 3. Save the state of the operation in a self-defined context variable in the Processing.
+ 4. Return `Progress.LATER`. This Processing is the appended to the back of the input queue, and we will be called again later.
+
+
+
+ If yes:
+
+ 1. Retrieve the reference that we set in our self-defined context variable in Processing.
+ 2. Is the high-latency operation done? If so, return `Progress.DONE`.
+ 3. Is it not yet done? Return `Progress.LATER` again.
+
+
+
+As is evident, this will let the finite set of document processing threads to do more work at the same time.
+
+## State
+
+Any state in the document processor for the particular Processing should be kept as local variables in the process method, while state which should be shared by all Processings should be kept as member variables. As the latter kind will be accessed by multiple threads at any one time, the state of such member variables must be *thread-safe*. This critical restriction is similar to those of e.g. the Servlet API. Options for implementing a multithread-safe document processor with instance variables:
+
+1. Use immutable (and preferably final) objects: they never change after they are constructed; no modifications to their state occurs after the DocumentProcessor constructor returns.
+2. Use a single instance of a thread-safe class.
+3. Create a single instance and synchronize access to it across all threads (but this will severely limit scalability).
+4. Arrange for each thread to have its own instance, e.g. with a `ThreadLocal`.
+
+### Processing Context Variables
+
+`Processing` has a map `String -> Object` that can be used to pass information between document processors. It is also useful when using `Progress.LATER` to save the state of a processing - see [Processing.java](https://github.com/vespa-engine/vespa/blob/master/docproc/src/main/java/com/yahoo/docproc/Processing.java) for `get/setVariable` and more.
+
+The [sample application](https://github.com/vespa-engine/sample-apps/tree/master/examples/document-processing) uses such context variables, too.
+
+## Operation ordering
+
+### Feed ordering
+
+Ordering of feed operations is not guaranteed. Operations on different documents will be done concurrently and are therefore not ordered. However, Vespa guarantees that operations on the same document are processed in the order they were fed if they enter vespa at the *same* feed endpoint.
+
+### Document processing ordering
+
+Document operations that are produced inside a document processor obey the same rules as at feed time. If you either split the input into other documents or into multiple operations to the same document, Vespa will ensure that operations to the same document ID are sequenced and are delivered in the order they enter.
+
+## (Re)configuring Document Processing
+
+Consider the following configuration:
+
+```xml highlight={7-9}
+
+
+
+
+
+
+
+ value
+
+
+
+
+
+
+```
+
+Changing chain ids, components in a chain, component configuration and schema mapping all takes effect after deployment - no restart required. Changing a *cluster name* (i.e. the container id) requires a restart of docproc services after *vespa activate*.
+
+Note when adding or modifying a processing chain in a running cluster; if at the same time deploying a *new* document processor (i.e. a document processor that was unknown to Vespa at the time the cluster was started), the container must be restarted:
+
+```txt
+$ vespa-sentinel-cmd restart container
+```
+
+## Class diagram
+
+
+
+
+
+The framework core supports asynchronous processing, processing one or multiple documents or document updates at the same time, document processors that makes dynamic decisions about the processing flow and passing of information between processors outside the document or document update:
+
+- One or more named `Docproc Services` may be created. One of the services is the *default*.
+- A service accepts subclasses of `DocumentOperation` for processing, meaning `DocumentPuts`, `DocumentUpdates` and `DocumentRemoves`. It has a `Call Stack` which lists the calls to make to various `DocumentProcessors` to process each DocumentOperation handed to the service.
+- Call Stacks consist of `Calls`, which refer to the Document Processor instance to call.
+- Document puts and document updates are processed asynchronously, the state is kept in a `Processing` for its duration (instead of in a thread or process). A Document Processor may make some asynchronous calls (typically to remote services) and return to the framework that it should be called again later for the same Processing to handle the outcome of the calls.
+- A processing contains its own copy of the Call Stack of the Docproc Service to keep track of what to call next. Document Processors may modify this Call Stack to dynamically decide the processing steps required to process a DocumentOperation.
+- A Processing may contain one or more DocumentOperations to be processed as a unit.
+- A Processing has a `context`, which is a Map of named values which can be used to pass arguments between processors.
+- Processings are prepared to be stored to disk, to allow a high number of ongoing long-term processings per node.
diff --git a/mintlify-docs/en/applications/http-servers-and-filters.mdx b/mintlify-docs/en/applications/http-servers-and-filters.mdx
new file mode 100644
index 0000000000..9960cca6e9
--- /dev/null
+++ b/mintlify-docs/en/applications/http-servers-and-filters.mdx
@@ -0,0 +1,199 @@
+---
+title: "Http servers and filters"
+description: "This document explains how to set up http servers and filters in the Container. Before proceeding, familiarize with the [Developer Guide](/en/applications/developer-guide)."
+---
+
+## Set up Http servers
+
+To accept http requests on e.g. port 8090, add an `http` section with a server to *services.xml*:
+
+```xml
+
+
+
+
+
+
+```
+
+To verify that the new server is running, check the default handler on the root path, which will return a list of all http servers:
+
+```txt
+$ curl http://localhost:8090/
+```
+
+Adding an `http` section to *services.xml* **disables the default http server** at port 8080.
+
+Binding to privileged ports (< 1024) is supported. Note that this **only** works when running as a standalone container, and **not** when running as a Vespa cluster.
+
+### Configure the HTTP Server
+
+Configuration settings for the server can be modified by setting values for the `jdisc.http.connector` config inside the `server` element:
+
+```xml
+
+
+
+
+
+ false
+
+
+
+
+```
+
+Note that it is not allowed to set the `listenPort` in the http-server config, as it conflicts with the port that is set in the *port* attribute in the *server* element. For a complete list of configuration fields that can be set, refer to the config definition schema in [jdisc.http.connector.def](https://github.com/vespa-engine/vespa/blob/master/container-disc/src/main/resources/configdefinitions/jdisc.http.jdisc.http.connector.def).
+
+### TLS
+
+TLS can be configured using either the [ssl](/en/reference/applications/services/http#ssl) or the [ssl-provider](/en/reference/applications/services/http#ssl-provider) element.
+
+```xml
+
+
+
+
+ /path/to/private-key.pem
+ /path/to/certificate.pem
+ /path/to/ca-certificates.pem
+ want
+
+ TLS_AES_128_GCM_SHA256,
+ TLS_AES_256_GCM_SHA384,
+ TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,
+ TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
+
+ TLSv1.2,TLSv1.3
+
+
+
+
+
+
+
+```
+
+Refer to the [multinode-HA](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode-HA) sample application for an example.
+
+## Set up Filter Chains
+
+There are two main types of filters:
+
+- request filters
+- response filters
+
+Request filters run before the handler that processes the request, and response filters run after. They are used for tasks such as authentication, error checking and modifying headers.
+
+### Using Filter Chains
+
+Filter chains are set up by using the `request-chain` and `response-chain` elements inside the [filtering](/en/reference/applications/services/http#filtering) element. Example setting up two request filter chains, and one response filter chain:
+
+```xml
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+```
+
+Filters that should be used in more than one chain, must be defined directly in the `filtering` element, as shown with `request-filter1` in the example above.
+
+To actually use a filter chain, add one or more URI [bindings](/en/reference/applications/services/http#binding):
+
+```xml
+
+
+
+
+ http://*/*
+
+
+
+
+ http://*/*
+
+
+
+
+```
+
+These bindings say that both the request chain and the response chain should be used when the request URI matches `http://*/*`. So both a request filter chain and a response filter chain can be used on a single request. However, only one request chain will be used if there are multiple request chains that have a binding that matches a request. And vice versa for response chains. Refer to the [javadoc](https://javadoc.io/doc/com.yahoo.vespa/jdisc_core/latest/com/yahoo/jdisc/application/UriPattern.html) for information about which chain that will be used in such cases.
+
+In order to bind a filter chain to a specific *server*, add the server port to the binding:
+
+```xml
+
+
+ http://*:8080/*
+ http://*:9000/*
+
+```
+
+A request must match a filter chain if any filter is configured. A 403 response is returned for non-matching request. This semantic can be disabled - see [strict-mode](/en/reference/applications/services/http#filtering).
+
+#### Excluding Filters from an Inherited Chain
+
+Say you have a request filter chain that you are binding to most of your URIs. Now, you want to run almost the same chain on another URI, but you need to exclude one of the filters. This is done by adding `excludes`, which takes a space separated list of filter ids, to the [chain element](/en/reference/applications/services/http#chain). Example where a security filter is excluded from an inherited chain for *status.html*:
+
+```xml
+
+ http://*/status.html
+
+```
+
+### Creating a custom Filter
+
+Create an [application package](/en/applications/developer-guide) with artifactId `filter-bundle`. Create a new file `filter-bundle/components/src/main/java/com/yahoo/demo/TestRequestFilter.java`:
+
+```java expandable
+package com.yahoo.demo;
+import com.yahoo.jdisc.*;
+import com.yahoo.jdisc.handler.*;
+import com.yahoo.jdisc.http.*;
+import com.yahoo.jdisc.http.filter.RequestFilter;
+import java.net.*;
+import java.nio.ByteBuffer;
+public class TestRequestFilter extends AbstractResource implements RequestFilter {
+ @Override
+ public void filter(HttpRequest httpRequest, ResponseHandler responseHandler) {
+ if (isLocalAddress(httpRequest.getRemoteAddress())) {
+ rejectRequest(httpRequest, responseHandler);
+ } else {
+ httpRequest.context().put("X-NOT-LOCALHOST", "true");
+ }
+}
+private boolean isLocalAddress(SocketAddress socketAddress) {
+ if (socketAddress instanceof InetSocketAddress) {
+ InetAddress address = ((InetSocketAddress)socketAddress).getAddress();
+ return address.isAnyLocalAddress() || address.isLoopbackAddress();
+ } else {
+ return false;
+}
+}
+private void rejectRequest(HttpRequest request, ResponseHandler responseHandler) {
+ HttpResponse response = HttpResponse.newInstance(request, Response.Status.FORBIDDEN);
+ ContentChannel channel = responseHandler.handleResponse(response);
+ channel.write(ByteBuffer.wrap("Not accessible by localhost.".getBytes()), null);
+ channel.close(null);
+}
+}
+```
+
+Build a bundle, and place it in the [application package](/en/basics/applications)'s *components* directory.
diff --git a/mintlify-docs/en/applications/ide-support.mdx b/mintlify-docs/en/applications/ide-support.mdx
new file mode 100644
index 0000000000..28399ef86d
--- /dev/null
+++ b/mintlify-docs/en/applications/ide-support.mdx
@@ -0,0 +1,15 @@
+---
+title: "IDE support"
+description: "Vespa provides plugins for working with schemas and rank profiles in IDE's:"
+---
+
+- VSCode: [VS Code extension](https://marketplace.visualstudio.com/items?itemName=vespaai.vespa-language-support)
+- Cursor, code-server and other VS Code compatible IDEs: [VS Code extension in Open VSX registry](https://open-vsx.org/extension/vespaai/vespa-language-support)
+- IntelliJ, PyCharm or WebStorm: [Jetbrains plugin](https://plugins.jetbrains.com/plugin/18074-vespa-schema-language-support)
+- Vim: [neovim](https://blog.vespa.ai/interns-languageserver/#neovim-plugin)
+
+If you are working with non-trivial Vespa applications, installing a plugin is highly recommended!
+
+
+
+
diff --git a/mintlify-docs/en/applications/inspecting-structured-data.mdx b/mintlify-docs/en/applications/inspecting-structured-data.mdx
new file mode 100644
index 0000000000..4fbcc40ea7
--- /dev/null
+++ b/mintlify-docs/en/applications/inspecting-structured-data.mdx
@@ -0,0 +1,156 @@
+---
+title: "Inspecting structured data in a Searcher"
+description: "The [Data Access API](https://javadoc.io/doc/com.yahoo.vespa/vespajlib/latest/com/yahoo/data/access/package-summary) is used to access structured data such as arrays and weighted sets."
+---
+
+## Use Case: accessing array attributes
+
+The following illustrates accessing some field that is of array type:
+
+```java expandable
+import com.yahoo.search.*;
+import com.yahoo.search.result.*;
+import com.yahoo.search.searchchain.*;
+import com.yahoo.data.access.*;
+@After(PhaseNames.TRANSFORMED_QUERY)
+@Before(PhaseNames.BLENDED_RESULT)
+public class SimpleTestSearcher extends Searcher {
+ public Result search(Query query, Execution execution) {
+ Result r = execution.search(query);
+ execution.fill(r);
+ for (Hit hit : r.hits().asList()) {
+ if (hit.isMeta()) continue;
+ Object o = hit.getField("titles");
+ if (o instanceof Inspectable) {
+ StringBuilder pasteBuf = new StringBuilder();
+ Inspectable field = (Inspectable) o;
+ Inspector arr = field.inspect();
+ for (int i = 0; i < arr.entryCount(); i++) {
+ pasteBuf.append(arr.entry(i).asString(""));
+ if (i+1 < arr.entryCount()) {
+ pasteBuf.append(", ");
+ }
+ }
+ hit.setField("titles", pasteBuf.toString());
+ }
+ }
+ return r;
+ }
+}
+```
+
+Here we assume there is a field in our schema like this:
+
+```text
+field titles type array {
+indexing: attribute | summary
+}
+```
+
+Again we process each hit, this time traversing the array and building a string which contains all the titles, transforming a field looking like this:
+
+```json
+"titles": [
+"Bond",
+"James Bond"
+]
+```
+
+into this output:
+
+```json
+"titles": "Bond, James Bond"
+```
+
+## Use Case: accessing weighted set attributes
+
+The following example illustrates accessing data held in a weighted set. Note that the Data Access API doesn't have a "set" or "weighted set" concept; the weighted set is represented as an unordered array of objects where each object has an "item" and a "weight" field. The weight is a long integer value, while the item type will vary according to the field type as declared in the schema.
+
+```java expandable
+import com.yahoo.search.*;
+import com.yahoo.search.result.*;
+import com.yahoo.search.searchchain.*;
+import com.yahoo.data.access.*;
+@After(PhaseNames.TRANSFORMED_QUERY)
+@Before(PhaseNames.BLENDED_RESULT)
+public class SimpleTestSearcher extends Searcher {
+ public Result search(Query query, Execution execution) {
+ Result r = execution.search(query);
+ execution.fill(r);
+ for (Hit hit : r.hits().asList()) {
+ processHit(hit);
+ }
+ return r;
+ }
+ void processHit(Hit hit) {
+ if (hit.isMeta()) return;
+ Object o = hit.getField("titles");
+ if (o instanceof Inspectable) {
+ StringBuilder pasteBuf = new StringBuilder();
+ Inspectable field = (Inspectable) o;
+ Inspector arr = field.inspect();
+ for (int i = 0; i < arr.entryCount(); i++) {
+ String sval = arr.entry(i).field("item").asString("");
+ long weight = arr.entry(i).field("weight").asLong(0);
+ pasteBuf.append("title: ");
+ pasteBuf.append(sval);
+ pasteBuf.append("[");
+ pasteBuf.append(weight);
+ pasteBuf.append("]");
+ if (i+1 < arr.entryCount()) {
+ pasteBuf.append(", ");
+ }
+ }
+ hit.setField("alternates", pasteBuf.toString());
+ }
+ }
+}
+```
+
+Here we assume there is a field in the schema like:
+
+```txt
+field titles type weightedset {
+indexing: attribute | summary
+}
+```
+
+Again we process each hit, and format each element of the weighted set, transforming this input:
+
+```json
+"titles": {
+"Bond": 15,
+"James Bond": 89
+}
+```
+
+into this output:
+
+```json
+"alternates": "title: Bond[15], title: James Bond[89]"
+```
+
+## Unit testing with structured data
+
+For unit testing it is useful to be able to create structured data fields programmatically. This case be done using `Slime`:
+
+```java
+import com.yahoo.slime.*;
+import com.yahoo.data.access.slime.SlimeAdapter;
+// Struct example:
+Slime slime = new Slime();
+Cursor struct = slime.setObject();
+struct.setString("foo", "bar");
+struct.setDouble("number", 1.0);
+myHit.setField("mystruct", new SlimeAdapter(struct));
+// Array example:
+Slime slime = new Slime();
+Cursor array = slime.setArray();
+array.addString("foo");
+array.addString("bar");
+myHit.setField("myarray", new SlimeAdapter(array));
+// Arrays and objects can be arbitrarily nested
+// Alternatively, create the slime structure from a JSON string:
+Slime slime = SlimeUtils.jsonToSlime(myJsonString.getBytes(StandardCharsets.UTF_8));
+myHit.setField("myfield", new SlimeAdapter(slime.get()));
+```
diff --git a/mintlify-docs/en/applications/pluggable-frameworks.mdx b/mintlify-docs/en/applications/pluggable-frameworks.mdx
new file mode 100644
index 0000000000..47221b61ee
--- /dev/null
+++ b/mintlify-docs/en/applications/pluggable-frameworks.mdx
@@ -0,0 +1,53 @@
+---
+title: "Using pluggable frameworks"
+description: "Many libraries provide pluggable architectures via Service Provider Interfaces (SPI)."
+---
+
+## Troubleshooting and Configuring the Application
+
+Libraries for pluggable frameworks rely on loading classes dynamically at runtime, usually via `Class.forName("…")`. If the package of the class that is loaded is not imported by our user bundle, this will result in the following error:
+
+```java
+java.lang.ClassNotFoundException: com.sun.imageio.plugins.jpeg.JPEGImageReaderSpi not found by my-bundle [29]
+at
+org.apache.felix.framework.BundleWiringImpl.findClassOrResourceByDelegation(BundleWiringImpl.java:1532)
+```
+
+The example above is from using the [Image I/O framework](https://docs.oracle.com/javase/6/docs/technotes/guides/imageio/). In this case, notice that the missing class is from a `com.sun` package, which is available in the SDK.
+
+### Importing the Missing Package
+
+The `ClassNotFoundException` means that the bundle is not importing the package. The [bundle-plugin](/en/applications/bundles#maven-bundle-plugin) will usually not have added an import since the class is only referred to from a string in a `Class.forName("…")` statement. Hence, add an explicit `importPackage` in the bundle's pom.xml:
+
+```xml highlight={8}
+
+
+
+ com.yahoo.vespa
+ bundle-plugin
+ ...
+
+ com.sun.imageio.plugins.jpeg
+ ...
+
+
+
+
+```
+
+The `importPackage` configuration option takes a comma-separated list of packages. Adding multiple `importPackage` elements in pom.xml means that only one of them will take effect.
+
+### Exporting the Missing Package from the Container
+
+As mentioned, the missing package in this example is part of the SDK. In these cases, we must tell the Container to export the missing package. When running in [cluster mode](/en/operations/self-managed/multinode-systems#aws-ecs), this is done in `services.xml`:
+
+```xml
+
+
+
+ com.sun.imageio.plugins.jpeg
+
+
+ ...
+
+```
diff --git a/mintlify-docs/en/applications/processing.mdx b/mintlify-docs/en/applications/processing.mdx
new file mode 100644
index 0000000000..a298618554
--- /dev/null
+++ b/mintlify-docs/en/applications/processing.mdx
@@ -0,0 +1,325 @@
+---
+title: "Request-Response Processing"
+description: "*Processing* makes it easy to create low-latency request/response processing applications. It is the recommended way of creating such applications on top of JDisc, but can also be used independently of JDisc. Processing lets you define application behavior by combining Processors performing simple tasks. Processors use a synchronous call model, but the underlying IO may be asynchronous."
+---
+
+
+Javadoc:
+[com.yahoo.processing.Processor](https://javadoc.io/doc/com.yahoo.vespa/processing/latest/com/yahoo/processing/Processor.html)
+[com.yahoo.processing.rendering.Renderer](https://javadoc.io/doc/com.yahoo.vespa/processing/latest/com/yahoo/processing/rendering/Renderer.html)
+
+## Using processing
+
+To use processing, add this dependency to *pom.xml*:
+
+```xml
+
+ com.yahoo.vespa
+ container
+ 8.689.26 {/* Find latest version at [search.maven.org/search?q=g:com.yahoo.vespa%20a:container](https://search.maven.org/search?q=g:com.yahoo.vespa%20a:container) */}
+ provided
+
+```
+
+Or read [how to start a deployable project from scratch](/en/applications/developer-guide).
+
+## Processors
+
+A *processor* subclasses Processor and implements a single method:
+
+```java
+package com.mydomain.example;
+
+import com.yahoo.processing.\*;
+import com.yahoo.processing.execution.Execution;
+import com.yahoo.processing.test.ProcessorLibrary.StringData;
+
+public class ExampleProcessor extends Processor {
+
+ @Override
+ public Response process(Request request, Execution execution) {
+ // Process the Request:
+ request.properties().set("foo","bar");
+
+ // Pass on to the next processor in the chain
+ Response response=execution.process(request);
+
+ // process the response
+ response.data().add(new StringData(request,"Hello, world!"));
+
+ return response;
+ }
+
+}
+```
+
+Processors may work on both the request and response, pass on the request one or more times to further processors or create the result data internally or by contacting a remote service. The result data may be a nested composite structure where content is contributed by multiple processors.
+
+## Chaining Processors
+
+Processors should carry out a single task and are combined into complete applications. This is achieved using Chains:
+
+```java
+Chain myChain=new Chain(new ExampleProcessor(),
+ new FooProcessor(),
+ new BarProcessor());
+Response response=new Execution(myChain).process(request); // execute this chain
+```
+
+This executes the three processors in order. The Execution keeps track of the execution state so the same processor instances may be used in many chains at the same time. When the execution reaches the end of the chain, the execution returns an empty Response to the processor calling it. An AsyncExecution class is provided as a convenience to perform an execution in a separate thread instead.
+
+In most cases it is more convenient to configure chains and processor instances using external configuration. Chains of processors may be specified in a [processing](/en/reference/applications/services/processing) element in the *[services.xml](/en/reference/applications/services/services)* file in the application package. The compiled processors are added to the application package as [OSGi components](/en/applications/components). Chain configuration allows chains to be defined as *sets* of processors with ordering constraints, such that the global ordering of processors can be figured out by the framework, and set operations con chains can be used to define extensions and variants of chains.
+
+## Asynchronous Results
+
+In some cases it is useful to return a Response before all the data in it is available. This allows returning a partial response to clients with low latency even though the complete response contains some data arriving more slowly. The slow data can be added to the Response as a placeholder where actual data will arrive later. The processing framework allows waiting or listening for such completion events as [Guava ListenableFutures.](https://guava.dev/releases/snapshot/api/docs/com/google/common/util/concurrent/ListenableFuture.html)
+
+If *all* data is added to the Response as future placeholders the processing framework becomes completely non-blocking.
+
+## Dependency Injection
+
+Processors in real applications will typically depend on some configuration and/or other components to run. Such dependencies should be declared as straightforward constructor arguments to allow them to be injected at construction time.
+
+The container runtime used to host the processing framework uses a dependency injection framework based in Guice, see [container components](/en/applications/components).
+
+As a processor may participate in many processing executions at one time, field values in a processing class should usually be immutable after construction is completed.
+
+## Response Rendering
+
+A *Renderer* is used to serialize the Response for return to a client. Renderers are subclasses of `com.yahoo.processing.rendering.Renderer`. A convenience superclass which handles waiting for future data in the asynchronous case is provided as `com.yahoo.processing.rendering.AsynchronousSectionedRenderer`. The default renderer, which renders in a simple JSON format is [com.yahoo.processing.rendering.ProcessingRenderer](https://github.com/vespa-engine/vespa/blob/master/container-disc/src/main/java/com/yahoo/processing/rendering/ProcessingRenderer.java) and can be subclassed to customize rendering of each kind of Data item.
+
+Processors are regular [components](/en/applications/components) which are added to the application package in the [renderer section](/en/reference/applications/services/processing#renderer) of the *services.xml* file. A renderer is selected in the request by setting the `format` parameter in the request to the renderer id.
+
+## Subclassing of Processing
+
+The Processing framework is meant to be generic and minimal. In some domains it is useful to employ a richer model of Processors, Requests, Responses and Executions targeted to that domain. An example is the [Search domain](/en/applications/searchers), where Searchers, Queries and Results subclass Processors, Requests and Responses. The Processing framework is designed to allow such subclassing to build richer frameworks on top.
+
+## Testing Processors with an Application
+
+A processor can be tested running inside a container. We create a JDisc from *services.xml*:
+
+```java expandable
+import com.yahoo.application.container.JDisc;
+import com.yahoo.application.Networking;
+
+import com.yahoo.processing.Request;
+import com.yahoo.processing.Response;
+
+import com.yahoo.component.ComponentSpecification;
+
+import org.junit.Test;
+
+import static org.junit.Assert.assertThat;
+import static org.junit.matchers.JUnitMatchers.containsString;
+
+public class ContainerTest {
+ @Test
+ public void testSearch() {
+ String servicesXml =
+ "" +
+ " " +
+ " " +
+ " " +
+ " " +
+ " " +
+ "";
+ try (JDisc container = JDisc.fromServicesXml(servicesXml, Networking.disable)) {
+ Response response = container.processing().process(ComponentSpecification.fromString("default"), new Request());
+ assertThat(response.data().get(0).toString(), containsString("Hello, world!"));
+ }
+
+ }
+}
+```
+
+We can also examine which processors are in a chain and their ordering:
+
+```java
+ChainRegistry chains = container.processing().getChains();
+Chain defaultChain = chains.getComponent("default");
+boolean foundExampleProcessor = false;
+for (Processor processor: defaultChain.components()) {
+ if ("ExampleProcessor".equals(processor.getClassName()))
+ foundExampleProcessor = true;
+}
+
+
+assertTrue("No instance of ExampleProcessor found in the default chain", foundExampleProcessor)
+```
+
+## Selecting a Non-default Processor Chain
+
+A complete application will usually be composed of several processor chains, which may or may not invoke each other. To select a chain configured with another `id` than "default", add the chain ID as a GET parameter named `chain`.
+
+In other words, given a chain named "testbed", as in:
+
+```xml
+
+
+
+
+
+
+
+```
+
+
+The chain testbed could be tested from the command line by doing:
+
+```bash
+$ curl http://*hostname*:*port*/processing/?chain=testbed
+```
+
+## References
+
+- [Developing web services](/en/applications/web-services).
+- [com.yahoo.processing](https://javadoc.io/doc/com.yahoo.vespa/processing/latest/com/yahoo/processing/package-summary.html) javadoc
+- [Guava Javadoc](https://guava.dev/releases/snapshot/api/docs/).
+
+## Common tasks with processing
+
+This section contains a collection of "how do I" explanations with processing. Most of these pertains to the jDisc binding of Processing, but note that Processing is independent of jDisc and may be invoked programmatically in any environment.
+
+### Accessing the HTTP request from Processors
+
+Processors which interface with the network layer may need to access the network level request to access headers or request data, or to make outgoing calls through jDisc. The jDisc request is available through request properties:
+
+httpRequest = (com.yahoo.container.jdisc.HttpRequest)processingRequest.properties().get("jdisc.request");
+
+### Setting response headers from Processors
+
+Response headers may be added to any Response by adding instances of `com.yahoo.processing.handler.ResponseHeaders` to the Response (ResponseHeaders is a kind of response Data). Multiple instances of this may be added to the Response, and the complete set of headers returned is the superset of all such objects. Example Processor:
+
+```java expandable
+import com.yahoo.processing.Processor;
+import com.yahoo.processing.Request;
+import com.yahoo.processing.Response;
+import com.yahoo.processing.handler.ResponseHeaders;
+import com.yahoo.processing.execution.Execution;
+
+import java.util.Collections;
+import java.util.Map;
+import java.util.List;
+
+public class ResponseHeaderSetter extends Processor {
+
+ private final Map> responseHeaders;
+
+ public ResponseHeaderSetter(Map> responseHeaders) {
+ this.responseHeaders = Collections.unmodifiableMap(responseHeaders);
+ }
+
+ @Override
+ public Response process(Request request, Execution execution) {
+ Response response = execution.process(request);
+ response.data().add(new ResponseHeaders(responseHeaders, request));
+ return response;
+ }
+
+}
+```
+
+## Example Processors
+
+This section lists a few example processors which shows some use cases for the asynchronous aspects of the API.
+
+```java expandable
+import com.yahoo.component.chain.Chain;
+import com.yahoo.processing.Processor;
+import com.yahoo.processing.Request;
+import com.yahoo.processing.Response;
+import com.yahoo.processing.execution.AsyncExecution;
+import com.yahoo.processing.execution.Execution;
+import com.yahoo.processing.response.FutureResponse;
+
+import java.util.\*;
+
+/\*\*
+ \* Call a number of chains in parallel
+ \*/
+public class Federator extends Processor {
+
+ private final List> chains;
+
+ public Federator(Chain extends Processor> … chains) {
+ this.chains= Arrays.asList(chains);
+ }
+
+ @Override
+ public Response process(Request request, Execution execution) {
+ List futureResponses=new ArrayList(chains.size());
+ for (Chain extends Processor> chain : chains) {
+ futureResponses.add(new AsyncExecution(chain,execution).process(request));
+ }
+ Response response=execution.process(request);
+ AsyncExecution.waitForAll(futureResponses,1000);
+ for (FutureResponse futureResponse : futureResponses) {
+ Response federatedResponse=futureResponse.get();
+ response.data().add(federatedResponse.data());
+ response.mergeWith(federatedResponse);
+ }
+ return response;
+ }
+}
+```
+
+```java expandable
+import com.yahoo.processing.\*;
+import com.yahoo.processing.execution.Execution;
+import com.yahoo.processing.response.\*;
+import com.yahoo.processing.test.ProcessorLibrary.StringData;
+
+/\*\*
+ \* A data producer which producer data which will receive asynchronously.
+ \* This is not a realistic, thread safe implementation as only the incoming data
+ \* from the last created incoming data can be completed.
+ \*/
+public class AsyncDataProducer extends Processor {
+
+ private IncomingData incomingData;
+
+ @Override
+ public Response process(Request request, Execution execution) {
+ DataList dataList = ArrayDataList.createAsync(request); // Default implementation
+ incomingData=dataList.incoming();
+ return new Response(dataList);
+ }
+
+ /\*\* Called by some other data producing thread, later \*/
+ public void completeLateData() {
+ incomingData.addLast(new StringData(incomingData.getOwner().request(),
+ "A late hello, world!"));
+ }
+
+}
+```
+
+```java expandable
+import com.google.common.util.concurrent.MoreExecutors;
+import com.yahoo.component.chain.Chain;
+import com.yahoo.processing.\*;
+import com.yahoo.processing.execution.\*;
+
+/\*\*
+ \* A processor which registers a listener on the future completion of
+ \* asynchronously arriving data to perform another chain at that point.
+ \*/
+public class AsyncDataProcessingInitiator extends Processor {
+
+ private final Chain asyncChain;
+
+ public AsyncDataProcessingInitiator(Chain asyncChain) {
+ this.asyncChain=asyncChain;
+ }
+
+ @Override
+ public Response process(Request request, Execution execution) {
+ Response response=execution.process(request);
+ response.data().complete().addListener(new RunnableExecution(request,
+ new ExecutionWithResponse(asyncChain, response, execution)),
+ MoreExecutors.sameThreadExecutor());
+ return response;
+ }
+
+}
+```
\ No newline at end of file
diff --git a/mintlify-docs/en/applications/request-handlers.mdx b/mintlify-docs/en/applications/request-handlers.mdx
new file mode 100644
index 0000000000..ca6bbe6381
--- /dev/null
+++ b/mintlify-docs/en/applications/request-handlers.mdx
@@ -0,0 +1,42 @@
+---
+title: "Request handlers"
+description: "This document explains how to implement and deploy a custom request handler."
+---
+
+In most cases, implementing your own request handlers is unnecessary, as both searchers and processors can access the request data directly. However, there are a few cases where custom request handlers are useful:
+
+
+1. You need to implement a custom REST API.
+2. Your application needs to control which parameters are used to route requests to a particular search or processing chain.
+
+## Implementing a request handler
+
+Upon receiving a request, the request handler must consume its content, process it, and then return a response. The most convenient way to implement a request handler is by subclassing the [ThreadedHttpRequestHandler](https://javadoc.io/doc/com.yahoo.vespa/container-disc/latest/com/yahoo/container/jdisc/ThreadedHttpRequestHandler).
+
+This utility base class uses a synchronous API and a multithreaded execution model. It also implements a lot of functionality that is needed by most request handlers:
+
+- queries are automatically written to the access log
+- an HTTP date header is added to the response (if your own code adds a date header, it will not be overwritten, though)
+- logging of exceptions and queries that time out
+- automatic shutdown when an Error is thrown
+
+### Example request handler implementations
+
+The [Vespa sample apps](https://github.com/vespa-engine/sample-apps) on GitHub contains a few example request handler implementations:
+
+| Handler | Description |
+| :--- | :--- |
+| [DemoHandler](https://github.com/vespa-engine/sample-apps/blob/master/examples/http-api-using-request-handlers-and-processors/src/main/java/ai/vespa/examples/DemoHandler.java) | A handler that modifies a request before dispatching it to the `ProcessingHandler`. This handler is also used in the [HTTP API tutorial](/en/learn/tutorials/http-api). Note that since this depends on ProcessingHandler you must add `processing` to your `container` tag to use it. If you want to issue Queries instead, have com.yahoo.search.searchchain.ExecutionFactory injected instead and use it to create executions and call search/fill on them. |
+
+## Deploying a request handler
+
+To deploy a request handler in an application, use the [handler](/en/reference/applications/services/container#handler) element in *services.xml*:
+
+```xml highlight={2-4}
+
+
+ http://*/*
+
+```
+
+A request handler may be bound to zero or more URI patterns by adding a [binding](/en/reference/applications/services/container#binding) element for each pattern.
diff --git a/mintlify-docs/en/applications/result-renderers.mdx b/mintlify-docs/en/applications/result-renderers.mdx
new file mode 100644
index 0000000000..1e8c68d288
--- /dev/null
+++ b/mintlify-docs/en/applications/result-renderers.mdx
@@ -0,0 +1,273 @@
+---
+title: "Result renderers"
+description: "Vespa provides a default JSON format for query results. *Renderers* can be configured to implement custom formats, like binary and text format. Renderers should not be used to implement business logic - that should go in [Searchers](/en/applications/searchers), [Handlers](/en/applications/request-handlers) or [Processors](/en/applications/processing). This guide assumes familiarity with the [Developer Guide](/en/applications/developer-guide)."
+---
+
+Renderers are implemented by subclassing one of:
+
+- [com.yahoo.search.rendering.Renderer](https://javadoc.io/doc/com.yahoo.vespa/container-search/latest/com/yahoo/search/rendering/Renderer)
+- [com.yahoo.search.rendering.SectionedRenderer](https://javadoc.io/doc/com.yahoo.vespa/container-search/latest/com/yahoo/search/rendering/SectionedRenderer)
+- [com.yahoo.processing.rendering.AsynchronousSectionedRenderer<Result>](https://javadoc.io/doc/com.yahoo.vespa/container-disc/latest/com/yahoo/processing/rendering/AsynchronousSectionedRenderer)
+
+SectionedRenderer differs from Renderer by providing each part to be rendered in separate steps. It is therefore easier to implement a SectionedRenderer than a regular Renderer. AsynchronousSectionedRenderer has a similar API to SectionedRenderer, but supports asynchronously fetched hit contents, so if supporting slow clients or backends is a priority, this offers some advantages. AsynchronousSectionedRenderer also exposes an OutputStream instead of a Writer, so if the backend data contains data encoded the same way as the output from the container (often UTF-8), performance gains are possible.
+
+All renderers are [components](/en/applications/components). They are built and deployed like all other container components, and supports [custom config](/en/applications/configuring-components).
+
+Renderers do *not* need to be thread safe - they can safely use and store state during rendering in member variables. The container supports this by cloning the renderers just before rendering the search result. To support cloning correctly, the renderers are required to obey the following contract:
+
+1. At construction time, only final members shall be initialized, and these must refer to immutable data only.
+2. State mutated during rendering shall be initialized in the init method.
+
+To enable a renderer, add to [services.xml](/en/reference/applications/services/container):
+
+```xml highlight={6-8}
+
+
+ …
+
+
+
+
+ …
+
+ …
+
+```
+
+To use the renderer, add [&presentation.format=\[id\]](/en/reference/api/query#presentation.format) to queries - in this case `&presentation.format=MyRenderer`.
+
+## Renderer
+
+The simplest form of a renderer is extending `Renderer`. The `render` method does all the work - the derived class is expected to extract all the entities of interest itself and render them. Simple example:
+
+```java
+public class SimpleRenderer extends Renderer {
+ @Override
+ public void render(Writer writer, Result result) throws IOException {
+ writer.write("The result contains " + result.getHitCount() + " hits.");
+ }
+
+ @Override
+ public String getEncoding() {
+ return "utf-8";
+ }
+
+ @Override
+ public String getMimeType() {
+ return "text/plain";
+ }
+}
+```
+
+More complex example:
+
+```java expandable
+/**
+ * Render result sets as plain text. First line is whether an error occurred,
+ * second rendering initialization time stamp, then each line is the ID of each
+ * document returned, and the last line is time stamp for when the renderer was finished.
+ */
+public class DemoRenderer extends Renderer {
+ private String heading;
+
+ /**
+ * No global, shared state to set.
+ */
+ public DemoRenderer() {
+ }
+
+ @Override
+ protected void render(Writer writer, Result result) throws IOException {
+ if (result.hits().getErrorHit() == null) {
+ writer.write("OK\n");
+ } else {
+ writer.write("Oops!\n");
+ }
+ writer.write(heading + "\n");
+ renderHits(writer, result.hits());
+ writer.write("Rendering finished work: " + System.currentTimeMillis() + "\n");
+ }
+
+ private void renderHits(Writer writer, HitGroup hits) throws IOException {
+ for (Iterator i = hits.deepIterator(); i.hasNext();) {
+ Hit h = i.next();
+ if (h.types().contains("summary")) {
+ String id = h.getDisplayId();
+ if (id != null) {
+ writer.write(id + "\n");
+ }
+ }
+ }
+ }
+
+ @Override
+ public String getEncoding() {
+ return "utf-8";
+ }
+
+ @Override
+ public String getMimeType() {
+ return "text/plain";
+ }
+
+ /**
+ * Initialize mutable, per-result set state here.
+ */
+ @Override
+ public void init() {
+ long time = System.currentTimeMillis();
+ heading = "Renderer initialized: " + time;
+ }
+
+}
+```
+
+## SectionedRenderer
+
+To create a SectionedRenderer, subclass it and implement all its abstract methods. For each non-compound entity such as regular hits and query contexts, there are an associated method with the same name:
+
+```java
+public class DemoRenderer extends SectionedRenderer {
+
+ @Override
+ public void hit(Writer writer, Hit hit) throws IOException {
+ writer.write("Hit: " + hit.getField("documentid") + "\n");
+ }
+}
+```
+
+For each compound entity, such as hit groups and the result itself, there are pairs of methods, named `begin` and `end`:
+
+```java
+public class DemoRenderer extends SectionedRenderer {
+
+ private int indentation;
+
+ @Override
+ public void beginHitGroup(PrintWriter writer, HitGroup hitGroup) throws IOException {
+ writer.write("Begin hit group:" + hitGroup.getId() + "\n");
+ ++indentation;
+ }
+
+ @Override
+ public void endHitGroup(PrintWriter writer, HitGroup hitGroup) throws IOException {
+ --indentation;
+ writer.write("End hit group:" + hitGroup.getId() + "\n");
+ }
+}
+```
+
+```text
+For a compound entity, a method will be called for each of its members after its `begin`\-method and before its `end`\-method has been called:
+
+ Call sequence
+ -------------------
+Result { 1. beginResult()
+ HitGroup { 2. beginHitGroup()
+ Hit 3. hit()
+ Hit 4. hit()
+ Hit 5. hit()
+ } 6. endHitGroup()
+} 7. endResult()
+```
+
+For [grouping results](/en/querying/grouping), there is a dedicated set of callbacks available:
+
+- `beginGroup()` / `endGroup()`
+- `beginGroupList()` / `endGroupList()`
+- `beginHitList()` / `endHitList()`
+
+All of `Group`, `GroupList` and `HitList` are subclasses of `HitGroup`, and the default implementation of the above methods is provided that calls `beginHitGroup()` and `endHitGroup()`, respectively. Furthermore, since all the attributes of those classes are regular fields as defined by the root `Hit` class, output is made by simply implementing `beginHitGroup()`, `endHitGroup()`, and `hit()`.
+
+### JSON example
+
+Read the [default JSON result format](/en/reference/querying/default-result-format) before implementing custom JSON renderers. Example: Render a set of fields containing JSON data as a JSON array. In other words, dump a variable length array containing all available data, ignore everything else and silently ignore error states (i.e. good for prototyping):
+
+```java expandable
+package com.yahoo.mysearcher;
+
+import com.yahoo.search.Result;
+import com.yahoo.search.query.context.QueryContext;
+import com.yahoo.search.rendering.SectionedRenderer;
+import com.yahoo.search.result.ErrorMessage;
+import com.yahoo.search.result.Hit;
+import com.yahoo.search.result.HitGroup;
+
+import java.io.IOException;
+import java.io.Writer;
+import java.util.Collection;
+
+public class MyRenderer extends SectionedRenderer {
+ /**
+ * A marker variable for the hit rendering to know whether
+ * the hit being rendered is the first one that is rendered.
+ */
+ boolean firstHit;
+
+ public void init() {
+ firstHit = true;
+ }
+
+ @Override
+ public String getEncoding() {
+ return "utf-8";
+ }
+
+ @Override
+ public String getMimeType() {
+ return "application/json";
+ }
+
+ @Override
+ public void beginResult(Writer writer, Result result) throws IOException {
+ writer.write("[");
+ }
+
+ @Override
+ public void endResult(Writer writer, Result result) throws IOException {
+ writer.write("]");
+ }
+
+ @Override
+ public void error(Writer writer, Collection errorMessages) throws IOException {
+ // swallows errors silently
+ }
+
+ @Override
+ public void emptyResult(Writer writer, Result result) throws IOException {
+ //write nothing.
+ }
+
+ @Override
+ public void queryContext(Writer writer, QueryContext queryContext) throws IOException {
+ //write nothing.
+ }
+
+ @Override
+ public void beginHitGroup(Writer writer, HitGroup hitGroup) throws IOException {
+ //write nothing.
+ }
+
+ @Override
+ public void endHitGroup(Writer writer, HitGroup hitGroup) throws IOException {
+ //write nothing.
+ }
+
+ @Override
+ public void hit(Writer writer, Hit hit) throws IOException {
+ if (!firstHit) {
+ writer.write(",\n");
+ }
+ writer.write(hit.toString());
+ firstHit = false;
+ }
+}
+```
+
+## AsynchronousSectionedRenderer<Result>
+
+This is the same as for the [processing framework](/en/applications/processing#response-rendering). It is conceptually similar to SectionedRenderer, but has no special cases for search results as such. The utility method getResponse() has a parametrized return type, though, so templating the renderer on `Result` takes away some of the hassle.
+
+Find an example in [DemoRenderer.java](https://github.com/vespa-engine/sample-apps/blob/master/examples/http-api-using-request-handlers-and-processors/src/main/java/ai/vespa/examples/DemoRenderer.java).
diff --git a/mintlify-docs/en/applications/searchers.mdx b/mintlify-docs/en/applications/searchers.mdx
new file mode 100644
index 0000000000..c2d3bd4951
--- /dev/null
+++ b/mintlify-docs/en/applications/searchers.mdx
@@ -0,0 +1,427 @@
+---
+title: "Searchers"
+description: "The *Container* is the home for all global processing of user actions (represented as queries) and their results. It provides a development and hosting environment for processing [components](/en/applications/components), and a model for composing such components developed by multiple development teams into a functional whole."
+---
+
+This document describes how to develop and deploy Searcher components. To get started with development, see the [Developer Guide](/en/applications/developer-guide). For reference, see the [Container javadoc](https://javadoc.io/doc/com.yahoo.vespa/container-search/latest/com/yahoo/search/package-summary), and the [services.xml reference](/en/reference/applications/services/processing#chain).
+
+Best practise for queries is submitting the user-generated query as-is to Vespa, then use Searcher components to implement additional logic. Refer to the [Query HTTP API](/en/querying/query-api#http).
+
+
+
+
+
+## Searchers
+
+The components of the search container are called *Searchers*. A searcher is a component - usually deployed as part of an OSGi bundle - which extends the class `com.yahoo.search.Searcher`. All Searchers must implement a single method:
+
+```java
+public Result search(Query query, Execution execution);
+```
+
+When the container receives a request, it will create a Query representing it and execute a configured list of such Searcher components. This is done by calling the `search()` method on the first searcher in the list. That searcher is responsible for passing the call to the next searcher in the list (or not, as it sees fit). This is done by calling `search()` on the Executor given, which keeps track of where we are in the list of Searchers. Hence, this is a noop searcher implementation:
+
+```java
+public Result search(Query query, Execution execution) {
+ return execution.search(query);
+}
+```
+
+Eventually the search call will reach the end of the list of searchers. The last searcher in the list may create a Result (somehow), which is now passed back up the call chain until it reaches the top. The container will then translate that Result back to a response to the incoming request.
+
+As is evident from this description, this is a synchronous model, where each request is processed in a dedicated worker thread until the result is returned. This synchronous model is implemented with [multi-threading of individual searchers](#keeping-state-in-searchers).
+
+The single searcher method is sufficient to express all kinds of functionality, e.g.:
+
+- A *query processor* will modify the query, then pass it on to the next searcher.
+- A *result processor* will pass the query on to get the result, then modify the result before returning it.
+- A *result producer* which produces a result by some internal lookup or (more typically) by sending a network request to a backend will translate the query to the desired execution and instantiate and return a Result holding the outcome.
+- A *workflow* might pass the query on multiple times in a loop and gradually build up a Result for return from the Results received from each Query execution, or choose to pass a particular Query in an if-else loop etc.
+
+## Queries and Results
+
+The **Query** in the search container is the container of all the information needed to create a result to the request, including:
+
+- The parameters received in the request, including the user's query string, or chosen action.
+- The parameters in the chosen [query profile](/en/querying/query-profiles), if any.
+- The desired execution, including the boolean query tree. This information is gradually created from the request and query profile by Searcher components.
+- Any objects of any type containing information created by Searchers along the way.
+
+The **Result** encapsulates all the data generated from a Query. The Result contains a composite tree of Hit objects organized in lists called HitGroups (the Result points to the topmost group). Each Hit contains some particular data item which is deemed relevant to the Query. The Hit objects has a general key-value storage, but are also polymorphic to support representing more structured information. See the [inspecting structured data](/en/applications/inspecting-structured-data) documentation for details about handling structured information in a Searcher.
+
+As Hits may be hierarchically organized into hit lists, the Result object is capable of representing any organization of the results. For example, in a federated system the hits are initially organized in one hit group per source. Upstream searchers may reorganize this into something that fits the user's need better, e.g a single blended group, or one group per likely interpretation of the query etc.
+
+## Search Chains
+
+The lists of searchers mentioned above are called *search chains*. Search chains are a special case of the [general component chains](/en/applications/chaining). A search chain is nothing more than a list of searcher instances having an id. The search chains are typically not created for every query but are part of the configuration. Multiple ones may exist at the same time, the chain to execute may be specified in the request. If nothing is specified, a default one is used. The same Searcher instance may exist in multiple search chains, which is why the Execution object is responsible for knowing the next Searcher to invoke in a particular request.
+
+Search chains may also be executed programmatically (typically from a Searcher), synchronously or asynchronously:
+
+```java
+// Get a chain by id
+SearchChain myChain = execution.searchChainRegistry().getComponent("myChain");
+// Execute it in the same thread
+Result result = new Execution(myChain, execution.context()).search(query);
+// ... or in another thread
+Execution settings = new Execution(myChain, execution.context());
+FutureResult futureResult = new AsyncExecution(settings).search(query);
+FutureResult otherFutureResult = new AsyncExecution(settings).search(otherQuery);
+```
+
+Asynchronous execution is useful in cases like [federation](/en/querying/federation), where a searcher forks a Query to multiple search chains in parallel, each getting results from a particular source. Also, as in the example, it is allowed to use the same Execution instance to construct multiple AsyncExecution instances, as the state is only copied from the constructor argument.
+
+The execution order of the searchers in a chain are not ordered explicitly, but by [ordering constraints](/en/applications/chaining) declared in the searchers or their configuration. Also read the [search reference](/en/reference/applications/services/search).
+
+### Writing a Searcher
+
+Example of a complete searcher:
+
+```java
+package com.yahoo.search.example;
+import com.yahoo.search.*;
+import com.yahoo.search.result.Hit;
+import com.yahoo.search.searchchain.Execution;
+/**
+* A searcher adding a new hit.
+*/
+public class SimpleSearcher extends Searcher {
+ public Result search(Query query, Execution execution) {
+ Result result = execution.search(query); // Pass on to the next searcher to get results
+ Hit hit = new Hit("test");
+ hit.setField("message", "Hello world");
+ result.hits().add(hit);
+ return result;
+ }
+}
+```
+
+The container will create one or more instances of this class and place it in the desired search chain(s) to serve queries, as specified in [the configuration](/en/applications/components#adding-component-to-application-package). The first line in this searcher forwards the query to whatever is the next searcher in the chain this is a part of. This will eventually produce a Result, which is modified and then passed back to the previous searcher in this chain. The container will create a new instance of this searcher only when it is reconfigured, so any data needed by the searcher can be read and prepared from a constructor in the searcher. Constructors may also accept [configuration](/en/applications/components#dependency-injection), as any other pluggable component.
+
+Find the full API available to searchers in the [Search Container Javadoc](https://javadoc.io/doc/com.yahoo.vespa/container-search/latest/com/yahoo/search/package-summary).
+
+### Testing a Searcher
+
+Before there is any point in testing a searcher in a real system, it should pass a set of unit tests which test it in isolation or together with the few searchers it interacts with. To do this, we can write unit tests which programmatically sets up a search chain containing the searcher to be tested, the searchers it interacts with (if any) and a searcher which produces mock results appropriate for the tests. Here is a simple example testing the Searcher above:
+
+```java expandable
+package com.yahoo.search.example.test;
+
+import com.yahoo.search.*;
+import com.yahoo.search.searchchain.*;
+import com.yahoo.search.example.SimpleSearcher;
+
+public class SimpleSearcherTestCase extends junit.framework.TestCase {
+
+public void testBasics() {
+ // Create chain
+ Chain searchChain = new Chain(new SimpleSearcher());
+
+ // Create an empty context, in a running container this would be
+ // populated with settings used by different searcher. Tests must
+ // set this according to their own requirements.
+ Execution.Context context = Execution.Context.createContextStub(null);
+ Execution execution = new Execution(searchChain, context);
+
+ // Execute it
+ Result result = execution.search(new Query("search/?query=some test query"));
+
+ // Assert the result has the expected hit by scanning for the ID
+ assertNotNull(result.hits().get("test"));
+ }
+}
+```
+
+In this case, no searcher producing mock results is needed because the searcher we are testing does not care what the Result contains. If the search chain ends with a searcher which produces no result, the framework will simply return an empty result, which is what happens in this case. A test adding a mock searcher producing results are shown in [federation](/en/querying/federation#unit-testing-the-result-processor).
+
+To write unit tests of the whole application package, see the [Developer Guide](/en/applications/developer-guide).
+
+### Deploying a Searcher
+
+Once the searcher passes unit tests, it can be deployed to the Vespa system hosting it. The procedure is the same as described in [deploying a component](/en/applications/components#deploying-a-component). First [build the component jar](/en/applications/components#building-the-plugin-jar). To include the searcher in *services.xml*, define a search chain and add the searcher to it - example:
+
+```xml
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+```
+
+This defines the search chain `default`, which will be used in queries when no other chain is explicitly specified. The searcher id above is resolved to the component bundle jar we added by the symbolic name in the manifest, and to the right class within the bundle by the class name. By keeping all these three the same, we keep things simple, but more advanced use where this is possible is also supported, see later sections.
+
+See the [search chains reference](/en/reference/applications/services/search#chain).
+
+Example *hosts.xml*:
+
+```xml
+
+
+
+ node1
+
+
+```
+
+By creating a directory containing *services.xml*, *hosts.xml* and *components/Simplesearcher.jar*, that directory becomes a complete application package containing a bundle, which can now be deployed to a Vespa instance.
+
+After deployment, query the application: [http://localhost:8080/search?query=best](http://localhost:8080/search/?query=best).
+
+### Testing a Searcher with an Application
+
+A searcher can also be tested running inside a container. Create an instance from the *container* part of the *services.xml* file above:
+
+```java expandable
+import com.yahoo.component.ComponentSpecification;
+import com.yahoo.application.container.JDisc;
+import com.yahoo.application.Networking;
+
+import com.yahoo.search.Query;
+import com.yahoo.search.Result;
+
+import org.junit.jupiter.api.Test;
+
+import static org.junit.jupiter.api.Assertions.assertNotNull;
+
+public class ContainerTest {
+ @Test
+ public void testSearch() {
+ String servicesXml =
+ "" +
+ " " +
+ " " +
+ " " +
+ " " +
+ " " +
+ "";
+ try (JDisc container = JDisc.fromServicesXml(servicesXml, Networking.disabled)) {
+ Result result = container.search().process(ComponentSpecification.fromString("default"),
+ new Query("search/?query=test+query"));
+ assertNotNull(result.hits().get("test"));
+ }
+ }
+}
+```
+
+Examine which searchers are in a chain and their ordering:
+
+```java
+ChainRegistry chains = container.searching().getChains();
+Chain defaultChain = chains.getComponent("default");
+
+boolean foundSimpleSearcher = false;
+for (Searcher searcher: defaultChain.components()) {
+ if ("com.yahoo.search.example.SimpleSearcher".equals(searcher.getClassName()))
+ foundSimpleSearcher = true;
+}
+
+assertTrue("No instance of SimpleSearcher found in the default chain", foundSimpleSearcher);
+```
+
+## Passing information between Searchers
+
+The query object is used to pass information between searchers. A part of the query is a general property store which may hold any object. Any values set in the request or in the query profile is available through these properties, but in addition searchers may add any objects they create. This is useful when some searcher component is producing information later consumed by some other. Example:
+
+```java
+import com.yahoo.search.*;
+import com.yahoo.search.searchchain.*;
+@Provides(SomeObject.NAME)
+public class ProducerSearcher extends Searcher {
+ public Result search(Query query, Execution execution) {
+ SomeObject.setTo(query.properties(), new SomeObject(query));
+ return execution.search(query); // Pass to next in chain
+ }
+}
+```
+
+```java
+import com.yahoo.search.*;
+import com.yahoo.search.searchchain.*;
+@After(SomeObject.NAME)
+public class ConsumerSearcher extends Searcher {
+ public Result search(Query query, Execution execution) {
+ SomeObject someObject = SomeObject.getFrom(query.properties());
+ ...
+ return execution.search(query); // Pass to next in chain
+ }
+}
+```
+
+```java
+import com.yahoo.search.query.Properties;
+public final class SomeObject {
+ public static final String NAME = "SomeObject";
+ public static void setTo(Properties properties, SomeObject value) {
+ properties.set(NAME, value);
+ }
+ @SuppressWarnings("unchecked")
+ public static SomeObject getFrom(Properties properties) {
+ return (SomeObject) properties.get(NAME);
+ }
+}
+```
+
+This code illustrates two idioms such searchers should follow when exchanging data:
+
+- The key to an object should be exactly the same as the short name of the object stored.
+- The searcher should declare that it *provides* exactly the same name (and of course the consumers must declare that they need to be *after* the object is provided).
+
+When it does not cause unwanted dependencies, it is recommended to wrap the property get and put in a (static) `getFrom` and `setTo` method in the stored object, to allow storage and lookup without having to mention the key unnecessarily outside the object.
+
+Note that the objects are passed as regular in-memory references, so there is no noticeable overhead in this. However, in some situations (like when federating to multiple sources) the query will need to be cloned. The query will then attempt to clone the added properties. Those that implement Cloneable will have clone called, the rest will be copied by reference.
+
+
+**Important:**
+
+It is important that objects added to the query which contains mutable state are **deep cloned** to avoid bugs.
+
+
+On the other hand, cloning objects which should not change is wasteful, they should be copied by reference. Hence, the guidelines are:
+
+- Objects which should not be modified downstream should enforce immutability when added to the Query, either by not offering any mutator methods, or by being *frozen* (in a state where any mutator call causes an exception). Objects which *enforces* mutability should either not implement Cloneable or should implement a shallow clone.
+- Objects which should support downstream modifications **must** implement Cloneable and offer a clone method which performs deep copying.
+
+### Query Context
+
+In some cases there is a need for passing information between searchers beyond those who see the same Query object. For this purpose, the Query provides a QueryContext object which provides a shared data view to all Searchers working on the same request. The context provides (among other things) a facility for setting properties (named objects). The context can be accessed safely from all the threads working on a request without incurring synchronization overhead (with some caveats), but provides linear, not constant lookup time. To set and retrieve such properties, use:
+
+```java
+{query|result}.getContext(true).setProperty(name, value)
+{query|result}.getContext(true).getProperty(name)
+```
+
+## Parametrizing Searchers
+
+It is easy to pass arguments to searchers - any key-value looked up in the query properties in the searcher can be passed as is in the request, or in a query profile. Example:
+
+```java
+String myParameter = query.properties().getString("my.parameter", "defaultValue");
+```
+
+This value can be set by adding `&my.parameter=myValue` to the request. Guidelines:
+
+- Names should use camelCase with the first letter in small caps
+- Dots are used for nesting and have a special meaning in [query profiles](/en/querying/query-profiles). They exist to aid organization of the space of parameters, which easily grows quite large. Usually, the right thing to do is to create a separate name space for each searcher - i.e. use the same dotted prefix for all parameters, as in `myfeature.a`, `myfeature.b` etc. In addition to helping keep the search API clean, this allows various query profiles containing settings for all values in `myfeature` to be defined and selected at run time, which is often useful.
+- To make such parameter APIs easier to use, one should also consider creating a [query profile type](/en/querying/query-profiles#query-profile-types) defining the valid parameters. This will be in the form of an XML file accompanying the bundle. This allows checking and optionally enforcement of validity of request parameter and query profile settings of the parameters.
+
+Parameters should be used for all query state which it is reasonable and just as cheap to assume may change with every query. Good candidates are e.g. numerical values to algorithms and switches to business logic.
+
+## Execution model
+
+In broad strokes, the Container works like this:
+
+
+
+ The main thread picks up one of the requests waiting in the queue of the input socket
+
+
+
+ This thread selects the search chain to be used to answer this request, and hands off the actual execution of the chain to a worker thread (there are many such worker threads)
+
+
+
+ The worker thread calls all searchers in the search chain in turn, starting with the first one
+
+
+
+ Each searcher returns results
+
+
+
+ Results are eventually rendered (maybe using a template) into the buffer of the output socket
+
+
+
+There is a single instance of each search chain. In every chain, there is a single instance of each searcher. (Unless a chain is configured with multiple, identical searchers - this is a rare case.)
+
+When simultaneous requests arrive for the same search chain, multiple worker threads execute the searchers in that chain. A searcher can therefore be executed concurrently by multiple threads; many threads of execution can be going through the `search()` method, concurrently.
+
+This model places an important constraint on searcher classes: *instance variables are not safe.* They must be eliminated, or made thread-safe somehow.
+
+## Keeping state in Searchers
+
+As the passing of queries and results happen on the call stack, the container will allocate many worker threads to execute queries, using one thread per query until the result is returned.
+
+This means that any state we wish to keep along in the searcher for this particular query until the result is returned should be kept as local variables in the search method, while state which should be shared by all queries should be kept as member variables. As the latter kind will be accessed by multiple threads at any one time, the state of such member variables must be *multithread safe*.
+
+This critical restriction is similar to those of e.g. the Servlet API. A quick example should drive the point home:
+
+```java
+public class SafeSearcher extends Searcher {
+ public Result search(Query query, Execution execution) {
+ long count = (Long) query.properties().get("Count");
+ count++;
+ return execution.search(query);
+ }
+}
+public class UnsafeSearcher extends Searcher {
+ private long count;
+ public Result search(Query query, Execution execution) {
+ count = (Long) query.properties().get("Count");
+ count++; // unsafe
+ return execution.search(query);
+ }
+}
+```
+
+The second example uses an instance variable, which will be accessed concurrently by multiple threads. Without proper concurrency controls (such as synchronization), such access is inherently unsafe and may yield inconsistent results, and/or data corruption.
+
+Options for implementing a multithread-safe searcher with instance variables:
+
+1. Use immutable objects: they never change after they are constructed; no modifications to their state occurs after the Searcher constructor returns.
+2. Use a single instance of a thread-safe class.
+3. Create a single instance and synchronize access to it across all threads (but this will severely limit your scalability).
+4. Arrange for each thread to have its own instance, e.g. with a `ThreadLocal`.
+
+## Multiphase searching
+
+The model of a single pass fetching results from a Query described in this document is sometimes too simplistic to produce good performance. The search container supports *multiphase searching* to address such cases. With multiphase searching, the hits of the result is first filled with some minimal information. This minimally filled result is sent up the search chain where some of the hits are hopefully removed. When more information is needed, a second fill request is sent down the search chain to fetch more data for just those hits remaining in the result. This can happen in repeated stages, working on progressively smaller sets of hits containing progressively more expensive information.
+
+The container supports this by offering `fill` methods on execution, which may be called to request more information added to the hits of the result from a searcher. In addition, the backends and backend providers must support multiphase searching (this is currently only the case for internal Vespa clusters).
+
+Any searchers should assume they are operating in a multiphase setup, meaning:
+
+- Searchers which changes the query or contain workflows do not need to do anything
+- Searchers which accesses field information (not just id and relevance) from hits should **always** call either `fill()` to get the default set of fields for each hit type or `fill(summaryClassName)` to get a particular collection of fields known to exist in the backend(s) in question. Calling fill on a result which contains already-filled hits is cheap.
+- Federating searchers should implement both the regular `search` method and the `fill` method. The fill method must request filling down the source branches which has remaining hits in the result.
+- Backend searchers, which wish to support multiphase searching, should initially deliver unfilled hits and implement a `fill` method which fills the hits in the given result belonging to that backend with information from the backend.
+
+
+**Note:**
+
+[vespa-match-features](https://vinted.engineering/2025/11/06/vespa-match-features/) is a good article on multiphase searching, result fill and match-features.
+
+
+## Error handling
+
+If your searcher encounters a problem and wants to signal an error, set an error hit in the result object by calling `result.hits().addError(errorMsg)`.
+
+See the FAQ for [timeouts](/en/learn/faq#how-is-the-query-timeout-computed).
+
+## Timeouts
+
+How to gracefully handle a timeout inside a Searcher? `Result result = execution.search(query)` can result in a timeout - when printed:
+
+```txt
+Container.com...vespa.Searcher result: Result: Source 'top-chain': 12: Timed out: Error in execution of chain 'top-chain': Chain timed out.
+```
+
+When having a tree of chains (see [federation](/en/querying/federation#timeout-behavior)), where the main chain calls one chain per source, and in this case, one of the source chains times out (e.g. does not return a Result within its deadline), this can happen.
+
+It is not generally possible to prevent this from ever happening, but searchers can check `query.getTimeLeft` before doing time-consuming stuff, and pass `query.getTimeLeft() - a_little` as timeout to processes they initiate (such as network calls) that are able to take a deadline themselves.
+
+## WordItem
+
+In a Searcher, one often will use [WordItem](https://javadoc.io/doc/com.yahoo.vespa/container-search/latest/com/yahoo/prelude/query/WordItem) to modify the current query, or create a new query based on input query terms, or results from the current query. To keep linguistic settings (e.g. stemming) from the parent query, set `isFromQuery` to true - [example](https://github.com/vespa-cloud/vespa-documentation-search/blob/main/src/main/java/ai/vespa/cloud/docsearch/DocumentationSearcher.java).
diff --git a/mintlify-docs/en/applications/testing.mdx b/mintlify-docs/en/applications/testing.mdx
new file mode 100644
index 0000000000..dd6274e91b
--- /dev/null
+++ b/mintlify-docs/en/applications/testing.mdx
@@ -0,0 +1,154 @@
+---
+title: "System testing"
+description: "A system tests suite is an invaluable tool both when developing and maintaining a complex Vespa application. These are functional tests which are run against a deployment of the application package to verify, and use its HTTP APIs to execute feed and query operations which are compared to expected outcomes. Vespa provides two formalizations of this:"
+---
+
+- [Basic HTTP tests](/en/reference/applications/testing), expressing requests and expected responses as JSON, and run with the [Vespa CLI](/en/clients/vespa-cli).
+- [Java JUnit tests](/en/reference/applications/testing-java), for more advanced tests, run as regular Java tests, with some extra configuration.
+
+These two frameworks also includes an upgrade—or staging—test construct for scenarios where the application is upgraded, and state in the backend depends on the old application configuration; as well as a production verification test—basically a health check for production deployments. For system and staging tests, the frameworks provide an easy way to perform HTTP request against a designated test deployment, separating the tests from the deployment and configuration of the test clusters.
+
+This document describes how each of these test categories can be run as part of an imagined CI/CD system for safely deploying changes to a Vespa application in a continuous manner.
+
+Finally, find a section on [A/B-testing / bucket tests](/en/applications/testing#feature-switches-and-bucket-tests) using feature switches.
+
+## System tests
+
+System tests are just functional tests that verify a deployed Vespa application behaves as expected when fed and queried. Running a system test is as simple as making a separate deployment with the application package to test, and then running the system test suite, or one or a few or those tests.
+
+
+**Note:**
+
+Each system test should be self-contained, i.e., it should be able to run each test in isolation; or all tests, in any order. To achieve this, **system tests should generally start by clearing all documents from the cluster to test.** This is the case with our sample system tests, so take care to not run them against a production cluster.
+
+
+For the most part, system tests must be updated due to changes in the application package. Rarely, an upgrade of the Vespa version may also lead to changed functionality, but within major versions, this should only be new features and bug fixes. In any case, it is a good idea to always run system tests against a dedicated test deployment—both before upgrading the Vespa platform, and the application package—before deploying the change to production.
+
+### Running system tests
+
+The [Vespa CLI](/en/clients/vespa-cli) makes it easy to set up a test deployment, and run system and staging tests. To run a system test, first set up a test deployment:
+
+
+
+```sh
+$ vespa deploy --wait 600
+```
+
+Run the basic HTTP tests (prefer using this test suite for regular tests) - also see the [example](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/CI-CD/production-deployment-with-tests) application:
+
+
+
+```sh
+$ vespa test tests/system-test/feed-and-search-test.json
+```
+
+Example Java API tests (use for complex test cases) - also see the [example](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/CI-CD/production-deployment-with-tests-java) application:
+
+
+
+```sh
+$ mvn test -D test.categories=system -D vespa.test.config=/path-to/test-config.json
+```
+
+The test config file used by the test runner in the maven-plugin defines the endpoints for each of the clusters in [services.xml](/en/reference/applications/services/services) as fields under a `localEndpoints` JSON object:
+
+
+
+```json
+{
+ "localEndpoints": {
+ "query-service": "http://localhost:8080/",
+ "feed-service" : "http://localhost:8081/"
+ }
+}
+```
+
+`feed-service` is the endpoint of the container cluster with `` in [services.xml](/en/reference/applications/services/services). `query-service` is the endpoint of the container cluster with `` in [services.xml](/en/reference/applications/services/services).
+
+## Staging tests
+
+The goal of staging (upgrade) tests is *not* to ensure the new deployment satisfies its functional specifications, as that should be covered by system tests; rather, it is to ensure the upgrade of the application package and/or Vespa platform does not break the application, and is compatible with the behavior expected by existing clients.
+
+As an example, consider a change in how documents are indexed, e.g., adding new document processor. A system test would test verify this new behavior by feeding a document, and then verifying the document processor modified the document, or perhaps did something else. A staging test, on the other hand, would feed the document *before* the document processor was added, and querying for the document after the upgrade could give different results from what the system test would expect.
+
+Many such changes, which require additional action post-deployment, are also guarded by [validation overrides](/en/reference/applications/validation-overrides), but the staging test is then a great way of figuring out what the exact consequences of the change are, and how to deal with it.
+
+As opposed to system tests, staging tests are not self-contained, as the state change during upgrade is precisely what is tested. Instead, execution order of any staging tests that modify state, particularly after upgrade, must be controlled. Indeed, some changes will require re-feeding data, and this should then be part of the *staging test* code. Finally, it is also good to verify the expected state prior to upgrade.
+
+The clients of a Vespa application should be compatible with both the system and staging test expectations, and this dictates the workflow when deploying a breaking change - steps:
+
+
+
+ The application code and system and staging tests are updated, so tests pass; and clients are updated to reflect the updated test code.
+
+
+
+ The application is upgraded.
+
+
+
+ The *staging setup* code is updated to match the new application code.
+
+ Again, it is a good idea to always run staging tests before deployment of every change—be it a change in the application package, or an upgrade of the Vespa platform.
+
+
+
+
+### Running staging tests
+
+See [system tests](#system-tests) above for links to example applications. Steps:
+
+
+
+ A dedicated deployment is made with the *current* setup (package and Vespa version).
+
+
+
+ *staging setup* code is run to put the test cluster in a particular state—typically one that mimics the state in production clusters.
+
+
+
+ The deployment is then upgraded to the *new* setup (package and/or Vespa version).
+
+
+
+ *staging test* code is run to verify the cluster behaves as expected post-upgrade.
+
+ Example using JSON-tests:
+
+ ```sh
+ # load old application code, deploy it, run setup
+ $ vespa deploy --wait 600
+ $ vespa test tests/staging-setup
+
+ # make changes to the application, deploy it, run tests
+ $ vespa deploy --wait 120
+ $ vespa test tests/staging-test
+ ```
+
+ Example using Java tests (see [system tests](#running-system-tests) for *test-config.json*):
+
+ ```sh
+ # load old application code, deploy it, run setup
+ $ vespa deploy --wait 600
+ $ mvn test -D test.categories=staging-setup -D vespa.test.config=/path-to/test-config.json
+
+ # make changes to the application, deploy it, run tests
+ $ vespa deploy --wait 120
+ $ mvn test -D test.categories=staging -D vespa.test.config=/path-to/test-config.json
+ ```
+
+
+
+
+## Feature switches and bucket tests
+
+With continuous deployment, it is not practical to hold off releasing a feature until it is done, test it manually until convinced it works and then release it to production. What to do instead? The answer is *feature switches*: release new features to production as they are developed, but include logic which keeps them deactivated until they are ready, or until they have been verified in production with a subset of users.
+
+*Bucket tests* is the practice of systematically testing new features or behavior for a controlled subset of users. This is common practice when releasing new science models, as they are difficult to verify in test, but can also be used for other features.
+
+To test new behavior in Vespa, use a combination of [search chains](/en/applications/chaining) and [rank profiles](/en/reference/schemas/schemas#rank-profile), controlled by [query profiles](/en/querying/query-profiles), where one query profile corresponds to one bucket. These features support inheritance to make it easy to express variation without repetition.
+
+Sometimes a new feature requires [incompatible changes to a data field](/en/reference/schemas/schemas#modifying-schemas). To be able to CD such changes, it is necessary to create a new field containing the new version of the data. This costs extra resources but less than the alternative: standing up a new system copy with the new data. New fields can be added and populated while the system is live.
+
+One way to reduce the need for incompatible changes can be decreased by making the semantics of the fields more precise. E.g., if a field is defined as the "quality" of a document, where a higher number means higher quality, a new algorithm which produces a different range and distribution will typically be an incompatible change. However, if the field is defined more precisely as the average time spent on the document once it is clicked, then a new algorithm which produces better estimates of this value will not be an incompatible change. Using precise semantics also have the advantage of making it easier to understand if the use of the data and its statistical properties are reasonable.
diff --git a/mintlify-docs/en/applications/unit-testing.mdx b/mintlify-docs/en/applications/unit-testing.mdx
new file mode 100644
index 0000000000..c01abb6752
--- /dev/null
+++ b/mintlify-docs/en/applications/unit-testing.mdx
@@ -0,0 +1,151 @@
+---
+title: "Unit testing"
+description: "This document describes how to test application functionality in a local Java vm. See [automated deployments](/en/operations/automated-deployments) for how to create system, staging and production verification tests."
+---
+
+## Unit testing using Application
+
+The [Application](https://javadoc.io/doc/com.yahoo.vespa/application/latest/com/yahoo/application/Application.html) class is useful when writing unit tests. Application uses the application package configuration and set up a container instance for testing. The [JDisc](https://javadoc.io/doc/com.yahoo.vespa/application/latest/com/yahoo/application/container/JDisc.html) class that is accessed by the test through [app.getJDisc(clusterName)](https://javadoc.io/page/com.yahoo.vespa/application/latest/com/yahoo/application/Application#getJDisc-java.lang.String-) - this class has methods for using all common [component types](/en/reference/applications/components).
+
+Refer to [MetalSearcherTest.java](https://github.com/vespa-engine/sample-apps/blob/master/album-recommendation-java/app/src/test/java/ai/vespa/example/album/MetalSearcherTest.java) for example use. Notice how the test disables the network layer in order to run tests in parallel.
+
+
+**Note:**
+
+`Application` does not set up *content* nodes, only *container*. It is hence fully stateless, and intended for unit testing the functionality of application components. The *ClusterSearcher* will not find any content nodes and log errors if invoked. Write a *System Test* to test end-to-end features like search.
+
+
+For prototyping, enable the network interface, instantiate the Container and run requests using a browser:
+
+```java
+public class ApplicationMain {
+ @Test
+ public static void main(String[] args) throws Exception {
+ try (com.yahoo.application.Application app = com.yahoo.application.Application.fromApplicationPackage(
+ FileSystems.getDefault().getPath("src/main/application"),
+ Networking.enable)) {
+ app.getClass();
+ Thread.sleep(Long.MAX_VALUE);
+ }
+ }
+}
+```
+
+## Unit Testing Configurable Components
+
+How to programmatically build configuration instances for unit testing. Read the [Developer Guide](/en/applications/developer-guide) first.
+
+To be able to write self-contained unit tests using configuration classes generated from a schema, it is necessary to instantiate the configuration without the use of for instance an external services file. Configuration classes contain their own builders which are useful for solving exactly this problem. By using builders, the configuration will be created as an immutable, type-safe object, exactly the same as used during deployment.
+
+### Configuration schema
+
+Assume the config definition file `demo.def` with the following schema:
+
+```java
+package=com.mydomain.demo
+
+toplevel[].term string
+toplevel[].number int
+toplevel[].largenumber long
+toplevel[].secondlevel[].name string
+toplevel[].secondlevel[].magnitude double
+
+simplename string
+simplenumber int
+simplevaluearray[] string
+
+coordinate.x double
+coordinate.y double
+coordinate.name string
+```
+
+In other words, the configuration class will be `com.mydomain.demo.DemoConfig`, and it will contain an array of structures, a couple of top-level primitives (*simplename* and *simplenumber*), an array of primitive values (*simplevaluearray*) and a structure (*coordinate*).
+
+### Using configuration builders
+
+All structured objects in the cloud configuration system have their own Builder as a nested class. So, in the above example, one would get `DemoConfig.Builder` for the complete configuration class, `DemoConfig.Toplevel.Builder` for the top-level array, `DemoConfig.Toplevel.Secondlevel.Builder` for the inner array, and `DemoConfig.Coordinate.Builder` for the structure.
+
+A configuration object, or substructure, is easiest instantiated using a constructor accepting the corresponding *Builder* class, an array of structures should use the constructor accepting an array of *Builder* instances, and an array of primitive values simply accepts a java.util.Collection of the corresponding primitive value class:
+
+```java expandable
+package com.mydomain.demo;
+import static org.junit.Assert.*;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+import org.junit.Test;
+import static com.mydomain.demo.DemoConfig.Toplevel;
+import static com.mydomain.demo.DemoConfig.Toplevel.Secondlevel;
+import static com.mydomain.demo.DemoConfig.Coordinate;
+public class DemoTest {
+ /**
+ * An example showing how to build a relatively complex, mixed type
+ * configuration including arrays of primitive elements, nested arrays and
+ * arrays of structures and so on.
+ */
+ @Test
+ public final void test() {
+ // We need use builders to safely create the graph of immutable
+ // configuration objects. Each generated configuration class contains
+ // the builder for creating an instance of itself. This pattern is
+ // repeated for structures. So, in our case, we have four structured
+ // levels. The complete configuration class, DemoConfig, the top-level,
+ // nested array, Toplevel, the contained array, Secondlevel and the
+ // structure Coordinate. This leaves us with using four distinct
+ // builder classes: DemoConfig.Builder, Toplevel.Builder,
+ // Secondlevel.Builder and Coordinate.Builder.
+ // Chained setters are the most used pattern for the builders:
+ DemoConfig forTesting = new DemoConfig(new DemoConfig.Builder()
+ .simplename("basic chained setter for the string simplename")
+ .simplenumber(42)
+ .toplevel(buildTopLevelArray())
+ .simplevaluearray(
+ Arrays.asList(new String[] { "primitive", "arrays",
+ "are", "easier", "to", "build", "than",
+ "arrays", "of", "structures" }))
+ .coordinate(
+ new Coordinate.Builder()
+ .name("have no idea what to call this one")
+ .x(1e300d).y(1e-300d)));
+ assertTrue(forTesting != null); // ;)
+ }
+ /**
+ * It is often the more readable solution to use helper methods to build
+ * configuration arrays.
+ *
+ * @return a list of Toplevel.Builder instances
+ */
+ private List buildTopLevelArray() {
+ // Note how the Builder classes tend to work on Collection classes and
+ // mutable objects, while the config ready for use is bolted down and
+ // immutable:
+ List configArray = new ArrayList(3);
+ String[] configStrings = new String[] { "a", "b", "c" };
+ int[] configNumbers = new int[] { 1, 2, 3 };
+ long[] configLargeNumbers = new long[] { 1L + (long) Integer.MAX_VALUE,
+ 2L + (long) Integer.MAX_VALUE, 3L + (long) Integer.MAX_VALUE };
+ for (int i = 0; i < configStrings.length; ++i) {
+ configArray.add(new Toplevel.Builder().number(configNumbers[i])
+ .largenumber(configLargeNumbers[i]).term(configStrings[i])
+ .secondlevel(buildSecondLevelArray(2)));
+ }
+ return configArray;
+ }
+ /**
+ * Once again, the building of an array is delegated to a helper method
+ *
+ * @param subelements
+ * the length of the returned list
+ * @return a list of SecondLevel.Builder
+ */
+ private List buildSecondLevelArray(int subelements) {
+ List builders = new ArrayList(
+ subelements);
+ for (int i = 0; i < subelements; ++i) {
+ builders.add(new Secondlevel.Builder().name(String.valueOf(i))
+ .magnitude((double) i));
+ }
+ return builders;
+ }
+}
+```
diff --git a/mintlify-docs/en/applications/using-zookeeper.mdx b/mintlify-docs/en/applications/using-zookeeper.mdx
new file mode 100644
index 0000000000..e5f4b910ac
--- /dev/null
+++ b/mintlify-docs/en/applications/using-zookeeper.mdx
@@ -0,0 +1,59 @@
+---
+title: "Using ZooKeeper"
+description: "The Vespa container supports [ZooKeeper](https://zookeeper.apache.org/), which allows distributed synchronization across nodes in a container cluster."
+---
+
+Once enabled all nodes in a container cluster will automatically form a ZooKeeper ensemble, and participate as servers. Vespa takes care of reconfiguring ZooKeeper members when nodes are added or removed from the container cluster.
+
+
+**Note:**
+
+Vespa enforces an optimal node limit for clusters with ZooKeeper. Application packages that violate this node count will be rejected. The valid number of nodes is 3, 5 or 7. See [#15762](https://github.com/vespa-engine/vespa/issues/15762) for other node counts.
+
+
+## Configuration
+
+
+
+ ZooKeeper must be explicitly enabled in the [container cluster configuration](/en/reference/applications/services/container#zookeeper).
+
+
+
+ The application must specify a dependency on `zkfacade`. Example for `pom.xml`:
+
+ ```xml
+
+ com.yahoo.vespa
+ zkfacade
+ [vespa-version]
+ provided
+
+ ```
+
+
+
+
+## Code example
+
+ZooKeeper features are exposed through [VespaCurator](https://github.com/vespa-engine/vespa/blob/master/zkfacade/src/main/java/com/yahoo/vespa/curator/api/VespaCurator.java). [Inject](/en/applications/dependency-injection) `VespaCurator` to use it. [Handler](/en/applications/request-handlers) example:
+
+```java
+public class MyRequestHandler extends ThreadedHttpRequestHandler {
+ private final VespaCurator curator;
+ @Inject
+ public CuratorHandler(Executor executor, VespaCurator curator) {
+ super(executor);
+ this.curator = curator;
+ }
+ @Override
+ public HttpResponse handle(HttpRequest httpRequest) {
+ Path lockPath = Path.fromString("/locks/mylock");
+ Duration timeout = Duration.ofSeconds(1);
+ try (var lock = curator.lock(lockPath, timeout)) {
+ // Do something while holding lock
+ } catch (Exception e) {
+ throw new RuntimeException("Failed to acquire lock " + lockPath, e);
+ }
+}
+}
+```
diff --git a/mintlify-docs/en/applications/vespaignore.mdx b/mintlify-docs/en/applications/vespaignore.mdx
new file mode 100644
index 0000000000..20108f8dc7
--- /dev/null
+++ b/mintlify-docs/en/applications/vespaignore.mdx
@@ -0,0 +1,42 @@
+---
+title: ".vespaignore"
+sidebarTitle: ".vespaignore file"
+description: "When deploying an [application package](/en/reference/applications/application-packages) with [Vespa CLI](/en/clients/vespa-cli), a `.vespaignore` file (similar to `.gitignore`) can be added to the package to prevent specific files or path patterns from being included in the deployed package."
+---
+
+
+Ignoring files is useful when the Vespa application directory contains files that are only used for development purposes, and are not directly referenced by the application.
+
+## Location
+
+The `.vespaignore` file must be placed at the same level as [services.xml](/en/reference/applications/services/services). Having multiple `.vespaignore` at different path levels is not supported.
+
+## Example
+
+This is an example of a `.vespaignore` file that excludes files and directories rarely needed in an application package.
+
+```txt
+# exclude hidden files and readme
+.DS_Store
+.gitignore
+README.md
+
+# exclude feed input
+ext/
+
+# exclude auxiliary scripts
+*.py
+*.sh
+```
+
+## Format
+
+The `.vespaignore` format is a subset of the `.gitignore` format, where:
+
+- Lines starting with `#` are ignored and can be used for comments
+- Each non-empty line specifies a path pattern to ignore
+- Patterns are relative to `services.xml`
+- A pattern can be either a literal string, or a pattern string as consumed by [filepath.Match](https://pkg.go.dev/path/filepath#Match)
+- Lines ending with `/` always denote a directory, e.g. the pattern `foo/` will match the directory `foo` (and any files below), but not the file `foo`
+
+Complex rules, such as negated patterns and recursive globbing (`**`) are not supported.
diff --git a/mintlify-docs/en/applications/web-services.mdx b/mintlify-docs/en/applications/web-services.mdx
new file mode 100644
index 0000000000..174dcf6fa9
--- /dev/null
+++ b/mintlify-docs/en/applications/web-services.mdx
@@ -0,0 +1,130 @@
+---
+title: "Developing Web Service Applications"
+sidebarTitle: "Developing Web Services"
+description: "This document explains how to develop (REST) web service type applications on the container - design options, accessing the request path, returning a status code etc. There are two types of web service APIs:"
+---
+
+- Fine-grained APIs with closed semantics – for example *return the number of stars of an article*
+- Coarse-grained APIs with open semantics – for example *return a page containing the most relevant mixture of stuff for this user and action*
+
+With coarse-grained APIs, the container can help handle the complexity typically involved in the implementation of such APIs by providing a way to compose and federate components contributing to processing the request and provide and modify the returned data, and a way to allow such requests to start returning before they are finished to reduce latency with large responses. This is the [processing](/en/applications/processing) framework (or, in the case of search-like application, the [searcher](/en/applications/searchers) specialization).
+
+In addition, the [container](/en/reference/applications/components#component-types) features a generic mechanism allowing a [request handler](/en/applications/request-handlers) to be [bound](/en/reference/applications/components#binding) to a URI pattern and invoked to handle all requests matching that pattern. This is useful where there is no need to handle complexity and/or federation of various kinds of data in the response. Both the approaches above are actually implemented as built-in request handlers.
+
+A custom request handler may be written to parse the url path/method and dispatch to an appropriate chain of processing components. A "main" processing chain may be written to do the same by dispatching to other chains. The simplest way to invoke a specific chain of processors is to forward a query to the `ProcessingHandler` with the request property `chain` set to the name of the chain to invoke:
+
+```java
+import com.yahoo.component.annotation.Inject;
+public class DemoHandler extends com.yahoo.container.jdisc.ThreadedHttpRequestHandler {
+ ...
+ @Inject
+ public DemoHandler(Executor executor, ProcessingHandler processingHandler) {
+ super(executor);
+ this.processingHandler = processingHandler;
+ }
+ ...
+ @Override
+ public HttpResponse handle(HttpRequest request) {
+ HttpRequest processingRequest = new HttpRequest.Builder(request)
+ .put(com.yahoo.processing.Request.CHAIN, "theProcessingChainIWant")
+ .createDirectRequest();
+ HttpResponse r = processingHandler.handle(processingRequest);
+ return r;
+ }
+ ...
+}
+```
+
+## Accessing the HTTP request
+
+Custom [request handlers](/en/applications/request-handlers), are given a [com.yahoo.container.jdisc.HttpRequest](https://javadoc.io/doc/com.yahoo.vespa/container-disc/latest/com/yahoo/container/jdisc/HttpRequest), with direct access to associated properties and request data.
+
+In [Processing](/en/applications/processing), the Processors are given a [com.yahoo.processing.Request](https://javadoc.io/doc/com.yahoo.vespa/container-disc/latest/com/yahoo/processing/Request) containing the HTTP URL parameters:
+
+```txt
+// url parameters are added to properties
+String urlParameter = request.properties().get("urlParameterName");
+
+// jdisc request context is added with prefix context
+Object contextValue = request.properties().get("context.contextKey");
+```
+
+If needed, a Processor can retrieve the entire HTTP request via a utility function:
+
+```java
+import com.yahoo.container.jdisc.HttpRequest; ...
+
+// Retrieve the underlying HTTP request: Optional httpRequest = HttpRequest.getHttpRequest(request);
+
+if (httpRequest.isPresent()) {
+ // The POST data input stream:
+ InputStream in = httpRequest.get().getData();
+ // The HTTP method:
+ Method method = httpRequest.get().getMethod();
+}
+```
+
+### Setting the HTTP status and HTTP headers
+
+In Processing, the return status can be set by adding a special Data item to the Response:
+
+```bash
+response.data().add(new com.yahoo.processing.handler.ResponseStatus(404, request));
+```
+
+If no such data element is present, the status will be determined by the container. If it contains data able to render, it will be 200, otherwise it will be determined by any ErrorMessage present in the response.
+
+### Setting response headers from Processors
+
+Response headers may be added to any Response by adding instances of `com.yahoo.processing.handler.ResponseHeaders` to the Response (ResponseHeaders is a kind of response Data). Multiple instances of this may be added to the Response, and the complete set of headers returned is the superset of all such objects. Example Processor:
+
+```bash
+processingResponse.data().add(new com.yahoo.processing.handler.ResponseHeaders(myHeaders, request));
+```
+
+Request handlers may in general set their return status, and manipulate headers directly on the HttpRequest.
+
+## Queries
+
+Sometimes all that is needed is letting the standard query framework reply for more paths than standard. This is possible by adding extra [binding](/en/reference/applications/services/search#binding)s inside the `` element in `services.xml`. Writing a custom [request handler](/en/applications/request-handlers) is recommended if the application is a standalone HTTP API, and especially if there are properties used with the same name as those in the [Query API](/en/reference/api/query). A request handler may query the search components running in the same container without any appreciable overhead:
+
+### Invoking Vespa queries from a component
+
+To invoke Vespa queries from a component, have an instance of [ExecutionFactory](https://github.com/vespa-engine/vespa/blob/master/container-search/src/main/java/com/yahoo/search/searchchain/ExecutionFactory.java) injected in the constructor and use its API to construct and issue the query. The container this runs in must include the `` tag for the ExecutionFactory to be available. Example:
+
+```java expandable
+import com.yahoo.component.annotation.Inject;
+import com.yahoo.component.ComponentId;
+import com.yahoo.search.Query;
+import com.yahoo.search.Result;
+import com.yahoo.component.Chain;
+import com.yahoo.search.searchchain.Execution;
+import com.yahoo.search.searchchain.ExecutionFactory;
+
+public class MyComponent {
+
+ private final ExecutionFactory executionFactory;
+
+ @Inject
+ public MyComponent(ExecutionFactory executionFactory) {
+ this.executionFactory = executionFactory;
+ }
+
+ Result executeQuery(Query query, String chainId) {
+ Chain searchChain = executionFactory.searchChainRegistry().getChain(new ComponentId(chainId));
+ Execution execution = executionFactory.newExecution(searchChain);
+ query.getModel().setExecution(execution);
+ return execution.search(query);
+ }
+
+}
+```
+
+ExecutionFactory depends on the search chains, so it cannot be injected into any component which is part of the search chains. But from within a Searcher it is not needed as the Execution passed gives what is needed:
+
+- Access the search chains: execution.context().searchChainRegistry().
+- Create a new Execution: new Execution(mySearchChain, execution.context())
+
+This is the right way since it ties that execution to the one you're in.
+
+One hence cannot execute a search chain from the search chain component constructor to e.g. refresh a cache. It is impossible since the search chains can't be constructed until this constructor returns. An alternative is to extract the refreshing into a separate component which has both the client and execution factory injected into it.
\ No newline at end of file
diff --git a/mintlify-docs/en/basics/applications.mdx b/mintlify-docs/en/basics/applications.mdx
new file mode 100644
index 0000000000..0044a5bcdf
--- /dev/null
+++ b/mintlify-docs/en/basics/applications.mdx
@@ -0,0 +1,106 @@
+---
+title: Vespa applications
+---
+
+You use Vespa by deploying an *application* to it. Why applications? Because Vespa handles both data and the computations you do over them — together an application.
+
+An application is specified by an *application package* — a directory with some files. The application package contains *everything* that is needed to run your application: config, schemas, components, ML models, and so on.
+
+The *only* way to change an application is to make the change in the application package and then deploy it again. Vespa will then safely change the running system to match the new application package revision, without impacting queries, writes, or data.
+
+## A minimal application package
+
+You can create a complete application package with just a single file: `services.xml`. This file specifies the clusters that your application should run. It could just be a single stateless cluster — what's called *container* — like this:
+
+```xml
+
+
+
+
+
+
+
+```
+
+Put this in a file called `services.xml`, and you have created the world's smallest application package. However, this won't do much; usually you want to have a `content` cluster which can store data, maintain indexes, and run the distributed part of queries. You'll also want your container cluster to load the necessary middleware for this. With that we get a services file like this:
+
+```xml
+
+
+
+
+
+
+
+
+
+ 2
+
+
+
+
+
+
+
+
+```
+
+This specifies a pretty normal simple Vespa application, but now we need another file: the schema of the document type we'll use. This goes into the directory `schemas/`, so our application package now looks like this:
+
+```txt
+services.xml
+schemas/myschema.sd
+```
+
+The schema file describes a kind of data and the computations (such as ranking/scoring) you want to do over it. At minimum it just lists the fields of that data type and whether each field should be indexed:
+
+```txt
+schema myschema {
+
+ document myschema {
+
+ field text type string {
+ indexing: summary | index
+ }
+
+ field embedding type tensor(x[384]) {
+ indexing: attribute | index
+ }
+
+ field popularity type double {
+ indexing: summary | attribute
+ }
+
+ }
+
+}
+```
+
+With these two files we have specified a fully functional application that can do text, vector and hybrid search with filtering.
+
+Rather than creating applications from scratch like this, you can also clone one of our sample applications as a starting point like we did in [getting started](/en/basics/deploy-an-application).
+
+To read more on schemas, see the [schemas](/en/basics/schemas) guide. To see everything an application package can contain, see the [application package reference](/en/reference/applications/application-packages).
+
+## Deploying applications
+
+To create running instances of an application, or make changes to one take effect, you *deploy* it. Deployments to the dev zone and to self-managed clusters set up a single instance, while deployments to production can set up multiple instances in one or more regions.
+
+To deploy an application package you use the [deploy command](/en/clients/vespa-cli#deployment) in Vespa CLI:
+
+```bash
+vespa deploy .
+```
+
+This will deploy the application package at the current directory to the current target and the default dev zone (use `vespa deploy -h` to see other options).
+
+Deployment to production zones use a separate command:
+
+```bash
+vespa prod deploy .
+```
+
+Production deployments also require an additional file in the application package to specify where it should be deployed: `deployment.xml`. See [production deployment](/en/operations/production-deployment). The recommended way to deploy to production is by setting up a continuous deployment job — see [automated deployments](/en/operations/automated-deployments).
+
+Deploying a change to an application package is generally safe to do at any time. It does not disrupt queries and writes, and invalid or destructive changes are rejected before taking effect. You can also add tests that verify the application before deployment to production zones.
+
diff --git a/mintlify-docs/en/basics/deploy-an-application-java.mdx b/mintlify-docs/en/basics/deploy-an-application-java.mdx
new file mode 100644
index 0000000000..e5f3e6c492
--- /dev/null
+++ b/mintlify-docs/en/basics/deploy-an-application-java.mdx
@@ -0,0 +1,110 @@
+---
+title: Deploy an application having Java components
+---
+
+Follow these steps to deploy a Vespa application which includes Java components to the [dev zone](/en/operations/environments#dev) on Vespa Cloud (for free).
+
+Alternative versions of this guide:
+
+- [Deploy an application using pyvespa](https://vespa-engine.github.io/pyvespa/getting-started-pyvespa-cloud.html) - for Python developers
+- [Deploy an application without Java components](/en/basics/deploy-an-application)
+- [Deploy an application without Vespa CLI](/en/basics/deploy-an-application-shell)
+- [Deploy an application locally](/en/basics/deploy-an-application-local)
+- [Deploy an application having Java components locally](/en/basics/deploy-an-application-local-java)
+
+
+
+**Prerequisites:**
+
+- [Java 17](https://openjdk.org/projects/jdk/17/).
+- [Apache Maven](https://maven.apache.org/install.html) to build the application.
+
+
+Setup:
+
+
+
+**Create a [tenant](/en/learn/tenant-apps-instances) on Vespa Cloud:**
+
+
+**Install the [Vespa CLI](/en/clients/vespa-cli)** using [Homebrew](https://brew.sh/):
+
+```bash
+$ brew install vespa-cli
+```
+
+
+**Configure the Vespa client:**
+
+```bash
+$ vespa config set target cloud
+$ vespa config set application vespa-team.autotest
+```
+
+
+**Get Vespa Cloud control plane access:**
+
+```bash
+$ vespa auth login
+```
+
+
+
+**Clone a sample [application](/en/basics/applications):**
+
+```bash
+$ vespa clone album-recommendation myapp && cd myapp
+```
+
+
+**Add a certificate for [data plane access](/en/security/guide#data-plane) to the application:**
+
+```bash
+$ vespa auth cert app
+```
+
+
+
+
+Steps:
+
+
+
+**Build the application:**
+
+```bash
+$ mvn install -f app
+```
+
+
+**[Deploy](/en/basics/applications#deploying-applications) the application:**
+
+```bash
+$ vespa deploy --wait 600 ./app
+```
+
+
+**[Feed](/en/writing/reads-and-writes) [documents](/en/schemas/documents):**
+
+```bash
+$ vespa feed app/src/test/resources/*.json
+```
+
+
+**Run [queries](/en/querying/query-api):**
+
+```bash
+vespa query "select * from music where album contains 'head'"
+```
+```bash
+vespa query \
+ "select * from music where true" \
+ "ranking=rank_albums" \
+ "ranking.features.query(user_profile)={{cat:pop}:0.8,{cat:rock}:0.2,{cat:jazz}:0.1}"
+```
+
+
+
+Congratulations, you have deployed your first Vespa application! Application instances in the [dev zone](/en/operations/environments#dev) will by default keep running for 14 days after the last deployment. You can control this in the [console](https://console.vespa-cloud.com/).
+
+You can inspect the source code for this app at [sample-apps](https://github.com/vespa-engine/sample-apps/tree/master/album-recommendation-java).
\ No newline at end of file
diff --git a/mintlify-docs/en/basics/deploy-an-application-local-java.mdx b/mintlify-docs/en/basics/deploy-an-application-local-java.mdx
new file mode 100644
index 0000000000..93bc2dbe8e
--- /dev/null
+++ b/mintlify-docs/en/basics/deploy-an-application-local-java.mdx
@@ -0,0 +1,106 @@
+---
+title: Deploy an application having Java components locally
+---
+
+Follow these steps to deploy a Vespa application which includes Java components to the [dev zone](/en/operations/environments#dev) on Vespa Cloud (for free).
+
+Alternative versions of this guide:
+
+- [Deploy an application using pyvespa](https://vespa-engine.github.io/pyvespa/getting-started-pyvespa-cloud.html) - for Python developers
+- [Deploy an application without Java components](/en/basics/deploy-an-application)
+- [Deploy an application without Vespa CLI](/en/basics/deploy-an-application-shell)
+- [Deploy an application locally](/en/basics/deploy-an-application-local)
+- [Deploy an application having Java components locally](/en/basics/deploy-an-application-local-java)
+
+
+
+**Prerequisites:**
+
+- [Java 17](https://openjdk.org/projects/jdk/17/).
+- [Apache Maven](https://maven.apache.org/install.html) to build the application.
+
+Setup:
+
+
+
+**Create a [tenant](/en/learn/tenant-apps-instances) on Vespa Cloud:**
+
+
+**Install the [Vespa CLI](/en/clients/vespa-cli)** using [Homebrew](https://brew.sh/):
+
+```bash
+$ brew install vespa-cli
+```
+
+
+**Configure the Vespa client:**
+
+```bash
+$ vespa config set target cloud
+$ vespa config set application vespa-team.autotest
+```
+
+
+**Get Vespa Cloud control plane access:**
+
+```bash
+$ vespa auth login
+```
+
+
+**Clone a sample [application](/en/basics/applications):**
+
+```bash
+$ vespa clone album-recommendation-java myapp && cd myapp
+```
+
+
+**Add a certificate for [data plane access](/en/security/guide#data-plane) to the application:**
+
+```bash
+$ vespa auth cert app
+```
+
+
+Steps:
+
+
+
+**Build the application:**
+
+```bash
+$ mvn -U -f app package
+```
+
+
+**[Deploy](/en/basics/applications#deploying-applications) the application:**
+
+```bash
+$ vespa deploy --wait 600 ./app
+```
+The first deployment may take a few minutes while nodes are provisioned.
+
+
+**[Feed](/en/writing/reads-and-writes) [documents](/en/schemas/documents):**
+
+```bash
+$ vespa feed app/src/test/resources/*.json
+```
+
+
+**Run [queries](/en/querying/query-api):**
+```bash
+vespa query "select * from music where album contains 'head'"
+```
+```bash
+vespa query \
+ "select * from music where true" \
+ "ranking=rank_albums" \
+ "ranking.features.query(user_profile)={{cat:pop}:0.8,{cat:rock}:0.2,{cat:jazz}:0.1}"
+```
+
+
+
+Congratulations, you have deployed your first Vespa application! Application instances in the [dev zone](/en/operations/environments#dev) will by default keep running for 14 days after the last deployment. You can control this in the [console](https://console.vespa-cloud.com/).
+
+You can inspect the source code for this app at [sample-apps](https://github.com/vespa-engine/sample-apps/tree/master/album-recommendation-java).
diff --git a/mintlify-docs/en/basics/deploy-an-application-local.mdx b/mintlify-docs/en/basics/deploy-an-application-local.mdx
new file mode 100644
index 0000000000..d6ec166d5c
--- /dev/null
+++ b/mintlify-docs/en/basics/deploy-an-application-local.mdx
@@ -0,0 +1,119 @@
+---
+title: Deploy an application locally
+---
+
+Follow these steps to deploy a Vespa application on your own machine.
+
+Alternative versions of this guide:
+
+- [Deploy an application using pyvespa](https://vespa-engine.github.io/pyvespa/getting-started-pyvespa-cloud.html) - for Python developers
+- [Deploy an application](/en/basics/deploy-an-application)
+- [Deploy an application having Java components](/en/basics/deploy-an-application-java)
+- [Deploy an application without Vespa CLI](/en/basics/deploy-an-application-shell)
+- [Deploy an application having Java components locally](/en/basics/deploy-an-application-local-java)
+
+This is tested with _vespaengine/vespa:8.692.16_ container image.
+
+
+
+**Prerequisites:**
+
+- Linux, macOS or Windows 10 Pro on x86\_64 or arm64, with [Podman Desktop](https://podman.io/) or [Docker Desktop](https://www.docker.com/products/docker-desktop/) installed, with an engine running.
+ - Alternatively, start the Podman daemon:
+ ```bash
+ $ podman machine init --memory 6000
+ $ podman machine start
+ ```
+ - See [Docker Containers](/en/operations/self-managed/docker-containers) for system limits and other settings.
+
+- For CPUs older than Haswell (2013), see [CPU Support](/en/operations/self-managed/cpu-support).
+- Memory: Minimum 4 GB RAM dedicated to Docker/Podman. [Memory recommendations](/en/operations/self-managed/node-setup#memory-settings).
+- Disk: Avoid `NO_SPACE` - the vespaengine/vespa container image + headroom for data requires disk space. [Read more](/en/writing/feed-block).
+- [Homebrew](https://brew.sh/) to install the [Vespa CLI](/en/clients/vespa-cli), or download the Vespa CLI from [Github releases](https://github.com/vespa-engine/vespa/releases).
+
+
+
+
+
+**Validate the environment:**
+
+```bash
+$ docker info | grep "Total Memory"
+
+or
+
+$ podman info | grep "memTotal"
+```
+
+
+**Install the [Vespa CLI](/en/clients/vespa-cli)** using [Homebrew](https://brew.sh/):
+
+```bash
+$ brew install vespa-cli
+```
+Windows/No Homebrew? See the [Vespa CLI page](/en/clients/vespa-cli) to download directly.
+
+
+**Set local target:**
+```bash
+$ vespa config set target local
+```
+
+
+**Start a Vespa Docker container:**
+```bash
+docker run --detach --name vespa --hostname vespa-container \
+ --publish 8080:8080 --publish 19071:19071 \
+ vespaengine/vespa
+```
+Alternatively, use `podman` in the command above.
+
+The port `8080` is published to make the search and feed interfaces accessible from outside the container, `19071` is the deploy-endpoint. Only one container named `vespa` can run at a time, so change the name if needed. See [Docker containers](/en/operations/self-managed/docker-containers) for more insights.
+
+
+
+**Clone a sample [application](/en/basics/applications):**
+```bash
+$ vespa clone album-recommendation myapp && cd myapp
+```
+
+
+
+**[Deploy](/en/basics/applications#deploying-applications) the application:**
+```bash
+$ vespa deploy --wait 300 ./app
+```
+
+
+
+**[Feed](/en/writing/reads-and-writes) [documents](/en/schemas/documents):**
+```bash
+$ vespa feed dataset/documents.jsonl
+```
+
+
+**Run [queries](/en/querying/query-api):**
+```bash
+vespa query "select * from music where album contains 'head'"
+```
+```bash
+vespa query \
+ "select * from music where true" \
+ "ranking=rank_albums" \
+ "ranking.features.query(user_profile)={{cat:pop}:0.8,{cat:rock}:0.2,{cat:jazz}:0.1}"
+```
+
+
+**Get documents:**
+```bash
+vespa document get id:mynamespace:music::a-head-full-of-dreams
+```
+```bash
+vespa visit
+```
+
+Get a document by ID, or export all documents - see [/document/v1](/en/writing/document-v1-api-guide) and [vespa visit](/en/writing/visiting).
+
+
+
+Congratulations, you have deployed your first Vespa application!
\ No newline at end of file
diff --git a/mintlify-docs/en/basics/deploy-an-application-shell.mdx b/mintlify-docs/en/basics/deploy-an-application-shell.mdx
new file mode 100644
index 0000000000..e020606335
--- /dev/null
+++ b/mintlify-docs/en/basics/deploy-an-application-shell.mdx
@@ -0,0 +1,129 @@
+---
+title: Deploy an application without Vespa CLI
+---
+
+This lets you deploy an application to the [dev zone](/en/operations/environments#dev) on Vespa Cloud (for free).
+
+Alternative versions of this guide:
+
+- [Deploy an application using pyvespa](https://vespa-engine.github.io/pyvespa/getting-started-pyvespa-cloud.html) - for Python developers
+- [Deploy an application](/en/basics/deploy-an-application)
+- [Deploy an application having Java components](/en/basics/deploy-an-application-java)
+- [Deploy an application locally](/en/basics/deploy-an-application-local)
+- [Deploy an application with Java components locally](/en/basics/deploy-an-application-local-java)
+
+
+
+
+**Prerequisites:**
+
+- git - or download the files from [album-recommendation](https://github.com/vespa-engine/sample-apps/tree/master/album-recommendation)
+- zip - or other tool to create a .zip file
+- curl - or other tool to send HTTP requests with security credentials
+- OpenSSL
+
+
+Steps:
+
+
+
+**Create a [tenant](/en/learn/tenant-apps-instances) on Vespa Cloud:**
+
+Go to [console.vespa-cloud.com](https://console.vespa-cloud.com/) and create your tenant (unless you already have one).
+
+
+**Clone a sample [application](/en/basics/applications):**
+```bash
+git clone --depth 1 https://github.com/vespa-engine/sample-apps.git && \
+ cd sample-apps/album-recommendation
+```
+See [sample-apps](https://github.com/vespa-engine/sample-apps) for other sample apps you can clone
+
+
+**Add a certificate for [data plane access](/en/security/guide#data-plane) to the application:**
+
+On Unix or Mac, use `openssl`:
+
+```bash
+openssl req -x509 -nodes -days 14 -newkey rsa:4096 \
+ -subj "/CN=cloud.vespa.example" \
+ -keyout data-plane-private-key.pem -out data-plane-public-cert.pem
+```
+On Windows, the certificate has to be created with [New-SelfSignedCertificate](https://learn.microsoft.com/en-us/powershell/module/pki/new-selfsignedcertificate) in PowerShell, and then exported to PEM format using [certutil](https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/certutil).
+
+Once the certificate has been created, add it to the application package.
+
+```bash
+mkdir -p app/security && \
+ cp data-plane-public-cert.pem app/security/clients.pem
+```
+
+
+
+**Create a deployable application package zip:**
+```bash
+( cd app && zip -r ../application.zip . )
+```
+
+
+
+**Deploy the application:**
+
+In the [console](https://console.vespa-cloud.com/), click *Deploy Application*. Use "myapp" as the application name, leave the defaults. Make sure *DEV* is selected, and upload the `application.zip`. Click *Create and deploy*.
+
+The first deployment may take a few minutes while nodes are provisioned.
+
+
+
+**Verify the application endpoint:**
+
+```bash
+ENDPOINT=https://name.myapp.tenant-name.aws-us-east-1c.dev.z.vespa-app.cloud/
+```
+```bash
+curl --cert data-plane-public-cert.pem --key data-plane-private-key.pem $ENDPOINT
+```
+You can find the endpoint in the console deployment output, set it for later use and test it. You can also [do this in a browser](/en/security/guide#using-a-browser).
+
+
+**[Feed](/en/writing/reads-and-writes) [documents](/en/schemas/documents):**
+```bash
+curl --cert data-plane-public-cert.pem --key data-plane-private-key.pem \
+ -H "Content-Type:application/json" \
+ --data-binary @ext/A-Head-Full-of-Dreams.json \
+ $ENDPOINT/document/v1/mynamespace/music/docid/a-head-full-of-dreams
+```
+```bash
+curl --cert data-plane-public-cert.pem --key data-plane-private-key.pem \
+ -H "Content-Type:application/json" \
+ --data-binary @ext/Love-Is-Here-To-Stay.json \
+ $ENDPOINT/document/v1/mynamespace/music/docid/love-is-here-to-stay
+```
+```bash
+curl --cert data-plane-public-cert.pem --key data-plane-private-key.pem \
+ -H "Content-Type:application/json" \
+ --data-binary @ext/Hardwired...To-Self-Destruct.json \
+ $ENDPOINT/document/v1/mynamespace/music/docid/hardwired-to-self-destruct
+```
+
+
+**Run [queries](/en/querying/query-api):**
+```bash
+curl --cert data-plane-public-cert.pem --key data-plane-private-key.pem \
+ -X POST -H "Content-Type: application/json" --data '
+ {
+ "yql": "select * from music where true",
+ "ranking": {
+ "profile": "rank_albums",
+ "features": {
+ "query(user_profile)": "{{cat:pop}:0.8,{cat:rock}:0.2,{cat:jazz}:0.1}"
+ }
+ }
+ }' \
+ $ENDPOINT/search/
+```
+
+
+
+
+Congratulations, you have deployed your first Vespa application! Application instances in the [dev zone](/en/operations/environments#dev) will by default keep running for 14 days after the last deployment. You can control this in the [console](https://console.vespa-cloud.com/).
diff --git a/mintlify-docs/en/basics/deploy-an-application.mdx b/mintlify-docs/en/basics/deploy-an-application.mdx
new file mode 100644
index 0000000000..3891c93a62
--- /dev/null
+++ b/mintlify-docs/en/basics/deploy-an-application.mdx
@@ -0,0 +1,111 @@
+---
+title: Deploy an application
+description: "Follow these steps to deploy a Vespa application to the [dev zone](/en/operations/environments#dev) on Vespa Cloud (for free)."
+---
+
+{/* If you change this also make the same change in deploy-an-application-{shell,java} */}
+
+**Alternative versions of this guide:**
+
+- [Deploy an application using pyvespa](https://vespa-engine.github.io/pyvespa/getting-started-pyvespa-cloud.html) — for Python developers
+- [Deploy an application having Java components](/en/basics/deploy-an-application-java)
+- [Deploy an application without Vespa CLI](/en/basics/deploy-an-application-shell)
+- [Deploy an application locally](/en/basics/deploy-an-application-local)
+- [Deploy an application having Java components locally](/en/basics/deploy-an-application-local-java)
+
+## Setup
+
+
+
+ Create a [tenant](/en/learn/tenant-apps-instances) on Vespa Cloud:
+
+ Go to [console.vespa-cloud.com](https://console.vespa-cloud.com/) and create your tenant (unless you already have one).
+
+
+
+ Install the [Vespa CLI](/en/clients/vespa-cli) using [Homebrew](https://brew.sh/):
+
+ ```bash
+ brew install vespa-cli
+ ```
+
+ Windows or no Homebrew? See the [Vespa CLI](/en/clients/vespa-cli) page to download directly.
+
+
+
+ ```bash
+ export VESPA_CLI_HOME=$PWD/.vespa
+ vespa config set target cloud
+ vespa config set application vespa-team.autotest
+ ```
+
+ Use the tenant name from step 1 instead of `vespa-team`, and replace it in other steps in this guide too.
+
+
+
+ ```bash
+ vespa auth login
+ ```
+
+ Follow the instructions from the command to authenticate.
+
+
+
+ **Clone a sample [application](/en/basics/applications):**
+
+ ```bash
+ vespa clone album-recommendation myapp && cd myapp
+ ```
+
+ See [sample-apps](https://github.com/vespa-engine/sample-apps) for other sample apps you can clone.
+
+
+
+ **Add a certificate for [data plane access](/en/security/guide#data-plane) to the application:**
+
+ ```bash
+ vespa auth cert app
+ ```
+
+ It is a good idea to take note of the path to the `.pem` files written here.
+
+
+
+Steps:
+
+
+
+ [Deploy](/en/basics/applications#deploying-applications) the application:
+
+ ```bash
+ vespa deploy --wait 600 ./app
+ ```
+
+ The first deployment may take a few minutes while nodes are provisioned.
+
+
+
+ [Feed](/en/writing/reads-and-writes) [documents](/en/schemas/documents):
+
+ ```bash
+ vespa feed dataset/documents.jsonl
+ ```
+
+
+
+ Run [queries](/en/querying/query-api):
+
+ ```bash
+ vespa query "select * from music where album contains 'head'"
+ ```
+
+ ```bash
+ vespa query \
+ "select * from music where true" \
+ "ranking=rank_albums" \
+ "ranking.features.query(user_profile)={{cat:pop}:0.8,{cat:rock}:0.2,{cat:jazz}:0.1}"
+ ```
+
+
+
+Congratulations, you have deployed your first Vespa application! Application instances in the [dev zone](/en/operations/environments#dev) will by default keep running for 14 days after the last deployment. You can control this in the [console](https://console.vespa-cloud.com/).
\ No newline at end of file
diff --git a/mintlify-docs/en/basics/operations.mdx b/mintlify-docs/en/basics/operations.mdx
new file mode 100644
index 0000000000..4cb1fadb4d
--- /dev/null
+++ b/mintlify-docs/en/basics/operations.mdx
@@ -0,0 +1,25 @@
+---
+title: Operations
+description: "A deployed Vespa application is a self-contained highly available, distributed stateful system. Operating these at scale is difficult, so Vespa automates this to the extent possible in the deployment environment it is running."
+---
+
+| Deployment environment | Automated operations | Suitable for |
+| :--- | :--- | :--- |
+| Vespa self-managed/open source | Application deployment (single application, single instance), application change (except rolling restarts), data redistribution, failover | Development |
+| Vespa Kubernetes Operator | Application deployment (single application, single instance), application change, data redistribution, failover, node provisioning, failed node replacement, node type change, [autoscaling](/en/operations/autoscaling), [endpoint routing](/en/operations/endpoint-routing), encryption | Production in environments outside hyperscalers |
+| Vespa Cloud | Application deployment (multiple applications, instances, [regions](/en/operations/zones), clouds), application change, data redistribution, failover, node provisioning, failed node replacement, node type change, [autoscaling](/en/operations/autoscaling), [endpoint routing](/en/operations/endpoint-routing), encryption, Vespa platform and OS upgrades, continuous deployment pipeline with verification, metrics and management console | Development, production on hyperscalers (including in [customer accounts and VPCs](/en/operations/enclave/enclave)) |
+
+Vespa is designed to enable applications to evolve in production. This includes these aspects:
+
+- Application package changes are managed by Vespa's built-in control plane to be carried out without impacting queries or writes. If a change cannot be made without impacting queries or writes, it is rejected on deployment (and will require a [validation](/en/reference/applications/validation-overrides) override to be allowed).
+- The operations supported by Vespa are those that can be scaled to hundreds of nodes, billions of documents and hundreds of thousands of queries per second. If you can run it on a single machine, you can scale it.
+- The hardware resources available in a cluster can be changed both up and down. Redistribution will happen automatically in the background, without limiting resource usage to avoid impacting queries and writes.
+- When possible (on Vespa Cloud), new revisions of applications are deployed in test zones where they can be verified by application-supplied functional tests before being allowed to progress to production.
+
+## Performance and scaling
+
+Content clusters in Vespa can be scaled to any amount of content by adding more nodes (horizontal scaling). Data will redistribute automatically, and there's no need for manual tuning of the process. To scale to large amounts of queries, content clusters can also be scaled by adding multiple *groups* of nodes (vertical scaling). Each group contains a single copy of the corpus and container clusters will automatically load balance over groups.
+
+A Vespa application can consist of any number of stateless and stateful clusters. On larger applications it can be beneficial to split different functions into separate clusters that can be optimized separately. For example, having one stateless container cluster for feeding and another for querying, or using different content clusters for different data schemas.
+
+Read more in [elasticity](/en/content/elasticity) and the [performance guide](/en/performance/).
diff --git a/mintlify-docs/en/basics/querying.mdx b/mintlify-docs/en/basics/querying.mdx
new file mode 100644
index 0000000000..edac9a9a53
--- /dev/null
+++ b/mintlify-docs/en/basics/querying.mdx
@@ -0,0 +1,91 @@
+---
+title: Querying
+description: "An introduction to querying with Vespa."
+---
+
+## Queries
+
+Queries in Vespa are expressed as a YQL string: a query language identical to SQL for structured data, with additions for vector and full-text search, for example:
+
+```sql
+select * from mySchema where myTextField contains 'someWord' and myNumber > 10.0
+```
+
+You can also search multiple fields with one query item (like "contains"), by defining [fieldset](/en/reference/schemas/schemas#fieldset) in the schema.
+
+Any nested combination of and/or and so on is supported; see the full syntax in the [query language reference](/en/reference/querying/yql).
+
+## Query requests
+
+Queries are sent as HTTP requests, to the endpoint of a container cluster having `` in `services.xml`. The YQL query is sent as the `yql` parameter (HTTP encoded):
+
+```text
+endpoint-url/search/?yql=select+%2A+from+sources+%2A+where+true
+```
+
+The Vespa CLI can do this for you:
+
+```bash
+vespa query "select * from sources * where true"
+```
+
+You can add the `-v` option to see the HTTP request that this becomes.
+
+On Vespa Cloud your application will by default get an mTLS certificate that you use to make requests. If you want to use an access token, you can [add one in the console](/en/security/guide#configuring-tokens).
+
+## Query request parameters
+
+In addition to the YQL parameter, you can send other query request parameters to supply data such as user/llm query text, vectors, and parameters controlling the query execution. These are added to HTTP requests in the obvious way, and passed to Vespa CLI by adding multiple arguments:
+
+```bash
+vespa query -v "select * from sources * where true" "timeout=100ms"
+```
+
+You can also send the query parameters [as a JSON payload](/en/querying/query-api#http) instead of as request parameters:
+
+```bash
+curl -H "Content-Type: application/json" \
+ --data '{"yql" : "select * from sources * where true"}' \
+ endpoint-url/search/
+```
+
+To see all the parameters accepted, see the [query API reference](/en/reference/api/query).
+
+You may end up wanting to set many query parameters in your queries. Instead of passing them in the request, you can create a query profile in the application package containing all the parameters and just specify the profile in the request — see [query profiles](/en/querying/query-profiles).
+
+## Querying with text
+
+You can use the `text` YQL operator to retrieve or rank using raw text. This will process the text and (by default) search it with the [WeakAnd](/en/ranking/wand#weakand) text search operator.
+
+You can pass the text directly, or refer to a separate request parameter (using `@parameter`):
+
+```bash
+vespa query "select * from sources * where title contains text(@query)" \
+ "query=Any text, from a human/llm"
+```
+
+You can set options controlling how the text is to be parsed and matched; see the [text() reference documentation](/en/reference/querying/yql#text).
+
+## Querying with vectors
+
+Querying by vectors is done using the `nearestNeighbor` YQL operator, which takes a document and query vector:
+
+```bash
+vespa query 'select * from sources * where {targetHits: 100}nearestNeighbor(my_vector_field, my_query_vector)' \
+ ranking=my_rank_profile \
+ 'input.query(my_query_vector)'='[1,2,3]'
+```
+
+Read more in [nearest neighbor search](/en/querying/nearest-neighbor-search).
+
+You can combine multiple `nearestNeighbor`, `text` and other operators in any way:
+
+```bash
+vespa query "select * from sources * where (({targetHits: 300}nearestNeighbor(my_title_embedding, my_query_vector)) \
+ or ({targetHits: 150}nearestNeighbor(my_body_embeddings, my_query_vector)) \
+ or title contains text(@query) or body contains text(@query)) \
+ and range(title, 0.0, 500.0) and category in ('c1', 'c2') \
+ and !(blacklisted=true)" \
+ "input.query(my_query_vector)=embed(@query)" \
+ "query=Hello, world! "
+```
diff --git a/mintlify-docs/en/basics/ranking.mdx b/mintlify-docs/en/basics/ranking.mdx
new file mode 100644
index 0000000000..eb73078ab5
--- /dev/null
+++ b/mintlify-docs/en/basics/ranking.mdx
@@ -0,0 +1,134 @@
+---
+title: Ranking
+---
+
+*Ranking* in Vespa is the computation that is done on matching documents during query execution. These are specified as [ranking functions](/en/ranking/ranking-expressions-features) in *rank profiles* in the schema.
+
+The special function named `first-phase` will determine the initial *rank* of the matches, such that the top k can be selected as response to a query:
+
+```js
+rank-profile my-rank-profile {
+ first-phase {
+ expression: 0.7 * bm25(text) + 0.3 * attribute(popularity)
+ }
+}
+```
+
+## Ranking functions and features
+
+The ranking functions can be any mathematical function combining rank features, including [tensor math](/en/ranking/tensor-user-guide#ranking-with-tensors) and [machine-learned models](#machine-learned-model-inference).
+
+The rank features these functions can use are of three categories:
+
+- **Document features**, using `attribute(fieldName)`: any document field which has `attribute` in the indexing statement.
+- **Query features**, aka inputs, using `query(name)`: any value sent with the query as an input. When these are tensors (not scalars) they must be declared as an input in the rank profile.
+- **Match features**: a built-in feature which says something about how well a query and document matches, e.g. bm25 or closeness.
+
+Refer to the [full list of rank features](/en/reference/ranking/rank-features).
+
+Query features (inputs) that are tensors must be declared in the rank profile:
+
+```js
+rank-profile my-rank-profile {
+ inputs {
+ query(user_context) tensor(x[3])
+ }
+ first-phase {
+ expression: bm25(text) + sum(query(user_context) * attribute(document_context))
+ }
+}
+```
+
+This is also how the type of query vectors in vector search are declared.
+
+## Rank profiles
+
+A schema can have any number of rank profiles specifying computations and ranking for different use cases, experiments, and so on. Queries select one using the [ranking.profile](/en/reference/api/query#ranking.profile) parameter in requests or a [query profile](/en/querying/query-profiles). If no profile is specified in the request, the one called `default` is used, and if that isn't specified in the schema, a default one ranking by the [nativeRank](/en/ranking/nativerank) feature is used. Another built-in rank profile `unranked` is also always available. Specifying this boosts serving performance in queries which do not need ranking because ordering is not important or [explicit field sorting](/en/reference/querying/sorting-language) is used.
+
+To avoid very long schema files, rank profiles can also be specified in their own files in the application package, named `schemas/[schema-name]/[profile-name].profile`. See the [schema reference](/en/reference/schemas/schemas#rank-profile) for documentation of all the content of rank profiles.
+
+Rank profiles can inherit other profiles to avoid duplication, as in `rank-profile myProfile inherits default, another`.
+
+## Phased ranking
+
+In addition to first-phase which specifies the initial ranking that will be applied on all matching documents during matching, rank profiles can also specify functions that will be applied to *rerank* the top k documents before returning the final result. This is useful to direct more computation towards the most promising candidate documents:
+
+```js
+schema myapp {
+
+ rank-profile my-rank-profile {
+
+ first-phase {
+ expression {
+ attribute(quality) * freshness(timestamp) + bm25(title)
+ }
+ }
+
+ second-phase {
+ expression: xgboost(my_xgboost_reranker)
+ total-rerank-count: 1000 # Over all nodes
+ }
+
+ global-phase {
+ expression: sum(onnx(my_large_onnx_model))
+ rerank-count: 20
+ }
+
+ }
+
+}
+```
+
+The `second-phase` expression is executed locally on the content node, using local data. This is efficient on thousands of candidates. The `global-phase` expression is executed on the global result set after merging, in the container node and is best used for any very expensive and high quality final reranking. See [phased ranking](/en/ranking/phased-ranking) for details.
+
+## Ranking functions
+
+A rank profile can define any number of functions which can be used in other ranking expressions or (when taking no arguments) be returned with results.
+
+```js expandable
+schema myapp {
+
+ rank-profile my-rank-profile {
+
+ function clickProbability() {
+ expression: xgboost('myClickModel')
+ }
+
+ function textRanking(field) {
+ expression: 0.7 * bm25(field) + 0.3 * nativeProximity(field)
+ }
+
+ first-phase {
+ expression {
+ 0.1 * clickProbability()
+ 0.2 * closeness(embeddingsField) +
+ 0.3 * textRanking(titleField) +
+ 0.4 * textRanking(bodyField)
+ }
+ }
+
+ summary-features {
+ clickProbability() # Returned with every matched document
+ }
+
+ }
+
+}
+```
+
+Read more in [ranking expressions and functions](/en/ranking/ranking-expressions-features).
+
+## Layered ranking
+
+In addition to ranking *documents*, a rank profile can also rank and select array elements within documents. This is most commonly used to select individual chunks within documents in RAG applications — see [working with chunks](/en/rag/working-with-chunks#layered-ranking-selecting-chunks-to-return).
+
+## Machine-Learned model inference
+
+The best quality is achieved by learning relevance functions using machine learning from a training set. Vespa lets you use machine-learned models in these formats in distributed ranking (first- and second phase):
+
+- [ONNX](/en/ranking/onnx), allowing importing models from ML frameworks like Tensorflow, PyTorch and scikit-learn.
+- [XGBoost](/en/ranking/xgboost)
+- [LightGBM](/en/ranking/lightgbm)
+
+As these are exposed as rank features, they can be used in ranking expressions exactly like any other rank feature.
+
diff --git a/mintlify-docs/en/basics/schemas.mdx b/mintlify-docs/en/basics/schemas.mdx
new file mode 100644
index 0000000000..1577338dff
--- /dev/null
+++ b/mintlify-docs/en/basics/schemas.mdx
@@ -0,0 +1,103 @@
+---
+title: Schemas
+description: "This is an introduction to schemas in Vespa. You can find all the details in the [schema reference](/en/reference/schemas/schemas)."
+---
+
+A schema defines a type of data and what we want to compute over it. An application package can contain multiple schemas for different kinds of data. Each content cluster specified in `services.xml` refers to the schemas that should be stored and indexed in that cluster. Schemas can inherit other schemas to avoid repeating common content.
+
+Schemas are placed in files named the same as the schema, with the ending `.sd` (for schema definition), in the `schemas/` directory of the application package.
+
+## Document fields
+
+A schema contains a document type, which is a named collection of fields:
+
+```js
+schema mySchema {
+
+ document mySchema {
+
+ field myField type string {
+ indexing: summary | index
+ }
+
+ ... more fields
+
+ }
+
+}
+```
+
+Each field has a type, a way it should be processed and indexed, and optionally other settings. The main decision you make is how the field should be used in queries, determined by the `indexing` statement:
+
+- `indexing: summary`: The field should be available in query responses ([document summaries](/en/querying/document-summaries)).
+- `indexing: index`: If a string: create a full-text on-disk index. If a tensor: create an HNSW vector index (requires `attribute` in addition).
+- `indexing: attribute`: For any field type: make the field value available for structured search (exact, range, regexp etc.), ranking, sorting, grouping, and aggregation in the [in-memory column store](/en/content/attributes). Suitable for structured data.
+- `indexing: attribute` and `attribute: fast-search`: As above, but in addition, create an index over this data to make it an efficient filter. Suitable for structured fields that are used as strong filters in queries.
+
+The indexing statement can contain multiple expressions separated by a pipe character, and these can also preprocess the value, so the pipe should be read as passing to the next expression, as on Unix. See the [reference](/en/reference/schemas/schemas#field) for all the types and content of fields.
+
+When a schema is defined and added to a content cluster, you can [write data](/en/basics/writing) according to it, and [query](/en/basics/querying) using the attributes and indexed fields in it. Indexing always happens automatically in real time.
+
+## Synthetic fields
+
+The document type in the schema defines the fields that you can put and get (read and write) for that document type. However, sometimes you want to take an input field and process it in some way before it is stored/indexed. To do that, you can create additional synthetic fields outside the document in the schema (for example using the [embed](/en/rag/embedding) function):
+
+```js
+schema mySchema {
+
+ document mySchema {
+
+ field myField type string {
+ indexing: summary | index
+ }
+
+ ...
+
+ }
+
+ field mySyntheticField type tensor(x[386]) {
+ indexing: input myField | embed | attribute | index
+ }
+
+}
+```
+
+## Rank profiles
+
+A *rank profile* specifies what should be computed over the data described by the schema, and how the documents of it should be ranked to select the ones to return in a query response:
+
+```js
+schema mySchema {
+
+ ...
+
+ rank-profile hybrid {
+
+ first-phase {
+ expression {
+ 0.3 * bm25(myText) +
+ 0.5 * closeness(myEmbedding) +
+ 0.2 * attribute(popularity)
+ }
+ }
+
+ }
+
+}
+```
+
+A schema can have any number of rank profiles for different use cases, experiments and so on, and each can have multiple functions that compute some value to be returned or used in ranking. In addition to simple math functions like the above these can also be machine-learned models. See [ranking](/en/basics/ranking) for more.
+
+## Working with schemas
+
+Schemas may become thousands of lines, with inheritance, multiple rank functions calling each other and so on. The most efficient way of working with them is to use an IDE and install the Vespa plugin to get syntax highlighting, completions and navigation — see [IDE support](/en/applications/ide-support).
+
+What happens if you change the schema of a running application?
+
+- **Adding new fields**: No problem; the new field will be added and have no value until data is written to it.
+- **Changing how a field is indexed**: This will automatically cause a background reindexing on Vespa Cloud, but in the meantime there may be inconsistency in how the field is used in queries and writes, so in production it is sometimes preferable to create a new field instead.
+- **Removing a field**: Data and indexes are removed for the field.
+- **Changing the type of a field**: Existing data and indexes are removed for the field. For this reason, it is often preferable to add a new field instead, populate it, switch usages to the new field, then remove the old.
+
+You can find the details in [modifying schemas](/en/reference/schemas/schemas#modifying-schemas).
+
diff --git a/mintlify-docs/en/basics/whats-more.mdx b/mintlify-docs/en/basics/whats-more.mdx
new file mode 100644
index 0000000000..089a4ce5ca
--- /dev/null
+++ b/mintlify-docs/en/basics/whats-more.mdx
@@ -0,0 +1,17 @@
+---
+title: What's more
+description: "The Vespa basics articles introduce the central concepts in Vespa, but can't cover everything needed to build complete applications."
+---
+
+Some additional important features are:
+
+- **[Grouping and aggregation](/en/querying/grouping)** (faceting): Grouping in the query language lets you specify hierarchical groupings and aggregations that will be performed over all the matches to a query distributed over all participating nodes.
+- **Streaming search**: In applications where queries search fixed small subsets of all data (such as a user or tenant) it is not cost-effective to build indexes. For these use cases Vespa supports a [streaming mode](/en/performance/streaming-search) which delivers low-latency search without the cost of maintaining indexes or even keeping data in memory.
+- **Application components**: Applications can include Java components that implement application logic, such as intercepting queries and results and implement custom workflows ([Searchers](/en/applications/searchers)), modify write operations ([document processors](/en/applications/document-processors)), and implementing custom APIs ([handlers](/en/applications/request-handlers)).
+- **Parent-child relations**: Joins are not supported in Vespa because they wouldn't scale, but the special case where one side of the join is much smaller than the other is supported. This is called a [parent-child relation](/en/schemas/parent-child).
+- **Federation**: Most applications federate over multiple types of content. Vespa will federate over all schemas and clusters by default, and includes a [federation framework](/en/querying/federation) which lets you define application-specific schemes to formulate queries to each content type, include content from other services, combine content in application-specific ways and so on.
+- **Predicate fields**: Sometimes it is useful to allow documents to specify when they should be matched, as conditions on properties sent with the query, for example to let content target specific kinds of users. This can be done using [predicate fields](/en/schemas/predicate-fields).
+- **Geo search**: By using [geo fields](/en/querying/geo-search), you can find documents within a given area, use distance to the query in ranking, or even retrieve by distance to a path for route planning.
+- **Mutable attributes**: By defining [mutable attributes](/en/reference/schemas/schemas#mutate) on the documents, applications can collect statistics in real time on each document to track how often they are matched, ranked, and returned in results.
+
+Read more in the full [features](/en/learn/features) list.
diff --git a/mintlify-docs/en/basics/writing.mdx b/mintlify-docs/en/basics/writing.mdx
new file mode 100644
index 0000000000..45cce4a74c
--- /dev/null
+++ b/mintlify-docs/en/basics/writing.mdx
@@ -0,0 +1,70 @@
+---
+title: Writing
+description: "This is an introduction to writing data into Vespa."
+---
+
+## Documents
+
+Once you have added one or more schemas to an application, and have added `` in `services.xml` to the container cluster you want to handle writes, you can send writes following those schemas. A document is written as a JSON map containing a value for each field:
+
+```json
+{
+ "put": "id:my-namespace:my-documenttype::my-id-string",
+ "fields": {
+ "myTextField": "Hello world!",
+ "myNumericAttribute": 13.8,
+ "myEmbedding": [0.3, 1.45, 1.03]
+ }
+}
+```
+
+Each document has an id, which has two parts which can be decided freely:
+
+ - The **namespace**, which is just a string used to avoid name collisions if you have multiple kinds of clients deciding ids and not used for any other purpose
+ - The **id string**, which can be any string you want, for example a product id or a url
+
+Fields can remain empty; you do not need to set a value for every field defined in the document type.
+
+You can find complete information on the document format in the [document JSON format reference](/en/reference/schemas/document-json-format).
+
+## Writing documents
+
+Documents are written to your application instance's *write endpoint*, using the [document/v1](/en/writing/document-v1-api-guide) HTTP API. You can use the API directly, or use one of the clients provided by Vespa:
+
+- **Command line, with [Vespa CLI](/en/clients/vespa-cli)**: [`vespa feed`](/en/clients/vespa-cli#documents) to feed one or many documents to Vespa.
+- **Python, with [PyVespa](https://vespa-engine.github.io/pyvespa/)**: [`application.feed_iterable(...)`](https://vespa-engine.github.io/pyvespa/reads-writes.html#feeding-operations-from-a-file)
+- **Java, with the [Java Feed Client](/en/clients/vespa-feed-client)**: [`myFeedClient.put(id, json, params)`](/en/clients/vespa-feed-client#example-java)
+
+Documents can also be removed, retrieved, and updated using the same API and clients.
+
+## Updating documents
+
+Documents can be fully replaced by a new version by writing them again, but you can also update any individual fields of existing documents. This is especially useful for updating attribute fields such as e.g. behavior signals or prices at high throughput, without impacting other fields and indexes.
+
+Updates are sent in the same ways as document puts; it's just the format that's different:
+
+```json
+{
+ "update": "id:my-namespace:my-documenttype::my-id-string",
+ "fields": {
+ "myTextField": {
+ "assign": "Some new value"
+ }
+ }
+}
+```
+
+Updates can also increment numerical values, add to arrays and tensor etc. Read more in the [partial update guide](/en/writing/partial-updates).
+
+## Writes are streamed and realtime
+
+Write operations to Vespa are streamed (using HTTP/2), and processed asynchronously. There is no need for a separate batch API to feed with the maximal throughput a system can handle; servers will push back by responding more slowly when they are close to saturation, and clients use this signal to back off, allowing them to dynamically converge at the maximal throughput a system can handle.
+
+The write operations to Vespa are always applied in real time: when a write operation is asynchronously acknowledged, the write operation is persisted, fully processed and the result is visible in all subsequent queries. Vespa achieves this by a unique index design, combining in-memory mutable structures with (for full-text) disk-backed posting lists.
+
+Read more in the [feed sizing doc](/en/performance/sizing-feeding).
+
+## The document API can also return documents
+
+In addition to supporting writes, the document/v1 HTTP API can also return single documents by id (get), and stream any selection of a document corpus (visit). Visiting is used for background and one-time jobs such as backup and scraping content for offline machine learning. It is designed to have minimal impact on the running system rather than returning with low latency. Read more in [the document/v1 guide](/en/writing/document-v1-api-guide#data-dump).
+
diff --git a/mintlify-docs/en/clients/http-best-practices.mdx b/mintlify-docs/en/clients/http-best-practices.mdx
new file mode 100644
index 0000000000..c0d7d7c236
--- /dev/null
+++ b/mintlify-docs/en/clients/http-best-practices.mdx
@@ -0,0 +1,43 @@
+---
+title: HTTP Best Practices
+sidebarTitle: "HTTP best practices"
+---
+
+## Always re-use connections
+As connections to a JDisc container cluster are terminated at the individual container nodes, the cost of connection overhead will impact their serving capability. This is especially important for HTTPS/TLS as full TLS handshakes are expensive in terms of CPU cycles. A handshake also entails multiple network round-trips that certainly degrades request latency for new connections. A client instance should therefore re-use HTTPS connections if possible for subsequent requests.
+
+Note that some client implementation may not re-use connections by default. For instance *Apache HttpClient (Java)* [will by default not re-use connections when configured with a client X.509 certificate](https://stackoverflow.com/a/13049131/1615280). Most programmatic clients require the response content to be fully consumed/read for a connection to be reused.
+
+## Use multiple connections
+Clients performing feed/query must use sufficient number of connections to spread the load evenly among all containers in a cluster. This is due to container clusters being served through a layer 4 load balancer (*Network Load Balancer*).
+Too few connections overall may result in an unbalanced workload, and some containers may not receive any traffic at all. This aspect is particular relevant for applications with large container clusters and/or few client instances.
+
+## Be aware of server-initiated connection termination
+Vespa Cloud will terminate idle connections after a timeout and active connections after a max age threshold is exceeded.
+The latter is performed gracefully through mechanisms in the HTTP protocol.
+* *HTTP/1.1*: A `Connection: close` header is added to the response for the subsequent request received after timeout.
+* *HTTP/2*: A `GOAWAY` frame with error code `NO_ERROR (0x0)` is returned for the subsequent request received after timeout. Be aware that some client implementation may not handle this scenario gracefully.
+
+Both the idle timeout and max age threshold are aggressive to regularly rebalanced traffic. This ensures that new container nodes quickly receives traffic from existing client instances, for example when new resources are introduced by the [autoscaler](/en/operations/autoscaling).
+
+To avoid connection termination issues, clients should configure client-side idle timeouts to **less than 30 seconds** and connection TTL (max age) to **less than 45 seconds**. Proactively closing connections before the server does helps prevent errors caused by server-initiated terminations. Connections should still be reused for subsequent requests — these timeouts control when idle or long-lived connections are recycled, not disabled. Disabling connection reuse entirely would incur the cost of a new TCP connection with TLS handshake for every request.
+
+## Prefer HTTP/2
+We recommend *HTTP/2* over *HTTP/1.1*. *HTTP/2* multiplexes multiple concurrent requests over a single connection,
+and its binary protocol is more compact and efficient.
+See Vespa's documentation on [HTTP/2](/en/performance/http2) for more details.
+
+## Be deliberate with timeouts and retries
+Make sure to configure your clients with sensible timeouts and retry policies.
+Too low timeouts combined with aggressive retries may cause havoc on your Vespa application if latency increases due to overload.
+
+Handle *transient failures* and *partial failures* through a retry strategy with backoff, for instance *capped exponential backoff* with a random *jitter*.
+Consider implementing a [*circuit-breaker*](https://martinfowler.com/bliki/CircuitBreaker.html) for failures persisting over a longer time-span.
+
+Only retry requests on *server errors* - not on *client errors*. A client should typically not retry requests after receiving a `400 Bad Request` response, or retry a TLS connection after handshake fails with client's X.509 certificate being expired.
+
+Be careful when handling 5xx responses, especially `503 Service Unavailable` and `504 Gateway Timeout`. These responses typically indicate an overloaded system, and blindly retrying without backoff will only worsen the situation. Clients should reduce overall throughput when receiving such responses.
+
+The same principle applies to `429 Too Many Requests` responses from the [Document v1 API](/en/writing/document-v1-api-guide), which indicates that the client is exceeding the system's feed capacity. Clients should implement strategies such as reducing the request rate by a specific percentage, introducing exponential backoff, or pausing requests for a short duration before retrying. These adjustments help prevent further overload and allow the system to recover.
+
+For more general advise on retries and timeouts see *Amazon Builder's Library*'s [excellent article](https://aws.amazon.com/builders-library/timeouts-retries-and-backoff-with-jitter/) on the subject.
diff --git a/mintlify-docs/en/clients/python-client.mdx b/mintlify-docs/en/clients/python-client.mdx
new file mode 100644
index 0000000000..66d0f12153
--- /dev/null
+++ b/mintlify-docs/en/clients/python-client.mdx
@@ -0,0 +1,4 @@
+---
+title: "Python Client(py Vespa)"
+url: "https://vespa-engine.github.io/pyvespa/"
+---
\ No newline at end of file
diff --git a/mintlify-docs/en/clients/vespa-cli.mdx b/mintlify-docs/en/clients/vespa-cli.mdx
new file mode 100644
index 0000000000..8848d85a27
--- /dev/null
+++ b/mintlify-docs/en/clients/vespa-cli.mdx
@@ -0,0 +1,218 @@
+---
+title: "Vespa CLI"
+---
+
+
+Vespa CLI is the command-line client for Vespa. It is a single binary without any runtime dependencies and is available for Linux, macOS and Windows. With Vespa CLI you can:
+
+- Clone the [sample applications](https://github.com/vespa-engine/sample-apps/) repository
+- Deploy your application to a Vespa installation running locally or remote
+- Deploy your application on [Vespa Cloud](/)
+- Feed and [query](/en/querying/query-language#query-examples) documents
+- Send custom requests with automatic authentication
+- Automate deployment operations with [vespa auth api-key](/en/reference/clients/vespa-cli/vespa_auth_api-key)
+
+Install Vespa CLI:
+
+- Homebrew: `brew install vespa-cli`
+- Mise: `mise use vespa-cli`
+- [Download from GitHub](https://github.com/vespa-engine/vespa/releases)
+
+To learn the basics on how to use Vespa CLI, see the [quick start guide](/en/basics/deploy-an-application-local) or the [cheat sheet below](#cheat-sheet).
+
+See the [reference documentation](/en/reference/clients/vespa-cli/vespa) for documentation of individual Vespa CLI commands and their options. This documentation is also bundled with CLI and accessible through `vespa help
+ ` or `man vespa-`.
+
+MTLS keypair location:
+
+```bash
+$ ls -l .vespa/mytenant.myapp.default/
+total 16
+-rw-r--r-- 1 name staff 3273 Nov 7 08:02 data-plane-private-key.pem
+-rw-r--r-- 1 name staff 1697 Nov 7 08:02 data-plane-public-cert.pem
+```
+
+The `.vespa` directory should be in the home directory or cwd. Remember to run `vespa config set target cloud` for Vespa Cloud endpoints.
+
+## Cheat sheet
+
+### Install, configure and run
+
+```bash
+# Install - make sure to upgrade frequently for new features
+$ brew install vespa-cli
+$ brew upgrade vespa-cli
+
+# Set home dir to a writeable directory - useful in some container contexts
+$ export VESPA_CLI_HOME=/tmp
+
+# export a token value for dataplane access
+$ export VESPA_CLI_DATA_PLANE_TOKEN='value-of-token'
+
+# Get help
+$ vespa document put --help
+```
+
+### Login and init
+
+```bash
+# Use endpoints on localhost
+$ vespa config set target local
+
+# Use Vespa Cloud
+$ vespa config set target cloud
+
+# Use a browser to log into Vespa Cloud
+$ vespa auth login
+
+# Configure application instance
+$ vespa config set application vespa-team.vespacloud-docsearch.default
+
+# Configure application instance, override global configuration (write to local .vespa)
+$ vespa config set --local application vespa-team.vespacloud-docsearch.other
+```
+
+### Deployment
+
+```bash expandable
+# Deploy an application package from cwd
+$ vespa deploy
+
+# Deploy to a specific zone
+$ vespa deploy -z dev.aws-us-east-1c
+
+# Get the deployed application package as a .zip-file
+$ vespa fetch
+
+# Deploy an application package from cwd to a prod zone with CD pipeline in Vespa Cloud using deployment.xml
+$ vespa prod deploy
+
+# Track deployment to Vespa Cloud status
+$ vespa status
+
+# Validate endpoint status, get endpoint only
+$ vespa status --format=plain
+
+# Remove a deployment from Vespa Cloud
+$ vespa destroy -a vespa-team.vespacloud-docsearch.other
+```
+
+### Documents
+
+```bash expandable
+# Put a document from file
+$ vespa document put file-with-one-doc.json
+
+# Put a document
+$ vespa document put id:mynamespace:music::a-head-full-of-dreams --data '
+{
+ "fields": {
+ "album": "A Head Full of Dreams",
+ "artist": "Coldplay"
+ }
+}'
+
+# Put a document, ID in JSON
+$ vespa document put --data '
+{
+ "put": "id:mynamespace:music::a-head-full-of-dreams",
+ "fields": {
+ "album": "A Head Full of Dreams",
+ "artist": "Coldplay"
+ }
+}'
+
+# Update a document
+$ vespa document update id:mynamespace:music::a-head-full-of-dreams --data '
+{
+ "fields": {
+ "album": {
+ "assign": "A Head Full of Thoughts"
+ }
+ }
+}'
+
+# Get one or more documents
+$ vespa document get id:mynamespace:music::a-head-full-of-dreams
+$ vespa document get id:mynamespace:music::a-head-full-of-dreams id:mynamespace:music::when-we-all-fall-asleep-where-do-we-go
+
+# Delete a document
+$ vespa document remove id:mynamespace:music::a-head-full-of-dreams
+
+# Feed multiple documents or feed from stdin
+$ vespa feed *.jsonl
+$ cat docs.json | vespa feed -
+
+# Feed to Vespa Cloud
+$ vespa feed --application mytenant.myapp -target https://b123e1db.b68a1234.z.vespa-app.cloud feedfile.json
+
+# Print successful and failed operations:
+$ vespa feed --verbose docs.json
+
+# Display a periodic summary every 3 seconds while feeding:
+$ vespa feed --progress=3 docs.json
+
+# Export all documents in "doc" schema, using "default" container cluster
+$ vespa visit --zone prod.aws-us-east-1c --cluster default --selection doc
+
+# Export slice 0 of 10 - approximately 10% of the documents
+$ vespa visit --slices 10 --slice-id 0
+
+# List IDs - great for counting total number of documents
+$ vespa visit --field-set "[id]"
+
+# Export fields "title" and "term_count" from "doc" schema
+$ vespa visit --field-set "doc:title,term_count"
+
+# Export documents using a selection string
+$ vespa visit --selection 'doc.last_updated > now() - 86400'
+
+# Export all documents in "doc" schema, in "open" namespace
+$ vespa visit --selection 'doc AND id.namespace == "open"'
+
+# Export a specific document, including synthetic (generated) fields
+$ vespa visit --selection 'id == "id:en:doc::doc-en-7764"' --field-set '[all]'
+
+# Copy documents from one cluster to another:
+$ vespa visit --target http://localhost:8080 | vespa feed --target http://localhost:9090 -
+```
+
+
+Notes:
+
+- The input files for `vespa feed` contains either a JSON array of feed operations, or one JSON operation per line ([JSONL](https://jsonlines.org/)).
+- The [\](/en/reference/applications/services/container#document-api) must be enabled in the container before documents can be fed or accessed - see [example](https://github.com/vespa-engine/sample-apps/blob/master/album-recommendation/app/services.xml).
+- For automation, see example usage in a [GitHub Action](https://github.com/vespa-engine/documentation/blob/master/.github/workflows/feed.yml). This action uses security credentials in `VESPA_CLI_DATA_PLANE_CERT` and `VESPA_CLI_DATA_PLANE_KEY` for easy security management in GitHub.
+
+
+### Queries
+
+```bash expandable
+# Query for all documents in all schemas / sources
+$ vespa query 'yql=select * from sources * where true'
+
+# YQL parameter is assumed if missing - this is equivalent to the above
+$ vespa query 'select * from sources * where true'
+
+# Query with an extra query API parameter
+$ vespa query 'select * from music where album contains "head"' \
+ hits=5
+
+# Use verbose to print a curl equivalent, too
+$ vespa query -v 'select * from music where album contains "head"' hits=5
+
+# Query a different port (after modifying http server port)
+$ vespa query 'select * from sources * where true' -t 'http://127.0.0.1:9080'
+
+# Use a query file - useful for large queries, e.g., when using query vectors
+$ vespa query --file queries-vector.json
+```
+Example query file:
+
+```json
+{
+ "yql": "select product_id, title from products where {totalTargetHits: 200}nearestNeighbor(dense_embedding, q_vector)",
+ "input.query(q_vector)": [-0.050548091530799866, ... ,0.028366032987833023],
+ "ranking": "vector_distance"
+}
+```
\ No newline at end of file
diff --git a/mintlify-docs/en/clients/vespa-feed-client.mdx b/mintlify-docs/en/clients/vespa-feed-client.mdx
new file mode 100644
index 0000000000..47b8f2a2ff
--- /dev/null
+++ b/mintlify-docs/en/clients/vespa-feed-client.mdx
@@ -0,0 +1,145 @@
+---
+title: "vespa-feed-client"
+sidebarTitle: "Java feed client"
+---
+
+- Java library and command line client for feeding document operations using [Document v1](/en/writing/document-v1-api-guide) over [HTTP/2](/en/performance/http2).
+- Asynchronous, high-performance Java implementation, with retries and dynamic throttling.
+- Supports a JSON array of feed operations, as well as [JSONL](https://jsonlines.org): one operation JSON per line.
+
+## Installing
+
+### Java library
+
+The Java library is available as a [Maven JAR artifact](https://search.maven.org/search?q=g:com.yahoo.vespa%20a:vespa-feed-client) at Maven Central. It requires minimum JDK17.
+
+Find an example application using this client at [client-java](https://github.com/vespa-engine/sample-apps/blob/master/examples/clients/client-java/README.md).
+
+### Command line client
+
+Two alternatives:
+
+- Install [_vespa-clients_/_vespa_](/en/operations/self-managed/build-install) RPM package.
+- Download [vespa-feed-client **zip** artifact](https://search.maven.org/artifact/com.yahoo.vespa/vespa-feed-client-cli) from Maven Central.
+
+Download example:
+
+```bash
+$ F_REPO="https://repo1.maven.org/maven2/com/yahoo/vespa/vespa-feed-client-cli" && \
+ F_VER=$(curl -Ss "${F_REPO}/maven-metadata.xml" | sed -n 's/.*\(.*\)<.*>/\1/p') && \
+ curl -SsLo vespa-feed-client-cli.zip ${F_REPO}/${F_VER}/vespa-feed-client-cli-${F_VER}-zip.zip && \
+ unzip -o vespa-feed-client-cli.zip
+```
+
+## Enable feed endpoint in Vespa
+
+Requirements:
+
+- [Document API must be enabled on container](/en/reference/applications/services/container#document-api).
+
+HTTP/2 over [TLS](/en/reference/applications/services/http#ssl) is optional but recommended from a security perspective.
+
+Example _services.xml_ with TLS:
+
+```xml highlight= {4-13}
+
+
+
+
+
+
+ /path/to/private-key.pem
+ /path/to/certificate.pem
+ /path/ca-certificates.pem
+
+
+
+
+
+
+
+```
+
+Example _services.xml_ without TLS:
+
+```xml highlight= {4}
+
+
+
+
+
+
+```
+
+## Using the client
+
+The Javadoc for the programmatic API is available at [javadoc.io](https://javadoc.io/doc/com.yahoo.vespa/vespa-feed-client-api). See output of `$ vespa-feed-client --help` for usage.
+
+Use `--speed-test` for bandwidth testing.
+
+### Example Java
+
+Add _vespa-feed-client_ as dependency to your Maven (or other build system using Maven for dependency management):
+
+```
+
+ com.yahoo.vespa
+ vespa-feed-client
+ 8.689.26
+
+```
+
+Code examples are listed in the [vespa-feed-client source code](https://github.com/vespa-engine/vespa/tree/master/vespa-feed-client-api/src/test/java/ai/vespa/feed/client/examples) on GitHub.
+
+- [JsonFileFeederExample.java](https://github.com/vespa-engine/vespa/blob/master/vespa-feed-client-api/src/test/java/ai/vespa/feed/client/examples/JsonFileFeederExample.java)
+- [JsonStreamFeederExample.java](https://github.com/vespa-engine/vespa/blob/master/vespa-feed-client-api/src/test/java/ai/vespa/feed/client/examples/JsonStreamFeederExample.java)
+- [SimpleExample.java](https://github.com/vespa-engine/vespa/blob/master/vespa-feed-client-api/src/test/java/ai/vespa/feed/client/examples/SimpleExample.java)
+
+### Example command line
+
+HTTP/2 over TLS:
+
+```bash
+$ vespa-feed-client \
+ --connections 4 \
+ --certificate cert.pem --private-key key.pem --ca-certificates ca.pem \
+ --file /path/to/json/file \
+ --endpoint https://container-endpoint:443/
+```
+
+The input must be either a proper JSON array, or a series, of JSON feed operations ([JSONL](https://jsonlines.org)), in the format described for the Vespa feed client [here](../reference/schemas/document-json-format#document-operations).
+
+HTTP/2 without TLS:
+
+```bash
+$ vespa-feed-client \
+ --connections 4 \
+ --file /path/to/json/file \
+ --endpoint http://container-endpoint:8080/
+```
+
+## Tuning for multi-worker pipelines
+
+A common pattern is feeding from an [Apache Beam](https://beam.apache.org/) topology (e.g., [Google Cloud Dataflow](https://docs.cloud.google.com/dataflow/docs/overview)). It is important to balance the number of workers and the connection settings.
+
+As each of the workers initializes its own `FeedClient` instance, the default settings can create too many connections. In this example we assume 128 workers and 10 Vespa Container nodes. With defaults (8 connections per endpoint, 128 max streams per connection), 128 workers opens 1,024 connections - each requiring a TLS handshake to the endpoint - which is a major source of container CPU overhead.
+
+Recommended configuration per worker ([Javadoc](https://javadoc.io/doc/com.yahoo.vespa/vespa-feed-client-api/latest/ai/vespa/feed/client/package-summary.html)):
+
+```java
+FeedClient client = FeedClientBuilder.create(endpoint)
+ .setConnectionsPerEndpoint(1)
+ .setMaxStreamPerConnection(maxStreams)
+ .setInitialInflightFactor(factor)
+ .build();
+```
+
+- `setConnectionsPerEndpoint(1)`: One connection per worker gives 128 total, which is more than sufficient for 10 container nodes.
+- `setMaxStreamPerConnection(maxStreams)`: Calculate based on the target feed rate and total number of workers. For example, if the target is 50k docs/sec across 128 workers, each worker needs ~390 docs/sec. With typical per-document latency of 5-10 ms, each worker needs ~2-4 concurrent streams.
+- `setInitialInflightFactor(factor)`: The dynamic throttler starts at a low inflight count and slowly ramps up via random walk. If you observe slow ramp-up at the start of a feed job, set this to a higher value (e.g., 4-8) to start closer to the optimal inflight level. The factor multiplies the minimum inflight (2 x connectionsPerEndpoint x endpoints), so with 1 connection and factor 8, you'd start at 16 inflight instead of 2.
+
+
+ **Important:** Each worker should create a single `FeedClient` instance and reuse it for the lifetime of the worker. Creating new instances per batch or per document group defeats connection reuse and prevents the throttler from converging.
+
+
+Also, use vespa-feed-client 8.657 or later, for the latest improvements to connection handling and stability.
\ No newline at end of file
diff --git a/mintlify-docs/en/cloud/image.png b/mintlify-docs/en/cloud/image.png
new file mode 100644
index 0000000000..497d81f650
Binary files /dev/null and b/mintlify-docs/en/cloud/image.png differ
diff --git a/mintlify-docs/en/cloud/quota.mdx b/mintlify-docs/en/cloud/quota.mdx
new file mode 100644
index 0000000000..33d1e7bae6
--- /dev/null
+++ b/mintlify-docs/en/cloud/quota.mdx
@@ -0,0 +1,15 @@
+---
+title: Quota
+description: "Tenants in Vespa Cloud have a quota that limits the amount of resources a tenant can use. The quota is expressed as *$/hour*, and is based on the maximum possible cost for a Vespa application."
+---
+
+That means, if you are using [autoscaling](/en/operations/autoscaling), the quota it will use is based on the maximum configured size of the application.
+
+You can see how much quota your applications are using in the Vespa Cloud console. The quota a tenant has depends on the [plan](https://vespa.ai/pricing/) the tenant is on:
+
+| Plan | Quota |
+|:-----|:-----|
+| Trial | \$2/hour |
+| All other plans | \$10/hour |
+
+Contact [Support](https://vespa.ai/support) to change the quota.
\ No newline at end of file
diff --git a/mintlify-docs/en/cloud/support.mdx b/mintlify-docs/en/cloud/support.mdx
new file mode 100644
index 0000000000..112e476f55
--- /dev/null
+++ b/mintlify-docs/en/cloud/support.mdx
@@ -0,0 +1,60 @@
+---
+title: "Cloud and Enterprise support"
+sidebarTitle: "Support"
+---
+
+Support options and other resources like status tracking are found at [Vespa Support](https://vespa.ai/support/).
+
+## Create a support case
+
+Open a support case using the support portal at [support.vespa.ai](https://support.vespa.ai/).
+
+Use this for:
+
+- Production support (reads and writes)
+- Deployment support (making changes to applications)
+- Technical support (general, including user access)
+- Feature requests
+
+Use the support portal to track your ongoing cases.
+
+In case of any problems with the support portal itself, mail [support@vespa.ai](mailto:support@vespa.ai).
+
+You must be a [registered user](https://console.vespa-cloud.com/link/tenant/account/users) in your organization's [tenant](/en/learn/tenant-apps-instances.html) in the Vespa Console to create a support case.
+
+## Escalate a support case
+
+Support response times are defined for the different [support levels](https://cloud.vespa.ai/price-calculator?_gl=1*yoo5sb*_gcl_au*ODE0ODM4MTI2LjE3Nzk3MjQ3OTY.).
+
+To escalate a support case for response within defined SLA, first create the case, then use "Escalate to oncall" to page the Vespa Team:
+
+
+
+
+
+An escalation will be acknowledged in the support case ticket within support SLA. Note that non-escalated cases will be handled during regular business hours.
+
+Depending on support level, your organization might have a shared Slack channel with the Vespa Team. Such a channel does not have an SLA, and does not replace the need to create a support ticket. The Slack channel is used on a best-effort basis and can be a useful tool in the support case process.
+
+## Incident management
+
+Depending on the severity of a support case, the Vespa Team might create an incident.
+
+A customer can request the incident process to be initiated in a support case.
+
+### Incident process
+
+An incident creation triggers the incident process. When the incident is resolved, a root cause analysis (RCA) is performed.
+
+During an incident, the support case is updated with relevant status at regular intervals:
+
+- The teams can mutually agree to use a shared Slack-channel for status and coordinated work - in this case, the support case will have a link to the Slack channel
+- The teams can mutually agree on next steps and timing, in the support case ticket or Slack channel.
+
+### Post-mortem
+
+The incident process includes a post-mortem event. Post-mortems are held weekly, on cases that are closed at least two days before the post-mortem, and all relevant information for the post-mortem is made available.
+
+Post-mortems are internal to Vespa.ai. A customer can request a joint post-mortem meeting after the Vespa.ai-internal post-mortem is completed.
+
+A Post-mortem report is shared with the customer within 7 days of the post-mortem event.
\ No newline at end of file
diff --git a/mintlify-docs/en/content/attributes.mdx b/mintlify-docs/en/content/attributes.mdx
new file mode 100644
index 0000000000..dd4e15ddf7
--- /dev/null
+++ b/mintlify-docs/en/content/attributes.mdx
@@ -0,0 +1,292 @@
+---
+title: "Document attributes"
+description: "An *attribute* is a [schema](/en/reference/schemas/schemas#attribute) keyword, specifying the indexing for a field:"
+---
+
+```txt
+field price type int {
+indexing: attribute
+}
+```
+
+Attribute properties and use cases:
+
+- Flexible [match modes](/en/reference/schemas/schemas#match) including exact match, prefix match, and case-sensitive matching, but not text matching (tokenization and linguistic processing).
+- High sustained update rates (avoiding read-apply-write patterns). Any mutating operation against an attribute field is written to Vespa's [transaction log](/en/content/proton#transaction-log) and persisted, but appending to the log is sequential access, not random. Read more in [partial updates](/en/writing/partial-updates).
+- Instant query updates - values are immediately searchable.
+- [Document Summaries](/en/querying/document-summaries) are memory-only operations if all fields are attributes.
+- [Numerical range queries](/en/reference/querying/yql#numeric).
+
+ ```txt
+ where price > 100
+ ```
+
+- [Grouping](/en/querying/grouping) - aggregate results into groups - it is also great for generating diversity in results.
+
+ ```txt
+ all(group(customer) each(max(3) each(output(summary()))))
+ ```
+
+- [Ranking](/en/basics/ranking) - use attribute values directly in rank functions.
+
+ ```txt
+ rank-profile rank_fresh {
+ first-phase {
+ expression { freshness(timestamp) }
+ }
+ }
+ ```
+
+- [Sorting](/en/reference/querying/sorting-language) - order results by attribute value.
+
+ ```txt
+ order by price asc, release_date desc
+ ```
+
+- [Parent/child](/en/schemas/parent-child) - import attribute values from global parent documents.
+
+ ```txt
+ import field advertiser_ref.name as advertiser_name {}
+ ```
+
+The other field option is *index* - use [index](/en/content/proton#index) for fields used for [text search](/en/querying/text-matching), with [stemming](/en/linguistics/linguistics-opennlp#stemming) and [normalization](/en/linguistics/linguistics-opennlp#normalization).
+
+An attribute is an in-memory data structure. Attributes speed up query execution and [document updates](/en/writing/partial-updates), trading off memory. As data structures are regularly optimized, consider both static and temporary resource usage - see [attribute memory usage](#attribute-memory-usage) below. Use attributes in document summaries to limit access to storage to generate result sets.
+
+
+
+
+
+Configuration overview:
+
+| | ||
+| --- | --- | --- |
+| **fast-search** | Also see the [reference](/en/reference/schemas/schemas#attribute). Add an [index structure](#index-structures) to improve query performance: ``` field titles type array { indexing : summary \| attribute attribute: fast-search }``` |
+| **fast-access** | For high-throughput updates, all nodes with a replica should have the attribute loaded in memory. Depending on replication factor and other configuration, this is not always the case. Use [fast-access](/en/reference/schemas/schemas#attribute) to increase feed rates by having replicas on all nodes in memory - see the [reference](/en/reference/schemas/schemas#attribute) and [sizing feeding](/en/performance/sizing-feeding). ``` field titles type array { indexing : summary \| attribute attribute: fast-access }``` |
+| **distance-metric** | Features like [nearest neighbor search](/en/querying/nearest-neighbor-search) require a [distance-metric](/en/reference/schemas/schemas#distance-metric), and can also have an `hsnw index` to speed up queries. Read more in [approximate nearest neighbor](/en/querying/approximate-nn-hnsw). Pay attention to the field's `index` setting to enable the index: ``` field image_sift_encoding type tensor(x\[128\]) { indexing: summary \| attribute \| index attribute { distance-metric: euclidean } index { hnsw { max-links-per-node: 16 neighbors-to-explore-at-insert: 500 } } }``` |
+
+The attribute field's data type decides which data structures are used by the attribute to store values for that field across all documents on a content node. For some data types, a combination of data structures is used:
+
+- *Attribute Multivalue Mapping* stores arrays of values for array and weighed set types.
+- *Attribute Enum Store* stores unique strings for all string attributes and unique values for attributes with [fast-search](/en/content/attributes#fast-search).
+- *Attribute Tensor Store* stores tensor values for all tensor attributes.
+
+In the following illustration, a row represents a document, while a named column represents an attribute.
+
+
+
+
+
+Attributes can be:
+
+| Type | Size | Description |
+| :--- | :--- | :--- |
+| Single-valued | Fixed | Like the "A" attribute, example `int`. The element size is the size of the type, like 4 bytes for an integer. A memory buffer (indexed by Local ID) holds all values directly. |
+| Multi-valued | Fixed | Like the "B" attribute, example `array`. A memory buffer (indexed by Local ID) is holding references (32 bit) to where in the *Multivalue Mapping* the arrays are stored. The *Multivalue Mapping* consists of multiple memory buffers, where arrays of the same size are co-located in the same buffer. |
+| Multi-valued | Variable | Like the "B" attribute, example `array`. A memory buffer (indexed by Local ID) is holding references (32 bit) to where in the *Multivalue Mapping* the arrays are stored. The unique strings are stored in the *Enum Store*, and the arrays in the *Multivalue Mapping* stores the references (32 bit) to the strings in the *Enum Store*. The *Enum Store* consists of multiple memory buffers. |
+| Single-valued | Variable | Like the "C" attribute, example `string`. A memory buffer (indexed by Local ID) is holding references (32 bit) to where in the *Enum Store* the strings are stored. |
+| Tensor | Fixed / Variable | Like the "D" attribute, example `tensor(x{},y[64])`. A memory buffer (indexed by Local ID) is holding references (32 bit) to where in the *Tensor Store* the tensor values are stored. The memory layout in the *Tensor Store* depends on the tensor type. |
+
+The "A", "B", "C" and "D" attribute memory buffers have attribute values or references in Local ID (LID) order - see [document meta store](#document-meta-store).
+
+When updating an attribute, the full value is written. This also applies to [multivalue](/en/basics/schemas#document-fields) fields - example adding an item to an array:
+
+1. Space for the new array is reserved in a memory buffer
+2. The current value is copied
+3. The new element is written
+
+This means that larger fields will copy more data at updates. It also implies that updates to [weighted sets](/en/reference/schemas/schemas#weightedset) are faster when using numeric keys (less memory and easier comparisons).
+
+Data stored in the *Multivalue Mapping*, *Enum Store* and *Tensor Store* is referenced using 32 bit references. This address space can go full, and then feeding is blocked - [learn more](/en/writing/feed-block). For array or weighted set attributes, the max limit on the number of documents that can have the same number of values is approx 2 billion per node. For string attributes or attributes with [fast-search](/en/content/attributes#fast-search), the max limit on the number of unique values is approx 2 billion per node.
+
+## Index structures
+
+Without `fast-search`, attribute access is a memory lookup, being one value or all values, depending on query execution. An attribute is a linear array-like data structure - matching documents potentially means scanning *all* attribute values.
+
+Setting [fast-search](/en/reference/schemas/schemas#attribute) creates an index structure for quicker lookup and search. This consists of a [dictionary](/en/reference/schemas/schemas#dictionary) pointing to posting lists. This uses more memory, and also more CPU when updating documents. It increases steady state memory usage for all attribute types and also add initialization overhead for numeric types.
+
+The default dictionary is a b-tree of attribute *values*, pointing to an *occurrence* b-tree (posting list) of local doc IDs for each value, exemplified in the A-attribute below. Using `dictionary: hash` on the attribute generates a hash table of attributes values pointing to the posting lists, as in the C-attribute (short posting lists are represented as arrays instead of b-trees):
+
+
+
+
+
+Notes:
+
+- If a value occurs in many documents, the *occurrence* b-tree grows large. For such values, a boolean-occurrence list (i.e. bitvector) is generated in addition to the b-tree.
+- Setting `fast-search` is not observable in the files on disk, other than size.
+- `fast-search` causes a memory increase even for empty fields, due to the extra index structures created. E.g. single value fields will have the "undefined value" when empty, and there is a posting list for this value.
+- The *value* b-tree enables fast range-searches in numerical attributes. This is also available for `hash`-based dictionaries, but slower as a full scan is needed.
+
+Using `fast-search` has many implications, read more in [when to use fast-search](/en/performance/feature-tuning#when-to-use-fast-search-for-attribute-fields).
+
+## Attribute memory usage
+
+Attribute structures are regularly optimized, and this causes temporary resource usage - read more in [maintenance jobs](/en/content/proton#proton-maintenance-jobs). The memory footprint of an attribute depends on a few factors, data type being the most important:
+
+- Numeric (int, long, byte, and double) and Boolean (bit) types - fixed length and fix cost per document
+- String type - the footprint depends on the length of the strings and how many unique strings that needs to be stored.
+
+Collection types like array and weighted sets increases the memory usage some, but the main factor is the average number of values per document. String attributes are typically the largest attributes, and requires most memory during initialization - use boolean/numeric types where possible. Example, refer to formulas below:
+
+```js
+schema foo {
+document bar {
+ field titles type array {
+ indexing: summary | attribute
+ }
+ }
+}
+```
+
+- Assume average 10 values per document, average string length 15, 100k unique strings and 20M documents.
+- Steady state memory usage is approx 1 GB \(20M\*4\*(6/5) + 20M\*10\*4\*(6/5) + 100k\*(15+1+4+4)\*(6/5)\).
+- During initialization (loading attribute from disk) an additional 2.4 GB is allocated \(20M\*10\*(4+4+4)\), for each value:
+
+- local document ID
+- enum value
+- weight
+- Increasing the average number of values per document to 20 (double) will also double the memory footprint during initialization (4.8 GB).
+
+When doing the capacity planning, keep in mind the maximum footprint, which occurs during initialization. For the steady state footprint, the number of unique values is important for string attributes.
+
+Check the [Example attribute sizing spreadsheet](/assets/attribute-memory-Vespa.xls), with various data types and collection types. It also contains estimates for how many documents a 48 GB RAM node can hold, taking initialization into account.
+
+[Multivalue](/en/basics/schemas#document-fields) attributes use an adaptive approach in how data is stored in memory, and up to 2 billion documents per node is supported.
+
+
+**Pro-tip:**
+
+The proton */state/v1/* interface can be explored for attribute memory usage. This is an undocumented debug-interface, subject to change at any moment - example: *http://localhost:19110/state/v1/custom/component/documentdb/music/subdb/ready/attribute/artist*
+
+
+## Attribute file usage
+
+Attribute data is stored in two locations on disk:
+
+- The attribute store in memory, which is regularly flushed to disk. At startup, the flushed files are used to quickly populate the memory structures, resulting in a much quicker startup compared to generating the attribute store from the source in the document store. The attribute store will temporarily double its disk usage when generating a new flush file, see [attribute flush](/en/content/proton#attribute-flush).
+- The document store on disk. Documents here are used to (re)generate index structures, as well as being the source for replica generation across nodes. Note that the attribute data is stored in the document store regardless of the [summary](/en/querying/document-summaries) configuration.
+
+The different field types use various data types for storage, see below, a conservative rule of thumb for steady-state disk usage is hence twice the data size.
+
+## Sizing
+
+Attribute sizing is not an exact science but rather an approximation. The reason is that they vary in size. Both the number of documents, number of values, and uniqueness of the values are variable. The components of the attributes that occupy memory are:
+
+| Abbreviation | Concept | Comment |
+| :--- | :--- | :--- |
+| D | Number of documents | Number of documents on the node, or rather the maximum number of local document IDs allocated |
+| V | Average number of values per document | Only applicable for arrays and weighted sets |
+| U | Number of unique values | Only applies for strings or if [fast-search](/en/reference/schemas/schemas#attribute) is set |
+| FW | Fixed data width | sizeof(T) for numerics, 1 byte for strings, 1 bit for boolean |
+| WW | Weight width | Width of the weight in a weighted set, 4 bytes. 0 bytes for arrays. |
+| EIW | Enum index width | Width of the index into the enum store, 4 bytes. Used by all strings and other attributes if [fast-search](/en/reference/schemas/schemas#attribute) is set |
+| VW | Variable data width | strlen(s) for strings, 0 bytes for the rest |
+| PW | Posting entry width | Width of a posting list entry, 4 bytes for singlevalue, 8 bytes for array and weighted sets. Only applies if [fast-search](/en/reference/schemas/schemas#attribute) is set. |
+| PIW | Posting index width | Width of the index into the store of posting lists; 4 bytes |
+| MIW | Multivalue index width | Width of the index into the multivalue mapping; 4 bytes |
+| ROF | Resize overhead factor | Default is 6/5. This is the average overhead in any dynamic vector due to resizing strategy. Resize strategy is 50% indicating that structure is 5/6 full on average. |
+
+### Components
+
+| Component | Formula | Approx Factor | Applies to |
+| :--- | :--- | :--- | :--- |
+| Document vector | D * ((FW or EIW) or MIW) | ROF | FW for singlevalue numeric attributes and MIW for multivalue attributes. EIW for singlevalue string or if the attribute is singlevalue fast-search |
+| Multivalue mapping | D * V * ((FW or EIW) + WW) | ROF | Applicable only for array or weighted sets. EIW if string or fast-search |
+| Enum store | U * ((FW + VW) + 4 + ((EIW + PIW) or EIW)) | ROF | Applicable for strings or if fast-search is set. (EIW + PIW) if fast-search is set, EIW otherwise. |
+| Posting list | D * V * PW | ROF | Applicable if fast-search is set |
+
+### Variants
+
+| Type | Components | Formula |
+| :--- | :--- | :--- |
+| Numeric singlevalue plain | Document vector | D * FW * ROF |
+| Numeric multivalue value plain | Document vector, Multivalue mapping | D * MIW * ROF + D * V * (FW+WW) * ROF |
+| Numeric singlevalue fast-search | Document vector, Enum store, Posting List | D * EIW * ROF + U * (FW+4+EIW+PIW) * ROF + D * PW * ROF |
+| Numeric multivalue value fast-search | Document vector, Multivalue mapping, Enum store, Posting List | D * MIW * ROF + D * V * (EIW+WW) * ROF + U * (FW+4+EIW+PIW) * ROF + D * V * PW * ROF |
+| Singlevalue string plain | Document vector, Enum store | D * EIW * ROF + U * (FW+VW+4+EIW) * ROF |
+| Singlevalue string fast-search | Document vector, Enum store, Posting List | D * EIW * ROF + U * (FW+VW+4+EIW+PIW) * ROF + D * PW * ROF |
+| Multivalue string plain | Document vector, Multivalue mapping, Enum store | D * MIW * ROF + D * V * (EIW+WW) * ROF + U * (FW+VW+4+EIW) * ROF |
+| Multivalue string fast-search | Document vector, Multivalue mapping, Enum store, Posting list | D * MIW * ROF + D * V * (EIW+WW) * ROF + U * (FW+VW+4+EIW+PIW) * ROF + D * V * PW * ROF |
+| Boolean singlevalue | Document vector | D * FW * ROF |
+
+## Paged attributes
+
+Regular attribute fields are guaranteed to be in-memory, while the [paged](/en/reference/schemas/schemas#attribute) attribute setting allows paging the attribute data out of memory to disk. The `paged` setting is *not* supported for the following types:
+
+- [tensor](/en/reference/schemas/schemas#tensor) with [fast-rank](/en/reference/schemas/schemas#attribute).
+- [predicate](/en/reference/schemas/schemas#predicate).
+
+For attribute fields using [fast-search](/en/reference/schemas/schemas#attribute), the memory needed for dictionary and index structures are never paged out to disk.
+
+Using the `paged` setting for attributes is an alternative when there are memory resource constraints and the attribute data is only accessed by a limited number of hits per query during ranking. E.g. a dense tensor attribute which is only used during a [re-ranking phase](/en/ranking/phased-ranking), where the number of attribute accesses are limited by the re-ranking phase count.
+
+For example using a second phase [total-rerank-count](/en/reference/schemas/schemas#secondphase-total-rerank-count) of 100 will limit the maximum number of page-ins/disk access per query to 100. Running at 100 QPS would need up to 10K disk accesses per second. This is the worst case if none of the accessed attribute data were paged into memory already. This depends on access locality and memory pressure (size of the attribute data versus available memory).
+
+In this example, we have a dense tensor with 1024 [int8](/en/reference/ranking/tensor#tensor-type-spec) values. The tensor attribute is only accessed during re-ranking (second-phase ranking expression):
+
+```txt
+schema foo {
+ document foo {
+ field tensordata type tensor(x[1024]) {
+ indexing: attribute
+ attribute: paged
+ }
+ }
+ rank-profile foo {
+ first-phase {}
+ second-phase {
+ total-rerank-count: 100
+ expression: sum(attribute(tensordata))
+ }
+ }
+}
+```
+
+For some use cases where serving latency SLA is not strict and query throughput is low, the `paged` attribute setting might be a tuning alternative, as it allows storing more data per node.
+
+### Paged attributes disadvantages
+
+The disadvantages of using *paged* attributes are many:
+
+- Unpredictable query latency as attribute access might touch disk. Limited queries per second throughput per node (depends on the locality of document re-ranking requests).
+- Paged attributes are implemented by file-backed memory mappings. The performance depends on the [Linux virtual memory management](https://tldp.org/LDP/tlk/mm/memory.html) ability to page data in and out. Using many threads per search/high query throughput might cause high system (kernel) CPU and system unresponsiveness.
+- The content node's total memory utilization will be close to 100% when using paged attributes. It's up to the Linux kernel to determine what part of the attribute data is paged into memory based on access patterns. A good understanding of how the Linux virtual memory management system works is recommended before enabling paged attributes.
+- The [memory usage metrics](/en/performance/sizing-search#metrics-for-vespa-sizing) from content nodes are not reflecting the reality when using paged attributes. They can indicate a usage that is much higher than the available memory on the node. This is because attribute memory usage is reported as the amount of data contained in the attribute, and whether this data is paged out to disk is controlled by the Linux kernel.
+- Using paged attributes doubles the disk usage of attribute data. For example if the original attribute size is 92 GB (100M documents of the above 1024 int8 per document schema), using the `paged` setting will double the attribute disk usage to close to 200 GB.
+- Changing the `paged` setting (e.g. removing the option) on a running system might cause hard out-of-memory situations as without `paged`, the content nodes will attempt loading the attribute into memory without the option for page outs.
+- Using a paged attribute in [first-phase](/en/ranking/phased-ranking) ranking can result in extremely high query latency if a large amount of the corpus is retrieved by the query. The number of disk accesses will, in the worst case, be equal to the number of hits the query produces. A similar problem can occur if running a query that searches a paged attribute.
+- Using `paged` in combination with [HNSW indexing](/en/querying/approximate-nn-hnsw) is *strongly* discouraged. *HNSW* indexing also searches and reads tensors during indexing, causing random access during feeding. Once the system memory usage reaches 100%, the Linux kernel will start paging pages in and out of memory. This can cause a high system (kernel) CPU and slows down HNSW indexing throughput significantly.
+
+## Mutable attributes
+
+[Mutable attributes](/en/reference/schemas/schemas#mutate) is document metadata for matching and ranking performance per document.
+
+The attribute values are mutated as part of query execution, as defined in rank profiles - see [rank phase statistics](/en/ranking/phased-ranking#rank-phase-statistics) for details.
+
+## Document meta store
+
+The document meta store is an in-memory data structure for all documents on a node. It is an *implicit attribute*, and is [compacted](/en/content/proton#lid-space-compaction) and [flushed](/en/content/proton#attribute-flush). Memory usage for applications with small documents / no other attributes can be dominated by this attribute.
+
+The document meta store scales linearly with number of documents - using approximately 30 bytes per document. The metric *content.proton.documentdb.ready.attribute.memory_usage.allocated_bytes* for `"field": "[documentmetastore]"` is the size of the document meta store in memory - use the [metric API](/en/reference/api/state-v1#state-v1-metrics) to find the size - in this example, the node has 9M ready documents with 52 bytes in memory per document:
+
+```json highlight= {10,14}
+{
+ "name": "content.proton.documentdb.ready.attribute.memory_usage.allocated_bytes",
+ "description": "The number of allocated bytes",
+ "values": {
+ "average": 4.69736008E8,
+ "count": 12,
+ "rate": 0.2,
+ "min": 469736008,
+ "max": 469736008,
+ "last": 469736008
+ },
+ "dimensions": {
+ "documenttype": "doctype",
+ "field": "[documentmetastore]"
+ }
+},
+```
+
+The above is for the *ready* documents, also check *removed* and *notready* - refer to [sub-databases](/en/content/proton#sub-databases).
diff --git a/mintlify-docs/en/content/buckets.mdx b/mintlify-docs/en/content/buckets.mdx
new file mode 100644
index 0000000000..beb55d1346
--- /dev/null
+++ b/mintlify-docs/en/content/buckets.mdx
@@ -0,0 +1,98 @@
+---
+title: "Buckets"
+description: "The content layer splits the document space into chunks called *buckets*, and algorithmically maps documents to buckets by their id. The cluster automatically splits and joins buckets to maintain a uniform distribution across all nodes and to keep bucket sizes within configurable limits."
+---
+
+Documents have string identifiers that maps to a 58 bit numeric location. A bucket is defined as all the documents that shares a given amount of the least significant bits within the location. The amount of bits used controls how many buckets will exist. For instance, if a bucket contains all documents whose 8 LSB bits is 0x01, the bucket can be split in two by using the 9th bit in the location to split them. Similarly, buckets can be joined by requiring one less bit in common.
+
+## Distribution
+
+Distribution happens in several layers.
+
+ - Documents map to 58 bit numeric locations.
+ - Locations map to buckets
+ - Buckets map to distributors responsible for handling requests related to those buckets.
+ - Buckets map to content nodes responsible for storing replicas of buckets.
+
+### Document to location distribution
+
+Document identifiers use [document identifier schemes](/en/schemas/documents) to map documents to locations. This way it is possible to co-locate data within buckets by enforcing some documents to have common LSB bits. Specifying a group or numeric value with the n and g options overrides the 32 LSB bits of the location. Only use when required, e.g. when using streaming search for personal search.
+
+### Location to bucket distribution
+
+The cluster state contains a distribution bit count, which is the amount of location bits to use to generate buckets which can be mapped to distributors.
+
+The cluster state may change the number of distribution bits to adjust the number of buckets distributed at this level. When adding more nodes to the cluster, the number of buckets increases in order for the distribution to remain uniform.
+
+Altering the distribution bit count causes a redistribution of all buckets.
+
+If locations have been overridden to co-localize documents into few units, the distribution of documents into these buckets may be skewed.
+
+### Bucket to distributor distribution
+
+Buckets are mapped to distributors using the ideal state algorithm.
+
+### Bucket to content node distribution
+
+Buckets are mapped to content nodes using the ideal state algorithm. As the content nodes persist data, changing bucket ownership takes more time/resources than on the distributors.
+
+There is usually a replica of a bucket on the same content node as the distributor owning the bucket, as the same algorithm is used.
+
+The distributors may split the buckets further than the distribution bit count indicates, allowing more units to be distributed among the content nodes to create a more even distribution, while not affecting routing from client to distributors.
+
+## Maintenance operations
+
+The content layer defines a set of maintenance operations to keep the cluster balanced. Distributors schedule maintenance operations and issue them to content nodes. Maintenance operations are typically not high priority requests. Scheduling a maintenance operation does not block any external operations.
+
+| | |
+| :--- | :--- |
+| **Split bucket** | Split a bucket in two, by enforcing the documents within the new buckets to have more location bits in common. Buckets are split either because they have grown too big, or because the cluster wants to use more distribution bits. |
+| **Join bucket** | Join two buckets into one. If a bucket has been previously split due to being large, but documents have now been deleted, the bucket can be joined again. |
+| **Merge bucket** | If there are multiple replicas of a bucket, but they do not store the same set of versioned documents, _merge_ is used to synchronize the replicas. A special case of a merge is a one-way merge, which may be done if some of the replicas are to be deleted right after the merge. Merging is used not only to fix inconsistent bucket replicas, but also to move buckets between nodes. To move a bucket, an empty replica is created on the target node, a merge is executed, and the source bucket is deleted. |
+| **Create bucket** | This operation exist merely for the distributor to notify a content node that it is now to store documents for this bucket too. This allows content nodes to refuse operations towards buckets it does not own. The ability to refuse traffic is a safeguard to avoid inconsistencies. If a client talks to a distributor that is no longer working correctly, we rather want its requests to fail than to alter the content cluster in strange ways. |
+| **Delete bucket** | Drop stored state for a bucket and reject further requests for it |
+| **(De)activate bucket** | Activate bucket for search results - refer to [bucket management](/en/content/proton#bucket-management) |
+| **Garbage collections** | If configured, documents are periodically garbage collected through background maintenance operations. |
+
+### Bucket split size
+
+The distributors may split existing buckets further to keep bucket sizes at manageable levels, or to ensure more units to split among the backends and their partitions.
+
+Using small buckets, the distribution will be more uniform and bucket operations will be smaller. Using large buckets, less memory is needed for metadata operations and bucket splitting and joining is less frequent.
+
+The size limits may be altered by configuring [bucket splitting](/en/reference/applications/services/content#bucket-splitting).
+
+## Document to bucket distribution
+
+Each document has a document identifier following a document identifier [uri scheme](/en/schemas/documents). From this scheme a 58 bit numeric _location_ is generated. Typically, all the bits are created from an MD5 checksum of the whole identifier.
+
+Schemes specifying a _groupname_, will have the LSB bits of the location set to a hash of the _groupname_. Thus, all documents belonging to that group will have locations with similar least significant bits, which will put them in the same bucket. If buckets end up split far enough to use more bits than the hash bits overridden by the group, the data will be split into many buckets, but each will typically only contain data for that group.
+
+MD5 checksums maps document identifiers to random locations. This creates a uniform bucket distribution, and is default. For some use cases, it is better to co-locate documents, optimizing grouped access - an example is personal documents. By enforcing some documents to map to similar locations, these documents are likely to end up in the same actual buckets. There are several use cases for where this may be useful:
+
+- When migrating documents for some entity between clusters, this may be implemented more efficient if the entity is contained in just a few buckets rather than having documents scattered around all the existing buckets.
+- If operations to the cluster is clustered somehow, clustering the documents equally in the backend may make better use of caches. For instance, if a service stores data for users, and traffic is typically created for users at short intervals while the users are actively using the service, clustering user data may allow a lot of the user traffic to be easily cached by generic bucket caches.
+
+If the `n=` option is specified, the 32 LSB bits of the given number overrides the 32 LSB bits of the location. If the `g=` option is specified, a hash is created of the group name, the hash value is then used as if it were specified with `n=`. When the location is calculated, it is mapped to a bucket. Clients map locations to buckets using [distribution bits](#location-to-bucket-distribution).
+
+Distributors map locations to buckets by searching their bucket database, which is sorted in inverse location order. The common case is that there is one. If there are several, there is currently inconsistent bucket splitting. If there are none, the distributor will create a new bucket for the request if it is a request that may create new data. Typically, new buckets are generated split according to the distribution bit count.
+
+Content nodes should rarely need to map documents to buckets, as distributors specify bucket targets for all requests. However, as external operations are not queued during bucket splits and joins, the content nodes remap operations to avoid having to fail them due to a bucket having recently been split or joined.
+
+### Limitations
+
+One basic limitation to the document to location mapping is that it may never change. If it changes, then documents will suddenly be in the wrong buckets in the cluster. This would violate a core invariant in the system, and is not supported.
+
+To allow new functionality, document identifier schemes may be extended or created that maps to location in new ways, but the already existing ones must map the same way as they have always done.
+
+Current document identifier schemes typically allow the 32 least significant bits to be overridden for co-localization, while the remaining 26 bits are reserved for bits created from the MD5 checksum.
+
+### Splitting
+
+When there are enough documents co-localized to the same bucket, causing the bucket to be split, it will typically need to split past the 32 LSB. At this split-level and beyond, there is no longer a 1-1 relationship between the node owning the bucket and the nodes its replica data will be stored on.
+
+The effect of this is that documents sharing a location will be spread across nodes in the entire cluster once they reach a certain size. This enables efficient parallel processing.
+
+## Bucket space
+
+Buckets exist in the _default_ or _global_ bucket space.
\ No newline at end of file
diff --git a/mintlify-docs/en/content/consistency.mdx b/mintlify-docs/en/content/consistency.mdx
new file mode 100644
index 0000000000..b6af387670
--- /dev/null
+++ b/mintlify-docs/en/content/consistency.mdx
@@ -0,0 +1,113 @@
+---
+title: "Vesa Consistency Model"
+sidebarTitle: "Consistency Model"
+description: "Vespa offers configurable data redundancy with eventual consistency across replicas. It's designed for high efficiency under workloads where eventual consistency is an acceptable tradeoff. This document aims to go into some detail on what these tradeoffs are, and what you, as a user, can expect."
+---
+
+## Vespa and CAP
+
+Vespa may be considered a limited subset of AP under the [CAP theorem](https://en.wikipedia.org/wiki/CAP_theorem).
+
+Under CAP, there is a fundamental limitation of whether any distributed system can offer guarantees on consistency (C) or availability (A) in scenarios where nodes are partitioned (P) from each other. Since there is no escaping that partitions can and will happen, we talk either of systems that are _either_ CP or AP.
+
+Consistency (C) in CAP implies that reads and writes are strongly consistent, i.e. the system offers _linearizability_. Weaker forms such as causal consistency or "read your writes" consistency is _not_ sufficient. As mentioned initially, Vespa is an eventually consistent data store and therefore does not offer this property. In practice, Consistency requires the use of a majority consensus algorithm, which Vespa does not currently use.
+
+Availability (A) in CAP implies that _all requests_ receive a non-error response regardless of how the network may be partitioned. Vespa is dependent on a centralized (but fault tolerant) node health checker and coordinator. A network partition may take place between the coordinator and a subset of nodes. Operations to nodes in this subset aren't guaranteed to succeed until the partition heals. As a consequence, Vespa is not _guaranteed_ to be strongly available, so we treat this as a "limited subset" of AP (though this is not technically part of the CAP definition).
+
+In _practice_, the best-effort semantics of Vespa have proven to be both robust and highly available in common datacenter networks.
+
+## Write durability and consistency
+
+When a client receives a successful [write](/en/writing/reads-and-writes) response, the operation has been written and synced to disk. The replication level is configurable. Operations are by default written on _all_ available replica nodes before sending a response. "Available" here means being Up in the [cluster state](/en/content/content-nodes#cluster-state), which is determined by the fault-tolerant, centralized Cluster Controller service. If a cluster has a total of 3 nodes, 2 of these are available and the replication factor is 3, writes will be ACKed to the client if both the available nodes ACK the operation.
+
+On each replica node, operations are persisted to a write-ahead log before being applied. The system will automatically recover after a crash by replaying logged operations. Writes are guaranteed to be synced to durable storage prior to sending a successful response to the client, so acknowledged writes are retained even in the face of sudden power loss.
+
+If a client receives a failure response for a write operation, the operation may or may not have taken place on a subset of the replicas. If not all replicas could be written to, they are considered divergent (out of sync). The system detects and reconciles divergent replicas. This happens without any required user intervention.
+
+Each document write assigns a new wall-clock timestamp to the resulting document version. As a consequence, configure servers with NTP to keep clock drift as small as possible. Large clock drifts may result in timestamp collisions or unexpected operation orderings.
+
+Vespa has support for conditional writes for individual documents through test-and-set operations. Multi-document transactions are not supported.
+
+After a successful response, changes to the search indexes are immediately visible by default.
+
+## Read consistency
+
+Reads are consistent on a best-effort basis and are not guaranteed to be linearizable.
+
+When using a [Get](/en/reference/api/document-v1#get) or [Visit](/en/writing/visiting) operation, the client will never observe a partially updated document. For these read operations, writes behave as if they are atomic.
+
+Searches may observe partial updates, as updates are not atomic across index structures. This can only happen _after_ a write has started, but _before_ it's complete. Once a write is complete, all index updates are visible.
+
+Searches may observe transient loss of coverage when nodes go down. Vespa will restore coverage automatically when this happens. How fast this happens depends on the configured [searchable-copies](/en/reference/applications/services/content#searchable-copies) value.
+
+If replicas diverge during a Get, Vespa performs a read-repair. This fetches the requested document from all divergent replicas. The client then receives the version with the newest timestamp.
+
+If replicas diverge during a Visit, the behavior is slightly different between the Document V1 API and [vespa-visit](/en/reference/operations/self-managed/tools#vespa-visit):
+
+- Document V1 will prefer immediately visiting the replica that contains the most documents. This means it's possible for a subset of documents in a bucket to not be returned.
+- `vespa-visit` will by default retry visiting the bucket until it is in sync. This may take a long time if large parts of the system are out of sync.
+
+The rationale for this difference in behavior is that Document V1 is usually called in a real-time request context, whereas `vespa-visit` is usually called in a background/batch processing context.
+
+Visitor operations iterate over the document corpus in an implementation-specific order. Any given document is returned in the state it was in at the time the visitor iterated over the data bucket containing the document. This means there is no snapshot isolation—a document mutation happening concurrently with a visitor may or may not be reflected in the returned document set, depending on whether the mutation happened before or after iteration of the bucket containing the document.
+
+## Replica reconciliation
+
+Reconciliation is the act of bringing divergent replicas back into sync. This usually happens after a node restarts or fails. It will also happen after network partitions.
+
+Unlike several other eventually consistent databases, Vespa doesn't use distributed replica operation logs. Instead, reconciling replicas involves exchanging sets of timestamped documents. Reconciliation is complete once the union set of documents is present on all replicas. Metadata is checksummed to determine whether replicas are in sync with each other.
+
+When reconciling replicas, the newest available version of a document will "win" and become visible. This version may be a remove (tombstone). Tombstones are replicated in the same way as regular documents.
+
+Reconciliation happens the document level, not at the field level. I.e. there is no merging of individual fields across different versions.
+
+If a test-and-set operation updates at least one replica, it will eventually become visible on the other replicas.
+
+The reconciliation operation is referred to as a "merge" in the rest of the Vespa documentation.
+
+Tombstone entries have a configurable time-to-live before they are compacted away. Nodes that have been partitioned away from the network for a longer period of time than this TTL should ideally have their indexes removed before being allowed back into the cluster. If not, there is a risk of resurrecting previously removed documents. Vespa does not currently detect or handle this scenario automatically.
+
+See the documentation on [data-retention-vs-size](/en/operations/self-managed/admin-procedures#data-retention-vs-size).
+
+## Q/A
+
+
+
+When the distributor process that is responsible for a particular data bucket receives a Get operation, it checks its locally cached replica metadata state for inconsistencies.
+
+If all replicas have consistent metadata, the operation is routed to a single replica—preferably located on the same host as the distributor, if present. This is the normal case when the bucket replicas are in sync.
+
+If there is at least one replica metadata mismatch, the distributor automatically initiates a read-repair process:
+
+1. The distributor splits the bucket replicas into subsets based on their metadata, where all replicas in each subset have the same metadata. It then sends a lightweight metadata-only Get to one replica in each subset. The core assumption is that all these replicas have the same set of document versions, and that it suffices to consult one replica in the set. If a metadata read fails, the distributor will automatically fail over to another replica in the subset.
+2. It then sends one full Get to a node in the replica set that returned the _highest_ timestamp.
+
+This means that if you have 100 replicas and 1 has different metadata from the remaining 99, only 2 nodes in total will be initially queried, and only 1 will receive the actual (full) Get read.
+
+Similar algorithms are used by other operations that may trigger read/write-repair.
+
+
+
+Unfortunately not. Vespa does not offer any cross-document transactions, so in this case strong consistency implies single-object _linearizability_ (as opposed to _strict serializability_ across multiple objects). Linearizability requires the ability to reach a majority consensus amongst a particular known and stable configuration of replicas (side note: replica sets can be reconfigured in strongly consistent algorithms like Raft and Paxos, but such a reconfiguration must also be threaded through the consensus machinery).
+
+The active replica set for a given data bucket (and thus the documents it logically contains) is ephemeral and dynamic based on the nodes that are currently available in the cluster (as seen from the cluster controller). This precludes having a stable set of replicas that can be used for reaching majority decisions.
+
+See also [Vespa and CAP](#vespa-and-cap).
+
+
+
+Stale document versions may be returned when all replicas containing the most recent document version have become unavailable.
+
+Example scenario (for simplicity—but without loss of generality—assuming redundancy 1) in a cluster with two nodes `{A, B}`:
+
+1. Document X is stored in a replica on node A with timestamp 100.
+2. Node A goes down; node B takes over ownership.
+3. A write request is received for document X; it is stored on node B with timestamp 200 and ACKed to the client.
+4. Node B goes down.
+5. Node A comes back up.
+6. A read request arrives for document X. The only visible replica is on node A, which ends up serving the request.
+7. The document version at timestamp 100 is returned to the client.
+
+Since the write at `t=200` _happens-after_ the write at `t=100`, returning the version at `t=100` violates linearizability.
+
+
\ No newline at end of file
diff --git a/mintlify-docs/en/content/content-nodes.mdx b/mintlify-docs/en/content/content-nodes.mdx
new file mode 100644
index 0000000000..eb6f19741a
--- /dev/null
+++ b/mintlify-docs/en/content/content-nodes.mdx
@@ -0,0 +1,330 @@
+---
+title: "Content nodes and states"
+---
+
+
+
+
+
+Content cluster processes are *distributor*, *proton* and *cluster controller*.
+
+The distributor calculates the correct content node using the distribution algorithm and the [cluster state](#cluster-state). With no known cluster state, the client library will send requests to a random node, which replies with the updated cluster state if the node was incorrect. Cluster states are versioned, such that clients hitting outdated distributors do not override updated states with old states.
+
+The [distributor](#distributor) keeps track of which content nodes that stores replicas of each bucket (maximum one replica each), based on [redundancy](/en/reference/applications/services/content#redundancy) and information from the *cluster controller*. A bucket maps to one distributor only. A distributor keeps a bucket database with bucket metadata. The metadata holds which content nodes store replicas of the buckets, the checksum of the bucket content and the number of documents and meta entries within the bucket. Each document is algorithmically mapped to a bucket and forwarded to the correct content nodes. The distributors detect whether there are enough bucket replicas on the content nodes and add/remove as needed. Write operations wait for replies from every replica and fail if less than redundancy are persisted within timeout.
+
+The [cluster controller](#cluster-controller) manages the state of the distributor and content nodes. This *cluster state* is used by the document processing chains to know which distributor to send documents to, as well as by the distributor to know which content nodes should have which bucket.
+
+## Cluster state
+
+There are three kinds of state: [unit state](/en/reference/api/cluster-v2#state-unit), [user state](/en/reference/api/cluster-v2#state-user) and [generated state](/en/reference/api/cluster-v2#state-generated) (a.k.a. *cluster state*).
+
+For new cluster states, the cluster state version is incremented, and the new cluster state is broadcast to all nodes. There is a minimum time between each cluster state change.
+
+It is possible to set a minimum capacity for the cluster state to be `up`.
+
+If a cluster has so many nodes unavailable that it is considered down, the state of each node is irrelevant, and thus new cluster states will not be created and broadcast before enough nodes are back for the cluster to come back up. A cluster state indicating the entire cluster is down, may thus have outdated data on the node level.
+
+## Cluster controller
+
+The main task of the cluster controller is to maintain the [cluster state](#cluster-state). This is done by *polling* nodes for state, *generating* a cluster state, which is then *broadcast* to all the content nodes in the cluster. Note that clients do not interface with the cluster controller - they get the cluster state from the distributors - [details](#distributor).
+
+
+| Task | Description |
+| :--- | :--- |
+| Node state polling | The cluster controller polls nodes, sending the current cluster state. If the cluster state is no longer correct, the node returns correct information immediately. If the state is correct, the request lingers on the node, such that the node can reply to it immediately if its state changes. After a while, the cluster controller will send a new state request to the node, even with one pending. This triggers a reply to the lingering request and makes the new one linger instead. Hence, nodes have a pending state request. During a controlled node shutdown, it starts the shutdown process by responding to the pending state request that it is now stopping. **Note:** As controlled restarts or shutdowns are implemented as TERM signals from the config-sentinel, the cluster controller is not able to differ between controlled and other shutdowns. |
+| Cluster state generation | The cluster controller translates unit and user states into the generated cluster state |
+| Cluster state broadcast | When node unit states are received, a cluster controller internal cluster state is updated. New cluster states are distributed with a minimum interval between. A grace period per unit state too - e.g., distributors and content nodes that are on the same node often stop at the same time. The version number is incremented, and the new cluster state is broadcast. If cluster state version is reset, distributors and content node processes may have to be restarted in order for the system to converge to the new state. Nodes will reject lower cluster state versions to prevent race conditions caused by overlapping cluster controller leadership periods. |
+
+See [cluster controller configuration](/en/operations/self-managed/admin-procedures#cluster-controller-configuration).
+
+### Master election
+
+Vespa can be configured with one cluster controller. Reads and writes will work well in case of cluster controller down, but other changes to the cluster (like a content node going down) will not be handled. It is hence recommended to configure a set of cluster controllers.
+
+The cluster controller nodes elect a master, which does the node polling and cluster state broadcast. The other cluster controller nodes only exist to do master election and potentially take over if the master dies.
+
+All cluster controllers will vote for the cluster controller with the lowest index that says it is ready. If a cluster controller has more than half of the votes, it will be elected master. As a majority vote is required, the number of cluster controllers should be an odd number of 3 or greater. A fresh master will not broadcast states before a transition time is passed, allowing an old master to have some time to realize it is no longer the master.
+
+## Distributor
+
+Buckets are mapped to distributors using the [ideal state algorithm](/en/content/idealstate). As the cluster state changes, buckets are re-mapped immediately. The mapping does not overlap - a bucket is owned by one distributor.
+
+Distributors do not persist the bucket database, the bucket-to-content-node mapping is kept in memory in the distributor. Document count, persisted size and a metadata checksum per bucket is stored as well. At distributor (re)start, content nodes are polled for bucket information, and return which buckets are owned by this distributor (using the ideal state algorithm). There is no centralized bucket directory node. Likewise, at any distributor cluster state change, content nodes are polled for bucket handover - a distributor will then handle a new set of buckets.
+
+Document operations are mapped to content nodes based on bucket locations - each put/update/get/remove is mapped to a [bucket](/en/content/buckets) and sent to the right content nodes. To manage the document set as it grows and nodes change, buckets move between content nodes.
+
+Document API clients (i.e. container nodes with [document-api](/en/reference/applications/services/container#document-api)) do not communicate directly with the cluster controller, and do not know the cluster state at startup. Clients therefore start out by sending requests to a random distributor. If the document operation hits the wrong distributor, `WRONG_DISTRIBUTION` is returned, with the current cluster state in the response. `WRONG_DISTRIBUTION` is hence expected and normal at cold start / state change events.
+
+### Timestamps
+
+[Write operations](/en/writing/reads-and-writes) have a *last modified time* timestamp assigned when passing through the distributor. The timestamp is guaranteed to be unique within the [bucket](/en/content/buckets) where it is stored. The timestamp is used by the content layer to decide which operation is newest. These timestamps can be used when [visiting](/en/writing/visiting), to process/retrieve documents within a given time range. To guarantee unique timestamps, they are in microseconds - the microsecond part is generated to avoid conflicts with other documents.
+
+If documents are migrated *between* clusters, the target cluster will have new timestamps for their entries. Also, when [reprocessing documents](/en/applications/document-processors) *within* a cluster, documents will have new timestamps, even if not modified.
+
+### Ordering
+
+The Document API uses the [document ID](/en/schemas/documents#document-ids) to order operations. A Document API client ensures that only one operation is pending at the same time. This ensures that if a client sends multiple operations for the same document, they will be processed in a defined order. This is done by queueing pending operations *locally* at the client.
+
+
+**Note:**
+
+If sending two write operations to the same document, and the first operation fails, the enqueued operation is sent. In other words, the client does not assume there exists any kind of dependency between separate operations to the same document. If you need to enforce this, use [test-and-set conditions](/en/writing/document-v1-api-guide#conditional-writes) for writes.
+
+
+If *different* clients have pending operations on the same document, the order is unspecified.
+
+### Maintenance operations
+
+Distributors track which content nodes have which buckets in their bucket database. Distributors then use the [ideal state algorithm](/en/content/idealstate) to generate bucket *maintenance operations*. A stable system has all buckets located per the ideal state:
+
+- If buckets have too few replicas, new are generated on other content nodes.
+- If the replicas differ, a bucket merge is issued to get replicas consistent.
+- If a buckets has too many replicas, superfluous are deleted. Buckets are merged, if inconsistent, before deletion.
+- If two buckets exist, such that both may contain the same document, the buckets are split or joined to remove such overlapping buckets. Read more on [inconsistent buckets](/en/content/buckets).
+- If buckets are too small/large, they will be joined or split.
+
+The maintenance operations have different priorities. If no maintenance operations are needed, the cluster is said to be in the *ideal state*. The distributors synchronize maintenance load with user load, e.g. to remap requests to other buckets after bucket splitting and joining.
+
+### Restart
+
+When a distributor stops, it will try to respond to any pending cluster state request first. New incoming requests after shutdown is commenced will fail immediately, as the socket is no longer accepting requests. Cluster controllers will thus detect processes stopping almost immediately.
+
+The cluster state will be updated with the new state internally in the cluster controller. Then the cluster controller will wait for maximum [min_time_between_new_systemstates](https://github.com/vespa-engine/vespa/blob/master/configdefinitions/src/vespa/fleetcontroller.def) before publishing the new cluster state - this to reduce short-term state fluctuations.
+
+The cluster controller has the option of setting states to make other distributors take over ownership of buckets, or mask the change, making the buckets owned by the distributor restarting unavailable for the time being.
+
+If the distributor transitions from `up` to `down`, other distributors will request metadata from the content nodes to take over ownership of buckets previously owned by the restarting distributor. Until the distributors have gathered this new metadata from all the content nodes, requests for these buckets can not be served, and will fail back to client. When the restarting node comes back up and is marked `up` in the cluster state again, the additional nodes will discard knowledge of the extra buckets they previously acquired.
+
+For requests with timeouts of several seconds, the transition should be invisible due to automatic client resending. Requests with a lower timeout might fail, and it is up to the application whether to resend or handle failed requests.
+
+Requests to buckets not owned by the restarting distributor will not be affected.
+
+## Content node
+
+The content node runs *proton*, which is the query backend.
+
+### Restart
+
+When a content node does a controlled restart, it marks itself in the `stopping` state and rejects new requests. It will process its pending request queue before shutting down. Consequently, client requests are typically unaffected by content node restarts. The currently pending requests will typically be completed. New copies of buckets will be created on other nodes, to store new requests in appropriate redundancy. This happens whether node transitions through `down` or `maintenance` state. The difference being that if transitioning through `maintenance`, the distributor will not start any effort of synchronizing new copies with existing copies. They will just store the new requests until the maintenance node comes back up.
+
+When starting, content nodes will start with gathering information on what buckets it has data stored for. While this is happening, the service layer will expose that it is `down`.
+
+## Metrics
+
+| Metric | Description |
+| :--- | :--- |
+| .idealstate.idealstate_diff | This metric tries to create a single value indicating distance to the ideal state. A value of zero indicates that the cluster is in the ideal state. Graphed values of this metric gives a good indication for how fast the cluster gets back to the ideal state after changes. Note that some issues may hide other issues, so sometimes the graph may appear to stand still or even go a bit up again, as resolving one issue may have detected one or several others. |
+| .idealstate.buckets_toofewcopies | Specifically lists how many buckets have too few copies. Compare to the *buckets* metric to see how big a portion of the cluster this is. |
+| .idealstate.buckets_toomanycopies | Specifically lists how many buckets have too many copies. Compare to the *buckets* metric to see how big a portion of the cluster this is. |
+| .idealstate.buckets | The total number of buckets managed. Used by other metrics reporting bucket counts to know how big a part of the cluster they relate to. |
+| .idealstate.buckets_notrusted | Lists how many buckets have no trusted copies. Without trusted buckets operations against the bucket may have poor performance, having to send requests to many copies to try and create consistent replies. |
+| .idealstate.delete_bucket.pending | Lists how many buckets that needs to be deleted. |
+| .idealstate.merge_bucket.pending | Lists how many buckets there are, where we suspect not all copies store identical document sets. |
+| .idealstate.split_bucket.pending | Lists how many buckets are currently being split. |
+| .idealstate.join_bucket.pending | Lists how many buckets are currently being joined. |
+| .idealstate.set_bucket_state.pending | Lists how many buckets are currently altered for active state. These are high priority requests which should finish fast, so these requests should seldom be seen as pending. |
+
+Example, using the [quickstart](/en/basics/deploy-an-application-local) - find the distributor port (look for HTTP):
+
+```txt
+$ docker exec vespa vespa-model-inspect service distributor
+
+distributor @ vespa-container : content
+music/distributor/0
+ tcp/vespa-container:19112 (MESSAGING)
+ tcp/vespa-container:19113 (STATUS RPC)
+ tcp/vespa-container:19114 (STATE STATUS HTTP)
+```
+
+Get the metric value:
+
+```txt
+$ docker exec vespa curl -s http://localhost:19114/state/v1/metrics | jq . | \
+ grep -A 10 idealstate.merge_bucket.pending
+
+ "name": "vds.idealstate.merge_bucket.pending",
+ "description": "The number of operations pending",
+ "values": {
+ "average": 0,
+ "sum": 0,
+ "count": 1,
+ "rate": 0.016666,
+ "min": 0,
+ "max": 0,
+ "last": 0
+ },
+```
+
+## /cluster/v2 API examples
+
+Examples of state manipulation using the [/cluster/v2 API](/en/reference/api/cluster-v2).
+
+List content clusters:
+
+```txt
+$ curl http://localhost:19050/cluster/v2/
+```
+
+```json
+{
+ "cluster": {
+ "music": {
+ "link": "/cluster/v2/music"
+ },
+ "books": {
+ "link": "/cluster/v2/books"
+ }
+ }
+}
+```
+
+Get cluster state and list service types within cluster:
+
+```txt
+$ curl http://localhost:19050/cluster/v2/music
+```
+
+```json
+{
+ "state": {
+ "generated": {
+ "state": "state-generated",
+ "reason": "description"
+ }
+ },
+ "service": {
+ "distributor": {
+ "link": "/cluster/v2/music/distributor"
+ },
+ "storage": {
+ "link": "/cluster/v2/music/storage"
+ }
+ }
+}
+```
+
+List nodes per service type for cluster:
+
+```txt
+$ curl http://localhost:19050/cluster/v2/music/storage
+```
+
+```json
+{
+ "node": {
+ "0": {
+ "link": "/cluster/v2/music/storage/0"
+ },
+ "1": {
+ "link": "/cluster/v2/music/storage/1"
+ }
+ }
+}
+```
+
+Get node state:
+
+```txt
+$ curl http://localhost:19050/cluster/v2/music/storage/0
+```
+
+```json
+{
+ "attributes": {
+ "hierarchical-group": "group0"
+ },
+ "state": {
+ "generated": {
+ "state": "up",
+ "reason": ""
+ },
+ "unit": {
+ "state": "up",
+ "reason": ""
+ },
+ "user": {
+ "state": "up",
+ "reason": ""
+ }
+ },
+ "metrics": {
+ "bucket-count": 0,
+ "unique-document-count": 0,
+ "unique-document-total-size": 0
+ }
+}
+```
+
+Get all nodes, including topology information (see `hierarchical-group`):
+
+```txt
+$ curl http://localhost:19050/cluster/v2/music/?recursive=true
+```
+
+```json expandable
+{
+ "state": {
+ "generated": {
+ "state": "up",
+ "reason": ""
+ }
+ },
+ "service": {
+ "storage": {
+ "node": {
+ "0": {
+ "attributes": {
+ "hierarchical-group": "group0"
+ },
+ "state": {
+ "generated": {
+ "state": "up",
+ "reason": ""
+ },
+ "unit": {
+ "state": "up",
+ "reason": ""
+ },
+ "user": {
+ "state": "up",
+ "reason": ""
+ }
+ },
+ "metrics": {
+ "bucket-count": 0,
+ "unique-document-count": 0,
+ "unique-document-total-size": 0
+ }
+ }
+ }
+ }
+ }
+}
+```
+
+Set node user state:
+
+```txt
+curl -X PUT -H "Content-Type: application/json" --data '
+ {
+ "state": {
+ "user": {
+ "state": "retired",
+ "reason": "This node will be removed soon"
+ }
+ }
+ }' \
+ http://localhost:19050/cluster/v2/music/storage/0
+```
+
+```json
+{
+ "wasModified": true,
+ "reason": "ok"
+}
+```
+
+## Further reading
+
+- Refer to [administrative procedures](/en/operations/self-managed/admin-procedures) for configuration and state monitoring / management.
+- Try the [Multinode testing and observability](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode) sample app to get familiar with interfaces and behavior.
diff --git a/mintlify-docs/en/content/elasticity.mdx b/mintlify-docs/en/content/elasticity.mdx
new file mode 100644
index 0000000000..e6297c22d8
--- /dev/null
+++ b/mintlify-docs/en/content/elasticity.mdx
@@ -0,0 +1,141 @@
+---
+title: "Elasticity"
+description: "Vespa clusters can be grown and shrunk while serving queries and writes. Documents in content clusters are automatically redistributed on changes to maintain an even distribution with minimal data movement. To resize, just change the [nodes](/en/reference/applications/services/services#nodes) and redeploy the application - no restarts needed."
+---
+
+
+
+
+
+Documents are managed by Vespa in chunks called [buckets](#buckets). The size and number of buckets are completely managed by Vespa and there is never any need to manually control sharding.
+
+The elasticity mechanism is also used to recover from a node loss: New replicas of documents are created automatically on other nodes to maintain the configured redundancy. Failed nodes is therefore not a problem that requires immediate attention - clusters will self-heal from node failures as long as there are sufficient resources.
+
+
+
+
+
+When you want to remove nodes from a content cluster, you can have the system migrate data off them in an orderly fashion prior to removal. This is done by marking nodes as *retired*. This is useful to remove nodes that should be retired, but also to migrate a cluster to entirely new nodes while online: Add the new nodes, mark the old nodes retired, wait for the data to be redistributed and remove the old nodes.
+
+The auto-elasticity is configured for a normal fail-safe operation, but there are tradeoffs like recovery speed and resource usage. Learn more in [procedures](/en/operations/self-managed/admin-procedures#content-cluster-configuration).
+
+## Adding nodes
+
+To add or remove nodes from a content cluster, just `nodes` tag of the [content](/en/reference/applications/services/content) cluster in [services.xml](/en/reference/applications/services/services) and [redeploy](/en/basics/applications#deploying-applications). Read more in [procedures](/en/operations/self-managed/admin-procedures).
+
+When adding a new node, a new *ideal state* is calculated for all buckets. The buckets mapped to the new node are moved, the superfluous are removed. See redistribution example - add a new node to the system, with redundancy n=2:
+
+
+
+
+
+The distribution algorithm generates a random node sequence for each bucket. In this example with n=2, replicas map to the two nodes sorted first. The illustration shows how placement onto two nodes changes as a third node is added. The new node takes over as primary for the buckets where it got sorted first, and as secondary for the buckets where it got sorted second. This ensures minimal data movement when nodes come and go, and allows capacity to be changed easily.
+
+No buckets are moved between the existing nodes when a new node is added. Based on the pseudo-random sequences, some buckets change from primary to secondary, or are removed. Multiple nodes can be added in the same deployment.
+
+## Removing nodes
+
+Whether a node fails or is *retired*, the same redistribution happens. If the node is retired, replicas are generated on the other nodes and the node stays up, but with no active replicas. Example of redistribution after node failure, n=2:
+
+
+
+
+
+Here, node 2 fails. This node held the active replicas of bucket 2 and 6. Once the node fails the secondary replicas are set active. If they were already in a *ready* state, they start serving queries immediately, otherwise they will index replicas, see [searchable-copies](/en/reference/applications/services/content#searchable-copies). All buckets that no longer have secondary replicas are merged to the remaining nodes according to the ideal state.
+
+## Grouped distribution
+
+Nodes in content clusters can be placed in [groups](/en/reference/applications/services/content#group). A group of nodes in a content cluster will have one or more complete replicas of the entire document corpus.
+
+
+
+
+
+This is useful in the cases listed below:
+
+| | |
+| :--- | :--- |
+| **Cluster upgrade** | With multiple groups it becomes safe to take out a full group for upgrade instead of just one node at a time. [Read more](/en/operations/self-managed/live-upgrade).
+| **Query throughput** | Applications with high query rates and/or high static query cost can use groups to scale to higher query rates since Vespa will automatically send a query to just a single group. [Read more](/en/performance/sizing-search)
+| **Topology** | By using groups you can control replica placement over network switches or racks to ensure there is redundancy at the switch and rack level.
+
+Tuning group sizes and node resources enables applications to easily find the latency/cost sweet spot, the elasticity operations are automatic and queries and writes work as usual with no downtime.
+
+
+### Pinning groups
+
+While each group contains the same data, they may not return exactly the same search results since each node in each group sees a unique subset of the groups data written in a unique order. This can lead to some inconsistency when users are paging over multiple result pages. To avoid this, Vespa supports *pinning* to a particular search group: Results contains a `searchGroups` field in the top level in results (root.fields.searchGroup) containing the integer index of the group that produced the result. This can be passed back in the query for the next page as the `model.searchGroup` query parameter.
+
+Note that:
+
+- The searchGroup field is only present in results where it is informative and unique: It will not be present when searching content clusters without multiple groups, or when composed of hits from multiple grouped queries.
+- The model.searchGroup parameter is a soft preference and always safe to pass: If the group is unavailable or non-existent another group will be used.
+
+## Changing topology
+
+A Vespa elasticity feature is the ability to change topology (i.e. grouped distribution) without service disruption. This is a live change, and will auto-redistribute documents to the new topology.
+
+Also read [topology change](/en/operations/self-managed/admin-procedures#topology-change) if running Vespa self-hosted - the below steps are general for all hosting options.
+
+### Replicas
+
+When changing topology, pay attention to the [min-redundancy](/en/reference/applications/services/content#min-redundancy) setting - this setting configures a *minimum* number of replicas in a cluster, the *actual* number is topology dependent - example:
+
+A flat cluster with min-redundancy n=2 and 15 nodes is changed into a grouped cluster with 3 groups with 5 nodes each (total node count and n is kept unchanged). In this case, the actual redundancy will be 3 after the change, as each of the 3 groups will have at least 1 replica for full query coverage. The practical consequence is that disk and memory requirements per node *increases* due to the change to topology. It is therefore important to calculate the actual replica count before reconfiguring topology.
+
+### Query coverage
+
+Changing topology might cause query coverage loss in the transition, unless steps taken in the right order. If full coverage is not important, just make the change and wait for document redistribution to complete.
+
+To keep full query coverage, make sure not to change both group size and number of groups at the same time:
+
+1. To add nodes for more data, or to have less data per node, increase group size. E.g., in a 2-group cluster with 8 nodes per group, add 4 nodes for a 25% capacity increase with 10 nodes per group.
+2. If the goal is to add query capacity, add one or more groups, with the same node count as existing group(s). A flat cluster is the same as one group - if the flat cluster has 8 nodes, change to a grouped cluster with 2 groups of 8 nodes per group. This will add an empty group, which is put in query serving once populated.
+
+In short, if the end-state means both changing number of groups and node count per group, do this as separate steps, as a combination of the above. Between each step, wait for document redistribution to complete using the `merge_bucket.pending` metric - see [example](/en/writing/initial-batch-feed).
+
+## Buckets
+
+To manage documents, Vespa groups them in *buckets*, using hashing or hints in the [document ID](/en/schemas/documents).
+
+A document Put or Update is sent to all replicas of the bucket with the document. If bucket replicas are out of sync, a bucket merge operation is run to re-sync the bucket. A bucket contains [tombstones](/en/operations/self-managed/admin-procedures#data-retention-vs-size) of recently removed documents.
+
+Buckets are split when they grow too large, and joined when they shrink. This is a key feature for high performance in small to large instances, and eliminates need for downtime or manual operations when scaling. Buckets are purely a content management concept, and data is not stored or indexed in separate buckets, nor does queries relate to buckets in any way. Read more in [buckets](/en/content/buckets).
+
+## Ideal state distribution algorithm
+
+The [ideal state distribution algorithm](/en/content/idealstate) uses a variant of the [CRUSH algorithm](https://ceph.io/assets/pdfs/weil-crush-sc06.pdf) to decide bucket placement. It makes a minimal number of documents move when nodes are added or removed. Central to the algorithm is the assignment of a node sequence to each bucket:
+
+
+
+
+
+Steps to assign a bucket to a set of nodes:
+
+
+
+Seed a random generator with the bucket ID to generate a pseudo-random sequence of numbers. Using the bucket ID as seed will then always generate the same sequence for the bucket.
+
+
+Nodes are ordered by [distribution-key](/en/reference/applications/services/content#node), assign the random number in that order. E.g. a node with distribution-key 0 will get the first random number, node 1 the second.
+
+
+Sort the node list by the random number.
+
+
+Select nodes in descending random number order - above, node 1, 3 and 0 will store bucket 0x3c000000000000a0 with n=3 (redundancy). For n=2, node 1 and 3 will store the bucket. This specification of where to place a bucket is called the bucket's *ideal state*.
+
+
+
+Repeat this for all buckets in the system.
+
+## Consistency
+
+Consistency is maintained at bucket level. Content nodes calculate local checksums based on the bucket contents, and the distributors compare checksums across the bucket replicas. A *bucket merge* is issued to resolve inconsistency, when detected. While there are inconsistent bucket replicas, operations are routed to the "best" replica.
+
+As buckets are split and joined, it is possible for replicas of a bucket to be split at different levels. A node may have been down while its buckets have been split or joined. This is called *inconsistent bucket splitting*. Bucket checksums can not be compared across buckets with different split levels. Consequently, content nodes do not know whether all documents exist in enough replicas in this state. Due to this, inconsistent splitting is one of the highest maintenance priorities. After all buckets are split or joined back to the same level, the content nodes can verify that all the replicas are consistent and fix any detected issues with a merge. [Read more](/en/content/consistency).
+
+## Further reading
+
+- [content nodes](/en/content/content-nodes)
+- [proton](/en/content/proton) - see *ready* state
diff --git a/mintlify-docs/en/content/idealstate.mdx b/mintlify-docs/en/content/idealstate.mdx
new file mode 100644
index 0000000000..8a03c1020c
--- /dev/null
+++ b/mintlify-docs/en/content/idealstate.mdx
@@ -0,0 +1,246 @@
+---
+title: "Distribution algorithm"
+---
+
+The distribution algorithm decides what nodes should be responsible for a given bucket. This is used directly in the clients to calculate distributor to talk to. Content nodes need time to move buckets when the distribution is changing, so routing to content nodes is done using tracked current state. The distribution algorithm decides which content nodes is wanted to store the bucket copies though, and due to this, the algorithm is also referred to as the ideal state algorithm.
+
+The input to the distribution algorithm is a bucket identifier, together with knowledge about what nodes are available, and what their capacities are.
+
+The output of the distribution algorithm is a sorted list of the available nodes. The first node in the order is the node most preferred to handle a given bucket. Currently, the highest order distributor node will be the owning distributor, and the redundancy factor decides how many of the highest order content nodes are preferred to store copies for a bucket.
+
+To enable minimal transfer of buckets when the list of available nodes changes, the removal or addition of nodes should not alter the sort order of the remaining nodes.
+
+Desired qualities for the ideal state algorithm:
+
+| | |
+| :--- | :--- |
+| **Minimal reassignment on cluster state change** | - If a node goes down, only buckets that resided on that node should be reassigned. - If a node comes up, only buckets that are moved to the new node should relocate. - Increasing the capacity of a single node should only move buckets to that node. - Reducing the capacity of a single node should only move buckets away from that node. |
+| **No skew in distribution** | - Nodes should get an amount of data relative to their capacity. |
+| **Lightweight** | - A simple algorithm that is easy to understand is a plus. Being lightweight to calculate is also a plus, giving more options of how to use it, without needing to cache results. |
+
+## Computational cost
+
+When considering how efficient the algorithm have to be, it is important to consider how often we need to calculate the ideal locations. Calculations are needed for the following tasks:
+
+- A client needs to map buckets to the distributors. If there are few buckets existing, all the results can be cached in clients, but for larger clusters, a lot of buckets may need to exist to create an even distribution, and caching becomes more memory intensive. Preferably the computational cost is cheap enough, such that no caching is needed. Currently, no caching is done by clients, but there is typically less than a million buckets, so caching all results would still have been viable.
+- Distributors need to calculate ideal state for a single bucket to verify that incoming operations are mapped to the correct distributor (clients have cluster state matching the distributor). This could be eliminated for buckets pre-existing in the bucket database, which would be true in most all cases. Currently, calculation is done for all requests.
+- Distributors need to calculate correct content nodes to create bucket copies on when operations to currently non-existing buckets come in. This is typically only something happening at the start of the cluster lifetime though. Normally buckets are created through splitting or joining existing buckets.
+- Distributors need to calculate ideal state to check if any maintenance operations need to be done for a bucket.
+- Content nodes need to calculate ideal state for a single bucket to verify that the correct distributor sent the request. This could be cached or served through bucket database but currently there is no need.
+
+As long as the algorithm is cheap, we can avoid needing to cache the result. The cache will then not limit scalability, and we have less dependencies and complexity within the content layer. The current algorithm has shown itself cheap enough, such that little caching has been needed.
+
+## A simple example: Modulo
+
+A simple approach would be to use a modulo operation to find the most preferred node, and then just order the nodes in configured order from there, skipping nodes that are currently not available:
+
+
+$$
+most\ preferred\ node = bucket\ \%\ nodecount
+$$
+
+Properties:
+
+- Computational lightweight and easy to understand
+- Perfect distribution among nodes.
+- Total redistribution on state change.
+
+By just skipping currently unavailable nodes, nodes can go down and up with minimal movement. However, if the number of configured nodes change, practically all buckets will be redistributed. As the content layer is intended to be scalable, this breaks with one of the intentions and this algorithm has thus not been considered.
+
+## Weighted random election
+
+This is the algorithm that is currently used for distribution in the content layer, as it fits our use case well.
+
+To avoid a total redistribution on state change, the mapping can not be heavily dependent on the number of nodes in the cluster. By using random numbers, we can distribute the buckets randomly between the nodes, in such a fashion that altering the cluster state has a small impact. As we need the result to be reproducible, we obviously need to use a pseudo-random number generator and not real random numbers.
+
+The idea is as follows. To find the location of a given bucket, seed a random number generator with the bucket identifier, when draw one number for each node. The drawn numbers will then decide upon the preferred node order for that specific bucket.
+
+For this to be reproducible, all nodes need to draw the same numbers each time. Each node is assigned a distribution key in the configuration. This key decides what random number the node will be assigned. For instance, a node with distribution key 13, will be assigned the 14th random number generated. (As the first will go to the node with key 0). The existence of this node then also requires us to always generate at least 14 random numbers to do the calculation.
+
+Thus, one may end up calculating random numbers for nodes that are currently not available, either because they are temporarily down, or because the configuration have left holes in the distribution key space. It is recommended to not leave too large holes in the distribution key space to not waste too much.
+
+Using this approach, if you add another node to the cluster, it will roll for each bucket. It should thus steal ownership of some of the buckets. As all the numbers are random, it will steal buckets from all the other nodes, thus, given that the bucket count is large compared to the number of nodes, it will steal on average 1/n of the buckets from each pre-existing node, where n is the number of nodes in the current cluster. Likewise, if a node is removed from the cluster, the remaining nodes will divide the extra load between them.
+
+### Weighting nodes
+
+By enforcing all the numbers drawn to be floating point numbers between 0 and 1, we can introduce node weights using the following formula:
+
+$$
+r^{1/c}
+$$
+
+Where r is the floating point number between 0 and 1 that was drawn for a given node, and c is the node capacity, which is the weight of the node. Proof not included here, but this will end up giving each node on average an amount of data that is relative to its capacity. That is, among any nodes there are two nodes X and Y, where the number of buckets given to X should be equal to the number of buckets given to Y multiplied by capacity(X)/capacity(Y). (Given perfect random distribution)
+
+Altering the weight in a running system will also create a minimal redistribution of data. If we reduce the capacity, all the nodes number will be reduced, and some of its buckets will be taken over by the other nodes, and vice versa if the capacity is increased. Properties:
+
+- Minimum data movement on state changes.
+- Some skew, depending on how good the random number generator is, the amount of nodes we have to divide buckets between, and the number of buckets we have to divide between them.
+- Fairly cheap to compute given a reasonable amount of nodes, and an inexpensive pseudo-random number generator.
+
+### Distribution skew
+
+The algorithm does generate a bit of skew in the distribution, as it is essentially random. The following attributes decrease the skew:
+
+- Having more buckets to distribute.
+- Having less targets (nodes and partitions) to distribute buckets to.
+- Having a more uniform pseudo-random function.
+
+The more buckets exist, the more metadata needs to be tracked in the distributors though, and operations that wants to scan all the buckets will take longer. Additionally, the backend may want buckets above a given size to improve performance, storage efficiency or similar. Consequently, we typically want to enforce enough buckets for a decent distribution, but not more.
+
+Then the number of nodes increase, more buckets need to exist to keep the distribution even. If the number of nodes is doubled, the number of buckets must typically more than double to keep the distribution equally even. Thus, this scales worse than linear. It does not scale much worse though, and this has not proved to be a practical problem for the cluster sizes we have used up until now. (A cluster size of a thousand nodes does not seem to be any issue here)
+
+Having a good and uniform pseudo-random function makes the distribution more even. However, this may require more computationally heavy generators. Currently, we are using a simple and fast algorithm, and it has proved more than sufficient for our needs.
+
+The distribution to distributors are done to create an even distribution between the nodes. The distributors are free to split the buckets further if the backend wants buckets to contain less data. They can not use fewer buckets than are needed for distribution though. By using a minimum amount of buckets for distribution, the distributors have more freedom to control sizes of buckets.
+
+### Distribution waste
+
+To measure how many buckets are needed to create a decent distribution a metric is needed. We have defined a waste metric for this purpose as follows:
+
+Distribute the buckets to all the units. Assume the size of all units are identical. Assume the unit with the most units assigned to it is at 100% capacity. The wasted space is the percentage of unused capacity compared to the used capacity.
+
+This definition seems useful as a cluster is considered at full capacity once one of its partitions is at full capacity. Having one node with more buckets than the rest is thus damaging, while having one node with fewer buckets than the rest is just fine.
+
+Example: There are 4 nodes distributing 18 units. The node with the most units has 6. Distribution waste is `100% * (4 * 6 - 18) / (4 * 6) = 25%`.
+
+Below we have calculated waste based on number of nodes and the amount of buckets to distribute between them. Bits refer to distribution bits used. A distribution bit count of 16 indicates that there will be 216 buckets.
+
+The calculations assume all buckets have the same size. This is normally close to true as documents are randomly assigned to buckets. There will be lots of buckets per node too, so a little variance typically evens out fairly well.
+
+The tables below assume only one partition exist on each node. If you have 4 partitions on 16 nodes, you should rather use the values for `4 * 16 = 64` nodes.
+
+A higher redundancy factor indicates more buckets to distribute between the same amount of nodes, resulting in a more even distribution. Doubling the redundancy has the same effect as adding one to the distribution bit count. To get values for redundancy 4, the redundancy 2 values can be used, and then the waste will be equal to the value with one less distribution bit used.
+
+### Calculated waste from various cluster sizes
+
+A value of 1 indicates 100% waste. A value of 0.1 indicates 10% waste. A waste below 1 % is shown green, below 10% as yellow and below 30% as orange. Red indicates more than 30% waste.
+
+#### Distribution with redundancy 1:
+
+| Bits \ Nodes | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
+| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
+| 1 | 0.0000 | 0.0000 | 0.3333 | 0.5000 | 0.6000 | 0.6667 | 0.7143 | 0.7500 | 0.7778 | 0.8000 | 0.8182 | 0.8333 | 0.8462 | 0.8571 | 0.8667 |
+| 2 | 0.0000 | 0.3333 | 0.3333 | 0.5000 | 0.2000 | 0.3333 | 0.4286 | 0.5000 | 0.5556 | 0.6000 | 0.6364 | 0.6667 | 0.6923 | 0.7143 | 0.7333 |
+| 3 | 0.0000 | 0.2000 | 0.1111 | 0.3333 | 0.2000 | 0.3333 | 0.6190 | 0.6667 | 0.8222 | 0.8400 | 0.8545 | 0.8333 | 0.6923 | 0.7143 | 0.7333 |
+| 4 | 0.0000 | 0.1111 | 0.1111 | 0.3333 | 0.3600 | 0.3333 | 0.4286 | 0.5000 | 0.7778 | 0.8000 | 0.8182 | 0.8095 | 0.6923 | 0.7143 | 0.6444 |
+| 5 | - | 0.0588 | 0.1111 | 0.2727 | 0.2889 | 0.4074 | 0.2381 | 0.3333 | 0.8129 | 0.8316 | 0.8469 | 0.8519 | 0.8359 | 0.8367 | 0.8359 |
+| 6 | - | 0.0000 | 0.0725 | 0.1579 | 0.1467 | 0.1111 | 0.1688 | 0.3846 | 0.7037 | 0.7217 | 0.7470 | 0.7460 | 0.7265 | 0.6952 | 0.6718 |
+| 7 | - | 0.0725 | 0.0519 | 0.0857 | 0.0857 | 0.1111 | 0.2050 | 0.2000 | 0.4530 | 0.4667 | 0.5152 | 0.5152 | 0.4530 | 0.3905 | 0.3436 |
+| 8 | - | 0.0000 | 0.0078 | 0.0725 | 0.0857 | 0.0922 | 0.1293 | 0.1351 | 0.1634 | 0.1742 | 0.1688 | 0.2381 | 0.2426 | 0.2967 | 0.3173 |
+| 9 | - | 0.0039 | 0.0192 | 0.1467 | 0.1607 | 0.1203 | 0.1080 | 0.1111 | 0.1380 | 0.1322 | 0.1218 | 0.1795 | 0.1962 | 0.2381 | 0.2580 |
+| 10 | - | 0.0019 | 0.0275 | 0.0922 | 0.0898 | 0.0623 | 0.0741 | 0.0922 | 0.1111 | 0.1018 | 0.1218 | 0.1203 | 0.1438 | 0.1688 | 0.1675 |
+| 11 | - | 0.0019 | 0.0234 | 0.0430 | 0.0385 | 0.0248 | 0.0248 | 0.0483 | 0.0636 | 0.0648 | 0.0737 | 0.0725 | 0.0894 | 0.0800 | 0.0958 |
+| 12 | - | - | 0.0121 | 0.0285 | 0.0282 | 0.0121 | 0.0149 | 0.0571 | 0.0577 | 0.0562 | 0.0549 | 0.0412 | 0.0510 | 0.0439 | 0.0616 |
+| 13 | - | - | 0.0074 | 0.0019 | 0.0070 | 0.0177 | 0.0304 | 0.0303 | 0.0337 | 0.0189 | 0.0252 | 0.0358 | 0.0409 | 0.0501 | 0.0385 |
+| 14 | - | - | 0.0041 | 0.0024 | 0.0037 | 0.0027 | 0.0145 | 0.0073 | 0.0101 | 0.0130 | 0.0220 | 0.0234 | 0.0290 | 0.0248 | 0.0195 |
+| 15 | - | - | 0.0019 | 0.0021 | 0.0036 | 0.0083 | 0.0059 | 0.0056 | 0.0101 | 0.0097 | 0.0123 | 0.0163 | 0.0150 | 0.0186 | 0.0173 |
+| 16 | - | - | 0.0010 | 0.0007 | 0.0010 | 0.0030 | 0.0049 | 0.0039 | 0.0085 | 0.0072 | 0.0097 | 0.0108 | 0.0135 | 0.0141 | 0.0115 |
+| 17 | - | - | - | - | - | 0.0030 | 0.0033 | 0.0024 | 0.0036 | 0.0030 | 0.0055 | 0.0091 | 0.0135 | 0.0156 | 0.0143 |
+| 18 | - | - | - | - | - | - | 0.0019 | - | 0.0029 | 0.0027 | 0.0043 | 0.0040 | 0.0066 | 0.0061 | 0.0060 |
+| 19 | - | - | - | - | - | - | - | - | 0.0019 | - | 0.0021 | 0.0030 | 0.0023 | 0.0031 | 0.0042 |
+| 20 | - | - | - | - | - | - | - | - | - | - | - | 0.0029 | 0.0025 | 0.0037 | 0.0044 |
+| 21 | - | - | - | - | - | - | - | - | - | - | - | - | 0.0026 | 0.0035 | 0.0040 |
+
+#### Distribution with redundancy 2:
+
+| Bits \ Nodes | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
+| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
+| 1 | 0.0000 | 0.0000 | 0.3333 | 0.5000 | 0.6000 | 0.6667 | 0.4286 | 0.5000 | 0.5556 | 0.6000 | 0.6364 | 0.6667 | 0.6923 | 0.7143 | 0.7333 |
+| 2 | 0.0000 | 0.0000 | 0.3333 | 0.3333 | 0.2000 | 0.3333 | 0.4286 | 0.5000 | 0.5556 | 0.6000 | 0.6364 | 0.6667 | 0.6923 | 0.4286 | 0.4667 |
+| 3 | 0.0000 | 0.0000 | 0.1111 | 0.2000 | 0.2000 | 0.3333 | 0.4286 | 0.5000 | 0.7037 | 0.7333 | 0.7576 | 0.7778 | 0.7949 | 0.7714 | 0.7333 |
+| 4 | 0.0000 | 0.0000 | 0.1111 | 0.2000 | 0.2000 | 0.3333 | 0.3469 | 0.2000 | 0.7460 | 0.7714 | 0.7762 | 0.7778 | 0.7949 | 0.7714 | 0.7630 |
+| 5 | - | - | 0.0725 | 0.1579 | 0.2471 | 0.2381 | 0.2967 | 0.2727 | 0.7265 | 0.7538 | 0.7673 | 0.7778 | 0.7949 | 0.7922 | 0.7968 |
+| 6 | - | - | 0.0519 | 0.1111 | 0.1742 | 0.1467 | 0.2050 | 0.2381 | 0.6908 | 0.7023 | 0.7016 | 0.7117 | 0.7265 | 0.7229 | 0.7247 |
+| 7 | - | - | 0.0303 | 0.0154 | 0.0340 | 0.0303 | 0.0857 | 0.1111 | 0.4921 | 0.4880 | 0.4828 | 0.4797 | 0.5077 | 0.4622 | 0.4667 |
+| 8 | - | - | 0.0078 | 0.0303 | 0.0248 | 0.0623 | 0.0857 | 0.0725 | 0.0970 | 0.1322 | 0.1049 | 0.1293 | 0.1620 | 0.1873 | 0.2242 |
+| 9 | - | - | 0.0019 | 0.0266 | 0.0519 | 0.0466 | 0.0682 | 0.0791 | 0.0824 | 0.0519 | 0.0691 | 0.0519 | 0.0623 | 0.0741 | 0.0898 |
+| 10 | - | - | 0.0063 | 0.0173 | 0.0154 | 0.0275 | 0.0116 | 0.0340 | 0.0558 | 0.0294 | 0.0452 | 0.0466 | 0.0567 | 0.0501 | 0.0584 |
+| 11 | - | - | 0.0078 | 0.0049 | 0.0154 | 0.0177 | 0.0149 | 0.0210 | 0.0275 | 0.0177 | 0.0252 | 0.0303 | 0.0305 | 0.0344 | 0.0317 |
+| 12 | - | - | - | 0.0073 | 0.0112 | 0.0192 | 0.0231 | 0.0312 | 0.0296 | 0.0177 | 0.0278 | 0.0358 | 0.0245 | 0.0312 | 0.0385 |
+| 13 | - | - | - | 0.0061 | 0.0049 | 0.0096 | 0.0112 | 0.0201 | 0.0218 | 0.0088 | 0.0077 | 0.0199 | 0.0138 | 0.0304 | 0.0317 |
+| 14 | - | - | - | 0.0059 | 0.0058 | 0.0058 | 0.0057 | 0.0092 | 0.0128 | 0.0082 | 0.0139 | 0.0081 | 0.0096 | 0.0199 | 0.0213 |
+| 15 | - | - | - | - | 0.0014 | 0.0039 | 0.0052 | 0.0034 | 0.0051 | 0.0085 | 0.0044 | 0.0072 | 0.0107 | 0.0101 | 0.0082 |
+| 16 | - | - | - | - | 0.0016 | 0.0030 | 0.0026 | 0.0036 | 0.0065 | 0.0051 | 0.0061 | 0.0084 | 0.0065 | 0.0083 | 0.0100 |
+| 17 | - | - | - | - | - | - | 0.0010 | 0.0020 | 0.0028 | - | 0.0040 | 0.0049 | 0.0067 | 0.0071 | 0.0062 |
+| 18 | - | - | - | - | - | - | - | - | 0.0032 | - | 0.0024 | - | 0.0034 | 0.0056 | 0.0041 |
+| 19 | - | - | - | - | - | - | - | - | - | - | - | - | 0.0025 | 0.0018 | - |
+
+#### Distribution with redundancy 2:
+
+| Bits \ Nodes | 16 | 20 | 32 | 48 | 64 | 100 | 128 | 160 | 200 | 256 | 350 | 500 | 800 | 1000 | 5000 |
+| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
+| 8 | 0.2000 | 0.3081 | 0.2727 | 0.5152 | 0.5294 | 0.5733 | 0.6364 | 0.7091 | 0.7673 | 0.8000 | 0.8537 | 0.8862 | 0.8933 | 0.8976 | 0.9659 |
+| 9 | 0.0725 | 0.2242 | 0.1795 | 0.1795 | 0.3043 | 0.3173 | 0.3846 | 0.5077 | 0.5345 | 0.6364 | 0.7340 | 0.7952 | 0.8400 | 0.8720 | 0.9317 |
+| 10 | 0.0725 | 0.1322 | 0.1233 | 0.2099 | 0.1579 | 0.2415 | 0.3333 | 0.5733 | 0.4611 | 0.5789 | 0.6558 | 0.7269 | 0.8293 | 0.8425 | 0.8976 |
+| 11 | 0.0340 | 0.0857 | 0.0922 | 0.1111 | 0.1233 | 0.1969 | 0.2558 | 0.5937 | 0.5643 | 0.5897 | 0.5965 | 0.6099 | 0.6587 | 0.7591 | 0.8830 |
+| 12 | 0.0448 | 0.0385 | 0.0623 | 0.1065 | 0.0986 | 0.1285 | 0.3725 | 0.3831 | 0.4064 | 0.4074 | 0.4799 | 0.4880 | 0.5124 | 0.8328 | 0.8976 |
+| 13 | 0.0340 | 0.0328 | 0.0554 | 0.0699 | 0.0623 | 0.0948 | 0.1049 | 0.2183 | 0.2344 | 0.3191 | 0.3498 | 0.4539 | 0.5733 | 0.6656 | 0.8870 |
+| 14 | 0.0140 | 0.0189 | 0.0376 | 0.0452 | 0.0466 | 0.0717 | 0.0986 | 0.1057 | 0.1047 | 0.2242 | 0.2853 | 0.2798 | 0.4064 | 0.4959 | 0.8830 |
+| 15 | 0.0094 | 0.0118 | 0.0385 | 0.0268 | 0.0331 | 0.0638 | 0.0708 | 0.0775 | 0.0898 | 0.1322 | 0.2133 | 0.2104 | 0.3550 | 0.4446 | 0.8752 |
+| 16 | 0.0097 | 0.0081 | 0.0380 | 0.0303 | 0.0362 | 0.0577 | 0.0501 | 0.0627 | 0.0717 | 0.1033 | 0.1733 | 0.1678 | 0.2586 | 0.3101 | 0.8511 |
+| 17 | 0.0075 | 0.0066 | 0.0346 | 0.0293 | 0.0154 | 0.0258 | 0.0466 | 0.0546 | 0.0704 | 0.1041 | 0.1469 | 0.1983 | 0.2702 | 0.2972 | 0.7740 |
+| 18 | 0.0053 | 0.0057 | 0.0098 | 0.0098 | 0.0122 | 0.0149 | 0.0238 | 0.0300 | 0.0394 | 0.0353 | 0.0434 | 0.0553 | 0.0611 | 0.1782 | 0.6334 |
+| 19 | - | 0.0022 | 0.0050 | 0.0162 | 0.0098 | 0.0133 | 0.0149 | 0.0220 | 0.0242 | 0.0252 | 0.0333 | 0.0398 | 0.0495 | 0.0999 | 0.5145 |
+| 20 | - | - | 0.0030 | 0.0107 | 0.0088 | 0.0098 | 0.0144 | 0.0140 | 0.0148 | 0.0203 | 0.0195 | 0.0255 | 0.0348 | 0.1133 | 0.4481 |
+| 21 | - | - | 0.0043 | 0.0063 | 0.0051 | 0.0074 | 0.0079 | 0.0085 | 0.0086 | 0.0113 | 0.0147 | 0.0170 | 0.0237 | 0.1068 | 0.4422 |
+| 22 | - | - | - | 0.0026 | 0.0035 | 0.0037 | 0.0082 | 0.0061 | 0.0077 | 0.0087 | 0.0101 | 0.0134 | 0.0193 | 0.1140 | 0.4635 |
+| 23 | - | - | - | 0.0019 | - | 0.0026 | 0.0080 | 0.0055 | 0.0056 | 0.0057 | 0.0063 | 0.0096 | 0.0155 | 0.1294 | 0.4982 |
+| 24 | - | - | - | 0.0013 | - | - | 0.0074 | 0.0060 | 0.0058 | 0.0053 | 0.0049 | 0.0068 | 0.0112 | 0.0471 | 0.3219 |
+| 25 | - | - | - | - | - | - | - | - | - | 0.0043 | 0.0043 | 0.0058 | 0.0067 | 0.0512 | 0.2543 |
+| 26 | - | - | - | - | - | - | - | - | - | - | 0.0040 | 0.0042 | 0.0043 | 0.0051 | 0.0210 |
+| 27 | - | - | - | - | - | - | - | - | - | - | - | - | 0.0028 | 0.0157 | 0.0814 |
+
+### Default number of distribution bits used
+
+Note that changing the amount of distribution bits used will change what buckets exist, which will change the distribution considerably. We thus do not want to alter the distribution bit count too often.
+
+Ideally, the users would be allowed to configure minimal and maximal acceptable waste, and the current amount of distribution bits could then just be calculated on the fly. But as computing the waste values above are computational heavy, especially with many nodes and many distribution bits, currently only a couple of profiles are available for you to configure.
+
+**Vespa Cloud note:** Vespa Cloud locks distribution bit count to 16. This is because Vespa Cloud offers auto-scaling of nodes, and such a scaling decision should not implicitly lead to a full redistribution of data by crossing a distribution bit node count boundary. 16 bits strikes a good balance of low skew and high performance for most production deployments.
+
+#### Loose mode (default)
+
+The loose mode allows for more waste, allowing the amount of nodes to change considerably without altering the distribution bit counts.
+
+| Node count | 1-4 | 5-199 | 200-> |
+| :--- | :--- | :--- | :--- |
+| Distribution bit count | 8 | 16 | 24 |
+| Max calculated waste *) | 3.03 % | 7.17 % | ? |
+| Minimum buckets/node **) | 256 - 64 | 13108 - 329 | 83886 - |
+
+#### Strict mode (not default)
+
+The strict mode attempts to keep the waste below 1.0 %. When it needs to increase the bit count it increases the bit count significantly to allow considerable more growth before having to adjust the count again.
+
+| Node count | 1-4 | 5-14 | 15-199 | 200-799 | 800-1499 | 1500-4999 | 5000-> |
+| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
+| Distribution bit count | 8 | 16 | 21 | 25 | 28 | 30 | 32 |
+| Max calculated waste *) | 3 % | 0.83 % | 0.86 % | 0.67 % | ? | ? | ? |
+| Minimum buckets/node **) | 256 - 64 | 13107 - 4681 | 139810 - 10538 | 167772 - 41995 | 335544 - 179076 | 715827 - 214791 | 858993 - |
+
+*) Max calculated waste, given redundancy 2 and the max node count in the given range, as shown in the table above. (Note that this assumes equal sized buckets, and that every possible bucket exist. In a real system there will be random variation).
+
+**) Given a node count and distribution bits, there is a minimum number of buckets enforced to exist. However, splitting due to bucket size may increase this count beyond this number. This value shows the maximum value of the minimum. (That is the number of buckets per node enforced for the lowest node count in the range) Ideally one wants to have few buckets enforced by distribution and rather let bucket size split buckets, as that leaves more freedom to users.
+
+## Q/A
+
+
+A: This is both expected and intentional—to see why we must look at how the ideal state algorithm works.
+
+As previously outlined, the ideal state algorithm requires 3 distinct inputs:
+
+1. The ID of the bucket to be replicated across content nodes.
+2. The set of all nodes (i.e. unique distribution keys) in the cluster *across* all groups, and their current availability state (Down, Up, Maintenance etc.).
+3. The cluster topology and replication configuration. The topology includes knowledge of all groups.
+
+From this the algorithm returns a deterministic, ordered sequence of nodes (i.e. distribution keys) across all configured groups. The ordering of nodes is given by their individual pseudo-random node *score*, where higher scoring nodes are considered more *ideal* for storing replicas for a given bucket. The set of nodes in this sequence respects the constraints given by the configured group topology and replication level.
+
+When computing node scores within a group, the *absolute* distribution keys are used rather than a node's *relative* ordering within the group. This means the individual node scores—and consequently the distribution of bucket replicas—within one group is different (with a very high probability) from all other groups.
+
+What the ideal state algorithm ensures is that there exists a deterministic, configurable number of replicas per bucket within each group and that they are evenly distributed across each group's nodes—the exact mapping can be considered an unspecified "implementation detail".
+
+The rationale for using absolute distribution keys rather than relative ordering is closely related to the earlier discussion about why [modulo distribution](/en/content/idealstate#a-simple-example-modulo) is a poor choice. Let $N_g \gt 1$ be the number of nodes in a given group:
+
+- A relative ordering means that removing—or just reordering—a single node from the configuration can potentially lead to a full redistribution of all data within that group, not just $\frac{1}{N_g}$ of the data. Imagine for instance moving a node from being first in the group to being the last.
+- If we require nodes with the same relative index in each group to store the same data set (i.e. a row-column strategy), this immediately suffers in failure scenarios even when just a single node becomes unavailable. Data coverage in the group remains reduced until the node is replaced, as no other nodes can take over responsibility for the data. This is because removing the node leads to the problem in the previous point, where a disproportionally large amount of data must be moved due to the relative ordering changing. With the ideal state algorithm, the remaining nodes in the group will transparently assume ownership of the data, with each node receiving an expected $( \frac{1}{N_g - 1} )$ of the unavailable node's buckets.
+
\ No newline at end of file
diff --git a/mintlify-docs/en/content/images/elastic-fail.png b/mintlify-docs/en/content/images/elastic-fail.png
new file mode 100644
index 0000000000..d87fa40cdf
Binary files /dev/null and b/mintlify-docs/en/content/images/elastic-fail.png differ
diff --git a/mintlify-docs/en/content/images/elastic-grow.png b/mintlify-docs/en/content/images/elastic-grow.png
new file mode 100644
index 0000000000..fd2e14789f
Binary files /dev/null and b/mintlify-docs/en/content/images/elastic-grow.png differ
diff --git a/mintlify-docs/en/content/images/query-groups.png b/mintlify-docs/en/content/images/query-groups.png
new file mode 100644
index 0000000000..b0f5332f3a
Binary files /dev/null and b/mintlify-docs/en/content/images/query-groups.png differ
diff --git a/mintlify-docs/en/content/proton.mdx b/mintlify-docs/en/content/proton.mdx
new file mode 100644
index 0000000000..120c80d6cf
--- /dev/null
+++ b/mintlify-docs/en/content/proton.mdx
@@ -0,0 +1,464 @@
+---
+title: "Proton"
+sidebarTitle: "Content clusters"
+description: "Proton is Vespa's search core and runs on each content node as the *vespa-proton-bin* process. Proton maintains disk and memory structures for documents (organized per document type), handles [read and write operations](/en/writing/reads-and-writes#operations), and execution of [queries](#queries). As the document data is dynamic, disk and memory structures are periodically optimized by [maintenance jobs](#proton-maintenance-jobs)."
+---
+
+The content node has a *bucket management system* which sends requests to a set of *document databases*, which each consists of three *sub-databases* `ready`, `not ready` and `removed`:
+
+
+
+
+
+
+### Bucket management
+
+When the node starts up it first needs to get an overview of what documents and buckets it has. Once metadata for all buckets is known, the content nodes transition from down to up state. As the distributors want quick access to bucket metadata, it maintains an in-memory bucket database to efficiently serve these requests. The state of the bucket database can always be reconstructed from the durably persisted search node state, but this is expensive and therefore only happens at process startup time.
+
+This database is considered the source of truth for the state of the node's bucket metadata for the duration of the process's lifetime. As incoming operations mutate the state of the documents on the node, it is critical that the database is always kept in sync with these changes.
+
+### Persistence threads and operation dispatching
+
+A content node has a pool of *persistence threads* that is created at startup and remains fixed in size for the lifetime of the process. It is the responsibility of the persistence threads to schedule incoming write and read operations received by the content node, dispatch these to the search core, and ensure the bucket database remains in sync with changes caused by write operations.
+
+Unless explicitly configured, the size of the thread pool is automatically set based on the number of CPU cores available.
+
+Persistence threads are backed by a *persistence queue*. Read/write-operations received by the RPC subsystem are pushed onto this queue. The queue is operation deadline-aware; if an operation has exceeded its deadline while enqueued, it is immediately failed back to the sender without being executed. This avoids a particular failure scenario where a heavily loaded node spends increasingly more and more time processing already doomed operations, due to not being able to drain its queue quickly enough.
+
+All operations bound for a particular data bucket (such as Puts, Gets, etc.) execute in the context of a *bucket lock*. Locks are *shared* for reads and *exclusive* for writes. This means that multiple read operations can execute in parallel for the same bucket, but only one write operation can execute for a bucket at any given time (and no reads can be started concurrently with existing writes for a bucket, and vice versa). Note that some of these locking restrictions can be relaxed when it's safe to do so—see [performance optimizations](#performance-optimizations) for details.
+
+If a persistence thread tries to pop an operation from the queue and sees that the bucket it's bound for is already locked, it will leave the operation in place in the queue and try the next operation(s) instead. This means that although the queue acts as a FIFO for client operations towards a *single* bucket, this is not the case across *multiple* buckets.
+
+#### Write operations
+
+Write operations are dispatched as *asynchronous*—i.e., non-blocking—tasks to the search core. This increases parallelism by freeing up persistence threads to handle other operations, and a deeper pipeline enables the search core to optimize transaction log synchronization and batching of data structure updates.
+
+Since a deeper pipeline comes at the potential cost of increased latency when many operations are in flight, the maximum number of concurrent asynchronous operations is bounded by an adaptive persistence throttling mechanism. The throttler will dynamically scale the window of concurrency until it reaches a saturation point where further increasing the window size also results in increased operation latencies. When the number of in-flight operations hits the current maximum, persistence threads will not dispatch any more writes until the number goes down. Reads can still be processed during this time.
+
+An asynchronous write-task holds on to the exclusive bucket lock for the duration of its lifetime. Once the search core completes the write operation, the bucket database is updated with the new metadata state of the bucket (which reflects the side effects of the write) prior to releasing the lock. An operation reply is then generated and sent back via the RPC subsystem.
+
+#### Read operations
+
+Read operations are always evaluated *synchronously*—i.e. blocking—by persistence threads. To avoid having potentially expensive maintenance read operations (such as those used for [replica reconciliation](/en/content/consistency#replica-reconciliation)) block client operations for prolonged amounts of time, a subset of the persistence threads are *not* allowed to process such maintenance operations.
+
+Note that the condition evaluation step of a test-and-set write is considered a *read* sub-operation and is therefore done synchronously. Since it's part of a write operation, it happens atomically in the context of the exclusive lock of the higher-level operation.
+
+#### Performance optimizations
+
+To reduce thread context switches, some write operations may bypass the persistence thread queues and be directly asynchronously dispatched to the search core from the RPC thread the operation was received at. Such operations must still successfully acquire the appropriate exclusive bucket lock—if the lock cannot be immediately acquired the operation is pushed onto the persistence queue instead.
+
+To reduce lock contention and thread wakeups, smaller numbers of persistence threads are grouped together in *stripes* that share a dedicated per-stripe persistence queue. Operations are routed deterministically to a particular stripe based on their bucket ID, meaning that stripes operate on non-overlapping parts of the bucket space. Together, the many stripes and queues form one higher-level *logical* queue that covers the entire bucket space.
+
+If the queue contains multiple *non-conflicting* write operations to the same bucket, these may be dispatched in parallel in the context of the *same* write lock. This avoids having to wait for an entire lock-execute-unlock roundtrip prior to dispatching the next write for the same bucket. An example of conflicting writes is multiple Puts to the same document ID. The maximum number of operations dispatched in parallel is implementation-defined.
+
+### Document database
+
+Each document database is responsible for a single document type. It has a component called FeedHandler which takes care of incoming documents, updates, and remove requests. All requests are first written to a [transaction log](#transaction-log), then handed to the appropriate sub-database, based on the request type.
+
+### Sub-databases
+
+There are three types of sub-databases, each with its own [document meta store](/en/content/attributes#document-meta-store) and [document store](#document-store). The document meta store holds a map from the document ID to a local ID. This local id is used to address the document in the document store. The document meta store also maintains information on the state of the buckets that are present in the sub-database.
+
+The sub-databases are maintained by the *Maintenance Controller*. The document distribution changes as the system is resized. When the number of nodes in the system changes, the Maintenance Controller will move documents between the Ready and Not Ready sub-databases to reflect the new distribution. When an entry in the Removed sub-database gets old, it is purged. The sub-databases are:
+
+
+|||
+| :--- | :--- |
+| **Not Ready** | Holds the redundant documents that are not searchable, i.e. the not ready documents. Documents that are not ready are only stored, not indexed. It takes some processing to move from this state to the ready state. |
+| **Ready** | Maintains attributes and indexes of all ready documents and keeps them searchable. One of the ready copies is active while the rest are not active:
**Active** There should always be exactly one active copy of each document in the system, though intermittently there may be more. These documents produce results when queries are evaluated.
>**Not Active** The ready copies that are not active are indexed but will not produce results. By being indexed, they are ready to take over immediately if the node holding the active copy becomes unavailable. Read more in searchable-copies. |
+|**Removed**|Keeps track of documents that have been removed. The id and timestamp for each document are kept. This information is used when buckets from two nodes are merged. If the removed document exists on another node but with a different timestamp, the most recent entry prevails.|
+
+## Transaction log
+
+Content nodes have a transaction log to persist mutating operations. The transaction log persists operations by file append. Having a transaction log simplifies proton's in-memory index structures and enables steady-state high performance, read more below.
+
+All operations are written and synced to the [transaction log](/en/content/proton#transaction-log). This is sequential (not random) IO, but might impact overall feed performance if running on NAS attached storage where the sync operation has a much higher cost than on local attached storage (e.g., SSD). See [sync-transactionlog](/en/reference/applications/services/content#sync-transactionlog).
+
+By default, proton will [flush components](/en/reference/applications/services/content#flush-on-shutdown) like attribute vectors and memory index on shutdown, for quicker startup after scheduled restarts.
+
+## Document store
+
+Documents are stored as compressed serialized blobs in the *document store*. Put, update and remove operations are persisted in the [transaction log](#transaction-log) before updating the document in the document store. The operation is acked to the client and the result of the operation is immediately seen in search results.
+
+Files in the document store are written sequentially, and occur in pairs - example:
+
+```bash
+-rw-r--r-- 1 owner users 4133380096 Aug 10 13:36 1467957947689211000.dat
+-rw-r--r-- 1 owner users 71192112 Aug 10 13:36 1467957947689211000.idx
+```
+
+The [maximum size](/en/reference/applications/services/content#summary-store-logstore-maxfilesize): (in bytes) per .dat file on disk can be set using the following:
+
+```xml highlight= {9}
+
+
+
+
+
+
+
+
+ 8000000000
+```
+
+Notes:
+
+- The files are written in sequence. *proton* starts with one pair and grows it until *maxfilesize*. Once full, a new pair is started.
+- This means the pair is immutable, except for the last pair, which is written to.
+
+- Documents exist in multiple versions in multiple files. Older versions are compacted away when a pair is scheduled to be the new active pair - obsolete versions are removed, leaving only the active document version left in a new file pair - which is the new active pair.
+- Read more on implications of setting *maxfilesize* in [proton maintenance jobs](/en/content/proton#document-store-compaction).
+- Files are written in [chunks](/en/reference/applications/services/content#summary-store-logstore-chunk), using compression settings.
+
+## Defragmentation
+
+[Document store compaction](#document-store-compaction), defragments and sort document store files. It removes stale versions of documents (i.e. old versions of updated documents). It is triggered when the disk bloat of the document store is larger than the total disk usage of the document store times [diskbloatfactor](/en/reference/applications/services/content#flushstrategy-native-total-diskbloatfactor). Refer to [summary tuning](/en/reference/applications/services/content#summary) for details.
+
+Defragmentation status is best observed by tracking the [max_bucket_spread](/en/reference/operations/metrics/searchnode#content_proton_documentdb_ready_document_store_max_bucket_spread) metric over time. A sawtooth pattern is normal for corpora that change over time. The [document_store_compact](/en/reference/operations/metrics/searchnode#content_proton_documentdb_job_document_store_compact) metric tracks when proton is running the document store compaction job. Compaction settings can be set too tight, in that case, the metric is always, or close to, 1.
+
+When benchmarking, it is important to set the correct compaction settings, and also make sure that proton has compacted files since (can take hours), and is not actively compacting (*document_store_compact* should be 0 most of the time).
+
+
+
+**Note:**
+
+There is no bucket-compaction across files - documents will not move between files.
+
+
+Optimized reads using chunks
+
+As documents are clustered within the .dat file, proton optimizes reads by reading larger chunks when accessing documents. When visiting, documents are read in *bucket* order. This is the same order as the defragmentation jobs use.
+
+The first document read in a visit operation for a bucket will read a chunk from the .dat file into memory. Subsequent document accesses are served by a memory lookup only. The chunk size is configured by [maxsize](/en/reference/applications/services/content#summary-store-logstore-chunk-maxsize):
+
+```xml highlight= {9}
+
+
+
+
+
+
+
+
+ 16384
+
+
+```
+
+There can be 2^22=4M chunks. This sets a minimum chunk size based on *maxfilesize* - e.g. an 8G file can have minimum 2k chunk size. Finally, the bucket size is configured by setting [bucket-splitting](/en/reference/applications/services/content#bucket-splitting):
+
+```xml highlight= {3}
+
+
+
+```
+
+The following are the relevant sizing units:
+
+.dat file size - *maxfilesize*. Larger files give fewer files and so better locality, but compaction requires more memory and more time to complete. chunk size - *maxsize*. Smaller chunks give less wasted IO bytes but more IO operations. bucket size - *bucket-splitting*. Larger buckets give fewer buckets and better locality to nodes and files, but incur more overhead during content layer bucket maintenance operations. Overhead can be treated as linear in both CPU, memory and network usage with the bucket size.
+
+### Memory usage
+
+The document store has a mapping in memory from local ID (LID) to position in a document store file (.dat). Part of this mapping is persisted in the .idx-file paired to the .dat file. The memory used by the document store is linear with the number of documents and updates to these.
+
+The metric [content.proton.documentdb.ready.document_store.memory_usage.allocated_bytes](/en/reference/operations/metrics/searchnode#content_proton_documentdb_ready_document_store_memory_usage_allocated_bytes) gives the size in memory - use the [metric API](/en/reference/api/state-v1#state-v1-metrics) to find it. A rule of thumb is 12 bytes per document.
+
+## Attributes
+
+[Attribute](/en/content/attributes) fields are in-memory fields used for matching, ranking, sorting and grouping. Each attribute is a separate component that consists of a set of [data structures](/en/content/attributes#data-structures) to store values for that field across all documents in a sub-database. Attributes are managed by the Ready sub-database. Some attributes can also be managed by the Not Ready sub-database, see [high-throughput updates](/en/content/attributes#fast-access) for details.
+
+## Index
+
+Index fields are string fields, used for text search. Other field types are [attributes](/en/content/attributes) and [summary fields](/en/querying/document-summaries).
+
+The Index in the Ready sub-database consists of a memory index and one or more disk indexes. Mutating document operations are applied to the memory index, which is [flushed](#memory-index-flush) regularly. Flushed memory indexes are [merged](#disk-index-fusion) with the primary disk index.
+
+Proton stores position information in text indices by default, for proximity relevance - `posocc` (below). All the occurrences of a term are stored in the posting list, with its position. This provides superior ranking features, but is somewhat more expensive than just storing a single occurrence per document. For most applications, it is the correct tradeoff, since most of the cost is usually elsewhere and relevance is valuable.
+
+Applications that only need occurrence information for filtering can use [rank: filter](/en/reference/schemas/schemas#rank) to optimize query performance, using only `boolocc`\-files (below).
+
+The memory index has a dictionary per index field. This contains all unique words in that field, with mapping to posting lists with position information. The position information is used during ranking, see [nativeRank](/en/ranking/nativerank) for details on how a text match score is calculated.
+
+The disk index stores the content of each index field in separate folders. Each folder contains:
+
+- Dictionary. Files: `dictionary.pdat`, `dictionary.spdat`, `dictionary.ssdat`.
+- Compressed posting lists with position information. File: `posocc.dat.compressed`.
+- Posting lists with only occurrence information (bitvector). These are generated for common words. Files: `boolocc.bdat`, `boolocc.idx`.
+
+Example:
+
+```bash
+$ pwd
+/opt/vespa/var/db/vespa/search/cluster.mycluster/n1/documents/myschema/0.ready/index/index.flush.1/myfield
+$ ls -la
+total 7632
+drwxr-xr-x 2 org users 145 Oct 29 06:09 .
+drwxr-xr-x 74 org users 4096 Oct 29 06:11 ..
+-rw-r--r-- 1 org users 4096 Oct 29 06:11 boolocc.bdat
+-rw-r--r-- 1 org users 4096 Oct 29 06:11 boolocc.idx
+-rw-r--r-- 1 org users 8192 Oct 29 06:11 dictionary.pdat
+-rw-r--r-- 1 org users 8192 Oct 29 06:11 dictionary.spdat
+-rw-r--r-- 1 org users 4120 Oct 29 06:11 dictionary.ssdat
+-rw-r--r-- 1 org users 7778304 Oct 29 06:11 posocc.dat.compressed
+```
+
+Note that `boolocc`\-files are empty if the number of occurrences is small, like in the example above.
+
+## Proton maintenance jobs
+
+The memory and disk data structures used in Proton are periodically optimized by a set of maintenance jobs. These jobs are automatically executed, and some can be tuned in [flush strategy tuning](/en/reference/applications/services/content#flushstrategy). All jobs are described in the table below.
+
+There is only one instance of each job at a time - e.g., attributes are flushed in sequence. When a job is running, its metric is set to 1 - otherwise 0. Use this to correlate observed performance or resource usage with job runs - see *Run metric* below.
+
+The *temporary* resources used when jobs are executed are described in *CPU*, *Memory* and *Disk*. The memory and disk usage metrics of components that are optimized by the jobs are described in *Metrics* (with *Metric prefix*). For a list of all available Proton metrics, refer to the searchnode metrics in the [Vespa Metric Set](/en/reference/operations/metrics/vespa-metric-set#searchnode-metrics). Metrics are available at the [Metrics API](/en/operations/metrics).
+
+
+| Job | Description |
+| :--- | :--- |
+| CPU | Little - one thread flushes to disk |
+| Memory | Little - some temporary use |
+| Disk | A new file is written too, so 2x the size of an attribute on disk until the old flush file is deleted. |
+| Run metric | content.proton.documentdb.job.attribute_flush |
+| content.proton.documentdb.[ready|notready].attribute.memory_usage. |
+| Metrics | allocated_bytes.average
+ used_bytes.average
+ dead_bytes.average
+ onhold_bytes.average |
+| CPU | Little - one thread flushes to disk |
+| Memory | Little |
+| Disk | Creates a new disk index, size of the memory index. |
+| Run metric | content.proton.documentdb.job.memory_index_flush |
+| Metric prefix | content.proton.documentdb.index.memory_usage. |
+| Metrics | allocated_bytes.average
+ used_bytes.average
+ dead_bytes.average
+ onhold_bytes.average |
+| CPU | Multiple threads merge indices, configured as a function of
+ feeding concurrency -
+ refer to this for details |
+| Memory | Little |
+| Disk | Creates a new index while serving from the current: 2x temporary disk usage for the given index. |
+| Run metric | content.proton.documentdb.job.disk_index_fusion |
+| CPU | Little |
+| Memory | Little |
+| Disk | Little |
+| Run metric | content.proton.documentdb.job.document_store_flush |
+| CPU | Little - one thread reads one file, sorts and writes a new file |
+| Memory | Holds a document store file in memory plus memory for sorting the file.
+ Note: This is important on hosts with little memory!
+ Reduce maxfilesize to increase the number of files and use less temporary memory for compaction. |
+| Disk | A new file is written while the current is serving, max temporary usage is 2x. |
+| Run metric | content.proton.documentdb.job.document_store_compact |
+| Metric prefix | content.proton.documentdb.[ready|notready|removed].document_store. |
+| Metrics | disk_usage.average
+ disk_bloat.average
+ max_bucket_spread.average
+ memory_usage.allocated_bytes.average
+ memory_usage.used_bytes.average
+ memory_usage.dead_bytes.average
+ memory_usage.onhold_bytes.average |
+| CPU | CPU similar to feeding.
+ Consumes capacity from the write threads, so has feeding impact |
+| Memory | As feeding - e.g., the attribute memory usage and memory index in the ready sub-database will grow |
+| Disk | As feeding |
+| Run metric | content.proton.documentdb.job.bucket_move |
+| CPU | Like feeding - add and remove documents |
+| Memory | Little |
+| Disk | 0 |
+| Run metric | content.proton.documentdb.job.lid_space_compact |
+| Metric prefix | content.proton.documentdb.[ready|notready|removed].lid_space. |
+| Metrics | lid_limit.last
+ lid_bloat_factor.average
+ lid_fragmentation_factor.average |
+| CPU | Little |
+| Memory | Little |
+| Disk | Little |
+| Run metric | content.proton.documentdb.job.removed_documents_prune |
+
+
+## Retrieving documents
+
+Retrieving documents is done by specifying an id to *get*, or use a [selection expression](/en/reference/writing/document-selector-language) to *visit* a range of documents - refer to the [Document API](/en/reference/api/api). Overview:
+
+
+
+
+
+| | |
+| :--- | :--- |
+| **Get** | When the content node receives a get request, it scans through all the document databases, and for each one, it checks all three sub-databases. Once the document is found, the scan is stopped and the document returned. If the document is found in a Ready sub-database, the document retriever will apply any changes that are stored in the [attributes](/en/content/attributes) before returning the document. |
+| **Visit** | A visit request creates an iterator over each candidate bucket. This iterator will retrieve matching documents from all sub-databases of all document databases. As for get, attribute values are applied to document fields in the Ready sub-database. |
+
+## Queries
+
+Queries have a separate pathway through the system. They do not use the distributor, nor do they go through the content node persistence threads. They are orthogonal to the elasticity set up by the storage and retrieval described above. How queries move through the system:
+
+
+
+
+
+A query enters the system through the *QR-server (query rewrite server)* in the [Vespa Container](/en/applications/containers). The QR-server issues one query per document type to the search nodes:
+
+
+| | |
+| :--- | :--- |
+| **Container** | The Container knows all the document types and rewrites queries as a collection of queries, one for each type. Queries may have a [restrict](/en/reference/api/query#model.restrict) parameter, in which case the container will send the query only to the specified document types. It sends the query to content nodes and collects partial results. It pings all content nodes every second to know whether they are alive, and keeps open TCP connections to each one. If a node goes down, the elastic system will make the documents available on other nodes. |
+| **Content node matching** | The *match engine* receives queries and routes them to the right document database based on the document type. The query is passed to the *Ready* sub-database, where the searchable documents are. Based on information stored in the document meta store, the query is augmented with a blocklist that ensures only *active* documents are matched. |
+
+## /state/v1 API
+
+Besides the common endpoints documented in the [/state/v1 API reference](/en/reference/api/state-v1), Proton has additional endpoints as part of the /state/v1 API that expose information about the internal state of a search node. This API is available at `http://host:stateport/state/v1/`.
+
+Run [vespa-model-inspect](/en/reference/operations/self-managed/tools#vespa-model-inspect) to find the JSON HTTP stateport:
+
+```
+vespa-model-inspect service searchnode
+```
+
+### Initialization Progress API
+
+The initialization progress can be found by HTTP GET at `http://host:stateport/state/v1/initialization`. This endpoint becomes available early during initialization of Proton when other endpoints are not yet available. It gives a human-readable overview of the document databases and their attributes being loaded. Note that this is **not** a stable API, and it will expand and change between releases.
+
+Example `state/v1/initialization`:
+
+```json expandable
+{
+ "state": "initializing",
+ "current_time": "1758873251.933488",
+ "start_time": "1758873249.715624",
+ "load": 1,
+ "replay_transaction_log": 0,
+ "online": 0,
+ "dbs": [
+ {
+ "state": "load",
+ "start_time": "1758873249.936939",
+ "name": "dbname",
+ "ready_subdb": {
+ "loaded_attributes": [
+ {
+ "state": "loaded",
+ "start_time": "1758873249.941415",
+ "name": "int_field",
+ "end_time": "1758873249.942051"
+ },
+ {
+ "state": "loaded",
+ "start_time": "1758873249.941498",
+ "name": "string_field",
+ "end_time": "1758873249.944647"
+ }
+ ],
+ "loading_attributes": [
+ {
+ "state": "reprocessing",
+ "start_time": "1758873249.941555",
+ "name": "tensor_field",
+ "reprocess_progress": "6.061879",
+ "reprocess_start_time": "1758873249.993847"
+ }
+ ],
+ "queued_attributes": [
+
+ ]
+ }
+ }
+ ]
+}
+```
+
+### Custom Component State API
+
+The custom component status can be found by HTTP GET at `http://host:stateport/state/v1/custom/component`. It gives an overview of the relevant search node components and their internal state. Note that this is **not** a stable API, and it will expand and change between releases.
+
+Example `state/v1/custom/component`:
+
+```json expandable
+{
+ "documentdb": {
+ "mydoctype": {
+ "documentType": "mydoctype",
+ "status": {
+ "state": "ONLINE",
+ "configState": "OK"
+ },
+ "documents": {
+ "active": 10,
+ "ready": 10,
+ "total": 10,
+ "removed": 0
+ },
+ "url": "http://host:stateport/state/v1/custom/component/documentdb/mydoctype"
+ }
+ },
+ "threadpools": {
+ "url": "http://host:stateport/state/v1/custom/component/threadpools"
+ },
+ "matchengine": {
+ "status": {
+ "state": "ONLINE"
+ },
+ "url": "http://host:stateport/state/v1/custom/component/matchengine"
+ },
+ "flushengine": {
+ "url": "http://host:stateport/state/v1/custom/component/flushengine"
+ },
+ "tls": {
+ "url": "http://host:stateport/state/v1/custom/component/tls"
+ },
+ "hwinfo": {
+ "url": "http://host:stateport/state/v1/custom/component/hwinfo"
+ },
+ "resourceusage": {
+ "url": "http://host:stateport/state/v1/custom/component/resourceusage",
+ "disk": 0.25,
+ "memory": 0.35,
+ "attribute_address_space": 0
+ },
+ "session": {
+ "search": {
+ "url": "http://host:stateport/state/v1/custom/component/session/search",
+ "numSessions": 0
+ }
+ }
+}
+```
+
+Example `state/v1/custom/component/documentdb/mydoctype`:
+
+```json expandable
+{
+ "documentType": "mydoctype",
+ "status": {
+ "state": "ONLINE",
+ "configState": "OK"
+ },
+ "documents": {
+ "active": 10,
+ "ready": 10,
+ "total": 10,
+ "removed": 0
+ },
+ "subdb": {
+ "removed": {
+ "url": "http://host:stateport/state/v1/custom/component/documentdb/mydoctype/subdb/removed"
+ },
+ "ready": {
+ "url": "http://host:stateport/state/v1/custom/component/documentdb/mydoctype/subdb/ready"
+ },
+ "notready": {
+ "url": "http://host:stateport/state/v1/custom/component/documentdb/mydoctype/subdb/notready"
+ }
+ },
+ "threadingservice": {
+ "url": "http://host:stateport/state/v1/custom/component/documentdb/mydoctype/threadingservice"
+ },
+ "bucketdb": {
+ "url": "http://host:stateport/state/v1/custom/component/documentdb/mydoctype/bucketdb",
+ "numBuckets": 1
+ },
+ "maintenancecontroller": {
+ "url": "http://host:stateport/state/v1/custom/component/documentdb/mydoctype/maintenancecontroller"
+ }
+}
+```
diff --git a/mintlify-docs/en/examples/assets/billion-vector-2vcpu.png b/mintlify-docs/en/examples/assets/billion-vector-2vcpu.png
new file mode 100644
index 0000000000..98172aac9e
Binary files /dev/null and b/mintlify-docs/en/examples/assets/billion-vector-2vcpu.png differ
diff --git a/mintlify-docs/en/examples/assets/billion-vector-8vcpu.png b/mintlify-docs/en/examples/assets/billion-vector-8vcpu.png
new file mode 100644
index 0000000000..442eaee987
Binary files /dev/null and b/mintlify-docs/en/examples/assets/billion-vector-8vcpu.png differ
diff --git a/mintlify-docs/en/examples/assets/billion-vector-feed-queries.png b/mintlify-docs/en/examples/assets/billion-vector-feed-queries.png
new file mode 100644
index 0000000000..14a5ad3914
Binary files /dev/null and b/mintlify-docs/en/examples/assets/billion-vector-feed-queries.png differ
diff --git a/mintlify-docs/en/examples/billion-scale-image-search.mdx b/mintlify-docs/en/examples/billion-scale-image-search.mdx
new file mode 100644
index 0000000000..d025cbf648
--- /dev/null
+++ b/mintlify-docs/en/examples/billion-scale-image-search.mdx
@@ -0,0 +1,450 @@
+---
+title: "Billion Scale Image Search"
+---
+
+This sample application combines two sample applications to implement
+cost-efficient large scale image search over multimodal AI powered vector representations;
+[text-image-search](https://github.com/vespa-engine/sample-apps/tree/master/text-image-search) and
+[billion-scale-vector-search](https://github.com/vespa-engine/sample-apps/tree/master/billion-scale-vector-search).
+
+## The Vector Dataset
+This sample app use the [LAION-5B](https://laion.ai/blog/laion-5b/) dataset,
+ the biggest open accessible image-text dataset in the world.
+
+> Large image-text models like ALIGN, BASIC, Turing Bletchly, FLORENCE & GLIDE have
+> shown better and better performance compared to previous flagship models like CLIP and DALL-E.
+> Most of them had been trained on billions of image-text pairs and unfortunately, no datasets of this size had been openly available until now.
+> To address this problem we present LAION 5B, a large-scale dataset for research purposes
+> consisting of 5,85B CLIP-filtered image-text pairs. 2,3B contain English language,
+> 2,2B samples from 100+ other languages and 1B samples have texts that do not allow a certain language assignment (e.g. names ).
+
+The LAION-5B dataset was used to train the popular text-to-image generative StableDiffusion model.
+
+
+Note the following about the LAION 5B dataset:
+
+> Be aware that this large-scale dataset is un-curated.
+> Keep in mind that the un-curated nature of the dataset means that collected
+> links may lead to strongly discomforting and disturbing content for a human viewer.
+
+
+
+The released dataset does not contain image data itself,
+but [CLIP](https://openai.com/research/clip) encoded vector representations of the images,
+and metadata like `url` and `caption`.
+
+## Use cases
+
+The app can be used to implement several use cases over the LAION dataset, or adopted to your large-scale vector dataset:
+
+- Search with a free text prompt over the `caption` or `url` fields in the LAION dataset using Vespa's standard text-matching functionality.
+- CLIP retrieval, using vector search, given a text prompt, search the image vector representations (CLIP ViT-L/14), for example for 'french cat'.
+- Given an image vector representation, search for similar images in the dataset. This can for example
+be used to take the output image of StableDiffusion to find similar images in the training dataset.
+
+All this combined using [Vespa's query language](/en/querying/query-language),
+ and also in combination with filters.
+
+## Vespa Primitives Demonstrated
+
+The sample application demonstrates many Vespa primitives:
+
+- Importing an [ONNX](https://onnx.ai/)-exported version of [CLIP ViT-L/14](https://github.com/openai/CLIP)
+for [accelerated inference](https://blog.vespa.ai/stateful-model-serving-how-we-accelerate-inference-using-onnx-runtime/)
+in [Vespa stateless](/en/learn/overview) containers.
+The exported CLIP model encodes a free-text prompt to a joint image-text embedding space with 768 dimensions.
+- [HNSW](/en/querying/approximate-nn-hnsw) indexing of vector centroids drawn
+from the dataset, and combination with classic Inverted File as described in
+[Billion-scale vector search using hybrid HNSW-IF](https://blog.vespa.ai/vespa-hybrid-billion-scale-vector-search/).
+- Decoupling of vector storage and vector similarity computations. The stateless layer performs vector
+similarity computation over the full precision vectors.
+By using Vespa's support for accelerated inference with [onnxruntime](https://onnxruntime.ai/),
+moving the majority of the vector compute to the stateless layer
+allows for faster auto-scaling with daily query volume changes.
+The full precision vectors are stored in Vespa's summary log store, using lossless compression (zstd).
+- Dimension reduction with PCA - The centroid vectors are compressed from 768 dimensions to 128 dimensions. This allows indexing 6x more
+centroids on the same instance type due to the reduced memory footprint. With Vespa's support for distributed search, coupled with powerful
+high memory instances, this allows Vespa to scale cost efficiently to trillion-sized vector datasets.
+- The trained PCA matrix and matrix multiplication which projects the 768-dim vectors to 128-dimensions is
+evaluated in Vespa using accelerated inference, both at indexing time and at query time. The PCA weights are represented also using ONNX.
+- Phased ranking.
+The image embedding vectors are also projected to 128 dimensions, stored using
+memory mapped [paged attribute tensors](/en/content/attributes#paged-attributes).
+Full precision vectors are on stored on disk in Vespa summary store.
+The first-phase coarse search ranks vectors in the reduced vector space, per node, and results are merged from all nodes before
+the final ranking phase in the stateless layer.
+The final ranking phase is implemented in the stateless container layer using [accelerated inference](https://blog.vespa.ai/stateful-model-serving-how-we-accelerate-inference-using-onnx-runtime/).
+- Combining approximate nearest neighbor search with [filters](https://blog.vespa.ai/constrained-approximate-nearest-neighbor-search/), filtering
+can be on url, caption, image height, width, safety probability, NSFW label, and more.
+- Hybrid ranking, both textual sparse matching features and the CLIP similarity, can be used when ranking images.
+- Reduced tensor cell precision. The original LAION-5B dataset uses `float16`. The app uses Vespa's support for `bfloat16` tensors,
+ saving 50% of storage compared to full `float` representation.
+- Caching, both reduced vectors (128) cached by the OS buffer cache, and full version 768 dims are cached using Vespa summary cache.
+- Query-time vector de-duping and diversification of the search engine result page using document to document similarity instead of query to document similarity. Also
+accelerated by stateless model inference.
+- Scale, from a single node deployment to multi-node deployment using managed [Vespa Cloud](/),
+or self-hosted on-premise.
+
+
+## Stateless Components
+The app contains several [container components](/en/applications/components):
+
+- [RankingSearcher](https://github.com/vespa-engine/sample-apps/tree/master/billion-scale-image-search/app/src/main/java/ai/vespa/examples/searcher/RankingSearcher.java) implements the last stage ranking using
+full-precision vectors using an ONNX model for accelerated inference.
+- [DedupingSearcher](https://github.com/vespa-engine/sample-apps/tree/master/billion-scale-image-search/app/src/main/java/ai/vespa/examples/searcher/DeDupingSearcher.java) implements run-time de-duping after Ranking, using
+document to document similarity matrix, using an ONNX model for accelerated inference.
+- [DimensionReducer](https://github.com/vespa-engine/sample-apps/tree/master/billion-scale-image-search/app/src/main/java/ai/vespa/examples/DimensionReducer.java) PCA dimension reducing vectors from 768-dims to 128-dims.
+- [AssignCentroidsDocProc](https://github.com/vespa-engine/sample-apps/tree/master/billion-scale-image-search/app/src/main/java/ai/vespa/examples/docproc/AssignCentroidsDocProc.java) searches the HNSW graph content cluster
+during ingestion to find the nearest centroids of the incoming vector.
+- [SPANNSearcher](https://github.com/vespa-engine/sample-apps/tree/master/billion-scale-image-search/app/src/main/java/ai/vespa/examples/searcher/SPANNSearcher.java)
+
+## Deploying this app
+These reproducing steps, demonstrates the app using a smaller subset of the LAION-5B vector dataset, suitable
+for playing around with the app on a laptop.
+
+**Requirements:**
+
+- [Docker](https://www.docker.com/) Desktop installed and running. 6GB available memory for Docker is recommended.
+ Refer to [Docker memory](/en/operations/self-managed/docker-containers#memory)
+ for details and troubleshooting
+- Alternatively, deploy using [Vespa Cloud](#deployment-note)
+- Operating system: Linux, macOS or Windows 10 Pro (Docker requirement)
+- Architecture: x86_64 or arm64
+- [Homebrew](https://brew.sh/) to install [Vespa CLI](/en/clients/vespa-cli), or download
+ a vespa cli release from [GitHub releases](https://github.com/vespa-engine/vespa/releases).
+- Java 17 installed.
+- Python3 and numpy to process the vector dataset
+- [Apache Maven](https://maven.apache.org/install.html) - this sample app uses custom Java components and Maven is used to build the application.
+
+Verify Docker Memory Limits:
+
+
+```bash
+$ docker info | grep "Total Memory"
+or
+$ podman info | grep "memTotal"
+```
+
+
+Install [Vespa CLI](/en/clients/vespa-cli):
+
+```bash
+$ brew install vespa-cli
+```
+
+
+For local deployment using docker image:
+
+```bash
+$ vespa config set target local
+```
+
+
+Use the [multi-node high availability](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode-HA)
+template for inspiration for multi-node, on-premise deployments.
+
+Pull and start the vespa docker container image:
+
+```bash
+$ docker pull vespaengine/vespa
+$ docker run --detach --name vespa --hostname vespa-container \
+ --publish 127.0.0.1:8080:8080 --publish 127.0.0.1:19071:19071 \
+ vespaengine/vespa
+```
+
+
+Verify that the configuration service (deploy api) is ready:
+
+```bash
+$ vespa status deploy --wait 300 ./app
+```
+
+
+Download this sample application:
+
+```bash
+$ vespa clone billion-scale-image-search myapp && cd myapp
+```
+
+Setup:
+
+
+
+**Create a [tenant](/en/learn/tenant-apps-instances) on Vespa Cloud:**
+
+ Go to [console.vespa-cloud.com](https://console.vespa-cloud.com/) and create your tenant (unless you already have one).
+
+
+**Install the [Vespa CLI](/en/clients/vespa-cli)** using [Homebrew](https://brew.sh/):
+```bash
+$ brew install vespa-cli
+```
+ Windows/No Homebrew? See the [Vespa CLI page](/en/clients/vespa-cli) to download directly.
+
+
+**Configure the Vespa client:**
+```bash
+$ vespa config set target cloud
+$ vespa config set application vespa-team.autotest
+```
+ Use the tenant name from step 1 instead of "vespa-team", and replace in other steps in this example guide, too.
+
+
+**Get Vespa Cloud control plane access:**
+```bash
+$ vespa auth login
+```
+ Follow the instructions from the command to authenticate.
+
+
+**Clone a sample [application](/en/basics/applications):**
+```bash
+$ vespa clone billion-scale-image-search myapp && cd myapp
+```
+ See [sample-apps](https://github.com/vespa-engine/sample-apps) for other sample apps you can clone.
+
+
+**Add a certificate for [data plane access](/en/security/guide#data-plane) to the application:**
+```bash
+$ vespa auth cert app
+```
+ It is a good idea to take note of the path to the `.pem` files written here.
+
+
+
+## Download Vector + Metadata
+
+These instructions use the first split file (0000) of a total of 2314 files in the LAION2B-en split.
+Download the vector data file:
+
+
+```bash
+$ curl --http1.1 -L -o img_emb_0000.npy \
+ https://the-eye.eu/public/AI/cah/laion5b/embeddings/laion2B-en/img_emb/img_emb_0000.npy
+```
+
+
+Download the metadata file:
+
+
+```bash
+$ curl -L -o metadata_0000.parquet \
+ https://the-eye.eu/public/AI/cah/laion5b/embeddings/laion2B-en/laion2B-en-metadata/metadata_0000.parquet
+```
+
+
+Install python dependencies to process the files:
+
+
+```bash
+$ python3 -m pip install pandas numpy requests mmh3 pyarrow
+```
+
+
+Generate centroids, this process randomly selects vectors from the dataset to represent
+centroids. Performing an incremental clustering can improve vector search recall and allow
+indexing fewer centroids. For simplicity, this tutorial uses random sampling.
+
+
+```bash
+$ python3 app/src/main/python/create-centroid-feed.py img_emb_0000.npy > centroids.jsonl
+```
+
+
+Generate the image feed, this merges the embedding data with the metadata and creates a Vespa
+jsonl feed file, with one json operation per line.
+
+
+```bash
+$ python3 app/src/main/python/create-joined-feed.py metadata_0000.parquet img_emb_0000.npy > feed.jsonl
+```
+
+
+To process the entire dataset, we recommend starting several processes, each operating on separate split files
+as the processing implementation is single-threaded.
+
+
+## Build and deploy Vespa app
+
+`src/main/application/models` has three small ONNX models:
+
+- `vespa_innerproduct_ranker.onnx` for vector similarity (inner dot product) between the query and the vectors
+in the stateless container.
+- `vespa_pairwise_similarity.onnx` for matrix multiplication between the top retrieved vectors.
+- `pca_transformer.onnx` for dimension reduction, projecting the 768-dim vector space to a 128-dimensional space.
+
+These `ONNX` model files are generated by specifying the compute operation using [pytorch](https://pytorch.org/) and using `torch`'s
+ability to export the model to [ONNX](https://onnx.ai/) format:
+
+- [ranker_export.py](https://github.com/vespa-engine/sample-apps/tree/master/billion-scale-image-search/app/src/main/python/ranker_export.py)
+- [similarity_export.py](https://github.com/vespa-engine/sample-apps/tree/master/billion-scale-image-search/app/src/main/python/similarity_export.py)
+- [pca_transformer_export.py](https://github.com/vespa-engine/sample-apps/tree/master/billion-scale-image-search/app/src/main/python/pca_transformer_export.py)
+
+Build the sample app (make sure you have JDK 17, verify with `mvn -v`): - This step
+also downloads a pre-exported ONNX model for mapping the prompt text to the CLIP vector embedding space.
+
+
+```bash
+$ mvn clean package -U -f app
+```
+
+
+Deploy the application. This step deploys the application package built in the previous step:
+
+
+```bash
+$ vespa deploy --wait 300 ./app
+```
+
+
+#### Deployment note
+It is possible to deploy this app to
+[Vespa Cloud](/en/basics/deploy-an-application-java).
+For Vespa cloud deployments to the [dev env](/en/operations/zones)
+replace the [src/main/application/services.xml](https://github.com/vespa-engine/sample-apps/tree/master/billion-scale-image-search/app/src/main/application/services.xml) with
+[src/main/application/services-cloud.xml](https://github.com/vespa-engine/sample-apps/tree/master/billion-scale-image-search/app/src/main/application/services-cloud.xml) -
+the cloud deployment uses dedicated clusters for `feed` and `query`.
+
+Wait for the application endpoint to become available:
+
+
+```bash
+$ vespa status --wait 300
+```
+
+
+Run [Vespa System Tests](/en/reference/applications/testing),
+which runs a set of basic tests to verify that the application is working as expected:
+
+```bash
+$ vespa test app/src/test/application/tests/system-test/feed-and-search-test.json
+```
+
+
+The _centroid_ vectors **must** be indexed first:
+
+
+```bash
+$ vespa feed centroids.jsonl
+$ vespa feed feed.jsonl
+```
+
+
+Track number of documents while feeding:
+
+
+```bash
+$ vespa query 'yql=select * from image where true' \
+ hits=0 \
+ ranking=unranked
+```
+
+
+
+## Fetching data
+
+Fetch a single document using [document api](/en/reference/api/document-v1):
+
+
+```bash
+$ vespa document get \
+ id:laion:image::5775990047751962856
+```
+
+
+The response contains all fields, including the full vector representation and the
+reduced vector, plus all the metadata. Everything represented in the same
+[schema](https://github.com/vespa-engine/sample-apps/tree/master/billion-scale-image-search/app/src/main/application/schemas/image.sd).
+
+
+## Query the data
+The following provides a few query examples,
+`prompt` is a run-time query parameter which is used by the
+[CLIPEmbeddingSearcher](https://github.com/vespa-engine/sample-apps/tree/master/billion-scale-image-search/app/src/main/java/ai/vespa/examples/searcher/CLIPEmbeddingSearcher.java)
+which will encode the prompt text into a CLIP vector representation using the embedded CLIP model:
+
+
+```bash
+$ vespa query \
+ 'yql=select documentid, caption, url, height, width from image where nsfw contains "unlikely"'\
+ 'hits=10' \
+ 'prompt=two dogs running on a sandy beach'
+```
+
+
+Results are filtered by a constraint on the `nsfw` field. Note that even if the image is classified
+as `unlikely` the image content might still be explicit as the NSFW classifier is not 100% accurate.
+
+The returned images are ranked by CLIP similarity (The score is found in each hit's `relevance` field).
+
+The following query adds another filter, restricting the search so that only images crawled from urls with `shutterstock.com`
+is retrieved.
+
+
+```bash
+$ vespa query \
+ 'yql=select documentid, caption, url, height, width from image where nsfw contains "unlikely" and url contains "shutterstock.com"'\
+ 'hits=10' \
+ 'prompt=two dogs running on a sandy beach'
+```
+
+
+Another restricting the search further, adding a phrase constraint `caption contains phrase("sandy", "beach")`:
+
+
+```bash
+$ vespa query \
+ 'yql=select documentid, caption, url, height, width from image where nsfw contains "unlikely" and url contains "shutterstock.com" and caption contains phrase("sandy", "beach")'\
+ 'hits=10' \
+ 'prompt=two dogs running on a sandy beach'
+```
+
+
+Regular query, matching over the `default` fieldset, searching the `caption` and the `url` field, ranked by
+the `text` ranking profile:
+
+
+```bash
+$ vespa query \
+ 'yql=select documentid, caption, url from image where nsfw contains "unlikely" and userQuery()'\
+ 'hits=10' \
+ 'query=two dogs running on a sandy beach' \
+ 'ranking=text'
+```
+
+
+The `text` [rank](/en/basics/ranking) profile uses
+[nativeRank](/en/ranking/nativerank), one of Vespa's many
+text matching rank features.
+
+## Non-native hyperparameters
+There are several non-native query request
+parameters that controls the vector search accuracy and performance tradeoffs. These
+can be set with the request, e.g, `/search/&spann.clusters=12`.
+
+- `spann.clusters`, default `64`, the number of centroids in the reduced vector space used to restrict the image search.
+A higher number improves recall, but increases computational complexity and disk reads.
+- `rank-count`, default `1000`, the number of vectors that are fully re-ranked in the container using the full vector representation.
+A higher number improves recall, but increases the computational complexity and network.
+- `collapse.enable`, default `true`, controls de-duping of the top ranked results using image to image similarity.
+- `collapse.similarity.max-hits`, default `1000`, the number of top-ranked hits to perform de-duping of. Must be less than `rank-count`.
+- `collapse.similarity.threshold`, default `0.95`, how similar a given image to image must be before it is considered a duplicate.
+
+## Areas of improvement
+There are several areas that could be improved.
+
+- CLIP model. The exported text transformer model uses fixed sequence length (77), this wastes computations and makes
+the model a lot slower than it has to be for shorter sequence lengths. A dynamic sequence length would
+make encoding short queries a lot faster than the current model.
+It would also be interesting to use the text encoder as a teacher and train a smaller distilled model using a different architecture (for example based on smaller MiniLM models).
+- CLIP query embedding caching. The CLIP model is fixed and only uses the text input. Caching the map from text to
+embedding would save resources.
+
+## Shutdown and remove the container:
+
+
+```bash
+$ vespa destroy --force
+```
+
+
diff --git a/mintlify-docs/en/examples/billion-scale-vector-search.mdx b/mintlify-docs/en/examples/billion-scale-vector-search.mdx
new file mode 100644
index 0000000000..7439018cbc
--- /dev/null
+++ b/mintlify-docs/en/examples/billion-scale-vector-search.mdx
@@ -0,0 +1,290 @@
+---
+title: "SPANN Billion Scale Vector Search"
+---
+
+
+The SPANN (Space Partitioned ANN) approach for approximate nearest neighbor search is described in
+[SPANN: Highly-efficient Billion-scale Approximate Nearest Neighbor Search](https://arxiv.org/abs/2111.08566).
+SPANN uses a hybrid combination of graph and inverted index methods for approximate nearest neighbor search.
+
+We recommend you read [Billion-scale vector search using hybrid HNSW-IF](https://blog.vespa.ai/vespa-hybrid-billion-scale-vector-search/)
+for details on how SPANN is implemented using Vespa, before running this example application.
+Excerpt:
+
+> SPANN searches for the k closest centroid vectors of the query vector in the in-memory ANN search data structure.
+> Then, it reads the k associated posting lists for the retrieved centroids
+> and computes the distance between the query vector and the vector data read from the posting list:
+
+
+
+
+
+This sample application demonstrates how to represent SPANN using Vespa.
+
+Setup:
+
+
+
+**Create a [tenant](/en/learn/tenant-apps-instances) on Vespa Cloud:**
+
+ Go to [console.vespa-cloud.com](https://console.vespa-cloud.com/) and create your tenant (unless you already have one).
+
+
+**Install the [Vespa CLI](/en/clients/vespa-cli)** using [Homebrew](https://brew.sh/):
+```bash
+$ brew install vespa-cli
+```
+ Windows/No Homebrew? See the [Vespa CLI page](/en/clients/vespa-cli) to download directly.
+
+
+**Configure the Vespa client:**
+```bash
+$ vespa config set target cloud
+$ vespa config set application vespa-team.autotest
+```
+ Use the tenant name from step 1 instead of "vespa-team", and replace in other steps in this example guide, too.
+
+
+**Get Vespa Cloud control plane access:**
+```bash
+$ vespa auth login
+```
+ Follow the instructions from the command to authenticate.
+
+
+**Clone a sample [application](/en/basics/applications):**
+```bash
+$ vespa clone billion-scale-vector-search myapp && cd myapp
+```
+ See [sample-apps](https://github.com/vespa-engine/sample-apps) for other sample apps you can clone.
+
+
+**Add a certificate for [data plane access](/en/security/guide#data-plane) to the application:**
+```bash
+$ vespa auth cert app
+```
+ It is a good idea to take note of the path to the `.pem` files written here.
+
+
+
+## Download Vector Data
+This sample app uses the Microsoft SPACEV vector dataset from [big-ann-benchmarks.com](https://big-ann-benchmarks.com/).
+It uses the first 10M vectors of the 100M slice sample.
+This sample file is about 1GB (10M vectors):
+
+```bash
+$ curl -L -o spacev10m_base.i8bin \
+ https://data.vespa-cloud.com/sample-apps-data/spacev10m_base.i8bin
+```
+
+
+Install dependencies and create the feed files for the first 10M vectors from the 100M sample:
+
+```bash
+$ pip3 install numpy requests tqdm
+```
+
+
+```bash
+$ python3 app/src/main/python/create-vespa-feed.py spacev10m_base.i8bin
+```
+
+Output:
+* `graph-vectors.jsonl`
+* `if-vectors.jsonl`
+
+
+## Build and deploy Vespa app
+Build the application:
+
+```bash
+$ mvn clean package -U -f app
+```
+
+
+Deploy the application:
+
+```bash
+$ vespa deploy --wait 900 ./app
+```
+
+
+Wait for the application endpoint to become available:
+
+```bash
+$ vespa status --wait 300
+```
+
+
+Test [basic functionality](https://github.com/vespa-engine/sample-apps/blob/master/billion-scale-vector-search/app/src/test/application/tests/system-test/feed-and-search-test.json):
+
+```bash
+$ vespa test app/src/test/application/tests/system-test/feed-and-search-test.json
+```
+
+See [CD tests](/en/operations/automated-deployments#cd-tests) for details.
+
+## Feed data
+The _graph_ vectors must be feed before the _if_ vectors:
+
+```bash
+$ vespa feed graph-vectors.jsonl
+```
+
+
+```bash
+$ vespa feed if-vectors.jsonl
+```
+
+
+Now is a good time to open the
+Vespa Cloud Dashboard
+to track progress.
+
+Refer to [<resources>](https://github.com/vespa-engine/sample-apps/blob/master/billion-scale-vector-search/app/src/main/application/services.xml)
+configuration to manage the feeding speed - more CPU is better, e.g.:
+```
+
+```
+Use the [instance type reference](/en/performance/instance-types/aws-instance-types) to find good combinations.
+Run time for a 2 VCPU deployment vs. 8 VCPU:
+
+
+
+
+
+
+
+
+
+Observe the feed and query phases (below) of this guide:
+
+
+
+
+
+## Recall Evaluation
+Download the query vectors and the ground truth for the 10M first vectors:
+
+```bash
+$ curl -L -o query.i8bin \
+ https://github.com/microsoft/SPTAG/raw/main/datasets/SPACEV1B/query.bin
+$ curl -L -o spacev10m_gt100.i8bin \
+ https://data.vespa-cloud.com/sample-apps-data/spacev10m_gt100.i8bin
+```
+
+Find the path to the credentials from the `vespa auth cert` step above, like
+
+```txt
+/Users/username/.vespa/tenant_name.autotest.default/data-plane-public-cert.pem
+```
+
+Replace the two filenames in the command below.
+(This is not needed when running a [local test](#local-test-with-oci-image))
+
+Run first 1K queries and evaluate recall@10. A higher number of clusters gives higher recall:
+
+```bash
+$ ENDPOINT=$(vespa status --format=plain)
+$ python3 app/src/main/python/recall.py \
+ --endpoint ${ENDPOINT}/search/ \
+ --query_file query.i8bin \
+ --query_gt_file spacev10m_gt100.i8bin \
+ --certificate $PWD/../.vespa/vespa-team.autotest.default/data-plane-public-cert.pem \
+ --key $PWD/../.vespa/vespa-team.autotest.default/data-plane-private-key.pem
+```
+
+
+See the [blog post](https://blog.vespa.ai/vespa-hybrid-billion-scale-vector-search/#hnsw-if-accuracy)
+for details about this script.
+
+
+```bash
+$ vespa destroy --force
+```
+
+
+
+## Local test with OCI image
+
+
+
+**Prerequisites:**
+
+- Linux, macOS or Windows 10 Pro on x86\_64 or arm64, with [Podman Desktop](https://podman.io/) or [Docker Desktop](https://www.docker.com/products/docker-desktop/) installed, with an engine running.
+ - Alternatively, start the Podman daemon:
+
+ ```bash
+ $ podman machine init --memory 6000
+ $ podman machine start
+ ```
+ - See [Docker Containers](/en/operations/self-managed/docker-containers.html) for system limits and other settings.
+- For CPUs older than Haswell (2013), see [CPU Support](/en/operations/self-managed/cpu-support).
+- Memory: Minimum 4 GB RAM dedicated to Docker/Podman. [Memory recommendations](/en/operations/self-managed/node-setup#memory-settings).
+- Disk: Avoid `NO_SPACE` - the vespaengine/vespa container image + headroom for data requires disk space. [Read more](/en/writing/feed-block).
+- [Homebrew](https://brew.sh/) to install the [Vespa CLI](/en/clients/vespa-cli), or download the Vespa CLI from [Github releases](https://github.com/vespa-engine/vespa/releases).
+- [Java 17](https://openjdk.org/projects/jdk/17/).
+- [Apache Maven](https://maven.apache.org/install.html) is used to build the application.
+
+
+
+
+Verify memory Limits:
+
+```bash
+$ docker info | grep "Total Memory"
+```
+or
+
+```bash
+$ podman info | grep "memTotal"
+```
+
+
+Install [Vespa CLI](../clients/vespa-cli.html):
+
+```bash
+$ brew install vespa-cli
+```
+
+
+For local deployment:
+
+```bash
+$ vespa config set target local
+```
+
+
+Download this sample application:
+
+```bash
+$ vespa clone billion-scale-vector-search myapp && cd myapp
+```
+
+
+Pull and start the Vespa image:
+
+```bash
+$ docker pull vespaengine/vespa
+$ docker run --detach --name vespa --hostname vespa-container \
+ --publish 127.0.0.1:8080:8080 --publish 127.0.0.1:19071:19071 \
+ vespaengine/vespa
+```
+
+
+Verify that the configuration service (deploy api) is ready:
+
+```bash
+$ vespa status deploy --wait 300
+```
+
+
+At this point, you can continue the guide from [download vector data](#download-vector-data).
+
+### Cleanup
+When done, remove the container:
+
+```bash
+$ docker rm -f vespa
+```
+
diff --git a/mintlify-docs/en/examples/rag-blueprint.mdx b/mintlify-docs/en/examples/rag-blueprint.mdx
new file mode 100644
index 0000000000..5a9ae41949
--- /dev/null
+++ b/mintlify-docs/en/examples/rag-blueprint.mdx
@@ -0,0 +1,110 @@
+---
+title: "The RAG Blueprint"
+---
+
+Vespa is the [platform of choice](https://blog.vespa.ai/perplexity-builds-ai-search-at-scale-on-vespa-ai/)
+for large scale RAG applications like Perplexity.
+It gives you all the features you need but putting them all together can be a challenge.
+
+This open source sample applications contains all the elements you need to create a RAG application that
+
+* delivers state-of-the-art quality, and
+* scales to any amount of data, query load, and complexity.
+
+This README provides the steps to create and run your own application based on the blueprint.
+Refer to the [RAG Blueprint tutorial](/en/learn/tutorials/rag-blueprint.html) for more in-depth explanations,
+or try out the [Python notebook](https://vespa-engine.github.io/pyvespa/examples/rag-blueprint-vespa-cloud.html).
+
+Setup:
+
+
+
+**Create a [tenant](/en/learn/tenant-apps-instances) on Vespa Cloud:**
+ Go to [console.vespa-cloud.com](https://console.vespa-cloud.com/) and create your tenant (unless you already have one).
+
+
+**Install the [Vespa CLI](/en/clients/vespa-cli)** using [Homebrew](https://brew.sh/):
+```bash
+$ brew install vespa-cli
+```
+ Windows/No Homebrew? See the [Vespa CLI page](/en/clients/vespa-cli) to download directly.
+
+
+**Configure the Vespa client:**
+```bash
+$ vespa config set target cloud
+$ vespa config set application vespa-team.autotest
+```
+ Use the tenant name from step 1 instead of "vespa-team", and replace in other steps in this example guide, too.
+
+
+**Get Vespa Cloud control plane access:**
+```bash
+$ vespa auth login
+```
+ Follow the instructions from the command to authenticate.
+
+
+**Clone a sample [application](/en/basics/applications):**
+```bash
+$ vespa clone rag-blueprint myapp && cd myapp
+```
+See [sample-apps](https://github.com/vespa-engine/sample-apps) for other sample apps you can clone.
+
+
+**Add a certificate for [data plane access](/en/security/guide#data-plane) to the application:**
+```bash
+$ vespa auth cert app
+```
+It is a good idea to take note of the path to the `.pem` files written here.
+
+
+
+
+## Test the application
+
+
+```bash
+$ vespa deploy --wait 900 ./app
+```
+
+
+Feed some documents, this will also chunk and embed so it takes about 3 minutes:
+
+
+```bash
+$ vespa feed dataset/docs.jsonl
+```
+
+
+Now you can issue queries:
+
+
+```bash
+$ vespa query 'query=yc b2b sales'
+```
+
+
+
+```bash
+$ vespa destroy --force
+```
+
+
+**TIP:**
+
+Add "-v" to see the HTTP request this becomes.
+
+
+Congratulations! You have now created a RAG application that can scale to billions of documents and thousands
+of queries per second, while delivering state-of-the-art quality.
+
+## Explore more
+
+What do you want to do next?
+
+- To learn what this application can do, look at the files in your app/ dir.
+- [Run your application locally using Docker](https://github.com/vespa-engine/sample-apps/blob/master/rag-blueprint/deploy-locally.md)
+- [Using query profiles to define behavior for different use cases](https://github.com/vespa-engine/sample-apps/blob/master/rag-blueprint/query-profiles.md)
+- [Evaluate and improve relevance of the data returned](https://github.com/vespa-engine/sample-apps/blob/master/rag-blueprint/relevance.md)
+- [Do LLM generation inside the application](https://github.com/vespa-engine/sample-apps/blob/master/rag-blueprint/relevance.md)
diff --git a/mintlify-docs/en/learn/about-documentation.mdx b/mintlify-docs/en/learn/about-documentation.mdx
new file mode 100644
index 0000000000..5fb452635e
--- /dev/null
+++ b/mintlify-docs/en/learn/about-documentation.mdx
@@ -0,0 +1,68 @@
+---
+title: About this documentation
+description: "The Vespa documentation [https://docs.vespa.ai](https://docs.vespa.ai) provides all the information required to use all Vespa features and deploy them in any supported environment."
+---
+
+It is split into guides and tutorials, which explains features and how to use them to solve problems, and reference documentation which list complete information about all features and APIs.
+
+## Applicability
+
+The Vespa platform is open source, and can be deployed in self-managed systems and on the Vespa Cloud service. Some add-ons (but no core functionality) are only available under a commercial license.
+
+Documents that describe functionality with such limited applicability are clearly marked by one or more of the following chips:
+
+| | |
+| :--- | :--- |
+| **Vespa Cloud** | Only applicable to Vespa Cloud deployments. |
+| **Self-managed** | Only applicable to self-managed deployments. |
+| **Enterprise** | Not open source: Available commercially only (both self-managed and on cloud unless also marked by one of the other chips above). |
+
+For clarity, any document *not* marked with any of these chips describes functionality that is open source and available both on Vespa Cloud and self-managed deployments.
+
+## Contributing
+
+If you find errors or want to improve the documentation, [create an issue](https://github.com/vespa-engine/vespa/issues) or [contribute a fix](/en/learn/contributing). See the [README](https://docs.vespa.ai/README.md) before contributing.
+
+## Notation
+
+*Italic* is used for:
+
+- Pathnames, filenames, program names, hostnames, and URLs
+- New terms where they are defined
+
+`Constant Width` is used for:
+
+- Programming language elements, code examples, keywords, functions, classes, interfaces, methods, etc.
+- Commands and command-line output
+
+Commands meant to be run on the command line are shown like this, prepended by a $ for the prompt:
+
+```bash
+$ export PATH=$VESPA_HOME/bin:$PATH # how to highlight text in pre
+```
+
+Notes and other Important pieces of information are shown like:
+
+
+**Note:**
+
+Some info here
+
+
+
+**Important:**
+
+Important info here
+
+
+
+**Warning:**
+
+Warning here
+
+
+
+**Deprecation:**
+
+Deprecation warning here
+
diff --git a/mintlify-docs/en/learn/contributing.mdx b/mintlify-docs/en/learn/contributing.mdx
new file mode 100644
index 0000000000..476b071522
--- /dev/null
+++ b/mintlify-docs/en/learn/contributing.mdx
@@ -0,0 +1,28 @@
+---
+title: Contributing to Vespa
+description: "Contributions to [Vespa](https://github.com/vespa-engine/vespa) and the [Vespa documentation](https://github.com/vespa-engine/documentation) are welcome. This document tells you what you need to know to contribute."
+---
+
+## Open development
+
+All work on Vespa happens directly on GitHub, using the [GitHub flow model](https://docs.github.com/en/get-started/quickstart/github-flow). We release the master branch a few times a week, and you should expect it to almost always work. In addition to the [builds seen on factory.vespa.ai](https://factory.vespa.ai) we have a large acceptance and performance test suite which is also run continuously.
+
+### Pull requests
+
+All pull requests are reviewed by a member of the Vespa Committers team. You can find a suitable reviewer in the OWNERS file upward in the source tree from where you are making the change (the OWNERS have a special responsibility for ensuring the long-term integrity of a portion of the code). If you want to become a committer/OWNER making some quality contributions is the way to start.
+
+We require all pull request checks to pass.
+
+## Versioning
+
+Vespa uses semantic versioning - see [vespa versions](/en/learn/releases). Notice in particular that any Java API in a package having a @PublicAPI annotation in the package-info file cannot be changed in an incompatible way between major versions: Existing types and method signatures must be preserved (but can be marked deprecated).
+
+## Issues
+
+We track issues in [GitHub issues](https://github.com/vespa-engine/vespa/issues). It is fine to submit issues also for feature requests and ideas, whether you intend to work on them or not.
+
+There is also a [ToDo list](https://github.com/vespa-engine/vespa/blob/master/TODO.md) for larger things which no one are working on yet.
+
+## Community
+
+If you have questions, want to share your experience or help others, please join our community on the [Vespa Slack](https://slack.vespa.ai), or see Vespa on [Stack Overflow](http://stackoverflow.com/questions/tagged/vespa).
diff --git a/mintlify-docs/en/learn/faq.mdx b/mintlify-docs/en/learn/faq.mdx
new file mode 100644
index 0000000000..6a53f10e4f
--- /dev/null
+++ b/mintlify-docs/en/learn/faq.mdx
@@ -0,0 +1,749 @@
+---
+title: "FAQ - frequently asked questions"
+sidebarTitle: "Frequently asked questions"
+description: "Refer to Vespa Support for more support options."
+---
+
+## Ranking
+
+
+
+[Ranking](/en/basics/ranking) is maybe the primary Vespa feature - we like to think of it as scalable, online computation. A rank profile is where the application's logic is implemented, supporting simple types like `double` and complex types like `tensor`. Supply ranking data in queries in query features (e.g. different weights per customer), or look up in a [Searcher](/en/applications/searchers). Typically, a document (e.g. product) "feature vector"/"weights" will be compared to a user-specific vector (tensor).
+
+
+
+Vespa doesn't have specific support for storing customer data as such. You can store this data as a separate document type in Vespa and look it up before passing the query, or store this customer meta-data as part of the other meta-data for the customer (i.e. login information) and pass it along the query when you send it to the backend. Find an example on how to look up data in [album-recommendation-docproc](https://github.com/vespa-engine/sample-apps/tree/master/examples/document-processing).
+
+
+
+Create a tensor in the ranking function from arrays or weighted sets using `tensorFrom...` functions - see [document features](/en/reference/ranking/rank-features#document-features).
+
+
+
+Pass a ranking feature like `query(threshold)` and use an `if` statement in the ranking expression - see [retrieval and ranking](/en/ranking/ranking-intro#retrieval-and-ranking). Example:
+
+```txt
+rank-profile drop-low-score {
+ function my_score() {
+ expression: ..... #custom first phase score
+ }
+ rank-score-drop-limit:0.0
+ first-phase {
+ if(my_score() < query(threshold), -1, my_score())
+ }
+}
+```
+
+
+
+Rank expressions are not evaluated lazily. No, this would require lambda arguments. Only doubles and tensors are passed between functions.
+
+Example:
+
+```txt
+function inline foo(tensor, defaultVal) {
+ expression: if (count(tensor) == 0, defaultValue, sum(tensor))
+}
+
+function bar() {
+ expression: foo(tensor, sum(tensor1 * tensor2))
+}
+```
+
+
+
+Yes, this can be accomplished by configuring [match-phase](/en/reference/schemas/schemas#match-phase) in the rank profile, or by adding a range query item using *hitLimit* to the query tree, see [capped numeric range search](/en/reference/querying/yql#numeric). Both methods require an *attribute* field with *fast-search*. The capped range query is faster, but beware that if there are other restrictive filters in the query, one might end up with 0 hits. The additional filters are applied as a post filtering step over the hits from the capped range query. *match-phase* on the other hand, is safe to use with filters or other query terms, and also supports diversification which the capped range query term does not support.
+
+
+
+If a ranking profile produces NaNs or Infinities - which are impossible to represent as a number in JSON - the strings "Infinity" or "-Infinity", (NaN becomes "-Infinity") are returned in result sets, and client libraries might handle that by default (e.g., Golang).
+
+The returned [relevance](/en/reference/querying/default-result-format#relevance) for a hit can become "-Infinity" instead of a double:
+
+- The [ranking](/en/basics/ranking) expression used a feature which became `NaN` (Not a Number). For example, `log(0)` would produce `-Infinity`. Use [isNan](/en/reference/ranking/ranking-expressions#isnan-x) to guard against this.
+- Surfacing low scoring hits using [grouping](/en/querying/grouping), that is, rendering low ranking hits with `each(output(summary()))` that are outside what Vespa computed and caches on a heap. This is controlled by the [total-keep-rank-count](/en/reference/schemas/schemas#total-keep-rank-count) parameter.
+- Using unset fields in the ranking function
+
+Resolve this by one or more of:
+
+- Extend the client code to specifically handle these strings
+- Make sure the field is set to some value for all documents
+- Add a default value for the field when accessing it in your rank profile: `if (isNan(attribute(last_update)), 0, attribute(last_update))`
+- Add a final guard in the ranking expressions coercing to some small number, making non-finite scores sink to the bottom while remaining a valid number:
+
+```txt
+function finite_or_sentinel(x) {
+ expression: if (isNan(x - x), -1e9, x)
+}
+```
+
+- Use CBOR instead of JSON - using a binary format can represent NaNs and Infinities without issues, and it can also be faster/more efficient
+
+
+
+To hard-code documents to positions in the result set, see the [pin results example](/en/ranking/multivalue-query-operators#pin-results-example).
+
+
+
+## Documents
+
+
+
+There is a [maximum document size](/en/reference/applications/services/container#document-api) of 100 MiB, which is configurable per content cluster in services.xml.
+
+
+
+No enforced limit, except resource usage (memory).
+
+
+
+E.g. a product is offered in a list of stores with a quantity per store. Use [multivalue fields](/en/querying/searching-multivalue-fields) (array of struct) or [parent child](/en/schemas/parent-child). Which one to chose depends on use case, see discussion in the latter link.
+
+
+
+E.g. price and quantity available per store may often change vs the actual product attributes. Vespa supports [partial updates](/en/writing/reads-and-writes) of documents. Also, the parent/child feature is implemented to support use-cases where child elements are updated frequently, while a more limited set of parent elements are updated less frequently.
+
+
+
+See the [Vespa Consistency Model](/en/content/consistency). Vespa is not transactional in the traditional sense, it doesn't have strict ACID guarantees. Vespa is designed for high performance use-cases with eventual consistency as an acceptable (and to some extent configurable) trade-off.
+
+
+
+Wildcard fields are not supported in vespa. Workaround would be to use maps to store the wildcard fields. Map needs to be defined with `indexing: attribute` and hence will be stored in memory. Refer to [map](/en/reference/schemas/schemas#map).
+
+
+
+Implement a [document processor](/en/applications/document-processors) for this.
+
+
+
+Set a selection criterion on the `document` element in `services.xml`. The criterion selects documents to keep. I.e. to purge documents "older than two weeks", the expression should be "newer than two weeks". Read more about [document expiry](/en/schemas/documents#document-expiry).
+
+
+
+Changing redundancy is a live and safe change (assuming there is headroom on disk / memory - e.g. from 2 to 3 is 50% more). The time to migrate will be quite similar to what it took to feed initially - a bit hard to say generally, and depends on IO and index settings, like if building an HNSW index. To monitor progress, take a look at the [multinode](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode) sample application for the _clustercontroller_ status page - this shows buckets pending, live. Finally, use the `.idealstate.merge_bucket.pending` metric to track progress - when 0, there are no more data syncing operations - see [monitor distance to ideal state](/en/operations/self-managed/admin-procedures#monitor-distance-to-ideal-state). Nodes will work as normal during data sync, and query coverage will be the same.
+
+
+
+It does not, _namespace_ is a mechanism to split the document space into parts that can be used for document selection - see [documentation](/en/schemas/documents#namespace). The namespace is not indexed and cannot be searched using the query api, but can be used by [visiting](/en/writing/visiting).
+
+
+
+There are multiple things that can cause this, see [visiting troubleshooting](/en/writing/visiting#troubleshooting).
+
+
+
+Run a query like `vespa query "select * from sources * where true"` and see the `totalCount` field. Alternatively, use metrics or `vespa visit` - see [examples](/en/writing/batch-delete#example).
+
+
+
+Not in the field definition, but it's possible to do this with the [choice](/en/writing/indexing#choice-example) expression in an indexing statement.
+
+
+
+## Query
+
+
+
+Facets is called grouping in Vespa. Groups can be multi-level.
+
+
+
+Add filters to the query using [YQL](/en/querying/query-language) using boolean, numeric and [text matching](/en/querying/text-matching). Query terms can be annotated as filters, which means that they are not highlighted when bolding results.
+
+
+
+One way is to describe items using tensors and query for the [nearest neighbor](/en/reference/querying/yql#nearestneighbor) - using full precision or approximate (ANN) - the latter is used when the set is too large for an exact calculation. Apply filters to the query to limit the neighbor candidate set. Using [dot products](/en/ranking/multivalue-query-operators) or [weak and](/en/ranking/wand) are alternatives.
+
+
+
+Vespa does not have a stop-word concept inherently. See the [sample app](https://github.com/vespa-engine/sample-apps/pull/335/files) for how to use [filter terms](/en/reference/querying/yql#annotations). [Tripling the query performance of lexical search](https://blog.vespa.ai/tripling-the-query-performance-of-lexical-search/) it s good blog post on this subject.
+
+
+
+Trying to request more than 400 hits in a query, getting this error: `{'code': 3, 'summary': 'Illegal query', 'message': '401 hits requested, configured limit: 400.'}`.
+
+- To increase max result set size (i.e. allow a higher [hits](/en/reference/api/query#hits)), configure `maxHits` in a [query profile](/en/reference/api/query#queryprofile), e.g. `500` in `search/query-profiles/default.xml` (create as needed). The [query timeout](/en/reference/api/query#timeout) can be increased, but it will still be costly and likely impact other queries - large limit more so than a large offset. It can be made cheaper by using a smaller [document summary](/en/querying/document-summaries), and avoiding fields on disk if possible.
+- Using _visit_ in the [document/v1/ API](/en/writing/document-v1-api-guide) is usually a better option for dumping all the data.
+
+
+
+See the [UserProfileSearcher](https://github.com/vespa-engine/sample-apps/blob/master/news/app-6-recommendation-with-searchers/src/main/java/ai/vespa/example/UserProfileSearcher.java) for how to create a new query to fetch data - this creates a new Query, sets a new root and parameters - then `fill`s the Hits.
+
+
+
+See the sub-query question above, in addition add something like:
+```java expandable
+public class ConfigCacheRefresher extends AbstractComponent {
+
+ private final ScheduledExecutorService configFetchService = Executors.newSingleThreadScheduledExecutor();
+ private Chain searcherChain;
+
+ void initialize() {
+ Runnable task = () -> refreshCache();
+ configFetchService.scheduleWithFixedDelay(task, 1, 1, TimeUnit.MINUTES);
+ searcherChain = executionFactory.searchChainRegistry().getChain(new ComponentId("configDefaultProvider"));
+ }
+
+ public void refreshCache() {
+ Execution execution = executionFactory.newExecution(searcherChain);
+ Query query = createQuery(execution);
+
+ public void deconstruct() {
+ super.deconstruct();
+ try {
+ configFetchService.shutdown();
+ configFetchService.awaitTermination(1, TimeUnit.MINUTES);
+ }catch(Exception e) {..}
+ }
+}
+```
+
+
+
+Yes, using the [in query operator](/en/reference/querying/yql#in). Example:
+```sql
+select * from data where user_id in (10, 20, 30)
+```
+The best article on the subject is [multi-lookup set filtering](/en/performance/feature-tuning#multi-lookup-set-filtering). Refer to the [in operator example](/en/ranking/multivalue-query-operators#in-example) on how to use it programmatically in a [Java Searcher](/en/applications/searchers).
+
+
+
+Use the [in query operator](/en/reference/querying/yql#in). Example:
+```sql
+select * from data where category in ('cat1', 'cat2', 'cat3')
+```
+See [multi-lookup set filtering](#is-it-possible-to-query-vespa-using-a-list-of-document-ids) above for more details.
+
+
+
+Count all documents using a query like [select * from doc where true](/en/querying/query-language) - this counts all documents from the "doc" source. Using `select * from doc where true limit 0` will return the count and no hits, alternatively add [hits=0](/en/reference/api/query#hits). Pass [ranking.profile=unranked](/en/reference/api/query#ranking.profile) to make the query less expensive to run. If an _estimate_ is good enough, use [hitcountestimate=true](/en/reference/api/query#hitcountestimate).
+
+
+
+Yes - a deployment warning with _This may lead to recall and ranking issues_ is emitted when fields with conflicting tokenization are put in the same [fieldset](/en/reference/schemas/schemas#fieldset). This is because a given query item searching one fieldset is tokenized just once, so there's no right choice of tokenization in this case. If you have text that you want to apply to multiple fields with different tokenization, include the text multiple times in the query:
+```sql
+select * from sources * where fieldsetOrField1 contains text(@query) or fieldsetOrField2 contains text(@query)
+```
+More details on [stack overflow](https://stackoverflow.com/questions/72784136/why-vepsa-easily-warning-me-this-may-lead-to-recall-and-ranking-issues).
+
+
+
+Symptoms — can appear when a term's DF differs substantially between member fields:
+- Poor recall for queries mixing a common term with a rare one (e.g. `"the cure"`, `"the X"`). [weakAnd](/en/ranking/wand) may drop the common term, so good matches never surface.
+- `term(n).significance` and `fieldMatch(field).significance` read identical across member fields in rank-feature dumps — even in fields where the term is actually rare.
+
+Cause: when a term matches a [fieldset](/en/reference/schemas/schemas#fieldset) (including the implicit `default` used by [userQuery()](/en/reference/querying/yql#userquery)), Vespa aggregates the document frequency across all member fields.
+
+If the DF differs substantially between members, the high-DF field dominates and pulls the term's significance down for the whole fieldset.
+
+**Example:**
+
+With `fieldset default { fields: title, artist }`, `"the"` is common in `title` (countless _"The Watcher"_, _"The Best Of the ..."_) but rare in `artist`.
+
+Its aggregated significance is pulled down toward the `title` DF, so searching for the artist `"The Cure"` loses the signal from `"the"`.
+
+The same aggregated DF drives every DF/IDF feature: [bm25](/en/ranking/bm25), [nativeRank](/en/ranking/nativerank), `term(n).significance`, `fieldMatch.significance`.
+
+Matches that survive retrieval are scored using the aggregated DF rather than per-field statistics.
+
+Fix: rewrite as OR'd [userInput](/en/reference/querying/yql#userinput) clauses with a [defaultIndex](/en/reference/querying/yql#defaultindex) annotation per field. Each field then uses its own DF:
+
+*Combined-fieldset DF:*
+
+```php
+vespa query 'select * from sources * where userQuery()' \
+ query='the cure'
+```
+
+*Per-field DF:*
+
+```bash
+vespa query 'select * from sources * where ({defaultIndex:"title"}userInput(@q)) or ({defaultIndex:"artist"}userInput(@q))' \
+ q='the cure'
+```
+
+
+**Important**
+
+BM25 and significance feature values shift scale when switching to per-field DF. Retrain any learned ranker on features collected with the new query formulation.
+
+
+
+
+Find query timeout details in the [Query API Guide](/en/querying/query-api#timeout) and the [Query API Reference](/en/reference/api/query#timeout).
+
+
+
+Backslash is used to escape special characters in YQL. For example, to query with a literal backslash, which is useful in regexpes, you need to escape it with another backslash: \\. Unescaped backslashes in YQL will lead to "token recognition error at: '\'".
+
+In addition, Vespa CLI unescapes double backslashes to single (while single backslashes are left alone), so if you query with Vespa CLI you need to escape with another backslash: \\\\. The same applies to strings in Java.
+
+Also note that both log messages and JSON results escape backslashes, so any \ becomes \\.
+
+
+
+E.g. two select queries with slightly different filtering condition and have a limit operator for each of the subquery. This makes it impossible to do via OR conditions to select both collection of documents - something equivalent to:
+
+SELECT 1 AS x UNION ALL SELECT 2 AS y;
+
+This isn’t possible, need to run 2 queries. Alternatively, split a single incoming query into two running in parallel in a [Searcher](/en/applications/searchers) - example:
+
+```java
+FutureResult futureResult = new AsyncExecution(settings).search(query);
+FutureResult otherFutureResult = new AsyncExecution(settings).search(otherQuery);
+```
+
+
+
+There is no index or attribute data structure that allows efficient _searching_ for documents where an array field has a certain number of elements or items. The _grouping language_ has a [size()](/en/reference/querying/grouping-language#list-expressions) operator that can be used in queries.
+
+
+
+The [visiting](/en/writing/visiting#analyzing-field-values) API using document selections supports it, with a linear scan over all documents. If the field is an _attribute_ one can query using grouping to identify Nan Values, see count and list [fields with NaN](/en/querying/grouping#count-fields-with-nan).
+
+
+
+See the [random.match](/en/reference/ranking/rank-features#random.match) rank feature - example:
+
+```txt
+rank-profile random {
+ first-phase {
+ expression: random.match
+ }
+}
+```
+
+Run queries, seeding the random generator:
+
+```bash
+$ vespa query 'select * from music where true' \
+ ranking=random \
+ rankproperty.random.match.seed=2
+```
+
+
+
+See [result diversity](/en/querying/result-diversity) for strategies on how to create result sets from different sources.
+
+
+
+If you want to search for the most dissimilar items, you can with angular distance multiply your `clip_query_embedding` by the scalar -1. Then you are searching for the points that are closest to the point which is the farthest away from your `clip_query_embedding`.
+
+Also see a [pyvespa example](https://vespa-engine.github.io/pyvespa/examples/pyvespa-examples#Neighbors).
+
+
+
+## Feeding
+
+
+
+The best option is to use `--verbose` option, like `vespa feed --verbose myfile.jsonl` - see [documentation](/en/clients/vespa-cli#documents). A common problem is a mismatch in schema names and [document IDs](/en/schemas/documents#document-ids) - a schema like:
+```yaml
+schema article {
+ document article {
+ ...
+ }
+}
+```
+
+will have a document feed like:
+
+```json
+{"put": "id:mynamespace:article::1234", "fields": { ... }}
+```
+
+Note that the [namespace](/en/learn/glossary#namespace) is not mentioned in the schema, and the schema name is the same as the document name.
+
+
+
+This configuration is a combination of content and container cluster configuration, see [indexing](/en/writing/indexing) and [feed troubleshooting](/en/operations/self-managed/admin-procedures#troubleshooting).
+
+
+
+This is often a problem if using [document expiry](/en/schemas/documents#document-expiry), as documents already expired will not be persisted, they are silently dropped and ignored. Feeding stale test data with old timestamps in combination with document-expiry can cause this behavior.
+
+
+
+Using too many HTTP clients can generate a 429 response code. The Vespa sample apps use [vespa feed](/en/clients/vespa-cli#documents) which uses HTTP/2 for high throughput - it is better to stream the feed files through this client.
+
+
+
+Vespa does not have a Kafka connector. Refer to third-party connectors like [kafka-connect-vespa](https://github.com/vinted/kafka-connect-vespa).
+
+
+
+## Text Search
+
+
+
+E.g. integrating NER, word sense disambiguation, specific intent detection. Vespa supports these things well:
+- [Query (and result) processing](/en/applications/searchers)
+- [Document processing](/en/applications/document-processors) and document processors working on semantic annotations of text
+
+
+
+E.g. instead of using terms or n-grams as the unit, we might use terms with specific word senses - e.g. bark (dog bark) vs. bark (tree bark), or BCG (company) vs. BCG (vaccine name). Creating a new index *format* means changing the core. However, for the examples above, one just need control over the tokens which are indexed (and queried). That is easily done in some Java code. The simplest way to do this is to plug in a [custom tokenizer](/en/linguistics/linguistics). That gets called from the query parser and bundled linguistics processing [Searchers](/en/applications/searchers) as well as the [Document Processor](/en/applications/document-processors) creating the annotations that are consumed by the indexing operation. Since all that is Searchers and Docprocs which you can replace and/or add custom components before and after, you can also take full control over these things without modifying the platform itself.
+
+
+
+It provides the building blocks but not an out-of-the-box solution. We can write a [Searcher](/en/applications/searchers) to detect query-side entities and rewrite the query, and a [DocProc](/en/applications/document-processors) if we want to handle them in some special way on the indexing side.
+
+
+
+You can write a document processor for text extraction, Vespa does not provide it out of the box.
+
+
+
+[Imported fields](/en/schemas/parent-child) from parent documents are defined as [attributes](/en/content/attributes), and have limited text match modes (i.e. `indexing: index` cannot be used). [Details](https://stackoverflow.com/questions/71936330/parent-child-mode-cannot-be-searched-by-parent-column).
+
+
+
+## Semantic search
+
+
+
+If you have added vectors to your documents and queries, and see that the rank feature closeness(field, yourEmbeddingField) produces 1.0 for all documents, you are likely using [distance-metric](/en/reference/schemas/schemas#distance-metric): innerproduct/prenormalized-angular, but your vectors are not normalized, and the solution is normally to switch to [distance-metric: angular](/en/reference/schemas/schemas#angular) or use [distance-metric: dotproduct](/en/reference/schemas/schemas#dotproduct) (available from Vespa 8.170.18).
+
+With non-normalized vectors, you often get negative distances, and those are capped to 0, leading to closeness 1.0. Some embedding models, such as models from sbert.net, claim to output normalized vectors but might not.
+
+
+
+## Programming Vespa
+
+
+
+Plugins have to run in the JVM - [jython](https://www.jython.org/) might be an alternative, however Vespa Team has no experience with it. Vespa does not have a language like [painless](https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-scripting-painless) - it is more flexible to write application logic in a JVM-supported language, using [Searchers](/en/applications/searchers) and [Document Processors](/en/applications/document-processors).
+
+
+
+A [Searcher](/en/applications/searchers) intercepts a query and/or result. To get a number of documents by id in a Searcher or other component like a [Document processor](/en/applications/document-processors), you can have an instance of [com.yahoo.documentapi.DocumentAccess](/en/reference/applications/components#injectable-components) injected and use that to get documents by id instead of the HTTP API.
+
+
+
+Vespa uses Java 17 - it will support 20 some time in the future.
+
+
+
+Use `System.out.println` to write text to the [vespa.log](/en/reference/operations/log-files).
+
+
+
+## Performance
+
+
+
+Vespa has a near real-time indexing core with typically sub-second latencies from document ingestion to being indexed. This depends on the use-case, available resources and how the system is tuned. Some more examples and thoughts can be found in the [scaling guide](/en/performance/sizing-search).
+
+
+
+Vespa does not have a concept of "batch ingestion" as it contradicts many of the core features that are the strengths of Vespa, including [serving elasticity](/en/content/elasticity) and sub-second indexing latency. That said, we have numerous use-cases in production that do high throughput updates to large parts of the (sometimes entire) document set. In cases where feed throughput is more important than indexing latency, you can tune this to meet your requirements. Some of this is detailed in the [feed sizing guide](/en/performance/sizing-feeding).
+
+
+
+Yes. The [content node](/en/content/proton) is implemented in C++ and not memory constrained other than what the operating system does.
+
+
+
+If the replicas are in sync the request is only sent to the primary content node. Otherwise, it's sent to several nodes, depending on replica metadata. Example: if a bucket has 3 replicas A, B, C and A & B both have metadata state X and C has metadata state Y, a request will be sent to A and C (but not B since it has the same state as A and would therefore not return a potentially different document).
+
+
+
+[Attribute](/en/content/attributes) (with or without `fast-search`) is always in memory, but does not support tokenized matching. It is for structured data. [Index](/en/basics/schemas#document-fields) (where there’s no such thing as fast-search since it is always fast) is in memory to the extent there is available memory and supports tokenized matching. It is for unstructured text.
+
+It is possible to guarantee that fields that are defined with `index` have both the dictionary and the postings in memory by changing from `mmap` to `populate`, see [index > io > search](/en/reference/applications/services/content#index-io-search). Make sure that the content nodes run on nodes with plenty of memory available, during index switch the memory footprint will 2x. Familiarity with Linux tools like `pmap` can help diagnose what is mapped and if it’s resident or not.
+
+Fields that are defined with `attribute` are in-memory, fields that have both `index` and `attribute` have separate data structures, queries will use the default mapped on disk data structures that supports `text` matching, while grouping, summary and ranking can access the field from the `attribute` store.
+
+A Vespa query is executed in two phases as described in [sizing search](/en/performance/sizing-search), and summary requests can touch disk (and also uses `mmap` by default). Due to their potential size there is no populate option here, but one can define [dedicated document summary](/en/querying/document-summaries#performance) containing only fields that are defined with `attribute`.
+
+The [practical performance guide](/en/performance/practical-search-performance-guide) can be a good starting point as well to understand Vespa query execution, difference between `index` and `attribute` and summary fetching performance.
+
+
+
+Deleting documents, by using the [document API](/en/writing/reads-and-writes) or [garbage collection](/en/schemas/documents#document-expiry) will increase the capacity on the content nodes. However, this is not necessarily observable in system metrics - this depends on many factors, like what kind of memory that is released, when [flush](/en/content/proton#proton-maintenance-jobs) jobs are run and document [schema](/en/basics/schemas).
+
+In short, Vespa is not designed to release memory once used. It is designed for sustained high throughput, low latency, keeping maximum memory used under control using features like [feed block](/en/writing/feed-block).
+
+When deleting documents, one can observe a slight increase in memory. A deleted document is represented using a [tombstone](/en/operations/self-managed/admin-procedures#content-cluster-configuration), that will later be removed, see [removed-db-prune-age](/en/reference/applications/services/content#removed-db-prune-age). When running garbage collection, the summary store is scanned using mmap and both VIRT and page cache memory usage increases.
+
+Read up on [attributes](/en/content/attributes) to understand more of how such fields are stored and managed. [Paged attributes](/en/content/attributes#paged-attributes) trades off memory usage vs. query latency for a lower max memory usage.
+
+
+
+A field is of type _index_ or _attribute_ - [details](/en/querying/text-matching#index-and-attribute).
+
+Fields with _index_ use no incremental memory at deployment, if the field has no value.
+
+Fields with _attribute_ use memory, even if the field value is not set,
+
+Attributes are optimized for random access: To be able to jump to the value of any document in O(1) time. That requires allocating a constant amount of memory (the value, or a pointer) per document, regardless of whether there is a value. In short, knowing that a value is unset is a value in itself for attributes, so deploying new fields or new schemas with attributes will cause an incremental increase in memory. Applications with many unused schemas and fields can factor this in when sizing for memory. Refer to [attributes](/en/content/attributes#attribute-memory-usage) for details.
+
+
+
+[Autoscaling](/en/operations/autoscaling) is the best guide to understand how to size and autoscale the system. Container clusters are stateless and can be autoscaled more quickly than content clusters.
+
+
+
+It is not possible to autoscale content clusters for 8x load increase in 5 minutes, as this requires both provisioning and data migration. Such use cases are best discussed with the Vespa Team to understand the resource bottlenecks, tradeoffs and mitigations. Also read [Graceful Degradation](/en/performance/graceful-degradation).
+
+
+
+It depends. Vespa aims to adapt to resources (like auto thread config based on virtual node thread count) and actual use (when to run maintenance jobs like compaction), but there are tradeoffs that applications owners can/should make. Start off by reading the [Vespa Serving Scaling Guide](/en/performance/sizing-search), then run [benchmarks](/en/performance/benchmarking-cloud) and use the [dashboards](/en/operations/monitoring).
+
+
+
+## Administration
+
+
+
+Yes, deployment is using this web service API, which allows you to create an edit session from the currently deployed package, make modifications, and deploy (prepare+activate) it: [deploy-rest-api-v2](/en/reference/api/deploy-v2). However, this is only useful in cases where you want to avoid transferring data to the config server unnecessarily. When you resend everything, the config server will notice that you did not actually change e.g. the node configs and avoid unnecessary noop changes.
+
+
+
+[Elasticity](/en/content/elasticity) is a core Vespa strength - easily add and remove nodes with minimal (if any) serving impact. The exact time needed depends on how much data will need to be migrated in the background for the system to converge to [ideal data distribution](/en/content/idealstate).
+
+
+
+You will need to load balance incoming requests between the nodes running the [stateless Java container cluster(s)](/en/learn/overview). This can typically be done using a simple network load balancer available in most cloud services. This is included when using [Vespa Cloud](/), with an HTTPS endpoint that is already load balanced - both locally within the region and globally across regions.
+
+
+
+[Search sizing](/en/performance/sizing-search) is the intro to this. Topology matters, and this is much used in the high-volume Vespa applications to optimise latency vs. cost.
+
+
+
+With [Vespa Cloud](/), we do automated background upgrades daily without noticeable serving impact. If you host Vespa yourself, you can do this, but need to implement the orchestration logic necessary to handle this. The high level procedure is found in [live-upgrade](/en/operations/self-managed/live-upgrade).
+
+
+
+[Vespa Cloud](/en/operations/zones) has integrated support - query a global endpoint. Writes will have to go to each zone. There is no auto-sync between zones.
+
+
+
+Building indexes offline requires the partition layout to be known in the offline system, which is in conflict with elasticity and auto-recovery (where nodes can come and go without service impact). It is also at odds with realtime writes. For these reasons, it is not recommended, and not supported.
+
+
+
+Use [visiting](/en/writing/visiting) to dump all or a subset of the documents. See [data-management-and-backup](/en/operations/data-management) for more information.
+
+
+
+Failure response will be given in case the document is not written on some replica nodes.
+
+
+
+Yes, it will be available, eventually. Also try [Multinode testing and observability](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode).
+
+
+
+Yes just add a `deleted` attribute, add [fast-search](/en/content/attributes#fast-search) on it and create a searcher which adds an `andnot deleted` item to queries.
+
+
+
+You can set a [transition-time](/en/reference/applications/services/content#transition-time) in services.xml to configure the cluster controller how long a node is to be kept in maintenance mode before being automatically marked down.
+
+
+
+Grouped distribution is used to reduce search latency. Content is distributed to a configured set of groups, such that the entire document collection is contained in each group. Setting the redundancy and searchable-copies equal to the number of groups ensures that data can be queried from all groups.
+
+
+
+Refer to [#17898](https://github.com/vespa-engine/vespa/issues/17898) for a discussion of options.
+
+
+
+Use [/state/v1/version](/en/reference/api/state-v1#state-v1-version) to find Vespa version.
+
+
+
+See [rollback](/en/applications/deployment#rollback) for options.
+
+
+
+## Troubleshooting
+
+
+
+If deployment fails with error message "Deployment failed, code: 413 ("Payload Too Large.")" you might need to increase the config server's JVM heap size. The config server has a default JVM heap size of 2 Gb. When deploying an app with e.g. large models this might not be enough, try increasing the heap to e.g. 4 Gb when executing 'docker run ...' by adding an environment variable to the command line:
+
+```bash
+docker run --env VESPA_CONFIGSERVER_JVMARGS=-Xmx4g
+```
+
+
+
+When deploying an application package, with some kind of error, the endpoints might fail, like:
+```bash
+$ vespa deploy --wait 300
+
+Uploading application package ... done
+
+Success: Deployed target/application.zip
+
+Waiting up to 5m0s for query service to become available ...
+Error: service 'query' is unavailable: services have not converged
+```
+Another example:
+
+```text
+[INFO] [03:33:48] Failed to get 100 consecutive OKs from endpoint ...
+```
+
+There are many ways this can fail, the first step is to check the Vespa Container:
+
+```bash
+$ docker exec vespa vespa-logfmt -l error
+
+[2022-10-21 10:55:09.744] ERROR container
+Container.com.yahoo.container.jdisc.ConfiguredApplication
+Reconfiguration failed, your application package must be fixed, unless this is a JNI reload issue:
+Could not create a component with id 'ai.vespa.example.album.MetalSearcher'.
+Tried to load class directly, since no bundle was found for spec: album-recommendation-java.
+If a bundle with the same name is installed,
+there is a either a version mismatch or the installed bundle's version contains a qualifier string.
+...
+```
+
+[Bundle plugin troubleshooting](/en/applications/bundles#bundle-plugin-troubleshooting) is a good resource to analyze Vespa container startup / bundle load problems.
+
+
+
+Using an M1 MacBook Pro / AArch64 makes the Docker run fail:
+
+```txt
+WARNING: The requested image’s platform (linux/amd64) does not match the detected host platform (linux/arm64/v8)
+and no specific platform was requested
+```
+
+Make sure you are running a recent version of the Docker image, do `docker pull vespaengine/vespa`.
+
+
+
+Make sure all [Config servers](/en/operations/self-managed/configuration-server#troubleshooting) are started, and are able to establish ZooKeeper quorum (if more than one) - see the [multinode](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode) sample application. Validate that the container has [enough memory](/en/operations/self-managed/docker-containers).
+
+
+
+The Config Server cluster with 3 nodes fails to start. The ZooKeeper cluster the Config Servers use waits for hosts on the network, the hosts wait for ZooKeeper in a catch 22 - see [sampleapp troubleshooting](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations#troubleshooting).
+
+
+
+Use [vespa-logfmt](/en/reference/operations/self-managed/tools#vespa-logfmt) to dump logs. If Vespa is running in a local container (named "vespa"), run `docker exec vespa vespa-logfmt`.
+
+
+
+See [encoding troubleshooting](/en/linguistics/troubleshooting-encoding) for how to handle and remove control characters from the document feed.
+
+
+
+## Login, Tenants and Plans
+
+
+
+[Deploy an application](/en/basics/deploy-an-application) to create a tenant and start your [free trial](https://vespa.ai/free-trial/). This tenant can be your personal tenant, or shared with others. It can not be renamed.
+
+
+
+If the tenant is already created, add more users to it. In the Vespa Cloud Console, open [**Account > Users**](https://console.vespa-cloud.com/link/tenant/account/users). From this view you can manage users in the tenant, and their roles - from here, you can add/set tenant admins.
+
+
+
+When starting the free trial, you are asked to accept Terms of Service. For paid plans, this is covered by the contract.
+
+
+
+In the console, open [**Account > Billing**](https://console.vespa-cloud.com/link/tenant/account/billing) to enter information required for billing. Use [Vespa Support](https://vespa.ai/support/) if you need to provide this information without console login.
+
+
+
+Yes, contact [Vespa Support](https://vespa.ai/support/) to set it up.
+
+
+
+## Vespa Cloud Operations
+
+
+
+See [node resources](/en/performance/node-resources) to assess current and auto-suggested resources and [autoscaling](/en/operations/autoscaling) for how to automate.
+
+
+
+Managing resources is easy, as most changes are automated. Adding / removing / changing nodes starts automated data migration, see [elasticity](/en/content/elasticity).
+
+
+
+Schema changes might require data reindexing, which is automated, but takes some time. Other schema changes require data refeed - [details](/en/reference/schemas/schemas#modifying-schemas)
+
+
+
+Use the [Memory Visualizer](/en/performance/memory-visualizer) to evaluate how memory is allocated to the fields. Fields can be `index`, `attribute` and `summary`, and combinations of these, with settings like `fast-search` that affects memory usage. [Attributes](/en/content/attributes) is a great read for understanding Vespa memory usage.
+
+
+
+Listing archived objects can fail, e.g. `gsutil -u my_project ls gs://vespa-cloud-data-prod-gcp-us-central1-f-12345f/my_tenant` can fail with `AccessDeniedException: 403 me@mymail.com does not have serviceusage.services.use access to the Google Cloud project. Permission \'serviceusage.services.use\' denied on resource (or it may not exist).` This can be due to missing rights on your Google project (my_project in the example above) \- from the Google documentation: _"The user account accessing the Cloud Storage Bucket must be granted the Service Usage Consumer role (see [https://cloud.google.com/service-usage/docs/access-control](https://cloud.google.com/service-usage/docs/access-control)) in order to charge the specified user project for the bucket usage cost"_
+
+
+
+Vespa Cloud applications have a Prometheus endpoint. Find guides for how to integrate with Grafana and AWS Cloudwatch at [monitoring](/en/operations/monitoring).
+
+
+
+Vespa Cloud has detailed dashboards linked from the _monitoring_ tab in the Console, one for each zone the instance is deployed to.
+
+
+
+Vespa is normally upgraded daily. There are exceptions, like holidays and weekends. During upgrades, nodes are stopped one-by-one per cluster. As all clusters have one redundant node, serving and write traffic is not impacted by upgrades. Before the upgrade, the application's [system and staging tests](/en/operations/automated-deployments) are run, halting the upgrade if they fail. Documents are re-migrated to the upgraded node before doing the next node, see [Elastic Vespa](/en/content/elasticity) for details.
+
+
+
+Issues like Feed Blocked, Deployment and Deprecation warnings show up in the console. There are no warnings on redundancy level / searchable copies, as redundant document buckets are activated for queries automatically, and auto data-migration kicks in for node failures / replacements.
+
+
+
+- Schema changes that [require service restart](/en/reference/schemas/schemas#changes-that-require-restart-but-not-re-feed) are handled automatically by Vespa Cloud. A deployment job involves waiting for these to complete.
+- Schema changes that [require reindexing](/en/reference/schemas/schemas#changes-that-require-reindexing) of data require a validation override, and will trigger automatic reindexing. Status can be tracked in the console application view. Vespa Cloud also periodically re-indexes all data, with minimal resource usage, to account for changes in linguistics libraries.
+- Schema changes that [require refeeding](/en/reference/schemas/schemas#changes-that-require-re-feed) data require a validation override, and the user must refeed the data after deployment.
+
+
+
+The management of data stored in an application running on Vespa Cloud is the responsibility of the application owner and, as such, Vespa Cloud does not have any retention policy for this data as long as it is stored by the application.
+
+The following data retention policies applies to Vespa Cloud:
+- After a node previously allocated to an application has been deallocated (e.g. due to application being deleted by application owner), all application data will be deleted within _four hours_.
+- All application log data will be deleted from Vespa servers after no more than _30 days_ (most often sooner) dependent on log volume, allocated disk resources, etc. *PLEASE NOTE:* This is the theoretical maximum retention time - see [archive guide](/en/operations/archive/archive-guide) for how to ensure access to your application logs.
+
+
+
+Yes, Vespa.ai has a SOC 2 attestation: [Trust Center](https://trust.vespa.ai).
+
+
+
+Read more in [GDPR](https://cloud.vespa.ai/en/gdpr?_gl=1*1uexiwi*_gcl_au*ODE0ODM4MTI2LjE3Nzk3MjQ3OTY.).
+
+
+
+Vespa is most often used for queries in data written from the information sources, although it can also be used without data, e.g. for model serving. It is the application owner that writes the integration with Vespa Cloud to write data.
+
+
+
+Vespa Cloud uses the following Cloud providers:
+- AWS EC2 instances, with local or remote storage
+- GCP Compute instances, with local or remote storage
+- Azure Compute instances, with local or remote storage
+
+The storage devices are encrypted per Cloud provider, at rest.
+
+
+
+See the [security guide](/en/security/guide) for roles and permissions. The Vespa Cloud Console has a log view tool, and logs / access logs can be exported to the customer's AWS account easily. Deployment operations are tracked in the deployment view, with a history. Vespa Cloud Operators do not have node access, unless specifically granted by the customer, audit logged.
+
+
+
+At termination, all application instances are removed, with data, before the tenant can be deactivated.
+
+
+
+In `dev` zones we use shared resources hence have more than one node on each host/instance. In order to provide a best possible overall responsiveness we do not restrict CPU resources for the individual application nodes.
+
+
+
diff --git a/mintlify-docs/en/learn/features.mdx b/mintlify-docs/en/learn/features.mdx
new file mode 100644
index 0000000000..11bd68f111
--- /dev/null
+++ b/mintlify-docs/en/learn/features.mdx
@@ -0,0 +1,85 @@
+---
+title: Features
+---
+
+## What is Vespa?
+
+Vespa is a platform for applications which need low-latency computation over large data sets. It allows you to write and persist any amount of data, and execute high volumes of queries over the data which typically complete in tens of milliseconds.
+
+Queries can use both structured filters conditions, text and nearest neighbor vector search to select data. All the matching data is then ranked according to ranking functions - typically machine learned - to implement such use cases as search relevance, recommendation, targeting and personalization.
+
+All the matching data can also be grouped into groups and subgroups where data is aggregated for each group to implement features like graphs, tag clouds, navigational tools, result diversity and so on.
+
+Application specific behavior can be included by adding Java components for processing queries, results and writes to the application package.
+
+Vespa is real time. It is architected to maintain constant response times with any data volume by executing queries in parallel over many data shards and cores, and with added query volume by executing queries in parallel over many copies of the same data (groups). It is optimized to return responses in tens of milliseconds. Writes to data becomes visible in a few milliseconds and can be handled at a rate of thousands to tens of thousands per node per second.
+
+A lot of work has gone into making Vespa easy to set up and operate. Any Vespa application - from single node systems to systems running on hundreds of nodes in data centers - are fully configured by a single artifact called an *application package*. Low level configuration of nodes, processes and components is done by the system itself based on the desired traits specified in the application package.
+
+Vespa is scalable. System sizes up to hundreds of nodes handling tens of billions of documents, and tens of thousands of queries per second are not uncommon, and no harder to set up and modify than single node systems. Since all system components, as well as stored data is redundant and self-correcting, hardware failures are not operational emergencies and can be handled by re-adding capacity when convenient.
+
+Vespa is self-repairing and dynamic. When machines are lost or new ones added, data is automatically redistributed over the machines, while continuing serving and accepting writes to the data. Changes to configuration and Java components can be made while serving by deploying a changed application package - no downtime or restarts required.
+
+## Features
+
+This section provides an overview of the main features of Vespa. The remainder of the documentation goes into full detail.
+
+### Data and writes
+
+- Documents in Vespa may be added, replaced, modified (single fields or any subset) and removed.
+- Writes are acknowledged back to the client issuing them when they are durable and visible in queries, in a few milliseconds.
+- Writes can be issued at a sustained volume of thousands to tens of thousands per node per second while serving queries.
+- Data is replicated with a configurable redundancy.
+- An even data distribution, with the desired redundancy is automatically maintained when nodes are added, removed or lost unexpectedly.
+- Data corruption is automatically repaired from an uncorrupted replica of the data.
+- Data is written over a simple HTTP/2 API, or (for high volume) using a small, standalone client.
+- Document data schemas allow fields of any of the usual primitive types as well as collections, structs and tensors.
+- Any number of data schemas can be used at the same time.
+- Documents may reference each other and field from referenced documents may be used in queries without performance penalty.
+- Write operations can be processed by adding custom Java components.
+- Data can be streamed out of the system for batch reprocessing.
+
+### Queries
+
+- Queries may contain any combination of structured filters, free text and vector search operators.
+- Queries may contain large tensors and vectors (to represent e.g a user).
+- Queries choose how results should be ranked and specify how they should be organized (see sections below).
+- Queries and results may be processed by adding custom Java components - or any HTTP request may be turned into a query by custom request handlers.
+- Query response times are typically in tens of milliseconds and can be maintained given any load and data size by adding more hardware.
+- A *streaming search* mode is available where search/selection is only supported on predefined groups of documents (e.g a user's document). In this mode each node can store and serve billions of documents while maintaining low response times.
+
+### Ranking and inference
+
+- All results are ranked using a configured ranking function, selected in the query.
+- A ranking function may be any mathematical function over scalars or tensors (multidimensional arrays).
+- Scalar functions include an "if" function to express business logic and decision trees.
+- Tensor functions include a powerful set of primitives and composite functions which allows expression of advanced machine-learned ranking functions such as e.g. deep neural nets.
+- Functions can also refer to ONNX models invoked locally on the content nodes.
+- Multiple ranking phases are supported to allocate more CPU to ranking promising candidates.
+- A powerful set of text ranking features using positional information from the documents is provided out of the box.
+- Other ranking features include 2D distance and freshness.
+
+### Organizing data and presenting results
+
+- Matches to a query can be grouped and aggregated according to a specification in the query.
+- All the matches are included, even though they reside on multiple machines executing in parallel.
+- Matches can be grouped by a unique value or by a numerical bucket.
+- Any level of groups and subgroups are supported, and multiple parallel groupings can be specified in one query.
+- Data can be aggregated (counted, averaged etc.) and selected within each group and subgroup.
+- Any selection of data from documents can be included with the final result returned to the client.
+- Search engine style keyword highlighting in matching fields is supported.
+
+## Configuration and operations
+
+- Vespa can be installed using rpm files or a Docker image - on personal laptops, owned datacenters or in AWS.
+- An application of Vespa is fully specified as a separate buildable artifact: An *application package* - individual machines or processes need never be configured individually.
+- Systems may contain multiple clusters of each type (stateless and stateful), each containing any number of nodes.
+- Systems of any size may be specified by two short configuration files in the application package.
+- Document schemas, Java components and ranking functions/models are also configured in the application package.
+- An application package is deployed as a single unit to Vespa to realizes the system desired by the application.
+- Most application changes (including Java component changes) can be performed by deploying a changed application package - the system will manage its own change process while serving and handling writes.
+- Most document schema changes (excluding field type changes) can be made while the system is live.
+- Application package changes are validated on deployment to prevent destructive changes to live systems.
+- Vespa has no single-point-of-failures and automatically routes around failing nodes.
+- System logs are collected to a central server in real time.
+- Selected metrics may be emitted to a third-party metrics/alerting system from all the nodes.
diff --git a/mintlify-docs/en/learn/glossary.mdx b/mintlify-docs/en/learn/glossary.mdx
new file mode 100644
index 0000000000..23767a5b48
--- /dev/null
+++ b/mintlify-docs/en/learn/glossary.mdx
@@ -0,0 +1,222 @@
+---
+title: Glossary
+description: "This is a glossary of both Vespa-specific terminology, and general terms useful in this context."
+---
+
+- **Application**
+
+ The unit of deployment and management. It can contain any number of clusters and schemas etc., but all deployed together. The files defining the application is called [Application Package](/en/basics/applications).
+
+- **Attribute**
+
+ An attribute is a field with properties other than an indexed field. Attribute fields have flexible match modes, including exact match, prefix match, and case-sensitive matching. Attributes enable high sustained update rates by writing directly to memory without disk access. Features like Grouping, Sorting, and [Parent/Child](/en/learn/glossary#parent-child) use attributes.
+
+- **Boolean Search**
+
+ Use [Predicate fields](/en/schemas/predicate-fields) to match queries to a set of boolean constraints in documents. The typical use case is to have a set of boolean constraints representing advertisements, specifying their target groups. Example: `hobby in [Music, Hiking] and age in [20..30]`.
+
+- **Cluster**
+
+ A set of homogenous nodes which all perform the same task. Vespa has two types: Container clusters are stateless, and content clusters store and process the data.
+
+- **Component**
+
+ Components extend a base class from the Container code module; some are [Chained](/en/applications/chaining) for execution. The component types are:
+
+ - [Processors](/en/applications/processing#processors)
+ - [Searchers](/en/learn/glossary#searcher)
+ - [Document Processors](/en/learn/glossary#document-processor)
+ - [Search Result Renderers](/en/applications/result-renderers)
+ - [Provider Components](/en/applications/dependency-injection#special-components)
+
+- **Configuration Server**
+
+ The configuration server hosts most of the control plane of Vespa, where application packages are deployed to - often shortened to "config server". Config servers are deployed as one or in a cluster - see [overview](/en/learn/overview). The config server serves configuration for all Vespa processes, and is normally the first cluster started.
+
+- **Container**
+
+ Vespa's Java container, hosting all application components as well as the stateless logic of Vespa itself. Read more in [Container](/en/applications/containers). Not to be confused with [Docker Containers](/en/learn/glossary#docker).
+
+- **Content Node**
+
+ Content nodes are stateful and holds the document and index data - see [content nodes](/en/content/content-nodes). These nodes implement Vespa's [elasticity](/en/content/elasticity) for seamless data migration and scaling.
+
+- **Control Plane**
+
+ The deploy-commands are Vespa's control plane. The control plane is often secured with other credentials than the [data plane](/en/learn/glossary#data-plane). Often low throughput and used by automation like GitHub Actions to deploy new versions of application packages.
+
+- **Data Plane**
+
+ Document and Query APIs make the Vespa Data plane. Also see [control plane](/en/learn/glossary#control-plane). Often high throughout / low latency, as this is user-serving.
+
+- **Deploy**
+
+ `deploy` is a control-plane command to upload and activate a new version of an [application package](/en/learn/glossary#application).
+
+- **Deployment**
+
+ A deployment is a running Vespa application, created by using [deploy](/en/learn/glossary#deploy).
+
+- **Diversity**
+
+ Result diversity means having diverse results in the result set. As an example, not return the n highest ranking results, but eliminate similar hits, e.g. from the same domain. Refer to [diversity](/en/reference/schemas/schemas#diversity) and [grouping](/en/querying/grouping) for features to eliminate similar hits or group them together.
+
+- **Docker**
+
+ Vespa is available as a container image from [hub.docker.com](https://hub.docker.com/r/vespaengine/vespa). Products to run this image include Docker, Podman and runC, and it enables users to run Vespa in a well-defined environment on multiple platforms. Read more in [Docker Containers](/en/operations/self-managed/docker-containers).
+
+- **Document**
+
+ Vespa models data as documents. A document has a string identifier, set by the application, unique across all documents. A document is a set of key-value pairs. A document has a [Schema](/en/learn/glossary#schema). Read more in [Documents](/en/schemas/documents).
+
+- **Document frequency (normalized)**
+
+ The *document frequency* of a term captures how often the term occurs in the document corpus relative to the total number of documents. For ranking purposes this value is always normalized so that it is in the range [0, 1]. For example, if a term occurs in 600 out of 1000 documents, its normalized document frequency will be \(600/1000 = 0.6\).
+
+ From an information retrieval perspective, the normalized document frequency gives a measure of how common (or rare) a term is. Query terms that occur rarely (thus having a low document frequency) are usually expected to be more *relevant* to the query, since they are more specific. On the other end, very common terms (with high document frequency) are often considered to be "stopwords" (such as "the", "an" etc.), and are expected to have a low contribution to query relevance. This is directly related to [inverse document frequency](https://en.wikipedia.org/wiki/Tf%E2%80%93idf#Inverse_document_frequency), which is used by classic text ranking algorithms such as [tf-idf](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) and [BM25](/en/ranking/bm25).
+
+- **Document summary**
+
+ A [document summary](/en/querying/document-summaries) is the information that is shown for each document in a query result. What information to include is determined by a document summary class: A named set of fields with config on which information they should contain. When Vespa stores a document, it is written to the [document store](/en/content/proton#document-store) and used to generate summaries. The document store is scanned when using [streaming search](/en/performance/streaming-search).
+
+- **Document Processor**
+
+ Document processing is a framework to create chains of configurable [Components](/en/learn/glossary#component) that read and modify document operations. A Document Processor uses `getFieldValue()` and `setFieldValue()` to process fields, alternatively using generated code from [Concrete Documents](/en/schemas/concrete-documents).
+
+- **Document Type**
+
+ The data type part of a [Schema](/en/learn/glossary#schema) - a collection of fields.
+
+- **Elasticity**
+
+ Vespa's clusters are elastic - a user can add or remove nodes on running applications without service disruption. For the stateful content nodes, this causes data sync between nodes for uniform distribution, with minimal data re-distribution. Read more in [Elasticity](/en/content/elasticity).
+
+- **Enclave**
+
+ Vespa Cloud Enclave is a feature to run your Vespa application in Vespa Cloud in your own AWS or GCP account, see the [Enclave documentation](/en/operations/enclave/enclave).
+
+- **Embedding**
+
+ A common technique in modern big data serving applications is to map the subject data - say, text or images - to points in an abstract vector space and then do computation in that vector space. For example, retrieve similar data by finding nearby points in the vector space, or using the vectors as input to a neural net. This mapping is usually referred to as *embedding*, and Vespa provides [built-in support](/en/rag/embedding) for this.
+
+- **Estimated hit ratio**
+
+ When Vespa plans how a query should be evaluated in the most efficient way possible, one of the most important pieces of information is how many *hits* different parts of the query will produce. The estimated hit ratio is a normalized number in the range [0, 1] that states the proportion of documents that is expected to match a given part of the query.
+
+ For example, a query with an `AND` operator over multiple terms will benefit by having the query planner place the term with the *lowest* estimated hit ratio *first* in the AND's evaluation order. This is because that term will be the cheapest to evaluate (least number of candidate documents to iterate over), and all other terms can be excluded as a possible match if it doesn't match.
+
+- **Federation**
+
+ The [Container](/en/learn/glossary#container) allows multiple sources of data to be [federated](/en/querying/federation) to a common search service. The sources of data may be both search clusters, or external services, backed by Vespa or any other kind of service. The container may be used as a pure federation platform by setting up a system consisting solely of container nodes federating to external services.
+
+- **Field**
+
+ Documents have [Fields](/en/basics/schemas#document-fields). A field has a type, and a field contained in a document can be written to, read from and queried. A field can also be generated (i.e. a synthetic field) - in this case, the field definition is outside the document - [example](/en/writing/indexing#date-indexing). A field can be singlevalue, like a string, or multivalue, like an array of strings.
+
+- **Fieldset**
+
+ The term *fieldset* has two meanings in Vespa:
+
+- **A collection of fields that are queried together - configured in the [schema](/en/reference/schemas/schemas#fieldset):**
+
+ ```sql
+ fieldset myset {
+ fields: artist, title, album
+ }
+ ```
+
+- **A collection of fields to return for a GET or VISIT operation, see the [guide](/en/schemas/documents#fieldsets):**
+
+ ```text
+ $ vespa visit --field-set restaurant:name,rating
+ ```
+
+- **Garbage Collection**
+
+ Use a [Document Selection](/en/reference/applications/services/content#document) to [auto-expire](/en/schemas/documents#document-expiry) documents by time or any other criterion.
+
+- **Grouping**
+
+ Vespa Grouping is a list processing language which describes how the query hits should be grouped, aggregated and presented in result sets. A grouping statement takes the list of all matches to a query as input and groups/aggregates it, possibly in multiple nested and parallel ways to produce the output. [Read more](/en/querying/grouping).
+
+- **Handler**
+
+ Also called *Request Handler*. A handler is a [Component](/en/learn/glossary#component) used to build API endpoints on the [Container](/en/learn/glossary#container). Find documentation at [developing request handlers](/en/applications/request-handlers), and [example use](https://github.com/vespa-engine/sample-apps/tree/master/model-inference/src/main/java/ai/vespa/example).
+
+- **Indexing**
+
+ The process of creating index structures. This includes routing document writes to indexing processors, processing (/en/writing/indexing ) documents and writing the documents to content clusters. Settings like [streaming search](/en/learn/glossary#streaming-search) do not create indices to optimize resource usage.
+
+- **Instance**
+
+ *Instance* is always "default" in Vespa.ai (i.e. there is only one) - managed services like [Vespa Cloud](/) support multiple, [read more](/en/learn/tenant-apps-instances). An instance is a deployment of an application for a given purpose, like production serving - multiple instances of an application can be used to support more use cases like integration testing.
+
+- **Namespace**
+
+ A segment of [document IDs](/en/learn/glossary#document) which helps you generate unique ids also if you have multiple sources of unique values. Namespace can be used to [Visit](/en/learn/glossary#visit) a subspace of the corpus.
+
+- **Nearest neighbor search**
+
+ [Nearest neighbor search](/en/querying/nearest-neighbor-search), or [vector search](/en/querying/vector-search-intro), is a technique used to find the closest data points to a given query point in a high-dimensional vector space - see [distance metric](/en/querying/nearest-neighbor-search#distance-metrics-for-nearest-neighbor-search). It can be exact or approximate.
+
+ This is supported in Vespa using the [nearestNeighbor](/en/reference/querying/yql) query operator.
+
+- **Node**
+
+ A Node is a host / container instance running one or more [Services](/en/learn/glossary#service). The mapping from logical to actual name is configured in [hosts.xml](/en/reference/applications/hosts).
+
+- **Parent / Child**
+
+ Using document references, documents can have [parent/child](/en/schemas/parent-child) relationships. Use this to join data by importing fields from parent documents. Parent documents are replicated to all nodes in the cluster.
+
+- **Partial Update**
+
+ A partial update is an update to one or more fields in a document. It also includes updating all index structures, so the effect of the partial update is immediately observable in queries. Partial updates do not require the full document, and enables a high write throughput with memory-only operations. [Read more](/en/writing/partial-updates).
+
+- **Posting List**
+
+ A posting list is a fundamental data structure in information retrieval and search engines. It is used in inverted indexes to store the occurrences of a term in a collection of documents. [Read more](/en/performance/feature-tuning#posting-lists).
+
+- **Quantization**
+
+ Quantization is the process of constraining an input from a continuous or otherwise large set of values (such as the real numbers) to a discrete set (such as the integers). It is a way to reduce memory and CPU usage for [tensor operations](/en/learn/glossary#tensor) in [nearest neighbor search](/en/learn/glossary#nearest-neighbor-search), to improve throughput or latencies.
+
+- **Query**
+
+ Use the [Query API](/en/querying/query-api) to query the corpus. Queries are written in [YQL](/en/reference/querying/yql), or can be created programmatically in a [Searcher](/en/learn/glossary#searcher). Configure query execution in a [Query Profile](/en/querying/query-profiles).
+
+- **Ranking**
+
+ Ranking is where Vespa does computing, or inference over documents. The computations to be done are expressed in functions called [Ranking Expressions](/en/ranking/ranking-expressions-features#ranking-expressions), bundled into [Rank Profiles](/en/basics/ranking#rank-profiles) defined in a [Schema](/en/learn/glossary#schema). These can range from simple math expressions combining some rank features, to tensor expressions or large machine-learned models. Ranking can be single- or [multiphased](/en/ranking/phased-ranking).
+
+- **Schema**
+
+ A description of a particular type of data and how to process/rank it. See the [Schema guide](/en/basics/schemas).
+
+- **Searcher**
+
+ A searcher is a [Component](/en/learn/glossary#component) - usually deployed as part of an OSGi bundle. All Searchers must implement a single method `search(query)`. Developers implement application query logic in Searchers - [read more](/en/applications/searchers).
+
+- **Semantic search**
+
+ Semantic search denotes search with meaning, as distinguished from lexical search where the search engine looks for literal matches of the query words. Read [ Revolutionizing Semantic Search with Multi-Vector HNSW Indexing](https://blog.vespa.ai/semantic-search-with-multi-vector-indexing/) for more details on semantic search, pointers to resources, and how to implement it.
+
+- **Service**
+
+ A Service runs in a [Cluster](/en/learn/glossary#cluster) of container or content nodes, configured in [services.xml](/en/reference/applications/services/services).
+
+- **Streaming search**
+
+ [Streaming search](/en/performance/streaming-search) is querying fields that do not have an index structure. The indexing cost is minimal as no index is generated. A query is hence a scan over all data, and normally slower than using index structures. Streaming search is used for applications like personal search, where the searched data volume is small. It can be a powerful option to drastically limit memory use in nearest-neighbor applications where the possible neighbor set it orders of magnitude smaller than the total.
+
+- **Tenant**
+
+ An organizational unit that owns [applications](/en/learn/glossary#application). In Vespa.ai APIs, *tenant* and *application* are always "default", and a Vespa system has exactly one tenant and one application. On [Vespa Cloud](/), multiple tenants and applications is supported - [read more](/en/learn/tenant-apps-instances).
+
+- **Tensor**
+
+ A [Tensor](/en/ranking/tensor-user-guide) is a data structure which generalizes scalars, vectors and matrices to any number of dimensions: A scalar is a tensor of rank 0, a vector is a tensor of rank 1, a matrix is a tensor of rank 2. Tensors consist of a set of scalar valued cells, with each cell having a unique address. A cell's address is specified by its index or label in all the dimensions of that tensor. The number of dimensions in a tensor is the rank of the tensor, each dimension can be either mapped or indexed.
+
+- **Visit**
+
+ [Visit](/en/writing/visiting) is a feature to efficiently get or process a set of / all documents, identified by a [Document Selection Expression](/en/reference/writing/document-selector-language). Visit iterates over all, or a set of, buckets and sends documents to a (set of) targets.
diff --git a/mintlify-docs/en/learn/llm-help.mdx b/mintlify-docs/en/learn/llm-help.mdx
new file mode 100644
index 0000000000..0b239f2d21
--- /dev/null
+++ b/mintlify-docs/en/learn/llm-help.mdx
@@ -0,0 +1,55 @@
+---
+title: Getting help from LLMs
+description: "This page describes some of the ways that you can get help from large language models (LLMs) when developing a Vespa application."
+---
+
+From our experience, providing the right context to the LLM is essential to get good results when asking questions about Vespa.
+
+## Markdown version of documentation pages
+
+Every page of the documentation is available in Markdown format, by changing the URL from `.html` to `.html.md`. There is also a link to the markdown version in the top right corner of each page.
+
+This can for example be used to copy/paste relevant markdown documentation page(s) into your AI tool of choice when working with LLMs on particular topics.
+
+## llms.txt
+
+We provide an [llms.txt](https://docs.vespa.ai/llms.txt) file, that can serve as a top level entrypoint for an LLM, which includes both top-level overview, architecture, as well as title of and link to markdown-version of all documentation pages.
+
+See [llmstxt.org](https://llmstxt.org/) for more information about the format.
+
+### Example usage
+
+The [llms.txt](https://docs.vespa.ai/llms.txt) file can be downloaded with:
+
+```bash
+curl -O /llms.txt
+```
+
+This file can then be used as an entrypoint when working with LLMs, either through an IDE, CLI or a chat interface. If the LLM has a tool available that allows it to fetch the referenced URLs, it can fetch the content of the desired pages as needed.
+
+We also provide [llms-full.txt](https://docs.vespa.ai/llms-full.txt) which contains the _full_ content of all documentation pages in markdown format. This file is relatively large (almost 0.5M words as of Oct 2025), so use accordingly.
+
+## MCP Server
+
+### Public Vespa MCP server
+
+We don't provide any official [MCP](https://modelcontextprotocol.io/) server at this time, but will update this page as soon as we do.
+
+### Personal MCP server
+
+Users can enable MCP server capablities in their own Vespa apps. This can be done by adding `McpRequestHandler` to `services.xml` with one or more `McpSpecProvider` components.
+
+A pre-built `McpSearchSpecProvider` already exists, and a usage example can be found in [this sample app](https://github.com/vespa-engine/sample-apps/tree/master/examples/mcp-server-app). This exposes Vespa search to LLMs via the `/mcp/` endpoint.
+
+Users can add more tools by implementing `McpSpecProvider` and adding the components in `services.xml`.
+
+#### Example MCP config
+
+Add this to `services.xml`
+
+```xml
+
+
+ http://*/mcp/*
+
+```
diff --git a/mintlify-docs/en/learn/migrating-from-elastic-search.mdx b/mintlify-docs/en/learn/migrating-from-elastic-search.mdx
new file mode 100644
index 0000000000..a68cb9ca1f
--- /dev/null
+++ b/mintlify-docs/en/learn/migrating-from-elastic-search.mdx
@@ -0,0 +1,239 @@
+---
+title: Migrating from Elasticsearch
+description: "This is a guide for how to move data from Elasticsearch to Vespa. By the end of this guide you will have exported documents from Elasticsearch, generated a deployable Vespa application package and tested this with documents and queries."
+---
+
+
+**Prerequisites:**
+
+- Linux, macOS or Windows 10 Pro on x86\_64 or arm64, with [Podman Desktop](https://podman.io/) or [Docker Desktop](https://www.docker.com/products/docker-desktop/) installed, with an engine running.
+ - Alternatively, start the Podman daemon:
+
+ ```bash
+ $ podman machine init --memory 6000
+ $ podman machine start
+ ```
+ - See [Docker Containers](/en/operations/self-managed/docker-containers) for system limits and other settings.
+- For CPUs older than Haswell (2013), see [CPU Support](/en/operations/self-managed/cpu-support).
+- Memory: Minimum RAM dedicated to Docker/Podman. [Memory recommendations](/en/operations/self-managed/node-setup#memory-settings).
+- Disk: Avoid `NO_SPACE` - the vespaengine/vespa container image + headroom for data requires disk space. [Read more](/en/writing/feed-block).
+- [Homebrew](https://brew.sh/) to install the [Vespa CLI](/en/clients/vespa-cli), or download the Vespa CLI from [Github releases](https://github.com/vespa-engine/vespa/releases).
+- Linux, macOS or Windows 10 Pro on x86\_64 or arm64, with Podman or [Docker](https://docs.docker.com/engine/install/) installed. See [Docker Containers](../operations/self-managed/docker-containers) for system limits and other settings.
+
+
+To get started, [sign up](https://vespa.ai/free-trial/) to get an endpoint to deploy to. Set the *tenant name* from the signup:
+
+```bash
+$ export TENANT_NAME=vespa-team # Replace with your tenant name
+```
+
+Alternatively, [test with local deployment](#test-with-local-deployment).
+
+## Feed a sample Elasticsearch index
+
+This section sets up an index with 1000 sample documents using [getting-started-index](https://www.elastic.co/guide/en/elasticsearch/reference/7.9/getting-started-index). Skip this part if you already have an index. Wait for Elasticsearch to start:
+
+```bash
+$ docker network create --driver bridge esnet
+
+$ docker run -d --rm --name esnode --network esnet -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" \
+ docker.elastic.co/elasticsearch/elasticsearch:7.10.2
+
+$ while [[ "$(curl -s -o /dev/null -w ''%{http_code}'' localhost:9200)" != "200" ]]; do sleep 1; echo 'waiting ...'; done
+```
+
+Download test data, and feed it to the Elasticsearch instance:
+
+```bash
+$ curl 'https://raw.githubusercontent.com/elastic/elasticsearch/7.10/docs/src/test/resources/accounts.json' \
+ > accounts.json
+
+$ curl -H "Content-Type:application/json" --data-binary @accounts.json 'localhost:9200/bank/_bulk?pretty&refresh'
+```
+
+Verify that the index has 1000 documents:
+
+```bash
+$ curl 'localhost:9200/_cat/indices?v'
+```
+
+## Export documents from Elasticsearch
+
+This guide uses [ElasticDump](https://github.com/elasticsearch-dump/elasticsearch-dump) to export the index contents and the index mapping. Export the documents and mappings, then delete the Docker network and the Elasticsearch container:
+
+```bash
+$ docker run --rm --name esdump --network esnet -v "$PWD":/dump -w /dump elasticdump/elasticsearch-dump \
+ --input=http://esnode:9200/bank --output=bank_data.json --type=data
+
+$ docker run --rm --name esdump --network esnet -v "$PWD":/dump -w /dump elasticdump/elasticsearch-dump \
+ --input=http://esnode:9200/bank --output=bank_mapping.json --type=mapping
+
+$ docker rm -f esnode && docker network remove esnet
+```
+
+## Generate Vespa documents and Application Package
+
+[ES_Vespa_parser.py](https://github.com/vespa-engine/vespa/tree/master/config-model/src/main/python/ES_Vespa_parser.py) is provided for conversion of Elasticsearch data and index mappings to Vespa data and configuration. It is a basic script with minimal error checking - it is designed for a simple export, modify this as needed for your application's needs. Generate Vespa documents and configuration:
+
+```bash
+$ curl 'https://raw.githubusercontent.com/vespa-engine/vespa/master/config-model/src/main/python/ES_Vespa_parser.py' \
+ > ES_Vespa_parser.py
+
+$ python3 ./ES_Vespa_parser.py --application_name bank bank_data.json bank_mapping.json
+```
+
+This generates documents in *documents.json* (see [JSON format](/en/reference/schemas/document-json-format)) where each document has IDs like this `id:bank:_doc::1`. It also generates a *bank* folder with an [application package](/en/basics/applications):
+
+```txt
+/bank
+ │
+ ├── documents.json
+ ├── hosts.xml
+ ├── services.xml
+ └── /schemas
+ └── _doc.sd
+```
+
+Enter the application package directory:
+
+```bash
+$ cd bank
+```
+
+## Deploy
+
+Install [Vespa CLI](/en/clients/vespa-cli). In this example we use [Homebrew](https://brew.sh/), you can also download from [GitHub](https://github.com/vespa-engine/vespa/releases):
+
+```bash
+$ brew install vespa-cli
+```
+
+Configure for Vespa Cloud deployment, log in and add credentials:
+
+```bash
+$ vespa config set target cloud
+$ vespa config set application $TENANT_NAME.myapp.default
+```
+
+```bash
+$ vespa auth login
+```
+
+```bash
+$ vespa auth cert
+```
+
+Also see [getting started](/en/basics/deploy-an-application) guide. Deploy the application package:
+
+```bash
+$ vespa deploy --wait 300
+```
+
+Index the documents exported from Elasticsearch:
+
+```bash
+$ vespa feed documents.json
+```
+
+## Interfacing with Vespa
+
+Export all documents:
+
+```bash
+$ vespa visit
+```
+
+Get a document:
+
+```bash
+$ vespa document get id:bank:_doc::1
+```
+
+Count documents, find `"totalCount":1000` in the output:
+
+```bash
+$ vespa query 'select * from _doc where true'
+```
+
+Run a simple query against the *firstname* field:
+
+```bash
+$ vespa query 'select firstname,lastname from _doc where firstname contains "amber"'
+```
+
+## Next steps
+
+Review the differences in document records, Vespa to the right:
+
+Elasticsearch:
+
+```json
+{
+ "_index": "bank",
+ "_type": "_doc",
+ "_id": "1",
+ "_score": 1,
+ "_source": {
+ "account_number": 1,
+ "balance": 39225,
+ "firstname": "Amber",
+ "lastname": "Duke",
+ "age": 32,
+ "gender": "M",
+ "address": "880 Holmes Lane",
+ "employer": "Pyrami",
+ "email": "amberduke@pyrami.com",
+ "city": "Brogan",
+ "state": "IL"
+ }
+}
+```
+
+Vespa:
+
+```json
+{
+ "put": "id:bank:_doc::1",
+ "fields": {
+ "account_number": 1,
+ "balance": 39225,
+ "firstname": "Amber",
+ "lastname": "Duke",
+ "age": 32,
+ "gender": "M",
+ "address": "880 Holmes Lane",
+ "employer": "Pyrami",
+ "email": "amberduke@pyrami.com",
+ "city": "Brogan",
+ "state": "IL"
+ }
+}
+```
+
+The [id](/en/schemas/documents#document-ids) field `id:bank:_doc::1` is composed of:
+
+- namespace: `bank`
+- schema: `_doc`
+- id: `1`
+
+Read more in [Documents](/en/schemas/documents) and [Schemas](/en/basics/schemas). The schema is the key Vespa configuration file where field types and [ranking](/en/basics/ranking) are configured. The schema (found in `schemas/_doc.sd`) also has [indexing](/en/basics/schemas#document-fields) settings, example:
+
+```txt
+search _doc {
+ document _doc {
+ field account_number type long {
+ indexing: summary | attribute
+ }
+ field address type string {
+ indexing: summary | index
+ }
+ ...
+ }
+}
+```
+
+These settings impact both performance and how fields are matched. For example, the *account_number* above is using the *attribute* keyword, which makes the field available for [sorting](/en/reference/querying/sorting-language), [ranking](/en/basics/ranking), [grouping](/en/querying/grouping), but which by default does not have data structures for fast search. Read more in [attributes](/en/content/attributes) and [practical search performance guide](/en/performance/practical-search-performance-guide).
+
+## Test with local deployment
+
+To run the steps above, using a local deployment, follow the steps in the [quickstart](/en/basics/deploy-an-application-local) to start a local container running Vespa. Then, deploy the application package from the *bank* folder.
diff --git a/mintlify-docs/en/learn/migrating-to-cloud.mdx b/mintlify-docs/en/learn/migrating-to-cloud.mdx
new file mode 100644
index 0000000000..8ec6fd9b77
--- /dev/null
+++ b/mintlify-docs/en/learn/migrating-to-cloud.mdx
@@ -0,0 +1,261 @@
+---
+title: Migrating to Vespa Cloud
+description: "Migrating a Vespa application to Vespa Cloud is straightforward, as applications on Vespa Cloud supports all the same features as your self-hosted Vespa instances, you're just gaining some new capabilities and avoid the operational work."
+---
+
+The high-level process is as follows:
+
+
+
+ Functional validation using the [dev](/en/operations/environments#dev) environment (this guide).
+
+
+ Deployment to a [prod](/en/operations/environments#prod) zone.
+
+
+
+The rest of this guide assumes you have a [tenant](/en/learn/tenant-apps-instances) ready for deployment:
+
+```bash
+$ export VESPA_TENANT_NAME=mytenant
+```
+
+
+**Prerequisites:**
+
+- Linux, macOS or Windows 10 Pro on x86\_64 or arm64, with [Podman Desktop](https://podman.io/) or [Docker Desktop](https://www.docker.com/products/docker-desktop/) installed, with an engine running.
+- Alternatively, start the Podman daemon:
+ ```bash
+ $ podman machine init --memory 6000
+ $ podman machine start
+ ```
+- See [Docker Containers](/en/operations/self-managed/docker-containers) for system limits and other settings.
+- For CPUs older than Haswell (2013), see [CPU Support](/en/operations/self-managed/cpu-support).
+- Memory: Minimum RAM dedicated to Docker/Podman. [Memory recommendations](/en/operations/self-managed/node-setup#memory-settings).
+- Disk: Avoid `NO_SPACE` - the vespaengine/vespa container image + headroom for data requires disk space. [Read more](/en/writing/feed-block).
+- [Homebrew](https://brew.sh/) to install the [Vespa CLI](/en/clients/vespa-cli), or download the Vespa CLI from [Github releases](https://github.com/vespa-engine/vespa/releases).
+
+
+
+**Note:**
+
+[Vespa Cloud Enclave](/en/operations/enclave/enclave) users: Run the enclave setup steps first.
+
+
+
+
+ An [application package](/en/basics/applications) from a self-hosted system can be deployed with minor modifications to the Vespa Cloud `dev` environment.
+
+ The root of an application package might look at this:
+
+ ```txt
+ ├── schemas
+ │ └── doc.sd
+ └── services.xml
+ ```
+
+ There are often more files, the above is a minimum. This is the root of the application package - make this the current working directory:
+
+ ```bash
+ $ cd /location/of/app/package
+ ```
+
+
+
+ Make sure the [Vespa CLI](/en/clients/vespa-cli) is installed:
+
+ ```bash
+ $ vespa
+ Usage:
+ vespa [flags]
+ vespa [command]
+ ```
+
+
+
+ Configure the local environment and log in to Vespa Cloud:
+
+ ```bash
+ $ vespa config set target cloud && \
+ vespa config set application $VESPA_TENANT_NAME.myapp && \
+ vespa auth login
+ ```
+
+
+
+ Create and get security credentials:
+
+ ```bash
+ $ vespa auth cert
+ ```
+
+ This will add the `security` directory to the application package, and add a public certificate to it:
+
+ ```txt
+ ├── schemas
+ │ └── doc.sd
+ ├── security
+ │ └── clients.pem
+ └── services.xml
+ ```
+
+ The command also installs a key/certificate pair in the Vespa CLI home directory, see [vespa auth cert](/en/reference/clients/vespa-cli/vespa_auth_cert). This pair is used in subsequent accesses to the data plane for document and query operations.
+
+
+
+
+ **Note:**
+
+ Skip this step unless you are using [Vespa Cloud Enclave](/en/operations/enclave/enclave).
+
+
+ Add [deployment.xml](/en/reference/applications/deployment#deployment) with your cloud provider account - This ensures the deployment uses resources from the correct account - examples:
+
+ ```xml
+
+
+
+ ```
+
+ ```xml
+
+
+
+ ```
+
+ The application package should look like:
+
+ ```txt
+ ├── deployment.xml
+ ├── schemas
+ │ └── doc.sd
+ ├── security
+ │ └── clients.pem
+ └── services.xml
+ ```
+
+
+
+ `hosts.xml` is not used in Vespa Cloud, remove it.
+
+
+
+ Edit the `` configuration in `services.xml` - from:
+
+ ```xml
+
+
+
+
+
+
+
+
+
+ ```
+
+ to:
+
+ ```xml
+
+
+
+
+
+
+
+
+ ```
+
+ In short, this is the Vespa Cloud syntax for resource specifications.
+
+ Example, migrating from a cluster using `c7i.2xlarge` instance type, with a 200G disk - from the AWS specifications:
+
+ ```txt
+ c7i.2xlarge 8 16 EBS-Only
+ ```
+
+ Equivalent Vespa Cloud configuration:
+
+ ```xml
+
+ ```
+
+ Repeat this for all clusters in `services.xml`. Notes:
+
+ 1. As you are now migrating to the `dev` environment, what is _actually_ deployed is a minimized version. The configuration changes above are easily tested in this environment.
+ 2. Using `count=2` is best practise at this point.
+ 3. Resources must match a node instance type at the cloud providers(s) deploying to, see [AWS flavors](/en/performance/instance-types/aws-instance-types), [GCP flavors](/en/performance/instance-types/gcp-instance-types), and [Azure flavors](/en/performance/instance-types/azure-instance-types).
+
+
+
+ At this point, the local environment and the application package is ready for deployment:
+
+ ```bash
+ $ vespa deploy --wait 600
+ ```
+
+ Please note that a first-time deployment normally takes a few minutes, as resources are provisioned.
+
+ At this point, we recommend opening the console to observe the deployed application. The link will be `https://console.vespa-cloud.com/tenant/mytenant/application/myapp/dev/instance/default` (replace with your own names) - this is also easily found in the console main page:
+
+
+ 
+
+
+ Refer to [vespa8 release notes](/en/reference/release-notes/vespa8) for troubleshooting in case the deployments fails, based on a Vespa 7 (or earlier) version.
+
+
+
+ The endpoints are shown in the console, one can also list them like:
+
+ ```bash
+ $ vespa status query
+ Container default at https://aa1c1234.b225678e.z.vespa-app.cloud/ is ready
+ ```
+
+ Test the query endpoint, expect `totalCount: 0`:
+
+ ```bash
+ $ vespa query 'select * from sources * where true'
+ ```
+
+ ```json
+ {
+ "root": {
+ "id": "toplevel",
+ "relevance": 1.0,
+ "fields": {
+ "totalCount": 0
+ },
+ ```
+
+ In the `services.xml` examples at the start of this guide, both `` and `` and configured in the same cluster, named `default`. In case of multiple container clusters, select the one configured with ``:
+
+ ```bash
+ vespa query 'select * from sources * where true' --cluster myquerycluster
+ ```
+
+ Finally, feed a document to the cluster (this is the cluster configured with ``)
+
+ ```bash
+ vespa feed mydoc.jsonl --cluster myfeedcluster
+ ```
+
+ Redo the query and observe nonzero `totalCount`.
+
+
+
+## Next steps
+
+This is the final step in the functional validation. Please note:
+
+
+**Note:**
+
+Deployments to `dev` expire after 7 days of inactivity, i.e., 7 days after the last deployment. **This applies to all plans**. Use the Vespa Console to extend the expiry period, or redeploy the application to add 7 more days.
+
+
+- Read more about the [dev](/en/operations/environments#dev) environment
+- Feed (a subset) of the data and validate that queries and other API accesses work as expected.
+- At the end of the validation process, continue to [production deployment](/en/operations/production-deployment) to set up in production zones.
diff --git a/mintlify-docs/en/learn/overview.mdx b/mintlify-docs/en/learn/overview.mdx
new file mode 100644
index 0000000000..164b96bada
--- /dev/null
+++ b/mintlify-docs/en/learn/overview.mdx
@@ -0,0 +1,70 @@
+---
+title: Vespa Overview
+description: "Vespa is a platform for applications which need low-latency computation over large data sets. It stores and indexes your structured, text and vector data so that queries, selection and processing and machine-learned model inference over the data can be performed quickly at serving time at any scale. Functionality can be customized and extended with application components hosted within Vespa. This document is an overview of the features and main components of Vespa."
+---
+
+## Introduction
+
+Vespa allows application developers to create applications that scale to large amounts of data and high loads without sacrificing latency or reliability. A Vespa application consists of a number of *stateless Java container clusters* and zero or more *content* clusters storing data.
+
+
+
+
+
+The [stateless **container** clusters](/en/applications/containers) host components which process incoming data and/or queries and their responses. These components provide functionality belonging to the platform like indexing transformations and the global stages of query execution, but can also include the middleware logic of the application. Application developers can configure their Vespa system with a single stateless cluster which performs all such functions, or create different clusters for each kind of task. The container clusters then pass queries and data operations on to the appropriate nodes in the content clusters. If the application uses data it does not own, you can add components to access data from external services as well.
+
+[**Content** clusters](/en/content/elasticity) in Vespa are responsible for storing data and execute queries and inferences over the data. Queries can range from simple data lookups for content serving to complex conditions for selecting the relevant data, ranking it using machine-learned models, and grouping and aggregating the data across all nodes participating in the query. All the operations provided by Vespa scales to more content, more expensive inference, and higher query volume simply by adding more nodes to the content clusters.
+
+When changing the nodes of a content cluster for scaling or on node failure, content clusters automatically re-balance data in the background to maintain a balanced distribution at the configured redundancy level. Faulty nodes are also automatically removed from the serving path to avoid any impact to queries and writes (failover).
+
+After intermediate processing in a container cluster, data is written to content clusters. Writes are persistent and visible in all queries after receiving an ack on the write message, after a few milliseconds. Each write is guaranteed to either succeed or provide failure information response within a given time limit, and writes and scale linearly with the available resources, indefinitely. In addition to rewriting and removing entire documents, writes may selectively modify only individual document fields. Writes can be sent directly over HTTP/2, or by using a Java client — refer to the [API documentation](/en/reference/api/api).
+
+Each document instance stored in Vespa are of a type defined in a configured [schema](/en/basics/schemas), which defines the document fields and how to store and index them, as well as the ranking and inference profiles that belongs to the document type. Applications can contain any number of schemas for different data types, and configure them to be stored either in the same or multiple content clusters.
+
+Container and content clusters handle all the end user traffic of a Vespa application, but there's also a third type of cluster, the *admin and config clusters*. These set up and manage the other clusters in the application according to configuration, and manages the process of changing the clusters safely without disruption to traffic when the configuration changed.
+
+A Vespa application is completely specified by an [*application package*](/en/basics/applications), which is a directory structure containing a declaration of the clusters to run as part of the application, the content schemas, any machine-learned models and Java components, and other configuration or data files needed by various features. Application developers create a running application from their application package by *deploying* it to any node in the config cluster. Changes to a running application is made in the same way: By changing the application package and deploying again. Once Vespa is installed and started on a node, it is managed by the config system such that the entire system can be treated as a single unit, and application owners do not need to perform any administration tasks locally on the nodes running the application. It is also possible to configure nodes as *log servers* on Vespa. These will collect logs in real time from all the nodes of the application. By default, the first node in the config server cluster performs this role.
+
+The rest of this document provides some more detail on the functions Vespa performs.
+
+## Vespa operations
+
+Vespa accepts the following operations:
+
+- Writes: Put (add and replace) and remove documents, and update fields in these.
+- Lookup of a document (or some subset of it) by id.
+- [Queries](/en/querying/query-api): [*Select*](/en/querying/query-language) documents whose fields match conditions, which search free-text fields, structured data or [vector spaces (ANN)](/en/querying/nearest-neighbor-search). Any number of such conditions can be combined freely in boolean trees to define the full query to be executed. Vespa will compute a query plan over the conditions which executes them efficiently with any number of conditions such as e.g. filters combined with ANN conditions. Matches to a query can be passed through an inference step which can compute any business logic or machine-learned model expressed as a [ranking expression](/en/reference/ranking/ranking-expressions) or [ONNX model](/en/ranking/onnx). Optionally, the highest scoring matches can also run through a second stage of this, to spend more computational resources on promising candidates. The final documents are ordered according to their score from these inferences ([*ranking*](/en/basics/ranking)), or by explicit [*sorting*](/en/reference/querying/sorting-language). Matches to queries can be [*grouped*](/en/querying/grouping) hierarchically by field values, where each group can contain aggregated values over the data in the group This can be used to calculate values for, e.g., navigation aids, tag clouds, graphs or for clustering in a distributed fashion without having to transfer the distributed to a single container node.
+- Data dumps: Content matching some criterion can be streamed out for background reprocessing, backup, etc., by using the [*visit*](/en/writing/visiting) operation.
+- [Any other custom network request](/en/reference/applications/components) which can be handled by application components deployed on a container cluster.
+
+## The stateless container
+
+[Container clusters](/en/applications/containers) host the application components which employ the operations listed above and process their return data. Vespa provides a set of components out of the box, together with component infrastructure: dependency injection, with added support for injection of config from the admin server or the application package; a component model based on OSGi; a shared mechanism to chain components into handler chains for modularity as well as metrics and logging. The container also provides the network layer for handling and issuing remote requests - HTTP is provided out of the box, and other protocols/transports can be transparently plugged in as components.
+
+Developers can make changes to components (and of course their configuration) simply by redeploying their application package - the system takes care of copying the components to the nodes of the cluster and loading/unloading components impacting request serving or restarting nodes.
+
+## Content clusters
+
+[Content clusters](/en/content/elasticity) store data and maintain distributed indices of data for searches and selects. Data is replicated over multiple nodes, with a number of copies specified by the application, such that the cluster can automatically repair itself on loss of a node or a disk. Using the same mechanism, clusters can also be grown or shrunk while online, simply by changing the set of available nodes declared in the application package.
+
+Lookup of an individual document is routed directly to a node storing that document, while queries are spread over a subset of nodes which contain the queried documents. Complex queries are performed as distributed algorithms with multiple steps back and forth between the container and the content nodes; this is to achieve the low latency which is one of the main design goals of Vespa.
+
+## Administration clusters and developer support
+
+The [single configuration cluster](/en/basics/applications) controls all the other clusters of the system.
+
+A config server derives the low level configuration of each individual cluster, node and process, such that the application developer can specify the desired system on a higher level without worrying about its detailed realization. Whenever the application package is redeployed, the system will compute the necessary changes in configuration and manage the process of moving safely from the current to the new configuration without disrupting queries or writes.
+
+Other admin clusters in Vespa are the cluster controller cluster (controls one or more content clusters), logserver cluster (logserver holds log archive for logs from all nodes in the application) and service location brokers (slobroks, which are a name service used by some services in Vespa).
+
+### Application packages
+
+Application packages may be [changed, redeployed](/en/reference/api/deploy-v2) and [inspected](/en/reference/api/config-v2) over an HTTP REST API, or through a [command line interface](/en/clients/vespa-cli#deployment). The administration cluster runs over [ZooKeeper](https://zookeeper.apache.org) to make changes to configuration singular and consistent, and to avoid having a single point of failure.
+
+An application package looks the same, and is deployed the same way, whether it specifies a large system with hundreds of nodes or a single node running all services. The only change needed is to the lists of nodes making up the cluster. The container clusters may also be started within a single Java VM by "deploying" the application package from a method call. This is useful for testing applications in an IDE and in unit tests. Application packages with components can be [developed](/en/applications/developer-guide) in an IDE using Maven starting from sample applications.
+
+## Summary
+
+Vespa allows functionally rich and highly available applications to be developed to scale and perform to high standards without burdening developers with the considerable low level complexity this requires. It allows developers to evolve and grow their applications over time without taking the system offline, and lets them avoid complex data and page precomputing schemes which lead to stale data that cannot be personalized, since this often requires complex queries to complete in real user time over data which is constantly changing at the same time.
+
+For more details, read [Vespa Features](/en/learn/features), or try to [deploy an application](/en/basics/deploy-an-application).
diff --git a/mintlify-docs/en/learn/releases.mdx b/mintlify-docs/en/learn/releases.mdx
new file mode 100644
index 0000000000..08cb079a2d
--- /dev/null
+++ b/mintlify-docs/en/learn/releases.mdx
@@ -0,0 +1,40 @@
+---
+title: Releases
+description: "Vespa is released every Monday through Thursday. Each public release has passed all functional and performance tests, and all cloud applications are automatically upgraded to it."
+---
+
+For each Vespa release, the following artifacts are provided:
+
+- [Java artifacts for building Vespa applications on Maven Central](https://search.maven.org/artifact/com.yahoo.vespa/parent)
+- [Vespa RPMs on Fedora Copr](https://copr.fedorainfracloud.org/coprs/g/vespa/vespa/)
+- [Container images on Docker Hub](https://hub.docker.com/repository/docker/vespaengine/vespa)
+
+Releases:
+
+- [Vespa 7](/en/reference/release-notes/vespa7)
+- [Vespa 8](/en/reference/release-notes/vespa8)
+
+Use the [Vespa Factory](https://factory.vespa.ai/releases) to inspect the commits in each release:
+
+
+
+
+
+## Versions
+
+Vespa uses [semantic versioning](https://semver.org/). Each release is backwards compatible and supports live migration on running systems, provided they are running a version which is less than 2 months old. It is therefore a minor version number change. All new features are released on such minor versions. Every second year or so we make a major version change which removes previously deprecated functionality.
+
+Java APIs, web service APIs and all application package constructs are supported through a major release and only removed on a new release if they are already marked deprecated.
+
+Use of deprecated Java APIs will cause a warning on compilation, and use of deprecated application package constructs will cause a deprecation warning on deployment. Note that Java APIs come in two categories:
+
+- *Public APIs* carry the compatibility guarantee and are visible from your code as well as in the javadoc
+- *Exported APIs* are also visible from your code, but is not in the public Javadoc and carry no compatibility guarantee
+
+Check the Javadoc list to verify that you are using public packages.
+
+In addition, some public Java classes and methods are marked with the com.yahoo.api.annotations.Beta tag. These are under development and may still change before they stabilize.
+
+## Stored Data
+
+Data written to Vespa is compatible between adjacent releases. For self-hosted systems, it may be necessary to upgrade through each minor release rather than in larger leaps to ensure Vespa can read existing data. This is a good practice in any case.
diff --git a/mintlify-docs/en/learn/tenant-apps-instances.mdx b/mintlify-docs/en/learn/tenant-apps-instances.mdx
new file mode 100644
index 0000000000..c458b86474
--- /dev/null
+++ b/mintlify-docs/en/learn/tenant-apps-instances.mdx
@@ -0,0 +1,30 @@
+---
+title: Tenants, Applications and Instances
+description: "When registering for Vespa Cloud, a tenant is created. Tenant is the billable unit, and most often represents an organization or a project. A tenant has one or more applications with one or more instances."
+---
+
+
+
+
+
+Instances are used for different use cases, and are deployed to a set of [zones](/en/operations/zones) - example:
+
+
+
+
+
+The *Application* has a "default" instance serving queries from two *production* zones. It has an "integration" instance with another dataset, used for other applications to interface a production-like, stable interface. Finally, a developer has deployed the "bob" instance to a *dev* zone to further develop plugin code.
+
+Deployments to production zones are specified in [deployment.xml](/en/reference/applications/deployment). Deployments to the manual *dev* zones are normally done directly from a developer computer for rapid code and config development. Read more in [Automated deployments](/en/operations/automated-deployments).
+
+The service configuration is specified in [services.xml](/en/reference/applications/services/services) and is composed of individually sized *clusters*. A cluster is deployed to a set of *nodes* with *resources* specified.
+
+One or more users may be a member of the tenant. A user is given roles in the tenant based on their access level. *Administrator* for tenant level managment like adding new members and updating billing data, *Developer* for managing applications, and one for *read-only* access.
+
+## Lifecycle
+
+The tenant name cannot be changed - create a new tenant, or contact Vespa Support.
+
+Tenants in trial are auto-expired once trial is completed. Move to a paid plan to keep applications and data.
+
+It is not possible to auto-migrate applications and data between tenants. To move an application to a new tenant, re-deploy the application with the new tenant name, see [cloning applications and data](/en/operations/cloning).
diff --git a/mintlify-docs/en/learn/tutorials.mdx b/mintlify-docs/en/learn/tutorials.mdx
new file mode 100644
index 0000000000..2a7188ff76
--- /dev/null
+++ b/mintlify-docs/en/learn/tutorials.mdx
@@ -0,0 +1,64 @@
+---
+title: Tutorials and use cases
+---
+
+### Text search
+
+- [Tutorial: Text Search](/en/learn/tutorials/text-search) A text search tutorial and introduction to text ranking with Vespa using traditional information retrieval techniques like BM25.
+- [Tutorial: Improving Text Search with Machine Learning](/en/learn/tutorials/text-search-ml) This tutorial builds on the text search tutorial but introduces Learning to Rank to improve relevance.
+
+
+### Vector Search
+
+Learn how to use Vespa Vector Search in the [practical nearest neighbor search guide](/en/querying/nearest-neighbor-search-guide). It uses Vespa's support for [nearest neighbor search](/en/querying/nearest-neighbor-search), there is also support for fast [approximate nearest neighbor search](/en/querying/approximate-nn-hnsw) in Vespa. The guide covers combining vector search with filters and how to perform hybrid search, combining retrieval over inverted index structures with vector search.
+
+### Hybrid Search
+
+[Tutorial: Hybrid Text Search](/en/learn/tutorials/hybrid-search) A search tutorial and introduction to hybrid text ranking with Vespa, combining BM25 with text embedding models.
+
+### RAG (Retrieval-Augmented Generation)
+
+- [Tutorial: The RAG Blueprint](/en/learn/tutorials/rag-blueprint) A tutorial that provides a blueprint for building high-quality RAG applications with Vespa. Includes evaluation and learning-to-rank (LTR).
+- [Retrieval-augmented generation (RAG) in Vespa](/en/rag/rag)
+
+### Combining search and recommendation: The News tutorial
+
+Follow this series to learn how to build a complete application supporting both content recommendation/personalization, navigation, and search.
+
+- [News 1: Getting Started](/en/learn/tutorials/news-1-deploy-an-application)
+- [News 2: Application Packages, Feeding, Query](/en/learn/tutorials/news-2-basic-feeding-and-query)
+- [News 3: Sorting, Grouping and Ranking](/en/learn/tutorials/news-3-searching)
+- [News 4: Embeddings](/en/learn/tutorials/news-4-embeddings)
+- [News 5: Partial Updates, ANNs, Filtering](/en/learn/tutorials/news-5-recommendation)
+- [News 6: Custom Searchers, Document Processors](/en/learn/tutorials/news-6-recommendation-with-searchers)
+- [News 7: Parent-Child, Tensor Ranking](/en/learn/tutorials/news-7-recommendation-with-parent-child)
+
+### ML Model Serving
+
+Learn how to use Vespa for ML model serving in [Stateless Model Evaluation](/en/ranking/stateless-model-evaluation.html). Vespa supports running inference with models from many popular ML frameworks, which can be used for ranking, query classification, question answering, multi-modal retrieval, and more.
+
+- [Ranking with ONNX models](/en/ranking/onnx) Export models from popular deep learning frameworks such as PyTorch to ONNX format for serving in Vespa. Vespa integrates with ONNX-Runtime for accelerated inference.
+- [Ranking with LightGBM models](/en/ranking/lightgbm)
+- [Ranking with XGBoost models](/en/ranking/xgboost)
+- [Ranking with TensorFlow models](/en/ranking/tensorflow)
+
+### Embedding Model Inference
+
+Vespa supports integrating [embedding](/en/rag/embedding) models, which avoids transferring large amounts of embedding vector data over the network and allows for efficient serving of embedding models.
+
+- [Huggingface Embedder](/en/rag/embedding) Use single-vector embedding models from Hugging Face.
+- [ColBERT Embedder](/en/rag/embedding) Use multi-vector embedding models.
+- [Splade Embedder](/en/rag/embedding) Use sparse learned single vector embedding models.
+
+
+### E-Commerce
+
+The [e-commerce shopping sample application](/en/learn/tutorials/e-commerce) demonstrates Vespa grouping, true in-place partial updates, custom ranking, and more.
+
+### Building a custom HTTP API
+
+The [HTTP API tutorial](/en/learn/tutorials/http-api) shows how to build a custom HTTP API in an application.
+
+### More examples and sample applications
+
+There are many examples and starting applications on [GitHub](https://github.com/vespa-engine/sample-apps/) and [Pyvespa examples](https://vespa-engine.github.io/pyvespa/index.html).
diff --git a/mintlify-docs/en/learn/tutorials/e-commerce.mdx b/mintlify-docs/en/learn/tutorials/e-commerce.mdx
new file mode 100644
index 0000000000..7bc18f1219
--- /dev/null
+++ b/mintlify-docs/en/learn/tutorials/e-commerce.mdx
@@ -0,0 +1,69 @@
+---
+title: "Use Case - shopping"
+---
+
+The [e-commerce, or shopping, use case](https://github.com/vespa-engine/sample-apps/tree/master/use-case-shopping) is an example of an e-commerce site complete with sample data and a web front end to browse product data and reviews. To quick start the application, follow the instructions in the [README](https://github.com/vespa-engine/sample-apps/blob/master/use-case-shopping/README.md) in the sample app.
+
+
+
+
+
+
+
+To browse the application, navigate to [localhost:8080/site](http://localhost:8080/site). This site is implemented through a custom [request handler](/en/applications/request-handlers) and is meant to be a simple example of creating a front end / middleware that sits in front of the Vespa back end. As such it is fairly independent of Vespa features, and the code is designed to be fairly easy to follow and as non-magical as possible. All the queries against Vespa are sent as HTTP requests, and the JSON results from Vespa are parsed and rendered.
+
+This sample application is built around the Amazon product data set found at [https://cseweb.ucsd.edu/~jmcauley/datasets.html](https://cseweb.ucsd.edu/~jmcauley/datasets.html). A small sample of this data is included in the sample application, and full data sets are available from the above site. This sample application contains scripts to convert from the data set format to Vespa format: [convert_meta.py](https://github.com/vespa-engine/sample-apps/blob/master/use-case-shopping/convert_meta.py) and [convert_reviews.py](https://github.com/vespa-engine/sample-apps/blob/master/use-case-shopping/convert_reviews.py). See [README](https://github.com/vespa-engine/sample-apps/tree/master/use-case-shopping#readme) for example use.
+
+When feeding reviews, there is a custom [document processor](/en/applications/document-processors) that intercepts document writes and updates the parent item with the review rating, so the aggregated review rating is kept stored with the item - see [ReviewProcessor](https://github.com/vespa-engine/sample-apps/blob/master/use-case-shopping/src/main/java/ai/vespa/example/shopping/ReviewProcessor.java). This is more an example of a custom document processor than a recommended way to do this, as feeding the reviews more than once will result in inflated values. To do this correctly, one should probably calculate this offline so a re-feed does not cause unexpected results.
+
+
+
+### Highlighted features
+
+* [Multiple document types](/en/basics/schemas)
+
+Vespa models data as documents, which are configured in schemas that defines how documents should be stored, indexed, ranked, and searched. In Vespa, you can have multiple documents types, which can be defined in `services.xml` how these should be distributed around the content clusters. This application uses three document types that are stored in the same content cluster: item, review and query. Search is done on items, but reviews refer to a single parent item and are rendered on the item page. The query document type is used to power auto-suggest functionality.
+
+* [Custom document processor](/en/applications/document-processors)
+
+In Vespa, you can set up custom document processors to perform any type of extra processing during document feeding. One example is to enrich the document with extra information, and another is to precalculate values of fields to avoid unnecessary computation during ranking. This application uses a document processor to intercept reviews and update the parent item's review rating.
+
+* [Custom searcher processor](/en/applications/searchers)
+
+In Vespa, you can set up custom searchers to perform any type of extra processing during querying. In the sample app there is a single custom searcher which builds the query for auto-suggestions, using a combination of [fuzzy matching](/en/reference/querying/yql#fuzzy) and [prefix search](/en/querying/text-matching#prefix-match).
+
+* [Custom handlers](/en/applications/request-handlers)
+
+With Vespa, you can set up general request handlers to handle any type of request. This example site is implemented with a single such request handler, [SiteHandler](https://github.com/vespa-engine/sample-apps/blob/master/use-case-shopping/src/main/java/ai/vespa/example/shopping/site/SiteHandler.java) which is set up in [services.xml](https://github.com/vespa-engine/sample-apps/blob/master/use-case-shopping/src/main/application/services.xml) to be bound to `/site`. Note that this handler is for example purposes and is designed to be independent of Vespa. Most applications would serve this through a dedicated setup.
+
+* [Custom configuration](/en/applications/configuring-components)
+
+When creating custom components in Vespa, for instance document processors, searchers or handlers, one can use custom configuration to inject config parameters into the components. This involves defining a config definition (a `.def` file), which creates a config class. You can instantiate this class with data in `services.xml` and the resulting object is dependency injected to the component during construction. This application uses custom config to set up the Vespa host details for the handler.
+
+* [Partial update](/en/reference/schemas/document-json-format#update)
+
+With Vespa, you can make changes to an existing document without submitting the full document. Examples are setting the value of a single field, adding elements to an array, or incrementing the value of a field without knowing the field value beforehand. This application contains an example of a partial update, in the voting of whether a review is helpful or not. The `SiteHandler` receives the request and the `ReviewVote` class sends a partial update to increment the `up`- or `downvotes` field.
+
+* [Search using YQL](/en/querying/query-language)
+
+In Vespa, you search for documents using YQL. In this application, the classes responsible for retrieving data from Vespa (in the `data` package beneath the `SiteHandler`) set up the YQL queries which are used to query Vespa over HTTP.
+
+* [Grouping](/en/querying/grouping)
+
+Grouping is used to group various fields of query results together. For this application, many of the queries to Vespa include grouping requests. The home page uses grouping to dynamically extract the first 3 levels of categories from the stored items. The search page groups results matching the query into categories, brands, item rating and price ranges. The order which the groups are rendered are determined by both counting and the relevance of the hits. This enables query-contextualized navigation.
+
+* [Rank profiles](/en/basics/ranking)
+
+Rank profiles are profiles containing instructions on how to score documents for a given query. The most important part of rank profiles are the ranking expressions. The schemas for the item and review document types contain different rank profiles to sort or score the data. The item ranking is using a hybrid combination of keyword and vector matching.
+
+* [Native embedders](/en/rag/embedding)
+
+Native embedders are used to map the textual query and document representations into dense high dimensional vectors which are used for semantic search. The application uses an open-source embedding model and inference is performed using [stateless model evaluation](/en/ranking/stateless-model-evaluation), both during document and query processing.
+
+* [Vector search](/en/querying/nearest-neighbor-search)
+
+The default retrieval uses approximate nearest neighbor search in combination with traditional lexical matching. Both the keyword and vector matching is constrained by the filters such as brand, price or category.
+
+* [Ranking functions](/en/reference/schemas/schemas#function-rank)
+
+Ranking functions are contained in rank profiles and can be referenced as part of any ranking expression from either first-phase, second-phase, global-phase or other functions.
diff --git a/mintlify-docs/en/learn/tutorials/http-api.mdx b/mintlify-docs/en/learn/tutorials/http-api.mdx
new file mode 100644
index 0000000000..0cfe5890ed
--- /dev/null
+++ b/mintlify-docs/en/learn/tutorials/http-api.mdx
@@ -0,0 +1,91 @@
+---
+title: "Building an HTTP API using request handlers and"
+---
+
+This tutorial builds a simple application consisting of these pieces:
+
+- A custom REST API - implemented in a _request handler_.
+- Two pieces of request/response processing logic - implemented as two chained _processors_.
+- A _component_ shared by the above processors.
+- A custom output format - a _renderer_.
+
+The end result is to process incoming request of the form:
+
+```bash
+http://hostname:port/demo?terms=something%20completely%20different
+```
+
+into a nested structure response produced by the processors and serialized by the renderer. Use the sample application found at [http-api-using-request-handlers-and-processors](https://github.com/vespa-engine/sample-apps/tree/master/examples/http-api-using-request-handlers-and-processors).
+
+## Request handler
+
+The custom request handler is required to implement a custom API. In many cases it is not necessary to add a custom handler as the Processors can access the request data directly. However, it is needed if e.g. your application wants more control over exactly which parameters are used to route to a particular processing chain.
+
+In this case, the request handler will simply add the request URI as a property and then forward to the built-in processing handler for processing.
+
+Review the code in [DemoHandler.java](https://github.com/vespa-engine/sample-apps/blob/master/examples/http-api-using-request-handlers-and-processors/src/main/java/ai/vespa/examples/DemoHandler.java)
+
+## Processors
+
+This application contains two processors, one for annotating the incoming request (using default values from config) and checking the result, and one for creating the result using the shared component.
+
+### AnnotatingProcessor
+
+Review the code in [AnnotatingProcessor.java](https://github.com/vespa-engine/sample-apps/blob/master/examples/http-api-using-request-handlers-and-processors/src/main/java/ai/vespa/examples/AnnotatingProcessor.java)
+
+### DataProcessor
+
+The other processor creates some structured Response Data from data handled to it in the request. This is done in cases where the web service is a processing service. In cases where the service is implementing some middleware on top of other services, similar processors will instead make outgoing requests to downstream web services to produce Response Data.
+
+Review the code in [DataProcessor.java](https://github.com/vespa-engine/sample-apps/blob/master/examples/http-api-using-request-handlers-and-processors/src/main/java/ai/vespa/examples/DataProcessor.java)
+
+Notice how the task of the server is decomposed into separate Processing steps which can be composed by chaining at configuration time and which communicates through the Request and Response only. This structure enhances sharing, reuse and modularity and makes it easy to create variations where some logic encapsulated in a Processor is added, removed or modified.
+
+The order of the processors is decided by the @Before and @After annotations - refer to [chained components](../../applications/chaining.html).
+
+### Custom configuration
+
+The default terms used by the AnnotatingProcessor are placed in user configuration, where the definition is in [demo.def](https://github.com/vespa-engine/sample-apps/blob/master/examples/http-api-using-request-handlers-and-processors/src/main/resources/configdefinitions/demo.def):
+
+```bash
+package=com.mydomain.demo
+
+demo[].term string
+```
+
+In other words, a configuration class containing a single array named _demo_, containing a class Demo which only contains single string named _term_.
+
+## Renderer
+
+The responsibility of the renderer is to serialize the structured result into bytes for transport back to the client.
+
+Rendering works by first creating a single instance of the renderer, invoking the constructor, then cloning a new renderer for each result set to be rendered. `init()` will be invoked once on each new clone before `render()` is invoked.
+
+Review the code in [DemoRenderer.java](https://github.com/vespa-engine/sample-apps/blob/master/examples/http-api-using-request-handlers-and-processors/src/main/java/ai/vespa/examples/DemoRenderer.java)
+
+## Shared component
+
+The responsibility of this custom component is to decouple some parts of the application from the Searcher. This makes it possible to reconfigure the Searcher without rebuilding the potentially costly custom component.
+
+In this case, what the component does is more than a little silly. More typical use would be an [FSA](/en/reference/operations/tools#vespa-makefsa) or complex, shared helper functionality.
+
+Review the code in [DemoComponent.java](https://github.com/vespa-engine/sample-apps/blob/master/examples/http-api-using-request-handlers-and-processors/src/main/java/ai/vespa/examples/DemoComponent.java)
+
+## Application
+
+Review the application's configuration in [services.xml](https://github.com/vespa-engine/sample-apps/blob/master/examples/http-api-using-request-handlers-and-processors/src/main/application/services.xml)
+
+## Try it!
+
+Build the project, then [run a test](../../applications/developer-guide.html), querying [http://localhost:8080/demo?terms=1%202%203%204](http://localhost:8080/demo?terms=1%202%203%204) gives:
+
+```bash
+OK
+Renderer initialized: 1369733374898
+http://localhost:8080/demo?terms=1%202%203%204
+1
+ 2
+ 3
+ 4
+Rendering finished work: 1369733374902
+```
\ No newline at end of file
diff --git a/mintlify-docs/en/learn/tutorials/hybrid-search.mdx b/mintlify-docs/en/learn/tutorials/hybrid-search.mdx
new file mode 100644
index 0000000000..53f5a18899
--- /dev/null
+++ b/mintlify-docs/en/learn/tutorials/hybrid-search.mdx
@@ -0,0 +1,1066 @@
+---
+title: "Hybrid Text Search Tutorial"
+---
+
+
+Hybrid search combines different retrieval methods to improve search quality. This tutorial distinguishes between two core components of search:
+
+- **Retrieval**: Identifying a subset of potentially relevant documents from a large corpus. Traditional lexical methods like [BM25](/en/ranking/bm25) excel at this, as do modern, embedding-based [vector search](/en/querying/vector-search-intro) approaches.
+- **Ranking**: Ordering retrieved documents by relevance to refine the results. Vespa's flexible [ranking framework](/en/basics/ranking) enables complex scoring mechanisms.
+
+This tutorial demonstrates building a hybrid search application with Vespa that leverages the strengths of both lexical and embedding-based approaches. We'll use the [NFCorpus](https://ir-datasets.com/nfcorpus.html) dataset from the [BEIR](https://github.com/beir-cellar/beir) benchmark and explore various hybrid search techniques using Vespa's query language and ranking features.
+
+The main goal is to set up a text search app that combines simple text scoring features such as [BM25](/en/ranking/bm25) [^1] with vector search in combination with text-embedding models. We demonstrate how to obtain text embeddings within Vespa using Vespa's [embedder](/en/rag/embedding#huggingface-embedder) functionality. In this guide, we use [snowflake-arctic-embed-xs](https://huggingface.co/Snowflake/snowflake-arctic-embed-xs) as the text embedding model. It is a small model that is fast to run and has a small memory footprint.
+
+
+
+**Prerequisites:**
+
+- Linux, macOS or Windows 10 Pro on x86\_64 or arm64, with [Podman Desktop](https://podman.io/) or [Docker Desktop](https://www.docker.com/products/docker-desktop/) installed, with an engine running.
+ - Alternatively, start the Podman daemon:
+
+ ```bash
+ $ podman machine init --memory 6000
+ $ podman machine start
+ ```
+ - See [Docker Containers](/en/operations/self-managed/docker-containers) for system limits and other settings.
+- For CPUs older than Haswell (2013), see [CPU Support](/en/operations/self-managed/cpu-support).
+- Memory: Minimum 4 GB RAM dedicated to Docker/Podman. [Memory recommendations](/en/operations/self-managed/node-setup#memory-settings).
+- Disk: Avoid `NO_SPACE` - the vespaengine/vespa container image + headroom for data requires disk space. [Read more](/en/writing/feed-block).
+- [Homebrew](https://brew.sh/) to install the [Vespa CLI](/en/clients/vespa-cli), or download the Vespa CLI from [Github releases](https://github.com/vespa-engine/vespa/releases).
+- Python3
+- `curl`
+
+
+
+## Installing vespa-cli and ir_datasets
+
+This tutorial uses [Vespa-CLI](/en/clients/vespa-cli) to deploy, feed, and query Vespa. We also use [ir-datasets](https://ir-datasets.com/) to obtain the NFCorpus relevance dataset.
+```bash
+$ pip3 install --ignore-installed vespacli ir_datasets ir_measures requests
+```
+
+
+We can quickly look at a document from [nfcorpus](https://ir-datasets.com/beir.html#beir/nfcorpus):
+
+
+```bash
+$ ir_datasets export beir/nfcorpus docs --format jsonl | head -1
+```
+
+
+Which outputs:
+
+
+```json expandable
+{"doc_id": "MED-10", "text": "Recent studies have suggested that statins, an established drug group in the prevention of cardiovascular mortality, could delay or prevent breast cancer recurrence but the effect on disease-specific mortality remains unclear. We evaluated risk of breast cancer death among statin users in a population-based cohort of breast cancer patients. The study cohort included all newly diagnosed breast cancer patients in Finland during 1995\u20132003 (31,236 cases), identified from the Finnish Cancer Registry. Information on statin use before and after the diagnosis was obtained from a national prescription database. We used the Cox proportional hazards regression method to estimate mortality among statin users with statin use as time-dependent variable. A total of 4,151 participants had used statins. During the median follow-up of 3.25 years after the diagnosis (range 0.08\u20139.0 years) 6,011 participants died, of which 3,619 (60.2%) was due to breast cancer. After adjustment for age, tumor characteristics, and treatment selection, both post-diagnostic and pre-diagnostic statin use were associated with lowered risk of breast cancer death (HR 0.46, 95% CI 0.38\u20130.55 and HR 0.54, 95% CI 0.44\u20130.67, respectively). The risk decrease by post-diagnostic statin use was likely affected by healthy adherer bias; that is, the greater likelihood of dying cancer patients to discontinue statin use as the association was not clearly dose-dependent and observed already at low-dose/short-term use. The dose- and time-dependence of the survival benefit among pre-diagnostic statin users suggests a possible causal effect that should be evaluated further in a clinical trial testing statins\u2019 effect on survival in breast cancer patients.", "title": "Statin Use and Breast Cancer Survival: A Nationwide Cohort Study from Finland", "url": "http://www.ncbi.nlm.nih.gov/pubmed/25329299"}
+```
+
+
+The NFCorpus documents have four fields:
+
+- The `doc_id` and `url`
+- The `text` and the `title`
+
+We are interested in the title and the text, and we want to be able to search across these two fields. We also need to store the `doc_id` to evaluate [ranking](/en/basics/ranking) accuracy. We will create a small script that converts the above output to [Vespa JSON document](/en/reference/schemas/document-json-format) format. Create a `convert.py` file:
+
+
+```python
+import sys
+import json
+
+for line in sys.stdin:
+ doc = json.loads(line)
+ del doc['url']
+ vespa_doc = {
+ "put": "id:hybrid-search:doc::%s" % doc['doc_id'],
+ "fields": {
+ **doc
+ }
+ }
+ print(json.dumps(vespa_doc))
+```
+
+```bash
+Paste the above into file convert.py
+```
+
+With this script, we convert the document dump to Vespa JSON format. Use the following command to convert the entire dataset to Vespa JSON format:
+
+
+```bash
+$ ir_datasets export beir/nfcorpus docs --format jsonl | python3 convert.py > vespa-docs.jsonl
+```
+
+
+Now, we will create the Vespa application package and schema to index the documents.
+
+## Create a Vespa Application Package
+
+A [Vespa application package](/en/basics/applications) is a set of configuration files and optional Java components that together define the behavior of a Vespa system. Let us define the minimum set of required files to create our hybrid text search application: `doc.sd` and `services.xml`.
+
+
+```bash
+$ mkdir -p app/schemas
+```
+
+
+
+### Schema
+A [schema](/en/basics/schemas) is a document-type configuration; a single Vespa application can have multiple schemas with document types. For this application, we define a schema `doc`, which must be saved in a file named `schemas/doc.sd` in the application package directory.
+
+Write the following to `app/schemas/doc.sd`:
+
+
+```js expandable
+schema doc {
+ document doc {
+ field language type string {
+ indexing: "en" | set_language
+ }
+ field doc_id type string {
+ indexing: attribute | summary
+ match: word
+ }
+ field title type string {
+ indexing: index | summary
+ match: text
+ index: enable-bm25
+ }
+ field text type string {
+ indexing: index | summary
+ match: text
+ index: enable-bm25
+ }
+ }
+ fieldset default {
+ fields: title, text
+ }
+
+ field embedding type tensor(v[384]) {
+ indexing: input title." ".input text | embed | attribute
+ attribute {
+ distance-metric: angular
+ }
+ }
+
+ rank-profile bm25 {
+ first-phase {
+ expression: bm25(title) + bm25(text)
+ }
+ }
+
+ rank-profile semantic {
+ inputs {
+ query(e) tensor(v[384])
+ }
+ first-phase {
+ expression: closeness(field, embedding)
+ }
+ }
+}
+```
+
+```bash
+Paste the above into file app/schemas/doc.sd
+```
+
+ A lot is happening here; let us go through it in detail.
+
+#### Document type and fields
+The `document` section contains the fields of the document, their types, and how Vespa should index and [match](/en/reference/schemas/schemas#match) them.
+
+The field property `indexing` configures the _indexing pipeline_ for a field. For more information, see [schemas - indexing](/en/basics/schemas#document-fields). The [string](/en/reference/schemas/schemas#string) data type represents both unstructured and structured texts, and there are significant differences between [index and attribute](/en/querying/text-matching#index-and-attribute). The above schema includes default `match` modes for `attribute` and `index` property for visibility.
+
+Note that we are enabling [BM25](/en/ranking/bm25) for `title` and `text` by including `index: enable-bm25`. The language field is the only field that is not the NFCorpus dataset. We hardcode its value to "en" since the dataset is English. Using `set_language` avoids automatic language detection and uses the value when processing the other text fields. Read more in [linguistics](/en/linguistics/linguistics).
+
+#### Fieldset for matching across multiple fields
+
+[Fieldset](/en/reference/schemas/schemas#fieldset) allows searching across multiple fields. Defining `fieldset` does not add indexing/storage overhead. String fields grouped using fieldsets must share the same [match](/en/reference/schemas/schemas#match) and [linguistic processing](/en/linguistics/linguistics) settings because the query processing that searches a field or fieldset uses *one* type of transformation.
+
+#### Embedding inference
+Our `embedding` vector field is of [tensor](/en/ranking/tensor-user-guide) type with a single named dimension (`v`) of 384 values.
+
+```js
+field embedding type tensor(v[384]) {
+ indexing: input title." ".input text | embed arctic | attribute
+ attribute {
+ distance-metric: angular
+ }
+}
+```
+The `indexing` expression creates the input to the `embed` inference call (in our example the concatenation of the title and the text field). Since the dataset is small, we do not specify `index` which would build [HNSW](/en/querying/approximate-nn-hnsw) data structures for faster (but approximate) vector search. This guide uses [snowflake-arctic-embed-xs](https://huggingface.co/Snowflake/snowflake-arctic-embed-xs) as the text embedding model. The model is trained with cosine similarity, which maps to Vespa's `angular` [distance-metric](/en/reference/schemas/schemas#distance-metric) for nearestNeighbor search.
+
+#### Ranking to determine matched documents ordering
+You can define many [rank profiles](/en/basics/ranking), named collections of score calculations, and ranking phases.
+
+In this starting point, we have two simple rank-profile's:
+- a `bm25` rank-profile that uses [BM25](/en/ranking/bm25). We sum the two field-level BM25 scores using a Vespa [ranking expression](/en/ranking/ranking-expressions-features).
+- a `semantic` rank-profile which is used in combination Vespa's nearestNeighbor query operator (vector search).
+
+Both profiles specify a single [ranking phase](/en/ranking/phased-ranking).
+
+### Services Specification
+
+The [services.xml](/en/reference/applications/services/services.html) defines the services that make up the Vespa application — which services to run and how many nodes per service. Write the following to `app/services.xml`:
+
+
+```xml expandable
+
+
+
+
+
+
+
+
+
+ cls
+
+ Represent this sentence for searching relevant passages:
+
+
+
+
+
+ 1
+
+
+
+
+
+```
+```xml
+Paste the above into file app/services.xml
+```
+
+Some notes about the elements above:
+
+- `` defines the [container cluster](/en/applications/containers) for document, query and result processing.
+- `` sets up the [query endpoint](/en/querying/query-api). The default port is 8080.
+- `` sets up the [document endpoint](/en/reference/api/document-v1) for feeding.
+- `` with type `hugging-face-embedder` configures the embedder in the application package. This includes where to fetch the model files from, the prepend instructions, and the pooling strategy. See [huggingface-embedder](/en/rag/embedding#huggingface-embedder) for details and other embedders supported.
+- `` defines how documents are stored and searched.
+- `` denotes how many copies to keep of each document.
+- `` assigns the document types in the _schema_ to content clusters.
+
+
+## Deploy the application package
+
+Once we have finished writing our application package, we can deploy it. We use settings similar to those in the [Vespa quick start guide](/en/basics/deploy-an-application-local).
+
+Start the Vespa container:
+
+
+```bash
+$ docker run --detach --name vespa-hybrid --hostname vespa-container \ --publish 8080:8080 --publish 19071:19071 \ vespaengine/vespa
+```
+
+
+Notice that we publish two ports: 8080 is the data-plane where we write and query documents, and 19071 is the control-plane where we can deploy the application. Note that the data-plane port is inactive before deploying the application.
+
+Configure the Vespa CLI to use the local container:
+```bash
+$ vespa config set target local
+```
+
+
+Starting the container can take a short while. Make sure that the configuration service is running by using `vespa status`.
+
+
+```bash
+$ vespa status deploy --wait 300
+```
+
+
+Now, deploy the Vespa application from the `app` directory:
+
+
+```bash
+$ vespa deploy --wait 300 app
+```
+
+
+
+## Feed the data
+
+The data fed to Vespa must match the document type in the schema. This step performs embed inference inside Vespa using the snowflake arctic embedding model. Remember the `component` definition in `services.xml` and the `embed` call in the schema.
+
+
+```bash
+$ vespa feed -t http://localhost:8080 vespa-docs.jsonl
+```
+
+
+The output should look like this (rates may vary depending on your machine HW):
+
+
+```json expandable
+{
+ "feeder.operation.count": 3633,
+ "feeder.seconds": 148.515,
+ "feeder.ok.count": 3633,
+ "feeder.ok.rate": 24.462,
+ "feeder.error.count": 0,
+ "feeder.inflight.count": 0,
+ "http.request.count": 3633,
+ "http.request.bytes": 2985517,
+ "http.request.MBps": 0.020,
+ "http.exception.count": 0,
+ "http.response.count": 3633,
+ "http.response.bytes": 348320,
+ "http.response.MBps": 0.002,
+ "http.response.error.count": 0,
+ "http.response.latency.millis.min": 316,
+ "http.response.latency.millis.avg": 787,
+ "http.response.latency.millis.max": 1704,
+ "http.response.code.counts": {
+ "200": 3633
+ }
+}
+```
+
+
+Notice:
+
+- `feeder.ok.rate` which is the throughput (Note that this step includes embedding inference). See [embedder-performance](/en/rag/embedding#embedder-performance) for details on embedding inference performance. In this case, embedding inference is the bottleneck for overall indexing throughput.
+- `http.response.code.counts` matches with `feeder.ok.count`. The dataset has 3633 documents. Note that if you observe any `429` responses, these are harmless. Vespa asks the client to slow down the feed speed because of resource contention.
+
+
+## Sample queries
+We can now run a few sample queries to demonstrate various ways to perform searches over this data using the [Vespa query language](/en/querying/query-language).
+
+
+```bash
+$ ir_datasets export beir/nfcorpus/test queries --fields query_id text | head -1
+```
+
+```bash
+PLAIN-2 Do Cholesterol Statin Drugs Cause Breast Cancer?
+```
+
+
+If you see a pipe related error from the above command, you can safely ignore it.
+
+Here, `PLAIN-2` is the query id of the first test query. We'll use this test query to demonstrate querying Vespa.
+
+### Lexical search with BM25 scoring
+
+The following query uses [weakAnd](/en/ranking/wand) and where `totalTargetHits` is a hint of how many documents we want to expose to configurable [ranking phases](/en/ranking/phased-ranking). Refer to [text search tutorial](/en/learn/tutorials/text-search#querying-the-data) for more on querying with `text`.
+
+
+```bash
+vespa query \
+ 'yql=select * from doc where default contains ({targetHits:10}text(@user-query))' \
+ 'user-query=Do Cholesterol Statin Drugs Cause Breast Cancer?' \
+ 'hits=1' \
+ 'language=en' \
+ 'ranking=bm25'
+```
+
+
+Notice that we choose `ranking` to specify which rank profile to rank the documents retrieved by the query. This query returns the following [JSON result response](/en/reference/querying/default-result-format):
+
+
+```json expandable
+{
+ "root": {
+ "id": "toplevel",
+ "relevance": 1.0,
+ "fields": {
+ "totalCount": 46
+ },
+ "coverage": {
+ "coverage": 100,
+ "documents": 3633,
+ "full": true,
+ "nodes": 1,
+ "results": 1,
+ "resultsFull": 1
+ },
+ "children": [
+ {
+ "id": "id:doc:doc::MED-10",
+ "relevance": 25.521817426330887,
+ "source": "content",
+ "fields": {
+ "sddocname": "doc",
+ "documentid": "id:doc:doc::MED-10",
+ "doc_id": "MED-10",
+ "title": "Statin Use and Breast Cancer Survival: A Nationwide Cohort Study from Finland",
+ "text": "Recent studies have suggested that statins, an established drug group in the prevention of cardiovascular mortality, could delay or prevent breast cancer recurrence but the effect on disease-specific mortality remains unclear. We evaluated risk of breast cancer death among statin users in a population-based cohort of breast cancer patients. The study cohort included all newly diagnosed breast cancer patients in Finland during 1995–2003 (31,236 cases), identified from the Finnish Cancer Registry. Information on statin use before and after the diagnosis was obtained from a national prescription database. We used the Cox proportional hazards regression method to estimate mortality among statin users with statin use as time-dependent variable. A total of 4,151 participants had used statins. During the median follow-up of 3.25 years after the diagnosis (range 0.08–9.0 years) 6,011 participants died, of which 3,619 (60.2%) was due to breast cancer. After adjustment for age, tumor characteristics, and treatment selection, both post-diagnostic and pre-diagnostic statin use were associated with lowered risk of breast cancer death (HR 0.46, 95% CI 0.38–0.55 and HR 0.54, 95% CI 0.44–0.67, respectively). The risk decrease by post-diagnostic statin use was likely affected by healthy adherer bias; that is, the greater likelihood of dying cancer patients to discontinue statin use as the association was not clearly dose-dependent and observed already at low-dose/short-term use. The dose- and time-dependence of the survival benefit among pre-diagnostic statin users suggests a possible causal effect that should be evaluated further in a clinical trial testing statins’ effect on survival in breast cancer patients."
+ }
+ }
+ ]
+ }
+}
+```
+
+
+The query retrieves and ranks `MED-10` as the most relevant document—notice the `totalCount` which is the number of documents that were retrieved for ranking phases. In this case, we exposed about 50 documents to first-phase ranking, it is higher than our target, but also fewer than the total number of documents that match any query terms.
+
+In the example below, we change the grammar from the default `weakAnd` to `any`, and the query matches 1780, or almost 50% of the indexed documents.
+
+
+```bash
+vespa query \
+ 'yql=select * from doc where default contains ({targetHits:100, grammar:"any"}text(@user-query))' \
+ 'user-query=Do Cholesterol Statin Drugs Cause Breast Cancer?' \
+ 'hits=1' \
+ 'language=en' \
+ 'ranking=bm25'
+```
+
+
+The bm25 rank profile calculates the relevance score (\~25.521), which is configured in the schema as:
+
+
+```txt
+rank-profile bm25 {
+ first-phase {
+ expression: bm25(title) + bm25(text)
+ }
+}
+```
+
+
+So, in this case, `relevance` is the sum of the two BM25 scores. The retrieved document looks relevant; we can look at the graded judgment for this query `PLAIN-2`. The following exports the query relevance judgments (we grep for the query id that we are interested in):
+
+
+```bash
+$ ir_datasets export beir/nfcorpus/test qrels | grep "PLAIN-2 "
+```
+
+
+The following is the output from the above command. Notice line two, the `MED-10` document retrieved above, is judged as very relevant with the grade 2 (perfect) for the query_id PLAIN-2. This dataset has graded relevance judgments where a grade of 1 is less relevant than 2. Documents retrieved by the system without a relevance judgment are assumed to be irrelevant (grade 0).
+
+
+```bash expandable
+PLAIN-2 0 MED-2427 2
+PLAIN-2 0 MED-10 2
+PLAIN-2 0 MED-2429 2
+PLAIN-2 0 MED-2430 2
+PLAIN-2 0 MED-2431 2
+PLAIN-2 0 MED-14 2
+PLAIN-2 0 MED-2432 2
+PLAIN-2 0 MED-2428 1
+PLAIN-2 0 MED-2440 1
+PLAIN-2 0 MED-2434 1
+PLAIN-2 0 MED-2435 1
+PLAIN-2 0 MED-2436 1
+PLAIN-2 0 MED-2437 1
+PLAIN-2 0 MED-2438 1
+PLAIN-2 0 MED-2439 1
+PLAIN-2 0 MED-3597 1
+PLAIN-2 0 MED-3598 1
+PLAIN-2 0 MED-3599 1
+PLAIN-2 0 MED-4556 1
+PLAIN-2 0 MED-4559 1
+PLAIN-2 0 MED-4560 1
+PLAIN-2 0 MED-4828 1
+PLAIN-2 0 MED-4829 1
+PLAIN-2 0 MED-4830 1
+```
+
+
+### Dense search using text embedding
+
+Now, we turn to embedding-based retrieval, where we embed the query text using the configured text-embedding model and perform an exact `nearestNeighbor` search. We use [embed query](/en/rag/embedding#embedding-a-query-text) to produce the input tensor `query(e)`, defined in the `semantic` rank-profile in the schema.
+
+
+```bash
+vespa query \
+ 'yql=select * from doc where {targetHits:10}nearestNeighbor(embedding,e)' \
+ 'user-query=Do Cholesterol Statin Drugs Cause Breast Cancer?' \
+ 'input.query(e)=embed(@user-query)' \
+ 'hits=1' \
+ 'ranking=semantic'
+```
+
+
+This query returns the following [JSON result response](/en/reference/querying/default-result-format):
+
+
+```json expandable
+{
+ "root": {
+ "id": "toplevel",
+ "relevance": 1.0,
+ "fields": {
+ "totalCount": 64
+ },
+ "coverage": {
+ "coverage": 100,
+ "documents": 3633,
+ "full": true,
+ "nodes": 1,
+ "results": 1,
+ "resultsFull": 1
+ },
+ "children": [
+ {
+ "id": "id:doc:doc::MED-2429",
+ "relevance": 0.6061378635706601,
+ "source": "content",
+ "fields": {
+ "sddocname": "doc",
+ "documentid": "id:doc:doc::MED-2429",
+ "doc_id": "MED-2429",
+ "title": "Statin use and risk of breast cancer: a meta-analysis of observational studies.",
+ "text": "Emerging evidence suggests that statins' may decrease the risk of cancers. However, available evidence on breast cancer is conflicting. We, therefore, examined the association between statin use and risk of breast cancer by conducting a detailed meta-analysis of all observational studies published regarding this subject. PubMed database and bibliographies of retrieved articles were searched for epidemiological studies published up to January 2012, investigating the relationship between statin use and breast cancer. Before meta-analysis, the studies were evaluated for publication bias and heterogeneity. Combined relative risk (RR) and 95 % confidence interval (CI) were calculated using a random-effects model (DerSimonian and Laird method). Subgroup analyses, sensitivity analysis, and cumulative meta-analysis were also performed. A total of 24 (13 cohort and 11 case-control) studies involving more than 2.4 million participants, including 76,759 breast cancer cases contributed to this analysis. We found no evidence of publication bias and evidence of heterogeneity among the studies. Statin use and long-term statin use did not significantly affect breast cancer risk (RR = 0.99, 95 % CI = 0.94, 1.04 and RR = 1.03, 95 % CI = 0.96, 1.11, respectively). When the analysis was stratified into subgroups, there was no evidence that study design substantially influenced the effect estimate. Sensitivity analysis confirmed the stability of our results. Cumulative meta-analysis showed a change in trend of reporting risk of breast cancer from positive to negative in statin users between 1993 and 2011. Our meta-analysis findings do not support the hypothesis that statins' have a protective effect against breast cancer. More randomized clinical trials and observational studies are needed to confirm this association with underlying biological mechanisms in the future."
+ }
+ }
+ ]
+ }
+}
+```
+
+
+The result of this vector-based search differed from the previous sparse keyword search, with a different relevant document at position 1. In this case, the relevance score is 0.606 and calculated by the `closeness` function in the `semantic` rank-profile. Note that more documents were retrieved than the `targetHits`.
+
+```bash
+rank-profile semantic {
+ inputs {
+ query(e) tensor(v[384])
+ }
+ first-phase {
+ expression: closeness(field, embedding)
+ }
+ }
+```
+
+Where [closeness(field, embedding)](/en/reference/ranking/rank-features#attribute-match-features-normalized) is a ranking feature that calculates the cosine similarity between the query and the document embedding. This returns the inverted of the distance between the two vectors. Small distance = higher closeness. This because Vespa sorts results in descending order of relevance. Descending order means the largest will appear at the top of the ranked list.
+
+Note that similarity scores of embedding vectors are often optimized via contrastive or ranking losses, which make them difficult to interpret.
+
+## Evaluate ranking accuracy
+
+The previous section demonstrated how to combine the Vespa query language with rank profiles to implement two different retrieval and ranking strategies.
+
+In the following section we evaluate all 323 test queries with both models to compare their overall effectiveness, measured using [nDCG@10](https://en.wikipedia.org/wiki/Discounted_cumulative_gain). `nDCG@10` is the official evaluation metric of the BEIR benchmark and is an appropriate metric for test sets with graded relevance judgments.
+
+For this evaluation task, we need to write a small script. The following script iterates over the queries in the test set, executes the query against the Vespa instance, and reads the response from Vespa. It then evaluates and prints the metric. The overall effectiveness is measured using the average of each query `nDCG@10` metric.
+
+
+```python expandable
+import requests
+import ir_datasets
+from ir_measures import calc_aggregate, nDCG, ScoredDoc
+from enum import Enum
+from typing import List
+
+class RModel(Enum):
+ SPARSE = 1
+ DENSE = 2
+ HYBRID = 3
+
+def parse_vespa_response(response:dict, qid:str) -> List[ScoredDoc]:
+ result = []
+ hits = response['root'].get('children',[])
+ for hit in hits:
+ doc_id = hit['fields']['doc_id']
+ relevance = hit['relevance']
+ result.append(ScoredDoc(qid, doc_id, relevance))
+ return result
+
+def search(query:str, qid:str, ranking:str,
+ hits=10, language="en", mode=RModel.SPARSE) -> List[ScoredDoc]:
+ yql = "select doc_id from doc where default contains ({targetHits:100}text(@user-query))"
+ if mode == RModel.DENSE:
+ yql = "select doc_id from doc where ({targetHits:10}nearestNeighbor(embedding, e))"
+ elif mode == RModel.HYBRID:
+ yql = "select doc_id from doc where default contains ({targetHits:100}text(@user-query)) OR ({targetHits:10}nearestNeighbor(embedding, e))"
+ query_request = {
+ 'yql': yql,
+ 'user-query': query,
+ 'ranking.profile': ranking,
+ 'hits' : hits,
+ 'language': language
+ }
+ if mode == RModel.DENSE or mode == RModel.HYBRID:
+ query_request['input.query(e)'] = "embed(@user-query)"
+
+ response = requests.post("http://localhost:8080/search/", json=query_request)
+ if response.ok:
+ return parse_vespa_response(response.json(), qid)
+ else:
+ print("Search request failed with response " + str(response.json()))
+ return []
+
+def main():
+ import argparse
+ parser = argparse.ArgumentParser(description='Evaluate ranking models')
+ parser.add_argument('--ranking', type=str, required=True, help='Vespa ranking profile')
+ parser.add_argument('--mode', type=str, default="sparse", help='retrieval mode, valid values are sparse, dense, hybrid')
+ args = parser.parse_args()
+ mode = RModel.HYBRID
+ if args.mode == "sparse":
+ mode = RModel.SPARSE
+ elif args.mode == "dense":
+ mode = RModel.DENSE
+
+
+ dataset = ir_datasets.load("beir/nfcorpus/test")
+ results = []
+ metrics = [nDCG@10]
+ for query in dataset.queries_iter():
+ qid = query.query_id
+ query_text = query.text
+ results.extend(search(query_text, qid, args.ranking, mode=mode))
+
+ metrics = calc_aggregate(metrics, dataset.qrels, results)
+ print("Ranking metric NDCG@10 for rank profile {}: {:.4f}".format(args.ranking, metrics[nDCG@10]))
+
+if __name__ == "__main__":
+ main()
+```
+
+```bash
+Paste the above into file evaluate_ranking.py
+```
+
+
+Then execute the script:
+```bash
+$ python3 evaluate_ranking.py --ranking bm25 --mode sparse
+```
+
+
+The script will produce the following output:
+
+
+```txt
+Ranking metric NDCG@10 for rank profile bm25: 0.3210
+```
+
+
+Now, we can evaluate the dense model using the same script:
+
+
+```bash
+$ python3 evaluate_ranking.py --ranking semantic --mode dense
+```
+
+```txt
+Ranking metric NDCG@10 for rank profile semantic: 0.3077
+```
+
+
+Note that the _average_ `nDCG@10` score is computed across all the 327 test queries. You can also experiment beyond a single metric and modify the script to calculate more [measures](https://ir-measur.es/en/latest/measures.html), for example, including precision with a relevance label cutoff of 2:
+
+
+```txt
+metrics = [nDCG@10, P(rel=2)@10]
+```
+
+
+Also note that the exact nDCG@10 values may vary slightly between runs.
+
+## Hybrid Search & Ranking
+
+We demonstrated and evaluated two independent retrieval and ranking strategies in the previous sections. Now, we want to explore hybrid search techniques where we combine:
+
+- traditional lexical keyword matching with a text scoring method (BM25)
+- embedding-based search using a text embedding model
+
+With Vespa, there is a distinction between retrieval (matching) and configurable [ranking](/en/basics/ranking).
+
+In the Vespa ranking phases, we can express arbitrary scoring complexity with the full power of the Vespa [ranking](/en/basics/ranking) framework. Meanwhile, top-k retrieval relies on simple built-in functions associated with Vespa's top-k query operators. These top-k operators aim to avoid scoring all documents in the collection for a query by using a simplistic scoring function to identify the top-k documents.
+
+These top-k query operators use `index` structures to accelerate the query evaluation, avoiding scoring all documents using heuristics. In the context of hybrid text search, the following Vespa top-k query operators are relevant:
+
+- YQL `{targetHits:k}nearestNeighbor()` for dense representations (text embeddings) using a configured [distance-metric](/en/reference/schemas/schemas#distance-metric) as the scoring function.
+- YQL `myField contains ({targetHits:k}text(@user-query))` which by default uses [weakAnd](/en/ranking/wand) for sparse representations.
+
+
+We can combine these operators using boolean query operators like AND/OR/RANK to express a hybrid search query. Then, there is a wild number of ways that we can combine various signals in [ranking](/en/basics/ranking).
+
+
+### Define our first simple hybrid rank profile
+
+First, we can add our first simple hybrid rank profile that combines the dense and sparse components using multiplication to combine them into a single score.
+
+
+```txt
+closeness(field, embedding) * (1 + bm25(title) + bm25(text))
+```
+
+
+- the [closeness(field, embedding)](/en/reference/ranking/rank-features#attribute-match-features-normalized) rank-feature returns a normalized score in the range 0 to 1 inclusive
+- Any of the per-field BM25 scores are in the range of 0 to infinity
+
+We add a bias constant (1) to avoid the overall score becoming 0 if the document does not match any query terms, as the BM25 scores would be 0. We also add `match-features` to be able to debug each of the scores.
+
+
+
+```js expandable
+schema doc {
+ document doc {
+ field language type string {
+ indexing: "en" | set_language
+ }
+ field doc_id type string {
+ indexing: attribute | summary
+ match: word
+ }
+ field title type string {
+ indexing: index | summary
+ match: text
+ index: enable-bm25
+ }
+ field text type string {
+ indexing: index | summary
+ match: text
+ index: enable-bm25
+ }
+ }
+ fieldset default {
+ fields: title, text
+ }
+
+ field embedding type tensor(v[384]) {
+ indexing: input title." ".input text | embed | attribute
+ attribute {
+ distance-metric: angular
+ }
+ }
+
+ rank-profile hybrid {
+ inputs {
+ query(e) tensor(v[384])
+ }
+ first-phase {
+ expression: closeness(field, embedding) * (1 + (bm25(title) + bm25(text)))
+ }
+ match-features: bm25(title) bm25(text) closeness(field, embedding)
+ }
+}
+```
+
+
+Now, re-deploy the Vespa application from the `app` directory:
+
+
+```bash
+$ vespa deploy --wait 300 app
+```
+
+
+After that, we can start experimenting with how to express hybrid queries using the Vespa query language.
+
+### Hybrid query examples
+
+The following demonstrates combining the two top-k query operators using the Vespa query language. In a later section, we will show how to combine the two retrieval strategies using the Vespa ranking framework. This section focuses on the top-k retrieval part that exposes matched documents to the Vespa [ranking](/en/basics/ranking) phase(s).
+
+#### Hybrid query using the OR operator
+The following query exposes documents to ranking that match the query using *either (OR)* the sparse or dense representation.
+
+
+```bash
+vespa query \
+ 'yql=select * from doc where default contains ({targetHits:10}text(@user-query)) or ({targetHits:10}nearestNeighbor(embedding,e))' \
+ 'user-query=Do Cholesterol Statin Drugs Cause Breast Cancer?' \
+ 'input.query(e)=embed(@user-query)' \
+ 'hits=1' \
+ 'language=en' \
+ 'ranking=hybrid'
+```
+ The documents retrieved into ranking is scored by the `hybrid` rank-profile. Note that both top-k query operators might expose more than the the `targetHits` setting.
+
+The above query returns the following [JSON result response](/en/reference/querying/default-result-format):
+
+
+```json expandable
+{
+ "root": {
+ "id": "toplevel",
+ "relevance": 1.0,
+ "fields": {
+ "totalCount": 87
+ },
+ "coverage": {
+ "coverage": 100,
+ "documents": 3633,
+ "full": true,
+ "nodes": 1,
+ "results": 1,
+ "resultsFull": 1
+ },
+ "children": [
+ {
+ "id": "id:doc:doc::MED-10",
+ "relevance": 15.898915593367988,
+ "source": "content",
+ "fields": {
+ "matchfeatures": {
+ "bm25(text)": 17.35556767018612,
+ "bm25(title)": 8.166249756144769,
+ "closeness(field,embedding)": 0.5994655395517325
+ },
+ "sddocname": "doc",
+ "documentid": "id:doc:doc::MED-10",
+ "doc_id": "MED-10",
+ "title": "Statin Use and Breast Cancer Survival: A Nationwide Cohort Study from Finland",
+ "text": "Recent studies have suggested that statins, an established drug group in the prevention of cardiovascular mortality, could delay or prevent breast cancer recurrence but the effect on disease-specific mortality remains unclear. We evaluated risk of breast cancer death among statin users in a population-based cohort of breast cancer patients. The study cohort included all newly diagnosed breast cancer patients in Finland during 1995–2003 (31,236 cases), identified from the Finnish Cancer Registry. Information on statin use before and after the diagnosis was obtained from a national prescription database. We used the Cox proportional hazards regression method to estimate mortality among statin users with statin use as time-dependent variable. A total of 4,151 participants had used statins. During the median follow-up of 3.25 years after the diagnosis (range 0.08–9.0 years) 6,011 participants died, of which 3,619 (60.2%) was due to breast cancer. After adjustment for age, tumor characteristics, and treatment selection, both post-diagnostic and pre-diagnostic statin use were associated with lowered risk of breast cancer death (HR 0.46, 95% CI 0.38–0.55 and HR 0.54, 95% CI 0.44–0.67, respectively). The risk decrease by post-diagnostic statin use was likely affected by healthy adherer bias; that is, the greater likelihood of dying cancer patients to discontinue statin use as the association was not clearly dose-dependent and observed already at low-dose/short-term use. The dose- and time-dependence of the survival benefit among pre-diagnostic statin users suggests a possible causal effect that should be evaluated further in a clinical trial testing statins’ effect on survival in breast cancer patients."
+ }
+ }
+ ]
+ }
+}
+```
+
+
+What is going on here is that we are combining the two top-k query operators using a boolean OR (disjunction). The `totalCount` is the number of documents retrieved into ranking (About 90, which is higher than 10 + 10). The `relevance` is the score assigned by `hybrid` rank-profile. Notice that the `matchfeatures` field shows all the feature scores. This is useful for debugging and understanding the ranking behavior, also for feature logging.
+
+#### Hybrid query with AND operator
+The following combines the two top-k operators using AND, meaning that the retrieved documents must match both the sparse and dense top-k operators:
+
+
+```bash
+vespa query \
+ 'yql=select * from doc where default contains ({targetHits:10}text(@user-query)) and ({targetHits:10}nearestNeighbor(embedding,e))' \
+ 'user-query=Do Cholesterol Statin Drugs Cause Breast Cancer?' \
+ 'input.query(e)=embed(@user-query)' \
+ 'hits=1' \
+ 'language=en' \
+ 'ranking=hybrid'
+```
+ For the sparse keyword query matching, the `weakAnd` operator is used by default and it requires that at least one term in the query matches the document (fieldset searched).
+
+#### Hybrid query with rank query operator
+The following combines the two top-k operators using the [rank](/en/reference/querying/yql#rank) query operator, which allows us to retrieve using only the first operand of the rank operator, but where the remaining operands allow computing (match) features that can be used in ranking phases.
+
+This query is meaningful because we can use the computed features in the ranking expressions but retrieve only by the dense representation. This is usually the most resource-effective way to combine the two representations.
+
+
+```bash
+vespa query \
+ 'yql=select * from doc where rank(({targetHits:10}nearestNeighbor(embedding,e)), default contains ({targetHits:10}text(@user-query)))' \
+ 'user-query=Do Cholesterol Statin Drugs Cause Breast Cancer?' \
+ 'input.query(e)=embed(@user-query)' \
+ 'hits=1' \
+ 'language=en' \
+ 'ranking=hybrid'
+```
+ We can also invert the order of the operands to the `rank` query operator that retrieves by the sparse representation but uses the dense representation to compute features for ranking. This is very useful in cases where we do not want to build HNSW indexes (adds memory and slows down indexing), but still be able to use semantic signals in ranking phases.
+
+
+```bash
+vespa query \
+ 'yql=select * from doc where rank(default contains ({targetHits:10}text(@user-query)), ({targetHits:10}nearestNeighbor(embedding,e)))' \
+ 'user-query=Do Cholesterol Statin Drugs Cause Breast Cancer?' \
+ 'input.query(e)=embed(@user-query)' \
+ 'hits=1' \
+ 'language=en' \
+ 'ranking=hybrid'
+```
+
+
+This way of performing hybrid retrieval allows retrieving only by the sparse representation and uses the dense vector representation to compute features for ranking.
+
+## Hybrid ranking
+
+In the previous section, we demonstrated combining the two top-k query operators using boolean query operators.
+
+This section will show combining the two retrieval strategies using the Vespa ranking framework. We can first start evaluating the effectiveness of the hybrid rank profile that combines the two retrieval strategies.
+
+
+
+```bash
+$ python3 evaluate_ranking.py --ranking hybrid --mode hybrid
+```
+
+Which outputs
+
+
+```txt
+Ranking metric NDCG@10 for rank profile hybrid: 0.3330
+```
+
+
+The `nDCG@10` score is slightly higher than the profiles that only use one of the ranking strategies.
+
+Now, we can experiment with more complex ranking expressions that combine the two retrieval strategies. We add a few more rank profiles to the schema that combine the two retrieval strategies in different ways.
+
+
+```js expandable
+schema doc {
+ document doc {
+ field language type string {
+ indexing: "en" | set_language
+ }
+ field doc_id type string {
+ indexing: attribute | summary
+ match: word
+ }
+ field title type string {
+ indexing: index | summary
+ match: text
+ index: enable-bm25
+ }
+ field text type string {
+ indexing: index | summary
+ match: text
+ index: enable-bm25
+ }
+ }
+ fieldset default {
+ fields: title, text
+ }
+
+ field embedding type tensor(v[384]) {
+ indexing: input title." ".input text | embed | attribute
+ attribute {
+ distance-metric: angular
+ }
+ }
+
+ rank-profile hybrid {
+ inputs {
+ query(e) tensor(v[384])
+ }
+ first-phase {
+ expression: closeness(field, embedding) * (1 + (bm25(title) + bm25(text)))
+ }
+ match-features: bm25(title) bm25(text) closeness(field, embedding)
+ }
+
+ rank-profile hybrid-sum inherits hybrid {
+ first-phase {
+ expression: closeness(field, embedding) + ((bm25(title) + bm25(text)))
+ }
+ }
+
+ rank-profile hybrid-normalize-bm25-with-atan inherits hybrid {
+
+ function scale(val) {
+ expression: 2*atan(val/8)/(3.14159)
+ }
+ function normalized_bm25() {
+ expression: scale(bm25(title) + bm25(text))
+ }
+ function cosine() {
+ expression: cos(distance(field, embedding))
+ }
+ first-phase {
+ expression: normalized_bm25 + cosine
+ }
+ match-features {
+ normalized_bm25
+ cosine
+ bm25(title)
+ bm25(text)
+ }
+ }
+
+ rank-profile hybrid-rrf inherits hybrid-normalize-bm25-with-atan{
+
+ function bm25_score() {
+ expression: bm25(title) + bm25(text)
+ }
+ global-phase {
+ rerank-count: 100
+ expression: reciprocal_rank(bm25_score) + reciprocal_rank(cosine)
+ }
+ match-features: bm25(title) bm25(text) bm25_score cosine
+ }
+
+ rank-profile hybrid-linear-normalize inherits hybrid-normalize-bm25-with-atan{
+
+ function bm25_score() {
+ expression: bm25(title) + bm25(text)
+ }
+ global-phase {
+ rerank-count: 100
+ expression: normalize_linear(bm25_score) + normalize_linear(cosine)
+ }
+ match-features: bm25(title) bm25(text) bm25_score cosine
+ }
+}
+```
+
+```bash
+Paste the above into file app/schemas/doc.sd
+```
+
+Now, re-deploy the Vespa application from the `app` directory:
+
+
+```bash
+vespa deploy --wait 300 app
+```
+
+Let us break down the new rank profiles:
+
+- `hybrid-sum` combines the two retrieval strategies using addition. This is a simple way to combine the two strategies. But since the BM25 scores are not normalized (unbound) and the closeness score is normalized (0-1), the BM25 scores will dominate the closeness score.
+- `hybrid-normalize-bm25-with-atan` combines the two strategies using a normalized BM25 score and the cosine similarity. The BM25 scores are normalized using the `atan` function.
+- `hybrid-rrf` combines the two strategies using the reciprocal rank feature. This is a way to combine the two strategies using a reciprocal rank feature.
+- `hybrid-linear-normalize` combines the two strategies using a linear normalization function. This is a way to combine the two strategies using a linear normalization function.
+
+The two last profiles are using `global-phase` to rerank the top 100 documents using the reciprocal rank and linear normalization functions. This can only be done in the global phase as it requires access to all the documents that are retrieved into ranking and in a multi-node setup, this requires communication between the nodes and knowledge of the score distribution across all the nodes. In addition, each ranking phase can only order the documents by a single score.
+
+### Evaluate the new rank profiles
+
+Adding new rank-profiles is a hot change. Once we have deployed the application, we can evaluate the new hybrid profiles using the script:
+
+
+```bash
+$ python3 evaluate_ranking.py --ranking hybrid-sum --mode hybrid
+```
+
+```txt
+Ranking metric NDCG@10 for rank profile hybrid-sum: 0.3244
+```
+
+```bash
+$ python3 evaluate_ranking.py --ranking hybrid-normalize-bm25-with-atan --mode hybrid
+```
+
+```txt
+Ranking metric NDCG@10 for rank profile hybrid-normalize-bm25-with-atan: 0.3410
+```
+
+
+
+```bash
+$ python3 evaluate_ranking.py --ranking hybrid-rrf --mode hybrid
+```
+
+
+
+```txt
+Ranking metric NDCG@10 for rank profile hybrid-rrf: 0.3233
+```
+
+
+
+```bash
+$ python3 evaluate_ranking.py --ranking hybrid-linear-normalize --mode hybrid
+```
+
+
+
+```txt
+Ranking metric NDCG@10 for rank profile hybrid-linear-normalize: 0.3423
+```
+
+
+On this particular dataset, the `hybrid-normalize-bm25-with-atan` rank profile performs the best, but the difference is small. This also demonstrates that hybrid search and ranking is a complex problem and that the effectiveness of the hybrid model depends on the dataset and the retrieval strategies.
+
+These results (which is the best) might not transfer to your specific retrieval use case and dataset, so it is important to evaluate the effectiveness of a hybrid model on your specific dataset.
+
+See [Improving retrieval with LLM-as-a-judge](https://blog.vespa.ai/improving-retrieval-with-llm-as-a-judge/) for more information on how to collect relevance judgments for your dataset.
+
+### Summary
+
+We showed how to express hybrid queries using the Vespa query language and how to combine the two retrieval strategies using the Vespa ranking framework. We also showed how to evaluate the effectiveness of the hybrid ranking model using one of the datasets that are a part of the BEIR benchmark. We hope this tutorial has given you a good understanding of how to combine different retrieval strategies using Vespa, and that there is not a single silver bullet for all retrieval problems.
+
+## Cleanup
+
+
+```bash
+$ docker rm -f vespa-hybrid
+```
+
+
+1. Robertson, Stephen and Zaragoza, Hugo and others, 2009. The probabilistic relevance framework: BM25 and beyond. Foundations and Trends in Information Retrieval.
diff --git a/mintlify-docs/en/learn/tutorials/news-1-deploy-an-application.mdx b/mintlify-docs/en/learn/tutorials/news-1-deploy-an-application.mdx
new file mode 100644
index 0000000000..d114168a53
--- /dev/null
+++ b/mintlify-docs/en/learn/tutorials/news-1-deploy-an-application.mdx
@@ -0,0 +1,222 @@
+---
+title: "News search and recommendation tutorial - getting started on Docker"
+---
+
+Our goal with this series is to set up a Vespa application for personalized news recommendations. We will do this in stages, starting with a simple news search system and gradually adding functionality as we go through the tutorial parts.
+
+The parts are:
+
+1. [Getting started](/en/learn/tutorials/news-1-deploy-an-application) - this part
+2. [A basic news search application](/en/learn/tutorials/news-2-basic-feeding-and-query) - application packages, feeding, query
+3. [News search](/en/learn/tutorials/news-3-searching) - sorting, grouping, and ranking
+4. [Generating embeddings for users and news articles](/en/learn/tutorials/news-4-embeddings)
+5. [News recommendation](/en/learn/tutorials/news-5-recommendation) - partial updates (news embeddings), ANNs, filtering
+6. [News recommendation with searchers](/en/learn/tutorials/news-6-recommendation-with-searchers) - custom searchers, doc processors
+7. [News recommendation with parent-child](/en/learn/tutorials/news-7-recommendation-with-parent-child) - parent-child, tensor ranking
+
+There are different entry points to this tutorial. This one is describing how to get started using Docker on your local machine. You can also deploy the application we are creating on [Vespa Cloud](https://cloud.vespa.ai).
+
+In this part, we will start with a minimal Vespa application to get used to some basic operations for running the application on Docker. In the next part of the tutorial, we'll start developing our application.
+
+
+
+**Prerequisites:**
+
+- Linux, macOS or Windows 10 Pro on x86\_64 or arm64, with [Podman Desktop](https://podman.io/) or [Docker Desktop](https://www.docker.com/products/docker-desktop/) installed, with an engine running.
+ - Alternatively, start the Podman daemon:
+
+ ```bash
+ $ podman machine init --memory 6000
+ $ podman machine start
+ ```
+ - See [Docker Containers](/en/operations/self-managed/docker-containers) for system limits and other settings.
+- For CPUs older than Haswell (2013), see [CPU Support](/en/operations/self-managed/cpu-support).
+- Memory: Minimum 4 GB RAM dedicated to Docker/Podman. [Memory recommendations](/en/operations/self-managed/node-setup#memory-settings).
+- Disk: Avoid `NO_SPACE` - the vespaengine/vespa container image + headroom for data requires disk space. [Read more](/en/writing/feed-block).
+- [Homebrew](https://brew.sh/) to install the [Vespa CLI](/en/clients/vespa-cli), or download the Vespa CLI from [Github releases](https://github.com/vespa-engine/vespa/releases).
+- Python3 for converting the dataset to Vespa JSON.
+- `curl` to download the dataset and run the Vespa health-checks.
+- [Java 17](https://openjdk.org/projects/jdk/17/) in part 6.
+- [Apache Maven](https://maven.apache.org/install.html) in part 6.
+
+
+
+
+**Note:**
+
+4 GB Docker memory is sufficient for the demo dataset in part 2. The full MIND dataset requires more, use 10 GB.
+
+
+
+In upcoming parts of this series, we will have some additional Python dependencies - we use [PyTorch](https://pytorch.org/) to train vector representations for news and users and train machine learning models for use in ranking.
+
+
+## Installing vespa-cli
+
+This tutorial uses [Vespa-CLI](/en/clients/vespa-cli), Vespa CLI is the official command-line client for Vespa.ai. It is a single binary without any runtime dependencies and is available for Linux, macOS, and Windows.
+
+```bash
+$ brew install vespa-cli
+```
+
+For the rest of this tutorial, you will be using localhost, so you need to configure your Vespa CLI to connect to localhost. Run the following to use endpoints on localhost:
+
+```bash
+$ vespa config set target local
+```
+
+
+## A minimal Vespa application
+
+This tutorial has a [companion sample application](https://github.com/vespa-engine/sample-apps/tree/master/news). Throughout the tutorial, we will be using support code from this application. Also, the final state of each tutorial can be found in the various `app-...` subdirectories.
+
+Let's start by cloning the sample application:
+
+```bash
+$ vespa clone -f news news && cd news
+```
+
+The above downloads the `news` directory from the Vespa [sample apps repository](https://github.com/vespa-engine/sample-apps/) and places the contents in a folder called `news`. Use `--help` to see documentation for the vespa-cli utility:
+
+```bash
+$ vespa clone --help
+```
+
+In the `news` directory, several pre-configured application packages are available. The `app-1-getting-started` directory contains a minimal Vespa application. There are two files there:
+
+- `services.xml` - defines the services that the application consists of
+- `schemas/news.sd` - defines the schema for searchable content.
+
+We will revisit these files in the next part of the tutorial.
+
+
+## Starting Vespa
+
+This application doesn't contain much at the moment, let's start up the application anyway by starting a Docker container to run it:
+
+```bash
+$ docker pull vespaengine/vespa
+$ docker run --detach --name vespa --hostname vespa-tutorial \
+ --publish 8080:8080 --publish 19071:19071 --publish 19092:19092 \
+ vespaengine/vespa
+```
+
+First, we pull the latest [vespa-image](https://hub.docker.com/r/vespaengine/vespa/) from the Docker hub, then we start it with the name `vespa`. This starts the Docker container and the initial Vespa services to be able to deploy an application.
+
+Starting the container can take a short while. Before continuing, make sure that the configuration service is running by using `vespa status`.
+
+```bash
+$ vespa status deploy --wait 300
+```
+
+With the config server up and running, deploy the application using vespa-cli:
+
+```bash
+$ vespa deploy --wait 300 app-1-getting-started
+```
+
+The command uploads the application and verifies the content. If anything is wrong with the application, this step will fail with a failure description; Otherwise, this switches the application to a live status.
+
+Whenever you have a new version of your application, run the same command to deploy the application. In most cases, there is no need to restart services. Vespa takes care of reconfiguring the system. If a restart of services is required in some rare case, however, the output will notify which services need restart to make the change effective.
+
+In the upcoming parts of the tutorials, we'll frequently deploy the application changes in this manner.
+
+
+## Feeding to Vespa
+
+We must index data before we can search for it. This is called "feeding", and we'll get back to that in more detail in the next part of the tutorial. For now, to test that everything is up and running, we'll feed in a single test document:
+
+```bash
+$ vespa feed -t http://localhost:8080 doc.json
+```
+
+The `-v` option will make vespa-cli print the http request:
+
+```bash
+$ vespa document -v doc.json
+```
+
+We can also feed using [Vespa document api](/en/writing/document-v1-api-guide) directly.
+
+Once the feed operation is acknowledged by Vespa, the operation is visible in search.
+
+
+## Querying Vespa
+
+We can query the endpoint using the vespa-cli's support for performing queries. It uses the [Vespa query api](/en/querying/query-api) to query vespa, including `-v` in the command, we can see the exact endpoint and url request parameters used.
+
+```bash
+$ vespa query -v 'yql=select * from news where true'
+```
+
+This example uses [YQL (Vespa Query Language)](/en/querying/query-language) to search for all documents of type `news`. This query request will return `1` result, which is the document we fed above.
+
+```bash
+$ vespa query \
+ 'yql=select * from news where userQuery()' \
+ 'query=hello world' \
+ 'default-index=title'
+```
+
+Another query language example that searches for hello or world in the title.
+
+```bash
+$ vespa query \
+ 'yql=select * from news where title contains phrase("hello","world")'
+```
+
+Another query language example that searches for the phrase "hello world" in the title. In the [next part of the tutorial](/en/learn/tutorials/news-2-basic-feeding-and-query) we'll demonstrate more query examples, and also ranking and grouping of results.
+
+
+## Remove documents
+
+Run the following to remove the document from the index:
+
+```bash
+$ vespa document -v remove id:news:news::1
+```
+
+Well done!
+
+
+## Stopping and starting Vespa
+
+Keep Vespa running to continue with the next steps in this tutorial set (skip the below).
+
+To stop Vespa, we can run the following commands:
+
+```bash
+$ docker exec vespa vespa-stop-services
+$ docker exec vespa vespa-stop-configserver
+```
+
+Likewise, to start the Vespa services:
+
+```bash
+$ docker exec vespa vespa-start-configserver
+$ docker exec vespa vespa-start-services
+```
+
+If a [restart is required](/en/reference/schemas/schemas#changes-that-require-restart-but-not-re-feed) due to changes in the application package, these two steps are what you need to do.
+
+To wipe the index and restart:
+
+```bash
+$ docker exec vespa sh -c ' \
+ vespa-stop-services && \
+ vespa-remove-index -force && \
+ vespa-start-services'
+```
+
+You can stop and kill the Vespa container application like this:
+
+```bash
+$ docker stop vespa; docker rm -f vespa
+```
+
+This will delete the Vespa application, including all data and configuration. See [container tuning for production](/en/operations/self-managed/docker-containers).
+
+
+## Conclusion
+
+Our simple application should now be up and running. In the [next part of the tutorial](/en/learn/tutorials/news-2-basic-feeding-and-query), we'll start building from this foundation.
diff --git a/mintlify-docs/en/learn/tutorials/news-2-basic-feeding-and-query.mdx b/mintlify-docs/en/learn/tutorials/news-2-basic-feeding-and-query.mdx
new file mode 100644
index 0000000000..32a2b70b19
--- /dev/null
+++ b/mintlify-docs/en/learn/tutorials/news-2-basic-feeding-and-query.mdx
@@ -0,0 +1,322 @@
+---
+title: "News search and recommendation tutorial - applications, feeding and querying"
+---
+
+This is the second part of the tutorial series for setting up a Vespa application for personalized news recommendations. The parts are:
+
+1. [Getting started](/en/learn/tutorials/news-1-deploy-an-application)
+2. [A basic news search application](/en/learn/tutorials/news-2-basic-feeding-and-query) - application packages, feeding, query
+3. [News search](/en/learn/tutorials/news-3-searching) - sorting, grouping, and ranking
+4. [Generating embeddings for users and news articles](/en/learn/tutorials/news-4-embeddings)
+5. [News recommendation](/en/learn/tutorials/news-5-recommendation) - partial updates (news embeddings), ANNs, filtering
+6. [News recommendation with searchers](/en/learn/tutorials/news-6-recommendation-with-searchers) - custom searchers, doc processors
+7. [News recommendation with parent-child](/en/learn/tutorials/news-7-recommendation-with-parent-child) - parent-child, tensor ranking
+8. Advanced news recommendation - intermission - training a ranking model
+9. Advanced news recommendation - ML models
+
+In this part, we will build upon the minimal Vespa application in the previous part. First, we'll take a look at the [Microsoft News Dataset](https://msnews.github.io/) (MIND), which we'll be using throughout the tutorial. We'll use this to set up the search schema, deploy the application and feed some data. We'll round off with some basic querying before moving on to the next part of the tutorial: searching for content.
+
+For reference, the final state of this tutorial can be found in the [app-2-feed-and-query](https://github.com/vespa-engine/sample-apps/tree/master/news/app-2-feed-and-query) subdirectory of the `news` sample application.
+
+
+## The Microsoft News Dataset
+
+During these tutorials, we will use the [Microsoft News Dataset](https://msnews.github.io/) (MIND). This is a large-scale dataset for news recommendation research. It contains over 160.000 articles, 15 million impressions logs, and 1 million users. We will not use the full dataset in this tutorial. To make the tutorial easier to follow along, we will use the much smaller DEMO part containing only 5000 users. However, readers are free to use the entire dataset at their own discretion.
+
+The [MIND dataset description](https://github.com/msnews/msnews.github.io/blob/master/assets/doc/introduction.md) contains an introduction to the contents of this dataset. For this tutorial, there are particularly two pieces of data that we will use:
+
+- News article content which contains data such as title, abstract, news category, and entities extracted from the title and abstract.
+- Impressions which contain a list of news articles that were shown to a user, labeled with whether the user clicked on them or not.
+
+We'll start with developing a search application, so we'll focus on the news content at first. We'll use the impression data as we begin building the recommendation system later in this series.
+
+Let's start by downloading the data. The `news` sample app directory will be our starting point. We've included a script to download the data for us:
+
+```bash
+$ ./bin/download-mind.sh small
+```
+
+The argument defines which dataset to download. Here, we download the `small` dataset, but `small` and `large` are valid options. Both the training and validation parts are downloaded to a directory called `mind`. Both `train` and `dev` datasets will be downloaded.
+
+Taking a look at the data, in `mind/train/news.tsv`, we see tab-separated lines like the following:
+
+```
+N16680 travel traveltripideas The Most Beautiful Natural Wonder in Every State While humans have built some impressive, gravity-defying, and awe-inspiring marvels here are the most photographed structures in the world the natural world may have us beat. https://www.msn.com/en-us/travel/traveltripideas/the-most-beautiful-natural-wonder-in-every-state/ss-AAF8Brj?ocid=chopendata [] []
+```
+
+Here we see the news article id, a category, a subcategory, the title, an abstract, and a URL to the article's content. The last two fields contain the identified entities in the title and abstract. This particular news item has no such entities.
+
+Note that the body content of the news article is retrievable by the URL. The dataset repository contains tools to download this. For the purposes of this tutorial, we won't be using this data, but feel free to download yourself.
+
+Let's start building a Vespa application to make this data searchable. We'll create the directory `my-app` under the `news` sample app directory to contain your Vespa application:
+
+```bash
+$ mkdir -p my-app/schemas
+```
+
+
+## Application Packages
+
+
+
+A Vespa [application package](/en/basics/applications) is the set of configuration files and Java plugins that together define the behavior of a Vespa system: what functionality to use, the available document types, how ranking will be done and how data will be processed during feeding and indexing. The schema, e.g., `news.sd`, is a required part of an application package — the other file needed is `services.xml`.
+
+For self-hosted multi-node deployments, a `hosts.xml` file is also needed. For multi-node self-hosted deployments using `hosts.xml`, see the [multinode high availability](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode-HA) sample application.
+
+We mentioned these files in the previous part but didn't really explain them at the time. We'll go through them here, starting with the specification of services.
+
+
+### Services Specification
+
+The [services.xml](/en/reference/applications/services/services) file defines the services that make up the Vespa application — which services to run and how many nodes per service. Write the following to `news/my-app/services.xml`:
+
+```xml expandable
+
+
+
+
+
+
+
+
+
+
+
+
+ 1
+
+
+
+
+
+
+
+
+
+```
+
+Quite a lot is set up here:
+
+- `` defines the stateless [container cluster](/en/applications/containers) for document, query and result processing
+- `` sets up the [query endpoint](/en/querying/query-api). The default port is 8080.
+- `` sets up the [document endpoint](/en/reference/api/document-v1) for feeding and visiting.
+- `` defines the nodes required per service. (See the [reference](/en/reference/applications/services/container) for more on container cluster setup).
+- `` The stateful content cluster
+- `` denotes how many copies to store of each document.
+- `` assigns the document types in the *schema* — the content cluster capacity can be increased by adding node elements — see [elasticity](/en/content/elasticity). (See also the [reference](/en/reference/applications/services/content) for more on content cluster setup.)
+
+
+### Schema
+
+In terms of data, Vespa operates with the notion of [documents](/en/schemas/documents). A document represents a single, searchable item in your system, e.g., a news article, a photo, or a user. Each document type must be defined in the Vespa configuration through a [schema](/en/basics/schemas). Think of the document type in a schema as similar to a table definition in a relational database - it consists of a set of fields, each with a given name, a specific type, and some optional properties. The data fed into Vespa must match the structure of the schema, and the results returned when searching will be in this format as well. There is no dynamic field creation support in Vespa, one can say Vespa document schemas are strongly typed.
+
+The `news` document type mentioned in the `services.xml` file above is defined in a schema. Schemas are found under the `schemas` directory in the application package, and **must** have the same name as the document type mentioned in `services.xml`.
+
+Given the MIND dataset described above, we'll set up the schema as follows. Write the following to `news/my-app/schemas/news.sd`:
+
+```sd expandable
+schema news {
+ document news {
+ field news_id type string {
+ indexing: summary | attribute
+ attribute: fast-search
+ }
+ field category type string {
+ indexing: summary | attribute
+ }
+ field subcategory type string {
+ indexing: summary | attribute
+ }
+ field title type string {
+ indexing: index | summary
+ index: enable-bm25
+ }
+ field abstract type string {
+ indexing: index | summary
+ index: enable-bm25
+ }
+ field body type string {
+ indexing: index | summary
+ index: enable-bm25
+ }
+ field url type string {
+ indexing: index | summary
+ }
+ field date type int {
+ indexing: summary | attribute
+ attribute: fast-search
+ }
+ field clicks type int {
+ indexing: summary | attribute
+ }
+ field impressions type int {
+ indexing: summary | attribute
+ }
+ }
+
+ fieldset default {
+ fields: title, abstract, body
+ }
+
+}
+```
+
+The `document` is wrapped inside another element called `schema`. The name following these elements, here `news`, must be exactly the same for both.
+
+This document contains several fields. Each field has a [type](/en/reference/schemas/schemas#field), such as `string`, `int`, or `tensor`. Fields also have properties. For instance, property `indexing` configures the *indexing pipeline* for a field, which defines how Vespa will treat input during indexing — see [indexing language](/en/reference/writing/indexing-language). Each part of the indexing pipeline is separated by the pipe character '|':
+
+- `index:` Create a search index for this field.
+- `attribute:` Store this field in memory as an [attribute](/en/content/attributes) — for [sorting](/en/reference/querying/sorting-language), [querying](/en/querying/query-api), [ranking](/en/basics/ranking) and [grouping](/en/querying/grouping).
+- `summary:` Lets this field be part of the [document summary](/en/querying/document-summaries) in the result set.
+
+Here, we also use the [index](/en/reference/schemas/schemas#index) property, which sets up parameters for how Vespa should index the field. For the `title`, `abstract`, and `body` fields, we configure Vespa to set up an index compatible with [bm25 ranking](/en/reference/ranking/rank-features#bm25) for text search.
+
+
+## Deploy the Application Package
+
+With the two necessary files above, we are ready to deploy the application package. Make sure it looks like this (use `ls` if `tree` is not installed):
+
+```
+my-app/
+├── schemas
+│ └── news.sd
+└── services.xml
+```
+
+```bash
+$ vespa deploy --wait 300 my-app
+```
+
+
+## Feeding data
+
+The data fed to Vespa must match the schema for the document type. The downloaded MIND data must be converted to a valid Vespa JSON [document format](/en/reference/schemas/document-json-format) before it can be fed to Vespa:
+
+```bash
+$ python3 src/python/convert_to_vespa_format.py mind
+```
+
+The argument is where to find the downloaded data above, which was in the `mind` directory. This script creates a new file in that directory called `vespa.json`. This contains all 28603 news articles in the data set. This file can now be fed to Vespa. Use the method described in the previous part:
+
+```bash
+$ vespa feed mind/vespa.json --target http://localhost:8080
+```
+
+`vespa feed` reads a JSON array of document operations, or JSONL with one Vespa document JSON formatted operation per line. Once the feed job finishes, all our 65 238 documents are searchable, let us do a quick query to verify:
+
+```bash
+$ vespa query -v 'yql=select * from news where true' 'hits=0'
+```
+
+You can verify that specific documents are indexed by fetching documents by document ID using the [Document V1 API](/en/writing/document-v1-api-guide):
+
+```bash
+$ vespa document -v get id:news:news::N10864
+```
+
+
+## The first query
+
+Searching with Vespa is done using HTTP(S) GET or HTTP(S) POST requests, like:
+
+```
+/search?yql=select..&hits=1...
+```
+
+or with a JSON POST:
+
+```json
+{
+ "yql" : "select ..",
+ "hits" : 2
+}
+```
+
+The only mandatory parameter is the query, using either `yql=` or `query=`. More details in the [Query API](/en/querying/query-api).
+
+Consider the query: `select * from news where default contains "music"`
+
+Given the above schema, where the fields `title`, `abstract` and `body` are part of the `fieldset default`, any document containing the word "music" in one or more of these fields matches that query. Let's try that with either a GET query:
+
+```bash
+$ vespa query -v 'yql=select * from news where default contains "music"'
+```
+
+or a POST JSON query (Notice the *Content-Type* header specification):
+
+```bash
+$ curl -s -H "Content-Type: application/json" \
+ --data '{"yql" : "select * from sources * where default contains \"music\""}' \
+ http://localhost:8080/search/ | python3 -m json.tool
+```
+
+
+Try the [Query Builder](https://github.com/vespa-engine/vespa/tree/master/client/js/app#query-builder) application!
+
+
+Looking at the output, please note:
+
+- The field `documentid` in the output and how it matches the value we assigned to each put operation when feeding data to Vespa.
+- Each hit has a property named relevance, which indicates how well the given document matches our query, using a pre-defined default ranking function. You have full control over ranking — more about ranking and ordering later. The hits are sorted by this value (descending).
+- When multiple hits have the same relevance score, their internal ordering is undefined. However, their internal ordering will not change unless the documents are re-indexed.
+- You can add `&trace.level=3` to dump query parsing details and execution plan, see [query tracing](/en/querying/query-api#query-tracing).
+- The `totalCount` field at the top level contains the number of documents that *matched* the query.
+- Also note the `coverage` element, this tells us how many documents and nodes we searched over. Coverage might be degraded, see [graceful degradation](/en/performance/graceful-degradation).
+
+Prefer HTTP POST over GET in production due to limitations on URI length (64 Kb).
+
+
+### Query examples
+
+```bash
+$ vespa query -v 'yql=select title from news where title contains "music"'
+```
+
+Again, this is a search for the single term "music", but this time explicitly in the `title` field. This means that we only want to match documents that contain the word "music" in the field `title`. As expected, you will see fewer hits for this query than for the previous one searching the `fieldset default`. Also note that we scope the select to only return the title.
+
+```bash
+$ vespa query -v 'yql=select title from news where default contains "music" and default contains "festival"'
+```
+
+This is a query for the two terms "music" and "festival", combined with an `AND` operation; it finds documents that match both terms, not just one of them.
+
+```bash
+$ vespa query -v \
+ 'yql=select title from news where userQuery()' \
+ 'query=music festival' \
+ 'type=all'
+```
+
+This combines YQL [userQuery()](/en/reference/querying/yql#userquery) with Vespa's [simple query language](/en/reference/querying/simple-query-language). In this case, documents needs to match both "music" and "festival".
+
+```bash
+$ vespa query -v \
+ 'yql=select title from news where userQuery()' \
+ 'query=music festival -beer' \
+ 'type=any'
+```
+
+Above changes the query type from all to any, so that all documents that match either (or both) of the terms are returned, excluding documents with the term "beer". Note that number of hits which are matched and ranked increases the computational complexity of the query execution. See [using WAND with Vespa](/en/ranking/wand) for a way to speed up evaluation of type any/or-like queries.
+
+```bash
+$ vespa query -v \
+ 'yql=select title from news where userQuery()' \
+ 'query=music festival' \
+ 'type=phrase' \
+ 'default-index=title'
+```
+
+Above searches using `type=phrase` which requires the exact phrase "music festival" to match in the title.
+
+```bash
+$ vespa query -v \
+ 'yql=select title from news where rank(userQuery(), title contains "festival")' \
+ 'query=music'
+```
+
+Search for "music" in the default fieldset, boost documents with festival in the title. The [rank()](/en/reference/querying/yql#rank) query operator allows us to retrieve on the first operand, and have match ranking features calculated for the second operand argument. The second and further operands does not impact recall (which documents match the query), but can be used to tune precision (ordering of the results). More on ranking in the next part of the tutorial.
+
+
+## Conclusion
+
+We now have a Vespa application running with searchable data. In the [next part of the tutorial](/en/learn/tutorials/news-3-searching), we'll explore searching with sorting, grouping, and ranking results.
diff --git a/mintlify-docs/en/learn/tutorials/news-3-searching.mdx b/mintlify-docs/en/learn/tutorials/news-3-searching.mdx
new file mode 100644
index 0000000000..b87b204802
--- /dev/null
+++ b/mintlify-docs/en/learn/tutorials/news-3-searching.mdx
@@ -0,0 +1,340 @@
+---
+# Copyright Vespa.ai. All rights reserved.
+title: "News search and recommendation tutorial - searching"
+---
+
+
+This is the third part of the tutorial series for setting up a Vespa application for personalized news recommendations. The parts are:
+
+1. [Getting started](/en/learn/tutorials/news-1-deploy-an-application)
+2. [A basic news search application](/en/learn/tutorials/news-2-basic-feeding-and-query) - application packages, feeding, query
+3. [News search](/en/learn/tutorials/news-3-searching) - sorting, grouping, and ranking
+4. [Generating embeddings for users and news articles](/en/learn/tutorials/news-4-embeddings)
+5. [News recommendation](/en/learn/tutorials/news-5-recommendation) - partial updates (news embeddings), ANNs, filtering
+6. [News recommendation with searchers](/en/learn/tutorials/news-6-recommendation-with-searchers) - custom searchers, doc processors
+7. [News recommendation with parent-child](/en/learn/tutorials/news-7-recommendation-with-parent-child) - parent-child, tensor ranking
+8. Advanced news recommendation - intermission - training a ranking model
+9. Advanced news recommendation - ML models
+
+In the previous part, we converted the [Microsoft News Dataset](https://msnews.github.io/) (MIND) to Vespa, and fed it to our application. In this part, we'll issue searches in this content and look at sorting, grouping, and ranking the results.
+
+For reference, the final state of this tutorial can be found in the [app-3-searching](https://github.com/vespa-engine/sample-apps/tree/master/news/app-3-searching) sub-directory of the `news` sample application.
+
+Conceptually, Vespa has two stages when determining the exact result to return. This first is "matching", where all the documents that match the query are found. This is a binary decision; either the document matches or it doesn't. For instance, when searching for a word, all documents that contain it are selected as candidates in this stage.
+
+The next stage determines the ordering of the results. We can think of the results being ordered either by:
+
+- a fixed value, or attribute, in the document
+- a function calculating a score
+
+Ordering by an attribute is called [sorting](/en/reference/querying/sorting-language). For instance, we can sort by decreasing `date`. [Grouping](/en/reference/querying/grouping-language) also works on attributes. An example is to group the results by a `category` attribute.
+
+Calculating a score to order by is generally called "ranking". As these scores are usually dependent upon both query and document, they can also be called *relevance*. Such expressions can be arbitrarily complex, but in general, require some form of computation to find this score. Ranking can be divided into [multiple rank phases](/en/ranking/phased-ranking) as well.
+
+We'll start by looking at attribute-based sorting and grouping before moving on to ranking.
+
+
+## What is an attribute?
+
+We saw multiple examples of attributes in the `news.sd` schema, for instance:
+
+`field date type int { indexing: summary | attribute attribute: fast-search }`
+
+Note that this `date` field has been defined as an `int` here, and when feeding document, we convert the date to the format `YYYYMMDD`.
+
+An [attribute](/en/content/attributes) is an in-memory field - this is different from _index_ fields, which may be moved to a disk-based index as more documents are added and the index grows. Since attributes are kept in memory, they are excellent for fields that require fast access for many documents, e.g. fields used for sorting, ranking or grouping query results. The downside is higher memory usage.
+
+In the above field definition we have included an additional property `attribute: fast-search` which will inform Vespa that we want to build inverted index structures (dictionary and posting lists) for *fast* *matching* in the field. See more about [when to use fast-search](/en/performance/feature-tuning#when-to-use-fast-search-for-attribute-fields) in the performance feature tuning section.
+
+
+### Example queries using attribute field
+
+```bash
+$ vespa query -v 'yql=select * from news where default contains "20191110"'
+```
+
+
+This is a single-term query for the term `20191110` in the `default` fieldset. In the schema, the field `date` is not included in the `default` fieldset, so no results are found. Instead, we search using `=` which can be used for numeric and bool fields:
+
+
+```bash
+$ vespa query -v 'yql=select * from news where date=20191110'
+```
+
+
+To get documents that were created 10 November 2019, and whose `date` field is `20191110`, replace `default` with `date` in the YQL query string.
+
+
+```bash
+$ vespa query -v 'yql=select * from news where date=20191110 and default contains "weather"'
+```
+
+
+This is a query with two terms; a search in the `default` field set for the term "weather" combined with a search in the `date` field for the value `20191110`.
+
+
+### Range searches
+
+The examples above searched over `date` just as any other field, and requested documents where the value was exactly `20191110`. Since the field is of type _int_, however, we can use it for _range searches_ as well, using the "less than" and "greater than" operators (`<` and `>`). The query:
+
+
+```bash
+$ vespa query -v 'yql=select * from news where date < 20191110'
+```
+
+
+finds all documents where the value of `date` is less than `20191110`, i.e. all documents from before 10 November 2019, while
+
+
+```bash
+$ vespa query -v 'yql=select * from news where date = 20191108'
+```
+
+
+finds all news articles from 8 November 2019 to 10 November 2019, inclusive.
+
+
+### Sorting on attribute fields
+
+The first feature we will look at is how an attribute can be used to change the hit order. By now, you have probably noticed that hits are returned in order of descending relevance, i.e. how well the document matches the query — if not, take a moment to verify this. You might ask how Vespa does this since we haven't even touched upon ranking yet. The answer is that Vespa uses its [nativeRank](/en/ranking/nativerank) score unless anything else is defined in the schema. We'll get back to defining custom ranking later on.
+
+Now send the following query to Vespa, and look at the order of the hits:
+
+
+```bash
+$ vespa query -v 'yql=select date from news where default contains phrase("music","festival") order by date'
+```
+
+
+By default, sorting is done in ascending order. This can also be specified by appending `asc` after the sort attribute name. Use `desc` to sort the results in descending order:
+
+
+```bash
+$ vespa query -v 'yql=select date from news where default contains phrase("music","festival") order by date desc'
+```
+
+
+Attempting to sort on a field which is not defined as attribute in the schema will create an error.
+
+
+### Query time result grouping
+
+[Grouping](/en/querying/grouping) is the concept of looking through all matching documents at query-time and then performing operations with specified fields across all the documents — some common use cases include:
+
+- Find all the unique values for a given field, make **one group per unique value**, and return the count of documents per group.
+- **Group documents by time and date** in fixed-width or custom-width buckets. An example of fixed-width buckets could be to group all documents by year, while an example of custom buckets could be to sort bug tickets by date of creation into the buckets _Today_, _Past Week_, _Past Month_, _Past Year_, and _Everything else_.
+- Calculate the **minimum/maximum/average value** for a given field.
+- [Result diversification](https://blog.vespa.ai/result-diversification-with-vespa/), e.g. to only display 3 best ranking results per category for up to 5 categories.
+
+Displaying such groups and their sizes (in terms of matching documents per group) on a search result page, with a link to each such group, is a common way to let users refine searches. For now, we will only do a simple grouping query to get a list of unique values for `category`, ordered by the number of documents they occur in and top 3 is shown:
+
+
+```bash
+$ vespa query -v 'yql=select * from news where true limit 0 | all(group(category) max(3) order(-count())each(output(count())))'
+```
+
+
+Note that expression after the pipe (`|`): this is the grouping expression that determines how grouping will be performed. You can read more about the grouping syntax in the [grouping reference documentation](/en/reference/querying/grouping-language). `limit 0` is an alternative syntax for the native `hits` parameter, in this case we are only interested in the group counts, so we set limit to 0.
+
+For this query, you will get something like the following:
+```json expandable
+{
+ "root": {
+ "children": [
+ {
+ "children": [
+ {
+ "children": [
+ {
+ "fields": {
+ "count()": 9115
+ },
+ "id": "group:string:news",
+ "relevance": 1.0,
+ "value": "news"
+ },
+ {
+ "fields": {
+ "count()": 6765
+ },
+ "id": "group:string:sports",
+ "relevance": 0.6666666666666666,
+ "value": "sports"
+ },
+ {
+ "fields": {
+ "count()": 1886
+ },
+ "id": "group:string:finance",
+ "relevance": 0.3333333333333333,
+ "value": "finance"
+ }
+ ],
+ "continuation": {
+ "next": "BGAAABEBGBC"
+ },
+ "id": "grouplist:category",
+ "label": "category",
+ "relevance": 1.0
+ }
+ ],
+ "continuation": {
+ "this": ""
+ },
+ "id": "group:root:0",
+ "relevance": 1.0
+ }
+ ],
+ "coverage": {
+ "coverage": 100,
+ "documents": 28603,
+ "full": true,
+ "nodes": 1,
+ "results": 1,
+ "resultsFull": 1
+ },
+ "fields": {
+ "totalCount": 28603
+ },
+ "id": "toplevel",
+ "relevance": 1.0
+ }
+}
+```
+
+So, the three most common unique values of `category` among the indexed documents (for the demo data set) are:
+
+- `news` with 9115 articles
+- `sports` with 6765 articles
+- `finance` with 1886 articles
+
+Try to change the filter part of the YQL+ expression — the `where` clause — to a text match of "weather", or restrict `date` to be less than 20191110, and see how the list of unique values changes as the set of matching documents for your query changes. If you try to search for a single term that is *not* present in the document set, you will see that the list of groups is empty as no documents have been matched. Vespa grouping is only applied over the documents which matched the query.
+
+In the following example we use the [select](/en/reference/api/query#select) parameter to pass the grouping specification:
+
+
+```bash
+$ vespa query -v 'yql=select * from news where userQuery() limit 0' \ 'select=all(group(category) max(2) each(max(2)each(output(summary()))))' \ 'query=drinks'
+```
+
+
+This request searches for drinks, groups by category and for each unique category output the 2 top ranking hits (according to the rank profile used). Groups are sorted by default by maximum relevance in the group. Notice that we also set an upper limit on the number of unique groups my the outermost max. This is important in cases with many unique values. See also [Result diversification using Vespa result grouping](https://blog.vespa.ai/result-diversification-with-vespa/).
+
+Please refer to the [grouping guide](/en/querying/grouping) for more information and examples using Vespa grouping. Similar to with sorting, attempting to group on a field which is not defined as attribute in the schema will create an error.
+
+
+### Matching - index versus attribute
+
+Before we move on to ranking, it's important to know some of the differences between `index` and `attribute`.
+
+#### Matching
+
+Consider the `title` field from our schema, and the document for the article with title "A little snow causes a big mess, more than 100 crashes on Minnesota roads". In the original input, the value for `title` is a string built of up the 14 words, with a single white space character between them. How should we be able to search this field?
+
+For string fields with `index` which defaults to `match:text`, Vespa performs linguistic processing of the string. This includes [tokenization](/en/linguistics/linguistics-opennlp#tokenization), [normalization](/en/linguistics/linguistics-opennlp#normalization) and language dependent [stemming](/en/linguistics/linguistics-opennlp#stemming) of the string.
+
+In our example, this means that the string above is split into the 14 tokens, enabling Vespa to match this document for:
+
+- the single-term queries such as "Michigan", "snow" and "roads",
+- the exact phrase query "A little snow causes a big mess, more than 100 crashes on Minnesota roads",
+- a query with two or more tokens in either order (e.g. "minnesota crashes").
+
+This is how we all have come to expect normal free text search to work.
+
+However, string fields with `indexing:attributes` do not support `match:text`, only *exact matching* or *prefix matching*. Exact matching is the default, and, as the name implies, it requires you to search for the exact contents of the field in order to get a match. See supported [match](/en/reference/schemas/schemas#match) modes and the differences in support between `attribute` and `index`.
+
+#### Memory usage
+
+Attributes are stored in memory, as opposed to fields with `index`, where the data is mostly kept on disk but paged in on-demand and cached by the OS buffer cache. Even with large flavor types, one will notice that it is not practical to define all the document type fields as attributes, as it will heavily restrict the number of documents per search node. Some Vespa applications have more than 1 billion documents per node — having megabytes of text per document in memory per document might not be cost-effective.
+
+#### When to use attributes
+
+There are both advantages and drawbacks of using attributes — it enables sorting, ranking and grouping, but requires more memory and does not support `match:text` capabilities. Attribute fields do support at least one order higher update throughput then regular `index` fields, see [partial updates with Vespa](/en/writing/partial-updates).
+
+When to use attributes depends on the application; in general, use attributes for:
+
+- fields used for sorting, e.g. a last-update timestamp,
+- fields used for grouping, e.g. category, and
+- fields accessed in ranking expressions
+
+Finally, all numeric and [tensors](/en/ranking/tensor-user-guide) fields used in ranking must be defined with attribute.
+
+#### Combining index and attribute
+
+`field category type string { indexing: summary | attribute | index }`
+
+Combining both index and attribute for the same field is supported. In this case, we can sort and group on the category, while search or matching will be using index matching with `match:text`, which will tokenize and stem the contents of the field.
+
+
+## Relevance and Ranking
+
+[Ranking](/en/basics/ranking) and relevance were briefly mentioned above; what is really the relevance of a hit? How can one change the relevance calculations? It is time to introduce _rank profiles_ and _ranking expressions_ — simple, yet powerful methods for tuning the relevance.
+
+Relevance is a measure of how well a given document matches a query. The default relevance is calculated by a formula that takes several *matching* factors into consideration. It computes, in essence, how well the document matches the terms in the query. The default Vespa ranking function and its limitations is described in [ranking with nativeRank](/en/ranking/nativerank).
+
+Ranking signals that might be useful, like freshness (the age of the document compared to the time of the query) or any other document or query features, are not a part of the nativeRank calculation. These need to be added to the ranking function depending on application specifics.
+
+Some use cases for tweaking the relevance calculations:
+
+- Personalize search results based on some property; age, nationality, language, friends and friends of friends.
+- Rank fresh (age) documents higher, while still considering other relevance measures.
+- Rank documents by geographical location, searching for relevant resources nearby.
+- Rank documents by machine learned ranking functions - Learning to Rank (LTR).
+- Rank documents by business constraints - For example by product availability.
+
+Vespa allows creating any number of _rank profiles_: named collections of ranking and relevance calculations that one can choose from at query time. A number of built-in functions and expressions are available to create highly specialized ranking expressions and users can define their own functions in the schema.
+
+
+### News article popularity signal
+
+During the conversion of the news dataset, the conversion script counted both the number of times a news article was shown (impressions) and how many clicks it received. A high number of clicks relative to impressions indicates that the news article was generally popular. We can use this signal in our ranking. Since both clicks and impressions are attribute fields, these fields can be [updated](/en/writing/partial-updates) at scale with very high throughput.
+
+We can use this signal in our ranking, by including a `popularity` rank profile, as defined below at the bottom of `schemas/news.sd`. Note that rank profiles are defined outside the `document` block:
+
+
+```txt
+schema news { document news { field news_id type string { indexing: summary | attribute attribute: fast-search } field category type string { indexing: summary | attribute } field subcategory type string { indexing: summary | attribute } field title type string { indexing: index | summary index: enable-bm25 } field abstract type string { indexing: index | summary index: enable-bm25 } field body type string { indexing: index | summary index: enable-bm25 } field url type string { indexing: index | summary } field date type int { indexing: summary | attribute attribute: fast-search } field clicks type int { indexing: summary | attribute } field impressions type int { indexing: summary | attribute } }
+
+fieldset default { fields: title, abstract, body }
+
+rank-profile popularity inherits default { function popularity() { expression: if (attribute(impressions) > 0, attribute(clicks) / attribute(impressions), 0) } first-phase { expression: nativeRank(title, abstract) + 10 * popularity } } }
+```
+
+
+- `rank-profile popularity inherits default`
+
+This configures Vespa to create a new rank profile named `popularity`, which inherits all the properties of the default rank-profile; only properties that are explicitly defined, or overridden, will differ from those of the default rank-profile.
+
+- `first-phase`
+
+Relevance calculations in Vespa are two-phased. The calculations done in the first phase are performed on every single document matching your query, while the second phase calculations are only done on the top `n` documents as determined by the calculations done in the first phase. See [phased ranking](/en/ranking/phased-ranking).
+
+- `function popularity()`
+
+This sets up a function that can be called from other expressions. This function calculates the number of clicks divided by impressions for indicating popularity. However, this isn't really the best way of calculating this as an article with a low number of impressions can score high on such a value, even though uncertainty is high.
+
+- `expression: nativeRank + 10 * popularity`
+
+This expression is used to rank documents. Here, the default ranking expression — the `nativeRank` of the `default` fieldset — is included to make the query relevant, while the second term calls the `popularity` function. The weighted sum of these two terms is the final relevance for each document. Note that the weight here, `10`, is set by observation. A better approach would be to learn such values using machine learning.
+
+More information can be found in the [schema reference](/en/reference/schemas/schemas#rank-profile).
+
+Deploy the _popularity_ rank profile:
+
+
+```bash
+$ vespa deploy --wait 300 my-app
+```
+
+Run a query:
+
+```bash
+$ vespa query -v \ 'yql=select * from news where default contains "music"' \ 'ranking=popularity'
+```
+
+and find documents with high `popularity` values at the top. Note that we must specify the rank profile to use with the run time `ranking` parameter.
+
+## Conclusion
+
+After completing this part of the tutorial, you should now have a basic understanding of how you can load data into Vespa and effectively search for content. In the [next part of the tutorial](/en/learn/tutorials/news-4-embeddings), we'll start with the basics for transforming this search app into a recommendation system.
+
diff --git a/mintlify-docs/en/learn/tutorials/news-4-embeddings.mdx b/mintlify-docs/en/learn/tutorials/news-4-embeddings.mdx
new file mode 100644
index 0000000000..bb3d1e9309
--- /dev/null
+++ b/mintlify-docs/en/learn/tutorials/news-4-embeddings.mdx
@@ -0,0 +1,291 @@
+---
+title: "News search and recommendation tutorial - embeddings"
+---
+
+This is the fourth part of the tutorial series for setting up a Vespa application for personalized news recommendations. The parts are:
+
+1. [Getting started](/en/learn/tutorials/news-1-deploy-an-application)
+2. [A basic news search application](/en/learn/tutorials/news-2-basic-feeding-and-query) - application packages, feeding, query
+3. [News search](/en/learn/tutorials/news-3-searching) - sorting, grouping, and ranking
+4. [Generating embeddings for users and news articles](/en/learn/tutorials/news-4-embeddings)
+5. [News recommendation](/en/learn/tutorials/news-5-recommendation) - partial updates (news embeddings), ANNs, filtering
+6. [News recommendation with searchers](/en/learn/tutorials/news-6-recommendation-with-searchers) - custom searchers, doc processors
+7. [News recommendation with parent-child](/en/learn/tutorials/news-7-recommendation-with-parent-child) - parent-child, tensor ranking
+8. Advanced news recommendation - intermission - training a ranking model
+9. Advanced news recommendation - ML models
+
+In this part, we'll start transforming our application from news search to recommendation. We won't be using Vespa at all in this part. Our focus is to generate news and user embeddings. We'll start using these embeddings in the next part - you can skip this part if you wish.
+
+The primary function of a recommendation system is to provide items of interest to any given user. The more we know about the user, the better recommendations we can provide. We can view recommendation as search where the query is the user profile. So, in this tutorial we will build upon the previous news search tutorial by creating user profiles and use them to search for relevant news articles.
+
+We start by generating embeddings using a collaborative filtering method. We'll then improve upon that using a content-based approach, which generates embedding based on BERT models. Since we'll use this in a nearest neighbors algorithm, we'll touch upon how the maximum inner product search is transformed to a distance search form.
+
+Let's start with taking a look again at what data the MIND dataset provides for us.
+
+
+### Requirements
+
+We start using some machine learning tools in this tutorial. Specifically, we need Numpy, Scikit-learn, PyTorch, and the HuggingFace Transformers library. Make sure you have all the necessary dependencies by running the following in the sample application directory:
+
+```bash
+$ python3 -m pip install --ignore-installed -r requirements.txt
+```
+
+
+## The MIND dataset
+
+The MIND dataset, for our purposes in this series of tutorials, consists of two main files: `news.tsv` and `behaviors.tsv`. We used the former in the previous tutorial, as that contains all news article content.
+
+The `behaviors.tsv` file contains a set of impressions. An impression is an ordered list of news articles that was generated for a user. It includes which of those articles the user clicked, and conversely, which ones were not. We designate articles not clicked as "skips". Also included in the impression is a list of articles the user has previously clicked. An example is:
+
+```
+3 U11552 11/11/2019 1:03:52 PM N2139 N18390-0 N10537-0 N23967-1
+```
+
+Here, user `U11552` was shown three articles: `N18390`, `N10537`, and `N23967`, of which the user skipped two and clicked the last one. At that time, the user had previously clicked on article `N2139`. We can cross-reference with the `news.tsv` and extract the content of these articles.
+
+We interpret a click as a positive signal for interest and a skip as possibly a negative signal for interest. This is called implicit feedback, as the users haven't explicitly expressed their interests. However, using clicks and skips, we can still start to infer the users' interests.
+
+
+## Collaborative filtering in recommendation systems
+
+A simple approach to provide recommendations to the above would be to extract the categories, subcategories, and/or entities the users have implicitly interacted with, and store these for each user. We can call this a sparse user profile because we store the exact terms of entities or categories. We could then use traditional information retrieval techniques to search for more articles with similar content.
+
+However, by doing this we miss out on a lot of information. For instance, some categories or entities are similar, which could be of interest to the user. Also, users with similar interests tend to click on similar articles. If some type of content was interesting to one user, it would likely be interesting to similar users.
+
+Exploiting this information is called collaborative filtering and the classical approach to this is matrix factorization. In this approach, we create a large matrix with users along one axis and news articles along the other. We'll call this the interaction matrix. Then we factorize this matrix into two smaller matrices, where the product of these two smaller approximates the original.
+
+
+
+
+
+In the image above, you can see a user matrix with as many rows as there are users and a news matrix with as many columns as there are news articles. Each user row, or news column, has the same length, signified by the `k` dimension. The intuition is that the dot product of the `k` length vector for a user and news pair approximates the user's interest in the news article. Since the information is compressed into the `k` length vector, this works across users as well. Thus, the "collaborative" filtering.
+
+These `k` length vectors can be extracted from the matrices and associated with the user or news article. So, when we want to recommend news articles to a user, we simply find the user's vector and find the articles with the highest dot products. In the following, we will use this approach to generate such embeddings for users and news articles.
+
+Please note, however, this approach would not work well in practice for news recommendation. The reason is that a large part of news recommendation is to recommend **new** news articles, which might not have received any implicit feedback yet. This is called the "cold start" problem. For such problems, we need to use additional content (often called "side information") of news articles to provide recommendations. We'll tackle this "cold start" a bit later.
+
+
+## Generating embeddings
+
+A standard method for factorizing the interaction matrix is to use Alternating Least Squares. The idea is to randomly fill the user and news matrices and freeze one of the matrices' parameters while solving for the other. By alternating between which matrix is fixed, this can be solved with a traditional least-squares problem. We can iterate the process until convergence.
+
+This tutorial aims to generate embeddings so that the dot product between a user and news vector signifies the probability of a click. Using this signal we can rank news articles by click probability. To train the embedding vectors, we will use a stochastic gradient descent approach to modify the embeddings so that their dot product followed by the logistic function predicts a user click. We use a binary cross-entropy as loss function.
+
+We'll use PyTorch for this. The main PyTorch model class is as follows:
+
+```python
+class MF(torch.nn.Module):
+ def __init__(self, num_users, num_items, embedding_size):
+ super(MF, self).__init__()
+ self.user_embeddings = torch.nn.Embedding(num_embeddings=num_users,
+ embedding_dim=embedding_size)
+ self.news_embeddings = torch.nn.Embedding(num_embeddings=num_items,
+ embedding_dim=embedding_size)
+
+ def forward(self, users, items):
+ user_embeddings = self.user_embeddings(users)
+ news_embeddings = self.news_embeddings(items)
+ dot_prod = torch.sum(torch.mul(user_embeddings, news_embeddings), 1)
+ return torch.sigmoid(dot_prod)
+```
+
+We use the PyTorch's `Embedding` class to hold the user and news embeddings. The forward function is the forward pass of the gradient descent. First, the users and items selected for a mini-batch update are extracted from their embedding tables. Then we take the dot-product with a logistic function and return the value. This prediction for user and news pairs is then evaluated against the click or skip labels:
+
+```python
+# forward + backward + optimize
+ user_ids, news_ids, labels = batch
+ prediction = model(user_ids, news_ids)
+ loss = loss_function(prediction.view(-1), labels)
+ loss.backward()
+ optimizer.step()
+```
+
+This is done across several of epochs. The `batch` here contains a batch of `user_id`s, `news_id`s, and `label`s used for training a mini-batch. For instance, from the example impression above, a training example would be `U11552, N23967, 1`. The code responsible for generating the training data samples 4 negative examples (skips) for each positive example (click).
+
+The full code can be seen in the sample application, in [train_mf.py](https://github.com/vespa-engine/sample-apps/blob/master/news/src/python/train_mf.py). Let's go ahead and generate the embeddings:
+
+```bash
+$ ./src/python/train_mf.py mind 10
+```
+
+This runs the training code for 10 epochs, and deposits the resulting user and news vectors in the `mind` directory, where the rest of the data is:
+
+```
+Total loss after epoch 1: 573.5299682617188 (3.49713397026062 avg)
+Total loss after epoch 2: 551.6585083007812 (3.363771438598633 avg)
+...
+{'auc': 0.5776, 'mrr': 0.248, 'ndcg@5': 0.2573, 'ndcg@10': 0.317}
+{'auc': 0.4988, 'mrr': 0.2154, 'ndcg@5': 0.2182, 'ndcg@10': 0.2824}
+```
+
+We can see the loss reduces over the number of epochs. The two final lines here are ranking metrics run on the training set and validation set. Here, the `AUC` metric - Area Under the (ROC) Curve - is at `0.5776` for the training set and `0.4988` for the validation set. If you run for a greater number of epochs, you would see the `AUC` for the training set become much larger than the validation set, around `0.974` and `0.51` respectively if run for 100 epochs.
+
+In this case, the `AUC` metric measures the probability of ranking relevant news higher than non-relevant news. A score of around `0.5` means that it is totally random. Thus, we haven't learned anything of use for the validation set.
+
+This is not overfitting but rather an instance of the problem mentioned earlier. The validation set contains news articles shown to users a time period after the data in the training set. Thus, most news articles are new, and their embedding vectors are effectively random.
+
+We'll address this next.
+
+
+## Addressing the cold start problem
+
+The approach above based itself on news articles that users interacted with in the training set period. Only the user ids and news article ids were used. To overcome the problem that new articles haven't been seen in the training set, we need to use the article's content features. So, the predictions will be based on the similarity of content a user has previously interacted, rather than the actual news article id.
+
+This is, naturally enough, called content-based recommendation.
+
+The general approach we'll take here is to still rely on a dot product between a user embedding and news embedding, however the news embedding will be constructed from various content features.
+
+The MIND dataset has a few such features we can use. Each news article has a `category`, a `subcategory` and zero or more `entities` extracted from the text. These features are categorical, meaning that they have a finite set of values they can take. To handle these, we'll generate an embedding for each possible value, similar to how we generated embeddings for the user id's and news id's above. These ids are also categorical, after all.
+
+
+### Creating BERT embeddings
+
+However, there are other content features as well such as the `title` and `abstract`. To create embeddings from these, we'll employ a [BERT-based sentence classifier](https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertForSequenceClassification) from the [HuggingFace transformers](https://huggingface.co/docs/transformers/index) library:
+
+```python
+from transformers import BertTokenizer, BertModel
+tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
+model = BertModel.from_pretrained('google/bert_uncased_L-8_H-512_A-8')
+tokens = tokenizer(title, abstract, return_tensors="pt")
+outputs = model(**tokens)
+embedding = outputs[0][0][0]
+```
+
+Here, we use a medium-sized BERT model with 8 layers and a hidden dimension size of 512. This means that the embedding will be a vector of size 512. We use the vector from the first `CLS` token to represent the combined title and abstract.
+
+To generate these embeddings for all news content, run one of the following:
+
+1. Generate embeddings. This might take a while, around an hour for all news articles in the `train` and `dev` demo dataset.
+
+```bash
+$ python3 src/python/create_bert_embeddings.py mind
+```
+
+2. Download pre-processed embeddings:
+
+```bash
+$ curl -L -o mind/train/news_embeddings.tsv \
+ https://data.vespa-cloud.com/sample-apps-data/mind_news_embedding.tsv
+$ curl -L -o mind/dev/news_embeddings.tsv \
+ https://data.vespa-cloud.com/sample-apps-data/mind_news_embedding_dev.tsv
+```
+
+This creates a `news_embeddings.tsv` file under the `mind/train` and `mind/dev` subdirectories.
+
+
+## Training the model
+
+Now that we have content-based embeddings for each news article, we can train the model to use them. The following figure illustrates the model we are training:
+
+
+
+
+
+So, we'll pass the 512-dimensional embeddings from the BERT model through a typical neural network layer to reduce dimensions to 50. We then concatenate this representation with the 50 dimensional embeddings for `category`, `subcategory` and `entity`. We only use one entity for now. This representation is then sent through another neural network layer to form the final representation for a news article. Finally, the dot product is taken with the user embedding.
+
+In PyTorch code, this looks like:
+
+```python expandable
+class ContentBasedModel(torch.nn.Module):
+ def __init__(self,
+ num_users,
+ num_news,
+ num_categories,
+ num_subcategories,
+ num_entities,
+ embedding_size,
+ bert_embeddings):
+ super(ContentBasedModel, self).__init__()
+
+ self.user_embeddings = torch.nn.Embedding(num_embeddings=num_users, embedding_dim=embedding_size)
+ self.news_embeddings = torch.nn.Embedding(num_embeddings=num_news, embedding_dim=embedding_size)
+ self.cat_embeddings = torch.nn.Embedding(num_embeddings=num_categories, embedding_dim=embedding_size)
+ self.sub_cat_embeddings = torch.nn.Embedding(num_embeddings=num_subcategories, embedding_dim=embedding_size)
+ self.entity_embeddings = torch.nn.Embedding(num_embeddings=num_entities, embedding_dim=embedding_size)
+
+ self.news_bert_embeddings = torch.nn.Embedding.from_pretrained(bert_embeddings, freeze=True)
+ self.news_bert_transform = torch.nn.Linear(bert_embeddings.shape[1], embedding_size)
+ self.news_content_transform = torch.nn.Linear(in_features=embedding_size*5, out_features=embedding_size)
+
+ def get_user_embeddings(self, users):
+ return self.user_embeddings(users)
+
+ def get_news_embeddings(self, items, categories, subcategories, entities):
+ bert_embeddings = self.news_bert_embeddings(items)
+ bert_embeddings = self.news_bert_transform(bert_embeddings)
+ bert_embeddings = torch.sigmoid(bert_embeddings)
+
+ cat_embeddings = self.cat_embeddings(categories)
+ news_embeddings = self.news_embeddings(items)
+ sub_cat_embeddings = self.sub_cat_embeddings(subcategories)
+ entity_embeddings_1 = self.entity_embeddings(entities[:,0])
+ news_embedding = torch.cat((news_embeddings, bert_embeddings, cat_embeddings,
+ sub_cat_embeddings, entity_embeddings_1), 1)
+ news_embedding = self.news_content_transform(news_embedding)
+ news_embedding = torch.sigmoid(news_embedding)
+
+ return news_embedding
+
+ def forward(self, users, items, categories, subcategories, entities):
+ user_embeddings = self.get_user_embeddings(users)
+ news_embeddings = self.get_news_embeddings(items, categories, subcategories, entities)
+ dot_prod = torch.sum(torch.mul(user_embeddings, news_embeddings), 1)
+ return torch.sigmoid(dot_prod)
+```
+
+The forward pass function is pretty much the same as before. You can see the entire training script in [train_cold_start.py](https://github.com/vespa-engine/sample-apps/blob/master/news/src/python/train_cold_start.py). Running this results in:
+
+```bash
+$ python3 src/python/train_cold_start.py mind 5
+```
+
+```
+Total loss after epoch 1: 920.5855102539062 (0.703811526298523 avg)
+{'auc': 0.5391, 'mrr': 0.2367, 'ndcg@5': 0.2464, 'ndcg@10': 0.3059}
+...
+Total loss after epoch 10: 517.16748046875 (0.3953879773616791 avg)
+{'auc': 0.8758, 'mrr': 0.5074, 'ndcg@5': 0.5818, 'ndcg@10': 0.6316}
+{'auc': 0.6249, 'mrr': 0.2842, 'ndcg@5': 0.3114, 'ndcg@10': 0.3733}
+```
+
+This is much better. The `AUC` score at epoch 9 is a respectable `0.6266`. Note that as we train further, the `AUC` for the dev set starts dropping. This is a sign of overfitting, so we should stop training.
+
+For reference, the baseline model for the MIND competition, [Neural News Recommendation with Multi-Head Self-Attention](https://aclanthology.org/D19-1671/), results in `0.6362`. This model additionally uses the user history in each impression to create a better model for the user embedding. For the moment, however, we are satisfied with these, and we'll use them going forward. Feel free to experiment and see if you can achieve better results!
+
+
+These numbers are for the demo dataset, which is much smaller than the full dataset. For reference, in [the MIND paper](https://msnews.github.io/assets/doc/ACL2020_MIND.pdf) the baseline here achieves `0.6776` on the full dataset.
+
+
+The training script writes these embeddings to the files `mind/user_embeddings.tsv` and `mind/news_embeddings.tsv`.
+
+
+## Mapping from inner-product search to Euclidean search
+
+These vectors have been trained to maximize the inner product. Finding the best news articles given a user vector is called Maximum Inner Product Search - or MIPS. This form isn't really suitable for efficient retrieval as-is, but it can be mapped to a nearest neighbor search problem, so we can use an efficient approximate nearest neighbors index.
+
+When specifying `distance-metric: dotproduct`, Vespa uses the technique discussed in [Speeding Up the Xbox Recommender System Using a Euclidean Transformation for Inner-Product Spaces](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/XboxInnerProduct.pdf) to solve the MIPS case. See [blog post announcing MIPS support in Vespa](https://blog.vespa.ai/announcing-maximum-inner-product-search/).
+
+```sd
+field embedding type tensor(d0[50]) {
+ indexing: attribute | index
+ attribute {
+ distance-metric: dotproduct
+ }
+}
+```
+
+See [Nearest Neighbor Search](/en/querying/nearest-neighbor-search) for more information on nearest neighbor search and supported distance metrics in Vespa.
+
+We've included a script to create a feed suitable for Vespa:
+
+```bash
+$ python3 src/python/convert_embeddings_to_vespa_format.py mind
+```
+
+We are now ready to feed these embedding vectors to Vespa.
+
+
+## Conclusion
+
+Now that we've generated user and document embeddings, we can start using these to recommend news items to users. We'll start feeding these in the [next part of the tutorial](/en/learn/tutorials/news-5-recommendation).
diff --git a/mintlify-docs/en/learn/tutorials/news-5-recommendation.mdx b/mintlify-docs/en/learn/tutorials/news-5-recommendation.mdx
new file mode 100644
index 0000000000..3d790db95e
--- /dev/null
+++ b/mintlify-docs/en/learn/tutorials/news-5-recommendation.mdx
@@ -0,0 +1,467 @@
+---
+title: "News search and recommendation tutorial - recommendation"
+---
+
+This is the fifth part of the tutorial series for setting up a Vespa application for personalized news recommendations. The parts are:
+
+1. [Getting started](/en/learn/tutorials/news-1-deploy-an-application)
+2. [A basic news search application](/en/learn/tutorials/news-2-basic-feeding-and-query) - application packages, feeding, query
+3. [News search](/en/learn/tutorials/news-3-searching) - sorting, grouping, and ranking
+4. [Generating embeddings for users and news articles](/en/learn/tutorials/news-4-embeddings)
+5. [News recommendation](/en/learn/tutorials/news-5-recommendation) - partial updates (news embeddings), ANNs, filtering
+6. [News recommendation with searchers](/en/learn/tutorials/news-6-recommendation-with-searchers) - custom searchers, doc processors
+7. [News recommendation with parent-child](/en/learn/tutorials/news-7-recommendation-with-parent-child) - parent-child, tensor ranking
+8. Advanced news recommendation - intermission - training a ranking model
+9. Advanced news recommendation - ML models
+
+In this part, we'll start transforming our application from news search to recommendation using the embeddings we created in the previous part. So, we'll start by modifying our application, so we can feed the embeddings and start using them for searching.
+
+For reference, the final state of this tutorial can be found in the [app-5-recommendation](https://github.com/vespa-engine/sample-apps/tree/master/news/app-5-recommendation) sub-directory of the `news` sample application.
+
+
+## Indexing embeddings
+
+First, we need to modify the `news.sd` search definition to include a field to hold the embedding and a recommendation rank profile:
+
+```sd expandable
+schema news {
+ document news {
+ field news_id type string {
+ indexing: summary | attribute
+ attribute: fast-search
+ }
+ field category type string {
+ indexing: summary | attribute
+ }
+ field subcategory type string {
+ indexing: summary | attribute
+ }
+ field title type string {
+ indexing: index | summary
+ index: enable-bm25
+ }
+ field abstract type string {
+ indexing: index | summary
+ index: enable-bm25
+ }
+ field body type string {
+ indexing: index | summary
+ index: enable-bm25
+ }
+ field url type string {
+ indexing: index | summary
+ }
+ field date type int {
+ indexing: summary | attribute
+ }
+ field clicks type int {
+ indexing: summary | attribute
+ }
+ field impressions type int {
+ indexing: summary | attribute
+ }
+ field embedding type tensor(d0[50]) {
+ indexing: attribute
+ attribute {
+ distance-metric: dotproduct
+ }
+ }
+ }
+
+ fieldset default {
+ fields: title, abstract, body
+ }
+
+ rank-profile popularity inherits default {
+ function popularity() {
+ expression: if (attribute(impressions) > 0, attribute(clicks) / attribute(impressions), 0)
+ }
+ first-phase {
+ expression: nativeRank(title, abstract) + 10 * popularity
+ }
+ }
+
+ rank-profile recommendation inherits default {
+ first-phase {
+ expression: closeness(field, embedding)
+ }
+ }
+}
+```
+
+The `embedding` field is a tensor field. Tensors in Vespa are flexible multi-dimensional data structures, and, as first-class citizens, can be used in queries, document fields, and constants in ranking. Tensors can be either dense or sparse or both, and can contain any number of dimensions. See [the tensor user guide](/en/ranking/tensor-user-guide) for more information.
+
+Here we have defined a dense tensor with a single dimension (`d0` - dimension 0), which represents a vector. The distance metric is "dotproduct" as we would like to use this field for nearest-neighbor search where we search for the maximal dotproduct.
+
+This is seen in the `recommendation` rank profile. Here, we've added a ranking expression using the [closeness](/en/reference/ranking/rank-features#closeness(dimension,name)) ranking feature, which calculates the dot product and uses that to rank the news articles. This depends on using the `nearestNeighbor` search operator, which we'll get back to below when searching. But for now, this expects a tensor in the query to be used as the initial search point.
+
+If you take a look at the file generated for the news embeddings, `mind/vespa_news_embeddings.json`, you'll see several lines with something like this:
+
+```json
+{
+ "update": "id:news:news::N13390",
+ "fields": {
+ "embedding": {
+ "assign": {
+ "values": [9.871717,-0.403103,...]
+ }
+ }
+ }
+}
+```
+
+This is a [partial update](/en/writing/partial-updates). So, assuming you already have a system up and running from the previous search tutorial, you don't need to feed the entire corpus. With a partial update, you only need to update the necessary fields. So, after training another set of embeddings you can partially feed them again. Please refer to [Vespa reads and writes](/en/writing/reads-and-writes) for more information on feeding formats.
+
+We need to add another document type to represent a user. Add this schema in `schemas/user.sd`:
+
+```sd
+schema user {
+ document user {
+ field user_id type string {
+ indexing: summary | attribute
+ attribute: fast-search
+ }
+ field embedding type tensor(d0[50]) {
+ indexing: summary | attribute
+ }
+ }
+}
+```
+
+This schema is set up so that we can search for a `user_id` and retrieve the user's embedding vector.
+
+We also need to let Vespa know we want to use this document type, so we modify `services.xml` and add it under `documents` in the `content` section:
+
+```xml
+
+
+
+
+```
+
+```bash
+$ vespa deploy --wait 300 my-app
+```
+
+```bash
+$ sleep 20
+```
+
+After redeploying with the updated schemas and `services.xml`, feed `mind/vespa_user_embeddings.json` and `mind/vespa_news_embeddings.json`:
+
+```bash
+$ vespa feed mind/vespa_user_embeddings.json --target http://localhost:8080
+$ vespa feed mind/vespa_news_embeddings.json --target http://localhost:8080
+```
+
+Once the feeding jobs finishes, the index is ready to be used, we can verify that we have 65238 news documents and 94057 user documents:
+
+```bash
+$ sleep 20
+```
+
+```bash
+$ vespa query -v \
+ 'yql=select * from news where true' \
+ 'hits=0'
+```
+
+```bash
+$ vespa query -v \
+ 'yql=select * from user where true' \
+ 'hits=0'
+```
+
+
+## Query profiles and query profile types
+
+Before we can test the application, we need to add a query profile type. The `recommendation` rank profile above requires a tensor to be sent along with the query. For Vespa to bind the correct types, it needs to know the expected type of this query parameter. That is called a query profile type.
+
+[Query profiles](/en/reference/querying/query-profiles) are named sets of search request parameters that can be set as default, so you don't have to pass them along with the query. We don't use this in this sample application. Still we need to set up a default query profile to set up the types of query parameters we expect to pass.
+
+So, write the following to `news/my-app/search/query-profiles/default.xml`:
+
+```xml
+
+```
+
+To set up the query profile types, write them to the file `search/query-profiles/types/root.xml`:
+
+```xml
+
+
+
+```
+
+This configures Vespa to expect a float tensor with dimension `d0[50]` when the query parameter `ranking.features.query(user_embedding)` is passed. We'll see how this works together with the `nearestNeighbor` search operator below.
+
+
+Setting up this query profile type is required when sending a tensor as a query parameter. A common pitfall is to forget the default query profile, but that is required to successfully set up the query profile type.
+
+
+Deploy the updates to query profiles:
+
+```bash
+$ vespa deploy --wait 300 my-app
+```
+
+
+## Testing the application
+
+We can now query Vespa using embeddings. First, let's find the user `U33527`:
+
+```bash
+$ vespa query -v \
+ 'yql=select user_id, embedding from user where user_id contains "U33527"' \
+ 'hits=1'
+```
+
+This returns the document containing the user's embedding:
+
+```json expandable
+{
+ "root": {
+ ...
+ "children": [
+ {
+ "id": "index:mind/0/ce7cc40b398f32626fcff97a",
+ "relevance": 0.0017429193899782135,
+ "source": "mind",
+ "fields": {
+ "user_id": "U33527",
+ "embedding": {
+ "type": "tensor(d0[50])",
+ "values": [
+ 0.0,
+ 0.06090399995446205,
+ 0.15839800238609314,
+ ...
+ ]
+ }
+ }
+ }
+ ]
+ }
+}
+```
+
+Now we can use this vector to query the news articles. You can either write this query by hand, but we've added a convenience script [user_search.py](https://github.com/vespa-engine/sample-apps/blob/master/news/src/python/user_search.py) which queries Vespa:
+
+```bash
+$ ./src/python/user_search.py U33527 10
+```
+
+This script first retrieves the user embedding using an HTTP `GET` query to Vespa. It then parses the tensor containing the embedding vector. Finally, it issues a `nearestNeighbor` search using a `POST` (however a `GET` would work just as well). Please see the [nearest-neighbor operator](/en/reference/querying/yql#nearestneighbor) for more on the syntax for nearest-neighbor searches. The `nearestNeighbor` search looks like:
+
+```json
+{
+ "hits": 10,
+ "yql": "select * from sources news where (nearestNeighbor(embedding, user_embedding))",
+ "ranking.features.query(user_embedding)": "{ ... }",
+ "ranking.profile": "recommendation"
+}
+```
+
+Here, you can see the `nearestNeighbor` search operator being set up so that the query parameter `user_embedding` will be searched against the `embedding` document field. The tensor for the `user_embedding` is in the `ranking.features.query(user_embedding)` parameter. Recall from above that we set a query profile type for this exact query parameter, so Vespa knows what to expect here.
+
+When Vespa receives this query, it scans linearly through all documents, and scores them using the `recommendation` rank profile we set up above. Recall that we ask Vespa to convert the problem from maximum inner product to a nearest distance problem by using the `dotproduct` distance metric; in this case `distance` ranking feature just outputs the negative dotproduct.
+
+With a distance search, we want to find the smallest distances. However, Vespa sorts the final results by decreasing rank score. To get the expected rank order, Vespa provides the `closeness` feature which in this case is just the dotproduct directly.
+
+Let's test that this works as intended, using [evaluate.py](https://github.com/vespa-engine/sample-apps/blob/master/news/src/python/evaluate.py):
+
+```bash
+$ ./src/python/evaluate.py mind 1000
+```
+
+This reads both the training and validation set impressions, queries Vespa for 1000 randomly drawn impressions, and calculates the same metrics we saw during training. The result is something like:
+
+```
+Train: {'auc': 0.8774, 'mrr': 0.5115, 'ndcg@5': 0.5842, 'ndcg@10': 0.6345}
+Valid: {'auc': 0.6308, 'mrr': 0.2935, 'ndcg@5': 0.3203, 'ndcg@10': 0.3789}
+```
+
+This is in line with the results from the training. So, the conversion from inner product space to euclidean space works as intended. The resulting rank scores are different, but the transformation evidently retains the same ordering.
+
+
+## Approximate Nearest Neighbor Search
+
+So far, we've been using exact nearest-neighbor search. This is a linear scan through all documents. For the MIND demo dataset we've been using, this isn't a problem as it only contains roughly 28000 documents, and Vespa only uses a few milliseconds to scan through these. However, as the index grows, the time (and computational cost) becomes significant.
+
+There are no exact methods for finding the nearest-neighbors efficiently. So we trade accuracy for efficiency in what is called approximate nearest-neighbors (ANN). Vespa provides a unique implementation of ANNs that uses the HNSW (hierarchical navigable small world) algorithm, while still being compatible with other facets of the search such as filtering. We'll get back to this in the next section.
+
+If you recall, Vespa returned something like the following when searching for single users above (with `targetHits` equals to 10):
+
+```json expandable
+{
+ "root": {
+ "id": "toplevel",
+ "relevance": 1.0,
+ "fields": {
+ "totalCount": 95
+ },
+ "coverage": {
+ "coverage": 100,
+ "documents": 65238,
+ "full": true,
+ "nodes": 1,
+ "results": 1,
+ "resultsFull": 1
+ }
+ }
+}
+```
+
+Here, `coverage` shows that Vespa did scan through all 65238 documents. The interesting piece here is the `totalCount`. This number is the number of times a document has been put in the top 10 results during this linear scan.
+
+Let's switch to using approximate nearest-neighbors by adding `index` to the embedding field in `news.sd`:
+
+```sd expandable
+schema news {
+ document news {
+ field news_id type string {
+ indexing: summary | attribute
+ attribute: fast-search
+ }
+ field category type string {
+ indexing: summary | attribute
+ }
+ field subcategory type string {
+ indexing: summary | attribute
+ }
+ field title type string {
+ indexing: index | summary
+ index: enable-bm25
+ }
+ field abstract type string {
+ indexing: index | summary
+ index: enable-bm25
+ }
+ field body type string {
+ indexing: index | summary
+ index: enable-bm25
+ }
+ field url type string {
+ indexing: index | summary
+ }
+ field date type int {
+ indexing: summary | attribute
+ attribute: fast-search
+ }
+ field clicks type int {
+ indexing: summary | attribute
+ }
+ field impressions type int {
+ indexing: summary | attribute
+ }
+ field embedding type tensor(d0[50]) {
+ indexing: attribute | index
+ attribute {
+ distance-metric: dotproduct
+ }
+ }
+ }
+
+ fieldset default {
+ fields: title, abstract, body
+ }
+
+ rank-profile popularity inherits default {
+ function popularity() {
+ expression: if (attribute(impressions) > 0, attribute(clicks) / attribute(impressions), 0)
+ }
+ first-phase {
+ expression: nativeRank(title, abstract) + 10 * popularity
+ }
+ }
+
+ rank-profile recommendation inherits default {
+ first-phase {
+ expression: closeness(field, embedding)
+ }
+ }
+}
+```
+
+If you make this change and deploy it, you will get prompted by Vespa that a restart is required so that the index can be built:
+
+```bash
+$ vespa deploy --wait 300 my-app
+```
+
+Introducing the HNSW `index` requires a content node restart, in this case we restart all services:
+
+```bash
+$ docker exec vespa /usr/bin/sh -c \
+ '/opt/vespa/bin/vespa-stop-services && /opt/vespa/bin/vespa-start-services'
+```
+
+```bash
+$ vespa status --wait 300
+```
+
+After doing this and waiting a bit for Vespa to start, we can query Vespa again:
+
+```bash
+$ ./src/python/user_search.py U33527 10
+```
+
+```json expandable
+{
+ "root": {
+ "id": "toplevel",
+ "relevance": 1.0,
+ "fields": {
+ "totalCount": 10
+ },
+ "coverage": {
+ "coverage": 100,
+ "documents": 65238,
+ "full": true,
+ "nodes": 1,
+ "results": 1,
+ "resultsFull": 1
+ }
+ }
+}
+```
+
+Here, `coverage` is still 100%, but the `totalCount` has been reduced to 10 - the same number of hits we requested. By adding the index to this field, Vespa built a HNSW graph structure for the values in this field. When used in an approximate nearest-neighbor search, this graph is queried and only the closest points as determined by this graph is added to the list.
+
+The particularly observant might have noticed that the result set has changed. Indeed, the third result when using exact nearest neighbor search was news article `N438`. This was omitted from the approximate search. As mentioned, we trade accuracy for efficiency when using approximate nearest-neighbor search.
+
+It should also be mentioned that searching through this graph comes with a cost. In our case, since we only have a relatively small amount of documents, there isn't that much gain in efficiency. However, as the number of documents grows, this starts to pay off. See [Approximate nearest neighbor search in Vespa](https://blog.vespa.ai/approximate-nearest-neighbor-search-in-vespa-part-1/) for more of a discussion around this. See also [Billion-scale vector search with Vespa - part one](https://blog.vespa.ai/billion-scale-knn/) and [Billion-scale vector search with Vespa - part two](https://blog.vespa.ai/billion-scale-knn-part-two/) which cover the many trade-offs related to approximate nearest neighbor search.
+
+The implementation of ANN using HNSW in Vespa has some nice features. Notice that we did not have to re-feed the corpus to enable ANN. Many other approaches for ANNs require building an index offline in a batch job. HNSW allows for incrementally building this index, which is fully exploited in Vespa.
+
+A unique feature of Vespa is that the implementation allows for filtering during graph traversal, which we'll look at next.
+
+
+## Filtering
+
+A common case when using approximate nearest-neighbors is to combine with some additional query filters. For instance, for retail search, one can imagine finding relevant products for a user. In this case, we should not recommend products that are out of stock. So an additional query filter would be to ensure that `in_stock` is true.
+
+Now, most implementations of ANNs come in the form of a library, so they are not integrated with the search at large. The natural approach is to first perform the ANN, the *post-filter* the results. Unfortunately, this often leads to sub-optimal results as relevant documents might not have been recalled. See [Using approximate nearest-neighbor search in real world applications](https://blog.vespa.ai/using-approximate-nearest-neighbor-search-in-real-world-applications/) for more of a discussion around this, and [Query Time Constrained Approximate Nearest Neighbor Search](https://blog.vespa.ai/constrained-approximate-nearest-neighbor-search/) for a better understanding of pre- and post-filtering tradeoffs.
+
+In our case, let's assume we want to retrieve 10 `sports` articles for a user. It turns out we need to retrieve at least 278 news articles from the search to get to 10 `sports` articles for this user:
+
+```bash
+$ ./src/python/user_search.py U63195 10 | grep "category\": \"sports\"" | wc -l
+$ ./src/python/user_search.py U63195 278 | grep "category\": \"sports\"" | wc -l
+```
+
+On the other hand, if we add a filter specifically:
+
+```bash
+$ ./src/python/user_search.py U63195 10 "AND category contains 'sports'" | \
+ grep "category\": \"sports" | wc -l
+```
+
+Here, we only specify 10 hits and exactly 10 hits of `sports` category are returned. Vespa still searches through the graph starting from the query point, however the search does not stop when we have 10 hits. In effect, the graph search widens until 10 results fulfilling the filters are found.
+
+As a note, strict filters that filter away a large part of the corpus would entail that many candidates in the graph are skipped while searching for the results that fulfill the filters. This can take an exponential amount of time. For this case, Vespa falls back to a linear, brute-force search over the few documents which matches the filter for efficiency.
+
+
+## Conclusion
+
+We now have a basic recommendation system up and running. We can query for a user, retrieve the embedding vector and use that for querying the news articles. Right now, this means two calls to Vespa. In the [next part of the tutorial](/en/learn/tutorials/news-6-recommendation-with-searchers), we will introduce `searchers`, which allows for custom logic during query processing inside the Vespa cluster, requiring only one pass from the client to Vespa.
diff --git a/mintlify-docs/en/learn/tutorials/news-6-recommendation-with-searchers.mdx b/mintlify-docs/en/learn/tutorials/news-6-recommendation-with-searchers.mdx
new file mode 100644
index 0000000000..92d87c8fd9
--- /dev/null
+++ b/mintlify-docs/en/learn/tutorials/news-6-recommendation-with-searchers.mdx
@@ -0,0 +1,278 @@
+---
+title: "News search and recommendation tutorial - Searchers"
+---
+
+This is the sixth part of the tutorial series for setting up a Vespa application for personalized news recommendations. The parts are:
+
+1. [Getting started](/en/learn/tutorials/news-1-deploy-an-application)
+2. [A basic news search application](/en/learn/tutorials/news-2-basic-feeding-and-query) - application packages, feeding, query
+3. [News search](/en/learn/tutorials/news-3-searching) - sorting, grouping, and ranking
+4. [Generating embeddings for users and news articles](/en/learn/tutorials/news-4-embeddings)
+5. [News recommendation](/en/learn/tutorials/news-5-recommendation) - partial updates (news embeddings), ANNs, filtering
+6. [News recommendation with searchers](/en/learn/tutorials/news-6-recommendation-with-searchers) - custom searchers, doc processors
+7. [News recommendation with parent-child](/en/learn/tutorials/news-7-recommendation-with-parent-child) - parent-child, tensor ranking
+8. Advanced news recommendation - intermission - training a ranking model
+9. Advanced news recommendation - ML models
+
+In the previous part of this series, we set up a recommendation system that, given a user id, needed two requests to generate a recommendation. The first to retrieve the user embedding, and a second for finding the nearest neighbor news articles. In this part, we'll introduce `Searchers`, which are processors that can modify queries before passing them along to search. These allow us to pull the logic from the Python scripts into Vespa.
+
+For reference, the final state of this tutorial can be found in the [app-6-recommendation-with-searchers](https://github.com/vespa-engine/sample-apps/tree/master/news/app-6-recommendation-with-searchers) directory of the `news` sample application.
+
+
+## Searchers and document processors
+
+First, let's revisit Vespa's overall architecture:
+
+
+
+
+
+Recall that the application package contains everything necessary to run the application. When this is deployed, the config cluster takes care of distributing the services to the various nodes. In particular, the two main types of nodes are the stateless `container` nodes and the stateful `content` nodes.
+
+All requests pass through the `container` cluster before passing along to `content` cluster where the actual retrieval and ranking occurs. The queries actually pass through a chain of Searchers; each one possibly doing a small amount of processing. This can be seen by adding a `&trace.level=5` to a query:
+
+```json expandable
+{ "message": "Invoke searcher 'com.yahoo.search.querytransform.WeakAndReplacementSearcher in vespa'" },
+{ "message": "Invoke searcher 'com.yahoo.prelude.statistics.StatisticsSearcher in native'" },
+{ "message": "Invoke searcher 'com.yahoo.prelude.querytransform.PhrasingSearcher in vespa'" },
+{ "message": "Invoke searcher 'com.yahoo.prelude.searcher.FieldCollapsingSearcher in vespa'" },
+{ "message": "Invoke searcher 'com.yahoo.search.yql.MinimalQueryInserter in vespa'" },
+
+{ "message": "Federating to [mind]" },
+
+{ "message": "Got 10 hits from source:mind" },
+{ "message": "Return searcher 'federation in native'" },
+
+{ "message": "Return searcher 'com.yahoo.search.yql.MinimalQueryInserter in vespa'" },
+{ "message": "Return searcher 'com.yahoo.prelude.searcher.FieldCollapsingSearcher in vespa'" },
+{ "message": "Return searcher 'com.yahoo.prelude.querytransform.PhrasingSearcher in vespa'" },
+{ "message": "Return searcher 'com.yahoo.prelude.statistics.StatisticsSearcher in native'" },
+{ "message": "Return searcher 'com.yahoo.search.querytransform.WeakAndReplacementSearcher in vespa'" },
+```
+
+This shows a small sample of the additional output when using `trace.level`. Note the invocations of the Searchers. Each Searcher gets invoked along a chain, and the last Searcher in the chain sends the post-processed query to the search backend. When the results come back, the processing passes back up the chain. The Searchers can then process the results before passing them to the previous Searcher, and ultimately back as a response to the query.
+
+
+Adding a [trace.level](/en/reference/api/query#trace.level) is generally helpful when debugging vespa queries.
+
+
+So, [Searchers](/en/applications/searchers) are Java components that do some kind of processing along the query chain; either modifying the query before the actual search, modifying the results after the search, or some combination of both.
+
+Developers can provide their own Searchers and inject them into the query chain. We'll capitalize on this and create a Searcher that performs essentially the same task that [user_search.py](https://github.com/vespa-engine/sample-apps/blob/master/news/src/python/user_search.py) does: retrieve a user embedding and do a news article search based on that. In the process, we'll only pass a `user_id` to Vespa instead of a full YQL query: `/search/?user_id=U33527&searchchain=user`
+
+The search will take care of creating the actual query for us - let's get started.
+
+
+## Adding a user profile Searcher
+
+While the `content` layer in Vespa is written in C++ for maximum performance, the `container` layer is in Java for flexibility. So, all Searchers and thus custom Searchers are written in Java. Refer to [the guide on Searcher development](/en/applications/searchers) for more information.
+
+We want to create a Searcher that takes a `user_id`, issues a query to find the corresponding embedding, then issues a second query to retrieve the news articles.
+
+To do this, we create a `UserProfileSearcher` that extends the base Searcher class `com.yahoo.search.Searcher`. This Searcher must implement a single method: `search`, and has the responsibility of passing the query to the next Searcher on the list. A minimal example:
+
+```java
+public class UserProfileSearcher extends Searcher {
+ public Result search(Query query, Execution execution) {
+ // ... process query
+ Result results = execution.search(query)
+ // ... process results
+ return results;
+ }
+}
+```
+
+So, what we do before we pass the query along (in `execution.search(query)`) and before we return the results is completely up to us. So, we implement our `UserProfileSearcher` like this:
+
+```java expandable
+public class UserProfileSearcher extends Searcher {
+
+ public Result search(Query query, Execution execution) {
+
+ // Get tensor and read items from user profile
+ Object userIdProperty = query.properties().get("user_id");
+ if (userIdProperty != null) {
+
+ // Retrieve user embedding by doing a search for the user_id and extract the tensor
+ Tensor userEmbedding = retrieveUserEmbedding(userIdProperty.toString(), execution);
+
+ // Create a new search using the user's embedding tensor
+ NearestNeighborItem nn = new NearestNeighborItem("embedding", "user_embedding");
+ nn.setTargetNumHits(query.getHits());
+ nn.setAllowApproximate(true);
+
+ query.getModel().getQueryTree().setRoot(nn);
+ query.getRanking().getFeatures().put("query(user_embedding)", userEmbedding);
+ query.getModel().setRestrict("news");
+
+ // Override default rank profile
+ if (query.getRanking().getProfile().equals("default")) {
+ query.getRanking().setProfile("recommendation");
+ }
+ }
+
+ return execution.search(query);
+ }
+
+ private Tensor retrieveUserEmbedding(String userId, Execution execution) {
+ Query query = new Query();
+ query.getModel().setRestrict("user");
+ query.getModel().getQueryTree().setRoot(new WordItem(userId, "user_id"));
+ query.setHits(1);
+
+ Result result = execution.search(query);
+ execution.fill(result); // This is needed to get the actual summary data
+
+ if (result.getTotalHitCount() == 0)
+ throw new RuntimeException("User id " + userId + " not found...");
+ return (Tensor) result.hits().get(0).getField("embedding");
+ }
+
+}
+```
+
+First, we retrieve the `user_id` from the query. If this is given in the query, we first call the `retrieveUserEmbedding` method, which creates a new `Query` to find the user's embedding. This is a straight-forward search which is restricted to the `user` document type. Since the `user_id` is unique, we only expect a single hit. We then extract the `embedding` tensor from the user document.
+
+
+We explicitly call a *fill* on the results before returning. A query is usually passed to the search backend at least twice: one to retrieve the top ranked results, another to retrieve the summary data of the final result set. This is to avoid sending excess data between services. For instance, if searching for the top 10 results with two search backends, each backend will retrieve the top 10 results from the local content on that node. A Searcher will determine the global top ten ranked results (potentially including diversification) and only issue a *fill* to retrieve the summary features for those top 10.
+
+
+Now that we've retrieved the user embedding, we programmatically set up a nearest-neighbor search, and add the user embedding to the query as the ranking feature `query(user_embedding)`. The search is then passed along to the next Searcher in the chain. We do not need to explicitly fill the result here, as that is guaranteed to happen before ultimately rendering the results.
+
+Again, note that all this is pretty much the same as what we did in [user_search.py](https://github.com/vespa-engine/sample-apps/blob/master/news/src/python/user_search.py) - just in Java.
+
+
+## Adding a search chain
+
+To add this Searcher to Vespa, we need to modify `services.xml`:
+
+```xml
+
+
+
+
+
+
+ ...
+
+```
+
+Here, we instruct Vespa to add a new search chain called `user` (which inherits the default `vespa` search chain), and includes our `UserProfileSearcher`. Note that Vespa expects this Searcher to be in a bundle called `news-recommendation`, so we need to compile and package this code. In Vespa, we use [Apache Maven](https://maven.apache.org/) for this, which requires a project object model, or `pom.xml`, to specify how to build this artifact.
+
+We won't go through that here; please refer to [app-6-recommendation-with-searchers](https://github.com/vespa-engine/sample-apps/tree/master/news/app-6-recommendation-with-searchers) in the `news` sample application for details. Note that this application's directory structure has changed compared to the previous parts in the tutorial. The structure is now:
+
+```
+.
+├── pom.xml
+└── src
+ └── main
+ ├── application
+ │ ├── schemas
+ │ │ ├── news.sd
+ │ │ └── user.sd
+ │ ├── search
+ │ │ └── query-profiles
+ │ │ ├── default.xml
+ │ │ └── types
+ │ │ └── root.xml
+ │ └── services.xml
+ └── java
+ └── ai
+ └── vespa
+ └── example
+ └── UserProfileSearcher.java
+```
+
+The Vespa application now lies under `src/main/application`, and all custom Java components are under `src/main/java` as is standard in a Java project. We can now compile and package this application:
+
+```bash
+$ (cd app-6-recommendation-with-searchers && mvn package)
+```
+
+[pom.xml](https://github.com/vespa-engine/sample-apps/blob/master/news/app-6-recommendation-with-searchers/pom.xml) is set up to create an artifact called `news-recommendation-searcher`, which is referred to in `services.xml`. When the command finishes, we can see this artifact in `target/application.zip`. This contains the full Vespa application, with Java components - deploy it:
+
+```bash
+$ vespa deploy --wait 300 app-6-recommendation-with-searchers
+```
+
+After the application has been deployed, we are ready to test. Refer to [the Searcher development guide](/en/applications/searchers) for much more on custom Searchers and the Java API.
+
+
+## Testing
+
+Now we can search for a user's recommended news articles directly from the `user_id`:
+
+```bash
+$ vespa query -v \
+ 'user_id=U33527' \
+ 'searchChain=user'
+```
+
+This should now return the top 10 recommended news articles for this user. Indeed, if we now add a with a `trace.level=5`, we see the Searcher being invoked:
+
+```bash
+$ vespa query -v \
+ 'user_id=U33527' \
+ 'searchChain=user' \
+ 'trace.level=5'
+```
+
+```json
+{ "message": "Invoke searcher 'ai.vespa.example.UserProfileSearcher in user'" },
+{ "message": "Invoke searcher 'com.yahoo.search.querytransform.WeakAndReplacementSearcher in vespa'" },
+{ "message": "Invoke searcher 'com.yahoo.prelude.statistics.StatisticsSearcher in native'" },
+
+{ "message": "Return searcher 'com.yahoo.prelude.statistics.StatisticsSearcher in native'" },
+{ "message": "Return searcher 'com.yahoo.search.querytransform.WeakAndReplacementSearcher in vespa'" },
+{ "message": "Return searcher 'ai.vespa.example.UserProfileSearcher in user'" },
+```
+
+Note that the `searchChain` query parameter can be set as default, so this does not have to be passed with the query request. This is done by adding it to the default query profile in [src/main/application/search/query-profiles/default.xml](https://github.com/vespa-engine/sample-apps/blob/master/news/app-6-recommendation-with-searchers/src/main/application/search/query-profiles/default.xml):
+
+```xml
+
+ user
+
+```
+
+
+[src/python/evaluate.py](https://github.com/vespa-engine/sample-apps/blob/master/news/src/python/evaluate.py) can now be modified to use this Searcher. However, to properly calculate the metrics, the Searcher needs to be modified to accept a list of news article id's and only recall those. We'll leave this as an exercise to the reader.
+
+
+
+## Document processors
+
+As can be seen in the architecture overview above, there are other component types as well. One is Document Processors, which are conceptually similar to Searchers. When a document is fed to Vespa, it goes through a chain of Document Processors before being passed to the content node for storage and indexing.
+
+Vespa also supports custom Document Processors, refer to [the guide for document processing](/en/applications/document-processors) for more information.
+
+
+## Improving recommendation diversity
+
+If we take a closer look at the query above, and search for the top 100 hits:
+
+```bash
+$ vespa query \
+ 'user_id=U33527' \
+ 'searchChain=user' \
+ 'hits=100' | \
+ grep "category\": \"sports" | wc -l
+```
+
+We see that all the hits are of category `sports` for this user. Actually, they are all from the `football_nfl` sub-category. Indeed, from inspection of the impressions file, this user has only clicked on `sports` articles. So, while this can seem a success, we generally would like to give users some form of diversity to keep them interested. This is also to combat the negative effects of filter bubbles.
+
+One way to do this is to create Searchers that perform multiple queries to the backend with various rank profiles. In the above, we were only retrieving results from the `recommendation` rank profile. Still, we can have any number of rank profiles. By searching in multiple rank profiles, we can blend the results from these sources before returning to the user, and thus introduce diversity.
+
+This is often called federation. Vespa supports federation both from internal and external sources, see [the guide on federation](/en/querying/federation) for more information.
+
+
+If the same document can be returned from multiple sources, it's important to perform some form of de-duplication before returning the final results!
+
+
+A common way of performing blending from multiple sources is to implement a specialized blending Searcher. This Searcher can, for instance, use an approach such as [reciprocal rank fusion](https://research.google/pubs/pub36196/), which gives decent results. However, when it comes to diversity, there are usually some goals or restrictions that needs to be controlled. In this case the business rules can be hand-written in the blending Searcher. Searchers are flexible enough to perform any type of processing.
+
+
+## Conclusion
+
+We now have a Vespa application up and running that takes a single `user_id` and returns recommendations for that user. In the [next part of the tutorial](/en/learn/tutorials/news-7-recommendation-with-parent-child), we'll address what to do when new users without any history visit our recommendation system.
diff --git a/mintlify-docs/en/learn/tutorials/news-7-recommendation-with-parent-child.mdx b/mintlify-docs/en/learn/tutorials/news-7-recommendation-with-parent-child.mdx
new file mode 100644
index 0000000000..ef188a0c59
--- /dev/null
+++ b/mintlify-docs/en/learn/tutorials/news-7-recommendation-with-parent-child.mdx
@@ -0,0 +1,250 @@
+---
+title: "News Recommendation Tutorial - parent child and tensor ranking"
+---
+
+This is the seventh part of the tutorial series for setting up a Vespa application for personalized news recommendations. The parts are:
+
+1. [Getting started](/en/learn/tutorials/news-1-deploy-an-application)
+2. [A basic news search application](/en/learn/tutorials/news-2-basic-feeding-and-query) - application packages, feeding, query
+3. [News search](/en/learn/tutorials/news-3-searching) - sorting, grouping, and ranking
+4. [Generating embeddings for users and news articles](/en/learn/tutorials/news-4-embeddings)
+5. [News recommendation](/en/learn/tutorials/news-5-recommendation) - partial updates (news embeddings), ANNs, filtering
+6. [News recommendation with searchers](/en/learn/tutorials/news-6-recommendation-with-searchers) - custom searchers, doc processors
+7. [News recommendation with parent-child](/en/learn/tutorials/news-7-recommendation-with-parent-child) - parent-child, tensor ranking
+8. Advanced news recommendation - intermission - training a ranking model
+9. Advanced news recommendation - ML models
+
+In this part of the series, we'll introduce a new ranking signal: category click-through rate (CTR). The idea is that we can recommend popular content for users that don't have a click history yet. Rather than just recommending based on articles, we recommend based on categories. However, these global CTR values can often change continuously, so we need an efficient way to update this value for all documents. We'll do that by introducing parent-child relationships between documents in Vespa. We will also use sparse tensors directly in ranking.
+
+For reference, the final state of this tutorial can be found in the [app-7-parent-child](https://github.com/vespa-engine/sample-apps/tree/master/news/app-7-parent-child) directory of the `news` sample application.
+
+
+## Parent-child relationships in Vespa
+
+Recall that most features come from either attributes in the document or parameters passed with the query when ranking a document. Parent-child relationships introduce the option of using attributes found in other documents. Parent-child relationships work as a form of scalable document joins.
+
+For instance, assume we have a global CTR value for the sports category of `0.2`. If we want to use this value during ranking, we could have a field in each news article holding this value. However, when we need to update this value, we need to issue a partial update to all documents, which seems wasteful.
+
+Another way would be to take inspiration from the `UserProfileSearcher`, where we retrieved the tensor embedding for a user in a search before passing that with the news article query. We could have a single document holding all global values and retrieve that with each query. However, that isn't particularly efficient.
+
+For these cases, Vespa introduced the [parent-child relationship](/en/schemas/parent-child). Parents are global documents, which are automatically distributed to all content nodes. Other documents can reference these parents and "import" values for use in ranking. The benefit is that the global category CTR values only need to be written to one place: the global document.
+
+Please see the [guide on parent-child relationships](/en/schemas/parent-child) for more information and examples.
+
+
+## Setting up a global category CTR document
+
+So, let's set this up for our application. First we need to add a new document type to hold the CTR values. We introduce the `category_ctr` document type, which we add in `schemas/category_ctr.sd`:
+
+```sd
+schema category_ctr {
+ document category_ctr {
+ field ctrs type tensor(category{}) {
+ indexing: attribute
+ attribute: fast-search
+ }
+ }
+}
+```
+
+This document holds a single field: a [tensor](/en/ranking/tensor-user-guide) of type `tensor(category{})`. This is a tensor with a single sparse dimension, which is slightly different from the tensors we have seen so far. Sparse tensors have strings as dimension addresses rather than a numeric index. More concretely, an example of such a tensor is (using the [tensor literal form](/en/reference/ranking/tensor#tensor-literal-form)):
+
+```
+{
+ {category: entertainment}: 0.2,
+ {category: news}: 0.3,
+ {category: sports}: 0.5,
+ {category: travel}: 0.4,
+ {category: finance}: 0.1,
+ ...
+}
+```
+
+This tensor holds all the CTR scores for all the categories. When updating this tensor, we can update individual cells if we don't need to update the whole tensor. This is called [tensor modify](/en/reference/schemas/document-json-format#tensor-modify) and can be helpful when you have large tensors.
+
+To use this document, add it to `services.xml`:
+
+```xml
+
+
+
+
+
+
+
+```
+
+Notice that we've set `global="true"`, configuring Vespa to keep a copy of these documents on all content nodes. This is required for using it in a parent-child relationship. This also put limits on how many parent documents a system can have, as all nodes needs to index all parent documents.
+
+
+## Importing parent values in child documents
+
+To use the `category_ctr` tensor when ranking `news` documents, we need to "import" the tensor into the child document type. There are two things to set up:
+
+1. The reference to the parent document
+2. Which fields to import.
+
+Modify `schemas/news.sd`:
+
+```sd
+schema news {
+ document news {
+ ...
+ field category_ctr_ref type reference {
+ indexing: attribute
+ }
+ ...
+ }
+ import field category_ctr_ref.ctrs as global_category_ctrs {}
+}
+```
+
+The field `category_ctr_ref` is a field of type `reference` of a `category_ctr` document type. When feeding this field, Vespa expects the fully qualified document ID. For instance, if our global CTR document has the id `id:category_ctr:category_ctr::global`, that is what this field must be set to. Usually, there are many parent documents that children can reference, but our application will only hold one.
+
+You can think of the reference field as holding a foreign key to the parent document, and the import as performing a real-time join between the child and parent document using this foreign key. The imported values are usable as if they were stored with the child.
+
+The `import` statement defines that we should import the `ctrs` field from the document referenced in the `category_ctr_ref` field. We name this as `global_category_ctrs`, and we can reference this as `attribute(global_category_ctrs)` during ranking.
+
+
+## Tensor expressions in ranking
+
+Up until this point, we've only used tensors as storage. We used tensors to hold news and user embeddings, and Vespa used these tensors to calculate the dot product in nearest-neighbor searches.
+
+However, Vespa has a [rich language](/en/ranking/tensor-user-guide#ranking-with-tensors) to perform calculations with tensors. We'll exploit that by looking up the `news` article's category in the global CTR tensor and using that as a feature in ranking.
+
+Our `news` document has a field currently that holds the `category` as a string. Unfortunately, tensor expressions only work on tensors, so we need to add a new field to hold the category tensor:
+
+```sd
+field category_tensor type tensor(category{}) {
+ indexing: attribute
+}
+```
+
+Using a tensor in this way also enables a document to have multiple categories, but our dataset only has a single category per article. For instance, we can represent the `finance` category of a `news` article like:
+
+```
+{category: finance}: 1.0
+```
+
+Since this is a sparse tensor, we don't need to mention the other categories. Now, we can use this tensor to calculate the global CTR score for an article's category:
+
+```
+attribute(category_tensor) * attribute(global_category_ctrs)
+```
+
+Given the global category CTR example above, this would result in the value `0.1`. How did we arrive at this? Recall that the value for the cell `finance` in the `category` dimension of the example above had a value of `0.1`. The multiplication of these two tensors is conceptually an "inner join", so you can take the matching cells and multiply them together. Due to the sparseness of the tensor, only the `finance` cell matches, and that value is multiplied by the `1.0` in this document. So in this case, this would effectively work as a lookup.
+
+
+Much more complex operations are available, refer to the [tensor user guide](/en/ranking/tensor-user-guide#ranking-with-tensors) for more information.
+
+
+Let's add a new rank profile to do this calculation:
+
+```sd expandable
+rank-profile recommendation_with_global_category_ctr inherits recommendation {
+ function category_ctr() {
+ expression: sum(attribute(category_tensor) * attribute(global_category_ctrs))
+ }
+ function nearest_neighbor() {
+ expression: closeness(field, embedding)
+ }
+ first-phase {
+ expression: nearest_neighbor * category_ctr
+ }
+ summary-features {
+ attribute(category_tensor)
+ attribute(global_category_ctrs)
+ category_ctr
+ nearest_neighbor
+ }
+}
+```
+
+Here, we've added a first phase ranking expression that multiplies the nearest-neighbor score with the category CTR score, implemented with the functions `nearest_neighbor` and `category_ctr`, respectively.
+
+We've added a `sum` function around the `category_ctr` expression - this is simply to unbox the single-value tensor to a double value suitable for use in the first phase expression.
+
+Note that, as a first attempt, we just multiply the nearest-neighbor with the category CTR score. This is not necessarily the correct way to combine these values, but we'll get back to that in a bit.
+
+We've added a section for [summary features](/en/reference/schemas/schemas#summary-features). This is simply a list of features that will be returned with the hit when using this rank profile. Recall that we can specify which features should be returned in the summary with the `indexing: summary` statement with each field. The `summary-features` can also include the result of functions as well. This is a helpful debugging tool, and we'll see how this looks after feeding some data.
+
+
+## Feeding parent and child updates
+
+Deploy the application:
+
+```bash
+$ (cd app-7-parent-child && mvn package)
+```
+
+```bash
+$ vespa deploy --wait 300 app-7-parent-child
+```
+
+After deploying the application, we are ready to feed a global CTR document. For convenience, we've created [create_category_ctrs.py](https://github.com/vespa-engine/sample-apps/blob/master/news/src/python/create_category_ctrs.py) that reads the MIND content and impression data to calculate CTR scores for each category. This produces two files in the `mind` directory:
+
+1. `mind/global_category_ctr.json` - a feed file for the global CTR document containing CTR score for each category.
+2. `mind/news_category_ctr_update.json` - a feed file for partially updating the `news` articles with the reference to the global CTR document as well as the category tensor.
+
+These files can now be fed to Vespa, but note that the `mind/global_category_ctr.json` need to be fed first because the global document needs to exist before the child documents can reference it.
+
+Create feed files:
+
+```bash
+$ ./src/python/create_category_ctrs.py mind
+```
+
+Feed the created feed files:
+
+```bash
+$ vespa feed mind/global_category_ctr.json --target http://localhost:8080
+$ vespa feed mind/news_category_ctr_update.json --target http://localhost:8080
+```
+
+
+## Testing the application
+
+After feeding the above files, we can now test the application with a query:
+
+```bash
+$ vespa query \
+ 'user_id=U33527' \
+ 'ranking.profile=recommendation_with_global_category_ctr' \
+ 'hits=10'
+```
+
+Note that we specify the rank profile to use. The first result of this query is something like the following:
+
+```json expandable
+"fields": {
+ "title": "Matthew Stafford's status vs. Bears uncertain, Sam Martin will play",
+ "abstract": "Stafford's start streak could be in jeopardy, according to Ian Rapoport.",
+ "category": "sports",
+ ...
+ "summary-features": {
+ "attribute(category_tensor)": {"cells": [{"address": { "category": "sports"}, "value":1.0 }]},
+ "attribute(global_category_ctrs)": {"cells": [
+ ...
+ { "address": { "category": "sports" }, "value": 0.05611187964677811 },
+ ...
+ ]},
+ "rankingExpression(category_ctr)": 0.05611187964677811,
+ "rankingExpression(nearest_neighbor)": 0.14914761220236453,
+ }
+ ...
+ "relevance": 0.008368952865503413,
+}
+```
+
+This is clearly a sports article. The global CTR document is also listed here, and the CTR score for the `sports` category is `0.0561`. Thus, the result of the `category_ctr` function is `0.0561` as intended. The `nearest_neighbor` score is `0.149`, and the resulting relevance score is `0.00836`. So, this worked as expected.
+
+If we were to feed another value to the global CTR document, this updated value is immediately available. As such, the system responds quickly to changes in the global parameters.
+
+Now, a simple multiplication between these features might not give us what we want. For instance, these features have different average values and different standard deviations. Particularly, if we add multiple additional features, just multiplying them together will probably not give a great user experience. Instead of a hand-tuned final relevancy calculation as demonstrated above, we could use a machine learned function with these as feature inputs.
+
+Ultimately, these features are computed in real-time for every news article during ranking. These features can then be added to any machine-learned ranking model. Vespa supports gradient-boosted trees from [XGBoost](/en/ranking/xgboost) and [LightGBM](/en/ranking/lightgbm), and also neural networks in [ONNX](/en/ranking/onnx) format, exported from popular ML frameworks like [PyTorch](https://pytorch.org/) and [Tensorflow](https://www.tensorflow.org/).
+
+
+## Conclusion
+
+This tutorial introduced parent-child relationships and demonstrated it through a global CTR feature we used in ranking. As this feature was based on tensors, we also introduced ranking with tensor expressions. For a real-world use-case using parent-child tensors, see this [blog post](https://blog.vespa.ai/parent-child-joins-tensors-content-recommendation/?_gl=1*1cqlj1i*_gcl_au*ODE0ODM4MTI2LjE3Nzk3MjQ3OTY.).
diff --git a/mintlify-docs/en/learn/tutorials/rag-blueprint.mdx b/mintlify-docs/en/learn/tutorials/rag-blueprint.mdx
new file mode 100644
index 0000000000..618df21bcf
--- /dev/null
+++ b/mintlify-docs/en/learn/tutorials/rag-blueprint.mdx
@@ -0,0 +1,1381 @@
+---
+title: "RAG Blueprint"
+description: "Many of our users use Vespa to power large scale RAG Applications."
+---
+
+This blueprint aims to exemplify many of the best practices we have learned while supporting these users.
+
+While many RAG tutorials exist, this blueprint provides a customizable template that:
+
+- Can [(auto)scale](/en/operations/autoscaling) with your data size and/or query load.
+- Is fast and [production grade](/en/operations/production-deployment).
+- Enables you to build RAG applications with state-of-the-art quality.
+
+This tutorial will show how we can develop a _high-quality_ RAG application with an evaluation-driven mindset, while being a resource you can revisit for making informed choices for your own use case.
+
+We will guide you through the following steps:
+
+1. [Our use case](#our-use-case)
+2. [Data modeling](#data-modeling)
+3. [Structuring your Vespa application](#structuring-your-vespa-application)
+4. [Configuring match-phase (retrieval)](#configuring-match-phase-retrieval)
+5. [First-phase ranking](#first-phase-ranking)
+6. [Second-phase ranking](#second-phase-ranking)
+7. [(Optional) Global-phase ranking](#optional-global-phase-ranking)
+
+All the accompanying code can be found in our [sample app](https://github.com/vespa-engine/sample-apps/tree/master/rag-blueprint) repo.
+
+Each step will contain reasoning behind the choices and design of the blueprint, as well as pointers for customizing your own application.
+
+Below, you can see a diagram of the indexing (document side), retrieval and ranking of the sample application, which will be explained in more detail in the following sections.
+
+
+**Note:**
+
+The elements in the diagram are clickable, and will lead you to the relevant sections, either of this tutorial or in the Vespa documentation.
+
+
+
+
+
+
+[Click to open diagram in full size](/assets/img/tutorials/rag-blueprint-overview.svg)
+
+
+**Note:**
+
+This is not a **'Deploy RAG in 5 minutes'** tutorial (although you _can_ technically do that by following the README in our [sample app](https://github.com/vespa-engine/sample-apps/tree/master/rag-blueprint)). This focus is more about providing you with the insights and tools for you to apply it to your own use case. Therefore we suggest taking your time to look at the code in the sample app, and run the described steps.
+
+
+
+**Prerequisites:**
+
+- Linux, macOS or Windows 10 Pro on x86\_64 or arm64, with [Podman Desktop](https://podman.io/) or [Docker Desktop](https://www.docker.com/products/docker-desktop/) installed, with an engine running.
+ - Alternatively, start the Podman daemon:
+ ```bash
+ $ podman machine init --memory 6000
+ $ podman machine start
+ ```
+ - See [Docker Containers](/en/operations/self-managed/docker-containers) for system limits and other settings.
+- For CPUs older than Haswell (2013), see [CPU Support](/en/operations/self-managed/cpu-support).
+- Memory: Minimum 4 GB RAM dedicated to Docker/Podman. [Memory recommendations](/en/operations/self-managed/node-setup#memory-settings).
+- Disk: Avoid `NO_SPACE` - the vespaengine/vespa container image + headroom for data requires disk space. [Read more](/en/writing/feed-block).
+- [Homebrew](https://brew.sh/) to install the [Vespa CLI](/en/clients/vespa-cli), or download the Vespa CLI from [Github releases](https://github.com/vespa-engine/vespa/releases).
+- [uv](https://docs.astral.sh/uv/) For Python dependency handling
+
+
+
+## Our use case
+
+The sample use case is a document search application, for a user who wants to get answers and insights quickly from a document collection containing company documents, notes, learning material, training logs. To make the blueprint more realistic, we required a dataset with more structured fields than are commonly found in public datasets. Therefore, we used a Large Language Model (LLM) to generate a custom one.
+
+It is a toy example, with only 100 documents, but we think it will illustrate the necessary concepts. You can also feel confident that the blueprint will provide a starting point that can scale as you want, with minimal changes.
+
+Below you can see a sample document from the dataset:
+
+```json
+{
+ "put": "id:doc:doc::78",
+ "fields": {
+ "created_timestamp": 1717750000,
+ "modified_timestamp": 1717750000,
+ "text": "# Feature Brainstorm: SynapseFlow Model Monitoring Dashboard v1\n\n**Goal:** Provide users with basic insights into their deployed model's performance and health.\n\n**Key Metrics to Display:**\n- **Inference Latency:** Avg, p95, p99 (Histogram).\n- **Request Rate / Throughput:** Requests per second/minute.\n- **Error Rate:** Percentage of 5xx errors.\n- **CPU/Memory Usage:** Per deployment/instance.\n- **GPU Usage / Temp (if applicable).**\n\n**Visualizations:**\n- Time series graphs for all key metrics.\n- Ability to select time range (last hour, day, week).\n- Filter by deployment ID.\n\n**Data Sources:**\n- Prometheus metrics from model server (see `code_review_pr123_metrics.md`).\n- Kubernetes metrics (via Kube State Metrics or cAdvisor).\n\n**Future Ideas (v2+):**\n- Data drift detection.\n- Concept drift detection.\n- Alerting on anomalies or threshold breaches.\n- Custom metric ingestion.\n\n## (UI mock-up sketches, specific Prometheus queries)",
+ "favorite": true,
+ "last_opened_timestamp": 1717750000,
+ "open_count": 3,
+ "title": "feature_brainstorm_monitoring_dashboard.md",
+ "id": "78"
+ }
+}
+```
+
+In order to evaluate the quality of the RAG application, we also need a set of representative queries, with annotated relevant documents. Crucially, you need a set of representative queries that thoroughly cover your expected use case. More is better, but _some_ eval is always better than none.
+
+We used `gemini-2.5-pro` to create our queries and relevant document labels. Please check out our [blog post](https://blog.vespa.ai/improving-retrieval-with-llm-as-a-judge/) to learn more about using LLM-as-a-judge.
+
+We decided to generate some queries that need several documents to provide a good answer, and some that only need one document. If these queries are representative of the use case, we will show that they can be a great starting point for creating an (initial) ranking expression that can be used for retrieving and ranking candidate documents. But, it can (and should) also be improved, for example by collecting user interaction data, human labeling and/ or using an LLM to generate relevance feedback following the initial ranking expression.
+
+
+## Data modeling
+
+Here is the schema that we will use for our sample application.
+
+```txt expandable
+# Copyright Vespa.ai. Licensed under the terms of the Apache 2.0 license. See LICENSE in the project root.
+schema doc {
+
+ document doc {
+
+ field id type string {
+ indexing: summary | attribute
+ }
+
+ field title type string {
+ indexing: index | summary
+ index: enable-bm25
+ }
+
+ field text type string {
+
+ }
+
+ field created_timestamp type long {
+ indexing: attribute | summary
+ }
+ field modified_timestamp type long {
+ indexing: attribute | summary
+ }
+
+ field last_opened_timestamp type long {
+ indexing: attribute | summary
+ }
+ field open_count type int {
+ indexing: attribute | summary
+ }
+ field favorite type bool {
+ indexing: attribute | summary
+ }
+
+ }
+
+ field title_embedding type tensor(x[96]) {
+ indexing: input title | embed | pack_bits | attribute | index
+ attribute {
+ distance-metric: hamming
+ }
+ }
+
+ field chunks type array {
+ indexing: input text | chunk fixed-length 1024 | summary | index
+ index: enable-bm25
+ }
+
+ field chunk_embeddings type tensor(chunk{}, x[96]) {
+ indexing: input text | chunk fixed-length 1024 | embed | pack_bits | attribute | index
+ attribute {
+ distance-metric: hamming
+ }
+ }
+
+ fieldset default {
+ fields: title, chunks
+ }
+
+ document-summary no-chunks {
+ summary id {}
+ summary title {}
+ summary created_timestamp {}
+ summary modified_timestamp {}
+ summary last_opened_timestamp {}
+ summary open_count {}
+ summary favorite {}
+ summary chunks {}
+ }
+
+ document-summary top_3_chunks {
+ from-disk
+ summary chunks_top3 {
+ source: chunks
+ select-elements-by: top_3_chunk_sim_scores #this needs to be added a summary-feature to the rank-profile
+ }
+ }
+}
+```
+
+Keep reading for an explanation and reasoning behind the choices in the schema.
+
+
+### Picking your searchable unit
+
+When building a RAG application, your first key decision is choosing the "searchable unit." This is the basic block of information your system will search through and return as context to the LLM. For instance, if you have millions of documents, some hundreds of pages long, what should be your searchable unit?
+
+Consider these points when selecting your searchable unit:
+
+- **Too fine-grained (e.g., individual sentences or very small paragraphs):**
+- Leads to duplication of context and metadata across many small units.
+- May result in units lacking sufficient context for the LLM to make good selections or generate relevant responses.
+- Increases overhead for managing many small document units.
+- **Too coarse-grained (e.g., very long chapters or entire large documents):**
+- Can cause performance issues due to the size of the units being processed.
+- May lead to some large documents appearing relevant to too many queries, reducing precision.
+- If you embed the whole document, a too large context will lead to reduced retrieval quality.
+
+We recommend erring on the side of using slightly larger units.
+
+- LLMs are increasingly capable of handling larger contexts.
+- In Vespa, you can index larger units, while avoiding data duplication and performance issues, by returning only the most relevant parts.
+
+With Vespa, it is now possible to return only the top k most relevant chunks of a document, and include and combine both document-level and chunk-level features in ranking.
+
+
+### Chunk selection
+
+Assume you have chosen a document as your searchable unit. Your documents may then contain text index fields of highly variable lengths. Consider for example a corpus of web pages. Some might be very long, while the average is well within the recommended size. See [scaling retrieval size](/en/performance/sizing-search#scaling-retrieval-size) for more details.
+
+While we recommend implementing guards against too long documents in your feeding pipeline, you still probably do not want to return every chunk of the top k documents to an LLM for RAG.
+
+In Vespa, we now have a solution for this problem. Check out our [blog post on layered ranking](https://blog.vespa.ai/introducing-layered-ranking-for-rag-applications/) for an overview of the new features that allow you to do this.
+
+Below, we show how you can score both documents as well as individual chunks, and use that score to select the best chunks to be returned in a summary, instead of returning all chunks belonging to the top k ranked documents.
+
+Compute closeness per chunk in a ranking function; use `elementwise(bm25(chunks), i, double)` for a per-chunk text signal. See [rank feature reference](/en/reference/ranking/rank-features#elementwise-bm25) Now available: elementwise rank functions and filtering on the content nodes.
+
+This allows you to pick a large document as the searchable unit, while still addressing the potential drawbacks many encounter as follows:
+
+- Pick your (larger) document as your searchable unit.
+- Chunk the text-fields automatically on indexing.
+- Embed each chunk (enabled through Vespa's multivector support)
+- Calculate chunk-level features (e.g. bm25 and embedding similarity) and document-level features. Combine as you want.
+- Limit the actual chunks that are returned to the ones that are actually relevant context for the LLM.
+
+This allows you to index larger units, while avoiding data duplication and performance issues, by returning only the most relevant parts.
+
+Vespa also supports automatic [chunking](/en/reference/writing/indexing-language#converters) in the [indexing language](/en/writing/indexing).
+
+Here are the parts of the schema, which defines the searchable unit as a document with a text field, and automatically chunks it into smaller parts of 1024 characters, which each are embedded and indexed separately:
+
+```txt
+field chunks type array {
+ indexing: input text | chunk fixed-length 1024 | summary | index
+ index: enable-bm25
+}
+
+field chunk_embeddings type tensor(chunk{}, x[96]) {
+ indexing: input text | chunk fixed-length 1024 | embed | pack_bits | attribute | index
+ attribute {
+ distance-metric: hamming
+ }
+}
+```
+
+In Vespa, we can specify which chunks to be returned with a summary feature, see [docs](/en/reference/schemas/schemas#select-elements-by) for details. For this blueprint, we will return the top 3 chunks based on the similarity score of the chunk embeddings, which is calculated in the ranking phase. Note that this feature could be any chunk-level summary feature defined in your rank-profile.
+
+Here is how the summary feature is calculated in the rank-profile:
+
+```txt expandable
+# This function unpacks the bits of each dimension of the mapped chunk_embeddings attribute tensor
+function chunk_emb_vecs() {
+ expression: unpack_bits(attribute(chunk_embeddings))
+}
+
+# This function calculates the dot product between the query embedding vector and the chunk embeddings (both are now float) over the x dimension
+function chunk_dot_prod() {
+ expression: reduce(query(float_embedding) * chunk_emb_vecs(), sum, x)
+}
+
+# This function calculates the L2 normalized length of an input tensor
+function vector_norms(t) {
+ expression: sqrt(sum(pow(t, 2), x))
+}
+
+# Here we calculate cosine similarity by dividing the dot product by the product of the L2 normalized query embedding and document embeddings
+function chunk_sim_scores() {
+ expression: chunk_dot_prod() / (vector_norms(chunk_emb_vecs()) * vector_norms(query(float_embedding)))
+}
+
+function top_3_chunk_text_scores() {
+ expression: top(3, chunk_text_scores())
+}
+
+function top_3_chunk_sim_scores() {
+ expression: top(3, chunk_sim_scores())
+ }
+
+summary-features {
+ top_3_chunk_sim_scores
+ }
+```
+
+
+**Note:**
+
+The ranking expression may seem a bit complex, as we chose to embed each chunk independently, store the embeddings in a binarized format, and then unpack them to calculate similarity based on their float representations. For single dimension dense vector similarity between same-precision embeddings, this can be simplified significantly using the [closeness](/en/reference/ranking/rank-features#closeness(name)) convenience function.
+
+
+Note that we want to use the float-representation of the query-embedding, and thus also need to convert the binary embedding of the chunks to float. After that, we can calculate the similarity score between the query embedding and the chunk embeddings using cosine similarity (the dot product, and then normalize it by the norms of the embeddings).
+
+See [ranking expressions](/en/reference/ranking/ranking-expressions#non-primitive-functions) for more details on the `top`-function, and other functions available for ranking expressions.
+
+Now, we can use this summary feature in our document summary to return the top 3 chunks of the document, which will be used as context for the LLM. Note that we can also define a document summary that returns all chunks, which might be useful for another use case, such as deep research.
+
+```txt
+document-summary top_3_chunks {
+ from-disk
+ summary chunks_top3 {
+ source: chunks
+ select-elements-by: top_3_chunk_sim_scores #this needs to be added a summary-feature to the rank-profile
+ }
+ }
+```
+
+
+### Use multiple text fields, consider multiple embeddings
+
+We recommend indexing different textual content as separate indexes. These can be searched together, using [field-sets](/en/reference/schemas/schemas#fieldset)
+
+In our schema, this is exemplified by the sections below, which define the `title` and `chunks` fields as separate indexed text fields.
+
+```txt
+...
+field title type title {
+ indexing: index | summary
+ index: enable-bm25
+}
+field chunks type array {
+ indexing: input text | chunk fixed-length 1024 | summary | index
+ index: enable-bm25
+}
+```
+
+Whether you should have separate embedding fields, depends on whether the added memory usage is justified by the quality improvement you could get from the additional embedding field.
+
+We choose to index both a `title_embedding` and a `chunk_embeddings` field for this blueprint, as we aim to minimize cost by embedding the binary vectors.
+
+```txt
+field title_embedding type tensor(title{}, x[96]) {
+ indexing: input text | embed | pack_bits | attribute | index
+ attribute {
+ distance-metric: hamming
+ }
+}
+field chunk_embeddings type tensor(chunk{}, x[96]) {
+ indexing: input text | chunk fixed-length 1024 | embed | pack_bits | attribute | index
+ attribute {
+ distance-metric: hamming
+ }
+}
+```
+
+Indexing several embedding fields may not be worth the cost for you. Evaluate whether the cost-quality trade-off is worth it for your application.
+
+If you have different vector space representations of your document (e.g images), indexing them separately is likely worth it, as they are likely to provide signals that are complementary to the text-based embeddings.
+
+
+### Model Metadata and Signals Using Structured Fields
+
+We recommend modeling metadata and signals as structured fields in your schema. Below are some general recommendations, as well as the implementation in our blueprint schema.
+
+**Metadata** — knowledge about your data:
+
+- Authors, publish time, source, links, category, price, …
+- Usage: filters, ranking, grouping/aggregation
+- Index only metadata that are strong filters
+
+In our blueprint schema, we include these metadata fields to demonstrate these concepts:
+
+- `id` - document identifier
+- `title` - document name/filename for display and text matching
+- `created_timestamp`, `modified_timestamp` - temporal metadata for filtering and ranking by recency
+
+**Signals** — observations about your data:
+
+- Popularity, quality, spam probability, click_probability, …
+- Usage: ranking
+- Often updated separately via partial updates
+- Multiple teams can add their own signals independently
+
+In our blueprint schema, we include several of these signals:
+
+- `last_opened_timestamp` - user engagement signal for personalization
+- `open_count` - popularity signal indicating document importance
+- `favorite` - explicit user preference signal, can be used for boosting relevant content
+
+These fields are configured as `attribute | summary` to enable efficient filtering, sorting, and grouping operations while being returned in search results. The timestamp fields allow for temporal filtering (e.g., "recent documents") and recency-based ranking, while usage signals like `open_count` and `favorite` can boost frequently accessed or explicitly marked important documents.
+
+Consider [parent-child](/en/schemas/parent-child) relationships for low-cardinality metadata. Most large scale RAG application schemas contain at least a hundred structured fields.
+
+## LLM-generation with OpenAI-client
+
+Vespa supports both Local LLMs, and any OpenAI-compatible API for LLM generation. For details, see [LLMs in Vespa](/en/rag/llms-in-vespa)
+
+The recommended way to provide an API key is by using the [secret store](/en/security/secret-store) in Vespa Cloud.
+
+To enable this, you need to create a vault (if you don't have one already) and a secret through the [Vespa Cloud console](/). If your vault is named `sample-apps` and contains a secret with the name `openai-api-key`, you would use the following configuration in your `services.xml` to set up the OpenAI client to use that secret:
+
+```xml
+
+
+
+
+
+
+ openai-api-key
+
+
+```
+
+Alternatively, for local deployments, you can set the `X-LLM-API-KEY` header in your query to use the OpenAI client for generation.
+
+To test generation using the OpenAI client, post a query that runs the `openai` search chain, with `format=sse`. (Use `format=json` for a streaming json response including both the search hits and the LLM-generated tokens.)
+
+```bash
+$ vespa query \
+ --timeout 60 \
+ --header="X-LLM-API-KEY:" \
+ yql='select *
+ from doc
+ where default contains text(@query) or
+ ({label:"title_label", targetHits:100}nearestNeighbor(title_embedding, embedding)) or
+ ({label:"chunks_label", targetHits:100}nearestNeighbor(chunk_embeddings, embedding))' \
+ query="Summarize the key architectural decisions documented for SynapseFlow's v0.2 release." \
+ searchChain=openai \
+ format=sse \
+ hits=5
+```
+
+
+## Structuring your vespa application
+
+This section provides recommendations for structuring your Vespa application package. See also the [application package docs](/en/basics/applications) for more details on the application package structure. Note that this is not mandatory, and it might be simpler to start without query profiles and rank profiles, but as you scale out your application, it will be beneficial to have a well-structured application package.
+
+Consider the following structure for our application package:
+
+```txt expandable
+app
+├── models
+│ └── lightgbm_model.json
+├── schemas
+│ └── doc
+│ │ ├-- base-features.profile
+│ │ ├── collect-second-phase.profile
+│ │ ├── collect-training-data.profile
+│ │ ├── learned-linear.profile
+│ │ ├── match-only.profile
+│ │ └── second-with-gbdt.profile
+│ └── doc.sd
+├── search
+│ └── query-profiles
+│ ├── deepresearch-with-gbdt.xml
+│ ├── deepresearch.xml
+│ ├── hybrid-with-gbdt.xml
+│ ├── hybrid.xml
+│ ├── rag-with-gbdt.xml
+│ └── rag.xml
+├── security
+│ └── clients.pem
+└── services.xml
+```
+
+You can see that we have separated the [query profiles](/en/reference/querying/query-profiles), and [rank profiles](/en/basics/ranking#rank-profiles) into their own directories.
+
+
+### Manage queries in query profiles
+
+Query profiles let you maintain collections of query parameters in one file. Clients choose a query profile → the profile sets everything else. This lets us change behavior for a use case without involving clients.
+
+Let us take a closer look at 3 of the query profiles in our sample application.
+
+1. `hybrid`
+2. `rag`
+3. `deepresearch`
+
+
+### **_hybrid_** query profile
+
+This query profile will be the one used by clients for traditional search, where the user is presented a limited number of hits. Our other query profiles will inherit this one (but may override some fields).
+
+```xml
+
+ doc
+ embed(@query)
+ embed(@query)
+
+ select *
+ from %{schema}
+ where default contains text(@query) or
+ ({label:"title_label", targetHits:100}nearestNeighbor(title_embedding, embedding)) or
+ ({label:"chunks_label", targetHits:100}nearestNeighbor(chunk_embeddings, embedding))
+
+ 10
+ learned-linear
+ top_3_chunks
+
+```
+
+
+### **_rag_** query profile
+
+This will be the query profile where the `openai` searchChain will be added, to generate a response based on the retrieved context. Here, we set some configuration that are specific to this use case.
+
+```xml
+
+ 50
+ openai
+ sse
+
+```
+
+
+### **_deepresearch_** query profile
+
+Again, we will inherit from the `hybrid` query profile, but override with a `targetHits` value of 10 000 (original was 100) that prioritizes recall over latency. We will also increase number of hits to be returned, and increase the timeout to 5 seconds.
+
+```xml
+
+
+ select *
+ from %{schema}
+ where default contains text(@query) or
+ ({label:"title_label", targetHits:10000}nearestNeighbor(title_embedding, embedding)) or
+ ({label:"chunks_label", targetHits:10000}nearestNeighbor(chunk_embeddings, embedding))
+
+ 100
+ 5s
+
+```
+
+We will leave out the LLM-generation for this one, and let an LLM agent on the client side be responsible for using this API call as a tool, and to determine whether enough relevant context to answer has been retrieved. Note that the `targetHits` parameter set here does not really make sense until your dataset reach a certain scale.
+
+As we add more rank-profiles, we can also inherit the existing query profiles, only to override the `ranking.profile` field to use a different rank profile. This is what we have done for the `rag-with-gbdt` and `deepresearch-with-gbdt` query profiles, which will use the `second-with-gbdt` rank profile instead of the `learned-linear` rank profile.
+
+```xml
+
+ 50
+ openai
+ sse
+
+```
+
+
+### Separating out rank profiles
+
+To build a great RAG application, assume you'll need many ranking models. This will allow you to bucket-test alternatives continuously and to serve different use cases, including data collection for different phases, and the rank profiles to be used in production.
+
+Separate common functions/setup into parent rank profiles and use `.profile` files.
+
+
+## Phased ranking in Vespa
+
+Before we move on, it might be useful to recap Vespa´s [phased ranking](/en/ranking/phased-ranking) approach.
+
+Below is a schematic overview of how to think about retrieval and ranking for this RAG blueprint. Since we are developing this as a tutorial using a small toy dataset, the application can be deployed in a single machine, using a single docker container, where only one container node and one container node will run. This is obviously not the case for most real-world RAG applications, so this is cruical to have in mind as you want to scale your application.
+
+
+
+
+
+The stateless container nodes can be [scaled independently](/en/performance/sizing-search) to handle increased query load.
+
+
+## Configuring match-phase (retrieval)
+
+This section will contain important considerations for the retrieval-phase of a RAG application in Vespa.
+
+The goal of the retrieval phase is to retrieve candidate documents efficiently, and maximize recall, without exposing too many documents to ranking.
+
+
+### Choosing a Retrieval Strategy: Vector, Text, or Hybrid?
+
+As you could see from the schema, we create and index both a text representation and a vector representation for each chunk of the document. This will allow us to use both text-based features and semantic features for both recall and ranking.
+
+The text and vector representation complement each other well:
+
+- **Text-only** → misses recall of semantically similar content
+- **Vector-only** → misses recall of specific content not well understood by the embedding models
+
+Our recommendation is to default to hybrid retrieval:
+
+```sql
+select *
+ from doc
+ where default contains text(@query) or
+ ({label:"title_label", targetHits:1000}nearestNeighbor(title_embedding, embedding)) or
+ ({label:"chunks_label", targetHits:1000}nearestNeighbor(chunk_embeddings, embedding))
+```
+
+In generic domains, or if you have fine-tuned an embedding model for your specific data, you might consider a vector-only approach:
+
+```sql
+select *
+ from doc
+ where rank({targetHits:10000}nearestNeighbor(embeddings_field, query_embedding), default contains text(@query))
+```
+
+Notice that only the first argument of the [rank](/en/reference/querying/yql#rank)-operator will be used to determine if a document is a match, while all arguments are used for calculating rank features. This mean we can do vector only for matching, but still use text-based features such as `bm25` and `nativeRank` for ranking. Note that if you do this, it makes sense to increase the number of `targetHits` for the `nearestNeighbor`-operator.
+
+For our sample application, we add three different retrieval operators (that are combined with `OR`), one with `weakAnd` for text matching, and two `nearestNeighbor` operators for vector matching, one for the title and one for the chunks. This will allow us to retrieve both relevant documents based on text and vector similarity, while also allowing us to return the most relevant chunks of the documents.
+
+```sql
+select *
+ from doc
+ where default contains text(@query) or
+ ({targetHits:100}nearestNeighbor(title_embedding, embedding)) or
+ ({targetHits:100}nearestNeighbor(chunk_embeddings, embedding))
+```
+
+
+### Choosing your embedding model (and strategy)
+
+Choice of embedding model will be a trade-off between inference time (both indexing and query time), memory usage (embedding dimensions) and quality. There are many good open-source models available, and we recommend checking out the [MTEB leaderboard](https://huggingface.co/spaces/mteb/leaderboard), and look at the `Retrieval`-column to gauge performance, while also considering the memory usage, vector dimensions, and context length of the model.
+
+See [model hub](/en/rag/model-hub) for a list of provided models ready to use with Vespa. See also [Huggingface Embedder](/en/rag/embedding#huggingface-embedder) for details on using other models (exported as ONNX) with Vespa.
+
+In addition to dense vector representation, Vespa supports sparse embeddings (token weights) and multi-vector (ColBERT-style) embeddings. See our [example notebook](https://vespa-engine.github.io/pyvespa//examples/mother-of-all-embedding-models-cloud#bge-m3-the-mother-of-all-embedding-models) of using the bge-m3 model, which supports both, with Vespa.
+
+Vespa also supports [Matryoshka embeddings](https://blog.vespa.ai/combining-matryoshka-with-binary-quantization-using-embedder/), which can be a great way of reducing inference cost for retrieval phases, by using a subset of the embedding dimensions, while using more dimensions for increased precision in the later ranking phases.
+
+For domain-specific applications or less popular languages, you may want to consider finetuning a model on your own data.
+
+
+### Consider binary vectors for recall
+
+Another decision to make is which precision you will use for your embeddings. See [binarization docs](/en/rag/binarizing-vectors) for an introduction to binarization in Vespa.
+
+For most cases, binary vectors (in Vespa, packed into `int8`-representation) will provide an attractive tradeoff, especially for recall during match-phase. Consider these factors to determine whether this holds true for your application:
+
+- Reduces memory-vector cost by 5 – 30 ×
+- Reduces query and indexing cost by 30 ×
+- Often reduces quality by only a few percentage points
+
+```txt
+field binary_chunk_embeddings type tensor(chunk{}, x) {
+ indexing: input text | chunk fixed-length 1024 | embed | pack_bits | attribute | index
+ attribute { distance-metric: hamming }
+}
+```
+
+If you need higher precision vector similarity, you should use bfloat16 precision, and consider paging these vectors to disk to avoid large memory cost. Note that this means that when accessing this field in ranking, they will also need to be read from disk, so you need to restrict the number of hits that accesses this field to avoid performance issues.
+
+```txt
+field chunk_embeddings type tensor(chunk{}, x) {
+ indexing: input text | chunk fixed-length 1024 | embed | attribute
+ attribute: paged
+}
+```
+
+For example, if you want to calculate `closeness` for a paged embedding vector in first-phase, consider configuring your retrieval operators (typically `weakAnd` and/or `nearestNeighbor`, optionally combined with filters) so that not too many hits are matched. Another option is to enable match-phase limiting, see [match-phase docs](/en/reference/schemas/schemas#match-phase). In essence, you restrict the number of matches by specifying an attribute field.
+
+
+### Consider float-binary for ranking
+
+In our blueprint, we choose to index binary vectors of the documents. This does not prevent us from using the float-representation of the query embedding though.
+
+By unpacking the binary document chunk embeddings to their float representations (using [`unpack_bits`](/en/reference/ranking/ranking-expressions#unpack-bits)), we can calculate the similarity between query and document with slightly higher precision using a `float-binary` dot product, instead of hamming distance (`binary-binary`)
+
+Below, you can see how we can do this:
+
+```txt expandable
+rank-profile base-features {
+
+ inputs {
+ query(embedding) tensor(x[96])
+ query(float_embedding) tensor(x[768])
+ }
+
+ function chunk_emb_vecs() {
+ expression: unpack_bits(attribute(chunk_embeddings))
+ }
+
+ function chunk_dot_prod() {
+ expression: reduce(query(float_embedding) * chunk_emb_vecs(), sum, x)
+ }
+
+ function vector_norms(t) {
+ expression: sqrt(sum(pow(t, 2), x))
+ }
+ function chunk_sim_scores() {
+ expression: chunk_dot_prod() / (vector_norms(chunk_emb_vecs()) * vector_norms(query(float_embedding)))
+ }
+
+ function top_3_chunk_text_scores() {
+ expression: top(3, chunk_text_scores())
+ }
+
+ function top_3_chunk_sim_scores() {
+ expression: top(3, chunk_sim_scores())
+ }
+}
+```
+
+
+### Use complex linguistics/recall only for precision
+
+Vespa gives you extensive control over [linguistics](/en/linguistics/linguistics). You can decide [match mode](/en/reference/schemas/schemas#match), stemming, normalization, or control derived tokens.
+
+It is also possible to use more specific operators than [weakAnd](/en/reference/querying/yql#weakand) to match only close occurrences ([near](/en/reference/querying/yql#near)/ [onear](/en/reference/querying/yql#near)), multiple alternatives ([equiv](/en/linguistics/query-rewriting#equiv)), weight items, set connectivity, and apply [query-rewrite](/en/linguistics/query-rewriting) rules.
+
+**Don't use this to increase recall — improve your embedding model instead.**
+
+Consider using it to improve precision when needed.
+
+
+### Evaluating recall of the retrieval phase
+
+To know whether your retrieval phase is working well, you need to measure recall, number of total matches and the reported time spent.
+
+We can use [`VespaMatchEvaluator`](https://vespa-engine.github.io/pyvespa/api/vespa/evaluation.html#vespa.evaluation.VespaMatchEvaluator) from the pyvespa client library to do this.
+
+For this sample application, we set up an evaluation script that compares three different retrieval strategies, let us call them "retrieval arms":
+
+1. **Semantic-only**: Uses only vector similarity through `nearestNeighbor` operators.
+2. **WeakAnd-only**: Uses only text-based matching with `userQuery()`.
+3. **Hybrid**: Combines both approaches with OR logic.
+
+
+**Note:**
+
+Note that this is only generic suggestion for and that you are of course free to include both [filter clauses](/en/reference/querying/yql#where), [grouping](/en/querying/grouping), [predicates](/en/schemas/predicate-fields), [geosearch](/en/querying/geo-search) etc. to support your specific use cases.
+
+
+It is recommended to use a ranking profile that does not use any first-phase ranking, to run the match-phase evaluation faster.
+
+The evaluation will output metrics like:
+
+- Recall (percentage of relevant documents matched)
+- Total number of matches per query
+- Query latency statistics
+- Per-query detailed results (when `write_verbose=True`) to identify "offending" queries in regard to recall or performance.
+
+This will be valuable input for tuning each of them.
+
+Run the complete evaluation script from the `eval/` directory to see detailed comparisons between all three retrieval strategies on your dataset.
+
+
+#### Semantic Query Evaluation
+
+```sql
+select * from doc where
+({targetHits:100}nearestNeighbor(title_embedding, embedding)) or
+({targetHits:100}nearestNeighbor(chunk_embeddings, embedding))
+```
+
+| Metric | Value |
+| :--- | :--- |
+| Match Recall | 1.0000 |
+| Average Recall per Query | 1.0000 |
+| Total Relevant Documents | 51 |
+| Total Matched Relevant | 51 |
+| Average Matched per Query | 100.0000 |
+| Total Queries | 20 |
+| Search Time Average (s) | 0.0090 |
+| Search Time Q50 (s) | 0.0060 |
+| Search Time Q90 (s) | 0.0193 |
+| Search Time Q95 (s) | 0.0220 |
+
+
+#### WeakAnd Query Evaluation
+
+The `userQuery` is just a convenience wrapper for `weakAnd`, see [reference/query-language-reference.html](/en/reference/querying/yql). The default `targetHits` for `weakAnd` is 100, but it is [overridable](/en/reference/querying/yql#targethits).
+
+```sql
+select * from doc where userQuery()
+```
+
+| Metric | Value |
+| :--- | :--- |
+| Match Recall | 1.0000 |
+| Average Recall per Query | 1.0000 |
+| Total Relevant Documents | 51 |
+| Total Matched Relevant | 51 |
+| Average Matched per Query | 88.7000 |
+| Total Queries | 20 |
+| Search Time Average (s) | 0.0071 |
+| Search Time Q50 (s) | 0.0060 |
+| Search Time Q90 (s) | 0.0132 |
+| Search Time Q95 (s) | 0.0171 |
+
+
+#### Hybrid Query Evaluation
+
+```sql
+select * from doc where
+({targetHits:100}nearestNeighbor(title_embedding, embedding)) or
+({targetHits:100}nearestNeighbor(chunk_embeddings, embedding)) or
+userQuery()
+```
+
+| Metric | Value |
+| :--- | :--- |
+| Match Recall | 1.0000 |
+| Average Recall per Query | 1.0000 |
+| Total Relevant Documents | 51 |
+| Total Matched Relevant | 51 |
+| Average Matched per Query | 100.0000 |
+| Total Queries | 20 |
+| Search Time Average (s) | 0.0076 |
+| Search Time Q50 (s) | 0.0055 |
+| Search Time Q90 (s) | 0.0150 |
+| Search Time Q95 (s) | 0.0201 |
+
+
+### Tuning the retrieval phase
+
+We can see that all queries match all relevant documents, which is expected, since we use `targetHits:100` in the `nearestNeighbor` operator, and this is also the default for `weakAnd`(and `userQuery`). By setting `targetHits` lower, we can see that recall will drop.
+
+In general, you have these options if you want to increase recall:
+
+1. Increase `targetHits` in your retrieval operators (e.g., `nearestNeighbor`, `weakAnd`).
+2. Improve your embedding model (use a better model or finetune it on your data).
+3. You can also consider tuning HNSW parameters, see [docs on HNSW](/en/querying/approximate-nn-hnsw#using-vespas-approximate-nearest-neighbor-search).
+
+Conversely, if you want to reduce the latency of one of your retrieval 'arms' at the cost of a small trade-off in recall, you can:
+
+1. Tune `weakAnd` parameters. This has potential to 3x your performance for the `weakAnd`-parameter of your query, see [blog post](https://blog.vespa.ai/tripling-the-query-performance-of-lexical-search/).
+
+Below are some empirically found default parameters that work well for most use cases:
+
+```txt
+rank-profile optimized inherits baseline {
+ filter-threshold: 0.05
+ weakand {
+ stopword-limit: 0.6
+ adjust-target: 0.01
+ }
+ }
+```
+
+See the [reference](/en/reference/schemas/schemas#weakand) for more details on the `weakAnd` parameters. These can also be set as query parameters.
+
+1. As already [mentioned](#consider-binary-vectors-for-recall), consider binary vectors for your embeddings.
+2. Consider using an embedding model with less dimensions, or using only a subset of the dimensions (e.g., using [Matryoshka embeddings](https://blog.vespa.ai/combining-matryoshka-with-binary-quantization-using-embedder/)).
+
+
+## First-phase ranking
+
+For the first-phase ranking, we must use a computationally cheap function, as it is applied to all documents matched in the retrieval phase. For many applications, this can amount to millions of candidate documents.
+
+Common options include (learned) linear combination of features including text similarity features, vector closeness, and metadata. It could also be a heuristic handwritten function.
+
+Text features should include [nativeRank](/en/reference/ranking/nativerank#nativeRank) or [bm25](/en/ranking/bm25#ranking-function) — not [fieldMatch](/en/reference/ranking/rank-features#field-match-features-normalized) (it is too expensive).
+
+Considerations for deciding whether to choose `bm25` or `nativeRank`:
+
+- **bm25**: cheapest, strong significance, no proximity, not normalized.
+- **nativeRank**: 2 – 3 × costlier, truncated significance, includes proximity, normalized.
+
+For this blueprint, we opted for using `bm25` for first phase, but you could evaluate and compare to see whether the additional cost of using `nativeRank` is justified by increased quality.
+
+
+### Collecting training data for first-phase ranking
+
+The features we will use for first-phase ranking are not normalized (i.e. they have values in different ranges). This means we can't just weight them equally and expect that to be a good proxy for relevance.
+
+Below we will show how we can find (learn) optimal weights (coefficients) for each feature, so that we can combine them into a ranking-expression on the format:
+
+```python
+a * bm25(title) + b * bm25(chunks) + c * max_chunk_sim_scores() + d * max_chunk_text_scores() + e * avg_top_3_chunk_sim_scores() + f * avg_top_3_chunk_text_scores()
+```
+
+The first thing we need to is to collect training data. We do this using the [VespaFeatureCollector](https://vespa-engine.github.io/pyvespa/api/vespa/evaluation.html#vespa.evaluation.VespaFeatureCollector) from the pyvespa library.
+
+These are the features we will include: (Below, )
+
+```txt expandable
+rank-profile base-features {
+ inputs {
+ query(embedding) tensor(x[96])
+ query(float_embedding) tensor(x[768])
+ }
+
+ rank chunks {
+ element-gap: 0 # Fixed length chunking should not cause any positional gap between elements
+ }
+ function chunk_text_scores() {
+ expression: elementwise(bm25(chunks),chunk,float)
+ }
+
+ function chunk_emb_vecs() {
+ expression: unpack_bits(attribute(chunk_embeddings))
+ }
+
+ function chunk_dot_prod() {
+ expression: reduce(query(float_embedding) * chunk_emb_vecs(), sum, x)
+ }
+
+ function vector_norms(t) {
+ expression: sqrt(sum(pow(t, 2), x))
+ }
+ function chunk_sim_scores() {
+ expression: chunk_dot_prod() / (vector_norms(chunk_emb_vecs()) * vector_norms(query(float_embedding)))
+ }
+
+ function top_3_chunk_text_scores() {
+ expression: top(3, chunk_text_scores())
+ }
+
+ function top_3_chunk_sim_scores() {
+ expression: top(3, chunk_sim_scores())
+ }
+
+ function avg_top_3_chunk_text_scores() {
+ expression: reduce(top_3_chunk_text_scores(), avg, chunk)
+ }
+ function avg_top_3_chunk_sim_scores() {
+ expression: reduce(top_3_chunk_sim_scores(), avg, chunk)
+ }
+
+ function max_chunk_text_scores() {
+ expression: reduce(chunk_text_scores(), max, chunk)
+ }
+
+ function max_chunk_sim_scores() {
+ expression: reduce(chunk_sim_scores(), max, chunk)
+ }
+}
+
+rank-profile collect-training-data inherits base-features {
+ match-features {
+ bm25(title)
+ bm25(chunks)
+ max_chunk_sim_scores
+ max_chunk_text_scores
+ avg_top_3_chunk_sim_scores
+ avg_top_3_chunk_text_scores
+
+ }
+
+
+ first-phase {
+ expression {
+ # Not used in this profile
+ bm25(title) +
+ bm25(chunks) +
+ max_chunk_sim_scores() +
+ max_chunk_text_scores()
+ }
+ }
+
+ second-phase {
+ expression: random
+ }
+ }
+```
+
+As you can see, we have defined a `collect-training-data` rank profile that inherits from the `base-features` rank profile. This rank profile will collect the match features we defined in the `match-features` section.
+
+The `random` expression in the `second-phase` allows us to collect random hits, and is necessary for our data collection script. See the [docstring of pyvespa class VespaFeatureCollector](https://vespa-engine.github.io/pyvespa/api/vespa/evaluation.html#vespa.evaluation.VespaFeatureCollector) that is used in the script for details.
+
+As you can see, we rely on the `bm25` and different vector similarity features (both document-level and chunk-level) for the first-phase ranking. These are relatively cheap to calculate, and will likely provide good enough ranking signals for the first-phase ranking.
+
+Running the command below will save a .csv-file with the collected features, which can be used to train a ranking model for the first-phase ranking.
+
+```bash
+$ python eval/collect_pyvespa.py --collect_matchfeatures
+```
+
+Our output file looks like this:
+
+| query_id | doc_id | relevance_label | relevance_score | match_avg_top_3_chunk_sim_scores | match_avg_top_3_chunk_text_scores | match_bm25(chunks) | match_bm25(title) | match_max_chunk_sim_scores | match_max_chunk_text_scores |
+| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
+| alex_q_01 | 50 | 1 | 0.660597 | 0.248329 | 8.444725 | 7.717984 | 0. | 0.268457 | 8.444725 |
+| alex_q_01 | 82 | 1 | 0.649638 | 0.225300 | 12.327676 | 18.611592 | 2.453409 | 0.258905 | 15.644889 |
+| alex_q_01 | 1 | 1 | 0.245849 | 0.358027 | 15.100841 | 23.010389 | 4.333828 | 0.391143 | 20.582403 |
+| alex_q_01 | 28 | 0 | 0.988250 | 0.278074 | 0.179929 | 0.197420 | 0. | 0.278074 | 0.179929 |
+| alex_q_01 | 23 | 0 | 0.968268 | 0.203709 | 0.182603 | 0.196956 | 0. | 0.203709 | 0.182603 |
+
+Note that the `relevance_score` in this table is just the random expression we used in the `second-phase` of the `collect-training-data` rank profile, and will be dropped before training the model.
+
+
+### Training a first-phase ranking model
+
+As you recall, a first-phase ranking expression must be cheap to evaluate. This most often means a heuristic handwritten combination of match features, or a linear model trained on match features.
+
+We will demonstrate how to train a simple Logistic Regression model to predict relevance based on the collected match features. The full training script can be found in the [sample-apps repository](https://github.com/vespa-engine/sample-apps/blob/master/rag-blueprint/eval/train_logistic_regression.py).
+
+Some "gotchas" to be aware of:
+
+- We sample an equal number of relevant and random documents for each query, to avoid class imbalance.
+- We make sure that we drop `query_id` and `doc_id` columns before training.
+- We apply standard scaling to the features before training the model. We apply the inverse transform to the model coefficients after training, so that we can use them in Vespa.
+- We do 5-fold stratified cross-validation to evaluate the model performance, ensuring that each fold has a balanced number of relevant and random documents.
+- We also make sure to have an unseen set of test queries to evaluate the model on, to avoid overfitting.
+
+Run the training [script](https://github.com/vespa-engine/sample-apps/blob/master/rag-blueprint/eval/train_logistic_regression.py)
+
+```bash
+$ python eval/train_logistic_regression.py
+```
+
+Expect output like this:
+
+```txt expandable
+------------------------------------------------------------
+ Cross-Validation Results (5-Fold, Standardized)
+------------------------------------------------------------
+Metric | Mean | Std Dev
+------------------------------------------------------------
+Accuracy | 0.9024 | 0.0294
+Precision | 0.9236 | 0.0384
+Recall | 0.8818 | 0.0984
+F1-Score | 0.8970 | 0.0415
+Log Loss | 0.2074 | 0.0353
+ROC AUC | 0.9749 | 0.0103
+Avg Precision | 0.9764 | 0.0117
+------------------------------------------------------------
+Transformed Coefficients (for original unscaled features):
+--------------------------------------------------
+avg_top_3_chunk_sim_scores : 13.383840
+avg_top_3_chunk_text_scores : 0.203145
+bm25(chunks) : 0.159914
+bm25(title) : 0.191867
+max_chunk_sim_scores : 10.067169
+max_chunk_text_scores : 0.153392
+Intercept : -7.798639
+--------------------------------------------------
+```
+
+Which seems quite good. With such a small dataset however, it is easy to overfit. Let us evaluate on the unseen test queries to see how well the model generalizes.
+
+First, we need to add the learned coefficients as inputs to a new rank profile in our schema, so that we can use them in Vespa.
+
+```txt expandable
+rank-profile learned-linear inherits collect-training-data {
+ match-features:
+ inputs {
+ query(embedding) tensor(x[96])
+ query(float_embedding) tensor(x[768])
+ query(intercept) double
+ query(avg_top_3_chunk_sim_scores_param) double
+ query(avg_top_3_chunk_text_scores_param) double
+ query(bm25_chunks_param) double
+ query(bm25_title_param) double
+ query(max_chunk_sim_scores_param) double
+ query(max_chunk_text_scores_param) double
+ }
+ first-phase {
+ expression {
+ query(intercept) +
+ query(avg_top_3_chunk_sim_scores_param) * avg_top_3_chunk_sim_scores() +
+ query(avg_top_3_chunk_text_scores_param) * avg_top_3_chunk_text_scores() +
+ query(bm25_title_param) * bm25(title) +
+ query(bm25_chunks_param) * bm25(chunks) +
+ query(max_chunk_sim_scores_param) * max_chunk_sim_scores() +
+ query(max_chunk_text_scores_param) * max_chunk_text_scores()
+ }
+ }
+ summary-features {
+ top_3_chunk_sim_scores
+ }
+
+ }
+```
+
+To allow for changing the parameters without redeploying the application, we will also add the values of the coefficients as query parameters to a new query profile.
+
+```xml expandable
+
+ doc
+ embed(@query)
+ embed(@query)
+ -7.798639
+ 13.383840
+ 0.203145
+ 0.159914
+ 0.191867
+ 10.067169
+ 0.153392
+
+ select *
+ from %{schema}
+ where default contains text(@query) or
+ ({label:"title_label", targetHits:100}nearestNeighbor(title_embedding, embedding)) or
+ ({label:"chunks_label", targetHits:100}nearestNeighbor(chunk_embeddings, embedding))
+
+ 10
+ learned-linear
+ top_3_chunks
+
+```
+
+
+### Evaluating first-phase ranking
+
+Now we are ready to evaluate our first-phase ranking function. We can use the [VespaEvaluator](https://vespa-engine.github.io/pyvespa/evaluating-vespa-application-cloud.html#vespaevaluator) from the [pyvespa](https://vespa-engine.github.io/pyvespa/) library to evaluate the first-phase ranking function.
+
+Run the following command to run the [evaluation script](https://github.com/vespa-engine/sample-apps/blob/master/rag-blueprint/eval/evaluate_ranking.py)
+
+```bash
+$ python eval/evaluate_ranking.py
+```
+
+We run the evaluation script on a set of unseen test queries, and get the following output:
+
+```json expandable
+{
+ "accuracy@1": 1.0,
+ "accuracy@3": 1.0,
+ "accuracy@5": 1.0,
+ "accuracy@10": 1.0,
+ "precision@10": 0.235,
+ "recall@10": 0.9405,
+ "precision@20": 0.13,
+ "recall@20": 0.9955,
+ "mrr@10": 1.0,
+ "ndcg@10": 0.8902,
+ "map@100": 0.8197,
+ "searchtime_avg": 0.017,
+ "searchtime_q50": 0.0165,
+ "searchtime_q90": 0.0251,
+ "searchtime_q95": 0.0267
+}
+```
+
+For the first phase ranking, we care most about recall, as we just want to make sure that the candidate documents are ranked high enough to be included in the second-phase ranking. The number of documents to be reranked in second-phase in total over all content nodes is controlled by the `total-rerank-count` parameter.
+
+We can see that our results are already very good. This is of course due to the fact that we have a small,synthetic dataset. In reality, you should align the metric expectations with your dataset and test queries.
+
+We can also see that our search time is quite fast, with an average of 22ms. You should consider whether this is well within your latency budget, as you want some headroom for second-phase ranking.
+
+
+## Second-phase ranking
+
+For the second-phase ranking, we can afford to use a more expensive ranking expression, since we will only run it on the top-k documents from the first-phase ranking (decided by the `total-rerank-count` parameter).
+
+This is where we can significantly improve ranking quality by using more sophisticated models and features that would be too expensive to compute for all matched documents.
+
+
+### Collecting features for second-phase ranking
+
+For second-phase ranking, we request Vespa's default set of rank features, which includes a comprehensive set of text features. See the [rank features documentation](/en/reference/ranking/rank-features) for complete details.
+
+We can collect both match features and rank features by running the same [script](https://github.com/vespa-engine/sample-apps/blob/master/rag-blueprint/eval/collect_pyvespa.py) as we did for first-phase ranking, with some additional parameters to collect rank features as well:
+
+```bash
+$ python eval/collect_pyvespa.py --collect_rankfeatures --collect_matchfeatures --collector_name rankfeatures-secondphase
+```
+
+This collects approximately 194 features, providing a rich feature set for training more sophisticated ranking models.
+
+
+### Training a GBDT model for second-phase ranking
+
+With the expanded feature set, we can train a Gradient Boosted Decision Tree (GBDT) model to predict document relevance. We use [LightGBM](/en/ranking/lightgbm) for this purpose.
+
+Vespa also supports [XGBoost](/en/ranking/xgboost) and [ONNX](/en/ranking/onnx) models.
+
+To train the model, run the following command ([link to training script](https://github.com/vespa-engine/sample-apps/blob/master/rag-blueprint/eval/train_lightgbm.py)):
+
+```bash
+$ python eval/train_lightgbm.py --input_file eval/output/Vespa-training-data_match_rank_second_phase_20250623_135819.csv
+```
+
+The training process includes several important considerations:
+
+- **Cross-validation**: We use 5-fold stratified cross-validation to evaluate model performance and prevent overfitting
+- **Hyperparameter tuning**: We set conservative hyperparameters to prevent growing overly large and deep trees, especially important for smaller datasets
+- **Feature selection**: Features with zero importance during cross-validation are excluded from the final model
+- **Early stopping**: Training stops when validation scores don't improve for 50 rounds
+
+Example training output:
+
+```txt
+------------------------------------------------------------
+ Cross-Validation Results (5-Fold)
+------------------------------------------------------------
+Metric | Mean | Std Dev
+------------------------------------------------------------
+Accuracy | 0.9214 | 0.0664
+ROC AUC | 0.9863 | 0.0197
+------------------------------------------------------------
+Overall CV AUC: 0.9249 • ACC: 0.9216
+------------------------------------------------------------
+```
+
+
+### Feature importance analysis
+
+The trained model reveals which features are most important for ranking quality. For our sample application, the top features include:
+
+| Feature | Importance |
+| :--- | :--- |
+| nativeProximity | 168.8498 |
+| firstPhase | 151.7382 |
+| max_chunk_sim_scores | 69.4377 |
+| avg_top_3_chunk_text_scores | 56.5079 |
+| avg_top_3_chunk_sim_scores | 31.8700 |
+| nativeRank | 20.0716 |
+| nativeFieldMatch | 15.9914 |
+| elementSimilarity(chunks) | 9.7003 |
+
+Key observations:
+
+- **Text proximity features** ([nativeProximity](/en/reference/ranking/nativerank#nativeProximity)) are highly valuable for understanding query-document relevance
+- **First-phase score** (`firstPhase`) being important validates that our first-phase ranking provides a good foundation
+- **Chunk-level features** (both text and semantic) contribute significantly to ranking quality
+- **Traditional text features** like [nativeRank](/en/reference/ranking/nativerank#nativeRank) and [bm25](/en/ranking/bm25#ranking-function) remain important
+
+
+### Integrating the GBDT model into Vespa
+
+The trained LightGBM model is exported and added to your Vespa application package:
+
+```txt
+app/
+├── models/
+│ └── lightgbm_model.json
+```
+
+Create a new rank profile that uses this model:
+
+```txt
+rank-profile second-with-gbdt inherits collect-training-data {
+ ...
+
+ second-phase {
+ expression: lightgbm("lightgbm_model.json")
+ }
+
+ ...
+}
+```
+
+And redeploy your application:
+
+```bash
+$ vespa deploy
+```
+
+
+### Evaluating second-phase ranking performance
+
+Run the [evaluate_ranking.py](https://github.com/vespa-engine/sample-apps/blob/master/rag-blueprint/eval/evaluate_ranking.py) script to evaluate the GBDT-powered second-phase ranking on unseen test queries:
+
+```bash
+$ python evaluate_ranking.py --second_phase
+```
+
+Expected results should show something like this:
+
+```json expandable
+{
+ "accuracy@1": 0.9,
+ "accuracy@3": 1.0,
+ "accuracy@5": 1.0,
+ "accuracy@10": 1.0,
+ "precision@10": 0.235,
+ "recall@10": 0.9402,
+ "precision@20": 0.13,
+ "recall@20": 0.9955,
+ "mrr@10": 0.95,
+ "ndcg@10": 0.8782,
+ "map@100": 0.8091,
+ "searchtime_avg": 0.0204,
+ "searchtime_q50": 0.018,
+ "searchtime_q90": 0.0333,
+ "searchtime_q95": 0.0362
+}
+```
+
+For a larger dataset, we would expect to see significant improvement over first-phase ranking. Since our first-phase ranking is already quite good, we can not see this here, but we will leave the comparison code for you to run on a real-world dataset.
+
+We also observe a slight increase in search time (from 22ms to 35ms average), which is expected due to the additional complexity of the GBDT model.
+
+
+### Query profiles with GBDT ranking
+
+Create new query profiles that leverage the improved ranking:
+
+```xml
+
+ second-with-gbdt
+ 20
+
+
+
+ 50
+ openai
+ sse
+
+```
+
+Test the improved ranking:
+
+```bash
+$ vespa query query="what are key points learned for finetuning llms?" queryProfile=hybrid-with-gbdt
+```
+
+For RAG applications with LLM generation:
+
+```bash
+$ vespa query \
+ --timeout 60 \
+ query="what are key points learned for finetuning llms?" \
+ queryProfile=rag-with-gbdt
+```
+
+
+### Best practices for second-phase ranking
+
+**Model complexity considerations:**
+
+- Use more sophisticated models (GBDT, neural networks) that would be too expensive for first-phase
+- Take advantage of the reduced candidate set (typically 100-10,000 documents)
+- Include expensive text features like `nativeProximity` and `fieldMatch`
+
+**Feature engineering:**
+
+- Combine first-phase scores with additional text and semantic features
+- Use chunk-level aggregations (max, average, top-k) to capture document structure
+- Include metadata signals
+
+**Training data quality:**
+
+- Use the first-phase ranking to generate better training data
+- Consider having LLMs generate relevance judgments for top-k results
+- Iteratively improve with user interaction data when available
+
+**Performance monitoring:**
+
+- Monitor latency impact of second-phase ranking
+- Adjust `total-rerank-count` based on quality vs. performance trade-offs
+- Consider using different models for different query types or use cases
+
+The second-phase ranking represents a crucial step in building high-quality RAG applications, providing the precision needed for effective LLM context while maintaining reasonable query latencies.
+
+
+## (Optional) Global-phase ranking
+
+We also have the option of configuring [global-phase](/en/reference/schemas/schemas#globalphase-rank) ranking, which can rerank the top k (as set by `total-rerank-count` parameter) documents from the second-phase ranking.
+
+Common options for global-phase are [cross-encoders](/en/ranking/cross-encoders) or another GBDT model, trained for better separating top ranked documents on objectives such as [LambdaMart](https://xgboost.readthedocs.io/en/latest/tutorials/learning_to_rank.html). For RAG applications, we consider this less important than for search applications where the results are mainly consumed by an human, as LLMs don't care that much about the ordering of the results.
+
+
+## Further improvements
+
+Finally, we will sketch out some opportunities for further improvements. As you have seen, we started out with only binary relevance labels for a few queries, and trained a model based on the relevant docs and a set of random documents.
+
+As you may have noted, we have not discussed what most people think about when discussing RAG evals, evaluating the "Generation"-step. There are several tools available to do this, for example [ragas](https://docs.ragas.io/en/stable/) and [ARES](https://github.com/stanford-futuredata/ARES). We refer to other sources for details on this, as this tutorial is probably enough to digest as it is.
+
+This was useful initially, as we had no better way to retrieve the candidate documents. Now, that we have a reasonably good second-phase ranking, we could potentially generate a new set of relevance labels for queries that we did not have labels for by having an LLM do relevance judgments of the top k returned hits. This training dataset would likely be even better in separating the top documents.
+
+
+## Summary
+
+In this tutorial, we have built a complete RAG application using Vespa, providing our recommendations for how to approach both retrieval phase with binary vectors and text matching, first-phase ranking with a linear combination of relatively cheap features to a more sophisticated second-phase ranking system with more expensive features and a GBDT model.
+
+We hope that this tutorial, along with the provided code in our [sample-apps repository](https://github.com/vespa-engine/sample-apps/tree/master/rag-blueprint), will serve as a useful reference for building your own RAG applications, with an evaluation-driven approach.
+
+By using the principles demonstrated in this tutorial, you are empowered to build high-quality RAG applications that can scale to any dataset size, and any query load.
+
+
+## FAQ
+
+
+
+We love ColBERT, and it has shown great performance. We do support ColBERT-style models in Vespa. The challenge is the added cost in memory storage, especially for large-scale applications. If you use it, we recommend consider binarizing the vectors to reduce memory usage 32x compared to float. If you want to improve the ranking quality and accept the additional cost, we encourage you to evaluate and try. Here are some resources if you want to learn more about using ColBERT with Vespa:
+
+- [Announcing ColBERT embedder](https://blog.vespa.ai/announcing-colbert-embedder-in-vespa/#what-is-colbert?)
+- [Long context ColBERT](https://blog.vespa.ai/announcing-long-context-colbert-in-vespa/)
+- [Long context ColBERT sample app](https://github.com/vespa-engine/sample-apps/tree/master/colbert-long/#vespa-sample-applications---long-context-colbert)
+- [ColBERT sample app](https://github.com/vespa-engine/sample-apps/tree/master/colbert)
+- [ColBERT embedder reference](/en/rag/embedding#colbert-embedder)
+- [ColBERT standalone python example notebook](https://vespa-engine.github.io/pyvespa/examples/colbert_standalone_Vespa-cloud.html)
+- [ColBERT standalone long context example notebook](https://vespa-engine.github.io/pyvespa/examples/colbert_standalone_long_context_Vespa-cloud.html)
+
+
+
+Vespa supports a variety of embedding models. For a list of vespa provided models on Vespa Cloud, see [Model hub](/en/rag/model-hub). See also [embedding reference](/en/rag/embedding#provided-embedders) for how to use embedders. You can also use private models (gated by authentication with Bearer token from Vespa Cloud secret store).
+
+
+
+No, you are free to use Vespa as a search engine. We provide the option of calling out to LLMs from within a Vespa application for reduced latency compared to sending large search results sets several times over network as well as the option to deploy Local LLMs, optionally in your own infrastructure if you prefer. See [Vespa Cloud Enclave](/en/operations/enclave/enclave)
+
+
+
+Binary vectors takes up a lot less memory and are faster to compute distances on, with only a slight reduction in quality. See blog [post](https://blog.vespa.ai/combining-matryoshka-with-binary-quantization-using-embedder/) for details.
+
+
+
+Vespa can scale both the stateless container nodes and content nodes of your application. See [overview](../overview) and [elasticity](/en/content/elasticity) for details.
+
+
diff --git a/mintlify-docs/en/learn/tutorials/text-search-ml.mdx b/mintlify-docs/en/learn/tutorials/text-search-ml.mdx
new file mode 100644
index 0000000000..fdb451b603
--- /dev/null
+++ b/mintlify-docs/en/learn/tutorials/text-search-ml.mdx
@@ -0,0 +1,333 @@
+---
+title: "Improving Text Search through ML"
+---
+
+At this point, we assume you have read our [Text Search Tutorial](/en/learn/tutorials/text-search) and accomplished the following steps.
+
+- Created and deployed a basic text search app in Vespa.
+- Fed the app with the MS MARCO full document dataset.
+- Compared and evaluated two different ranking functions.
+
+We are now going to show you how to create a dataset that joins relevance information from the MS MARCO dataset with ranking features from Vespa to enable you to train ML models to improve your application. More specifically, you will accomplish the following steps in this tutorial.
+
+- Learn how to collect rank feature data from Vespa associated with a specific query.
+- Create a dataset that can be used to improve your app's ranking function.
+- Propose sanity-checks to help you detect bugs in your data collection logic and ensure you have a properly built dataset at the end of the process.
+- Illustrate the importance of going beyond pointwise loss functions when dealing with Learning To Rank (LTR) tasks.
+
+[Vespa Product Ranking](https://github.com/vespa-engine/sample-apps/tree/master/commerce-product-ranking) is a good resource for Learning To Rank using XGBoost and LightGBM, with linked blog posts.
+
+
+## Collect rank feature data from Vespa
+
+Vespa's [rank feature set](/en/reference/ranking/rank-features) contains a large set of low and high level features. Those features are useful to understand the behavior of your app and to improve your ranking function.
+
+
+### Default rank features
+
+To access the default set of ranking features, set the query parameter [`ranking.listFeatures`](/en/reference/api/query#ranking.listfeatures) to `true`. For example, below is the body of a post request that in a [query](/en/querying/query-language), selects the `bm25` rank-profile developed in the previous tutorial and returns the rank features associated with each of the results returned.
+
+```bash
+$ vespa query \
+ 'yql=select id,rankfeatures from msmarco where userQuery()' \
+ 'query=what is dad bod' \
+ 'ranking=bm25' \
+ 'type=weakAnd' \
+ 'ranking.listFeatures=true'
+```
+
+The list of rank features that are returned by default can change in the future - the current list can be checked in the [system test](https://github.com/vespa-engine/system-test/blob/master/tests/search/rankfeatures/dump.txt). For the request specified by the body above we get the following (edited) json back. Each result will contain a field called `rankfeatures` containing the set of default ranking features:
+
+```json expandable
+{
+ "root": {
+ "children": [
+ ...
+ {
+ "fields": {
+ "rankfeatures": {
+ ...
+ "attributeMatch(id).totalWeight": 0.0,
+ "attributeMatch(id).weight": 0.0,
+ "elementCompleteness(body).completeness": 0.5051413881748072,
+ "elementCompleteness(body).elementWeight": 1.0,
+ "elementCompleteness(body).fieldCompleteness": 0.010282776349614395,
+ "elementCompleteness(body).queryCompleteness": 1.0,
+ "elementCompleteness(title).completeness": 0.75,
+ "elementCompleteness(title).elementWeight": 1.0,
+ "elementCompleteness(title).fieldCompleteness": 1.0,
+ "elementCompleteness(title).queryCompleteness": 0.5,
+ "fieldMatch(body)": 0.7529285549778888,
+ "fieldMatch(body).absoluteOccurrence": 0.065,
+ ...
+ }
+ },
+ "id": "index:msmarco/0/811ccbaf9796f92bfa343045",
+ "relevance": 37.7705101001455,
+ "source": "msmarco"
+ },
+ ],
+ ...
+ }
+}
+```
+
+
+### Chose and process specific rank features
+
+If instead of returning the complete set of rank features you want to select [specific ones](/en/reference/ranking/rank-features), you can add a new rank-profile (let's call it `collect_rank_features`) to our *msmarco.sd* schema definition and disable the default ranking features by adding `ignore-default-rank-features` to the new rank-profile. In addition, we can specify the desired features within the `rank-features` element. In the example below we explicitly configured Vespa to only return `bm25(title)`, `bm25(body)`, `nativeRank(title)` and `nativeRank(body)`.
+
+Note that using *all* available rank features comes with computational cost, as Vespa needs to calculate all these features. Using many features is usually only advisable using second phase ranking, see [phased ranking with Vespa](/en/ranking/phased-ranking).
+
+```sd expandable
+schema msmarco {
+ document msmarco {
+ field id type string {
+ indexing: attribute | summary
+ }
+ field title type string {
+ indexing: index | summary
+ index: enable-bm25
+ }
+ field url type string {
+ indexing: index | summary
+ }
+ field body type string {
+ indexing: index
+ index: enable-bm25
+ }
+ }
+
+ document-summary minimal {
+ summary id { }
+ }
+
+ fieldset default {
+ fields: title, body, url
+ }
+
+ rank-profile default {
+ first-phase {
+ expression: nativeRank(title, body, url)
+ }
+ }
+
+ rank-profile bm25 inherits default {
+ first-phase {
+ expression: bm25(title) + bm25(body) + bm25(url)
+ }
+ }
+
+ rank-profile collect_rank_features inherits default {
+ first-phase {
+ expression: bm25(title) + bm25(body) + bm25(url)
+ }
+ second-phase {
+ expression: random
+ }
+
+ match-features {
+ bm25(title)
+ bm25(body)
+ bm25(url)
+ nativeRank(title)
+ nativeRank(body)
+ nativeRank(url)
+ }
+ }
+}
+```
+
+
+```
+Paste the above into file text-search/app/schemas/msmarco.sd
+```
+
+The [random](/en/reference/ranking/rank-features#random) global feature will be useful in the next section when we describe our data collection process.
+
+After adding the `collect_rank_features` rank-profile to *msmarco.sd*, redeploy the app:
+
+```bash
+$ vespa deploy --wait 300 app
+```
+
+
+## Create a training dataset
+
+The [MS MARCO](https://microsoft.github.io/msmarco/) dataset described in [the previous tutorial](/en/learn/tutorials/text-search) provides us with more than 300 000 training queries, each of which is associated with a specific document ID that is relevant to the query. In this section we want to combine the information contained in the pairs `(query, relevant_id)` with the information available in the Vespa ranking features to create a dataset that can be used to train ML models to improve the ranking function of our msmarco text app.
+
+Before we move on to describe the collection process in detail, we want to point out that the whole process can be replicated by the following call to the data collection script `collect_training_data.py` available in [this tutorial repository](https://github.com/vespa-engine/sample-apps/tree/master/text-search):
+
+The following routine requires that you have downloaded the full dataset.
+
+```bash
+$ ./src/python/collect_training_data.py msmarco collect_rank_features 99
+```
+
+The command above use data contained in the query (msmarco-doctrain-queries.tsv.gz) and in the relevance (msmarco-doctrain-qrels.tsv.gz) files that are part of the MSMARCO dataset, and send queries to Vespa using the `collect_rank_features` rank-profile defined in the previous section in order to request `99` randomly selected documents for each query in addition to the relevant document associated with the query. All the data from the request are then parsed and stored in the output folder, which is chosen to be `data` in this case.
+
+
+### Data collection logic
+
+Since we want to improve the first-phase ranking function of our application, our goal here is to create a dataset that will be used to train models that will generalize well when used in the first-phase ranking of an actual Vespa instance running against possibly unseen queries and documents. This might be obvious at first but turns out to be easy to neglect when making some data collection decisions.
+
+The logic behind the `collect_training_data.py` can be summarized by the pseudo-code below:
+
+```python
+hits = get_relevant_hit(query, rank_profile, relevant_id)
+if relevant_hit:
+ hits.extend(get_random_hits(query, rank_profile, number_random_sample))
+ data = annotate_data(hits, query_id, relevant_id)
+ append_data(file, data)
+```
+
+For each query, we first send a request to Vespa to get the relevant document associated with the query. If the relevant document is matched by the query, Vespa will return it, and we will expand the number of documents associated with the query by sending a second request to Vespa. The second request asks Vespa to return a number of random documents sampled from the set of documents that were matched by the query. We then parse the hits returned by Vespa and organize the data into a tabular form containing the rank features and the binary variable indicating if the query-document pair is relevant or not.
+
+We are only interested in collecting documents that are matched by the query because those are the documents that would be presented to the first-phase model in a production environment. This means that we will likely leave some queries that contain information about relevant documents out of the collected dataset, but it will create a dataset that are closer to our stated goal. In other words, the dataset we collect is conditional on our match criteria.
+
+
+### Get relevant hit
+
+The first Vespa request is contained in the function call `get_relevant_hit(query, rank_profile, relevant_id)` where the `query` parameter contains the desired query string, `rank_profile` is set to the `collect_rank_features` defined earlier and `relevant_id` is the document ID that is said to be relevant to that specific query.
+
+The body of the request is given by:
+
+```python
+body = {
+ "yql": "select id, rankfeatures from sources * where userQuery()",
+ "query": query,
+ "hits": 1,
+ "recall": "+id:" + str(relevant_id),
+ "ranking": {"profile": rank_profile, "listFeatures": "true"},
+}
+```
+
+where the `yql` and `userQuery` parameters instruct Vespa to return the *id* of the documents along with the selected rank-features defined in the `collect_rank_features` rank-profile. The `hits` parameter is set to 1 because we know there are only one relevant id for each query, so we set Vespa to return only one document in the result set. The `recall` parameter allow us to specify the exact document *id* we want to retrieve.
+
+Note that the parameter `recall` only works if the document is matched by the query, which is exactly the behavior we want in this case.
+
+The `recall` syntax to retrieve one document with id equal to 1 is given by `"recall": "+id:1"` and the syntax to retrieve more than one document, say documents with ids 1 and 2 is given by `"recall": "+(id:1 id:2)"`.
+
+If we wanted to retrieve the document even if it did not match the query specification we could alter the query to use the following query specification:
+
+```python
+body = {
+ "yql": "select id, rankfeatures from sources * where true or userQuery()",
+ "query": query,
+ "hits": 1,
+ "recall": "+id:" + str(relevant_id),
+ "ranking": {"profile": rank_profile, "listFeatures": "true"},
+}
+```
+
+
+### Get random hits
+
+The second Vespa request happens when we want to extend the dataset by adding randomly selected documents from the matched set. The request is contained in the function call `get_random_hits(query, rank_profile, number_random_sample)` where the only new parameter is `number_random_sample`, which specify how many documents we should sample from the matched set.
+
+The body of the request is given by:
+
+```python
+body = {
+ "yql": "select id, rankfeatures from sources * where default contains text(@userQuery)",
+ "userQuery": query,
+ "hits": number_random_sample,
+ "ranking": {"profile": collect_features, "listFeatures": "true"},
+}
+```
+
+where the only changes with respect to the `get_relevant_hit` is that we no longer need to use the `recall` parameter and that we set the number of hits returned by Vespa to be equal to `number_random_sample`.
+
+Remember we had configured the second phase to use random scoring:
+
+```sd
+second-phase {
+ expression: random
+}
+```
+
+Using `random` as our second-phase ranking function ensures that the top documents returned by Vespa are randomly selected from the set of documents that were matched by the query.
+
+
+### Annotated data
+
+Once we have both the relevant and the random documents associated with a given query, we parse the Vespa result and store it in a file with the following format:
+
+| bm25(body) | bm25(title) | nativeRank(body) | nativeRank(title) | docid | qid | relevant |
+| --- | --- | --- | --- | --- | --- | --- |
+| 25.792076 | 12.117309 | 0.322567 | 0.084239 | D312959 | 3 | 1 |
+| 22.191228 | 0.043899 | 0.247145 | 0.017715 | D3162299 | 3 | 0 |
+| 13.880625 | 0.098052 | 0.219413 | 0.036826 | D2823827 | 3 | 0 |
+
+where the values in the `relevant` column are equal to 1 if document `docid` is relevant to the query `qid` and zero otherwise.
+
+
+## Data collection sanity check
+
+In the process of writing this tutorial and creating the data collection logic described above, we found it useful to develop a data collection sanity-check to help us catch bugs in our process. There is no unique right answer here, but our proposal is to use the dataset to train a model using the same features and functional form used by the baseline you want to improve upon. If the dataset is well-built and contains useful information about the task you are interested in, you should be able to get results at least as good as the one obtained by your baseline on a separate test set.
+
+In our case, the baseline is the ranking function used in [our previous tutorial](/en/learn/tutorials/text-search):
+
+```sd
+rank-profile bm25 inherits default {
+ first-phase {
+ expression: bm25(title) + bm25(body)
+ }
+}
+```
+
+Therefore, our sanity-check model will be a linear model containing only the two features above, i.e. `a + b * bm25(title) + c * bm25(body)`, where `a`, `b` and `c` should be learned by using our collected dataset.
+
+We split our dataset into training and validation sets, train the linear model and evaluate it on the validation dataset. We then expect the difference observed in the collected validation set between the model and the baseline to be similar to the difference observed on a running instance of Vespa when applied to an independent test set. In addition, we expect that the trained model to do at least as good as the baseline on a test set, given that the baseline model is contained in the set of possible trained models and is recovered when `a=0`, `b=1` and `c=1`.
+
+This is a simple procedure, but it did catch some bugs while we were writing this tutorial. For example, at one point we forgot to include
+
+```sd
+first-phase {
+ expression: random
+}
+```
+
+in the `collect_rank_features` rank-profile leading to a biased dataset where the negative examples were actually quite relevant to the query. The trained model did well on the validation set, but failed miserably on the test set when deployed to Vespa. This showed us that our dataset probably had a different distribution than what was observed on a running Vespa instance and led us to investigate and catch the bug.
+
+
+## Beyond pointwise loss functions
+
+The most straightforward way to train the linear model mentioned in the previous section would be to use a vanilla logistic regression, since our target variable `relevant` is binary. The most commonly used loss function in this case (binary cross-entropy) is referred to as a pointwise loss function in the LTR literature, as it does not take the relative order of documents into account. However, as we described in [the previous tutorial](/en/learn/tutorials/text-search), the metric that we want to optimize in this case is the Mean Reciprocal Rank (MRR). The MRR is affected by the relative order of the relevance we assign to the list of documents generated by a query and not by their absolute magnitudes. This disconnect between the characteristics of the loss function and the metric of interest might lead to suboptimal results.
+
+For ranking search results, it is preferable to use a listwise loss function when training our linear model, which takes the entire ranked list into consideration when updating the model parameters. To illustrate this, we trained linear models using the [TF-Ranking framework](https://github.com/tensorflow/ranking). The framework is built on top of TensorFlow and allow us to specify pointwise, pairwise and listwise loss functions, among other things.
+
+The two *rank-profile*'s below are obtained by training the linear model with a pointwise (sigmoid cross-entropy) and listwise (softmax cross-entropy) loss functions, respectively:
+
+```sd
+rank-profile pointwise_linear_bm25 inherits default {
+ first-phase {
+ expression: 0.22499913 * bm25(title) + 0.07596389 * bm25(body)
+ }
+}
+
+rank-profile listwise_linear_bm25 inherits default {
+ first-phase {
+ expression: 0.13446581 * bm25(title) + 0.5716889 * bm25(body)
+ }
+}
+```
+
+It is interesting to see that a pointwise loss function set more weight into the title in relation to the body while the opposite happens when using the listwise loss function.
+
+The figure below shows how frequently (over more than 5.000 test queries) those two ranking functions allocate the relevant document between the 1st and 10th position of the list of documents returned by Vespa. Although there is not a huge difference between those models on average, we can clearly see in the figure below that a model based on a listwise loss function allocate more documents in the first two positions of the ranked list when compared to the pointwise model:
+
+
+
+
+
+Overall, on average, there is not much difference between those models (with respect to MRR), which was expected given the simplicity of the models described here. The point was simply to point out the importance of choosing better loss functions when dealing with LTR tasks and to give a quick start for those who want to give it a shot in their own applications. We expect the difference in MRR between pointwise and listwise loss functions to increase as we move on to more complex models.
+
+
+## Next steps
+
+In this tutorial we have looked at using a simple *linear* ranking function. Vespa integrates with several popular machine learning libraries which can be used for Machine Learned Ranking:
+
+- [Ranking with XGBoost Models](/en/ranking/xgboost)
+- [Ranking with LightGBM Models](/en/ranking/lightgbm)
+- [Ranking with ONNX Models](/en/ranking/onnx)
diff --git a/mintlify-docs/en/learn/tutorials/text-search.mdx b/mintlify-docs/en/learn/tutorials/text-search.mdx
new file mode 100644
index 0000000000..7d71b2a5b6
--- /dev/null
+++ b/mintlify-docs/en/learn/tutorials/text-search.mdx
@@ -0,0 +1,692 @@
+---
+title: "Text Search Tutorial"
+---
+
+This tutorial will guide you through setting up a simple text search application. At the end, you can index text documents in Vespa and search them via text queries. The application built here will be the foundation for other tutorials, such as creating ranking functions based on Machine Learning (ML) models.
+
+The main goal is to set up a text search app based on simple text scoring features such as [BM25](/en/ranking/bm25) [^1] and [nativeRank](/en/reference/ranking/nativerank).
+
+
+**Prerequisites:**
+
+- Linux, macOS or Windows 10 Pro on x86_64 or arm64, with [Podman Desktop](https://podman.io/) or [Docker Desktop](https://www.docker.com/products/docker-desktop/) installed, with an engine running.
+ - Alternatively, start the Podman daemon:
+
+ ```bash
+ $ podman machine init --memory 6000
+ $ podman machine start
+ ```
+
+- See [Docker Containers](/en/operations/self-managed/docker-containers) for system limits and other settings.
+- For CPUs older than Haswell (2013), see [CPU Support](/en/operations/self-managed/cpu-support).
+- Memory: Minimum 4 GB RAM dedicated to Docker/Podman. [Memory recommendations](/en/operations/self-managed/node-setup#memory-settings).
+- Disk: Avoid `NO_SPACE` - the vespaengine/vespa container image + headroom for data requires disk space. [Read more](/en/writing/feed-block).
+- [Homebrew](https://brew.sh/) to install the [Vespa CLI](/en/clients/vespa-cli), or download the Vespa CLI from [Github releases](https://github.com/vespa-engine/vespa/releases).
+- Python3
+- `curl`
+
+
+
+## Installing vespa-cli
+
+This tutorial uses [Vespa-CLI](/en/clients/vespa-cli) to deploy, feed and query Vespa. Below, we use [HomeBrew](https://brew.sh/) to download and install `vespa-cli`, you can also download a binary from [GitHub](https://github.com/vespa-engine/vespa/releases) for your OS/CPU architecture.
+
+```bash
+$ brew install vespa-cli
+```
+
+We acquire the scripts to follow this tutorial from the [sample-apps repository](https://github.com/vespa-engine/sample-apps/tree/master/text-search) via `vespa clone`.
+
+```bash
+$ vespa clone text-search text-search && cd text-search
+```
+
+The repository contains a fully-fledged Vespa application, but below, we will build it all from scratch for educational purposes.
+
+
+## Dataset
+
+We use a dataset called [MS MARCO](https://microsoft.github.io/msmarco/) throughout this tutorial. MS MARCO is a collection of large-scale datasets released by Microsoft with the intent of helping the advance of deep learning research related to search. Many tasks are associated with MS MARCO datasets, but we want to build an end-to-end search application that returns relevant documents to a text query. We have included a small dataset sample for this tutorial under the `ext/sample` directory, which contains around 1000 documents.
+
+The sample data must be converted to Vespa [JSON feed format](/en/reference/schemas/document-json-format). The following step includes extracting documents, queries and relevance judgments from the sample files:
+
+```bash
+$ ./scripts/convert-msmarco.sh
+```
+
+After running the script, we end up with a file `dataset/documents.jsonl` containing lines such as the one below:
+
+```json
+{
+ "put": "id:msmarco:msmarco::D1555982",
+ "fields": {
+ "id": "D1555982",
+ "url": "https://answers.yahoo.com/question/index?qid=20071007114826AAwCFvR",
+ "title": "The hot glowing surfaces of stars emit energy in the form of electromagnetic radiation",
+ "body": "Science Mathematics Physics The hot glowing surfaces of stars emit energy in the form of electromagnetic radiation ... "
+ }
+}
+```
+
+In addition to `vespa.json` we also have a `test-queries.tsv` file containing a list of the sampled queries along with the document ID relevant to each particular query.
+
+
+## Create a Vespa Application Package
+
+A [Vespa application package](/en/basics/applications) is a set of configuration files and optional Java components that together define the behavior of a Vespa system. Let us define the minimum set of required files to create our basic text search application: `msmarco.sd` and `services.xml`.
+
+For this tutorial, we will create a new Vespa application rather than using the one in the repository, so we will create a directory for this application:
+
+```bash
+$ mkdir -p app/schemas
+```
+
+
+### Schema
+
+A [schema](/en/basics/schemas) is a document-type configuration; a single vespa application can have multiple schemas with document types. For this application, we define a schema `msmarco` which must be saved in a file named `schemas/msmarco.sd`. Write the following to `text-search/app/schemas/msmarco.sd`:
+
+```sd expandable
+schema msmarco {
+ document msmarco {
+ field language type string {
+ indexing: "en" | set_language
+ }
+ field id type string {
+ indexing: attribute | summary
+ match: word
+ }
+ field title type string {
+ indexing: index | summary
+ match: text
+ index: enable-bm25
+ }
+ field body type string {
+ indexing: index | summary
+ match: text
+ index: enable-bm25
+ }
+ field url type string {
+ indexing: index | summary
+ index: enable-bm25
+ }
+ }
+ fieldset default {
+ fields: title, body, url
+ }
+ document-summary minimal {
+ summary id { }
+ }
+ document-summary debug-tokens {
+ summary url {}
+ summary url-tokens {
+ source: url
+ tokens
+ }
+ from-disk
+ }
+ rank-profile default {
+ first-phase {
+ expression: nativeRank(title, body, url)
+ }
+ }
+ rank-profile bm25 inherits default {
+ first-phase {
+ expression: bm25(title) + bm25(body) + bm25(url)
+ }
+ }
+}
+```
+
+A lot is going on here; let us go through it in detail.
+
+
+#### Document type and fields
+
+The `document` section contains the fields of the document, their types, and how Vespa should index and [match](/en/reference/schemas/schemas#match) them.
+
+The field property `indexing` configures the *indexing pipeline* for a field. For more information, see [schemas - indexing](/en/basics/schemas#document-fields). The [string](/en/reference/schemas/schemas#string) data type is used to represent both unstructured and structured texts, and there are significant differences between [index and attribute](/en/querying/text-matching#index-and-attribute). The above schema includes default `match` modes for `attribute` and `index` property for visibility.
+
+Note that we are enabling the usage of [BM25](/en/ranking/bm25) for `title`, `body` and `url`. by including `index: enable-bm25`. The language field is the only field not in the msmarco dataset. We hardcode its value to "en" since the dataset is English. Using `set_language` avoids automatic language detection and uses the value when processing the other text fields. Read more in [linguistics](/en/linguistics/linguistics).
+
+
+#### Fieldset for matching across multiple fields
+
+[Fieldset](/en/reference/schemas/schemas#fieldset) allows searching across multiple fields. Defining `fieldset` does not add indexing/storage overhead. String fields grouped using fieldsets must share the same [match](/en/reference/schemas/schemas#match) and [linguistic processing](/en/linguistics/linguistics) settings because the query processing that searches a field or fieldset uses *one* type of transformation.
+
+
+#### Document summaries to control search response contents
+
+Next, we define two [document summaries](/en/querying/document-summaries). Document summaries control what fields are available in the [response](/en/reference/querying/default-result-format); we include the `debug-tokens` document-summary to demonstrate later how we can get visibility into how text is converted into searchable tokens.
+
+
+#### Ranking to determine matched documents ordering
+
+You can define many [rank profiles](/en/basics/ranking), named collections of score calculations, and ranking phases.
+
+In this tutorial, we define our `default` to be using [nativeRank](/en/reference/ranking/nativerank). In addition, we have a `bm25` rank-profile that uses [bm25](/en/ranking/bm25). Both are examples of text-scoring [rank-features](/en/reference/ranking/rank-features) in Vespa.
+
+
+### Services Specification
+
+The [services.xml](/en/reference/applications/services/services) defines the services that make up the Vespa application — which services to run and how many nodes per service. Write the following to `text-search/app/services.xml`:
+
+```xml expandable
+
+
+
+
+
+
+
+
+
+
+ 1
+
+
+
+
+
+
+
+
+
+```
+
+```
+Paste the above into file text-search/app/services.xml
+```
+
+Some notes about the elements above:
+
+- `` defines the [container cluster](/en/applications/containers) for document, query and result processing
+- `` sets up the [query endpoint](/en/querying/query-api). The default port is 8080.
+- `` sets up the [document endpoint](/en/reference/api/document-v1) for feeding.
+- `` defines how documents are stored and searched
+- `` denotes how many copies to keep of each document.
+- `` assigns the document types in the *schema* to content clusters — the content cluster capacity can be increased by adding node elements — see [elasticity](/en/content/elasticity). (See also the [reference](/en/reference/applications/services/content) for more on content cluster setup.)
+- `` defines the hosts for the content cluster.
+
+
+## Deploy the application package
+
+Once we have finished writing our application package, we can deploy it. We use settings similar to those in the [Vespa quick start guide](/en/basics/deploy-an-application-local).
+
+Start the Vespa container:
+
+```bash
+$ docker run --detach --name vespa-msmarco --hostname vespa-msmarco \
+ --publish 8080:8080 --publish 19071:19071 \
+ vespaengine/vespa
+```
+
+Notice that we publish two ports (:8080) is the data-plane port where we write and query documents, and 19071 is the control-plane where we can deploy the application.
+
+Configure the Vespa CLI to use the local container:
+
+```bash
+$ vespa config set target local
+```
+
+Starting the container can take a short while. Make sure that the configuration service is running by using `vespa status`.
+
+```bash
+$ vespa status deploy --wait 300
+```
+
+Now, deploy the Vespa application from the `app` directory:
+
+```bash
+$ vespa deploy --wait 300 app
+```
+
+
+## Feed the data
+
+The data fed to Vespa must match the document type in the schema. The file `vespa.json` generated by the `convert-msmarco.sh` script described in the [dataset section](#dataset) already has data in the appropriate format expected by Vespa:
+
+```bash
+$ vespa feed -t http://localhost:8080 dataset/documents.jsonl
+```
+
+
+## Querying the data
+
+This section demonstrates various ways to search the data using the [Vespa query language](/en/querying/query-language). All the examples use the `vespa-cli` client, the tool uses the HTTP api and if you pass `-v`, you will see the `curl` equivalent API request.
+
+```bash
+$ vespa query \
+ 'yql=select * from msmarco where default contains text(@user-query)' \
+ 'user-query=what is dad bod' \
+ 'hits=3' \
+ 'language=en'
+```
+
+This query combines YQL [text()](/en/reference/querying/yql#text), a robust way to combine free text from users or models with application logic. Similar to `set_language` in indexing, we specify the language of the query using the [language](/en/linguistics/linguistics#querying-with-language) API parameter. This ensures symmetric linguistic processing of both the query and the document text. Automatic language detection is inaccurate for short query strings and might lead to asymmetric processing of queries and document texts.
+
+Following is a partial output of the query above when using the small dataset sample:
+
+```json expandable
+{
+ "root": {
+ "id": "toplevel",
+ "relevance": 1,
+ "fields": {
+ "totalCount": 562
+ },
+ "children": [
+ {
+ "id": "id:msmarco:msmarco::D2977840",
+ "relevance": 0.20676669550322158,
+ "source": "msmarco",
+ "fields": {
+ "sddocname": "msmarco",
+ "body": "After The Cut released a piece explaining what the dadbodis last week the internet pretty much exploded into debate over the trend ",
+ "documentid": "id:msmarco:msmarco::D2977840",
+ "id": "D2977840",
+ "title": "What Is A Dad Bod An Insight Into The Latest Male Body Craze To Sweep The Internet",
+ "url": "http://www.huffingtonpost.co.uk/2015/05/05/what-is-a-dadbod-male-body_n_7212072.html"
+ }
+ }
+ ]
+ }
+}
+```
+
+As shown, 562 documents matched the query out of 996 in the corpus. The `first-phase` ranking expression scores all the matching documents.
+
+A few important observations:
+
+- We did not specify which fields to search in the query. Vespa will, by default, use a field set or field named `default` when the query terms do not specify a field. In our case:
+
+```sd
+fieldset default {
+ fields: title, body, url
+}
+```
+
+- Our query for `what is dad bod` searches across all those three fields.
+- If we did not specify a `default` fieldset in the schema, the above query would return zero hits as the query did not specify a field.
+- The hit `relevance` holds the score computed by the rank profile. Vespa uses `default` by default. In our case:
+
+```sd
+rank-profile default {
+ first-phase {
+ expression: nativeRank(title, body, url)
+ }
+}
+```
+
+We can use query operator annotations for the [text](/en/reference/querying/yql#text) operator to control various matching aspects, for example to set the number of hits to produce in the text operator:
+
+```bash
+$ vespa query \
+ 'yql=select * from msmarco where title contains ({targetHits:100}text(@user-query))' \
+ 'user-query=what is dad bod' \
+ 'hits=3' \
+ 'language=en'
+```
+
+Notice how the query above matches fewer documents `totalCount:116` because we limited the free text query to the title field. We can change the [grammar](/en/reference/querying/yql#grammar) to specify how the user query text is parsed into a query execution plan. In the following example, we use `grammar:"all"` to specify that we only want to retrieve documents where *all* the query terms match the title field.
+
+```bash
+$ vespa query \
+ 'yql=select * from msmarco where title contains ({grammar:"all"}text(@user-query))' \
+ 'user-query=what is dad bod' \
+ 'hits=3' \
+ 'language=en'
+```
+
+This query, using `all`, matches only one document. Notice how the relevance of the hit is the same as in the above example. The difference between the two types of queries is in the matching specification.
+
+We can use `text` to build a query that searches multiple fields (or fieldsets):
+
+```bash
+$ vespa query \
+ 'yql=select * from msmarco where title contains ({grammar:"all"}text(@user-query)) or url contains ({grammar:"all"}text(@user-query))' \
+ 'user-query=what is dad bod' \
+ 'hits=3' \
+ 'language=en'
+```
+
+
+### Boosting by query terms
+
+Sometimes, we want to add a query time boost if some field matches a query term; the following uses the [rank](/en/reference/querying/yql#rank) query operator. The rank query operator allows us to retrieve using the first operand, and the remaining operands can only impact ranking.
+
+It is important to note that the following approach for query time term boosting is in the context of using the `nativeRank` text scoring feature.
+
+```bash
+$ vespa query \
+ 'yql=select * from msmarco where rank(default contains text(@user-query), url contains ({weight:1000, significance:1.0}"www.answers.com"))' \
+ 'user-query=what is dad bod' \
+ 'hits=3' \
+ 'language=en'
+```
+
+The above will match the user query against the default fieldset and produce match features for the second operand. It does not change the *retrieval* or *matching* as the number of documents exposed to ranking is the same as before. The `rank` operator can be used to implement a variety of use case around boosting.
+
+
+#### Combine free text with filters
+
+We can combine the `text` operator with application logic. We add an application-specific query filter on the `url` field to demonstrate how to combine `text` with other query time constraints. We add `ranked:false` to tell Vespa that this specific term should not contribute to the relevance calculation and `filter:true` to ensure that the term is not used for [bolding/highlighting or dynamic snippeting](/en/querying/document-summaries#dynamic-snippets).
+
+```bash
+$ vespa query \
+ 'yql=select * from msmarco where default contains text(@user-query) and url contains ({filter:true,ranked:false}"huffingtonpost.co.uk")' \
+ 'user-query=what is dad bod' \
+ 'hits=3' \
+ 'language=en'
+```
+
+Notice that the `relevance` stays the same since we used `ranked:false` for the filter. Let us see what is going on by adding [query tracing](/en/querying/query-api#query-tracing):
+
+```bash
+$ vespa query \
+ 'yql=select * from msmarco where default contains text(@user-query) and url contains ({filter:true,ranked:false}"huffingtonpost.co.uk")' \
+ 'user-query=what is dad bod' \
+ 'trace.level=3' \
+ 'language=en'
+```
+
+We can notice the following in the trace output:
+
+```
+query=[AND (WEAKAND(100) default:what default:is default:dad default:bod) |url:'huffingtonpost co uk']
+```
+
+Notice that the `text` part is converted to a [weakAnd](/en/ranking/wand) query operator and that this operator is AND'ed with a phrase search ('huffingtonpost co uk') in the `url` field. Notice also the field scoping where the query terms are prefixed with `default`. Notice also that punctuation characters (.) are removed as part of the tokenization. Suppose this is a common pattern where we want to filter on specific strings. In that case, we should create a separate field to avoid phrase matching, phrase matching is more expensive than a single token search.
+
+
+### Supporting end user query syntax
+
+In some applications, you'd like end users or models to be able to search specific fields, match phrases etc. To do that, you can use the `userInput` YQL operator instead of `text`:
+
+```bash
+$ vespa query \
+ 'yql=select * from msmarco where userInput(@user-query)' \
+ 'user-query=title:"dad bod"' \
+ 'hits=3' \
+ 'language=en'
+```
+
+Notice that since the string given to userInput may specify the fields to search, there is no "field contains" part that specifies the field on the YQL side. See the [userInput() documentation](/en/reference/querying/yql#userinput) on the various end user query languages supported and other parameters that can be set.
+
+
+### Debugging token string matching
+
+Query tracing, combined with a summary using [tokens](/en/reference/schemas/schemas#tokens) can help debug matching.
+
+```bash
+$ vespa query \
+ 'yql=select * from msmarco where url contains ({filter:true,ranked:false}"huffingtonpost.co.uk")' \
+ 'trace.level=0' \
+ 'language=en' \
+ 'summary=debug-tokens'
+```
+
+```json expandable
+{
+ "root": {
+ "id": "toplevel",
+ "relevance": 1,
+ "fields": {
+ "totalCount": 562
+ },
+ "children": [
+ {
+ "id": "index:msmarco/0/59444ddd06537a24953b73e6",
+ "relevance": 0.0,
+ "source": "msmarco",
+ "fields": {
+ "sddocname": "msmarco",
+ "url": "http://www.huffingtonpost.co.uk/2015/05/05/what-is-a-dadbod-male-body_n_7212072.html",
+ "url-tokens": [
+ "http",
+ "www",
+ "huffingtonpost",
+ "co",
+ "uk",
+ "2015",
+ "05",
+ "05",
+ "what",
+ "is",
+ "a",
+ "dadbod",
+ "male",
+ "body",
+ "n",
+ "7212072",
+ "html"
+ ]
+ }
+ }
+ ]
+ }
+}
+```
+
+This gives us insight into how the input `url` field was tokenized and indexed. Those are the tokens that the query can match. Notice how punctuation characters like `:`, `,`, `.`, `/`, `_` and `-` are removed as part of the text tokenization.
+
+Observations:
+
+- Relevance is 0.0, because the term uses `ranked:false`.
+- We cannot match "://" because those are not searchable characters with `match:text`
+- `dadbod` is a token in the url, this cannot match `dad` or `bod` as it is represented as a single token `dadbod`.
+
+Let us do a similar example to demonstrate the impact of linguistic stemming:
+
+```bash
+$ vespa query \
+ 'yql=select * from msmarco where url contains ({filter:true,ranked:false}"http")' \
+ 'summary=debug-tokens' \
+ 'language=en'
+```
+
+```json
+{
+ "url": "http://www.ourbabynamer.com/meaning-of-Anika.html",
+ "url-tokens": [
+ "http",
+ "www",
+ "ourbabynamer",
+ "com",
+ "meaning",
+ "of",
+ "anika",
+ "html"
+ ]
+}
+```
+
+Notice that a query for `https` matches `http`, because 'https' on the query is stemmed to `http`. If we turn off stemming on the query side, searching for `https` directly, we end up with 0 results.
+
+```bash
+$ vespa query \
+ 'yql=select * from msmarco where url contains ({filter:true,ranked:false,stem:false}"https")' \
+ 'summary=debug-tokens' \
+ 'language=en'
+```
+
+Similarly, if we pass a different language tag, which will not stem https to http, we also get 0 results:
+
+```bash
+$ vespa query \
+ 'yql=select * from msmarco where url contains ({filter:true,ranked:false}"https")' \
+ 'summary=debug-tokens' \
+ 'language=de'
+```
+
+
+## Ranking
+
+The previous section covered free-text search matching, linguistics, and how to combine business logic with free-text user queries. All the examples used a `default` rank-profile using Vespa's [nativeRank](/en/ranking/nativerank) text scoring feature.
+
+With free-text search, we can use other text scoring functions, like [BM25](/en/ranking/bm25). All the matching capabilities (or limitations) still apply, we can use fieldsets or fields; the difference is in the text scoring function where BM25 is different from nativeRank.
+
+```bash
+$ vespa query \
+ 'yql=select * from msmarco where default contains text(@user-query)' \
+ 'user-query=what is dad bod' \
+ 'hits=3' \
+ 'language=en' \
+ 'ranking=bm25'
+```
+
+While the `nativeRank` text score is normalized to the range 0 to 1, BM25 is unbounded, as demonstrated above. When querying (matching), we can ask Vespa to compute both features in the same query.
+
+Modify the schema and add a new rank-profile `combined`:
+
+```sd expandable
+schema msmarco {
+ document msmarco {
+ field language type string {
+ indexing: "en" | set_language
+ }
+ field id type string {
+ indexing: attribute | summary
+ match: word
+ }
+ field title type string {
+ indexing: index | summary
+ match: text
+ index: enable-bm25
+ }
+ field body type string {
+ indexing: index | summary
+ match: text
+ index: enable-bm25
+ }
+ field url type string {
+ indexing: index | summary
+ index: enable-bm25
+ }
+ }
+ fieldset default {
+ fields: title, body, url
+ }
+ document-summary minimal {
+ summary id { }
+ }
+ document-summary debug-tokens {
+ summary url {}
+ summary url-tokens {
+ source: url
+ tokens
+ }
+ from-disk
+ }
+ rank-profile default {
+ first-phase {
+ expression: nativeRank(title, body, url)
+ }
+ }
+ rank-profile bm25 inherits default {
+ first-phase {
+ expression: bm25(title) + bm25(body) + bm25(url)
+ }
+ }
+
+ rank-profile combined inherits default {
+ first-phase {
+ expression: bm25(title) + bm25(body) + bm25(url) + nativeRank(title) + nativeRank(body) + nativeRank(url)
+ }
+ match-features {
+ bm25(title)
+ bm25(body)
+ bm25(url)
+ nativeRank(title)
+ nativeRank(body)
+ nativeRank(url)
+ }
+ }
+}
+```
+
+Then, re-deploy the Vespa application from the `app` directory:
+
+```bash
+$ vespa deploy --wait 300 app
+```
+
+Adding or removing rank profiles is a live-change as it only impacts how we score documents, not how we index or match them.
+
+Run a query with the new rank-profile:
+
+```bash
+$ vespa query \
+ 'yql=select * from msmarco where default contains text(@user-query)' \
+ 'user-query=what is dad bod' \
+ 'hits=3' \
+ 'language=en' \
+ 'ranking=combined'
+```
+
+Which will produce a result like this:
+
+```json expandable
+{
+ "root": {
+ "id": "toplevel",
+ "relevance": 1,
+ "fields": {
+ "totalCount": 562
+ },
+ "children": [
+ {
+ "id": "id:msmarco:msmarco::D2977840",
+ "relevance": 25.482783473796484,
+ "source": "msmarco",
+ "fields": {
+ "matchfeatures": {
+ "bm25(body)": 19.51565699523739,
+ "bm25(title)": 4.978933753876959,
+ "bm25(url)": 0.3678926381724701,
+ "nativeRank(body)": 0.3010929113058281,
+ "nativeRank(title)": 0.24814575272673867,
+ "nativeRank(url)": 0.07106142247709807
+ },
+ "sddocname": "msmarco",
+ "documentid": "id:msmarco:msmarco::D2977840",
+ "id": "D2977840",
+ "title": "What Is A Dad Bod An Insight Into The Latest Male Body Craze To Sweep The Internet",
+ "url": "http://www.huffingtonpost.co.uk/2015/05/05/what-is-a-dadbod-male-body_n_7212072.html"
+ }
+ }
+ ]
+ }
+}
+```
+
+Notice that `matchfeatures` field that is added to the hit when using `match-features` in the rank-profile. Here, we have all the computed features from the matched document, and the final `relevance` score is the sum of these features (In this case). This query and ranking example demonstrates that for a single query searching a set of fields via fieldset, we can compute different types of text scoring features and use combinations.
+
+Now consider the following where we limit matching to the title field:
+
+```bash
+$ vespa query \
+ 'yql=select * from msmarco where title contains text(@user-query)' \
+ 'user-query=what is dad bod' \
+ 'hits=3' \
+ 'language=en' \
+ 'ranking=combined'
+```
+
+Now, we do not get features for `body` or `url`, because they were not matched by the query.
+
+
+## Next steps
+
+Check out the [Improving Text Search through ML](/en/learn/tutorials/text-search-ml).
+
+
+## Cleanup
+
+If you do not want to proceed with the [Improving Text Search through ML](/en/learn/tutorials/text-search-ml) guide, you can stop and remove the container (and data):
+
+```bash
+$ docker rm -f vespa-msmarco
+```
+
+[^1]: Robertson, Stephen and Zaragoza, Hugo and others, 2009. The probabilistic relevance framework: BM25 and beyond. Foundations and Trends in Information Retrieval.
diff --git a/mintlify-docs/en/linguistics/linguistics-custom.mdx b/mintlify-docs/en/linguistics/linguistics-custom.mdx
new file mode 100644
index 0000000000..e8354751c9
--- /dev/null
+++ b/mintlify-docs/en/linguistics/linguistics-custom.mdx
@@ -0,0 +1,37 @@
+---
+title: "Custom Linguistics"
+---
+
+A linguistics component is an implementation of [com.yahoo.language.Linguistics](https://github.com/vespa-engine/vespa/blob/master/linguistics/src/main/java/com/yahoo/language/Linguistics.java). Refer to the [com.yahoo.language.simple.SimpleLinguistics](https://github.com/vespa-engine/vespa/blob/master/linguistics/src/main/java/com/yahoo/language/simple/SimpleLinguistics.java) implementation (which can be subclassed for convenience).
+
+SimpleLinguistics provides support for english stemming only. Try loading the `com.yahoo.language.simple.SimpleLinguistics` module, or providing another linguistics module.
+
+The linguistics implementation must be configured as a component in container clusters doing linguistics processing, see [injecting components](/en/applications/dependency-injection).
+
+As document processing for indexing is by default done by an autogenerated container cluster which cannot be configured, specify a container cluster for indexing explicitly.
+
+This example shows how to configure SimpleLinguistics for linguistics using the same cluster for both query and indexing processing (if using different clusters, add the same linguistics component to all of them):
+
+```xml highlight= {4,14}
+
+
+
+
+
+
+
+
+
+
+ 1
+
+
+
+
+
+
+
+
+```
+
+If changing the linguistics component of a live system, recall can be reduced until all documents are re-written. This because documents will still be stored with tokens generated by the previous linguistics module.
\ No newline at end of file
diff --git a/mintlify-docs/en/linguistics/linguistics-opennlp.mdx b/mintlify-docs/en/linguistics/linguistics-opennlp.mdx
new file mode 100644
index 0000000000..f6eff7aa19
--- /dev/null
+++ b/mintlify-docs/en/linguistics/linguistics-opennlp.mdx
@@ -0,0 +1,123 @@
+---
+title: "OpenNLP Linguistics"
+sidebarTitle: "Default (OpenNLP) linguistics"
+---
+
+The default Vespa linguistics implementation uses [OpenNLP](https://opennlp.apache.org/). The Apache OpenNLP language detection is also used, by default, even if you're using a different implementation. See [Language handling](/en/linguistics/linguistics#language-handling) for more information. OpenNLP has support for 103 languages.
+
+## OpenNLP language detection
+
+The OpenNLP language detector gives a prediction with a confidence; with confidence typically increasing with more input. The threshold for using the prediction can be configured with a number typically from 1.0 (wild guess) to 6.0 (confident guess), with 2.0 as the default:
+
+```xml
+
+ ...
+
+ 4.2
+
+```
+
+## Default languages
+
+OpenNLP tokenization and stemming supports these languages:
+
+- Arabic (ar)
+- Catalan (ca)
+- Danish (da)
+- Dutch (nl)
+- English (en)
+- Finnish (fi)
+- French (fr)
+- German (de)
+- Greek (el)
+- Hungarian (hu)
+- Indonesian (id)
+- Irish (ga)
+- Italian (it)
+- Norwegian (no)
+- Portuguese (pt)
+- Romanian (ro)
+- Russian (ru)
+- Spanish (es)
+- Swedish (sv)
+- Turkish (tr)
+
+Other languages will use a fallback to English _en_.
+
+English uses a simpler stemmer (kStem) by default, which produces fewer stems and therefore lower recall. To use OpenNlp stemming (Snowball) also for English add this config to your \ element(s):
+
+```xml
+
+ ...
+
+ true
+
+```
+
+See _Tokens_ [OpenNLP models](https://opennlp.apache.org/models.html) and [text matching](/en/querying/text-matching) for examples and how to experiment with linguistics.
+
+If you need support for more languages, you can consider replacing the default OpenNLP based linguistic integration with the [Lucene Linguistics](/en/linguistics/lucene-linguistics) implementation which supports more languages.
+
+### Chinese
+
+The default linguistics implementation does not segment Chinese into tokens, but this can be turned on by config:
+
+```xml
+
+ ...
+
+ true
+ true
+
+```
+
+The createCjkGrams adds substrings of segments longer than 2 characters, which may increase recall.
+
+## Tokenization
+
+Tokenization removes any non-word characters, and splits the string into _tokens_ on each word boundary. In addition, CJK tokens are split using a _segmentation_ algorithm. The resulting tokens are then searchable in the index.
+
+Also see [N-gram matching](/en/reference/schemas/schemas#gram).
+
+## Normalization
+
+An example normalization is à ⇒ a. Normalizing will cause accents and similar decorations which are often misspelled to be normalized the same way both in documents and queries.
+
+Vespa uses [java.text.Normalizer](https://docs.oracle.com/javase/7/docs/api/java/text/Normalizer.html) to normalize text, see [SimpleTransformer.java](https://github.com/vespa-engine/vespa/blob/master/linguistics/src/main/java/com/yahoo/language/simple/SimpleTransformer.java). Normalization preserves case.
+
+Refer to the [nfkc](/en/reference/querying/yql#nfkc) query term annotation. Also see the YQL [accentDrop](/en/reference/querying/yql#accentdrop) annotation.
+
+## Stemming
+
+Stemming means _translate a word to its base form_ (singular forms for nouns, infinitive for verbs), using a [stemmer](https://en.wikipedia.org/wiki/Stemming). Use of stemming increases search recall, because the searcher is usually interested in documents containing query words regardless of the word form used. Stemming in Vespa is symmetric, i.e. words are converted to stems both when indexing and searching.
+
+Examples of this is when text is indexed, the stemmer will convert the noun _reports_ (plural) to _report_, and the latter will be stored in the index. Likewise, before searching, _reports_ will be stemmed to _report_. Another example is that _am_, _are_ and _was_ will be stemmed to _be_ both in queries and indexes.
+
+When [bolding](/en/reference/schemas/schemas#bolding) is enabled, all forms of the query term will be bolded. I.e. when searching for _reports_, both _report_, _reported_ and _reports_ will be bolded.
+
+See the [stem](/en/reference/querying/yql#stem) query term annotation.
+
+### Theory
+
+From a matching point of view, stemming takes all possible token strings and maps them into equivalence classes. So in the example above, the set of tokens \{ _report_, _reports_, _reported_ } are in an equivalence class. To represent the class, the linguistics library should pick the best element in the class. At query time, the text typed by a user will be tokenized, and then each token should be mapped to the most likely equivalence class, again represented by the shortest element that belongs to the class.
+
+While the theory sounds pretty simple, in practice it is not always possible to figure out which equivalence class a token should belong to. A typical example is the string _number_. In most cases we would guess this to mean a numerical entity of some kind, and the equivalence class would be \{ _number_, _numbers_ } - but it could also be a verb, with a different equivalence class \{ _number_, _numbered_, _numbering_ \}. These are of course closely related, and in practice they will be merged, so we'll have a slightly larger equivalence class \{ _number_, _numbers_, _numbered_, _numbering_ \} and be happy with that. However, in a sentence such as _my legs keep getting number every day_, the _number_ token clearly does not have the semantics of a numerical entity, but should be in the equivalence class \{ _numb_, _number_, _numbest_, _numbness_ \} instead. But blindly assigning _number_ to the equivalence class _numb_ is clearly not right, since the _more numb_ meaning is much less likely than the _numerical entity_ meaning.
+
+The approach currently taken by the low-level linguistics library will often lead to problems in the _number_-like cases as described above. To give better recall, Vespa has implemented a _multiple_ stemming option.
+
+### Configuration
+
+By default, all words are stemmed to their _best_ form. Refer to the [stemming reference](/en/reference/schemas/schemas#stemming) for other stemming types. To change type, add:
+
+```yaml
+stemming: [stemming-type]
+```
+
+Stemming can be set either for a field, a fieldset or as a default for all fields. Example: Disable stemming for the field _title_:
+
+```yaml
+field title type string {
+ indexing: summary | index
+ stemming: none
+}
+```
\ No newline at end of file
diff --git a/mintlify-docs/en/linguistics/linguistics.mdx b/mintlify-docs/en/linguistics/linguistics.mdx
new file mode 100644
index 0000000000..1765aa1c8b
--- /dev/null
+++ b/mintlify-docs/en/linguistics/linguistics.mdx
@@ -0,0 +1,313 @@
+---
+title: "Linguistics in Vespa"
+sidebarTitle: "Linguistics Overview"
+---
+
+Vespa uses a _linguistics_ module to process text in queries and documents during indexing and searching. The goal of linguistic processing is to increase _recall_ (how many documents are matched) without hurting _precision_ (the relevance of the documents matched) too much. It consists of such operations as:
+
+- tokenizing text into chunks of known types such as words and punctuation.
+- normalizing accents.
+- finding the base form of words (stemming or lemmatization).
+
+Linguistic processing is run when writing documents, and when querying:
+
+
+ 
+
+
+The processing is run on [string](/en/reference/schemas/schemas#string) fields with `index` indexing mode. Overview:
+
+1. When writing documents, string fields with `indexing: index` are by default processed. A field's language will configure this processing. A document/fields can have the language set explicitly, if not, it is [detected](/en/linguistics/linguistics#field-language-detection).
+2. The field's content is processed (e.g., tokenized, normalized, stemmed, etc.), and the resulting terms are added to the index.
+
+ **Note:** The language for the field is not persisted on the content node, just the processed terms themselves
+
+3. A query is also processed in a similar fashion. Typically through the same [linguistics profile](/en/reference/schemas/schemas#linguistics) as the field content, producing the same terms from the same text. The language of query strings is [detected](/en/linguistics/linguistics#query-language-detection) unless specified using [model.locale](/en/reference/api/query#model.locale) or [annotations](/en/reference/querying/yql#annotations) like `language`.
+
+ **Note:** This is a very common query problem - it is hard to detect language precisely from short strings.
+
+4. The processed query is evaluated on the content nodes, and will only work as expected if both documents and queries produce the same terms.
+
+These operations can be turned on or off per field in the [schema](/en/basics/schemas). See [implicitTransforms](/en/reference/querying/yql#implicittransforms) for how to enable/disable transforms per query term.
+
+## Linguistics implementations
+
+Vespa comes with two linguistics variants out of the box: [OpenNLP](/en/linguistics/linguistics-opennlp) and [Lucene](/en/linguistics/lucene-linguistics). Check out the respective pages for more information on how to configure them.
+
+You can also implement a custom [Linguistics](/en/linguistics/linguistics-custom) component.
+
+The default linguistics variant is [OpenNLP](/en/linguistics/linguistics-opennlp), but for the rest of this page we'll go through common options, such as language handling, inherited by all implementations.
+
+
+ **Note:** Linguistics implementations only control how text is tokenized, including positional information. These tokens are stored in the same way in the underlying index. For example, if you use Lucene linguistics, Vespa does not store information such as positions in Lucene segment files. Storage is the same as with OpenNLP, only resulting tokens might differ.
+
+
+## Language handling
+
+Vespa does _not_ know the language of a document - this applies:
+
+1. The indexing processor is instructed on a per-field level what language to use when calling the underlying linguistics library
+2. The query processor is instructed on a per-query level what language to use
+
+If no language is explicitly set in a document or a query, Vespa will run its configured language detector (by default, [OpenNLP language detection](/en/linguistics/linguistics-opennlp#language-detection)) on the available text (the full content of a document field, or the full `query=` parameter value).
+
+A document that contains the exact same word as a query might not be recall-able if the language of the document field is detected differently from the query. Unless the query has explicitly declared a [language](/en/reference/api/query#model.language), this can occur.
+
+### Indexing with language
+
+The indexing process run by Vespa is a sequential execution of the indexing scripts of each field in the schema, in the declared order. At any point, the script may set the language that will be used for indexing statements for subsequent fields, using [set\_language](/en/reference/writing/indexing-language#set_language). Example:
+
+```yaml
+schema doc {
+ document doc {
+ field language type string {
+ indexing: set_language
+ }
+ field title type string {
+ indexing: index
+ }
+ }
+}
+```
+
+If a language has not been set when tokenization of a field is run, the language is determined by [language detection](/en/linguistics/linguistics#field-language-detection).
+
+If all documents have the same language, the language can be hardcoded it the schema in this way:
+
+```yaml
+schema doc {
+
+ field language type string {
+ indexing: "en" | set_language
+ }
+
+ document doc {
+ ...
+```
+
+If the same document contains fields in multiple languages, set\_language can be invoked multiple times, e.g.:
+
+```yaml
+schema doc {
+ document doc {
+ field language_title1 type string {
+ indexing: set_language
+ }
+ field title1 type string {
+ indexing: index
+ }
+ field language_title2 type string {
+ indexing: set_language
+ }
+ field title2 type string {
+ indexing: index
+ }
+ }
+}
+```
+
+Or, if fixed per field, use multiple indexing statements in each field:
+
+```yaml
+schema doc {
+ document doc {
+ field my_english_field type string {
+ indexing {
+ "en" | set_language;
+ index;
+ }
+ }
+ field my_spanish_field type string {
+ indexing {
+ "es" | set_language;
+ index;
+ }
+ }
+ }
+}
+```
+
+### Field language detection
+
+When indexing a document, if a field has unknown language (i.e. not set using `set_language`), language detection is run on the field's content. This means, language detection is per field, not per document.
+
+See [query language detection](/en/linguistics/linguistics#query-language-detection) for detection confidence, fields with little text will default to English.
+
+### Querying with language
+
+The content of an indexed string field is language-agnostic. One must therefore apply a compatible tokenization on the query terms (e.g., stemming for the same language) in order to match the content of that field.
+
+The query parser subscribes to configuration that tells it what fields are indexed strings, and every query term that targets such a field are run through appropriate tokenization. The [language](/en/reference/api/query#model.language) query parameter controls the language state of these calls.
+
+Because an index may simultaneously contain terms in any number of languages, one can have stemmed variants of one language match the stemmed variants of another. To work around this, store the language of a document in a separate attribute, and apply a filter against that attribute at query-time.
+
+By default, there is no knowledge anywhere that captures what languages are used to generate the content of an index. The language parameter only affects the transformation of query terms that hit tokenized indexes.
+
+### Query language detection
+
+If no [language](/en/reference/api/query#model.language) parameter is used, or the query terms are [annotated](/en/reference/querying/yql#annotations), the language detector is called to process the query string.
+
+Queries are normally short, as a consequence, the detection confidence is low. Example:
+
+```bash
+$ vespa query "select * from music where default contains text(@text)" \
+ tracelevel=3 text='Eine kleine Nachtmusik' | grep 'Stemming with language'
+ "message": "Stemming with language=ENGLISH"
+
+$ vespa query "select * from music where default contains text(@text)" \
+ tracelevel=3 text='Eine kleine Nachtmusik schnell' | grep 'Stemming with language'
+ "message": "Stemming with language=GERMAN"
+```
+
+See [#24265](https://github.com/vespa-engine/vespa/issues/24265) for details - in short, with the current 0.02 confidence cutoff, queries with 3 terms or fewer will default to English.
+
+### Multiple languages
+
+Vespa supports having documents in multiple languages in the same schema, but does not out-of-the-box support cross-lingual retrieval (e.g., search using English and retrieve relevant documents written in German). This is because the language of a query is determined by the language of the query string and only one transformation can take place.
+
+Approaches to overcome this limitation include:
+
+1. Use semantic retrieval using a multilingual text embedding model (see [blog post](https://blog.vespa.ai/simplify-search-with-multilingual-embeddings/)) which has been trained on multilingual corpus and can be used to retrieve documents in multiple languages.
+2. Stem and tokenize the query using the relevant languages, build a query tree using [weakAnd](/en/reference/querying/yql#weakand) / [or](/en/reference/querying/yql#or) and using [equiv](/en/reference/querying/yql#equiv) per stem variant. This is easiest done in a custom [Searcher](/en/applications/searchers) as mentioned in [#12154](https://github.com/vespa-engine/vespa/issues/12154).
+
+Example:
+
+**language=fr:** machine learning =\> machin learn
+
+**language=en:** machine learning =\> machine learn
+
+Using _weakAnd_ here as example as that technique is already mentioned in #12154:
+
+```sql
+select * from sources * where rank(
+ default contains "machine",
+ default contains "learning",
+ weakAnd(
+ default contains equiv("machin", "machine"),
+ default contains "learn"
+ )
+)
+```
+
+We now retrieve using all possible stems/base forms with _weakAnd_, and use the [rank](/en/reference/querying/yql#rank) operator to pass in the original query form, so that ranking can rank literal matches (original) higher. Benefit of _equiv_ is that it allows multiple term variants to share the same position, so that proximity ranking does not become broken by this approach.
+
+## Linguistics profiles
+
+Linguistics profiles are used to configure linguistics processing for a field in the schema. They are typically used with the [Lucene linguistics implementation](/en/linguistics/lucene-linguistics), but can be used in e.g., [custom linguistics implementations](/en/linguistics/linguistics-custom) as well.
+
+### Symmetrical processing
+
+For example, a definition like this:
+
+```yaml
+field title type string {
+ indexing: summary | index
+ linguistics {
+ profile: whitespaceLowercase
+ }
+}
+```
+
+Will look for a profile named `whitespaceLowercase`, which could be defined like this in `services.xml`:
+
+```xml
+
+
+ whitespace
+
+
+
+ lowercase
+
+
+
+```
+
+Note `language=en` there. It is optional: if it's not set, the profile will be used for all languages. But you can have different definitions for different languages on the same profile (e.g., different stemming).
+
+### Different processing for query strings
+
+For some use cases, you may want to process the query string differently than the document content. Synonyms are a good example. If you expand `dog` to `dog,puppy` at query time, it will match either term in the document anyway - no need to expand it at write-time.
+
+To do this, you'd define a different profile for the query string. Like:
+
+```xml
+
+
+ whitespace
+
+
+
+ lowercase
+
+
+ synonymGraph
+
+
+ en/synonyms.txt
+
+
+
+
+```
+
+Then, in the schema, expand `profile` to `profile.index` and `profile.search`:
+
+```yaml
+field title type string {
+ indexing: summary | index
+ linguistics {
+ profile {
+ index: whitespaceLowercase
+ search: whitespaceLowercaseSynonyms
+ }
+ }
+}
+```
+
+At this point, `where synonyms_test contains 'dog'` will match a document containing `puppy`.
+
+### Overriding profile for query strings
+
+At query time, you can tell Vespa to use a specific profile to process the query string via [grammar.profile](/en/reference/querying/yql#grammar). This works with the [userInput()](/en/reference/querying/yql#userinput) and [text()](/en/reference/querying/yql#text) operators. For example, to use the `whitespaceLowercase` profile for the query string:
+
+```sql
+where title contains ({grammar.profile: 'whitespaceLowercase'}text('dog'))
+```
+Equivalent expression via `userInput()`:
+```sql
+where {defaultIndex:'title', grammar.profile: 'whitespaceLowercase', grammar: 'linguistics'}userInput('dog')
+```
+
+
+ **Note:** You should use grammar=linguistics (like in the example above) with grammar.profile to ensure that there is no additional processing (e.g., tokenization) besides what is already defined in the profile.
+
+
+## Troubleshooting linguistics processing
+
+If your documents don't match as expected, there are two ways to get more information. First, you can get the tokenized text for a field by using [tokens](/en/reference/schemas/schemas#tokens) in the [document summary](/en/querying/document-summaries). For example, to get the original text and tokens for the `title` field:
+
+```yaml
+document-summary debug-text-tokens {
+ summary title {}
+ summary title_tokens {
+ source: title
+ tokens
+ }
+ from-disk
+}
+```
+
+Then, at query time, you can also get the tokens of the query string by increasing the [trace level](/en/reference/api/query#trace.level):
+
+```json
+{
+ "yql": "select * from sources * where title contains \"dog\"",
+ "presentation.summary": "debug-text-tokens",
+ "model.locale": "en",
+ "trace.level": 2
+}
+```
diff --git a/mintlify-docs/en/linguistics/lucene-linguistics.mdx b/mintlify-docs/en/linguistics/lucene-linguistics.mdx
new file mode 100644
index 0000000000..22969ccab1
--- /dev/null
+++ b/mintlify-docs/en/linguistics/lucene-linguistics.mdx
@@ -0,0 +1,214 @@
+---
+title: "Lucene Linguistics"
+---
+
+Lucene Linguistics is a custom [linguistics](/en/linguistics/linguistics) implementation of the [Apache Lucene](https://lucene.apache.org) library. It provides a Lucene analyzer to handle text processing for a language with an optional variation per [stemming mode](https://github.com/vespa-engine/vespa/blob/master/linguistics/src/main/java/com/yahoo/language/process/StemMode.java).
+
+Check [sample apps](https://github.com/vespa-engine/sample-apps/tree/master/examples/lucene-linguistics) to get started.
+
+## Crash course to Lucene text analysis
+
+Lucene [text analysis](https://lucene.apache.org/core/9_8_0/core/org/apache/lucene/analysis/package-summary.html) is a process of converting text into searchable tokens. This text analysis consists of a series of components applied to the text in order:
+
+- [CharFilters](https://lucene.apache.org/core/9_8_0/core/org/apache/lucene/analysis/CharFilter.html): transform the text before it is tokenized, while providing corrected character offsets to account for these modifications.
+- [Tokenizers](https://lucene.apache.org/core/9_8_0/core/org/apache/lucene/analysis/Tokenizer.html): responsible for breaking up incoming text into tokens.
+- [TokenFilters](https://lucene.apache.org/core/9_8_0/core/org/apache/lucene/analysis/TokenFilter.html): responsible for modifying tokens that have been created by the Tokenizer.
+
+A specific configuration of the above components is wrapped into an [Analyzer](https://lucene.apache.org/core/9_8_0/core/org/apache/lucene/analysis/Analyzer.html) object.
+
+The text analysis works as follows:
+1. All char filters are applied in the specified order on the entire text string
+2. Token filters in the specified order are applied on each token.
+
+## Defaults language analysis
+
+Lucene Linguistics out-of-the-box exposes the analysis components provided by the [lucene-core](https://lucene.apache.org/core/9_8_0/core/index.html) and the [lucene-analysis-common](https://lucene.apache.org/core/9_8_0/analysis/common/index.html) libraries. Other libraries with Lucene text analysis components (e.g. [analysis-kuromoji](https://lucene.apache.org/core/9_8_0/analysis/kuromoji/index.html)) can be added to the application package as a Maven dependency.
+
+Lucene Linguistics out-of-the-box provides analyzers for 40 languages:
+
+- Arabic
+- Armenian
+- Basque
+- Bengali
+- Bulgarian
+- Catalan
+- Chinese
+- Czech
+- Danish
+- Dutch
+- English
+- Estonian
+- Finnish
+- French
+- Galician
+- German
+- Greek
+- Hindi
+- Hungarian
+- Indonesian
+- Irish
+- Italian
+- Japanese
+- Korean
+- Kurdish
+- Latvian
+- Lithuanian
+- Nepali
+- Norwegian
+- Persian
+- Portuguese
+- Romanian
+- Russian
+- Serbian
+- Spanish
+- Swedish
+- Tamil
+- Telugu
+- Thai
+- Turkish
+
+The Lucene [StandardAnalyzer](https://lucene.apache.org/core/9_8_0/core/org/apache/lucene/analysis/standard/StandardAnalyzer.html) is used for the languages that doesn't have a custom nor a default analyzer.
+
+## Linguistics key
+
+Linguistics keys identify a configuration of text analysis. It can be made of two parts, separated by a semicolon, though you can omit one or the other. The two parts are:
+
+- A [linguistics profile](/en/linguistics/linguistics#linguistics-profiles).
+- A language key.
+
+The language key, in turn, has 2 parts: a mandatory [language code](https://github.com/vespa-engine/vespa/blob/master/linguistics/src/main/java/com/yahoo/language/Language.java) and an optional stemming mode. The format is `LANGUAGE_CODE[/STEM_MODE]`. There are 5 stemming modes: `NONE, DEFAULT, ALL, SHORTEST, BEST` (they can be specified in the [field schema](/en/reference/schemas/schemas#stemming)).
+
+Examples of linguistics key:
+
+- `profile=whitespaceLowercase`: a profile that applies to all languages. You can bind it to different fields by specifying their [linguistics profiles](/en/linguistics/linguistics#linguistics-profiles) in the schema.
+- `profile=whitespaceLowercase;language=en`: a profile that applies to the English language. You'd still bind it to fields via their [linguistics profiles](/en/linguistics/linguistics#linguistics-profiles) in the schema, but it will only be applied to the English texts (either at indexing or query time).
+- `en`: English language: applies to all English texts where no profile is specified (in the schema or [in the query](/en/linguistics/linguistics#overriding-profile-for-query-strings)).
+- `en/BEST`: English language with the `BEST` stemming mode. Like the previous example, but only applies when [stemming](/en/reference/schemas/schemas#stemming) is set to `BEST`.
+
+
+ **Note:** You can use different profiles for document fields and query strings. See [Different processing for query strings](/en/linguistics/linguistics#different-processing-for-query-strings) and the [multiple-profiles sample app](https://github.com/vespa-engine/sample-apps/tree/master/examples/lucene-linguistics/multiple-profiles) for more information.
+
+
+## Customizing text analysis
+
+Lucene linguistics provides multiple ways to customize text analysis per language:
+
+- `LuceneLinguistics` component configuration in the `services.xml`
+- `ComponentsRegistry`
+
+### LuceneLinguistics component configuration
+
+In `services.xml` it is possible to construct an analyzer by providing [configuration for the](https://github.com/vespa-engine/vespa/blob/master/lucene-linguistics/src/main/resources/configdefinitions/lucene-analysis.def)`LuceneLinguistics` component (from all text analysis components that are available on the classpath). Example for the English language:
+
+```xml
+
+
+ lucene-linguistics
+
+
+
+ standard
+
+
+
+ stop
+
+ en/stopwords.txt
+ true
+
+
+
+ englishMinimalStem
+
+
+
+
+
+
+```
+
+Notes:
+
+- `item key="profile=standardStopStem;language=en"` value is a [linguistics key](#linguistics-key).
+- `name` values are the [SPI names](https://docs.oracle.com/en/java/javase/17/docs/api/java.naming/javax/naming/spi/package-summary.html) of the text analysis components. You'll typically find them in the [Lucene analysis JavaDocs](https://lucene.apache.org/core/9_11_1/analysis/common/allclasses-index.html). For example, the name `stop` along with other options can be found in the [StopFilterFactory JavaDoc](https://lucene.apache.org/core/9_11_1/analysis/common/org/apache/lucene/analysis/core/StopFilterFactory.html).
+- The `en/stopwords.txt` file must be placed in your application package under the `lucene-linguistics` directory, which is referenced by the `configDir` option.
+- If `configDir` is not provided the files must be on the classpath.
+
+### Components registry
+
+The [ComponentsRegistry](/en/applications/dependency-injection#depending-on-all-components-of-a-specific-type) mechanism can be used to set a Lucene Analyzer for a language.
+
+```xml
+
+```
+
+Where:
+
+- `id` must be a [linguistics key](#linguistics-key);
+- `class` is the implementation class that extends the `Analyzer` class;
+- `bundle` is a name of the application package as specified in the `pom.xml` (or can be any bundle added to your `components` dir that contains the class).
+
+For this to work, the class must provide **only** a constructor without arguments.
+
+In case your analyzer class needs some initialization you must wrap the analyzer into a class that implements the `Provider` class.
+
+### Custom text analysis components
+
+The text analysis components are loaded via Java Service provider interface ([SPI](https://www.baeldung.com/java-spi)).
+
+To use an external library that is properly prepared it is enough to add the library to the application package as a Maven dependency.
+
+In case you need to create a custom component the steps are:
+
+1. Implement a component in a Java class
+2. Register the component class in the (e.g. a custom token filter) `META-INF/services/org.apache.lucene.analysis.TokenFilterFactory` file that is on the classpath.
+
+## Language Detection
+
+Lucene Linguistics doesn't provide language detection. This means that for both feeding and searching you should provide a [language parameter](/en/reference/api/query#model.language).
+
+## Indexing all stems
+
+Some analyzers expand the input text into multiple tokens on the same position. For example, those based on the [NGramTokenFilter](https://lucene.apache.org/core/9_11_1/analysis/common/org/apache/lucene/analysis/ngram/NGramTokenFilter.html). Here's a sample analyzer configuration:
+
+```xml
+
+
+ whitespace
+
+
+
+ nGram
+
+ 2
+ 2
+
+
+
+
+```
+
+This will take a text like `dog` and produce `do` and `og` as tokens, plus (by default) the original `dog`. However, Vespa only takes the first token (`do`) and writes it to the index, ignoring the other "stems". As a result, a search for `og` will not match documents that contain `dog`, which is the whole point of using letter n-grams.
+
+To index all stems, you can use the [stemming](/en/reference/schemas/schemas#stemming) parameter in the schema definition of your field:
+
+```yaml
+field title_grams type string {
+ indexing: summary | index
+ linguistics {
+ profile: ngram
+ }
+ stemming: multiple
+}
+```
+
+Now, Vespa will index all stems, and a search for `og` will match documents that contain `dog`.
+
+
+ **Note:** Queries look for all stems by default (regardless of the schema configuration). For example, a search for `dog` would expand to `do` and `og` as well, looking for all three terms.
+
\ No newline at end of file
diff --git a/mintlify-docs/en/linguistics/query-rewriting.mdx b/mintlify-docs/en/linguistics/query-rewriting.mdx
new file mode 100644
index 0000000000..0b76fb900d
--- /dev/null
+++ b/mintlify-docs/en/linguistics/query-rewriting.mdx
@@ -0,0 +1,204 @@
+---
+title: "Query Rewriting"
+---
+
+
+A search application can improve the quality by interpreting the intended meaning of the user queries. Once the meaning is guessed, the query can be rewritten to one that will satisfy the user better than the raw query. Vespa includes a query rewriting language which makes it easy to use query rewriting to understand and act upon the query semantics.
+
+These query rewriting techniques can be combined to improve the search experience:
+
+- Query focusing: Decide a field to search for a term
+- Query enhancing: Add additional terms which improves the query
+- Stopwords: Remove terms which hurts recall or precision - [example](https://github.com/vespa-cloud/cord-19-search/blob/main/src/main/java/ai/vespa/example/cord19/searcher/BoldingSearcher.java)
+- Synonyms: Replace terms or phrases by others
+
+Query rewriting done by _semantic rules_ or _searchers_. Semantic rules is a simple production rule language that operates on queries. For more complex query rewriting logic which could not be handled by simple rules, one could create a rewriting searcher making use of the query rewriting framework.
+
+## EQUIV
+
+EQUIV is a query operator that can be used to add synonyms for words where the various synonyms should be equivalent - example:
+
+- The user query is `(used AND automobile)`
+- _automobile_ is a synonym for _car_ (from a dictionary)
+- Rewrite the query to `(used AND (automobile EQUIV car))`
+- _automobile_ or _car_ are here equivalent - the query shall behave as if all occurrences of _car_ in the document corpus had been replaced by _automobile_
+
+See the [reference](/en/reference/querying/yql#equiv) for differences between OR and EQUIV. In many cases it might be better to use OR instead of EQUIV. Example _Snoop_ Dogg:
+
+```sql
+"Snoop" EQUIV "Snoop Doggy Dogg" EQUIV "Snoop Lion" EQUIV "Calvin Broadus" EQUIV "Calvin Cordozar Broadus Junior"
+```
+
+However, _Snoop_ is used by other people - so matching that alone is not a sure hit for the correct entity, and finding more than one of the synonyms in the same text gives better confidence. This is exactly what OR does:
+
+```sql
+"Snoop"!20 OR "Snoop Doggy Dogg"!90 OR "Snoop Lion"!75 OR "Calvin Broadus"!60 OR "Calvin Cordozar Broadus Junior"!100
+```
+
+Use lower weights on the alternatives with less confidence. If it looks like the many words and phrases inside the OR overwhelms other words in the query, giving even lower weights may be useful, for example making the sum of weights 100 - the default weight for just one alternative.
+
+The decision to use EQUIV must be taken by application-specific dictionary or linguistics use. This can be done using [YQL](/en/reference/querying/yql#equiv) or from a container plugin (example [EquivSearcher.java](https://github.com/vespa-engine/sample-apps/blob/master/album-recommendation-java/app/src/main/java/ai/vespa/example/album/EquivSearcher.java)) where the query object is manipulated as follows:
+
+1. Find a word item in the query
+2. Check that an EQUIV can be used in that place (see [limitations](/en/reference/querying/yql#equiv))
+3. Find the synonyms in the dictionary
+4. Make an `EquivItem` with the synonyms (and the original word) as children
+5. Replace the original `WordItem` with the new `EquivItem`
+
+## Rules
+
+A simple semantic rule looks like:
+
+```sql
+lotr -> lord of the rings;
+```
+
+This means that whenever the term _lotr_ is encountered in a query, replace it by the terms _lord of the rings_. Rules can also refer to conditions, and the produced terms can be a modified version of whatever is matched instead of a concrete term:
+
+```sql
+[brand] -> company:[brand];
+[brand] :- sony, dell, ibm, hp;
+```
+
+This rule says that, whenever the condition named _brand_ is matched, replace the matched term(s) by _the same term(s)_ searching the _company_ field. In addition, the _brand_ condition is defined to match any of a list of brands. Note how `->` means a replacing production rule, `:-` means a condition and `,` separates alternatives.
+
+It is also possible to do grouping using parentheses, list multiple terms which must be matched in sequence, and to write _adding_ production rules using `+>` instead of `->`. Terms are by default added using the query default (as if they were written in the search box), but it is also possible to force them to be AND, OR, NOT or RANK using respectively `+`, `?`, `-` and `$`. Here is a more complex rule illustrating this:
+
+```sql
+[destination] (in, by, at, on) [place] +> $name:[destination]
+```
+
+This rule boosts matches which has a destination which matches the _name_ field followed by a preposition and a place (the definition of the _destination_ and _place_ conditions are not shown). This is achieved by adding a RANK term - a term which do not impact whether a document is matched or not, but which adds a relevancy boost if it is.
+
+The complete syntax of this language is found in the [semantic rules reference](/en/reference/querying/semantic-rules).
+
+## Rule bases
+
+A collection of rules used together are collected in a _rule base_ - a text file containing rules and conditions, with file suffix `.sr` (for semantic rules). Example:
+
+```sql
+# Replacements
+lotr -> lord of the rings;
+colour -> color;
+audi -> skoda;
+
+# Stopwords
+[stopword] -> ; # (Replace them by nothing)
+[stopword] :- and, or, the, be;
+
+# Focus brands to the brand field. If we think the_brand_# field has high quality data, we can replace. We use the same name
+# for the condition and the field, but this is not necessary.
+[brand] :- brand:[brand];
+[brand] :- sony, dell, ibm, hp;
+
+# Boost recognized categories
+[category] +> $category:[category];
+[category] :- laptop, digital camera, camera;
+```
+
+The rules in a rule base is evaluated in order from the top down. A rule will be matched as many times as is possible before evaluation moves on to the next query. So the query _colour colour_ will be rewritten to _color color_ before moving on to the next rule.
+
+## Configuration
+
+A rule base file is placed in the `rules/` directory under the [application package](/en/reference/applications/application-packages), and will be named as the file, excluding the `.sr` suffix. E.g. if the rules above are saved to `[my-application]/rules/example.sr`, the rules base available is named `example`.
+
+To make a rule base be used by default in queries, add `@default` on a separate line to the rule base. To deactivate the default rules, add [rules.off](/en/reference/api/query#rules.off) to the query.
+
+The rules can safely be updated at any time by running `vespa prepare` again. If there are errors in the rule bases, they will not be updated, and the errors will be reported on the command line.
+
+To trace what the rules are doing, add [tracelevel.rules=[number]](/en/reference/api/query#tracelevel.rules) to the query.
+
+## Using multiple rule bases
+
+It is possible to place multiple rule bases in the `[my-application]/rules/` and choose between them in the query. Rules may also include each other. This is useful to organize larger sets of rules, to experiment with variants of the rule set in new bases which includes the standard base, or to use different sets of rules for different use cases.
+
+To include one rule base in another, add `@include(rulebasename)` on a separate line, where _rulebasename_ is the file name (with or without the _.sr_). The result will be the same as if the included rule base were copied in to the location of the `include` line. If a condition is defined in both bases, the one from the _including_ base will be used. It is also possible to refer to the same-named condition in an included rule base using the `@super` directive as a condition. For example, this rule base adds some more categories to the _category_ definition in the `example.sr` above:
+
+```sql
+@include(example)
+
+# Category becomes laptop, digital camera, camera, palmtop, phone
+[category] :- @super, palmtop, phone;
+```
+
+Multiple rule bases can be included, and included rule bases can themselves have included rule bases. All the rule bases included in the application package will be available when making queries. One of the rule bases can be made the default by adding `@default` on a separate line in the rule base. To use another rule base, add [rules.rulebase=[rulebasename]](/en/reference/api/query#rules.rulebase) to the query.
+
+## Using a finite state automaton
+
+_Finite state automata_ (FSA) are efficient in storing and making lookups in large string lists. A rule base can be compiled into an FSA to increase performance. An automaton is created from a text file which lists the condition terms to match and the condition names separated by a tab (by default). The name of the condition can be followed by a semicolon and additional data which will be ignored.
+
+This automaton source file defines the same as the _stopword_ and _brand_ conditions in the example rule base:
+
+```txt
+and stopword
+or stopword
+be stopword
+the stopword
+sony brand
+dell brand
+ibm brand; This text is ignored
+hp brand
+```
+
+Use [vespa-makefsa](/en/reference/operations/tools#vespa-makefsa) to compile the automaton file:
+
+```bash
+$ vespa-makefsa -t sourcefile.txt targetfile.fsa
+```
+
+The target file is used from a rule base by adding _@automata(automatonfile)_ on a separate line in the rule base file (the file path is relative to _$VESPA\_HOME_). Automata-files must be stored on all container nodes.
+
+Note that automata are not included in others, so a rule base including another which uses an automaton must also declare to use the same automaton (or an automaton containing any changes from the automaton of the included base).
+
+## Query phrasing
+
+Users search for phrases like _New York_, _Rolling Stones_, _The Who_, or _daily horoscopes_. Considering the latter, most of the time the query string will look like this:
+
+```bash
+/search/?query=daily horoscopes&…
+```
+
+This is actually a search for documents where both _daily_ and _horoscopes_ match, but not necessarily documents with the exact phrase _"daily horoscopes"_. PhrasingSearcher is a Searcher that compares queries with a list of common phrases, and replaces two search terms with a phrase. If _"daily horoscopes"_ is a common phrase, the above query becomes:
+
+```bash
+/search/?query="daily horoscopes"&…
+```
+
+The PhrasingSearcher must be configured with a list of common phrases, compiled into a _finite state automation_ (FSA). The phrase list must be:
+
+- all lowercase
+- sorted alphabetically
+
+Example:
+
+```bash
+$ perl -ne 'print lc' listofphrasestextfile.unsorted.mixedcase | sort > listofphrasestextfile
+```
+
+Note that the Perl command to convert the text file to lowercase does not handle non-ASCII characters well (this is just an example). If the list of phrases is e.g. UTF-8 encoded and/or contains non-English characters, double-check that the resulting file is correct.
+
+Use [vespa-makefsa](/en/reference/operations/tools#vespa-makefsa) to compile the list into an FSA file:
+
+```bash
+$ vespa-makefsa listofphrasestextfile phrasefsa
+```
+
+Put the file on all container nodes, configure the location and [deploy](/en/basics/applications):
+
+```xml
+
+
+
+
+
+
+
+ /path/phrasefsa
+
+
+
+
+
+
+```
+
diff --git a/mintlify-docs/en/linguistics/troubleshooting-encoding.mdx b/mintlify-docs/en/linguistics/troubleshooting-encoding.mdx
new file mode 100644
index 0000000000..ab15b9df7d
--- /dev/null
+++ b/mintlify-docs/en/linguistics/troubleshooting-encoding.mdx
@@ -0,0 +1,49 @@
+---
+title: "Troubleshooting character encoding"
+---
+
+This document helps recognize the most common problems related to Unicode and I18N.
+
+UTF-8 is a Unicode specific encoding where each letter (code point) is encoded as one to four 8 bit bytes. The UTF-8 schema can technically use more bytes, but Unicode is defined as having approximately 1 million code points (partly on cause of limitations of UTF-16), and more than four bytes are then never necessary.
+
+A string in Java is stored as UTF-16, a series of 16 bits char(acter)s. All code points in Unicode base plane, the first 64k code points, is represented as a single char, while higher code points is represented using surrogate pairs. A surrogate pair is a pair of char from a reserved range.
+
+Accessing a code point in a Java string is done using e.g. String.codePointAt(), which then returns a 32-bit integer representing the code point (basically UCS-4). When traversing a string in Java, use codePointAt + offsetByCodePoints or String.codePoints() or similar methods. If your applications conceptually handles letters, using String.charAt() will most of the time be wrong. To calculate buffer sizes for UTF-8 buffers with UTF-16 inputs without doing speculative encoding, Vespa has a toolbox, [com.yahoo.text.Utf8](https://github.com/vespa-engine/vespa/blob/master/vespajlib/src/main/java/com/yahoo/text/Utf8.java), with static helper methods.
+
+If you are using python, use the following to remove control characters:
+
+```python
+def remove_control_characters(s):
+ return "".join(ch for ch in s if unicodedata.category(ch)[0]!="C")
+```
+
+## Visual pattern matching of encoding bugs
+
+| Transformation | Result |
+| :--- | :--- |
+| Input | hôtel |
+| Correctly URL quoted (Vespa always uses UTF-8 there) | h%C3%B4tel |
+| Encoded as ISO-8859-1 (ISO Latin-1), then URL quoted | h%F4tel |
+| Encoded as UTF-16 (as in Java strings), then URL quoted | %00h%00%F4%00t%00e%00l |
+| For completeness, little endian UTF-16, including byte order marker | %FF%FEh%00%F4%00t%00e%00l%00 |
+
+What we are looking for is single bytes outside ASCII, i.e. ordinal above 127. Given UTF-8, there should always be sequences of two or more of these when a code point is outside ASCII. The first byte for each code point will have the two most significant bits set, in other words hex C to hex F. The rest of the bytes for that code point will have the most significant bit set, and the second most unset, in other words hex 8 to hex B.
+
+From here, we move on to the two most common de-/encoding errors:
+
+| Error | Hex dump of code points | Rendered |
+| :--- | :--- | :--- |
+| UTF-8 input decoded as if it were ISO-8859-1 | h\xc3\xb4tel | hôtel |
+| UTF-8 input re-encoded as UTF-8, then decoded as UTF-8 again | h\xc3\xb4tel | hôtel |
+
+Note how these two bugs create exactly the same byte sequences. This is because the first 256 code points of Unicode are identical to ISO-8859-1. What we are looking for is line noise in-between normal ASCII, as both ISO-8859-1 and Unicode are ASCII compatible.
+
+Trying to decode valid ISO-8859-1 input with a UTF-8 decoder will usually make the decoder report the input as invalid if there are code points outside ASCII. Valid ISO-8859-1 rarely end up conforming to the required bit patterns of valid UTF-8, though it sometimes happens.
+
+_Never_ try to debug encoding problems with a web browser. Always use a hexdump tool. `xxd` is a nice utility which is included with vim, which avoids several of the endianness headaches associated with some UNIX alternatives.
+
+Also, remember Windows 1252 is _not_ the same as ISO-8859-1.
+
+## JSON
+
+Use proper JSON - a common error is not stripping ASCII control characters from feed data. See [stripInvalidCharacters](https://github.com/vespa-engine/vespa/blob/master/vespajlib/src/main/java/com/yahoo/text/Text.java) for a utility function.
diff --git a/mintlify-docs/en/modules/e-commerce/multi-currency-filtering.mdx b/mintlify-docs/en/modules/e-commerce/multi-currency-filtering.mdx
new file mode 100644
index 0000000000..f3ec481bd0
--- /dev/null
+++ b/mintlify-docs/en/modules/e-commerce/multi-currency-filtering.mdx
@@ -0,0 +1,500 @@
+---
+title: Multi-Currency Pricing
+---
+
+Vespa for e-commerce includes multi-currency pricing support for e-commerce applications with global product catalogs where products are priced in different currencies and sold across multiple markets. Multi-currency pricing refers to presenting and working with prices in multiple currencies, enabling applications to query, filter, and rank products using prices expressed in the buyer's preferred currency. This enables filtering by price range in any currency and using converted prices in ranking, with automatic currency conversion when market-specific pricing is not available.
+
+## Overview
+
+The multi-currency pricing feature supports:
+
+- **Per-market pricing** - Define different prices for different markets on each product.
+- **Keeping track of exchange rates** - An N×N tensor mapping of currency-to-currency exchange rates is stored in a "forex" document, and can be updated at any time.
+- **Automatic currency conversion** - Fallback to a default market when no other market-specific price exists for the buyer's market.
+- **Query-time filtering** - Filter products by price range in any currency.
+- **Ranking integration** - Optional exposure of currency rates for use in ranking expressions (ranking on the computed price).
+
+The implementation consists of two key components:
+
+- **MultiCurrencyFilterSearcher** - A custom searcher that intercepts queries and dynamically filters products based on effective prices.
+- **CachedForexRateService** - A background service that stores exchange rates from the forex document in-memory for faster look-ups.
+
+## Quick Start
+This quick start walks through an end-to-end example of enabling multi-currency pricing in a Vespa application.
+### Define Schemas
+
+Create two schemas: one to store the forex rates, and one for products. If you already have an existing product schema, you can reuse it as long as it contains the required fields described below.
+
+#### Forex Schema
+
+The forex schema stores currency exchange rates as a tensor. Add a `forex.sd` schema to your application defined as:
+
+```json
+schema forex {
+ document forex {
+ field timestamp type long {
+ indexing: attribute | summary
+ }
+
+ field rates type tensor(from{}, to{}) {
+ indexing: attribute | summary
+ }
+ }
+}
+```
+
+#### Product Schema
+
+The product schema stores products with their seller currency and per-market prices. The `per_market_price` array contains price overrides for specific markets, with a `DEFAULT` market used as fallback. Every product must include a `DEFAULT` entry, and all `per_market_price.price` values are expressed in the document's `seller_currency`.
+
+
+ **Note:** Every per-market override is stored in the seller's native currency, so the searcher can convert buyer price windows instead of rewriting stored prices.
+
+
+
+ **Important:**`fast-search` on the struct-fields is recommended. Without it, price filtering becomes significantly slower as the number of supported currencies grows. `rank: filter` can be added for further optimization, but this depends on the specific ranking setup ” see [rank: filter](/en/reference/schemas/schemas#filter) for details.
+
+
+```js expandable
+schema product {
+ document product {
+
+ # Your existing fields above
+
+ field seller_currency type string {
+ indexing: summary | attribute
+ }
+
+ struct market_price {
+ field market type string {}
+ field price type double {}
+ }
+
+ field per_market_price type array {
+ indexing: summary
+ summary: matched-elements-only
+ struct-field market {
+ indexing: attribute
+ attribute: fast-search
+ }
+ struct-field price {
+ indexing: attribute
+ attribute: fast-search
+ }
+ }
+
+ # Your existing fields below
+
+ }
+}
+```
+
+### Configure Services
+
+Vespa only applies multi-currency filtering when the searcher and forex cache are wired into the container cluster. Queries must pass through a chain that includes `MultiCurrencyFilterSearcher`, and the `ForexRateRetriever` must read the global forex document via its own search chain. Add both chains and the two components to your container definition in `services.xml`:
+
+Inside your existing `` block, add the multi-currency chains and components:
+
+```xml
+
+
+
+
+
+
+
+
+
+
+```
+
+In the `` cluster ensure both document types are declared:
+
+```
+
+
+
+
+```
+
+The retriever issues background queries through the `forex-cache` chain. If that chain is missing or restricts the wrong document type, the cache never reaches `READY` and queries fail with "ensure exactly one forex document exists".
+
+Putting it all together, a minimal `services.xml` might look like this:
+
+```xml
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+```
+
+### Feed Data
+
+#### Feed Forex Rates
+
+Feed a single forex document with ID `id:forex:forex::forex` containing all currency-to-currency exchange rates. Include identity rates (e.g., USD→USD = 1.0) to avoid missing-cell lookups. The `timestamp` field is required and must be updated with each rate change to ensure the cache picks up new rates.
+
+ **Warning:** Exactly one global forex document must exist. If multiple documents are present, the retriever reports `INVALID_FOREX_DOCUMENTS` and the searcher returns error hits instructing you to keep a single forex document.
+
+```json
+{
+ "put": "id:forex:forex::forex",
+ "fields": {
+ "timestamp": 1757385600,
+ "rates": {
+ "cells": [
+ {"address": {"from": "USD", "to": "USD"}, "value": 1.0},
+ {"address": {"from": "USD", "to": "EUR"}, "value": 0.92},
+ {"address": {"from": "USD", "to": "GBP"}, "value": 0.78},
+ {"address": {"from": "USD", "to": "NOK"}, "value": 10.50},
+ {"address": {"from": "EUR", "to": "USD"}, "value": 1.09},
+ {"address": {"from": "EUR", "to": "EUR"}, "value": 1.0},
+ {"address": {"from": "EUR", "to": "GBP"}, "value": 0.85}
+ ]
+ }
+ }
+}
+```
+
+#### Feed Products
+
+Feed products with their seller currency and per-market prices. Always include a `DEFAULT` market entry as fallback.
+
+```json
+{
+ "put": "id:product:product::sku-100",
+ "fields": {
+ "seller_currency": "USD",
+ "per_market_price": [
+ {"market": "DEFAULT", "price": 199.0},
+ {"market": "EU", "price": 189.0},
+ {"market": "UK", "price": 209.0},
+ {"market": "NO", "price": 300.0}
+ ]
+ }
+}
+```
+
+
+ **Note:** If your product schema already includes identifiers or descriptive fields (such as `product_id` or `product_name`), include them in the feed as usual. The example keeps only the required currency fields so it works with the minimal schema shown above.
+
+
+### Query with Price Filtering
+
+Use the following query parameters to filter products by price range in a specific market and currency:
+
+| Parameter | Description | Example |
+| :--- | :--- | :--- |
+| `ecommerce.multicurrency.market` | Target market code | `NO`, `US`, `EU`, `NO-49`, `27` |
+| `ecommerce.multicurrency.currency` | Target currency code | `NOK`, `USD`, `EUR` |
+| `ecommerce.multicurrency.price-min` | Minimum price in target currency | `1000` |
+| `ecommerce.multicurrency.price-max` | Maximum price in target currency | `1500` |
+| `ecommerce.multicurrency.enrich` | Optional: expose forex rates as query tensor for ranking. Defaults to false | `true` or `false` |
+
+#### Example Query
+
+```sql
+$ vespa query \
+ 'yql=select * from product where true' \
+ 'searchChain=multi-currency-filter' \
+ 'ecommerce.multicurrency.market=NO' \
+ 'ecommerce.multicurrency.currency=NOK' \
+ 'ecommerce.multicurrency.price-min=1000' \
+ 'ecommerce.multicurrency.price-max=1500'
+```
+
+This query returns all products whose effective price in NOK (Norwegian Krone) for the Norwegian market is between 1000 and 1500 NOK. The searcher will:
+
+1. Check if the product has a market-specific price for `NO`
+2. If yes, use that price directly
+3. If no, convert the product's `DEFAULT` market price from the seller currency to NOK using forex rates
+4. Keep only products within the specified price range
+
+### Validation Rules
+
+The multi-currency searcher validates query parameters and returns an error if validation fails:
+
+- **Currency codes** must be exactly 3 letters (ISO-4217 format, e.g., `USD`, `EUR`, `NOK`)
+- **Market codes** must be alphanumeric (e.g., `US`, `NO`, `EU`, `NO-47`, `13`)
+- **Price values** must be valid numbers and non-negative
+- **Price range**: `price-max` must be greater than or equal to `price-min`
+- **Currency availability**: The requested currency must exist in the forex document
+
+If any parameter is missing or invalid, the searcher will either skip filtering (for format issues) or return an error result (for logical issues like invalid price ranges or unknown currencies).
+
+
+ **Note:** When filtering is skipped due to malformed inputs, the searcher acts as a no-op and the trace log records the reason (for example, "currency failed ISO-4217 validation; skipping filter"). Use [query tracing](/en/reference/api/query#tracing) to confirm whether the multi-currency filter actually ran.
+
+
+### Updating Forex Rates
+
+Forex rates can be updated at any time by feeding a new version of the forex document with an updated `timestamp` field. The cache will automatically pick up the new rates on its next refresh cycle (typically within seconds).
+
+```bash
+$ vespa feed <(echo '{
+ "update": "id:forex:forex::forex",
+ "fields": {
+ "timestamp": {"assign": 1757472000},
+ "rates": {
+ "assign": {
+ "cells": [...]
+ }
+ }
+ }
+}')
+```
+
+## How It Works
+
+### Price Resolution Logic
+
+For each product, the effective price in the target currency is determined as follows:
+
+1. **Market-specific price:** If the product has a price entry for the requested market, use that price directly
+2. **Currency conversion:** Otherwise, use the `DEFAULT` market price and convert it from the seller currency to the target currency using forex rates
+3. **Price range filter:** Keep only products whose effective price falls within the specified min/max range
+
+### Forex Cache
+
+The `CachedForexRateService` component maintains an in-memory cache of exchange rates and refreshes them periodically from the forex document (`id:forex:forex::forex`). This ensures low-latency access to forex rates during query processing.
+
+#### Automatic Refresh
+
+The `ForexRateRetriever` component automatically refreshes forex rates every 10 seconds using a fixed schedule. This cadence (10s interval, 5s retry window, 1s per attempt) is hard-coded in the provided component and cannot be tuned at deployment time. Each refresh cycle:
+
+- Queries the forex document using the `forex-cache` search chain
+- Validates the document has both `rates` (tensor) and `timestamp` (long) fields
+- Only applies updates if the timestamp is newer than the cached version
+- Retries within a 5-second budget if the first attempt fails
+
+#### Health States
+
+The forex service tracks its operational status with the following health states:
+
+| State | Description | Query Behavior |
+| :--- | :--- | :--- |
+| `READY` | Forex rates loaded and service is operational | Queries with multi-currency filtering work normally |
+| `UNINITIALIZED` | No forex document has been loaded yet | Queries return error: "forex rate service not initialized" |
+| `OUTAGE` | Refresh failed but stale data exists (cache stays ready for re-use once the retriever succeeds again) | Queries return error: "forex rate service temporarily unavailable (last refresh failed)" |
+| `INVALID_FOREX_DOCUMENTS` | Multiple forex documents detected (expected exactly one) | Queries return error: "ensure exactly one forex document exists" |
+
+#### Error Handling
+
+When the service is not in `READY` state, queries with multi-currency filtering will:
+
+- Return an empty result with an appropriate error message
+- Log detailed diagnostic information at appropriate trace levels
+- Continue retrying background refresh attempts until successful
+
+### Performance
+
+Multi-currency price filtering is implemented as efficient query-time filter construction, not result-time evaluation. This means Vespa can use its indexes to find matching products without iterating through all documents.
+
+#### How Filtering Works
+
+When a query with multi-currency parameters is received, the searcher:
+
+1. **Pre-computes price ranges:** Converts the buyer's price range (e.g., 1000-1500 NOK) into equivalent ranges for every seller currency using cached forex rates. For example, if the forex cache has USD, EUR, and GBP, it computes what 1000-1500 NOK equals in each currency.
+2. **Builds structured query filters:** Creates a query tree using Vespa's efficient query primitives:
+ - `SameElementItem` - Matches documents where market and price appear in the same array element
+ - `RangeItem` - Efficiently filters on numeric price ranges using indexes
+ - `WordItem` - Matches exact seller currency and market values
+
+3. **Injects filter into query tree:** Combines the price filter with the user's query, allowing Vespa's query execution engine to evaluate it efficiently using indexes.
+
+This approach has several performance benefits:
+
+- **No document iteration:** Vespa uses attribute indexes to quickly identify matching documents without fetching and evaluating all products
+- **One-time conversion:** Currency conversion happens once during query construction, not for every product in the result set
+- **Index-backed filtering:** Price range and market matching leverage Vespa's fast attribute lookups
+- **Query optimization:** Vespa's query optimizer can reorder and optimize the combined query tree for efficient execution
+
+## Advanced Usage
+
+### Custom Field Configuration
+
+By default, the multi-currency components expect specific field names in your product schema. You can customize these field names using the `ecommerce-schema-wiring` configuration. See [Configuration Reference](#configuration) for all available parameters and their defaults.
+
+```html
+
+
+
+
+
+
+ seller_currency
+ per_market_price
+ market
+ price
+
+
+ DEFAULT
+
+
+ forexRates
+
+
+
+
+
+
+```
+
+### Using Forex Rates in Ranking
+
+When `ecommerce.multicurrency.enrich=true` is set, the searcher exposes the forex rates as a query tensor `query(forexRates)` that can be used in ranking expressions. The ranking profile should implement the same fallback logic as the searcher: check for market-specific prices first, then fall back to the `DEFAULT` market price, and convert to the buyer's currency.
+
+```js expandable
+rank-profile price_ranking {
+ inputs {
+ query(forexRates) tensor(from{}, to{})
+ query(buyer_currency) tensor(to{})
+ query(buyer_market) tensor(market{})
+ }
+
+ function from_selector() {
+ expression: tensorFromLabels(attribute(seller_currency), from)
+ }
+
+ function buyer_rate() {
+ expression: sum(query(forexRates) * from_selector() * query(buyer_currency), from, to)
+ }
+
+ function price_tensor() {
+ expression: tensorFromStructs(attribute(per_market_price), market, price, double)
+ }
+
+ function market_specific_price() {
+ expression: sum(price_tensor() * query(buyer_market), market)
+ }
+
+ function default_price() {
+ expression: price_tensor(){market:'DEFAULT'}
+ }
+
+ function effective_price_in_seller_currency() {
+ expression: if(market_specific_price() > 0, market_specific_price(), default_price())
+ }
+
+ function effective_price_in_buyer_currency() {
+ expression: effective_price_in_seller_currency() * buyer_rate()
+ }
+
+ first-phase {
+ expression: -effective_price_in_buyer_currency()
+ }
+}
+```
+
+#### Filter Parameters vs Ranking Inputs
+
+There are two distinct types of query parameters used together, and they must not be confused:
+
+| Type | Prefix | Format | Purpose |
+| :--- | :--- | :--- | :--- |
+| Filter parameters | `ecommerce.multicurrency.*` | Plain string or number | Tells the searcher which market, currency, and price range to filter on. Consumed server-side ” never reach the rank profile. |
+| Ranking inputs | `ranking.features.query(...)` | One-hot tensor | Passed directly to the rank profile to drive scoring expressions. The searcher does not read or modify these. |
+
+
+ **Note:** The only exception is `enrich=true`, which causes the searcher to inject `query(forexRates)` from its in-memory cache ” because the client cannot know the current rates. `buyer_currency` and `buyer_market` are already known to the client so they are passed directly as ranking inputs, not via the searcher.
+
+
+#### One-Hot Tensor Format
+
+Ranking inputs use one-hot encoded tensors. The format is `{{dimension:value}:1}` where `dimension` is the named dimension defined in the rank profile `inputs` block, and `value` is the label to select:
+
+- `query(buyer_currency)` has dimension `to` ” pass `{{to:NOK}:1}`
+- `query(buyer_market)` has dimension `market` ” pass `{{market:NO}:1}`
+
+Both tensors are required by the example rank profile above. Omitting either causes its ranking expressions to produce zero for all documents.
+
+#### Complete Query with Filtering and Ranking
+
+The following example uses the `price_ranking` profile defined above with price filtering and both ranking input tensors:
+
+```bash
+$ vespa query \
+ 'yql=select * from product where true' \
+ 'searchChain=multi-currency-filter' \
+ 'ecommerce.multicurrency.market=NO' \
+ 'ecommerce.multicurrency.currency=NOK' \
+ 'ecommerce.multicurrency.price-min=1000' \
+ 'ecommerce.multicurrency.price-max=1500' \
+ 'ecommerce.multicurrency.enrich=true' \
+ 'ranking.profile=price_ranking' \
+ 'ranking.features.query(buyer_currency)={{to:NOK}:1}' \
+ 'ranking.features.query(buyer_market)={{market:NO}:1}'
+```
+
+Key functions:
+
+- `tensorFromStructs` - Converts the `per_market_price` array to a tensor at ranking time
+- `market_specific_price()` - Extracts price for the requested market if it exists
+- `default_price()` - Gets the `DEFAULT` market price as fallback
+- `effective_price_in_seller_currency()` - Selects market-specific price or falls back to `DEFAULT`
+- `effective_price_in_buyer_currency()` - Converts the effective price using forex rates
+
+## Configuration Reference
+
+This section describes the [configuration parameters](/en/applications/configuring-components) used by the multi-currency components. All parameters are part of the `ecommerce-schema-wiring` config (`ai.vespa.ecommerce.common.ecommerce-schema-wiring`).
+
+| Parameter | Description | Type | Default |
+| :--- | :--- | :--- | :--- |
+| `productFields.sellerCurrency` | Field name for the product's seller currency. | `string` | `seller_currency` |
+| `productFields.perMarketPriceArrayStruct` | Array field name containing per-market prices. | `string` | `per_market_price` |
+| `productFields.marketStructField` | Struct field name for market code. | `string` | `market` |
+| `productFields.priceStructField` | Struct field name for price value. | `string` | `price` |
+| `defaults.market` | Default market identifier used as fallback when no market-specific price exists. | `string` | `DEFAULT` |
+| `rankProfileInputs.forexRates` | Query tensor name for forex rates in ranking. Used when `enrich=true` to inject the forex tensor into the query. | `string` | `forexRates` |
+
+## Requirements
+
+- **Single global forex document:** Maintain one document with ID `id:forex:forex::forex` and mark it `global="true"`. Additional documents trigger the `INVALID_FOREX_DOCUMENTS` health state and queries fail.
+- **Forex payload completeness:** Every feed/update must include the `rates` tensor for all buyer/seller pairs you filter on, identity rates (USD→USD, etc.), and a monotonically increasing `timestamp` (epoch seconds).
+- **Product schema layout:** Products expose `seller_currency`, encode all `per_market_price.price` values in that seller currency, and include a `DEFAULT` market entry.
+- **Container wiring:** Deploy the `multi-currency-filter` search chain and the `forex-cache` chain in your container cluster, along with the `ForexRateService` and `ForexRateRetriever` components.
+- **Query parameters:** Multi-currency filtering only runs when the query supplies `market`, `currency`, `price-min`, and `price-max`. Missing or malformed parameters cause the searcher to skip filtering.
+
+## Recommended Practices
+
+- **Structure product ids/names as needed:** Keep your existing product fields (IDs, names, facets) and add the required currency fields alongside them.
+- **Model asymmetric rates:** Store both A→B and B→A conversions explicitly so buyer→seller lookups stay accurate even when FX rates are not perfect inverses.
+- **Plan update cadence:** Choose how often you feed forex data based on market volatility. The retriever polls every 10 seconds, so frequent feeds are reflected quickly.
+- **Default chain selection:** Either set `searchChain=multi-currency-filter` on relevant queries or make it the default chain so multi-currency filtering is always applied when parameters are present.
+
+## See Also
+
+- [Saved Search Notifications](/en/modules/e-commerce/saved-search)
+- [Using Features Together](/en/modules/e-commerce/using-features-together)
+- [E-commerce tutorial](/en/learn/tutorials/e-commerce)
+- [Searcher Development](/en/applications/searchers)
+- [Tensor Guide](/en/ranking/tensor-user-guide)
+- [tensorFromStructs - Convert struct arrays to tensors](/en/reference/ranking/rank-features#tensorFromStructs(attribute,key,value,type))
+- [Struct Fields in Schemas](/en/reference/schemas/schemas#struct-field)
+- [Search Chains](/en/applications/searchers#search-chains)
\ No newline at end of file
diff --git a/mintlify-docs/en/modules/e-commerce/saved-search.mdx b/mintlify-docs/en/modules/e-commerce/saved-search.mdx
new file mode 100644
index 0000000000..4de0f9e70c
--- /dev/null
+++ b/mintlify-docs/en/modules/e-commerce/saved-search.mdx
@@ -0,0 +1,350 @@
+---
+title: Saved Search Notifications
+---
+
+
+
+**Important:** **Experimental feature under active development.** Saved Search Notifications is in early access and is being actively developed. Schemas, configuration, the webhook payload format, and other public APIs may change in backwards-incompatible ways without notice. Do not rely on this feature for production-critical workloads, and expect to migrate as the feature evolves. Feedback is welcome - please reach out to [Vespa Support](https://vespa.ai/support/).
+
+
+Vespa for e-commerce includes a module for storing queries in Vespa ("searches") and issuing notifications when a new or updated document matches any saved searches. A typical use case in e-commerce is letting users store queries on products using filters on keywords, price, location etc. and sending them a notification when a new product matches their query.
+
+## Overview
+
+The saved search notifications feature supports:
+
+- **Storing predicate queries** - Saved searches contain arbitrary boolean expressions over a set of string attributes and numerical ranges. See [Predicate Fields](/en/schemas/predicate-fields).
+- **Schema wiring configuration** - Wirings between the saved search attributes and the document fields can be configured by the application.
+- **Webhook notifications** - A match between a new or updated document and a set of saved searches can be sent to a HTTP endpoint in a JSON format.
+- **Separate document processing** - Separate routing to the saved search component allows processing of saved searches without disrupting normal feed operations.
+
+## Quick Start
+
+A minimal setup for demonstrating saved search notification capabilities is given in this section. We will develop a small shopping use-case example with a few saved search attributes.
+
+### Define Schemas
+
+Create two schemas: one for products and one for storing the saved searches.
+
+#### Product Schema
+
+We create a minimal document type representing a product for sale. Each of the three fields will correspond to a searchable attribute in the saved searches.
+
+```js expandable
+schema product {
+ document product {
+
+ # Other fields
+
+ field price type int {
+ indexing: attribute
+ }
+
+ field category type string {
+ indexing: attribute
+ }
+
+ field condition type string {
+ indexing: attribute
+ }
+ }
+
+ # rank-profiles etc.
+}
+```
+
+#### Saved Search Schema
+
+The predicate field will contain the entire search expression used to match products.
+
+```js expandable
+schema saved_search {
+ document saved_search {
+ field filters type predicate {
+ indexing: attribute
+ index {
+ arity: 2 # Mandatory
+ # Range of values the expressions are expected to operate on.
+ # Better performance if these are smaller
+ lower-bound: 3
+ upper-bound: 500
+
+ dense-posting-list-threshold: 0.25
+ }
+ }
+ }
+}
+```
+
+### Configure Services
+
+A minimal `services.xml` configuring the saved search component can look like this:
+
+```xml expandable
+
+
+
+
+
+
+
+
+
+
+ WEBHOOK
+
+ http://localhost:8000/notification
+
+
+
+ product
+ saved_search
+ filters
+ 100
+
+
+
+ category
+ category
+ true
+
+
+ condition
+ condition
+ false
+
+
+
+
+ price
+ price
+ true
+
+
+
+
+
+
+
+
+ 2
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+```
+
+### Feed Data
+
+#### Feed saved searches
+
+To test the functionality, feed two saved search documents:
+
+```json
+[
+ {"put": "id:saved_search:saved_search::search1", "fields": {
+ "filters": "price in [20..100] and category in [Sports, Books]"
+ }},
+ {"put": "id:saved_search:saved_search::search2", "fields": {
+ "filters": "price in [200..487] and category in [Electronics]"
+ }}
+]
+```
+
+#### Feed a product to the notification route
+
+Assume a new product is available, with the following schema:
+
+```json
+{"put": "id:saved_search:product::product1", "fields":{"category": "Sports", "price": 50}}
+```
+
+To enable notifications when feeding this product, feed it to the `notification-route`:
+
+```bash
+$ vespa feed --route notification-route product.jsonl
+```
+
+Assuming everything is set up correctly, it should match `id:saved_search:saved_search::search1` but not `search2`. If a server is receiving requests at the endpoint specified in the `URL` parameter of the `webhook` configuration, you should see a request with a JSON body representing the matched pair.
+
+
+ **Warning:** The `SavedSearchDocumentProcessor` acts as a **sink** for incoming documents. That is, `Put` and `Update` operations sent to that document processor will not propagate down to the content nodes, effectively discarding the operations. This is why a `routingtable` is specified in the example - documents going to the route `notification-route` will "fork", with one path going to the content cluster and one path to the saved search component.
+
+
+## Notification kinds
+
+### Webhook
+
+The webhook notification kind will send a request to a specified URL for each document that matches a set of saved searches. It requires an external application to provide the handling of such requests.
+
+If the configuration parameter `notification.kind == WEBHOOK`, all configuration parameters prefixed with `notification.webhook` will take effect. The requests from the saved search application will be `POST`-requests with a JSON body:
+
+```json highlight= {7}
+{
+ "id": "",
+ "timestamp": "",
+ "matched_documents": [
+ "",
+ "",
+ ...
+ ]
+}
+```
+
+#### Security
+
+In most cases, the Webhook endpoint handling the notifications is (and should be) protected in some way. The currently supported way to send authorized requests to the webhook endpoint is by combining the [Vespa secret store](/en/security/secret-store) with the `notification.webhook.headers[].secret` config parameter. Assume we want to send notifications to `https://my.webhook.com/notification`, and that the api requires the following header:
+
+```text
+Authorization: Bearer TOKEN
+```
+
+to be present in all requests. To enable our application to use this, first create the secret in Vespa Cloud and let it contain the **full** value of the header: `Bearer TOKEN`, replacing `TOKEN` with the actual token.
+
+Next, add the secret to the application in `services.xml`:
+
+```xml expandable
+
+ ...
+
+
+
+```
+
+Finally, configure the `SavedSearchDocumentProcessor` to add a header with this secret value to all notification requests:
+
+```xml expandable
+...
+
+
+
+ WEBHOOK
+
+
+
+ Authorization
+
+ myApiToken
+
+
+
+
+
+
+```
+
+Webhook notifications are sent once without automatic retry. Delivery failures are recorded in the Vespa log.
+
+### Vespa Schema
+
+For a simpler way to test saved search notification, a method for storing the notifications within the Vespa application is provided. This method represents each notification between a pair of a product and a saved search using a dedicated Vespa document type. It is recommended for testing purposes only.
+
+If the configuration parameter `notification.kind == VESPA_SCHEMA`, all configuration parameters prefixed with `notification.vespaSchema` will take effect. A minimal working example of this notification kind is given below.
+
+#### Notification example
+
+Define a document type for storing the notifications, for example `notification.sd`:
+
+```js
+schema notification {
+ document notification {
+ field product_id type string {
+ indexing: attribute | summary
+ }
+
+ field saved_search_id type string {
+ indexing: attribute | summary
+ }
+
+ field timestamp type long {
+ indexing: attribute | summary
+ }
+ }
+}
+```
+
+Add the document type to the application:
+
+```xml
+
+
+ ...
+
+
+
+```
+
+Configure the schema wirings of the `SavedSearchDocumentProcessor`:
+
+```xml expandable
+
+
+
+
+
+ VESPA_SCHEMA
+
+ notification
+ saved_search
+ product_id
+ saved_search_id
+ timestamp
+
+
+
+
+```
+Now notifications can be inspected by using `vespa visit` or `vespa query` with the appropriate parameters.
+## Configuration reference
+
+This section describes the possible [configuration parameters](/en/applications/configuring-components) used by the document processor.
+
+| Parameter | Description | Type | Default value |
+| :--- | :--- | :--- | :--- |
+| `notification.kind` | Method to use for sending notifications. | `enum {WEBHOOK, DUMMY, VESPA_SCHEMA}` | `DUMMY` |
+| `notification.webhook.URL` | URL to send notification requests. | `string` | |
+| `notification.webhook.connectionPoolSize` | Number of HTTP client threads to use in the container cluster. | `int` | `20` |
+| `notification.webhook.headers[].key` | Key of a header to add to all webhook requests. | `string` | |
+| `notification.webhook.headers[].value` | Value of a header to add to all webhook requests. | `string` | |
+| `notification.webhook.headers[].secret` | Use a secret from Vespa secret store instead of the value provided in `.value`. The value provided here should match the name of a secret specified with a `secrets` tag in `services.xml`. | `string` | |
+| `notification.vespaSchema.documentType` | Name of the Vespa document type to use for storing notifications. This document type has to be defined in the application. | `string` | saved\_search\_notification |
+| `notification.vespaSchema.namespace` | Namespace to use for creating document IDs for the notification documents. | `string` | saved\_search |
+| `notification.vespaSchema.fieldPathProductId` | Fieldpath for storing the product id in the notification documents. | `string` | product\_id |
+| `notification.vespaSchema.fieldPathSavedSearchId` | Fieldpath for storing saved search id in the notification documents. | `string` | saved\_search\_id |
+| `notification.vespaSchema.fieldPathTimestamp` | Fieldpath for storing timestamps in the notification documents. | `string` | timestamp |
+| `productDocumentType` | The name of the document type that can trigger notifications, e.g. `product`. | `string` | product |
+| `savedSearchDocumentType` | The name of the document type storing saved searches, e.g. `saved_search`. | `string` | saved\_search |
+| `predicateFieldName` | The name of the field in `savedSearchDocumentType` storing the predicate query. | `string` | filters |
+| `savedSearchNumHits` | Maximum number of saved searches that can match per product update. Matches beyond this limit are silently dropped. Higher values increase work per update. | `int` | 100 |
+| `regularAttributes[].predicateName` | The name of a regular (string) attribute to be used in the saved search predicate field. | `string` | |
+| `regularAttributes[].fieldPath` | The field in the `productDocumentType` to be matched with this attribute. This field should be of type `string`. | `string` | |
+| `regularAttributes[].required` | Whether documents are required to specify this attribute. | `bool` | `false` |
+| `rangeAttributes[].predicateName` | The name of a numerical range attribute to be used in the saved search predicate field. | `string` | |
+| `rangeAttributes[].fieldPath` | The field in the `productDocumentType` to be matched with this attribute. This field should be of a numeric type, e.g. `int`. | `string` | |
+| `rangeAttributes[].required` | Whether documents are required to specify this attribute. | `bool` | `false` |
+
+## See Also
+
+- [Predicate Fields](/en/schemas/predicate-fields)
+- [Document Processors](/en/applications/document-processors)
+- [Configuring Components](/en/applications/configuring-components)
+- [Secret Store](/en/security/secret-store)
+- [Multi-Currency Pricing](/en/modules/e-commerce/multi-currency-filtering)
+- [Using Features Together](/en/modules/e-commerce/using-features-together)
\ No newline at end of file
diff --git a/mintlify-docs/en/modules/e-commerce/using-features-together.mdx b/mintlify-docs/en/modules/e-commerce/using-features-together.mdx
new file mode 100644
index 0000000000..b8fe35b7b0
--- /dev/null
+++ b/mintlify-docs/en/modules/e-commerce/using-features-together.mdx
@@ -0,0 +1,234 @@
+---
+title: Using Features Together
+---
+
+
+**Important:** Some features described on this page (notably [Saved Search Notifications](/en/modules/e-commerce/saved-search)) are experimental and under active development. APIs and configuration may change in backwards-incompatible ways - see the individual feature pages for details.
+
+
+The e-commerce features are designed as standalone components that can be composed together. This page covers the configuration needed when features interact at feed time or query time. Read the individual feature pages before this one:
+
+- [Multi-Currency Pricing](/en/modules/e-commerce/multi-currency-filtering)
+- [Saved Search Notifications](/en/modules/e-commerce/saved-search)
+
+The schemas, services configuration, and other examples on this page are illustrative. Adapt field names, document types, and configuration values to match your application.
+
+## Saved Search with Multi-Currency
+
+When [saved search notifications](/en/modules/e-commerce/saved-search) and [multi-currency pricing](/en/modules/e-commerce/multi-currency-filtering) are used together, the saved search document processor generates per-currency price features at feed time. This enables saved searches with price filters to match products regardless of the seller's currency.
+
+### How It Works
+
+Without multi-currency, a saved search like `price in [100..200]` matches against the product's single `price` field. With multi-currency enabled, the document processor:
+
+1. Reads the product's `per_market_price` entries and `seller_currency`
+2. Converts each price to every known currency using the forex rates
+3. Scales the converted prices to integers by multiplying with `priceScaleFactor` (e.g., a factor of 100 preserves two decimal places). Predicate ranges require integer values, so this step is needed to retain precision.
+4. Generates predicate range features named `{featurePrefix}_{currency}_{market}` (e.g., `price_NOK_DEFAULT`, `price_EUR_NO`)
+5. Feeds these features alongside the regular attributes into the predicate query
+
+A saved search filtering on NOK prices can then use `price_NOK_DEFAULT in [1000..1500]` as its predicate expression, and it will match products originally priced in any currency.
+
+### Predicate Upper Bound
+
+The predicate field's `upper-bound` in the saved search schema must be large enough to cover the highest possible scaled converted price. Because prices are both currency-converted and scaled by `priceScaleFactor`, the resulting values can be significantly larger than the original prices.
+
+
+ **Warning:** If any single range feature in a predicate query exceeds the `upper-bound`, Vespa rejects the **entire** predicate query for that document, not just the out-of-range feature. This means the product will not match any saved searches at all.
+
+
+For example, with a scale factor of 100 and a EUR-to-SEK rate of \~11.76: a product priced at 200 EUR produces `price_SEK_DEFAULT = 200 × 11.76 × 100 = 235,294`. The `upper-bound` must be at least 235,294 for this to work.
+
+Choose an upper bound that covers your highest-priced products converted to the weakest target currency, multiplied by the scale factor. A generous margin is recommended to accommodate price and exchange rate fluctuations.
+
+### Schema Setup
+
+The following are minimal examples showing the fields required from each feature. Your schemas will likely have additional fields and configuration.
+
+#### Product Schema
+
+The product schema must include fields from both features: the saved search attributes (`price`, `category`, etc.) and the multi-currency fields (`seller_currency`, `per_market_price`).
+
+```js expandable
+schema product {
+ document product {
+
+ # Saved search attributes
+ field price type int {
+ indexing: attribute
+ }
+
+ field category type string {
+ indexing: attribute
+ }
+
+ # Multi-currency fields
+ field seller_currency type string {
+ indexing: summary | attribute
+ }
+
+ struct market_price {
+ field market type string {}
+ field price type double {}
+ }
+
+ field per_market_price type array {
+ indexing: summary
+ summary: matched-elements-only
+ struct-field market {
+ indexing: attribute
+ }
+ struct-field price {
+ indexing: attribute
+ }
+ }
+ }
+}
+```
+
+#### Saved Search Schema
+
+Set the `upper-bound` high enough to cover scaled converted prices. See [Predicate Upper Bound](#predicate-upper-bound) above.
+
+```js expandable
+schema saved_search {
+ document saved_search {
+ field filters type predicate {
+ indexing: attribute
+ index {
+ arity: 2
+ lower-bound: 3
+ upper-bound: 250000
+
+ dense-posting-list-threshold: 0.25
+ }
+ }
+ }
+}
+```
+
+### Services Configuration
+
+Both features are configured in the same container cluster. The example below shows a combined setup - the key addition is the `multicurrency` block inside the `ecommerce-schema-wiring` config of the document processor:
+
+```xml expandable
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ WEBHOOK
+
+ http://my-webhook-endpoint:8000/notification
+
+
+
+
+ seller_currency
+ per_market_price
+ market
+ price
+
+
+ product
+ saved_search
+ filters
+
+
+
+ category
+ category
+ false
+
+
+
+
+ price
+ price
+ false
+
+
+
+
+
+ true
+ price
+ 100
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+```
+
+### Multi-currency Configuration Reference
+
+These parameters are part of the `ecommerce-schema-wiring` config and only apply when the saved search document processor is used together with multi-currency:
+
+| Parameter | Description | Type | Default |
+| :--- | :--- | :--- | :--- |
+| `multicurrency.enabled` | Enable generation of per-currency price features at feed time. | `bool` | `false` |
+| `multicurrency.featurePrefix` | Prefix for the generated predicate range features. A feature is named `{prefix}_{currency}_{market}`. | `string` | `price` |
+| `multicurrency.priceScaleFactor` | Integer multiplier applied to converted prices before feeding as predicate range features. Predicate ranges require integer values, so this preserves decimal precision. A factor of 100 preserves two decimal places. | `int` | `100` |
+
+
+ **Note:** The `productFields` config block is shared between both features. When combining, use the same field names in the searcher and document processor configurations.
+
+
+### Feeding Workflow
+
+When both features are active, the feeding workflow is:
+
+1. Feed the forex document (`id:forex:forex::forex`) and wait for the `CachedForexRateService` to reach `READY` state
+2. Feed saved search documents to the content cluster
+3. Feed products to the `notification-route` - the document processor will generate multi-currency predicate features and match against saved searches
+
+
+ **Warning:** The forex rates must be loaded before feeding products to the notification route. If the `ForexRateService` is not ready, the document processor cannot generate currency-converted price features.
+
+
+## See Also
+
+- [Saved Search Notifications](/en/modules/e-commerce/saved-search)
+- [Multi-Currency Pricing](/en/modules/e-commerce/multi-currency-filtering)
+- [Predicate Fields](/en/schemas/predicate-fields)
+- [Configuring Components](/en/applications/configuring-components)
diff --git a/mintlify-docs/en/operations/access-logging.mdx b/mintlify-docs/en/operations/access-logging.mdx
new file mode 100644
index 0000000000..123ef3b5b1
--- /dev/null
+++ b/mintlify-docs/en/operations/access-logging.mdx
@@ -0,0 +1,265 @@
+---
+title: "Access Logging"
+---
+
+The Vespa access log format allows the logs to be processed by a number of available tools handling JSON based (log) files. With the ability to add custom key/value pairs to the log from any Searcher, you can easily track the decisions done by container components for given requests.
+
+## Vespa Access Log Format
+
+In the Vespa access log, each log event is logged as a JSON object on a single line. The log format defines a list of fields that can be logged with every request. In addition to these fields, [custom key/value pairs](#logging-key-value-pairs-to-the-json-access-log-from-searchers) can be logged via Searcher code. Pre-defined fields:
+
+| Name | Type | Description | Always present |
+| --- | --- | --- | --- |
+| ip | string | The IP address request came from | yes |
+| time | number | UNIX timestamp with millisecond decimal precision (e.g. 1477828938.123) when request is received | yes |
+| duration | number | The duration of the request in seconds with millisecond decimal precision (e.g. 0.123) | yes |
+| responsesize | number | The size of the response in bytes | yes |
+| code | number | The HTTP status code returned | yes |
+| method | string | The HTTP method used (e.g. 'GET') | yes |
+| uri | string | The request URI from path and beyond (e.g. '/search?query=test') | yes |
+| version | string | The HTTP version (e.g. 'HTTP/1.1') | yes |
+| agent | string | The user agent specified in the request | yes |
+| host | string | The host header provided in the request | yes |
+| scheme | string | The scheme of the request | yes |
+| port | number | The IP port number of the interface on which the request was received | yes |
+| remoteaddr | string | The IP address of the [remote client](#logging-remote-address-port) if specified in HTTP header | no |
+| remoteport | string | The port used from the [remote client](#logging-remote-address-port) if specified in HTTP header | no |
+| peeraddr | string | Address of immediate client making request if different from *remoteaddr* | no |
+| peerport | string | Port used by immediate client making request if different from *remoteport* | no |
+| user-principal | string | The name of the authenticated user (java.security.Principal.getName()) if principal is set | no |
+| ssl-principal | string | The name of the x500 principal if client is authenticated through SSL/TLS | no |
+| search | object | Object holding search specific fields | no |
+| search.totalhits | number | The total number of hits for the query | no |
+| search.hits | number | The hits returned in this specific response | no |
+| search.coverage | object | Object holding [query coverage information](/en/performance/graceful-degradation) similar to that returned in result set. | no |
+| connection | string | Reference to the connection log entry. See [Connection log](#connection-log) | no |
+| attributes | object | Object holding [custom key/value pairs](#logging-key-value-pairs-to-the-json-access-log-from-searchers) logged in searcher. | no |
+
+
+**Note:**
+
+IP addresses can be both IPv4 addresses in standard dotted format (e.g. 127.0.0.1) or IPv6 addresses in standard form with leading zeros omitted (e.g. 2222:1111:123:1234:0:0:0:4321).
+
+
+An example log line will look like this (here, pretty-printed):
+
+```json
+{
+ "ip": "152.200.54.243",
+ "time": 920880005.023,
+ "duration": 0.122,
+ "responsesize": 9875,
+ "code": 200,
+ "method": "GET",
+ "uri": "/search?query=test¶m=value",
+ "version": "HTTP/1.1",
+ "agent": "Mozilla/4.05 [en] (Win95; I)",
+ "host": "localhost",
+ "search": {
+ "totalhits": 1234,
+ "hits": 0,
+ "coverage": {
+ "coverage": 98,
+ "documents": 100,
+ "degraded": {
+ "non-ideal-state": true
+ }
+ }
+ }
+}
+```
+
+
+**Note:**
+
+The log format is extendable by design such that the order of the fields can be changed and new fields can be added between minor versions. Make sure any programmatic log handling is using a proper JSON processor.
+
+
+Example: Decompress, pretty-print, with human-readable timestamps:
+
+```bash
+$ jq '. + {iso8601date:(.time | todateiso8601)}' \
+ <(unzstd -c /opt/vespa/logs/vespa/access/JsonAccessLog.default.20210601010000.zst)
+```
+
+### Logging Remote Address/Port
+
+In some cases when a request passes through an intermediate service, this service may add HTTP headers indicating the IP address and port of the real origin client. These values are logged as *remoteaddr* and *remoteport* respectively. Vespa will log the contents in any of the following HTTP request headers as *remoteaddr*: *X-Forwarded-For*, *Y-RA*, *YahooRemoteIP* or *Client-IP*. If more than one of these headers are present, the precedence is in the order listed here, i.e. *X-Forwarded-For* takes precedence over *Y-RA*. The contents of the *Y-RP* HTTP request header will be logged as *remoteport*.
+
+If the remote address or -port differs from those initiating the HTTP request, the address and port for the immediate client making the request are logged as *peeraddress* and *peerport* respectively.
+
+## Configuring Logging
+
+For details on the access logging configuration see [accesslog in the container](/en/reference/applications/services/container#accesslog) element in *services.xml*.
+
+Key configuration options include:
+
+- **fileNamePattern**: Pattern for log file names with time variable support
+- **rotationInterval**: Time-based rotation schedule (minutes since midnight)
+- **rotationSize**: Size-based rotation threshold in bytes (0 = disabled)
+- **rotationScheme**: Either 'sequence' or 'date'
+- **compressionFormat**: GZIP or ZSTD compression for rotated files
+
+### Logging Request Content
+
+Vespa supports logging of request content for specific URI paths. This is useful for inspecting query content of search POST requests or document operations of Document v1 POST/PUT requests. The request content is logged as a base64-encoded string in the JSON access log.
+
+To configure request content logging, use the [request-content](/en/reference/applications/services/container#request-content) element in the accesslog configuration in *services.xml*.
+
+Here is an example of how the request content appears in the JSON access log:
+
+```json
+{
+ ...
+ "method": "POST",
+ "uri": "/search",
+ ...,
+ "request-content": {
+ "type": "application/json; charset=utf-8",
+ "length": 12345,
+ "body": ""
+ }
+}
+```
+
+### File name pattern
+
+The file name pattern is expanded using the time when the file is created. The following parts in the file name are expanded:
+
+| Field | Format | Meaning | Example |
+| --- | --- | --- | --- |
+| `%`Y` | YYYY | Year | 2003 |
+| `%`m` | MM | Month, numeric | 08 |
+| `%`x` | MMM | Month, textual | Aug |
+| `%`d` | dd | Date | 25 |
+| `%`H` | HH | Hour | 14 |
+| `%`M` | mm | Minute | 30 |
+| `%`S` | ss | Seconds | 35 |
+| `%`s` | SSS | Milliseconds | 123 |
+| `%`Z` | Z | Time zone | \-0400 |
+| `%`T` | Long | System.currentTimeMillis | 1349333576093 |
+| `%`%` `| `%``| Escape percentage | % |
+
+## Log rotation
+
+Apache httpd style log *rotation* can be configured by setting the *rotationScheme*. There's two alternatives for the rotationScheme, sequence and date. Rotation can be triggered by time intervals using *rotationInterval* and/or by file size using *rotationSize*.
+
+### Sequence rotation scheme
+
+The *fileNamePattern* is used for the active log file name (which in this case will often be a constant string). At rotation, this file is given the name fileNamePattern.N where N is 1 + the largest integer found by extracting the integers from all files ending by .`` in the same directory
+
+```xml
+
+```
+
+### Date rotation scheme
+
+The *fileNamePattern* is used for the active log file name here too, but the log files are not renamed at rotation. Instead, you must specify a time-dependent fileNamePattern so that each time a new log file is created, the name is unique. In addition, a symlink is created pointing to the active log file. The name of the symlink is specified using *symlinkName*.
+
+```xml
+
+```
+
+### Rotation interval
+
+The time of rotation is controlled by setting *rotationInterval*:
+
+```xml
+
+```
+
+The rotationInterval is a list of numbers specifying when to do rotation. Each element represents the number of minutes since midnight. Ending the list with '...' means continuing the [arithmetic progression](https://en.wikipedia.org/wiki/Arithmetic_progression) defined by the two last numbers for the rest of the day. E.g. "0 100 240 480 ..." is expanded to "0 100 240 480 720 960 1200"
+
+### Log retention
+
+Access logs are rotated, but not deleted by Vespa processes. It is up to the application owner to take care of archiving of access logs.
+
+## Logging Key/Value pairs to the JSON Access Log from Searchers
+
+To add a key/value pair to the access log from a searcher, use
+
+```bash
+query/result.getContext(true).logValue(key,value)
+```
+
+Such key/value pairs may be added from any thread participating in handling the query without incurring synchronization overhead.
+
+If the same key is logged multiple times, the values written will be included in the log as an array of strings rather than a single string value.
+
+The key/value pairs are added to the *attributes* object in the log.
+
+An example log line will then look something like this:
+
+```json
+{"ip":"152.200.54.243","time":920880005.023,"duration":0.122,"responsesize":9875,"code":200,"method":"GET","uri":"/search?query=test¶m=value","version":"HTTP/1.1","agent":"Mozilla/4.05 [en] (Win95; I)","host":"localhost","search":{"totalhits":1234,"hits":0},"attributes":{"singlevalue":"value1","multivalue":["value2","value3"]}}
+```
+
+A pretty print version of the same example:
+
+```json
+{
+ "ip": "152.200.54.243",
+ "time": 920880005.023,
+ "duration": 0.122,
+ "responsesize": 9875,
+ "code": 200,
+ "method": "GET",
+ "uri": "/search?query=test¶m=value",
+ "version": "HTTP/1.1",
+ "agent": "Mozilla/4.05 [en] (Win95; I)",
+ "host": "localhost",
+ "search": {
+ "totalhits": 1234,
+ "hits": 0
+ },
+ "attributes": {
+ "singlevalue": "value1",
+ "multivalue": [
+ "value2",
+ "value3"
+ ]
+ }
+}
+```
+
+## Connection log
+
+In addition to the access log, one entry per connection is written to the connection log. This entry is written on connection close. Available fields:
+
+| Name | Type | Description | Always present |
+| :--- | :--- | :--- | :--- |
+| id | string | Unique ID of the connection, referenced from access log. | yes |
+| timestamp | number | Timestamp (ISO8601 format) when the connection was opened | yes |
+| duration | number | The duration of the request in seconds with millisecond decimal precision (e.g. 0.123) | yes |
+| peerAddress | string | IP address used by immediate client making request | yes |
+| peerPort | number | Port used by immediate client making request | yes |
+| localAddress | string | The local IP address the request was received on | yes |
+| localPort | number | The local port the request was received on | yes |
+| remoteAddress | string | Original client ip, if proxy protocol enabled | no |
+| remotePort | number | Original client port, if proxy protocol enabled | no |
+| httpBytesReceived | number | Number of HTTP bytes sent over the connection | no |
+| httpBytesSent | number | Number of HTTP bytes received over the connection | no |
+| requests | number | Number of requests sent by the client | no |
+| responses | number | Number of responses sent to the client | no |
+| ssl | object | Detailed information on ssl connection | no |
+
+## SSL information
+
+| Name | Type | Description | Always present |
+| :--- | :--- | :--- | :--- |
+| clientSubject | string | Client certificate subject | no |
+| clientNotBefore | string | Client certificate valid from | no |
+| clientNotAfter | string | Client certificate valid to | no |
+| sessionId | string | SSL session id | no |
+| protocol | string | SSL protocol | no |
+| cipherSuite | string | Name of session cipher suite | no |
+| sniServerName | string | SNI server name | no |
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/archive/archive-guide-aws.mdx b/mintlify-docs/en/operations/archive/archive-guide-aws.mdx
new file mode 100644
index 0000000000..11ab754b4f
--- /dev/null
+++ b/mintlify-docs/en/operations/archive/archive-guide-aws.mdx
@@ -0,0 +1,118 @@
+---
+title: "AWS Archive guide"
+---
+
+
+**Note:**
+
+This guide is for tenants using Vespa Cloud. If your tenant uses **Enclave**, the archive buckets are in your own cloud account and you can access them directly — see the [Enclave archive guide](/en/operations/enclave/archive) instead.
+
+
+Vespa Cloud exports log data, heap dumps, and Java Flight Recorder sessions to buckets in AWS S3. This guide explains how to access this data. Access to the data must happen through an AWS account controlled by the tenant. Data traffic to access this data is charged to this AWS account.
+
+These resources are needed to get started:
+
+- An AWS account
+- An IAM Role in that AWS account
+- The [AWS command line client](https://aws.amazon.com/cli/)
+
+Access is configured through the Vespa Cloud Console in the tenant account screen. Choose the "archive" tab, then expand the **AWS** section.
+
+## Register IAM Role
+
+
+
+
+
+Click **Configure access to your cloud archive** to open the configuration dialog.
+
+## Configure access
+
+
+
+
+
+In **Step 1**, enter the ARN of the IAM Role that should have access to the S3 buckets (e.g. `arn:aws:iam::123456789012:role/my-iam-role`) and click **Save**. Vespa Cloud will then grant access to that role on the S3 buckets.
+
+In **Step 2**, a policy is generated that must be attached to your IAM Role. Copy the policy and attach it to the IAM Role in your AWS account. AWS requires permissions to be registered in both Vespa Cloud's AWS account (step 1) and the tenant's AWS account (step 2). Make your own equivalent policy should you have other requirements. For more information, see the [AWS documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html).
+
+## Access files using AWS CLI
+
+
+
+
+
+Once permissions have been granted, the IAM Role can access the contents of the archive buckets. Any AWS S3 client will work, but the AWS command line client is an easy tool to use. The archive page will list all buckets where data is stored, typically one bucket per zone the tenant has applications.
+
+The `--request-payer=requester` parameter is mandatory to make sure network traffic is charged to the correct AWS account.
+
+Refer to [access-log-lambda](https://github.com/vespa-cloud/vespa-documentation-search/blob/main/access-log-lambda/README.md) for how to install and use `aws cli`, which can be used to download logs as in the illustration, or e.g. list objects:
+
+```bash
+$ aws s3 ls --profile=archive --request-payer=requester \
+ s3://vespa-cloud-data-prod.aws-us-east-1c-9eb633/vespa-team/
+
+ PRE album-rec-searcher/
+ PRE cord-19/
+ PRE vespacloud-docsearch/
+```
+
+In the example above, the S3 bucket name is *vespa-cloud-data-prod.aws-us-east-1c-9eb633* and the tenant name is *vespa-team* (for that particular prod zone). Archiving is per tenant, and a log file is normally stored with a key like:
+
+```bash
+/vespa-team/vespacloud-docsearch/default/h2946a/logs/access/JsonAccessLog.default.20210629100001.zst
+```
+
+The URI to this object is hence:
+
+```bash
+s3://vespa-cloud-data-prod.aws-us-east-1c-9eb633/vespa-team/vespacloud-docsearch/default/h2946a/logs/access/JsonAccessLog.default.20210629100001.zst
+```
+
+Objects are exported once generated - access log files are compressed and exported at least once per hour.
+
+If you are having problems accessing the files, please run
+
+```bash
+aws sts get-caller-identity
+```
+
+to verify that you are correctly assuming the role which has been granted access.
+
+## Lambda processing
+
+When processing logs using a lambda function, write a minimal function to list objects, to sort out access / keys / roles:
+
+```js expandable
+const aws = require("aws-sdk");
+const s3 = new aws.S3({ apiVersion: "2006-03-01" });
+
+const findRelevantKeys = ({ Bucket, Prefix }) => {
+ console.log(`Finding relevant keys in bucket ${Bucket}`);
+ return s3
+ .listObjectsV2({ Bucket: Bucket, Prefix: Prefix, RequestPayer: "requester" })
+ .promise()
+ .then((res) =>
+ res.Contents.map((content) => ({ Bucket, Key: content.Key }))
+ )
+ .catch((err) => Error(err));
+};
+
+exports.handler = async (event, context) => {
+ const options = { Bucket: "vespa-cloud-data-prod.aws-us-east-1c-9eb633", Prefix: "MY-TENANT-NAME/" };
+ return findRelevantKeys(options)
+ .then((res) => {
+ console.log("response: ", res);
+ return { statusCode: 200 };
+ })
+ .catch((err) => ({ statusCode: 500, message: err }));
+};
+```
+
+
+**Note:**
+
+Always set `RequestPayer: "requester"` to access the objects - transfer cost is assigned to the requester.
+
+
+Once the above lists the log files from S3, review [access-log-lambda](https://github.com/vespa-cloud/vespa-documentation-search/blob/main/access-log-lambda/README.md) for how to write a function to decompress and handle the log data.
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/archive/archive-guide-gcp.mdx b/mintlify-docs/en/operations/archive/archive-guide-gcp.mdx
new file mode 100644
index 0000000000..253966a5d1
--- /dev/null
+++ b/mintlify-docs/en/operations/archive/archive-guide-gcp.mdx
@@ -0,0 +1,81 @@
+---
+title: "GCP Archive guide"
+sidebarTitle: "Archive Guide GCP"
+---
+
+
+**Note:**
+
+This guide is for tenants using Vespa Cloud. If your tenant uses **Enclave**, the archive buckets are in your own cloud account and you can access them directly — see the [Enclave archive guide](/en/operations/enclave/archive) instead.
+
+
+Vespa Cloud exports log data, heap dumps, and Java Flight Recorder sessions to buckets in Google Cloud Storage. This guide explains how to access this data. Access to the data is through a GCP project controlled by the tenant. Data traffic to access this data is charged to this GCP project.
+
+These resources are needed to get started:
+
+- A GCP project
+- A Google user account
+- The [gcloud command line interface](https://cloud.google.com/sdk/docs/install)
+
+Access is configured through the Vespa Cloud Console in the tenant account screen. Choose the "archive" tab, then expand the **GCP** section.
+
+## Register IAM principal
+
+
+
+
+
+Click **Configure access to your cloud archive** to open the configuration dialog.
+
+## Grant access to Vespa Cloud resources
+
+
+
+
+
+Enter a [principal](https://cloud.google.com/iam/docs/overview) with a supported prefix and click **Save**. Vespa Cloud will then grant access to that principal on the Cloud Storage buckets.
+
+Supported principal prefixes:
+
+- `user:` — Google Account, e.g. `user:email@example.com`
+- `serviceAccount:` — Service account, e.g. `serviceAccount:my-app@project.iam.gserviceaccount.com`
+- `group:` — Google group, e.g. `group:admins@example.com`
+- `domain:` — Google Workspace or Cloud Identity domain, e.g. `domain:example.com`
+
+## Access files using Gcloud CLI
+
+
+
+
+
+Once permissions have been granted, the GCP member can access the contents of the archive buckets. Any Cloud Storage client will work, but the `gsutil` command line client is an easy tool to use. The archive page will list all buckets where data is stored, typically one bucket per zone the tenant has applications.
+
+The `-u user-project` parameter is mandatory to make sure network traffic is charged to the correct GCP project.
+
+```bash
+$ gsutil -u my-project ls \
+ gs://vespa-cloud-data-prod.gcp-us-central1-f-73770f/vespa-team/
+ gs://vespa-cloud-data-prod.gcp-us-central1-f-73770f/vespa-team/album-rec-searcher/
+ gs://vespa-cloud-data-prod.gcp-us-central1-f-73770f/vespa-team/cord-19/
+ gs://vespa-cloud-data-prod.gcp-us-central1-f-73770f/vespa-team/vespacloud-docsearch/
+```
+
+In the example above, the bucket name is *vespa-cloud-data-prod.gcp-us-central1-f-73770f* and the tenant name is *vespa-team* (for that particular prod zone). Archiving is per tenant, and a log file is normally stored with a key like:
+
+```bash
+/vespa-team/vespacloud-docsearch/default/h7644a/logs/access/JsonAccessLog.20221011080000.zst
+```
+
+The URI to this object is hence:
+
+```bash
+gs://vespa-cloud-data-prod.gcp-us-central1-f-73770f/vespa-team/vespacloud-docsearch/default/h2946a/logs/access/JsonAccessLog.default.20210629100001.zst
+```
+
+Objects are exported once generated - access log files are compressed and exported at least once per hour.
+
+
+**Note:**
+
+Always set a user project to access the objects - transfer cost is assigned to the requester.
+
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/archive/archive-guide.mdx b/mintlify-docs/en/operations/archive/archive-guide.mdx
new file mode 100644
index 0000000000..bd28724008
--- /dev/null
+++ b/mintlify-docs/en/operations/archive/archive-guide.mdx
@@ -0,0 +1,87 @@
+---
+title: "Archive guide"
+---
+
+Vespa Cloud exports log data, heap dumps, and Java Flight Recorder sessions to storage buckets. The bucket system used will depend on which cloud provider is backing the zone your application is running in. AWS S3 will be used in the AWS zones, and Cloud Storage will be used in the GCP zones.
+
+How to access and use the storage buckets is found in the documentation for the respective cloud providers:
+
+
+
+
+
+
+## Examples
+
+These examples use GCP as source, replace with AWS commands as needed. Here, *resonant-triode-123456* is the Google project ID that owns the target bucket *my\_access\_logs* for data copy (and will get the data download cost, if any).
+
+Use the CLUSTERS view in the Vespa Cloud Console to find hostname(s) for the nodes to export logs from - then list contents:
+
+```bash
+$ gsutil -u resonant-triode-123456 ls \
+ gs://vespa-cloud-data-prod-gcp-us-central1-f-73770f/mytenant/myapp/
+
+$ gsutil -u resonant-triode-123456 ls \
+ gs://vespa-cloud-data-prod-gcp-us-central1-f-73770f/mytenant/myapp/myinstance
+
+$ gsutil -u resonant-triode-123456 ls \
+ gs://vespa-cloud-data-prod-gcp-us-central1-f-73770f/mytenant/myapp/myinstance/h404a/logs/access
+```
+
+Copy files for a host to the *my\_access\_logs* bucket:
+
+```bash
+$ gsutil -u resonant-triode-123456 \
+ -m -o "GSUtil:parallel_process_count=1" \
+ cp -r \
+ gs://vespa-cloud-data-prod-gcp-us-central1-f-73770f/vespa-team/vespacloud-docsearch/default/h404a \
+ gs://my_access_logs/vespa-files
+```
+
+`rsync` can be used to reduce number of files copied, using `-x` to exclude paths:
+
+```bash
+$ gsutil -u resonant-triode-123456 \
+ -m -o "GSUtil:parallel_process_count=1" \
+ rsync -r \
+ -x '.*/connection/.*|.*/vespa/.*|.*/zookeeper/.*' \
+ gs://vespa-cloud-data-prod-gcp-us-central1-f-73770f/vespa-team/vespacloud-docsearch/default/h404a \
+ gs://my_access_logs/vespa-files
+```
+
+Refer to [cloud-functions](https://github.com/vespa-engine/sample-apps/tree/master/examples/google-cloud/cloud-functions) and [lambda](https://github.com/vespa-engine/sample-apps/tree/master/examples/aws/lambda) for how to write and deploy simple functions to process files in Google Cloud and AWS.
+
+For local processing, copy files for a host to local file system (or use `rsync`):
+
+```bash
+$ gsutil -u resonant-triode-123456 \
+ -m -o "GSUtil:parallel_process_count=1" \
+ cp -r \
+ gs://vespa-cloud-data-prod-gcp-us-central1-f-73770f/vespa-team/vespacloud-docsearch/default/h404a \
+ .
+```
+
+Use [zstd](https://facebook.github.io/zstd/) to decompress files:
+
+```bash
+$ zstd -d *
+```
+
+Example: Filter out healthchecks using [jq](https://stedolan.github.io/jq/):
+
+```bash
+$ cat JsonAccessLog.20230117* | jq '. |
+ select (.uri != "/status.html") |
+ select (.uri != "/state/v1/metrics") |
+ select (.uri != "/state/v1/health")'
+```
+
+Add a human-readable date field per access log entry:
+
+```bash
+$ cat JsonAccessLog.20230117* | jq '. |
+ select (.uri != "/status.html") |
+ select (.uri != "/state/v1/metrics") |
+ select (.uri != "/state/v1/health") |
+ . +{iso8601date:(.time|todateiso8601)}'
+```
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/automated-deployments.mdx b/mintlify-docs/en/operations/automated-deployments.mdx
new file mode 100644
index 0000000000..30a2e65970
--- /dev/null
+++ b/mintlify-docs/en/operations/automated-deployments.mdx
@@ -0,0 +1,346 @@
+---
+title: "Automated Deployments"
+---
+
+
+
+
+
+See [pipeline graph](#pipeline-graph) for details on the visual elements.
+
+Vespa Cloud provides:
+
+- A [CD test framework](#cd-tests) for safe deployments to production zones.
+- [Multi-zone deployments](#deployment-orchestration) with orchestration and test steps.
+
+This guide goes through details of an orchestrated deployment. Read / try [production deployment](/en/reference/applications/deployment) first to have a baseline understanding. The [developer guide](/en/applications/developer-guide) is useful for writing tests. Use [example GitHub Actions](#automating-with-github-actions) for automation.
+
+## CD tests
+
+Before deployment in production zones, [system tests](#system-tests) and [staging tests](#staging-tests) are run. Tests are run in a dedicated and [downsized](/en/operations/environments) environment. These tests are optional, see details in the sections below. Status and logs of ongoing tests can be found in the *Deployment* view in the [Vespa Cloud Console](https://console.vespa-cloud.com/):
+
+
+
+
+
+These tests are also run during [Vespa Cloud upgrades](#vespa-cloud-upgrades).
+
+Find deployable example applications in [CI-CD](https://github.com/vespa-cloud/examples/tree/main/CI-CD).
+
+### System tests
+
+When a system test is run, the application is deployed in the [test environment](/en/operations/environments#test). The system test suite is then run against the endpoints of the test deployment. The test deployment is empty when the test execution begins. The application package and Vespa platform version is the same as that to be deployed to production.
+
+A test suite includes at least one [system test](/en/applications/testing#system-tests). An application can be deployed to a production zone without system tests - this step will then only test that the application starts successfully. See [production deployment](/en/reference/applications/deployment) for an example without tests.
+
+If the production zones span multiple cloud providers (e.g., both AWS and GCP), system tests are run separately for each cloud provider, using test nodes from that provider. This ensures the application starts and works correctly on each provider's infrastructure before production deployment.
+
+Read more about [system tests](/en/applications/testing#system-tests).
+
+### Staging tests
+
+A staging test verifies the transition of a deployment of a new application package - i.e., from application package `Appold` to `Appnew`. A test suite includes at least one [staging setup](/en/applications/testing#staging-tests), and [staging test](/en/applications/testing#staging-tests).
+
+
+
+All production zone deployments are polled for the current versions. As there can be multiple versions already being deployed (i.e. multiple `Appold`), there can be a series of staging test runs.
+
+
+The application at revision `Appold` is deployed in the [staging environment](/en/operations/environments#staging).
+
+
+The staging setup test code is run, typically making the cluster reasonably similar to a production cluster.
+
+
+The test deployment is then upgraded to application revision `Appnew`.
+
+
+Finally, the staging test code is run, to verify the deployment works as expected after the upgrade.
+
+
+
+An application can be deployed to a production zone without staging tests - this step will then only test that the application starts successfully before and after the change. See [production deployment](/en/reference/applications/deployment) for an example without tests.
+
+Like system tests, staging tests are run separately for each cloud provider when the production zones span multiple providers.
+
+Read more about [staging tests](/en/applications/testing#staging-tests).
+
+### Disabling tests
+
+To deploy without testing, remove the test files from the application package. Tests are always run, regardless of *deployment.xml*.
+
+To temporarily deploy without testing, run `deploy` and hit the "Abort" button (see illustration above, hover over the test step in the Console) - this skips the test step and makes the orchestration progress to the next step.
+
+### Running tests only
+
+To run a system test, without deploying to any nodes after, add a new test instance. In *deployment.xml*, add the instance without `dev` or`prod` elements, like:
+
+```xml
+
+
+
+
+ ...
+
+```
+
+Note that this will leave an empty instance in the console, as the deployment is for testing only, so no resources deployed to after test.
+
+Make sure to run `vespa prod deploy` to invoke the pipeline for testing, and use a separate application for this test.
+
+## Deployment orchestration
+
+The *deployment orchestration* is flexible. One can configure dependencies between deployments to production zones, production verification tests, and configured delays; by ordering these in parallel and serial blocks of steps:
+
+
+
+
+
+### Pipeline graph
+
+The deployment pipeline is visualized as a graph in the [Vespa Cloud Console](https://console.vespa-cloud.com/). Each node represents a step in the pipeline, and edges show dependencies between steps. Hover over any node to see details and available actions.
+
+#### Node shapes
+
+| Shape | Step type | Description |
+| --- | --- | --- |
+| | Instance | The application instance. Hover to see target versions, cancel/deploy/pin controls, and block windows. |
+| | Test | System test, staging test, or production test. Hover to see run status, versions, and abort/restart actions. |
+| | Production deployment | A deployment to a production zone. Hover to see run status, versions, and abort/restart/defer actions. |
+| | Delay | A configured delay between steps. |
+
+#### Visual indicators
+
+| Indicator | Meaning | Description |
+| --- | --- | --- |
+| | Completed | The step has completed successfully on the current version. The color corresponds to the deployed version. |
+| | Running | A deployment or test is currently in progress. Shown as an animated gradient between the source and target version colors. |
+| | Failed | The last run of this step failed. |
+| | Unknown / initial | No version has been deployed to this step yet. |
+| | Pending change | A newer version is queued and waiting to be deployed to this step. |
+| | Paused / deferred | Deployments to this step are temporarily postponed. |
+| | Application blocked | Application changes are blocked by a [block window](#block-windows). Shown as vertical bars. |
+| | Platform blocked | Platform upgrades are blocked by a [block window](#block-windows). Shown as horizontal bars. |
+
+Each version deployed through the pipeline is assigned a distinct color. This makes it easy to see at a glance which zones are on the same version and where a rollout is in progress. A thumbtack icon on a node indicates that the version is [pinned](#pinning-versions).
+
+On a higher level, instances can also depend on each other in the same way. This makes it easy to configure a deployment process which gradually rolls out changes to increasingly larger subsets of production nodes, as confidence grows with successful production verification tests. Refer to [deployment.xml](/en/reference/applications/deployment) for details.
+
+Deployments run sequentially by default, but can be configured to [run in parallel](/en/reference/applications/deployment). Inside each zone, Vespa Cloud orchestrates the deployment, such that the change is applied without disruption to read or write traffic against the application. A production deployment in a zone is complete when the new configuration is active on all nodes.
+
+Most changes are instant, making this a quick process. If node restarts are needed, e.g., during platform upgrades, these will happen automatically and safely as part of the deployment. When this is necessary, deployments will take longer to complete.
+
+System and staging tests, if present, must always be successfully run before the application package is deployed to production zones.
+
+### Version progression
+
+The deployment pipeline deploys one revision at a time through the production zones. When a revision is being deployed, it must complete deployment to *all* declared production zones before the next revision begins its production rollout. System and staging tests for newer revisions may run in parallel, but production deployment is serialized.
+
+For example, if build 90 is being deployed to the second of two production zones, build 91 will not start deploying to the first zone until build 90 has completed in all zones — even if build 91 has already passed system and staging tests.
+
+#### Superseding a version
+
+To override the currently deploying revision and force a newer build through the pipeline, hover over the instance node in the pipeline graph and use the *TARGET VERSIONS* controls. Select the desired build number from the revision dropdown and click **deploy**. This updates the instance's deployment target. Any running production job for the old revision will be aborted, and the pipeline will start deploying the new revision from the first production zone.
+
+
+
+
+
+To cancel the currently deploying revision without selecting a new one, click **cancel**. This lets the pipeline pick the next revision automatically.
+
+#### Pinning versions
+
+Pinning locks the pipeline to a specific platform version or application revision, preventing automatic upgrades. This is useful for forcing a downgrade, holding a known-good revision during an incident, or preventing the system from picking up a new platform version.
+
+To pin a version, hover over the instance node in the pipeline graph. Under *TARGET VERSIONS*, select the desired version from the dropdown and click **pin**. A reason is required — enter a description and click **submit pin**. Platform and revision can be pinned independently.
+
+
+
+
+
+While pinned, no newer platform versions or revisions will be deployed for the pinned dimension. The dropdown and deploy button are disabled to prevent accidental changes. To unpin, hover over the instance node and click **unpin**, which allows newer versions to move through the pipeline again.
+
+For example, to roll back to a previous revision:
+
+
+
+Select the older build number from the revision dropdown.
+
+
+Click **pin** and provide a reason (e.g., "rollback due to regression in build 91").
+
+
+The pipeline will deploy the pinned build to all production zones.
+
+
+Once the issue is resolved, click **unpin** to resume normal deployments.
+
+
+
+#### Cooldown after failures
+
+When a production deployment fails repeatedly, an exponential cooldown is applied before the job is automatically retried. The cooldown period grows with the time between the first failure and the last completed run. This prevents the system from continuously retrying a failing deployment.
+
+The cooldown applies only when the target versions match those of the failing runs. If the target changes (e.g., a new revision is set as the deployment target), the cooldown resets and the new revision can be deployed immediately.
+
+To manually re-trigger a failed deployment and bypass the cooldown, hover over the failed zone node in the pipeline graph and click **restart**.
+
+
+
+
+
+#### Pausing deployments to a zone
+
+To temporarily hold off deployments to a specific production zone, hover over the zone node in the pipeline graph and click **defer**. This postpones deployments for 72 hours. Click **enable** to resume scheduling before the deferral period expires.
+
+### Source code repository integration
+
+Each new *submission* is assigned an increasing build number, which can be used to track the roll-out of the new package to the instances and their zones. With the submission, add a source code repository reference for easy integration - this makes it easy to track changes:
+
+
+
+
+
+Add the source diff link to the pull request - see example [GitHub Action](https://github.com/vespa-cloud/vespa-documentation-search/blob/main/.github/workflows/deploy-vespa-documentation-search.yaml):
+
+```bash
+$ vespa prod deploy \
+ --source-url "$(git config --get remote.origin.url | sed 's+git@\(.*\):\(.*\)\.git+https://\1/\2+')/commit/$(git rev-parse HEAD)"
+```
+
+### Block-windows
+
+Use block-windows to block deployments during certain windows throughout the week, e.g., avoid rolling out changes during peak hours / during vacations. Hover over the instance (here "default") to find block status - see [block-change](/en/reference/applications/deployment#block-change):
+
+
+
+
+
+### Validation overrides
+
+Some configuration changes are potentially destructive / change the application behavior - examples are removing fields and changing linguistic processing. These changes are disallowed by default, the deploy-command will fail. To override and force a deploy, use a [validation override](/en/reference/applications/validation-overrides):
+
+```xml
+
+ tensor-type-change
+
+```
+
+### Production tests
+
+Production tests are optional and configured in [deployment.xml](/en/reference/applications/deployment). A production test is placed after a deployment zone in the pipeline and acts as a gate: if it fails, the rollout stops and subsequent zones will not receive the new version. This is useful in multi-zone deployments where the first zone serves as a canary. Production tests run against the endpoints of the preceding production region in the pipeline.
+
+
+
+
+
+### Deploying Components
+
+Vespa is [backwards compatible](/en/learn/releases#versions) within major versions, and major versions rarely change. This means that [Components](/en/applications/components) compiled against an older version of Vespa APIs can always be run on the same major version. However, if the application package is compiled against a newer API version, and then deployed to an older runtime version in production, it might fail. See [vespa:compileVersion](/en/reference/applications/deployment#production-deployment-with-components) for how to solve this.
+
+## Automating with GitHub Actions
+
+Auto-deploy production applications using GitHub Actions - examples:
+
+- [deploy-vector-search.yaml](https://github.com/vespa-cloud/vector-search/blob/main/.github/workflows/deploy-vector-search.yaml) deploys an application to a production environment - a good example to start from!
+- [deploy.yaml](https://github.com/vespa-cloud/examples/blob/main/.github/workflows/deploy.yaml) deploys an application with basic HTTP tests.
+- [deploy-vespa-documentation-search.yaml](https://github.com/vespa-cloud/vespa-documentation-search/blob/main/.github/workflows/deploy-vespa-documentation-search.yaml) deploys an application with Java-tests.
+
+The automation scripts use an API-KEY to deploy:
+
+```bash
+$ vespa auth api-key
+```
+
+This creates a key, or outputs:
+
+```bash
+Error: refusing to overwrite /Users/me/.vespa/mytenant.api-key.pem
+Hint: Use -f to overwrite it
+
+This is your public key:
+-----BEGIN PUBLIC KEY-----
+ABCDEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEB2UFsh8ZjoWNtkrDhyuMyaZQe1ze
+qLB9qquTKUDQTuM2LOr2dawUs02nfSc3UTfC08Lgr/dvnTnHpc0/fY+3Aw==
+-----END PUBLIC KEY-----
+
+Its fingerprint is:
+12:34:56:78:65:30:77:90:30:ab:83:ee:a9:67:68:2c
+
+To use this key in Vespa Cloud click 'Add custom key' at
+https://console.vespa-cloud.com/tenant/mytenant/account/keys
+and paste the entire public key including the BEGIN and END lines.
+```
+
+This means, if there is a key, it is not overwritten, it is safe to run. Make sure to add the deploy-key to the tenant using the Vespa Cloud Console.
+
+After the deploy-key is added, everything is ready for deployment.
+
+You can upload or create new Application keys in the console, and store them as a secret in the repository like the GitHub actions example above.
+
+Some services like [Travis CI](https://travis-ci.com) do not accept multi-line values for Environment Variables in Settings. A workaround is to use the output of
+
+```bash
+$ openssl base64 -A -a < mykey.pem && echo
+```
+
+in a variable, say `VESPA_MYAPP_API_KEY`, in Travis Settings. `VESPA_MYAPP_API_KEY` is exported in the Travis environment, example output:
+
+```bash
+Setting environment variables from repository settings
+$ export VESPA_MYAPP_API_KEY=[secure]
+```
+
+Then, before deploying to Vespa Cloud, regenerate the key value:
+
+```bash
+$ MY_API_KEY=`echo $VESPA_MYAPP_API_KEY | openssl base64 -A -a -d`
+```
+
+and use `${MY_API_KEY}` in the deploy command.
+
+## Vespa Cloud upgrades
+
+Vespa upgrades follows the same pattern as for new application revisions in [CD tests](#cd-tests), and can be tracked via its version number in the Vespa Cloud Console.
+
+System tests are run the same way as for deploying a new application package.
+
+A staging test verifies the upgrade from application package `Appold` to `Appnew`, and from Vespa platform version `Vold` to `Vnew`. The staging test then consists of the following steps:
+
+
+
+All production zone deployments are polled for the current `Vold` / `Appold` versions. As there can be multiple versions already being deployed (i.e. multiple `Vold` / `Appold`), there can be a series of staging test runs.
+
+
+The application at revision `Appold` is deployed on platform version `Vold`, to a zone in the [staging environment](/en/operations/environments#staging).
+
+
+The *staging setup* test code is run, typically making the cluster reasonably similar to a production cluster.
+
+
+The test deployment is then upgraded to application revision `Appnew` and platform version `Vnew`.
+
+
+Finally, the *staging test* test code is run, to verify the deployment works as expected after the upgrade.
+
+
+
+Note that one or both of the application revision and platform may be upgraded during the staging test, depending on what upgrade scenario the test is run to verify.
+
+### Concurrent platform and revision changes
+
+When both a platform upgrade and a revision change are pending, the `rollout` setting in [deployment.xml](/en/reference/applications/deployment) controls how they interact in production zones:
+
+- `simultaneous` (default): Revision changes deploy independently of platform upgrades. A revision can catch up to and pass an ongoing platform upgrade.
+- `leading`: When a revision catches up to a platform upgrade, the two changes fuse and roll out together.
+- `separate`: The revision waits for the platform upgrade to complete, unless the upgrade is failing.
+
+With the default `simultaneous` strategy, a new revision will not be held back by an ongoing platform upgrade.
+
+## Next steps
+
+- Read more about [feature switches and bucket tests](/en/applications/testing#feature-switches-and-bucket-tests).
+- A challenge with continuous deployment can be integration testing across multiple services: Another service depends on this Vespa application for its own integration testing. Use a separate [application instance](/en/reference/applications/deployment#instance) for such integration testing.
+- Set up a deployment badge - available from the console's deployment view - example: 
+- Set up a [global query endpoint](/en/reference/applications/deployment#endpoints-global).
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/autoscaling.mdx b/mintlify-docs/en/operations/autoscaling.mdx
new file mode 100644
index 0000000000..e06533e2ac
--- /dev/null
+++ b/mintlify-docs/en/operations/autoscaling.mdx
@@ -0,0 +1,96 @@
+---
+title: "Autoscaling"
+---
+
+Autoscaling lets you adjust the hardware resources allocated to application clusters automatically depending on actual usage. It will attempt to keep utilization of all allocated resources close to ideal, and will automatically reconfigure to the cheapest option allowed by the ranges when necessary.
+
+You can turn it on by specifying *ranges* in square brackets for the [nodes](/en/reference/applications/services/services#nodes) and/or [node resource](/en/reference/applications/services/services#resources) values in *services.xml*. Vespa Cloud will monitor the resource utilization of your clusters and automatically choose the cheapest resource allocation within ranges that produces close to optimal utilization.
+
+You can see the status and recent actions of the autoscaler in the *Resources* view under a deployment in the console.
+
+Autoscaling is not considering latency differences achieved by different configurations. If your application has certain configurations that produce good throughput but too high latency, you should not include these configurations in your autoscaling ranges.
+
+Adjusting the allocation of a cluster may happen quickly for stateless container clusters, and much more slowly for content clusters with a lot of data. Autoscaling will adjust each cluster on the timescale it typically takes to rescale it (including any data redistribution).
+
+The ideal utilization takes into account that a node may be down or failing, that another region may be down causing doubling of traffic, and that we need headroom for maintenance operations and handling requests with low latency. It acts on what it has observed on your system in the recent past. If you need much more capacity in the near future than you do currently, you may want to set the lower limit to take this into account. Upper limits should be set to the maximum size that makes business sense.
+
+## When to use autoscaling
+
+Autoscaling is useful in a number of scenarios. Some typical ones are:
+
+- You have a new application which you can't benchmark with realistic data and usage, making you unsure what resources to allocate: Set wide ranges for all resource parameters and let the system choose a configuration. Once you gain experience you can consider tightening the configuration space.
+- You have load that varies quickly during the day, or that may suddenly increase quickly due to some event, and want container cluster resources to quickly adjust to the load: Set a range for the number of nodes and/or vcpu on containers.
+- You expect your data volume to grow over time, but you don't want to allocate resources prematurely, nor constantly worry about whether it is time to increase: Configure ranges for content nodes and/or node resources such that the size of the system grows with the data.
+
+## Resource tradeoffs
+
+Some other considerations when deciding resources:
+
+- Making changes to resources/nodes is easy and safe, and one of Vespa Cloud's strengths. We advise you make controlled changes and observe effect on latencies, data migration and cost. Everything is automated, just deploy a new application package. This is useful learning when later needed during load peaks and capacity requirement changes.
+- Node resources cannot be chosen freely in all zones, CPU/Memory often comes in increments of x 2. Try to make sure that the resource configuration is a good fit.
+- CPU is the most expensive component, optimize for this for most applications.
+- Having few nodes means more overcapacity as Vespa requires that the system will handle one node being down (or one group, in content clusters having multiple groups). 4-5 nodes minimum is a good rule of thumb. Whether 4-5 or 9-10 nodes of half the size is better depends on quicker upgrade cycles vs. smoother resource auto-scale curves. Latencies can be better or worse, depending on static vs dynamic query cost.
+- Changing a node resource may mean allocating a new node, so it may be faster to scale container nodes by changing the number of nodes.
+- As a consequence, during resource shortage (say almost full disk), add nodes and keep the rest unchanged.
+- It is easiest to reason over capacity when changing one thing at a time.
+
+It is often safe to follow the *suggested resources* advice when shown in the console and feel free to contact us if you have questions.
+
+## Mixed load
+
+A Vespa application must handle a combination of reads and writes, from multiple sources. User load often resembles a sine-like curve. Machine-generated load, like a batch job, can be spiky and abrupt.
+
+In the default Vespa configuration, all kinds of load uses *one* default container cluster. Example: An application where daily batch jobs update the corpus at high rate:
+
+
+
+
+
+Autoscaling scales *up* much quicker than *down*, as the probability of a new spike is higher after one has been observed. In this example, see the rapid cluster growth for the daily load spike - followed by a slow decay.
+
+The best solution for this case is to slow down the batch job, as it is of short duration. It is not always doable to slow down jobs - in these cases, setting up multiple [container clusters](/en/applications/containers) can be a smart thing - optimize each cluster for its load characteristics. This could be a combination of clusters using autoscale and clusters with a fixed size. Autoscaling often works best for the user-generated load, whereas the machine-generated load could either be tuned or routed to a different cluster in the same Vespa application.
+
+## Examples
+
+Below is an example of node resources with autoscaling that would work well for a container cluster:
+
+```xml
+
+
+
+```
+
+The above would in general **not be recommended for a content cluster.** Changing cpu, memory or disk usually leads to allocating new nodes to fulfil the new node resources spec. When that happens there will be redistribution of documents between the old and new nodes and this might impact service quality to some degree. For a content cluster it would usually be better to try to stick to the same node resources and add or remove nodes, e.g something like:
+
+```xml
+
+
+
+```
+
+If a content cluster is configured to autoscale based on node resources (not just number of nodes or groups) this will work fine, but note that using paged attributes or HNSW indexes will make it more expensive and time-consuming to redistribute documents when scaling up or down. When doing the initial feeding of a cluster it will be best to avoid auto-scaling, as changing the topology will require redistribution of documents, possibly several times.
+
+When using groups in a content cluster it's possible to scale the number of groups instead of the number of nodes, e.g. with a fixed group size and a range for the number of groups:
+
+```xml
+
+
+
+```
+
+Note that at the moment it is not possible to autoscale GPU resources per node, but you can scale the number of nodes with GPUs:
+
+```xml
+
+
+
+
+
+```
+
+## Related reading
+
+
+
+
+
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/cloning.mdx b/mintlify-docs/en/operations/cloning.mdx
new file mode 100644
index 0000000000..ab0cfd9a9d
--- /dev/null
+++ b/mintlify-docs/en/operations/cloning.mdx
@@ -0,0 +1,280 @@
+---
+title: "Cloning applications and data"
+---
+
+This is a guide on how to replicate a Vespa application in different environments, with or without data. Use cases for cloning include:
+
+- Get a copy of the application and (some) data on a laptop to work offline, or attach a debugger.
+- Deploy local experiments to the `dev` environment to easily cooperate and share.
+- Set up a copy of the application and (some) data to test a new major version of Vespa.
+- Replicate a bug report in a non-production environment.
+- Set up a copy of the application and (some) data in a `prod` environment to experiment with a CI/CD pipeline, without touching the current production serving.
+- Onboard a new team member by setting up a copy of the application and test data in a `dev` environment.
+- Clone to a `dev` environment for load testing.
+
+This guide uses *applications*. One can also use *instances*, but that will not work across Vespa major versions on Vespa Cloud - refer to [tenant, applications, instances](/en/learn/tenant-apps-instances) for details.
+
+Vespa Cloud has different environments `dev` and `prod`, with different characteristics - [details](/en/operations/environments). Clone to `dev` for short-lived experiments/development/benchmarking, use `prod` for serving applications with a [CI/CD pipeline](/en/operations/automated-deployments).
+
+As some steps are similar, it is a good idea to read through all, as details are added only the first time for brevity. Examples are based on the [album-recommendation](https://github.com/vespa-engine/sample-apps/tree/master/album-recommendation) sample application.
+
+
+**Note:**
+
+When done, it is easy to tear down resources in Vespa Cloud. E.g., *https://console.vespa-cloud.com/tenant/mytenant/application/myapp/prod/deploy* or *https://console.vespa-cloud.com/tenant/mytenant/application/myapp/dev/instance/default* to find a delete-link. Instances in `dev` environments are auto-expired ([details](/en/operations/environments)), so application cloning is a safe way to work with Vespa. Find more information in [deleting applications](/en/operations/deleting-applications).
+
+
+## Cloning - self-hosted to Vespa Cloud
+
+**Source setup:**
+
+```bash
+$ docker run --detach --name vespa1 --hostname vespa-container1 \
+ --publish 8080:8080 --publish 19071:19071 \
+ vespaengine/vespa
+
+$ vespa deploy -t http://localhost:19071
+```
+
+**Target setup:**
+
+[Create a tenant](/en/basics/deploy-an-application) in the Vespa Cloud console, in this guide using "mytenant".
+
+**Export source application package:**
+
+This gets the application package and copies it out of the container to local file system:
+
+```bash
+$ vespa fetch -t http://localhost:19071 && \
+ unzip application.zip -x application.zip
+```
+
+**Deploy target application package**
+
+The procedure differs a little whether deploying to dev or prod [environment](/en/operations/environments). The `mvn -U clean package` step is only needed for applications with custom code. Configure application name and create data plane credentials:
+
+```bash
+$ vespa config set target cloud && \
+ vespa config set application mytenant.myapp
+
+$ vespa auth login
+
+$ vespa auth cert -f
+
+$ mvn -U clean package
+```
+
+
+**Note:**
+
+When deploying to a new app, one will often want to generate a new data plane cert/key pair. To do this, use `vespa auth cert -f`. If reusing a cert/key pair, drop `-f` and make sure to put the pair in *.vespa*, to avoid errors like `Error: open /Users/me/.vespa/mytenant.myapp.default/data-plane-public-cert.pem: no such file or directory` in the subsequent deploy step.
+
+
+Then deploy the application. Depending on the use case, deploy to `dev` or `prod`:
+
+- `dev`:
+
+ ```bash
+ $ vespa deploy
+ ```
+ Expect something like:
+
+ ```bash
+ Uploading application package ... done
+
+ Success: Triggered deployment of . with run ID 1
+
+ Use vespa status for deployment status, or follow this deployment at
+ https://console.vespa-cloud.com/tenant/mytenant/application/myapp/dev/instance/default/job/dev-aws-us-east-1c/run/1
+ ```
+- Deployments to the `prod` environment requires [deployment.xml](/en/reference/applications/deployment) - select which [zone](/en/operations/zones) to deploy to:
+
+ ```bash
+ $ cat < deployment.xml
+
+
+ aws-us-east-1c
+
+
+ EOF
+ ```
+
+ `prod` deployments also require `resources` specifications in [services.xml](/en/reference/applications/services/services) - use [vespa-documentation-search](https://github.com/vespa-cloud/vespa-documentation-search/blob/main/src/main/application/services.xml) as an example and add/replace `nodes` elements for `container` and `content` clusters. If in doubt, just add a small config to start with, and change later:
+
+ ```bash
+
+
+
+ ```
+ Deploy the application package:
+
+ ```bash
+ $ vespa prod deploy
+ ```
+
+ Expect something like:
+
+ `Hint: See` [`production deployment`](/en/reference/applications/deployment)
+ `Success: Deployed` .
+ `See https://console.vespa-cloud.com/tenant/mytenant/application/myapp/prod/deployment for deployment progress`
+
+ A proper deployment to a `prod` zone should have automated tests, read more in [automated deployments](/en/operations/automated-deployments)
+
+**Data copy**
+
+Export documents from the local instance and feed to the Vespa Cloud instance:
+
+```bash
+$ vespa visit -t http://localhost:8080 | vespa feed -
+```
+
+Add more parameters as needed to `vespa feed` for other endpoints.
+
+**Get access log from source:**
+
+```bash
+$ docker exec vespa1 cat /opt/vespa/logs/vespa/access/JsonAccessLog.default
+```
+
+## Cloning - Vespa Cloud to self-hosted
+
+**Download application from Vespa Cloud**
+
+Validate the endpoint, and fetch the application package:
+
+```bash
+$ vespa config get application
+application = mytenant.myapp.default
+
+$ vespa fetch
+Downloading application package... done
+Success: Application package written to application.zip
+```
+
+The application package can also be downloaded from the Vespa Cloud Console:
+
+- dev: Navigate to *https://console.vespa-cloud.com/tenant/mytenant/application/myapp/dev/instance/default*, click *Application* to download:
+ 
+- prod: Navigate to *https://console.vespa-cloud.com/tenant/mytenant1/application/myapp/prod/deployment?tab=builds* and select the version of the application to download:
+ 
+
+**Target setup:**
+
+Note the name of the application package .zip-file just downloaded. If changes are needed, unzip it and use `vespa deploy -t http://localhost:19071 ` to deploy from current directory:
+
+```bash
+$ docker run --detach --name vespa1 --hostname vespa-container1 \
+ --publish 8080:8080 --publish 19071:19071 \
+ vespaengine/vespa
+
+$ vespa config set target local
+
+$ vespa deploy -t http://localhost:19071 mytenant.myapp.default.dev.aws-us-east-1c.zip
+```
+
+**Data copy**
+
+Set config target cloud for `vespa visit` and pipe the jsonl output into `vespa feed` to the local instance:
+
+```bash
+$ vespa config set target cloud
+
+$ vespa visit | vespa feed - -t http://localhost:8080
+```
+
+**data copy - minimal**
+
+For use cases requiring a few documents, visit just a few documents:
+
+```bash
+$ vespa visit --chunk-count 10
+```
+
+**Get access log from source:**
+
+Use the Vespa Cloud Console to get access logs
+
+## Cloning - Vespa Cloud to Vespa Cloud
+
+This is a combination of the procedures above. Download the application package from dev or prod, make note of the source name, like mytenant.myapp.default. Then use `vespa deploy` or `vespa prod deploy` as above to deploy to dev or prod.
+
+If cloning from `dev` to `prod`, pay attention to changes in *deployment.xml* and *services.xml* as in [cloning to Vespa Cloud](#cloning---self-hosted-to-vespa-cloud).
+
+**Data copy**
+
+Set the feed endpoint name / paths, e.g. mytenant.myapp-new.default:
+
+```bash
+$ vespa config set target cloud
+
+$ vespa visit | vespa feed - -t https://default.myapp-new.mytenant.aws-us-east-1c.dev.z.vespa-app.cloud
+```
+
+**Data copy 5%** Set the –selection argument to `vespa visit` to select a subset of the documents.
+
+## Cloning - self-hosted to self-hosted
+
+Creating a copy from one self-hosted application to another. Self-hosted means running [Vespa](https://vespa.ai/) on a laptop or a [multinode system](/en/operations/self-managed/multinode-systems).
+
+This example sets up a source app and deploys the [application package](/en/basics/applications) - use [album-recommendation](https://github.com/vespa-engine/sample-apps/tree/master/album-recommendation) as an example. The application package is then exported from the source and deployed to a new target app. Steps:
+
+**Source setup:**
+
+```bash
+$ vespa config set target local
+
+$ docker run --detach --name vespa1 --hostname vespa-container1 \
+ --publish 8080:8080 --publish 19071:19071 \
+ vespaengine/vespa
+
+$ vespa deploy -t http://localhost:19071
+```
+
+**Target setup:**
+
+```bash
+$ docker run --detach --name vespa2 --hostname vespa-container2 \
+ --publish 8081:8080 --publish 19072:19071 \
+ vespaengine/vespa
+```
+
+**Export source application package**
+
+Export files:
+
+```bash
+$ vespa fetch -t http://localhost:19071
+```
+
+**Deploy application package to target**
+
+Before deploying, one can make changes to the application package files as needed. Deploy to target:
+
+```bash
+$ vespa deploy -t http://localhost:19072 application.zip
+```
+
+**Data copy from source to target**
+
+This pipes the source data directly into `vespa feed` - another option is to save the data to files temporarily and feed these individually:
+
+```bash
+$ vespa visit -t http://localhost:8080 | vespa feed - -t http://localhost:8081
+```
+
+**Data copy 5%**
+
+This is an example on how to use a [selection](/en/reference/writing/document-selector-language) to specify a subset of the documents - here a "random" 5% selection:
+
+```bash
+$ vespa visit -t http://localhost:8080 --selection 'id.hash().abs() % 20 = 0' | \
+ vespa feed - -t http://localhost:8081
+```
+
+**Get access log from source**
+
+Get the current query access log from the source application (there might be more files there):
+
+```bash
+$ docker exec vespa1 cat /opt/vespa/logs/vespa/access/JsonAccessLog.default
+```
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/data-management.mdx b/mintlify-docs/en/operations/data-management.mdx
new file mode 100644
index 0000000000..5fd23b3bdf
--- /dev/null
+++ b/mintlify-docs/en/operations/data-management.mdx
@@ -0,0 +1,233 @@
+---
+title: "Data management and backup"
+---
+
+This guide covers data management operations for Vespa Cloud applications, including automated backups, document export, feed, and bulk updates and removals.
+
+## Automated Backups
+
+Depending on [plan](https://vespa.ai/pricing/), content clusters are automatically backed up when a [``](/en/reference/applications/deployment#backup) element is specified in *deployment.xml*. Vespa Cloud manages the backup schedule, storage, and lifecycle with no external tooling required. Backups will run at the configured frequency while also respecting any [block windows](/en/reference/applications/deployment#block-change) defined for the instance.
+
+```xml
+
+
+
+ aws-us-east-1c
+
+
+```
+
+Backups are retained for three backup intervals (e.g. 21 days for a 7-day frequency). The most recent fully completed backup is always retained regardless of age. See [Restore from Backup](#restore) for how to request a restore.
+
+If you prefer to manage backups yourself, documents can be exported manually using `vespa visit` as shown in the [Google Cloud Function example](https://github.com/vespa-engine/sample-apps/tree/master/examples/google-cloud/cloud-functions#backup---experimental).
+
+## Restore from Backup
+
+Restoring from a backup is handled by Vespa Cloud. To initiate a restore, contact [Vespa Support](https://vespa.ai/support/). Response time and priority handling are governed by your [support plan](https://vespa.ai/pricing/).
+
+Restore requires a deployed target cluster with:
+
+- The same number of content nodes as the backup.
+- At least equivalent disk capacity per node as at the time of the backup.
+
+Note that content redistribution is usually required after restoration. See [backup reference](/en/reference/applications/deployment#backup) for details.
+
+## Export documents
+
+
+**Note:**
+
+The examples below use the [Vespa CLI](/en/clients/vespa-cli). Ensure you have the latest version installed.
+
+
+To export documents, configure the application to export from, then select zone, container cluster and schema - example:
+
+```bash
+$ vespa config set application vespa-team.vespacloud-docsearch.default
+
+$ vespa visit --zone prod.aws-us-east-1c --cluster default --selection doc | head
+```
+
+Some of the parameters above are redundant if unambiguous. Here, the application is set up using a template found in [multinode-HA](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode-HA) with multiple container clusters. This example [visit](/en/writing/visiting) documents from the `doc` schema.
+
+Use a [fieldset](/en/schemas/documents#fieldsets) to export document IDs only:
+
+```bash
+$ vespa visit --zone prod.aws-us-east-1c --cluster default --selection doc --field-set '[id]' | head
+```
+
+
+**Note:**
+
+Configuring the [`documentid`](/en/reference/schemas/schemas#documentid) field to be an attribute in the schema avoids that this requires disk access and, hence, speeds up the exporting process.
+
+
+As the name implies, fieldsets are useful to select a subset of fields to export. Note that, if disk access is required to fetch a field from the fieldset, selecting fewer fields does not speed up the exporting process as the same amount of data is read from the index. The data transfer out of the Vespa application is smaller with fewer fields.
+
+For copying documents between applications, see [cloning applications and data](/en/operations/cloning).
+
+## Feed
+
+If a document feed is generated with `vespa visit` (above), it is already in [JSON Lines](https://jsonlines.org/) feed-ready format by default:
+
+```bash
+$ vespa visit | vespa feed - -t $ENDPOINT
+```
+
+Find more examples in [cloning applications and data](/en/operations/cloning).
+
+A document export generated using [/document/v1](/en/writing/document-v1-api-guide) is slightly different from the .jsonl output from `vespa visit` (e.g., fields like a continuation token are added). Extract the `document` objects before feeding:
+
+```bash
+$ gunzip -c docs.gz | jq '.documents[]' | \
+ vespa feed - -t $ENDPOINT
+```
+
+## Delete
+
+To remove all documents in a Vespa deployment—or a selection of them—run a *deletion visit*. Use the `DELETE` HTTP method, and fetch only the continuation token from the response:
+
+```bash expandable
+#!/bin/bash
+
+set -x
+
+# The ENDPOINT must be a regional endpoint, do not use '*.g.vespa-app.cloud/'
+ENDPOINT="https://vespacloud-docsearch.vespa-team.aws-us-east-1c.z.vespa-app.cloud"
+NAMESPACE=open
+DOCTYPE=doc
+CLUSTER=documentation
+
+# doc.path =~ "^/old/" -- all documents under the /old/ directory:
+SELECTION='doc.path%3D~%22%5E%2Fold%2F%22'
+
+continuation=""
+
+while
+ token=$( curl -X DELETE -s \
+ --cert data-plane-public-cert.pem \
+ --key data-plane-private-key.pem \
+ "${ENDPOINT}/document/v1/${NAMESPACE}/${DOCTYPE}/docid?selection=${SELECTION}&cluster=${CLUSTER}&${continuation}" \
+ | tee >( jq . > /dev/tty ) | jq -re .continuation )
+do
+ continuation="continuation=${token}"
+done
+```
+
+Each request will return a response after roughly one minute—change this by specifying *timeChunk* (default 60).
+
+To purge all documents in a document export (above), generate a feed with `remove`\-entries for each document ID, like:
+
+```bash
+$ gunzip -c docs.gz | jq '[ .documents[] | {remove: .id} ]' | head
+
+[
+ {
+ "remove": "id:open:doc::open/documentation/schemas.html"
+ },
+ {
+ "remove": "id:open:doc::open/documentation/securing-your-vespa-installation.html"
+ },
+```
+
+Complete example for a single chunk:
+
+```bash
+$ gunzip -c docs.gz | jq '[ .documents[] | {remove: .id} ]' | \
+ vespa feed - -t $ENDPOINT
+```
+
+## Update
+
+To update all documents in a Vespa deployment—or a selection of them—run an *update visit*. Use the `PUT` HTTP method, and specify a partial update in the request body:
+
+```bash expandable
+#!/bin/bash
+
+set -x
+
+# The ENDPOINT must be a regional endpoint, do not use '*.g.vespa-app.cloud/'
+ENDPOINT="https://vespacloud-docsearch.vespa-team.aws-us-east-1c.z.vespa-app.cloud"
+NAMESPACE=open
+DOCTYPE=doc
+CLUSTER=documentation
+
+# doc.inlinks == "some-url" -- the weightedset inlinks has the key "some-url"
+SELECTION='doc.inlinks%3D%3D%22some-url%22'
+
+continuation=""
+
+while
+ token=$( curl -X PUT -s \
+ --cert data-plane-public-cert.pem \
+ --key data-plane-private-key.pem \
+ --data '{ "fields": { "inlinks": { "remove": { "some-url": 0 } } } }' \
+ "${ENDPOINT}/document/v1/${NAMESPACE}/${DOCTYPE}/docid?selection=${SELECTION}&cluster=${CLUSTER}&${continuation}" \
+ | tee >( jq . > /dev/tty ) | jq -re .continuation )
+do
+ continuation="continuation=${token}"
+done
+```
+
+Each request will return a response after roughly one minute—change this by specifying *timeChunk* (default 60).
+
+## Using /document/v1/ api
+
+To get started with a document export, find the *namespace* and *document type* by listing a few IDs. Hit the [/document/v1/](/en/reference/api/document-v1) ENDPOINT. Restrict to one CLUSTER, see [content clusters](/en/reference/applications/services/content):
+
+```bash
+$ curl \
+ --cert data-plane-public-cert.pem \
+ --key data-plane-private-key.pem \
+ "$ENDPOINT/document/v1/?cluster=$CLUSTER"
+```
+
+For ID export only, use a [fieldset](/en/schemas/documents#fieldsets):
+
+```bash
+$ curl \
+ --cert data-plane-public-cert.pem \
+ --key data-plane-private-key.pem \
+ "$ENDPOINT/document/v1/?cluster=$CLUSTER&fieldSet=%5Bid%5D"
+```
+
+From an ID, like *id:open:doc::open/documentation/schemas.html*, extract
+
+- NAMESPACE: open
+- DOCTYPE: doc
+
+Example script:
+
+```bash expandable
+#!/bin/bash
+
+set -x
+
+# The ENDPOINT must be a regional endpoint, do not use '*.g.vespa-app.cloud/'
+ENDPOINT="https://vespacloud-docsearch.vespa-team.aws-us-east-1c.z.vespa-app.cloud"
+NAMESPACE=open
+DOCTYPE=doc
+CLUSTER=documentation
+
+continuation=""
+idx=0
+
+while
+ ((idx+=1))
+ echo "$continuation"
+ printf -v out "%05g" $idx
+ filename=${NAMESPACE}-${DOCTYPE}-${out}.data.gz
+ echo "Fetching data..."
+ token=$( curl -s \
+ --cert data-plane-public-cert.pem \
+ --key data-plane-private-key.pem \
+ "${ENDPOINT}/document/v1/${NAMESPACE}/${DOCTYPE}/docid?wantedDocumentCount=1000&concurrency=4&cluster=${CLUSTER}&${continuation}" \
+ | tee >( gzip > ${filename} ) | jq -re .continuation )
+do
+ continuation="continuation=${token}"
+done
+```
+
+If only a few documents are returned per response, *wantedDocumentCount* (default 1, max 1024) can be specified for a lower bound on the number of documents per response, if that many documents still remain.
+
+Specifying *concurrency* (default 1, max 100) increases throughput, at the cost of resource usage. This also increases the number of documents per response, and *could* lead to excessive memory usage in the HTTP container when many large documents are buffered to be returned in the same response.
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/deleting-applications.mdx b/mintlify-docs/en/operations/deleting-applications.mdx
new file mode 100644
index 0000000000..eaa5c13397
--- /dev/null
+++ b/mintlify-docs/en/operations/deleting-applications.mdx
@@ -0,0 +1,57 @@
+---
+title: "Deleting Applications"
+sidebarTitle: "Deleting Applications"
+---
+
+
+**Warning:**
+
+Following these steps will remove production instances or regions and all data within them. Data will be unrecoverable.
+
+
+
+## Deleting an application
+
+To delete an application, use the console:
+
+- navigate to the *application* view at https://console.vespa-cloud.com/tenant/tenant-name/application where you can find the trash can icon to the far right, as an `ACTION`.
+- navigate to the *deploy* view at *https://console.vespa-cloud.com/tenant/tenant-name/application/app-name/prod/deploy*.
+
+
+
+
+
+When the application deployments are deleted, delete the application in the [console](https://console.vespa-cloud.com). Remove the CI job that builds and deploys application packages, if any.
+
+## Deleting an instance / region
+
+To remove an instance or a deployment to a region from an application:
+
+
+
+Remove the `region` from `prod`, or the `instance` from `deployment` in [deployment.xml](/en/reference/applications/deployment#instance):
+```xml
+
+
+ aws-us-east-1c
+
+
+
+
+```
+
+
+Add or modify [validation-overrides.xml](/en/reference/applications/validation-overrides), allowing Vespa Cloud to remove production instances:
+
+```xml
+
+ deployment-removal
+
+ global-endpoint-change
+
+```
+
+
+Build and deploy the application package.
+
+
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/deployment-patterns.mdx b/mintlify-docs/en/operations/deployment-patterns.mdx
new file mode 100644
index 0000000000..08a6aa4595
--- /dev/null
+++ b/mintlify-docs/en/operations/deployment-patterns.mdx
@@ -0,0 +1,179 @@
+---
+title: "Deployment patterns"
+---
+
+Vespa Cloud's [automated deployments](/en/operations/automated-deployments) lets you design CD pipelines for staged rollouts and multi-zone deployments. This guide documents some of these patterns.
+
+## Two regions, two AZs each, sequenced deployment
+
+This is the simplest pattern, deploy to a set of zones/regions, in a sequence:
+
+
+
+
+
+```xml
+
+
+ aws-us-east-1c
+ aws-use1-az4
+ aws-use2-az1
+ aws-use2-az3
+
+
+```
+
+## Two regions, two AZs each, parallel deployment
+
+Same as above, but deploying all zones in parallel:
+
+
+
+
+
+```xml
+
+
+
+ aws-us-east-1c
+ aws-use1-az4
+ aws-use2-az1
+ aws-use2-az3
+
+
+
+```
+
+## Two regions, two AZs each, parallel deployment inside region
+
+Deploy to the use1 region first, both AZs in parallel, then the use2 region, both AZs in parallel:
+
+
+
+
+
+```xml
+
+
+
+ aws-us-east-1c
+ aws-use1-az4
+
+
+ aws-use2-az1
+ aws-use2-az3
+
+
+
+```
+
+## Deploy to a test instance first
+
+Deploy to a (downscaled) instance first, and add a delay before propagating to later instances and zones.
+
+
+
+
+
+```xml
+
+
+
+ aws-use2-az1
+
+
+
+
+
+ aws-use2-az1
+
+
+
+```
+
+### Deployment variants
+
+[Deployment variants](/en/operations/deployment-variants) are useful to set up a downscaled instance. In [services.xml](/en/reference/applications/services/services), override settings per instance:
+
+```xml
+
+
+
+
+
+
+```
+
+## Test and prod instances as separate applications
+
+In the section before, we modeled the test and prod app as one pipeline. This lets users halt the pipeline (using the delay) before prod propagation.
+
+In some cases, this is better modeled as different applications:
+
+- The CI pipeline is multistep, with approvals and use of different branches
+
+The below uses different *applications* to model the flow, these are completely separate application instances. The application owner will model the flow in own tool, and orchestrate deployments to Vespa Cloud as fit:
+
+
+
+
+
+
+
+
+The important point is, these are two *separate* deploy commands to Vespa Cloud:
+
+```bash
+$ vespa config set application tenant1.canaryapp
+$ vespa prod deploy app
+```
+
+```xml
+
+
+
+ aws-use2-az1
+
+
+
+```
+
+```bash
+$ vespa config set application tenant1.prodapp
+$ vespa prod deploy app
+```
+
+```xml
+
+
+
+ aws-use2-az1
+
+
+
+```
+
+## services.xml structure
+
+It is possible to split *services.xml* to more file using includes:
+
+```xml
+
+
+
+
+```
+
+
+**Note:**
+
+The include-feature can not be used in combination with [deployment variants](#deployment-variants).
+
+
+## Next reads
+
+
+
+
+
+
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/deployment-variants.mdx b/mintlify-docs/en/operations/deployment-variants.mdx
new file mode 100644
index 0000000000..23f3c55bf8
--- /dev/null
+++ b/mintlify-docs/en/operations/deployment-variants.mdx
@@ -0,0 +1,150 @@
+---
+title: "Application, instance, region, cloud and environment variants"
+sidebarTitle: "Deployment variants"
+---
+
+Sometimes it is useful to create configuration that varies depending on properties of the deployment, for example to set region specific endpoints of services used by [Searchers](/en/applications/searchers), use smaller clusters for a "beta" instance, or vary configuration when the same application package is shared across multiple applications.
+
+This is supported both for [services.xml](#services.xml-variants) and [query profiles](#query-profile-variants).
+
+## services.xml variants
+
+[services.xml](/en/reference/applications/services/services) files support different configuration settings for different *tags*, *applications*, *instances*, *environments*, *clouds* and *regions*. To use this, import the *deploy* namespace:
+
+```xml
+
+```
+
+Deploy directives are used to specify with which tags, and in which application, instance, environment, cloud and/or [region](/en/operations/zones) an XML element should be included:
+
+```xml expandable
+
+ 2
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+```
+
+The example above configures different node counts/configurations depending on the deployment target. Deploying the application in the *dev* environment gives:
+
+```xml
+
+ 2
+
+
+
+
+
+```
+
+Whereas in `aws-us-west-2a` it is:
+
+```xml
+
+ 2
+
+
+
+
+
+
+
+```
+
+This can be used to modify any config by deployment target.
+
+The `deploy` directives have a set of override rules:
+
+- A directive specifying more conditions will override one specifying fewer.
+- Directives are inherited in child elements.
+- When multiple XML elements with the same name is specified (e.g. when specifying search or docproc chains), the *id* attribute or the *idref* attribute of the element is used together with the element name when applying directives.
+
+Some overrides are applied by default in some environments, see [environments](/en/operations/environments). Any override made explicitly for an environment will override the defaults for it.
+
+### Specifying multiple targets
+
+More than one tag, application, instance, region or environment can be specified in the attribute, separated by space.
+
+Note that `tags` by default only apply in production instances, and are matched whenever the tags of the element and the tags of the instance intersect. To match tags in other environments, an explicit `deploy:environment` directive for that environment must also match. Use tags if you have a complex instance structure which you want config to vary by.
+
+The namespace can be applied to any element. Example:
+
+```xml
+
+
+
+
+
+ Hello from application config
+ Hello from east colo!
+
+
+
+
+
+```
+
+Above, the `container` element is configured for the 3 environments only (it will not apply to `dev`) - and in region `aws-us-east-1c`, the config is different.
+
+## Query profile variants
+
+[Query profiles](/en/querying/query-profiles) support different configuration settings for different *applications*, *instances*, *environments* and *regions* through [query profile variants](/en/querying/query-profiles#query-profile-variants). This allows you to set different query parameters for a query type depending on these deployment attributes.
+
+To use this feature, create a regular query profile variant with any of `application`, `instance`, `environment` and `region` as dimension names and let your query profile vary by that. For example:
+
+```xml expandable
+
+
+ application, instance, environment, region
+
+ My default value
+
+
+
+ My beta value
+
+
+
+
+ My dev value
+
+
+
+
+ My main instance prod value
+
+
+
+```
+
+You can pick and combine these dimensions in any way you want with other dimensions sent as query parameters, e.g:
+
+```xml
+device, application, instance, usecase
+```
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/enclave/archive.mdx b/mintlify-docs/en/operations/enclave/archive.mdx
new file mode 100644
index 0000000000..70e23c008c
--- /dev/null
+++ b/mintlify-docs/en/operations/enclave/archive.mdx
@@ -0,0 +1,45 @@
+---
+title: "Log archive in Vespa Cloud Enclave"
+sidebarTitle: "Log archive"
+---
+
+
+**Warning:**
+
+The structure of log archive buckets may change without notice
+
+
+After Vespa Cloud Enclave is established in your cloud provider account using Terraform, the module will have created a storage bucket per Vespa Cloud zone you configured in your enclave. These storage buckets are used to archive logs from the machines that run Vespa inside your account.
+
+Since the buckets are in your own cloud account, you do not need to register an IAM role or configure access through the Vespa Cloud Console — you can access the archive buckets directly using your existing cloud credentials.
+
+
+
+
+
+There will be one storage bucket per Vespa Cloud Zone that is configured in the enclave. The name of the bucket will depend on the cloud provider you are setting up the enclave in.
+
+Files are synchronized to the archive bucket when the file is rotated by the logging system, or when a virtual machine is deprovisioned from the application. The consequence of this is that frequency of uploads will depend on the activity of the Vespa application.
+
+## Directory structure
+
+The directory structure in the bucket is as follows:
+
+```bash
+////logs//
+```
+
+- `tenant` is the tenant ID.
+- `application` is the application ID that generated the log.
+- `instance` is the instance ID of the generated log, e.g. `default`.
+- `host` is the name prefix of the host that generated the log, e.g. `e103a`.
+- `logtype` is the type of log in the directory (see below).
+- `logfile` is the specific file of the log.
+
+## Log types
+
+There are three log types that are synced to this bucket.
+
+- `vespa`: [Vespa logs](/en/reference/operations/log-files)
+- `access`: [Access logs](/en/operations/access-logging)
+- `connection`: [Connection logs](/en/operations/access-logging#connection-log)
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/enclave/aws-architecture.mdx b/mintlify-docs/en/operations/enclave/aws-architecture.mdx
new file mode 100644
index 0000000000..c373006425
--- /dev/null
+++ b/mintlify-docs/en/operations/enclave/aws-architecture.mdx
@@ -0,0 +1,38 @@
+---
+title: "Vespa Cloud Enclave AWS Architecture"
+sidebarTitle: "AWS architecture"
+---
+
+Each Vespa Cloud Enclave in the tenant AWS account corresponds to a Vespa Cloud [zone](../zones.html). Inside the tenant AWS account one enclave is contained within one single [VPC](https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html).
+
+
+
+
+
+#### EC2 Instances, Load Balancers, and S3 buckets
+
+Configuration Servers inside the Vespa Cloud zone makes the decision to create or destroy EC2 instances ("Vespa Hosts" in diagram) based on the Vespa applications that are deployed. The Configuration Servers also set up the Network Load Balancers needed to communicate with the deployed Vespa application.
+
+Each Vespa Host will periodically sync its logs to a S3 bucket ("Log Archive"). This bucket is "local" to the enclave and provisioned by the Terraform module inside the tenant's AWS account.
+
+#### Networking
+
+The enclave VPC is very network restricted. Vespa Hosts do not have public IPv4 addresses and there is no [NAT gateway](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-nat-gateway.html) available in the VPC. Vespa Hosts have public IPv6 addresses and are able to make outbound connections. Inbound connections are not allowed. Outbound IPv6 connections are used to bootstrap communication with the Configuration Servers, and to report operational metrics back to Vespa Cloud.
+
+When a Vespa Host is booted it will set up an encrypted tunnel back to the Configuration Servers. All communication between Configuration Servers and the Vespa Hosts will be run over this tunnel after it is set up.
+
+### Security
+
+The Vespa Cloud operations team does *not* have any direct access to the resources that is part of the customer account. The only possible access is through the management APIs needed to run Vespa itself. In case it is needed for, e.g. incident debugging, direct access can only be granted to the Vespa team by the tenant itself. For further details, see the documentation for the [`ssh`\-submodule](https://registry.terraform.io/modules/vespa-cloud/enclave/aws/latest/submodules/ssh).
+
+All communication between the enclave and the Vespa Cloud configuration servers is encrypted, authenticated and authorized using [mTLS](https://en.wikipedia.org/wiki/Mutual_authentication#mTLS) with identities embedded in the certificate. mTLS communication is facilitated with the [Athenz](https://www.athenz.io/) service.
+
+All data stored is encrypted at rest using [KMS](https://docs.aws.amazon.com/kms/latest/developerguide/overview.html). All keys are managed by the tenant in the tenant's AWS account.
+
+The resources provisioned in the tenant AWS account are either provisioned by the Terraform module executed by the tenant, or by the orchestration services inside a Vespa Cloud Zone.
+
+Resources are provisioned by the Vespa Cloud configuration servers, using the [`provision_policy`](https://github.com/vespa-cloud/terraform-aws-enclave/blob/main/modules/provision/main.tf) AWS IAM policy document defined in the Terraform module.
+
+The tenant that registered the AWS account is the only tenant that can deploy applications targeting the enclave.
+
+For more general information about security in Vespa Cloud, see the [whitepaper](/en/security/whitepaper).
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/enclave/aws-getting-started.mdx b/mintlify-docs/en/operations/enclave/aws-getting-started.mdx
new file mode 100644
index 0000000000..fa0a55778a
--- /dev/null
+++ b/mintlify-docs/en/operations/enclave/aws-getting-started.mdx
@@ -0,0 +1,85 @@
+---
+title: "Getting started with Vespa Cloud Enclave in AWS"
+sidebarTitle: "AWS getting started"
+---
+
+Setting up Vespa Cloud Enclave requires:
+
+
+
+Registration at [Vespa Cloud](https://console.vespa-cloud.com), or use a pre-existing tenant.
+
+
+Running a [Terraform](https://www.terraform.io/) configuration to provision AWS resources in the account. Go through the [AWS tutorial](https://developer.hashicorp.com/terraform/tutorials/aws-get-started) as needed.
+
+
+Registration of the AWS account ID in Vespa Cloud
+
+
+Deployment of a Vespa application.
+
+
+
+### 1. Vespa Cloud Tenant setup
+
+Register at [Vespa Cloud](https://console.vespa-cloud.com) or use an existing tenant. Note that the tenant must be on a [paid plan](https://vespa.ai/pricing/).
+
+### 2. Configure AWS Account
+
+
+**Note:**
+
+We recommend using a *dedicated* account for your Vespa Cloud Enclave. Vespa Cloud will manage resources in the Enclave VPCs created in the AWS resource provisioning step. Primarily EC2 instances, load balancers and service endpoints.
+
+
+One account can host all your Vespa applications, there is no need for multiple tenants or accounts.
+
+The AWS account you intend to use for Vespa Cloud Enclave must be prepared for deploying Vespa applications using either *Terraform* or *Cloudformation*.
+
+#### Terraform
+
+Use [Terraform](https://www.terraform.io/) to set up the necessary resources using the [modules](https://registry.terraform.io/modules/vespa-cloud/enclave/aws/latest) published by the Vespa team.
+
+Modify the [multi-region Terraform files](https://github.com/vespa-cloud/terraform-aws-enclave/blob/main/examples/multi-region/main.tf) for your deployment.
+
+If you are unfamiliar with Terraform: It is a tool to manage resources and their configuration in various cloud providers, like AWS and GCP. Terraform has published an [AWS](https://developer.hashicorp.com/terraform/tutorials/aws-get-started) tutorial, and we strongly encourage enclave users to read and follow the Terraform recommendations for [CI/CD](https://developer.hashicorp.com/terraform/tutorials/automation/automate-terraform).
+
+The Terraform module we provide is regularly updated to add new required resources or extra permissions for Vespa Cloud to automate the operations of your applications. In order for your enclave applications to use the new features you must re-apply your terraform templates with the latest release. The [notification system](/en/operations/notifications) will let you know when a new release is available.
+
+### 3. Onboarding
+
+Once the AWS account is configured, contact [support@vespa.ai](mailto:support@vespa.ai) stating which tenant should be on-boarded to use Vespa Cloud Enclave. Also include the [AWS account ID](https://docs.aws.amazon.com/accounts/latest/reference/manage-acct-identifiers.html#FindAccountId) to associate with the tenant.
+
+
+**Note:**
+
+Wait for confirmation from the Vespa team that onboarding is complete before deploying an application in the next step.
+
+
+
+### 4. Deploy a Vespa application
+
+By default, all applications are deployed on resources in Vespa Cloud accounts. To deploy in your enclave account, update [deployment.xml](/en/reference/applications/deployment) to reference the AWS account you onboarded:
+
+```xml
+
+
+
+```
+
+Useful resources are [getting started](/en/basics/deploy-an-application-java) and [migrating to Vespa Cloud](/en/learn/migrating-to-cloud) - put *deployment.xml* next to *services.xml*.
+
+## Next steps
+
+After a successful deployment to the [dev](/en/operations/environments#dev) environment, iterate on the configuration to implement your application on Vespa. The *dev* environment is ideal for this, with rapid deployment cycles.
+
+For production serving, deploy to the [prod](/en/operations/environments#prod) environment - follow the steps in [production deployment](/en/reference/applications/deployment).
+
+## Enclave teardown
+
+To tear down a Vespa Cloud Enclave system, do the steps above in reverse order:
+
+1. [Undeploy the application(s)](/en/operations/deleting-applications)
+2. Undeploy the Terraform changes
+
+It is important to undeploy the Vespa application(s) first. After running the Terraform, Vespa Cloud cannot manage the resources allocated, so you must clean up these yourself.
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/enclave/azure-architecture.mdx b/mintlify-docs/en/operations/enclave/azure-architecture.mdx
new file mode 100644
index 0000000000..9e24f07649
--- /dev/null
+++ b/mintlify-docs/en/operations/enclave/azure-architecture.mdx
@@ -0,0 +1,42 @@
+---
+title: "Architecture for Vespa Cloud Enclave in Azure"
+sidebarTitle: "Azure architecture"
+---
+
+### Architecture
+
+With Vespa Cloud Enclave, all Azure resources associated with your Vespa Cloud applications are in your enclave Azure subscription, as opposed to a shared Vespa Cloud subscription.
+
+Each Vespa Cloud [zone](/en/operations/zones) has an associated zone resource group (RG) in the enclave subscription, that contains all the resources for that zone. For instance, it has one Virtual Network (VNet aka [VPC](https://cloud.google.com/vpc/)).
+
+
+
+
+
+#### Virtual Machines, Load Balancers, and Blob Storage
+
+Configuration Servers inside the Vespa Cloud subscription make the decision to create or destroy virtual machines ("Vespa Hosts" in diagram) based on the Vespa applications that are deployed. The Configuration Servers also set up the Container Load Balancers needed to communicate with the deployed Vespa application.
+
+Each Vespa Host will periodically sync its logs to a Blob Storage container ("Log Archive") in a Storage Account in the zone RG. This storage account is "local" to the enclave and provisioned by the Terraform module inside your Azure subscription.
+
+#### Networking
+
+The Zone Virtual Network (VNet aka VPC) is very network restricted. The Vespa Hosts do not have a public IPv4 address. But your application can connect to external IPv4 services using a [NAT gateway](https://learn.microsoft.com/en-us/azure/nat-gateway/nat-overview). Vespa Hosts have public IPv6 addresses and are able to make outbound connections. Inbound connections are not allowed. Outbound IPv6 connections are used to bootstrap communication with the Configuration Servers, and to report operational metrics back to Vespa Cloud.
+
+When a Vespa Host is booted, it will set up an encrypted tunnel back to the Configuration Servers. All communication between Configuration Servers and the Vespa Hosts will be run over this tunnel after it is set up.
+
+### Security
+
+The Vespa Cloud operations team does *not* have any direct access to the resources in your subscription. The only possible access is through the management APIs needed to run Vespa itself. In case it is needed for, e.g. incident debugging, direct access can only be granted to the Vespa team by you. Enable direct access by setting the `enable_ssh` input to true in the enclave module. For further details, see the documentation for the [enclave module inputs](https://registry.terraform.io/modules/vespa-cloud/enclave/azure/latest/?tab=inputs).
+
+All communication between the enclave and the Vespa Cloud Configuration servers is encrypted, authenticated, and authorized using [mTLS](https://en.wikipedia.org/wiki/Mutual_authentication#mTLS) with identities embedded in the certificate. mTLS communication is facilitated with the [Athenz](https://www.athenz.io/) service.
+
+All data stored is encrypted at rest using [Encryption At Host](https://learn.microsoft.com/en-us/azure/virtual-machines/disk-encryption-overview). All keys are managed automatically by the Azure platform.
+
+The resources provisioned in your Azure subscription are either provisioned by the Vespa Cloud Enclave Terraform module you apply, or by the orchestration services inside a Vespa Cloud zone.
+
+Resources are provisioned by the Vespa Cloud Configuration servers, using the [`id-provisioner`](https://github.com/vespa-cloud/terraform-azure-enclave/blob/main/provisioner.tf) user-assigned managed identity defined in the Terraform module.
+
+Only your Vespa tenant (that registered this Azure subscription) can deploy applications targeting your enclave.
+
+For more general information about security in Vespa Cloud, see the [whitepaper](/en/security/whitepaper).
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/enclave/azure-getting-started.mdx b/mintlify-docs/en/operations/enclave/azure-getting-started.mdx
new file mode 100644
index 0000000000..751856b38b
--- /dev/null
+++ b/mintlify-docs/en/operations/enclave/azure-getting-started.mdx
@@ -0,0 +1,76 @@
+---
+title: "Getting started with Vespa Cloud Enclave in Azure"
+sidebarTitle: "Azure getting started"
+---
+
+Setting up Vespa Cloud Enclave requires:
+
+
+
+Registration at [Vespa Cloud](https://console.vespa-cloud.com), or use a pre-existing Vespa tenant.
+
+
+Running a [Terraform](https://www.terraform.io/) configuration to provision necessary Azure resources in the subscription.
+
+
+Registration of the Azure subscription in Vespa Cloud.
+
+
+Deployment of a Vespa application.
+
+
+
+### 1. Vespa Cloud Tenant setup
+
+Register at [Vespa Cloud](https://console.vespa-cloud.com) or use an existing Vespa tenant. Note that the tenant must be on a [paid plan](https://vespa.ai/pricing/).
+
+### 2. Configure Azure subscription
+
+Choose an Azure subscription to use for Vespa Cloud Enclave.
+
+
+**Note:**
+
+We recommend using a *dedicated* subscription for your Vespa Cloud Enclave. Resources in this subscription will be fully managed by Vespa Cloud.
+
+
+One subscription can host all your Vespa applications, there is no need for multiple Vespa tenants or Azure subscriptions.
+
+The subscription must be prepared for deploying Vespa applications. Use [Terraform](https://www.terraform.io/) to set up the necessary resources using the [modules](https://registry.terraform.io/modules/vespa-cloud/enclave/azure/latest) published by the Vespa team.
+
+Feel free to use the [example](https://github.com/vespa-cloud/terraform-azure-enclave/blob/main/examples/basic/main.tf) to get started.
+
+If you are unfamiliar with Terraform: It is a tool to manage resources and their configuration in various cloud providers, like AWS, Azure, and GCP. Terraform has published a [Get Started - Azure](https://developer.hashicorp.com/terraform/tutorials/azure-get-started) tutorial, and we strongly encourage enclave users to read and follow the Terraform recommendations for [CI/CD](https://developer.hashicorp.com/terraform/tutorials/automation/automate-terraform).
+
+The Terraform module we provide is regularly updated to add new required resources or extra permissions for Vespa Cloud to automate the operations of your applications. In order for your enclave applications to use the new features you must re-apply your terraform templates with the latest release. The [notification system](/en/operations/notifications) will let you know when a new release is available.
+
+### 3. Onboarding
+
+Contact [support@vespa.ai](mailto:support@vespa.ai) and provide the `enclave_config` output after applying the Terraform, see [Outputs](https://github.com/vespa-cloud/terraform-azure-enclave?tab=readme-ov-file#outputs). The `enclave_config` includes which Vespa tenant should be on-boarded to use Vespa Cloud Enclave. And the Azure tenant ID, the subscription ID, and a client ID of an Athenz identity the Terraform created.
+
+### 4. Deploy a Vespa application
+
+By default, all applications are deployed on resources in Vespa Cloud accounts. To deploy in your Azure enclave subscription instead, update [deployment.xml](/en/reference/applications/deployment) to reference the subscription ID from step 2:
+
+```bash
+
+
+
+```
+
+Useful resources are [getting started](/en/basics/deploy-an-application) and [migrating to Vespa Cloud](/en/learn/migrating-to-cloud) - put *deployment.xml* next to *services.xml*.
+
+## Next steps
+
+After a successful deployment to the [dev](/en/operations/environments#dev) environment, iterate on the configuration to implement your application on Vespa. The *dev* environment is ideal for this, with rapid deployment cycles.
+
+For production serving, deploy to the [prod](/en/operations/environments#prod) environment - follow the steps in [production deployment](/en/reference/applications/deployment).
+
+## Enclave teardown
+
+To tear down a Vespa Cloud Enclave system, do the steps above in reverse order:
+
+1. [Undeploy the application(s)](/en/operations/deleting-applications)
+2. Undeploy the Terraform changes
+
+It is important to undeploy the Vespa application(s) first. After running the Terraform, Vespa Cloud cannot manage the resources allocated, so you must clean up these yourself.
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/enclave/enclave.mdx b/mintlify-docs/en/operations/enclave/enclave.mdx
new file mode 100644
index 0000000000..063e0e65fe
--- /dev/null
+++ b/mintlify-docs/en/operations/enclave/enclave.mdx
@@ -0,0 +1,84 @@
+---
+title: "Vespa Cloud Enclave"
+sidebarTitle: "Enclave"
+---
+
+
+
+
+
+Vespa Cloud Enclave allows Vespa Cloud applications to run inside the tenant's own cloud accounts while everything is still fully managed by Vespa Cloud's automation, giving the tenant full access to Vespa Cloud features inside their own cloud account. This allows tenant data to always remain within the bounds of services controlled by the tenant, and also to build closer integrations with Vespa applications inside the cloud services.
+
+Vespa Cloud Enclave is available in AWS, Azure, and GCP.
+
+
+**Note:**
+
+As the Vespa Cloud Enclave resources run in *your* account, this incurs resource costs from your cloud provider in *addition* to the Vespa Cloud costs.
+
+
+
+## AWS
+
+
+
+
+
+
+## Azure
+
+
+
+
+
+
+## GCP
+
+
+
+
+
+
+## Guides
+
+
+
+
+
+
+## FAQ
+
+
+
+
+The permissions required are coded into the Terraform modules found at:
+
+
+
+
+
+
+Navigate to the *modules* directory for details.
+
+
+
+Use terraform to grant Vespa hosts access to necessary secrets, and create an RPM that retrieves them and configures your application. See [enclave-examples](https://github.com/vespa-cloud/enclave-examples/tree/main/systemd-secrets) for a complete example.
+
+
+
+This happens if you deploy to new zones *before* running the Terraform/CloudFormation templates:
+
+```bash
+Deployment failed: Invalid application: In container cluster 'mycluster': Could not provision load balancer mytenant:myapp:myinstance:mycluster: Expected to find exactly 1 resource, but got 0 for subnet with service 'tenantelb'
+```
+
+
+
+Vespa Cloud will take proactive actions on maintenance operations and replace instances that are scheduled for maintenance tasks ahead of time to reduce any impact the maintenance may incur.
+
+All EC2 instance failures are detected by our control plane, and the problematic instances are automatically replaced. The system will, as part of the replacement process, also ensure that the document distribution is kept in line with your application configuration.
+
+
+VPC peering is not supported; [AWS PrivateLink](/en/operations/private-endpoints#aws-private-link) and [Google Private Service Connect](/en/operations/private-endpoints#gcp-private-service-connect) are good alternatives, so you can access the endpoints without going over public internet. [Read more](/en/reference/applications/deployment#accessing-a-public-cloud-application-from-another-vpc-on-another-account).
+
+
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/enclave/gcp-architecture.mdx b/mintlify-docs/en/operations/enclave/gcp-architecture.mdx
new file mode 100644
index 0000000000..b059028a19
--- /dev/null
+++ b/mintlify-docs/en/operations/enclave/gcp-architecture.mdx
@@ -0,0 +1,40 @@
+---
+title: "Architecture for Vespa Cloud Enclave in GCP"
+sidebarTitle: "GCP architecture"
+---
+
+### Architecture
+
+Each Vespa Cloud Enclave in the tenant GCP project corresponds to a Vespa Cloud [zone](/en/operations/zones). Inside the tenant GCP project one enclave is contained within one single [VPC](https://cloud.google.com/vpc/).
+
+
+
+
+
+#### Compute Instances, Load Balancers, and Cloud Storage buckets
+
+Configuration Servers inside the Vespa Cloud zone makes the decision to create or destroy compute instances ("Vespa Hosts" in diagram) based on the Vespa applications that are deployed. The Configuration Servers also set up the Network Load Balancers needed to communicate with the deployed Vespa application.
+
+Each Vespa Host will periodically sync its logs to a Cloud Storage bucket ("Log Archive"). This bucket is "local" to the enclave and provisioned by the Terraform module inside the tenant's GCP project.
+
+#### Networking
+
+The enclave VPC is very network restricted. Vespa Hosts do not have public IPv4 addresses and there is no [NAT gateway](https://cloud.google.com/nat/docs/overview) available in the VPC. Vespa Hosts have public IPv6 addresses and are able to make outbound connections. Inbound connections are not allowed. Outbound IPv6 connections are used to bootstrap communication with the Configuration Servers, and to report operational metrics back to Vespa Cloud.
+
+When a Vespa Host is booted it will set up an encrypted tunnel back to the Configuration Servers. All communication between Configuration Servers and the Vespa Hosts will be run over this tunnel after it is set up.
+
+### Security
+
+The Vespa Cloud operations team does *not* have any direct access to the resources that is part of the customer account. The only possible access is through the management APIs needed to run Vespa itself. In case it is needed for, e.g. incident debugging, direct access can only be granted to the Vespa team by the tenant itself. Enabling direct access is done by setting the `enable_ssh` input to true in the enclave module. For further details, see the documentation for the [enclave module inputs](https://registry.terraform.io/modules/vespa-cloud/enclave/google/latest/?tab=inputs).
+
+All communication between the enclave and the Vespa Cloud configuration servers is encrypted, authenticated and authorized using [mTLS](https://en.wikipedia.org/wiki/Mutual_authentication#mTLS) with identities embedded in the certificate. mTLS communication is facilitated with the [Athenz](https://www.athenz.io/) service.
+
+All data stored is encrypted at rest using [Cloud Key Management](https://cloud.google.com/security-key-management). All keys are managed by the tenant in the tenant's GCP project.
+
+The resources provisioned in the tenant GCP project are either provisioned by the Terraform module executed by the tenant, or by the orchestration services inside a Vespa Cloud zone.
+
+Resources are provisioned by the Vespa Cloud configuration servers, using the [`vespa_cloud_provisioner_role`](https://github.com/vespa-cloud/terraform-google-enclave/blob/main/main.tf) IAM role defined in the Terraform module.
+
+The tenant that registered the GCP project is the only tenant that can deploy applications targeting the enclave.
+
+For more general information about security in Vespa Cloud, see the [whitepaper](/en/security/whitepaper).
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/enclave/gcp-getting-started.mdx b/mintlify-docs/en/operations/enclave/gcp-getting-started.mdx
new file mode 100644
index 0000000000..5de7521cb5
--- /dev/null
+++ b/mintlify-docs/en/operations/enclave/gcp-getting-started.mdx
@@ -0,0 +1,84 @@
+---
+title: "Getting started with Vespa Cloud Enclave in GCP"
+sidebarTitle: "GCP getting started"
+---
+
+Setting up Vespa Cloud Enclave requires:
+
+
+
+Registration at [Vespa Cloud](https://console.vespa-cloud.com), or use a pre-existing tenant.
+
+
+Running a [Terraform](https://www.terraform.io/) configuration to provision necessary GCP resources in the project.
+
+
+Registration of the GCP project in Vespa Cloud.
+
+
+Deployment of a Vespa application.
+
+
+
+### 1. Vespa Cloud Tenant setup
+
+Register at [Vespa Cloud](https://console.vespa-cloud.com) or use an existing tenant. Note that the tenant must be on a [paid plan](https://vespa.ai/pricing/).
+
+### 2. Configure GCP Project
+
+
+**Note:**
+
+We recommend using a *dedicated* project for your Vespa Cloud Enclave. Resources in this project will be fully managed by Vespa Cloud.
+
+
+One project can host all your Vespa applications, there is no need for multiple tenants or projects.
+
+The project you intend to use for Vespa Cloud Enclave must be prepared for deploying Vespa applications. Use [Terraform](https://www.terraform.io/) to set up the necessary resources using the [modules](https://registry.terraform.io/modules/vespa-cloud/enclave/google/latest) published by the Vespa team.
+
+Modify the [multi-region example](https://github.com/vespa-cloud/terraform-google-enclave/blob/main/examples/multi-region/main.tf) for your deployment.
+
+If you are unfamiliar with Terraform: It is a tool to manage resources and their configuration in various cloud providers, like AWS and GCP. Terraform has published a [GCP](https://developer.hashicorp.com/terraform/tutorials/gcp-get-started) tutorial, and we strongly encourage enclave users to read and follow the Terraform recommendations for [CI/CD](https://developer.hashicorp.com/terraform/tutorials/automation/automate-terraform).
+
+The Terraform module we provide is regularly updated to add new required resources or extra permissions for Vespa Cloud to automate the operations of your applications. In order for your enclave applications to use the new features you must re-apply your terraform templates with the latest release. The [notification system](/en/operations/notifications) will let you know when a new release is available.
+
+### 3. Onboarding
+
+Once the GCP project is configured, contact [support@vespa.ai](mailto:support@vespa.ai) stating which tenant should be on-boarded to use Vespa Cloud Enclave. Also include the [GCP Project ID](https://cloud.google.com/resource-manager/docs/creating-managing-projects#identifying_projects) to associate with the tenant.
+
+
+**Note:**
+
+Wait for confirmation from the Vespa team that onboarding is complete before deploying an application in the next step.
+
+
+### 4. Deploy a Vespa application
+
+By default, all applications are deployed on resources in Vespa Cloud accounts. To deploy in your enclave account, update [deployment.xml](/en/reference/applications/deployment) to reference the GCP project you onboarded:
+
+```xml
+
+
+
+```
+
+Useful resources are [getting started](/en/basics/deploy-an-application) and [migrating to Vespa Cloud](/en/learn/migrating-to-cloud) - put *deployment.xml* next to *services.xml*.
+
+## Next steps
+
+After a successful deployment to the [dev](/en/operations/environments#dev) environment, iterate on the configuration to implement your application on Vespa. The *dev* environment is ideal for this, with rapid deployment cycles.
+
+For production serving, deploy to the [prod](/en/operations/environments#prod) environment - follow the steps in [production deployment](/en/reference/applications/deployment).
+
+## Enclave teardown
+
+To tear down a Vespa Cloud Enclave system, do the steps above in reverse order:
+
+1. [Undeploy the application(s)](/en/operations/deleting-applications)
+2. Undeploy the Terraform changes
+
+It is important to undeploy the Vespa application(s) first. After running the Terraform, Vespa Cloud cannot manage the resources allocated, so you must clean up these yourself.
+
+## Troubleshooting
+
+**Identities restricted by domain**: If your GCP organization is using [domain restriction for identities](https://cloud.google.com/resource-manager/docs/organization-policy/restricting-domains) you will need to permit Vespa.ai GCP identities to be added to your project. For Vespa Cloud the organization ID to allow identities from is: *1056130768533*, and the Google Customer ID is *C00u32w3e*.
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/enclave/operations.mdx b/mintlify-docs/en/operations/enclave/operations.mdx
new file mode 100644
index 0000000000..2c7202b03c
--- /dev/null
+++ b/mintlify-docs/en/operations/enclave/operations.mdx
@@ -0,0 +1,21 @@
+---
+title: "Operations and Support for Vespa Cloud Enclave"
+sidebarTitle: "Operations"
+---
+Vespa Cloud Enclave requires that resources provisioned within the VPC are wholly managed by the Vespa Cloud orchestration services, and must not be manually managed by tenant operations. Changing or removing the resources created by the Configuration Servers will negatively impact your Vespa application and may prevent Vespa Cloud from properly managing the applications as well as Vespa engineers from support it.
+
+The Terraform modules might see occasional backwards compatible updates. It is recommended that the tenant applies updates to their system on a regular basis. For more information, see the Terraform documentation on [using Terraform in automation](https://developer.hashicorp.com/terraform/tutorials/automation/automate-terraform).
+
+The network access granted to Vespa Hosts must be in place for the Vespa application to operate properly. If network access is restricted the Vespa application might stop working.
+
+## Custom resource tags
+
+Custom tags can be applied to the cloud resources (virtual machines and disks) that Vespa Cloud provisions inside the tenant's cloud account. Tags are declared in *deployment.xml* via the [``](/en/reference/applications/deployment#resource-tags) element and are commonly used for cost tracking and resource management. This is supported on AWS, Azure, and GCP.
+
+## Quota
+
+Make sure your organization's AWS or GCP quotas are set high enough to support common Vespa Cloud use cases. A common use case is migrating to new instance types, and this causes temporary doubled (or more) resource usage in the data migration transition period. Other use cases with temporary increased resource usage are node replacements.
+
+Best practise is to ensure the quota is 3x of current resource usage, to also cover for capacity expansion.
+
+This is not to be confused with the [Vespa Cloud quota](/en/cloud/quota).
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/endpoint-routing.mdx b/mintlify-docs/en/operations/endpoint-routing.mdx
new file mode 100644
index 0000000000..b3e3fbea2a
--- /dev/null
+++ b/mintlify-docs/en/operations/endpoint-routing.mdx
@@ -0,0 +1,64 @@
+---
+title: "Routing and endpoints"
+sidebarTitle: "Endpoint routing"
+---
+Vespa Cloud supports multiple methods of routing requests to an application. This guide describes how these routing methods work, failover, and how to configure them.
+
+By default, each deployment of a Vespa Cloud application will have a zone endpoint. In addition to the default zone endpoint, one can configure global endpoints.
+
+All endpoints for an application are available under the *endpoints* tab of each deployment in the console.
+
+## Endpoint format
+
+Vespa Cloud endpoints are on the format: `{random}.{random}.{scope}.vespa-app.cloud`.
+
+## Endpoint scopes
+
+### Zone endpoint
+
+This is the default endpoint for a deployment. Requests through a zone endpoint are sent directly to the zone.
+
+Zone endpoints are created implicitly, one per container cluster declared in [services.xml](/en/reference/applications/services/container). Zone endpoints are not configurable.
+
+Zone endpoints have the suffix `z.vespa-app.cloud`
+
+### Global endpoint
+
+A global endpoint is an endpoint that can route requests to multiple zones. It can be configured in [deployment.xml](/en/reference/applications/deployment#endpoints-global). Similar to how a [CDN](https://en.wikipedia.org/wiki/Content_delivery_network) works, requests through this endpoint will be routed to the nearest zone based on geo proximity, i.e. the zone that is nearest to the client.
+
+Global endpoints have the suffix `g.vespa-app.cloud`
+
+
+**Important:**
+
+Global endpoints do not support feeding. Feeding must be done through zone endpoints.
+
+
+
+## Routing control
+
+Vespa Cloud has two mechanisms for manually controlling routing of requests to a zone:
+
+- Removing the `` element from the relevant `` elements in [deployment.xml](/en/reference/applications/deployment) and deploying a new version of your application.
+- Changing the status through the console.
+
+This section describes the latter mechanism. Navigate to the relevant deployment of your application in the console. Hovering over the *GLOBAL ROUTING* badge will display the current status and when it was last changed.
+
+### Change status
+
+In case of a production emergency, a zone can be manually set out to prevent it from receiving requests:
+
+1. Hover over the *GLOBAL ROUTING* badge for the problematic deployment and click *Deactivate*.
+2. Inspection of the status will now show the status set to *OUT*. To set the zone back in and have it continue receiving requests: Hover over the *GLOBAL ROUTING* badge again and click *Activate*.
+
+### Behaviour
+
+Changing the routing status is independent of the endpoint scope used. You're technically overriding the routing status the deployment reports to the Vespa Cloud routing infrastructure. This means that a change to routing status affects both *zonal endpoints* and *global endpoints*.
+
+Deactivating a deployment disables routing of requests to that deployment through global endpoints until the deployment is activated again. As routing through these endpoints is DNS-based, it may take up between 5 and 15 minutes for all traffic to shift to other deployments.
+
+If all deployments of an endpoint are deactivated, requests are distributed as if all deployments were active. This is because attempting to route traffic according to the original configuration is preferable to discarding all requests.
+
+## AWS clients
+
+While Vespa Cloud is hosted in AWS, clients that talk to Vespa Cloud from AWS nodes will be treated as any other client from the Internet. This means clients in AWS will generate regular Internet egress traffic even though they are talking to a service in AWS in the same zone.
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/environments.mdx b/mintlify-docs/en/operations/environments.mdx
new file mode 100644
index 0000000000..aedc8e9423
--- /dev/null
+++ b/mintlify-docs/en/operations/environments.mdx
@@ -0,0 +1,85 @@
+---
+title: "Environments"
+---
+
+Vespa Cloud has two kinds of environments:
+
+- Manual environment for rapid development and test: `dev`
+- Automated environment with integrated CD pipeline: `prod`
+
+An application is deployed to one or more *zones* (see [zone list](/en/operations/zones)), which is a combination of an *environment* and a *region*, like `vespa deploy -z dev.aws-us-east-1c`.
+
+## Dev
+
+The dev environment is built for rapid developments cycles, with auto-downscaling and auto-expiry for ease of use and cost control. The dev environment is the default, to deploy to this, use `vespa deploy`.
+
+### Auto downscaling
+
+One use case for the dev environment is to take an application package from a prod environment and deploy to the dev environment to debug. To minimize cost and make this speedy, Vespa Cloud will by default ignore [nodes](/en/reference/applications/services/services#nodes) and [resources](/en/reference/applications/services/services#resources) settings.
+
+With this, you can safely download an application package from prod (that are normally large) and deploy to dev, with no changes.
+
+To override this behavior and control the resources, specify them explicitly for the dev environment as described in [deployment variants](/en/operations/deployment-variants#services.xml-variants). Example:
+
+```xml
+
+
+
+
+
+>
+```
+
+
+**Important:**
+
+The `dev` environment has redundancy 1 by default, and there are no availability or data persistence guarantees. Do not use applications deployed to these zones for production serving use cases.
+
+
+### Auto expiry
+
+Deployments to `dev` expire after 14 days of inactivity, that is, 14 days after the last [deployment](/en/basics/applications#deploying-applications). **This applies to all plans**. To add 7 more days to the expiry period, redeploy the application or use the Vespa Cloud Console.
+
+### Vespa version
+
+The latest active Vespa version is used when deploying to the dev environment. The deployment is upgraded at a time which is most likely at night for the developer in order to minimize downtime (based on the time when last deployments were made). An upgrade will be skipped if metrics indicate ongoing feed or query load, but will still be done if current version is more than a week old.
+
+## Prod
+
+Applications are deployed to the `prod` environment for production serving. Deployments are passed through an integrated CD pipeline for system tests and staging tests. Read more in [automated deployments](/en/operations/automated-deployments).
+
+## Test
+
+The `test` environment is used by the integrated CD pipeline for prod deployments, to run [system tests](/en/operations/automated-deployments#system-tests). The test capacity is ephemeral and only used during test. Nodes in test and staging environments do not have access to data in prod environments.
+
+Note that one cannot deploy directly to test and staging environments. For long-lived test applications (e.g., a QA system that is integrated with other services) use the prod environment.
+
+System tests are always invoked, even if there are no tests defined. In this case, an instance is just started and then stopped. This has value in itself, as it ensures that the application is able to start.
+
+Test runs can be [aborted](/en/operations/automated-deployments#disabling-tests).
+
+## Staging
+
+See system tests above, this applies to the staging, too. [Staging tests](/en/operations/automated-deployments#staging-tests) use a fraction of the configured prod capacity, this can be overridden to using 1 node regardless of prod cluster size:
+
+```xml
+
+
+
+
+
+```
+
+## Reference
+
+Environment settings:
+
+| Name | Description | Expiry | Cluster sizes |
+| :--- | :--- | :--- | :--- |
+| `dev` | Used for manual development testing. | 14 days | `1` |
+| `test` | Used for [automated system tests](/en/applications/testing#system-tests). | \- | `1` |
+| `staging` | Used for [automated staging tests](/en/applications/testing#staging-tests). | \- | `min(max(2, 0.05 * spec), spec)` |
+| `prod` | Hosts all production deployments. | No expiry | `max(2, spec)` |
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/kubernetes/architecture.mdx b/mintlify-docs/en/operations/kubernetes/architecture.mdx
new file mode 100644
index 0000000000..dedd147d08
--- /dev/null
+++ b/mintlify-docs/en/operations/kubernetes/architecture.mdx
@@ -0,0 +1,11 @@
+---
+title: "Architecture"
+---
+
+
+
+
+
+The Vespa Operator is an implementation of the [Operator Pattern](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) that extends Kubernetes with custom orchestration capabilities for Vespa. It relies on a [Custom Resource Definition](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) called a `VespaSet`, which represents a quorum of [ConfigServers](/en/operations/self-managed/configuration-server) in a Kubernetes namespace. The Vespa Operator is responsible for the deployment and lifecycle of the `VespaSet` resource and its ConfigServers, which collectively entails the infrastructure for Vespa on Kubernetes.
+
+[Application Packages](/en/basics/applications) are deployed to the [ConfigServers](/en/operations/self-managed/configuration-server) to create Vespa applications. The ConfigServers will dynamically instantiate the services as individual Pods based on the settings provided in the Application Package. After an Application Package is deployed, the ConfigServers will remain responsible for the management and lifecycle of the Vespa application.
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/kubernetes/configuration/configure-local-storage-type.mdx b/mintlify-docs/en/operations/kubernetes/configuration/configure-local-storage-type.mdx
new file mode 100644
index 0000000000..9370780ba9
--- /dev/null
+++ b/mintlify-docs/en/operations/kubernetes/configuration/configure-local-storage-type.mdx
@@ -0,0 +1,183 @@
+---
+title: "Configure Local Storage Type"
+---
+We recommend configuring node-local storage for the [content cluster](/en/content/proton) (i.e. the search core) to maximize performance by avoiding network I/O on the data path. In a standard Vespa deployment, this is controlled through the `storage-type` attribute under the [resources](/en/reference/applications/services/services#resources) tag in the [application package](/en/basics/applications). However, that attribute has no effect when running Vespa on Kubernetes. Instead, local storage should be configured through the `spec.application.storageClass` field in the `VespaSet`. Vespa on Kubernetes abstracts away the concept of storage and will consume whatever is provided by the referenced storage class.
+
+For ConfigServer pods, storage performance is less critical; therefore, selecting a more cost-efficient network-attached storage class, such as `gp3` EBS volumes on Amazon EKS, is generally an appropriate tradeoff.
+
+To provision node-local storage, we recommend using Kubernetes [Local Persistent Volumes](https://kubernetes.io/blog/2019/04/04/kubernetes-1.14-local-persistent-volumes-ga/). These volumes expose `NodeAffinity` constraints to the Kubernetes scheduler, ensuring that Pods consuming them are scheduled onto nodes where the underlying storage is available. This avoids the need to manually manage NodeAffinity rules on per Pod.
+
+In addition, the Kubernetes Special Interest Groups (SIGs) provide an external [Local Persistent Volume](https://kubernetes.io/blog/2019/04/04/kubernetes-1.14-local-persistent-volumes-ga/) static provisioner. This provisioner automatically discovers local disks mounted on each node and creates corresponding `PersistentVolumes`, while managing their lifecycle, including cleanup and reuse as Pods are deleted. We recommend using this component in production deployments.
+
+This guide walks through setting up local NVMe instance storage on EKS nodes using the [Kubernetes Local Volume Static Provisioner](https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner). This exposes the physical NVMe disks available on instances as a `local-nvme` StorageClass that Application Pods can claim. While this guide specifically targets an Amazon EKS setup, the concept is similar across different environments - refer to the [project](https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner/tree/master/helm/examples) for several other examples.
+
+## Setup Local Storage on Amazon EKS
+
+This guide assumes that your EKS cluster has a Node Group configured with an instance type that supports local NVMe instance storage, such as `m7gd.xlarge`. These instance types typically contain the `d` suffix to designate themselves as specialized for workloads that require local instance storage. Refer to the [AWS EKS Node Groups](https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html) documentation for further information on configuring Node Groups.
+
+This guide specifically targets Bottlerocket-based EKS Nodes. These Nodes do not execute the standard EKS bootstrap script responsible for preparing NVMe instance storage. Disk formatting and mounting is therefore handled by an init container, after which the static provisioner scans for available volumes and registers them as `PersistentVolumes`.
+
+Add the Helm repository for the Local Volume Static Provisioner.
+
+```text
+$ helm repo add sig-storage-local-static-provisioner https://kubernetes-sigs.github.io/sig-storage-local-static-provisioner
+$ helm repo update
+```
+
+Create an EKS NVMe instance storage configuration. The example below will run an [initContainer](https://kubernetes.io/docs/concepts/workloads/pods/init-containers/) that will scan for NVMe instance store disks and format them as `ext4` under `/mnt/disks`, which the static provisioner will detect.
+
+```bash expandable
+cat <<'EOF' > local-nvme-values.yaml
+# EKS Bottlerocket NVMe instance storage configuration.
+classes:
+ - name: local-nvme
+ hostDir: /mnt/disks
+ mountDir: /mnt/disks
+ volumeMode: Filesystem
+ fsType: ext4
+ accessMode: ReadWriteOnce
+ storageClass:
+ reclaimPolicy: Delete
+ isDefaultClass: false
+
+nodeSelector:
+ eks.amazonaws.com/nodegroup: test-node-group
+
+priorityClassName: system-node-critical
+mountDevVolume: true
+
+initContainers:
+ - name: nvme-disk-setup
+ image: registry.k8s.io/sig-storage/local-volume-provisioner:v2.8.0
+ securityContext:
+ privileged: true
+ command:
+ - sh
+ - -c
+ - |
+ set -eu
+
+ DISKS_PATH=/mnt/disks
+
+ disks=$(ls /dev/nvme*n1 2>/dev/null | grep -v '/dev/nvme0n1' || true)
+
+ if [ -z "${disks}" ]; then
+ echo "No NVMe instance-store disks found, nothing to do"
+ exit 0
+ fi
+
+ for disk in ${disks}; do
+ echo "Processing ${disk}..."
+
+ model=$(cat /sys/block/$(basename ${disk})/device/model 2>/dev/null || true)
+ if ! echo "${model}" | grep -q "Amazon EC2 NVMe Instance Storage"; then
+ echo "${disk} is not an instance store disk (model: ${model}), skipping"
+ continue
+ fi
+
+ if grep -q "^${disk} " /proc/mounts; then
+ echo "${disk} is already mounted, skipping"
+ continue
+ fi
+
+ if ! blkid "${disk}" >/dev/null 2>&1; then
+ echo "No filesystem on ${disk}, formatting as ext4..."
+ mkfs.ext4 -F "${disk}"
+ fi
+
+ uuid=$(blkid -s UUID -o value "${disk}")
+ if [ -z "${uuid}" ]; then
+ echo "Could not determine UUID for ${disk}, skipping"
+ continue
+ fi
+
+ mount_point="${DISKS_PATH}/${uuid}"
+ mkdir -p "${mount_point}"
+ echo "Mounting ${disk} (UUID=${uuid}) at ${mount_point}"
+ mount "${disk}" "${mount_point}"
+ done
+
+ echo "Setup complete. Disks mounted under ${DISKS_PATH}:"
+ grep "${DISKS_PATH}" /proc/mounts || echo " (none found)"
+ volumeMounts:
+ - name: provisioner-dev
+ mountPath: /dev
+ - name: local-nvme
+ mountPath: /mnt/disks
+ mountPropagation: Bidirectional
+
+resources:
+ requests:
+ cpu: 10m
+ memory: 32Mi
+ limits:
+ cpu: 100m
+ memory: 128Mi
+EOF
+
+$ helm install local-volume-provisioner \
+ sig-storage-local-static-provisioner/local-static-provisioner \
+ --namespace kube-system \
+ --values local-nvme-values.yaml
+```
+
+`mountPropagation: Bidirectional` will ensure that the volume mount is propagated back to the host, and `priorityClassName: system-node-critical` ensures the provisioner Pod will not be evicted in the case of Node pressure.
+
+After installing the static provisioner, a `StorageClass` type of `local-nvme` will be created. This should be used in the `spec.application.storageClass` attribute of the `VespaSet`.
+
+```text
+$ kubectl get storageclasses
+NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
+local-nvme kubernetes.io/no-provisioner Delete WaitForFirstConsumer false 12h
+```
+
+Ensure that the `VolumeBindingMode` is `WaitForFirstConsumer` to delay `PersistentVolume` binding until a Pod is scheduled, allowing the scheduler to place the Pod on a Node where the storage physically resides.
+
+After the `initContainer` has completed, the static provisioner will provision `PersistentVolumes`.
+
+```text expandable
+$ kubectl get persistentvolumes
+NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE
+local-pv-201c66f3 216Gi RWO Delete Available local-nvme 12h
+local-pv-2942e993 216Gi RWO Delete Available local-nvme 12h
+local-pv-2fea7934 216Gi RWO Delete Available local-nvme 12h
+local-pv-335a2831 216Gi RWO Delete Available local-nvme 12h
+local-pv-3499cebf 216Gi RWO Delete Available local-nvme 12h
+local-pv-36dc72b5 216Gi RWO Delete Available local-nvme 12h
+local-pv-37928b3d 216Gi RWO Delete Available local-nvme 12h
+local-pv-5e09d438 216Gi RWO Delete Available local-nvme 12h
+local-pv-6e9849a9 216Gi RWO Delete Available local-nvme 12h
+```
+
+Configure the `VespaSet` to use the newly created `StorageClass`. For example:
+
+```bash expandable
+# vespaset sample for EKS with local storage configured
+$ cat > vespaset.yaml <• Add env vars/mounts to main container. | • Cannot change main container image, command, or args. • Cannot override main container CPU/Memory resources (these are locked to `services.xml`). |
+| **Volumes** | • Add new Volumes (ConfigMap, Secret, EmptyDir). | • Cannot modify operator-reserved volumes (e.g., `/data`). |
+| **Metadata** | • Add new Labels and Annotations. | • Cannot overwrite operator-created labels and annotations |
+
+## Examples
+
+### Example 1: Injecting a Logging Sidecar
+
+This example adds a Fluent Bit sidecar to ship logs to a central system. It defines the sidecar container and mounts a shared volume that the Vespa container also writes to.
+
+```bash
+apiVersion: k8s.ai.vespa/v1
+kind: VespaSet
+metadata:
+ name: my-vespa-cluster
+spec:
+ application:
+ image: vespaengine/vespa:8.200.15
+ # Define the Custom Overlay
+ podTemplate:
+ spec:
+ containers:
+ # 1. Define the Sidecar
+ - name: fluent-bit
+ image: fluent/fluent-bit:1.9
+ volumeMounts:
+ - name: vespa-logs
+ mountPath: /opt/vespa/logs/vespa
+ # 2. Define the Shared Volume
+ volumes:
+ - name: vespa-logs
+ emptyDir: {}
+```
+
+### Example 2: Pinning Pods to Specific Nodes
+
+This example uses a nodeSelector to ensure Vespa pods only run on nodes labeled with workload=high-performance.
+
+```bash
+apiVersion: k8s.ai.vespa/v1
+kind: VespaSet
+metadata:
+ name: prod-vespa
+spec:
+ application:
+ podTemplate:
+ spec:
+ # Schedule only on nodes with label 'workload: high-performance'
+ nodeSelector:
+ workload: high-performance
+ # Tolerate the 'dedicated' taint if those nodes are tainted
+ tolerations:
+ - key: "dedicated"
+ operator: "Equal"
+ value: "search-team"
+ effect: "NoSchedule"
+```
+
+### Example 3: Adding Cost Allocation Labels
+
+This example adds custom labels that will appear on every tenant pod, enabling cost tracking by team.
+
+```bash
+apiVersion: k8s.ai.vespa/v1
+kind: VespaSet
+metadata:
+ name: shared-vespa
+spec:
+ application:
+ podTemplate:
+ metadata:
+ labels:
+ cost-center: "engineering-search"
+ owner: "team-alpha"
+ annotations:
+ # Example annotation for an external monitoring system
+ monitoring.datadoghq.com/enabled: "true"
+```
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/kubernetes/deployment/dev-mode.mdx b/mintlify-docs/en/operations/kubernetes/deployment/dev-mode.mdx
new file mode 100644
index 0000000000..51258904fa
--- /dev/null
+++ b/mintlify-docs/en/operations/kubernetes/deployment/dev-mode.mdx
@@ -0,0 +1,84 @@
+---
+title: "Setup Dev Environment"
+---
+
+The steps to enable the `dev` environment for Vespa on Kubernetes are described in this guide. This is a one-time irreversible operation. Once a `VespaSet` has been deployed in the `dev` environment configuration, it cannot be reserved.
+
+
+**Important:**
+
+The `dev` environment is intended for local development, integration testing, and experimentation — not for production serving.
+
+
+## Dev Environment
+
+Contrary to Vespa Cloud, the `dev` environment must additionally be configured at the `VespaSet` resource level. Once this is enabled, any Vespa Cluster that is reconciled through this `VespaSet` will have a `min-availability` in their `contenet` cluster and `node` count of 1 for all cluster types.
+
+As such, HA (high-availability) of Vespa Pods is not guaranteed, and availability will be reduced during upgrades. The only exception is the ConfigServer Pods, which must always maintain a replica count of 3 to ensure a quorum.
+
+For more information on Environments, refer to the [Vespa Cloud](/en/operations/environments#dev) documentation.
+
+## Enable Dev Environment
+
+The `dev` environment is activated by adding the following annotation to the `VespaSet` resource:
+
+| Annotation | Value | Effect |
+| --- | --- | --- |
+| `internal.vespa.ai/environment` | `dev` | Signals to the ConfigServer that this is a `dev` environment. |
+
+```bash
+$ cat > vespaset-dev.yaml <
+
+
+
+
+
+
+
+
+
+
+
+
+ 1
+
+
+
+
+
+
+
+
+
+```
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/kubernetes/deployment/ecr-pull-through-cache.mdx b/mintlify-docs/en/operations/kubernetes/deployment/ecr-pull-through-cache.mdx
new file mode 100644
index 0000000000..3078ce8253
--- /dev/null
+++ b/mintlify-docs/en/operations/kubernetes/deployment/ecr-pull-through-cache.mdx
@@ -0,0 +1,91 @@
+---
+title: "Setup Amazon ECR Pull-Through Cache"
+sidebarTitle: "Setup ECR Pull-through Cache"
+---
+
+For production, we recommend mirroring the upstream artifacts into your own registry. This section shows how to create an [Amazon ECR pull-through cache](https://docs.aws.amazon.com/AmazonECR/latest/userguide/pull-through-cache.html) for the images referenced in the [Installation](/en/operations/kubernetes/deployment/installation) guide.
+
+## AWS Console Steps
+
+
+
+Open AWS Console -> **Amazon ECR** -> **Private registry** -> **Pull through cache rules**.
+
+
+Choose **Create rule**.
+
+
+Set **ECR repository prefix** to `vespa-cache`.
+
+
+Set **Upstream registry URL** to `images.ves.pa`.
+
+
+Create or select a Secrets Manager credential with your support-provided upstream username/token.
+
+
+Create the rule, then optionally pull one tag of each artifact to warm the cache.
+
+
+
+## AWS CLI Steps
+
+Set the AWS account, region, and ECR registry variables, along with the upstream credentials provided by Vespa support.
+
+```js
+export AWS_ACCOUNT_ID=123456789012
+export AWS_REGION=us-east-1
+export ECR_REGISTRY=${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com
+export ECR_CACHE_PREFIX=vespa-cache
+
+export VESPAAI_REGISTRY_USER=
+export VESPAAI_REGISTRY_TOKEN=
+```
+
+Create a Secrets Manager secret to store the upstream registry credentials.
+
+```bash
+aws secretsmanager create-secret \
+ --name vespa-registry-creds \
+ --secret-string "{\"username\":\"${VESPAAI_REGISTRY_USER}\",\"password\":\"${VESPAAI_REGISTRY_TOKEN}\"}" \
+ --region ${AWS_REGION} || \
+aws secretsmanager put-secret-value \
+ --secret-id vespa-registry-creds \
+ --secret-string "{\"username\":\"${VESPAAI_REGISTRY_USER}\",\"password\":\"${VESPAAI_REGISTRY_TOKEN}\"}" \
+ --region ${AWS_REGION}
+```
+
+Create the pull-through cache rule. A single rule covers all repositories under the `images.ves.pa` host.
+
+```bash
+aws ecr create-pull-through-cache-rule \
+ --ecr-repository-prefix ${ECR_CACHE_PREFIX} \
+ --upstream-registry-url images.ves.pa \
+ --credential-arn arn:aws:secretsmanager:${AWS_REGION}:${AWS_ACCOUNT_ID}:secret:vespa-registry-creds \
+ --region ${AWS_REGION}
+```
+
+Authenticate your local tooling to the ECR registry.
+
+```bash
+aws ecr get-login-password --region ${AWS_REGION} | \
+ docker login --username AWS --password-stdin ${ECR_REGISTRY}
+aws ecr get-login-password --region ${AWS_REGION} | \
+ helm registry login --username AWS --password-stdin ${ECR_REGISTRY}
+```
+
+Warm the cache by pulling the Vespa images and the Helm chart artifact.
+
+```bash
+podman pull ${ECR_REGISTRY}/${ECR_CACHE_PREFIX}/kubernetes/vespa:${VESPA_VERSION}
+podman pull ${ECR_REGISTRY}/${ECR_CACHE_PREFIX}/kubernetes/operator:${VESPA_VERSION}
+helm pull oci://${ECR_REGISTRY}/${ECR_CACHE_PREFIX}/helm/vespa-operator --version ${VESPA_VERSION}
+```
+
+Point the installation variables to ECR.
+
+```bash
+export VESPA_IMAGE=${ECR_REGISTRY}/${ECR_CACHE_PREFIX}/kubernetes/vespa
+export VESPA_OPERATOR_IMAGE=${ECR_REGISTRY}/${ECR_CACHE_PREFIX}/kubernetes/operator
+export HELM_CHART_REF=oci://${ECR_REGISTRY}/${ECR_CACHE_PREFIX}/helm/vespa-operator
+```
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/kubernetes/deployment/installation.mdx b/mintlify-docs/en/operations/kubernetes/deployment/installation.mdx
new file mode 100644
index 0000000000..7551a4d5c3
--- /dev/null
+++ b/mintlify-docs/en/operations/kubernetes/deployment/installation.mdx
@@ -0,0 +1,238 @@
+---
+title: "Install Vespa on Kubernetes"
+sidebarTitle: "Installation"
+description: "These steps walk through deploying Vespa using the official Helm chart."
+---
+
+## Requirements
+
+The following tools are required for a smooth deployment.
+
+
+
+
+
+
+
+These instructions assume that your `kubeconfig` is pointing to an active Kubernetes cluster. Refer to the [Getting Started](https://kubernetes.io/docs/setup/) guide to create a Kubernetes cluster. For instructions on deploying Vespa locally on MiniKube, refer to the [Deploy Vespa Locally](/en/reference/applications/deployment) guide.
+
+Vespa on Kubernetes uses a [Custom Resource Definition](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) (CRD) called a `VespaSet`. Users intending to manage the CRD definition by themselves should apply it to the cluster before installation.
+
+The permissions that are needed to run Vespa are listed on the [Permissions](/en/operations/kubernetes/deployment/permissions) page. The Helm Chart will automatically apply a default set of RBAC API resources onto the cluster.
+
+## Setup Registry Access
+
+
+**Note**:
+
+Vespa on Kubernetes is an enterprise feature. You will need access to the images below. Contact us through our support portal to receive an authentication ID and token. For production use, we recommend mirroring these images into your own registry or a well-known internal repository appropriate to your infrastructure.
+
+
+- `VESPA_IMAGE=images.ves.pa/kubernetes/vespa`
+- `VESPA_OPERATOR_IMAGE=images.ves.pa/kubernetes/operator`
+- `HELM_CHART_REF=oci://images.ves.pa/helm/vespa-operator`
+
+We will use this naming convention throughout this guide. The tags for all three images conform to the [Vespa Version](/en/learn/releases) release semantics. We recommend using the latest Vespa release as the default. We will refer to it as `VESPA_VERSION`.
+
+The Vespa Operator and all Vespa components are local to a namespace. We will refer to the namespace as `NAMESPACE` in this guide.
+
+## Deploy the Vespa Operator
+
+Authenticate to the Helm Chart OCI registry. The credentials will be provided by our support team.
+
+```bash
+$ helm registry login images.ves.pa -u $USER -p $TOKEN
+```
+
+Install the Helm Chart onto the namespace. This will deploy the Vespa Operator and apply the `VespaSet` resource definition. Set `image.repository` to `VESPA_OPERATOR_IMAGE` as provided by our support team. The `image.tag` refers to the `VESPA_VERSION`.
+
+```bash
+$ helm install vespa-operator $HELM_CHART_REF --namespace $NAMESPACE --create-namespace --set image.repository=$VESPA_OPERATOR_IMAGE --set image.tag=$VESPA_VERSION
+```
+
+The lifecycle of the CRD definition can be managed separately. However, the CRD specification must be manually applied to the Kubernetes cluster before installing the Helm Chart. Our support team can provide this specification if necessary. To do this, use the `--skip-crds` option in Helm.
+
+```bash
+$ kubectl apply vespasets.k8s.ai.vespa-v1.yaml
+$ helm install vespa-operator $HELM_CHART_REF --namespace $NAMESPACE --create-namespace --skip-crds --set image.repository $VESPA_OPERATOR_IMAGE --set image.tag $VESPA_VERSION
+```
+
+Ensure that the `Deployment` resource was successfully created, and that the `Vespa Operator` Pod is running.
+
+## Deploy a VespaSet
+
+To set up a `dev` environment in Vespa on Kubernetes, refer to the example on the [Setup Dev Environment](/en/operations/kubernetes/deployment/dev-mode) page.
+
+A `VespaSet` is a quorum of [ConfigServer](/en/operations/self-managed/configuration-server) Pods that manage the lifecycle of Vespa applications. Several examples of `VespaSet` resources are provided in the Helm Chart `samples` directory. An example of a `VespaSet` for an archetypical [Amazon Elastic Kubernetes Service](https://aws.amazon.com/eks/) (EKS) setup is shown below.
+
+```bash
+# vespaset sample for EKS
+$ cat > vespaset.yaml < vespaset.yaml <
+```
+
+## Deploy a Vespa Application
+
+A Vespa application can be deployed through the ConfigServers' ingress endpoint once a quorum has been met. Refer to the [Vespa Sample Applications](https://github.com/vespa-engine/sample-apps) to get started. In the following example, we will use the [Album Recommendation](https://github.com/vespa-engine/sample-apps/tree/master/album-recommendation) sample application.
+
+Set up the Vespa CLI to download the Album Recommendation sample to a directory.
+
+```bash
+$ vespa clone album-recommendation myapp && cd myapp
+```
+
+The `Node` resources must be specified for any application package is deployed on Vespa on Kubernetes. These will directly translate to Kubernetes container [resource requests and limits](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/). In a default deployment without any [PodTemplate](/en/operations/kubernetes/custom-overrides-podtemplate) overrides, the requests will equal the limits for a container.
+
+Modify the container and content cluster specifications in the application package, as shown below:
+
+```xml
+
+
+
+
+
+
+
+
+
+
+
+
+
+ 2
+
+
+
+
+
+
+
+
+
+```
+
+Enable port-forwarding from the ConfigServer's ingress port `19071` to your local port `19071`. Any ConfigServer Pod can be used.
+
+```bash
+$ vespa config set target local
+$ kubectl -n $NAMESPACE port-forward pod/cfg-1 19071:19071
+```
+
+Deploy and activate the application.
+
+```bash
+$ vespa prepare --target local
+$ while ! vespa --target local activate; do sleep 1; done
+```
+
+The ConfigServers will create the Container, Content, and Cluster-Controller Pods as specified in the application package. The deployment is considered complete once all Pods show the phase `RUNNING` in the `VespaSet` status.
+
+Port-forwarding provides a simple way to ingress to the ConfigServer locally. For other ingress options, refer to the [Configuring the External Access Layer](/en/operations/kubernetes/ingress) page.
+
+## Feed and Query Documents
+
+Feed and query documents by port-forwarding the ConfigServer ingress port and the Dataplane ingress port, then using the Vespa CLI.
+
+```bash
+$ kubectl -n $NAMESPACE port-forward pod/cfg-1 19071:19071
+$ kubectl -n $NAMESPACE port-forward pod/default-100 8080:8080
+$ vespa feed dataset/A-Head-Full-of-Dreams.json
+$ vespa query 'yql=select * from music where true limit 1'
+```
+
+Refer to the [Vespa CLI documentation](/en/reference/clients/vespa-cli/vespa) for the full list of available commands.
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/kubernetes/deployment/local-deployment.mdx b/mintlify-docs/en/operations/kubernetes/deployment/local-deployment.mdx
new file mode 100644
index 0000000000..74b0af925d
--- /dev/null
+++ b/mintlify-docs/en/operations/kubernetes/deployment/local-deployment.mdx
@@ -0,0 +1,80 @@
+---
+title: "Deploy Vespa Locally"
+description: "Vespa on Kubernetes can be deployed locally using MiniKube for development and experimental use-cases."
+sidebarTitle: "Minikube Setup"
+---
+
+
+**Note:**
+
+This setup is not recommended for production.
+
+
+Initialize a Minikube cluster with 8 nodes, each with 4GiB of memory and 2 CPUs. Enable Minikube's image registry add-on to allow the Minikube nodes to access the Vespa images. In this example, we use `podman` as the driver.
+
+```bash
+minikube start --nodes 8 --cpus 2 --memory 4GiB --driver=podman --insecure-registry="192.168.49.0/24"
+minikube addons enable registry
+```
+
+Cache the images provided by our support team into the MiniKube registry.
+
+```bash
+echo $VESPAAI_REGISTRY_TOKEN | podman login images.ves.pa \
+ -u "$VESPAAI_REGISTRY_USER" \
+ --password-stdin
+
+podman pull images.ves.pa/kubernetes/vespa:$VESPA_VERSION
+podman pull images.ves.pa/kubernetes/operator:$VESPA_VERSION
+```
+
+Then, push the images to the MiniKube registry. The images will then be accessible from `$(minikube ip):5000`.
+
+```bash
+export MINIKUBE_REGISTRY=$(minikube ip)
+
+podman tag kubernetes/vespa:$VESPA_VERSION $MINIKUBE_REGISTRY:5000/localhost/kubernetes/vespa:$VESPA_VERSION
+podman push --tls-verify=false $MINIKUBE_REGISTRY:5000/localhost/kubernetes/vespa:$VESPA_VERSION
+
+podman tag kubernetes/operator:$VESPA_VERSION $MINIKUBE_REGISTRY:5000/localhost/kubernetes/operator:$VESPA_VERSION
+podman push --tls-verify=false $MINIKUBE_REGISTRY:5000/localhost/kubernetes/operator:$VESPA_VERSION
+```
+
+We will now use the following environment variables for the rest of the guide to refer to the images.
+
+```bash
+export VESPA_IMAGE=$MINIKUBE_REGISTRY:5000/localhost/kubernetes/vespa
+export VESPA_OPERATOR_IMAGE=$MINIKUBE_REGISTRY:5000/localhost/kubernetes/operator
+```
+
+Then, install the [Local Persistent Volume](https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner) Helm Chart. This will allow provisioning Persistent Volumes locally, which is required to run Vespa on Kubernetes. Helm will automatically create a StorageClass called `local-storage`, which should be used as the `StorageClass` for subsequent steps.
+
+```bash
+$ git clone git@github.com:kubernetes-sigs/sig-storage-local-static-provisioner.git
+
+# Install the Helm Chart onto the cluster globally
+$ cd sig-storage-local-static-provisioner
+$ helm install -f helm/examples/baremetal-default-storage.yaml local-volume-provisioner --namespace kube-system ./helm/provisioner
+```
+
+Create several usable volumes on each MiniKube Node. We recommend at least 4 per node for a smooth deployment.
+
+```bash
+# Create several volumes on each Minikube node.
+$ for n in minikube minikube-m02 minikube-m03 minikube-m04 minikube-m05 minikube-m06 minikube-m07 minikube-m08; do
+ echo "==> $n"
+ minikube ssh -n "$n" -- '
+ set -e
+ for i in 1 2 3 4; do
+ sudo mkdir -p /mnt/disks/vol$i
+ if ! mountpoint -q /mnt/disks/vol$i; then
+ sudo mount --bind /mnt/disks/vol$i /mnt/disks/vol$i
+ fi
+ done
+ echo "Mounted:"
+ mount | grep -E "/mnt/disks/vol[1-4]" || true
+ '
+done
+```
+
+Once the images are available in the MiniKube registry, proceed to the [Installation](/en/operations/kubernetes/deployment/installation) guide, using `local-storage` as the `storageClass` and `NONE` as the `endpointType`.
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/kubernetes/deployment/permissions.mdx b/mintlify-docs/en/operations/kubernetes/deployment/permissions.mdx
new file mode 100644
index 0000000000..9ef3c15183
--- /dev/null
+++ b/mintlify-docs/en/operations/kubernetes/deployment/permissions.mdx
@@ -0,0 +1,20 @@
+---
+title: "Permissions"
+---
+
+The Vespa Operator requires the following permissions within the namespace. These permissions are listed by Kubernetes [API verbs](https://kubernetes.io/docs/reference/access-authn-authz/rbac/) per resource.
+
+| Kubernetes Resource | Required Permissions |
+| --- | --- |
+| CustomResourceDefinitions | create, get, list, watch |
+| VespaSet | get, list, watch, create, update, patch, delete |
+| VespaSet Subresources | `vespasets/status`: update, patch `vespasets/finalizers`: update |
+| ConfigMaps | get, list, watch, create, update, patch, delete |
+| Services | get, list, watch, create, update, patch, delete |
+| Pods | get, list, watch, create, update, patch, delete |
+| Pod Execution | get, create |
+| Events | create, patch |
+| PersistentVolumeClaims | get, list, watch, create, update, patch, delete |
+| ServiceAccounts | get, list, watch, create, update, patch, delete |
+| Roles | get, list, watch, create, update, patch, delete |
+| RoleBindings | get, list, watch, create, update, patch, delete |
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/kubernetes/ingress.mdx b/mintlify-docs/en/operations/kubernetes/ingress.mdx
new file mode 100644
index 0000000000..50b58e5cc6
--- /dev/null
+++ b/mintlify-docs/en/operations/kubernetes/ingress.mdx
@@ -0,0 +1,136 @@
+---
+title: "Configure External Access Layer"
+---
+
+The Vespa Operator automatically provisions Kubernetes `Service` resources to enable external access for feeding and querying data. This behavior is controlled by the `VespaSet` Custom Resource configuration.
+
+Load balancers are provisioned exclusively for Container clusters. Content clusters communicate internally and do not require external load balancing services. The type of service provisioned is determined by the `spec.ingress.endpointType` field in the `VespaSet`.
+
+## Supported Endpoint Types
+
+The operator supports four endpoint types to cover different infrastructure requirements.
+
+| Endpoint Type | Kubernetes Service Type | Use Case |
+| --- | --- | --- |
+| `LOAD_BALANCER` | `LoadBalancer` | Provision the cloud-native (AWS, GCP, Azure) load-balancer. |
+| `NODE_PORT` | `NodePort` | Expose a static port across every worker node, allowing external traffic to access the cluster from any node's IP. |
+| `CLUSTER_IP` | `ClusterIP` | Each Container Pod will expose an internal IP address. Should not be used for production use-cases. |
+| `NONE` | N/A | An external access layer will not be provisioned. Custom networking setups (Istio, Ingress Controllers) where no automatic service is desired. |
+
+## LOAD_BALANCER
+
+This is the recommended configuration for production deployments on cloud providers (EKS, GKE, AKS). The operator creates a standard Kubernetes `LoadBalancer` service, triggering the cloud provider to provision a managed load balancer (e.g., AWS NLB).
+
+**Configuration:**
+
+```bash
+ingress:
+ endpointType: LOAD_BALANCER
+```
+
+On AWS, the ConfigServer automatically applies the annotation `service.beta.kubernetes.io/aws-load-balancer-internal: "true"` to all Container pods. This provisions an **internal** Network Load Balancer (NLB) accessible only within the VPC where the EKS cluster nodes reside.
+
+## NODE_PORT
+
+The `NODE_PORT` type exposes the Vespa container cluster on a specific port (range 30000-32767) across all Kubernetes worker nodes.
+
+**Configuration:**
+
+```bash
+ingress:
+ endpointType: NODE_PORT
+```
+
+When this option is set, Kubernetes opens a static port on every worker node. External traffic can reach the application via `:`. Note that unlike `LOAD_BALANCER`, this does not provide health checks at the entry point level. If a worker node with a connection crashes, the connection will simply time out or fail. This additionally requires all worker nodes to expose an External IP.
+
+To use the `NODE_PORT` service, find the assigned port.
+
+```bash
+$ kubectl get service lb-default -n $NAMESPACE
+
+NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
+lb-default NodePort 10.100.150.25 80:31942/TCP 5m
+```
+
+Get the list of nodes and look for their External IP addresses.
+
+```bash
+$ kubectl get nodes -o wide
+
+NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION
+ip-192-168-3-50.us-east-2.compute.internal Ready 10d v1.27.3-eks-a5565ad 192.168.3.50 18.221.100.45 Amazon Linux 2 5.10.184-175.731.amzn2.x86_64
+ip-192-168-3-51.us-east-2.compute.internal Ready 10d v1.27.3-eks-a5565ad 192.168.3.51 3.142.200.10 Amazon Linux 2 5.10.184-175.731.amzn2.x86_64
+```
+
+Choose any External IP and combine the IP and port to access the service.
+
+```js expandable
+$ curl http://18.221.100.45:31942/state/v1/health
+
+{
+ "time" : 1769981985754,
+ "status" : {
+ "code" : "up"
+ },
+ "metrics" : {
+ "snapshot" : {
+ "from" : 1.769981924895E9,
+ "to" : 1.769981984895E9
+ },
+ "values" : [ {
+ "name" : "requestsPerSecond",
+ "values" : {
+ "count" : 19,
+ "rate" : 0.31666666666666665
+ }
+ }, {
+ "name" : "latencySeconds",
+ "values" : {
+ "average" : 0.009578947368421053,
+ "sum" : 0.182,
+ "count" : 19,
+ "last" : 0.003,
+ "max" : 0.057,
+ "min" : 0.0,
+ "rate" : 0.31666666666666665
+ }
+ } ]
+ }
+}
+```
+
+## CLUSTER\_IP
+
+This type restricts access to within the Kubernetes cluster. It provides a stable internal IP and DNS name (e.g., `lb-default.vespa.svc.cluster.local`) but assigns no external IP.
+
+**Configuration:**
+
+```bash
+ingress:
+ endpointType: CLUSTER_IP
+```
+
+The `CLUSTER_IP` service is ideal for architectures where the clients (e.g., front-end applications or ingestion services) run inside the same Kubernetes cluster as Vespa.
+
+## NONE
+
+This option disables automatic Service provisioning. Use this if you intend to manually define `Ingress` resources, use a Service Mesh (like Istio or Linkerd), or have complex networking requirements not covered by the standard types.
+
+**Configuration:**
+
+```bash
+ingress:
+ endpointType: NONE
+```
+
+## Traffic Routing & Labeling
+
+To ensure zero-downtime deployments, the ConfigServer manages traffic routing dynamically using Kubernetes labels. The created Services use the selector `vespa.ai/tenant-lb: backend`. When the Pod is provisioned, these labels are automatically attached.
+
+During a rolling upgrade, the label is removed from the terminating Pod(s) before they are shut down. This provides a window for the remaining traffic to drain before the Pod is upgraded.
+
+
+**Note**:
+
+The Service exposes port **80** (plaintext) and **443** (TLS) externally, mapping them to the container's port 4443.
+
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/kubernetes/logging.mdx b/mintlify-docs/en/operations/kubernetes/logging.mdx
new file mode 100644
index 0000000000..ebcdf5b5d2
--- /dev/null
+++ b/mintlify-docs/en/operations/kubernetes/logging.mdx
@@ -0,0 +1,152 @@
+---
+title: "Configure Log Collection"
+description: "Use the Fluent Bit Operator to collect logs and forward them to Grafana Cloud Loki."
+---
+
+## 1. Install Fluent Bit Operator
+
+Install the Fluent Bit Operator in a dedicated `logging` namespace.
+
+```bash
+$ helm repo add fluent https://fluent.github.io/helm-charts
+$ helm repo update
+$ kubectl create namespace logging
+$ helm install fluent-operator fluent/fluent-operator --namespace logging --set operator.logLevel=debug
+```
+
+Verify the installation by ensuring the operator Pod is running:
+
+```bash
+$ kubectl get pods -n logging
+```
+
+## 2. Configure Loki Credentials
+
+To forward logs to Grafana Cloud Loki, you must create a Kubernetes Secret containing your credentials. Obtain your User ID and API Token from the Grafana Cloud Portal under *Connections → Loki*.
+
+```bash
+$ kubectl create secret generic grafana-cloud-loki -n logging --from-literal=username=$USER_ID --from-literal=password=$API_TOKEN
+```
+
+## 3. Deploy Fluent Bit Configuration
+
+The Fluent Operator uses Custom Resources to define the logging pipeline. The following configuration sets up a **ClusterInput** to tail container logs, a **ClusterFilter** to add Kubernetes metadata, and a **ClusterOutput** to ship logs to Loki.
+
+Save the following as `fluentbit-logging.yaml`. **Note:** Replace `logs-prod-006.grafana.net` with your specific Loki endpoint.
+
+```yaml expandable
+ apiVersion: fluentbit.fluent.io/v1alpha2
+kind: ClusterFluentBitConfig
+metadata:
+ name: fluent-bit-config
+spec:
+ service:
+ httpServer: true
+ parsersFile: parsers.conf
+ inputSelector:
+ matchLabels:
+ fluentbit.fluent.io/enabled: "true"
+ filterSelector:
+ matchLabels:
+ fluentbit.fluent.io/enabled: "true"
+ outputSelector:
+ matchLabels:
+ fluentbit.fluent.io/enabled: "true"
+---
+apiVersion: fluentbit.fluent.io/v1alpha2
+kind: ClusterInput
+metadata:
+ name: kube-containers
+ labels:
+ fluentbit.fluent.io/enabled: "true"
+spec:
+ tail:
+ tag: kube.*
+ path: /var/log/containers/*.log
+ parser: cri
+ readFromHead: false
+ memBufLimit: 5MB
+---
+apiVersion: fluentbit.fluent.io/v1alpha2
+kind: ClusterFilter
+metadata:
+ name: k8s-meta
+ labels:
+ fluentbit.fluent.io/enabled: "true"
+spec:
+ match: "kube.*"
+ filters:
+ - kubernetes:
+ mergeLog: true
+ keepLog: false
+ labels: true
+ annotations: true
+---
+apiVersion: fluentbit.fluent.io/v1alpha2
+kind: ClusterOutput
+metadata:
+ name: loki
+ labels:
+ fluentbit.fluent.io/enabled: "true"
+spec:
+ match: "kube.*"
+ loki:
+ host: logs-prod-006.grafana.net
+ port: 443
+ tls:
+ verify: false
+ httpUser:
+ valueFrom:
+ secretKeyRef:
+ name: grafana-cloud-loki
+ key: username
+ httpPassword:
+ valueFrom:
+ secretKeyRef:
+ name: grafana-cloud-loki
+ key: password
+ labels:
+ - job=fluentbit
+ - cluster=minikube
+ autoKubernetesLabels: "off"
+---
+apiVersion: fluentbit.fluent.io/v1alpha2
+kind: FluentBit
+metadata:
+ name: fluent-bit
+ namespace: logging
+spec:
+ fluentBitConfigName: fluent-bit-config
+ image: ghcr.io/fluent/fluent-operator/fluent-bit:3.1.4
+ tolerations:
+ - operator: Exists
+```
+
+Apply the configuration to your cluster:
+
+```bash
+$ kubectl apply -f fluentbit-logging.yaml
+```
+
+## 4. Query Logs
+
+Once deployed, Fluent Bit will run as a `DaemonSet` on every node. You can query the logs using LogQL in Grafana Explore.
+
+Check the Fluent Bit logs to ensure there are no authentication errors (HTTP 401):
+
+```bash
+$ kubectl logs -n logging -l app.kubernetes.io/name=fluent-bit --tail=50
+```
+
+Use these LogQL queries to inspect Vespa components specifically:
+
+```bash
+# Filter logs for the Config Server
+{cluster="minikube", namespace_name="default", pod_name=~"cfg-.*"}
+
+# Filter logs for specific containers (e.g., query container)
+{cluster="minikube", container_name="vespa"}
+
+# Search for errors across all Vespa pods
+{cluster="minikube"} |= "error"
+```
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/kubernetes/monitoring.mdx b/mintlify-docs/en/operations/kubernetes/monitoring.mdx
new file mode 100644
index 0000000000..26c1f7b079
--- /dev/null
+++ b/mintlify-docs/en/operations/kubernetes/monitoring.mdx
@@ -0,0 +1,194 @@
+---
+title: "Monitor a Vespa on Kubernetes Deployment"
+---
+
+Use the Prometheus Operator to collect metrics from a Vespa on Kubernetes deployment. This guide covers the installation of the monitoring stack, configuration of `PodMonitor` resources for Vespa components, and forwarding metrics to Grafana Cloud.
+
+## Prerequisites
+
+- A Kubernetes cluster (EKS, GKE, AKS, or Minikube).
+- [Helm CLI](https://helm.sh/docs/intro/install/)
+- Kubernetes Command Line Tool ([kubectl](https://kubernetes.io/docs/reference/kubectl/))
+- A Grafana Cloud account
+
+## 1. Install Prometheus Operator
+
+The recommended way to install Prometheus on Kubernetes is via the [kube-prometheus-stack](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack) Helm chart. Add the repository and create a monitoring namespace.
+
+```bash
+$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
+$ helm repo update
+$ kubectl create namespace monitoring
+```
+
+### Configure Grafana Cloud Credentials
+
+If you intend to forward metrics to Grafana Cloud, create a Kubernetes Secret with your credentials. Retrieve your **Instance ID** (User) and **API Token** (Password) from the Grafana Cloud Portal under *Configure Prometheus*.
+
+```bash
+$ kubectl create secret generic grafana-cloud-prometheus -n monitoring --from-literal=username=$INSTANCE_ID --from-literal=password=$API_TOKEN
+```
+
+### Configure Helm Values
+
+Create a `prometheus-values.yaml` file. This configuration enables remote writing to Grafana Cloud, configures the Prometheus Operator to select all `PodMonitors`, and disables the local Grafana instance.
+
+```yaml expandable
+prometheus:
+ prometheusSpec:
+ # Allow Prometheus to discover PodMonitors in other namespaces
+ podMonitorSelectorNilUsesHelmValues: false
+ serviceMonitorSelectorNilUsesHelmValues: false
+
+ # Remote write configuration for Grafana Cloud
+ remoteWrite:
+ - url: [https://prometheus-prod-XX-prod-XX.grafana.net/api/prom/push](https://prometheus-prod-XX-prod-XX.grafana.net/api/prom/push)
+ basicAuth:
+ username:
+ name: grafana-cloud-prometheus
+ key: username
+ password:
+ name: grafana-cloud-prometheus
+ key: password
+ writeRelabelConfigs:
+ - sourceLabels: [__address__]
+ targetLabel: cluster
+ replacement: my-cluster-name
+
+# Disable local Grafana
+grafana:
+ enabled: false
+
+# Enable Alertmanager and Kube State Metrics
+alertmanager:
+ enabled: true
+ kube-state-metrics:
+ enabled: true
+```
+
+Install the stack using Helm:
+
+```bash
+$ helm install prometheus prometheus-community/kube-prometheus-stack --namespace monitoring --values prometheus-values.yaml
+```
+
+## 2. Configure PodMonitors
+
+Vespa exposes metrics on specific ports that differ from standard web traffic ports. We use the `PodMonitor` Custom Resource to define how Prometheus should scrape these endpoints.
+
+### Monitor ConfigServer Pods
+
+ConfigServers expose metrics on port **19071** at the path `/configserver-metrics`. Apply the following configuration to scrape these metrics.
+
+```yaml expandable
+apiVersion: monitoring.coreos.com/v1
+kind: PodMonitor
+metadata:
+ name: vespa-configserver
+ namespace: $NAMESPACE
+ labels:
+ release: prometheus # Required to be picked up by the operator
+spec:
+ selector:
+ matchLabels:
+ app: vespa-configserver
+ podMetricsEndpoints:
+ - targetPort: 19071
+ path: /configserver-metrics
+ interval: 30s
+ scheme: http
+ params:
+ format: ['prometheus']
+ relabelings:
+ # Map Kubernetes pod name to the 'pod' label
+ - sourceLabels: [__meta_kubernetes_pod_name]
+ targetLabel: pod
+ - targetLabel: vespa_role
+ replacement: configserver
+```
+
+### Monitor Application Pods
+
+Container and Content Pods expose metrics on the state API port **19092** at `/prometheus/v1/values`. The following example defines a PodMonitor for Vespa application pods.
+
+```yaml expandable
+apiVersion: monitoring.coreos.com/v1
+kind: PodMonitor
+metadata:
+ name: vespa-application
+ namespace: default
+ labels:
+ release: prometheus
+spec:
+ selector:
+ matchExpressions:
+ # Selects pods that are part of a Vespa application (feed, query, content)
+ - key: vespa.ai/cluster-name
+ operator: Exists
+ podMetricsEndpoints:
+ - targetPort: 19092
+ path: /prometheus/v1/values
+ interval: 30s
+ scheme: http
+ relabelings:
+ - sourceLabels: [__meta_kubernetes_pod_name]
+ targetLabel: pod
+ - sourceLabels: [__meta_kubernetes_namespace]
+ targetLabel: namespace
+ # Extract the role from the pod name or labels if needed
+ - targetLabel: vespa_role
+ replacement: node
+```
+
+## 3. Verify Metrics
+
+Once the `PodMonitors` are applied, verify that Prometheus is successfully scraping the targets.
+
+### Check Targets Locally
+
+Port-forward the Prometheus UI to your local machine:
+
+```bash
+$ kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090
+```
+
+Navigate to [http://localhost:9090/targets](http://localhost:9090/targets). You should see targets named `default/vespa-configserver` and `default/vespa-application` in the **UP** state.
+
+### Query Metrics
+
+You can verify the data using PromQL queries in the Prometheus UI or Grafana Explore:
+
+```js
+# Check availability of Config Servers
+up{vespa_role="configserver"}
+
+# Retrieve average maintenance duration
+vespa_maintenance_duration_average
+
+# List all metrics coming from Vespa
+{job=~"default/vespa-.*"}
+```
+
+## Troubleshooting
+
+**Targets show `No active targets`**:
+
+This indicates the `PodMonitor` selector does not match any Pods. Verify the labels on your Vespa pods:
+
+```bash
+$ kubectl get pods -n $NAMESPACE --show-labels
+```
+
+Ensure the `selector.matchLabels` in your `PodMonitor` YAML matches the labels shown in the output above.
+
+**Targets are in `DOWN` state**:
+
+This usually means Prometheus cannot reach the metric endpoint. Verify that the metrics are exposed on the expected port by running a curl command from within the cluster:
+
+```bash
+$ kubectl run curl-test -n $NAMESPACE --image=curlimages/curl -it --rm -- curl http://cfg-0.$NAMESPACE.svc.cluster.local:19071/configserver-metrics?format=prometheus
+```
+
+**Network Policies**:
+
+If you use `NetworkPolicy` to restrict traffic, ensure you have a policy allowing ingress traffic from the `monitoring` namespace to the `$NAMESPACE` namespace on ports 19071 and 19092.
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/kubernetes/operations/delete-vespaset.mdx b/mintlify-docs/en/operations/kubernetes/operations/delete-vespaset.mdx
new file mode 100644
index 0000000000..29ff4d17fb
--- /dev/null
+++ b/mintlify-docs/en/operations/kubernetes/operations/delete-vespaset.mdx
@@ -0,0 +1,92 @@
+---
+title: "Delete a VespaSet"
+sidebarTitle: "Delete a VespaSet"
+---
+
+This page provides instructions for deleting a VespaSet.
+
+The ConfigServer and Application Pods use [Kubernetes PreStop Hooks](https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/) to prevent their immediate removal when evicted voluntarily or deleted involuntarily. In production cases, these finalizers are paramount for ensuring proper data redistribution between the Content Pods. However, they also have the adverse side effect of making Vespa difficult to fully uninstall.
+
+Follow the steps below to fully uninstall your setup. This example assumes the Pods were created for the [Album Recommendation](https://github.com/vespa-engine/sample-apps/tree/master/album-recommendation) sample application, the Pods are scheduled in the `$NAMESPACE` namespace, and a `VespaSet` called `vespaset-test` was deployed.
+
+
+ **Important:** These instructions should not be run on production serving environments.
+
+
+## Steps
+
+### Delete Pods
+
+Run `vespa-stop-configserver` on all ConfigServer Pods. This will ensure that any finalizers will exit immediately, since all finalizers ultimately route to a ConfigServer.
+
+```bash
+$ kubectl exec -n $NAMESPACE cfg-1 -- vespa-stop-configserver
+$ kubectl exec -n $NAMESPACE cfg-2 -- vespa-stop-configserver
+$ kubectl exec -n $NAMESPACE cfg-3 -- vespa-stop-configserver
+```
+
+Run `vespa-stop-services` on all Application Pods.
+
+```bash
+$ kubectl exec -n $NAMESPACE default-100 -- vespa-stop-services
+$ kubectl exec -n $NAMESPACE default-101 -- vespa-stop-services
+$ kubectl exec -n $NAMESPACE music-102 -- vespa-stop-services
+$ kubectl exec -n $NAMESPACE music-103 -- vespa-stop-services
+$ kubectl exec -n $NAMESPACE cluster-controllers-104 -- vespa-stop-services
+$ kubectl exec -n $NAMESPACE cluster-controllers-105 -- vespa-stop-services
+$ kubectl exec -n $NAMESPACE cluster-controllers-106 -- vespa-stop-services
+```
+
+Delete all the ConfigServer and Application Pods. The finalizers will exit immediately.
+
+```bash
+$ kubectl delete pod -n $NAMESPACE cfg-1 --grace-period=0 --force --ignore-not-found
+$ kubectl delete pod -n $NAMESPACE cfg-2 --grace-period=0 --force --ignore-not-found
+$ kubectl delete pod -n $NAMESPACE cfg-3 --grace-period=0 --force --ignore-not-found
+$ kubectl delete pod -n $NAMESPACE default-100 --grace-period=0 --force --ignore-not-found
+$ kubectl delete pod -n $NAMESPACE default-101 --grace-period=0 --force --ignore-not-found
+$ kubectl delete pod -n $NAMESPACE music-102 --grace-period=0 --force --ignore-not-found
+$ kubectl delete pod -n $NAMESPACE music-103 --grace-period=0 --force --ignore-not-found
+$ kubectl delete pod -n $NAMESPACE cluster-controllers-104 --grace-period=0 --force --ignore-not-found
+$ kubectl delete pod -n $NAMESPACE cluster-controllers-105 --grace-period=0 --force --ignore-not-found
+$ kubectl delete pod -n $NAMESPACE cluster-controllers-106 --grace-period=0 --force --ignore-not-found
+```
+
+### Delete Persistent Volume Claims
+
+Delete all PersistentVolumeClaims (PVCs) from the namespace. This should be performed after all Pods have been deleted, to ensure PVC deletion does not hang on a Pod binding.
+
+```bash
+$ kubectl delete pvc -n $NAMESPACE cfg-1-data --ignore-not-found
+$ kubectl delete pvc -n $NAMESPACE cfg-2-data --ignore-not-found
+$ kubectl delete pvc -n $NAMESPACE cfg-3-data --ignore-not-found
+$ kubectl delete pvc -n $NAMESPACE default-100-data --ignore-not-found
+$ kubectl delete pvc -n $NAMESPACE default-101-data --ignore-not-found
+$ kubectl delete pvc -n $NAMESPACE music-102-data --ignore-not-found
+$ kubectl delete pvc -n $NAMESPACE music-103-data --ignore-not-found
+```
+
+### Delete ConfigMaps
+
+Delete all remaining ConfigMaps in the namespace.
+
+```bash
+$ kubectl delete configmap -n $NAMESPACE vespa-config --ignore-not-found
+```
+
+### Delete Services
+
+Delete any Services and other networking components that may have been setup by the operator.
+
+```bash
+$ kubectl delete svc -n $NAMESPACE x --ignore-not-found
+$ kubectl delete svc -n $NAMESPACE cfg-internal --ignore-not-found
+```
+
+### Delete VespaSet
+
+Delete the `VespaSet` resource. With all Pods and services already removed, the operator's finalizer will exit immediately.
+
+```bash
+$ kubectl delete vespaset -n $NAMESPACE vespaset-test --ignore-not-found
+```
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/kubernetes/operations/monitoring.mdx b/mintlify-docs/en/operations/kubernetes/operations/monitoring.mdx
new file mode 100644
index 0000000000..6a9df01a7b
--- /dev/null
+++ b/mintlify-docs/en/operations/kubernetes/operations/monitoring.mdx
@@ -0,0 +1,195 @@
+---
+title: "Monitor a Vespa on Kubernetes Deployment"
+sidebarTitle: "Monitor a Vespa on Kubernetes Deployment"
+---
+
+Use the Prometheus Operator to collect metrics from a Vespa on Kubernetes deployment. This guide covers the installation of the monitoring stack, configuration of `PodMonitor` resources for Vespa components, and forwarding metrics to Grafana Cloud.
+
+## Prerequisites
+
+- A Kubernetes cluster (EKS, GKE, AKS, or Minikube).
+- [Helm CLI](https://helm.sh/docs/intro/install/)
+- Kubernetes Command Line Tool ([kubectl](https://kubernetes.io/docs/reference/kubectl/))
+- A Grafana Cloud account
+
+## 1. Install Prometheus Operator
+
+The recommended way to install Prometheus on Kubernetes is via the [kube-prometheus-stack](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack) Helm chart. Add the repository and create a monitoring namespace.
+
+```bash
+$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
+$ helm repo update
+$ kubectl create namespace monitoring
+```
+
+### Configure Grafana Cloud Credentials
+
+If you intend to forward metrics to Grafana Cloud, create a Kubernetes Secret with your credentials. Retrieve your **Instance ID** (User) and **API Token** (Password) from the Grafana Cloud Portal under _Configure Prometheus_.
+
+```bash
+$ kubectl create secret generic grafana-cloud-prometheus -n monitoring --from-literal=username=$INSTANCE_ID --from-literal=password=$API_TOKEN
+```
+
+### Configure Helm Values
+
+Create a `prometheus-values.yaml` file. This configuration enables remote writing to Grafana Cloud, configures the Prometheus Operator to select all `PodMonitors`, and disables the local Grafana instance.
+
+```yaml expandable
+prometheus:
+ prometheusSpec:
+ # Allow Prometheus to discover PodMonitors in other namespaces
+ podMonitorSelectorNilUsesHelmValues: false
+ serviceMonitorSelectorNilUsesHelmValues: false
+
+ # Remote write configuration for Grafana Cloud
+ remoteWrite:
+ - url: [https://prometheus-prod-XX-prod-XX.grafana.net/api/prom/push](https://prometheus-prod-XX-prod-XX.grafana.net/api/prom/push)
+ basicAuth:
+ username:
+ name: grafana-cloud-prometheus
+ key: username
+ password:
+ name: grafana-cloud-prometheus
+ key: password
+ writeRelabelConfigs:
+ - sourceLabels: [__address__]
+ targetLabel: cluster
+ replacement: my-cluster-name
+
+# Disable local Grafana
+grafana:
+ enabled: false
+
+# Enable Alertmanager and Kube State Metrics
+alertmanager:
+ enabled: true
+ kube-state-metrics:
+ enabled: true
+```
+
+Install the stack using Helm:
+
+```bash
+$ helm install prometheus prometheus-community/kube-prometheus-stack --namespace monitoring --values prometheus-values.yaml
+```
+
+## 2. Configure PodMonitors
+
+Vespa exposes metrics on specific ports that differ from standard web traffic ports. We use the `PodMonitor` Custom Resource to define how Prometheus should scrape these endpoints.
+
+### Monitor ConfigServer Pods
+
+ConfigServers expose metrics on port **19071** at the path `/configserver-metrics`. Apply the following configuration to scrape these metrics.
+
+```yaml expandable
+apiVersion: monitoring.coreos.com/v1
+kind: PodMonitor
+metadata:
+ name: vespa-configserver
+ namespace: $NAMESPACE
+ labels:
+ release: prometheus # Required to be picked up by the operator
+spec:
+ selector:
+ matchLabels:
+ app: vespa-configserver
+ podMetricsEndpoints:
+ - targetPort: 19071
+ path: /configserver-metrics
+ interval: 30s
+ scheme: http
+ params:
+ format: ['prometheus']
+ relabelings:
+ # Map Kubernetes pod name to the 'pod' label
+ - sourceLabels: [__meta_kubernetes_pod_name]
+ targetLabel: pod
+ - targetLabel: vespa_role
+ replacement: configserver
+```
+
+### Monitor Application Pods
+
+Container and Content Pods expose metrics on the state API port **19092** at `/prometheus/v1/values`. The following example defines a PodMonitor for Vespa application pods.
+
+```yaml expandable
+apiVersion: monitoring.coreos.com/v1
+kind: PodMonitor
+metadata:
+ name: vespa-application
+ namespace: default
+ labels:
+ release: prometheus
+spec:
+ selector:
+ matchExpressions:
+ # Selects pods that are part of a Vespa application (feed, query, content)
+ - key: vespa.ai/cluster-name
+ operator: Exists
+ podMetricsEndpoints:
+ - targetPort: 19092
+ path: /prometheus/v1/values
+ interval: 30s
+ scheme: http
+ relabelings:
+ - sourceLabels: [__meta_kubernetes_pod_name]
+ targetLabel: pod
+ - sourceLabels: [__meta_kubernetes_namespace]
+ targetLabel: namespace
+ # Extract the role from the pod name or labels if needed
+ - targetLabel: vespa_role
+ replacement: node
+```
+
+## 3. Verify Metrics
+
+Once the `PodMonitors` are applied, verify that Prometheus is successfully scraping the targets.
+
+### Check Targets Locally
+
+Port-forward the Prometheus UI to your local machine:
+
+```bash
+$ kubectl port-forward -n monitoring svc/prometheus-kube-prometheus-prometheus 9090:9090
+```
+
+Navigate to [http://localhost:9090/targets](http://localhost:9090/targets). You should see targets named `default/vespa-configserver` and `default/vespa-application` in the **UP** state.
+
+### Query Metrics
+
+You can verify the data using PromQL queries in the Prometheus UI or Grafana Explore:
+
+```js
+# Check availability of Config Servers
+up{vespa_role="configserver"}
+
+# Retrieve average maintenance duration
+vespa_maintenance_duration_average
+
+# List all metrics coming from Vespa
+{job=~"default/vespa-.*"}
+```
+
+## Troubleshooting
+
+**Targets show `No active targets`**:
+
+This indicates the `PodMonitor` selector does not match any Pods. Verify the labels on your Vespa pods:
+
+```bash
+$ kubectl get pods -n $NAMESPACE --show-labels
+```
+
+Ensure the `selector.matchLabels` in your `PodMonitor` YAML matches the labels shown in the output above.
+
+**Targets are in `DOWN` state**:
+
+This usually means Prometheus cannot reach the metric endpoint. Verify that the metrics are exposed on the expected port by running a curl command from within the cluster:
+
+```bash
+$ kubectl run curl-test -n $NAMESPACE --image=curlimages/curl -it --rm -- curl http://cfg-0.$NAMESPACE.svc.cluster.local:19071/configserver-metrics?format=prometheus
+```
+
+**Network Policies**:
+
+If you use `NetworkPolicy` to restrict traffic, ensure you have a policy allowing ingress traffic from the `monitoring` namespace to the `$NAMESPACE` namespace on ports 19071 and 19092.
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/kubernetes/operations/operations.mdx b/mintlify-docs/en/operations/kubernetes/operations/operations.mdx
new file mode 100644
index 0000000000..11605c42ec
--- /dev/null
+++ b/mintlify-docs/en/operations/kubernetes/operations/operations.mdx
@@ -0,0 +1,62 @@
+---
+title: "Lifecycle Operations for Vespa on Kubernetes"
+sidebarTitle: "Operations"
+---
+The ConfigServer and Vespa Application Pods have built-in resilience and recovery capabilities; they are automatically recovered during failures and gracefully shut down during maintenance or scaling operations to preserve data integrity.
+
+### Automatic Recovery
+
+Vespa relies on standard Kubernetes controllers to detect and restart crashed Pods. If a container exits unexpectedly (e.g., OOMKilled or application crash), the kubelet will automatically restart it.
+
+However, the ConfigServers track the health history of every Pod. To prevent a "crash loop" from causing cascading failures or constantly churning resources, the system implements a strict throttling mechanism. The ConfigServers allow a maximum of 2 involuntary Pod disruptions per 24-hour period for a given Vespa Application. If this limit is exceeded, the ConfigServer stops automatically failing these Pods and will require human intervention to investigate the root cause.
+
+### Graceful Shutdown
+
+To prevent query failures or data loss during termination, a [PreStop Hook](https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/) is placed on every ConfigServer and Vespa Application Pod. During a voluntary disruption, this hook ensures that existing traffic is drained and that data is flushed before killing the Pod.
+
+Two types of disruptions exist in Kubernetes:
+
+| Type | Scenario | Behavior |
+| --- | --- | --- |
+| **Voluntary Disruption** | Scaling down, rolling upgrades, or node maintenance. | The preStop hook detects a voluntary disruption, stops the Vespa Container cluster from accepting new traffic, flushes in-memory data to disk for Content clusters, and ensures a clean exit before the Pod is deleted. |
+| **Involuntary Disruption** | Node hardware failure, kernel panic, or eviction. | Kubernetes initiates the termination. The preStop hook attempts to run to flush data and close connections. However, if the Pod is lost abruptly. the hook cannot run, and recovery relies on Vespa's data replication. |
+
+### Pod Disruption Budget
+
+Defining a `PodDisruptionBudget` (PBD) is not supported for Vespa on Kubernetes. The ConfigServers will override any PBD with its own orchestration policy.
+
+### Application Pod Resources
+
+For Vespa Application Pods, the resources for each Pod, the number of Pods in a Vespa cluster, and the group configuration can be updated through the `` element in the application package. Refer to the [specification](/en/reference/applications/services/services) for more details.
+
+### ConfigServer Pod Resources
+
+ConfigServer Pod resources can be configured by overriding the `vespa` container's resource specification via the PodTemplate in the `VespaSet`. The Config Server deduces its heap size from the Pod cgroup limits, which are derived from the `requests` and `limits` set on the Pod. Setting requests and limits to the same value is recommended to ensure the heap size is deduced correctly.
+
+Horizontally scaling the replica count for ConfigServer Pods is not supported.
+
+```bash
+apiVersion: k8s.ai.vespa/v1
+kind: VespaSet
+metadata:
+ name: sample-vespaset
+spec:
+ configServer:
+ image: "$VESPA_IMAGE"
+ storageClass: "gp3"
+ podTemplate:
+ spec:
+ containers:
+ - name: vespa
+ resources:
+ requests:
+ cpu: "4"
+ memory: "8Gi"
+ limits:
+ cpu: "4"
+ memory: "8Gi"
+```
+
+### Autoscaling
+
+Vespa on Kubernetes provides autoscaling through ranges specified in the `resource` elements in the application package. Refer to the [Autoscaling](/en/operations/autoscaling) guide for more details.
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/kubernetes/operations/upgrades.mdx b/mintlify-docs/en/operations/kubernetes/operations/upgrades.mdx
new file mode 100644
index 0000000000..c417904ea8
--- /dev/null
+++ b/mintlify-docs/en/operations/kubernetes/operations/upgrades.mdx
@@ -0,0 +1,196 @@
+---
+title: "Upgrades"
+sidebarTitle: "Upgrade Vespa on Kubernetes"
+---
+
+Vespa on Kubernetes supports zero-downtime rolling upgrades. An upgrade involves upgrading the `vespa-operator` via the Helm chart and the ConfigServer and Application (Container and Content) Pods through the `VespaSet` resource.
+
+We do not support version drift between the `vespa-operator` and the `VespaSet`. Accordingly, upgrades should be planned so that all components are updated together. To ensure availability, they should be performed in the order as shown in this guide.
+
+## Update the CRD
+
+Some upgrades may introduce changes to the `VespaSet` CRD definition. These changes should be applied to the cluster before performing the upgrade. As a rule of thumb, we recommend executing this before every upgrade procedure.
+
+Helm does not manage the lifecycle of the CRD after it is installed (see [the official documentation](https://helm.sh/docs/chart_best_practices/custom_resource_definitions/?utm_source=chatgpt.com)). As a result, CRD updates must be handled manually. Given the official Helm Chart for Vespa on Kubernetes, this can be performed by extracting the CRD definition from the OCI package and applying it directly using `kubectl`.
+
+```text
+$ helm show crds $HELM_CHART_REF --version $VESPA_VERSION > vespaset-crd.yaml
+$ kubectl apply -f vespaset-crd.yaml
+```
+
+## Upgrade the Vespa Operator
+
+The operator can be upgraded through helm by running `helm upgrade` with the new `VESPA_VERSION`. Replace `$NAMESPACE` with the namespace where Vespa is installed. Refer to [Factory](https://factory.vespa.ai/) for the latest `VESPA_VERSION`. Note that upgrading the operator does not affect the ConfigServer and Application Pods. Their upgrade will be performed in a subsequent step.
+
+```bash
+$ helm upgrade vespa-operator vespa/vespa-operator \
+ --version $OPERATOR_VERSION \
+ --namespace $NAMESPACE \
+ --reuse-values
+```
+
+Wait for the operator to finish rolling out before proceeding.
+
+```js
+$ kubectl rollout status deployment/vespa-operator -n $NAMESPACE
+```
+
+## Upgrade the VespaSet
+
+To upgrade the ConfigServer and application Pods, patch the `spec.version` field in the `VespaSet` resource. Ensure that the target image is available and accessible on the Kubernetes Node at `VESPA_OPERATOR_IMAGE:VESPA_VERSION` and `VESPA_IMAGE:VESPA_VERSION` before proceeding. For example:
+
+```bash
+$ cat > vespaset.yaml <
+Annotations:
+API Version: k8s.ai.vespa/v1
+Kind: VespaSet
+Metadata:
+ Creation Timestamp: 2026-01-29T21:32:27Z
+ Finalizers:
+ vespasets.k8s.ai.vespa/finalizer
+ Generation: 1
+ Resource Version: 121822902
+ UID: a70f56e9-6625-4011-acd7-9f7cad29dbc2
+Spec:
+ Application:
+ Image: $VESPA_IMAGE
+ Storage Class: gp3
+ Config Server:
+ Generate Rbac: false
+ Image: $VESPA_IMAGE
+ Storage Class: gp3
+ Ingress:
+ Endpoint Type: LOAD_BALANCER
+ Version: 8.577
+Status:
+ Bootstrap Status:
+ Pods:
+ cfg-1:
+ Last Updated: 2026-01-29T21:38:45Z
+ Message: Pod is running
+ Phase: RUNNING
+ Converged Version: 8.577
+ cfg-2:
+ Last Updated: 2026-01-29T21:38:09Z
+ Message: Pod is running
+ Phase: RUNNING
+ Converged Version: 8.577
+ cfg-3:
+ Last Updated: 2026-01-29T21:36:32Z
+ Message: Pod is running
+ Phase: RUNNING
+ Converged Version: 8.577
+ default-100:
+ Last Updated: 2026-01-29T21:38:45Z
+ Message: Pod is upgrading
+ Phase: UPGRADING
+ Converged Version: 8.576
+ default-101:
+ Last Updated: 2026-01-29T21:38:09Z
+ Message: Pod is running
+ Phase: RUNNING
+ Converged Version: 8.576
+ documentation-102:
+ Last Updated: 2026-01-29T21:36:32Z
+ Message: Pod is running
+ Phase: RUNNING
+ Converged Version: 8.576
+ documentation-103:
+ Last Updated: 2026-01-29T21:36:32Z
+ Message: Pod is running
+ Phase: RUNNING
+ Converged Version: 8.576
+ cluster-controller-104:
+ Last Updated: 2026-01-29T21:36:32Z
+ Message: Pod is running
+ Phase: RUNNING
+ Converged Version: 8.576
+ cluster-controller-105:
+ Last Updated: 2026-01-29T21:36:32Z
+ Message: Pod is running
+ Phase: RUNNING
+ Converged Version: 8.576
+ cluster-controller-106:
+ Last Updated: 2026-01-29T21:36:32Z
+ Message: Pod is running
+ Phase: RUNNING
+ Converged Version: 8.576
+ Last Transition Time: 2026-01-29T21:33:55Z
+ Message: All configservers running
+ Phase: RUNNING
+Events:
+```
+
+The upgrade is complete when every Pod's `Converged Version` matches the new version and all phases report `RUNNING`.
+
+## Debugging Upgrade Failures
+
+If a Pod fails to converge to the target version — for example, due to an image pull failure, a crash loop, or a failed health check, the ConfigServer will continuously retry the upgrade for that Pod until it either succeeds or an administrator intervenes.
+
+In this scenario, the administrator can diagnose the issue by inspecting the ConfigServer logs or the events of the failing Pod in the current upgrade phase. Once the issue is resolved, the ConfigServer will automatically retry the upgrade for that Pod and proceed with the remaining nodes.
+
+For example, suppose the Pod `search-106` is failing to upgrade.
+
+```bash
+$ kubectl get logs cfg-1 -n $NAMESPACE
+$ kubectl get logs cfg-2 -n $NAMESPACE
+$ kubectl get logs cfg-3 -n $NAMESPACE
+$ kubectl describe pod search-106 -n $NAMESPACE
+```
+
+This design prevents a bad upgrade from cascading to the rest of the Pods. Since the ConfigServer refuses to advance past a Pod that has not converged, the remaining Pods stay on the previous known-good version while the administrator investigates.
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/kubernetes/tls.mdx b/mintlify-docs/en/operations/kubernetes/tls.mdx
new file mode 100644
index 0000000000..0549691e43
--- /dev/null
+++ b/mintlify-docs/en/operations/kubernetes/tls.mdx
@@ -0,0 +1,402 @@
+---
+title: "Enable TLS Encryption for Vespa on Kubernetes"
+sidebarTitle: "Enable TLS Encryption"
+---
+
+TLS encryption for Vespa on Kubernetes can be configured for internal pod-to-pod communication using mutual TLS (mTLS) and for external ingress traffic. This guide demonstrates using [cert-manager](https://cert-manager.io/), a Kubernetes-native certificate management solution that automates certificate issuance and renewal, to set up TLS for the Vespa on Kubernetes deployment.
+
+`cert-manager` integrates with multiple certificate authorities including self-signed CAs, Let's Encrypt, HashiCorp Vault, and commercial providers. It handles the certificate lifecycle by automatically issuing certificates and renewing them before expiration. The [cert-manager CSI driver](https://cert-manager.io/docs/usage/csi-driver/) provides secure certificate delivery to pods through runtime injection via a DaemonSet, ensuring certificates are available before containers start.
+
+## Prerequisites
+
+- Kubernetes cluster with Vespa Operator installed (see [Installation](/en/operations/kubernetes/deployment/installation))
+- [cert-manager](https://cert-manager.io/docs/installation/) v1.13 or later
+- [cert-manager CSI driver](https://cert-manager.io/docs/usage/csi-driver/) installed
+- Kubernetes Command Line Tool ([kubectl](https://kubernetes.io/docs/reference/kubectl/))
+- OpenSSL (for CA generation)
+
+## Enable mTLS for Internal Communication
+
+Mutual TLS (mTLS) secures communication between Vespa services within the Kubernetes cluster. Each pod authenticates using client certificates issued by a namespace-local root Certificate Authority. It is also possible to configure the Certificate Authority to be cluster-global.
+
+This method is ideal for those who prefer TLS to terminate at the service, or those who have already integrated with mTLS from Vespa Cloud. For more details on Vespa's mTLS implementation, see the [Vespa mTLS documentation](/en/security/mtls).
+
+### Step 1: Create Certificate Authority
+
+Create a self-signed issuer to bootstrap the certificate chain, then use it to create a namespace-local root CA certificate that acts as the trust anchor for all internal mTLS certificates.
+
+```yaml expandable
+$ cat <
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/login.mdx b/mintlify-docs/en/operations/login.mdx
new file mode 100644
index 0000000000..64b4bf5d1f
--- /dev/null
+++ b/mintlify-docs/en/operations/login.mdx
@@ -0,0 +1,9 @@
+---
+title: "Login"
+---
+
+## MFA reset
+
+If you have lost your MFA information you can request it reset from Vespa.ai Support.
+
+To reset MFA for a user, contact [support@vespa.ai](mailto:support@vespa.ai) with the user's email address. Support will send a verification code to that email address as part of the reset process. Once received, the code must be returned to support — along with explicit confirmation that the MFA reset is authorized — before the reset can be completed.
diff --git a/mintlify-docs/en/operations/metrics.mdx b/mintlify-docs/en/operations/metrics.mdx
new file mode 100644
index 0000000000..d2a8b749fb
--- /dev/null
+++ b/mintlify-docs/en/operations/metrics.mdx
@@ -0,0 +1,200 @@
+---
+title: "Metrics"
+---
+
+Metrics for all nodes is aggregated using *[/metrics/v2/values](/en/reference/api/metrics-v2#metrics-v2-values)* or *[/prometheus/v1/values](/en/reference/api/prometheus-v1#prometheus-v1-values)*. Values from these endpoints reflect the 1 minute activity immediately before the request.
+
+Example getting a metric value from using the prometheus endpoint:
+
+```bash
+$ curl -s http://ENDPOINT/prometheus/v1/values/?consumer=vespa | \
+ grep "vds.idealstate.merge_bucket.pending.average" | egrep -v 'HELP|TYPE'
+```
+
+
+**Important:**
+
+Make sure to use [consumer=vespa](/en/reference/api/metrics-v1#consumer) to list all metrics.
+
+
+Example getting a metric value using */metrics/v2/values*:
+
+```bash
+$ curl ENDPOINT/metrics/v2/values | \
+ jq -r -c '
+ .nodes[] |
+ .hostname as $h |
+ .services[].metrics[] |
+ select(.values."content.proton.documentdb.documents.total.last") |
+ [$h, .dimensions.documenttype, .values."content.proton.documentdb.documents.total.last"] |
+ @tsv'
+
+node9.vespanet music 0
+node8.vespanet music 0
+```
+
+## Aggregating metrics
+
+Metrics in Vespa are generated from services running on the individual nodes, and in many cases have many recordings per metric, from within each node, with unique tag / dimension combinations. These recordings need to be put together to contribute to the overall picture of how the system is behaving. If this is done the right way you will be able to “zoom out” to get the bigger picture, or to “zoom in” to see how things behave in more detail. This is very useful when looking into possible production issues. Unfortunately it is easy to combine metrics the wrong way, resulting in potentially significantly distorted graphs.
+
+For each of the values (suffixes) available for the different metrics here is how we recommend that you aggregate them to get the best use of them. The guidelines should be used both for aggregations over time (multiple snapshot intervals) and over tag combinations.
+
+| Suffix Name | Aggregation |
+| --- | --- |
+| `max` | Use the highest value available `MAX(max)`. |
+| `min` | Use the lowest value available `MIN(min)`. |
+| `sum` | Use the sum of all values `SUM(sum)`. |
+| `count` | Use the sum of all values `SUM(count)`. |
+| `average` | To generate an average value you want to do `SUM(sum) / SUM(count)` where you generate the graph. Don’t use the `average` suffix itself if you have the `sum` and `count` suffixes available. Using this will easily lead to computing averages of averages, which will easily become very distorted and noisy. |
+| `last` | Avoid this except for metrics you expect to be stable, such as amount of memory available on a node, etc. This value is the last from a metrics snapshot period, hence basically a single value picked from all values during the snapshot period. Typically, very noisy for volatile metrics. It does not make sense to aggregate on this value at all, but if you must then choose a value with the same combination of tags over time. |
+| `95percentile` | This value cannot be aggregated in a way that gives a mathematically correct value. But where you have to either compute the average value for the most realistic value, `AVERAGE(95percentile)`, or max if the goal is to better identify outliers, `MAX(95percentile)`. Regardless, this value is best used when considered at the most granular level, with all tag values specified. |
+| `99percentile` | Same as for the `95percentile` suffix. |
+
+## Metric-sets
+
+Node metrics in */metrics/v1/values* are listed per service, with a set of system metrics - example:
+
+```json expandable
+{
+ "services": [
+ {
+ "name": "vespa.container",
+ "timestamp": 1662120754,
+ "status": {
+ "code": "up",
+ "description": "Data collected successfully"
+ },
+ "metrics": [
+ {
+ "values": {
+ "memory_virt": 3683172352,
+ "memory_rss": 1425416192,
+ "cpu": 2.0234722784298,
+ "cpu_util": 0.202347227843
+ },
+ "dimensions": {
+ "metrictype": "system",
+ "instance": "container",
+ "clustername": "default",
+ "vespaVersion": "8.46.19"
+ }
+ },
+ {
+ "values": {},
+ "dimensions": {
+ "clustername": "default",
+ "instance": "container",
+ "vespaVersion": "8.46.19"
+ }
+ }
+ ]
+ },
+```
+
+The `default` metric-set is added to the system metric-set, unless a [consumer](/en/reference/api/metrics-v1#consumer) request parameter specifies a different built-in or custom metric set - see [metric list](/en/reference/operations/metrics/default-metric-set#metric-sets).
+
+The `Vespa` metric-set has a richer set of metrics, see [metric list](/en/reference/operations/metrics/vespa-metric-set#metric-sets).
+
+The *consumer* request parameter can also be used in [/metrics/v2/values](/en/reference/api/metrics-v2#metrics-v2-values) and [/prometheus/v1/values](/en/reference/api/prometheus-v1#prometheus-v1-values).
+
+Example minimal metric-set; system metric-set + a specific metric:
+
+```xml
+
+
+
+
+
+
+
+
+```
+
+Example default metric-set and more; system metric-set + default metric-set + a built-in metric:
+
+```xml
+
+
+
+
+
+
+
+
+
+```
+
+## Metrics names
+
+The names of metrics emitted by Vespa typically follow this naming scheme: `...`. The separator (`.` here) may differ for different metrics integrations. Similarly, the `` string may differ depending on your configuration. Further some metrics have several levels of `component` names. Each metric will have a number of values associated with them, one for each `suffix` provided by the metric. Typical suffixes include `sum`, `count` and `max`.
+
+## Container Metrics
+
+Metrics from the container with description and unit can be found in the [container metrics reference](/en/reference/operations/metrics/container#container-metrics). The most commonly used metrics are mentioned below.
+
+### Generic Container Metrics
+
+These metrics are output for the server as a whole, e.g. related to resources. Some metrics indicate memory usage, such as `mem.heap.*`, `mem.native.*`, `mem.direct.*`. Other metrics are related to the JVM garbage collection, `jdisc.gc.count` and `jdisc.gc.ms`.
+
+### Thread Pool Metrics
+
+Metrics for the container thread pools. The `jdisc.thread_pool.*` metrics have a dimension `threadpool` with thread pool name, e.g. *default-pool* for the container's default thread pool. See [Container Tuning](/en/performance/container-tuning#container-tuning) for details.
+
+### HTTP Specific Metrics
+
+These are metrics specific for HTTP. Those metrics that are specific to a connector will have a dimension containing the TCP listen port.
+
+Refer to [Container Metrics](/en/reference/operations/metrics/container) for metrics on HTTP status response codes, `http.status.*` or more detailed requests related to the handling of requests, `jdisc.http.*`. Other relevant metrics include `serverNumConnections`, `serverNumOpenConnections`, `serverBytesReceived` and `serverBytesSent`.
+
+### Query Specific Metrics
+
+For metrics related to queries please start with the `queries` and `query_latency`, the `handled.requests` and `handled.latency` or the `httpapi_*` metrics for more insights.
+
+### Feed Specific Metrics
+
+For metrics related to feeding into Vespa, we recommend using the `feed.operations` and `feed.latency` metrics.
+
+## Available metrics
+
+Each of the services running in a Vespa installation maintains and reports a number of metrics.
+
+Metrics from the container services are the most commonly used, and are listed in [Container Metrics](/en/reference/operations/metrics/container). You will find the metrics available there, with description and unit.
+
+## Metrics from custom components
+
+Add custom metrics from components like [Searchers](/en/applications/searchers) and [Document processors](/en/applications/document-processors):
+
+
+
+Add a [MetricReceiver](https://javadoc.io/doc/com.yahoo.vespa/container-disc/latest/com/yahoo/metrics/simple/MetricReceiver.html) instance to the constructor of the component - it is [injected](/en/applications/dependency-injection) by the Container.
+
+
+Declare [Gauge](https://javadoc.io/doc/com.yahoo.vespa/container-disc/latest/com/yahoo/metrics/simple/Gauge.html) and [Counter](https://javadoc.io/doc/com.yahoo.vespa/container-disc/latest/com/yahoo/metrics/simple/Counter.html) metrics using the *declare*\-methods on the *MetricReceiver*. Optionally set arbitrary metric dimensions to default values at declaration time - refer to the javadoc for details.
+
+
+Each time there is some data to measure, invoke the [sample](https://javadoc.io/doc/com.yahoo.vespa/container-disc/latest/com/yahoo/metrics/simple/Gauge.html#sample\(double\)) method on gauges or the [add](https://javadoc.io/doc/com.yahoo.vespa/container-disc/latest/com/yahoo/metrics/simple/Counter.html#add\(\)) method on counters. The gauges and counters declared are inherently thread-safe. When sampling data, any dimensions can optionally be set.
+
+
+Add a [consumer](/en/reference/applications/services/admin#consumer) in *services.xml* for the metrics to be emitted in the metric APIs, like in the previous section.
+
+
+Find a full example in the [album-recommendation-java](https://github.com/vespa-engine/sample-apps/tree/master/album-recommendation-java) sample application.
+
+
+**Note:**
+
+Metrics with no value do now show in the metric APIs - in the example above, make at least one query to set the metric value.
+
+
+### Example / QA
+
+I have two different libraries that are running as components with their own threads within the vespa container. We are injecting MetricReceiver to each library. After injecting the receiver we store the reference to this receiver in a container-wide object so that they can be used inside these libraries (the libraries each have several classes and such, so it is not possible to inject the receiver every time, and we need to use the stored reference). Questions:
+
+
+
+Yes, you get the same object.
+
+
+It remains valid for the lifetime of the component to which it got injected. Therefore, if you share component references through some other mean than direct or indirect injection you may end up with invalid references. A "container-wide object" sounds like trouble. You should have it injected into all the components that needs it instead. Or, if you feel that will be too fine-grained, create one large object which gets these things injected, and then have that injected into all components that need the common stuff.
+
+
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/monitoring.mdx b/mintlify-docs/en/operations/monitoring.mdx
new file mode 100644
index 0000000000..1c4e08f6ab
--- /dev/null
+++ b/mintlify-docs/en/operations/monitoring.mdx
@@ -0,0 +1,509 @@
+---
+title: "Monitoring"
+---
+
+
+
+
+
+The Vespa Cloud Console has dashboards for insight into performance metrics, use the METRICS tab in the application zone view.
+
+These metrics can also be pulled into external monitoring tools using the Prometheus metrics API.
+
+## The Vespa Cloud metrics dashboard
+
+The Vespa Cloud metrics dashboard (the METRICS tab in the application zone view) is organized around a *symptom → layer → resource* workflow, so an investigation that starts from "latency is up" can land on "this specific layer is the bottleneck" without scanning every chart.
+
+### Tabs and filters
+
+
+
+
+
+The dashboard is organized into seven tabs:
+
+| Tab | What it shows | When to use it |
+| --- | --- | --- |
+| **Overview** | Health indicators, request rates, QoS, latency summary, HTTP status codes, resource utilization | Daily health check, first stop during incidents |
+| **Query** | Container- and content-node query latency, per-rank-profile breakdown, match/docsum executors | Investigating read latency, query quality issues |
+| **Feed** | Feed operation rates and latency at each layer, feed blocking | Investigating write latency or throughput issues |
+| **Nearest Neighbor Search** | NNS distance computations, visit efficiency | Tuning HNSW parameters (hidden when not in use) |
+| **Content Node** | Document counts, Proton resource usage, executor utilization, maintenance jobs | Deep investigation of search engine internals |
+| **Resources** | CPU, memory, disk, GPU, JVM, thread pools | Sizing and scaling decisions |
+| **Health** | Cluster state, data consistency, restarts, reindexing, resource limits | Stability monitoring, post-incident review |
+
+Filters at the top apply across all tabs:
+
+- **Cluster** — limit metrics to specific clusters
+- **Per host metrics** — toggle between aggregated cluster view and per-node breakdown
+- **Rank Profile** — filter per-rank-profile panels on the Query tab (defaults to "All")
+
+Query, Feed, Content Node, Resources, and Health tabs group metrics per cluster — you see all metrics for one cluster before scrolling to the next. Container metrics are grouped per container cluster, content metrics per content cluster.
+
+### Annotations
+
+
+
+
+
+Annotations are vertical lines drawn on every chart that mark operational events. When a latency or throughput anomaly lines up with an annotation, you get the context for the change without having to infer it from the graph alone.
+
+| Annotation | Triggered by | Why it matters |
+| :--- | :--- | :--- |
+| **Feed blocked in cluster** | A content node crosses its disk/memory feed-block limit | Writes are paused cluster-wide until remediated |
+| **Vespa upgrade** | A new Vespa version is rolled out | Brief rolling-restart latency spikes are expected around this marker |
+| **Data migration** | Bucket merges pending exceed a threshold | Explains elevated CPU/IO and latency during redistribution |
+| **Document re-indexing** | A reindexing job is running | Explains elevated CPU and search-side load |
+| **Auto-scaling** | The autoscaler changed the cluster shape | Brief capacity drop during reshuffle |
+| **Service restart** | `delta(sentinel_totalRestarts[10m]) > 0` — a Vespa service process restarted on one or more nodes | Unexpected restarts usually indicate a crash, OOM, or forced stop; outside of planned upgrades these are always worth investigating |
+| **Core dump** | `delta(coredumps_processed[1h]) > 0` — a process core-dumped | Signals a crash; cross-reference with Service restart. Should be extremely rare |
+
+### Overview tab
+
+The Overview tab is the fastest place to answer "is anything obviously broken?" and provides everything needed for daily monitoring at a glance.
+
+#### Health Indicators
+
+
+
+
+
+The Overview tab opens with a dedicated **Health Indicators** row — five stat panels designed to surface stability issues in a single glance. A row of green zeros is the signal to stop; a non-zero value tells you which tab to visit next.
+
+| Indicator | What it counts | Healthy value |
+| :--- | :--- | :--- |
+| **Core Dumps (1h)** | Core dumps processed across all clusters in the last hour | 0 — any non-zero value is a crash to investigate |
+| **Restarts (1h)** | Vespa service restarts across all clusters in the last hour | 0 during steady state; brief spikes are normal during upgrades |
+| **Feed Blocked** | Nodes currently above a feed-block resource limit | 0 — non-zero means writes are being rejected cluster-wide |
+| **Content: Groups/Nodes Down** | Content groups with at least one node down | 0 during steady state. 1 group down is normal during rolling restarts or maintenance; 2 or more should be investigated |
+| **Container: Services Down** | Active container nodes where some service isn't running | 0 during steady state; brief spikes during deployments are expected |
+
+#### QoS and latency overview
+
+**QoS (Quality of Service)** shows the percentage of successful requests. Read and write QoS are shown separately; a healthy application should be above 99.9%. If QoS drops, consult the HTTP Response Code Reference row (collapsed by default) for a table explaining every observed status code and its meaning in Vespa context. 4xx responses are client errors; 5xx responses are server errors and should be investigated immediately.
+
+**Latency summary** separates query and feed latency into read and write rows. Compare averages with p99 — a large gap indicates tail latency that won't show up in averages. As a rule of thumb, if p99 is more than 5× the average, investigate the tail.
+
+#### Resource utilization
+
+The bottom row gives a quick view of CPU, memory, and disk across all clusters. Any resource consistently above 80% warrants attention.
+
+### Query tab
+
+When query latency increases, the Query tab helps find the cause layer-by-layer. Metrics are grouped per container cluster (for container-level metrics) and per content cluster (for content-node metrics).
+
+A query flows through multiple layers, each with its own latency metric:
+
+```bash
+Client
+ → HTTP Read Latency (end-to-end including network I/O)
+ → Query Container Latency (time in the container itself)
+ → Query Latency (container-observed total, excluding HTTP overhead)
+ → Search Protocol Latency (time on each content node)
+ → Rank Profile Latency (per rank-profile breakdown)
+```
+
+#### Container-level metrics
+
+Start with the *Query Rate & Latency* row:
+
+- Did QPS increase? More queries means more load.
+- Which latency metric increased?
+ - **Query Latency** — container level, includes dispatch to content nodes
+ - **HTTP Read Latency** — includes HTTP I/O overhead
+ - **Search Protocol Latency** — content node execution only
+
+If HTTP latency is much higher than query latency, the bottleneck is network or payload size. If search protocol latency dominates, the bottleneck is on the content nodes.
+
+The *Query Quality* row shows:
+
+- **Failed queries** — actual errors. Should be near zero.
+- **Degraded queries** — queries that were [soft-doomed](/en/performance/graceful-degradation) (ran out of time during matching). These return partial results.
+- **Empty results** — queries returning zero hits. A sudden increase may indicate an indexing problem or a query change.
+
+#### Rank profile metrics
+
+
+
+
+
+The Query tab groups per-rank-profile metrics into four sub-rows, all filterable by the Rank Profile dropdown:
+
+- **Rank Profile — Latency & Volume** — query latency (avg and max), QPS per profile, and raw docs matched per profile
+- **Rank Profile — Time Breakdown** — setup time, rerank time, and grouping time, each shown as avg plus peak so you can tell whether a profile has steady-state cost or occasional cost spikes
+- **Rank Profile — Quality** — docs matched per query, soft-doom factor, and soft-doomed queries. These tell you when a profile is [overrunning its time budget](/en/performance/graceful-degradation).
+- **Rank Profile — Query Distribution** — QPS split by content group, which helps spot uneven routing
+
+Things to look for:
+
+- Which rank profile has the highest latency?
+- Are soft-doomed queries concentrated on a specific rank profile?
+- Is the peak for rerank or grouping time much higher than the average? That often points to a specific second-phase or grouping expression that's expensive only on some queries.
+- Did docs matched per query grow? More documents matched means more ranking work.
+
+See [Latency tracking](#latency-tracking) below for a worked example, and the [rank profiles](/en/basics/ranking#rank-profiles) documentation for background.
+
+#### Match and Docsum executor panels
+
+The Query tab also includes *Match Executor* and *Docsum Executor* sub-rows (queue size + accepted rate) so you can see whether the content-node thread pools feeding the query and summary paths are saturated. These are not attributable to a rank profile, but often explain tail-latency spikes that aren't visible in rank-profile metrics.
+
+### Feed tab
+
+When feed latency increases or throughput drops, the Feed tab shows where in the write path the slowdown occurs. A write operation flows through:
+
+```bash
+Client
+ → HTTP Write Latency (end-to-end)
+ → Container Feed Latency (document processing chains, embedders)
+ → Distributor Latency (routing based on bucket distribution)
+ → Content: Storage Latency(persistence, per document replica)
+ → Commit Latency (transaction log)
+```
+
+Start from the top and find where latency increases. If container feed latency is normal but HTTP write latency is high, the bottleneck is network/payload. If distributor latency is high, check for node state issues in the Health tab. If storage latency is high, check disk I/O in the Resources tab.
+
+#### Typical healthy values
+
+- Feed latency: 1–50 ms for puts/updates is typical; spikes during maintenance are normal
+- Distributor failures: zero — non-zero indicates node state issues
+- HTTP API failures: near zero
+- Feed blocked: always zero
+
+#### Feed blocked
+
+**Feed Blocked** is the most critical feed metric. When a content node exceeds its disk or memory [resource limit](/en/writing/feed-block), feeding is paused for the entire cluster. HTTP clients receive `507 Insufficient Storage`.
+
+If feed is being blocked:
+
+
+
+Check *Health > Feed Resource Limits* for which resource is near the limit.
+
+
+Check the Resources tab for the specific nodes causing pressure.
+
+
+Add nodes to the content cluster (always add, don't resize — data [auto-redistributes](/en/content/elasticity)).
+
+
+
+The Health tab includes a Resource Limits Reference panel explaining the default limits, the blocking mechanism, and how to remediate.
+
+### Nearest Neighbor Search tab
+
+This tab only appears when the application uses [approximate nearest neighbor search](/en/querying/approximate-nn-hnsw) — it is automatically hidden when no NNS distance computations are detected.
+
+Vespa supports two NNS modes:
+
+- **Approximate NNS** — uses an HNSW graph index to find neighbors efficiently without scanning every document. Fast, but may miss some true nearest neighbors.
+- **Exact NNS** — brute-force scan computing distance to every document. Accurate but expensive. Vespa falls back to this when the filter hit ratio is below the `approximate-threshold` (default 0.02).
+
+Key metrics:
+
+- **Exact NNS Ratio** — fraction of queries using brute-force search. Should be below 0.05 (5%). High values mean many queries fall back to exact search, significantly increasing cost.
+- **Approx NNS Visit Efficiency** — ratio of graph nodes visited to distances computed. Values of 1.0–3.0 are typical; much higher suggests the HNSW index could be tuned.
+- **Distances Computed / Nodes Visited** — rate metrics showing the raw NNS workload.
+
+Tuning parameters (set per [rank profile](/en/basics/ranking#rank-profiles)): `approximate-threshold`, `filter-first-threshold`, `target-hits-max-adjustment-factor`, `exploration-slack`. If the exact NNS ratio is high, consider increasing `approximate-threshold` or restructuring filters to be less restrictive.
+
+### Content Node tab
+
+The Content Node tab shows internals of the [Proton](/en/content/proton) search engine running on each content node. All metrics are grouped per content cluster.
+
+#### Documents
+
+- **Total** — all documents in the database (including removed)
+- **Ready** — documents available for search
+- **Active** — primary copies that should be searchable on this node
+- **Removed** — tombstones pending garbage collection
+
+#### Proton resource usage
+
+Disk and memory usage from Proton's internal accounting. This is distinct from node-level metrics in the Resources tab — these are the values Vespa uses for [feed-blocking](/en/writing/feed-block) decisions.
+
+#### Executor utilization
+
+Proton uses several thread pools (executors):
+
+- **Match** — executes queries. Directly impacts query latency.
+- **Shared** — handles background tasks like flush and compaction.
+- **Proton** — internal coordination tasks.
+- **Field writer** — writes attribute and index data during feeding. Saturation directly impacts feed throughput.
+
+Typical healthy values:
+
+- Utilization below 0.8 (80%) — sustained values above this are a bottleneck
+- Field writer saturation well below 1.0
+- Queue sizes near zero during steady state
+
+The dashboard renders avg as a solid green line and max as a dashed yellow line, making it easy to spot whether the maximum tracks the average or has concerning spikes.
+
+#### Maintenance jobs
+
+Proton runs background [maintenance jobs](/en/content/proton#proton-maintenance-jobs) that manage data structures. The dashboard includes a reference panel (collapsed) explaining each job and its resource impact:
+
+| Job | Resource impact |
+| --- | --- |
+| Attribute Flush | Low |
+| Memory Index Flush | Moderate |
+| Disk Index Fusion | High — temporary 2× disk usage |
+| Document Store Compaction | High — holds file in memory |
+| Bucket Move | High — competes with feeding |
+| LID-Space Compaction | Moderate |
+
+Latency spikes that correlate with active maintenance are expected but may indicate the cluster needs more headroom.
+
+### Resources tab
+
+The Resources tab is the primary tool for sizing decisions. Node-level resources (CPU, memory, disk) are grouped per cluster. Container-specific metrics (JVM, thread pools, GPU, network) are grouped per container cluster.
+
+#### Typical healthy values
+
+| Resource | Healthy | Concerning | Action needed |
+| --- | --- | --- | --- |
+| **CPU** | `< 70%` | `70-85%` | `> 85%` sustained |
+| **CPU IOWait** | `< 5%` | `5-10%` | `> 10%` (I/O bottleneck) |
+| **Memory** | `< 70%` | `70-80%` | Approaching feed-block limit |
+| **Disk** | `< 70%` | `70-80% `| Approaching feed-block limit |
+| **JVM GC Overhead** | `< 5%` | `5-15%` | `> 15%` (severe latency impact) |
+| **Threadpool utilization** | `< 70%` | `70-90%` | Rejected tasks = requests dropped |
+
+Content nodes need extra headroom because [maintenance jobs](/en/content/proton#proton-maintenance-jobs) (especially disk index fusion) temporarily increase resource usage.
+
+#### Container thread pools
+
+
+
+
+
+Which thread pools exist on a container depends on which elements are configured in `services.xml`:
+
+| Thread pool | Present when |
+| --- | --- |
+| `default-handler-common` | Always (handler executor used by anything without its own pool) |
+| `search-handler` | `` element is present |
+| `feedapi-handler` | `` element is present |
+
+To keep the dashboard free of empty panels, the Resources tab contains three threadpool rows — one per container configuration case — and each row repeats per container cluster that falls into that case:
+
+- **Container Thread Pools (search + document-api)** — clusters with both pools
+- **Container Thread Pools (search only)** — clusters with `` but no feed API
+- **Container Thread Pools (document-api only)** — feed-only clusters
+
+Classification is automatic: hidden variables derive the cluster list per case, so only relevant rows render for a given deployment. Each pool gets three panels — **Utilization**, **Work Queue Size**, **Work Queue Utilization** — with avg as a solid green line and max as a dashed yellow line.
+
+- **Utilization** — active threads as percentage of pool size
+- **Work queue size** — tasks waiting for a thread. The default pool uses a synchronous queue (capacity 0), so there is no buffering — if no thread is available, the task is rejected.
+- **Queue utilization** — percentage of configured queue capacity used (only meaningful for thread pools with bounded queues)
+
+#### JVM memory breakdown
+
+
+
+
+
+The Resources tab's JVM row separates the three layers of container memory:
+
+- **JVM Heap Usage** — Java objects (searchers, document processors, caches)
+- **JVM Direct Memory** — NIO buffers, Netty pools
+- **JVM Native Memory** — JNI allocations, including ONNX embedder working memory and — if configured — a local LLM's KV cache and compute buffers
+
+When overall node memory is high but heap and direct look normal, the native layer is usually the answer. This is common on container nodes running embedder or local-LLM components: model weights are memory-mapped and only partially resident, but KV cache and compute buffers are allocated upfront as native memory.
+
+### Health tab
+
+The Health tab tracks cluster stability and data consistency, grouped per content cluster.
+
+#### Cluster state
+
+Nodes are distributed across states: **up** (serving), **down** (unreachable), **initializing** (starting up), **maintenance** (temporarily out), **retired** (being removed). During normal operation: all up, zero down. See [content node states](/en/content/content-nodes).
+
+#### Data consistency
+
+- **Buckets Out of Sync** — percentage of [data buckets](/en/content/buckets) not yet replicated/consistent. Should be 0% during steady state; non-zero during scaling, restarts, or failures.
+- **Merge Pending** — bucket merge operations queued. High during data redistribution.
+
+After scaling events, expect buckets out of sync and pending merges. These should converge back to zero. If they don't, investigate.
+
+#### Stability
+
+- **Service Restarts** — cumulative restarts per cluster. An increase indicates a process crash.
+- **Core Dumps** — should always be zero.
+
+Both signals surface in three complementary ways: as per-cluster time series on this tab (for historical context), as at-a-glance counters in the [Health Indicators row](#health-indicators) on the Overview tab, and as *Service restart*/*Core dump* [annotations](#dashboard-annotations) drawn as vertical lines on every chart.
+
+#### Feed Resource Limits
+
+Shows memory and disk utilization vs. configured limits. When utilization exceeds the limit, [feeding is blocked](/en/writing/feed-block). The dashboard includes a Resource Limits Reference panel (collapsed) explaining the default limits (disk 80%, memory 80%), the blocking mechanism, and what to do about it.
+
+### Common workflows
+
+#### "Our query latency increased"
+
+
+
+Overview: confirm the latency increase, check if QPS also changed.
+
+
+Query: which percentile increased most? (avg vs p95 vs p99)
+
+
+Query → Rank Profile: is it one query type or all?
+
+
+Resources: CPU, JVM GC overhead.
+
+
+Content Node: match executor utilization, queue sizes growing?
+
+
+
+#### "Feed is slow / feed is blocked"
+
+
+
+Overview: check feed latency and feed operation rate.
+
+
+Feed: which layer shows increased latency?
+
+
+Health: is feed blocked? Check resource limits.
+
+
+Resources: disk and memory utilization, CPU IOWait.
+
+
+Content Node: field writer saturation, filestor queue, active maintenance jobs.
+
+
+
+#### "Should we scale up?"
+
+
+
+Resources: identify the bottleneck resource.
+
+
+Content Node: check executor utilization (are we compute-bound?).
+
+
+Per host view: is load evenly distributed?
+
+
+Enable [autoscaling](/en/operations/autoscaling) or adjust resources in `services.xml`.
+
+
+See the [benchmarking guide](/en/performance/benchmarking) for systematic capacity testing.
+
+
+
+## Latency tracking
+
+When monitoring latency in clusters with mixed loads, it is useful to use [rank profiles](/en/basics/ranking#rank-profiles) to separate them. As an example, an application might have user queries mixed with agentic, batch-oriented queries. Tracking the Container-level query latencies might look like:
+
+
+
+
+
+Using Content node level metrics, separated by ranking profile, we see:
+
+
+
+
+
+From this, we see that query latency varies with the rank profile used. Relevant metrics to export to your monitoring system include:
+
+- [content.proton.documentdb.matching.rank\_profile.queries](/en/reference/operations/metrics/searchnode#content_proton_documentdb_matching_rank_profile_queries)
+- [content.proton.documentdb.matching.rank\_profile.docs\_matched](/en/reference/operations/metrics/searchnode#content_proton_documentdb_matching_rank_profile_docs_matched)
+- [content.proton.documentdb.matching.rank\_profile.query\_latency](/en/reference/operations/metrics/searchnode#content_proton_documentdb_matching_rank_profile_query_latency)
+- [content.proton.documentdb.matching.rank\_profile.rerank\_time](/en/reference/operations/metrics/searchnode#content_proton_documentdb_matching_rank_profile_rerank_time)
+
+In short, when debugging latency, look for changes, per rank profile:
+
+- Did the query rate increase?
+- Did number of matched or ranked documents change?
+
+The above metrics is a subset or the available metrics. It is a good idea to set a [query profile](/en/reference/querying/query-profiles) per class of query, and in each query profile, select a distinct rank profile. With this, you can change the rank profile for a given query class by configuration only (no need to change the clients) - a good example is having a lightweight rank profile to use in overload situations. This makes it easier to track the individual query classes, per rank profile.
+
+## Prometheus metrics API
+
+Prometheus metrics are found at `$ENDPOINT/prometheus/v1/values`:
+
+```bash
+$ curl -s --cert data-plane-public-cert.pem --key data-plane-private-key.pem \
+ 'https://b6718765.b68a1234.z.vespa-app.cloud/prometheus/v1/values'
+```
+
+The metrics can be fed into e.g. your Grafana Cloud or self-hosted Grafana instance. See the [Vespa metrics documentation](/en/reference/operations/metrics/vespa-metric-set) for more information.
+
+## Using Grafana
+
+This section explains how to set up Grafana to consume Vespa metrics using the Prometheus API.
+
+### 1. Prometheus configuration
+
+Prometheus is configured using `prometheus.yml`, find sample config in [prometheus](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/monitoring/album-recommendation-monitoring/prometheus). See `prometheus-cloud.yml`, which is designed to be easy to set up with any Vespa Cloud instance. Replace `` and `` with the endpoint for the application and the service name, respectively. In addition, the path to the private key and public cert that is used for the data plane to the endpoint need to be provided - refer to [security](/en/security/guide). Then, configure the Prometheus instance to use this configuration file. The Prometheus instance will now start retrieving the metrics from Vespa Cloud. If the Prometheus instance is used for multiple services, append the target configuration for Vespa to scrape\_configs.
+
+### 2. Grafana configuration
+
+Use the [provisioning folder](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/monitoring/album-recommendation-monitoring/grafana/provisioning) as a baseline for further configuration.
+
+In the provisioning folder there are a few different files that all help for configuring Grafana locally. These work as good examples of default configurations, but the most important is the file named `Vespa-Engine-Advanced-Metrics-External.json`. This is a default dashboard, based upon the metrics the Vespa team use to monitor performance.
+
+Click the + button on the side and go to import. Upload the file to the Grafana instance. This should automatically load in the dashboard for usage. For now, it will not display any data as no data sources are configured yet.
+
+### 3. Grafana Data Source
+
+The Prometheus data source has to be added to the Grafana instance for the visualisation. Click the cog on the left and then "Data Sources". Click "Add data source" and choose Prometheus from the list. Add the URL for the Prometheus instance with appropriate bindings for connecting. The configuration for the bindings will depend on how the Prometheus instance is hosted. Once the configuration details have been entered, click Save & Test at the bottom and ensure that Grafana says "Data source is working".
+
+To verify the data flow, navigate back to the Vespa Metrics dashboard by clicking the dashboard symbol on the left (4 blocks) and clicking manage and then click Vespa Metrics. Data should now appear in the Grafana dashboard. If no data shows up, edit one of the data sets and ensure that it has the right data source selected. The name of the data source the dashboard is expecting might be different from what your data source is named. If there is still no data appearing, it either means that the Vespa instance is not being used or that some part of the configuration is wrong.
+
+## Using AWS Cloudwatch
+
+To pull metrics from your Vespa application into AWS Cloudwatch, refer to the [metrics-emitter](https://github.com/vespa-engine/metrics-emitter/tree/master/cloudwatch) documentation for how to set up an AWS Lambda.
+
+## Alerting
+
+The [Vespa Grafana Terraform template](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/monitoring/vespa-grafana-terraform) provides a set of dashboards and alerts. If you are using a different monitoring service and want to set up an equivalent alert set, you can follow this table:
+
+| Metric name | Threshold | Dimension aggregation |
+| --- | --- | --- |
+| [content_proton_resource_usage_disk_average](/en/reference/operations/metrics/searchnode#content_proton_resource_usage_disk) | `>` 0.9 | max by(applicationId, clusterId, zone) |
+| [content_proton_resource_usage_memory_average](/en/reference/operations/metrics/searchnode#content_proton_resource_usage_memory) | `>` 0.8 | max by(applicationId, zone, clusterId) |
+| cpu_util | `>` 90 | max by(applicationId, zone, clusterId) |
+| [content_proton_resource_usage_feeding_blocked_last](/en/reference/operations/metrics/searchnode#content_proton_resource_usage_feeding_blocked) | `>=` 1 | N/A |
+
+All metrics are from the [default metric set](/en/reference/operations/metrics/default-metric-set#metric-sets). Metrics are using the naming scheme from the [Prometheus metrics](/en/reference/api/prometheus-v1#prometheus-v1-values) API. Dimension aggregation is optional, but reduces alerting noise - e.g. in the case where an entire cluster goes bad. It is recommended to filter all alerts on zones in the [prod environment](/en/operations/environments#prod).
+
+## Prometheus Metrics Sample
+
+Below is a sample request with sample response for prometheus metrics for a minimal application on Vespa Cloud:
+
+
+```bash
+$ curl -s --cert data-plane-public-cert.pem --key data-plane-private-key.pem \
+ 'https://b6718765.b68a1234.z.vespa-app.cloud/prometheus/v1/values'
+
+...
+jdisc_thread_pool_work_queue_size_min{threadpool="default-pool",zone="dev.aws-us-east-1c",applicationId="mytenant.myapp.default",serviceId="logserver-container",clusterId="admin/logserver",hostname="h97490a.dev.us-east-1c.aws.vespa-cloud.net",vespa_service="vespa_logserver_container",} 0.0 1733139324000
+jdisc_thread_pool_work_queue_size_min{threadpool="default-handler-common",zone="dev.aws-us-east-1c",applicationId="mytenant.myapp.default",serviceId="logserver-container",clusterId="admin/logserver",hostname="h97490a.dev.us-east-1c.aws.vespa-cloud.net",vespa_service="vespa_logserver_container",} 0.0 1733139324000
+# HELP content_proton_documentdb_matching_rank_profile_rerank_time_average
+# TYPE content_proton_documentdb_matching_rank_profile_rerank_time_average untyped
+content_proton_documentdb_matching_rank_profile_rerank_time_average{rankProfile="rank_albums",documenttype="music",zone="dev.aws-us-east-1c",applicationId="mytenant.myapp.default",serviceId="searchnode",clusterId="content/music",hostname="h104562a.dev.us-east-1c.aws.vespa-cloud.net",vespa_service="vespa_searchnode",} 0.0 1733139324000
+content_proton_documentdb_matching_rank_profile_rerank_time_average{rankProfile="unranked",documenttype="music",zone="dev.aws-us-east-1c",applicationId="mytenant.myapp.default",serviceId="searchnode",clusterId="content/music",hostname="h104562a.dev.us-east-1c.aws.vespa-cloud.net",vespa_service="vespa_searchnode",} 0.0 1733139324000
+content_proton_documentdb_matching_rank_profile_rerank_time_average{rankProfile="default",documenttype="music",zone="dev.aws-us-east-1c",applicationId="mytenant.myapp.default",serviceId="searchnode",clusterId="content/music",hostname="h104562a.dev.us-east-1c.aws.vespa-cloud.net",vespa_service="vespa_searchnode",} 0.0 1733139324000
+...
+```
+
+Relevant labels include:
+
+- `chain` This is the name on the search chain in the container that is used for a set of query requests. This is typically used to get separate metrics, such as latency and the number of queries for each chain over time.
+- `documenttype` This is the name of the document type for which a set of queries are run in the content clusters. This is typically used to get separate content layer metrics, such as latency and the number of queries for each chain over time.
+- `groupId` This is the id of the cluster group for which the metric measurement is done. This is typically used to get separate metrics aggregates per group in a content cluster. The label is most relevant for metrics from the content clusters running multiple content groups, see [Content Cluster Elasticity](/en/content/elasticity). The value is in the format group 0, group 1, group 2, etc.
+- `rankProfile` This label is present for a subset of metrics from the content clusters, with names starting with `content_proton_documentdb_matching_rank_profile_`. The label is typically used in cases where you use multiple rank profiles and want to analyse performance differences between the different rank profiles, or to better understand certain types of performance issues and need to narrow down the candidate set.
+- `source` This is a label applied on container metrics for classifying query failures by the content cluster where the failure was triggered.
+
+How you will use labels to separate different kinds of queries depends on the observability backend you use, but you will typically compute weighted averages for query latency and query volume, and split graphs by the relevant labels to better understand system performance and bottlenecks.
+
+For the container level metrics you use the `chain` label to differentiate between different query streams, while you use the `rankProfile` label to do the same in the content level.
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/notifications.mdx b/mintlify-docs/en/operations/notifications.mdx
new file mode 100644
index 0000000000..6c71a9000d
--- /dev/null
+++ b/mintlify-docs/en/operations/notifications.mdx
@@ -0,0 +1,36 @@
+---
+title: "Notifications"
+---
+
+Vespa Cloud supports two different categories of notifications. Notifications can be sent by email if this has been configured in the Console.
+
+- **Tenant notifications** are administrative notifications about the tenant. Information about users, plan, etc. are sent to all contacts configured to get tenant notifications.
+- **Application notifications** are notifications about your running Vespa applications. If there are resource constraint issues, deployment errors, configuration errors or other issues with a Vespa application, they will be sent to all contacts configured to get application notifications.
+
+## Configuring Notifications
+
+Notifications are configured in the Console under [**Account > Notifications**](https://console.vespa-cloud.com/link/tenant/account/notifications). You can add contacts here that will start receiving emails for the categories enabled for that contact.
+
+
+
+
+
+To add a new address to get notifications:
+
+
+
+Click **+Add new contact**.
+
+
+Enter the email address to receive notifications to.
+
+
+Choose the types of notifications to receive.
+
+
+Click **Save**
+
+
+Go to your email inbox and click the verification link you have received there.
+
+
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/private-endpoints.mdx b/mintlify-docs/en/operations/private-endpoints.mdx
new file mode 100644
index 0000000000..f80add455f
--- /dev/null
+++ b/mintlify-docs/en/operations/private-endpoints.mdx
@@ -0,0 +1,263 @@
+---
+title: "Private endpoints"
+---
+
+Vespa Cloud lets you set up private endpoint services on your application clusters, for exclusive access from your own, co-located VPCs with the same cloud provider. This is supported for AWS deployments through AWS's [PrivateLink](https://docs.aws.amazon.com/vpc/latest/privatelink/what-is-privatelink.html), and for GCP deployments through GCP's [Private Service Connect](https://cloud.google.com/vpc/docs/private-service-connect). This guide takes you through the necessary configuration steps for either [AWS PrivateLink](#aws-private-link) or for [GCP Private Service Connect](#gcp-private-service-connect).
+
+Private endpoints are only supported in zones in the [prod environment](/en/operations/environments#prod).
+
+
+**Note:**
+
+Private endpoints use mTLS authentication by default, and token-based authentication must be explicitly enabled. See [configuring private endpoint authentication method](#authentication-methods).
+
+
+## AWS PrivateLinkRequired information:
+
+| Item | Description |
+| --- | --- |
+| **Your IAM account number** | The numeric identifier for your AWS account. |
+| **VPC ID** | The identifier of your AWS VPC where you wish to connect to the service endpoints from. |
+| **AWS region name** | The name of the AWS region to connect from. Note that you can only connect to a service in the same region, or, if public endpoints are disabled, in the same AWS availability zone. |
+
+Procedure:
+
+
+
+Add `` to [deployment.xml](/en/reference/applications/deployment#endpoint-private), allowing access to the container cluster using the designated ARN from your account. The example allows all roles and users under the `123123123123` account to connect to the endpoint service on the `my-container` cluster, in each region listed under the `` tag.
+
+See [endpoint service configuration](https://docs.aws.amazon.com/vpc/latest/privatelink/configure-endpoint-service.html) for details on valid ARNs, and more fine-grained access control.
+
+The example also shows how to disable the public zone endpoint by adding the [`"zone"` type endpoint](/en/reference/applications/deployment#endpoint-zone) declaration—this is an optional step, and not required to set up the private service:
+ ```xml
+
+
+ region-1
+ region-2
+
+
+
+
+
+
+
+
+ ```
+
+Build and deploy the application package, and wait for it to deploy to the indicated regions.
+
+**Important:**
+
+In the above example, the public endpoint is set to disabled, and the private endpoint is added. Make this change in the *same* deployment. If you disable the public endpoint *after* the private endpoint is live, the private endpoint will be disabled and recreated as part of recreating the load balancer. If this happens, run through the procedure again for correct load balancer creation.
+
+
+
+Navigate to the endpoints tab for your application in the Console, and find the service ID for the deployment to which you wish to connect. While there, verify that access to connect to the endpoint was granted to the correct ARNs.
+
+
+
+
+
+
+[Create a VPC endpoint](https://docs.aws.amazon.com/cli/latest/reference/ec2/create-vpc-endpoint.html) in your VPC. This is your entry point, which forwards connections to your Vespa application through the private network of AWS. For this example, assume your VPC has id `vpc-123` and resides in the AWS region `us-east-1`, and that the service ID of your endpoint service, found in the Console, is `com.amazonaws.vpce.us-east-1.vpce-svc-321`:
+
+```bash
+$ aws ec2 create-vpc-endpoint \
+ --region us-east-1 \
+ --vpc-id vpc-123 \
+ --service-name com.amazonaws.vpce.us-east-1.vpce-svc-321 \
+ --vpc-endpoint-type Interface \
+ --private-dns-enabled | jq .
+```
+
+Note the value of the `VpcEndpointId` field, for verification in the below item. This is also where you specify optional security group and subnet IDs; these are omitted here for brevity. If creating the VPC endpoint through the AWS console instead, be sure to check "Enable DNS names"!
+
+
+Navigate back to the endpoints tab in the Console, and refresh the page. You should now see a new entry representing the connection between your newly created interface endpoint and the endpoint service on your container cluster. This is the "CONNECTED ENDPOINTS" in the image above. Verify the ID matches the value of the `VpcEndpointId` field above. The connection is ready when the state is `open`.
+
+
+The zone endpoint of the designated container cluster should now resolve through private DNS, for any AWS resource that is allowed to connect to your VPC endpoint. The easiest way to verify this is to run the following Python 3.9 lambda, using your own zone endpoint, from within your VPC:
+
+```bash
+from socket import gethostbyname
+from urllib.request import urlopen
+
+def lambda_handler(event, context):
+ return {
+ 'statusCode': 200,
+ 'body': urlopen('https://badc0ffee.deadbeef.z.vespa-app.cloud/status.html').read(),
+ 'ip': gethostbyname('badc0ffee.deadbeef.z.vespa-app.cloud')
+ }
+```
+
+Alternatively, run a couple of commands from a host inside the VPC:
+
+```bash
+$ host my-container.my-app.my-tenant.region-1.z.vespa-app.cloud
+$ curl https://my-container.my-app.my-tenant.region-1.z.vespa-app.cloud/status.html
+```
+
+In both cases, the IP should be in one of the private IP ranges, and the HTTP response from the Vespa container endpoint should be `OK`.
+
+
+
+
+**Note:**
+
+Enclave users may set up high-availability PrivateLink endpoints connected across multiple AZs. Contact [Vespa support](https://vespa.ai/support/) for guidance.
+
+
+## GCP Private Service ConnectPrerequisites:
+
+| Item | Description |
+| --- | --- |
+| **Enabled GCP APIs** | The *Compute Engine*, *Service Directory* and *Cloud DNS* APIs must all be enabled in your GCP account:
`$ gcloud services enable compute.googleapis.com` `$ gcloud services enable dns.googleapis.com` `$ gcloud services enable servicedirectory.googleapis.com` |
+| **Your GCP project name** | The string identifier for your GCP account, like *resonant-diode-123456* |
+| **VPC network and subnetwork names** | The name of the network and subnetwork to create your consumer endpoint in. |
+
+Procedure:
+
+
+
+
+Add `` to [deployment.xml](/en/reference/applications/deployment#endpoint-private), allowing access to the container cluster from the GCP account with the designated project ID. The example below allows consumer endpoints created under the `private-test` account to connect to the endpoint service on the `my-container` cluster, in each region listed under the `` tag.
+
+The example also shows how to disable the public zone endpoint by adding the [`"zone"` type endpoint](/en/reference/applications/deployment#endpoint-zone) declaration—this is an optional step, and not required to set up the private service
+ ```xml
+
+
+ region-1
+ region-2
+
+
+
+
+
+
+
+
+ ```
+ Build and deploy the application package, and wait for it to deploy to the indicated regions.
+
+
+Navigate to the endpoints tab for your application in the Console, and find the service ID for the deployment to which you wish to connect. While there, verify that access to connect to the endpoint was granted to the correct projects.
+
+
+
+
+
+
+
+
+[Create a service consumer endpoint](https://cloud.google.com/vpc/docs/configure-private-service-connect-services) in your VPC. This is your entry point, which forwards connections to your Vespa application through the private GCP network. In this example, the project is named `test-project`, has a VPC network named `test-network` that resides in the GCP region `us-central1`, with a subnet `test-subnet` to hold the endpoint, behind an address to be named `test-address`, and the service ID of the endpoint service (found in the Console) is `projects/vespa-external/regions/us-central1/serviceAttachments/scsa-xxxxxx`. Finally, the endpoint is named `badc0ffee`, and the service directory namespace is `my-tenant-my-app`. See the discussion on generated endpoint names in the last item in this guide.
+
+Create network (if it does not already exist):
+
+```bash
+$ gcloud compute networks create test-network
+```
+
+Create subnet (if it does not already exist):
+
+```bash
+$ gcloud compute networks subnets create test-subnet \
+ --region=us-central1 \
+ --network=test-network \
+ --range=10.10.0.0/24
+```
+
+Create the IP address which will be used for the endpoint, for clients inside your VPC:
+
+```bash
+$ gcloud compute addresses create test-address \
+ --region=us-central1 \
+ --subnet=test-subnet
+```
+
+Create a forwarding rule for traffic to the above IP, to the service endpoint in Vespa Cloud:
+
+```bash
+$ gcloud compute forwarding-rules create badc0ffee \
+ --region=us-central1 \
+ --network=test-network \
+ --address=test-address \
+ --target-service-attachment=projects/vespa-external/regions/us-central1/serviceAttachments/scsa-xxxxxx \
+ --service-directory-registration=projects/test-project/locations/us-central1/namespaces/my-tenant-my-app
+```
+
+Note the ID of the created resource, for the verification step below.
+
+
+
+Navigate back to the endpoints tab in the Console, and refresh the page. You should now see a new entry representing the connection between your newly created interface endpoint, and the endpoint service on your container cluster. This is the "CONNECTED ENDPOINTS" in the image above. Verify the ID matches the resource ID of the forwarding rule created above. The connection is ready when the state is `open`.
+
+
+
+The generated endpoint name (see last items) of the designated container cluster should now resolve through private DNS inside your VPC. The easiest way to verify this is to launch an instance in your VPC, inside the designated subnet, and run a couple of commands from it:
+
+```bash
+$ host badc0ffee.deadbeef.z.vespa-app.cloud
+$ curl https://badc0ffee.deadbeef.z.vespa-app.cloud/status.html
+```
+
+The resolved IP address should be that of the address created earlier, and the `curl` command should simply output `OK`.
+
+If the endpoint fails to resolve, refer to [GCP's troubleshooting documentation](https://cloud.google.com/vpc/docs/configure-private-service-connect-services#troubleshooting).
+
+
+When a consumer endpoint is created with a *Service Directory* namespace, GCP automatically creates a private DNS record for that endpoint, which must be used instead of the IP address (created above) of the endpoint, as Vespa application containers have web certificates matching specific domain names. Unfortunately, we are unable to set the final endpoint names for the consumer endpoint. For a private endpoint service, we can only set a domain name *suffix*, and GCP then generates private DNS records matching *your endpoint resource name prepended to this suffix*. The Service Directory namespace of these endpoints *must also be one-to-one* with their domain name suffixes, lest the automatic setup fail.
+
+The domain name suffix used by Vespa Cloud is `[.].z.vespa-app.cloud`. We therefore encourage using the `-` pair as the service directory namespace, as this ensures a one-to-one mapping between suffixes and namespaces, as required by GCP (see above).
+
+The Vespa Cloud web certificates (see above) match any direct descendant of the domain suffix we set for your services. Thus, any endpoint resource name yields a private DNS record that matches the web certificate, with Service Directory. Moreover, the zone endpoints generated by Vespa Cloud consist of a random, unique cluster-instance-region ID. Using this same ID as the GCP endpoint resource name (as in the example) results in identical domain names for the private DNS set up by GCP, and the endpoint names generated by Vespa Cloud, visible in our console.
+
+
+
+## Configuring Private Endpoint Authentication
+
+You can configure private endpoints to use either mTLS or token-based authentication with the optional `auth-method` attribute. If the attribute is not set, mTLS will be used by default. The attribute is only allowed with `private` type endpoints and must be either `mtls` or `token`.
+
+
+**Note:**
+
+Only one authentication method can be enabled at the same time. Enabling token authentication will disable mTLS authentication for the private endpoint, and vice versa.
+
+
+#### Example with token-based authentication
+
+```xml
+
+
+ region-1
+ region-2
+
+
+
+
+
+
+
+```
+
+#### Changing authentication method for an existing deployment
+
+If you have an existing deployment with a private endpoint, you must remove any connections and redeploy with a [validation override](/en/reference/applications/validation-overrides) to modify the authentication method:
+
+
+
+Remove the VPC interface endpoint (AWS) or service consumer endpoint (GCP) configured above
+
+
+Change the authentication method for the endpoint in `deployment.xml`
+
+
+Deploy with the `zone-endpoint-change` validation override:
+ ```xml
+
+
+ zone-endpoint-change
+
+
+ ```
+
+
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/production-deployment.mdx b/mintlify-docs/en/operations/production-deployment.mdx
new file mode 100644
index 0000000000..c8eefc12f3
--- /dev/null
+++ b/mintlify-docs/en/operations/production-deployment.mdx
@@ -0,0 +1,243 @@
+---
+title: "Production Deployment"
+---
+
+Production zones enable serving from various locations, with a [CI/CD pipeline](/en/operations/automated-deployments) for safe deployments. This guide goes through the minimal steps for a production deployment - in short:
+
+- Configure a production zone in [deployment.xml](/en/reference/applications/deployment).
+- Configure resources for clusters in [services.xml](/en/reference/applications/services/services).
+- Name the tenant, application, log in.
+- Create or have access to the data-plane cert/key pair.
+- Deploy the application to Vespa Cloud.
+
+The sample application used in [getting started](/en/basics/deploy-an-application) is a good basis for these steps, see [source files](https://github.com/vespa-engine/sample-apps/tree/master/album-recommendation).
+
+Read [migrating to Vespa Cloud](/en/learn/migrating-to-cloud) first, as a primer on deployment and endpoint usage.
+
+There are alternative ways of deploying at the end of this guide, too.
+
+## deployment.xml
+
+Add a `` element to *deployment.xml*:
+
+```xml
+
+
+ aws-us-east-1c
+
+
+```
+
+If *deployment.xml* does not exist, add it to the application package root (next to *services.xml*).
+
+
+**Note:**
+
+If the application uses [private endpoints](/en/operations/private-endpoints), add this configuration here, too, and run the setup steps in the guide.
+
+
+## services.xml
+
+Modify *services.xml* - minimal example:
+
+```xml expandable
+
+
+
+
+
+
+
+
+
+
+
+ 2
+
+
+
+
+
+
+
+
+
+```
+
+For production deployments, at least 2 nodes are required for each cluster to ensure availability during maintenance tasks and upgrades. In some cases one might still want to use just 1 node per cluster, even though redundancy will be lost. This can be done by adding [a validation override](/en/reference/applications/validation-overrides) `minimum-node-count` (and additional validation override `redundancy-one` in case of a content cluster). The `nodes` section is also where you specify your required [resources](/en/reference/applications/services/services#resources):
+
+```xml
+
+
+
+```
+
+Also note the minimum redundancy requirement of 2:
+
+```xml
+2
+```
+
+## Minimum resources
+
+To help ensure a reliable service, there is a minimum resource requirement for nodes in the production environment. The minimum is currently 0.5 VCPU, 8Gb of memory, and for disk, 2 x memory for stateless nodes, or 3 x memory for content nodes. As the disk resource is normally the least expensive, we recommend it should be allocated generously to ensure it does not limit the use of more expensive cpu and memory resources.
+
+## Application name
+
+Give the deployment a name and log in:
+
+```bash
+vespa config set target cloud
+vespa config set application mytenant.myapp
+vespa auth login
+```
+
+The tenant name is found in the console, the application is something unique within your organization - see [tenants, applications and instances](/en/learn/tenant-apps-instances).
+
+## Add public certificate
+
+Just as in the [getting started](/en/basics/deploy-an-application) guide, the application package needs the public key in the *security* directory. You might already have a pair, if not generate it:
+
+```bash
+$ vespa auth cert -f
+Success: Certificate written to security/clients.pem
+Success: Certificate written to /Users/me/.vespa/mytenant.myapp.default/data-plane-public-cert.pem
+Success: Private key written to /Users/me/.vespa/mytenant.myapp.default/data-plane-private-key.pem
+```
+
+Observe that the files are put in *$HOME/.vespa*. The content from *data-plane-public-cert.pem* is copied to *security/clients.pem*. More details on [data-plane access control permissions](/en/security/guide#permissions).
+
+## Deploy the application
+
+Package the application and deploy it to a production zone:
+
+```bash
+vespa prod deploy
+```
+
+Find alternative deployment procedures in the next sections.
+
+
+**Note:**
+
+The `vespa prod deploy` command to prod zones, which uses [deployment.xml](/en/reference/applications/deployment) differs from the `vespa deploy` command used for dev zones - see [environments](/en/operations/environments).
+
+
+## Endpoints
+
+Find the 'zone' endpoint to use under Endpoints in the [console](https://console.vespa-cloud.com/). There is an mTLS endpoint for each zone by default. See [configuring mTLS](/en/security/guide#configuring-mtls) for how to use mTLS certificates.
+
+You can also add [access tokens](/en/security/guide#configuring-tokens) in the console as an alternative to mTLS, and specify [global](/en/reference/applications/deployment#endpoints-global) and [private](/en/reference/applications/deployment#endpoint-private) endpoints in *deployment.xml*.
+
+Write data efficiently using the [document/v1 API](/en/reference/api/document-v1) using HTTP/2, or with the [Vespa CLI](/en/clients/vespa-cli). There is also a [Java library](/en/clients/vespa-feed-client#java-library).
+
+To feed data from a self-hosted Vespa into a new cloud instances, see the [appendix](#feeding-data-from-an-existing-vespa-instance) or [cloning applications and data](/en/operations/cloning).
+
+Also see the [http best practices documentation](/en/clients/http-best-practices).
+
+## Automate deployments
+
+Use [deploy-vector-search.yaml](https://github.com/vespa-cloud/vector-search/blob/main/.github/workflows/deploy-vector-search.yaml) as a starting point, and see [Automating with GitHub Actions](/en/operations/automated-deployments#automating-with-github-actions) for more information.
+
+## Production deployment using console
+
+Instead of using the [Vespa CLI](/en/clients/vespa-cli), one can build an application package for production deployment using zip only:
+
+- Create [deployment.xml](#deployment-xml) and modify [services.xml](#services-xml) as above.
+- Skip the [Application name](#application-name) step.
+- Add a public certificate to *security/clients.pem*. See [creating a self-signed certificate](/en/basics/deploy-an-application-shell#create-a-self-signed-certificate) for how to create the key/cert pair, then copy the cert file to *security/clients.pem*. At this point, the files are ready for deployment.
+- Create a deployable zip-file:
+
+ ```bash
+ zip -r application.zip . \
+ -x application.zip "ext/*" README.md .gitignore ".idea/*"
+ ```
+- Click *Create Application* in the [console](https://console.vespa-cloud.com/). Select the *PROD* tab. Enter a name for the application and drop the *application.zip* file in the upload section.
+- Click *Create and deploy* to deploy the application to the production environment.
+
+## Production deployment with components
+
+Deploying an application with [Components](/en/applications/components) is a little different from above:
+
+- The application package root is at *src/main/application*.
+- Find the Vespa API version to compile the component.
+- The application package is built into a zip artifact, before deploying it.
+
+See [Getting started java](/en/basics/deploy-an-application-java) for prerequisites. Procedure:
+
+
+
+Use the [album-recommendation-java](https://github.com/vespa-engine/sample-apps/tree/master/album-recommendation-java) sample application as a starting point.
+
+
+Make the same changes to *src/main/application/deployment.xml* and *src/main/application/services.xml*.
+
+
+Run the same steps for [Application name](#application-name) and [Add public certificate](#add-public-certificate).
+
+
+Find the lowest Vespa version of the current deployments (if any) - [details](/en/operations/automated-deployments#deploying-components):
+
+```bash
+mvn vespa:compileVersion \
+ -Dtenant=mytenant \
+ -Dapplication=myapp
+```
+
+
+Build *target/application.zip*:
+
+```bash
+mvn -U package -Dvespa.compile.version="$(cat target/vespa.compile.version)"
+```
+
+
+Run the [Deploy the application](#deploy-the-application) step. Here, the Vespa CLI command will deploy *target/application.zip* built in the step above.
+
+
+
+## Next steps
+
+- Vespa Cloud takes responsibility for rolling out application changes to all production zones as well as testing the changes first. You will usually want to set up a job which automatically builds your application package when changes to it are checked in, to get continuous deployment of your application. Read [automated deployments](/en/operations/automated-deployments) for automation, adding CD tests and multi-zone deployments.
+- Once you have experience with load patterns, consider [autoscaling](/en/operations/autoscaling).
+- Set up [monitoring](/en/operations/monitoring).
+
+## Feeding data from an existing Vespa instance
+
+To dump data from an existing Vespa instance, you can use this command with Vespa CLI:
+
+```bash
+slices=10
+for slice in $(seq 0 $((slices-1))); do
+ vespa visit \
+ --slices $slices --slice-id $slice \
+ --target [existing Vespa instance endpoint] \
+ | gzip > dump.$slice.gz &
+done
+```
+
+This dumps all the content to files, but you can also pipe the content directly into 'vespa feed'.
+
+To feed the data:
+
+```bash
+slices=10
+for slice in $(seq 0 $((slices-1))); do
+ zcat dump.$slice.gz | \
+ vespa feed \
+ --application .. \
+ --target [zone endpoint from the Vespa Console] -
+done
+```
+
+Note that the different slices in these commands can be done in parallel on different machines.
+
+## Accessing a public cloud application from another VPC on another account
+
+A common challenge when deploying on the public cloud, is network connectivity between workloads running in different accounts and VPCs. Within in a team, this is often resolved by setting up VPC peering between VPCs, but this has its challenges when coordinating between many different teams and dynamic workloads. Vespa does not support direct VPC peering.
+
+There are three recommended options:
+
+1. **Use your public endpoints, but IPv6 if you can:** The default. There are many advantages to a Zero-Trust approach and accessing your application through the public endpoint. If you use IPv6, you will also avoid some of the network costs associated with IPv4 NATs, etc. For some applications, this option could be cost prohibitive, but one should not assume this is the case for all applications with a moderate amount of data being transferred over the endpoint.
+2. **Use private endpoints via AWS PrivateLink or GCP Private Service Connect:** Vespa allows you to set up private endpoints for exclusive access from your own, co-located VPCs. This requires less administrative overhead than general VPC peering and is also more secure. Refer to [private endpoints](/en/operations/private-endpoints).
+3. **Run Vespa workloads in your own account/project (Enclave):** The Vespa Cloud Enclave feature allows you to have all your Vespa workloads run in your own account. In this case, you can set up any required peering to open the connection into your application. While generally available, using Vespa Cloud Enclave requires significantly more effort from the application team in terms of operating the service, and is only recommended for larger applications that can justify the additional work from e.g., a security or interoperability perspective. Refer to [Vespa Cloud Enclave](/en/operations/enclave/enclave).
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/reindexing.mdx b/mintlify-docs/en/operations/reindexing.mdx
new file mode 100644
index 0000000000..537bb18079
--- /dev/null
+++ b/mintlify-docs/en/operations/reindexing.mdx
@@ -0,0 +1,50 @@
+---
+title: "Reindexing"
+---
+
+When the indexing pipeline of a Vespa application changes, Vespa may automatically refeed stored data such that the index is updated according to the new specification. Changes in the indexing pipeline may be due to changes in external libraries, e.g. for linguistics, or due to changes in the configuration done by the user, such as the [indexing script](/en/reference/writing/indexing-language) in a document's schema, or the [indexing mode](/en/reference/applications/services/content#document.mode) of a document type in a content cluster. Reindexing can be done for an application's full corpus, for only certain content clusters, or for only certain document types in certain clusters, using the [reindex endpoint](/en/reference/api/deploy-v2#reindex), and inspected at the [reindexing endpoint](/en/reference/api/deploy-v2#reindexing), details are described below.
+
+## Start reindexing
+
+When a change in the indexing pipeline of an application is deployed, this is discovered by the config server (see the [prepare endpoint](/en/reference/api/deploy-v2#prepare-session) for details). If the change is to be deployed, a [validation override](/en/reference/applications/validation-overrides) might have to be added to the application package (e.g. if changing match settings for a field). Deployment output will then list the *reindex actions* required to make the index reflect the new indexing pipeline. Use the [reindex endpoint](/en/reference/api/deploy-v2#reindex) to mark reindexing as ready for affected document types, **but only after the new indexing pipeline is successfully deployed**, i.e. when the application has converged on the config generation that introduced the change. Reindexing then commences with the next deployment of the application. Summary of steps needed to enable and start reindexing:
+
+
+
+Change indexing pipeline in application package, adding validation overrides if needed
+
+
+Wait until config has converged on new config generation
+
+
+Mark reindexing change as ready by POSTing to reindex endpoint
+
+
+Start reindexing job by deploying application package one more time
+
+
+
+## Reindexing progress
+
+Reindexing is done by a component in each content cluster that [visits](/en/writing/visiting) all documents of the indicated types, and re-feeds these through the [indexing chain](/en/writing/indexing) of the cluster. (Note that only the [document fields](/en/reference/schemas/schemas#document) are re-fed — all derived fields, produced by the indexing pipeline, are recomputed.) The reindexing process avoids write races with concurrent feed by locking [small subsets](/en/content/buckets) of the corpus when reindexing them; this may cause elevated write latencies for a fraction of concurrent write operations, but does not impact general throughput. Moreover, since reindexing can be both lengthy and resource consuming, depending on the corpus, the process is tuned to yield resources to other tasks, such as external feed and serving, and is generally safe to run in the background.
+
+Reindexing is done for one document type at a time, in parallel across content clusters. Detailed progress can be found at the [reindexing endpoint](/en/reference/api/deploy-v2#reindexing). If state is *failed*, reindexing attempts to resume from the position where it failed after a grace period of some minutes. State *pending* indicates reindexing will start, or resume, when the cluster is ready, while *running* means it's currently progressing. Finally, *successful* means all documents of that type were successfully reindexed. Additionally, if the *speed* of a reindexing is `0.0`—set by users—that reindexing is halted until the speed is either set to a positive value again, or it is replaced by a new reindexing of that document type.
+
+## Procedure
+
+Refer to [schema changes](/en/reference/schemas/schemas#modifying-schemas) for a procedure / way to test the reindexing feature, and tools to validate the data.
+
+## Use cases
+
+Below are sample changes to the schema for different use cases, or examples of operational steps for data manipulation.
+
+| Use case | Description |
+| --- | --- |
+| **clear field** | To clear a field, do a partial update of all documents with the value, say an empty string.
It is also possible to use reindexing, but there is a twist - intuitively, this would work:
`field artist type string {` `indexing: "" \| summary \| index` `}`
However, the reset only works for [synthetic fields](/en/reference/schemas/schemas#schema).
A solution is to deploy a [document processor](/en/applications/document-processors) that empties the field, to the default indexing chain - then trigger a reprocessing. |
+| **change indexing settings** | As reindexing takes time, a field's data can be in one state or another, while the queries to it have the most current state. This is OK for many changes and applications.
If not, it is possible to reindex to a new field for a more atomic change. Add a *synthetic field* outside the *document definition* and pipe the content of the current field to it:
`search mydocs {` `field title_non_stemmed type string {` `indexing: input title \| index \| summary` `stemming: none` `}` `document mydocs {` `field title type string` `{` `indexing: index \| summary` `}`
Once reindexing is completed, switch queries to use the new field. This solution naturally increases memory and disk requirements in the transition.
Going back to using the original field with the new settings can be done by changing the index settings for the original field, wait for reindexing to be finished and start using the original field again in queries, then remove the temporary synthetic field. |
+
+Relevant pointers:
+
+
+
+
+
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/self-managed/admin-procedures.mdx b/mintlify-docs/en/operations/self-managed/admin-procedures.mdx
new file mode 100644
index 0000000000..d41c9ec158
--- /dev/null
+++ b/mintlify-docs/en/operations/self-managed/admin-procedures.mdx
@@ -0,0 +1,249 @@
+---
+title: "Administrative Procedures"
+sidebarTitle: "Admin procedures"
+---
+
+## Install
+
+Refer to the [multinode-HA](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode-HA) sample application for a primer on how to set up a cluster - use this as a starting point. Try the [Multinode testing and observability](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode) sample app to get familiar with interfaces and behavior.
+
+## Vespa start / stop / restart
+
+Start and stop all services on a node:
+
+`$ $VESPA_HOME/bin/`[`vespa-start-services`](/en/reference/operations/self-managed/tools#vespa-start-services)
+`$ $VESPA_HOME/bin/`[`vespa-stop-services`](/en/reference/operations/self-managed/tools#vespa-stop-services)
+
+Likewise, for the config server:
+
+`$ $VESPA_HOME/bin/`[`vespa-start-configserver`](/en/reference/operations/self-managed/tools#vespa-start-configserver)
+`$ $VESPA_HOME/bin/`[`vespa-stop-configserver`](/en/reference/operations/self-managed/tools#vespa-stop-configserver)
+
+There is no *restart* command, do a *stop* then *start* for a restart. Learn more about which processes / services are started at [Vespa startup](/en/operations/self-managed/config-sentinel#start-sequence), read the [start sequence](/en/operations/self-managed/configuration-server#start-sequence) and find training videos in the vespaengine [YouTube channel](https://www.youtube.com/@vespaai).
+
+Use [vespa-sentinel-cmd](/en/reference/operations/self-managed/tools#vespa-sentinel-cmd) to stop/start individual services.
+
+
+**Important:**
+
+Running *vespa-stop-services* on a content node will call [prepareRestart](/en/reference/operations/self-managed/tools#vespa-proton-cmd) to optimize restart time, and is the recommended way to stop Vespa on a node.
+
+
+See [multinode](/en/operations/self-managed/multinode-systems#aws-ec2) for *systemd* /*systemctl* examples. [Docker containers](/en/operations/self-managed/docker-containers) has relevant start/stop information, too.
+
+### Content node maintenance mode
+
+When stopping a content node *temporarily* (e.g. for a software upgrade), consider manually setting the node into [maintenance mode](/en/reference/api/cluster-v2#maintenance) *before* stopping the node to prevent automatic redistribution of data while the node is down. Maintenance mode must be manually removed once the node has come back online. See also: [cluster state](#cluster-state).
+
+Example of setting a node with [distribution key](/en/reference/applications/services/content#node) 42 into `maintenance` mode using [vespa-set-node-state](/en/reference/operations/self-managed/tools#vespa-set-node-state), additionally supplying a reason that will be recorded by the cluster controller:
+
+```bash
+ $ vespa-set-node-state --type storage --index 42 maintenance "rebooting for software upgrade"
+```
+
+After the node has come back online, clear maintenance mode by marking the node as `up`:
+
+```bash
+ $ vespa-set-node-state --type storage --index 42 up
+```
+
+Note that if the above commands are executed *locally* on the host running the services for node 42, `--index 42` can be omitted; `vespa-set-node-state` will use the distribution key of the local node if no `--index` has been explicitly specified.
+
+## System status
+
+- Use [vespa-config-status](/en/reference/operations/self-managed/tools#vespa-config-status) on a node in [hosts.xml](/en/reference/applications/hosts) to verify all services run with updated config
+- Make sure [VESPA\_CONFIGSERVERS](/en/operations/self-managed/files-processes-and-ports#environment-variables) is set and identical on all nodes in hosts.xml
+- Use the *cluster controller* status page (below) to track the status of search/storage nodes.
+- Check [logs](/en/reference/operations/log-files)
+- Use performance graphs, System Activity Report (*sar*) or [status pages](#status-pages) to track load
+- Use [query tracing](/en/reference/api/query#trace.level)
+- Disk and/or memory might be exhausted and block feeding - recover from [feed block](/en/writing/feed-block)
+
+## Status pages
+
+All Vespa services have status pages, for showing health, Vespa version, config, and metrics. Status pages are subject to change at any time - take care when automating. Procedure:
+
+
+
+**Find the port:** The status pages runs on ports assigned by Vespa. To find status page ports, use [vespa-model-inspect](/en/reference/operations/self-managed/tools#vespa-model-inspect) to list the services run in the application.
+
+```bash
+$ vespa-model-inspect services
+```
+
+To find the status page port for a specific node for a specific service, pick the correct service and run:
+
+```bash
+$ vespa-model-inspect service [Options]
+```
+
+
+**Get the status and metrics:** *distributor*, *storagenode*, *searchnode* and *container-clustercontroller* are content services with status pages. These ports are tagged HTTP. The cluster controller have multiple ports tagged HTTP, where the port tagged STATE is the one with the status page. Try connecting to the root at the port, or /state/v1/metrics. The *distributor* and *storagenode* status pages are available at `/`:
+
+```bash
+$ vespa-model-inspect service searchnode
+
+ searchnode @ myhost.mydomain.com : search
+ search/search/cluster.search/0
+ tcp/myhost.mydomain.com:19110 (STATUS ADMIN RTC RPC)
+ tcp/myhost.mydomain.com:19111 (FS4)
+ tcp/myhost.mydomain.com:19112 (TEST HACK SRMP)
+ tcp/myhost.mydomain.com:19113 (ENGINES-PROVIDER RPC)
+ tcp/myhost.mydomain.com:19114 (HEALTH JSON HTTP)
+ $ curl http://myhost.mydomain.com:19114/state/v1/metrics
+ ...
+ $ vespa-model-inspect service distributor
+ distributor @ myhost.mydomain.com : content
+ search/distributor/0
+ tcp/myhost.mydomain.com:19116 (MESSAGING)
+ tcp/myhost.mydomain.com:19117 (STATUS RPC)
+ tcp/myhost.mydomain.com:19118 (STATE STATUS HTTP)
+ $ curl http://myhost.mydomain.com:19118/state/v1/metrics
+ ...
+ $ curl http://myhost.mydomain.com:19118/
+ ...
+```
+
+
+**Use the cluster controller status page**: A status page for the cluster controller is available at the status port at `http://hostname:port/clustercontroller-status/v1/**`. If *clustername* is not specified, the available clusters will be listed. The cluster controller leader status page will show if any nodes are operating with differing cluster state versions. It will also show how many data buckets are pending merging (document set reconciliation) due to either missing or being out of sync.
+
+`$` [`vespa-model-inspect`](/en/reference/operations/self-managed/tools#vespa-model-inspect) `service container-clustercontroller | grep HTTP`
+
+With multiple cluster controllers, look at the one with a "/0" suffix in its config ID; it is the preferred leader.
+
+The cluster state version is listed under the *SSV* table column. Divergence here usually points to host or networking issues.
+
+
+
+## Cluster state
+
+Cluster and node state information is available through the [/cluster/v2 API](/en/reference/api/cluster-v2). This API can also be used to set a *user state* for a node - alternatively use:
+
+
+
+
+
+
+
+Also see the cluster controller [status page](#status-pages).
+
+State is persisted in a ZooKeeper cluster, restarting/changing a cluster controller preserves:
+
+- Last cluster state version number, for new cluster controller handover at restarts
+- User states, set by operators - i.e. nodes manually set to down / maintenance
+
+In case of state data lost, the cluster state is reset - see [cluster controller](/en/content/content-nodes#cluster-controller) for implications.
+
+## Cluster controller configuration
+
+It is recommended to run cluster controllers on the same hosts as [config servers](/en/operations/self-managed/configuration-server), as they share a zookeeper cluster for state and deploying three nodes is best practise for both. See the [multinode-HA](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode-HA) sample app for a working example.
+
+To configure the cluster controller, use [services.xml](/en/reference/applications/services/content#cluster-controller) and/or add [configuration](https://github.com/vespa-engine/vespa/blob/master/configdefinitions/src/vespa/fleetcontroller.def) under the *services* element - example:
+
+```bash
+
+
+ 5000
+
+```
+
+A broken content node may end up with processes constantly restarting. It may die during initialization due to accessing corrupt files, or it may die when it starts receiving requests of a given type triggering a node local bug. This is bad for distributor nodes, as these restarts create constant ownership transfer between distributors, causing windows where buckets are unavailable.
+
+The cluster controller has functionality for detecting such nodes. If a node restarts in a way that is not detected as a controlled shutdown, more than [max\_premature\_crashes](https://github.com/vespa-engine/vespa/blob/master/configdefinitions/src/vespa/fleetcontroller.def), the cluster controller will set the wanted state of this node to be down.
+
+Detecting a controlled restart is currently a bit tricky. A controlled restart is typically initiated by sending a TERM signal to the process. Not having any other sign, the content layer has to assume that all TERM signals are the cause of controlled shutdowns. Thus, if the process keep being killed by kernel due to using too much memory, this will look like controlled shutdowns to the content layer.
+
+## Monitor distance to ideal state
+
+Refer to the [distribution algorithm](/en/content/idealstate). Use distributor [status pages](#status-pages) to inspect state metrics, see [metrics](/en/content/content-nodes#metrics). `idealstate.merge_bucket.pending` is the best metric to track, it is 0 when the cluster is balanced - a non-zero value indicates buckets out of sync.
+
+## Cluster configuration
+
+- Running `vespa prepare` will not change served configuration until `vespa activate` is run. `vespa prepare` will warn about all config changes that require restart.
+- Refer to [schemas](/en/basics/schemas) for how to add/change/remove these.
+- Refer to [elasticity](/en/content/elasticity) for how to add/remove capacity from a Vespa cluster, procedure below.
+- See [chained components](/en/applications/chaining) for how to add or remove searchers and document processors.
+- Refer to the [sizing examples](/en/operations/self-managed/sizing-examples) for changing from a *flat* to *grouped* content cluster.
+
+## Add or remove a content node
+
+
+
+**Node setup:** Prepare the node by installing software, set up the file systems/directories and set [VESPA\_CONFIGSERVERS](/en/operations/self-managed/files-processes-and-ports#environment-variables). [Start](#vespa-start-stop-restart) the node.
+
+
+**Modify configuration:** Add/remove a [node](/en/reference/applications/services/content#node)\-element in *services.xml* and [hosts.xml](/en/reference/applications/hosts). Refer to [multinode install](/en/operations/self-managed/multinode-systems). Make sure the *distribution-key* is unique.
+
+
+**Deploy**: [Observe metrics](#monitor-distance-to-ideal-state) to track progress as the cluster redistributes documents. Use the [cluster controller](/en/content/content-nodes#cluster-controller) to monitor the state of the cluster.
+
+
+**Tune performance (optional):** Use [maxpendingidealstateoperations](https://github.com/vespa-engine/vespa/blob/master/storage/src/vespa/storage/config/stor-distributormanager.def) to tune concurrency of bucket merge operations from distributor nodes. Likewise, tune [merges](/en/reference/applications/services/content#merges) - concurrent merge operations per content node. The tradeoff is speed of bucket replication vs use of resources, which impacts the applications' regular load.
+
+
+**Finish:** The cluster is done redistributing when `idealstate.merge_bucket.pending` is zero on all distributors.
+
+
+
+Do not remove more than *redundancy*\-1 nodes at a time, to avoid data loss. Observe `idealstate.merge_bucket.pending` to know bucket replica status, when zero on all distributor nodes, it is safe to remove more nodes. If [grouped distribution](/en/content/elasticity#grouped-distribution) is used to control bucket replicas, remove all nodes in a group if the redundancy settings ensure replicas in each group.
+
+To increase bucket redundancy level before taking nodes out, [retire](/en/content/content-nodes) nodes. Again, track `idealstate.merge_bucket.pending` to know when done. Use the [/cluster/v2 API](/en/reference/api/cluster-v2) or [vespa-set-node-state](/en/reference/operations/self-managed/tools#vespa-set-node-state) to set a node to the *retired* state. You can set any number of nodes retired at the same time. The [cluster controller's](/en/content/content-nodes#cluster-controller) status page lists node states.
+
+An alternative to increasing cluster size is building a new cluster, then migrate documents to it. This is supported using [visiting](/en/writing/visiting).
+
+To *merge* two content clusters, add nodes to the cluster like above, considering:
+- [distribution-keys](/en/reference/applications/services/content#node) must be unique. Modify paths like *$VESPA\_HOME/var/db/vespa/search/mycluster/n3* before adding the node.
+- Set [VESPA\_CONFIGSERVERS](/en/operations/self-managed/files-processes-and-ports#environment-variables), then start the node.
+
+## Topology change
+
+Read [changing topology first](/en/content/elasticity#changing-topology), and plan the sequence of steps.
+
+Make sure to not change the `distribution-key` for nodes in *services.xml*.
+
+It is not required to restart nodes as part of this process
+
+## Add or remove services on a node
+
+It is possible to run multiple Vespa services on the same host. If changing the services on a given host, stop Vespa on the given host before running `vespa activate`. This is because the services are dynamically allocated port numbers, depending on what is running on the host. Consider if some of the services changed are used by services on other hosts. In that case, restart services on those hosts too. Procedure:
+
+
+
+Edit *services.xml* and *hosts.xml*
+
+
+Stop Vespa on the nodes that have changes
+
+
+Run `vespa prepare` and `vespa activate`
+
+
+Start Vespa on the nodes that have changes
+
+
+
+## Troubleshooting
+
+Also see the [FAQ](/en/learn/faq).
+
+|||
+| --- | --- |
+| **No endpoint** | Most problems with the quick start guides are due to Docker out of memory. Make sure at least 6G memory is allocated to Docker:
`$ docker info \| grep "Total Memory"` `or` `$ podman info \| grep "memTotal"`
OOM symptoms includeINFO:
`Problem with Handshake localhost:8080 ssl=false: localhost:8080 failed to respond `
The container is named *vespa* in the guides, for a shell do:
`$ docker exec -it vespa bash` |
+| **Log viewing** | Use [vespa-logfmt](/en/reference/operations/self-managed/tools#vespa-logfmt) to view the vespa log - example:
to commands that output json - or use [jq](https://stedolan.github.io/jq/). |
+| **Routing** | Vespa lets application set up custom document processing / indexing, with different feed endpoints. Refer to [indexing](/en/writing/indexing) for how to configure this in *services.xml*.
[#13193](https://github.com/vespa-engine/vespa/issues/13193) has a summary of problems and solutions. |
+| **Tracing** | Use [tracelevel](/en/reference/api/document-v1#request-parameters) to dump the routes and hops for a write operation - example:
`$ curl -H Content-Type:application/json --data-binary @docs.json \` `$ENDPOINT/document/v1/mynamespace/doc/docid/1?tracelevel=4 \ jq .` `{` `"pathId": "/document/v1/mynamespace/doc/docid/1",` `"id": "id:mynamespace:doc::1",` `"trace": [` `{ "message": "[1623413878.905] Sending message (version 7.418.23) from client to ..." },` `{ "message": "[1623413878.906] Message (type 100004) received at 'default/container.0' ..." },` `{ "message": "[1623413878.907] Sending message (version 7.418.23) from 'default/container.0' ..." },` `{ "message": "[1623413878.907] Message (type 100004) received at 'default/container.0' ..." },` `{ "message": "[1623413878.909] Selecting route" },` `{ "message": "[1623413878.909] No cluster state cached. Sending to random distributor." }` |
+
+## Clean start mode
+
+There has been rare occasions were Vespa stored data that was internally inconsistent. For those circumstances it is possible to start the node in a [validate\_and\_sanitize\_docstore](https://github.com/vespa-engine/vespa/blob/master/configdefinitions/src/vespa/proton.def) mode. This will do its best to clean up inconsistent data. However, detecting that this is required is not easy, consult the Vespa Team first. In order for this approach to work, all nodes must be stopped before enabling this feature - this to make sure the data is not redistributed.
+
+## Content cluster configuration
+
+|||
+| --- | --- |
+| **Availability vs resources** | Keeping index structures costs resources. Not all replicas of buckets are necessarily searchable, unless configured using [searchable-copies](/en/reference/applications/services/content#searchable-copies). As Vespa indexes buckets on-demand, the most cost-efficient setting is 1, if one can tolerate temporary coverage loss during node failures. |
+| **Data retention vs size** | When a document is removed, the document data is not immediately purged. Instead, *remove-entries* (tombstones of removed documents) are kept for a configurable amount of time. The default is two weeks, refer to [removed-db prune age](/en/reference/applications/services/content#removed-db-prune-age). This ensures that removed documents stay removed in a distributed system where nodes change state. Entries are removed periodically after expiry. Hence, if a node comes back up after being down for more than two weeks, removed documents are available again, unless the data on the node is wiped first. A larger *prune age* will grow the storage size as this keeps document and tombstones longer.
**Note:**
The backend does not store remove-entries for nonexistent documents. This to prevent clients sending wrong document identifiers from filling a cluster with invalid remove-entries. A side effect is that if a problem has caused all replicas of a bucket to be unavailable, documents in this bucket cannot be marked removed until at least one replica is available again. Documents are written in new bucket replicas while the others are down - if these are removed, then older versions of these will not re-emerge, as the most recent change wins. |
+| **Transition time** | See [transition-time](/en/reference/applications/services/content#transition-time) for tradeoffs for how quickly nodes are set down vs. system stability. |
+| **Removing unstable nodes** | One can configure how many times a node is allowed to crash before it will automatically be removed. The crash count is reset if the node has been up or down continuously for more than the [stable state period](/en/reference/applications/services/content#stable-state-period). If the crash count exceeds [max premature crashes](/en/reference/applications/services/content#max-premature-crashes), the node will be disabled. Refer to [troubleshooting](#troubleshooting). |
+| **Minimal amount of nodes required to be available** | A cluster is typically sized to handle a given load. A given percentage of the cluster resources are required for normal operations, and the remainder is the available resources that can be used if some of the nodes are no longer usable. If the cluster loses enough nodes, it will be overloaded:
• Remaining nodes may create disk full situation. This will likely fail a lot of write operations, and if disk is shared with OS, it may also stop the node from functioning. • Partition queues will grow to maximum size. As queues are processed in FIFO order, operations are likely to get long latencies. • Many operations may time out while being processed, causing the operation to be resent, adding more load to the cluster. • When new nodes are added, they cannot serve requests before data is moved to the new nodes from the already overloaded nodes. Moving data puts even more load on the existing nodes, and as moving data is typically not high priority this may never actually happen.
To configure what the minimal cluster size is, use [min-distributor-up-ratio](/en/reference/applications/services/content#min-distributor-up-ratio) and [min-storage-up-ratio](/en/reference/applications/services/content#min-storage-up-ratio). |
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/self-managed/build-install.mdx b/mintlify-docs/en/operations/self-managed/build-install.mdx
new file mode 100644
index 0000000000..412246667e
--- /dev/null
+++ b/mintlify-docs/en/operations/self-managed/build-install.mdx
@@ -0,0 +1,82 @@
+---
+title: "Build / install Vespa"
+sidebarTitle: "Build and install"
+---
+
+To develop with Vespa, follow the [guide](https://github.com/vespa-engine/vespa#building) to set up a development environment on AlmaLinux 8 using Docker.
+
+Build Vespa Java artifacts with Java >= 17 and Maven >= 3.6.3. Once built, Vespa Java artifacts are ready to be used and one can build a Vespa application using the [bundle plugin](/en/applications/bundles#maven-bundle-plugin).
+
+```bash
+$ export MAVEN_OPTS="-Xms128m -Xmx1024m"
+$ ./bootstrap.sh java && mvn install
+```
+
+See [vespa.ai releases](/en/learn/releases).
+
+## Container images
+
+| Image | Description |
+| --- | --- |
+| [docker.io/vespaengine/vespa](https://hub.docker.com/r/vespaengine/vespa) [ghcr.io/vespa-engine/vespa](https://github.com/orgs/vespa-engine/packages/container/package/vespa) | Container image for running Vespa. |
+| [docker.io/vespaengine/vespa-build-almalinux-8](https://hub.docker.com/r/vespaengine/vespa-build-almalinux-8) | Container image for building Vespa on AlmaLinux 8. |
+| [docker.io/vespaengine/vespa-dev-almalinux-8](https://hub.docker.com/r/vespaengine/vespa-dev-almalinux-8) | Container image for development of Vespa on AlmaLinux 8. Used for incremental building and system testing. |
+
+## RPMs
+
+Dependency graph:
+
+
+
+
+
+Installing Vespa on AlmaLinux 8:
+
+```bash
+$ dnf config-manager \
+ --add-repo https://raw.githubusercontent.com/vespa-engine/vespa/master/dist/vespa-engine.repo
+$ dnf config-manager --enable powertools
+$ dnf install -y epel-release
+$ dnf install -y vespa
+```
+
+Package repository hosting is graciously provided by [Cloudsmith](https://cloudsmith.com) which is a fully hosted, cloud-native and universal package management solution:
+[](https://cloudsmith.com)
+
+
+**Important:**
+
+Please note that the retention of released RPMs in the repository is limited to the latest 50 releases. Use the Docker images (above) for installations of specific versions older than this. Any problems with released rpm packages will be fixed in subsequent releases, please [report any issues](https://vespa.ai/support/) - troubleshoot using the [install example](/en/operations/self-managed/multinode-systems#aws-ec2-singlenode).
+
+
+Refer to [vespa.spec](https://github.com/vespa-engine/vespa/blob/master/dist/vespa.spec). Build RPMs for a given Vespa version X.Y.Z:
+
+```bash
+$ git clone https://github.com/vespa-engine/vespa
+$ cd vespa
+$ git checkout vX.Y.Z
+$ docker run --rm -ti -v $(pwd):/wd:Z -w /wd \
+ docker.io/vespaengine/vespa-build-almalinux-8:latest \
+ make -f .copr/Makefile rpms outdir=/wd
+$ ls *.rpm | grep -v debug
+vespa-8.691.19-1.el8.src.rpm
+vespa-8.691.19-1.el8.x86_64.rpm
+vespa-ann-benchmark-8.691.19-1.el8.x86_64.rpm
+vespa-base-8.691.19-1.el8.x86_64.rpm
+vespa-base-libs-8.691.19-1.el8.x86_64.rpm
+vespa-clients-8.691.19-1.el8.x86_64.rpm
+vespa-config-model-fat-8.691.19-1.el8.x86_64.rpm
+vespa-jars-8.691.19-1.el8.x86_64.rpm
+vespa-libs-8.691.19.el8.x86_64.rpm
+vespa-malloc-8.691.19-1.el8.x86_64.rpm
+vespa-node-admin-8.691.19-1.el8.x86_64.rpm
+vespa-tools-8.691.19-1.el8.x86_64.rpm
+```
+
+Find most utilities in the vespa-x.y.z\*.rpm - other RPMs:
+
+| RPM | Description |
+| --- | --- |
+| **vespa-tools** | Tools accessing Vespa endpoints for query or document operations:
• [vespa-destination](/en/reference/operations/self-managed/tools#vespa-destination) • [vespa-fbench](/en/reference/operations/tools#vespa-fbench) • [vespa-feeder](/en/reference/operations/self-managed/tools#vespa-feeder) • [vespa-get](/en/reference/operations/self-managed/tools#vespa-get) • [vespa-query-profile-dump-tool](/en/reference/operations/tools#vespa-query-profile-dump-tool) • [vespa-stat](/en/reference/operations/self-managed/tools#vespa-stat) • [vespa-summary-benchmark](/en/reference/operations/self-managed/tools#vespa-summary-benchmark) • [vespa-visit](/en/reference/operations/self-managed/tools#vespa-visit) • [vespa-visit-target](/en/reference/operations/self-managed/tools#vespa-visit-target) |
+| **vespa-malloc** | Vespa has its own memory allocator, *vespa-malloc* - refer to */opt/vespa/etc/vespamalloc.conf* |
+| **vespa-clients** | *vespa-feed-client.jar* - see [vespa-feed-client](/en/clients/vespa-feed-client) |
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/self-managed/config-proxy.mdx b/mintlify-docs/en/operations/self-managed/config-proxy.mdx
new file mode 100644
index 0000000000..6bc533c108
--- /dev/null
+++ b/mintlify-docs/en/operations/self-managed/config-proxy.mdx
@@ -0,0 +1,85 @@
+---
+title: "Configuration proxy"
+sidebarTitle: "Config Proxy"
+---
+
+Read [application packages](/en/basics/applications) for an overview of the cloud config system. The *config proxy* runs on every Vespa node. It has a set of config sources, defined in [VESPA\_CONFIGSERVERS](/en/operations/self-managed/files-processes-and-ports#environment-variables).
+
+The config proxy will act as a proxy for config clients on the same machine, so that all clients can ask for config on *localhost:19090*. The *config source* that the config proxy uses is set in [VESPA\_CONFIGSERVERS](/en/operations/self-managed/files-processes-and-ports#environment-variables) and consists of one or more config sources (the addresses of [config servers](/en/operations/self-managed/configuration-server)).
+
+The proxy has a memory cache that is used to serve configs if it is possible. In default mode, the proxy will have an outstanding request to the config server that will return when the config has changed (a new generation of config). This means that every time config changes on the config server, the proxy will get a response, update its cache and respond to all its clients with the changed config.
+
+The config proxy has two modes:
+
+| Mode | Description |
+| --- | --- |
+| default | Gets config from server and stores in memory cache. The config proxy will always be started in *default* mode. Serves from cache if possible. Always uses a config source. If restarted, it will lose all configs that were cached in memory. |
+| memorycache | Serves config from memory cache only. Never uses a config source. A restart will lose all cached configs. Setting the mode to *memorycache* will make all applications on the node work as before (given that they have previously been running and requested config), since the config proxy will serve config from cache and work without connection to any config server. Applications on this node will not work if the config proxy stops, is restarted or crashes. |
+
+Use [vespa-configproxy-cmd](/en/reference/operations/self-managed/tools#vespa-configproxy-cmd) to inspect cached configs, mode, config sources etc., there are also some commands to change some of the settings. Run the command as:
+
+```bash
+$ vespa-configproxy-cmd -m
+```
+
+to see all possible commands.
+
+## Detaching from config servers
+
+```bash
+$ vespa-configproxy-cmd -m setmode memorycache
+```
+
+## Inspecting config
+
+To inspect the configuration for a service, in this example a searchnode (proton) instance, do:
+
+
+
+Find the active config generation used by the service, using [/state/v1/config](/en/reference/api/state-v1#state-v1-config) - example for *http://localhost:19110/state/v1/config*, here the generation is 2:
+
+```json
+{
+ "config": {
+ "generation": 2,
+ "proton": {
+ "generation": 2
+ },
+ "proton.documentdb.music": {
+ "generation": 2
+ }
+ }
+}
+```
+
+
+Find the relevant *config definition name*, *config id* and *config generation* using [vespa-configproxy-cmd](/en/reference/operations/self-managed/tools#vespa-configproxy-cmd) - e.g.:$ vespa-configproxy-cmd | grep proton
+
+```bash
+$ vespa-configproxy-cmd | grep proton
+
+vespa.config.search.core.proton,music/search/cluster.music/0,2,MD5:40087d6195cedb1840721b55eb333735,XXHASH64:43829e79cea8e714
+```
+
+`vespa.config.search.core.proton` is the *config definition name* for this particular config, `music/search/cluster.music/0` is the *config id* used by the proton service instance on this node and `2` is the active config generation. This means, the service is using the correct config generation as it is matching the /state/v1/config response (a restart can be required for some config changes).
+
+
+Get the generated config using [vespa-get-config](/en/reference/operations/self-managed/tools#vespa-get-config) - e.g.:$ vespa-get-config -n vespa.config.search.core.proton -i music/search/cluster.music/0
+
+```bash
+$ vespa-get-config -n vespa.config.search.core.proton -i music/search/cluster.music/0
+
+basedir "/opt/vespa/var/db/vespa/search/cluster.music/n0"
+rpcport 19106
+httpport 19110
+...
+```
+
+
+**Important:**
+
+Omitting `-i` will return the default configuration, meaning not generated for the active service instance.
+
+
+
+
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/self-managed/config-sentinel.mdx b/mintlify-docs/en/operations/self-managed/config-sentinel.mdx
new file mode 100644
index 0000000000..b6f8e3ce67
--- /dev/null
+++ b/mintlify-docs/en/operations/self-managed/config-sentinel.mdx
@@ -0,0 +1,145 @@
+---
+title: "Config sentinel"
+---
+
+The config sentinel starts and stops services - and restart failed services unless they are manually stopped. All nodes in a Vespa system have at least these running processes:
+
+| Process | Description |
+| --- | --- |
+| [config-proxy](/en/operations/self-managed/config-proxy) | Proxies config requests between Vespa applications and the configserver node. All configuration is cached locally so that this node can maintain its current configuration, even if the configserver shuts down. |
+| **config-sentinel** | Registers itself with the *config-proxy* and subscribes to and enforces node configuration, meaning the configuration of what services should be run locally, and with what parameters. |
+| [vespa-logd](/en/reference/operations/log-files#logd) | Monitors *$VESPA\_HOME/logs/vespa/vespa.log*, which is used by all other services, and relays everything to the [log-server](/en/reference/operations/log-files#log-server). |
+| [metrics-proxy](/en/operations/self-managed/monitoring#metrics-proxy) | Provides APIs for metrics access to all nodes and services. |
+
+
+
+
+
+Start sequence:
+
+
+
+*config server(s)* are started and application config is deployed to them - see [config server operations](/en/operations/self-managed/configuration-server).
+
+
+*config-proxy* is started. The environment variables [VESPA\_CONFIGSERVERS](/en/operations/self-managed/files-processes-and-ports#environment-variables) and [VESPA\_CONFIGSERVER\_RPC\_PORT](/en/operations/self-managed/files-processes-and-ports#environment-variables) are used to connect to the [config-server(s)](/en/operations/self-managed/configuration-server). It will retry all config servers in case some are down.
+
+
+*config-sentinel* is started, and subscribes to node configuration (i.e. a service list) from *config-proxy* using its hostname as the [config id](/en/applications/configapi-dev#config-id). See [Node and network setup](/en/operations/self-managed/node-setup) for details about how the hostname is detected and how to override it. The config for the config-sentinel (the service list) lists the processes to be started, along with the *config id* to assign to each, typically the logical name of that service instance.
+
+
+*config-proxy* subscribes to node configuration from *config-server*, caches it, and returns the result to *config-sentinel*
+
+
+*config-sentinel* starts the services given in the node configuration, with the config id as argument. See example output below, like *id="search/qrservers/qrserver.0"*. *logd* and *metrics-proxy* are always started, regardless of configuration. Each service:
+
+ a. Subscribes to configuration from *config-proxy*.
+
+ b. *config-proxy* subscribes to configuration from *config-server*, caches it and returns result to the service.
+
+ c. The service runs according to its configuration, logging to *`$VESPA_HOME/logs/vespa/vespa.log`*. The processes instantiate internal components, each assigned the same or another config id, and instantiating further components.
+
+Also see [cluster startup](#cluster-startup) for a minimum nodes-up start setting.
+
+
+
+When new config is deployed to *config-servers* they propagate the changed configuration to nodes subscribing to it. In turn, these nodes reconfigure themselves accordingly.
+
+## User interface
+
+The config sentinel runs an RPC service which can be used to list, start and stop the services supposed to run on that node. This can be useful for testing and debugging. Use [vespa-sentinel-cmd](/en/reference/operations/self-managed/tools#vespa-sentinel-cmd) to trigger these actions. Example output from `vespa-sentinel-cmd list`:
+
+```bash
+vespa-sentinel-cmd 'sentinel.ls' OK.
+container state=RUNNING mode=AUTO pid=27993 exitstatus=0 id="default/container.0"
+container-clustercontroller state=RUNNING mode=AUTO pid=27997 exitstatus=0 id="admin/cluster-controllers/0"
+distributor state=RUNNING mode=AUTO pid=27996 exitstatus=0 id="search/distributor/0"
+logd state=RUNNING mode=AUTO pid=5751 exitstatus=0 id="hosts/r6-3/logd"
+logserver state=RUNNING mode=AUTO pid=27994 exitstatus=0 id="admin/logserver"
+searchnode state=RUNNING mode=AUTO pid=27995 exitstatus=0 id="search/search/cluster.search/0"
+slobrok state=RUNNING mode=AUTO pid=28000 exitstatus=0 id="admin/slobrok.0"
+```
+
+To learn more about the processes and services, see [files and processes](/en/operations/self-managed/files-processes-and-ports). Use [vespa-model-inspect host *hostname*](/en/reference/operations/self-managed/tools#vespa-model-inspect) to list services running on a node.
+
+## Cluster startup
+
+The config sentinel will not start services on a node unless it has connectivity to a minimum of other nodes, default 50%. Find an example of this feature in the [multinode-HA](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode-HA#start-the-admin-server) example application. Example configuration:
+
+```xml
+
+
+
+ 20
+ 1
+
+
+```
+
+Example: `minOkPercent 10` means that services will be started only if more than or equal to 10% of nodes are up. If there are 11 nodes in the application, the first node started will not start its services - when the second node is started, services will be started on both.
+
+`maxBadCount` is for connectivity checks where the other node is up, but we still do not have proper two-way connectivity. Normally, one-way connectivity means network configuration is broken and needs looking into, so this may be set low (1 or even 0 are the recommended values). If there are some temporary problems (in the example below non-responding DNS which leads to various issues at startup) the config sentinel will loop and retry, so the service startup will just be slightly delayed.
+
+Example log:
+
+```bash
+[2021-06-15 14:33:25] EVENT : starting/1 name="sbin/vespa-config-sentinel -c hosts/le40808.ostk (pid 867)"
+[2021-06-15 14:33:25] EVENT : started/1 name="config-sentinel"
+[2021-06-15 14:33:25] CONFIG : Sentinel got 4 service elements [tenant(footest), application(bartest), instance(default)] for config generation 1001
+[2021-06-15 14:33:25] CONFIG : Booting sentinel 'hosts/le40808.ostk' with [stateserver port 19098] and [rpc port 19097]
+[2021-06-15 14:33:25] CONFIG : listening on port 19097
+[2021-06-15 14:33:25] CONFIG : Sentinel got model info [version 7.420.21] for 35 hosts [config generation 1001]
+[2021-06-15 14:33:25] CONFIG : connectivity.maxBadCount = 3
+[2021-06-15 14:33:25] CONFIG : connectivity.minOkPercent = 40
+[2021-06-15 14:33:28] INFO : Connectivity check details: 2086533.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le01287.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le23256.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le23267.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le23297.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le23312.ostk -> connect OK, but reverse check FAILED
+[2021-06-15 14:33:28] INFO : Connectivity check details: le23317.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le23319.ostk -> connect OK, but reverse check FAILED
+[2021-06-15 14:33:28] INFO : Connectivity check details: le30550.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le30553.ostk -> connect OK, but reverse check FAILED
+[2021-06-15 14:33:28] INFO : Connectivity check details: le30556.ostk -> unreachable from me, but up
+[2021-06-15 14:33:28] INFO : Connectivity check details: le30560.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le30567.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le40387.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le40389.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le40808.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le40817.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le40833.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le40834.ostk -> unreachable from me, but up
+[2021-06-15 14:33:28] INFO : Connectivity check details: le40841.ostk -> connect OK, but reverse check FAILED
+[2021-06-15 14:33:28] INFO : Connectivity check details: le40858.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le40860.ostk -> unreachable from me, but up
+[2021-06-15 14:33:28] INFO : Connectivity check details: le40863.ostk -> connect OK, but reverse check FAILED
+[2021-06-15 14:33:28] INFO : Connectivity check details: le40873.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le40892.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le40900.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le40905.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: le40914.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: sm02318.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: sm02324.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: sm02340.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: zt40672.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: zt40712.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: zt40728.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] INFO : Connectivity check details: zt41329.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:28] WARNING : 8 of 35 nodes up but with network connectivity problems (max is 3)
+[2021-06-15 14:33:28] WARNING : Bad network connectivity (try 1)
+[2021-06-15 14:33:30] WARNING : slow resolve time: 'le30556.ostk' -> '1234:5678:90:123::abcd' (5.00528 s)
+[2021-06-15 14:33:30] WARNING : slow resolve time: 'le40834.ostk' -> '1234:5678:90:456::efab' (5.00527 s)
+[2021-06-15 14:33:30] WARNING : slow resolve time: 'le40860.ostk' -> '1234:5678:90:789::cdef' (5.00459 s)
+[2021-06-15 14:33:31] INFO : Connectivity check details: le23312.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:31] INFO : Connectivity check details: le23319.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:31] INFO : Connectivity check details: le30553.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:31] INFO : Connectivity check details: le30556.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:31] INFO : Connectivity check details: le40834.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:31] INFO : Connectivity check details: le40841.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:31] INFO : Connectivity check details: le40860.ostk -> connect OK, but reverse check FAILED
+[2021-06-15 14:33:31] INFO : Connectivity check details: le40863.ostk -> OK: both ways connectivity verified
+[2021-06-15 14:33:31] INFO : Enough connectivity checks OK, proceeding with service startup
+[2021-06-15 14:33:31] EVENT : starting/1 name="searchnode"
+...
+```
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/self-managed/configuration-server.mdx b/mintlify-docs/en/operations/self-managed/configuration-server.mdx
new file mode 100644
index 0000000000..6ffd84f6fc
--- /dev/null
+++ b/mintlify-docs/en/operations/self-managed/configuration-server.mdx
@@ -0,0 +1,272 @@
+---
+title: "Configuration Servers"
+---
+
+Vespa Configuration Servers host the endpoint where application packages are deployed - and serves generated configuration to all services - see the [overview](/en/learn/overview) and [application packages](/en/basics/applications) for details. I.e., one cannot configure Vespa without config servers, and services cannot run without it.
+
+It is useful to understand the [Vespa start sequence](/en/operations/self-managed/config-sentinel). Refer to the sample applications [multinode](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode) and [multinode-HA](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode-HA) for practical examples of multi-configserver configuration.
+
+Vespa configuration is set up using one or more configuration servers (config servers). A config server uses [Apache ZooKeeper](https://zookeeper.apache.org/) as a distributed data storage for the configuration system. In addition, each node runs a config proxy to cache configuration data - find an overview at [services start](/en/operations/self-managed/config-sentinel).
+
+## Status and config generation
+
+Check the health of a running config server using (replace localhost with hostname):
+
+```bash
+$ curl http://localhost:19071/state/v1/health
+```
+
+Note that the config server is a service is itself, and runs with file-based configuration. The application packages deployed will not change the config server - the config server serves this configuration to all other Vespa nodes. This will hence always be config generation 0:
+
+```bash
+$ curl http://localhost:19071/state/v1/config
+```
+
+Details in [start-configserver](https://github.com/vespa-engine/vespa/blob/master/configserver/src/main/sh/start-configserver).
+
+## Redundancy
+
+The config servers are defined in [VESPA\_CONFIGSERVERS](/en/operations/self-managed/files-processes-and-ports#environment-variables), [services.xml](/en/reference/applications/services/services) and [hosts.xml](/en/reference/applications/hosts):
+
+
+```bash
+$ VESPA_CONFIGSERVERS=myserver0.mydomain.com,myserver1.mydomain.com,myserver2.mydomain.com
+```
+
+```xml
+
+
+
+
+
+
+
+
+
+```
+```xml
+
+
+ admin0
+
+
+ admin1
+
+
+ admin2
+
+
+```
+
+[VESPA\_CONFIGSERVERS](/en/operations/self-managed/files-processes-and-ports#environment-variables) must be set on all nodes. This is a comma- or whitespace-separated list with the hostname of all config servers, like *myhost1.mydomain.com,myhost2.mydomain.com,myhost3.mydomain.com*.
+
+When there are multiple config servers, the [config proxy](/en/operations/self-managed/config-proxy) will pick a config server randomly (to achieve load balancing between config servers). The config proxy is fault-tolerant and will switch to another config server (if there is more than one) if the one it is using becomes unavailable or there is an error in the configuration it receives.
+
+For the system to tolerate *n* failures, [ZooKeeper](#zookeeper) by design requires using *(2\*n)+1* nodes. Consequently, only an odd numbers of nodes is useful, so you need minimum 3 nodes to have a fault-tolerant config system.
+
+Even when using just one config server, the application will work if the server goes down (but deploying application changes will not work). Since the *config proxy* runs on every node and caches configs, it will continue to serve config to the services on that node. However, restarting a node when config servers are unavailable means that services on the node will be unable to start since the cache will be destroyed when restarting the config proxy.
+
+Refer to the [admin model reference](/en/reference/applications/services/admin#configservers) for more details on *services.xml*.
+
+## Start sequence
+
+To bootstrap a Vespa application instance, the high-level steps are:
+
+- Start config servers
+- Deploy config
+- Start Vespa nodes
+
+[multinode-HA](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode-HA) is a great guide on how to start a multinode Vespa application instance - try this first. Detailed steps for config server startup:
+
+
+
+Set [VESPA\_CONFIGSERVERS](/en/operations/self-managed/files-processes-and-ports#environment-variables) on all nodes, using fully qualified hostnames and the same value on all nodes, including the config servers.
+
+
+Start the config server on the nodes configured in *services/hosts.xml*. Make sure the startup is successful by inspecting [/state/v1/health](/en/reference/api/state-v1#state-v1-health), default on port 19071:
+
+```bash
+$ curl http://localhost:19071/state/v1/health
+```
+
+```json
+{
+ "time" : 1651147368066,
+ "status" : {
+ "code" : "up"
+ },
+ "metrics" : {
+ "snapshot" : {
+ "from" : 1.651147308063E9,
+ "to" : 1.651147367996E9
+ }
+ }
+}
+```
+
+If there is no response on the health API, two things can have happened:
+ - The config server process did not start - inspect logs using `vespa-logfmt`, or check *`$VESPA_HOME/logs/vespa/vespa.log`*, normally */opt/vespa/logs/vespa/vespa.log*.
+ - The config server process started, and is waiting for [Zookeeper quorum](#zookeeper):
+
+ ```bash
+ $ vespa-logfmt -S configserver
+ ```
+
+ ```bash
+ configserver Container.com.yahoo.vespa.zookeeper.ZooKeeperRunner Starting ZooKeeper server with /opt/vespa/var/zookeeper/conf/zookeeper.cfg. Trying to establish ZooKeeper quorum (members: [node0.vespanet, node1.vespanet, node2.vespanet], attempt 1)
+ configserver Container.com.yahoo.container.handler.threadpool.ContainerThreadpoolImpl Threadpool 'default-pool': min=12, max=600, queue=0
+ configserver Container.com.yahoo.vespa.config.server.tenant.TenantRepository Adding tenant 'default', created 2022-04-28T13:02:24.182Z. Bootstrapping in PT0.175576S
+ configserver Container.com.yahoo.vespa.config.server.rpc.RpcServer Rpc server will listen on port 19070
+ configserver Container.com.yahoo.container.jdisc.state.StateMonitor Changing health status code from 'initializing' to 'up'
+ configserver Container.com.yahoo.jdisc.http.server.jetty.Janitor Creating janitor executor with 2 threads
+ configserver Container.com.yahoo.jdisc.http.server.jetty.JettyHttpServer Threadpool size: min=22, max=22
+ configserver Container.org.eclipse.jetty.server.Server jetty-9.4.46.v20220331; built: 2022-03-31T16:38:08.030Z; git: bc17a0369a11ecf40bb92c839b9ef0a8ac50ea18; jvm 11.0.14.1+1-
+ configserver Container.org.eclipse.jetty.server.handler.ContextHandler Started o.e.j.s.ServletContextHandler@341c0dfc{19071,/,null,AVAILABLE}
+ configserver Container.org.eclipse.jetty.server.AbstractConnector Started configserver@3cd6d147{HTTP/1.1, (http/1.1, h2c)}{0.0.0.0:19071}
+ configserver Container.org.eclipse.jetty.server.Server Started @21955ms
+ configserver Container.com.yahoo.container.jdisc.ConfiguredApplication Switching to the latest deployed set of configurations and components. Application config generation: 0
+ ```
+
+It will hang until quorum is reached, and the second highlighted log line is emitted. Root causes for missing quorum can be:
+ - No connectivity between the config servers. Zookeeper logs the members like `(members: [node0.vespanet, node1.vespanet, node2.vespanet], attempt 1)`. Verify that the nodes running config server can reach each other on port 2181.
+ - No connectivity can be wrong network config. [multinode-HA](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode-HA) uses a docker network, make sure there are no underscores in the hostnames.
+
+
+Once all config servers return `up` on *state/v1/health*, an application package can be deployed. This means, if deploy fails, it is always a good idea to verify the config server health first - if config servers are up, and deploy fails, it is most likely an issue with the application package - if so, refer to [application packages](/en/basics/applications).
+
+
+A successful deployment logs the following, for the *prepare* and *activate* steps:
+
+```bash
+Container.com.yahoo.vespa.config.server.ApplicationRepository Session 2 prepared successfully.
+Container.com.yahoo.vespa.config.server.deploy.Deployment Session 2 activated successfully using no host provisioner. Config generation 2. File references: [file '9cfc8dc57f415c72']
+Container.com.yahoo.vespa.config.server.session.SessionRepository Session activated: 2
+```
+
+
+Start the Vespa nodes. Technically, they can be started at any time. When troubleshooting, it is easier to make sure the config servers are started successfully, and deployment was successful - before starting any other nodes. Refer to the [Vespa start sequence](/en/operations/self-managed/config-sentinel) and [Vespa start / stop / restart](/en/operations/self-managed/admin-procedures#vespa-start-stop-restart).
+
+
+
+Make sure to look for logs on all config servers when debugging.
+
+## Scaling up
+
+Add a config server node for increased fault tolerance or when replacing a node. Read up on [ZooKeeper configuration](#zookeeper-configuration) before continuing. Although it is *possible* to add more than one config server at a time, doing it one by one is recommended, to keep the ZooKeeper quorum intact.
+
+Due to the ZooKeeper majority vote, use one or three config servers.
+
+
+
+Install *vespa* on new config server node.
+
+
+Append the config server node's hostname to VESPA\_CONFIGSERVERS on all nodes, then (re)start all config servers in sequence to update the ZooKeeper config. By appending, the current config server nodes keep their current ZooKeeper index. Restart the existing config server(s) first. Config server will log which servers are configured when starting up to vespa log.
+
+
+Update *services.xml* and *hosts.xml* with the new set of config servers, then *vespa prepare* and *vespa activate*.
+
+
+Restart other nodes one by one to start using the new config servers. This will let the vespa nodes use the updated set of config servers.
+
+
+
+The config servers will automatically redistribute the application data to new nodes.
+
+## Scaling down
+
+This is the inverse of scaling up, and the procedure is the same. Remove config servers from the end of *VESPA\_CONFIGSERVERS*, and here one can remove two nodes in one go, if going from three to one.
+
+## Replacing nodes
+
+- Make sure to replace only one node at a time.
+- If you have only one config server you need to first scale up with a new node, then scale down by removing the old node.
+- If you have 3 or more you can replace one of the old nodes in VESPA\_CONFIGSERVERS with the new one instead of adding one, otherwise same procedure as in [Scaling up](#scaling-up). Repeat for each node you want to replace.
+
+## Tools
+
+Tools to access config:
+
+
+
+
+
+
+
+## ZooKeeper
+
+[ZooKeeper](https://zookeeper.apache.org/) handles data consistency across multiple config servers. The config server Java application runs a ZooKeeper server, embedded with an RPC frontend that the other nodes use. ZooKeeper stores data internally in *nodes* that can have *sub-nodes*, similar to a file system.
+
+At [vespa prepare](/en/reference/clients/vespa-cli#vespa-prepare), the application's files, along with global configurations, are stored in ZooKeeper. The application data is stored under */config/v2/tenants/default/sessions/\[sessionid\]/userapp*. At [vespa activate](/en/reference/clients/vespa-cli#vespa-activate), the newest application is activated *live* by writing the session id into */config/v2/tenants/default/applications/default:default:default*. It is at that point the other nodes get configured.
+
+Use *vespa-zkcli* to inspect state, replace with actual session id:
+
+```bash
+$ vespa-zkcli ls /config/v2/tenants/default/sessions/sessionid/userapp
+$ vespa-zkcli get /config/v2/tenants/default/sessions/sessionid/userapp/services.xml
+```
+
+The ZooKeeper server logs to *`$VESPA_HOME/logs/vespa/zookeeper.configserver.0.log` (files are rotated with sequence number)*
+
+### ZooKeeper configuration
+
+The members of the ZooKeeper cluster is generated based on the contents of [VESPA\_CONFIGSERVERS](/en/operations/self-managed/files-processes-and-ports#environment-variables). *`$VESPA_HOME/var/zookeeper/conf/zookeeper.cfg`* is written when (re)starting the config server. Hence, config server(s) must all be restarted when `VESPA_CONFIGSERVERS` changes.
+
+The order of the nodes is used to create indexes in *zookeeper.cfg*, do not change node order.
+
+### ZooKeeper recovery
+
+If the config server(s) should experience data corruption, for instance a hardware failure, use the following recovery procedure. One example of such a scenario is if *`$VESPA_HOME/logs/vespa/zookeeper.configserver.0.log`* says *java.io.IOException: Negative seek offset at java.io.RandomAccessFile.seek(Native Method)*, which indicates ZooKeeper has not been able to recover after a full disk. There is no need to restart Vespa on other nodes during the procedure:
+
+1. [vespa-stop-configserver](/en/reference/operations/self-managed/tools#vespa-stop-configserver)
+2. [vespa-configserver-remove-state](/en/reference/operations/self-managed/tools#vespa-configserver-remove-state)
+3. [vespa-start-configserver](/en/reference/operations/self-managed/tools#vespa-start-configserver)
+4. [vespa](/en/clients/vespa-cli#deployment) prepare ``
+5. [vespa](/en/clients/vespa-cli#deployment) activate
+
+This procedure completely cleans out ZooKeeper's internal data snapshots and deploys from scratch.
+
+Note that by default the [cluster controller](/en/content/content-nodes#cluster-controller) that maintains the state of the content cluster will use the shared same ZooKeeper instance, so the content cluster state is also reset when removing state. Manually set state will be lost (e.g. a node with user state *down*). It is possible to run cluster-controllers in standalone zookeeper mode - see [standalone-zookeeper](/en/reference/applications/services/admin#cluster-controllers).
+
+### ZooKeeper barrier timeout
+
+If the config servers are heavily loaded, or the applications being deployed are big, the internals of the server may time out when synchronizing with the other servers during deploy. To work around, increase the timeout by setting: [VESPA\_CONFIGSERVER\_ZOOKEEPER\_BARRIER\_TIMEOUT](/en/operations/self-managed/files-processes-and-ports#environment-variables) to 600 (seconds) or higher, and restart the config servers.
+
+## Configuration
+
+To access config from a node not running the config system (e.g. doing feeding via the Document API), use the environment variable [VESPA\_CONFIG\_SOURCES](/en/operations/self-managed/files-processes-and-ports#environment-variables):
+
+```bash
+$ export VESPA_CONFIG_SOURCES="myadmin0.mydomain.com:19071,myadmin1.mydomain.com:19071"
+```
+
+Alternatively, for Java programs, use the system property *configsources* and set it programmatically or on the command line with the *\-D* option to Java. The syntax for the value is the same as for *VESPA\_CONFIG\_SOURCES*.
+
+### System requirements
+
+The minimum heap size for the JVM it runs under is 128 Mb and max heap size is 2 GB (which can be changed with a [setting](/en/performance/container-tuning#config-server-and-config-proxy)). It writes a transaction log that is regularly purged of old items, so little disk space is required. Note that running on a server that has a lot of disk I/O will adversely affect performance and is not recommended.
+
+### Ports
+
+The config server RPC port can be changed by setting [VESPA\_CONFIGSERVER\_RPC\_PORT](/en/operations/self-managed/files-processes-and-ports#environment-variables) on all nodes in the system.
+
+Changing HTTP port requires changing the port in *`$VESPA_HOME/conf/configserver-app/services.xml`*:
+
+```xml
+
+
+
+```
+
+When deploying, use the *\-p* option, if port is changed from the default.
+
+## Troubleshooting
+
+| Problem | Description |
+| --- | --- |
+| **Health checks** | Verify that a config server is up and running using [/state/v1/health](/en/reference/api/state-v1#state-v1-health), see [start sequence](#start-sequence). Status code is `up` if the server is up and has finished bootstrapping.
Alternatively, use [http://localhost:19071/status.html](http://localhost:19071/status.html) which will return response code 200 if server is up and has finished bootstrapping.
Metrics are found at [/state/v1/metrics](/en/reference/api/state-v1#state-v1-metrics). Use [vespa-model-inspect](/en/reference/operations/self-managed/tools#vespa-model-inspect) to find host and port number, port is 19071 by default. |
+| **Consistency** | When having more than one config server, consistency between the servers is crucial. [http://localhost:19071/status](http://localhost:19071/status) can be used to check that settings for config servers are the same for all servers.
[vespa-config-status](/en/reference/operations/self-managed/tools#vespa-config-status) can be used to check config on nodes.
[http://localhost:19071/application/v2/tenant/default/application/default](http://localhost:19071/application/v2/tenant/default/application/default) displays active config generation and should be the same on all servers, and the same as in response from running [vespa deploy](/en/clients/vespa-cli#deployment) |
+| **Bad Node** | If running with more than one config server and one of these goes down or has hardware failure, the cluster will still work and serve config as usual (clients will switch to use one of the good servers). It is not necessary to remove a bad server from the configuration.
Deploying applications will take longer, as [vespa deploy](/en/clients/vespa-cli#deployment) will not be able to complete a deployment on all servers when one of them is down. If this is troublesome, lower the [barrier timeout](#zookeeper-barrier-timeout) - (default value is 120 seconds).
Note also that if you have not configured [cluster controllers](/en/reference/applications/services/admin#cluster-controller) explicitly, these will run on the config server nodes and the operation of these might be affected. This is another reason for not trying to manually remove a bad node from the config server setup. |
+| **Stuck filedistribution** | The config system distributes binary files (such as jar bundle files) using [file-distribution](/en/reference/applications/deployment#file-distribution) - use [vespa-status-filedistribution](/en/reference/operations/self-managed/tools#vespa-status-filedistribution) to see detailed status if it gets stuck. |
+| **Memory** | Insufficient memory on the host / in the container running the config server will cause startup or deploy / configuration problems - see [Docker containers](/en/operations/self-managed/docker-containers). |
+| **ZooKeeper** | The following can be caused by a full disk on the config server, or clocks out of sync:
`at com.yahoo.vespa.zookeeper.ZooKeeperRunner.startServer(ZooKeeperRunner.java:92)` `Caused by: java.io.IOException: The accepted epoch, 10 is less than the current epoch, 48`
Users have reported that "Copying the currentEpoch to acceptedEpoch fixed the problem". |
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/self-managed/container.mdx b/mintlify-docs/en/operations/self-managed/container.mdx
new file mode 100644
index 0000000000..faf7d4aea3
--- /dev/null
+++ b/mintlify-docs/en/operations/self-managed/container.mdx
@@ -0,0 +1,95 @@
+---
+title: "Container"
+description: "This is the Container service operational guide."
+---
+
+
+
+
+
+Note that "container" is an overloaded concept in Vespa - in this guide it refers to service instance nodes in blue.
+
+Refer to [container metrics](/en/operations/metrics#container-metrics).
+
+## Endpoints
+
+Container service(s) hosts the query and feed endpoints - examples:
+
+- [album-recommendation](https://github.com/vespa-engine/sample-apps/blob/master/album-recommendation/app/services.xml) configures \_both\_ query and feed in the same container cluster (i.e. service):
+ ```xml
+
+
+
+
+
+
+
+ ```
+- [multinode-HA](https://github.com/vespa-engine/sample-apps/blob/master/examples/operations/multinode-HA/services.xml) configures query and feed in separate container clusters (i.e. services):
+ ```xml
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ ```
+
+Observe that `` and `` are located in separate clusters in the second example, and endpoints are therefore different.
+
+
+**Important:**
+
+The first thing to validate when troubleshooting query errors is to make sure that the endpoint is correct, i.e. that query requests hit the correct nodes. A query will be written to the [access log](/en/operations/access-logging) on one of the nodes in the container cluster
+
+
+## Inspecting Vespa Java Services using JConsole
+
+Determine the state of each running Java Vespa service using JConsole. JConsole is distributed along with the Java developer kit. Start JConsole:
+
+```bash
+$ jconsole :
+```
+
+where the host and port determine which service to attach to. For security purposes the JConsole tool can not directly attach to Vespa services from external machines.
+
+### Connecting to a Vespa instance
+
+To attach a JConsole to a Vespa service running on another host, create a tunnel from the JConsole host to the Vespa service host. This can for example be done by setting up two SSH tunnels as follows:
+
+```bash
+$ ssh -N -L:localhost: &
+$ ssh -N -L:localhost: &
+```
+
+where port1 and port2 are determined by the type of service (see below). A JConsole can then be attached to the service as follows:
+
+```bash
+$ jconsole localhost:
+```
+
+Port numbers:
+
+| Service | Port 1 | Port 2 |
+| --- | --- | --- |
+| QRS | 19015 | 19016 |
+| Docproc | 19123 | 19124 |
+
+Updated port information can be found by running:
+
+`$` [`vespa-model-inspect`](/en/reference/operations/self-managed/tools#vespa-model-inspect) `service `
+
+where the resulting RMIREGISTRY and JMX lines determine port1 and port2, respectively.
+
+### Examining thread states
+
+The state of each container is available in JConsole by pressing the Threads tab and selecting the thread of interest in the threads list. Threads of interest includes *search*, *connector*, *closer*, *transport* and *acceptor* (the latter four are used for backend communications).
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/self-managed/content-node-recovery.mdx b/mintlify-docs/en/operations/self-managed/content-node-recovery.mdx
new file mode 100644
index 0000000000..cad08f9f1d
--- /dev/null
+++ b/mintlify-docs/en/operations/self-managed/content-node-recovery.mdx
@@ -0,0 +1,35 @@
+---
+title: "Content node recovery"
+---
+
+In exceptional cases, one or more content nodes may end up with corrupted data causing it to fail to restart. Possible reasons are
+
+- the application configuring a higher memory or disk limit such that the node is allowed to accept more data than it can manage,
+- hardware failure, or
+- a bug in Vespa.
+
+Normally a corrupted node can just be wiped of all data or removed from the cluster, but when this happens simultaneously to multiple nodes, or redundancy 1 is used, it may be necessary to recover the node(s) to avoid data loss. This documents explains the procedure.
+
+## Recovery steps
+
+On each of the nodes needing recovery:
+
+
+
+[Stop services](/en/operations/self-managed/admin-procedures#vespa-start-%2F-stop-%2F-restart) on the node if running.
+
+
+Repair the node:
+ - If the node cannot start due to needing more memory than available: Increase the memory available to the node, or if not possible stop all non-essential processes on the node using [`vespa-sentinel-cmd`](/en/reference/operations/self-managed/tools#vespa-sentinel-cmd) list and `vespa-sentinel-cmd stop [name]`, and (if necessary) start only the content node process using `vespa-sentinel-cmd start searchnode`. When the node is successfully started, issue delete operations or increase the cluster size to reduce the amount of data on the node if necessary.
+ - If the node cannot start due to needing more disk than available: Increase the disk available to the node, or if not possible delete non-essential data such as logs and cached packages. When the node is successfully started, issue delete operations or increase the cluster size to reduce the amount of data on the node if necessary.
+ - If the node cannot start for any other reason, repair the data manually as needed. This procedure will depend on the specific nature of the data corruption.
+
+
+[Start services](/en/operations/self-managed/admin-procedures#vespa-start-%2F-stop-%2F-restart) on the node.
+
+
+Verify that the node is fully up before doing the next node - metrics/interfaces to be used to evaluate if the next node can be stopped:
+ - Check if a node is up using [/state/v1/health](/en/reference/api/state-v1#state-v1-health).
+ - Check the `vds.idealstate.merge_bucket.pending.average` metric on content nodes. When 0, all buckets are in sync - see [example](/en/operations/metrics).
+
+
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/self-managed/cpu-support.mdx b/mintlify-docs/en/operations/self-managed/cpu-support.mdx
new file mode 100644
index 0000000000..cf99ef3d9c
--- /dev/null
+++ b/mintlify-docs/en/operations/self-managed/cpu-support.mdx
@@ -0,0 +1,25 @@
+---
+title: "CPU Support"
+---
+
+For maximum performance, the current version of Vespa for x86\_64 is compiled only for [Haswell (2013)](https://en.wikipedia.org/wiki/Haswell_\(microarchitecture\)) or later CPUs. If trying to run on an older CPU, you will likely see error messages like the following:
+
+```bash
+Problem running program /opt/vespa/bin/vespa-runserver => died with signal: illegal instruction (you probably have an older CPU than required)
+```
+
+or in older versions of Vespa, something like
+
+```bash
+/usr/local/bin/start-container.sh: line 67: 10 Illegal instruction /opt/vespa/bin/vespa-start-configserver
+```
+
+If you would like to run Vespa on an older CPU, we provide a [generic x86 container image](https://hub.docker.com/r/vespaengine/vespa-generic-intel-x86_64/). This image is slower, receives less testing than the regular image, and is less frequently updated.
+
+**To start a Vespa Docker container using this image:**
+
+```bash
+$ docker run --detach --name vespa --hostname vespa-container \
+ --publish 8080:8080 --publish 19071:19071 \
+ vespaengine/vespa-generic-intel-x86_64
+```
\ No newline at end of file
diff --git a/mintlify-docs/en/operations/self-managed/docker-containers.mdx b/mintlify-docs/en/operations/self-managed/docker-containers.mdx
new file mode 100644
index 0000000000..0cb54ba470
--- /dev/null
+++ b/mintlify-docs/en/operations/self-managed/docker-containers.mdx
@@ -0,0 +1,246 @@
+---
+title: "Docker containers"
+---
+
+This document describes tuning and adaptions for running Vespa Docker containers, for developer use on laptop, and in production.
+
+## Mounting persistent volumes
+
+The [quick start](/en/basics/deploy-an-application-local) and [AWS ECS multinode](/en/operations/self-managed/multinode-systems#aws-ecs) guides show how to run Vespa in Docker containers. In these examples, all the data is stored inside the container - the data is lost if the container is deleted. When running Vespa inside Docker containers in production, volume mappings to the parent host should be added to persist data and logs.
+
+- /opt/vespa/var
+- /opt/vespa/logs
+
+```bash
+$ mkdir -p /tmp/vespa/var; export VESPA_VAR_STORAGE=/tmp/vespa/var
+$ mkdir -p /tmp/vespa/logs; export VESPA_LOG_STORAGE=/tmp/vespa/logs
+$ docker run --detach --name vespa --hostname vespa-container \
+ --volume $VESPA_VAR_STORAGE:/opt/vespa/var \
+ --volume $VESPA_LOG_STORAGE:/opt/vespa/logs \
+ --publish 8080:8080 \
+ vespaengine/vespa
+```
+
+## Start Vespa container with Vespa user
+
+You can start the container directly as the *vespa* user. The *vespa* user and group within the container are configured with user id *1000* and group id *1000*. The vespa user and group must be the owner of the */opt/vespa/var* and */opt/vespa/logs* volumes that are mounted in the container for Vespa to start. This is required for Vespa to create the required directories and files within those directories.
+
+The start script will check that the correct owner uid and gid are set and fail if the wrong user or group is set as the owner.
+
+When using an isolated user namespace for the Vespa container, you must set the uid and gid of the directories on the host to the subordinate uid and gid, depending on your mapping. See the [Docker documentation](https://docs.docker.com/engine/security/userns-remap/) for more details.
+
+```bash
+$ mkdir -p /tmp/vespa/var; export VESPA_VAR_STORAGE=/tmp/vespa/var
+$ mkdir -p /tmp/vespa/logs; export VESPA_LOG_STORAGE=/tmp/vespa/logs
+$ sudo chown -R 1000:1000 $VESPA_VAR_STORAGE $VESPA_LOG_STORAGE
+$ docker run --detach --name vespa --user vespa:vespa --hostname vespa-container \
+ --volume $VESPA_VAR_STORAGE:/opt/vespa/var \
+ --volume $VESPA_LOG_STORAGE:/opt/vespa/logs \
+ --publish 8080:8080 \
+ vespaengine/vespa
+```
+
+## System limits
+
+When Vespa starts inside Docker containers, the startup scripts will set [system limits](/en/operations/self-managed/files-processes-and-ports#vespa-system-limits). Make sure that the environment starting the Docker engine is set up in such a way that these limits can be set inside the containers.
+
+For a CentOS/RHEL base host, Docker is usually started by [systemd](https://www.freedesktop.org/software/systemd/man/systemd.exec.html). In this case, `LimitNOFILE`, `LimitNPROC` and `LimitCORE` should be set to meet the minimum requirements in [system limits](/en/operations/self-managed/files-processes-and-ports#vespa-system-limits).
+
+In general, when using Docker or Podman to run Vespa, the `--ulimit` option should be used to set limits according to [system limits](/en/operations/self-managed/files-processes-and-ports#vespa-system-limits). The `--pids-limit` should be set to unlimited (`-1` for Docker and `0` for Podman).
+
+## Transparent Huge Pages
+
+Vespa performance improves significantly by enabling [Transparent Huge Pages (THP)](https://www.kernel.org/doc/html/latest/admin-guide/mm/transhuge.html), especially for memory-intensive applications with large dense tensors with concurrent query and write workloads.
+
+One application improved query p99 latency from 950 ms to 150 ms during concurrent query and write by enabling THP. Using THP is even more important when running in virtualized environments like AWS and GCP due to nested page tables.
+
+When running Vespa using the container image, *THP* settings must be set on the base host OS (Linux). The recommended settings are:
+
+```bash
+$ echo 1 > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag
+$ echo always > /sys/kernel/mm/transparent_hugepage/enabled
+$ echo never > /sys/kernel/mm/transparent_hugepage/defrag
+```
+
+To verify that the setting is active, one should see that *AnonHugePages* is non-zero, In this case, 75 GB has been allocated using AnonHugePages.
+
+```bash
+$ cat /proc/meminfo |grep AnonHuge
+
+ AnonHugePages: 75986944 kB
+```
+
+Note that the Vespa container needs to be restarted after modifying the base host OS settings to make the changes effective. Vespa uses `MADV_HUGEPAGE` for memory allocations done by the [content node process (proton)](/en/content/proton).
+
+## Controlling which services to start
+
+The Docker image *vespaengine/vespa*'s [start script](https://github.com/vespa-engine/docker-image/blob/master/include/start-container.sh) takes a parameter that controls which services are started inside the container.
+
+Starting a *configserver* container:
+
+```bash
+$ docker run \
+ --env VESPA_CONFIGSERVERS= \
+ vespaengine/vespa configserver
+```
+
+Starting a *services* container (configserver will not be started):
+
+```bash
+$ docker run \
+ --env VESPA_CONFIGSERVERS= \
+ vespaengine/vespa services
+```
+
+Starting a container with *both configserver and services*:
+
+```bash
+$ docker run