From 0dd1df866536efa4db2ac8015a876c38c1334093 Mon Sep 17 00:00:00 2001
From: Sarah Emmett <sarah.welton@couchbase.com>
Date: Fri, 23 Jan 2026 11:22:30 -0500
Subject: [PATCH 1/5] [DOC-9267] First draft of sizing for Search

---
 modules/install/pages/sizing-general.adoc | 270 ++++++++++++++++++++++
 1 file changed, 270 insertions(+)

diff --git a/modules/install/pages/sizing-general.adoc b/modules/install/pages/sizing-general.adoc
index 006e88ec30..9aceb653af 100644
--- a/modules/install/pages/sizing-general.adoc
+++ b/modules/install/pages/sizing-general.adoc
@@ -534,6 +534,276 @@ NOTE: The storage engine used in the sizing calculation corresponds to the stora
 | Nitro
 |===
 
+== Sizing Search Service Nodes
+
+Search Service nodes manage Search indexes and serve your Search queries. 
+
+Basic Search indexes are lists of all the unique terms that appear in the documents on your cluster.
+For each term, the Search index also contains a list of the documents where that term appears.
+These lists inside a Search index can cause the Search index to be larger than your original dataset.
+
+Specific options in your Search index configuration can also increase its size, such as *Store*, *Include in _all field*, and *Include Term Vectors*.
+For more information about what options can increase index size and storage requirements, see xref:search:child-field-options-reference.adoc[].
+
+In general, when sizing nodes for a deployment that uses the Search Service, you need to determine the number of vCPUs and the amount of RAM that will support your workload.
+
+=== Calculating Node Requirements
+
+To size the Search Service nodes in your cluster, you need the following information: 
+
+* The number of documents you need to include in your Search index or indexes.
+* The average size of the documents that need to be included in your Search index, in KB. 
+* A sample document or documents that show the structure of your data. 
+* The specific queries per second (QPS) target you need from the Search Service.
+
+You should also consider your replication and recovery needs. 
+
+With all this information, you can work with Couchbase Support to get the most accurate sizing for your Search workload.
+
+If you want to try sizing your cluster yourself, you can use some of the following guidelines to size your <<search-vcpus,>> and <<search-ram,>>, using averages and estimates from other Search deployments.
+
+For the best results with Search node sizing, contact Couchbase Support.
+
+[#search-vcpus]
+==== vCPUS
+
+A heavy QPS workload requires more vCPUs. 
+If your workload requires a high QPS, this is the most important part of your sizing for the Search Service.
+
+For example, if your target QPS is 30,000 and your queries are less complex, use a value of 200 and get a total number of 150 required vCPUs: 
+
+[stem]
+++++
+30,0000_{\mathrm{QPS}} \div 200 = 150_{\mathrm{vCPUs}}
+++++
+
+If your queries were more complex, but the QPS target was the same, the calculation changes to use a value of 150 and a result of 200 vCPUs: 
+
+[stem]
+++++
+30,0000_{\mathrm{QPS}} \div 150 = 200_{\mathrm{vCPUs}}
+++++
+
+You can then divide your result by the vCPU configuration you want to use to calculate the number of nodes you need:
+
+[stem]
+++++
+\lceil 150_{\mathrm{vCPUs}} \div 32_{\mathrm{vCPUs Per Node}} \rceil = 5_{\mathrm{Nodes}}
+++++
+
+In this case, with a target QPS of 30,000 with less complex queries and 32 vCPUs per node, you would need 5 nodes in your deployment.
+
+[#search-ram]
+==== RAM
+
+In general, you should allocate 65% of the RAM on your node to the Search Service.
+A Search node needs more RAM if you:
+
+* Are xref:search:child-field-options-reference.adoc#store[storing field values] or xref:search:child-field-options-reference.adoc#doc-values[using doc values]. 
+* Have xref:search:customize-index.adoc#analyzers[analyzed text fields]. 
+* Want to use more complex queries than xref:search:search-request-params.adoc#analytic-queries[keyword matches].
+
+To calculate a more precise estimate for the required RAM for the Search Service, you need to: 
+
+. <<index-bytes,>>
+. <<index-gb,>>
+. <<add-replicas,>>
+. <<total-ram,>>
+
+[#index-bytes]
+===== Calculate Your Per Doc Index Bytes
+
+Use the following formula first to calculate the number of bytes per document in your Search index: 
+
+[latexmath]
+++++
+\begin{equation}
+\begin{split}
+\text{Per Doc Index Bytes} = ( ( W \cdot 1024 \cdot \text{f_text} \cdot \text{m_text} ) + ( W \cdot 1024 \cdot \text{f_kw} \cdot \text{m_kw} ) + B ) \times (1 + D)
+\end{split}
+\end{equation}
+++++
+
+You need to know the following variables for the formula: 
+
+[cols="1,2"]
+|====
+|Variable |Description 
+
+| stem:[W]
+| The average size of your JSON documents, in KB.
+
+| stem:[{\text{f_text}}]
+a| A measure of the analyzed text from your JSON documents.
+
+You can omit this value if you're using primarily keyword searches and do not have longer-form text fields that require an xref:search:customize-index.adoc#analyzers[analyzer].
+
+You can use the following value ranges based on the kind of analyzed text you have in your index: 
+
+* *Product descriptions, titles and body snippets, support ticket descriptions*: `0.10-0.20`
+* *Long note fields, email bodies, articles, knowledge-base content*: `0.20-0.40`
+* *Log files, message streams, event payloads with large message fields*: `0.40-0.70`
+
+If you're not sure about the size and complexity of the text fields in your documents and how they match to the example ranges, use a value of `0.25` to get a rough estimate. 
+
+To get the most accurate values for stem:[{\text{f_text}}] and your RAM sizing calculations, contact Couchbase Support.
+
+| stem:[{\text{m_text}}]
+a| A multiplier for calculating how the bytes in your documents translate into your Search index for analyzed text fields.
+
+For a good planning range, try a value between `0.12-0.35`. 
+
+To get the most accurate values for stem:[{\text{m_text}}] and your RAM sizing calculations, contact Couchbase Support.
+
+| stem:[{\text{f_kw}}]
+a| A measure of the keywords from your JSON documents.
+
+For a good planning range for a keyword search use case or a filter-heavy workload, use a value of `0.10`.
+
+To get the most accurate values for stem:[{\text{f_kw}}] and your RAM sizing calculations, contact Couchbase Support.
+
+| stem:[{\text{m_kw}}]
+a| A multiplier for calculating how the bytes in your documents translate into your Search index for keywords.
+
+For a good planning range, try a value between `0.10-0.18`. 
+
+To get the most accurate values for stem:[{\text{m_kw}}] and your RAM sizing calculations, contact Couchbase Support.
+
+| stem:[B]
+a| The number of bytes needed for storing field values for your documents, if xref:search:child-field-options-reference.adoc#store[store] is enabled for a child field mapping.
+
+If you're not storing any field values in your Search index, set this value to `0`.
+
+| stem:[D]
+a| The additional overhead from adding xref:search:child-field-options-reference.adoc#doc-values[doc values] to your Search index from a child field mapping. 
+
+Use a value from `0-1`.
+If you're not using doc values in your Search index, set this value to `0`.
+|==== 
+
+[#index-gb]
+===== Calculate Your Total Index GB
+
+After you have calculated your stem:[{\text{Per Doc Index Bytes}}], calculate the total GB needed for your Search index, where:
+
+* stem:[N] is the total number of JSON documents you want to include in your Search index.
+* stem:[S] is a measure of your system overhead.
+For a rough estimate, use a value of `0.10`.
+
+Use the following formula: 
+
+[latexmath]
+++++
+\begin{equation}
+\begin{split}
+\text{Total Index GB} =
+\frac{(N \times \text{Per Doc Index Bytes})}{10^{9}} \times (1 + S)
+\end{split}
+\end{equation}
+++++
+
+[#add-replicas]
+===== Add Your Replication Factor
+
+If you want to add replicas to your Search index, you need to factor that into your stem:[{\text{Total Index GB}}].
+
+Use the following formula:
+
+[latexmath]
+++++
+\begin{equation}
+\begin{split}
+\text{Total Index GB With Replicas} = \text{Total Index GB} \times (\text{Number Of Replicas} + 1)
+\end{split}
+\end{equation}
+++++
+
+[#total-ram]
+===== Calculate Your Total Required RAM
+
+Then, you can calculate the total RAM required on a node for your use case: 
+
+[latexmath]
+++++
+\begin{equation}
+\begin{split}
+\text{Total Node RAM} = \text{Total Index GB With Replicas} \times 0.65
+\end{split}
+\end{equation}
+++++
+
+[#search-examples]
+=== Search Node Sizing Examples
+
+You'll get the most accurate results by going through sizing with Couchbase Support, but you can use the following examples for a sizing estimate for a Search workload:
+
+* <<high-qps,>>
+* <<low-qps,>>
+
+[#high-qps]
+==== High QPS and Keyword-Only Searches
+
+The following sizing scenario assumes a high QPS target, a CPU-bound configuration, and a keyword-only workload for a compact Search index. 
+
+This example uses the following variables: 
+
+|====
+|Number of Documents |Per Doc Index Bytes |QPS Target |System Overhead |Replica Factor
+
+|194,000,000
+|258.05
+|87,000
+|0.10
+|2 (1 replica + 1)
+
+|====
+
+Based on these variables, the required vCPUs could be either: 
+
+* stem:[290], using a value of 300 in the vCPU calculation.
+* stem:[435], using a value of 200 in the vCPU calculation. 
+
+The Total Index GB With Replicas is stem:[110.13 GB].
+
+The vCPUs matter the most in this workload.
+
+To get a higher QPS for each vCPU, you could try a configuration of: 
+
+. 10 nodes with 32 vCPUs and 128{nbsp}GB of RAM
+. 5 nodes with 64 vCPUs and 256{nbsp}GB of RAM
+
+Otherwise, for a lower QPS for each vCPU, you could try a configuration of:
+
+. 14 nodes with 32 vCPUs and 128{nbsp}GB of RAM
+. 7 nodes with 64 vCPUs and 256{nbsp}GB of RAM
+
+[#low-qps]
+==== Lower QPS with Higher Storage and a Larger Index
+
+The following sizing scenario assumes a comparatively lower QPS target, a storage-bound configuration, and a larger Search index.
+
+This example uses the following variables: 
+
+[cols="1,2,1,1,1"]
+|====
+|Number of Documents |Per Doc Index Bytes |QPS Target |System Overhead |Replica Factor
+
+|500,000,000
+|344.86 (more faceting, sorting, and more complex queries)
+|12,000
+|0.10
+|2 (1 replica + 1)
+
+|====
+
+Based on these variables, the required vCPUs would be stem:[40], based on the more complex queries needing a higher QPS per vCPU and using a value of 300 in the calculation.
+If you wanted to stick to 32 vCPU nodes, you would need 2 nodes.
+
+The Total Index GB With Replicas is stem:[379.34 \text{GB}].
+Each of the 2 nodes would need stem:[379.34 \text{GB} \times 0.65 = 123.28 \text{GB}] of RAM.
+
+As a result, the best configuration should be 2 nodes with 32 vCPUs and 128{nbsp}GB of RAM.
+
 == Sizing Query Service Nodes
 
 A node that runs the Query Service executes queries for your application needs.

From 14ba599a7c909ac1419e44a5f9817a9a318cbb1b Mon Sep 17 00:00:00 2001
From: Sarah Emmett <sarah.welton@couchbase.com>
Date: Mon, 26 Jan 2026 11:49:39 -0500
Subject: [PATCH 2/5] [DOC-9267] Some grammmar and language tweaks

---
 modules/install/pages/sizing-general.adoc | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/modules/install/pages/sizing-general.adoc b/modules/install/pages/sizing-general.adoc
index 9aceb653af..b945158382 100644
--- a/modules/install/pages/sizing-general.adoc
+++ b/modules/install/pages/sizing-general.adoc
@@ -570,13 +570,15 @@ For the best results with Search node sizing, contact Couchbase Support.
 A heavy QPS workload requires more vCPUs. 
 If your workload requires a high QPS, this is the most important part of your sizing for the Search Service.
 
-For example, if your target QPS is 30,000 and your queries are less complex, use a value of 200 and get a total number of 150 required vCPUs: 
+For example, if your target QPS is 30,000 and your queries are less complex, divide your total QPS target by 200 to get your required vCPUs: 
 
 [stem]
 ++++
 30,0000_{\mathrm{QPS}} \div 200 = 150_{\mathrm{vCPUs}}
 ++++
 
+The formula gives a target of 150 vCPUs for a mid range workload with a less complex query. 
+
 If your queries were more complex, but the QPS target was the same, the calculation changes to use a value of 150 and a result of 200 vCPUs: 
 
 [stem]
@@ -591,12 +593,12 @@ You can then divide your result by the vCPU configuration you want to use to cal
 \lceil 150_{\mathrm{vCPUs}} \div 32_{\mathrm{vCPUs Per Node}} \rceil = 5_{\mathrm{Nodes}}
 ++++
 
-In this case, with a target QPS of 30,000 with less complex queries and 32 vCPUs per node, you would need 5 nodes in your deployment.
+Based on the formula, if you wanted to use nodes with 32 vCPUs and reach a target QPS of 30,000 with less complex queries, you would need 5 nodes in your deployment.
 
 [#search-ram]
 ==== RAM
 
-In general, you should allocate 65% of the RAM on your node to the Search Service.
+In general, you should allocate 65% of the RAM on a node in your cluster where you want to run the Search Service.
 A Search node needs more RAM if you:
 
 * Are xref:search:child-field-options-reference.adoc#store[storing field values] or xref:search:child-field-options-reference.adoc#doc-values[using doc values]. 
@@ -651,7 +653,7 @@ To get the most accurate values for stem:[{\text{f_text}}] and your RAM sizing c
 | stem:[{\text{m_text}}]
 a| A multiplier for calculating how the bytes in your documents translate into your Search index for analyzed text fields.
 
-For a good planning range, try a value between `0.12-0.35`. 
+For a good planning range, try a value between `0.12-0.35`, increasing based on the complexity of your analyzed text fields.
 
 To get the most accurate values for stem:[{\text{m_text}}] and your RAM sizing calculations, contact Couchbase Support.
 
@@ -721,7 +723,7 @@ Use the following formula:
 [#total-ram]
 ===== Calculate Your Total Required RAM
 
-Then, you can calculate the total RAM required on a node for your use case: 
+Then, you can calculate the total RAM required on a node for your use case with the following formula: 
 
 [latexmath]
 ++++
@@ -802,7 +804,7 @@ If you wanted to stick to 32 vCPU nodes, you would need 2 nodes.
 The Total Index GB With Replicas is stem:[379.34 \text{GB}].
 Each of the 2 nodes would need stem:[379.34 \text{GB} \times 0.65 = 123.28 \text{GB}] of RAM.
 
-As a result, the best configuration should be 2 nodes with 32 vCPUs and 128{nbsp}GB of RAM.
+As a result, the best configuration for this workload should be 2 nodes with 32 vCPUs and 128{nbsp}GB of RAM.
 
 == Sizing Query Service Nodes
 

From 5ba9d7417befab7018f5603b4838b31417d91246 Mon Sep 17 00:00:00 2001
From: Sarah Emmett <sarah.welton@couchbase.com>
Date: Mon, 26 Jan 2026 12:01:24 -0500
Subject: [PATCH 3/5] [DOC-9267] More grammar and formatting tweaks.

---
 modules/install/pages/sizing-general.adoc | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/modules/install/pages/sizing-general.adoc b/modules/install/pages/sizing-general.adoc
index b945158382..49a96e6a75 100644
--- a/modules/install/pages/sizing-general.adoc
+++ b/modules/install/pages/sizing-general.adoc
@@ -574,7 +574,7 @@ For example, if your target QPS is 30,000 and your queries are less complex, div
 
 [stem]
 ++++
-30,0000_{\mathrm{QPS}} \div 200 = 150_{\mathrm{vCPUs}}
+30,0000_{\mathrm{QPS}} \div 200_{\mathrm{Mid}} = 150_{\mathrm{vCPUs}}
 ++++
 
 The formula gives a target of 150 vCPUs for a mid range workload with a less complex query. 
@@ -583,7 +583,7 @@ If your queries were more complex, but the QPS target was the same, the calculat
 
 [stem]
 ++++
-30,0000_{\mathrm{QPS}} \div 150 = 200_{\mathrm{vCPUs}}
+30,0000_{\mathrm{QPS}} \div 150_{\mathrm{Low}} = 200_{\mathrm{vCPUs}}
 ++++
 
 You can then divide your result by the vCPU configuration you want to use to calculate the number of nodes you need:
@@ -762,10 +762,10 @@ This example uses the following variables:
 
 Based on these variables, the required vCPUs could be either: 
 
-* stem:[290], using a value of 300 in the vCPU calculation.
-* stem:[435], using a value of 200 in the vCPU calculation. 
+* stem:[290], using a value of `300` in the vCPU calculation.
+* stem:[435], using a value of `200` in the vCPU calculation. 
 
-The Total Index GB With Replicas is stem:[110.13 GB].
+The Total Index GB With Replicas is stem:[110.13 \text{ GB}].
 
 The vCPUs matter the most in this workload.
 
@@ -791,18 +791,20 @@ This example uses the following variables:
 |Number of Documents |Per Doc Index Bytes |QPS Target |System Overhead |Replica Factor
 
 |500,000,000
-|344.86 (more faceting, sorting, and more complex queries)
+|344.86 (For faceting, sorting, and more complex queries)
 |12,000
 |0.10
 |2 (1 replica + 1)
 
 |====
 
-Based on these variables, the required vCPUs would be stem:[40], based on the more complex queries needing a higher QPS per vCPU and using a value of 300 in the calculation.
-If you wanted to stick to 32 vCPU nodes, you would need 2 nodes.
+Based on these variables, the required vCPUs would be stem:[40], based on the more complex queries needing a higher QPS per vCPU and using a value of `300` in the calculation.
 
-The Total Index GB With Replicas is stem:[379.34 \text{GB}].
-Each of the 2 nodes would need stem:[379.34 \text{GB} \times 0.65 = 123.28 \text{GB}] of RAM.
+If you wanted to use nodes with 32 vCPUs, you would need 2 nodes.
+
+The Total Index GB With Replicas is stem:[379.34 \text{ GB}].
+
+Each of the 2 nodes would need stem:[379.34 \text{ GB} \times 0.65 = 123.28 \text{ GB}] of RAM.
 
 As a result, the best configuration for this workload should be 2 nodes with 32 vCPUs and 128{nbsp}GB of RAM.
 

From 88b85670b907384b85dab8d6cc549da05799d7a4 Mon Sep 17 00:00:00 2001
From: Sarah Emmett <sarah.welton@couchbase.com>
Date: Mon, 26 Jan 2026 13:59:39 -0500
Subject: [PATCH 4/5] [DOC-9267] Add preview config

---
 preview/DOC-9267-fts-sizing.yml | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)
 create mode 100644 preview/DOC-9267-fts-sizing.yml

diff --git a/preview/DOC-9267-fts-sizing.yml b/preview/DOC-9267-fts-sizing.yml
new file mode 100644
index 0000000000..50d91a757e
--- /dev/null
+++ b/preview/DOC-9267-fts-sizing.yml
@@ -0,0 +1,29 @@
+sources:
+    docs-devex:
+      branches: DOC-9267-fts-sizing
+
+    docs-analytics:
+      branches: release/8.0
+
+    couchbase-cli:
+      branches: morpheus
+      startPaths: docs/
+      
+    backup:
+      branches: morpheus
+      startPaths: docs/
+      
+    #analytics:
+    #  url: ../../docs-includes/docs-analytics
+    #  branches: HEAD
+
+    cb-swagger:
+      url: https://github.com/couchbaselabs/cb-swagger
+      branches: release/8.0
+      start_path: docs
+
+  # Minimal SDK build
+    docs-sdk-common:
+      branches: [release/8.0]
+    docs-sdk-java:
+      branches: [3.8-api]

From 2fc6a0a8752957e18c7af373f65a10480930855a Mon Sep 17 00:00:00 2001
From: Sarah Emmett <sarah.welton@couchbase.com>
Date: Fri, 30 Jan 2026 10:21:44 -0500
Subject: [PATCH 5/5] [DOC-9267] Tweak math in examples per PM request

---
 modules/install/pages/sizing-general.adoc | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/modules/install/pages/sizing-general.adoc b/modules/install/pages/sizing-general.adoc
index 49a96e6a75..9a99e9b6f8 100644
--- a/modules/install/pages/sizing-general.adoc
+++ b/modules/install/pages/sizing-general.adoc
@@ -762,7 +762,7 @@ This example uses the following variables:
 
 Based on these variables, the required vCPUs could be either: 
 
-* stem:[290], using a value of `300` in the vCPU calculation.
+* stem:[580], using a value of `150` in the vCPU calculation.
 * stem:[435], using a value of `200` in the vCPU calculation. 
 
 The Total Index GB With Replicas is stem:[110.13 \text{ GB}].
@@ -771,13 +771,13 @@ The vCPUs matter the most in this workload.
 
 To get a higher QPS for each vCPU, you could try a configuration of: 
 
-. 10 nodes with 32 vCPUs and 128{nbsp}GB of RAM
-. 5 nodes with 64 vCPUs and 256{nbsp}GB of RAM
+. 14 nodes with 32 vCPUs and 128{nbsp}GB of RAM
+. 7 nodes with 64 vCPUs and 256{nbsp}GB of RAM
 
 Otherwise, for a lower QPS for each vCPU, you could try a configuration of:
 
-. 14 nodes with 32 vCPUs and 128{nbsp}GB of RAM
-. 7 nodes with 64 vCPUs and 256{nbsp}GB of RAM
+. 18 nodes with 32 vCPUs and 128{nbsp}GB of RAM
+. 9 nodes with 64 vCPUs and 256{nbsp}GB of RAM
 
 [#low-qps]
 ==== Lower QPS with Higher Storage and a Larger Index
@@ -798,7 +798,7 @@ This example uses the following variables:
 
 |====
 
-Based on these variables, the required vCPUs would be stem:[40], based on the more complex queries needing a higher QPS per vCPU and using a value of `300` in the calculation.
+Based on these variables, the required vCPUs would be stem:[60], based on the more complex queries needing a higher QPS per vCPU and using a value of `200` in the calculation.
 
 If you wanted to use nodes with 32 vCPUs, you would need 2 nodes.