From 0dd1df866536efa4db2ac8015a876c38c1334093 Mon Sep 17 00:00:00 2001 From: Sarah Emmett Date: Fri, 23 Jan 2026 11:22:30 -0500 Subject: [PATCH 1/5] [DOC-9267] First draft of sizing for Search --- modules/install/pages/sizing-general.adoc | 270 ++++++++++++++++++++++ 1 file changed, 270 insertions(+) diff --git a/modules/install/pages/sizing-general.adoc b/modules/install/pages/sizing-general.adoc index 006e88ec30..9aceb653af 100644 --- a/modules/install/pages/sizing-general.adoc +++ b/modules/install/pages/sizing-general.adoc @@ -534,6 +534,276 @@ NOTE: The storage engine used in the sizing calculation corresponds to the stora | Nitro |=== +== Sizing Search Service Nodes + +Search Service nodes manage Search indexes and serve your Search queries. + +Basic Search indexes are lists of all the unique terms that appear in the documents on your cluster. +For each term, the Search index also contains a list of the documents where that term appears. +These lists inside a Search index can cause the Search index to be larger than your original dataset. + +Specific options in your Search index configuration can also increase its size, such as *Store*, *Include in _all field*, and *Include Term Vectors*. +For more information about what options can increase index size and storage requirements, see xref:search:child-field-options-reference.adoc[]. + +In general, when sizing nodes for a deployment that uses the Search Service, you need to determine the number of vCPUs and the amount of RAM that will support your workload. + +=== Calculating Node Requirements + +To size the Search Service nodes in your cluster, you need the following information: + +* The number of documents you need to include in your Search index or indexes. +* The average size of the documents that need to be included in your Search index, in KB. +* A sample document or documents that show the structure of your data. +* The specific queries per second (QPS) target you need from the Search Service. + +You should also consider your replication and recovery needs. + +With all this information, you can work with Couchbase Support to get the most accurate sizing for your Search workload. + +If you want to try sizing your cluster yourself, you can use some of the following guidelines to size your <> and <>, using averages and estimates from other Search deployments. + +For the best results with Search node sizing, contact Couchbase Support. + +[#search-vcpus] +==== vCPUS + +A heavy QPS workload requires more vCPUs. +If your workload requires a high QPS, this is the most important part of your sizing for the Search Service. + +For example, if your target QPS is 30,000 and your queries are less complex, use a value of 200 and get a total number of 150 required vCPUs: + +[stem] +++++ +30,0000_{\mathrm{QPS}} \div 200 = 150_{\mathrm{vCPUs}} +++++ + +If your queries were more complex, but the QPS target was the same, the calculation changes to use a value of 150 and a result of 200 vCPUs: + +[stem] +++++ +30,0000_{\mathrm{QPS}} \div 150 = 200_{\mathrm{vCPUs}} +++++ + +You can then divide your result by the vCPU configuration you want to use to calculate the number of nodes you need: + +[stem] +++++ +\lceil 150_{\mathrm{vCPUs}} \div 32_{\mathrm{vCPUs Per Node}} \rceil = 5_{\mathrm{Nodes}} +++++ + +In this case, with a target QPS of 30,000 with less complex queries and 32 vCPUs per node, you would need 5 nodes in your deployment. + +[#search-ram] +==== RAM + +In general, you should allocate 65% of the RAM on your node to the Search Service. +A Search node needs more RAM if you: + +* Are xref:search:child-field-options-reference.adoc#store[storing field values] or xref:search:child-field-options-reference.adoc#doc-values[using doc values]. +* Have xref:search:customize-index.adoc#analyzers[analyzed text fields]. +* Want to use more complex queries than xref:search:search-request-params.adoc#analytic-queries[keyword matches]. + +To calculate a more precise estimate for the required RAM for the Search Service, you need to: + +. <> +. <> +. <> +. <> + +[#index-bytes] +===== Calculate Your Per Doc Index Bytes + +Use the following formula first to calculate the number of bytes per document in your Search index: + +[latexmath] +++++ +\begin{equation} +\begin{split} +\text{Per Doc Index Bytes} = ( ( W \cdot 1024 \cdot \text{f_text} \cdot \text{m_text} ) + ( W \cdot 1024 \cdot \text{f_kw} \cdot \text{m_kw} ) + B ) \times (1 + D) +\end{split} +\end{equation} +++++ + +You need to know the following variables for the formula: + +[cols="1,2"] +|==== +|Variable |Description + +| stem:[W] +| The average size of your JSON documents, in KB. + +| stem:[{\text{f_text}}] +a| A measure of the analyzed text from your JSON documents. + +You can omit this value if you're using primarily keyword searches and do not have longer-form text fields that require an xref:search:customize-index.adoc#analyzers[analyzer]. + +You can use the following value ranges based on the kind of analyzed text you have in your index: + +* *Product descriptions, titles and body snippets, support ticket descriptions*: `0.10-0.20` +* *Long note fields, email bodies, articles, knowledge-base content*: `0.20-0.40` +* *Log files, message streams, event payloads with large message fields*: `0.40-0.70` + +If you're not sure about the size and complexity of the text fields in your documents and how they match to the example ranges, use a value of `0.25` to get a rough estimate. + +To get the most accurate values for stem:[{\text{f_text}}] and your RAM sizing calculations, contact Couchbase Support. + +| stem:[{\text{m_text}}] +a| A multiplier for calculating how the bytes in your documents translate into your Search index for analyzed text fields. + +For a good planning range, try a value between `0.12-0.35`. + +To get the most accurate values for stem:[{\text{m_text}}] and your RAM sizing calculations, contact Couchbase Support. + +| stem:[{\text{f_kw}}] +a| A measure of the keywords from your JSON documents. + +For a good planning range for a keyword search use case or a filter-heavy workload, use a value of `0.10`. + +To get the most accurate values for stem:[{\text{f_kw}}] and your RAM sizing calculations, contact Couchbase Support. + +| stem:[{\text{m_kw}}] +a| A multiplier for calculating how the bytes in your documents translate into your Search index for keywords. + +For a good planning range, try a value between `0.10-0.18`. + +To get the most accurate values for stem:[{\text{m_kw}}] and your RAM sizing calculations, contact Couchbase Support. + +| stem:[B] +a| The number of bytes needed for storing field values for your documents, if xref:search:child-field-options-reference.adoc#store[store] is enabled for a child field mapping. + +If you're not storing any field values in your Search index, set this value to `0`. + +| stem:[D] +a| The additional overhead from adding xref:search:child-field-options-reference.adoc#doc-values[doc values] to your Search index from a child field mapping. + +Use a value from `0-1`. +If you're not using doc values in your Search index, set this value to `0`. +|==== + +[#index-gb] +===== Calculate Your Total Index GB + +After you have calculated your stem:[{\text{Per Doc Index Bytes}}], calculate the total GB needed for your Search index, where: + +* stem:[N] is the total number of JSON documents you want to include in your Search index. +* stem:[S] is a measure of your system overhead. +For a rough estimate, use a value of `0.10`. + +Use the following formula: + +[latexmath] +++++ +\begin{equation} +\begin{split} +\text{Total Index GB} = +\frac{(N \times \text{Per Doc Index Bytes})}{10^{9}} \times (1 + S) +\end{split} +\end{equation} +++++ + +[#add-replicas] +===== Add Your Replication Factor + +If you want to add replicas to your Search index, you need to factor that into your stem:[{\text{Total Index GB}}]. + +Use the following formula: + +[latexmath] +++++ +\begin{equation} +\begin{split} +\text{Total Index GB With Replicas} = \text{Total Index GB} \times (\text{Number Of Replicas} + 1) +\end{split} +\end{equation} +++++ + +[#total-ram] +===== Calculate Your Total Required RAM + +Then, you can calculate the total RAM required on a node for your use case: + +[latexmath] +++++ +\begin{equation} +\begin{split} +\text{Total Node RAM} = \text{Total Index GB With Replicas} \times 0.65 +\end{split} +\end{equation} +++++ + +[#search-examples] +=== Search Node Sizing Examples + +You'll get the most accurate results by going through sizing with Couchbase Support, but you can use the following examples for a sizing estimate for a Search workload: + +* <> +* <> + +[#high-qps] +==== High QPS and Keyword-Only Searches + +The following sizing scenario assumes a high QPS target, a CPU-bound configuration, and a keyword-only workload for a compact Search index. + +This example uses the following variables: + +|==== +|Number of Documents |Per Doc Index Bytes |QPS Target |System Overhead |Replica Factor + +|194,000,000 +|258.05 +|87,000 +|0.10 +|2 (1 replica + 1) + +|==== + +Based on these variables, the required vCPUs could be either: + +* stem:[290], using a value of 300 in the vCPU calculation. +* stem:[435], using a value of 200 in the vCPU calculation. + +The Total Index GB With Replicas is stem:[110.13 GB]. + +The vCPUs matter the most in this workload. + +To get a higher QPS for each vCPU, you could try a configuration of: + +. 10 nodes with 32 vCPUs and 128{nbsp}GB of RAM +. 5 nodes with 64 vCPUs and 256{nbsp}GB of RAM + +Otherwise, for a lower QPS for each vCPU, you could try a configuration of: + +. 14 nodes with 32 vCPUs and 128{nbsp}GB of RAM +. 7 nodes with 64 vCPUs and 256{nbsp}GB of RAM + +[#low-qps] +==== Lower QPS with Higher Storage and a Larger Index + +The following sizing scenario assumes a comparatively lower QPS target, a storage-bound configuration, and a larger Search index. + +This example uses the following variables: + +[cols="1,2,1,1,1"] +|==== +|Number of Documents |Per Doc Index Bytes |QPS Target |System Overhead |Replica Factor + +|500,000,000 +|344.86 (more faceting, sorting, and more complex queries) +|12,000 +|0.10 +|2 (1 replica + 1) + +|==== + +Based on these variables, the required vCPUs would be stem:[40], based on the more complex queries needing a higher QPS per vCPU and using a value of 300 in the calculation. +If you wanted to stick to 32 vCPU nodes, you would need 2 nodes. + +The Total Index GB With Replicas is stem:[379.34 \text{GB}]. +Each of the 2 nodes would need stem:[379.34 \text{GB} \times 0.65 = 123.28 \text{GB}] of RAM. + +As a result, the best configuration should be 2 nodes with 32 vCPUs and 128{nbsp}GB of RAM. + == Sizing Query Service Nodes A node that runs the Query Service executes queries for your application needs. From 14ba599a7c909ac1419e44a5f9817a9a318cbb1b Mon Sep 17 00:00:00 2001 From: Sarah Emmett Date: Mon, 26 Jan 2026 11:49:39 -0500 Subject: [PATCH 2/5] [DOC-9267] Some grammmar and language tweaks --- modules/install/pages/sizing-general.adoc | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/modules/install/pages/sizing-general.adoc b/modules/install/pages/sizing-general.adoc index 9aceb653af..b945158382 100644 --- a/modules/install/pages/sizing-general.adoc +++ b/modules/install/pages/sizing-general.adoc @@ -570,13 +570,15 @@ For the best results with Search node sizing, contact Couchbase Support. A heavy QPS workload requires more vCPUs. If your workload requires a high QPS, this is the most important part of your sizing for the Search Service. -For example, if your target QPS is 30,000 and your queries are less complex, use a value of 200 and get a total number of 150 required vCPUs: +For example, if your target QPS is 30,000 and your queries are less complex, divide your total QPS target by 200 to get your required vCPUs: [stem] ++++ 30,0000_{\mathrm{QPS}} \div 200 = 150_{\mathrm{vCPUs}} ++++ +The formula gives a target of 150 vCPUs for a mid range workload with a less complex query. + If your queries were more complex, but the QPS target was the same, the calculation changes to use a value of 150 and a result of 200 vCPUs: [stem] @@ -591,12 +593,12 @@ You can then divide your result by the vCPU configuration you want to use to cal \lceil 150_{\mathrm{vCPUs}} \div 32_{\mathrm{vCPUs Per Node}} \rceil = 5_{\mathrm{Nodes}} ++++ -In this case, with a target QPS of 30,000 with less complex queries and 32 vCPUs per node, you would need 5 nodes in your deployment. +Based on the formula, if you wanted to use nodes with 32 vCPUs and reach a target QPS of 30,000 with less complex queries, you would need 5 nodes in your deployment. [#search-ram] ==== RAM -In general, you should allocate 65% of the RAM on your node to the Search Service. +In general, you should allocate 65% of the RAM on a node in your cluster where you want to run the Search Service. A Search node needs more RAM if you: * Are xref:search:child-field-options-reference.adoc#store[storing field values] or xref:search:child-field-options-reference.adoc#doc-values[using doc values]. @@ -651,7 +653,7 @@ To get the most accurate values for stem:[{\text{f_text}}] and your RAM sizing c | stem:[{\text{m_text}}] a| A multiplier for calculating how the bytes in your documents translate into your Search index for analyzed text fields. -For a good planning range, try a value between `0.12-0.35`. +For a good planning range, try a value between `0.12-0.35`, increasing based on the complexity of your analyzed text fields. To get the most accurate values for stem:[{\text{m_text}}] and your RAM sizing calculations, contact Couchbase Support. @@ -721,7 +723,7 @@ Use the following formula: [#total-ram] ===== Calculate Your Total Required RAM -Then, you can calculate the total RAM required on a node for your use case: +Then, you can calculate the total RAM required on a node for your use case with the following formula: [latexmath] ++++ @@ -802,7 +804,7 @@ If you wanted to stick to 32 vCPU nodes, you would need 2 nodes. The Total Index GB With Replicas is stem:[379.34 \text{GB}]. Each of the 2 nodes would need stem:[379.34 \text{GB} \times 0.65 = 123.28 \text{GB}] of RAM. -As a result, the best configuration should be 2 nodes with 32 vCPUs and 128{nbsp}GB of RAM. +As a result, the best configuration for this workload should be 2 nodes with 32 vCPUs and 128{nbsp}GB of RAM. == Sizing Query Service Nodes From 5ba9d7417befab7018f5603b4838b31417d91246 Mon Sep 17 00:00:00 2001 From: Sarah Emmett Date: Mon, 26 Jan 2026 12:01:24 -0500 Subject: [PATCH 3/5] [DOC-9267] More grammar and formatting tweaks. --- modules/install/pages/sizing-general.adoc | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/modules/install/pages/sizing-general.adoc b/modules/install/pages/sizing-general.adoc index b945158382..49a96e6a75 100644 --- a/modules/install/pages/sizing-general.adoc +++ b/modules/install/pages/sizing-general.adoc @@ -574,7 +574,7 @@ For example, if your target QPS is 30,000 and your queries are less complex, div [stem] ++++ -30,0000_{\mathrm{QPS}} \div 200 = 150_{\mathrm{vCPUs}} +30,0000_{\mathrm{QPS}} \div 200_{\mathrm{Mid}} = 150_{\mathrm{vCPUs}} ++++ The formula gives a target of 150 vCPUs for a mid range workload with a less complex query. @@ -583,7 +583,7 @@ If your queries were more complex, but the QPS target was the same, the calculat [stem] ++++ -30,0000_{\mathrm{QPS}} \div 150 = 200_{\mathrm{vCPUs}} +30,0000_{\mathrm{QPS}} \div 150_{\mathrm{Low}} = 200_{\mathrm{vCPUs}} ++++ You can then divide your result by the vCPU configuration you want to use to calculate the number of nodes you need: @@ -762,10 +762,10 @@ This example uses the following variables: Based on these variables, the required vCPUs could be either: -* stem:[290], using a value of 300 in the vCPU calculation. -* stem:[435], using a value of 200 in the vCPU calculation. +* stem:[290], using a value of `300` in the vCPU calculation. +* stem:[435], using a value of `200` in the vCPU calculation. -The Total Index GB With Replicas is stem:[110.13 GB]. +The Total Index GB With Replicas is stem:[110.13 \text{ GB}]. The vCPUs matter the most in this workload. @@ -791,18 +791,20 @@ This example uses the following variables: |Number of Documents |Per Doc Index Bytes |QPS Target |System Overhead |Replica Factor |500,000,000 -|344.86 (more faceting, sorting, and more complex queries) +|344.86 (For faceting, sorting, and more complex queries) |12,000 |0.10 |2 (1 replica + 1) |==== -Based on these variables, the required vCPUs would be stem:[40], based on the more complex queries needing a higher QPS per vCPU and using a value of 300 in the calculation. -If you wanted to stick to 32 vCPU nodes, you would need 2 nodes. +Based on these variables, the required vCPUs would be stem:[40], based on the more complex queries needing a higher QPS per vCPU and using a value of `300` in the calculation. -The Total Index GB With Replicas is stem:[379.34 \text{GB}]. -Each of the 2 nodes would need stem:[379.34 \text{GB} \times 0.65 = 123.28 \text{GB}] of RAM. +If you wanted to use nodes with 32 vCPUs, you would need 2 nodes. + +The Total Index GB With Replicas is stem:[379.34 \text{ GB}]. + +Each of the 2 nodes would need stem:[379.34 \text{ GB} \times 0.65 = 123.28 \text{ GB}] of RAM. As a result, the best configuration for this workload should be 2 nodes with 32 vCPUs and 128{nbsp}GB of RAM. From 88b85670b907384b85dab8d6cc549da05799d7a4 Mon Sep 17 00:00:00 2001 From: Sarah Emmett Date: Mon, 26 Jan 2026 13:59:39 -0500 Subject: [PATCH 4/5] [DOC-9267] Add preview config --- preview/DOC-9267-fts-sizing.yml | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) create mode 100644 preview/DOC-9267-fts-sizing.yml diff --git a/preview/DOC-9267-fts-sizing.yml b/preview/DOC-9267-fts-sizing.yml new file mode 100644 index 0000000000..50d91a757e --- /dev/null +++ b/preview/DOC-9267-fts-sizing.yml @@ -0,0 +1,29 @@ +sources: + docs-devex: + branches: DOC-9267-fts-sizing + + docs-analytics: + branches: release/8.0 + + couchbase-cli: + branches: morpheus + startPaths: docs/ + + backup: + branches: morpheus + startPaths: docs/ + + #analytics: + # url: ../../docs-includes/docs-analytics + # branches: HEAD + + cb-swagger: + url: https://github.com/couchbaselabs/cb-swagger + branches: release/8.0 + start_path: docs + + # Minimal SDK build + docs-sdk-common: + branches: [release/8.0] + docs-sdk-java: + branches: [3.8-api] From 2fc6a0a8752957e18c7af373f65a10480930855a Mon Sep 17 00:00:00 2001 From: Sarah Emmett Date: Fri, 30 Jan 2026 10:21:44 -0500 Subject: [PATCH 5/5] [DOC-9267] Tweak math in examples per PM request --- modules/install/pages/sizing-general.adoc | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/modules/install/pages/sizing-general.adoc b/modules/install/pages/sizing-general.adoc index 49a96e6a75..9a99e9b6f8 100644 --- a/modules/install/pages/sizing-general.adoc +++ b/modules/install/pages/sizing-general.adoc @@ -762,7 +762,7 @@ This example uses the following variables: Based on these variables, the required vCPUs could be either: -* stem:[290], using a value of `300` in the vCPU calculation. +* stem:[580], using a value of `150` in the vCPU calculation. * stem:[435], using a value of `200` in the vCPU calculation. The Total Index GB With Replicas is stem:[110.13 \text{ GB}]. @@ -771,13 +771,13 @@ The vCPUs matter the most in this workload. To get a higher QPS for each vCPU, you could try a configuration of: -. 10 nodes with 32 vCPUs and 128{nbsp}GB of RAM -. 5 nodes with 64 vCPUs and 256{nbsp}GB of RAM +. 14 nodes with 32 vCPUs and 128{nbsp}GB of RAM +. 7 nodes with 64 vCPUs and 256{nbsp}GB of RAM Otherwise, for a lower QPS for each vCPU, you could try a configuration of: -. 14 nodes with 32 vCPUs and 128{nbsp}GB of RAM -. 7 nodes with 64 vCPUs and 256{nbsp}GB of RAM +. 18 nodes with 32 vCPUs and 128{nbsp}GB of RAM +. 9 nodes with 64 vCPUs and 256{nbsp}GB of RAM [#low-qps] ==== Lower QPS with Higher Storage and a Larger Index @@ -798,7 +798,7 @@ This example uses the following variables: |==== -Based on these variables, the required vCPUs would be stem:[40], based on the more complex queries needing a higher QPS per vCPU and using a value of `300` in the calculation. +Based on these variables, the required vCPUs would be stem:[60], based on the more complex queries needing a higher QPS per vCPU and using a value of `200` in the calculation. If you wanted to use nodes with 32 vCPUs, you would need 2 nodes.