We distinguish 5 sizes depending on concurrent users talking to the bot and/or the total amount of conversations. Actual values are fluent and are be revised periodically.
XS - max 50 concurrent users, <10k monthly conversations
S - max 150 concurrent users, <10k monthly conversations
M - max 200 concurrent users, <50k monthly conversations
L - max 300 concurrent users, <300k monthly conversations (we advise scaling all pods x2, especially main path: gateway, dialogs, nlu-pipeline)
For concurrent users, we assume that every user, on average, is sending 1 message to the bot per 10 seconds. This means that the requests/seconds parameter is 1/10 of the concurrent users.
For numbers bigger than L individual tests and estimations will be needed to provide the best possible service.
For CPU requests and limits we've got two columns for each component. All requirements are connected to sizes defined above in format: XS / S / M / L. We also strongly advise scaling all pods x2 for L size deployment.
|Service name||CPU Request (XS/S/M/L)||CPU Limit (XS/S/M/L)|
|admin||0.5 / 2 / 2 / 4 CPU||2 / 4 / 4 / 8 CPU|
|dialogs||0.5 / 2 / 2 / 4 CPU||2 / 4 / 4 / 8 CPU|
|gateway||0.5 / 2 / 2 / 4 CPU||2 / 4 / 4 / 8 CPU|
Detailed requirements per each pod are available further documents.
After a bunch of profiling, we created guidelines about assigning memory in the Kubernetes cluster. Below you will find general rules as well as a table with all needed information
Pod memory should have the same values for limits and requests sections.
|Service name||POD memory||Heap||Internal||Class|
|admin||2/6 GB *||-||-||-|
|dialogs||6/12/18 GB *||4/10/16 GB||1 GB||800 MB|
|analyser||7 GB||5 GB||1 GB||800 MB|
|refinery||8 GB||6 GB||1 GB||800 MB|
|nlu-pipeline||2 GB *||-||-||-|
* admin - has two values depending on users using the admin panel. For a production environment where bot designers don’t use an admin panel, we require 2GB of RAM. For environments where bot designers work and bots are developed, we require 6GB of RAM.
* dialogs - configuration is based on the size of the deployment (see "Prerequisites" section)
* nlu-pipeline - its memory limits max phrases for NLU (default limit is 10k), in order to increase the limit we should add more memory (2GB for additional 5k phrases, and need to change limit in admin's config)
There are 4 memory types that we are interested in when it comes to JVM: heap, internal, class and stack.
- Heap memory is “actual” memory that the application is using (all objects, strings, etc.) - default 25%
- Internal/native memory is the memory that JVM itself is using to run (we can’t directly control it) - default 25%
- Compressed class space is a memory for pointers 1GB
- Stack memory - we can define max stack side per thread (we can’t control how many threads JVM is running) - default is 1MB per thread
- We found out that JVM will set up “internal memory” equals to heap memory when we set -Xmx parameter. This is very important because it means that we cannot just set the heap to 80%
- GC won’t be triggered by rising internal memory; will be trigger only for heap
Our “naked” application requires ~500-600MB of heap memory to work, assuming or other defaults we need 500MB heap + 500MB internal (default) + 1GB class + 100-200MB stack = 2.2GB. And for this value, it appears that JVM defaults for heap (25%) are exactly the same. So the best configuration for our applications would be to NOT SET any JVM memory params and set 2GB memory for the pod. There is a slight chance for OOM (2 GB < 2.2GB), but class memory never exceeded 300MB, so in most cases “we are safe”.
There are applications that are using more memory than others and thus we need to increase default values. And because the default value for the heap is 25% it will result in not optimal memory usage. So for these cases, we prepare a formula to calculate all parameters based on the required heap. Besides the formula, we suggested “default” values (to be checked on real environments) for these applications.
POD memory = heap memory + 1GB internal memory + 800MB class + 200MB stack
And in this case, we need to configure:
- -Xmx for heap
- -XX:MaxDirectMemorySize for internal memory
- -XX:CompressedClassSpaceSize for compressed class space
Updated 6 months ago