Comparison of available intentizer types

When creating a new NLU you can select different types of intentizer models. Please note that complex model requires special feature enabled on company level. Contact your system admin to enable this feature.

SimpleComplex
SummaryLightweight, universal model, multilingualHeavy model, mainly for banking
DomainUniversalFinances
Supported languagesMultilingual - 16 languages (Arabic, Simplified Chinese, Traditional Chinese, English, French, German, Italian, Japanese, Korean, Dutch, Polish, Portuguese, Thai, Turkish, Russian)mainly Polish
Support for multi-intentions
Classification quality (1)⭐⭐⭐⭐⭐⭐⭐⭐⭐
Classification performance (2)⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐
Model training time(3)⭐⭐⭐⭐⭐⭐⭐⭐
Size of the resulting model (4)⭐⭐⭐⭐⭐⭐⭐⭐
PreprocessingThis mechanism performs a series of operations on the input text, aimed at clearing it of unwanted characters and normalizing it to a uniform form. The following is a description of each operation:


  • matching all characters that are not alphanumeric and converting the matched sequences to a single space, thus removing any unwanted characters such as punctuation marks, emoticons, etc.

  • matching any occurrences of two or more spaces and converting the matched sequences to a single space, which allows you to normalize multiple spaces to a single space.

  • removing whitespace characters (spaces, tabs) from the beginning and end of the text, which eliminates any whitespace characters added at the end of the previous step's operation.

  • converting all letters in the text to lowercase, which ensures the uniformity of letter size.

  • converting Unicode characters into their ASCII equivalents. For example, diacritical characters will be replaced with their basic equivalents (e.g., “ł” will be replaced with “l”).


As a result, the text after preprocessing will contain text normalized to a uniform form, devoid of punctuation marks, multiple spaces and unwanted Unicode characters.

Examples:

  • What will the weather be like tomorrow?” → “what will the weather be like tomorrow”

  • “ By when do I get a response to my complaint?????” → “by when do i get a response to my complaint”

  • "sign me up for a doctor tomorrow 😁" → “sign me up for a doctor tomorrow"

This mechanism performs a series of operations on the input text, aimed at clearing it of unwanted characters and normalizing it to a uniform form. The following is a description of each operation:


  • converting all letters in the text to lowercase, which ensures the uniformity of letter size.


Examples:

  • “What will the weather be like tomorrow?” → “what will the weather be like tomorrow ?”



  1. f1-score measure, simple - 0.89 (precision - 0.92, sensitivity - 0.85), complex - 0.94 (precision - 0.94, sensitivity - 0.93), test set of 2807 samples for 91 classes
  2. simple - 58.82 phrases/s, complex - 47.62 phrases/s, test collection of 2807 samples
  3. simple - 107 s, complex - 25 min, training set of 9825 samples divided into 91 classes of intentions
  4. simple - 350MB (feature extractor) + 1MB / model, complex - 3,5GB / model

How to calculate NLU Complex models limit?

  1. Determine available RedisAI's storage in GB (RAS).
  2. Take floor of RAS / 3.5 GB.
  3. Result is the maximum number of NLU Complex models that can be stored in RedisAI.
  4. Remember to leave some storage for Simple models.