Intent Detection Using Semantic Search

Where to find it in Automate

To use Semantic Search for intent classification in Automate:

Open your bot or assistant.
Go to NLU Settings for your Project.
Select Semantic Search.
Choose a model from the list.
Train and test your assistant.

For most users, this is the only part that matters: choose Semantic Search, then select the model that best fits the use case.

How to choose a model

The available built-in models are:

auto
MultilingualE5Small
MultilingualE5Base
BGEM3
ParaphraseMLMiniLML12V2

They all serve the same purpose, but they differ in speed, quality, and typical use case.

`auto`

This is the easiest option when you do not want to choose a model manually.

Compares the built-in models during training
Selects the option that works best for the current dataset
May also choose the most suitable internal classification strategy
Usually takes longer to train than selecting one fixed model directly

Best for:

new assistants
teams that want the safest default
use cases where it is not yet clear which model performs best
projects where quality matters more than training speed

`MultilingualE5Small`

This is the safest starting point for most teams.

Fast
Good general quality
Works well in multilingual environments
Practical for everyday chatbot and routing scenarios

Best for:

new assistants
high-volume traffic
teams that want a reliable default

`MultilingualE5Base`

This is a balanced option when you want more quality than the small model.

Better quality in many cases than MultilingualE5Small
Still suitable for general production use
Slightly heavier and slower

Best for:

general customer support use cases
assistants where understanding quality matters more than raw speed
teams that want a step up from the basic default

`BGEM3`

This is the quality-focused option.

Often the strongest option for harder classification tasks
Useful when intents are similar to each other
Slower and heavier than lighter models

Best for:

more complex assistants
important business flows where classification quality is critical
cases where a slightly slower response is acceptable

`ParaphraseMLMiniLML12V2`

This is the lightweight option.

Fast and efficient
Good for short user messages
Useful when you want a smaller, simpler model

Best for:

smaller deployments
short phrase classification
cases where low resource usage is important

Recommended starting point

If you do not know which model to choose:

start with auto if you want the system to choose for you
start with MultilingualE5Small
move to MultilingualE5Base if you want more quality
try BGEM3 if quality is the highest priority
try ParaphraseMLMiniLML12V2 if you want a lighter option

What happens in `auto`

When a user selects auto, the system does extra work during training.

In simple terms:

it tests the supported built-in models
it compares which option fits the current training data best
it saves the best-performing result for later use

This means auto is useful when:

you want the system to make the choice for you
you are starting a new assistant
you want to optimize for quality without manually testing models one by one

The main tradeoff is simple:

auto is easier for the user
training usually takes longer than with one fixed model

Classification strategies

You do not choose the strategy directly in Automate, but it helps to understand what it means.

In practice:

when you choose a specific model from the list, the system uses knn
when you choose auto, the system may internally use centroid or hybrid if that gives better results

`knn`

This means: "find examples most similar to the current message."

In simple business terms:

the system compares the new message to known training examples
then it chooses the intent that looks most similar

This is the standard behavior when a user selects one of the built-in models directly.

`centroid`

This means: "compare the message to the overall profile of each intent."

In simple business terms:

the system looks at the general shape of each intent
then it chooses the closest topic

`hybrid`

This means: "first shortlist the most likely topics, then compare more precisely."

In simple business terms:

the system first narrows down the possible intents
then it makes a more detailed comparison

Good practice for business users

Model choice matters, but training data matters even more.

Good practice:

keep intent names clear and distinct
use realistic user phrases
avoid heavy overlap between intents
add enough examples for each intent
test the assistant on real user questions after training

Practical recommendation

For most business teams, the simplest approach is:

Select Semantic Search in NLU Settings.
Start with auto if you want the system to choose for you, or MultilingualE5Small if you want a fast and predictable default.
Train the assistant.
Test real user messages.
If quality is not good enough, try MultilingualE5Base or BGEM3.

Updated 21 days ago

Where to find it in Automate

How to choose a model

auto

MultilingualE5Small

MultilingualE5Base

BGEM3

ParaphraseMLMiniLML12V2

Recommended starting point

What happens in auto

Classification strategies

knn

centroid

hybrid

Good practice for business users

Practical recommendation

`auto`

`MultilingualE5Small`

`MultilingualE5Base`

`BGEM3`

`ParaphraseMLMiniLML12V2`

What happens in `auto`

`knn`

`centroid`

`hybrid`