Phrases length

What impact does the length of phrases have on recognition? The length of phrases can significantly affect intent recognition by NLU models. Models such as Universal Sentence Encoder (used in intentizers) and other transformer-based models can handle longer phrases better because they have more context to analyze. Why might longer phrases be recognized more effectively?

  1. More Context: Longer phrases provide more context, allowing the model to better understand the intent. For example, in the sentence “What steps do I need to take to make a change to my insurance policy?” the model has more information to determine that the intent is about making a change to the policy.
  2. Reduction of Ambiguity: Shorter phrases can be more ambiguous and harder to classify. For example, the phrase “change the policy” is less clear than “What steps do I need to take to make a change to my insurance policy”.
  3. Better Utilisation of Model Capabilities: Transformer-based models are designed to work with sequences of up to several hundred tokens. By using longer phrases, you can better leverage the model's ability to capture complex dependencies.

How to approach phrase creation?

  1. Variety in Phrase Length: Create phrases of varying lengths so that the model can learn intents from different contexts. Include both shorter and longer phrases.
  2. Diversity of Constructions: Add phrases with different syntactic structures to help the model recognize intents regardless of sentence structure. For example: “I would like to change my policy” vs. “Please change my insurance policy”.
  3. Intent Clarification: Ensure that phrases clearly specify the intent by adding context that unequivocally points to the given intent. For instance, instead of “change the policy”, use “How can I change the coverage of my insurance policy?”.

Collect phrases from real users to ensure that the model learns from examples that best reflect actual queries.

Include both the most typical phrases and more complex, rare cases to make the model more versatile.

Phrase length significantly impacts intent recognition by an NLU model. Longer phrases provide more context, which can reduce ambiguity and improve the model's accuracy. Creating diverse phrases, both in terms of length and structure, is crucial for effective model training. Regular experimentation and analysis of results will aid in the iterative improvement of the training dataset and the model’s performance.