Phrases with similar construction placed in different intents

The causes of unsatisfactory intent recognition in an NLU system can be varied:

  1. Phrases similarity and semantic differences
    Phrases with similar syntactic structure but different semantic meaning can pose challenges for the NLU model:
    1. For example, in the phrase "Can I change the contact number I provided to you?" the model might misidentify the intent "change" due to a lack of clear distinction between phrases associated with the intent "change" and another intent in the model. The context of the “contact number” may have been insufficiently represented in the training data.
    2. Even though the phrase “I want to change the policy.” contains the word "change” the model might struggle to assign the intent (change) due to the generality of the expression and lack of context.
    3. In the phrase “I want to change the insurance plan.”, the issue may stem from a lack of appropriate examples or the ambiguity of the phrase in the training data.
    4. “How can I change the coverage of my insurance policy?” The complexity of the phrase and the specific context may require more training data to handle effectively.

  2. Separation of training data set
    One of the key issues could be the way the training data was constructed and divided. Here are some tips:
    1. Phrase Variety: Ensure that each intent has a wide range of examples covering different syntactic structures and contexts.
    2. Data Examination: Examine the training data for incorrect labels and ensure that phrases are clearly assigned to the appropriate intents.
    3. Data Balance: Ensure that the number of examples for each intent is balanced.

  3. Results analysis and iterative improvements
    Regular analysis of results and iterative adjustments can significantly improve the quality of the model:
    1. Confusion Matrix: Analyse the confusion matrix to understand which intents are most frequently confused and why.
    2. Manual Correction: Sometimes, manually correcting the training data can resolve issues.
    3. Adding Synonyms: Include synonyms and variations of phrases that users might commonly use.