Pattern entities
For more complex or less regular expressions, we have created a custom language called Metaetykietator, MTT in short. It is a syntax that allows attaching custom labels to text data.
It is most useful when you need to differentiate between expressions that overlap and thus cannot function as simple entities, or when you want to catch a whole group of characteristic expressions such as welcomes or confirmations.
Some examples:
A snippet of a welcome pattern:
LABEL (a"hi"|a"hey"|a"hello"|a"good" a"morning") AS @"welcome"
A snippet of a confirmation pattern:
LABEL ( a"yes" | a"ok" | a"yep" | a"yup" | a"correct" | a"ya" | a"okay" | a"why" a"not" | a"yp" | a"ye" |a"aye" |a"go"| a"ahead" | a"make"| a"happen" | a"so" | a"perfect" | a"that" a"is" a"correct" | a"uh" a"huh" | a"yes" a"please" | a"yeah" | a"you" a"got" a"it" | a"absolutely" | a"alright" | a"continue" | a"good" | a"ja" | a"proceed" | a"righto" | a"totally" | a"true" | a"most" a"assuredly" | a"sure" | a"agree" | a"of" a"course" | a"certainly" | a"affirmative" | a"naturally" | a"fine") AS @"mtt_yes"
A snippet of an eating pattern (using regex):
Results found: eaten, eating, eat
LABEL (ir"eat" | a"ate") AS @"eat"
A snippet of an intent pattern ("cancel order"):
LABEL (a"cancel"|a"remove"|a"delete"|a"terminate") AS @"CANCEL"
LABEL (a"order"|a"package"|a"parcel") AS @"ORDER"
LABEL (@"CANCEL" <0,3> @"ORDER") AS @"cancel_order"
UNLABEL @"CANCEL"
UNLABEL @"ORDER"
Quick tutorial
Patterns
Basic labeling
LABEL a"dog" AS @"ANIMAL"
- a"dog" matches exactly the word "dog".
- AS @"ANIMAL" gives every match the label ANIMAL.
Sequence of patterns
LABEL a"red" a"apple" a"pie" AS @"DESSERT"
- Matches the exact sequence "red", then "apple", then "pie".
- The whole phrase gets the label DESSERT.
Alternatives (|)
LABEL (i"i") (i"like"|a"love"|a"enjoy") AS @"AFFECTION"
- First match "I", then one of: "like", "love", "enjoy".
Alternatives with optionality
LABEL (a"my"|) (a"cat"|a"dog"|a"parrot") AS @"POSSIBLE_PET"
- (a"my"|) → "my" or nothing — the first word is optional.
- (a"cat" | a"dog" | a"parrot") → matches exactly one of these animals.
Semantics
Case sensitivity in patterns
By default, patterns are case-sensitive:
LABEL "vase" AS @"VASE"
Sentence: Vase on the table, vase by the window, lovely vase in the corner.
Matches: 2 (misses the capitalized Vase at the start)
The i makes the match case-insensitive, so "vase", "Vase", "VASE", etc. all match.
LABEL i"vase" AS @"VASE"
To ignore diacritics (such as ą/ę/ń, etc.) and case (uppercase/lowercase), add the letter a before the word:
LABEL (a"wazon"|a"wazonu"|a"wazonowi"|a"wazonem"|a"wazonie"|a"wazony"|a"wazonow"|a"wazonom"|a"wazonami"|a"wazonach") AS @"VASE"
Matching More Than Just “Vase”
Sometimes we don’t want to match only the singular nominative form "vase".
We can do this in two ways.
- Using alternatives (A|B|C|D)
LABEL ("vase"|"vases"|"vaselike"|"vasectomy") AS @"VASE"
Using a regular expression (r)
LABEL r"vase(s|like|ctomy|)" AS @"VASE"
LABEL r"vase" AS @"LABEL"
Patterns for Matching
- "lexeme" — matches exact literal lexeme (case-sensitive)
- i"lexeme" — matches lexeme ignoring case
- a"lexeme" — matches lexeme ignoring case and diacritics
- r"regex" — matches lexeme using regular expression (regex)
- ir"regex" — regex match ignoring case
- ar"regex" — regex match ignoring case and diacritics
- _ — any single lexeme
- ... — any sequence of lexemes (of any length)
- <n,m> — any sequence of lexemes with length between n and m
- ^ — start of text (e.g. LABEL ^ a"cat" AS @"first_cat")
- $ — end of text (e.g. LABEL a"cat" $ AS @"last_cat")
- . — any single character (in regex expressions)
Labeling Complex Phrases in Steps
Suppose we want to find phrases like: I smashed the vase /She broke the glass.
Step 1: Label relevant verbs (actions of breaking)
LABEL (a"broke"|a"smashed"|a"shattered") AS @"BREAKING_ACTION"
Step 2: Label relevant nouns (containers)
LABEL (a"vase"|a"glass"|a"bottle") AS @"CONTAINER"
Step 3: Label sequences of verb + noun
LABEL @"BREAKING_ACTION" <0,1> @"CONTAINER" AS @"BREAKING_EVENT"
UNLABEL
The script is executed top to bottom.
Labels like BREAKING_ACTION and CONTAINER were created as helper labels to build up more complex annotations.
Once these intermediate labels are no longer needed, we can safely remove them using the UNLABEL command:
LABEL (a"broke"|a"smashed"|a"shattered") AS @"BREAKING_ACTION"
LABEL (a"vase"|a"glass"|a"bottle") AS @"CONTAINER"
LABEL @"BREAKING_ACTION" <0,1> @"CONTAINER" AS @"BREAKING_EVENT"
UNLABEL @"BREAKING_ACTION"
UNLABEL @"CONTAINER"
Nested Labels - UNLABEL ... IN
I smashed the vase on David’s head. - This phrase suggests an act of violence rather than an accident.
LABEL (a"smashed"|a"hit"|a"broke") AS @"ACCIDENT"
LABEL <1,1> (a"vase"|a"glass"|a"bottle") AS @"CONTAINER"
LABEL @"ACCIDENT" @"CONTAINER" a"on" <0,2> (a"head"|a"skull"|a"forehead") AS @"VIOLENCE"
UNLABEL @"CONTAINER"
UNLABEL @"ACCIDENT" IN @"VIOLENCE"
- First, label verbs like “smashed,” “hit,” or “broke” as accident (a general event).
- Then, if this is followed by “on head,” “on skull,” or “on forehead,” label it as violence.
- Finally, remove the accident label only within the parts labeled as violence, because these represent intentional acts rather than accidents.
Detect Everything Except Certain Cases - UNLABEL ... CONTAINING
If we want to exclude a specific expression from our label, we use UNLABEL ... CONTAINING.
This removes matches that contain a word or phrase indicating the opposite of what we’re looking for.
LABEL "is" <1,1> AS @"is"
LABEL ("broken"|"damaged") AS @"broken_or_damaged"
UNLABEL @"is" CONTAINING @"broken_or_damaged"
Remove the is label if it contains the word "broken", so phrases like "is broken" are excluded from the is label.
Excluding an Exact Match from a Broader Label - UNLABEL ... EQUAL
If we want to exclude a specific expression from our label and it matches exactly 1:1 with our broader label, we use
UNLABEL ... EQUAL.
Example: here, we want to remove the person "runner" from detections of the activity "run".
LABEL ir"run" AS @"RUN_activity"
LABEL ir"runner" AS @"RUN_person"
UNLABEL @"RUN_activity" EQUAL @"RUN_person"
Remove the RUN_activity label only when it exactly matches the RUN_person label — this way "runner" won’t be tagged as an activity.
If you have any further questions, please feel free to contact us.
In this link, you will find a prototype of the pattern entity tester:
https://cdn.sentione.com/automate/mtt-tutorial-pl/gibon.html

Updated 13 days ago