Methods for categorisation

Under the Email and work item | Methods for categorisation menu choice in ACE Admin you can choose a method to search and categorise incoming emails and work items.

The two different categorisation methods are:

Keyword based (ACE Keyword)
Can be used to determine which task type to be given to the contact, seen to the keywords in the email or work item. Also see Task type assignment for email and work item.

Can also be used to see in which language an email is written, to be able to send an automatic answer in the correct language, as described in Email — Automatic answer.

Pattern matching (ACE Pattern)
E.g. used for handling forms where information follow a specific pattern. What is found via pattern matching ends up in contact data keys and can be used for routing to specific queues or screen pop in task handling systems. Also see Email — Pattern matching.

An email can consist of text, with or without formatting. A formatted email can e.g. constitute an html page that can also be opened in a web browser. ACE Email can convert the relevant content of html mails into text and can in this way search and categorise these as well, using the settings you have made. In the cases where the email consists of several alternative parts, as where an html page has been completed with the same information as text, only one part is searched.

Keyword based categorisation

Keyword-based categorisation means that a text will be searched in order to find certain keywords. These keywords are defined for a number of categories.

To use this method, select ACE Keyword in the drop down menu under Methods for categorisation.

In the Categories box you will find a number of categories which are already defined in your ACE system. Tick the categories you want to use.

Add a category

To add a new category, click the Add category... button.

Enter the category name in the entry field and click on OK.

You can now enter keywords for the new category by selecting it in the Category drop-down menu under the Change of Category header.

Keywords can either be downloaded from an existing text file using the Download from file button or be entered manually as a list in the Keywords box, found in the window’s bottom right part.

If you tick off the View help to add keyword box, you will see which various rules that are used to define the keywords. These are the rules:

  • Categorisation is performed irrespective of upper and lower case letters. This means that computer and Computer are identical keywords and both keywords will match the words computer, Computer, COMPUTER, compuTER, etc.
  • An asterisk * is used to get a keyword to match all words in the text beginning with a particular prefix. The keyword computer* will match the words computer, computers, computerisation, etc.
  • A plus sign + is placed before a keyword to indicate that the keyword is necessary, i.e. that the keyword must be present in the text. In other words, the keyword +computer means that computer is a necessary word.
  • A minus sign - is placed before a keyword to indicate that the keyword is prohibited, i.e. that the keyword may not be present in the text. In other words, the keyword -computer means that computer is a prohibited word.
  • An equals sign = is placed before a keyword in order to let special characters to be included in the keyword. The keyword =+computer will match +computer in the input text.

When the keyword list is finished, click on Save keyword. Note that you may also save the keywords to a file.

According to this example picture the words der, das and golfplatz belong to the german category. The + sign in front of der and das means that these words must be found in the email text in order for this category to be included in the overall categorisation result.

Settings for the selected method - Parameters

Under the Parameters heading to the right you enter the values for the max_categories and cut_off parameters respectively.

cut_off
This parameter is used to sift out poor category matches, i.e. a way of removing a category that is not mentioned as much as other categories. The value for cut_off is a number between 0.0 and 1.0.
max_categories
This parameter controls how many categories the ACE Email Text Categoriser may return. If the max_categories parameter for instance is set to 2 and an email contains 3 categories, the Text Categoriser will only return the two with the highest number of main points found, i.e. with the strongest weight. If, on the other hand, ACE only finds one category, this sole category will be returned. If you map categories to task types it is quite probable that a customer mentions more than one task in an email.
Example for the Swedish and English categories:

If there are 30 English words and one Swedish word in an email you may want the categorisation to only return English. Then you weigh all categories against the others.

If ACE finds 30 English words and one Swedish, the weight is 30 for English and 1 for Swedish, giving 1/30 = 0.0333. If you have set a cut_off for 0.1, Swedish will not be returned as a category since 0.0333 is less than 0.1.

If ACE instead finds 30 English words and 15 Swedish words, agents speaking both languages might be needed and you might want both languages to be returned. 15/30 = 0.5 and since 0.5 is greater than 0.1, Swedish will be returned as well.