Best practices when training model-based entities in an IVA

To train Pega Intelligent Virtual Assistant (IVA) to extract correct information from a conversation, you can modify the data records by highlighting relevant entities. For the IVA to respond more efficiently during a chat conversation, you train the analytics model to identify a word or phrase as structured content, and then associate metadata with that content. For example, you can train the system to detect the make and model of a car by identifying this information in the data records as entities.

When training model-based entities in Pega Platform, ensure that you highlight multiple types of entities in one phrase, sentence, or document. This is because the model learns the probability of an entity occurring, and the probability of the entity being followed or preceded by other entity types, based on the accompanying words. Additionally, the model uses the training data to learn the order in which entities are likely to appear. Apart from single words or short phrases, you can also mark larger chunks of text as entities, for example, entire paragraphs. When you are training the model for the IVA, highlight at least 10 to 20 instances per entity type in the text. Try to avoid highlighting incomplete tags, missing characters, or misspelled words. Any mistakes affect the accuracy of the model.

Note: To learn more about best practices for creating entities for a text analytics model, see Best practices for creating extraction models.