Best practices for cleaning up training data in an IVA

Before you apply changes to the text analytics model for Pega Intelligent Virtual Assistant (IVA), ensure that you correct each training data record by fixing issues in the text and removing any misplaced characters in the content. Correcting training data ensures that the IVA learns only from properly formatted sample data. Eliminating mistakes when you train the IVA helps to improve the accuracy of the model.

To clean up each training record for an IVA, ensure that you remove trailing white spaces, non-alphanumeric characters, and typos. You can also eliminate incomplete tags, missing characters, and misspelled words. For example, you can remove from a data record such characters as # or misplaced apostrophes and quotation marks in sentences and phrases. To learn more about editing a training data record, see Correcting training data in an IVA.