Skip to main content


Preparing data

Suggest edit Updated on March 11, 2021

The Data preparation step begins when you connect to a database or upload your data from a data set or a CSV file.

The columns in the data source are used as predictors but you can later define their roles. For more information, see Defining the predictor role.

The data is necessary to create a statistically relevant sample with customer details that can be further segregated into different dataset types such as development, validation, and testing. The customer data that goes into development sample is used to develop predictive models. Data in the validation and test sample is used to validate and test model accuracy.

The data source contains customer and their previous behavior information. It should contain one record per customer, each record presented in the same structure. Ideally, the data should be present for all fields and customers but in most circumstances some missing data can be tolerated.

Based on your model selection and outcome field categorization, Prediction Studio generates data that you can view in the Graphical view tab and Tabular view tab. For more information, see Defining an outcome.

  • Selecting a data source

    Select a data source for the creation of predictive models. Before you select the input for the development, validation, and testing of data, make sure that these resources are available for you.

  • Constructing a sample

    A sample is a subset of historical data that you can extract when you apply a selection or sampling method to the data source. A sample construction helps to construct development, validation, and test data sets for analysis and modeling.

  • Defining an outcome

    Select a model type and define the outcome field representing the behavior that you want to predict in the model.

Did you find this content helpful? YesNo

Have a question? Get answers now.

Visit the Collaboration Center to ask questions, engage in discussions, share ideas, and help others.

Ready to crush complexity?

Experience the benefits of Pega Community when you log in.

We'd prefer it if you saw us at our best. is not optimized for Internet Explorer. For the optimal experience, please use:

Close Deprecation Notice
Contact us