Changes to the architecture of the Data Flow service

Valid from Pega Version 8.4

In Pega Platform™ 8.4, the architecture of batch and real-time data flows uses improved node handling to increase the stability of data flow runs. As a result, there are fewer interactions with the database and between the nodes, resulting in increased resilience of the Data Flow service.

If you upgrade from a previous version of Pega Plaftorm, see the following list for an overview of the changes in the behavior of the Data Flow service compared to previous versions:


Nodes no longer communicate and trigger each other, but run periodic tasks instead. As such, triggering a new run does not cause the service nodes to immediately start the run. Instead, the run starts a few seconds later. The same applies to user actions such as stopping, starting, and updating the run. The system also processes topology changes as periodic tasks, so it might take a few minutes for new nodes to join runs, or for partitions to redistribute when a node leaves a run.

Updates to lifecycle actions

To make lifecycle actions more intuitive, the Stop action consolidates both the Stop and Pause actions. The Start action consolidates both the Resume and Start actions.

You can resume or restart stopped and failed runs with the Start and Restart actions. The Start action is only available for resumable runs and continues the run from where it stopped. The Restart action causes the run to process from the beginning. Completed runs can only be restarted. If a run completes with failures, you can restart it from the beginning, or process only the errors by using the Reprocess failures action.

Starting a run

New data flow runs have the Initializing status, and start automatically. You no longer need to manually start a new run, so the New status is now removed.

If there are no nodes available to process a run, the run gets the Queued status and waits for an available node.

Triggering pre- and post-activities

The system now triggers pre-activities on a random service node, rather than on the node that triggered the run.

The system triggers post-activities only for runs that complete, fail, or complete with failures. If you manually stop a run with the Stop action, the post-activity does not trigger. However, restarting the run with the Restart action triggers first the post-activity, and then the pre-activity.

You can no longer choose to run pre- and post-activities on all nodes.

Selecting a node fail policy

For resumable runs, you can no longer select a node fail policy. If a node fails, the partitions assigned to that node automatically continue the run on different nodes.

For non-resumable runs, you can choose to restart the partitions assigned to the failed node on different nodes, or to fail the partitions assigned to the failed node.

No service nodes and active runs

If the last data flow node for an in-progress run fails, the run remains in the In Progress state, even if no processing takes place. This behavior results from the fact that data flow architecture now prevents unrelated nodes from affecting runs.

The UpdateAdaptiveModels agent causes an exception after Pega 7.2 to 7.2.1 upgrade

Valid from Pega Version 7.2.1

After the Pega 7 Platform is upgraded from version 7.2 to 7.2.1, the log files might show an error that is caused by the UpdateAdaptiveModels agent. This agent is enabled by default and is responsible for updating scoring models in the Pega 7 Platform. If you use adaptive models in your solution, you can avoid this error by configuring the Adaptive Decision Manager service. If you do not use adaptive models, disable the UpdateAdaptiveModels agent.

For more information, see Configuring the Adaptive Decision Manager service and Pega-DecisionEngine agents.

Reconfiguration of the Adaptive Decision Manager service after upgrade to Pega 7.2.1

Valid from Pega Version 7.2.1

After you upgrade the Pega 7 Platform to version 7.2.1, you need to reconfigure the Adaptive Decision Manager service. Beginning with Pega 7.2.1, the Adaptive Decision Management (ADM) service is native to the Pega 7 Platform and is supported by the Decision data node infrastructure.

For more information, see Services landing page.

Interactions in flows are no longer supported by the Run Interaction shape

Valid from Pega Version 7.3.1

The Run Interaction shape in flows has been replaced by the Run Data Flow shape, which supports running a single case data flow with a strategy. Flows that include the Run Interaction shape continue to work; however, you must now use the Utility shape to reference any new interactions that you create.

For more information, see Running a decision strategy from a flow and About Interaction rules.

Extension attributes are not supported in PMML models

Valid from Pega Version 7.3.1

Models in the Predictive Model Markup Language (PMML) format version 4.3 that contain extension attributes with the x- prefix are not valid. These extension attributes are deprecated; you must use extension elements instead. In addition, if the output type of any output field in the model is set to FLOAT, change it to DOUBLE.

For more information, see PMML 4.3 - General Structure in the Data Mining Group documentation.

The Upload responses action is not supported for adaptive models with customized context

Valid from Pega Version 7.3.1

A default instance of the Adaptive Model rule contains five model identifiers (.pyIssue, .pyGroup, .pyName, .pyDirection, .pyChannel) that are used to partition adaptive models. If you add other identifiers in your Adaptive Model rule instance, you cannot upload responses to this instance with the Upload Responses wizard and the following error is displayed: The Flow Action post-processing activity pzUploadCSVFile failed: Cannot parse csv file.You can still train such adaptive models with data flows.

For more information, see Training adaptive models in bulk with data flows, Model context, and Uploading customer responses.

Upgrading Adaptive Decision Manager data mart tables might fail

Valid from Pega Version 7.3.1

Issue: Upgrade from 7.3 to 7.3.1 fails if the data contained in the pxInsName column of the PR_DATA_DM_ADMMART_PRED_FACT table is longer than 128 characters.

Reason: During the Pega Platform™ upgrade from 7.3 to 7.3.1, data in the Adaptive Decision Manager (ADM) data mart tables is migrated from the PR_DATA_DM_ADMMART_PRED_FACT table to the PR_DATA_DM_ADMMART_MDL_FACT table. In Pega 7.3.1, ADM uses only the PR_DATA_DM_ADMMART_MDL_FACT table where the pxInsName property can store values that are 128 characters long. In Pega Platform 7.3, the pxInsName property in the PR_DATA_DM_ADMMART_PRED_FACT table can store values that are 255 characters long. If the pxInsName property contains values that are longer that 128 characters, the upgrade fails.

Resolution: Issue an ALTER TABLE statement to change the pxInsName column size to 255 characters and resume the upgrade. For example:

ALTER TABLE rules.pr_data_dm_admmart_pred ALTER COLUMN pxInsName TYPE varchar(255);

For more information, see Adaptive Decision Manager data model.

