Configuring the Data Flow service

Enable running data flow instances on decision data nodes by configuring the Data Flow service. Specify the number of Pega Platform threads that you want to use for running the data flow instances.

Each new data flow instance is processed on all Data Flow nodes.

Test runs of data flows are always processed on the local Pega Platform node, so you do not need to configure the Data Flow service for test runs.

Before you begin: Add at least one node to start the Data Flow service. For more information, see Adding nodes to Decision Management services.
  1. In the header of Dev Studio, click Configure > Decisioning > Infrastructure > Services > Data flow.
  2. In the Service list, select a type of Data Flow service.
    Nodes in the Batch service are used to process the batch runs of data flows. Nodes in the Real Time service are used to process the real-time runs of data flows. Both services use independent nodes.
  3. In the Data flow nodes section, click Edit settings.
  4. In the Thread count field, enter the number of threads that are assigned to process running the data flow.
    For example: When the source of a data flow is divided into five partitions, the data flow run is divided into five assignments that can be processed simultaneously on separate threads if there are enough threads.

    The number of available threads is calculated by multiplying the thread count by the number of nodes. With two nodes and five threads in the system, the data flow run uses five threads and five threads remain idle.

  5. Click Submit.
  6. Optional: On the Data Flow tab, on the Execute menu, select an action that you want to run for the selected Data flow node.
    For more information, see Managing decision data nodes.
    Note: If you decommission a node that has active data flow runs, the status of that node changes to LEAVING, and it is not decommissioned until all active data flow runs are finished.
  7. Optional: To display the status parameters of a selected Data Flow node, on the Data Flow tab, click the row for that node.