Creating an ingestion case
This content applies only to Pega Cloud environments
Configure the ingestion data flows as Pega cases.
By using the case process, you can also schedule certain ingestion runs to off hours to
avoid executing batch processes during peak daytime processing. The following figure provides an overview of an ingestion case that can be used to ingest
and process customer data. For more information about creating case types, see Automating work by creating case types. This stage validates that all data files identified in the manifest file
for a given ingestion execution are present in the data folder. You can
use repository APIs for all file-related operations. For more information, see Repository APIs. Add this stage if your application requires data validation. Otherwise,
this stage is optional. As a best practice, ensure that all the data was successfully
read from all the data files before you upload the data to a permanent data
set. Using a Cassandra data set improves processing by taking advantage of
the inherent Cassandra high-speed processing capabilities. For critical customer data, exact match is required. However, for
non-critical data, you can identify a tolerance to allow processing to
continue if there is no match. If optional size validation and non-standard decryption are required, these
operations are performed at this point. Processing size validation might not
be applicable if you are transferring compressed files. Both size validation
and decryption require project development work, which in some cases might
require Java development skills depending on the decompression and
decryption algorithms used. Typically, there are multiple data flows that process the data to a
destination depending on the data (customer, product, transactions, and so on) and
action to be performed. You can upload the data to a relational database
(PostgreSQL for Pega Cloud) or to xCAR in a Cassandra database. In both cases, the data
can be processed in one execution run by identifying a single data flow in the manifest
file which executes the appropriate data flows. The following operations are supported: In the cleanup stage, you archive (if needed) or delete the files that were
transferred to the Pega Cloud File Storage. In the error stage, you archive (if needed) or delete the files that were
transferred to the Pega Cloud File Storage up to the point when an
error in the ingestion process occurred.Checking for files
Options Actions SFTP process guarantees that the manifest file is delivered after
the data files were transferred SFTP process cannot guarantee that the manifest file is delivered
after the data files were transferred and token files are
used .zip
and .gzip
compression
formats. If data files are compressed, size validation is not possible. If you
perform size validation, you cannot use the out-of-the-box support for reading
compressed files.Uploading to the staging data set
Uploading to the final data set
Cleanup
Handling errors
Previous topic Triggering the customer data ingestion process Next topic Exporting customer data from Pega Customer Decision Hub