Parse transforms

Use a parse transform rule to define a regular expression for text searches and matches, and to create Java code equivalent to the regular expression. Parse transform rules are referenced in parse transform collection rules ( Rule-Parse-TransformCollection rule type), which in turn are referenced by parse infer rules ( Rule-Parse-Infer rule type).

Note: This rule type deprecated. Use the @pxReplaceAllViaRegex function instead.

The following table defines fields and controls on the Transform tab of the Parse Transform form.

Field Description
Regular expression Enter a regular expression conforming to the Java syntax accepted by the java.util.regex class. You can use the typical syntactical elements, including brackets [ ], predefined character classes (\s for a white space character), boundary matches (\b for a word boundary), quantifiers, and so on.

When you save the Parse Transform form, the Pega Platform checks the syntax of your regular expression. See the Java documentation for the Pattern class for the authoritative definition of the syntax accepted here.

You can use the Regular Expression tool to develop and validate the expression. See About the Regular Expression tool.

Convert Optional. Enter Java code to convert and validate a source text string, and optionally to form a Pega Platform property type such as Date or DateTime. The Java code executes for each regular expression pattern match in the source text.

Click the Gear icon to start your Java editor or Notepad. Within the Java code, the following read-only variables are available:

  • String aSource — The source string that was searched
  • String aMatch — The substring that matched the regular expression pattern
  • int aMatchStart — The location (character offset) within the source string where the pattern match starts
  • int aMatchEnd — The location (character offset) within the source string where the pattern match ends
  • String aMatchGroups[] — An array of all subgroup matches for the pattern

Within the Java, set the java.lang.String variable sReturn to the approved result, or null to indicate invalid or error conditions.

Output Type Select the property type of the output value.

The following table defines fields and controls on the Compile tab of the Parse Transform form.

Field Description
Only allow UNIX line termination ('\n') Only a newline character (line feed) is treated as a line termination. Sets the constant PATTERN.UNIX_LINES.
Dot ('.') matches all characters including line terminators A period character usually matches any character including a line terminator. Select to include any line terminator character in the set of characters that match a period.

Sets the constant PATTERN.DOTALL.

Start of line ('^') and end of line ('$') directives match line terminators By default, the expressions ^ and $ ignore line terminators, and only match the beginning and end of the entire input sequence. Select to cause ^ to match the beginning of input, and also after any line terminator except at the end. Select to cause $ to match immediately before a line terminator or at the end of the input sequence.

Sets the constant PATTERN.MULTILINE.

Allow embedded comments Select to allow white space and comments (starting with # and ending at the end of a line) to appear and be ignored within a pattern match.

Sets the constant PATTERN.COMMENTS.

Case Insensitive Matching for US-ASCII Characters Select to enable case-insensitive matching. This assumes that only characters in the US-ASCII character set are being matched, unless the next check box is also selected.

Sets the constant PATTERN.CASE_INSENSITIVE.

Case Insensitive Matching for UNICODE Characters Select to enable UNICODE-aware case folding.

Sets the constant PATTERN.UNICODE_CASE.

Canonical Equivalence Select to enable canonical equivalence; two characters are considered to match only if their full canonical decompositions match.

Sets the constant PATTERN.CANON_EQ.