Configuring the Document Processing Service component

Optimize how your application invokes the Document Processing Service (DPS) by modifying the configureDPSABBYY data transform rule for the component. By selecting a profile for optical character recognition (OCR) and highlighting services, and optionally editing custom parameters, you help DPS process image-based documents in a more efficient way. For example, you can optimize DPS for text extraction and accuracy by specifying the TextExtraction_Accuracy profile for the OCR service.
Before you begin: Install and enable the Document Processing Service component in your application. For more information, see Installing and enabling the Document Processing Service component.

A profile is a DPS mode that is optimized for speed or accuracy, such as barcode recognition, book archiving, document archiving, engineering drawings, or text extraction.

DPS uses the OCR service to optically recognize characters in an image-based document. You select a profile for the OCR service by editing the ocrPredefinedProfile parameter in the configureDPSABBYY data transform. DPS uses the highlighting service to select text in an image-based document. You select a profile for the highlighting service by editing the highlightPredefinedProfile parameter in the configureDPSABBYY data transform.

  1. In the header of Dev Studio, click the name of your application, and then click Definition.
  2. In the list of enabled components in the Enabled components section, click the Edit icon next to the DPS component name.
    For example: Click the icon next to the DPS component DPService_Component_0904T094112237.
  3. In the DPService component rule, in the Component rulesets section, click the Edit icon next to the DPS service ruleset.
    For example: Click the icon next to the DPS ruleset named DPService:01-01.
  4. In the Rule information section, in the DPService ruleset definition, click the number for the rule count.
  5. In the list of the DPService component rules, scroll down to or filter the data transform rule types, and then click the configureDPSABBYY rule.
  6. In the data transform rule, specify a DPS profile for the OCR service by entering a value within double quotation marks in the Source column field for the Para.ocrPredefinedProfile target.
    You can select one of several predefined DPS profiles for the OCR service. For more information, see Document Processing Service profiles.
    For example: To specify a profile that is optimized for text extraction and speed, enter: "TextExtraction_Speed"
  7. In the data transform rule, specify a DPS profile for the highlighting service by entering a value within double quotation marks in the Source column field for the Para.highlightPredefinedProfile target.
    You can select one of several predefined DPS profiles for the service. For more information, see Document Processing Service profiles.
    For example: To specify a profile that is optimized for book archiving and accuracy, enter: "BookArchiving_Accuracy"
  8. Optional: To optimize the DPS component further, enter values within double quotation marks in the Source column field for specific parameters displayed in the Target column.
    You can modify custom parameters for the DPS service that relate to page preprocessing, PDF file export, and object extraction. For more information, see Document Processing Service custom parameters.
    For example: To specify the file format as PDF for the highlighting service, enter "FEF_PDF" in the Param.highlightFileExportFormat parameter.
  9. Click Save.