Document Processing Service profiles

Depending on the needs and requirements of your organization, you can select a Document Processing Service (DPS) optimization mode for the optical character recognition (OCR) and highlighting services that your application uses for document processing. DPS mode uses profiles that are optimized for different tasks involving the text, table, form analysis, and highlighting of an image-based document, for example, a PNG file. The DPS component is available on the cloud and can also use DPS for document processing as extension points in an application and automations for custom solutions.

You specify the OCR and highlighting service profiles that you want to use for DPS in the ocrPredefinedProfile and highlightPredefinedProfile parameters of the configureDPSABBYY data transform, respectively. For more information, see Configuring the Document Processing Service component.

The list below describes the available profiles:

BarcodeRecognition_Accuracy

Used for barcode extraction. In this profile, the system extracts only barcodes and text, and does not detect pictures and tables. This profile optimizes settings for accuracy.

For compatibility purposes, you can also access this profile by using the BarcodeRecognition name.

BarcodeRecognition_Speed

Used for barcode extraction. In this profile, the system extracts only barcodes, and does not detect pictures, text or tables. This profile optimizes settings for processing speed.

BookArchiving_Accuracy

Used for creating an electronic library. The settings for this profile are optimized for accuracy and best quality. This profile enables font style detection and full synthesis of the logical structure of the document.

BookArchiving_Speed

Used for creating an electronic library. This profile optimizes settings for processing speed in the following way:

  • Best quality. Enables font style detection and full synthesis of the logical structure of the document.
  • The document analysis and recognition process works faster.
Default

Sets all the processing parameters to the values of the BookArchiving_Accuracy profile.

DocumentArchiving_Accuracy

Used for creating an electronic archive. This profile optimizes settings for accuracy in the following way:

  • The system enables the detection of the maximum amount of text in an image, including text that is embedded in the image.
  • The system does not perform skew correction.
  • The system does not detect fonts and styles.
  • The system does not perform full synthesis of the logical structure of the document.
DocumentArchiving_Speed

Used for creating an electronic archive. This profile optimizes settings for processing speed in the following way:

  • The system enables the detection of the maximum amount of text in an image, including text that is embedded in the image.
  • The system does not perform skew correction.
  • The system does not detect fonts and styles.
  • The system does not perform full synthesis of the logical structure of a document.
  • The document analysis and recognition process works faster.
DocumentConversion_Accuracy

Used for converting documents. The settings for this profile are optimized for accuracy and best quality. This profile enables font style detection and full synthesis of the logical structure of the document.

DocumentConversion_Speed

Used for converting documents. This profile optimizes settings for processing speed in the following way:

  • Best quality. Enables font style detection and full synthesis of the logical structure of the document.
  • The document analysis and recognition process works faster.
EngineeringDrawingsProcessing

Used for recognizing technical drawings. The system takes into account the large size and the complexity of engineering diagrams, as well as different text orientations within the image. The purpose of this profile is to convert the images into a searchable PDF format. This profile uses the following settings:

  • The system enables the detection of all text in an image, including text blocks in a vertical orientation.
  • The system does not perform full synthesis of the logical structure of a document.
HighCompressedImageOnlyPdf

Used for creating high-compression PDF files that contain full documents saved as pictures. This profile uses the following settings:

  • The system does not perform document recognition and synthesis of the logical structure of a document.
  • The system does not perform skew correction.
  • PDF export is optimized for minimum size of the output file.
  • The entire document is saved as a picture using the PEM_ImageOnly mode.
TextExtraction_Accuracy

Used for extracting text from a document. This profile optimizes settings for accuracy in the following way:

  • The system enables the detection of all text in an image, including small text areas that are of low quality. Pictures and tables are not detected.
  • The system does not detect fonts and styles.
  • The system does not perform full synthesis of the logical structure of a document.
TextExtraction_Speed

Used for extracting text from a document. This profile optimizes settings for processing speed in the following way:

  • The system enables the detection of all text in an image, including small text areas that are of low quality. Pictures and tables are not detected.
  • The system does not detect fonts and styles.
  • The system does not perform full synthesis of the logical structure of a document.
  • The document analysis and recognition process works faster.