Skip to main content

Extract content from image-based files

Suggest edit Updated on January 27, 2022

With the new DocumentOcr component you convert an image-based document that contains text into text that you can manipulate in an automation. Use this component with images, such as a faxed document, or documents that contain both text and images. This component works with image files (such as .png, .jpeg, .tiff), PDF files, and Microsoft Word documents.

  • Use in automations to extract data from the supported file formats.

  • Use in automations for searching and copying data from unstructured documents.

For more information, see DocumentOcr Component.

  • Previous topic Integrate robotic automations into your application by using the Connect Robot rule (8.2)
  • Next topic Screen scrape Windows applications
Did you find this content helpful? YesNo

33% found this useful

Have a question? Get answers now.

Visit the Support Center to ask questions, engage in discussions, share ideas, and help others.

We'd prefer it if you saw us at our best.

Pega.com is not optimized for Internet Explorer. For the optimal experience, please use:

Close Deprecation Notice
Contact us