HomeAdvanced FunctionCreating a Searchable OOXML File via OCR

Creating a Searchable OOXML File via OCR

When sending an OOXML (PPTX, DOCX, or XLSX) file, create a searchable OOXML file using OCR character recognition technology.

To enable searching of an OOXML file, select [PPTX], [DOCX], or [XLSX] as the file type, and select [Character Recognition]. Then, configure the following settings.

Setting

Description

[ON]/[OFF]

Select [ON] to enable searching of an OOXML file.

[Language Setting]

Select a language for OCR processing.

Select the language used in the original to correctly recognize text data.

[Adjust Rotation]

Set this option to ON to automatically perform the rotation adjustment for each page based on the direction of text data detected by OCR processing.

When the rotation adjustment is disabled, if the specified original orientation does not match the text direction, text data is not recognized correctly.

[Output Method]

This option is available when [DOCX] or [XLSX] is selected as the file type.

Select how to create an OOXML file using text detected by OCR processing.

When [DOCX] is selected as the file type:

The system analyzes the scanned original, and creates "image data" including illustrations in the original, "text data" detected by OCR processing, or "text image data" in which text in the original is treated as images.

  • [Text Priority]: Creates a searchable DOCX file by combining "text data" and "image data". This function displays "text data" detected by OCR processing without any adjustment; therefore, its visual quality may not be the same as that of the scanned original depending on the result of OCR processing.

  • [Image Priority]: Creates a DOCX file by combining only "image data" and "text image data".

  • [Image and Text]: Creates a searchable DOCX file by combining "image data", "text data", and "text image data". "Text data" is saved separately from "text image data"; therefore, a text search is possible while maintaining the visual quality of the original.

When [XLSX] is selected as the file type:

The system creates a "scanned image" of the original and "text data" detected by OCR processing from the scanner original.

  • [Image and Text]: Creates a searchable XLSX file by combining "scanned image" and "text data". A text search can be performed while maintaining the visual quality of the original.

  • [Text Only]: Creates a searchable XLSX file using only "text data". This function displays "text data" detected by OCR processing without any adjustment; therefore, its visual quality may not be the same as that of the scanned original depending on the result of OCR processing.

  • To use this function, an option is required. For details on the required option, refer to Here.

  • This function is available only in classic style.

  • [Adjust Rotation] is not available when encryption using a digital certificate (digital ID) is enabled together.

  • If the following language is selected in [Language Setting], the text direction is recognized automatically.
    [Japanese], [Simplified Chinese], [Korean], [Traditional Chinese]

  • When [Language Setting] is selected, if the vertical and horizontal directions are mixed in the same page of an original, they are recognized as either one direction.
    [Simplified Chinese], [Korean], [Traditional Chinese]