Definition and Usage
The General text recognition command recognizes all text in an image via OCR model. It returns both the recognized content and the coordinates of the text. This is useful for extracting text from images, scanned documents, advertisements, or handwritten notes. You can choose the OCR model, recognition type, and the source of the image (local file, URL, web element, or UI element).
Parameter Values
Input Parameters
Parameters | Description | Possible Values | Required | Options / Notes |
OCR model | Select the OCR model to recognize text. | Mistral OCR | Yes | - |
Recognition type | Select the type of recognition mode. | General text; General text (High accuracy); Handwriting; Advertisement text | Yes | High accuracy may take longer time |
Image source | Choose the source of the image. | Local file; Hyperlinked image; UI element; Web element | Yes | Determines which input field (path, URL, or element) to use. |
Image path | Enter the path or choose an image/PDF file (up to 5MB). | Any valid local file path | Conditional | Used only when Image source = Local file; ensure file size β€ 5MB. |
Image URL | Enter the image URL (files up to 5MB). | Any valid image URL | Conditional | Used only when Image source = Hyperlinked image. |
Element (UI) | Select or capture a UI element to operate on. | Any valid UI element | Conditional | Used only when Image source = UI element. |
Web page | Select a variable that contains the web page to work with. | Web page variable | Conditional | Used only when Image source = Web element. |
Element (Web) | Select or capture a web element to operate on. | Any valid web element | Conditional | Used only when Image source = Web element. |
Advanced Settings
Parameters | Description | Possible Values | Required | Options / Notes |
Element timeout (s) | Wait time for the element to appear (seconds). | Numeric value | No | Applies to UI or web element sources. |
Enable PDF recognition | Recognize text from images or PDFs. | True / False (checkbox) | No | Must be enabled when processing PDF files. |
PDF page number | Enter the page number to recognize. Only one page can be processed at a time. | Numeric value | Conditional | Required when Enable PDF recognition is enabled and file is PDF. |
Error Handling
Parameter Name | Description |
Throw error & stop | When an error occurs, the action will stop execution. |
Retry command | The action retries the command if an error occurs. |
Ignore error & continue | The action ignores the error and continues workflow execution. |
Variables Produced
Variable Name | Description |
Store result into | Stores the result in an object variable containing recognized text and the full response. |
Using Variables in Conditions
Some input fields in this command support variables, so you can pass values generated in earlier steps instead of entering them manually. For example, you can use a variable as the Image path, Image URL, or Web page source to make the OCR process dynamic and reusable.
Notes
The maximum supported file size is 5MB.
Only one PDF page can be processed at a time.
Make sure Enable PDF recognition is selected when using PDF input.
The accuracy of recognition depends on the OCR model and recognition type chosen.
For web or UI element sources, the element must be visible before recognition starts.