General text recognition | Octoparse AI Help Center

Definition and Usage

The General text recognition command recognizes all text in an image via OCR model. It returns both the recognized content and the coordinates of the text. This is useful for extracting text from images, scanned documents, advertisements, or handwritten notes. You can choose the OCR model, recognition type, and the source of the image (local file, URL, web element, or UI element).

Parameter Values

Input Parameters

Parameters	Description	Possible Values	Required	Options / Notes
OCR model	Select the OCR model to recognize text.	Mistral OCR	Yes	-
Recognition type	Select the type of recognition mode.	General text; General text (High accuracy); Handwriting; Advertisement text	Yes	High accuracy may take longer time
Image source	Choose the source of the image.	Local file; Hyperlinked image; UI element; Web element	Yes	Determines which input field (path, URL, or element) to use.
Image path	Enter the path or choose an image/PDF file (up to 5MB).	Any valid local file path	Conditional	Used only when Image source = Local file; ensure file size ≤ 5MB.
Image URL	Enter the image URL (files up to 5MB).	Any valid image URL	Conditional	Used only when Image source = Hyperlinked image.
Element (UI)	Select or capture a UI element to operate on.	Any valid UI element	Conditional	Used only when Image source = UI element.
Web page	Select a variable that contains the web page to work with.	Web page variable	Conditional	Used only when Image source = Web element.
Element (Web)	Select or capture a web element to operate on.	Any valid web element	Conditional	Used only when Image source = Web element.

Advanced Settings

Parameters	Description	Possible Values	Required	Options / Notes
Element timeout (s)	Wait time for the element to appear (seconds).	Numeric value	No	Applies to UI or web element sources.
Enable PDF recognition	Recognize text from images or PDFs.	True / False (checkbox)	No	Must be enabled when processing PDF files.
PDF page number	Enter the page number to recognize. Only one page can be processed at a time.	Numeric value	Conditional	Required when Enable PDF recognition is enabled and file is PDF.

Error Handling

Parameter Name	Description
Throw error & stop	When an error occurs, the action will stop execution.
Retry command	The action retries the command if an error occurs.
Ignore error & continue	The action ignores the error and continues workflow execution.

Variables Produced

Variable Name	Description
Store result into	Stores the result in an object variable containing recognized text and the full response.

Using Variables in Conditions

Some input fields in this command support variables, so you can pass values generated in earlier steps instead of entering them manually. For example, you can use a variable as the Image path, Image URL, or Web page source to make the OCR process dynamic and reusable.

Notes

The maximum supported file size is 5MB.
Only one PDF page can be processed at a time.
Make sure Enable PDF recognition is selected when using PDF input.
The accuracy of recognition depends on the OCR model and recognition type chosen.
For web or UI element sources, the element must be visible before recognition starts.

Character CAPTCHA

Slider Puzzle CAPTCHA

Slider CAPTCHA

Text click CAPTCHA

Drawing Path CAPTCHA