Module
- Module
- ~WasmAsposeOCRRecognitionSettings
- ~WasmAsposeOCRInput
- ~Rect
- ~ExportFormat :
enum
- ~CharactersAllowedType :
enum
- ~DetectAreasMode :
enum
- ~AsposeOCRRawDataType :
enum
- ~Language :
enum
- ~AsposeOCRSetLicense(licenseFullPath)
- ~AsposeOCRGetState() ⇒
boolean
- ~AsposeOCRRecognize(descriptors, settings) ⇒
WasmAsposeOCRRecognitionResult
- ~AsposeOCRSerializeResult(recognition_result, format) ⇒
string
- ~Rect :
object
- ~VectorRect :
Array.<Rect>
- ~WasmAsposeOCRInputs :
Array.<WasmAsposeOCRInput>
- ~WasmAsposeOCRRecognitionSettings :
object
- ~WasmAsposeOCRInput :
object
- ~WasmAsposeOCRRecognitionArea :
object
- ~WasmAsposeOCRRecognizedPage :
object
- ~WasmAsposeOCRRecognitionResult :
object
Module~WasmAsposeOCRRecognitionSettings
Kind: inner class of Module
new WasmAsposeOCRRecognitionSettings()
Empty constructor of WasmAsposeOCRRecognitionSettings.
Module~WasmAsposeOCRInput
Kind: inner class of Module
new WasmAsposeOCRInput()
Empty constructor of WasmAsposeOCRInput.
Module~Rect
Kind: inner class of Module
new Rect()
Empty constructor of Rect.
Module~ExportFormat : enum
(ENUM) The format for recognition result. text: 0 json: 1 xml: 2
Kind: inner enum of Module
Properties
Name | Type | Default |
---|---|---|
text | int | 0 |
json | int | 1 |
xml | int | 2 |
Module~CharactersAllowedType : enum
(ENUM) Determines the type of characters allowed for recognition result. Used in the RecognitionSettings to indicate which characters will be recognized. ALL: 0, LATIN_ALPHABET: 1, DIGITS: 2
Kind: inner enum of Module
Properties
Name | Type | Default |
---|---|---|
ALL | int | 0 |
LATIN_ALPHABET | int | 1 |
DIGITS | int | 2 |
Module~DetectAreasMode : enum
(ENUM) Determines the type of neural network used for areas detection. Used in the RecognitionSettings to specify which type of image you want to recognize.
Kind: inner enum of Module
Properties
Name | Type | Default | Description |
---|---|---|---|
NONE | int | 0 | Doesn’t detect paragraphs. Better for a simple one-column document without pictures. |
DOCUMENT | int | 1 | Detects paragraphs uses NN model for documents. Better for multicolumn document, document with pictures or with other not text objects. |
PHOTO | int | 2 | Detects paragraphs uses NN model for photos. Better for image with a lot of pictures and other not text objects. |
COMBINE | int | 3 | Detects paragraphs with text and then uses other NN model to detect areas inside of paragraphs. Better for images with complex structure. |
TABLE | int | 4 | Detects cells with text. Preferable mode for images with table structure.. |
CURVED_TEXT | int | 5 | Detects lines and recognizes text on curved images. Preferred mode for photos of book and magazine pages. |
Module~AsposeOCRRawDataType : enum
(ENUM) Data type for AsposeOCRInput
Kind: inner enum of Module
Properties
Name | Type | Default |
---|---|---|
UNKNOWN | int | 0 |
GRAYSCALE | int | 1 |
RGB | int | 2 |
Module~Language : enum
Languages used for OCR. ISO 639-2 Code
Kind: inner enum of Module
Properties
Name | Type | Default | Description |
---|---|---|---|
NONE | int | 0 | Multi-language support |
ENG | int | 1 | English alphabet |
DEU | int | 2 | German alphabet |
POR | int | 3 | Portuguese alphabet |
SPA | int | 4 | Spanish alphabet |
FRA | int | 5 | French alphabet |
ITA | int | 6 | Italian alphabet |
CZE | int | 7 | Czech alphabet |
DAN | int | 8 | Danish alphabet |
DUM | int | 9 | Dutch alphabet |
EST | int | 10 | Estonian alphabet |
FIN | int | 11 | Finnish alphabet |
LAV | int | 12 | Latvian alphabet |
LIT | int | 13 | Lithuanian alphabet |
NOR | int | 14 | Norwegian alphabet |
POL | int | 15 | Polish alphabet |
RUM | int | 16 | Romanian alphabet |
SRP_HRV | int | 17 | Serbo-Croatian alphabet |
SLK | int | 18 | Slovak alphabet |
SLV | int | 19 | Slovene alphabet |
SWE | int | 20 | Swedish alphabet |
CHI | int | 21 | Chinese alphabet |
BEL | int | 22 | Belorussian alphabet |
BUL | int | 23 | Bulgarian alphabet |
RUS | int | 24 | Russian alphabet |
SRP | int | 25 | Serbian alphabet |
UKR | int | 26 | Ukrainian alphabet |
HIN | int | 28 | Hindi alphabet |
Module~AsposeOCRSetLicense(licenseFullPath)
Set license to library. License is XML file.
Kind: inner method of Module
Param | Type | Description |
---|---|---|
licenseFullPath | string | path to license file. |
Module~AsposeOCRGetState() ⇒ boolean
Check license.
Kind: inner method of Module
Returns: boolean
- True if the license is installed and valid, otherwise false.
Module~AsposeOCRRecognize(descriptors, settings) ⇒ WasmAsposeOCRRecognitionResult
Optical character recognition image with automatic detection of text areas Allowed formats:
- Images PNG, JPG, BMP, TIFF from file system
- ZIP archive or images from folder in file system Internal archives and folders are not supported Only PNG, JPG, BMP, TIFF internal images are used for recognition from ZIP archive or from folder
- Images PNG, JPG, BMP from URI
Kind: inner method of Module
Returns: WasmAsposeOCRRecognitionResult
- - result of recognition as a complex structure AsposeOCRRecognitionResult, that described in definition.
Param | Type | Description |
---|---|---|
descriptors | WasmAsposeOCRInput | image descriptors array. |
settings | WasmAsposeOCRRecognitionSettings | Size of allocated descriptors. |
Module~AsposeOCRSerializeResult(recognition_result, format) ⇒ string
Prepare a recognition result in awailable output formats Allowed formats: text, json, xml
Kind: inner method of Module
Returns: string
- A string filled with data of the specified format
Param | Type | Description |
---|---|---|
recognition_result | WasmAsposeOCRRecognitionResult | recognitized result. |
format | ExportFormat | output format descriptor. |
Module~Rect : object
Kind: inner typedef of Module
Properties
Name | Type | Description |
---|---|---|
x | number | left top corner coordinate. |
y | number | left top corner coordinate. |
width | number | width in pixels. |
height | number | height in pixels. |
new Rect()
Empty constructor of Rect.
Module~VectorRect : Array.<Rect>
Kind: inner typedef of Module
Module~WasmAsposeOCRInputs : Array.<WasmAsposeOCRInput>
Kind: inner typedef of Module
Module~WasmAsposeOCRRecognitionSettings : object
Kind: inner typedef of Module
Properties
Name | Type | Description |
---|---|---|
all_image | boolean | Disabled (false) by default. Turning on means recognizing the image as a single area. |
correct_skew | boolean | Enabled (true) by default. Detects orientation and auto-rotate image if needed. |
upscale_small_font | boolean | Allows you to use additional algorithms specifically for small font recognition. Useful for images with small-size characters. |
lines_filtration | boolean | Disabled (false) by default. Allows to recognize text in the tables (regions surrounded lines). |
alphabet | string | Set of allowed characters in the alphabet (symbols for recognition). |
ignoredCharacters | string | Sets blacklist for recognition symbols. |
rectangles | VectorRect | Choose areas for recognition. |
preprocess_area | Rect | User area to be pre-processed |
skew | number | Rotate image on specified angle. Doesn’t work if rectangles aDere specified. |
language_alphabet | Language | Language used for OCR. Supported languages: English (en), German (de), Portuguese (pt), Spanish (es), French (fr), Italian (it), Czech (cze), Danish (dan), Dutch (dum), Estonian (est), Finnish (fin), Latvian (lav), Lithuanian (lit), Norwegian (nor), Polish (pol), Romanian (rum), Serbo-Croatian (srp_hrv), Slovak (slk), Slovene (slv), Swedish (swe), Chinese (chi) |
threshold_value | number | Sets custom threshold value for image binarization. Range from 1 to 255. |
allowed_characters | CharactersAllowedType | Allowed characters set. Determines the type of characters allowed for recognition result. |
auto_contrast | boolean | Allows using an additional contrast correction algorithm for the image before recognition. |
auto_denoising | boolean | Enables the use of an additional neural network for the image before recognition. Useful for images with noice, spots, flares, gradients, foreign elements. |
detect_areas_mode | DetectAreasMode | Allows to select the optimal mode for document type areas: document, photo, plain text, column, image. |
new WasmAsposeOCRRecognitionSettings()
Empty constructor of WasmAsposeOCRRecognitionSettings.
Module~WasmAsposeOCRInput : object
Descriptor, that can describe input raw data or path to fileIf field “file_path” not empty, takes precedence over the raw data part
Kind: inner typedef of Module
Properties
Name | Type | Description |
---|---|---|
raw_data | VectorRect | * Input raw data in special format Input raw data in special format may represent: row_index 0 … height column_index 0 … width 1) RGB color model (channels_size = 3): - raw_data[row_index * width * 3 + column_index * 3 + 0] = RED - raw_data[row_index * width * 3 + column_index * 3 + 1] = GREEN - raw_data[row_index * width * 3 + column_index * 3 + 2] = BLUE 2) Grayscale (channels_size = 1): - raw_data[row_index * width + column_index] = grayscale value |
height | number | image height. |
width | number | image width. |
raw_data_type | AsposeOCRRawDataType | represent input raw_data format |
url | string | Null terminated string, that describe file URL (file system path). |
new WasmAsposeOCRInput()
Empty constructor of WasmAsposeOCRInput.
Module~WasmAsposeOCRRecognitionArea : object
Kind: inner typedef of Module
Properties
Name | Type | Description |
---|---|---|
area | Rect | Area rectangle, that containe an recognized text in “recognized_text” field |
recognized_text | string | Recognized text array field. |
Module~WasmAsposeOCRRecognizedPage : object
Kind: inner typedef of Module
Properties
Name | Type | Description |
---|---|---|
recognized_areas | Array.<WasmAsposeOCRRecognitionArea> | Array of recognized areas. |
Module~WasmAsposeOCRRecognitionResult : object
Kind: inner typedef of Module
Properties
Name | Type |
---|---|
recognized_pages | Array.<WasmAsposeOCRRecognizedPage> |