Module

Module~WasmAsposeOCRRecognitionSettings

Kind: inner class of Module

new WasmAsposeOCRRecognitionSettings()

Empty constructor of WasmAsposeOCRRecognitionSettings.

Module~WasmAsposeOCRInput

Kind: inner class of Module

new WasmAsposeOCRInput()

Empty constructor of WasmAsposeOCRInput.

Module~Rect

Kind: inner class of Module

new Rect()

Empty constructor of Rect.

Module~ExportFormat : enum

(ENUM) The format for recognition result. text: 0 json: 1 xml: 2

Kind: inner enum of Module
Properties

NameTypeDefault
textint0
jsonint1
xmlint2

Module~CharactersAllowedType : enum

(ENUM) Determines the type of characters allowed for recognition result. Used in the RecognitionSettings to indicate which characters will be recognized. ALL: 0, LATIN_ALPHABET: 1, DIGITS: 2

Kind: inner enum of Module
Properties

NameTypeDefault
ALLint0
LATIN_ALPHABETint1
DIGITSint2

Module~DetectAreasMode : enum

(ENUM) Determines the type of neural network used for areas detection. Used in the RecognitionSettings to specify which type of image you want to recognize.

Kind: inner enum of Module
Properties

NameTypeDefaultDescription
NONEint0Doesn’t detect paragraphs. Better for a simple one-column document without pictures.
DOCUMENTint1Detects paragraphs uses NN model for documents. Better for multicolumn document, document with pictures or with other not text objects.
PHOTOint2Detects paragraphs uses NN model for photos. Better for image with a lot of pictures and other not text objects.
COMBINEint3Detects paragraphs with text and then uses other NN model to detect areas inside of paragraphs. Better for images with complex structure.
TABLEint4Detects cells with text. Preferable mode for images with table structure..
CURVED_TEXTint5Detects lines and recognizes text on curved images. Preferred mode for photos of book and magazine pages.

Module~AsposeOCRRawDataType : enum

(ENUM) Data type for AsposeOCRInput

Kind: inner enum of Module
Properties

NameTypeDefault
UNKNOWNint0
GRAYSCALEint1
RGBint2

Module~Language : enum

Languages used for OCR. ISO 639-2 Code

Kind: inner enum of Module
Properties

NameTypeDefaultDescription
NONEint0Multi-language support
ENGint1English alphabet
DEUint2German alphabet
PORint3Portuguese alphabet
SPAint4Spanish alphabet
FRAint5French alphabet
ITAint6Italian alphabet
CZEint7Czech alphabet
DANint8Danish alphabet
DUMint9Dutch alphabet
ESTint10Estonian alphabet
FINint11Finnish alphabet
LAVint12Latvian alphabet
LITint13Lithuanian alphabet
NORint14Norwegian alphabet
POLint15Polish alphabet
RUMint16Romanian alphabet
SRP_HRVint17Serbo-Croatian alphabet
SLKint18Slovak alphabet
SLVint19Slovene alphabet
SWEint20Swedish alphabet
CHIint21Chinese alphabet
BELint22Belorussian alphabet
BULint23Bulgarian alphabet
RUSint24Russian alphabet
SRPint25Serbian alphabet
UKRint26Ukrainian alphabet
HINint28Hindi alphabet

Module~AsposeOCRSetLicense(licenseFullPath)

Set license to library. License is XML file.

Kind: inner method of Module

ParamTypeDescription
licenseFullPathstringpath to license file.

Module~AsposeOCRGetState() ⇒ boolean

Check license.

Kind: inner method of Module
Returns: boolean - True if the license is installed and valid, otherwise false.

Module~AsposeOCRRecognize(descriptors, settings) ⇒ WasmAsposeOCRRecognitionResult

Optical character recognition image with automatic detection of text areas Allowed formats:

  1. Images PNG, JPG, BMP, TIFF from file system
  2. ZIP archive or images from folder in file system Internal archives and folders are not supported Only PNG, JPG, BMP, TIFF internal images are used for recognition from ZIP archive or from folder
  3. Images PNG, JPG, BMP from URI

Kind: inner method of Module
Returns: WasmAsposeOCRRecognitionResult - - result of recognition as a complex structure AsposeOCRRecognitionResult, that described in definition.

ParamTypeDescription
descriptorsWasmAsposeOCRInputimage descriptors array.
settingsWasmAsposeOCRRecognitionSettingsSize of allocated descriptors.

Module~AsposeOCRSerializeResult(recognition_result, format) ⇒ string

Prepare a recognition result in awailable output formats Allowed formats: text, json, xml

Kind: inner method of Module
Returns: string - A string filled with data of the specified format

ParamTypeDescription
recognition_resultWasmAsposeOCRRecognitionResultrecognitized result.
formatExportFormatoutput format descriptor.

Module~Rect : object

Kind: inner typedef of Module
Properties

NameTypeDescription
xnumberleft top corner coordinate.
ynumberleft top corner coordinate.
widthnumberwidth in pixels.
heightnumberheight in pixels.

new Rect()

Empty constructor of Rect.

Module~VectorRect : Array.<Rect>

Kind: inner typedef of Module

Module~WasmAsposeOCRInputs : Array.<WasmAsposeOCRInput>

Kind: inner typedef of Module

Module~WasmAsposeOCRRecognitionSettings : object

Kind: inner typedef of Module
Properties

NameTypeDescription
all_imagebooleanDisabled (false) by default. Turning on means recognizing the image as a single area.
correct_skewbooleanEnabled (true) by default. Detects orientation and auto-rotate image if needed.
upscale_small_fontbooleanAllows you to use additional algorithms specifically for small font recognition. Useful for images with small-size characters.
lines_filtrationbooleanDisabled (false) by default. Allows to recognize text in the tables (regions surrounded lines).
alphabetstringSet of allowed characters in the alphabet (symbols for recognition).
ignoredCharactersstringSets blacklist for recognition symbols.
rectanglesVectorRectChoose areas for recognition.
preprocess_areaRectUser area to be pre-processed
skewnumberRotate image on specified angle. Doesn’t work if rectangles aDere specified.
language_alphabetLanguageLanguage used for OCR. Supported languages: English (en), German (de), Portuguese (pt), Spanish (es), French (fr), Italian (it), Czech (cze), Danish (dan), Dutch (dum), Estonian (est), Finnish (fin), Latvian (lav), Lithuanian (lit), Norwegian (nor), Polish (pol), Romanian (rum), Serbo-Croatian (srp_hrv), Slovak (slk), Slovene (slv), Swedish (swe), Chinese (chi)
threshold_valuenumberSets custom threshold value for image binarization. Range from 1 to 255.
allowed_charactersCharactersAllowedTypeAllowed characters set. Determines the type of characters allowed for recognition result.
auto_contrastbooleanAllows using an additional contrast correction algorithm for the image before recognition.
auto_denoisingbooleanEnables the use of an additional neural network for the image before recognition. Useful for images with noice, spots, flares, gradients, foreign elements.
detect_areas_modeDetectAreasModeAllows to select the optimal mode for document type areas: document, photo, plain text, column, image.

new WasmAsposeOCRRecognitionSettings()

Empty constructor of WasmAsposeOCRRecognitionSettings.

Module~WasmAsposeOCRInput : object

Descriptor, that can describe input raw data or path to fileIf field “file_path” not empty, takes precedence over the raw data part

Kind: inner typedef of Module
Properties

NameTypeDescription
raw_dataVectorRect* Input raw data in special format Input raw data in special format may represent: row_index 0 … height column_index 0 … width 1) RGB color model (channels_size = 3): - raw_data[row_index * width * 3 + column_index * 3 + 0] = RED - raw_data[row_index * width * 3 + column_index * 3 + 1] = GREEN - raw_data[row_index * width * 3 + column_index * 3 + 2] = BLUE 2) Grayscale (channels_size = 1): - raw_data[row_index * width + column_index] = grayscale value
heightnumberimage height.
widthnumberimage width.
raw_data_typeAsposeOCRRawDataTyperepresent input raw_data format
urlstringNull terminated string, that describe file URL (file system path).

new WasmAsposeOCRInput()

Empty constructor of WasmAsposeOCRInput.

Module~WasmAsposeOCRRecognitionArea : object

Kind: inner typedef of Module
Properties

NameTypeDescription
areaRectArea rectangle, that containe an recognized text in “recognized_text” field
recognized_textstringRecognized text array field.

Module~WasmAsposeOCRRecognizedPage : object

Kind: inner typedef of Module
Properties

NameTypeDescription
recognized_areasArray.<WasmAsposeOCRRecognitionArea>Array of recognized areas.

Module~WasmAsposeOCRRecognitionResult : object

Kind: inner typedef of Module
Properties

NameType
recognized_pagesArray.<WasmAsposeOCRRecognizedPage>