models
Module models
Classes
AreasType(value, names=None, *, module=None, qualname=None, type=None, start=1)
- Determines the type of regions detected by the model.
Used in the get_text_areas to indicate which result will be obtained - paragraph coordinates or line coordinates.
Ancestors (in MRO)
- enum.Enum
Class variables
LINES
- Sets regions as lines
PARAGRAPHS
- Sets regions as paragraphs
WORDS
- Sets regions as words
DetectAreasMode(value, names=None, *, module=None, qualname=None, type=None, start=1)
- Determines the type of neural network used for areas detection.
Used in the RecognitionSettings to specify which type of image you want to recognize.
Ancestors (in MRO)
- enum.Enum
Class variables
COMBINE
- Detects paragraphs with text and then uses other NN model to detect areas inside of paragraphs. Better for images with complex structure.
CURVED_TEXT
- Detects lines and recognizes text on curved images. Preferred mode for photos of book and magazine pages.
DOCUMENT
- Detects paragraphs uses NN model for documents. Better for multicolumn document, document with pictures or with other not text objects.
NONE
- Doesn’t detect paragraphs. Better for a simple one-column document without pictures.
PHOTO
- Detects paragraphs uses NN model for photos. Better for image with a lot of pictures and other not text objects.
TABLE
- Detects cells with text. Preferable mode for images with table structure.
TEXT_IN_WILD
- A super-powerful neural network specialized in extracting words from low-quality images such as street photos, license plates, passport photos, meter photos, and photos with noisy backgrounds.
Format(value, names=None, *, module=None, qualname=None, type=None, start=1)
- Format to save recognition result as document.
Ancestors (in MRO)
- enum.Enum
Class variables
DOCX
- Saves the result as an Office Open XML Word processing ML Document (macro-free).
EPUB
- Saves the document as an EPUB file.
HTML
- Saves the document as an HTML file.
JSON
- Saves the result as an plain text written in JavaScript object notation.
PDF
- Saves the result as a PDF (Adobe Portable Document) Document.
PDF_NO_IMG
- Saves the document as a Searchable PDF (Adobe Portable Document) Document without image.
RTF
- Saves the document as an rtf file.
TEXT
- Saves the result in the plain text format.
XLSX
- Saves the result as an Excel ( 2007 and later) workbook Document.
XML
- Saves the result as an XML Document.
ImageData(javaClass)
Ancestors (in MRO)
- aspose.helper.BaseJavaClass
Methods
initParams(self)
:InputType(value, names=None, *, module=None, qualname=None, type=None, start=1)
- Types of image/ documents for processing / recognition.
Ancestors (in MRO)
- enum.Enum
Class variables
BASE64
- base64 string with the image or path to the .txt file with the base64 content. Supports GIF, PNG, JPEG, BMP, TIFF.
DIRECTORY
- Path to the directory. Nested archives and folders are not supported. Supports GIF, PNG, JPEG, BMP, TIFF. Default amount of processed images is all.
PDF
- Scanned PDF document from file or from bynary array.
SINGLE_IMAGE
- Supports GIF, PNG, JPEG, BMP, TIFF, JFIF, binary array.
TIFF
- Multipage TIFF, TIF document from file or from InputStream.
URL
- Link on the image. Supports GIF, PNG, JPEG, BMP, TIFF.
ZIP
- Full name of the ZIP archive. Nested archives and folders are not supported. Supports GIF, PNG, JPEG, BMP, TIFF, JFIF. Default amount of processed images is all.
Language(value, names=None, *, module=None, qualname=None, type=None, start=1)
- Language model for the recognition.
Ancestors (in MRO)
- enum.Enum
Class variables
BEL
- Belorussian alphabet
BUL
- Bulgarian alphabet
CHI
- Chinese alphabet
CYRILLIC
- Multi - language(cyrillic alphabet) support
CZE
- Czech alphabet
DAN
- Danish alphabet
DEU
- German alphabet
DUM
- Dutch alphabet
ENG
- English alphabet
EST
- Estonian alphabet
FIN
- Finnish alphabet
FRA
- French alphabet
HIN
- Hindi alphabet
ITA
- Italian alphabet
KAZ
- Kazakh alphabet
LATIN
- Multi - language(latin alphabet) support
LAV
- Latvian alphabet
LIT
- Lithuanian alphabet
NONE
- Multi - language support
NOR
- Norwegian alphabet
POL
- Polish alphabet
POR
- Portuguese alphabet
RUM
- Romanian alphabet
RUS
- Russian alphabet
SLK
- Slovak alphabet
SLV
- Slovene alphabet
SPA
- Spanish alphabet
SRP
- Serbian alphabet
SRP_HRV
- Serbo-Croatian alphabet
SWE
- Swedish alphabet
UKR
- Ukrainian alphabet
ModelsConverter()
Methods
convertInputTypeToJava(jType)
:convertToJavaAreasMode(jType)
:convertToJavaAreasType(jType)
:convertToJavaFormat(jType)
:convertToJavaLanguage(jType)
:convertToJavaSpellCheckLanguage(jType)
:OcrInput(type: models.InputType, filters: models.PreprocessingFilter = None)
- Main class to collect images.
Constructor to create container and set the type of images / documents and filters for further processing / recognition. @param type: Set the images/documents type will be added to container. @param filters: Set processing filters will be applied for further processing or recognition.
Methods
add(self, fullPath: str, startPage: int = None, pagesNumber: int = None)
- Add the path or URI containing the image for recognition / processing. The type of the image must correspond to the type specified in the constructor. @param fullPath: Path to the image/ document / folder / archive. @param startPage: The first page/image for processing / recognition. Use for documents, zip, folders. @param pagesNumber: The total amount of pages/images for processing / recognition. Use for documents, zip, folders. Default = all.
addStream(self, image_data_binary, startPage: int = None, pagesNumber: int = None)
- Add the InputStream containing the image for recognition / processing.
The type of the image must correspond to the type specified in the constructor.
\code input = OcrInput(InputType.SINGLE_IMAGE) file = open(imgPath, “rb”) image_data_binary = file.read() file.close() input.addStream(image_data_binary) result = api.recognize(input, RecognitionSettings()) \endcode
@param image_data_binary: containing the image or document. @param startPage: The first page/image for processing / recognition. Use for documents, zip, folders. @param pagesNumber: The total amount of pages/images for processing / recognition. Use for documents, zip, folders. Default = all.
add_base64(self, base64: str)
- Add the base64 string containing the image for recognition / processing. The type of the image must correspond to the type specified in the constructor. @param base64: Base64 string with single image.
clear(self)
- Set the amount of items for processing / recognition as 0. Clear the collection.
clear_filters(self)
- Remove all filters.
get(self, index: int) ‑> models.ImageData
- Returns information about processed / recognized image. @param index: Position of the image in the List. @return: The object of ImageData.
getJavaClass(self)
:init(self, javaClass)
:size(self)
- Amount of items for processing / recognition. @return: Amount of items.
PreprocessingFilter()
- Base class for image processing commands.
Ancestors (in MRO)
- aspose.helper.BaseJavaClass
Class variables
JAVA_CLASS_NAME
:Static methods
auto_denoising()
- Enables the use of an additional neural network to improve the image - reduce noise. Useful for images with scan artifacts, distortion, spots, flares, gradients, foreign elements. @return: AutoDenoisingFilter object.
auto_dewarping()
- Automatically corrects geometric distortions in the image. Extremely resource intensive! @return: AutoDewarpingFilter object.
auto_skew()
- Enables the automatic image skew correction. @return: AutoSkewFilter object.
binarize()
- Converts an image to black-and-white image. Binary images are images whose pixels have only two possible intensity values. They are normally displayed as black and white. Numerically, the two values are often 0 for black, and 255 for white. Binary images are produced by auto thresholding an image. @return: BinarizeFilter object.
binarize_and_dilate()
- Dilation adds pixels to the boundaries of objects in an image. @return: DilateFilter object.
contrast_correction()
- Contrast correction filter. @return: ContrastCorrectionFilter object.
invert()
- Automatically inverts colors in a document image. @return: InvertFilter object.
median()
- The median filter run through each element of the image and replace each pixel with the median of its neighboring pixels. @return: MedianFilter object.
resize(width: int, height: int)
- Rescale image - upscale or downscale image resolution. @param width: The new width of the image. @param height: The new height of the image. @return: ResizeFilter object.
rotate(angle: float)
- Rotate original image. @param angle: Angle of rotation. Value from -360 to 360. @return: RotateFilter object.
scale(ratio: float)
- Rescale image - Upscale or downscale image resolution. InterpolationFilterType bilinear or nearest neighbor. @param ratio: The scaling factor. Recommended value from 0.1 to 1 to shrink. From 1 to 10 to enlarge. @return: ScaleFilter object.
threshold(value: int)
- Create a binary image based on setting a threshold value on the pixel intensity of the original image. @param value: The max value. @return: BinarizeFilter object.
to_grayscale()
- Converts an image to grayscale image. Grayscale image have 256 level of light in image (0 to 255). @return: GrayscaleFilter object.
Methods
add(self, filter)
- Add filter to collection for further preprocessing. @param filter: PreprocessingFilter object.
getJavaClass(self)
: SpellCheckError(javaClass)
- Representing misspelled word with additional data.
Ancestors (in MRO)
- aspose.helper.BaseJavaClass
Methods
initParams(self)
: SpellCheckLanguage(value, names=None, *, module=None, qualname=None, type=None, start=1)
- Dictionary language for spell-check correction.
Ancestors (in MRO)
- enum.Enum
Class variables
CZE
- Czech dictionary
DAN
- Danish dictionary
DEU
- German dictionary
DUM
- Dutch dictionary
ENG
- English dictionary
EST
- Estonian dictionary
FIN
- Finnish dictionary
FRA
- French dictionary
ITA
- Italian dictionary
LAV
- Latvian dictionary
LIT
- Lithuanian dictionary
POL
- Polish dictionary
POR
- Portuguese dictionary
RUM
- Romanian dictionary
SLK
- Slovak dictionary
SLV
- Slovene dictionary
SPA
- Spanish dictionary
SWE
- Swedish dictionary
SuggestedWord(javaClass)
- Spelling suggestion returned from get_spell_check_error_list.
Ancestors (in MRO)
- aspose.helper.BaseJavaClass
Methods
initParams(self)
: