Table of Contents
Smart Classifier Model Editor
- ABBYY Smart Classifier is based on a scalable processing back-end that uses machine learning and linguistic technologies to classify unstructured content.
- The intuitive, web-based Model Editor of Smart Classifier allows the content experts within the organization to set-up, train and maintain the classification models.
- Smart Classifier can easily be connected and integrated with existing IT systems via a Rest API 1).
- Important Notes:
- Setting up a proper classification model with other classification tools often requires scientific know-how and expertise to pick the best classification algorithms. Then they have to be tuned by selecting the best working parameters.
- ABBYY made this step very intuitive and put a lot of artificial intelligence into Smart Classifier that does this automatically!
- If you have installed ABBYY Smart Classifier, you can start the Model Editor via the Windows Start menu icon
- the local URL (default installation) is: http://localhost:83/smartclassifier#/
- The Model Editor allows you to set-up/train classification models based on your own documents
Create a new Classification Project
To setup a new classification model, just run through the assistant:
- Give the model a meaningful name
- Select the language of the documents that should be classified
- Select what type of classification or results should be returned:
- All candidate categories
- Top candidate category = assign only the category with the highest score
- Single candidate category = assign a category only if no other possible candidate are found
Define Name & Language
Compatibility Note: the selection option between textual/linguistic or semantic features that was available up to version 2.5 was removed from the interface, because starting from version 2.6 the technology internally selects the best matching option.
Definition of the Classification Behavior
Create a new Training set for Classification
Classification of unstructured documents is based on machine learning. Since the content is not structured, a rule based approach will not work. To let the machine “learn your document and classes” a training set of documents is needed.
The Smart Classifier code sample also contains a test-set of documents that make it easy to create a working Classification Model so that you can evaluate the Smart Classifier Model Editor.
- If you already installed Smart Classifier on your machine, you can find fhe following documents locally under:
C:\Users\Public\ABBYY\Compreno Products\2.5\Code Samples\SmartClassifierSampleApplication– or just download the test-collection:
- The training set requirements are:
- Put at least 10 documents that represent a specific class in each folder. Smart Classifier is very flexible it can process Office files, text, HTML, PDFs, images and others…
- Create a .zip archive out of it
Upload your Training Set
- Select and upload the .zip
- Learning of the document specific features will start automatically
Result: You created a first Classification Model based on your classed and your documents.
Evaluate your Classification Model
Smart Classifier Integration
- Once an initial classification model is available, you can start the Smart Classifier integration via the REST Api. The default web port is: 83 2)
Model Training & Management API
Version 2.6 of ABBYY Smart Classifier introduced an API that allows to setup and to manage classification models. With these new feature-set developers, can automate the creation training and tuning of the models via API.
The API provides access to the same features/options that are available via the Classification Model Editor (screenshots above), this includes:
- Create new models
- Define/Adjust the model settings
- Upload and delete training/control documents
- Start re-training
- Get the classification statistics
- Control the “stop-word-list”
- Deploy a model