ABBYY InfoExtractor SDK

Activate your data

with powerful information extraction technology that reveals entities, events and relations across unstructured texts.

ABBYY InfoExtractor SDK empowers a new generation of Case Management and smart business process applications by automatically identifying and extracting business-relevant information from the unstructured content: entities, events, and relationships between them. Consolidating the extracted information into facts, ABBYY InfoExtractor ensures reconstruction of story lines hidden in complex and long documents.

An in-depth insight into entities, events and relationships enables enterprise customers and software developers to automate document workflows, increase time-consuming contract review, improve compliance management and other content-intensive tasks.

Built on ABBYY’s Compreno natural language processing technology, which operates on the level of meaning, InfoExtractor delivers highly accurate text analysis results. Its language-based capabilities will sustain business processes that require granular and reliable content understanding and open up new opportunities to increase operational efficiency, mitigate risks, and drive revenue.

Highlighted Use Cases

Credit and Insurance Risk Mitigation

Information-intensive transactions require that attorneys, bankers, risk managers and other professionals quickly and efficiently analyze and review complex documents. InfoExtractor’s intelligent understanding of documents enables these highly skilled professionals to navigate directly to relevant facts that are critical to the transaction.

Improving Compliance in the Construction Industry

Information extraction can help designers and engineers identify specific equipment, together with their elements and characteristics, in project documentation, thus ensuring consistent compliance with industry standards and regulations, helping to avoid mistakes in project documentation, and mitigate financial and safety risks.

Solution Highlights

Accurately extract entities, events, facts

Entity recognition identifies critical textual elements like persons, organizations, dates, geographic objects, event extraction reveals speech activities, commercial deals, crimes – or facts like citizenship, employment, family relationships. Powered by cutting-edge natural language processing Compreno technology, ABBYY InfoExtractor recognizes and disambiguates entities and events at the level of their meaning.

Identify relationships between entities and events

Through deep language-based analysis of documents InfoExtractor SDK reveals relations between entities and events. Entitity extraction is important, but recognizing that an entity has been replaced by a pronoun, or tracing its mentioning throughout text, allows you to analyze the whole picture: e.g. to find the deals that link a buyer and a seller and indentify the related financial figures or other information.

Add customized entities for specific cases

To ensure reliable entity recognition, the InfoExtractor enables creation of user ontology dictionaries to extract complicated examples of entities that can be critical for business (e.g. complex names like Aditya Prasad Kola, or organizations like Mengniu). An easy algorithm described in the SDK’s documentation helps to add missing concepts within existing entity types. These new concepts will automatically submit the extraction rules and improve complex cases of entity recognition.

Use custom ontologies for industry solutions

In addition to basic ontologies that enable entity and event extraction for most common domains, industry ontologies for specific industries or processes can also be customized or developed from scratch upon request by ABBYY professional linguistic services.

Work with text regardless of source

ABBYY’s world-famous Optical Character Recognition technology (OCR) is integrated into the InfoExtractor SDK enabling the analysis of scanned files (in tiff, jpeg and other graphic formats) and PDF-documents. Likewise, if large volumes of scanned documents need to be processed the InfoExtractor SDK can be seamlessly integrated with ABBYY Recognition Server.

Usage Scenarios

Business process optimization

ABBYY InfoExtractor creates operational efficiency by identifying things like dates, money amounts, technical characteristics and various other data in legal, accounting or technical documents. Automatic processing of complex documents enables efficient evaluation whether they contain critical data, need verification, or other business rules and processes should be applied.

Enhancing search capabilities

Delivering accurate extraction of entities and related events, ABBYY InfoExtractor SDK enriches document metadata and enables search tools with advanced faceted search capabilities.

Workflow automation

Automatic routing and alerting for unstructured documents like customer requests or incoming agreements can be effectively embedded into company’s workflow for higher efficiency of business processes, including faster responses and enhanced customer support.