|
Technology
Overview
At the core of Arikus inContext™ is the
Arikus Information Refinery Engine (AIRE), comprising all of
Arikus’ core technologies. These core technologies are
also available in the form of an sdk for the developer community,
and packaged together make up the inContext™ sdk product.
Arikus provides an information retrieval platform to allow for
the automatic processing of unstructured data. At the basis
of any such platform is the ability to rank a document against
a query. This basic operation allows for all other operations.
Without a good ranking algorithm at its core nothing else will
work well.
Arikus’ information retrieval platform is based on linguistic
analysis (plain language) and statistical analysis. The architecture
of ranking algorithms separates the language dependent linguistic
analysis from the language independent statistical analysis.
The linguistic analysis is used to determine various language
dependent properties such as: noise words, grammatical inflections,
compounds, and pre/post-clitic. The linguistic analysis can
be enhanced to take account any special language dependent construct
such as noun phrases. The linguistic analyzer creates a model
of the query that is then passed to the statistical analysis
layer for processing. The statistical layer looks for patterns
and deviations from collection norms. The statistical analysis
treats words as abstract symbols allowing it to handle any word-based
language. The analysis performed allows for the automatic determination
of related words. These words can then be presented to users
to allow augmenting of their query or can also be used to automatically
refine the queries.
The architecture of the Arikus information retrieval platform
also separates the indexing of documents and the crawling of
documents. Each of these technologies can be distributed to
other computers to allow for the distributed processing of a
large collection. Solution
The Arikus platform is designed to scale, extend,
and evolve with clients’ ever-expanding information needs.
It accesses and indexes information (heterogeneous data types)
contained in disparate locations. Based upon an end user’s
query, Arikus classifies and displays the information that is
relevant and appropriate. Key technologies comprising the Arikus
platform are crawlers, indexer, search analyzer, query analyzer
and administrator.
Arikus technology goes further than any other known industry
document ranking technology in delivering accurate, relevant
and quality results. Arikus begins by analyzing an end user
query to determine word usage, grammatical inflections, phrase
structure, and sentence structure. An analysis is performed
on target documents to determine how words are distributed in
a document and what the relationships are between the words
in the document. Sentence analysis is performed to determine
both a position-independent and a position-dependent score for
each sentence in the document. This step not only improves the
accuracy, relevancy, and quality of returned results but also
allows for the determination of the “best summary”,
that is, the passage that best answers the user’s question.
This is entirely unique to Arikus.
Inherent in the process of identifying the best summary, is
the ability to determine a cut-off that identifies only the
most relevant and appropriate documents and therefore not present
the end user with the arduous task of looking through thousands
of returns, but is instead given a short list of accurate, relevant
and quality results.
In addition to its superior ranking algorithm, a unique feature
of Arikus’ technology is the ability to algorithmically
determine “related words” that are associated with
the query. Traditional information retrieval systems have pre-computed
synonym lists. These systems can fail when they do not take
into account the context in which the word is used. The ability
to determine related words allows Arikus and users the ability
to refine the query and to find the information the users are
looking for faster and more efficiently than ever before.
If you’ve ever been led astray while conducting an intranet
search or a web site search for a specific topic, Arikus’
content-based retrieval solution will alleviate that frustration.
While many Web site and intranet searches are structured around
keywords, Arikus’ Information Refinery Engine goes one
step further.
Instead of dealing with keywords, Arikus’ smart-retrieval
technology makes it possible to “understand” the
actual context and meanings of text—enabling searches
to deliver more accurate, pertinent, and efficient results.
Arikus Features
- We embrace new technologies - agent technology as well
as WAP - we have a wireless solution.
- Our solutions allow leading enterprise applications to
understand and route the business critical content that
exists in unstructured formats (eCRM, e-mail routing, e-business,
etc.).
- Arikus uses its technology to create a family of Knowledge
Management products: crawlers, indexers, summarizer, natural
language processing, categorizer, document hyper linking,
profiler, OEM product, etc.
- Multilingual support: English, French, Italian, German,
Dutch, Spanish, Chinese, Japanese, Korean.
- Reporting: Know your site visitors and respond to their
needs.
- Security: Only authorized viewers can view documents.
- Automation of previously manual tasks: Reduce cost of
knowledge management, opportunity costs, personnel cost.
- Heterogeneous data types, disparate sources, over 290
file formats supported: Office Documents, PDF, HTML, XML, Databases, HTTP, File Systems,
Notes, and Exchange.
- Cross platform functionality: Windows NT, Unix, Linux and any POSIX complient OS on request.
|
|