Page History

Include Page

	Overview
	Overview

UNDER CONSTRUCTION

The Saga Language Processing Toolkit (Saga Library) processes raw text into normalized tokens, entities and semantic tags. The output can be used for question-answering, full-text analysis (fact extraction), semantic search, content vectors and matching, and many other purposes.

Handles the full range of text processing
- Tokens extraction & cleansing, entity extraction, syntactic analysis and semantic analysis
Scalable to extremely large dictionaries and pattern databases (>10s of millions of patterns)
- Makes it possible to build patterns from machine learning algorithms
Disambiguation is a first-class citizen
- Saves all interpretations all the time (nothing is thrown away)
- Multiple disambiguation methods
Confidence is captured at every step
- Confidence builds up as patterns are matched
Fast enough to process documents for full database scans

Use Cases:

Query interpretation
Question answering
Chatbots
Full document fact extraction
Vector generation for statistical and machine learning

Getting Started

The Language Processing Toolkit is a lean java library that can be used anywhere.

Getting Started

Go Deeper

Search this Wiki

Livesearch

spaceKey	saga131

...

Page tree

Versions Compared

Old Version 7

New Version 8

Key

Getting Started

Go Deeper

Search this Wiki