You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

All Confidence Values are from 0.0 → 1.0

Confidence = Confidence factor that the interpretation is correct

Note: Currently it’s a factor, not a probability

  • FUTURE WORK:  Make confidence a true probability score

Note: Multiple different interpretations can be correct at the same time

  • Example:  A Token interpretation and a Semantic Interpretation

Confidence Adjustment (ConfAdj) = Human-adjustable configuration parameter to adjust (boost or reduce) confidence values

  • Confidence Adjustment < 1.0 : Reduce confidence
    • Multiplied by underlying confidence value
    • For example:  0.7 = “new confidence is 70% of old confidence”
  • Confidence Adjustment = 1.0 : Leave confidence alone
  • Confidence Adjustment > 1.0 : Increase confidence
    • Identifies the percentage confidence growth towards 1.0
    • For example:  1.3 = “move 30% of the way towards 1.0”
Confidence Adjustment

Original Confidence = 0.5

If ConfAdj = 0.7
  • New Confidence = 0.35
  • (0.5 * 0.7) = 0.35

If ConfAdj = 1.3

  • New Confidence = 0.65
  • 0.5 + (1.3 - 1) * (1 - 0.5) = 0.65

If ConfAdj = 0

  • New Confidence = 0

If ConfAdj = 1.0

  • Confidence is unchanged

Current philosophy on confidence calculations

Original Text & Token Confidence = 0.5

Stages that increase ambiguity have ConfAdj < 1.0

  • Lower Case, Lemmatize: ConfAdj = 0.9

Splitting does not change confidence

  • Char Change Splitter, Advanced Splitter: ConfAdj = 1.0

Less Useful Items have lowered confidence

  • Stop Words: ConfAdj = 0.8

Simple Recognizers increase confidence

  • Email, date, number recognizers have Conf Adj = 1.1 (default)

Original Text & Token Confidence = 0.5

Stages that increase ambiguity have ConfAdj < 1.0

  • Lower Case, Lemmatize: ConfAdj = 0.9

Splitting does not change confidence

  • Char Change Splitter, Advanced Splitter: ConfAdj = 1.0

Less Useful Items have lowered confidence

  • Stop Words: ConfAdj = 0.8

Simple Recognizers increase confidence

  • Email, date, number recognizers haveConf Adj = 1.1 (default)

  • No labels