Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Excerpt

The JSON Producer

...

Stage will, as

...

its name suggests, produces a JSON array representation of TEXT_BLOCK items. Output can be filtered to entities only

...

or to all tokens. Access to the produced output is done programatically (

...

see below).


Operates On:  Every lexical Item in the graph.

Include Page
Generic Configuration Parameters
Generic Configuration Parameters


Include Page
Generic Producer Configuration Parameters
Generic Producer Configuration Parameters

Configuration Parameters

  • onlyEntities (boolean, optional) - Defaults to false. Will only include tagged entities in the output if true. Otherwise,

...

  • it will include all tokens.
  • whitelist (String array, optional) - Defaults to empty. If non empty,

...

  • only

...

  • entities of the given names in the whitelist are added to the JSON output.
  • blacklist (String array, optional) - Defaults to empty. If non empty,

...

  • any entity will be added to the JSON output, except for

...

  • those in

...

  • a blacklist.


Code Block
languagejs
themeEclipse
titleExample Configuration
{
  "type": "JsonProducerStage",
  "name": "JsonProducer",
  "boundaryFlags": [
    "TEXT_BLOCK_SPLIT"
  ],
  "onlyEntities": true,
  "queueTimeout": 10,
  "queueRetries": 1
}

Example Output

If you have a text block like the following:

Code Block
languagetext
themeFadeToGrey
V----------[300 ml of water]----------V 
^----------[300 ml of water]----------^ 
^-[300]-V---[ml]---V--[of]--V-[water]-^ 
^-[{#}]-^-[{unit}]-^-[have]-^ 
^-[{measurement}]--^ 

the

...

stage

...

will

...

produce

...

the

...

following

...

JSON

...

(if

...

onlyEntities

...

=

...

true):

Code Block
languagejs
themeFadeToGrey
{"entities":[{
    "text":"300 ml",
    "value":[
      {
          "value":"300",
          "entity":"#"
      },
      {
          "value":"mililiters",
          "entity":"unit"
      }
    ],
    "entity":"measurement",
    "startPos":0,
    "endPos":6
}]}

or the following (if onlyEntities = false):

Code Block
languagejs
themeFadeToGrey
{"tokens":[
  {
    "text":"300 ml",
    "value":[
      {
          "value":"300",
          "entity":"#"
      },
      {
          "value":"mililiters",
          "entity":"unit"
      }
    ],
    "entity":"measurement",
    "startPos":0,
    "endPos":6
  },
  {
    "text":"of",
    "startPos":7,
    "endPos":9
  },
  {
    "text":"water",
    "startPos":10,
    "endPos":15
  }
]}