Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Excerpt

The JSON Producer

...

Stage will, as

...

its name suggests, produces a JSON array representation of TEXT_BLOCK

...

items. Output can be filtered to entities only

...

or to all tokens. Access to the produced output is done programatically

...

.


Operates On:  Every lexical Item in the graph.

Include Page
Generic Configuration Parameters
Generic Configuration Parameters

Include Page
Generic Producer

...

Configuration Parameters
Generic Producer

...

Configuration Parameters

Configuration Parameters

...

  • Parameter
    summaryWill only include tagged entities in the output if true

...

  • . Otherwise, it will include all tokens

...

  • defaultfalse
    nameonlyEntities
    typeboolean
  • Parameter
    summary
  • If non empty,

...

  • only

...

  • entities of the given names in the whitelist are added to the JSON output.

...

  • defaultempty
    namewhitelist
    typestring array
  • Parameter
    summaryIf non empty,

...

  • any entity will be added to the JSON output, except for

...

  • those in a blacklist
    defaultempty
    nameblacklist
    typestring array


Code Block

...

boundaryFlags

...

text block split

...

stage

...

JsonProducer

...

language

...

js
"name": "JsonProducer

...

"

...

,

...

"onlyEntities": true,

...

"queueTimeout": 10,

...

"queueRetries": 1

...

Example Output

If you have a text block like the following:

Code Block
languagetext
themeFadeToGrey
V----------[300 ml of water]----------V 
^----------[300 ml of water]----------^ 
^-[300]-V---[ml]---V--[of]--V-[water]-^ 
^-[{#}]-^-[{unit}]-^-[have]-^ 
^-[{measurement}]--^ 

the

...

stage

...

will

...

produce

...

the

...

following

...

JSON

...

(if

...

onlyEntities

...

=

...

true):

Code Block
languagejs

...

...

"entities":[{
  "text":"300 ml",
  "value":[
    {
      "value":"300",
      "entity":"#"
    },
    {
      "value":"mililiters",
      "entity":"unit"
    }
  ],
  "entity":"measurement",
  "startPos":0,
  "endPos":6
}]

...

or the following (if onlyEntities = false):

Code Block
languagejs

...

...

"tokens":[
  {
    "text":"300 ml",
    "value":[
      {
        "value":"300",
        "entity":"#"
      },
      {
        "value":"mililiters",
        "entity":"unit"
      }
    ],
    "entity":"measurement",
    "startPos":0,
    "endPos":6
  },
  {
    "text":"of",
    "startPos":7,
    "endPos":9
  },
  {
    "text":"water",
    "startPos":10,
    "endPos":15
  }
]

...