Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Excerpt

The JSON Producer

...

Stage will, as

...

its name suggests, produces a JSON array representation of TEXT_BLOCK

...

items. Output can be filtered to entities only

...

or to all tokens. Access to the produced output is done programatically

...

.


Operates On:  Every lexical Item in the graph.

Include Page
Generic Configuration Parameters
Generic Configuration Parameters

Include Page
Generic Producer

...

Configuration Parameters
Generic Producer

...

Configuration Parameters

Configuration Parameters

...

  • Parameter
    summaryWill only include tagged entities in the output if true

...

  • . Otherwise, it will include all tokens

...

  • defaultfalse
    nameonlyEntities
    typeboolean
  • Parameter
    summary
  • If non empty,

...

  • only

...

  • entities of the given names in the whitelist are added to the JSON output.

...

  • defaultempty
    namewhitelist
    typestring array
  • Parameter
    summaryIf non empty,

...

  • any entity will be added to the JSON output, except for

...

  • those in a blacklist
    defaultempty
    nameblacklist
    typestring array


Code Block

...

boundaryFlags

...

text block split

...

stage

...

JsonProducer

...

language

...

js
"name": "JsonProducer

...

"

...

,

...

"onlyEntities": true,

...

"queueTimeout": 10,

...

"queueRetries": 1

...

Example Output

If you have a text block like the following:

Code Block
languagetext
themeFadeToGrey
V----------[300 ml of water]----------V 
^----------[300 ml of water]----------^ 
^-[300]-V---[ml]---V--[of]--V-[water]-^ 
^-[{#}]-^-[{unit}]-^-[have]-^ 
^-[{measurement}]--^ 

the

...

stage

...

will

...

produce

...

the

...

following

...

JSON

...

(if

...

onlyEntities

...

=

...

true):

Code Block
languagejs

...

...

"entities":[{

...

  "text":"300 ml",

...

  "value":[

...

    {

...

      "value":"300",

...

      "entity":"#"

...

    },

...

    {

...

      "value":"mililiters",

...

      "entity":"unit"

...

    }

...

  ],

...

  "entity":"measurement",

...

  "startPos":0,

...

  "endPos":6
}]

...

or the following (if onlyEntities = false):

Code Block
languagejs

...

...

"tokens":[
  {
    "text":"300 ml",
    "value":[
      {

...

        "value":"300",

...

        "entity":"#"
      },
      {

...

        "value":"mililiters",

...

        "entity":"unit"
      }
    ],
    "entity":"measurement",
    "startPos":0,
    "endPos":6
  },
  {
    "text":"of",
    "startPos":7,
    "endPos":9
  },
  {
    "text":"water",
    "startPos":10,
    "endPos":15
  }
]

...