Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Xml Summarizer Executor can be configured using the Rest API. 

Easy Heading Free
navigationTitleOn this Page
wrapNavigationTexttrue
navigationExpandOptionexpand-all-by-default

Create Xml Summarizer Executor


Field

Required

Default

Multiple

NotesExample
typeYes-No

The value must be "application".

"application"

_typeYes-No

The value must be "application".

"application"

appNameYes-NoThe name of the application"ParquetXml-Executor"
appTypeYes-NoThe value must be "parquetxml-summarize-executor"."parquetxml-summarize-executor"
configYes-NoThe value must be "com.accenture.aspire:app-parquetsummarizexmlsummarize-executor"."com.accenture.aspire:app-parquetsummarizexmlsummarize-executor"
descriptionYes-NoThe description

"ParquetXml-Executor"

propertiesYes-NoConfiguration object
addSchemarootNodeYestrue"/"NoIf enabled, the table schema will be added to the processed columns.trueThe root node which contains the sub-jobs to publish. "/path/rootNode/"
characterEncodingNo"UTF-8"NoThe character encoding of the XML file to be read, if not UTF-8."UTF-8"
cleanseNouseTempFileYestrueNoEnable if you want to download clean the content stream to a temporary file before processing it.trueXML content from non-readable characters.ASCII code 15
honorDTDNotrueNoFetch XML's DTD.true
limitNestedNofalseNoLimit how many levels in a nested structures should be flattened.false
maxLevelNo10NoThe maximum nested level to be flatten.10
limitArraysNofalseNoLimit how many entries in array structures should be processed.false
arraysLimitNo10NoThe maximum number of array entries to process.10
debugNofalseNoDebug messages will be enabled.false
threadPoolNothreadPoolYes5NoThe number of threads to use for parallel processing.5
logFrequencyYesNo10005NoThe frequency for reporting the processed rows.5
useSamplingNofalseNoProcess only a random sample of the table rows. This option could increase the memory usage.false
minimumSamplesNo10NoThe minimum of randoms samples that will be gathered from the table.10
maxSamplesNo2000NoThe maximum of randoms samples that will be gathered from the table.2000
minimumPercentNo0.35NoThe minimum percentage of the total rows .to process from table.0.35
limitRowsNofalseNoLimit how many rows from the table will be read.false
maxRowsToReadNo10NoThe maximum of row from the table that will be read.101000
filterRowsYesNofalseNoEnable Check to filter the rows to process.truefalse
useFilterFileYesNotrueNoEnable to use a groovy file to filter the rows.true
useScriptFileNotrueNoEnable to specify a script file or disable to specify an uploaded resource file.true
groovyPathNoYes-NoThe path of the groovy script that contains the filter logic.  It must return a boolean value. If true, the row will be filtered."C:\\Aspire\\config\\rowsGroovyFilter.txt"
groovyScriptNo-NoScript used to filter the rows. It must return a boolean value. If true, the row will be filtered."row.getBoolean(\"sensitive\") == true"

Example

Code Block
themeRDark
titlePOST /aspire/_api/workflows/{workflow}/rules
{
  "type": "application",
  "_type": "application",
  "description": "ParquetXml-Executor",
  "config": "com.accenture.aspire:app-parquetsummarizexmlsummarize-executor",
  "appType": "parquetxml-summarize-executor",
  "appName": "ParquetXml Summarize Executor",
  "properties": {     
	"rootNode": "/",
  	"characterEncoding": "addSchema"UTF-8",
  	"cleanse": true,
  	"honorDTD": true,
  	"useTempFilelimitNested": truefalse,
    	"limitArrays": false,
  	"debug": false,
    	"threadPool": 5,
    	"logFrequency": 10005,
  	"useSampling": false,
  	"filterRows": true,false
      "useFilterFile": false,
    "groovyScript": "// This script must return a boolean.\n// The references of the job, doc, component, row and table objects are available.\n// Javadoc references \n// Row (row) - http://{manager}/javadocs/com/accenture/aspire/services/summarization/Row.html\n// Table (table) - http://{manager}/javadocs/com/accenture/aspire/services/summarization/Table.html\nrow.getBoolean(\"sensitive\") == true"
  }
}

Update Xml Summarizer Executor


Field

Required

Default

Multiple

NotesExample
idYes-NoID of the application to update"61014782-442a-4587-ab85-ba1439a7f7b5"
typeYes-No

The value must be "application".

"application"

_typeYes-No

The value must be "application".

"application"

appNameYes-NoThe name of the application"
Parquet
Xml-Executor"
appTypeYes-NoThe value must be "
parquet
xml-summarize-executor"."
parquet
xml-summarize-executor"
configYes-NoThe value must be "com.accenture.aspire:app-
parquetsummarize
xmlsummarize-executor"."com.accenture.aspire:app-
parquetsummarize
xmlsummarize-executor"
descriptionYes-NoThe description

"

Parquet

Xml-Executor"

propertiesYes-NoConfiguration object
addSchema

rootNodeYes
true
"/"No
If enabled, the table schema will be added to the processed columns.trueuseTempFile
The root node which contains the sub-jobs to publish. "/path/rootNode/"
characterEncodingNo"UTF-8"NoThe character encoding of the XML file to be read, if not UTF-8."UTF-8"
cleanseNo
Yes
trueNoEnable if you want to
download
clean the
content stream to a temporary file before processing it.
XML content from non-readable characters.ASCII code 15
honorDTDNotrueNoFetch XML's DTD.true
limitNestedNofalseNoLimit how many levels in a nested structures should be flattened.false
maxLevelNo10NoThe maximum nested level to be flatten.10
limitArraysNofalseNoLimit how many entries in array structures should be processed.false
arraysLimitNo10NoThe maximum number of array entries to process.10
debugNofalseNoDebug messages will be enabled.false
threadPoolNo
truethreadPoolYes
5NoThe number of threads to use for parallel processing.5
logFrequency
Yes
No
1000
5NoThe frequency for reporting the processed rows.5
useSamplingNofalseNoProcess only a random sample of the table rows
.1000
. This option could increase the memory usage.false
minimumSamplesNo10NoThe minimum of randoms samples that will be gathered from the table.10
maxSamplesNo2000NoThe maximum of randoms samples that will be gathered from the table.2000
minimumPercentNo0.35NoThe minimum percentage of the total rows to process from table.0.35
limitRowsNofalseNoLimit how many rows from the table will be read.false
maxRowsToReadNo10NoThe maximum of row from the table that will be read.10
filterRowsNo
filterRowsYes
falseNo
Enable
Check to filter the rows to process.
true
false
useFilterFile
Yes
NotrueNoEnable to use a groovy file to filter
the rows
.true
useScriptFileNotrueNoEnable to specify a script file or disable to specify an uploaded resource file.true
groovyPath
No
Yes-NoThe path of the groovy script that contains the filter logic.  It must return a boolean value. If true, the row will be filtered."C:\\Aspire\\config\\rowsGroovyFilter.txt"
groovyScriptNo-NoScript used to filter the rows. It must return a boolean value. If true, the row will be filtered."row.getBoolean(\"sensitive\") == true"

Example

Code Block
themeRDark
titlePUT /aspire/_api/workflows/{workflow}/rules/{id}
{
  "id": "61014782-442a-4587-ab85-ba1439a7f7b5", 
   "type": "application",
  "_type": "application",
  "description": "ParquetXml-Executor",
  "config": "com.accenture.aspire:app-parquetsummarizexmlsummarize-executor",
  "appType": "parquetxml-summarize-executor",
  "appName": "ParquetXml Summarize Executor",
  "properties": {     
	"rootNode": "/",
  	"characterEncoding": "UTF-8",
  	"cleanse": true,
 "addSchema 	"honorDTD": true,
  	"limitNested": true,
  	"maxLevel": 10,
  	"useTempFilelimitArrays": true,
     	"arraysLimit": 10,
  	"debug": falsetrue,
  	"threadPool": 5,
  	"threadPoollogFrequency": 5,
  	"useSampling": true,
  "logFrequency	"minimumSamples": 10,
  	"maxSamples": 10002000,
  	"minimumPercent": 0.35,
  	"filterRowslimitRows": true,
  	"maxRowsToRead": 10,
  	"filterRows": true,
  	"useFilterFile": false,
      "groovyScript": "// This script must return a boolean.\n// The references of the job, doc, component, row and table objects are available.\n// Javadoc references \n// Row (row) - http://{manager}/javadocs/com/accenture/aspire/services/summarization/Row.html\n// Table (table) - http://{manager}/javadocs/com/accenture/aspire/services/summarization/Table.html\nrow.getBoolean(\"sensitive\") == true"
  }
}