Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Avro Summarizer Executor can be configured using the Rest API. 

Easy Heading Free
navigationTitleOn this Page
wrapNavigationTexttrue
navigationExpandOptionexpand-all-by-default

Create Avro Summarizer Executor


Field

Required

Default

Multiple

NotesExample
typeYes-No

The value must be "application".

"application"

_typeYes-No

The value must be "application".

"application"

appNameYes-NoThe name of the application"ParquetAvro-Executor"
appTypeYes-NoThe value must be "parquetavro-summarize-executor"."parquetavro-summarize-executor"
configYes-NoThe value must be "com.accenture.aspire:app-parquetsummarizeavrosummarize-executor"."com.accenture.aspire:app-parquetsummarizeavrosummarize-executor"
descriptionYes-NoThe description

"ParquetAvro-Executor"

propertiesYes-NoConfiguration object
addSchemaYestrueNoIf enabled, the table schema will be added to the processed columns.true
useTempFiledebugYesNotruefalseNoEnable to download the content stream to a temporary file before processing it.Debug messages will be enabledtrue
threadPoolYes5NoThe number of threads to use for parallel processing.5
logFrequencyYes1000NoThe frequency for reporting the processed rows.1000
useSamplingYesfalseNoEnable to process only a random sample of the table rows. This option could increases the memory usage.true
filterRowsYesfalsetrueNoEnable to filter the rows to process.true
useFilterFileYestrueNoEnable to use a groovy file to filter the rows.true
groovyPathNo-NoThe path of the groovy script that contains the filter logic.  It must return a boolean value. If true, the row will be filtered."C:\\Aspire\\config\\rowsGroovyFilter.txt"
groovyScriptNo-NoScript used to filter the rows. It must return a boolean value. If true, the row will be filtered."row.getBoolean(\"sensitive\") == true"

Example

Code Block
themeRDark
titlePOST /aspire/_api/workflows/{workflow}/rules
{
  "type": "application",
  "_type": "application",
  "description": "ParquetAvro-Executor",
  "config": "com.accenture.aspire:app-parquetsummarizeavrosummarize-executor",
  "appType": "parquetavro-summarize-executor",
  "appName": "ParquetAvro Summarize Executor",
  "properties": {
    "addSchema": true,
    "useTempFiledebug": true,
    	"debugthreadPool": false5,
    "threadPoollogFrequency": 51000, 
      "logFrequencyuseSampling": 1000true,
    "filterRows": true,
    "useFilterFile": false,
    "groovyScript": "// This script must return a boolean.\n// The references of the job, doc, component, row and table objects are available.\n// Javadoc references \n// Row (row) - http://{manager}/javadocs/com/accenture/aspire/services/summarization/Row.html\n// Table (table) - http://{manager}/javadocs/com/accenture/aspire/services/summarization/Table.html\nrow.getBoolean(\"sensitive\") == true"
  }
}

Update Avro Summarizer Executor


Field

Required

Default

Multiple

NotesExample
idYes-NoID of the application to update"61014782-442a-4587-ab85-ba1439a7f7b5"
typeYes-No

The value must be "application".

"application"

_typeYes-No

The value must be "application".

"application"

appNameYes-NoThe name of the application"ParquetAvro-Executor"
appTypeYes-NoThe value must be "parquetavro-summarize-executor"."parquetavro-summarize-executor"
configYes-NoThe value must be "com.accenture.aspire:app-parquetsummarizeavrosummarize-executor"."com.accenture.aspire:app-parquetsummarizeavrosummarize-executor"
descriptionYes-NoThe description

"ParquetAvro -Executor"

propertiesYes-NoConfiguration object
addSchemaYestrueNoIf enabled, the table schema will be added to the processed columns.true
useTempFiledebugYesNotruefalseNoEnable to download the content stream to a temporary file before processing it.Debug messages will be enabledtrue
threadPoolYes5NoThe number of threads to use for parallel processing.5
logFrequencyYes1000NoThe frequency for reporting the processed rows.1000
useSamplingYesfalseNoEnable to process only a random sample of the table rows. This option could increases the memory usage.true
filterRowsYesfalsetrueNoEnable to filter the rows to process.true
useFilterFileYestrueNoEnable to use a groovy file to filter the rows.true
groovyPathNo-NoThe path of the groovy script that contains the filter logic.  It must return a boolean value. If true, the row will be filtered."C:\\Aspire\\config\\rowsGroovyFilter.txt"
groovyScriptNo-NoScript used to filter the rows. It must return a boolean value. If true, the row will be filtered."row.getBoolean(\"sensitive\") == true"

Example

Code Block
themeRDark
titlePUT /aspire/_api/workflows/{workflow}/rules/{id}
{
  "id": "61014782-442a-4587-ab85-ba1439a7f7b5", 
   "type": "application",
  "_type": "application",
  "description": "ParquetAvro-Executor",
  "config": "com.accenture.aspire:app-parquetsummarizeavrosummarize-executor",
  "appType": "parquetavro-summarize-executor",
  "appName": "ParquetAvro Summarize Executor",
  "properties": {
    "addSchema": true,
    "useTempFiledebug": true,
    	"debugthreadPool": false5,
    "threadPoollogFrequency": 51000, 
      "logFrequencyuseSampling": 1000true,
    "filterRows": true,
    "useFilterFile": false,
    "groovyScript": "// This script must return a boolean.\n// The references of the job, doc, component, row and table objects are available.\n// Javadoc references \n// Row (row) - http://{manager}/javadocs/com/accenture/aspire/services/summarization/Row.html\n// Table (table) - http://{manager}/javadocs/com/accenture/aspire/services/summarization/Table.html\nrow.getBoolean(\"sensitive\") == true"
  }
}