Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Easy Heading Free
navigationTitleOn this Page
wrapNavigationTexttrue
navigationExpandOptionexpand-all-by-default

Introduction


The Parquet Summarizer Executor is able to can process the content of a Parquet file and extract each of the rows and the table schema. Each extracted row will be processed by the summarizers attached to the job.


Temporary Files

The Parquet Summarizer Executor allows to download downloading the content of the file into a local temporary file to reduce memory usage.

Rows Filtering

The Parquet Summarizer Executor has the option to configure a groovy script to filter which rows will be processed.

Example:

Code Block
themeRDark
titleRow Filter
// This script must return a boolean.
// The references of the job, doc, component, row and table objects are available.
// Javadoc references 
// Row (row) - http://{manager}/javadocs/com/accenture/aspire/services/summarization/Row.html
// Table (table) - http://{manager}/javadocs/com/accenture/aspire/services/summarization/Table.html
row.getBoolean("sensitive") == true