The Read Only Hash Table Lookup component loads an in-memory hash table for very quickly looking up data and adding it to the document being processed. The hash table will be automatically loaded on start-up from a tabular file or relational database select. This pipeline stage, takes the key from an existing XML element, looks up the entry in the hash table, and then maps the hash table values-array elements to attributes of the target element in the document being processed.

Read Only Hash Table Lookup
Factory Namecom.searchtechnologies.aspire:aspire-hash-table
subType

readOnlyAttribute

InputsAspireObject (when used as a pipeline stage)
OutputsAttributes on the target element

Configuration

ElementTypeDefaultDescription
initialSizeint10000 (10 thousand)The estimated initial size for the hash table, used to specify its initial capacity. It is best to set this value large enough to contain all of the expected entries in the hash table. This will prevent additional hash table allocations and rehashing.
initializeFromTabularFilebooleanfalseSet this flag to true if you are initializing the hash table from a tabular file (i.e. a comma-separated or tab-separated file).
fileNamestringnone(requires initializeFromTabularFile = true) The file name where the tabular file can be located. If a relative path is specified, this is assumed to be relative to Aspire Home.
separatorstringtab(requires initializeFromTabularFile = true) This is either "comma", "tab" or a single character to specify the separator used for columns in the file. If a CSV file, use "comma".

The tabular files use the Microsoft-Excel standards for specifying data. Specifically, data entries with embedded commas or tabs should be surrounded by double quotes. Data entries which contain double quotes should escape the double-quote character with a pair of double quotes.

Finally, if you want to have some other separator (for example, the pipe-character / vertical-bar, |, is popular), then you can specify that single character in the <separator> tab as well.

hasColumnLabelsbooleanfalse(requires initializeFromTabularFile = true) Set this flag to true if the first row of the tabular file contain column labels.
keyColumnstringcolumn0(requires initializeFromTabularFile = true) The name of the tabular file column which will be used for the hash table key.

If <hasColumnLabels> = false, then the column labels will be numbered starting with 1, as in "column1", "column2", "column3", etc.

<keyColumn> is also available when loading the hash table from the RDB. See below.

valueMapNested list of <column label=""/> tagsinclude all columns in the order in which they occur(requires initializeFromTabularFile = true) The value map parent tag allows users to choose exactly which columns are stored in the hash table (controlling memory usage) and the order of the columns in the value array.

Inside of <valueMap> list the columns desired with nested <column label=""> tags. Only columns specified in the value map will be stored in the hash table. The order of the values in the hash table will be the same as the order of the <column> tags inside the value map.

Column labels will either be the labels specified in the file (if <hashColumnLables> is true) or "column1", "column2", "column3" etc. otherwise.

initializeFromSQLbooleanfalseSet this flag to true if you are initializing the hash table from a SQL select statement.
connectionPoolNamestringnone(requires initializeFromSQL = true) The Aspire component name of the RDBMS Connection component which maintains the pool of RDB connections for the database to be queried.
sqlQuerystringnone(requires initializeFromSQL = true) The SQL query to use to access the data from the RDBMS to load the hash table. The order of the columns in the SQL table will be maintained in the list of values stored in the hash table.
keyColumnstringnone(requires initializeFromSQL = true) The name of the SQL column from the "sqlQuery" query which will be used for the hash table key.
targetElementstringnone(when used as a pipeline stage) The XML element from the document being processed which will be used as the key to look up the entry in the hash table. NOTE: Multiple target elements are allowed.
metadataMapMetadata MappernoneSpecifies the mapping of fields or columns from the original hash table

Example Configurations

 <component name="refGetLifeCycleText" subType="readOnlyAttribute" factoryName="aspire-hash-table">
   <initializeFromSQL>true</initializeFromSQL>
   <connectionPoolName>/rdbConnections/reference</connectionPoolName>
   <targetElement>/doc/STATE_ID</targetElement>
   <targetElement>/doc/PG_STATE_ID</targetElement>
   <targetElement>/doc/P_STATE_ID</targetElement>
   <targetElement>/doc/PV_STATE_ID</targetElement>
   <targetElement>/doc/MA_STATE_ID</targetElement>
   <targetElement>/doc/MAR_STATE_ID</targetElement>
   <sqlQuery>
      <![CDATA[
         SELECT
             id   AS ID,
             name AS NAME
         FROM
             REF.ref_entity_state_type
       ]]>
   </sqlQuery>
   <keyColumn>id</keyColumn>
 </component>

Also see the examples for the default subtype

Example

Input fields:

 <PG_STATE_ID source="RDBFeederImpl"><![CDATA[1]]></PG_STATE_ID>
 <P_STATE_ID source="RDBFeederImpl"><![CDATA[1]]></P_STATE_ID>
 <PV_STATE_ID source="RDBFeederImpl"><![CDATA[1]]></PV_STATE_ID>
 <MA_STATE_ID source="RDBFeederImpl"><![CDATA[1]]></MA_STATE_ID>
 <MAR_STATE_ID source="RDBFeederImpl"><![CDATA[1]]></MAR_STATE_ID>

Output fields:

 <PG_STATE_ID NAME="Created" source="RDBFeederImpl"><![CDATA[1]]></PG_STATE_ID>
 <P_STATE_ID NAME="Created" source="RDBFeederImpl"><![CDATA[1]]></P_STATE_ID>
 <PV_STATE_ID NAME="Created" source="RDBFeederImpl"><![CDATA[1]]></PV_STATE_ID>
 <MA_STATE_ID NAME="Created" source="RDBFeederImpl"><![CDATA[1]]></MA_STATE_ID>
 <MAR_STATE_ID NAME="Created" source="RDBFeederImpl"><![CDATA[1]]></MAR_STATE_ID>
  • No labels