The Read Only Hash Table Lookup component loads an in-memory hash table for very quickly looking up data and adding it to the document being processed. The hash table will be automatically loaded on start-up from a tabular file or relational database select. This pipeline stage, takes the key from an existing XML element, looks up the entry in the hash table, and then maps the hash table values-array elements to attributes of the target element in the document being processed.

Read Only Hash Table Lookup
Factory Name	com.searchtechnologies.aspire:aspire-hash-table
subType	readOnlyAttribute
Inputs	AspireObject (when used as a pipeline stage)
Outputs	Attributes on the target element

Configuration

Element	Type	Default	Description
initialSize	int	10000 (10 thousand)	The estimated initial size for the hash table, used to specify its initial capacity. It is best to set this value large enough to contain all of the expected entries in the hash table. This will prevent additional hash table allocations and rehashing.
initializeFromTabularFile	boolean	false	Set this flag to true if you are initializing the hash table from a tabular file (i.e. a comma-separated or tab-separated file).
fileName	string	none	(requires initializeFromTabularFile = true) The file name where the tabular file can be located. If a relative path is specified, this is assumed to be relative to Aspire Home.
separator	string	tab	(requires initializeFromTabularFile = true) This is either "comma", "tab" or a single character to specify the separator used for columns in the file. If a CSV file, use "comma". The tabular files use the Microsoft-Excel standards for specifying data. Specifically, data entries with embedded commas or tabs should be surrounded by double quotes. Data entries which contain double quotes should escape the double-quote character with a pair of double quotes. Finally, if you want to have some other separator (for example, the pipe-character / vertical-bar, \|, is popular), then you can specify that single character in the <separator> tab as well.
hasColumnLabels	boolean	false	(requires initializeFromTabularFile = true) Set this flag to true if the first row of the tabular file contain column labels.
keyColumn	string	column0	(requires initializeFromTabularFile = true) The name of the tabular file column which will be used for the hash table key. If <hasColumnLabels> = false, then the column labels will be numbered starting with 1, as in "column1", "column2", "column3", etc. <keyColumn> is also available when loading the hash table from the RDB. See below.
valueMap	Nested list of <column label=""/> tags	include all columns in the order in which they occur	(requires initializeFromTabularFile = true) The value map parent tag allows users to choose exactly which columns are stored in the hash table (controlling memory usage) and the order of the columns in the value array. Inside of <valueMap> list the columns desired with nested <column label=""> tags. Only columns specified in the value map will be stored in the hash table. The order of the values in the hash table will be the same as the order of the <column> tags inside the value map. Column labels will either be the labels specified in the file (if <hashColumnLables> is true) or "column1", "column2", "column3" etc. otherwise.
initializeFromSQL	boolean	false	Set this flag to true if you are initializing the hash table from a SQL select statement.
connectionPoolName	string	none	(requires initializeFromSQL = true) The Aspire component name of the RDBMS Connection component which maintains the pool of RDB connections for the database to be queried.
sqlQuery	string	none	(requires initializeFromSQL = true) The SQL query to use to access the data from the RDBMS to load the hash table. The order of the columns in the SQL table will be maintained in the list of values stored in the hash table.
keyColumn	string	none	(requires initializeFromSQL = true) The name of the SQL column from the "sqlQuery" query which will be used for the hash table key.
targetElement	string	none	(when used as a pipeline stage) The XML element from the document being processed which will be used as the key to look up the entry in the hash table. NOTE: Multiple target elements are allowed.
metadataMap	Metadata Mapper	none	Specifies the mapping of fields or columns from the original hash table

Example Configurations

 <component name="refGetLifeCycleText" subType="readOnlyAttribute" factoryName="aspire-hash-table">
   <initializeFromSQL>true</initializeFromSQL>
   <connectionPoolName>/rdbConnections/reference</connectionPoolName>
   <targetElement>/doc/STATE_ID</targetElement>
   <targetElement>/doc/PG_STATE_ID</targetElement>
   <targetElement>/doc/P_STATE_ID</targetElement>
   <targetElement>/doc/PV_STATE_ID</targetElement>
   <targetElement>/doc/MA_STATE_ID</targetElement>
   <targetElement>/doc/MAR_STATE_ID</targetElement>
   <sqlQuery>
      <![CDATA[
         SELECT
             id   AS ID,
             name AS NAME
         FROM
             REF.ref_entity_state_type
       ]]>
   </sqlQuery>
   <keyColumn>id</keyColumn>
 </component>

Also see the examples for the default subtype

Example

Input fields:

 <PG_STATE_ID source="RDBFeederImpl"><![CDATA[1]]></PG_STATE_ID>
 <P_STATE_ID source="RDBFeederImpl"><![CDATA[1]]></P_STATE_ID>
 <PV_STATE_ID source="RDBFeederImpl"><![CDATA[1]]></PV_STATE_ID>
 <MA_STATE_ID source="RDBFeederImpl"><![CDATA[1]]></MA_STATE_ID>
 <MAR_STATE_ID source="RDBFeederImpl"><![CDATA[1]]></MAR_STATE_ID>

Output fields:

 <PG_STATE_ID NAME="Created" source="RDBFeederImpl"><![CDATA[1]]></PG_STATE_ID>
 <P_STATE_ID NAME="Created" source="RDBFeederImpl"><![CDATA[1]]></P_STATE_ID>
 <PV_STATE_ID NAME="Created" source="RDBFeederImpl"><![CDATA[1]]></PV_STATE_ID>
 <MA_STATE_ID NAME="Created" source="RDBFeederImpl"><![CDATA[1]]></MA_STATE_ID>
 <MAR_STATE_ID NAME="Created" source="RDBFeederImpl"><![CDATA[1]]></MAR_STATE_ID>

Page tree

Read Only Hash Table Lookup

Configuration

Example Configurations

Example