Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The RDB via Tables Connector can be configured using the Rest API. It requires the following entities to be created:

  • Credential
  • Connection
  • Connector
  • Seed

Below are the examples of how to create the Credential, Connection and the Seed. For the Connector, please check this page.

Easy Heading Free
navigationTitleOn this Page
navigationExpandOptionexpand-all-by-default

Create Credential


FieldOptionalRequiredDefaultMultipleNotesExample
typeNoYes-NoThe value must be "rdb-tables"."rdb-tables"
descriptionNoYes-NoName of the credential object."My RDB Credential"
propertiesYes-NoConfiguration object
usernameNoYes-NoUser name."admin"
passwordNoYes-NoPassword."adminPassword"

Example

Code Block
themeRDark
titlePOST aspire/_api/credentials
{
    "type": "rdb-tables",
    "description": "My RDB Credential",
    "properties": {
        "username": "admin",
        "password": "adminPassword"
    }
}

Create Connection


Field
Optional
RequiredDefaultMultipleNotesExample
type
No
Yes-NoThe value must be "
group
rdb-
expansion
tables"."
group
rdb-
expansion
tables"
description
No
Yes-NoName of the connection object."
MyGroupExpansionConnection
My RDB Connection"
throttlePolicy
Yes
No-NoId of the throttle policy that applies to this connection object."
f5587cee-9116-4011-b3a9-
6b235b333a1b"
routingPolicies
Yes
No[ ]YesThe ids of the routing policies that this connection will use.[
"313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"]
"17f75ce7d0c7", "d42780003b36"]
credentialYes-NoId of the credential"6b235b333a1b"
propertiesYes-NoConfiguration object
jdbcUrlYes-NoThe JDBC url for your RDBMS server and database. Use database marker {DATABASE} to denote the database"jdbc:mysql://db:3306/{DATABASE}"
jdbcDriverJarYes-NoPath to the JDBC driver jar file for your RDBMS."/lib/myjdbcdriver.jar"
jdbcDriverClassNosee notesNoThe name of the default JDBC driver class (if the class name from the META-INF/services/java.sql.Driver file in the driver Jar file should not be used), or if that file does not exist in the driver jar file (Oracle)"java.sql.Driver"
jdbcDriverClasspathNothe driver jar fileNoThe class path for external jars required by the jdbc driver
stopOnErrorNotrueNoWhen selected, the scan stops if the JDBC driver throws an error while getting a row, and the crawl halts. When unselected, the connector attempts to get subsequent rowsfalse
useSlicesNofalseNo

Select this option to divide the full SQL into multiple slices. For example,if you have a 10 Million row table to scan, executing the 10 M query will take a while. After it completes, the connector starts sending items. By activating 10 slices, the scan is split into 10 1 Million scans, which takes less time and you can view results faster. This only works when the idColumn contains an integer.

true
numSlicesNo-NoThe number of SQL slices to split fullSQL. Slicing the full SQL should improve the performance significantly if a big database is to be crawled. Only works when the id column is an integer10
percentAsModNofalseNo

Use % Operator for Modulo. Check this option if you want to specify the MODULO operator to use for a particular Database system that doesn't recognize the MOD() function. "MOD()" is available for MySQL, PostgreSQL and Oracle. But systems like Microsoft SQL Server you must check this option

true
customFetchSizeNofalseNoCheck this box if you need to specify a fetch size to the JDBC driver to use when getting resultstrue
fetchSizeNo50NoThis option indicates to the JDBC driver how it should do paging when retrieving results100

Example

Code Block
themeRDark
titlePOST aspire/_api/connections
{
   "type": rdb-tables,
   "description": "RDB_TEST",
   "properties": {
       "jdbcUrl":"jdbc:mysql://localhost:3307/{DATABASE}",
        "jdbcDriverJar" : "/lib/myjdbcdriver.jar",
       "type": "group-expansion",
 "jdbcDriverClass": null,
        "jdbcDriverClasspath": null,
        "stopOnError": true,
        "useSlices": false,
        "numSlices": 2,
        "descriptionpercentAsMod": false,
        "MyGroupExpansionConnectioncustomFetchSize": true,
        "fetchSize": 10
}

Update Connection


Field
Optional
RequiredDefaultMultipleNotesExample
id
No
Yes-NoId of the connection to update"
89d6632a-a296-426c-adb0-
d442adcab4b0",
descriptionNo-NoName of the connection object."
MyGroupExpansionConnection
My RDB Connection"
throttlePolicy
Yes
No-NoId of the throttle policy that applies to this connection object."
f5587cee-9116-4011-
b3a9-6b235b333a1b"
routingPolicies
Yes
No[ ]YesThe ids of the routing policies that this connection will use.["
313de87c-3cb9-4fe0-a2cb-
17f75ce7d0c7", "
b4d2579f-1a0a-4a8b-9fd4-
d42780003b36"]
credentialNo-NoId of the credential"6b235b333a1b"
propertiesNo-NoConfiguration object
(see create connection)




Example

Code Block
themeRDark
titlePUT aspire/_api/connections/89d6632a-a296-426c-adb0-d442adcab4b0
{
    "id": "89d6632a-a296-426c-adb0-d442adcab4b0",
   "description": "RDB_TEST",
   "properties": {
       "jdbcUrl":"jdbc:mysql://localhost:3307/{DATABASE}",
        "jdbcDriverJar" : "/lib/myjdbcdriver.jar",
        "jdbcDriverClass": null,
        "jdbcDriverClasspath": null,
        "stopOnError": true,
        "useSlices": false,
       "description": "MyGroupExpansionConnection",
 "numSlices": 2,
        "percentAsMod": false,
        "customFetchSize": true,
        "fetchSize": 10
}

Create Connector


For the creation of the Connector object using the Rest API check this page

Update Connector


For the update of the Connector object using the Rest API check this page

Create Seed


Field
Optional
RequiredDefaultMultipleNotesExample
seed
No
Yes-No
Put a placeholder here
The name of the database. It will replace the marker {DATABASE} used in the field jdbcUrl of connection object"test_db
"seed_placeholder
"
type
No
Yes-NoThe value must be "
group
rdb-
expansion
tables"."
group
rdb-
expansion
tables"
description
No
Yes-NoName of the seed object."
MyGroupExpansionSeed
My RDB Seed"
connector
No
Yes-NoThe id of the connector to be used with this seed. The connector type must match the seed type."
82f7f0a4-8d28-47ce-8c9d-
e3ca414b0d31"
connection
No
Yes-NoThe id of the connection to be used with this seed. The connection type must match the seed type."
602d3700-28dd-4a6a-8b51-
e4a663fe9ee6"
workflows
Yes
No[ ]YesThe ids of the workflows that will be executed for the documents crawled.["
f8c414cb-1f5d-42ef-9cc9-
5696c3f0bda4"]
throttlePolicy
Yes
No-NoId of the throttle policy that applies to this seed object."
f5587cee-9116-4011-b3a9-
6b235b333a1b"
routingPolicies
Yes
No[ ]YesThe ids of the routing policies that this seed will use.["
313de87c-3cb9-4fe0-a2cb-
17f75ce7d0c7", "
b4d2579f-1a0a-4a8b-9fd4-
d42780003b36"]
tags
Yes
No[ ]YesThe tags of the seed. These can be used to filter the seed["tag1", "tag2"]
propertiesYes-NoConfiguration object
seedsIncluded

fullSQL
yes
Yes-
yesThose seeds will be included in crawling identities["5b9f4074-01a3-4d86-8642-32a6f27179d4"]
NoFull SQL. Run a "SELECT" query to retrieve all documents. This query is used only for full scans. Use the "WHERE" clause to specify any required condition for crawling just the desired documents. If slicing is enabled, add a "WHERE" clause: "SELECT idCol, col1, col2, col3 FROM data_table WHERE {SLICES}"SELECT * FROM table"
idColumnYes-NoThe column name that holds the unique key. The default name of the column which holds the value to use as the document id. This column must be present in fullSQL. SQL aliases are NOT supported."id"
stringIdColumnNofalseNoCheck if the unique key is a string valuetrue
postCrawlSQLNo-NoThe SQL to run after a crawl




Incremental Crawl
preUpdateSQLNo-NoThe SQL to run before an incremental crawl. This SQL can be used to mark documents for update, save timestamps, clear update tables, etc. as needed to prepare for an incremental crawl"UPDATE updates_table SET status='I'"
updateSQLYes-NoThe SQL to run during an incremental crawl. This SQL should provide a list of all adds and deletes to the documents in the index. Some field names have special meaning (such as 'title', 'content', 'url', 'id', etc.) - see the wiki for more information. Note the special column, 'action' should report 'I' (for inserts), 'U' (for updates, typically the same as updates for most search engines), and 'D' (for deletes)The SQL to run during an incremental crawl. This SQL should provide a list of all adds and deletes to the documents in the index. Some field names have special meaning (such as 'title', 'content', 'url', 'id', etc.) - see the wiki for more information. Note the special column, 'action' should report 'I' (for inserts), 'U' (for updates, typically the same as updates for most search engines), and 'D' (for deletes)"SELECT updates_table.sequence, updates_table.id, updates_table.action, students.first_name, students.last_name FROM students RIGHT OUTER JOIN updates_table ON students.id = updates_table.id WHERE updates_table.status = 'I' ORDER BY updates_table.sequence ASC"
postUpdateSQLNo-NoThe SQL to run after each record is processed. This SQL can be used un-mark / delete each document in the tables after it is complete. Your SQL may include placeholders for the row id, action, sequence id and whether the processing was successful. These are {documentId}, {action}, {sequenceId} and {failed} respectivelyUPDATE updates_table SET status = 'C' WHERE sequence = {sequenceId}
postUpdateFailedSQLNo-NoThe SQL to run after each record if processing fails. If not configured, the 'Post update SQL' will be run instead Your SQL may include placeholders for the row id, action, sequence id and whether the processing was successful. These are {documentId}, {action}, {sequenceId} and {failed} respectively
seqColumnYes-NoThe name of the column in the returned data which holds the sequence number of the update. This is only used for incremental crawls and must match the name returned by the SQL. If the column is aliased using the SQL "AS" construct, you should provide the alias name here."sequence"
actionColumnYes-NoThe name of the column in the returned data which holds action of the update (ie Insert, Update or Delete). This is only used for incremental crawls and must match the name returned by the SQL. If the column is aliased using the SQL "AS" construct, you should provide the alias name here"action"
useBoundingNofalseNoChecking this option allows incremental crawls to use SQL that is bounded by a condition. When entering SQL you may use the variables {lowerBound} and {upperBound} in a WHERE clause to limit the data collected. The {upperBound} will be calculated at the start of the crawl. The {lowerBound} will be the {upperBound} from the previous crawl. Two types of bounding are available - 'Timestamp' returns the bounds as a 'long' value representing the current system time whilst 'SQL' allows you to define SQL to return the new upper bound when the crawl startstrue
boundingSQLNo-NoThe SQL run when the crawl starts to return the new upper bound. The upper bound will be taken from the first column of the first row returned




ACL
aclColumnYes (aclColumn or aclSQL)-NoThe column name that holds the ACLs. Each ACL must be separated by semi-colons and must follow this format: my-domain\userOrGroup@NT"acl"
aclSQLYes (aclColumn or aclSQL)-NoThe query to use for extracting and building ACLs. This query depends of the Database engine, so the syntax could vary. For example on Oracle: SELECT 'my-domain\\' || user || '@NT;' FROM myTable"SELECT * FROM table_acl
seedsExcludedyes-yesThose seeds will be excluded from crawling identities["5b9f4074-01a3-4d86-8642-32a6f27179d4"]seedsIdentities localyes-yesCustom mapping localidno-noThe seed id for the mapping

"5b9f4074-01a3-4d86-8642-32a6f27179d4"

identityKeyTypeno-noThe value must be "local""local"identityKeyAttrno-noThe name of the mapping attribute"someAttributeName"seedsIdentities externalyes-yesCustom mapping externalidno-noThe seed id for the mapping

"5b9f4074-01a3-4d86-8642-32a6f27179d4"

identityKeyTypeno-noThe value must be "external""external"identityKeyAttrno-noThe name of the mapping attribute in external seed"someAttributeName"externalSeedno-noThe external seed id"5b9f4074-01a3-4d86-8642-32a6f27179d4"externalLookupAttributeno-noThe external lookup attribute name"externalLookupAttrName
"

Example

Code Block
themeRDark
titlePOST aspire/_api/seeds
{
    "type": "group-expansion",
    "seed": "seedtest_placeholderdb",
    "connectortype": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31rdb-tables",
    "description": "GroupExpansionRDB_Test_Seed",
    "throttlePolicyproperties": "6b8b5f23-fc77-47a1-9b58-106577162e7b",{
    "routingPoliciesidColumn": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],"id",
    "connectionstringIdColumn": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6"true,
    "workflowsaclSQL": ["f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"]null,
    "tagsaclColumn": ["tag1acl", "tag2"],
    "propertiesquoteId": {
   "doNotQuote",
    "seedsIncludedfullSQL": [
"SELECT *     FROM students",
     "5b9f4074-01a3-4d86-8642-32a6f27179d4"
        ],
  "preUpdateSQL": "UPDATE updates_table SET status='I'",
     "seedsExcludedupdateSQL": [],
"SELECT  updates_table.sequence, updates_table.id, updates_table.action, students.first_name,  "seedsIdentities": [
   students.last_name FROM students RIGHT OUTER JOIN   updates_table  ON students.id  {
        = updates_table.id WHERE updates_table.status = 'I' ORDER BY updates_table.sequence ASC",
     "idpostUpdateSQL": "5b9f4074-01a3-4d86-8642-32a6f27179d4",
         UPDATE updates_table SET status = 'C' WHERE sequence = {sequenceId}",
    "identityKeyTypeseqColumn": "localsequence",
             "identityKeyAttractionColumn": "someAttrNameaction"
            }
       ]
    }
}

Update Seed


Field
Optional
RequiredDefaultMultipleNotesExample
id
No
Yes-NoId of the seed to update"2f287669-d163-4e35-ad17-6bbfe9df3778"
(see the "Create seed" for other fields)




Example

Code Block
themeRDark
titlePUT aspire/_api/seeds/2f287669-d163-4e35-ad17-6bbfe9df3778
{
    "id": "2f287669-d163-4e35-ad17-6bbfe9df3778",
    "connectorseed": "82f7f0a4-8d28-47ce-8c9d-e3ca414b0d31test_db",
    "description": "FileSystemRDB_Test_Seed",
    "throttlePolicyproperties": "6b8b5f23-fc77-47a1-9b58-106577162e7b",{
    "routingPoliciesidColumn": ["313de87c-3cb9-4fe0-a2cb-17f75ce7d0c7", "b4d2579f-1a0a-4a8b-9fd4-d42780003b36"],"id",
    "connectionstringIdColumn": "602d3700-28dd-4a6a-8b51-e4a663fe9ee6"true,
    "workflowsaclSQL": ["b255e950-1dac-46dc-8f86-1238b2fbdf27", "f8c414cb-1f5d-42ef-9cc9-5696c3f0bda4"],null,
    "tagsaclColumn": ["tagacl", "tag2"],
    "propertiesquoteId": {
   "doNotQuote",
    "seedsIncludedfullSQL": [
"SELECT *     FROM students",
     "5b9f4074-01a3-4d86-8642-32a6f27179d4"
        ]"preUpdateSQL": "UPDATE updates_table SET status='I'",
       "seedsExcludedupdateSQL": [],
"SELECT  updates_table.sequence, updates_table.id, updates_table.action, students.first_name,  "seedsIdentities": [
   students.last_name FROM students RIGHT OUTER JOIN   updates_table  ON students.id  {
         = updates_table.id WHERE updates_table.status = 'I' ORDER BY updates_table.sequence ASC",
    "idpostUpdateSQL": "5b9f4074-01a3-4d86-8642-32a6f27179d4",
         UPDATE updates_table SET status = 'C' WHERE sequence = {sequenceId}",
    "identityKeyTypeseqColumn": "localsequence",
             "identityKeyAttr"actionColumn": "someAttrName"
            }
       ]
  action"
  }
}