The RDB via Tables Snapshots Connector can be configured using the Rest API. It requires the following entities to be created:

Credential
Connection
Connector
Seed

Below are the examples of how to create the Credential, Connection and the Seed. For the Connector, please check this page.

Easy Heading Free

navigationTitle	On this Page
navigationExpandOption	expand-all-by-default

Create Credential

Field	OptionalRequired	Default	Multiple	Notes	Example
type	NoYes	-	No	The value must be "rdb-tablessnapshot".	"rdb-tablessnapshot"
description	NoYes	-	No	Name of the credential object.	"My RDB Credential"
properties	NoYes	-	No	Configuration object
username	NoYes	-	No	User name.	"admin"
password	NoYes	-	No	Password.	"adminPassword"

Example

Code Block

theme	RDark
title	POST aspire/_api/credentials

{
    "type": "rdb-tablessnapshot",
    "description": "My RDB Credential",
    "properties": {
        "username": "admin",
        "password": "adminPassword"
    }
}

Create Connection

Field

Optional

Required	Default	Multiple	Notes	Example
type

No

Yes

-

No

The value must be "rdb-

tables

snapshot".

"rdb-

tables

snapshot"

description

No

Yes	-	No	Name of the connection object.	"My RDB Connection"
throttlePolicy

Yes

No	-	No	Id of the throttle policy that applies to this connection object.	"6b235b333a1b"
routingPolicies

Yes

No	[ ]	Yes	The ids of the routing policies that this connection will use.	["17f75ce7d0c7", "d42780003b36"]
credential

No

properties
Yes	-	No	Id of the credential	"6b235b333a1b"

No

Yes	-	No	Configuration object
jdbcUrl

No

Yes	-	No	The JDBC url for your RDBMS server and database. Use database marker {DATABASE} to denote the database	"jdbc:mysql://db:3306/{DATABASE}"
jdbcDriverJar

No

Yes	-	No	Path to the JDBC driver jar file for your RDBMS.	"/lib/myjdbcdriver.jar"
jdbcDriverClass

Yes

No	see notes	No	The name of the default JDBC driver class (if the class name from the META-INF/services/java.sql.Driver file in the driver Jar file should not be used), or if that file does not exist in the driver jar file (Oracle)	"java.sql.Driver"
jdbcDriverClasspath

Yes

No	the driver jar file	No	The class path for external jars required by the jdbc driver
stopOnError

Yes

No	true	No	When selected, the scan stops if the JDBC driver throws an error while getting a row, and the crawl halts. When unselected, the connector attempts to get subsequent rows	false
useSlices

Yes

No	false	No	Select this option to divide the full SQL into multiple slices. For example,if you have a 10 Million row table to scan, executing the 10 M query will take a while. After it completes, the connector starts sending items. By activating 10 slices, the scan is split into 10 1 Million scans, which takes less time and you can view results faster. This only works when the idColumn contains an integer.	true
numSlices

Yes

No	-	No	The number of SQL slices to split fullSQL. Slicing the full SQL should improve the performance significantly if a big database is to be crawled. Only works when the id column is an integer	10
percentAsMod

Yes

No

false

No

Use % Operator for Modulo. Check this option if you want to specify the MODULO operator to use for a particular Database system that doesn't recognize the MOD() function. "MOD()" is available for MySQL, PostgreSQL and Oracle. But systems like Microsoft SQL Server you must check this option

true

customFetchSize

Example

YesfalseNoCheck this box if you need to specify a fetch size to the JDBC driver to use when getting resultstrue

Code Block

fetchSizeYes50NoThis option indicates to the JDBC driver how it should do paging when retrieving results100

Example

Code Block

theme	RDark
title	POST aspire/_api/connections

{
   "type": rdb-tablessnapshot,
   "description": "RDB_TEST",
   "properties": {
       "jdbcUrl":"jdbc:mysql://localhost:3307/{DATABASE}",
        "jdbcDriverJar" : "/lib/myjdbcdriver.jar",
        "jdbcDriverClass": null,
        "jdbcDriverClasspath": null,
        "stopOnError": true,
        "useSlices": false,
        "numSlices": 2,
        "percentAsMod": false,
        "customFetchSize": true,
        "fetchSize": 10
}

Update Connection

Field

Optional

Required	Default	Multiple	Notes	Example
id

No

Yes	-	No	Id of the connection to update	"d442adcab4b0",
description

Yes

No	-	No	Name of the connection object.	"My RDB Connection"
throttlePolicy

Yes

No	-	No	Id of the throttle policy that applies to this connection object.	"b3a9-6b235b333a1b"
routingPolicies

Yes

No	[ ]	Yes	The ids of the routing policies that this connection will use.	["17f75ce7d0c7", "d42780003b36"]
credential

Yes

properties
No	-	No	Id of the credential	"6b235b333a1b"

Yes

No	-	No	Configuration object
(see create connection)

Example

Code Block

theme	RDark
title	PUT aspire/_api/connections/89d6632a-a296-426c-adb0-d442adcab4b0

{
   "id": "89d6632a-a296-426c-adb0-d442adcab4b0",
   "description": "RDB_TEST",
   "properties": {
       "jdbcUrl":"jdbc:mysql://localhost:3307/{DATABASE}",
        "jdbcDriverJar" : "/lib/myjdbcdriver.jar",
        "jdbcDriverClass": null,
        "jdbcDriverClasspath": null,
        "stopOnError": true,
        "useSlices": false,
        "numSlices": 2,
        "percentAsMod": false,
        "customFetchSize": true,
        "fetchSize": 10
}
}

Create Connector

For the creation of the Connector object using the Rest API check this page

Update Connector

For the update of the Connector object using the Rest API check this page

Create Seed

Field

Optional

Required	Default	Multiple	Notes	Example
seed

No

Yes	-	No	The name of the database. It will replace the marker {DATABASE} used in the field jdbcUrl of connection object	"test_db"
type

No

Yes

-

No

The value must be "rdb-

tables

snapshot".

"rdb-

tables

snapshot"

description

No

Yes	-	No	Name of the seed object.	"My RDB Seed"
connector

No

Yes	-	No	The id of the connector to be used with this seed. The connector type must match the seed type.	"e3ca414b0d31"
connection

No

Yes	-	No	The id of the connection to be used with this seed. The connection type must match the seed type.	"e4a663fe9ee6"
workflows

Yes

No	[ ]	Yes	The ids of the workflows that will be executed for the documents crawled.	["5696c3f0bda4"]
throttlePolicy

Yes

No	-	No	Id of the throttle policy that applies to this seed object.	"6b235b333a1b"
routingPolicies

Yes

No	[ ]	Yes	The ids of the routing policies that this seed will use.	["17f75ce7d0c7", "d42780003b36"]
tags

Yes

properties
No	[ ]	Yes	The tags of the seed. These can be used to filter the seed	["tag1", "tag2"]

No

Yes	-	No	Configuration object
fullSQL

No

Yes (this or discoverySQL + extractionSQL)

-

No

Full SQL. Run a

The "SELECT" query to be run to retrieve all documents. This query is used

only

for full or incremental scans.

Use the

The "WHERE" clause can be used to specify any required condition for crawling

just

the desired documents

. If slicing is enabled, add a "WHERE" clause:

. Any change to any column selected in this SQL will cause the document to be re-indexed. For example "SELECT idCol, col1, col2, col3 FROM data_table" When slicing is enabled, add a "WHERE" clause containing "{SLICES}". For example "SELECT idCol, col1, col2, col3 FROM data_table WHERE {SLICES}" .

"SELECT * FROM table"

idColumn

discoverySQL

Yes (this or fullSQL)

No

-

No

The

column name that holds the unique key. The default name of the column which holds the value to use as the document id. This column must be present in both discoverySQL and extractionSQL. SQL aliases are NOT supported."id"stringIdColumnYesfalseNoCheck if the unique key is a string valuetruepostCrawlSQLYes-NoThe SQL to run after a crawlIncremental CrawlpreUpdateSQLYes-NoThe SQL to run before an incremental crawl. This SQL can be used to mark documents for update, save timestamps, clear update tables, etc. as needed to prepare for an incremental crawl"UPDATE updates_table SET status='I'"updateSQLNo-NoThe SQL to run during an incremental crawl. This SQL should provide a list of all adds and deletes to the documents in the index. Some field names have special meaning (such as 'title', 'content', 'url', 'id', etc.) - see the wiki for more information. Note the special column, 'action' should report 'I' (for inserts), 'U' (for updates, typically the same as updates for most search engines), and 'D' (for deletes)The SQL to run during an incremental crawl. This SQL should provide a list of all adds and deletes to the documents in the index. Some field names have special meaning (such as 'title', 'content', 'url', 'id', etc.) - see the wiki for more information. Note the special column, 'action' should report 'I' (for inserts), 'U' (for updates, typically the same as updates for most search engines), and 'D' (for deletes)"SELECT updates_table.sequence, updates_table.id, updates_table.action, students.first_name, students.last_name FROM students RIGHT OUTER JOIN updates_table ON students.id = updates_table.id WHERE updates_table.status = 'I' ORDER BY updates_table.sequence ASC"postUpdateSQLYes-NoThe SQL to run after each record is processed. This SQL can be used un-mark / delete each document in the tables after it is complete. Your SQL may include placeholders for the row id, action, sequence id and whether the processing was successful. These are {documentId}, {action}, {sequenceId} and {failed} respectivelyUPDATE updates_table SET status = 'C' WHERE sequence = {sequenceId}postUpdateFailedSQLYes-NoThe SQL to run after each record if processing fails. If not configured, the 'Post update SQL' will be run instead Your SQL may include placeholders for the row id, action, sequence id and whether the processing was successful. These are {documentId}, {action}, {sequenceId} and {failed} respectively

				ACL
"SELECT" query to run for discovering documents. This query is used for full or incremental scans. A "WHERE" clause can be used to specify any required condition for crawling the desired documents. A change to any column selected in this SQL will cause the document to be re-indexed. For example: "SELECT idCol, lastModifiedDate FROM data_table". When slicing is enabled, add a "WHERE" clause containing "{SLICES}". For example: "SELECT idCol, col1 FROM data_table WHERE {SLICES}	"SELECT id, lastModified FROM table"
extractionSQL	Yes (this or fullSQL)	-	No	"SELECT" query for extracting all data for each document found in the Discovery SQL. At the least, you MUST include a "WHERE" clause containing the expression "idColumnName IN {IDS}", where idColumnName corresponds to a unique key field name. {IDS} is replaced automatically by the connector with the corresponding unique key values. For example: "SELECT col1, col2, col3 FROM data_table WHERE idCol in {IDS}" You must not include the {SLICES} condition here.	"SELECT * FROM table WHERE id IN {IDS}"
idColumn	Yes	-	No	The column name that holds the unique key. The default name of the column which holds the value to use as the document id. This column must be present in both discoverySQL and extractionSQL. SQL aliases are NOT supported.	"id"
stringIdColumn	No	false	No	Check if the unique key is a string value	true
quoteId	No	doNotQuote	No	Quote id column - use if you have a name clashing with RDBMS keywords. You can use one of the values: doNotQuote, `, "	doNotQuote
aclColumn	Yes

seqColumnNo-NoThe name of the column in the returned data which holds the sequence number of the update. This is only used for incremental crawls and must match the name returned by the SQL. If the column is aliased using the SQL "AS" construct, you should provide the alias name here."sequence"actionColumnNo-NoThe name of the column in the returned data which holds action of the update (ie Insert, Update or Delete). This is only used for incremental crawls and must match the name returned by the SQL. If the column is aliased using the SQL "AS" construct, you should provide the alias name here"action"useBoundingYesfalseNoChecking this option allows incremental crawls to use SQL that is bounded by a condition. When entering SQL you may use the variables {lowerBound} and {upperBound} in a WHERE clause to limit the data collected. The {upperBound} will be calculated at the start of the crawl. The {lowerBound} will be the {upperBound} from the previous crawl. Two types of bounding are available - 'Timestamp' returns the bounds as a 'long' value representing the current system time whilst 'SQL' allows you to define SQL to return the new upper bound when the crawl startstrueboundingSQLYes-NoThe SQL run when the crawl starts to return the new upper bound. The upper bound will be taken from the first column of the first row returnedACLaclColumnNo

(aclColumn or aclSQL)	-	No	The column name that holds the ACLs. Each ACL must be separated by semi-colons and must follow this format: my-domain\userOrGroup@NT	"acl"
aclSQL

No

Yes (aclColumn or aclSQL)

-

No

The query to use for extracting and building ACLs. This query depends of the Database engine, so the syntax could vary. For example on Oracle: SELECT 'my-domain\\' || user || '@NT;' FROM myTable

"SELECT * FROM table_acl"

Example

Code Block

theme	RDark
title	POST aspire/_api/seeds

{
  "seed": "test_db",
  "type": "rdb-tables",
  "description": "RDB_Test",
  "properties": {
    "idColumnseed": "idtest_db",
    "stringIdColumntype": true,
    "aclSQL": null"rdb-snapshot",
    "aclColumndescription" : "aclRDB_TEST",
    "quoteIdproperties" : "doNotQuote",
{
      "fullSQLidColumn" : "SELECT * FROM studentsfilm_id",
      "preUpdateSQLstringIdColumn" : "UPDATE updates_table SET status='I'",
false,
      "updateSQLaclSQL" : null,
 "SELECT  updates_table.sequence, updates_table.id, updates_table.action, students.first_name, students.last_name FROM students RIGHT OUTER JOIN   updates_table  ON students.id = updates_table.id WHERE updates_table.status = 'I' ORDER BY updates_table.sequence ASC "aclColumn" : "acl",
      "quoteId" : "doNotQuote",
      "discoverySQL" : "SELECT film_id, title FROM film",
      "postUpdateSQLextractionSQL" : "UPDATESELECT updates_table* SET status = 'C'FROM film WHERE sequencefilm_id =IN {sequenceIdIDS}",
    "seqColumn":  "sequencefullSQL",
    "actionColumn": "actionnull"
  }
}

Update Seed

Field	OptionalRequired	Default	Multiple	Notes	Example
id	NoYes	-	No	Id of the seed to update	"2f287669-d163-4e35-ad17-6bbfe9df3778"
(see the "Create seed" for other fields)

Example

Code Block

theme	RDark
title	PUT aspire/_api/seeds/2f287669-d163-4e35-ad17-6bbfe9df3778

{
  "id": "2f287669-d163-4e35-ad17-6bbfe9df3778",
  "seed": "test_db",
  "description": "RDB_Test",
  "properties": {
    "idColumn": "id6bbfe9df3778",
    "stringIdColumnseed": true,
    "aclSQL": null,
  "test_db",
  "aclColumndescription" : "aclRDB_TEST",
    "quoteIdproperties" : "doNotQuote",{
      "fullSQLidColumn" : "SELECT * FROM students",
film_id",
      "preUpdateSQLstringIdColumn" : "UPDATE updates_table SET status='I'",
 false,
      "updateSQLaclSQL" : null,
 "SELECT  updates_table.sequence, updates_table.id, updates_table.action, students.first_name, students.last_name FROM students RIGHT OUTER JOIN   updates_table  ON students.id = updates_table.id WHERE updates_table.status = 'I' ORDER BY updates_table.sequence ASC "aclColumn" : "acl",
      "quoteId" : "doNotQuote",
      "discoverySQL" : "SELECT film_id, title FROM film",
      "postUpdateSQLextractionSQL" : "UPDATESELECT updates_table* SET status = 'C'FROM film WHERE sequencefilm_id =IN {sequenceIdIDS}",
    "seqColumn":  "sequencefullSQL",
    "actionColumn": "actionnull"
  }
}

Page tree

Versions Compared

Old Version 2

New Version Current

Key

Create Credential

Example

Create Connection

Example

Example

Update Connection

Example

Create Connector

Create Connector

Update Connector

Create Seed

Example

Update Seed

Example

Page tree

Page History

Versions Compared

Old Version 2

New Version Current

Key

Create Credential

Example

Create Connection

Example

Example

Update Connection

Example

Create Connector

Create Connector

Update Connector

Create Seed

Example

Update Seed

Example