Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The RDB via Snapshots Connector can be configured using the Rest API. It requires the following entities to be created:

  • Credential
  • Connection
  • Connector
  • Seed

Below are the examples of how to create the Credential, Connection and the Seed. For the Connector, please check this page.

Easy Heading Free
navigationTitleOn this Page
navigationExpandOptionexpand-all-by-default

Create Credential


FieldOptionalRequiredDefaultMultipleNotesExample
typeNoYes-NoThe value must be "rdb-snapshot"."rdb-snapshot"
descriptionNoYes-NoName of the credential object."My RDB Credential"
propertiesNoYes-NoConfiguration object
usernameNoYes-NoUser name."admin"
passwordNoYes-NoPassword."adminPassword"

Example

Code Block
themeRDark
titlePOST aspire/_api/credentials
{
    "type": "rdb-snapshot",
    "description": "My RDB Credential",
    "properties": {
        "username": "admin",
        "password": "adminPassword"
    }
}

Create Connection


Field
Optional
RequiredDefaultMultipleNotesExample
type
No
Yes-NoThe value must be "rdb-snapshot"."rdb-snapshot"
description
No
Yes-NoName of the connection object."My RDB Connection"
throttlePolicy
Yes
No-NoId of the throttle policy that applies to this connection object."6b235b333a1b"
routingPolicies
Yes
No[ ]YesThe ids of the routing policies that this connection will use.["17f75ce7d0c7", "d42780003b36"]
credential
No
Yes-NoId of the credential"6b235b333a1b"
properties
No
Yes-NoConfiguration object
jdbcUrl
No
Yes-NoThe JDBC url for your RDBMS server and database. Use database marker {DATABASE} to denote the database"jdbc:mysql://db:3306/{DATABASE}"
jdbcDriverJar
No
Yes-NoPath to the JDBC driver jar file for your RDBMS."/lib/myjdbcdriver.jar"
jdbcDriverClass
Yes
Nosee notesNoThe name of the default JDBC driver class (if the class name from the META-INF/services/java.sql.Driver file in the driver Jar file should not be used), or if that file does not exist in the driver jar file (Oracle)"java.sql.Driver"
jdbcDriverClasspath
Yes
Nothe driver jar fileNoThe class path for external jars required by the jdbc driver
stopOnError
Yes
NotrueNoWhen selected, the scan stops if the JDBC driver throws an error while getting a row, and the crawl halts. When unselected, the connector attempts to get subsequent rowsfalse
useSlices
Yes
NofalseNo

Select this option to divide the full SQL into multiple slices. For example,if you have a 10 Million row table to scan, executing the 10 M query will take a while. After it completes, the connector starts sending items. By activating 10 slices, the scan is split into 10 1 Million scans, which takes less time and you can view results faster. This only works when the idColumn contains an integer.

true
numSlices
Yes
No-NoThe number of SQL slices to split fullSQL. Slicing the full SQL should improve the performance significantly if a big database is to be crawled. Only works when the id column is an integer10
percentAsMod
Yes
NofalseNo

Use % Operator for Modulo. Check this option if you want to specify the MODULO operator to use for a particular Database system that doesn't recognize the MOD() function. "MOD()" is available for MySQL, PostgreSQL and Oracle. But systems like Microsoft SQL Server you must check this option

true

Example

Code Block
themeRDark
titlePOST aspire/_api/connections
{
   "type": rdb-snapshot,
   "description": "RDB_TEST",
   "properties": {
       "jdbcUrl":"jdbc:mysql://localhost:3307/{DATABASE}",
        "jdbcDriverJar" : "/lib/myjdbcdriver.jar",
        "jdbcDriverClass": null,
        "jdbcDriverClasspath": null,
        "stopOnError": true,
        "useSlices": false,
        "numSlices": 2,
        "percentAsMod": false
}

Update Connection


Field
Optional
RequiredDefaultMultipleNotesExample
id
No
Yes-NoId of the connection to update"d442adcab4b0",
description
Yes
No-NoName of the connection object."My RDB Connection"
throttlePolicy
Yes
No-NoId of the throttle policy that applies to this connection object."b3a9-6b235b333a1b"
routingPolicies
Yes
No[ ]YesThe ids of the routing policies that this connection will use.["17f75ce7d0c7", "d42780003b36"]
credential
Yes
No-NoId of the credential"6b235b333a1b"
properties
Yes
No-NoConfiguration object
(see create connection)




Example

Code Block
themeRDark
titlePUT aspire/_api/connections/89d6632a-a296-426c-adb0-d442adcab4b0
{
   "id": "89d6632a-a296-426c-adb0-d442adcab4b0",
   "description": "RDB_TEST",
   "properties": {
       "jdbcUrl":"jdbc:mysql://localhost:3307/{DATABASE}",
        "jdbcDriverJar" : "/lib/myjdbcdriver.jar",
        "jdbcDriverClass": null,
        "jdbcDriverClasspath": null,
        "stopOnError": true,
        "useSlices": false,
        "numSlices": 2,
        "percentAsMod": false
}

Create Connector


For the creation of the Connector object using the Rest API check this page

Update Connector


For the update of the Connector object using the Rest API check this page

Create Seed


Field
Optional
RequiredDefaultMultipleNotesExample
seed
No
Yes-NoThe name of the database. It will replace the marker {DATABASE} used in the field jdbcUrl of connection object"test_db"
type
No
Yes-NoThe value must be "rdb-snapshot"."rdb-snapshot"
description
No
Yes-NoName of the seed object."My RDB Seed"
connector
No
Yes-NoThe id of the connector to be used with this seed. The connector type must match the seed type."e3ca414b0d31"
connection
No
Yes-NoThe id of the connection to be used with this seed. The connection type must match the seed type."e4a663fe9ee6"
workflows
Yes
No[ ]YesThe ids of the workflows that will be executed for the documents crawled.["5696c3f0bda4"]
throttlePolicy
Yes
No-NoId of the throttle policy that applies to this seed object."6b235b333a1b"
routingPolicies
Yes
No[ ]YesThe ids of the routing policies that this seed will use.["17f75ce7d0c7", "d42780003b36"]
tags
Yes
No[ ]YesThe tags of the seed. These can be used to filter the seed["tag1", "tag2"]
properties
No
Yes-NoConfiguration object
fullSQL
No
Yes (this or discoverySQL + extractionSQL)-NoThe "SELECT" query to be run to retrieve all documents. This query is used for full or incremental scans. The "WHERE" clause can be used to specify any required condition for crawling the desired documents. Any change to any column selected in this SQL will cause the document to be re-indexed. For example "SELECT idCol, col1, col2, col3 FROM data_table" When slicing is enabled, add a "WHERE" clause containing "{SLICES}". For example "SELECT idCol, col1, col2, col3 FROM data_table WHERE {SLICES}" ."SELECT * FROM table"
discoverySQL
No
Yes (this or fullSQL)-NoThe "SELECT" query to run for discovering documents. This query is used for full or incremental scans. A "WHERE" clause can be used to specify any required condition for crawling the desired documents. A change to any column selected in this SQL will cause the document to be re-indexed. For example: "SELECT idCol, lastModifiedDate FROM data_table". When slicing is enabled, add a "WHERE" clause containing "{SLICES}". For example: "SELECT idCol, col1 FROM data_table WHERE {SLICES}"SELECT id, lastModified FROM table"
extractionSQL
No
Yes (this or fullSQL)-No

"SELECT" query for extracting all data for each document found in the Discovery SQL. At the least, you MUST include a "WHERE" clause containing the expression "idColumnName IN {IDS}", where idColumnName corresponds to a unique key field name. {IDS} is replaced automatically by the connector with the corresponding unique key values. For example: "SELECT col1, col2, col3 FROM data_table WHERE idCol in {IDS}" You must not include the {SLICES} condition here.

"SELECT * FROM table WHERE id IN {IDS}"
idColumn
No
Yes-NoThe column name that holds the unique key. The default name of the column which holds the value to use as the document id. This column must be present in both discoverySQL and extractionSQL. SQL aliases are NOT supported."id"
stringIdColumn
Yes
NofalseNoCheck if the unique key is a string valuetrue
quoteId
Yes
NodoNotQuoteNoQuote id column - use if you have a name clashing with RDBMS keywords. You can use one of the values: doNotQuote, `, "
doNot
doNotQuote




ACL
aclColumn
No
Yes (aclColumn or aclSQL)-NoThe column name that holds the ACLs. Each ACL must be separated by semi-colons and must follow this format: my-domain\userOrGroup@NT"acl"
aclSQL
No
Yes (aclColumn or aclSQL)-NoThe query to use for extracting and building ACLs. This query depends of the Database engine, so the syntax could vary. For example on Oracle: SELECT 'my-domain\\' || user || '@NT;' FROM myTable"SELECT * FROM table_acl"

Example

Code Block
themeRDark
titlePOST aspire/_api/seeds
{
  "seed":"test_db",
  "type":"rdb-snapshot",
  "description" : "RDB_TEST",
  "properties" : {
      "idColumn" : "film_id",
      "stringIdColumn" : false,
      "aclSQL" : null,
      "aclColumn" : "acl",
      "quoteId" : "doNotQuote",
      "discoverySQL" : "SELECT film_id, title FROM film",
      "extractionSQL" : "SELECT * FROM film WHERE film_id IN {IDS}",
      "fullSQL" : null"
  }
}

Update Seed


FieldOptionalRequiredDefaultMultipleNotesExample
idNoYes-NoId of the seed to update"2f287669-d163-4e35-ad17-6bbfe9df3778"
(see the "Create seed" for other fields)




Example

Code Block
themeRDark
titlePUT aspire/_api/seeds/2f287669-d163-4e35-ad17-6bbfe9df3778
{
  "id": "2f287669-d163-4e35-ad17-6bbfe9df3778",
  "seed":"test_db",
  "description" : "RDB_TEST",
  "properties" : {
      "idColumn" : "film_id",
      "stringIdColumn" : false,
      "aclSQL" : null,
      "aclColumn" : "acl",
      "quoteId" : "doNotQuote",
      "discoverySQL" : "SELECT film_id, title FROM film",
      "extractionSQL" : "SELECT * FROM film WHERE film_id IN {IDS}",
      "fullSQL" : null"
  }
}

...