OptionalRequired | Default | Multiple | Notes | Example |
---|
seed |
NoYes | - | No | The name of the database. It will replace the marker {DATABASE} used in the field jdbcUrl of connection object | "test_db" |
type |
NoYes | - | No | The value must be "rdb- |
tablestablesNoYes | - | No | Name of the seed object. | "My RDB Seed" |
connector |
NoYes | - | No | The id of the connector to be used with this seed. The connector type must match the seed type. | "e3ca414b0d31" |
connection |
NoYes | - | No | The id of the connection to be used with this seed. The connection type must match the seed type. | "e4a663fe9ee6" |
workflows |
YesNo | [ ] | Yes | The ids of the workflows that will be executed for the documents crawled. | ["5696c3f0bda4"] |
throttlePolicy |
YesNo | - | No | Id of the throttle policy that applies to this seed object. | "6b235b333a1b" |
routingPolicies |
YesNo | [ ] | Yes | The ids of the routing policies that this seed will use. | ["17f75ce7d0c7", "d42780003b36"] |
tags |
YesNo | [ ] | Yes | The tags of the seed. These can be used to filter the seed | ["tag1", "tag2"] |
properties |
---|
NoYes | - | No | Configuration object |
|
---|
fullSQL |
NoYes (this or discoverySQL + extractionSQL) | - | No |
Full SQL. Run a The "SELECT" query to be run to retrieve all documents. This query is used |
only for full or incremental scans. |
Use the The "WHERE" clause can be used to specify any required condition for crawling |
just If slicing Any change to any column selected in this SQL will cause the document to be re-indexed. For example "SELECT idCol, col1, col2, col3 FROM data_table" When slicing is enabled, add a "WHERE" clause |
: containing "{SLICES}". For example "SELECT idCol, col1, col2, col3 FROM data_table WHERE {SLICES}" . | "SELECT * FROM table" |
idColumn | discoverySQL | Yes (this or fullSQL) |
No column name that holds the unique key. The default name of the column which holds the value to use as the document id. This column must be present in both discoverySQL and extractionSQL. SQL aliases are NOT supported."id" | stringIdColumn | Yes | false | No | Check if the unique key is a string value | true |
postCrawlSQL | Yes | - | No | The SQL to run after a crawl | Incremental Crawl | preUpdateSQL | Yes | - | No | The SQL to run before an incremental crawl. This SQL can be used to mark documents for update, save timestamps, clear update tables, etc. as needed to prepare for an incremental crawl | "UPDATE updates_table SET status='I'" |
updateSQL | No | - | No | The SQL to run during an incremental crawl. This SQL should provide a list of all adds and deletes to the documents in the index. Some field names have special meaning (such as 'title', 'content', 'url', 'id', etc.) - see the wiki for more information. Note the special column, 'action' should report 'I' (for inserts), 'U' (for updates, typically the same as updates for most search engines), and 'D' (for deletes)The SQL to run during an incremental crawl. This SQL should provide a list of all adds and deletes to the documents in the index. Some field names have special meaning (such as 'title', 'content', 'url', 'id', etc.) - see the wiki for more information. Note the special column, 'action' should report 'I' (for inserts), 'U' (for updates, typically the same as updates for most search engines), and 'D' (for deletes) | "SELECT updates_table.sequence, updates_table.id, updates_table.action, students.first_name, students.last_name FROM students RIGHT OUTER JOIN updates_table ON students.id = updates_table.id WHERE updates_table.status = 'I' ORDER BY updates_table.sequence ASC" |
postUpdateSQL | Yes | - | No | The SQL to run after each record is processed. This SQL can be used un-mark / delete each document in the tables after it is complete. Your SQL may include placeholders for the row id, action, sequence id and whether the processing was successful. These are {documentId}, {action}, {sequenceId} and {failed} respectively | UPDATE updates_table SET status = 'C' WHERE sequence = {sequenceId} |
postUpdateFailedSQL | Yes | - | No | The SQL to run after each record if processing fails. If not configured, the 'Post update SQL' will be run instead Your SQL may include placeholders for the row id, action, sequence id and whether the processing was successful. These are {documentId}, {action}, {sequenceId} and {failed} respectively | seqColumn | No | - | No | The name of the column in the returned data which holds the sequence number of the update. This is only used for incremental crawls and must match the name returned by the SQL. If the column is aliased using the SQL "AS" construct, you should provide the alias name here. | "sequence" |
actionColumn | No | - | No | The name of the column in the returned data which holds action of the update (ie Insert, Update or Delete). This is only used for incremental crawls and must match the name returned by the SQL. If the column is aliased using the SQL "AS" construct, you should provide the alias name here | "action" |
useBounding | Yes | false | No | Checking this option allows incremental crawls to use SQL that is bounded by a condition. When entering SQL you may use the variables {lowerBound} and {upperBound} in a WHERE clause to limit the data collected. The {upperBound} will be calculated at the start of the crawl. The {lowerBound} will be the {upperBound} from the previous crawl. Two types of bounding are available - 'Timestamp' returns the bounds as a 'long' value representing the current system time whilst 'SQL' allows you to define SQL to return the new upper bound when the crawl starts | true |
boundingSQL | Yes | - | No | The SQL run when the crawl starts to return the new upper bound. The upper bound will be taken from the first column of the first row returned | ACL | aclColumn | No (aclColumn or aclSQL) | - | No | The column name that holds the ACLs. Each ACL must be separated by semi-colons and must follow this format: my-domain\userOrGroup@NT | "acl" |
aclSQL | No (aclColumn or aclSQL) | - | No | The query to use for extracting and building ACLs. This query depends of the Database engine, so the syntax could vary. For example on Oracle: SELECT 'my-domain\\' || user || '@NT;' FROM myTable | "SELECT * FROM table_acl" |