We support these crawled repositories authentication types:
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
type | Yes | - | No | The value must be "rest-api". | "rest-api" |
description | Yes | - | No | Name of the credential object. | "My REST Credential" |
properties | Yes | - | No | Configuration object | |
type | yes | - | No | Authentication type: basic, apiToken, bearer, none, | basic |
type: basic | |||||
loginAccount | Yes | - | No | User name. | "admin" |
password | Yes | - | No | Password (can be encrypted in Aspire fashion) | "adminPassword" |
type: apiToken | |||||
headerName | yes | - | No | The name of HTTP header field to be sent with a request | "tokenName1" |
headerValue | yes | - | No | The value of the "headerName" field | "tokenValue1" |
type: bearer | |||||
preExpirationLimitInMs | Yes | 0 | No | Pre expiration limit. The time (in ms) used for calculating when to ask for the new accessToken | 5000 |
query | yes | - | No | bearer query: JSON object representing the query to be sent for getting the accessToken | |
urlTemplate | yes | - | No | The context path of the URL | "/login" |
method | yes | - | No | HTTP method. Must be POST in this version | "POST" |
body | yes | - | No | The query body. Fields ${loginAccount}, ${password} are expected to be used as a part of the body. | "{\"username\" : \"${username}\",\"password\" : \"${password}\"}" |
queryType | yes | - | No | Use the value "metadataExtraction" here | "metadataExtraction" |
resultField | yes | - | No | The field in the response with the access token | "accessToken" |
loginAccount | Yes | - | No | User name. Used as a value for ${loginAccount} query body field | "admin" |
password | Yes | - | No | Password. Used as a value for ${password} query body field | "adminPassword" |
{ "type": "rest-api", "description": "My credential", "properties": { "type": "bearer", "query": { "urlTemplate": "/login", "method": "POST", "body": "{\"username\" : \"${username}\",\"password\" : \"${password}\"}", "queryType": "metadataExtraction", "resultField": "accessToken" "username": "admin", "password": "encrypted:xxxxx", } } }
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
type | Yes | - | No | The value must be "rest-api". | "rest-api" |
description | Yes | - | No | Name of the connection object. | "My REST Connection" |
throttlePolicy | No | - | No | Id of the throttle policy that applies to this connection object. | "6b235b333a1b" |
routingPolicies | No | [ ] | Yes | The ids of the routing policies that this connection will use. | ["17f75ce7d0c7", "d42780003b36"] |
credential | Yes | - | No | Id of the credential | "6b235b333a1b" |
properties | Yes | - | No | Configuration object | |
baseUrl | Yes | - | No | Your rest service API url | "https://your-service/api/v2/" |
trustAllCertificates | Yes | false | No | If selected, no HTTPS certificate validation will be done. | true |
{ "type": "rest-api", "description": "Rest conn 3", "credential": "0b6fd9c8-d722-4874-aca1-e57c6eff2089", "properties": { "baseUrl": "http://aspire_manager:50443/aspire/_api" } }
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
id | Yes | - | No | Id of the connection to update | "d442adcab4b0", |
description | No | - | No | Name of the connection object. | "My REST Connection" |
throttlePolicy | No | - | No | Id of the throttle policy that applies to this connection object. | "b3a9-6b235b333a1b" |
routingPolicies | No | [ ] | Yes | The ids of the routing policies that this connection will use. | ["17f75ce7d0c7", "d42780003b36"] |
credential | No | - | No | Id of the credential | "6b235b333a1b" |
properties | No | - | No | Configuration object | |
(see create connection) |
{ "id": "89d6632a-a296-426c-adb0-d442adcab4b0", "description": "REST connection", "properties": { "baseUrl": "http://aspire_manager:50443/aspire/_api" } }
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
seed | Yes | - | No | N/A | "N/A" |
type | Yes | - | No | The value must be "rest-api". | "rest-api" |
description | Yes | - | No | Name of the seed object. | "My REST Seed" |
connector | Yes | - | No | The id of the connector to be used with this seed. The connector type must match the seed type. | "e3ca414b0d31" |
connection | Yes | - | No | The id of the connection to be used with this seed. The connection type must match the seed type. | "e4a663fe9ee6" |
workflows | No | [ ] | Yes | The ids of the workflows that will be executed for the documents crawled. | ["5696c3f0bda4"] |
throttlePolicy | No | - | No | Id of the throttle policy that applies to this seed object. | "6b235b333a1b" |
routingPolicies | No | [ ] | Yes | The ids of the routing policies that this seed will use. | ["17f75ce7d0c7", "d42780003b36"] |
tags | No | [ ] | Yes | The tags of the seed. These can be used to filter the seed | ["tag1", "tag2"] |
properties | Yes | - | No | Configuration object | |
crawlRules | yes | - | yes | Crawl rules | |
condition | No | - | No | Groovy condition to determine which items should execute this set of queries. Groovy script to determine if a given item should execute this set of queries. The following matches the root item: item.getType().toString().equals('root') The following matches any extracted entity from a scan: item.getType().toString().equals('entity') | "item.getType().toString().equals('root')" |
shouldStop | No | false | No | If selected, then no other queries will be executed for the given item. | true |
shouldIndex | No | false | No | If selected, the item matching this crawl rule will be indexed. | true |
queries | No | - | yes | Crawl rules: Queries to execute inside the rule | |
urlTemplate | Yes | - | No | The query to execute. If ${metadataParameter} is found inside the field it will be replaced with a specific value (for example from the scan result entity) | "/serviceEndpoint/${name}" |
method | Yes | - | No | HTTP method. Options: GET, POST, PUT | "GET" |
body (if method POST or PUT) | Yes (if method is POST or PUT) | - | No | The body of the POST or PUT body. Can include parameters to be replaced as: ${param1.paramA} | "{\"username\" : \"${username}\",\"password\" : \"${password}\"}" |
contentType (if method POST or PUT) | No | json | no | The body mime type: json/xml/text | "xml" |
queryType | yes | - | no | The query type: scan/metadataExtraction/binaryFetch | "scan" |
Scan | |||||
childrenPath | No | response | No | Extraction path. The path to the response array that contains the children to extract. For example if the response comes as {"response":{"entitities":[{1},{2},{..},{n}]}} response.entities should be used. If the array is the response, then leave this field empty | "response.entities" |
idField | Yes | - | No | Child ID field. Field within each child holding its ID. For example if each child has the following structure: {"entity":{"entityId":"abc-ef-1234"}, "att1":"val1"} then entity.entityId should be used | "entity.entityId" |
signatureFields | No | - | Yes | Scan: Incremental configuration signature fields | |
path | yes | - | no | Signature Json Path (e.g. $.attribute). Json path to extract fields to use as signature. Check out https://github.com/json-path/JsonPath for JsonPath documentation | "$.attribute" |
Scan: Extended signatures | |||||
extendedSignature | no | false | no | Use this option if extra requests must be executed to obtain metadata needed to calculate modifications properly. Use this option carefully as this decreases the performance upon incremental crawls linearly. | true |
queries | no | - | yes | Scan: Extended signature Queries | |
queryType | yes | - | no | Query type - must be "metadataExtraction" | "metadataExtraction" |
urlTemplate | yes | - | no | The query to execute | "/serviceEndpoint/${metadataParameter}" |
method | Yes | - | No | HTTP method. Options: GET, POST, PUT | "GET" |
body (if method POST or PUT) | Yes (if method is POST or PUT) | - | No | The body of the POST or PUT body. Can include parameters to be replaced as: ${param1.paramA} | "{\"username\" : \"${username}\",\"password\" : \"${password}\"}" |
contentType (if method POST or PUT) | No | json | no | The body mime type: json/xml/text | "xml" |
signatureFields | No | - | Yes | Signature fields | |
path | yes | - | no | Signature Json Path (e.g. $.attribute). Json path to extract fields to use as signature. Check out https://github.com/json-path/JsonPath for JsonPath documentation | "$.attribute" |
resultField | yes | - | no | Internal name of metadata where the the results will be extracted into | |
Scan: Pagination | |||||
hasPagination | no | false | no | Enable pagination | true |
pageSize | no | 300 | no | The maximum number of entries the query retrieve per page | 100 |
totalField | no | - | no | Path to the total field in the response. If the response has {"response":{"totalEntities":50000, "entities":[...]}} then response.totalEntities should be used | "response.totalEntities" |
queryParameters | yes (if pagination controlled by query params) | - | no | Parameters template for pagination | start=${pagination.offset}&pageSize=300 |
Metadata extraction | |||||
resultField | yes | - | no | Internal name of metadata where the the results will be extracted into | "someField" |
cacheSize | no | 100 | no | Maximum cache size for request | 200 |
cacheExpiration | no | 3600 | no | Cache expiration in seconds | 60 |
{ "seed": "N/A", "description": "REST seed", "connector": "93c16011-562d-4aba-a57d-31a945b3f8e5", "connection": "0ed33b76-e0ea-4ff0-ba1e-dcd25a3024c6", "type": "rest-api", "properties": { "trustAllCertificates": true, "crawlRules": [ { "condition": "item.getType().toString().equals('root')", "shouldStop": false, "shouldIndex": false, "queries": [ { "urlTemplate": "/connectors", "method": "GET", "queryType": "scan", "scan": { "childrenPath": "connector", "idField": "id" } } ] }, { "condition": "true", "shouldStop": "false", "shouldIndex": "true" }, { "condition": "false", "shouldStop": "false", "shouldIndex": "false" } ] } }
Field | Required | Default | Multiple | Notes | Example |
---|---|---|---|---|---|
id | Yes | - | No | Id of the seed to update | "2f287669-d163-4e35-ad17-6bbfe9df3778" |
(see the "Create seed" for other fields) |
{ "id": "2f287669-d163-4e35-ad17-6bbfe9df3778", "seed": "N/A", "description": "REST seed", "connector": "93c16011-562d-4aba-a57d-31a945b3f8e5", "connection": "0ed33b76-e0ea-4ff0-ba1e-dcd25a3024c6", "properties": { "trustAllCertificates": true, "crawlRules": [ { "condition": "item.getType().toString().equals('root')", "shouldStop": false, "shouldIndex": false, "queries": [ { "urlTemplate": "/connectors", "method": "GET", "queryType": "scan", "scan": { "childrenPath": "connector", "idField": "id" } } ] }, { "condition": "true", "shouldStop": "false", "shouldIndex": "true" }, { "condition": "false", "shouldStop": "false", "shouldIndex": "false" } ] } }