Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
titleconnection_transform_matrix.json
{"default":{
  "scanRecursively": true,
  "stopCrawlOnScannerError": true,
  "filterNoCrawl": false,
  "azureADSeed": ""
},

...

Multiple Starting Points

Multiple starting points are feature used in connectors version 4.0. Connectors version 5.0 doesn't have this feature because the functionality now is done by seeds and connection.

Two possibilities are in connectors version 4.0 to add multiple version points, by UI and by file.

By UI: User add urls by UI and we can see them in the file content-source.xml,


Code Block
# sharepoint connector
<siteCollectionsToCrawl>
    <siteCollectionUrl>https://cao365.sharepoint.com/sites/qasite/davidgtest2</siteCollectionUrl>
    <siteCollectionUrl>https://cao365.sharepoint.com/sites/qasite/davidgtest2</siteCollectionUrl>
</siteCollectionsToCrawl>

#smb connector
<urls>smb://10.89.26.106:445/</urls>
<urls>smb://10.89.26.105:445/</urls>

#filesystem connector
<urls>c:\tmp</urls>
<urls>c:\Users</urls>

 Urls xml tags are not standardized for all connectors. It is created mechanism in *_transform_matrix.json to deal with it.


Code Block
# sharepoint connector 
"transform":{
  "conn_url_prop_name": "serverUrl",
  "source_url_prop_name": "siteCollectionsToCrawl:siteCollectionUrl",
...

#smb connector
"transform":{
  "conn_url_prop_name": "host",
  "source_url_prop_name": "urls",

#file system connector
"transform":{
  "conn_url_prop_name": "url",
  "source_url_prop_name": "urls",

We can see that are created keys "conn_url_prop_name", "source_url_prop_name" which are pointing to json connection property name and xml tag in content-source.xml.


By file:

Users add in UI only path to the file and in the file content-source.xml is xml tag

Code Block
<fileUrl>C:\tmp\ups.txt</fileUrl>

 

Script read urls, split them to connection part and seed part and create them by API call.

Content by Label
showLabelsfalse
max5
spacesASPIRE50
showSpacefalse
sortmodified
reversetrue
typepage
cqllabel = "kb-how-to-article" and type = "page" and space = "ASPIRE50"
labelskb-how-to-article

...