You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 23 Next »

Connector template migration script is designed and written in python 3.11. 

The connector script helps to migrate connectors from Aspire 4.0 to Aspire 5.0.

The script is designed to be robust and easily extensible using templates with transformation matrices.

It can currently migrate the following connector types: Filesystem, SMB, SharePoint online.

The workflows are now ignored and not migrated.

There are implemented these features:

  • ability to change version of template
  • ability passing connectionId, connectorId, workflowId,  policyId to seed, connection, credential
  • the ability passing list of tags
  • default values for properties which are not in 4.0.
  • multiple Starting Points -> split to multiple seeds and connections, script read file and for each line create seed.
  • ability to splitting url to connection and seed.
  • ability to pass as parameter a suffix of http object description

Repository

aspire-migration-component


Python libraries used

os.path, json, urlparse, xmltodict, urllib3, zipfile, sys, datetime, logging, argparse

Step-by-step guide

  1. Run Aspire 4.0 and Aspire 5.0

  2. Download zip file of connector from Aspire 4.0
  3. setup Aspire 5.0 url in file __init__.py, if you are using private certificates, needs to be setup also certificates

       

URL_ASPIRE = "http://localhost:50505/aspire/_api" 
# --- https with certificates
# CA_CERTS = ('client-2048.crt', 'client-2048.key')
# http = urllib3.PoolManager(cert_reqs='REQUIRED', ca_certs=CA_CERTS) 

     4. Run script

python __init__.py -p smb_connector-content-source.zip

      5. After few second connector is migrated to Aspire 5.0, objects "seed", "connection", "connector", "credential" created with description "migration-{date}".

You can also check the created report about migration "migration-{date}.log" which will be in the  directory "reports" to see what exactly happened. 

Help

We can get information about script by command

python __init__.py -h

Output

usage: __init__.py [-h] [-d DESC] [-v VER] [-p PATH] [-t TAGS] [-i IDS]

Migration ASPIRE script. Migrate connector from ASPIRE version 4.0 to ASPIRE
version 5.0

options:
  -h, --help            show this help message and exit
  -d DESC, --desc DESC  Custom description of all migrated ASPIRE objects
                        (seed, connector, connection, credential)
  -v VER, --ver VER     Version of transformation matrix.
  -p PATH, --path PATH  Path to connector file zip.
  -t TAGS, --tags TAGS  Tags for seed.
  -i IDS, --ids IDS     Ids with should be used in seed or connection

Check log after migration. All is well that ends well.


Ability to change version of template

Templates are in directory "migration_template"

The structure of the project looks similarily as an situation below.

-- migration

-- migration_template

-- 4.1

-- default

-- aspire-filesystem-source

-- aspire-sharepointonline-source

-- aspire-smb-source

-- type_transform_matrix.json

Connectors can have different properties related to their version 4.0, 4.1, 5.0, 5.1 etc. When we want to work with different version than "default", we need to create new directory there "4.1" everything from directory "default" needs to be copied to this new directory. Changes needs to be done in "*_transform_matrix.json" for current connector.

After we are done with changes, script must to be run with other parameter "-v [name of new directory]"

python __init.py__ -v 4.1 -p smb_connector-content-source.zip


Ability passing ids to seed, connection, credential

Can be the situation that we have some objects already created in Aspire 5 and we want to use them for this new migrated connector. So we create json file with the structure related to example below.

File "ids.json" content:

{
   "Seed":{
      "workflows":[
         "2745cfbe-0a2f-4d19-ac02-51e7c6d68898"
      ],
      "connector":"a3ffa705-cc09-44f2-9233-ea6a5242491c",
      "connection":"2f31049e-da32-45cf-af49-dbc203d00c75",
      "deleteIncrementalPolicy":"e6165aa0-fb7d-490a-98da-f90032f0b886",
      "throttlePolicy":"c080798a-fd43-484b-bafb-2891b555ab3e",
      "routingPolicies":[
         "c6ed5858-4e74-443d-a303-a33ff4099115"
      ]
   },
   "Connection":{
      "credential":"aa56ec21-1459-469e-b347-c95199c2d95b",
      "deleteIncrementalPolicy":"e6165aa0-fb7d-490a-98da-f90032f0b886",
      "throttlePolicy":"c080798a-fd43-484b-bafb-2891b555ab3e",
      "routingPolicies":[
         "c6ed5858-4e74-443d-a303-a33ff4099115"
      ]
   },
 "Credential":{
      "deleteIncrementalPolicy":"e6165aa0-fb7d-490a-98da-f90032f0b886",
      "throttlePolicy":"c080798a-fd43-484b-bafb-2891b555ab3e",
      "routingPolicies":[
         "c6ed5858-4e74-443d-a303-a33ff4099115"
      ]
   },
}

This content of the file is only to show, what all of ids can be setup and in the real situation  "connection" in "Seed" must be deleted if we want to also setup connection and same for the situation credential in connection. Because these objects will be replaced and not created in script.

After we are done with file, script must to be run with other parameter "-i [filepath]"

python __init.py__ -i ids.json -p smb_connector-content-source.zip

Output

#credential json body
{
  "properties": {
    "domain": "ad",
    "username": "svcRS01VF4D1FS01_SMB",
    "password": "encrypted:E52FA744FE5F255D982B6AEF12BCBA0F20A3E725DDD2D3D74A6FE5A70A931FCD"
  },
  "type": "smb",
  "description": "migration smb-2023-01-26_11-58-48",
  "deleteIncrementalPolicy": "e6165aa0-fb7d-490a-98da-f90032f0b886",
  "throttlePolicy": "c080798a-fd43-484b-bafb-2891b555ab3e",
  "routingPolicies": [
    "c6ed5858-4e74-443d-a303-a33ff4099115"
  ]
} 
# connection json body
{
  "type": "smb",
  "description": "migration smb-2023-01-26_11-58-48",
  "properties": {
    "host": "smb://10.89.26.105:445/",
   ...
	
  },
  "credential": "babe0cc5-7223-4ed5-80c1-fa992d9a6a2f",
  "deleteIncrementalPolicy": "e6165aa0-fb7d-490a-98da-f90032f0b886",
  "throttlePolicy": "c080798a-fd43-484b-bafb-2891b555ab3e",
  "routingPolicies": [
    "c6ed5858-4e74-443d-a303-a33ff4099115"
  ]
} 
# seed json body
{
  "type": "smb",
  "description": "migration smb-2023-01-26_11-58-48",
  "connector": "ec658f0b-1ea1-407f-8b8c-1b5c38da09f8",
  "connection": "e6cf3d33-1914-48c0-9b2f-89d45e8adcbf",
  "seed": "/",
  "properties": {
    "seed": "/"
  },
  "tags": null,
  "workflows": [
    "2745cfbe-0a2f-4d19-ac02-51e7c6d68898"
  ],
  "deleteIncrementalPolicy": "e6165aa0-fb7d-490a-98da-f90032f0b886",
  "throttlePolicy": "c080798a-fd43-484b-bafb-2891b555ab3e",
  "routingPolicies": [
    "c6ed5858-4e74-443d-a303-a33ff4099115"
  ]
} 


Ability to passing list of tags

Plase see the article below about tags.

Routing Policies - Aspire 5.0 - Confluence (accenture.com)

Script must to be run with other parameter "-t [tag1, tag2, ..]"

python __init.py__ -t EU,US -p smb_connector-content-source.zip

Output

{
  "type": "smb",
  "description": "migration smb-2023-01-26_12-45-23",
  "connector": "52970e8c-2426-4967-80e7-9100a3ed3ba4",
  "connection": "00a71baa-2d4f-4e87-9c7b-bec59f092dc9",
  "seed": "/",
  "properties": {
    "seed": "/"
  },
  "tags": [
    "EU",
    "US"
  ]
} 

Default values for properties which are not in 4.0.

Default values are part of *-transform-matrix.json and can be added for seed, connection, connector, credential.

connection_transform_matrix.json
{"default":{
  "scanRecursively": true,
  "stopCrawlOnScannerError": true,
  "filterNoCrawl": false,
  "azureADSeed": ""
},

...

Multiple Starting Points

Multiple starting points are feature used in connectors version 4.0. Connectors version 5.0 doesn't have this feature because the functionality now is done by seeds and connection.

Two possibilities are in connectors version 4.0 to add multiple version points, by UI and by file.

By UI: User add urls by UI and we can see them in the file content-source.xml,


# sharepoint connector
<siteCollectionsToCrawl>
    <siteCollectionUrl>https://cao365.sharepoint.com/sites/qasite/davidgtest2</siteCollectionUrl>
    <siteCollectionUrl>https://cao365.sharepoint.com/sites/qasite/davidgtest2</siteCollectionUrl>
</siteCollectionsToCrawl>

#smb connector
<urls>smb://10.89.26.106:445/</urls>
<urls>smb://10.89.26.105:445/</urls>

#filesystem connector
<urls>c:\tmp</urls>
<urls>c:\Users</urls>

 Urls xml tags are not standardized for all connectors. It is created mechanism in *_transform_matrix.json to deal with it.


# sharepoint connector 
"transform":{
  "conn_url_prop_name": "serverUrl",
  "source_url_prop_name": "siteCollectionsToCrawl:siteCollectionUrl",
...

#smb connector
"transform":{
  "conn_url_prop_name": "host",
  "source_url_prop_name": "urls",
..

#file system connector
"transform":{
  "conn_url_prop_name": "url",
  "source_url_prop_name": "urls",
..

We can see that are created keys "conn_url_prop_name", "source_url_prop_name" which are pointing to json connection property name and xml tag in content-source.xml.


By file:

Users add in UI only path to the file and in the file content-source.xml is xml tag

# sharepoint connector
<seedsFilePath>${aspire.config.dir}/${app.name}/urls.txt</seedsFilePath>

#smb connector
<fileUrl>C:\tmp\ups.txt</fileUrl>

#filesystem connector
<fileUrl>C:\tmp\ups.txt</fileUrl>


 The file path xml tag is not standardized for all connectors, so we created similar mechanism as for urls to deal with that.


# sharepoint connector
"transform":{
  "source_fileurl_prop_name": "seedsFilePath",
...
#smb connector
"transform":{
  "source_fileurl_prop_name": "fileUrl",
...
#filesystem connector
"transform":{
  "source_fileurl_prop_name": "fileUrl",
...


Script read urls from "content-source.xml" file or txt file , splits them to connection part and seed part and create several connections and seeds by HTTP API call.



  • No labels