Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • ability to change version of template
  • ability passing connectionId, connectorId, workflowId,  policyId to seed, connection, credential
  • the ability passing list of tags
  • default values for properties which are not in 4.0.
  • multiple Starting Points -> split to multiple seeds and connections, script read file and for each line create seed.
  • ability to splitting url to connection and seed.
  • ability to pass as parameter a suffix of http object description
  • ability to work with relative path
  • ability to use the crawl path/folder to the Seed name and host/url to the Connection name the by using an argument.

Repository

aspire-migration-component

Installation

For Developers

  • Install Python 3.11
  • Install Pycharm Comunity

...

Install missing libraries urllib3, wheel, xmltodict, orjson by "+" button

Python libraries used

Windows cmd

install Python 3.11

Update Path in environment variables:

C:\Users\#{username}\AppData\Local\Programs\Python\Python311\

C:\Users\#{username}\AppData\Local\Programs\Python\Python311\Scripts

install pip

python -m pip install --upgrade pip --trusted-host pypi.org

install virtualenv

pip install virtualenv --trusted-host pypi.org


Create virtual env in python project where __init.py file is

virtualenv venv

Activate virtual env

venv\Scripts\activate

Install necessary libraries

pip install urllib3 --trusted-host pypi.org

pip install wheel --trusted-host pypi.org

pip install xmltodict --trusted-host pypi.org

pip install orjson --trusted-host pypi.org



Python libraries used

os.path, json, urlparse, os.path, json, urlparse, xmltodict, urllib3, zipfile, sys, datetime, logging, argparse

...

Code Block
{
   "Seed":{
      "workflows":[
         "2745cfbe-0a2f-4d19-ac02-51e7c6d68898"
      ],
      "connector":"a3ffa705-cc09-44f2-9233-ea6a5242491c",
      "connection":"2f31049e-da32-45cf-af49-dbc203d00c75",
      "deleteIncrementalPolicy":"e6165aa0-fb7d-490a-98da-f90032f0b886",
      "throttlePolicy":"c080798a-fd43-484b-bafb-2891b555ab3e",
      "routingPolicies":[
         "c6ed5858-4e74-443d-a303-a33ff4099115"
      ]
   },
   "Connection":{
      "credential":"aa56ec21-1459-469e-b347-c95199c2d95b",
      "deleteIncrementalPolicy":"e6165aa0-fb7d-490a-98da-f90032f0b886",
      "throttlePolicy":"c080798a-fd43-484b-bafb-2891b555ab3e",
      "routingPolicies":[
         "c6ed5858-4e74-443d-a303-a33ff4099115"
      ]
   },
 "Credential":{
      "deleteIncrementalPolicy":"e6165aa0-fb7d-490a-98da-f90032f0b886",
      "throttlePolicy":"c080798a-fd43-484b-bafb-2891b555ab3e",
      "routingPolicies":[
         "c6ed5858-4e74-443d-a303-a33ff4099115"
      ]-bafb-2891b555ab3e"
   },
}

This content of the file is only to show, what all of ids can be setup and in the real situation  "connection" in "Seed" must be deleted if we want to also setup connection and same for the situation credential in connection. Because these objects will be replaced and not created in script.

...

Code Block
#credential json body
{
  "properties": {
    "domain": "ad",
    "username": "svcRS01VF4D1FS01_SMB",
    "password": "encrypted:E52FA744FE5F255D982B6AEF12BCBA0F20A3E725DDD2D3D74A6FE5A70A931FCD"
  },
  "type": "smb",
  "description": "migration smb-2023-01-26_11-58-48",
  "deleteIncrementalPolicy": "e6165aa0-fb7d-490a-98da-f90032f0b886",
  "throttlePolicy": "c080798a-fd43-484b-bafb-2891b555ab3e",
  "routingPolicies": [
    "c6ed5858-4e74-443d-a303-a33ff4099115"
  ]
} 
# connection json body
{
  "type": "smb",
  "description": "migration smb-2023-01-26_11-58-48",
  "properties": {
    "host": "smb://10.89.26.105:445/",
   ...
	
  },
  "credential": "babe0cc5-7223-4ed5-80c1-fa992d9a6a2f",
  "deleteIncrementalPolicy": "e6165aa0-fb7d-490a-98da-f90032f0b886",
  "throttlePolicy": "c080798a-fd43-484b-bafb-2891b555ab3e",
  "routingPolicies": [
    "c6ed5858-4e74-443d-a303-a33ff4099115"
  ]
} 
# seed json body
{
  "type": "smb",
  "description": "migration smb-2023-01-26_11-58-48",
  "connector": "ec658f0b-1ea1-407f-8b8c-1b5c38da09f8",
  "connection": "e6cf3d33-1914-48c0-9b2f-89d45e8adcbf",
  "seed": "/",
  "properties": {
    "seed": "/"
  },
  "tags": null,
  "workflows": [
    "2745cfbe-0a2f-4d19-ac02-51e7c6d68898"
  ],
  "deleteIncrementalPolicy": "e6165aa0-fb7d-490a-98da-f90032f0b886",
  "throttlePolicy": "c080798a-fd43-484b-bafb-2891b555ab3e",
  "routingPolicies": [
    "c6ed5858-4e74-443d-a303-a33ff4099115"
  ]
} 

...

Code Block
# sharepoint connector
"transform":{
  "source_fileurl_prop_name": "seedsFilePath",
...
#smb connector
"transform":{
  "source_fileurl_prop_name": "fileUrl",
...
#filesystem connector
"transform":{
  "source_fileurl_prop_name": "fileUrl",
...

Script read urls from "content-source.xml" file or txt file and create  several connections and seeds by HTTP API call.

Ability to splitting url to the connection and seed

Script read an url splits it to connection part and seed part and create the connection and the seed.

Concretely Filesystem url C:\tmp/abc will be splitted to drive part C:\ (connection) and directory part /tmp/abc (seed).

Sharepoint url https://cao365.sharepoint.com/sites/qasite/davidgtest will be splitted to scheme part https://,  hostname part cao365.sharepoint.com, path part /sites/qasite/davidgtest, 

scheme + hostname will be in the connection, path part will be in the seed.

Almost same will be splitted smb url smb://10.89.26.106:445/, only we don't need scheme in this situation so this will be not used. To control usage of scheme, we need to setup in file  connection_transform_matrix.json property

"url_scheme_included":"false".

Ability to pass as parameter a suffix of http object description

",
...
#filesystem connector
"transform":{
  "source_fileurl_prop_name": "fileUrl",
...


Script read urls from "content-source.xml" file or txt file and create  several connections and seeds by HTTP API call.

Ability to splitting url to the connection and seed

Script read an url splits it to connection part and seed part and create the connection and the seed.

Concretely Filesystem url C:\tmp/abc will be splitted to drive part C:\ (connection) and directory part /tmp/abc (seed).

Sharepoint url https://cao365.sharepoint.com/sites/qasite/davidgtest will be splitted to scheme part https://,  hostname part cao365.sharepoint.com, path part /sites/qasite/davidgtest, 

scheme + hostname will be in the connection, path part will be in the seed.

Almost same will be splitted smb url smb://10.89.26.106:445/, only we don't need scheme in this situation so this will be not used. To control usage of scheme, we need to setup in file  connection_transform_matrix.json property

"url_scheme_included":"false".

Ability to pass as parameter a suffix of http object description

Script must be run with other parameter "-d [migr]"


Code Block
python __init.py__ -d migr -p smb_connector-content-source.zip

Ability to work with relative path

Migration xml files can contain relative path to other configuration files. Script can change automatically relative path to absolute path but must be run with parameter "-a [absolute path to aspire 4.0 distribution]"

Code Block
python __init.py__ -a C:\aspire-4.0 -p smb_connector-content-source.zip

Ability to use the crawl path/folder to the Seed name and host/url to the Connection name the by using an argument

Script can change description to path/folder, host/url by run with parameter "-uScript must be run with other parameter "-d [migr]"

Code Block
python __init.py__ -d migr -p smb_connector-content-source.zip -u