...
aspire-migration-component
In Pycharm go to File → Settings → Python Interpreter
Setup path to where is python
Install missing libraries urllib3, wheel, xmltodict, orjson by "+" button
install Python 3.11
Update Path in environment variables:
C:\Users\#{username}\AppData\Local\Programs\Python\Python311\
C:\Users\#{username}\AppData\Local\Programs\Python\Python311\Scripts
install pip
python -m pip install --upgrade pip --trusted-host pypi.org
install virtualenv
pip install virtualenv --trusted-host pypi.org
Create virtual env in python project where __init.py file is
virtualenv venv
Activate virtual env
venv\Scripts\activate
Install necessary libraries
pip install urllib3 --trusted-host pypi.org
pip install wheel --trusted-host pypi.org
pip install xmltodict --trusted-host pypi.org
pip install orjson --trusted-host pypi.org
...
os.path, json, urlparse, xmltodict, urllib3, zipfile, sys, datetime, logging, argparse
Run Aspire 4.0 and Aspire 5.0
...
After we are done with changes, script must to be run with other parameter "-v [name of new directory]"
...
Code Block |
---|
{ "Seed":{ "workflows":[ "2745cfbe-0a2f-4d19-ac02-51e7c6d68898" ], "connector":"a3ffa705-cc09-44f2-9233-ea6a5242491c", "connection":"2f31049e-da32-45cf-af49-dbc203d00c75", "deleteIncrementalPolicy":"e6165aa0-fb7d-490a-98da-f90032f0b886", "throttlePolicy":"c080798a-fd43-484b-bafb-2891b555ab3e", "routingPolicies":[ "c6ed5858-4e74-443d-a303-a33ff4099115" ] }, "Connection":{ "credential":"aa56ec21-1459-469e-b347-c95199c2d95b", "deleteIncrementalPolicy":"e6165aa0-fb7d-490a-98da-f90032f0b886", "throttlePolicy":"c080798a-fd43-484b-bafb-2891b555ab3e", "routingPolicies":[ "c6ed5858-4e74-443d-a303-a33ff4099115" ] }, "Credential":{ "deleteIncrementalPolicy":"e6165aa0-fb7d-490a-98da-f90032f0b886", "throttlePolicy":"c080798a-fd43-484b-bafb-2891b555ab3e", "routingPolicies":[ "c6ed5858-4e74-443d-a303-a33ff4099115" ] }, } |
This content of the file is only to show, what all of ids can be setup and in the real situation "connection" in "Seed" must be deleted if we want to also setup connection and same for the situation credential in connection. Because these objects will be replaced and not created in script.
After we are done with file, script must to be run with other parameter "-i [filepath]"
...
Code Block |
---|
#credential json body { "properties": { "domain": "ad", "username": "svcRS01VF4D1FS01_SMB", "password": "encrypted:E52FA744FE5F255D982B6AEF12BCBA0F20A3E725DDD2D3D74A6FE5A70A931FCD" }, "type": "smb", "description": "migration smb-2023-01-26_11-58-48", "deleteIncrementalPolicy": "e6165aa0-fb7d-490a-98da-f90032f0b886", "throttlePolicy": "c080798a-fd43-484b-bafb-2891b555ab3e", "routingPolicies": [ "c6ed5858-4e74-443d-a303-a33ff4099115" ] } # connection json body { "type": "smb", "description": "migration smb-2023-01-26_11-58-48", "properties": { "host": "smb://10.89.26.105:445/", ... }, "credential": "babe0cc5-7223-4ed5-80c1-fa992d9a6a2f", "deleteIncrementalPolicy": "e6165aa0-fb7d-490a-98da-f90032f0b886", "throttlePolicy": "c080798a-fd43-484b-bafb-2891b555ab3e", "routingPolicies": [ "c6ed5858-4e74-443d-a303-a33ff4099115" ] } # seed json body { "type": "smb", "description": "migration smb-2023-01-26_11-58-48", "connector": "ec658f0b-1ea1-407f-8b8c-1b5c38da09f8", "connection": "e6cf3d33-1914-48c0-9b2f-89d45e8adcbf", "seed": "/", "properties": { "seed": "/" }, "tags": null, "workflows": [ "2745cfbe-0a2f-4d19-ac02-51e7c6d68898" ], "deleteIncrementalPolicy": "e6165aa0-fb7d-490a-98da-f90032f0b886", "throttlePolicy": "c080798a-fd43-484b-bafb-2891b555ab3e", "routingPolicies": [ "c6ed5858-4e74-443d-a303-a33ff4099115" ] } |
...
Routing Policies - Aspire 5.0 - Confluence (accenture.com)
Script must to be run with other parameter "-t [tag1, tag2, ..]"
Code Block |
---|
python __init.py__ -t EU,US -p smb_connector-content-source.zip |
Output
Content by Label | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Code Block |
---|
{
"type": "smb",
"description": "migration smb-2023-01-26_12-45-23",
"connector": "52970e8c-2426-4967-80e7-9100a3ed3ba4",
"connection": "00a71baa-2d4f-4e87-9c7b-bec59f092dc9",
"seed": "/",
"properties": {
"seed": "/"
},
"tags": [
"EU",
"US"
]
} |
Default values are part of *-transform-matrix.json and can be added for seed, connection, connector, credential.
Code Block | ||
---|---|---|
| ||
{"default":{
"scanRecursively": true,
"stopCrawlOnScannerError": true,
"filterNoCrawl": false,
"azureADSeed": ""
},
... |
Multiple starting points are feature used in connectors version 4.0. Connectors version 5.0 doesn't have this feature because the functionality now is done by seeds and connection.
Two possibilities are in connectors version 4.0 to add multiple version points, by UI and by file.
By UI: User add urls by UI and we can see them in the file content-source.xml,
Code Block |
---|
# sharepoint connector
<siteCollectionsToCrawl>
<siteCollectionUrl>https://cao365.sharepoint.com/sites/qasite/davidgtest2</siteCollectionUrl>
<siteCollectionUrl>https://cao365.sharepoint.com/sites/qasite/davidgtest</siteCollectionUrl>
</siteCollectionsToCrawl>
#smb connector
<urls>smb://10.89.26.106:445/</urls>
<urls>smb://10.89.26.105:445/</urls>
#filesystem connector
<urls>c:\tmp</urls>
<urls>c:\Users</urls>
|
Urls xml tags are not standardized for all connectors. It is created mechanism in connection_transform_matrix.json to deal with it.
Code Block |
---|
# sharepoint connector
"transform":{
"conn_url_prop_name": "serverUrl",
"source_url_prop_name": "siteCollectionsToCrawl:siteCollectionUrl",
...
#smb connector
"transform":{
"conn_url_prop_name": "host",
"source_url_prop_name": "urls",
..
#file system connector
"transform":{
"conn_url_prop_name": "url",
"source_url_prop_name": "urls",
.. |
We can see that are created keys "conn_url_prop_name", "source_url_prop_name" which are pointing to json connection property name and xml tag in content-source.xml.
By file:
Users add in UI only path to the txt file and in the file content-source.xml is file path xml tag.
Code Block |
---|
# sharepoint connector
<seedsFilePath>${aspire.config.dir}/${app.name}/urls.txt</seedsFilePath>
#smb connector
<fileUrl>C:\tmp\ups.txt</fileUrl>
#filesystem connector
<fileUrl>C:\tmp\ups.txt</fileUrl>
|
The file path xml tag is not standardized for all connectors, so we created similar mechanism as for urls to deal with that.
Code Block |
---|
# sharepoint connector
"transform":{
"source_fileurl_prop_name": "seedsFilePath",
...
#smb connector
"transform":{
"source_fileurl_prop_name": "fileUrl",
...
#filesystem connector
"transform":{
"source_fileurl_prop_name": "fileUrl",
... |
Script read urls from "content-source.xml" file or txt file and create several connections and seeds by HTTP API call.
Script read an url splits it to connection part and seed part and create the connection and the seed.
Concretely Filesystem url C:\tmp/abc will be splitted to drive part C:\ (connection) and directory part /tmp/abc (seed).
Sharepoint url https://cao365.sharepoint.com/sites/qasite/davidgtest will be splitted to scheme part https://, hostname part cao365.sharepoint.com, path part /sites/qasite/davidgtest,
scheme + hostname will be in the connection, path part will be in the seed.
Almost same will be splitted smb url smb://10.89.26.106:445/, only we don't need scheme in this situation so this will be not used. To control usage of scheme, we need to setup in file connection_transform_matrix.json property
"url_scheme_included":"false".
Script must be run with other parameter "-d [migr]"
Code Block |
---|
python __init.py__ -d migr -p smb_connector-content-source.zip |
Migration xml files can contain relative path to other configuration files. Script can change automatically relative path to absolute path but must be run with parameter "-a [absolute path to aspire 4.0 distribution]"
Code Block |
---|
python __init.py__ -a C:\aspire-4.0 -p smb_connector-content-source.zip |
Script can change description to path/folder, host/url by run with parameter "-u"
Code Block | ||
---|---|---|
python __init.py__ -p smb_connector-content-source.zip -u | ||
Page properties | ||
| ||
Related issues |