Step 2. Select the Seed option from the left-hand menu.
The "Seed" option, identified by a "seed" image , is located on the left side of the application, just above the "Workflows" option. Click on it to navigate to the "Seed" page.
Step 3. Specify Connection Description and Type
Once on the "Seed" page, click on the "+New" option to create a new Seed or select an existing one to modify it.
Description: specify a description for the Seed. It is advised for it to be concise and meaningful.
Type: select "S3" as the type for the Seed.
Step 4. Specify Seed Information
Once the type has been selected, you will be presented with the "Seed" section of the "Seed" page. A single parameter is required in this section:
Crawl path: the path to be crawled. It can be a bucket, folder or file.
The "Split Files" section is located between the "Seed" and "Connector" sections of the "Seed" page. Here you need to set the following options for this section of the Seed. If no options are modified, default values are used:
Process Split Documents: if enabled, files that are split are treated as a single document instead of multiple documents.
Split Patterns: list of regular expressions to match folders that contain split files.
Step 6. Specify a Connector
The "Connector" section is located between the "Split Files" section and the "Connection" section of the "Seeds" page. Here, you must select a previously created Amazon S3 Connector for the Seed, from the Connector combo box.
Step 7. Specify a Connection
The "Connection" section is located between the "Connector" section and the "Workflows" section of the "Seeds" page. Here, you must select a previously created Amazon S3 Connection for the Seed, from the Connection combo box.
Step 8. Specify Workflows (Optional)
The "Workflows" section is located between the "Connection" section and the "Tag" section of the "Seeds" page. Here, you can select previously created Workflows that apply to the seed. If no workflow is specified, a default workflow is assigned.
Step 9. Specify a Tag (Optional)
The "Tag" section is located between the "Workflows" section and the "Policies" section of the "Seeds" page. Here you can, if desired, specify a tag for seeds filtering.
Step 10. Specify Policies (Optional)
The "Policies" section is the last section, located right below the "Tag" section of the "Seeds" page:
Throttle Policy: here, you can select a previously created Throttling Policy from the Throttle Policy combo box.
Route Policy: here you can select a previously created Routing Policy from the Route Policy combo box.
Step 11. Save the Seed
Click on the "Complete" button to save the new Seed (when updating, the button option will read "Save" instead of "Complete").
Step 12. Running the crawl
To run a crawl for the Amazon S3 Seed, click on the button for the seed you want to run and select Full or Incremental Crawl. This will start the chosen crawl for your seed.