// Available variables: // WebDriver driver, selenium driver instance, can manipulate the page at will // List<String> discoveredUrls, insert all the URLs to process here // ALogger logger, aspire logger, for debug purposes // Get a list of all the <a /> elements driver.findElements(By.tagName("a")).each { item -> String link = item.getAttribute("href"); if (link == null || link == "") link = url.getAttribute("src"); logger.info("Current url %s, discovered %s", driver.getCurrentUrl(), link); discoveredUrls.add(link); } logger.info("Current url %s, discovery complete", driver.getCurrentUrl());
Some connectors perform incremental crawls based on snapshot files, which are meant to match the exact documents that have been indexed by the connector to the search engine. On an incremental crawl, the connector fully crawls the file system the same way as a full crawl, but it only indexes the modified, new or deleted documents during that crawl.
For a discussion on crawling, see Full & Incremental Crawls.
Failing to save a content source before creating or editing another content source can result in an error.
ERROR [aspire]: Exception received attempting to get execute component command com.searchtechnologies.aspire.services.AspireException: Unable to find content source
Save the initial content source before creating or working on another.
No available troubleshooting at this moment