Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents



The Selenium connector will crawl content from websites using an internet browser to retrieve the pages. 

Features


Some of the features of the Selenium connector include:

  • Use a real browser to retrieve the pages.
  • Avoid compatibility issues with web frameworks such as Angular, React, Node, among others.

Content Retrieved


The Selenium connector retrieves several types of documents such as: 

  • Web Pages.
  • Sitemaps.
  • Binary documents (PDF, word, images).

Limitations 


Due Selenium's own limitations, the connector doesn't support:

  • Basic authentication
  • NTLM authentication
  • Custom HTTP headers.

Due to API limitations, Selenium connector is only compatible with browsers that have a Web Driver implementation, for example:

  • Google chrome
  • Mozilla Firefox

Other features are also dependent on browser support, such as Headless Mode.

Future Development Plan 


The following features are not currently implemented, but are on the development plan:

  • Add support for more browsers.

Anything we should add? Please let us know.