Page tree
Skip to end of metadata
Go to start of metadata

The Selenium connector will crawl content from websites using an internet browser to retrieve the pages. 


Some of the features of the Selenium connector include:

  • Use a real browser to retrieve the pages.
  • Avoid compatibility issues with web frameworks such as Angular, React, Node, among others.

Content Retrieved

The Selenium connector retrieves several types of documents such as: 

  • Web Pages.
  • Sitemaps.
  • Binary documents (PDF, word, images).


Due Selenium's own limitations, the connector doesn't support:

  • Basic authentication
  • NTLM authentication
  • Custom HTTP headers.

Due to API limitations, Selenium connector is only compatible with browsers that have a Web Driver implementation, for example:

  • Google chrome
  • Mozilla Firefox

Other features are also dependent on browser support, such as Headless Mode.

Future Development Plan 

The following features are not currently implemented, but are on the development plan:

  • Add support for more browsers.

Anything we should add? Please let us know.

  • No labels