Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Twitter connector will crawl content from any twitter account. 

The Twitter connector is a crawler developed using the Twitter Developer Platform for tweets discovery, but relies on the Aspire 3 Connector Framework to handle connections and distributed crawls.




Panel
titleOn this page

Table of Contents



Features


Some of the features of the Twitter connector include:

  •  Authentication Authentication using twitter user, consumer key and consumer secret key
  • Incremental crawl
  • Full crawl


Content Retrieved


The Aspider Web Crawler connector retrieves several types of documentsThe Twitter connector retrieves all tweets related to the twitter user specified. Listed below are some examples of documents content information of different tweet types that can be retrieved by this crawler.

  • HTML pages
    • html, aspx, php, etc.
  • Scripts and stylesheets
    • js, css, etc.
  • Images
jpg, gif, png, etc.
  • Text tweet
  • URL links
  • Geo location
  • Hashtags
  • User mentions entities
  • Media entities
  • Retweet count


Info

This crawler will retrieve any document found linked in the HTML Markup as links (such as PDFs, MS Word, MS PowerPoint, etc).


Limitations 


Due to the design implementation, Aspider Web Crawler  Twitter connector has the following limitations:

  • Dynamic generated markup


Anything we should add? Please let us know.