SolrCloud CDH (Cloudera Distribution Including Apache Hadoop) targets enterprise-class deployments of that technology.

The Publish to SolrCloud CDH publisher will post documents to a SolrCloud index using a SolrJ Library. SolrJ has a CloudSolrClient class to communicate with SolrCloud. Instances of this class communicate with Zookeeper to discover Solr endpoints for SolrCloud collections and then perform requests.  

This version of the publisher offers two connection approaches:

  • CloudSolrClient uses a Zookeeper host and port list.
  • HttpSolrClient uses a unique Solr host and port.


For Kerberos Authentication, this publisher uses UserGroupInformation (which is a privileged action), and keytab, principal and hadoop core-site.xml to authenticate with Kerberos. It also uses a custom KerberizedHttpClient to perform the request.

Features

Some of the features of the Publish to SolrCloud publisher include:

  • Customizable feed to the Solr index by editing the XSLT file

  • Specify the Zookeeper server and port

  • Is connector independent

  • XSL transformations

  • No labels