Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The general principal of the Publisher Framework is the same as the Connector Framework.

There is a generic publisher component that calls in to a repository-specific provider to access the repository to which content is being published.

  • In the case of standard “targets” (Solr, Elasticsearch, SharepointSharePoint via Stager etc.), the provider is also supported by the Publisher Framework. However, a developer is able to create his own, if required, to publish to a customer-specific target; only needing to consider how to perform actions at the target, rather than needing to consider all the general functionality such as when a new batch should be used. 
  • All common functionality (connections, batch handling, commit/clear jobs, etc.) is handled by the framework and this call methods in the provider as required. 
Panel
titleOn this page

Table of Contents



How it Works


  1. The user selects Select a publisher jar to load. This is a component (aspire-XXX-publisher). 
  2. There is a A common app bundle that is loaded automatically loaded when , if required.
  3. The app bundle loads the publisher framework jar and the originally requested provider. 
  4. The framework has the ability to may perform optional groovy Groovy or xml XML transforms; and the appropriate parameters are collected by the framework. However, the  
    • The actual use of the transform is controlled by the developer. 
  5. Where possible, connections are pooled.


Developer Settings


Similar to the “SourceInfo”, the framework uses a “PublisherInfo”. This holds information used to connect to the a target repository (urlURL, username, password etc.) but and also controls the framework functionality the framework provides. For example, the framework allows may allow for a transformation , but that a connector may does not require this functionality, so the developer could , you can disable it. . The developer is able to You can extend this, if required.

The , but the framework allows for the following configuration. 

  • When a connection is required

  • Pool connections 

    • True/false that connections should be pooled 

  • Use transform 

    • Transform type – none/xml/json + default transform file (to pick out of component) 

  • Supports authentication 

    • http(s)

  • We implemented a set of properties in the provider that control the DXF in the app bundle and the options in the framework. 

    • This could then be used to control the app bundle loaded (this is via the aspire application), and the configuration of the component “publisher info” (for example to control if the publisher supports “clear” and “commit” operations. 

    • The same configuration could then we used, via the

    dxf
    • DXF, to offer the option to (say) “process commits” only if it’s supported


Installation Settings


Installation settings are collected when the component is installed. The obvious items required are the location and connection (user/password) details of the target. The desire is that only options that the developer has enabled (in the developer settings above) will be presented to the user. The settings are collected using DXF. A publisher specific DXF is merged with a common piece to present the entire set. The framework collects the following parameters: 

  • Target URL 

    • The urlURL for the search engine, etc. 

  • Authentication

    • Yes/no/type 

    • Gather username/password 

  • Clear before full crawls 

    • True/false. If true, the publisher will react to start jobs for full crawl by calling a clear method 

  • Commit after crawls 

    • True/false. If true, the publisher will react to end jobs for crawls by calling a commit method 

  • Transform data before sending 

    • True/false 

  • Transform file name 

    • For cases when transformation is required 


Implementation


  1. On startup, the framework connects to the provider and calls a method “newPublisherInfo”.
  2. This returns a class (much like the SourceInfo) holding all of the configuration for the publisher (including the common options – perform clear ,etc.). 
    • This can be passed to other calls later. 
    • If required, connection pools will be initialized here. 
  3. When processing a document, the framework first categorizes the job into “control” or “document”
    • “Control” jobs are commit and clear
    • The and the framework calls the provider’s commit or clear methods (if enabled and processing is selected); passing a connection as required
    • For “document” jobs, the framework willdecide framework determines whether a new batch is required and calls the provider’s startBatch() method. 
    • The framework provides “standard” batch implementations.
  4. Then the provider establishes the specific type of job (add/update, delete or delete by query) and calls the appropriate provider method. 
  5. Closing the component releases all of the connections.


Component Properties

Control of the DXF etc is via , for example, is by way of a new properties file that is added to the component. This file allows the developer you to add properties.

  • While these properties are generic, the Publisher Framework looks for certain properties and uses these to default the various settings in the PublisherInfo.
  • These properties are passed to the dxfDXF with the rest of the configuration, allowing control of the options shown to the installer. 
  • These properties also allow control of the app-bundle loaded when the component is selected.