You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

 

Step 1. Launch Aspire and open the Content Source Management Page

Before launch Aspire, you need to change the felix.properties file and add this lines if the Kerberos authentication is going to be used:

# To append packages to the default set of exported system packages,
# set this value.
org.osgi.framework.system.packages.extra=\
 ...
 sun.security.krb5, \
 com.sun.security.auth.callback

# The following property makes specified packages from the class path
# available to all bundles. You should avoid using this property.
org.osgi.framework.bootdelegation=\
 ...
 javax.security.sasl, \
 sun.security.krb5

Launch Aspire (if it's not already running). See:









                                          


Step 2. Add a new Content Source

  • For this step please follow the step from the Configuration Tutorial of the connector of you choice, please refer to Connector list

Step 3. Add a new Publish to HBase to the Workflow

To add a Publish to HBase drag from the Publish to HBase rule from the Workflow Library and drop to the Workflow Tree where you want to add it. This will automatically open the Publish to HBase window for the configuration of the publisher.

Step 3a. Specify Publisher Information

 In the Publish to HBase window, specify the connection information to publish to the .

  1. Source Name: Enter the source name to use for publishing the document.
  2. Namespace prefix: Enter the prefix of the name space.
  3. Create Namespaces: Select this option if the publisher should attempt to create the namespaces
  4. Configuration type
    1. Use Settings file:
      1. HBase configuration file path: Enter the path that contains the HBase configuration file.
    2. Do not use Settings File
      1. Username: Kerberos User with the permissions to publish to HBase.
      2. Keytab: Path to the Keytab file to use.
      3. Add resources: Check if you are going to pass Hadoop resources files (hbase-site.xml) to the publisher.
        1. Resource file: Hadoop resource file path.
      4. Add properties: Check if you are going to pass specific Hadoop properties to the publisher.
        1. Name: Hadoop property name.
        2. Value: Hadoop property value.
  5. Clean database before full crawl
  6. Debug: Check if you want debug messages enabled.

Step 3b. HBase settings file

If used, the HBase settings file has the following structure:

<settings>
    <properties>
        <property name="hbase.zookeeper.quorum">10.0.0.114</property>
    </properties>
    <configDir>config\kerberos\conf</configDir>
    <security>
        <kerberos >
            <user>hbase/[email protected]</user>
            <path>config\kerberos\hbase.keytab</path>
        </kerberos>
    </security>
</settings>

Properties: Put any Hadoop property required.

Config Dir: Path where the Hadoop Resources files are located.

Kerberos User: Kerberos User with the permissions to publish to HBase

Kerberos Keytab: Path to the Keytab file to use.

Once you've clicked on the Add button, it will take a moment for Aspire to download all of the necessary components (the Jar files) from the Maven repository and load them into Aspire. Once that's done, the publisher will appear in the Workflow Tree.

For details on using the Workflow section, please refer to Workflow introduction.

  • No labels