The Load XML Stage stage loads XML from a stream into the job's AspireObject. The XML will be loaded as a sub-element. This stage is typically used after a Fetch URL stage (which creates the stream).

XML Loader (Aspire 2)
Factory Name com.searchtechnologies.aspire:aspire-xml-files (previously aspire.XML)
subType loadXML
Inputs Either object['contentStream'] (an InputStream which contains the XML file to be loaded) or object['contentBytes'] (an array of bytes which contains the XML file to be loaded).
Outputs The XML file specified by the content stream or bytes will be loaded into memory and stored as a sub-element within the <doc> element attached to the AspireObject which is attached to the job.

Example Configuration


<component name="LoadXML" subType="loadXML" factoryName="aspire-xml-files"/>

With a Locally Stored DTDs

Use this version if the XML file calls out DTDs which you can not access through the internet.

  <component name="LoadXML" subType="loadXML" factoryName="aspire-xml-files">

Example Use Within A Pipeline

  <pipeline name="process-feedOne-test">
      <stage component="FetchUrl" />
      <stage component="LoadXML" />


In the following example suppose that there's a file called "file:test.xml" which contains the following:

  <speech name="George Washington">The period for a new election of a citizen, 
    to administer the executive government of the United States, being not far distant, 
    and the time actually arrived...
  <speech name="Abraham Lincoln">Four score and seven years ago our forefathers 
    brought forth upon this country...
  <speech name="Thomas Jefferson">We hold these truths to be self-evident, 
    that all men are created equal, that they are endowed by their Creator 
    with certain unalienable Rights, that among these are Life, Liberty and 
    the pursuit of Happiness...

Further suppose that "file:test.xml" is read by the Fetch URL stage. Once executing the Load XML stage, the AspireObject will contain the following structure. Notice how the <testRootNode> is nested within the <doc> node which is the root node of the AspireObject.

  <protocol source="FetchURLStage/protocol">file</protocol>
  <mimeType source="FetchURLStage/mimeType">application/xml</mimeType>
  <extension source="FetchURLStage">
    <field name="modificationDate">2009-12-06T05:06:06Z</field>
    <field name="content-type">application/xml</field>
    <field name="content-length">618</field>
    <field name="last-modified">Sun, 06 Dec 2009 05:06:06 GMT</field>
    <speech name="George Washington">The period for a new election of a citizen, 
      to administer the executive government of the United States, being not far distant, 
      and the time actually arrived...
    <speech name="Abraham Lincoln">Four score and seven years ago our forefathers 
      brought forth upon this country...
    <speech name="Thomas Jefferson">We hold these truths to be self-evident, 
      that all men are created equal, that they are endowed by their Creator 
      with certain unalienable Rights, that among these are Life, Liberty and 
      the pursuit of Happiness...
  • No labels