Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Http HTTP Listener connector will create an endpoint, which will be listening for requests to put them to Aspire pipeline. 

Easy Heading Free
navigationTitleOn this Page
wrapNavigationTexttrue
navigationExpandOptionexpand-all-by-default

Introduction


Use the Http HTTP Listener connector to receive RESTFul RESTful requests and to feed these requests to an Aspire pipeline. This feeder can turn Aspire into a "RESTful Web Service", accepting requests from outside clients, processing jobs, and then returning results.

The Http HTTP Listener connector will register a new endpoint URL, based on the Aspire server path. For example, if your seed description is "http HTTPS seed" and the endpoint name is "submitFiles" , then the new URL will be http://localhost:50505/aspire/_api/http_seed/submitFiles/. Look that Make sure the underscore replaced replaces the space in words "http HTTP seed".  In other words, it is separate and apart from the standard Aspire admin user interface (which is under "/aspire").

There are two modes of operation for the HTTP Feeder: 1) Input parameters specified on the URL, and 2) Input data POST'ed to the feeder. In the case of parameters on the URL, the input parameters are added to the AspireObject which is fed down the pipeline. In the case of POSTed data, this may either be parameters from a form that will be added to  the AspireObject, which is fed down the pipeline; or data streamed to the endpoint which is attached to the published Job as a stream.

The HTTP Feeder listener can also be used to upload files, using a Multipart form submission. See below for details.

Repository


HTTP Listener connector
Factory Namecom.accenture.aspire:aspire-http-listener-connector
subTypedefault
InputsRESTful requests in standard URL query string format (name=value pairs).
OutputsAspireObjects AspireObjectsAspireObject containing HTTP Request data, including all name=value pairs from the query string.

Connector Features


Connector Features

The Http HTTP Listener connector has the following features:

  • Parameters Specified on the URL written to AspireObject.
  • Download Serving files to a specific disc path.
  • Xml XML and JSON Data POSTed to the service are sent to Aspire pipeline.
  • Upload file by Multipart Form Submissions.



Parameters Specified on the URL

In the first mode, parameters are specified on the URL in param=value format. For example: http://localhost:50505/aspire/_api/http_seed/mySeed/params?param1=value1&param2=value2 .

These parameters will be stored in the resulting AspireDocument passed down the pipeline as XML tags at the top level. For example:

Code Block
<doc>
   <feederLabel>HttpFeeder</feederLabel>
   <param1 source="FeederReleaseEndpoint">value1</param1>
   <param2 source="FeederReleaseEndpoint">value2</param2>
 </doc>

The pipeline would then be responsible (via groovy scripting or whateverfor example) for processing the job as necessary. The results would be returned as XML data.

Information from the Endpoint

Information from the endpoint is also added to the job published by the HTTPFeeder Information is added as elements to the <aspireHttpFeederEndpoint> tag:

Code Block
<doc>


<feederLabel>httpFeeder</feederLabel>


<param1 
<param
source="FeederReleaseEndpoint"
>1</param>
<param
>value1</param1>
<param2 source="FeederReleaseEndpoint"
>2</param>
>value2</param2>
<aspireHttpFeederEndpoint fullPath="/aspire/_api/http_seed/mySeed/parameters" relativePath="/parameters" remoteAddr="[0:0:0:0:0:0:0:1]" remoteHost="[0:0:0:0:0:0:0:1]" remotePort="63173" serverName="localhost" serverPort="50505" endpointPath="/aspire/_api/http_seed/mySeed" source="FeederReleaseEndpoint">

<queryString>param

<queryString>param1=1&
param
param2=2</queryString>


</aspireHttpFeederEndpoint>


<pathInfo source="FeederReleaseEndpoint">/parameters</pathInfo>


</doc>


The following information is available:

AttributeDescription
sourceThe name of the HttpFeeder
remoteHostThe hostname of the client (e.g., browser).
remoteAddrThe IP address of the client (e.g., browser).
remotePortThe port used by the client (e.g., browser).
serverNameThe name of the server running the HttpFeeder.
serverPortThe port the HttpFeeder is listening on.
endpointPathThe path the HttpFeeder is responding to.
fullPathThe full path requested by the client.
relativePathThe path requested by the client relative to the servletPath.
queryStringThe entire query string (iei.e., everything after the the "?" in the URL).
maxUploadSizeThe maximum size of file that can be uploaded (in bytes - defaults to 10,485,760 bytes - 10Mb). This may be specified using a suffix to specify bytes/kilobytes/megabytes/gigabytes (b/kb/mb/gb). If the suffix is not given, the parameter is in bytes.

XML and JSON Data POSTed to the Service

If you wish to actually post data to the service, this can currently be done by setting the "Accept posted content" parameter to TRUE.

The content can be XML or JSON. 

This also means that you can follow the HTTP feeder with any pipeline stage that uses the content stream. For example, XML Sub Job Extractor, Tabular Files Extractor_Aspire_2, XML File LoaderTabular Files Extractor_Aspire_2, XML File Loader, and Extract Text can all be the first pipeline stage to receive the job.

The relative URL looks like aspire/_api/#{seed description}/#{endpoint}/data.

See the examples See the examples below to understand how to use it.

JSON input:

Code Block
curl -X POST --location "http://localhost:50505/aspire/_api/http_seed/mySeed/data" \
    -H "Content-Type: application/json" \
    -d "{\"name\" : \"David\", \"lastName\": \"Grobelny\"}"

JSON output:

Code Block
{"doc":{"name":"David","lastName":"Grobelny","aspireHttpFeederEndpoint":{"@source":"FeederReleaseEndpoint","@serverPort":"50505","@serverName":"localhost","@endpointPath":"\/aspire\/_api\/http_seed\/mySeed","@relativePath":"\/data","@fullPath":"\/aspire\/_api\/http_seed\/mySeed\/data","@remoteAddr":"127.0.0.1","@remoteHost":"127.0.0.1","@remotePort":"61250"},"pathInfo":{"@source":"FeederReleaseEndpoint","$":"\/data"}}}


XML input:

Code Block
curl -X POST --location "http://localhost:50505/aspire/_api/http_seed/mySeed/data" \
    -H "Content-Type: application/xml" \
    -d "<?xml version=\"1.0\" encoding=\"UTF-8\" ?>
<DATA_RECORD>
<APPLICANT_ID>11130154</APPLICANT_ID>
<ROLES>accounts assistant</ROLES>
<MAIN_SKILLS>sage line, excel, word power point</MAIN_SKILLS>
<TEXT_CV>manager executive construction project manager</TEXT_CV>
</DATA_RECORD>"

XML output:

Code Block
<?xml version="1.0" encoding="UTF-8"?>
<doc>
    <DATA_RECORD>
        <APPLICANT_ID>11130154</APPLICANT_ID>
        <ROLES>accounts assistant</ROLES>
        <MAIN_SKILLS>sage line, excel, word power point</MAIN_SKILLS>
        <TEXT_CV>manager executive construction project manager</TEXT_CV>
    </DATA_RECORD>
    <aspireHttpFeederEndpoint endpointPath="/aspire/_api/http_seed/mySeed" fullPath="/aspire/_api/http_seed/mySeed/data"
                              relativePath="/data" remoteAddr="127.0.0.1" remoteHost="127.0.0.1" remotePort="65372"
                              serverName="localhost" serverPort="50505" source="FeederReleaseEndpoint"/>
    <pathInfo source="FeederReleaseEndpoint">/data</pathInfo>
</doc>

XML input data transform

XML input with XSLT template path enable "Transform response" and fill path "/config/xsl/extractor.xsl (" to the field "Transform file"). You need to put file "extractor.xsl" to "aspire5/config/xsl" folder.

For XSLT version 2, you can use the Saxon processor.


Code Block
curl -X POST --location "http://localhost:50505/aspire/_api/http_seed/mySeed/data" \
    -H "Content-Type: application/xml" \
    -d "<?xml version=\"1.0\" encoding=\"UTF-8\" ?>
<DATA_RECORD>
<APPLICANT_ID>11130154</APPLICANT_ID>
<ROLES>accounts assistant</ROLES>
<MAIN_SKILLS>sage line, excel, word power point</MAIN_SKILLS>
<TEXT_CV>manager executive construction project manager</TEXT_CV>
</DATA_RECORD>"

XML output will be related to xsl schema file.Content of extractor.xsl

Code Block
<?xml version="1.0" encoding="UTF-8"?>
<doc<xsl:stylesheet typeversion="APPLICANT" id="11130154"/>

Multipart Form Submissions

HTML supports submitting "multipart forms" made up of multiple parameters, some of which may represent uploaded file content.

1.0"
	xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
	<xsl:template match="/">
		<doc type="APPLICANT">
			<xsl:attribute name="id"><xsl:value-of select="doc/DATA_RECORD/APPLICANT_ID" /></xsl:attribute>
       		<xsl:copy-of select="doc/extractedTerms"/>
		</doc>
	</xsl:template>
</xsl:stylesheet>


XML output will be related to XSL schema file.

Code Block
<?xml version="1.0" encoding="UTF-8"?>
<doc type="APPLICANT" id="11130154"/>


Multipart Form Submissions

HTML supports submitting "multipart forms" made up of multiple parameters, some of which may represent uploaded file content.

In order for the HTTP feeder to receive multipart forms, you need to enable them and then specify how files are handled. You may choose to handle posted files as a stream (choose stream for the "Multipart form data" option), or as files (choose file for the "Multipart form data" option). If you choose to handle posted files as files, you must In order for the HTTP feeder to receive multipart forms, you need to enable them and then specify how files are handled. You may choose to handle posted files as a stream (choose stream for the <fileHandler> option), or as files (choose file for the <fileHandler> option). If you choose to handle posted files as files, you must also specify the directory they are uploaded to.

Note: setting the XMLContent " option of the HttpFeeder automatically disables multipart form submission processing.

The relative URL looks like aspire/_api/#{seed description}/#{endpoint}/form.

See the examples below to understand how to use it.

Stream Handler

When the file handler is set to stream, only a single file may be uploaded at a time. AlsoFurthermore, all parameters which are received BEFORE the file will be added to the job's as XML tags on the AspireObject. Parameters received AFTER the file are ignored. The file itself will be attached as an InputStream to the job, and subsequent stages can access the data using the Standards.Basic.getContentStream(Job j) method in the package com.accenture.aspire.framework and so data can be streamed directly from the client through whatever processing you need to do. The file is NOT stored locally on the Aspire server by the HttpFeeder

Example configuration
<component name="MyHTTPFeeder" factoryName="aspire-http-feeder" subType="default">

.


Input:

Code Block
curl -X POST --location "http://localhost:50505/aspire/_api/http_seed/mySeed/form" \
   
.
 -H 
. <multipartForm>
"Content-Type: multipart/form-data; boundary=WebAppBoundary" \
    
<fileHandler>stream</fileHandler>
-F "field-name=@C:\tmp\multipartfile.txt;filename=file.txt;type=*/*"

Output:

Code Block
<?xml version="1.0" encoding="UTF-8"?>
<doc>
   
</multipartForm> </component>
 <field-name source="FeederReleaseEndpoint">file.txt</field-name>
    <contentStream>Http Feeder

        Description: Feeds a single URL down the pipeline in response to an http request.
		...
        
    </contentStream>
    <aspireHttpFeederEndpoint endpointPath="/aspire/_api/http_seed/mySeed" fullPath="/aspire/_api/http_seed/mySeed/form"
                              relativePath="/form" remoteAddr="127.0.0.1" remoteHost="127.0.0.1" remotePort="63025"
                              serverName="localhost" serverPort="50505" source="FeederReleaseEndpoint"/>
    <pathInfo source="FeederReleaseEndpoint">/form</pathInfo>
</doc>


File Handler

When the file handler is set to file, multiple files may be uploaded by a single form submission. Using the file handler requires the HTTP Listener field Upload directory name to be configured. Any file submitted will be uploaded and saved to this directory. The uploaded file is saved using its original filename (filename only, not the complete path).

No streams are added to the Aspire job. And if you wish to reference the file, you will need to access the job's AspireObject and extract the value for the tag corresponding to the HTML form input that caused the file to be uploaded. This value is the full path to the saved copy of the uploaded file on the Aspire server.

For example, we will use the same input as it was for the Stream handler.

Input:

Code Block
curl -X POST --location "http://localhost:50505/aspire/_api/http_seed/mySeed/form" \
    -H "Content-Type: multipart/form-data; boundary=WebAppBoundary" \
    -F "field-name=@C:\tmp\multipartfile.txt;filename=file.txt;type=*/*"
Output:
Code Block
<?xml version="1.0" encoding="UTF-8"?>
<doc>
    <field-name source="FeederReleaseEndpoint">
        c:\Users\david.grobelny\aspire5.1\target\aspire-distribution-archetype-5.1-SNAPSHOT-distribution/config/xsl\file.txt
    </field-name>
    <aspireHttpFeederEndpoint endpointPath="/aspire/_api/http_seed/mySeed" fullPath="/aspire/_api/http_seed/mySeed/form"
                              relativePath="/form" remoteAddr="127.0.0.1" remoteHost="127.0.0.1" remotePort="56743"
                              serverName="localhost" serverPort="50505" source="FeederReleaseEndpoint"/>
    <pathInfo source="FeederReleaseEndpoint">/form</pathInfo>
</doc>

Serving Files

The HTTPFeeder can also serve up ordinary HTML files so it can be used as a more complete, end-to-end user interface for simple user interfaces.

Files are stored inside the Aspire Home directory, in the "$ASPIRE_HOME/web/httpfeeder/#{endpoint}/#{html_serving_directory}" directory.

For example, a request for:

Will access the file from:

  • $ASPIRE_HOME/web/httpfeeder/mySeed/submitFiles/test.html

Note that “index.html” is also supported. So, a request for:

Will return:

  • $ASPIRE_HOME/web/httpfeeder/mySeed/submitFiles/index.html

If it exists.

File Handler

When the file handler is set to file, multiple files may be uploaded by a single form submission. Using the file handler requires the HttpFeeder <uploadDir> to be configured. Any file submitted will be uploaded and saved to this directory. The uploaded file is saved using its original filename (filename only, not the complete path).

No streams are added to the Aspire job, and if you wish to reference the file, you will need to access the job's AspireObject and extract the value for the tag corresponding to the HTML form input that caused the file to be uploaded. This value is the full path to the saved copy of the uploaded file on the Aspire server.

For example, if the file was uploaded via the following form:

 <form enctype="multipart/form-data" method=POST  action="http://localhost:50505/xmlfeed">
   XML file to push:
   <input type="file" name="data">
   <input type="submit" value=">Submit<">
 </form>

The AspireObject for the job would look similar too:

 <doc>
   <aspireHttpFeederServlet remotePort="56494" serverName="localhost" source="HTTPFeederServlet" remoteHost="127.0.0.1" serverPort="50505" remoteAddr="127.0.0.1" fullPath="/xmlfeed" servletPath="/xmlfeed">
     <queryString/>
   </aspireHttpFeederServlet>
   C:\tmp\3.0distroTest\distro-test\target\aspire-distribution-1.0-distribution/data/upload\htmlContentFeed.xml
 </doc>

All ordinary HTML form input parameters will be added to the job's AspireObject as XML tags.

Example configuration

 <component name="MyHTTPFeeder" factoryName="aspire-http-feeder" subType="default">
   .
   .
   <multipartForm>
     <fileHandler>file</fileHandler>
     <uploadDir>data/upload</uploadDir>
   </multipartForm>
 </component>

Configuration

ElementTypeDefaultDescriptionbranchesparent tagNoneThe configuration of the pipeline to publish to. See below.waitForJobbooleantrueIndicates to the component whether or not wait for the job to complete .servletNameStringhttpFeederName of the servlet that will feed the files. For example, if servletName is "submitFiles", then you would send files to the httpFeeder using the "http://localhost:50505/submitFiles?params..." URL.feederLabelStringHttpFeederThe <feederLabel> value to be included with the document as it is sent to the pipeline. For example, HttpFeeder.XMLContentbooleantrueSet this parameter to true if you will be POST-ing XML data to the HTTP Feeder. This XML data will be set as an input stream attached to the job published by the feeder. Subsequent stages can access the data using the Standards.Basic.getContentStream(Job j) method in the package com.accenture.aspire.framework.xsltFileNameStringnullThe path of the XSL transform file to be used to format the output xml. Path names will be relative to Aspire Home.outputMimeStringtext/xmlSpecifies the mime type which the HTTP feeder will report back to the HTTP client. Change this to "text/html" if your transform creates HTML which should be shown by a browser.resultMimeTypeFieldStringSet the mime type using the value found in the field specified. The field must exist as a child of the root (ie a parameter value of mimeType looks for value in the /doc/mimeType field in the default AspireObject) . If the field does not exist or is empty, then the mimeType reverts back to the value from the parameter <outputMime>
NOTE: The value is extracted before the transformation (if any) is applied.multipartFormparent tagEnable multi-part form submission, which allows for uploading files to the HTTP server through HTML forms, as well as other input elements.multipartForm/fileHandlerStringstreamSpecify the type of file handler to use for posted files. The stream (default) handler will attach an InputStream to the file stream to the job and subsequent stages can access the data using the Standards.Basic.getContentStream(Job j) method in the package com.accenture.aspire.framework. The file handler will upload the file to the specified directory (see below). No input stream is attached to the job for the file handler. See above for more details and restrictions.multipartForm/uploadDirStringSpecify the location where files from multi-part forms will be uploaded when using the file handler. See above for more details.saxonProcessorbooleanfalseSet on true if you want to use SAXON Processors to transform using XSLT 2.0 files.debugOutFileStringSpecify the location where the XSLT processed output will be written to. This is used for debugging the transforms.headers parent tagNoneThe configuration of the http headers. See below.

Example Configurations for HTML Form-Style Parameters

This will handle either parameters specified on the URL with HTTP GET, or parameters POST'ed from an HTML <form>.

 <component name="MyHTTPFeeder" factoryName="aspire-http-feeder" subType="default">
   <servletName>submitFiles</servletName>
   <feederLabel>HttpFeeder</feederLabel>
   <xsltFileName>config/categorizeOutput.xsl</xsltFileName>
   <branches>
     <branch event="onPublish" pipelineManager="CategorizeFolderOrFile" />
   </branches> 
 </component>

Example configuration for posting XML to Aspire

 <component name="MyHTTPFeeder" factoryName="aspire-http-feeder" subType="default">
   <servletName>submitFiles</servletName>
   <feederLabel>HttpFeeder</feederLabel>
   <XMLContent>true</XMLContent>
   <xsltFileName>config/extractor.xsl</xsltFileName>
   <branches>
     <branch event="onPublish" pipelineManager="CategorizeFolderOrFile" />
   </branches> 
 </component>

Example configuration for configuring HTTP headers

You can specify required HTTP headers in the configuration as following. Then feeder will add those header information to the response.

 <component name="MyHTTPFeeder" factoryName="aspire-http-feeder" subType="default">
  .
  .
  .
    <headers>
        <header name="Authorisation">simple</header>
        <header name="Accept">text/plain</header>
    </headers>
 </component>

Serving Files

The HTTPFeeder can also serve up ordinary HTML files so it can be used as a more complete, end-to-end user interface for simple user interfaces.

Files are stored inside the Aspire Home directory, in the "web/httpfeeder/<servlet-name>" directory.

For example, a request for:

Will access the file from:

  • $ASPIRE_HOME/web/httpfeeder/submitFiles/test.html

Note that “index.html” is also supported. So, a request for:

Will return:

  • $ASPIRE_HOME/web/httpfeeder/submitFiles/index.html

If it exists.

Content Crawled

The SMB connector is able to crawl the following objects:

NameType Relevant MetadataContent Fetch & ExtractionDescriptionFoldercontainer
  • Last Modified Date
NAThe directories of the share folder. Each directory will be scanned to retrieve more directories or filesFiledocument
  • Last Modified Date
  • Data size
yesThe files contained by the directories in the crawled share folder.

Limitations

The SMB Connector has the following limitations:

The following features are not currently implemented, but are on the development plan:

SMBv3 support