Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Mimetype Normalizer component reads the mime type name from the input AspireObject and categorizes it according to the list of known mime types listed in the normalized-mimetypes.xml file.

Feature only available with Aspire EnterpriseImage Added

Mime Type Normalizer (Aspire 2)
Factory Name com.searchtechnologies.aspire:aspire-tools
subType mimeTypeNormalizer
Inputs AspireObject holding a mime-type in one of the mime-type fields described here
Outputs AspireObject with normalized mime-type fields

Mimetype Fields

The Mimetype Normalizer will search for the mime-type to classify on one of the following fields (first appearance in this order is used) in the input AspireObject:

OrderField
1mimeType
2contentType
3hierarchy/item/@type
4repItemType


Configuration

The mime type normalizer recognizes the following configuration parameters:

ElementTypeDefaultDescription
mimetypesLocationString${aspire.home}/resources/com.searchtechnologies.aspire.utilities.tools.MimeTypes/normalized-mimetypes.xmlThe location of the normalized mimetypes file.


Normalized Mimetypes XML

<?xml version="1.0" encoding="UTF-8"?>
<mimetypes>
  <category name="application/msword" displayName="Word">
    <mimetype name="application/vnd.lotus-wordpro"/>
    <mimetype name="application/vnd.openxmlformats-officedocument.wordprocessingml.document"/>
    <mimetype name="application/msword"/>
    <mimetype name="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"/>
    <mimetype name="application/vnd.openxmlformats-officedocument.wordprocessingml.websettings+xml"/>
    .
    .
    .
  </category>
  <category name="application/vnd.ms-powerpoint" displayName="PowerPoint">
    <mimetype name="application/vnd.ms-powerpoint"/>
    <mimetype name="application/vnd.openxmlformats-officedocument.presentationml.presentation"/>
    .
    .
    .
  </category>
  .
  .
  .
</mimetypes>


Output

The mimetype normalizer will output three different values: the original mime type value (originalMimeType), the normalized mime type or category (normalizedMimeType) and the normalized mime name or friendly name (normalizedMimeName).

<doc>
  <fetchUrl>smb://server/Archive 2011 - DLS Utah presentation.pptx</fetchUrl>
  <mimeType>application/vnd.openxmlformats-officedocument.presentationml.presentation</mimeType>
  .
  .
  .
  <originalMimeType>application/vnd.openxmlformats-officedocument.presentationml.presentation</originalMimeType>
  <normalizedMimeType>application/vnd.ms-powerpoint</normalizedMimeType>
  <normalizedMimeName>PowerPoint</normalizedMimeName>
</doc>