Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Mimetype Normalizer component reads the mime type name from the input AspireObject and categorizes it according to the list of known mime types listed in the normalized-mimetypes.xml file.

Mime Type Normalizer
Factory Namecom.searchtechnologies.aspire:aspire-tools
subType

mimeTypeNormalizer

InputsAspireObject holding a mime-type in one of the mime-type fields described here
OutputsAspireObject with normalized mime-type fields

Mimetype Fields

The Mimetype Normalizer will search for the mime-type to classify on one of the following fields (first appearance in this order is used) in the input AspireObject:

OrderField
1mimeType
2contentType
3hierarchy/item/@type
4repItemType

Configuration

The mime type normalizer recognizes the following configuration parameters:

ElementTypeDefaultDescription
mimetypesLocationString${aspire.home}/resources/com.searchtechnologies.aspire.utilities.tools.MimeTypes/normalized-mimetypes.xmlThe location of the normalized mimetypes file.

Normalized Mimetypes XML

<?xml version="1.0" encoding="UTF-8"?>
<mimetypes>
  <category name="application/msword" displayName="Word">
    <mimetype name="application/vnd.lotus-wordpro"/>
    <mimetype name="application/vnd.openxmlformats-officedocument.wordprocessingml.document"/>
    <mimetype name="application/msword"/>
    <mimetype name="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"/>
    <mimetype name="application/vnd.openxmlformats-officedocument.wordprocessingml.websettings+xml"/>
    .
    .
    .
  </category>
  <category name="application/vnd.ms-powerpoint" displayName="PowerPoint">
    <mimetype name="application/vnd.ms-powerpoint"/>
    <mimetype name="application/vnd.openxmlformats-officedocument.presentationml.presentation"/>
    .
    .
    .
  </category>
  .
  .
  .
</mimetypes>

Output

The mimetype normalizer will output three different values: the original mime type value (originalMimeType), the normalized mime type or category (normalizedMimeType) and the normalized mime name or friendly name (normalizedMimeName).

<doc>
  <fetchUrl>smb://server/Archive 2011 - DLS Utah presentation.pptx</fetchUrl>
  <mimeType>application/vnd.openxmlformats-officedocument.presentationml.presentation</mimeType>
  .
  .
  .
  <originalMimeType>application/vnd.openxmlformats-officedocument.presentationml.presentation</originalMimeType>
  <normalizedMimeType>application/vnd.ms-powerpoint</normalizedMimeType>
  <normalizedMimeName>PowerPoint</normalizedMimeName>
</doc>