The Mimetype Normalizer component reads the mime type name from the input AspireObject and categorizes it according to the list of known mime types listed in the normalized-mimetypes.xml file.

Mime Type Normalizer
Factory Namecom.searchtechnologies.aspire:aspire-tools
subType

mimeTypeNormalizer

InputsAspireObject holding a mime-type in one of the mime-type fields described here
OutputsAspireObject with normalized mime-type fields

Mimetype Fields

The Mimetype Normalizer will search for the mime-type to classify on one of the following fields (first appearance in this order is used) in the input AspireObject:

OrderField
1mimeType
2contentType
3hierarchy/item/@type
4repItemType

Configuration

The mime type normalizer recognizes the following configuration parameters:

ElementTypeDefaultDescription
mimetypesLocationString${aspire.home}/resources/com.searchtechnologies.aspire.utilities.tools.MimeTypes/normalized-mimetypes.xmlThe location of the normalized mimetypes file.

Normalized Mimetypes XML

<?xml version="1.0" encoding="UTF-8"?>
<mimetypes>
  <category name="application/msword" displayName="Word">
    <mimetype name="application/vnd.lotus-wordpro"/>
    <mimetype name="application/vnd.openxmlformats-officedocument.wordprocessingml.document"/>
    <mimetype name="application/msword"/>
    <mimetype name="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"/>
    <mimetype name="application/vnd.openxmlformats-officedocument.wordprocessingml.websettings+xml"/>
    .
    .
    .
  </category>
  <category name="application/vnd.ms-powerpoint" displayName="PowerPoint">
    <mimetype name="application/vnd.ms-powerpoint"/>
    <mimetype name="application/vnd.openxmlformats-officedocument.presentationml.presentation"/>
    .
    .
    .
  </category>
  .
  .
  .
</mimetypes>

Output

The mimetype normalizer will output three different values: the original mime type value (originalMimeType), the normalized mime type or category (normalizedMimeType) and the normalized mime name or friendly name (normalizedMimeName).

<doc>
  <fetchUrl>smb://server/Archive 2011 - DLS Utah presentation.pptx</fetchUrl>
  <mimeType>application/vnd.openxmlformats-officedocument.presentationml.presentation</mimeType>
  .
  .
  .
  <originalMimeType>application/vnd.openxmlformats-officedocument.presentationml.presentation</originalMimeType>
  <normalizedMimeType>application/vnd.ms-powerpoint</normalizedMimeType>
  <normalizedMimeName>PowerPoint</normalizedMimeName>
</doc>
  • No labels