Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

 

The Publish to Azure Search publisher will post documents to an Azure Search index through httpsthrough 

https://<server>/indexes/<index>/docs/index?api-version=<apiVersion>

as described by Azure Search in the Bulk API.

Panel
titleOn this page

Table of Contents

Features

Some

Features of

the features of

the Publish to Azure Search publisher include:

Customization of the feed for the Azure Search
  • You can customize the Azure Search feed by editing the Groovy script.
Is
  • It is connector independent.
Runs
  • It runs from any machine with access to the Azure Search Cloud Service.


Limitations 


The publisher

itself

is

subjected

subject to the following Azure Search Service limitations.

:

    • Index schema - Certain properties of your index schema can
only
    • be set only once
and
    • .
      • They cannot be updated in the future.
Because of this
      • Therefore, any schema updates that
would
      • require re-indexing (such as changing field types) are not currently possible after the initial configuration.
So be sure that
      •  
        Important:  Make sure your index has all of the required fields with the correct configuration before saving it.
Documents Keys

    • Document keys - can only contain letters, digits, underscore ( _ ), dash ( - ), or equal sign ( = ).
Documents
    • Document field names - must
only
    • start with a
letters
    • letter and contain only
letter
    • letters, digits, or underscore.
The maximum size of a batch

    • Batch size - the maximum is 16 MB
, since you
    • . You can pass a batch of multiple documents to the Index API all at once, so the size limit per document
actually
    • depends on how many documents are in the batch. 
      • If the batch size exceeds 16 MB, the publisher will
try Maximum
      • attempt to split the batch
and truncate the content field of the document (if exists) in order to reduce the size, for a batch with a single document, the maximum document size is 16 MB of JSON, if the document size still exceeds 16 MB even after truncating the content field, it will impossible to publish it without doing modifications to the Groovy Transformation file. 
      • to make multiples requests without exceeding the size limit.
      • If a single document size is 16+ MB it will not be published successfully to the Azure Index, is possible to truncate the content of the such documents using the Groovy Transformation file.

    • Maximums
        • 1000 documents per batch of index uploads, merges, or deletes.
    Maximum
        • 32 fields in the $orderby clause in the schema.
    Maximum field size for Filterable, Sortable, Facetable or Searchable fields is
        • 32,766
    bytes
        •  bytes (32 KB minus 2 bytes) of UTF-8 encoded text
    .Is not possible to clean all documents of a index directly, so if the index needs to be cleaned, deleting and re-creating the index is the only way. The publisher will never do this since it does not know the how the schema is configured in order to recreate it, so if multiple full crawls are executed, data of previous crawls will still exists on the index
        • for the field size for the Filterable, Sortable, Facetable, and Searchable fields.

    More information

    at https://docs.microsoft.com/en-us/azure/search/search-limits-quotas-capacity and https://docs.microsoft.com/en-us/azure/search/search-create-index-portal