You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

 

The Publish to Azure Search publisher will post documents to an Azure Search index through https://<server>/indexes/<index>/docs/index?api-version=<apiVersion> as described by Azure Search in the Bulk API.

Features

Some of the features of the Publish to Azure Search publisher include:

  • Customization of the feed for the Azure Search by editing the Groovy script.
  • Is connector independent.
  • Runs from any machine with access to Azure Search Cloud Service.



Limitations 


The publisher itself is subjected to the Azure Search Service Limitations.:

  • Certain properties of your index schema can only be set once and cannot be updated in the future. Because of this, any schema updates that would require re-indexing such as changing field types are not currently possible after the initial configuration. So be sure that your index has all required fields with correct configuration before saving it.
  • Documents Keys can only contain letters, digits, underscore (_), dash (-), or equal sign (=).
  • Documents field names must only start with a letters and contain only letter, digits, or underscore.
  • The maximum size of a batch is 16 MB, since you can pass a batch of multiple documents to the Index API at once, the size limit per document actually depends on how many documents are in the batch. If the batch size exceeds 16 MB the publisher will try to split the batch and truncate the content field of the document (if exists) in order to reduce the size, for a batch with a single document, the maximum document size is 16 MB of JSON, if a document size still exceeds 16 MB even after truncating the content field, it will impossible to publish it without doing modifications to the Groovy Transformation file. 
  • Maximum 1000 documents per batch of index uploads, merges, or deletes.
  • Maximum 32 fields in $orderby clause in the schema.
  • Maximum field size for Filterable, Sortable, Facetable or Searchable fields is 32,766 bytes (32 KB minus 2 bytes) of UTF-8 encoded text.
  • Is not possible to clean all documents of a index directly, so if the index needs to be cleaned, deleting and re-creating the index is the only way. The publisher will never do this since it does not know the how the schema is configured in order to recreate it, so if multiple full crawls are executed, data of previous crawls will still exists on the index.

More information at https://docs.microsoft.com/en-us/azure/search/search-limits-quotas-capacity and https://docs.microsoft.com/en-us/azure/search/search-create-index-portal


  • No labels