As said before GAIA API has normalization of parameters and responses, this allows us to make one piece of code making a search request to an connection of type we don't know, to a engine we don't care about, and be sure it will work. 

Making a Request

Most calls to the engine ask for specific parameters, like the index, document or IDs, but the ones intended for querying as for a SearchRequest. The same can be said for the return types which can be SearchResponse, bool, str, dict and in most failed cases ErrorResponse.

# Define request, not all parameters are required
request: SearchRequest = SearchRequest(
            q=q,
            query=query,
            knn=knn,
            size=10,
            start=0,
            fetch_fields=['content', 'title'],
            exclude_fields=['date'],
            scroll=None,
            sort=SortEntry(field='title', order=SortOrder.ASC),
            default_operator=BoolOperation.OR,
            highlight=highlight,
            filters=filters,
            agg_filters=agg_filters,
            aggs=aggs,
        )
 
# Execute a search with the engine
response = engine.search(index='my_index', data=request)      
 
# Verify response is not ErrorResponse
if isinstance(response, ErrorResponse):
    raise ConnectionError(f'Error {response.status_code}) {response.error}')
 
# Profit
return response

Normalization Classes

To ease the process of creating generic request, GAIA API has different classes with well defined parameters

Search Request

This request is one of the parameters expected by engine methods (search, multi_search, async_search and knn_search). SearchRequest holds all the possible parameters this methods could need, the implementation of this parameters depend on the engine specific implementation, and not all parameters are shared amongst all the methods

Property

Description

Default

Type

Required

q

Query String

"*"

string

No

query

Accepts the search query structure in the specific engine's format


object

No

knn

Accepts the knn search structure in the the specific engine's format


object

No

rescore

The query rescorer executes a second query only on the Top-K results returned. Expected in the specific engine's format


object

No

size

Defines the number of hits to return


integer (minimum: 0)

No

start

Starting document offset. Needs to be non-negative


integer (minimum: 0)

No

scroll

Period to retain the search context for scrolling


string

No

sort

Engine string definition or generic SortEntry defining the sort order


array of objects

No

fetch_fields

List of fields to return in the response based on field values


array of strings

No

default_operator

The default operator for the query string query: AND or OR

"or"

string

No

highlight

Highlighters enable you to get highlighted snippets from one or more fields in your search results so you can show users where the query matches are


object

No

filters

Tuple of engine-specific filters. The tuple must contain optional_negated_filters, optional_filters, required_negated_filters, required_filters


SearchFilters

No

agg_filters

Tuple of engine-specific filters. The tuple must contain optional_negated_filters, optional_filters, required_negated_filters, required_filters


object

No

aggs

List of engine-specific aggregation implementations


object

No

exclude_fields

List of fields to exclude in the response based on field values


array of strings

No


If you have a keen eye you notice several parameters are simple object types, opening up to a wide variety of options what it can receive, all this options are expecting engine specific format. This was done on purpose to allow the developer to tweak with the queries before sending them to the engine, to compensate for this non-agnostic approach GAIA API has agnostic tools

  • Flexibility and Customization: The use of simple object types as parameters allows for a wide range of options, enabling developers to customize queries to their specific needs.

  • Pre-Engine Query Tweaking: Developers have greater control over their queries, adjusting parameters to achieve desired search results and optimize the search experience, and maximizing the engine's capabilities.

  • Agnostic Tools for Compatibility: Agnostic tools bridge different search engines, ensuring code compatibility and easy integration without major modifications.

Prepare Data

For Q, Query and KNN

For the string parameter q, one approach available is the use of PyQPL, this library can translate query string to engine specific queries, here an example

from app.rest import connection_manager
from pyqpl.parser import QPLParser
from pyqpl.qpl import QPLOptions
from pyqpl.translator import ElasticsearchTranslator, OpensearchTranslator
from models.engines import EngineTypes
 
# Get the conneciton_manager for access to the connection
from app.rest import connection_manager
 
# Get engine by name
engine = connection_manager.get_engine(name=engine_name)
 
# Based on the engine type, select the QPL Translator to use
if engine.engine_type is EngineTypes.ELASTIC:
    translator = ElasticsearchTranslator()
elif engine.engine_type is EngineTypes.OPENSEARCH:
    translator = OpensearchTranslator()
 
# Create the QPL parser based on the provided configuration
qpl_parser = QPLParser(options=qpl_config)
 
# Use the parser to parse the string in the parameter q
qpl_query = qpl_parser.parse_query(data=q)
# Use the translator to transforma the QPLQuery from the parser to an engine specific format
query = translator.to_engine_query(qpl_query)

This method allows you to have access to the QPLQuery, which can be manipulated (for information on this please check PYQPL space), and/or access to the engine specific query, and modify, inject or remove as desired. Then this query made from q, can be used in the parameters for query or knn

For Highlight

The highlight generation is as simple as calling the engine to which you will be sending the request and ask it to generate the highlight structure using the function generate_highlight

from app.rest import connection_manager
from models.engines import EngineTypes
from models.engines import HighlightConfig
 
# Get the connection_manager for access to the connection
from app.rest import connection_manager
 
# Get engine by name
engine = connection_manager.get_engine(name=engine_name)
 
 
# Convert HighlightConfig to engine specific
engine_highlight: any = engine.generate_highlight(highlight_config)

For Filters

Same as highlight, you can ask the engine to generate the filters (this is why the class FilterFactory is important), using the function  generate_filters, this function receives a list_of_filters which is composed of DynamicFilters or JSON representations of it.

This will generate a SearchFilters which has the ability to merge with other SearchFilters, this way you can merge multiple filters from different sources into one single SearchFilters

from app.rest import connection_manager
from models.engines import EngineTypes
from models.engines import FilterConfig, SearchFilters
from framework.filters import DynamicFilter
 
# Get the connection_manager for access to the connection
from app.rest import connection_manager
 
# Get engine by name
engine = connection_manager.get_engine(name=engine_name)
 
 
# Convert a list of DynamicFilter to engine specific, in the SearchFilters container
engine_filters: SearchFilters = self.engine.generate_filters(list_of_filters)
 
# Additionally you can merge filters from different sources
engine_filters = SearchFilters.merge_filters(engine_filters, self.engine.generate_filters(filters_from_request))

For Aggregations

Repeating the pattern, you can also ask the engine to generate the aggregations (this is why the class AggregationFactory is important), using the function generate_aggregations, this function receives alist_of_aggregations which is composed of DynamicAgg. Regarding selected aggregations (what happens when an aggregation is selected from ui), the structure is a list of SelectionAgg, or the JSON representation of a SelectionAgg, this list can be transformed into filters using the function generate_aggregation_filters wihch recieves the list of aggregations and the list of selected aggregations

from app.rest import connection_manager
from models.engines import EngineTypes
from framework.aggregations import DynamicAgg
from framework.aggregations.utils import SelectionAgg
 
# Get the connection_manager for access to the connection
from app.rest import connection_manager
 
# Get engine by name
engine = connection_manager.get_engine(name=engine_name)
 
 
# Convert a list of DynamicAgg to engine specific aggregations
aggregations: Dict = engine.generate_aggregations(list_of_aggregations)
 
# To convert selected aggregations into filters
applied_filters = engine.generate_aggregation_filters(list_of_aggregations, list_of_selected_aggregations))

ErrorResponse

Default response when a call to the engine returns as error

PropertyDescriptionDefaultTypeRequired
status_codeStatus code to be returned500integerNo
errorAn object describing the encountered error
dictYes

SortEntry

SortEntry represents a sort option, in which the field and order indicated the sort nature, the display_name is only required for ui_only configuration, and not required for the engine request process

PropertyDescriptionTypeRequired
fieldName of the field which will be used for sortingstringYes
display_nameDisplay name for this sort entry. Only applicable for the user interfacestringNo
orderSort order to be used. It can be one of the predefined SortOrder values or a custom object with additional propertiesSortOrderYes

SortOrder

An enumeration representing the order of a SortEntry.

ValueDescription
ascAscending sort order
descDescending sort order

Search Filters

SearchFilters is a normalization structure to hold all the filters to be applied into the search engine, and categorize each filter by behavior
PropertyDescriptionDefaultTypeRequired
should_notFilter which will act as a boolean operator NOR[]array of objectsNo
shouldFilter which will act as a boolean operator OR[]array of objectsNo
must_notFilter which will act as a boolean operator NAND[]array of objectsNo
mustFilter which will act as a boolean operator AND[]array of objectsNo
post_should_notFilter which will act as a boolean operator NOR[]array of objectsNo
post_shouldFilter which will act as a boolean operator OR[]array of objectsNo
post_must_notFilter which will act as a boolean operator NAND[]array of objectsNo
post_mustFilter which will act as a boolean operator AND[]array of objectsNo

SelectionAgg

SelectionAgg is a normalization structure to hold all the filters to be applied into the search engine, and categorize each filter by behavior

PropertyDescriptionDefaultTypeRequired
idAggregation Id
stringYes
negatedList of values or ranges, depending on the aggregation, selected to filter by exclusion[]arrayNo
valuesList of values or ranges, depending on the aggregation, selected to filter by[]arrayNo
levelLevel of Deepness0integerNo


Example Configuration:

{ 
    "id": "", 
    "negated": [], 
    "values": [], 
    "level": 0 
}
  • No labels