What is QPL?

The Query Processing Language (QPL) is a scripting language which allows you to easily construct very complex queries.

For example:

marketsQuery = or(field("markets", split(solr.markets, "[;\\s]+")));
sourcesQuery = or(field("sources", split(solr.sources, "[;\\s]+")));
userTerms = solr.tokenize(query);
phraseQuery = phrase(userTerms)^1.5;
andQuery = and(userTerms)^1.0;

return and(or(phraseQuery,andQuery), marketsQuery, sourcesQuery);

The QPL product is not part of the Aspire Content Processing product and licensed under a separate agreement.


On this page

Why QPL?

Constructing complex queries has usually been implemented via custom Search Engine plug-ins or by string manipulation in the user interface. To date, there has been no consistent method for constructing queries.

QPL provides many advantages:

  • It's simple, flexible, compact, and powerful
  • Query construction logic is moved out of the user interface

    It is best if queries are constructed by the search engine team, who understand the index structure, rather than by user interface developers

  • It is easily extensible
  • It can be used to generate queries for any search engine (see below)
  • It can access external information to build queries, including:
    • Relational databases
    • External files
    • Synonym lists
    • Results of other queries

Remember that QPL is a complete scripting language. This means that you can use if-tests, loops, maps, external information, etc. to build your queries, giving it the power to construct queries for any situation.

Why Not Query Pipelines?

The previous method for building queries at Search Technologies was using Query Pipelines. However, these pipelines were never used that much.

The problems we discovered with Query Pipelines include:

  • Query pipelines are hard to understand

    Devolving query construction into a series of processing stages makes less sense for queries than it does for, say, document processing

  • Most stages want to operate on a portion of the query, and not the whole thing
  • Fundamentally, query pipelines are the wrong paradigm for query construction, because they do not understand the hierarchical structure of complex query expressions

QPL is fundamentally hierarchical, and constructs complex hierarchical expressions as a natural outcome of the script. This makes it the right tool for the job.

QPL is Search Engine Agnostic

QPL itself is not tied to any particular search engine. The operators and structures which QPL creates, for example and(), or(), phrase(), constant(), etc. are not tied to any particular search engine implementation.

This means that the standard result of QPL is a search-engine-independent query expression. This query expression is then "built" into a search-engine specific representation by a "search engine builder".

Current search engine builders:

  • Lucene
  • FAST
  • Elasticsearch

More builders for other engines (Google GSA, Amazon CloudSearch, etc.) will be available in the future.

Use Cases

QPL for Processing User Interface Inputs

User interfaces may have many inputs that affect the query, such as:

  • Geographic location
  • Location on the website
  • Security information (user name, roles, groups, classification level, etc.)
  • User interface widgets (select boxes, pulldowns, advanced-search, etc.)
  • What source or sources are being searched
  • Type of search and for what purpose

In most systems these inputs are converted into query structures using a variety of spaghetti code and string manipulations.

Instead, we propose to send these inputs (sometimes called "signals") directly from the user interface to the search services layer. QPL will then be responsible for converting the UI inputs into queries.

This architecture has several advantages:

  1. It reduces the burden on the user interface developer.
    • U/I Developers do not need to learn how to construct queries.
  2. Queries can be easily modified for performance and accuracy.
    • QPL scripts are easy to modify and do not require a new deployment.
    • Modifications to QPL do not require any changes to any user interface code.
  3. Queries can be built by the search engine team.
    • This is important because queries typically need to be coordinated with what is indexed.
    • Since the search team understands the structure of the indexes, they are the best ones to create the queries as well.

QPL for Complex Relevancy Ranking

Sometimes customers will require a very specific ordering of results. For example, all documents from category A (with ordered sub-categories), then category B (with ordered sub-categories), etc.

When these categories are based on user queries and/or other user inputs (locations, etc.) this may be best handled with complex query constructions and careful query weighting.

Many search engines provide operators which can provide these sorts of complex results ordering. For example, FAST provides the XRANK function, and Solr provides operators such as the ConstantQuery, DisjunctionMax, and BooleanQuery.

QPL makes it much, much simpler to create these sorts of complex query structures. It allows for and encourages careful results ordering.

Federation

Finally, QPL is an important step for creating a query federation framework. This would work as follows:

  1. The query is received from the search application.
  2. The query is converted to a QPL operator tree.
  3. The query is enhanced using QPL as needed.
  4. The resulting QPL query is then built and sent to each of the participating search engines.
  5. The results from all of the engines are then merged and presented back to the user.

Since the QPL structures are search-engine independent, this makes it the ideal intermediate query language for query construction. The builder plug-in architecture makes it easy to translate QPL to each of the search engines that need to receive the search.

Road Map

Features of QPL that are anticipated in the future:

  • Packaging other plug-ins for Solr and Elasticsearch
    • Lemmatization
    • XML Search
    • Security Result Trimming
  • No labels