Query complexity
#Overview
When working with GraphQL (GQL) queries, it's important to manage the complexity of your queries to ensure efficient and effective data retrieval.
In the context of GQL, query complexity refers to the computational resources needed to fulfill a query. The complexity of a query increases with the number of fields and the depth of the query.
- Scalar fields: Each scalar field in a query contributes one point to the query complexity.
- Relations / Unions: Relations multiply their complexity times the level of nesting in the query.
For example, if a query retrieves a list of posts and each post has multiple comments, the complexity of the query increases with each nested comment.
This guide will help you with the following:
- How to split up your GQL queries to manage complexity
- How to optimize union queries
- How to use the complexity tree JSON output to calculate the cost of your queries
Some queries can be heavy on the database - especially due to heavy nesting or filtering - and in those cases, query splitting can help prevent them from failing.
The complexity tree shown here gives you information about the estimated cost of a query. If the JSON output shows values that are really high compared to the others, it might make sense to split the query.
While query complexity matters, the reason a query fails can also be due to the number of entries, the concurrency of the query in a short period of time, or the general load of the infrastructure. If your query fails and the complexity tree does not show any significantly high values, please contact support so they can help you investigate the reason and find a solution.
#Splitting GQL queries
To manage query complexity, you can split your GQL queries into smaller, more manageable parts:
Suggestion | Description |
---|---|
Limit the depth of your queries | Avoid deeply nested queries. Instead, break them up into multiple smaller queries. This can help reduce the complexity and make your queries more efficient. |
Fetch only necessary fields | Minimize the number of fields you're retrieving in each query. Only fetch the fields that are necessary for your current operation. |
Use pagination | Hygraph supports various arguments for paginating content entries. By using these features, you can manage the amount of data retrieved in each query, thereby reducing the complexity. |
Remember that the goal is to reduce the complexity of your queries to ensure efficient and effective data retrieval. By limiting the depth of your queries, fetching only necessary fields, and using pagination, you can manage the complexity of your GQL queries effectively.
The following examples show you how you can split your GQL queries:
#Example 1: Limiting query depth
Instead of a deeply nested query like this:
{posts {idcomments {idauthorreplies {idtextuser {idname}}}}}
You can split it into two separate queries:
#Example 2: Fetching only necessary fields
Intead of retrieving all fields, like this:
{post(where: { id: "..." }) {idtitlebodyauthorcomments}}
You can retrieve only the necessary fields, like this:
{post(where: { id: "..." }) {idtitle}}
#Example 3: Using pagination
Hygraph supports various arguments for paginating content entries:
first
: Seek forwards from the start of the result set.last
: Seek backwards from the end of the result set.skip
: Skip result set by a given amount.before
: Seek backwards before a specific ID.after
: Seeks forwards after a specific ID.
You cannot combine first
with before
, or last
with after
.
The default result size of results returned by queries fetching multiple entries is 10. You can provide a maximum of 100 to the first
, or last
arguments.
- The limit of 10/100 applies to projects created after 14-06-2022.
- Projects created before that date have a limit of 100/1000.
- To learn more about this, check out our document on Pagination.
You can use first
, last
, skip
, before
, and after
arguments with any nested relations. In the following example, the posts
model has comments
:
{posts {idcomments(first: 6, skip: 6) {idcreatedAt}}}
#Union queries
Union types allow to setup relational fields that point to different model types, while this feature allows for very flexible modelling of content, it can also open the door to queries that might not perform as well and could use some optimizations. Below we document means to optimize querying for content that is backed by a Union relation.
Unions are typically queried like so:
{page(where: { id: "ckrks0ge0334m0b52onduq7r2" }) {idtitleblocks {__typename... on Hero {titlectaLink}... on Grid {titlesubtitle {markdown}}... on Gallery {photos {urlhandle}}}}}
As schemas evolve and Union relations expand to many models, querying unions this way can become problematic. Particularly when every single possible type is queried with this format within the same query.
#Optimizing union queries
We offer two ways of optimizing your union queries:
- Enhanced Query Splitting with Entity Type (Preferred solution)
- Optimizing union queries using Node
#Enhanced query splitting with Entity type
This is the preferred solution, as it offers significantly better performance.
Hygraph has introduced an improved query splitting feature using the Entity
type and entities
query entrypoint.
This approach is particularly beneficial for handling complex union relationships and modular components.
This new feature reflects Hygraph's commitment to providing advanced solutions for handling complex GraphQL queries with ease and efficiency.
#Implementation
The Entity
type provides a more streamlined approach compared to the traditional Node interface. It makes use of the typename to substantially increase performance.
To do this, follow these two steps:
Step 1: Initial query using Entity type
This initial query fetches id
and __typename
for each block within a page, preparing for the detailed query in the next step.
query {page {idblocks {__typename... on Entity {id}}}}
Step 2: Detailed query for specific types
The second query specifically targets Hero
, Grid
, and Gallery
entities based on the id
and __typename
obtained from the first query. Results are returned in the order of the where
input.
query {entities(where: [{id: "ckrks0ge0334m0b52ienf67ag", typename: "Hero", stage: "DRAFT"},{id: "ckrks0ge0334m0b52firha74a", typename: "Grid", stage: "DRAFT"},{id: "ckrks0ge0334m0b52ifh2sd6a", typename: "Gallery", stage: "DRAFT"}]) {... on Hero {idtitle}... on Grid {idlayout}... on Gallery {idimages}}}
Please note that entities
is a top-level type.
#Benefits
Benefit | Description |
---|---|
Reduced Query Complexity | Simplifies queries by splitting them into manageable parts. |
Enhanced Performance | Improves efficiency by reducing the load in fetching complex data types. |
Flexible Data Fetching | Offers more control and precision in querying specific content types. |
#Example Use Case
Consider a website with a dynamic layout consisting of Hero
, Grid
, and Gallery
sections. Enhanced query splitting with Entity type would allow for efficient identification and retrieval of specific content types, ensuring high performance and flexibility in data handling.
#Optimizing union queries using Node
In order to avoid performance impacts due to a large number of Union types in a relation, it is possible to change the way the content is queried so that it is done in a 2 step approach.
Below we will be using the same query from the previous section as an example:
Step 1: Find out which documents are in fact connected
We will get the __typename
and the id
for all the connected documents in the union relation by using the Node interface like so:
Step 2: Query the connected types by id
With the retrieved information we can construct queries dynamically to fetch the affected documents. Considering the response we received from the previous query in Step 1, we will now go over the response and generate another query that will in fact get only the connected documents by id
:
query heroBlocks {heros(where: { id_in: ["cks8t3o943h1l0d099v8xd072"] }) {titlectaLink}}query gridBlocks {grids(where: {id_in: ["cksj3dxww0o2r0c57savzceub", "cksrocxds3mwa0a07rdtj7qvx"]}) {titlesubtitle {markdown}}}query galleryBlocks {galleries(where: { id_in: ["cks8t36i83iq70b6035caxp6n"] }) {photos {urlhandle}}}
Alternatively, you can combine these into a single query by using aliasing:
query blocks {heroBlocks: heros(where: { id_in: ["cks8t3o943h1l0d099v8xd072"] }) {titlectaLink}gridBlocks: grids(where: {id_in: ["cksj3dxww0o2r0c57savzceub", "cksrocxds3mwa0a07rdtj7qvx"]}) {titlesubtitle {markdown}}galleryBlocks: galleries(where: { id_in: ["cks8t36i83iq70b6035caxp6n"] }) {photos {urlhandle}}}
#Complexity tree JSON output
The complexity tree JSON output provides a detailed breakdown of the estimated and actual costs of your GraphQL query. This information can help you understand the computational resources required to fulfill your query and guide you in optimizing your queries for better performance.
To get the complexity tree JSON for your query, you need to add the "x-inspect-complexity": true
header to the playground.
#JSON Output
Here is a brief explanation of the keys in the JSON output:
total_estimated_docs
: The total number of documents estimated to be fetched by the query.total_actual_docs
: The total number of documents actually fetched by the query.total_estimated_cost
: The total estimated cost of the query. This includes the cost of fetching documents and any additional costs.total_actual_cost
: The total actual cost of the query.complexityTree
: A nested structure that breaks down the cost of each field in the query.
Each node in the complexityTree
has the following keys:
field_name
: The name of the field in the query.xpath
: The path to the field in the query.estimated_no_of_docs
: The estimated number of documents fetched by this field.additional_cost
: Any additional cost associated with this field.estimated_cost
: The total estimated cost of this field (the sum ofestimated_no_of_docs
andadditional_cost
).actual_no_of_docs
: The actual number of documents fetched by this field.actual_cost
: The actual cost of this field.children
: Any nested fields within this field. Each child is also a node with the same structure.
Nested objects multiply the estimated complexity by the pagination size default 10 (max 100).
This is important to keep in mind when dealing with nested queries, as they can significantly increase the complexity of your query.
#JSON Output Example
Consider the following query and its related complexity tree JSON
output:
This JSON output shows us that the total estimated cost of the query is 1116, which includes fetching 1110 documents and additional costs. However, since the query did not return any content for this example(there was no real content in the project), the actual costs and documents fetched are 0. Despite this, the query is still costly due to the nested structure, hence the high estimated cost.
The complexityTree
provides a breakdown of the costs for each field in the query. For example, the posts
field is estimated to fetch 10 documents with an additional cost of 2, resulting in an estimated cost of 12. Within the posts
field, the comments
field is estimated to fetch 100 documents with an additional cost of 2, resulting in an estimated cost of 102. The authors
field within comments
is estimated to fetch 1000 documents with an additional cost of 2, resulting in an estimated cost of 1002. This is because of the multiplication of nested fields that we mentioned before.
By examining the complexityTree
, you can identify which fields contribute the most to the complexity of your query and optimize accordingly.
Keep in mind that nested objects multiply the estimated complexity by the pagination size, so be mindful of this when structuring your queries.