JSON Query logo

JSON Query

A small, flexible, and expandable JSON query language.

Documentation

JSON Query is a small, flexible, and expandable query language. It comes with a human friendly Text Format and an easy to parse, intermediate JSON Format.

JSON Query Overview

Getting started

To get started with JSON Query, select one of the implementations and read the corresponding documentation on how to use the library. Typically, a JSON Query library contains a main function where the input is the data and a query, and the output is the query result:

output = jsonquery(data, query)

Besides a query function, the library likely has functions to convert between the JSON Format and Text Format of JSON Query.

Syntax

The jsonquery language looks quite similar to JavaScript and other JSON query languages. This makes it easy to learn. When writing a query, you compose a “pipe” or a “chain” of operations to be applied to the data. It resembles chaining like in Lodash or just in JavaScript itself using methods like map and filter.

Queries can be written in a plain Text Format which is compact and easy to read for humans. The Text Format is parsed into an intermediate JSON Format which is easy to operate on programmatically. This JSON Format is executed by the query engine.

The Text Format has functions, operators, property getters, pipes to execute multiple queries in series, and objects to execute multiple queries in parallel or transform the input. For example:

filter(.age >= 18) | sort(.age)

The Text Format can be converted (back and forth) into a JSON Format consisting purely of composed function calls. A function call is described by an array containing the function name followed by its arguments, like [name, arg1, arg2, ...]. Here is the JSON equivalent of the previous example:

[
  "pipe",
  ["filter", ["gte", ["get", "age"], 18]],
  ["sort", ["get", "age"]]
]

The JSON Format is mostly used under the hood. It allows for easy integrations like a GUI or executing the query in a different environment or language without having to implement a parser for the Text Format. Read more in the JSON Format section.

Text Format

The following table gives an overview of the JSON query Text Format:

TypeSyntaxExample
Functionname(argument1, argument2, ...)sort(.age, "asc")
Operator(left operator right)filter(.age >= 18)
Pipequery1 | query2 | …sort(.age) | pick(.name, .age)
Object{ prop1: query1, prop2: query2, ... }{ names: map(.name), total: sum() }
Array[ item1, item2, ... ][ "New York", "Atlanta" ]
Property.prop1
.prop1.prop2
."prop1"
get("prop1", "prop2")
get()
.age
.address.city
."first name"
get("address", "city")
get()
String"string""Hello world"
NumberA floating point number2.4
Booleantrue or falsetrue
nullnullnull

The syntax is explained in detail in the following sections. The examples are based on querying the following data:

[
  { "name": "Chris", "age": 23, "address": { "city": "New York" } },
  { "name": "Emily", "age": 19, "address": { "city": "Atlanta" } },
  { "name": "Joe", "age": 32, "address": { "city": "New York" } },
  { "name": "Kevin", "age": 19, "address": { "city": "Atlanta" } },
  { "name": "Michelle", "age": 27, "address": { "city": "Los Angeles" } },
  { "name": "Robert", "age": 45, "address": { "city": "Manhattan" } },
  { "name": "Sarah", "age": 31, "address": { "city": "New York" } }
]

Functions

Function calls have the same syntax as in most programming languages:

name(argument1, argument2, ...)

The following example will sort the data in ascending order, sorted by the property age.

sort(.age, "asc")

Important to understand is that the functions are executed as a method in a chain: the sorting is applied to the data input, and forwarded to the next method in the chain (if any). The following example first filters the data, and next sorts it:

filter(.age >= 21) | sort(.age, "asc")

See page Function reference for a detailed overview of all available functions and operators.

Operators

JSON Query supports all basic operators. Operators must be wrapped in parentheses (...), must have both a left and right hand side, and do not have precedence since parentheses are required. The syntax is:

(left operator right)

The following example tests whether a property age is greater than or equal to 18:

(.age >= 18)

Operators are for example used to specify filter conditions:

filter(.age >= 18)

When composing multiple operators, it is necessary to use parentheses:

filter((.age >= 18) and (.age <= 65))

See page Function reference for a detailed overview of all available functions and operators.

Pipes

A pipe is a series of multiple query operations separated by a pipe character |. The syntax is:

query1 | query2 | ...

The entries in the pipeline are executed one by one, and the output of the first is the input for the next. The following example will first filter the items of an array that have a nested property city in the object address with the value "New York", and next, sort the filtered items by the property age:

filter(.address.city == "New York") | sort(.age)

Objects

An object is defined as a regular JSON object with a property name as key, and query as value. Objects can be used to transform data or to execute multiple queries in parallel.

{ prop1: query1, prop2: query2, ... }

The following example will transform the data by mapping over the items of the array and creating a new object with properties firstName and city for every item:

map({
  firstName: .name,
  city: .address.city
})

The following example runs multiple queries in parallel. It outputs an object with properties names, count, and averageAge containing the results of their query: a list with names, the total number of array items, and the average value of the properties age in all items:

{
  names: map(.name),
  count: size(),
  averageAge: map(.age) | average()
}

A property can be unquoted when it only contains characters a-z, A-Z, _ and $, and all but the first character can be a number 0-9. When the property contains other characters, like spaces, it needs to be enclosed in double quotes and escaped like JSON keys:

{
  "first name": map(.name)
}

Arrays

Arrays are defined like JSON arrays: enclosed in square brackets, with items separated by a comma:

[query1, query2, ...]

Arrays can for example be used for the operators in and not in:

filter(.city in ["New York", "Atlanta"])

Properties

An important feature is the property getter. It allows to get a property from an object:

.age

A nested property can be retrieved by specifying multiple properties. The following path for example describes the value of a nested property city inside an object address:

.address.city

A property can be unquoted when it only contains characters a-z, A-Z, _ and $, and all but the first character can be a number 0-9. When the property contains other characters, like spaces, it needs to be enclosed in double quotes and escaped like JSON keys:

."first name"

To get the current value itself, use the function get without arguments:

get()

And, alternatively to the dot notation, the function get can be used for properties and nested properties too:

get("age")
get("address", "city")

Values

JSON Query supports the following primitive values, the same as in JSON: string, number, boolean, null.

TypeExample
string"Hello world"
"Multi line text\nwith \"quoted\" contents"
number42
2.74
-1.2e3
booleantrue
false
nullnull

JSON Format

The Text Format described above can be converted into an intermediate JSON Format consisting purely of composed function calls and vice versa. A function call is described as a JSON array containing the function name followed by its arguments:

[name, arg1, arg2, ...]

For example, the following JSON Query filters a list with objects on having a property age > 18, and next, sorts the objects by the property age.

[
  "pipe",
  ["filter", ["gte", ["get", "age"], 18]],
  ["sort", ["get", "age"]]
]

The following table gives an overview of the Text Format and the equivalent JSON Format.

TypeText FormatJSON Format
Functionname(argument1, argument2, ...)["name", argument1, argument2, ...]
Operator(left operator right)["operator", left, right]
Pipequery1 | query2 | …["pipe", query1, query2, ...]
Object{ prop1: query1, prop2: query2, ... }["object", { "prop1": query1, "prop2": query2, ... }]
Array[ item1, item2, ... ]["array", item1, item2, ... ]
Property.prop1
.prop1.prop2
."prop1"
["get", "prop1"]
["get", "prop1", "prop2"]
["get", "prop1"]
String"string""string"
NumberA floating point numberA floating point number
Booleantrue or falsetrue or false
nullnullnull

Gotchas

The JSON Query language has some gotchas. What can be confusing at first is to understand how data is piped through the query. A traditional function call is for example max(myValues), so you may expect to have to write this in JSON Query like ["max", "myValues"]. However, JSON Query has a functional approach where we create a pipeline like: data -> max -> result. So, you will have to write a pipe which first gets this property and next calls the function max: .myValues | max().

Another gotcha is that unlike some other query languages, you need to use the map function to pick some properties out of an array for every item in the array. When you’re just picking a few fields without renaming them, you can use the function pick too, which is more concise.

.friends | { firstName: .name, age: .age }        WRONG 
.friends | map({ firstName: .name, age: .age })   RIGHT 
.friends | pick(.name, .age)                      RIGHT 

Motivation

There are many powerful query languages out there, so why the need to develop jsonquery? There are a couple of reasons for this.

  1. Syntax

    Most JSON query languages have a syntax that is simple for very basic queries, but complex for more advanced queries. Their syntax is typically very compact but includes special characters and notations (like @, $, |, ?, :, *, &), almost feeling like Regex which is infamously hard to read. The syntax is hard to remember unless you use the query language on a daily basis.

  2. Size

    Most of the JSON query languages are quite big when looking at the bundle size. This can make them unsuitable for use in a web application where every kilobyte counts.

  3. Expressiveness

    The expressiveness of most query languages is limited. Using for example JavaScript+Lodash to query data can be a go-to because it is so flexible. The downside however is that it is not safe to store or share queries written in JavaScript: executing arbitrary JavaScript can be a security risk.

  4. Parsable

    When a query language is simple to parse, it is easy to write integrations and adapters for it. For example, it is possible to write a visual user interface to write queries, and the query language can be implemented in various environments (frontend, backend).

The jsonquery language is inspired by JavaScript+Lodash, JSON Patch, and MongoDB aggregates. It is basically a JSON notation to describe making a series of function calls. It has no magic syntax except for the need to be familiar with JSON, making it flexible and easy to understand. The library is extremely small thanks to smartly utilizing the native language’s functions and built-in JSON parser, requiring very little code to make the query language work.