Power of Eloquence

Using built-in JSON query tools of relational databases

| Comments

I’ve been doing relational databases for a long time now, especially when you’re dealing with making data-rich web applications.

In fact, every full stack developer will tell and share you their stories and trivialities of working with database intimately every day.

So what better way to work with them is to know plenty of SQL statements such as SELECT, GROUP BY, FROM, WHERE etc, which is paramount without question.

Knowing such basic skills allows you to work with disparate industry-standard relational database technologies such as MySQL, MS SQL, Oracle DB, Postgres, and many more.

But what I discovered, recently of late, there’s a new tool that has been slowly introduced to these relational database technologies crowd all the while.

For the first time, you can now create and generate JSONified results from SQL statements.

What does this mean exactly?

It simply means instead of running a SQL statement that traditionally returns a resultset in a table format and you had to make some translation of such database for the web app to make sense of the data upon its return, you convert the same resultset into a JSON payload result!

How is that so?

Let me show you how.

Let’s say you’re working with Postgres DB using the following query.

/* Some simple table query we’re running... */

SELECT customer.id, customer.first_name, customer.last_name, customer.dob FROM customer;

As we know, with the above statement, our web app or mobile app or similar will expect a table and to make an appropriate decision on how to handle the purpose of the data.

Now, we decided we can make this into JSON object results.

We use row_to_json() method

/* Now let’s throw in some JSON magic here.. */

SELECT row_to_json(jt) FROM(SELECT customer.id, customer.first_name, customer.last_name, customer.dob FROM customer) jt

This will result in the JSON output

{"id": 1, "first_name":"Andy", "last_name":"Wong", "dob": "19xx-08-05"} // not telling you my real age ;)

This returns the first-row result which makes sense when you interpret its row_to_json and believe that it intuitively grabs the first row back in a jsonified output.

But what if you decide that you want to grab all the other rows remaining in the table you ran in the original SQL statement?

If that’s the case, you use array_agg and array_to_json methods.

/* Ready for full blown JSON-ripped data */

SELECT array_to_json(array_agg(row_to_json(jt)))
(SELECT customer.id, customer.first_name, customer.last_name, customer.dob FROM customer) jt

Running the above will result you in having a JSON array of objects.

[{"id": 1, "first_name":"Andy", "last_name":"Wong", "dob": "19xx-08-05"},
 {"id": 2, "first_name":"Bruce", "last_name":"Lee", "dob": "1940-11-27"}, 
 {"id": 3, "first_name":"Chuck", "last_name":"Norris", "dob": "1940-03-10"}]

What’s happening here is that array_agg is an aggregate function that acts count or sum thus it will aggregates the query into one PostgresSQL array, while array_to_json take the same PostgresSQL array and flattens it into a single JSON value.

This is pretty amazing!!

Do you know what immediate benefit this will mean to you as an application developer?

It means you don’t have to take the responsibility in making the data translation from a table into JSON-formatted results such as using ORM API to deconstruct/reconstruct the resultset.

You get the exact result query you specifically asked for! No data structure to manipulate.

All you then do is to get ORM tools such as Node’s Sequealize or Python’s SQLAlchemy to do only one job, which is to connect fetch the resultset only!

Your front-end app that consumes it, will have the appropriate data model to match the JSON payload response from the server And that’s it!

let customer_data = fetch(api_call).then(result => result.data) //just assume the data is successful.

console.log(customer_data.length) // 3
console.log(customer_data[0].id) // id: 1
console.log(customer_data[0].first_name) //first_name: Andy
console.log(customer_data[0].last_ name) //last_name: Wong
console.log(customer_data[0].dob) //dob: 19xx-08-05
//  Or Typescript equivalent

export interface CustomerModel {
  id: number,
  first_name: string,
  last_name: string,
  dob: Date

With this, you can do furthermore with it.

Using the same customer query above, what if you want to tweak query, say, a customer can have several customer orders? The customer orders will also appear inside this query but we want to appear as embedded JSON resultset against each customer JSON object.

Using the same idea we learnt above, we can rewrite it into the following

/* Revised JSON query to embed JSON array of data against each JSON customer object */

SELECT array_to_json(array_agg(row_to_json(jt)))
    (SELECT array_to_json(array_agg(row_to_json(customer_orders)))
      (SELECT orders.id, orders.order_number, orders.description, orders.amount
       FROM orders WHERE orders.customer_id = customer.id) customer_orders
    ) AS orders
   FROM customer) jt
// JSON results with embedded queries
  { "id": 1,
    "first_name": "Andy",
    "last_name": "Wong",
    "dob": "19xx-08-05",
      {"id":1, "order_number":"aw0001", "description": "books", "amount": 90.00},
      {"id":2, "order_number":"aw0002", "description": "magazines", "amount": 45.00},
      {"id":3, "order_number":"aw0003", "description": "toys", "amount": 22.75}
  { "id": 2,
    "first_name": "Bruce",
    "last_name": "Lee",
    "dob": "1940-11-27"
      {"id":4, "order_number":"aw0004", "description": "books", "amount": 100.00},
      {"id":5, "order_number":"aw0005", "description": "magazines", "amount": 55.00},
      {"id":6, "order_number":"aw0006", "description": "toys", "amount": 32.75}
  { "id": 3,
    "first_name": "Chuck",
    "last_name": "Norris",
    "dob": "1940-03-10"
      {"id":7, "order_number":"aw0007", "description": "books", "amount": 110.00},
      {"id":8, "order_number":"aw0008", "description": "magazines", "amount": 65.00},
      {"id":9, "order_number":"aw0009", "description": "toys", "amount": 42.75}

What’s going on here is that in the inner SELECT orders statement we utilize the array_agg to aggregate the SQL results of the customer orders table, which is predetermined by looking at the most current iteratee customer.id that’s supplied by outer SELECT customers statement. You can think of it as a form of lazy or eager evaluation when all the customer’s records are in the running state of being evaluated by the above statement, the SELECT orders statement will also get executed in parallel during this evaluation.
At each customer JSON object level, when customer orders have been successfully queried, we want the same results to be converted back into a JSON array of results so we utilize array_to_json method that simply does that, and gave its property name as orders.

That’s it!

That’s how you get to produce embedded JSON resultset using JSON query tools within another JSON query tools as above.

The great thing about such setup is that this type of written query is more performant than having two separate SQL statement calls made in the ORM layer. By doing that, you will notice an increase I/O time of reading the disk layer of the database between two such calls.

With the above, you only get to execute one SQL statement so you never need to worry about extended database call duration.

So there you have it, folks!

JSON query tools are now available at your disposal for all your complex and performance-based queries in relational database environments. This includes everything from MySQL, SQLite, Oracle DB, Derby, Amazon Aurora to many more!

But one word of caution - should you decide to go down this path of development, you may make subconscious tradeoff with performance for readability in the code as you could end up with a big SQL statements that have few layers of embedded queries so make to use these features wisely when producing JSON results from table-structured queries.

Give them a go. ^_^

Till then, Happy Coding!