Power of Eloquence

For all the great love of async-ness...

| Comments

After building a number of great web apps, sites and other fun stuff for some time, I find programming models such as procedural and object-oriented principles have helped us to accomplish most of ever trivial and ill-conceived tasks we get to them to do. Everything from designing online booking system, timesheet applications, CMS-driven plugins to recent SPA-driven apps etc. Engineers, like us, couldn’t be any much more productive doing these, in the most imaginative way. We’re so used to following their guidelines and practices such that we thought it’s our holy-grail on how we build things - for a very very long time since the dawn of computing and Internet programming.

Now, come with the ever-exploding of sophisticated real-time applications appearing in the tech scene such as messaging apps like WhatsApp, video-conferencing on demand, online gaming, e-commerce trading transactions etc, we’re starting to witness on another addition of programming model we’re going to be slowly seeing (or embracing).

That’s working with asynchronous programming models.

Asynchronous programming…

How does one define such a programming model? How’s it any different to the ones above we’re used to doing? Does it take our development expertise to another exciting chapter in our creative engineering journey? Why should we care about it?

With that burning curiosity in my mind, I googled up “why asynchronous programming” and it leads to this simple definition.

Asynchronous Programming refers to a style of structuring a program whereby a call to some unit of functionality triggers an action that is allowed to continue outside of the ongoing flow of the program.

Fascinating!

So we’re allowing one piece of code to keep running on its own course without blocking the rest of the program? No CPU or browser hangup is ever going to happen? Does this mean all of the other processes in the program will not have to compete for available memory or I/O power such that server has the sole responsibility in creating threads to cater each individual application request?

Perhaps.

Let’s start off with the following trivial JS code.

//let's assume we're using our database technology of some sort
var mongo = require('mongodb');

//then we open up connection with certain prviledges
let syncdb = mongo.MongoClient.connect('some_db_url', {user:'dbadmin', pwd: 'topsecret'});

//let's fetch a table result set.
let resultset = syncdb.query('SELECT * FROM averyverybigtable');
console.log('We are done!');

What we’re seeing here is the code is written synchronously.

When line 8 is getting executed, line 9 has to wait for line 8 to complete.

For this type of request, if you could imagine a table of some significant size with countless of rows (along with multiple columns) that could take some considerable time to complete its query before line two will get its turn of operation to run? In fact, could you ever guarantee the line 8 is ever going to complete its run-to-completion task at all? Many things could happen here ranging from database timeout errors, network latency, high memory load, database hangs etc, etc. With all those uncontrollable external events, what’s going to happen on line 9? Ultimately, it might not run.

Instead, the program might come to a grinding halt and you’ll end up with a very unresponsive environment which will prevent users from doing other tasks. Since JS is a single-threaded language in the browser, you can’t spawn threads for similar multiple requests like Java or C to lighten the request load. Hence this will result in what we call a UI-blocking behaviour.

That’s the drawback on writing synchronous code.

Let’s look at the asynchronous approach.

//again same db connection settings
var mongo = require('mongodb');
let asyncdb = mongo.MongoClient.connect('some_db_url', {user:'dbadmin', pwd: 'topsecret'});

//using callbacks
asyncdb.query('SELECT * FROM averyverybigtable', function(resultset) {
    console.log(resultset);
});
console.log('We are done');

What’s going on here now? The code is pretty much the same as the previous version ie the asyncdb object instance is operating the same SQL query. The minor difference is - we now placed a callback function within the query method call.

What purpose does the callback function serve here really?

Well..

This is essentially saying, while we’re ready to fetch a very long database query in the backend, call us when you’re done with it. We’re expecting our anonymous callback to come back with a table query result as its callback argument, and show us the complete result set. It also means that while the query function remains in the current state of being executed, the rest of the world can still keep moving forth, running its own course.

So what do you think the above code will output?

Not surprising, it will display 'We are done' and eventually the table resultset will be outputted.

There will be no blocks in the UI thread. No unresponsiveness will occur during the time of the fetching database call. The user can go ahead mind his/her own business for other things while they’re waiting for the query function to complete. They can either open another UI screen, click a button, scroll the page, or even perform another asynchronous operation elsewhere. They still have full control of the environment without being stuck, leaving idle in between user interactions.

This is the beauty of async-ness.

It’s about giving us the ability to minimize and remove user idle times where possible and ultimately encourages frequent user-interactions within the browser environment at any given point in time while triggered activities are left to run in the background.

This is a simply due to the internal mechanics of how Javascript’s concurrency model is driven, and how event handlers managed within its event loop and how they are processed within its system queue. If you’re wondering how do JS’s concurrency system, event loop and system queue work underneath, you can check its very good self-explanatory post here. I won’t give away too much detail. It’s best off to discuss my firm thoughts on these differing concurrency models in a future post.

That’s all mighty great and dandy to know!

But you might ask yourself some curious important questions. What if you decide to have a synchronous operation that’s dependent on the success of the previous asynchronous operation it last runs? What if, in the next step, you want to do something like calculating the total number of rows returned on a given data set like this.

let resultSetForUI = null;

asyncdb.query('SELECT * FROM averyverybigtable', function(resultset) {
    console.log(resultset);
    resultSetForUI = resultset;
});

//display data in the html table using Underscore/JSON template combo
document.getElementById('table').innerHTML = _.template(JSON.stringify(resultSetForUI)).html();

//the table count
console.log(resultsetForUI.count);

The problem here is that resultsetForUI won’t have anything fetched from the asyncdb query method call. It will happen sometime in the near future (but not now). It will not happen immediately. It’s still empty at this stage.

Even if it does return immediately, you cannot guarantee its feedback will always run at such constant times. As mentioned earlier, there are a lot of extenuating factors behind such as network latency and database reliability uptimes etc, etc, thus you run into this awkward situation where you get an incomplete state of operations back to the user very often. They will be misled to think the operations are broken while the truth is things are still in pending status.

So what should we do in this situation?

The first solution is we can use setInterval/cleartimeouts methods.

let resultSetForUI = null;

asyncdb.query('SELECT * FROM averyverybigtable', function(resultset) {
    console.log(resultset);
    resultSetForUI = resultset.
});

let intervalResponse = setInterval(function(){
     if(resultSetForUI  === null) {
          console.log("Awaiting db resultset...");
     }
      else {
         clearTimeout(intervalReponse);

         document.getElementById('table').innerHTML = _.template(JSON.stringify(resultSetForUI)).html();
         console.log(resultsetForUI.count);
     }
}, 1000);

This says that we place a waiting time of 1000ms at every interval during its query method call invocation, expecting a valid resultset data. When the resultset has been successfully fetched, we killed interval times thus began the next operation which is outputting the correct response to the user. We don’t need interval operations anymore.

This is by far one of the simplest approaches in getting asynchronous operations to behave well in synchronous places. But the issue with this is that you are responsible to keep track on placing multiple running intervals at many parts of the application, thus it will create a lot of performance overhead in your browser(or in NodeJS) and compromising the environment stability.

Instead of this, what should we do then?

Another proposed solution is to write a callback within another callback.

function getQueryTable(statement, callback) {
  asyncdb.query(statement, function(resultset) {
      //console.log(resultset);
      callback(resultset);
  });
}

let resultSetForUI = getQueryTable.call(null, 'SELECT * FROM averyverybigtable');

/*------ The rest code runs by here ------*/

Here, you transfer the control from synchronous operations to the asynchronous world of callbacks, and we’re done.

Which is great for a simple use case like so. But at times, it may not always be this straightforward.

Imagine another scenario. What if you want multiple sync tasks to perform one after another, in a sequential matter. Maybe something like this.

  1. After fetching this large database, you want to be able to filter the table results first that has non-empty author names ie authorName.
  2. Then you want to refine the table results by selecting columns that have non-empty data.
  3. Then you sort them in ascending order by looking up the first column by authorName.
  4. Finally, return the total count of rows that are non-empty based on the previous database operations.

With this in mind, again with callbacks, you write like it so

let filterParam = "authorName";

let resultSetForUI = asyncdb.query("SELECT * FROM averyverybigtable", function(resultset) {
    resultset.filterBy(filterParam, function(filteredResultSet){
          let columnsNotNull = true;    
          filteredResultSet.refineData(columsNotNull, function(refinedResultSet) {
              let firstColumnName = "authorName";
              refinedResultSet.sortBy(firstColumnName, function(sortedResultSet) {
                  return sortedResultSet;
              });
          });
     });
});
/*------ The rest code runs by here ------*/

Here we have 4 callbacks. Hence we have four nested levels of callbacks to keep track off. It does the job pretty well.

However, like the setinterval methods, this also can lead to programs unwieldiness as your async operations can get more complex as you place more features. You will end up with this triangular shape of your deeply nested callbacks thus it popularises the phrase as the “pyramid of doom”. It gained fame for this phrase because if you keep building up this pyramid scheme further and further, you will end up with one massively ugly tree of spaghetti code in the codebase. You may wonder how does one person keep track of all states within the app, let alone keep their sanity when debugging and troubleshooting asynchronous behaviours. This is also known as “callback hell”. Rightly so, it’s hellishly difficult to know what/where the moving parts are.

On top of that, in real-world situations, you’re supposed to manage error handling steps at every level of callback too. As you know, let’s face it, there’s no code-proof you will get your definite results back at every async step.

So it becomes like..

let resultSetForUI = asyncdb.query('SELECT * FROM averyverybigtable', function(error, resultset) {
    // console.log(resultset);
    if(error){console.log('cannot find table');}
    else {
      resultset.filterBy(filterParam, function(error, filteredResultSet){
        if(error){console.log('cannot filter resultset by', filterParam);}
        else {
          let columnsIsNotNull = true;              
          filteredResultset.refineData(columsIsNotNull, function(error, refinedResultSet) {
            if(error){console.log('cannot refine data for valid columns');}
            else {              
              let firstColumnName = "authorName";              
              refinedResultSet.sort(firstColumnName, function(error, sortedResultSet) {
                if(error){console.log('cannot sort result set')}
                else { console.log(sortedResultSet.count); }
              })               
            }
          })
        }
     })
    }
});

Yuck! Looks pretty nasty.

Yet, surprisingly, a lot of people I’d found, can deal with this level of complexity depth for a long time - as a normal practice. This is simply because asynchronous programming did not become such mainstream in many web apps (though their concepts have been around much longer than most people think) until front-end technologies such as ReactJS and NodeJS arrived in the scene some years back. They bring forth new sleuth of innovative ways to create more sophisticated apps and, at the same time, enriching user level experience to greater heights. The demand for asynchronous-driven apps has gone sky high.

Recent changes in EMAC5/6/7 brought us 3 new approaches.

##Promises

Promises are essentially the more ‘cleaner’ version of callbacks. What I mean by cleaner is not necessarily having clean async code (though you can variably achieve much better with this than raw callbacks). But rather, it offers better abstractions of how you can chain your async events in flow order using .then callbacks - rewritten like so.

//using thenables callbacks
let resultSetForUI = asyncdb.query('SELECT * FROM averyverybigtable' )
         .then(getFilterBy)
         .then(refineData)
         .then(sortData, function(sortedData){
               return sortedData;
}).catch(error, handleError(error));

function getFilterBy(resultset){
  return new Promise(function(resolve, reject) {
        /**returns promised value*/
  })
}

function refineData(filteredResultset){
 return new Promise(function(resolve, reject) {
        /**returns promised value  */
  })
}

function sortData(refinedResultSet){
  return new Promise(function(resolve, reject) {
        /**returns promised value  */
  })
}

function handleError(error) {
   console.log('Cannot processed query at this time due to', error);
   return {
      message: 'Data available at this time'
      count: 0
   })
}

Notice how all error callback handlers are removed and encapsulated in one catch callback method at the end of this promises chaining. The great thing about Promises is they return a value that usually comes in either one of the two states. Resolve when the operation was successful or Reject when the operation was a failure (including errors). How it determines its eventual state does not really matter here. What matters more is that the promised value will always return at some point in the future thus we can decide our next actions in subsequent async events.

This is the core basis of Promises.

Promises offer great mechanisms in exception or error handling situations better than normal callbacks do. Thus, in the above example, we’ve gone far in tidying up the code by creating thenables chaining and providing each chained method with a callback function to handle their respective business logic. These few lines of code suggest that we will go through each database operation one step a time if and only if the resulting promised value is resolved at the end of each executing callback function. If, at any point in time, should the promised value is rejected, we simply catch error at the end of promise chain and exit the async flow altogether.

This is fantastic for code readability and maintenance, I personally find. I have more confidence in refactoring the code without risking much harm in breaking existing functionality that fits my purpose.

Having said that though, you can still get away with writing ‘cleaner’ code using traditional callbacks too, not just with Promises. You can use named callback functions instead of anonymous callbacks at each level, and still maintain readability like so.

Basically, it’s a matter of preferential style one individual picks, and how well and clear your designed abstractions are going to look. Even more importantly, if you’re working in the team, will they get the full grasp behind how you write them. The key thing is you must be consistent with your choice of async approaches in your codebase as you begin building them.

##Generators/Iterators

Like callbacks, you could also run into the same mistakes of making your code unreadable too by subconsciously writing several nested promises chaining, and you still end with a callback hell pattern. Naturally, as your app evolves, you will be inundated with async features that come with growing complexity thus your original design has to handle edge cases of async flow control.

In our second option, the latest async feature from ES6, we have generators and iterators.

Generators/Iterators are a pair of metaprogramming tools that can allow you to pause and run operations at any point in time.

This is quite a revelation to behold as I would never imagine we are able to take control of run-to-completion executions anywhere in the program until now. Especially in the async world.

The basic syntax looks like this

//generator function made
function *generator() {
   yield 'Hello';
   yield 'World';
   yield 'this';
   yield 'is';
   yield 'awesome';
}

//instantiate generator object
let my_iterator = generator();

You start with * symbol to mark it as a generator function. Then you instantiate generator constructor function for use. To run the generator, you call the iterator’s next() method.

my_iterator.next(); // {value: 'Hello', done: false}

This tells generator to pause and return the first iterable value of yield sequence it first ran. Notice there’s done property attached to this yielded property. It’s suggested have we run to our completion during its execution yet.. and it comes up saying its false. Meaning we still have some more yield statements to get through.

Now running next() 3 times.

my_iterator.next(); // {value: 'World', done: false}
my_iterator.next(); // {value: 'this', done: false}
my_iterator.next(); // {value: 'is', done: false}
my_iterator.next(); // {value: 'awesome', done: false}

We still have our enumerable items with done boolean property return as false because we’ve paused on the yield fourth statement.

When running next() once more,

my_iterator.next(); // {value: 'undefined', done: true}

Now we don’t have any yield statement to pause, thus we end up having done set to true meaning it is truly completed at this point.

let getQueryTable = function* (SQLstatement) {

   let resultset = yield asyncdb.query(SQLstatement);

   let filteredResultset =  yield resultset.filterBy(fiterParam);

   let refinedResultSet = yield filteredResultset.refiineData(false);

   let sortedResultSet = yield refinedResultSet.sort(firstColumnName);

  return sortedResultSet;
}

Wow. Look at that! There’s not a single sight of callback anywhere in this function. In fact, if you look closely, it’s more sync-ish style coding now… And that’s marvellous because many programmers and engineers have been dreaming of writing asynchronous code in a synchronous way for a long time. Especially with tools like these, callback hell patterns will end up as nothing but a living memory in programmers lives - which is great in my firm opinion.

To put the above example generator into action, you perform the following by calling next methods.

// instantiate iterator

var asyncDBIterator = getQueryTable();

// get first yield
let result = asyncDBIterator.next('SELECT * FROM averyverybigtable');

let resultSetForUI = result.value(function(err, res) {
     if(err) console.log('error occurred at this point', err);

     // get second yield
     let result = asyncDBIterator.next(res);
     result.value(function(err, res) {
        if(err) console.log('error occurred at this point', err);

        //get third yield
        let result =asyncDBIterator.next(res);
        result.value(function(err, res) {
            if(err) console.log('error occurred at this point', err);

            //get fourth yield
            let result = asyncDBIterator.next(res);

            result.value(function(err,res) {
                 if(err) console.log('error occurred at this point', err);

                 //get fifth( and last) yield
                 let result = asyncDBIterator.next(res);
                 return result
            });
        });
     })
});

It works now. However, the bad thing with this is that you still end up building up callback hell again…

Rats! Not a good idea.

So a better approach we can do is to use a more robust library that helps up with the execution of async operations sequentially such Co.js.

import co from 'co';

var getQueryTable = function(dbStatement) {

co(function *() {
   const resultset = yield asyncdb.query(statement);
   const filteredResultset =  yield resultset.filterBy(fiterParam);
   const refinedResultSet = yield filteredResultset.refiineData(false);
   const sortedResultSet = yield refinedResultSet.sort(firstColumnName);
   return sortedResultSet;
}).catch(err, handleError(err));
}

let resultSetForUI = getQueryTable('SELECT * from averyverybigtable');

Co.js is one of the community-supported libraries that gives asynchronous developers the edge on how to encapsulate better asynchronous design using the latest ES6 generator features. There are other notable good libraries making good use of generators too such as Bluebird.js, which they do similar things but has a different philosophy behind them.

##Async/Wait

Finally, we come to one last, but probably the more exciting option of the 3.

That’s using async and await keywords coming up in ES7. This will make you write code so synchronously such you feel at home doing sane coding for the first time.

It looks like this.

var getQueryTable = async function(SQLStatement) {
     const resultset = await asyncdb.query(SQLStatement);
     const filteredResultset =  await resultset.filterBy(fiterParam);
     const refinedResultSet = await filteredResultset.refiineData(false);
     const sortedResultSet = await refinedResultSet.sort(firstColumnName);
     return sortedResultSet;
}

let resultSetForUI = getQueryTable("SELECT * FROM a veryverybigtable");

Again - no more callbacks and promises (though you can combine their uses with them async/await anyway). Asynchronous flows that behave exactly like synchronous ones.

In other words.

Async/await + synchronous = perfect harmony and blissful coding

Bear in mind, they’re not readily available in most modern browsers yet so what people tend to do is to use transpiler tools such as Babel to make up for gap-period prior to its public release at the moment.

##Tall and short of this.

Now you have a number of async style choices to pick from this list.

And with many async features are slowly baked into reputable JS libraries/frameworks such as NodeJS/Angular/React, it’s no wonder people of web community getting awe and hyped-up for asynchronous applications these recent days.

Hence the great love of async-ness.

Just to be clear, I’m not really an expert in this area of programming paradigm. It’s something I started to pick up a couple of months back when I was given a coding challenge to write a lot of asynchronous style tasks. It’s about building a simulated tournament ranking system using scoring API server to store/retrieve scores for players running in real time. Completing the coding challenge inspire me to write up this post and share my learnings and thoughts on them.

I sincerely hope this post will help to open up your awareness on this and how this programming paradigm can affect every aspect of our software development practices on the web, whether you’ll use them or not. I’d find it’s an inescapable fact asynchronous programming is the future thus it’s part of grand scheme what exciting innovations await us in coming years, should this become mainstream.

Happy Coding!

Comments