Friday, May 17, 2013

Growing pains of node.js usage

Generic Repotring Intranet v2 Enhancements

As part of the "Generic Reporting Intranet v2" next release I intend to not only to flesh out some missing functionality, but also to start work on the underlying infrastructure to make it more suitable for large scale deployments. To this end my intention is to optionally allow Linux/Windows/MAC backend hosts to run either the existing Perl process, or a new alternative written in Node.js.

Well that is the plan. The reality is that flipping between Javascript, Perl, Python and Shell programming whilst doing system admin work during the working day is quite a juggle and so progress is slower than I expected.

I'm desperate for the asynchronous performance benefits of Node but getting there is hard work.

A Node Problem and Solution

Typical Node Scenario

For example consider the following pseudo-code:

fs.readFile(args['cfgfile'],'utf-8',function (err,data) {
    if(err) {
      console.error(err);
      process.exit(1);
    }
    var rr=new xmldoc(data);
    args['myname']=2;
    // set more args[..] settings
});

if(args['myname']==2) {
  console.log('it is me!');
}

Now coming from a procedural programming background you might expect the final "if" statement to work, assuming the code above did set args['myname']. Actually it might or might not, depending on unpredictable factors!

Understanding the problem

What is happening here is that "fs.readFile" is an asynchronous function and the code in the "function" part is executed once the file is read in. But that might be later - and certainly will be later than the last "if" statement!

The Solution - re-factor, re-factor ...

I'm quite comfortable writing code that is probably not very next and has countless levels of indentation. I'm not a great programmer!

However Node almost forces you to re-factor your code into smaller procedures to handle the callbacks in a sane manner. Add in the Javascript scoping limitations and it might as well hold a gun to your head: re-factor or die!

So for the pseudo-code example the code might be re-written as:

function load_config(cb) {
  fs.readFile(args['cfgfile'],'utf-8',function (err,data) {
    if(err) {
      console.error(err);
      process.exit(1);
    }
    var rr=new xmldoc(data);
    args['myname']=2;
    // set more args[..] settings
  });
}

load_config(function() {
  if(args['myname']==2) {
    console.log('it is me!');
  }
});

So the process tends to be to re-write code into functions which take call-backs to do the work that might be dependent on something asynchronous. It might sound obvious ... but converting existing programs tends to require significant effort.