My JavaScript journey - Part 2 - Server Side Scripting and Node.js

Posted in development on November 30, 2017 by Adrian Wyssmann ‐ 11 min read

In the first part of my journey I went trough very raw basics of JavaScript. As mentioned there client-side scripting is quite easy to understand deployment is straightforward: Deploy to a web server and call the respective page in a browser. As the runtime is your browser there is no need to install anything in your web server. This is different, as for server-side scripting you need a runtime on your server - pretty obvious. However, for a newbie like me there are still some things I need to understand especially how an app has to be deployed and hosted.

Basics

But first things first. As mentioned in my previous post, Node.js is such a JavaScript runtime which can be installed on a server - there are versions for Windows, MacOS and Linux. Once installed you can run your JavaScript files with Node instead a browser. So let’s assume this script console-output.js

var logtext = "This is a string logged to the console"
console.log(logtext)

which results in the respective output when calling it with node.js

[adrian@archlinux ~/node-playground\]$ node console-output.js
This is a string logged to the console

But Node is not just about execution simple JavaScripts, it provides much more as it is an asynchronous event driven runtime which implements an event loop as a runtime construct. What does that mean?

  • Node uses an event-driven architecture (actually libuv) and callback pattern to implement the asynchronous behavior
  • Node not perform blocking I/O operations as it offloads such operations to the system kernel
  • Node implements an event loop to handle events

Asynchronous programs

In comparison to synchronous programs where a program waits to finish a task before moving on to the next one, in a asynchronously program, the program moves on the the next task even so the previous one has not finished yet. Coordination happens by events, which can be triggered in different ways (e.g. by a user pressing some keys, an internal system event, …). There has to be at least one place where such events are handled. In Node it’s the Event Loop, which I explain in the next chapter.

The callback pattern is something which took me a while to understand when I first heard it and even more to use it. It looks that I am not alone, as the page http://callbackhell.com/ points out. It’s by the way a very interesting guide to write asynchronous JavaScript programs. Basically,callbacks are functions which are executed when a specified event happens. A simple example is a timeout. The below example triggers a message to be printed 1s after the program has started

console.log("Program started at " + Date.now())
setTimeout(function() { console.log("timer 1s has ended") }, 1000)

As you can observer when running this is, that node actually exists only after the timeout has finished.

The Event Loop

An event loop - as the name implies - is a programming construct that waits for and dispatches events or messages in a program. In Node the event loop is initialized when started and then processes the input script. There are 6 well defined phases, of which each one has a specific purpose and it’s own FIFO callback queue

  • timers  - execute the timer’s callback i.e the ones scheduled scheduled by setTimeout() and setInterval().
  • I/O callbacks - executes callbacks for some system operations except close callbacks, the ones scheduled by timers, and setImmediate().
  • idle, prepare -  only used internally.
  • poll - executes scripts for timers whose threshold has elapsed, and process events in the poll queue
  • check - execute callbacks immediately after the poll phase has completed i.e. the ones set with setImmediate()
  • close callbacks - an abruptly closed socket or handle emits a the ‘close’ event in this phase
  • process.nextTick() - handles the nextTickQueue which is processed between two phases

There are some important aspects to understand.

First, each phase only operations specific to that phase, then execute the callbacks in the phase’s queue until the queue is empty or the maximum number of callbacks has been executed.

Second the relation of the poll and check phases: As long as the poll queue is not empty the callbacks in the queue will be executed synchronously until either the queue has been exhausted, or the system-dependent hard limit is reached. If the queue is empty, the event loop will wait for callbacks to be added to the queue unless there is a script scheduled in the check queue.

Third, the “special phase” related to the call process.nextTick(): This can be called in any phase and if so the callback passed as argument is added to the nextTickQueue . This queue is immediately processed once the current phase is exited and before the next phase is entered.

I’ve played a bit with this

const fs = require('fs'); 
const cache = {}; 

function readFileSynchronous(filename) {
    if(cache[filename]) {
        return cache[filename];
    } else {
        cache[filename] = fs.readFileSync(filename, 'utf8');
        return cache[filename]
    }
}

function readFileAsynchronous(filename, callback) { 
    if(cache[filename]) { 
        process.nextTick(() => callback(cache[filename] + " (nextTick)")); 
    } else { 
        //asynchronous function 
        fs.readFile(filename, 'utf8', (err, data) => {
        if(err) {
            console.log('>> ' + err); 
        } else { 
            cache[filename] = data; 
            callback(data); 
        }
    }); 
  } 
} 

var sleep = function (time) {
    var startTime = Date.now()
    while (Date.now() <= startTime + time ) 
    {
        //do nothing
    }
}

var consolelog = function (string) {
    console.log(string)
}

var consoleLogWait = function (string, time) {
    console.log(string)
    sleep(time)
}

setTimeout(function() { consolelog("timer 5s has ended") }, 5000)
setTimeout(function() { consolelog("timer 10ms has ended") }, 10)
setImmediate(function() { consolelog("immediate") })

readFileAsynchronous("test.txt", data => console.log(data + " (not cached)"))
console.log(readFileSynchronous("test.txt") + " (1. call)")
readFileAsynchronous("test.txt", data => console.log(data + " (cached)"))
console.log(readFileSynchronous("test.txt") + " (2. call)")

for (let i = 0; i < 4; i++) { 
    consoleLogWait("Synchronous call #" + i + " then wait 100ms", 100)
}

So what I observe is the following

  • The first and second asynch I/O calls are added to the queue whereas the 2 synch I/O operations (file read) are executed directly (blocking calls). So I assume we are in the I/O phase
  • After the I/O phase node enters the poll phase where the the timers are added to the timers queue and the synchronous console write operations. These are blocking calls so no other operations happens in meanwhile even so we second timer (10s) has been finished
  • In meanwhile the first and the second asynch file read operation have been finished. After exiting the poll phase and before entering the check phase we actually see the output of the second async call - indicated by the “(cached)”. This means that the file was already read before (otherwise it would not be cached) but as we told the function process.nextTick()  in case the file is cached the output is logged to the console in between the phase change.
  • As next output I would have expected the “immediate” as to my understanding this should be executed in the poll-phase. Something which I still not fully understand
  • The next phase which has someting in the queue is the timer phase. As the first timer has terminated it is printed out.
  • As the secoexamplend timer has not yet finished the next phase that has something queued is the I/O-phase as the first async file read call has been finished so content of the file is printed
  • In the end, after 5s also the second timer finishes and “timer 5s has ended” is printed to the console
  • Now the script has ended, no more calls are queued so node terminates

Output of the node callback.js

[adrian@archlinux ~/node-playground\]$ node callback.js
this is line 1 of test.txt (1. call)
this is line 1 of test.txt (2. call)
Synchronous call #0 then wait 100ms
Synchronous call #1 then wait 100ms
Synchronous call #2 then wait 100ms
Synchronous call #3 then wait 100ms
this is line 1 of test.txt (nextTick) (cached)
timer 10ms has ended
immediate
this is line 1 of test.txt (not cached)
timer 5s has ended

[adrian@archlinux ~/node-playground\]$

API

Beside of some helpful CLI options which provides various runtime options and a built-in debugging, Node also offers some modules which provide helpful functionality like operating system-related utility methods or utilities for working with file and directory paths. For more information checkout the API.

Modules

JavaScript can be easily extended by using existing scripts and libraries. For client-side scripting you can easily include an external file with the script-tag as shown below:

<html>

<head>
    <title>Exmaple</title>
    <script src="https://unpkg.com/vue"></script>
    <script src="https://unpkg.com/raphael"></script>
    <link rel="stylesheet" type="text/css" href="main.css">
</head>
....

For server-side scripting and especially when you get a lot of packages and dependencies among them you will definitely use a package manager.  NPM is the default package manager for Node and is included as a recommended feature in Node.js installer. The npm client lets you easily download libraries (and dependencies) from an npm registry. The npm registry is a database of public and private packages - there is the official registry but you or your company may also have it’s own private registry. For more details I recommend to read trough the extensive documentation at here.

Installing packages is quite easy by calling npm install. This will create the node_modules directory in your current directory and download the package to it.  Let’s install the math-expression-evaluator:

[adrian@archlinux ~/node-playground\]$ npm install math-expression-evaluator
[email protected] node_modules/math-expression-evaluator
[adrian@archlinux ~/node-playground\]$ ls -l
total 4.0K
drwxr-xr-x 5 adrian adrian 4.0K Nov 29 11:41 math-expression-evaluator/

After the package is installed you can use it by require(‘module_name.’) .

var mexpeval = require('math-expression-evaluator')
console.log(mexpeval.eval("10 + 12"))

An the run the script

[adrian@archlinux ~/node-playground\]$ node mexpeval.js
22

package.json

As recommended by npm, the best way to manage locally installed packages is the use of package.json which offers you

  1. It serves as documentation for what packages your project depends on.
  2. It allows you to specify the versions of a package that your project can use using semantic versioning rules.
  3. It makes your build reproducible, which means that it’s much easier to share with other developers.

To create a package.json we have to run npm init which guides you to create project information and automatically adds already existing packages as dependencies.

{
  "name": "node-examples",
  "version": "0.0.1",
  "description": "my node playground with some arbitrary scripts",
  "main": "index.js",
  "dependencies": {
    "math-expression-evaluator": "^1.2.17"
  },
  "devDependencies": {},
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "adrian",
  "license": "ISC"
}

You may also realize that a package-lock.json is automatically generated when using npm. This file is intended to be committed into source repositories, and serves various purposes:

  • Describe a single representation of a dependency tree such that teammates, deployments, and continuous integration are guaranteed to install exactly the same dependencies.
  • Provide a facility for users to “time-travel” to previous states of node_modules without having to commit the directory itself
  • To facilitate greater visibility of tree changes through readable source control diffs
  • And optimize the installation process by allowing npm to skip repeated metadata resolutions for previously-installed packages.

Node.js modules

Above we have used modules but sooner or later you may also want to create a node module. Modules are npm packages which are published to a npm registry. As already mentioned above when initializing the package.json you will provide information for your package.

{
  "name": "dummy-console-log",
  "version": "1.0.0",
  "description": "my first npm package",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "repository": {
    "type": "git",
    "url": "git+https://github.com/papanito/nodejs-playground.git"
  },
  "author": "Papanito <[email protected]>",
  "license": "GPL-3.0",
  "bugs": {
    "url": "https://github.com/papanito/nodejs-playground/issues"
  },
  "homepage": "https://github.com/papanito/nodejs-playground#readme"
}

Importantly to know is that functions which shall be available to other need to be made a property of the exports object.

exports.printMsg = function() {
    console.log("This is a message from my first npm module");
}

To publish an npm package in the official registry you can follow the guidelines on the npm site. However, I will use my test environment for publishing, so I follow the guidelines here. i.e. login to your private registry

[adrian@archlinux ~/node-playground\npm-example]$ npm login --registry=https://nexus.wyssmann.com/repository/npm-private/

After this, the package can be published

[adrian@archlinux ~/node-playground\npm-example]$ npm publish --registry=https://nexus.wyssmann.com/repository/npm-private/
+ [email protected]

npm package

To consume a package you cannot simply run npm install dummy-console-log as this will results in an error “code E404 - not found”. Clearly cause the package is not in the default npm registry but in my private one. Therefore we need to provide the registry where to grab the package (parameter -registry=) - as I am already logged in due to the package upload there is no need to login again, but in case a colleague would like to grab the package he would first need to login.

[adrian@archlinux ~/test]$ npm install dummy-console-log --registry=https://nexus.wyssmann.com/repository/npm-private/
+ [email protected]
added 1 package in 0.82s

an the respective entry in the package-lock-json

{
  "requires": true,
  "lockfileVersion": 1,
  "dependencies": {
    "dummy-console-log": {
      "version": "1.0.0",
      "resolved": "https://nexus.wyssmann.com/repository/npm-private/dummy-console-log/-/dummy-console-log-1.0.0.tgz",
      "integrity": "sha512-8+sMdjATyWZkMHCBlYCKWtd5j874i8WrEjoDTYCsHxiYX8Su4eJEqsi/4Rs+S3soVUNoqPv+8ufKJFoiBgqYcA=="
    }
  }
}

The module can then be used as usual in your code …

var dcl = require("dummy-console-log")
dcl.printMsg()

… which runs successfully

[adrian@archlinux ~/test]$ node test.js
This is a message from my first npm module

And now?

Now I have some rough idea how node works and how to use npm. Develop applications with it is a different story and still a lot to learn.