State of distributed events in node.js
Context
Application scaling in the cloud
The cloud technologies brings exciting new challenges for application developers, in order to achieve horizontal scalability, remove all single points of failure, gracefully handle the load. It’s fun, but it’s a challenge, because it raises many complex problems. One big category of problems are information sharing between application instances.
The need of information sharing
There are a lot of cases where you’ll need to share information between your running application instances: it can be for sharing session data (so you won’t need the sticky session anymore, so you can be truly elastic), for cache invalidation, to launch asynchronous tasks, ie update your Elastic Search indexes… The need is real.
The event-driven architecture
It’s not big news that lots of applications are using an event-driven architecture one way or another. What is a JMS message if not an event ? The shift is more that, now that we have one application that is running on lots of servers concurrently, applications simply can’t work without some event-driven architecture anymore.
node.js and events
events at the core of Javascript
Due to its single threaded nature, javascript is built from the ground to listen and send events. It seems a natural fit for a browser language, to listen for keypress and mouse clicks. It provides added power when using events with AJAX : you can fire a few ajax requests in parallel, listen for the completeness event, and thereafter aggregate the results and do whatever with them. But, to be completely correct, the javascript creators, once said that it’ll be a single threaded system, didn’t have many choices other than events to listen to external changes without blocking.
Events at the core of node.js
Events are also first class citizen of the node.js core. Lots of core modules (streams, or http, for example) are inheriting the EventEmitter interface. It’s the suggested way of communication between your node modules, mainly for the loosely coupled robustness it provides.
The distinct types of event managers
There are three main patterns of event driven patterns used in Javascript. The first one is the well known observer pattern. Some observers objects are subscribing to a subject object. The subject notifies its observers when a change occurs. This is the most tightly coupled event system.
The second pattern is the event model we all know from jQuery or node. A subject provides methods for observers to subscribe and unsubscribe to events notifications. This is less tightly coupled, as the subject doesn’t have to know its observers. It’s also more granular, because an observer can choose a specific event it’s interested in. However, it’s still tightly coupled, because the observers have to know the subject (to subscribe on its events), and there can’t be several objects emitting the same event.
The third pattern is the pubsub. A third object, the pubsub, sits between the subject(s) and the observer(s). That way, more than one object can emit (we say publish in that case) an event, and the observers doesn’t need to know what part of the application published the event, it just have to know the pubsub and subscribe to the topics of interest.
Communication between application instances
The pubsub pattern is the best event pattern when communicating between instances of an application, because, well, you don’t have acces to all objects of all instances to bind on their events. Here, decoupling is not a choice, it’s a requirement.
node.js existing distributed pubsub systems
I was surprised to find so little modules to address this need. However there are some.
Faye
Faye is based on the Bayeux protocol. It’s a simple networked pubsub system. However, it requires that all nodes are connected together to work correctly. In a cluster system, that means that each node is connected to another, so for a n node cluster, each node should maintain n-1 connections. That is not so cool. Lastly, the Bayeux protocol is “Copyright © The Dojo Foundation (2007). All Rights Reserved”. I don’t think it’s a good idea.
axon
Axon, created by the js ninja TJ Holowaychuk, is a building block to implement all kind a messaging between nodejs instances. It’s terribly well written, and implements some refinements over the basic pubsub (called pubemitter/subemitter). However, once again, there is no support for an elastic cloud of applications that have to share events.
CloudJS
CloudJS is more the type of software I’m looking at. It uses UDP broadcast packets to communicate between application nodes, and can even migrate event pool from one node to another (for example, in the case of the shutdown of an instance). However, the project doesn’t seem to get much attraction, and the last commit is one year old…
The right tool for the job
How is it possible that a so requested feature doesn’t exist, or isn’t known (I didn’t parse the whole results of the ‘pubsub’ or ‘cloud’ search on npmjs.org) ? Well, the reason behind it is simple: pubsub systems already exist outside of node.js, are modern, well known and well tested. And those are… databases.
MongoDB
One of the most popular databse system those days is mongodb. Guess what ? There is a module, mubsub, dedicated to implement this exact functionnality. Mongo developers even created a blog post with some guidelines on the subject.
Redis
Redis is the rock star of key/value stores, as it’s extremely fast, can be totally in memory, and support replication. And it implements natively the pubsub interface. There are lots of node.js modules taking advantage of Redis pubsub, for example this one .
PostgreSQL
The SQL king also have its notification mechanism, allowing a pubsub implementation on top of it. This article (in french) explains it simply.
Conclusion
The node.js ecosystem focus on its strengths, and delegate other tasks to other systems. I won’t be surprised to see a true cluster module, that address those kind of concerns. However, today the best choice seems to rely on databases systems to handle data passing amongst application instances, and it doesn’t seem so crazy.
Hello world! Browsers technologies we can use in a new project