Biz & IT —

The Web is getting its bytecode: WebAssembly

The next step in the evolution of JavaScript and asm.js is to do away with both of them.

In the quest for ever faster JavaScript, there has been a recurring refrain: why use JavaScript at all?

JavaScript engines have been a major focus of browser developers for some years, and the result has been substantial performance improvements from every vendor. JIT ("just-in-time") compilation that turns JavaScript code into instructions that can be directly executed on the processor brought huge speed gains. New data types have been added to the language to reduce the overhead when crunching numbers, and combined with asm.js, a high performance limited subset of JavaScript, applications running in the browser can achieve performance that's comparable with that of native code.

In spite of these improvements, the question of "why JavaScript?" remains. This is not without reason. The use of JavaScript incurs certain overheads: browsers have to read and interpret a text-based language that was designed for human authors, not for machines. The design of JavaScript itself has features that are suboptimal from a performance perspective; the way a single JavaScript variable may at different times represent a number, a string, or a fragment of HTML means that a JIT compiler may not be able to optimize as aggressively as it would like. The ability to modify the behavior of even built-in objects such as arrays can be similarly problematic.

JavaScript does have certain important advantages: it's a memory safe, sandboxed environment, meaning that (browser bugs excepted) JavaScript programs can't escape beyond the confines of the browser to access sensitive data or install persistent malware. JavaScript is also processor independent, so scripts will run just as well on an x86 PC as they will on an ARM smartphone.

However, there are well-known ways of providing the advantages of JavaScript without those perceived downsides: bytecode runtimes like Java and .NET. Unlike script files, the bytecode represents a low-level, fairly compact representation of a program. Bytecodes also tend to be much easier for computers to read and JIT compile. Bytecode systems tend to map cleanly to the underlying arithmetic capabilities of the processor, too; they tend to operate on simple integers and floating point numbers, thereby avoiding the complexity of JavaScript's object system.

As such, there has long been a degree of pressure to use a bytecode system for the browser. Both Microsoft and Sun (now Oracle) did this with .NET and Java, respectively, but these systems both depended on plugins, rather than being integrated into the browser's rendering engine the way JavaScript is. JavaScript programs could directly manipulate HTML objects; the plugins were instead off in their own world, separate from the HTML pages they resided in.

Google built a couple of systems to try to extend the browser to go beyond JavaScript. Native Client (NaCl) ran x86 (or ARM) programs in a secure sandbox, and Portable Native Client (PNaCl) did the same but using a kind of bytecode instead of x86 or ARM code. However, while Google championed these approaches other browser vendors largely rebuffed them. JavaScript was the lowest common denominator; it was the one thing that all browsers had to implement, so it was felt that it was better to make JavaScript better than to try to invent a whole new system.

As JavaScript became faster, the browser also became more capable. WebGL, for example, exposed hardware accelerated 3D graphics to the JavaScript developer. New APIs, to give access to, for example, games controllers, webcams, and microphones, have all been developed, extending the scope of what JavaScript in the browser can do. Simultaneously, a range of JavaScript-based-but-not-actually-JavaScript was devised. Microsoft's TypeScript, for example, offers various language features that Microsoft thinks are useful for the development of large programs by large teams. But browsers don't have to support TypeScript: the TypeScript compiler produces regular JavaScript that can run in any browser.

This kind of wide-ranging usage led Microsoft's Scott Hanselman to dub JavaScript the "assembly language for the Web," a sentiment largely shared by people such as Brendan Eich, who invented JavaScript, and Douglas Crockford, who invented JSON, widely used for JavaScript-based data interchange.

But the people calling for a bytecode for the browser never went away, and they were never entirely wrong about the perceived advantages. And now they're going to get their wish. WebAssembly is a new project being worked on by people from Mozilla, Microsoft, Google, and Apple, to produce a bytecode for the Web.

WebAssembly, or wasm for short, is intended to be a portable bytecode that will be efficient for browsers to download and load, providing a more efficient target for compilers than plain JavaScript or even asm.js. Like, for example, .NET bytecode, wasm instructions operate on native machine types such as 32-bit integers, enabling efficient compilation. It's also designed to be extensible, to make it easy to add, say, support for SIMD instruction sets like SSE and AVX.

WebAssembly will include both a binary notation, that compilers will produce, and a corresponding text notation, suitable for display in debuggers or development environments. Early prototypes are already showing some of the expected advantages; the binary representation is 20 times faster to parse than the equivalent asm.js.

The people behind wasm have not forgotten that JavaScript is supported everywhere and wasm is currently not supported anywhere. Their plan is to fill the gap with a polyfill; a JavaScript script that will convert wasm to asm.js for those browsers that don't have native wasm support. Either the browser will interpret the wasm directly, or it will load the polyfill and execute the resulting asm.js. Native handling should be faster, but the polyfill means that a developer can be sure that a wasm program will always work.

wasm is still in the early stages of development. There's no formal standards body behind it, just an informal community group. The specifications aren't complete, and the high level design is still being decided. But with all four major browser engine makers working together, the future of wasm should be bright. And the JavaScript skeptics who have been crying out for a bytecode for so long will finally get their wish.

Listing image by Pablo BD

Channel Ars Technica