Richard Backhouse's Blog

Sunday, October 27, 2013

A Simple Chromecast Media Streamer

The Google Chromecast became available over the summer to great anticipation and excitement.

I decided to buy one for a couple of reasons :

I needed a way to play Netflix on one of my TV's that didn't currently have that ability.
I wanted a way to cheaply and easily play videos that I had sitting on a NAS.

For 2. I had actually made some attempts via alternative methods using a Raspberry Pi running a number of different versions of XMBC. While initially things would look promising, when it actually came to using it anger things fell apart. After reading up on the Chromecast SDK it seemed like I could write my own media streamer fairly simply.

So what follows is details of my attempt. Overall it was an interesting investigation into what can be achieved with the Chromecast with a small amount of code. Some things work well and other things not so much. Particularly the media status listener support seemed somewhat flaky.

The source code for the application is available here. The majority of the chromecast interaction can be found in chromecast.js.

To try it out you must

Checkout the app from github and run npm install
Get your URL(s) whitelisted
Modify receiver.html and chromecast.js setting your application id
Point your whitelisted URL at your modified receiver.html
Add the domain you plan to run the sender web application on to the Chrome browser Chromecast setttings. See the "Whitelisting Chrome Apps" section here
Run the server via "node lib/server.js [port] [path to media]"
Point your Chromecast enabled browser at http://[host]:[port] to load the Media List Page

I decided (at least initially) to go the Google Chrome route for the application.

Receiver

For the receiver I ended up using the sample one from cast-chrome-sender-helloworld. I just had to hook it up to my whitelisted URL and host it somewhere. In the end I just put it on a Jetty instance running on the Raspberry Pi. To make it available I setup a port forward in my home networks router pointing at the Raspberry Pi jetty instance.

Sender

The sender part of the application runs in the Chrome Browser.

It provides access to the list of media available on the NAS (actually this can be any filesystem directory accesible to the web application).

It enables the media to be played, started, stop and paused. It also enables the play position to be set

I wrote the sender in javascript running on Node.js. It's hosted within my home network running on my Raspberry Pi. This part did not have to visible outside of my home and that included the URL's for the media. I was happy to find that I could pass internally based URL's to the receiver that pointed to the node.js server and they would load fine.

The backend part of sender application simply provided a rest handler to :

Get a list of available media
Get and set the currently playing media
Get and set the current activity id so that page reloads of the sender application can pick up on what the receiver is currently doing.

It also used a Connect static registration to enable loading the media from the NAS/filesystem.

For the frontend I decided to use a mobile friendly setup with the hopes that the Chrome browser in IOS and Android will eventually get cast support (neither of them currently do). It uses jQuery, jQuery Mobile, Backbone.js and Underscore.js with all modules in AMD format. My Zazl Optimizer is also used to provide dynamically built and optimized javascript to the frontend.

The media list page looks like this :

You can select one of the list media files to play or navigate to the Playing page.

The playing page looks like this :

You can select the Chromecast Receiver to play on, the current position by setting the slider value, and start, pause and stop the media itself.

Wednesday, August 28, 2013

Zazl AMD Optimizer and Node.js

When I first started writing the Zazl Optimizer I focused on providing a Java HTTP layer that could be used to support dynamic analysis and optimization of AMD based Web Applications. One of the core analyzers within the Zazl AMD Optimizer is written in javascript, so with that in mind it made sense to also provide an HTTP layer written to run in Node.js.
Node.js has a number of static HTML libraries available. If one of those is combined with the Zazl AMD Optimizer you have an environment where you can write your AMD based application and serve up an optimized (concatenated and compressed) version of it to your browser based clients.

One of the most well known and used static HTML libraries is Connect. It's actually used within a large number of middleware Node.js based libraries such as Express. For the Zazl Optimizer's purposes connect is used to serve up any static resource that is not handled by the optimizer. The packaging of the Zazl Optimizer for Node.js provides a Connect based server frontend. The section that starts up the http server looks like this (from the file found here):

    var connectOptimizer = zazloptimizer.createConnectOptimizer(appdir, compress);

    var app = connect()
        .use("/_javascript", connectOptimizer)
        .use(connect.static(appdir))
        .use(connect.static(zazloptimizer.getLoaderDir()));

    http.createServer(app).listen(port);

When creating the optimizer you give it the path to where the JavaScript resources reside and also whether to turn on compression. Also, you can see above that the typical approach is taken to initialize the connect environment. The path is first checked for "_javascript" and directed to the optimizer to be handled if matched. Otherwise a Connect static handler for the specified application directory is search and also one to handle finding Zazl's AMD loader that the application code references. To take advantage of the AMD loader handler simply reference it as follows in the HTML file :

    <script type="text/javascript" src="loader/amd/zazl.js"></script>

That's more or less all there is to setting up usage. You can see more in two sample github repositories, one with Dojo samples and one with JQuery samples. Also both are hosted here and here.

Note: The hosting site (Heroku) puts both apps to sleep after being idle for 1hour. Don't be surprised if the first load(s) takes some time. Subsequent loads will demonstrate the full potential. Alternatively you can download the source from the repositories and run them yourself.

Friday, December 7, 2012

Inlining HTML5 WebWorker source content

I have been playing with HTML5 WebWorkers in an attempt to solve a performance problem and came up with a simple way to inline the WebWorker's source code. The Basics of Web Workers tutorial demonstrates inlining via script tags marked as 'type="javascript/worker"' however this means placing the WebWorker source within the HTML page. I use AMD for my module loading and would like to avoid this approach. With that in mind I came up with the following :

    var webWorkerFunc = function() {
        onmessage = function(e) {
      .......
            postMessage(....);
       }
    };

    var content = webWorkerFunc.toString();
    content = content.substring("function () {".length+1);
    content = content.substring(0, content.lastIndexOf("}"));

    var URL = window.URL || window.webkitURL;    var blob = new Blob([content]);
    var blobURL = URL.createObjectURL(blob);
    var webWorker = new Worker(blobURL);
    webWorker.onmessage = function(e) {
.......
    };

This approach is similar to the one described in the HTML5 Rocks tutorial except that the Function.toString() is used to obtain the WebWorker source code. The function wrapper is stripped and the result passed into the Blob constructor.

Sunday, September 23, 2012

Adventures in source map land

JavaScript source maps are a great solution to the age old problem with JavaScript where you have minified your source code but now you need to debug it. Attempting to step through the minified code can push a developer over the edge :-)

Within the Zazl Optimizer there is support to minify the JavaScript responses that are generated. The compressor interface it provides allows for different compression implementations to be configured. Although I do not have the Google Closure compiler implementation available with the Optimizer codebase I have been experimenting with providing one.

With this in mind I decided to see if I could make the minified JavaScript responses Zazl generates include the "//@ sourceMappingURL=" comments and load the source maps when requested. To begin with the source maps themselves have to be generated. When running the Closure Compiler simply setting a non-null value on the sourceMapOutputPath property of the CompilerOptions object will trigger the source map generation. The JSON string representation of the source map can then be obtained from the sourceMap property of the Results object

CompilerOptions options = new CompilerOptions();
options.sourceMapOutputPath = "";
......
Result result = compiler.compile(extern, input, options);
String compressedSrc = compiler.toSource();
compressedSrc+= "\n//@ sourceMappingURL=_javascript?sourcemap="+path+".map\n";
StringBuffer sb = new StringBuffer();
result.sourceMap.appendTo(sb, "sourceMap");

Note in the code above the attached URL points to the Zazl HTTP handler to obtain the source map for a given module when the HTTP request contains a "sourcemap" parameter.

At this point I should indicate that for performance purposes the Zazl Optimizer does not run the compresser for each JavaScript response it generates. Individual modules are compressed and cached so that when a response is generated it is simply a matter of concatenating the required modules. This results in a single stream of JavaScript with multiple modules and also multiple "//@ sourceMappingURL=" comments separating them.

And this is where things fall apart with this approach. It appears that the Chrome implementation supporting source maps cannot deal with a single JavaScript resource containing multiple modules and multiple sourceMappingURL comments. When run in Chrome I see the debugger hook up the first module it finds in the resource and then it ignores the rest.

At the moment the only solution I can see is for Zazl to stop compressing individual modules and just compress the single JavaScript response generated. This will result in a single resource listed in the debugger, not individual modules, but the minified code will be hooked up correctly to the unminified source. I really don't want to do this as the performance hit will be substantial. A to-do for me is to find out if there is any way I can get Chrome to handle the multiple modules within the single resource. I'll update the post if I find out more.

Update 9/27/2012 :
After posting a message on the Chrome DevTools google group I was pointed to the source map specification where it describes sections. This is exactly what I needed. Instead of writing multiple sourceMappingURL comments the optimizer writes one URL that gets directed to the optimizers javascript servlet with an identifying key for the contents of the response. When the javascript servlet receives the request for the map it generates a JSON object containing the required sections for each module.

The good news is with these changes in place the Chrome debugger now shows and links to all source files correctly. The bad news is that doing other debug tasks, such as setting breakpoints, do not work.

Thursday, July 26, 2012

Zazl Optimizer integrated into Maqetta

The Maqetta project is a great new tool for building HTML5 based user interfaces. One of its features is a "preview" option that allows developers to view the pages they have assembled. This functionality runs as an AMD based webpage loading all of its AMD modules individually. The load time of the preview can be significantly affected when running the preview in a high latency environment as each module load is an individual HTTP request. Typically the fix for this is to run some form of build tool that will concatenate all the modules together so that only one HTTP request is required. However, as the pages are assembled dynamically in Maqetta performing a static build is not really a viable option.

This is where Zazl can help. The Zazl AMD Optimizer supports dynamic optimizations such as module concatenation and can be typically integrated with minimal coding. Maqetta is OSGi based so Zazl must run in its OSGi mode as a set of OSGi bundles.

One of my main goals of the integration was to be as unobtrusive as possible in regard to the Maqetta source modifications. Only 2 core modifications were required :

Modify the generated preview URL to include a "zazl=true" parameter when Zazl is required to handle the preview.
Ensure that a raw version of Dojo was available for Zazl to use. Zazl requires that the AMD modules it analyzes have not been built with another build tool. Unfortunately the Dojo that Maqetta uses for preview has already been run through the Dojo build tool. Maqetta uses an ajaxLibrary Eclipse Extension Point to register paths to different libraries. A new extension instance for the raw Dojo code was added so that it did not interfere with the existing ajaxLibrary extension for the built version of Dojo.

With the Maqetta modifications in place some bootstrap code is required to setup the Zazl runtime so that it can intecept the preview URL requests and ensure that the Zazl AMD loader is used to load the AMD modules. You can see all of the bootstrap code here.

Modifications have to be made to the Preview's HTML page to ensure that the Zazl AMD loader is configured and loaded. A JEE Filter is a great tool for intercepting HTTP requests and responses. A Filter was written and configured within Maqetta to catch the preview requests and look for the "zazl=true" URL parameter. If matched an HTML parser (written using a Java Library called NekoHTML) is used to parse the HTML looking for the Dojo script tag. The parser switches the script tag with one that loads the Zazl AMD loader and also sets up the configuration.

In addition to creating the JEE Filter for the preview the bootstrap code has to ensure that the Zazl javascript servlet is configured and running and also that Zazl Resource Loading requests can find resources within the Maqetta environment. Both the JEE Filter and the Zazl javascript servlet are registered in an OSGi Activator run within a bootstap OSGi bundle called maqetta.zazl. This Activator also creates an instance of a custom Zazl Resource Loader that understands how to obtain resources from the Maqetta environment. Maqetta provides its own virtual directory API that can be used by this custom Resource Loader to obtain URL's to the resources.

The bootstrap code includes one other component. When the preview webpage is loaded it now has a reference to the Zazl AMD Loader. The Maqetta environment must be able to find this resource which resides in one of the Zazl Optimizers bundles. To achieve this the Zazl Optimizer bundle has to register an ajaxLibrary Eclipse Plugin Extension, I didn't want to contaminate the Zazl code with Maqetta specific references so an OSGi fragment bundle was created to add the required Eclipse Metadata. You see this fragment bundle here.

This integration also had to handle how the Zazl OSGi bundles would be integrated into the Maqetta git repository. The Maqetta git repository use submodules to reference its third-party dependencies. Providing direct submodule links to the Zazl git repositories on github would not work well as Zazl itself has a build step that has to be run. I decided the best way to handle this was to provide Zazl Release git repositories hosted on github.

There are 2 staging repositories:

One contains the build output of Zazl with tags marking specific versions.
The other contains the binary dependencies that Zazl requires to run.

This provides a nice controlled way for Maqetta to be upgraded to new versions of the Zazl Optimizer.

You can try all of this out by loading Maqetta. Developer setup details can be found here. The Preview7 Release, when available, will contain Zazl. It will be found here.

Saturday, February 11, 2012

AMD, jQuery and the Zazl Optimizer

Having got my Dynamic Optimizer running with Dojo 1.7 I decided to take a look at how jQuery works in the AMD world. With the release of JQuery 1.7 it became possible to load and reference the core jQuery code as an AMD module. Also, after looking at the jQuery mobile library (note this is currently only available in the 1.1 version that has not yet been released) I found that it too was AMD enabled. So it seemed like the perfect time to get familiar with the jQuery world. I decided I would write a jQuery mobile based frontend to my Music Server application. It already uses the Zazl Optimizer to load a Dojo 1.7 based desktop and mobile frontend.

The first step was to obtain jQuery 1.7.1 and jQuery mobile 1.1. jQuery 1.7.1 can be downloaded from here and jQuery mobile 1.1 can be obtained by build it from it github repostitory. I should note in both cases the uncompressed versions are used as the compression is handled by the Zazl Optimizer itself.

Once downloaded I placed jQuery 1.7.1 in a directory path of "lib/jquery/jquery-1.7.1.js" and jQuery mobile 1.1 in a directory path of "lib/jquery-mobile/jquery.mobile.js" within my Web Application. I also had to obtain the required CSS file for jquery mobile(also the images it references). This I placed in a directory path of "css/jquery-mobile/jquery.mobile.css" and referenced it in the HTML

    <link rel="stylesheet" href="css/jquery-mobile/jquery.mobile.css" />

The HTML front-end code then could simply reference the modules via a call to the Zazl Optimizer's entry point, "zazl". The actual script tag that loads the jQuery code is inserted by an HTML Filter as described here.

    <script type="text/javascript">
        zazl({
            paths : {
                jquery: "lib/jquery/jquery-1.7.1",
                jquerymobile: "lib/jquery-mobile/jquery.mobile"
            }
        },
        ["jquery", "jquerymobile", "app/jqmobile"],
        function($) {
        });
    </script>

Above you can see the main AMD module that handles the application logic called "app/jqmobile". Within that the jQuery core is referenced as follows :

define(['jquery'], function ($) {

    .....
    $(document).ready(function() {
        console.log("ready");
    });

});

That's about all there is to it. jQuery is used just as it is normally.

Sunday, January 15, 2012

Using an HTML Filter to insert javascript tags

The code I have produced for my Zazl JavaScript Optimizer works by generating URL's for HTML script tags. Because of this the developer must use some form of server-side support to generate the HTML resource so the script tags can be inserted. As the code is written in Java the obvious choice for the server-side technology is JSP's. If you are comfortable with writing JSP's and perhaps also plan to use them to insert other dynamic content into the returning HTML resource then they are a good solution, but if you are only interested in using the Optimizer then writing HTML is a simpler choice to pick.

I decided to write an HTML Filter to make adoption of the Optimizer easier. It can be used to insert the required javascript script tag into the HTML resource before it is returned to the requester. All that the developer is required to do is add the javascript that references the Optimizer's AMD loader entry point. This can be via an embedded script within the HTML or via a "main" javascript resource referenced by the HTML via a script tag with a "src" attribute.

Writing the HTML Filter was made fairly straight forward because of great third party open source libraries that are available, NekoHTML and UglifyJS. The HTML Filter itself is written as a JEE Filter. Filters allows the HTTP requests and responses to be modified before and after the HTTP servlet serving the HTML is executed. In this particular case the HTTP response is obtained by the Filter and analyzed before being returned back to the requester.

NekoHTML is an HTML parser written in Java. The Zazl Optimizer HTML Filter uses it to parse the HTML response returned from the WebContainer. The parser allows the filter to find and scan embedded javascript within the HTML. It also allows the filter to identify "main" javascript resources attached to the HTML that might contain the Optimizers AMD loader entry point.

Once these javascript snippets have been obtained the filter uses the javascript parser provided by UglifyJS to parse the code and locate the Optimizer AMD loader entry point. The entry point should provide the ids of the AMD modules that can be considered the top level modules for the page. The Optimizer's analyzer is then used to analyze and generate a javascript tag URL that can be inserted into the HTML response.

In a nutshell that's about all there is to it. I should also mention that the Filter attempts to ensure the HTML resource does not get cached because if any javascript resources required for the page are modified a new javascript tag URL would have to be generated. If the HTML resource is cached the javascript changes would never be picked up by the requester. It attempts to do this by stripping out any request headers that the WebContainer might use to indicate caching is possible.

You can see this in action in the Zazl AMD Optimizer sample WAR, also a wiki page is found here that provides some more details. Alternatively, I have a Music Server Application found here that also demonstrates the Filter and Optimizer in action.

Saturday, January 14, 2012

Optimizing with AMD JavaScript loaders - Part 2

This is the second part, an update might be a better description, for my earlier blog post Optimizing with AMD JavaScript Loaders. Since then Dojo 1.7 is now in GA and I have updated the Zazl Dynamic Optimizer to support optimizing Dojo 1.7 based applications.

I had originally wanted to have the optimizer work generically with AMD compliant loaders and also be able to optimize plugin references using the referenced plugins themselves, however it became obvious that this is not a reasonable goal to achieve with the state that AMD spec is currently in. The result of this is that the optimizer provides its own AMD loader (similar to Almond in that it expects all the modules to be part of the javascript stream) and also supports its own server-side plugin API that allows for custom optimizations to be applied where possible.

The Zazl Optimizer's AMD Loader

As the optimizer ensures all the required AMD modules are part of the javascript stream delivered to the client there is no need for the loader to fully support asynchronous loading. I also found that the loader had to support modules that contained references to the local "require" function. The loader will ensure that modules referenced from "define" calls will be present in the initial javascript stream. If modules contain calls to the local "require" the loader will make additional XHR calls back to the optimizer to load the "required" module along with its dependencies, again in a single response.

Another value-add feature that the loader provides is the ability to preload a cache that is made available to modules via a "cache" property on the require object. The cache property is not part of the AMD spec but the Dojo AMD loader provides this support. This enables optimizations to be made for plugins such as the dojo/text plugin. Using a server-side extension the cache is preloaded with the text value that the plugin will provide, thus avoid and additional XHR call to load the resource.

Configuring the zazl loader

The entry point into Optimizers AMD loader is called "zazl". If follows a similar pattern to how my lsjs AMD loader had defined its entry point. For example :

zazl({
    packages: [
        {
            name: 'dojo',
            location: 'dojo',
            main:'main'
        },
        {
            name: 'dijit',
            location: 'dijit',
            main:'main'
        },
        {
            name: 'dojox',
            location: 'dojox',
            main:'main'
        }
    ]
},
["amdtest/Calendar"],
function(calendar) {
    console.log("done");
});

Additionally, an zazl.json file has to be provided on the server-side.

Here is an example of the zazl.json file :

{
   "bootstrapModules" : ["loader/amd/zazl.js"],
   "debugBootstrapModules" : ["loader/amd/zazl.js"],
   "amdconfig" : {
        "plugins" : {
           "dojo/has" : {
               "proxy": "optimizer/amd/plugins/dojo/has",
               "has" : {
                   "host-browser" : 1,
                   "dom" : 1,
                   "dojo-dom-ready-api" : 1,
                   "dojo-sniff" : 1
               }
           },
           "dojo/text" : {
               "proxy": "optimizer/amd/plugins/dojo/text"
           },
           "dojo/selector/_loader" : {
               "proxy": "optimizer/amd/plugins/dojo/selector/_loader",
               "defval": "dojo/selector/lite"
           },
           "dojox/gfx/renderer" : {
               "proxy": "optimizer/amd/plugins/dojox/gfx/renderer",
               "defval": "dojox/gfx/svg"
           }
        },
       "i18nPluginId" : "dojo/i18n"
    },
   "type" : "amd"
}
Optimizing plugin references

Having given up on loading and running the plugins themselves on the server-side I faced the reality that to support frameworks like Dojo I would have to provide server-side equivalents for some of the plugins the Dojo provides. As the server-side was already providing a commonjs loader environment I decided to write these plugin proxies just as commonjs modules.

A server-side plugin proxy is configured via the zazl.json configuration file.
For example :

"plugins" : {
    "dojo/has" : "optimizer/amd/plugins/dojo/has",
    "dojo/text" : "optimizer/amd/plugins/dojo/text",
    "dojo/selector/_loader" : "optimizer/amd/plugins/dojo/selector/_loader",
    "dojox/gfx/renderer" : "optimizer/amd/plugins/dojox/gfx/renderer"

A plugin proxy can provide two exports :

1) write(pluginName, normalizedName, callback, moduleUrl) - the return value from the callback is written into the javascript stream.
2) normalize(id, config, expand) - the returned value is used to determine if the plugin has additional dependencies that need to be included in the javascript response stream.

The "config" param for both calls is value specified for the "amdconfig" property in zazl.json.

dojo/has

The dojo/has plugin is used to determine whether to dynamically include other modules based on a "has" configuration. This provides a challenge for how to deal with this in an optimized environment. I chose to provide my own server-side version of the has plugin that used its own configuration. It provides a "normalize" function that is used to direct the optimizer to include these additional dependencies. The "has" configuration is provided in the zazl.json configuration file.

dojo/text

The dojo/text plugin is used to load in text resource dependencies. Ideally, in a optimized environment you would want the optimizer to ensure these text resources are included in the javascript response stream in a form that avoids additional downloads. The dojo/text plugin makes use of a "require.cache" property to populate a cache. The zazl AMD loader supports populating the cache by providing an "addToCache" function. The server-side version of the dojo/text plugin writes calls to this function that are included into the javascript response stream.

function jsEscape(content) {
    return content.replace(/(['\\])/g, '\\$1')
        .replace(/[\f]/g, "\\f")
        .replace(/[\b]/g, "\\b")
        .replace(/[\n]/g, "\\n")
        .replace(/[\t]/g, "\\t")
        .replace(/[\r]/g, "\\r");
};

exports.write = function(pluginName, moduleName, write, moduleUrl) {
    var textContent = require('zazlutil').resourceloader.readText(moduleUrl);
    if (textContent) {
        write("zazl.addToCache('"+moduleName+"', '"+jsEscape(textContent)+"');\n");
    }
};

This ensures that when the client-side version it will find the cache value and avoid an XHR call to obtain the resource.

dojo/selector/_loader

The dojo/selector/_loader plugin determines what selector engine to use. The client side version expects a true browser environment to be available to determine the required engine type. This not something that can be determined easily in an optimized environment so for now the server side version I have written provides a "normalize" method that simply returns the default "lite" module id.

exports.normalize = function(id, config, expand) {
return "dojo/selector/lite";
};

The optimizer ensures that this module, along with its dependencies, is included in the javascript response stream. I plan to make the value returned configurable.

dojox/gfx/renderer

The dojox/gfx/renderer plugin works in a similar fashion to the dojo/selector/_loader plugin. It currently returns a default value of "dojox/gfx/svg" via a provided "normalize" function.

i18n plugin support

Optimizing i18n plugin support is more complicated that other types of plugins due to the need to provide i18n message bundles based on the locale of the calling client. The other optimized plugins produce output that can be cached and included for ever request for the same AMD module and its dependencies. The i18n output must be generated for each request made.

Because of this I have made the i18n plugin support a special case and there is specialized code that produces the message bundles. The "dojo/i18n" plugin supports the same format of messages as the requirejs i18n plugin does. The zazl configuration files specifies a property that indicates the module id of the i18n plugin. When the optimizer encounters an i18n plugin reference the details are used by the javascript response renderer to include the specified message bundles based on the locale of calling client. When rendered into the client the client i18n plugin finds that the messages bundles have been loaded in and avoid XHR requests to load them.

What's next ?

Currently the optimizer is only usable in a Java JEE WebContainer environment that requires you to write your frontend HTML resources as JSP's. I now have a working HTML filter that enables developers to write HTML resources instead. The HTML is parsed for AMD module references. The HTML filter inserts the required javascript tag to load the module and all its dependencies based on the parser results. My next blog post will provide more details on how the HTML filter works.
I plan to also provide a version of both the AMD optimizer and HTML filter for node.js.
So far I have focused on making the optimizer support Dojo based applications. As jquery and other javascript frameworks are now supporting AMD I plan to ensure the optimizer supports applications using these frameworks too.

You can get more details on trying out the AMD optimizer here.

Update:

I have improved the configuration such that it is not duplicated in zazl.json. I have also written a blog post about the HTML filtering approach to inserting the required script tags

Saturday, October 29, 2011

AMD - Loading from HTML5 localstorage

One of the best advantages of using AMD as your javascript loader architecture is that it allows for different AMD implementations to be "plugged-in" without requiring modifications to your application code. As more devices provide HTML5 compliant browsers as part of their bundled software the need for more specialized loaders becomes important, a tailored loader will ensure your application loads in the most efficient manor.

HTML5 can play an important part in all of this as it provide a number of "localstorage" specifications that can be leveraged by AMD loaders to load locally within the device if possible.

Over the summer I decided to try this out and the result is an AMD compliant loader that will load its modules from a "localstorage" implementation. It's available here on github.

One goal I had was that the "localstorage" implementation used should be configurable. The HTML5 localStorage API works great however it has a hard limit of 5MB of storage. This limit is quite easy to reach, especially if you are using a large set of uncompressed modules. In addition to the default localStorage implementation the loader provides a Web SQL storage implementation that can specify the size of the storage to be used. While Web SQL is no longer active in the HTML5 spec it is still currently available in all webkit based browsers. That means Google Chrome, Safari and iOS based browsers can use it. Also there is now an IndexedDB storage implementation available too that works in Chrome and Firefox. Alternative custom storage implementations can be written by following the storage implementation spec.

Another goal I had was to make the loader work independently of any server-side requirement. It is usable just by referencing it via a script tag in your HTML file. There is a caveat with using it this way though, when modules need to be updated the whole storage area must be cleared. To overcome this limitation the loader optionally uses a configured "check timestamp" URL to check timestamps on each of the modules it loads. If a returned timestamp for a module is different than the one stored locally for it the module is reloaded from the server. The protocol for the "check timestamp" API is very simple and allows for a variety of server-side technologies to be used to provide this functionality. In fact the lsjs project provides a Java based servlet implemenetation and also a node.js based implementation.

So how does it perform ? When compared with regular AMD based loaders that load modules asynchronously its performance benefit can be seen in higher latency environments where the time taken to load the modules or perform validation based cache calls for the modules can affect load time. The lsjs loader does still have to load each module independently via script injection or eval calls and this can affect load performance, especially in browsers with slower Javascript engines. One thought I have on improving this is to allow the lsjs loader to load all the required modules in a single script injection call or eval call. That would require the loader storing knowledge of the dependency chains involved.

In addition to the load performance improvements I have some other ideas on how the loader can emulate HTML5 appcache. Using AMD plugins other required resources such a text, css and images can be loaded from localstorage too. The advantage over appcache is that the loader has complete control on if and when the cache is used versus loading from the server.

Overall it been a very interesting experiment that has allowed me to learn how to write an AMD compliant loader. As I have found with my work on dynamic server-side optimizers the ability to write specialized AMD loaders (my next blog post will cover more on this) has become a hard requirement.

Saturday, May 21, 2011

Compressing JavaScript

As part of this blog post one of the recommendations I made for obtaining the best performance for downloading your JavaScript is to compress it before returning it to the requesting Web Client. There are a number of options available and I'm going to compare a variety of them to demonstrate their differences.

Before I get into the comparison result here are details on the environment used to run the tests.

The HTML page simply contained a <div> tag with a dojo dijit.Calendar widget attached.
All JavaScript is loaded via the Zazl AMD dynamic optimizer. It is delivered in one single response connected to a single script tag in the page.
When a JavaScript compressor was applied it was on an individual module basis, not on the whole JavaScript response that is returned.
Dojo 1.6 in AMD mode was used for providing the dijit.Calendar Widget.
RequireJS was used for the AMD loader.
Google Chrome was used to load the page and its Developer Tools used to show the size of the JavaScript downloaded
The list of modules returned were as follows

dojo/_base/_loader/bootstrap.js
dojo/lib/backCompat.js
dojo/_base/_loader/hostenv_browser.js
dojo/lib/kernel.js
dojo/_base/lang.js
dojo/_base/array.js
dojo/_base/declare.js
dojo/_base/connect.js
dojo/_base/Deferred.js
dojo/_base/json.js
dojo/_base/Color.js
dojo/_base/window.js
dojo/_base/event.js
dojo/_base/html.js
dojo/_base/NodeList.js
dojo/_base/query.js
dojo/_base/xhr.js
dojo/_base/fx.js
dojo/lib/main-browser.js
dijit/lib/main.js
dojo/i18n.js
dojo/cldr/supplemental.js
dojo/date.js
dojo/regexp.js
dojo/string.js
dojo/date/locale.js
dijit/_base/manager.js
dojo/Stateful.js
dijit/_WidgetBase.js
dojo/window.js
dijit/_base/focus.js
dojo/AdapterRegistry.js
dijit/_base/place.js
dijit/_base/window.js
dijit/_base/popup.js
dijit/_base/scroll.js
dojo/uacss.js
dijit/_base/sniff.js
dijit/_base/typematic.js
dijit/_base/wai.js
dijit/_base.js
dijit/_Widget.js
dojo/date/stamp.js
dojo/parser.js
dojo/cache.js
dijit/_Templated.js
dijit/_CssStateMixin.js
dijit/form/_FormWidget.js
dijit/_Container.js
dijit/_HasDropDown.js
dijit/form/Button.js
dijit/form/DropDownButton.js
dijit/Calendar.js
app/Calendar.js

The types of compression used were :

No Compression
Gzip
Gzip + Simple comment and whitespace removal
Gzip + Dojo Shrinksafe
Gzip + Google Closure
Gzip + Uglify-JS

No Compression

With no compression at all, that is gzip is turned off and the modules are written into the response "as is" results in a transfer of around 755kb. This is quite a sizable chunk considering that all the page contains is a single widget.

Gzip

As you can see just by turning on Gzip results in a 218kb download vs 755kb.

Gzip + Simple Comment and Whitespace removal

Once again quite a signification size reduction (85kb vs 218kb) just by removing comments and whitespace. I should note that I used simple home grown code to remove comments and whitespace. Writing code that reliably removes comments is actually not as straight forward as it would first appear. Ideally using a real JS parser is the best solution but that can increase the compression time. Google Closure does support a "comment and whitespace" removal mode and this will do a thorough job at the potential expense of increased time to compress.

Gzip + Dojo Shrinksafe

Shrinksafe renames locally scoped variable names in addition to comment and whitespace removal. We have now gone from 85kb to 68kb.

Gzip + Google Closure

This result is from using Google Closure with its SIMPLE_OPTIMIZATIONS mode turned on. We see a better results than Shrinksafe going from 68kb to 58kb. Closure also support more advanced optimizations but typically you have to modify your code to be closure friendly. If you plan to exclusively use Closure this might be something to consider.

Gzip + Uglify-JS

This result is from using Uglify-JS in its default mode. As can be seen Uglify-JS compresses almost as well as Closure. The following code was used to execute it.

var jsp = require("uglify-js").parser;
var pro = require("uglify-js").uglify;

var ast = jsp.parse(src);
ast = pro.ast_mangle(ast);
ast = pro.ast_squeeze(ast);
var compressedSrc = pro.gen_code(ast);

Conclusion
It's pretty obvious that compression can provide substantial reduction in the size of the JavaScript code delivered to WebClients. Certainly the best bang for the buck is simply to turn on Gzip, however adding any of the 3 compression engines used here, or even just removing whitespace and comments, will result in much smaller downloads.

Saturday, April 30, 2011

Optimizing with AMD JavaScript loaders

In a previous blog post I talked about writing JavaScript optimizers that run dynamically. That is to say when a Web Application is serving JavaScript resources the server itself is performing optimizations to ensure the JavaScript is returned to the web client in the most efficient way possible,

AMD (Asynchronous Module Definition) is rapidly becoming the preferred type of loader to use when developing Web Client based JavaScript. Its "define' API provides a way for modules to declare what other resources (other JavaScript modules, text, i18n etc) they depend on. This information is key to how an optimizer running in the server can ensure that when a request is received for a JavaScript resource all its dependencies can be included in the response too. This avoids the loader code running in the web client having to make additional HTTP requests for these dependencies.

So how does one obtain this dependency information? Probably the most reliable way is to use a JavaScript Lexer/Parser to parse the code and analyze the calls to the "define" and "require" API's. There are a number of options here (Mozilla Rhino and Google Closure, both Java based, provide AST parsers) however this typically mean having to pick a environment or language. For the AMD Optimizer that I wrote I decided to use the parser that is part of Uglify-JS. As it is written in JavaScript itself I had more flexibility in the environments where I wanted to run it.

The Uglify-JS AST parser API provides a mechanism to walk the AST's that are returned by the parser. The parse call to obtain the AST object is a one time step. You can then walk the AST as many times as you need. The module "process.js" exports an function called "ast_walker". You can use this function to walk the AST with your provided walker function.

Here is an example of an AST walker that obtains the position within a module where a name/id should be inserted into a "define" call if one is not provided (The AMD optimizer I have written uses this walker to perform any name insertions that are required).

var jsp = require("uglify-js").parser;
var uglify = require("uglify-js").uglify;
var ast = jsp.parse(src, false, true);
var w = uglify.ast_walker();
var nameIndex = -1;
w.with_walkers({
    "call": function(expr, args) {
        if (expr[0] === "name" && expr[1] === "define") {
            if (args[0][0] !== "string") {
                nameIndex = w.parent()[0].start.pos + (src.substring(w.parent()[0].start.pos).indexOf('(')+1);
            }
        }
    }
}, function(){
    w.walk(ast);
});

The walker looks for "call" statements that have a name of "define". The "expr" argument provides this information. The "args" argument provides the details of the type and value of the arguments provided to the "call". In this example the code checks that the type of the first argument is not a string and records the position that the name should be inserted. The walker object (w) provides access to the parent object that contains the position in the source code.

Using it in a Dynamic Environment

For the Optimizer that I have written for AMD based environments (the walker code can be seen here) I use a single AST walker function. The walker code itself recursively walks down the chain of modules recording the relevant information that each module provides. Given a module url for a starting point it obtains the information for each dependency module, placing it in a data structure and adding it to a module map using the module uri as a key.

The walker records the following :

For each module a list of JavaScript dependencies
A list of "text" dependencies
A list of "i18n" dependencies
If a module does not contain an id then records its index within the source where and id should be placed.

With this information now available it's possible to write a HTTP handler that can write a single streams of content that contains :

An AMD compliant loader
A "i18n" plugin handler
A "text" plugin handler
Each "text" resource as a separate "define"'d module
Each "i18n" resource (see below for more details)
Each required module in it correct dependency order

Note: All modules with have an id added if one is not present

i18n dependencies
An i18n dependency is declared in the form "i18n!<..../nls/...>". If the required locale value is available (with HTTP this can be obtained by parsing the "accept-language" for the best fit) the set of messages can be provided in separate AMD modules that will be merged together by the i18n plugin. When processing what has to be written into the response stream the Optimizer will provide up to 3 separate modules based on :

The root locale
The intermediate locale
The full locale

For example given the the i18n dependency "i18n!amdtest/nls/messages" and the request locale is "fr-fr" then the Optimizer will look for the :

root module in "amdtest/nls/messages.js"
intermediate module in "amdtest/nls/fr/messages.js"
full module in "amdtest/nls/fr_fr/messages.js"

Seeing it in action
You can see some samples of the AMD optimizer if you download

amdoptimizerjsp.war (load into a JEE webcontainer and access "/amdoptimizerjsp/amdcalendar.jsp" or "/amdoptimizerjsp/amddeclarative.jsp")
zazlnodejs.tar.gz (run node zazlserver.js ./amdsamples ./dojo16 requirejs.json)

Sunday, April 3, 2011

JavaScript Loaders and Dynamic Optimization

If you are using a JavaScript framework to develop browser based applications then the chances are you are using a JavaScript loader that the framework provides. The advantages of using a loader are that you can write your JavaScript code as separate modules and also declare dependencies on other modules. When your application is loaded the frameworks loader will ensure that the modules and their dependencies are loaded in the correct order.

For most of the JavaScript work I have done I have used Dojo as a framework. Dojo provides two core API's that enable modules to be loaded and declare dependencies

dojo.provide() - used to declare the id of your module
dojo.require() - used to declare a dependency on another module

When the dojo.require() statement is called the Dojo framework will see if the module has already been loaded and if not it will load it via an XHR request.

This works well while developing your application, however when it comes time to deploy in a production environment you do not want it making potentially hundred of HTTP requests to load all it modules. Performance will suffer especially in a high latency environment.

To handle this issue most frameworks provide a "build" tool that allows you to package your application and all it depencies into a single resource or multiple resources that can be loaded via a small set of script tags. Dojo provides such a "build" tool.

If you are like me though and don't particularly care for having to run static builds then using a dynamic optimizer is more appealing. Also if your application is one that supports extensibility then using a static build may not even be an option unless you are willing to customize the frameworks build tool. One example of this is Jazz whose Web UI is extensible and also provides a dynamic optimizer that supports providing code contributions via extensions. (I should note that I am the writer of the original Jazz Web UI optimizer).

Concatenating modules and their dependencies together into a easily loadable resource is only one step in obtaining well performing JavaScript loading. Bill Higgins, a former colleague of mine from Jazz wrote an excellent blog post that details some core techniques for optimizing. With this in mind I decided to write an optimizer that would support these objectives :

Given a set of module id's enable the loading of these modules + all their dependencies in a single HTTP request.
Use both validation based caching and expiration based caching when possible.
Load any localization modules that are required using the locale of the client to determine the message files written into the response.
Support a "debug" flag that when passed set to true will ensure each module + its dependencies can be written into the loading HTML response as individual <script> tags thus enabling easy debugging via browser tooling.
Allow the javascript to be compressed as part of writing the HTTP response. Use both gzip and other javascript specific compressors (shrinksafe, uglifyjs, etc).
Support a variety of server-side environments written in Java and Javascript. For example JEE Servlet based (OSGi, JEE WebContainers) and commonjs based environments such as nodeJS.

So far I have what I have talked about covers the old style Dojo sync loader. Dojo is now moving to adopting Asynchronous Module Definition for its loader architecture (1.6 in its source form is AMD compliant in the dojo and dijit namespaces, for 1.7 it should be using an AMD loader by default). This affects 1) and 4) above in how they are implemented.

1) Given a set of module id's enable the loading of these modules + all their dependencies in a single HTTP request.

This requires performing dependency analysis on the set of modules that make up the application. The result is an ordered set of modules that can be used to build a stream of content written into the response to the HTTP request.

For a Dojo sync loader based system this has typically meant using the Dojo bootstrap process with replaced versions of dojo.provide() and dojo. require() that record the id/dependencies. For each module that is loaded a Regular Expression is applied on the source code to obtain the dojo.provide() details and then for each included dojo.require() the dependencies. Regular Expression works quite well in this scenario as the API's are very simple in structure. This is how the Dojo build tool works to build its optimized versions and also the Jazz Web UI Framework.

For an AMD loader based system using Regular Expression, while certainly possible, is not what I would consider the best option as the AMD API is more complex in structure. In this case using a real JavaScript language lexer/parser is a much better solution. As I wanted the optimizer to run in environments such as NodeJS I needed a JavaScript lexer/parser that was written in JavaScript itself. Luckily the excellent Uglify-JS provides one. Similar to how the Dojo sync loader analyzer works each module is parsed and scanned for "require" and "define" API calls, the results of which is recorded to obtain the ordered list of dependencies. One downside to using a true lexer/parser over RegEx is that the performance is affected somewhat, however other sort of optimizations can now be better supported as the information available from the parser is far richer in detail to what the RegEx can provide. For example Dojo is considering using has.js to sniff out features. The parser can be used to remove features identified by the "has" statements thus making the returned code better tailored to the environment that is loading it.

2) Use both validation based caching and expiration based caching when possible.

While returning a single stream of content versus each individual module is a significant performance improvement the size of the content can still take considerable time to be retrieved from the server. Using both validation and expiration based caching helps significantly here. Both techniques require using some form of unique identifier that represents the content being delivered. For my optimization work I decided to use an MD5 checksum value calculated from the JavaScript contents itself.

Validation based caching makes use of some HTTP caching headers. When the JavaScript response is returned an "ETag" header is added that contains the checksum value. When requests are received for the JavaScript content the request is seached for a "If-None-Match" header. If one is found and the value matches the checksum for the modules being returned an HTTP status code of SC_NOT_MODIFIED (304) is set and no reponse written. This indicates to the browser that the contents should be loaded from its local cache.

Expiration based caching takes it one stage further. If it's possible for the HTML page loading the optimized JavaScript content to include a URL that contains the checksum then the HTTP handler that returns the JavaScript content can also set an "Expires" HTTP header that sets the expiration to be sometime in the future. For the optimizer I have written the HTTP handler look for a "version" query parameter and if it matches the relevant checksum value it sets the "Expires" header one year in advanced. This works better than the validation based caching as the browser will not even make an HTTP request for the JavaScript content instead loading from its local cache. To use this type of caching you must have some form of HTML generation that is able to access the optimizer to obtain the checksum values for the set of modules. This will ensure that a URL with the correct checksum value is specified for the script tag loading the JavaScript content. Here is an example of a script tag URL that the optimizer I have written uses. Note it also contains a locale parameter that ensures applications using i18n modules can use the caching too.

/_javascript?modules=app/Calendar,&namespaces=&version=63ffd833cbe0a7ded0b92a124abd437a&locale=en_US

3) Load any localization modules that are required using the locale of the client to determine the message files written into the response.

The Dojo framework allows developers to use i18n modules for their messages. This means that simply changing the browsers configured locale will show the language specific messages.

Dojo sync loader based applications use the dojo.requireLocalization() API to identify dependencies on i18n modules. Likewise in an AMD environment quite a few of the implementations provide an i18n plugin that allows developers to prefix i18n dependency specifiers with "i18n". Using similar techniques that were used for obtaining the JavaScript module dependencies the optimizer I have written gather these i18n dependencies too. The HTTP JS Handlers I have written look for a locale query parameter and use that value to determine which language specific i18n modules to write into the response. This means that having to return i18n modules for all locales can be avoided although it does require that the mechanism responsible for writing the applications HTML containing the script tags must be able to obtain the locale information from the HTTP request (via the "accept-language" HTTP header)

4) Support a "debug" flag that when passed set to true will ensure each module + its dependencies can be written into the loading HTML response as individual <script> tags thus enabling easy debugging via browser tooling.

What has been described so far is great for obtaining the quick loading JavaScript however it is not particularly friendly for developer usage. Debugging code that is concatenated together and also potentially compressed using a JavaScript compression tool is not fun at all.

Dealing with this in an AMD based environment is actually very simple as one of the goals of AMD is to enable loading of modules indivually. What this means is that the debug flag is used just to ensure that neither JavaScript compression or concatenation of modules is applied.

For the Dojo sync loader environments my optimizer will allow HTML generation mechanisms to obtain the list of dependencies and write script tags for each into the HTML generated. This means that debugging tools will see each module independently.

5) Allow the javascript to be compressed as part of writing the HTTP response. Use both gzip and other javascript specific compressors (shrinksafe, uglifyjs, etc)

If your HTTP handler's environment supports gzip then it is a very simple way to significantly reduce the size of the JavaScript content written back to the browser. This typically involves using a gzip stream of some form that the content is written into and then written out into the returning HTTP response. Browser that support gzip will provide the HTTP header "Accept-Encoding" including the value "gzip".

In addition to this using a JavaScript compression tool can reduce the content size significantly too. Tools such as shrinksafe, uglifyjs and Google Closure all provide good compression results. The optimizer I have written enables different compression tools to be plugged into the resource loading step of the process. The HTTP handler responsible for writing the JavaScript content uses these JS compression enabled resource loaders.

6) Support a variety of server-side environments written in Java and Javascript. For example JEE Servlet based (OSGi, JEE WebContainers) and commonjs based environments such as nodeJS.

I'm not going to go into too much detail here as I will probably write more about this in future blog posts but briefly I will say that where possible the optimizer I wrote uses JavaScript to do any optimization work. I have written Java binding and also NodeJS bindings. Both sets of bindings use a common set of javascript modules. All the Java modules can be used in both OSGi and POJO environments.

Seeing the optimizers in action

The optimizers I have written are available in the Dojo Zazl project. The easiest way to see them is via the samples that use the Zazl DTL templating engine for HTML generation. You can use the README for setup help and this site too.

For a Dojo sync loader example run the "personGrid" sample in the zazlsamples WAR file or from the zazlnodejs package using this command :

node zazlserver.js ./samples ./dojo15

For an AMD loader using RequireJS run the zazlamdsamples WAR file or run the zazlnodejs package using this command

node zazlserver.js ./amdsamples ./dojo16 requirejs.json

Use a tool like Firebug to observe the JavaScript tag load. You should see subsequent requests load from the cache be significantly faster.

Also, you can see the i18n support in action by selecting a different locale. In the sync loader "personGrid" sample you can see console messages displayed in the language currently selected. In the AMD samples you should observe the calendar widget change.