Saturday, March 26, 2011

Writing your own CommonJS module loader

As more and more JavaScript code uses  the CommonJS module spec for loading it became apparent that the work I had been doing in the Dojo Zazl project was going to have to support loading CommonJS modules. One of the driving factors for me was wanting to use the fantastic JavaScript compressor and parser Uglify-JS which is written in JavaScript itself.

The Uglify-JS code uses "require" for its dependency loading and if I wanted to use it in my Zazl code I would have to either modify the Uglify-JS code to remove the "require" statements and manually load them or make the Zazl JavaScript loader be CommonJS module compliant. Also I was interested in making my Zazl code run in NodeJS and that meant supporting CommonJS modules in the Zazl JavaScript too.

The existing Zazl JavaScript loader is implemented as a native method called "loadJS". When using Rhino "loadJS" uses the Rhino API's to compile and load. Likewise for V8 it's API's are used also. Loading a CommonJS module is more involved than this though. In addition to compiling and loading it has to also minimally :
  1. Provide a sandbox for the module such that access to other globals is restricted
  2. Provide a "require" method to enable loading of other modules. 
  3. Track paths such that requests to "require" containing relative paths (starts with ./ or ../) are resolved correctly
  4. Provide an "exports" variable that the module can attach its exports to.
  5. Provide a "module" variable that contains details about the current module.
My current loadJS support now had to be somewhat smarter to support these requirements. The solution I came up with was to write a common "loader" JavaScript module that can be used in both Rhino and V8 environments. In support of this new loader code a new native method had to be provided for both Rhino and V8 called "loadCommonJSModule". In addition to the "path" parameter passed to "loadJS" this new method also expects a "context" object that contains the values for "exports" and "module". The native methods ensures that this context object is used for the modules parent context, basically its global object.

Doing this in Rhino is fairly straightforward. The java code that supports the "loadCommonJSModule" call has to create a org.mozilla.javascript.Scriptable object. The commonjs loader makes the call :

loadCommonJSModule(modulePath, moduleContext);

And the Rhino based Java code does this :

Scriptable nativeModuleContext = Context.toObject(moduleContext, thisObj);
classloader.loadJS(resource, cx, nativeModuleContext);

The thisObj parameter is what Rhino has passed to the java invocation of "loadCommonJSModule". The classloader object is a RhinoClassLoader (see another blog post for more details) used to create an instance of the module script and use the moduleContext object for its scope.

In V8 it's a little more involved. A new V8 Context is created for each module load. Creating Contexts in V8 is cheap so the performance overhead should be minimal. For the new Context object created each attribute in the provided moduleContext is copied in. This new Context object is then used to run the module script in. The following is some code snippets from the v8javabridge.cpp file.

v8::Handle<v8::ObjectTemplate> global = CreateGlobal();
v8::Handle<v8::Context> moduleContext = v8::Context::New(NULL, global);
v8::Handle<v8::Value> requireValue =
context->Global()->Get(v8::String::New("require"));
v8::Context::Scope context_scope(moduleContext);
moduleContext->Global()->Set(v8::String::New("require"), requireValue);

v8::Local<v8::Object> module = args[1]->ToObject();
v8::Local<v8::Array> keys = module->GetPropertyNames();
            
unsigned int i;
for (i = 0; i < keys->Length(); i++) {
       v8::Handle<v8::String> key = keys->Get(v8::Integer::New(i))->ToString();
       v8::String::Utf8Value keystr(key);
       v8::Handle<v8::Value> value = module->Get(key);
       moduleContext->Global()->Set(key, value);
}

You can see the JavaScript loader code here

Running CommonJS code in Zazl now is just a matter of :

loadJS('/jsutil/commonjs/loader.js');
require('mymodule');

For validation I used the CommonJS set of unit tests to check my loader runs correctly to the spec.

I have some gists that provide both a RhinoCommonJSLoader and a V8CommonJSLoader. Using a Multiple Rooted Resource Loader its pretty easy to run common js based code from Java:

File root1 = new File("/commonjs/example");
File root2 = new File("/commonjs/runtime");
File[] roots = new File[] {root1, root2};

ResourceLoader resourceLoader = new MultiRootResourceLoader(roots); 
try {
    RhinoCommonJSLoader loader = new RhinoCommonJSLoader(resourceLoader);
    loader.run("program"); 

} catch (IOException e) {
    e.printStackTrace(); 

} 

The /commonjs/runtime root must contain the contents of the Zazl jsutil JavaScript directory pathed such that the runtime directory contains the jsutil directory. You can place a "program.js" and its dependencies in the /commonjs/example directory and the loader will perform a "require('program'); to load it.

Tuesday, March 22, 2011

Making the most of using the Mozilla Rhino Javascript Engine

If you are using the Mozilla Rhino Javascript Engine to run your javascript code from Java the you have probably found that the performance is not one of its strong points. There are, however, a few things you can do to improve it, especially if you are creating reoccurring contexts that use the same script files.
    1. Load javascript resources from some form of caching resource loader.
    2. Compile scripts into org.mozilla.javascript.Script objects and use them instead of compiling the script each time.
      1) This is should be pretty obvious but using a caching resource loader ensures that javascript resources are not read from disk multiple times. If you use a single caching resource loader for given application then you will end up on loading a given resource only once from disk. The Dojo Zazl Project source code contains an simple implementation of a caching resource loader. The resource loader interface and the caching implementation can be seen here

      2) Rhino provides an API for compiling javascript resources into Script objects. If your usage of Rhino involves loading the same set of javascript resources multiple times then using the compiled Script to instantiate instances improves performance significantly.

      For the Dojo Zazl project I wrote a RhinoClassLoader class that enables the compiling of the scripts and also uses a classloader as a script class cache store. Here is a snippet from the code showing the compliation :

      import org.mozilla.javascript.CompilerEnvirons;
      import org.mozilla.javascript.optimizer.ClassCompiler;
      ClassCompiler classCompiler = new ClassCompiler(new CompilerEnvirons());
      Object[] classBytes = classCompiler.compileToClassFiles(resource, fileName.replace('-', '_'), 1, name.replace('-', '_'));
      Class c = defineClass(name.replace('-', '_'), (byte[])classBytes[1], 0, ((byte[])classBytes[1]).length);

      You can see the full code for it here

      The loadClass method of the RhinoClassLoader uses the classloader cache by simply using the ClassLoader classes findLoadedClass method  :

      Class<?> c = findLoadedClass(name.replace('-', '_'));
      if (c != null) { 
          return c;
      }

      Usage is simply :

      RhinoClassLoader rcl = new RhinoClassLoader(resourceLoader);
      Object scriptIntance = rcl.loadJS(uri, context, thisObj);

      with the returned object being the instance of the script. The RhinoClassLoader uses the resourceLoader instance to load the javascript resource. If you use a caching version of the resource loader then you improve the efficiency even more.

      Monday, March 21, 2011

      Using the Google V8 Javascript Engine in Java

      When you want to run javascript code in a java environment the only option you really have is to use the Mozilla Rhino Javascript Engine. It has some great features but performance is quite lacking especially when compared to a native engine such as Google's V8 engine. So what if you wanted to run javascript code in V8 from java.

      As part of the work I did for the Dojo Zazl Project  I investigated using V8 for the javascript engine when executing requests for DTL templates. This consisted of writing a JNI layer on top of the V8 C API. This part was fairly straightforward and was really just an exercise in writing JNI code. There is a pretty good embedding guide here that explains the V8 concepts etc.

      But what if you wanted to make calls back into your java code from your javascript code ? In Rhino this is very easy to achieve. It's also easy in V8 if you know the names of the methods you want to call and their corresponding signatures. This does not make it something that is easily extendable though.

      For the Zazl project I wanted to produce something that was more flexible. What I ended up writing was a V8 Java Bridge that allowed you to run any javascript code you wanted and also be able callback any provided java methods. The only restriction was that the signature of the methods had to be fixed. Because of this JSON was used for the method parameters and also for the return value.


      Using the bridge is a simple matter of writing a class that extends org.dojotoolkit.rt.v8.V8JavaBridge. You can see the code for it here. You must provide your own readResource method that is responsible for loading javascript resources that are requested to be loaded by the javascript engine :

      public String readResource(String path, boolean useCache) throws IOException {
      ......
      }

      The Zazl project provides a number of implementations of a ResourceLoader interface for different types of environmetns (JEE WebApplications, OSGi). Also there are some gists that provide examples of File based ResourceLoaders. (example).

      Running the script is simply like this :

              StringBuffer sb = new StringBuffer();
              sb.append("var v = test(JSON.stringify({input: \"Hello\"})); print(v);");
              try {
                  runScript(sb.toString(), new String[]{"test"});
              } catch (V8Exception e) {
                  e.printStackTrace();
              }

       

      The call to runScript passes the name of a callback method (in this example "test"). With the javascript code the test method is called. Note that the parameter must be a stringified JSON object. Also the return value with be a stringified JSON object. For this example the test method looks like :


          public String test(String json) {
              try {
                  Map<String, Object> input = (Map<String, Object>)JSONParser.parse(new StringReader(json));
                  System.out.println("json input value = "+input.get("input"));
                  Map<String, Object> returnValue = new HashMap<String, Object>();
                  returnValue.put("returnValue", "Hello Back");
                  StringWriter sw = new StringWriter();
                  JSONSerializer.serialize(sw, returnValue);
                  return sw.toString();
              } catch (IOException e) {
                  e.printStackTrace();
                  return "{}";
              }
          }

      The whole example can be seen in this gist.

      The V8JavaScript bridge class also contains runScript methods that support providing a object reference in addition to the method names so that a more genertic extender of the bridge can be passed callback methods.


      For the Zazl project I have produced native libraries for 32bit Linux, 32 bit Windows and both 64 bit and 32 bit Mac. These native libraties must be accesible to the JVM (in the same directory as the JVM invocation or via a -Djava.library.path). Alternatively you can use if you run in an OSGi environment you can use the org.dojotoolkit.rt.v8 Bundle. You can get the native libraries from here.


      Details on building the Java code can be found on the main github page for the Zazl Project. If you build the org.dojotoolkit.rt.v8. feature and org.dojotoolkit.server.util.feature features you will have POJO JAR  files that you can use in a variety of different Java environments.


      One thing that should be noted is that V8 is single threaded. Because of this the JNI code has to ensure synchronization via its v8::Locker object. An unlock occurs while any java callback is in process so that the lock is only in effect while the v8 engine is actually running javascript code. As the V8 engine is so fast I have not seen any noticeable issues with this so far but it is something that has to be considered when deciding when and what code is run via V8.