Wednesday, November 30, 2011

Concurrent use of embedded Ruby in Java (using JRuby)

Last night I was finishing up the map/reduce capabilities within Virgil. We hope to allow people to post ruby scripts that will then get executed over a column family in Cassandra using map/reduce. To do that, we needed concurrent use of a ScriptEngine that could evaluate the ruby script. In the below code snippets, script is a String that contains the contents of a ruby file with a method definition for foo.

First, I started with JSR 223 and the ScriptEngine with the following code:

public static final ScriptEngine ENGINE = new ScriptEngineManager().getEngineByName("jruby");
ScriptContext context = new SimpleScriptContext();
Bindings bindings = context.getBindings(ScriptContext.ENGINE_SCOPE);
bindings.put("variable", "value");
ENGINE.eval(script, context);

That worked fine in unit testing, but when used within map/reduce I encountered a dead-lock of sorts. After some googling, I landed in the Redbridge documentation. There I found that jruby exposes a lower-level API (beneath JSR223) that exposes concurrent processing features. I swapped the above code, for the following:

this.rubyContainer = new ScriptingContainer(LocalContextScope.CONCURRENT);
this.rubyReceiver = rubyContainer.runScriptlet(script);
container.callMethod(rubyReceiver, "foo", "value");

That let me leverage a single engine for multiple concurrent invocations of the method foo, which is defined in the ruby script.

This worked like a charm.

Monday, November 28, 2011

Introducing run-modes to Virgil: support for embedded or remote cassandra instances

Since Virgil was originally developed as an embedded REST layer for the Cassandra Server, it ran as a daemon inside the server and performed operations directly against the CassandraServer classes. Running in a single JVM had some performance gains over a separate server that communicated over Thrift (either directly or via Hector) since operations didn't have to take a second hop across the network (with the associated marshalling/unmarshalling)

We had a request come in to add the ability to run Virgil against a remote Cassandra:

That seemed reasonable since there are a lot of existing cassandra clusters and users may just want to add a REST layer to support webapp/gui access or SOLR integration.

To support those cases, we added run-modes to the configuration:

Let us know what you think.

Monday, November 21, 2011

Virgil: GUI for Cassandra now included in Virgil

Sure, its read-only.
Sure, its focused on Strings.

But it was written in only 100 lines of code using Virgil's REST layer for Cassandra and includes all of ExtJS's goodness. (if you are into that kind of thing)

You can see the entire the GUI is contained in a single javascript class:

That javascript uses two GridPanel's: one to display column families grouped by keyspaces (on the east region panel), and another to display columns grouped by rowkeys (in the center panel). Each of the GridPanel's uses a store backed by an ExtJS model.

To accomodate the GUI, we added fetch capabilities the REST layer for both schema information and rows using key ranges. I'll detail those capabilities in a follow up post.

For instructions on how to access the GUI and to see what it looks like check out the wiki page.

Even in its existing state, this a useful GUI to quickly inspect the contents of a Cassandra node. It is also a good demonstration of how you might include a javascript component for visualization into your own application with very little effort.

Virgil now includes an elementary REST interface. (thanks to Dave Strauss @ Pantheon for his help defining the interface) It also includes simple SOLR integration and a GUI. Next up, map/reduce for the masses via REST. Stay tuned.

As always, comments and contributions welcome and appreciated.

Thursday, November 10, 2011

PATCH methods on JAX-RS

We added PATCH semantics for Virgil.

This was fairly straight forward, except we need to add support for a @PATCH annotation and PatchMethod for HttpClient.

To do this, we created a PATCH annotation. Take a look at The contents of which are shown below:

public @interface PATCH {

This then allows us to use @PATCH on an annotation on a REST service.

@Produces({ "application/json" })
public void patchRow(@PathParam("keyspace") String keyspace,
@PathParam("columnFamily") String columnFamily, @PathParam("key") String key,
@QueryParam("index") boolean index, String body) throws Exception

That worked like a charm. Then we needed to call it using HttpClient. To that, we created a PatchMethod class that extended PostMethod. You can see that here.

Then we could use that just like any other HTTP method.

PatchMethod patch = new PatchMethod(BASE_URL + KEYSPACE + "/" + COLUMN_FAMILY + "/" + KEY);
requestEntity = new StringRequestEntity("{\"ADDR1\":\"1235 Fun St.\",\"COUNTY\":\"Montgomery\"}",
"appication/json", "UTF8");

Hope that helps people.

Virgil: PATCH semantics added to REST layer for Cassandra

Virgil now supports PATCH semantics for row updates in Cassandra via REST.

In REST, when a resource is modified rather than fully replaced by an HTTP operation, the IETF is proposing a new HTTP method, PATCH.

Virgil now allows users to use this HTTP method to add and modify columns in a single post (without reposting the entire row). We've included an example in the Getting Started instructions.

Likewise, PUT operations will now replace the entire row, per HTTP semantics.

(Thanks to David Strauss for suggesting this)

Friday, November 4, 2011

Cassandra integration w/ SOLR using Virgil

Up front, I'd like to say this is still pretty raw. We'd love to get feedback and contributions.

That said, Virgil now has the ability to integrate SOLR and Cassandra. When you add and delete rows and columns via the REST interface, an index is updated in SOLR.

For more information check out:

Let us know what we can do better.