Wednesday, October 31, 2012

CQL: Seismic shift or simply a SQL veneer for Cassandra?

If you have recently come to Cassandra and you are happily using CQL, this blog post will probably only serve to confuse you.  You may want to stop here.

If however, you:
  • Are using Cassandra as a BigTable and care about the underlying persistence layer
  • Started using CQL, but wonder why it has crazy limitations that aren't found in SQL
  • Are accustomed to Java APIs and are trying to reconcile that with the advent of CQL
Then, this blog post is for you.  In fact, first you may want to read Jonathan's excellent post to get up to speed.

Now, I don't know the full history of all of it, but Cassandra has struggled with client apis from the beginning.  There are remnants of that struggle littered throughout the code base, and I believe there were a couple false starts along the way. (cassandra$ grep -r "avro" *)

But since I've been using Cassandra, I've been able to hang my hat on Thrift, which is the underlying RPC layer on which many of the Java APIs are built.  The Thrift layer is tied intimately to the storage engine in Cassandra and although it can be a bit cumbersome, it exposed everything we needed to build out app-layer extensions (a REST interface, indexing, and trigger functionality).

That said, Thrift is an albatross.  It provides no abstraction layer between between clients and the server-side implementation.  Concepts that you wish would die are still first class citizens in the Thrift API.  (cassandra$ grep -r "SuperColumn" * | grep thrift)  

An abstraction layer was certainly necessary, and I believe this was the primary purpose of CQL: to shield the client/user from the complexities of the underlying storage mechanism, and changes within that storage layer.  But because we chose a SQL-like abstraction layer, it has left many of the people that came to Cassandra because it is "NO"-SQL, wondering why a SQL-like language was introduced, mucking up their world.  Isn't the simple BigTable-based data model sufficient, why do we have to complicate things and impose schemas on everyone?

First, let me say that I never considered the underlying storage mechanism of Cassandra complex.  To me, it was a lot easier to reason about than many of the complex relational models I've seen over my career.  At the risk of oversimplifying, in Cassandra you really only have one structure to deal with HashMap<RK, SortedMap<CN, V>>  (RK = RowKey, CN=ColumnName, V=Value).   That is straight forward enough, and the only "complexity" here is that the outside HashMap is spread across machines.  No big deal.  But I'll accept that many people have trouble grok'ing the concept vs. the warm and comfy, familiar concepts of RDBMS. 

So, aside from the fact that the native Cassandra data structures might be difficult to grok, why did we add SQL to a NoSQL world?  As I said before in my post on Cassandra terminology,  I think a SQL-like language opens up avenues that weren't possible before. (application integrations with ETL tools, ORM layers, easing the transition for DBAs, etc.)  I truly appreciate why CQL was defined/selected and at the same time I'm sensitive to viewing Cassandra through SQL-colored glasses, and forcing those glasses on others.

Specifically, if it looks, tastes and smells like SQL, people will expect SQL like behavior.  People (and systems) expect to be able to construct arbitrary WHERE clauses, and JOINs.  With those expectations,  and without an understanding of the underlying storage model,  features, functions and even performance might not align well with users expectations. (IMHO, never a good thing)  We may find ourselves explaining BigTable concepts anyway, just to explain why JOINs aren't welcome here.
(or we can just point people to Dean Hiller and playORM, so he can explain why they are. =)

Also, I think we want to be careful not to hide the "simple" structures of the BigTable.  If it becomes cumbersome to interact with the BigTable (the Maps), we'll end up alienating the portion of the community that came to Cassandra for simple dynamic/flexible schemas.  That flexibility and simplicity allowed us to accomodate vast Varieties of data, one of the pillars in the 3 V's of BigData.  We don't want to lose it.

For more and discussion on this, follow and chime in on:

With those caveats in mind, I'm on board with CQL.   I intend to embrace it whole heartedly, especially given the enhancements coming down the pipe.   IMHO, CQL is more than a SQL veneer for Cassandra.  It is the foundation for future feature enhancements.  And although Thrift will be around for a long, long, long, long time, Thrift RPC will begin to fall behind CQL.  There is already evidence of that as CQL is going to provide first-class support for operations on collections in 1.2, with only limited support (via JSON) in Thrift. See:

With that conclusion, we've got our work cut out for us.  All of the enhancements we've developed for Cassandra were built on Thrift (either as AOP against the Thrift API, or directly consuming it to enable embedding).  This includes: cassandra-indexing, cassandra-triggers, and Virgil.  For each, we need to find a path forward that embraces CQL, keeping in mind that CQL is built on an entirely new protocol, listening on an entirely different port.  Additionally,  I'm looking to develop the Spring Data integration layer on CQL.

Anyone looking to get involved, we'd love a hand in migrating Virgil and Cassandra-Triggers forward and creating the initial Spring Data integration!

I'm excited about the future prospects of CQL, but it will take everyone's involvement to ensure that we get the best of all worlds from it, and that we don't lose anything in the transition.

Thursday, October 25, 2012

J2EE is dead: Long-live javascript backed by JSON services

Hello, my name is Brian O'Neill, and I've been J2EE free for almost 8 years now.  

A long time ago in a blog post far away, I asked the question "Is J2EE dead?".  Almost five years later, I'm here to say that, "Yes, J2EE is dead".   In fact, I'd say that it is now generations removed from the emerging technologies, and an ancestor to current enterprise best practices.

Originally, I thought that the MVC and HTTP/CRUD portions of J2EE would be replaced by rapid development frameworks like Rails.  The back-end non-http portions of the system (e.g. JMS) would move outside the container into Enterprise-Service Bus's (ESBs) and other non-J2EE based event-driven systems. 

In large part, I think we've seen this trend.  Increasingly, enterprises are adopting Fuse, Camel and Mule to form an event-driven backbone for the enterprise and although I don't believe we've seen as wide an adoption of Rails in the enterprise, I think we've seen a strong movement towards light-weight non-J2EE containers for web application deployment.  In fact, I didn't realize just how much the deployment landscape has changed,  until I saw this survey, which showed that half the enterprise servers are running Tomcat followed by Jetty at 16%.

We're perhaps on the extreme end, since we've abandoned war files entirely, swapping them out for the Dropwizard framework with an embedded Jetty Server. (and we're loving it)

Now, I think most enterprises are rolling out successful deployments using what I believe has become the canonical enterprise stack: Spring and Hibernate on Tomcat.  The more daring are striving for even higher realms of productivity and are breaking new ground with an even lighter stack: pure javascript backed by light-weight JSON services.

This is where we've landed.  We use straight-up javascript for our front-end.  Although I'm a huge fan of jquery, we've gone with ExtJS.  With it, we deliver a single-page webapp that can be deployed to anything capable of serving up raw HTML.  (e.g. apache web server)   No more servlets, nor more containers of any kind.  This enables our front-end developers to be *incredibly* productive.  They don't need to run a server of any kind.  We stub out the JSON services with flat files, and they can plow ahead, cranking out kick-ass look and feel and slick UX.  To be honest, they probably haven't heard the acronym J2EE. (and probably never will)

Meanwhile, back at the (server) farm, our java-heads can crank out light-weight services, which replace the stub JSON files with real services that interact with the back-end systems.  To keep them productive, again we cut the fat, we stay in J2SE and roll simple JAX-RS based services that we *could* deploy to Tomcat, but we choose to deploy as individual processes using DropWizard.  They are simple java classes that anyone can pick up and start coding after only reading the first couple chapters of Java in 21 days.

This not to say that we don't leverage elements of the J2EE stack (e.g. JMS and Servlet Filters for security, etc.), but even those we are beginning to replace with web-scale equivalents (e.g. Kafka).  I can see the day coming where J2EE will be naught but a distant memory.

"Alas poor J2EE, I knew him (well)."  We'll miss you.

(BTW -- keep your eye on the up and coming Apigee.  They hint at even more radical (and productive) architecture strategies that focus your time away from infrastructure entirely, focusing your time on only the elements of your application that are unique.  Its a glimpse of things to come.)


Wednesday, October 24, 2012

Solid NoSQL Benchmarks from YCSB w/ a side of HBase Bologna

One of the great appeals of Cassandra is its linear scalability.  Need more speed? Just add water, er... nodes to your ring.    Proving this out, Netflix performed one of the most famous Cassandra benchmarks.

Cassandra's numbers are most impressive, but Netflix didn't perform a side-by-side comparison of the available NoSQL platforms.  Admirably, Yahoo! Cloud Serving Benchmark (YCSB) endeavored to perform just such a comparison. The results of that effort were recently published by Network World.

It's not surprising that Cassandra and HBase lead the pack in many of the metrics since both are based on Google's BigTable.  It was surprising however to see HBase's latency near zero.  This warranted some digging.

Now, side-by-side comparisons are always tough because they often depend highly on system configuration and the specifics of the use case / data model used.   And in NetworkWorld's article, there is a key paragraph:
"During updates, HBase and Cassandra went far ahead from the main group with the average response latency time not exceeding two milliseconds. HBase was even faster. HBase client was configured with AutoFlush turned off. The updates aggregated in the client buffer and pending writes flushed asynchronously, as soon as the buffer became full. To accelerate updates processing on the server, the deferred log flush was enabled and WAL edits were kept in memory during the flush period"

With Autoflush disabled, writes are buffered until flush is called.  See:

If I'm reading that correctly, with autoflush disabled, the durability of a put operation is not guaranteed until the flush occurs.   This really sets up an unfair comparison with the other systems where durability is guaranteed on each write.  When buffering the data locally, nothing is sent over the wire, which naturally results in near-zero latency!

The change required to YCSB to level the playing field can be seen here:

I think its certainly worth including metrics when the autoflush is disabled because that is a valid use case, but YCSB should also include the results when autoflush is enabled.  Similarly, it would be great to see Cassandra's numbers when using different consistency levels and replication factors. Durability is something we all need, but it isn't one-size fits all. (which is why tunable consistency/replication is ideal)

I appreciate all the work the YCSB crew put in to produce the benchmark.  Thank you.  And it is not to say you should take the benchmarks with a grain of salt, but you may need some mustard to go with the HBase autoflush bologna.

Friday, October 12, 2012

Monday, October 8, 2012

CQL, Astyanax and Compound/Composite Keys: Writing Data

In my previous post, I showed how to connect the dots between Astyanax and CQL, but it focused primarily on reading.  Here is the update that connects the dots on write.

I created a sample project and put it out on github.  Let me know if you have any trouble.

Extending the previous example to accomodate writes, you need to use the AnnotatedCompositeSerializer when writing as well.  Here is the code:

    public void writeBlog(String columnFamilyName, String rowKey, FishBlog blog, byte[] value) throws ConnectionException {
        AnnotatedCompositeSerializer entitySerializer = new AnnotatedCompositeSerializer(FishBlog.class);
        MutationBatch mutation = keyspace.prepareMutationBatch();
        ColumnFamily columnFamily = new ColumnFamily(columnFamilyName,
                StringSerializer.get(), entitySerializer);
        mutation.withRow(columnFamily, rowKey).putColumn(blog, value, null);

Now, if you recall from the previous post, we had a single object, FishBlog, that represented the compound/composite column name:

public class FishBlog {
    @Component(ordinal = 0)
    public long when;
    @Component(ordinal = 1)
    public String fishtype;
    @Component(ordinal = 2)
    public String field;

We mapped this object to the following schema:
    CREATE TABLE fishblogs (
        userid varchar,
        when timestamp,
        fishtype varchar,
        blog varchar,
        image blob,
        PRIMARY KEY (userid, when, fishtype)

We had one member variable in FishBlog, field, that specified which piece of data we were writing: image or blog.  Because of how things work with CQL and the Java API, you actually need *two* mutations to create a row.  Here is the code from the unit test:

       AstyanaxDao dao = new AstyanaxDao("localhost:9160", "examples");
       FishBlog fishBlog = new FishBlog();
       dao.writeBlog("fishblogs", "bigcat", fishBlog, "myblog.".getBytes());
       FishBlog image = new FishBlog();
       image.when = now;
       byte[] buffer = new byte[10];
       buffer[0] = 1;
       dao.writeBlog("fishblogs", "bigcat", image, buffer);

All parts of the primary key need to be the same between the the different mutations.  Then, after you perform both mutations, you'll get a row back in CQL that looks like:

cqlsh:examples> select * from fishblogs;
 userid | when                     | fishtype | blog            | image
 bigcat | 2012-10-08 12:08:10-0400 |  CATFISH | this is myblog. | 01000000000000000000

Hopefully this clears things up for people.

Wednesday, October 3, 2012

"File name too long" in scala/sbt (on ubuntu)

When building Kafka on mac osx, I had no problem.  I moved to Ubuntu and received an error:

[info] Compiling main sources...
[error] File name too long
[error] one error found

The problem is that you are most likely on an encrypted filesystem.  If you are using Ubuntu, you might not even remember that you selected to have an encrypted filesystem.  It presented an option when you installed to encrypt your home directory.  You probably answered yes. ;)

Anyway, to get compiling move the source to another filesystem/mount.  You'll be good to go after that.