Wednesday, July 18, 2012

Spring Data w/ Cassandra using JPA


We recently adopted the use of Spring Data.  Spring Data provides a nice pattern/API that you can layer on top of JPA to eliminate boiler-plate code.

With that adoption, we started looking at the DAO layer we use against Cassandra for some of our operations.  Some of the data we store in Cassandra is simple.  It does *not* leverage the flexible nature of NoSQL.  In other words, we know all the table names, the column names ahead of time, and we don't anticipate them changing all that often.

We could have stored this data in an RDBMs, using hibernate to access it, but standing up another persistence mechanism seemed like overkill.  For simplicity's sake, we preferred storing this data in Cassandra.  That said, we want the flexibility to move this to an RDBMs if we need to.

Enter JPA.

JPA would provide us a nice layer of abstraction away from the underlying storage mechanism.  Wouldn't it be great if we could annotate the objects with JPA annotations, and persist them to Cassandra?

Enter Kundera.

Kundera is a JPA implementation that supports Cassandra (among other storage mechanisms).  OK -- so JPA is great, and would get us what we want, but we had just adopted the use of Spring Data.  Could we use both?

The answer is "sort of".

I forked off SpringSource's spring-data-cassandra:
https://github.com/boneill42/spring-data-cassandra

And I started hacking on it.  I managed to get an implementation of the PagingAndSortingRepository for which I wrote unit tests that worked, but I was duplicating a lot of what should have come for free in the SimpleJpaRepository.  When I tried to substitute my CassandraJpaRepository for the SimpleJpaRepository, I ran into some trouble w/ Kundera.  Specifically, the MetaModel implementation appeared to be incomplete.  MetaModelImpl was returning null for all managedTypes().  SimpleJpa wasn't too happy with this.

Instead of wrangling with Kundera, we punted.  We can achieve enough of the value leveraging JPA directly. 

Perhaps more importantly, there is still an impedance mismatch between JPA and NoSQL.  In our case, it would have been nice to get at Cassandra through Spring Data using JPA for a few cases in our app, but for the vast majority of the application, a straight up ORM layer whereby we know the tables, rows and column names ahead of time is insufficient. 

For those cases where we don't know the schema ahead of time, we're going to need to leverage the converters pattern in Spring Data.  So, I started hacking on a proper Spring Data layer using Astyanax as the client.  Follow along here:
https://github.com/boneill42/spring-data-cassandra

More to come on that....







No comments: