Friday, January 23, 2015

Hadoop for Cassandra: CqlInputFormat != CqlPagingInputFormat != ColumnFamilyInputFormat

We haven't had cause to write a Hadoop job against Cassandra since the old days of thrift.  (since we introduced Elastic Search in our system)   But this week, we found ourselves needing to get some metrics on data stored in the actual C* tables.

I went to the documentation and found this page:

That page references:
"CQL partition input format: ColumnFamilyInputFormat class"

I was familiar with the ColumnFamilyInputFormat class from the old thrift days, and I was pretty sure that a new InputFormat was available that used CQL.  I headed over to the code, dropped down to the 2.0 branch and found this:

Notice that imports:
import org.apache.cassandra.hadoop.cql3.CqlPagingInputFormat

I went happily along my way and implemented the MapReduce code using this InputFormat, but the compiler kept complaining that CqlPagingInputFormat could not be found. After some investigation, it looks like that class was removed from cassandra-all, sometime between 2.0.3 and 2.0.11. See below:

➜  tusk  unzip -l /Users/bone/.m2/repository/org/apache/cassandra/cassandra-all/2.0.11/cassandra-all-2.0.11.jar | grep Cql | grep Input
     2882  10-21-14 16:31   org/apache/cassandra/hadoop/cql3/CqlInputFormat.class
➜  tusk  unzip -l /Users/bone/.m2/repository/org/apache/cassandra/cassandra-all/2.0.3/cassandra-all-2.0.3.jar | grep Cql | grep Input
     1359  11-22-13 08:56   org/apache/cassandra/hadoop/cql3/CqlPagingInputFormat$1.class
     2875  11-22-13 08:56   org/apache/cassandra/hadoop/cql3/CqlPagingInputFormat.class

It looks like the crew is already addressing it:

Hopefully no one else runs into this. ;)


Sam BESSALAH said...

Actually, I ran into this last year, while using Spark do do just like you metrics aggregations. By the time I rolled back using CQLInputFormat (which wasn't handy for me) they hopefully open sourced the Cassandra spark connector.

vasudha dharani said...

Hadoop Developer --- "
Big Data (Hadoop) Developer Online Training
Send ur Enquiry to
Understanding Big Data
Introduction/Installation - Hadoop Custom VM(Single Node)
Understanding Big Data
3V (Volume-Variety-Velocity) Characteristics
Structured and Unstructured Data
Application and use cases of Big Data" more… Online Training- Corporate Training- IT Support U Can Reach Us On +917386622889 - +919000444287

Skill Quotient said...

It was so nice article and useful to Informatica learners. we also provide Informatica Course online training
Microsoft Dynamics GP Training | Informatica training

rehan singh said...

Great article! Cassandra online training includes Advantages and usage of Cassandra, CAP Theorem and Nosql databases, Cassandra fundamentals, Data model, Installation and setup, node tool commands, cluster, Indexes, Cassandra & Mapreduce, Installing Ops-center, Thrift/AVRO/JSON/Hector Client. More at


Thanks for your support, i am very interested in learning Hadoop.. If you want more details on HADOOP BIGDATA
just go through this link.....

peterjohn said...

I appreciate you sharing this article. Really thank you! Much obliged.
This is one awesome blog article. Much thanks again.

sap online training
software online training
sap sd online training
hadoop online training

peterjohn said...

I really enjoy the blog.Much thanks again. Really Great.
Very informative article post. Really looking forward to read more. Will read on…

oracle online training
sap fico online training
dotnet online training

Steve Hawks said...

There are lots of information about latest technology and how to get trained in them, like Big Data Hadoop Training in Chennai have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get trained in future technologies(Hadoop Course in Chennai). By the way you are running a great blog. Thanks for sharing this.

Best Hadoop Training in Chennai
| Best hadoop training institute in chennai

Jannik Andrew said...

The Hadoop tutorial you have explained is most useful for begineers who are taking Hadoop Administrator Online Training
Thank you for sharing Such a good tutorials on Hadoop

Akula Rahul said...

Latest Government Jobs 2016

Thanks for providing valuable information in this site........

rajashekhar reddy said...

Latest Govt Jobs Notification 2016

The information provided was extremely useful and informative. Thanks a lot for useful stuff.................

Anna said...

Great and Useful Article.

Online Java Training

Java Online Training India

Java Online Course

Java EE course

Java EE training

Best Recommended books for Spring framework

Java Interview Questions

Java Course in Chennai

Java Online Training India

Rasool Bevi said...

Very useful and informative blog.

Hadoop training in chennai

jhansi joe said...

There are lots of information about hadoop have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get to the next level in big data. Thanks for sharing this.Hadoop Training in Chennai | Big Data Training in Chennai

Divit said...

I wondered keep share this sites .if anyone wants realtime training Greens technolog chennai in visit this blog.

Cassandra Training in Chennai

Jhon Abraham said...

Your article gives more information.It helps to get a great career in IT industry.
Hadoop courses in


Hadoop Training in chennai

A1trainings said...

hadoop online training by real-time experts visit A1trainings

Hadoop training in india

jazz said...

Finding the time and actual effort to create a superb article like this is great thing. I’ll learn many new stuff right here! Good luck for the next post buddy..
PHP training in chennai

Pratik Shekhar said...

I really appreciate information shared above. It’s of great help. If someone want to learn Online (Virtual) instructor lead live training in TECHNOLOGY , kindly contact us
MaxMunus Offer World Class Virtual Instructor-led training on TECHNOLOGY. We have industry expert trainer. We provide Training Material and Software Support. MaxMunus has successfully conducted 100000+ pieces of training in India, USA, UK, Australia, Switzerland, Qatar, Saudi Arabia, Bangladesh, Bahrain and UAE etc.
For Demo Contact us.
Pratik Shekhar
Ph:(0) +91 9066268701

Sam Reddy said...

Thanks for sharing nice information

hadoop Training in india

Dubai Raju said...

it’s really nice and meanful. it’s really cool blog. Linking is very useful have really helped lots of people who visit blog and provide them usefull information.
Hadoop Training in Hyderabad

nutana meka said...

Thanks for sharing this informative content on Hadoop admin Online Training Hyderabad