Monday, October 3, 2011

Cassandra / Hadoop : Getting the row key (when iterating over all rows)


I thought I would save some people some time...

The word count example is fantastic, and is enough to get you going. But, you it may leave you wondering how to get at the row key since the "key" passed into the map is the name of the column and not the key of the row. Instead the key is in the context. Take a look at the snippet below.
public void map(ByteBuffer key, SortedMap<ByteBuffer, IColumn> columns, Context context) 
throws IOException, InterruptedException {
for (ByteBuffer columnKey : columns.keySet()){
String name = ByteBufferUtil.string(columns.get(columnKey).name());
String value = ByteBufferUtil.string(columns.get(columnKey).value());
logger.info("[" + ByteBufferUtil.string(columnKey) + "]->[" + name + "]:[" + value + "]");
logger.info("Context [" + ByteBufferUtil.string(context.getCurrentKey()) + "]);
}
}

1 comment:

Sundara rami reddy said...

nicely written post, could you please throw in some light about having the secondary name node on a separate machine (not with Master Name node)It is my great pleasure to visit your website and to enjoy your excellent post here. I like that very much.
Hadoop Training in hyderabad