Monday, March 25, 2013

The Art of Platform Development: Scaling Agile with Open Source dynamics

I’ve spent nearly my entire career working on “platform teams”.   The teams went by different names including “Shared Services”,  “Framework”,  “Core”, etc.  But the goal was always the same:  to centralize capabilities development and leverage those capabilities across product lines.

Achieving that goal is incredibly valuable.  By sharing capabilities, you can eliminate code and infrastructure duplication, which decreases operations and maintenance costs.   It also ensures consistency and simplifies integration across the products, which decreases the overall system complexity and expedites the delivery of new products to market.  In this model, “product development” can become an exercise in capabilities composition.

Unfortunately, this model challenges traditional Agile development.  Typically, Agile works from a product backlog, which is controlled by the Product Owner.  The Product Owner is focused on the business value of the stories (typically functionality).    Priorities are often driven by market opportunity and customer value.   With multiple products and product owners, where does the platform live?   

Often, the drive to keep the teams isolated and focused on the customer functionality results in silo’d development and silo’d products.  One might argue that such dynamics will always result in a fractured architecture/platform.

Some enterprises solve this problem by creating a Platform backlog and Platform team, which takes on all common service development.  This can work, but it is a nightmare to coordinate and often bottlenecks development. 

Furthermore, since prioritization of functionality is done within each product backlog, the result is local optimization.   It would be better if the enterprise could prioritize work globally, across all products and then optimize the assignment of that work across all development teams.

In the slides below, I suggest a different model, whereby product demand is collapsed into a single pivoted backlog that focuses on capabilities instead of specific product functionality.  Then prioritization is driven by the collective value of that capability across product lines.

With this pivot however, we lose the affinity between a team and its Product Backlog.  To fix that, I suggest the teams take an open-source approach to development.  Any team can take on any story, and contribute capabilities back to the “platform” via git pull requests to the appropriate component.

In this model, “platform development” is no longer the bottleneck.  Instead, all the teams share the platform, which eliminates the “us vs. them” mentality that can develop and establishes the proper dynamics to support the development of a single cohesive platform.   (aligning with Conway’s law)

Anyway, it’s just some food for thought.  I’d love to hear what people think.

Wednesday, March 13, 2013

Big Data goes to the Big Apple

Next week is the NYC* Big Data Tech Day.  It looks even bigger and more badass than last year's Cassandra Summit.  There is a good blend of use cases from people that already have Cassandra in production as well as cutting edge development that hasn't yet gone mainstream.

I'm really looking forward to John McCann's talk.  He is going to present on Comcast's use of Cassandra as the backend of their DVR system.   Netflix has been fairly open about their use, but guiding a ship as big as Comcast into the new era of NoSQL is a feat of shepherd-ship I'd love to hear about.

Likewise, on the more cutting-edge side of things, I'm looking forward to Thomas Pinkney's talk on graph-databases.  That may be the next ingredient in our architecture.  We were targeting Neo4j, but if a Cassandra-based graph database  is mature enough we would love to use it.  That would allow us to centralize on a single storage mechanism that scales. (FTW!)

Finally, Ed Capriolo's talk promises to be a good one.  As the veteran's know, once you've got your CRUD operations down.  There is a whole world of potential out there in data processing.  On our second try, we decided to go with Storm for our data processing layer, but I believe Ed has an innovative perspective on things.  (Hopefully, we'll see mention of intravert-ug, which is what happens when smart people like Nate McCall, Ed Anuff and Ed Capriolo have a baby.)

Also, I should mention that Taylor Goetz and I will be presenting on our Big Data journey, which has culminated in a Big Data platform that we are extremely happy with, where we've combined Storm, Kafka, Elastic Search and Cassandra into a slick/fast/scalable/flexible data processing machine.

I believe there is still room if you want to sign up.  If it is anything like last year,  not only will the talks be informative, but the collaboration sessions  before, in-between and after are worth their weight in gold.