Last week was huge in the booming world of big data with vendors simultaneously chasing market share and sharing innovations on the big stage at Strata + Hadoop World in San Jose, Calif.

If you have a big data product or service to sell, there may not be a better opportunity. After all, there’s a captive audience that paid big bucks and committed their time to be there. Attendees genuinely want to hear what you have to say. This is why so many vendor announcements are made at, or around, the conference.

Putting forth the best you have to offer while on the big stage, without sounding like an infomercial or slamming the competition seems to challenge some, though. Here’s the secret, strut your best stuff, your grandest vision and your ability to deliver, and the customers you want to win over will see and hear, only you. Knock a competitor, even if you don’t out rightly name them, and there are two of you sharing the spotlight. Is that what you want customers to remember?

Enough said.

Wait … President Obama Spoke?

Yes he did (virtually), and it took him less than a minute to get the geeks giggling. “Normally I’d be giving these remarks with a joke about data science,” he said, “but about half of the stuff my staff came up with was below average.”

If you’re not giggling, you’re probably not a data scientist. But even so, chances are that you’ll crack a smile if you think back to your stats class.

Of course, the President wasn’t at the conference just for the laughs. He had a few other things on his agenda. First, he wanted the crowd to know about his administration’s belief that innovating with data holds the potential to help us do almost anything we do better. Next, he announced that DJ Patil (yes the one who co-authored the Harvard Business Review article, “ Data Scientist: The Sexiest Job of the 21st Century”) as the first Chief Data Scientist of the United States.

And finally, he tried to recruit data scientists to come to work for the US Government. “Data science is a team sport,” said the President, and standing before an audience of all stars, he gave his call to action. DJ Patil needs your help.

Forget Hadoop, Spark was the Star at Strata

Hadoop is supposed to bring big data’s promise to life, but it’s beginning to look that Spark is going to get the job done instead. It’s not as if one displaces the other, but Spark has a whole lot more power to offer than the MapReduce component of Hadoop.

At the conference Matei Zaharia, CTO of Databricks (Apache Spark’s commercial sponsor), spoke of Spark’s prowess, use cases in multiple dimensions, including dataset size, computational complexity, and cluster size. And not just that, but that its users won’t be only developers but also data scientists. Databricks is making Spark friendlier.

What Cloudera’s Bringing In, Putting Out

Sometimes it seems that there’s a chess match or game of Dare being played by between Cloudera and whoever else wants in on the big data crunching marketplace.

On Tuesday, just as Pivotal was announcing that it was open sourcing of the components of its big data platform, releasing some of its sales figures, committing to a partnership with Hortonworks, and inaugurating the Open Data Platform, Cloudera opened its kimono to reveal to the world that its revenue for fiscal year 2015 was $100 million.

Privately held companies don’t need to do this kind of thing, so the crowd was wondering why and why now. Especially because Cloudera had other news the audience would have cared about too.

Apache Kafka is now part of CDH, Cloudera’s Hadoop distro; it comes with the free download. Not only that, but Cloudera customers will also be able to leverage Kafka clusters using Cloudera Manager. For current Cloudera Enterprise customers, tech support for Kafka is free.

Cloudera had even more to share; namely an alliance with Deloitte. Through it, Deloitte plans to provide its customers shorter time to value on their big data stores by leveraging Cloudera Enterprise and two industry specific solutions created via the alliance: Insurance Claims Subrogation and Customer Next Best Offer.

PayPal Marries Couchbase, More

High velocity data streams, like those that come from the IoT and telematics need to be harnessed so that enterprises can take advantage of more of the data that’s available to them when they make decisions. Companies like PayPal see this as a “must do.”

To get the job done, they leveraged Couchbase Server 3.0.2 integration with Hortonworks Data Platform 2.2, via distributed messaging with Apache Kafka and stream processing via Apache Storm, to create a new big data architecture that enables streaming real time data to be leveraged for new business and operational opportunities.

Got that?

Here’s how Couchbase puts it in its press release:

The Kafka connector leverages Couchbase’s Database Change Protocol (DCP), a unique architectural feature of Couchbase Server 3.0.2, to stream data from Couchbase Server to the Kafka message queue in real time. The messages are consumed by Storm for real-time analysis. The data is then written to Hadoop for further processing and the analysis is written to Couchbase Server for access by real-time reporting and visualization dashboards. This provides enterprises the insight they need at the moment they need it. By integrating Hortonworks Data Platform and Couchbase Server, enterprises can meet both operational and analytical requirements with a single solution to improve short-term and long-term operation.”


Larry Ellison: Big Data Needs a Face

Big data insights come in stories and pictures, not Excel spread sheets, we all know this by now. But Oracle might have just become aware of the fact, and now it wants to enlighten its customers.

Last week the company announced Oracle Big Data Discovery, which it calls, “the visual face of big data.” The pitch is that it is “built natively on Hadoop to transform raw data into business insight in minutes, without the need to learn complex products or rely only on highly skilled resources.”

Doesn’t Tableau already do that? We think so, but it’s a hot sector. It just so happens that publicly traded Tableau announced exponential growth the same week that Oracle announced its visual data dream. The former, a five year old company reached $913 million in lifetime revenue – $413 million of which was generated in 2014. With that achievement, Tableau has become one of the fastest growing companies in business analytics software history.

But that doesn’t mean that Oracle is without a play in the game. Its most loyal users might want stick with it — or at least try.



No, we didn’t get to every announcement, but if you haven’t yet read about Pivotal’s pivot, the Open Data Platform, Microsoft’ big data and machine learning offerings, how MapR closed the big data-to-action loop and Hortonworks’ better than expected financials, grab a Venti coffee and sit down in a comfortable chair: big data’s on the move and you don’t want it to pass you by.