On January 11, Aater and I attended Data Day Texas 2014 here in Austin. Sponsored by Geek Austin, it was such a great event that I thought I’d share some highlights. Data Day Texas holds special significance for Flux7 Labs because it was at Data Day 2013 that we made our first presentation, when Aater gave a talk on the role of microservers in big data, which you can find here.
What I like the most about this conference is that it provides a perfect balance between technical depth and high-level theory that appeals to everyone. Another great thing about it is that, in addition to presentations on a wide array of topics, you get to meet and talk with a broad spectrum of cutting-edge leaders and innovators in the field. It was also great getting to meet and exchange ideas with many of our Austin network friends, people we know both personally and by reputation, but with whom we don’t often get time to hang out in our everyday professional lives. It’s inspiring and rejuvenating to step back from foci on our own work and to hear what others are thinking about and working on.
This year’s keynote was delivered by Paco Nathan (@pacoid). Entitled “Data 2014: The Big Picture”, Nathan covered a variety of topics, from how playing Minecraft has taught his daughter enough to be a Linux sysadmin to how our schools are failing to educate students in the particular fields of math that are most relevant to the 21st century. For instance, current engineering curricula is placing primary emphasis on calculus when in fact it’s graph theory and linear algebra that’s most needed today. He also described the big data processing pipeline from beginning to end, explaining current trends toward functional programming. In his discussion of rising technologies, he talked about Docker and the move away from virtualization, which was of particular note to me since Flux7 Labs is now in talks with Docker as we work toward strengthening our relationship.
The talk I found most interesting was by Josh Wills (@josh_wills), whose speaking skills were particularly impressive. Entitled “From The Lab To The Factory: Building A Production Machine Learning Infrastructure”, you can find his slides here, but you really have to hear him speak to get their full impact. Wills provided an excellent overview of the data science field, but what resonated most deeply with us at Flux7 Labs were his ideas on rapid experimentation. He narrated the history of human-powered aircraft and the Kremer prize. The prize went unclaimed for many years until Dr. Paul McCready’s team realized that it was attempting to solve the wrong problem altogether. McCready recognized that the primary impediment to success is the turnaround time between conceiving new ideas and then implementing them. McCready’s insight led him and his team to create an aircraft that could be put together very quickly. This belief in innovation through rapid experimentation is at the core of Flux7 Labs’ work and vision, as we’re convinced that it’s essential to any startup’s success.
I also attended a talk about Knime data mining software, which seems to be a promising tool for simplifying data science. I became particularly interested in Knime when our friends at Blacklight Solutions praised it. I also enjoyed a talk by Dr. Steve Kramer and Matthew Russell about the possible impact of advanced NLP on data science. The speakers used the Enron email corpus as a case study, pointing out that Dr. Vince Kaminski was one of the first people to raise red flags over Enron’s practices. Research in the area of NLP may one day help organizations to develop better checks and balances for preventing similar fiascos.
Overall, Data Day Texas 2014 provided a fantastic and rewarding way to spend a Saturday. Great big kudos go to Geek Austin’s Lynn Bender (@linearb) for organizing such a great event and for making it so interesting and accessible. I’m already looking forward to Data Day Texas 2015!