SYS-CON MEDIA Authors: Liz McMillan, Elizabeth White, Pat Romanski, Gary Arora, Zakia Bouachraoui

Article

Making the World a Better Place with Big Data

This is just a summary of some of the great content at the conference

I spent last week at the Strata + Hadoop World Conference in New York City with 5000 other “big data” customers, vendors, and enthusiasts. In the last 6 months we’ve seen demand for a “big data” based network infrastructure really start to take off, and I’ve spent a lot of time recently trying to better understand the evolving market and technology landscape and use cases. I’m particularly interested in how network infrastructure can drive a better experience for users of big data applications, or networking/infrastructure teams that need to support these applications, but ultimately I want to know what do businesses get out of these investments in data, analytics, and infrastructure.

[On a related note, as part of our efforts to provide the best “Big Data Fabric” we recently brought on @networkn3rd (Ed Henry) to Plexxi to fully define our reference architecture. Ed will be demo’ing the first fruits of his labor this Friday on SDN’s Central’s Demo Friday - Register Here].

Hadoop World was a really great experience. As a relative newbie to Big Data, I have a lot to learn and this was a great place to soak up actual customer use cases. While there was certainly much feel-good hyperbole about the “making the world a better place” (if you haven’t seen HBO’s Silicon Valley, please watch this!) nature of big data, that was more than offset with actual real-world details of how data was being used to solve more day-to-day business problems. Here’s a quick synopsis of some of my personal highlights:

  • Finance People have their Sh*t Together In “Why Marketing Suck at Big Data” Jennifer Zeszut (@jenniferland) from Beckon pleaded with Marketers to learn from other functional areas that actually make formal use of data in pretty structured ways (e.g. finance people). Her message was a bit contrarian to the whole Big Data notion of data exploration – she talked about structuring the data on the way in, storing only the data that matters, and avoiding the data “spelunking” approach. She used some great examples of what Finance people would never do (like throw all receipts into one big data warehouse without any input classification!).
  • Data is Intrinsically Worthless In “Do you Know What Your Data is Worth?” Brian d’Alessandro from Dstillery (@delbrians)  talked about how you can easily double your data, which often doubles the investment (cost to acquire the data) but doesn’t double the benefit. He captured the “Value of Data (VOD)” in a handy equation that looked at understanding the value of an application with data minus the value of an application without data. A key lesson I learned here was that data has no intrinsic value – rather it is tied to the applications and actions derived from the data.
  • Surprise leads to Innovation In “The Sound of Data Silence” Jana Eggers from Nara Logics (@jeggers) talked about how to better listen to the non-obvious signals in the data. She gave some very practical exercises on how to do this including going beyond the ‘show me state’ to the ‘curiosity state’ – being hyper curious by channelling your inner Steve Jobs and remembering not to rationalize surprises in the data as this is the true source of innovation.
  • The Data Natives Generation In “The Future of Data” Kim Rees of Periscopic (@krees) gave a fascinating talk on just that – the future of data. To demonstrate how data is much more scalable than algorithms, she used an example of a robot that crowd sources its knowledge on how to handle objects from other robots’ data. Then she led us to the “data natives” phenomenon – using kids’ familiarity with gadgets as an analogy – data natives speaks to how we’ll very quickly have a generation that will be born with the universe of data at their fingertips and from birth will never need to remember or figure. This marks a new state in our evolution.
  • The Kevin Bacon Game for Banking In “How Goldman Sachs is Using Knowledge to Create an Information Edge” Peter Ferns talked about the GS “Big Graph” application and how it is used to build a relationship graph of people, legal entities, organizational entities, transactional data, and banking transactions like M&A. He then detailed how this information is used for compliance (surveillance, investigations and analytics), information security, technology infrastructure management, and customer relationship management. The best thing is that they put this info out in the public at http://www.gs.com/engineering!
  • Big People In “Building with Data: Lessons from Etsy” Nellwyn Thomas talked about how she built the data organization at Etsy.  She covered how they have 3 groups within the “Data Org” – Data engineering / hadoop team, data science team, and analysts. Then she zoomed in on the specific skill sets that are required for analysts – not just analytical skills (which she defined as the ability to understand the problem and the opportunity), but also math/statistics skills (to understand the data), technical skills (to get, parse, and visualize the data), and communication skills (to communicate what matters, and more importantly to not communicate what doesn’t matter).
  • Bad Recommendations Make Angry Customers In “Learning About Music and Listeners” Brian Whitman (@bwhitman) from Spotify gave what was probably my favorite session of the week. He talked about the company he started called The Echo Nest (that was acquired by Spotify) and the work they are doing to move beyond simple collaborative filtering engines (the ‘other customers bought this’ type). Spotify’s goal is to make users loyal by encouraging discovery, understanding that giving bad recommendations really makes music fans angry! They are doing this with content-based recommendations. He talked about the progress they have made in helping computers understand enough music to make recommendations, and the obvious but almost always overlooked basic human fact that we can actually have different “modes” that determine what we might want to listen too! He also went through some fascinating data analysis derived solely from user usage data – like predicting political affiliations. Pretty cool stuff. Best of all was the fact that the entire Echo Nest API and a million song data set are available for anyone to use for research purposes on http://developer.spotify.com.
  • Knowledge is Dangerous Last but not least, the venerable Shankar Vedantam (@HiddenBrain), author of “The Hidden Brain” and the Social Science Correspondent on NPR warned us all that more data doesn’t necessarily make us smarter or better. In fact, what the data shows is that the more knowledge we have, the more we amplify our own existing biases into stronger positions because we ultimately are really good at cherry picking what we want to believe. It was a bit sobering given the tone of the conference, but a practical message nonetheless.

This is just a summary of some of the great content at the conference, and I’m leaving out a great deal. I also spent a ton of time in our booth talking to some really fascinating customers and learning about what they are doing (and of course doing a bit of selling). Bottom line is that Big Data is not only big, but it has real/broad applications beyond the typical web crawling for the search crowd -including customer profiling for marketing/sales across a variety of industries, content and goods recommendations for eCommerce and online media, fraud detection/compliance for banking, resource allocation / inventory planning for retailers/manufacturers, and of course solving world hunger and making the world a better place!

All of this will ultimately drive new infrastructure designs and decisions as the data sets get larger, the users get more diverse and more demanding, and the expectations to provide real-time analysis across many data sets becomes more possible. Don’t forget to tune into our SDN Central Demo Friday to see how we’re starting to tackle these infrastructure challenges from the network side!

[Today’s fun fact: 111,111,111 x 111,111,111 = 12,345,678,987,654,321. That's big data for you.]

The post Making the World a Better Place with Big Data appeared first on Plexxi.

More Stories By Mat Mathews

Visionary solutions are built by visionary leaders. Plexxi co-founder and Vice President of Product Management Mat Mathews has spent 20 years in the networking industry observing, experimenting and ultimately honing his technology vision. The resulting product — a combination of traditional networking, software-defined networking and photonic switching — represents the best of Mat's career experiences. Prior to Plexxi, Mat held VP of Product Management roles at Arbor Networks and Crossbeam Systems. Mat began his career as a software engineer for Wellfleet Communications, building high speed Frame Relay Switches for the carrier market. Mat holds a Bachelors of Science in Computer Systems Engineering from the University of Massachusetts at Amherst.

Latest Stories
Alan Hase is Vice President of Engineering and Chief Development Officer at Big Switch. Alan has more than 20 years of experience in the networking industry and leading global engineering teams which have delivered industry leading innovation in high end routing, security, fabric and wireless technologies. Alan joined Big Switch from Extreme Networks where he was responsible for product strategy for its secure campus switching, intelligent mobility and campus orchestration products. Prior to Ext...
Isomorphic Software is the global leader in high-end, web-based business applications. We develop, market, and support the SmartClient & Smart GWT HTML5/Ajax platform, combining the productivity and performance of traditional desktop software with the simplicity and reach of the open web. With staff in 10 timezones, Isomorphic provides a global network of services related to our technology, with offerings ranging from turnkey application development to SLA-backed enterprise support. Leadin...
On-premise or off, you have powerful tools available to maximize the value of your infrastructure and you demand more visibility and operational control. Fortunately, data center management tools keep a vigil on memory contestation, power, thermal consumption, server health, and utilization, allowing better control no matter your cloud's shape. In this session, learn how Intel software tools enable real-time monitoring and precise management to lower operational costs and optimize infrastructure...
Public clouds dominate IT conversations but the next phase of cloud evolutions are "multi" hybrid cloud environments. The winners in the cloud services industry will be those organizations that understand how to leverage these technologies as complete service solutions for specific customer verticals. In turn, both business and IT actors throughout the enterprise will need to increase their engagement with multi-cloud deployments today while planning a technology strategy that will constitute a ...
Cloud-Native thinking and Serverless Computing are now the norm in financial services, manufacturing, telco, healthcare, transportation, energy, media, entertainment, retail and other consumer industries, as well as the public sector. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that pro...
At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...
Every organization is facing their own Digital Transformation as they attempt to stay ahead of the competition, or worse, just keep up. Each new opportunity, whether embracing machine learning, IoT, or a cloud migration, seems to bring new development, deployment, and management models. The results are more diverse and federated computing models than any time in our history.
Andrew Keys is co-founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereum.
Data center, on-premise, public-cloud, private-cloud, multi-cloud, hybrid-cloud, IoT, AI, edge, SaaS, PaaS... it's an availability, security, performance and integration nightmare even for the best of the best IT experts. Organizations realize the tremendous benefits of everything the digital transformation has to offer. Cloud adoption rates are increasing significantly, and IT budgets are morphing to follow suit. But distributing applications and infrastructure around increases risk, introdu...
DevOps has long focused on reinventing the SDLC (e.g. with CI/CD, ARA, pipeline automation etc.), while reinvention of IT Ops has lagged. However, new approaches like Site Reliability Engineering, Observability, Containerization, Operations Analytics, and ML/AI are driving a resurgence of IT Ops. In this session our expert panel will focus on how these new ideas are [putting the Ops back in DevOps orbringing modern IT Ops to DevOps].
Financial enterprises in New York City, London, Singapore, and other world financial capitals are embracing a new generation of smart, automated FinTech that eliminates many cumbersome, slow, and expensive intermediate processes from their businesses. Accordingly, attendees at the upcoming 23rd CloudEXPO, June 24-26, 2019 at Santa Clara Convention Center in Santa Clara, CA will find fresh new content in full new FinTech & Enterprise Blockchain track.
While a hybrid cloud can ease that transition, designing and deploy that hybrid cloud still offers challenges for organizations concerned about lack of available cloud skillsets within their organization. Managed service providers offer a unique opportunity to fill those gaps and get organizations of all sizes on a hybrid cloud that meets their comfort level, while delivering enhanced benefits for cost, efficiency, agility, mobility, and elasticity.
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: Driving Business Strategies with Data Science" is responsible for guiding the technology strategy within Hitachi Vantara for IoT and Analytics. Bill brings a balanced business-technology approach that focuses on business outcomes to drive data, analytics and technology decisions that underpin an organization's digital transformation strategy. Bill has a very impressive background which includes ...
On-premise or off, you have powerful tools available to maximize the value of your infrastructure and you demand more visibility and operational control. Fortunately, data center management tools keep a vigil on memory contestation, power, thermal consumption, server health, and utilization, allowing better control no matter your cloud's shape. In this session, learn how Intel software tools enable real-time monitoring and precise management to lower operational costs and optimize infrastructure...
Most organizations are awash today in data and IT systems, yet they're still struggling mightily to use these invaluable assets to meet the rising demand for new digital solutions and customer experiences that drive innovation and growth. What's lacking are potent and effective ways to rapidly combine together on-premises IT and the numerous commercial clouds that the average organization has in place today into effective new business solutions. New research shows that delivering on multicloud e...