SYS-CON MEDIA Authors: Zakia Bouachraoui, Liz McMillan, Carmen Gonzalez, Roger Strukhoff, David Linthicum

Article

Making the World a Better Place with Big Data

This is just a summary of some of the great content at the conference

I spent last week at the Strata + Hadoop World Conference in New York City with 5000 other “big data” customers, vendors, and enthusiasts. In the last 6 months we’ve seen demand for a “big data” based network infrastructure really start to take off, and I’ve spent a lot of time recently trying to better understand the evolving market and technology landscape and use cases. I’m particularly interested in how network infrastructure can drive a better experience for users of big data applications, or networking/infrastructure teams that need to support these applications, but ultimately I want to know what do businesses get out of these investments in data, analytics, and infrastructure.

[On a related note, as part of our efforts to provide the best “Big Data Fabric” we recently brought on @networkn3rd (Ed Henry) to Plexxi to fully define our reference architecture. Ed will be demo’ing the first fruits of his labor this Friday on SDN’s Central’s Demo Friday - Register Here].

Hadoop World was a really great experience. As a relative newbie to Big Data, I have a lot to learn and this was a great place to soak up actual customer use cases. While there was certainly much feel-good hyperbole about the “making the world a better place” (if you haven’t seen HBO’s Silicon Valley, please watch this!) nature of big data, that was more than offset with actual real-world details of how data was being used to solve more day-to-day business problems. Here’s a quick synopsis of some of my personal highlights:

  • Finance People have their Sh*t Together In “Why Marketing Suck at Big Data” Jennifer Zeszut (@jenniferland) from Beckon pleaded with Marketers to learn from other functional areas that actually make formal use of data in pretty structured ways (e.g. finance people). Her message was a bit contrarian to the whole Big Data notion of data exploration – she talked about structuring the data on the way in, storing only the data that matters, and avoiding the data “spelunking” approach. She used some great examples of what Finance people would never do (like throw all receipts into one big data warehouse without any input classification!).
  • Data is Intrinsically Worthless In “Do you Know What Your Data is Worth?” Brian d’Alessandro from Dstillery (@delbrians)  talked about how you can easily double your data, which often doubles the investment (cost to acquire the data) but doesn’t double the benefit. He captured the “Value of Data (VOD)” in a handy equation that looked at understanding the value of an application with data minus the value of an application without data. A key lesson I learned here was that data has no intrinsic value – rather it is tied to the applications and actions derived from the data.
  • Surprise leads to Innovation In “The Sound of Data Silence” Jana Eggers from Nara Logics (@jeggers) talked about how to better listen to the non-obvious signals in the data. She gave some very practical exercises on how to do this including going beyond the ‘show me state’ to the ‘curiosity state’ – being hyper curious by channelling your inner Steve Jobs and remembering not to rationalize surprises in the data as this is the true source of innovation.
  • The Data Natives Generation In “The Future of Data” Kim Rees of Periscopic (@krees) gave a fascinating talk on just that – the future of data. To demonstrate how data is much more scalable than algorithms, she used an example of a robot that crowd sources its knowledge on how to handle objects from other robots’ data. Then she led us to the “data natives” phenomenon – using kids’ familiarity with gadgets as an analogy – data natives speaks to how we’ll very quickly have a generation that will be born with the universe of data at their fingertips and from birth will never need to remember or figure. This marks a new state in our evolution.
  • The Kevin Bacon Game for Banking In “How Goldman Sachs is Using Knowledge to Create an Information Edge” Peter Ferns talked about the GS “Big Graph” application and how it is used to build a relationship graph of people, legal entities, organizational entities, transactional data, and banking transactions like M&A. He then detailed how this information is used for compliance (surveillance, investigations and analytics), information security, technology infrastructure management, and customer relationship management. The best thing is that they put this info out in the public at http://www.gs.com/engineering!
  • Big People In “Building with Data: Lessons from Etsy” Nellwyn Thomas talked about how she built the data organization at Etsy.  She covered how they have 3 groups within the “Data Org” – Data engineering / hadoop team, data science team, and analysts. Then she zoomed in on the specific skill sets that are required for analysts – not just analytical skills (which she defined as the ability to understand the problem and the opportunity), but also math/statistics skills (to understand the data), technical skills (to get, parse, and visualize the data), and communication skills (to communicate what matters, and more importantly to not communicate what doesn’t matter).
  • Bad Recommendations Make Angry Customers In “Learning About Music and Listeners” Brian Whitman (@bwhitman) from Spotify gave what was probably my favorite session of the week. He talked about the company he started called The Echo Nest (that was acquired by Spotify) and the work they are doing to move beyond simple collaborative filtering engines (the ‘other customers bought this’ type). Spotify’s goal is to make users loyal by encouraging discovery, understanding that giving bad recommendations really makes music fans angry! They are doing this with content-based recommendations. He talked about the progress they have made in helping computers understand enough music to make recommendations, and the obvious but almost always overlooked basic human fact that we can actually have different “modes” that determine what we might want to listen too! He also went through some fascinating data analysis derived solely from user usage data – like predicting political affiliations. Pretty cool stuff. Best of all was the fact that the entire Echo Nest API and a million song data set are available for anyone to use for research purposes on http://developer.spotify.com.
  • Knowledge is Dangerous Last but not least, the venerable Shankar Vedantam (@HiddenBrain), author of “The Hidden Brain” and the Social Science Correspondent on NPR warned us all that more data doesn’t necessarily make us smarter or better. In fact, what the data shows is that the more knowledge we have, the more we amplify our own existing biases into stronger positions because we ultimately are really good at cherry picking what we want to believe. It was a bit sobering given the tone of the conference, but a practical message nonetheless.

This is just a summary of some of the great content at the conference, and I’m leaving out a great deal. I also spent a ton of time in our booth talking to some really fascinating customers and learning about what they are doing (and of course doing a bit of selling). Bottom line is that Big Data is not only big, but it has real/broad applications beyond the typical web crawling for the search crowd -including customer profiling for marketing/sales across a variety of industries, content and goods recommendations for eCommerce and online media, fraud detection/compliance for banking, resource allocation / inventory planning for retailers/manufacturers, and of course solving world hunger and making the world a better place!

All of this will ultimately drive new infrastructure designs and decisions as the data sets get larger, the users get more diverse and more demanding, and the expectations to provide real-time analysis across many data sets becomes more possible. Don’t forget to tune into our SDN Central Demo Friday to see how we’re starting to tackle these infrastructure challenges from the network side!

[Today’s fun fact: 111,111,111 x 111,111,111 = 12,345,678,987,654,321. That's big data for you.]

The post Making the World a Better Place with Big Data appeared first on Plexxi.

More Stories By Mat Mathews

Visionary solutions are built by visionary leaders. Plexxi co-founder and Vice President of Product Management Mat Mathews has spent 20 years in the networking industry observing, experimenting and ultimately honing his technology vision. The resulting product — a combination of traditional networking, software-defined networking and photonic switching — represents the best of Mat's career experiences. Prior to Plexxi, Mat held VP of Product Management roles at Arbor Networks and Crossbeam Systems. Mat began his career as a software engineer for Wellfleet Communications, building high speed Frame Relay Switches for the carrier market. Mat holds a Bachelors of Science in Computer Systems Engineering from the University of Massachusetts at Amherst.

Latest Stories
ScaleMP is presenting at CloudEXPO 2019, held June 24-26 in Santa Clara, and we’d love to see you there. At the conference, we’ll demonstrate how ScaleMP is solving one of the most vexing challenges for cloud — memory cost and limit of scale — and how our innovative vSMP MemoryONE solution provides affordable larger server memory for the private and public cloud. Please visit us at Booth No. 519 to connect with our experts and learn more about vSMP MemoryONE and how it is already serving some of...
Codete accelerates their clients growth through technological expertise and experience. Codite team works with organizations to meet the challenges that digitalization presents. Their clients include digital start-ups as well as established enterprises in the IT industry. To stay competitive in a highly innovative IT industry, strong R&D departments and bold spin-off initiatives is a must. Codete Data Science and Software Architects teams help corporate clients to stay up to date with the mod...
There are many examples of disruption in consumer space – Uber disrupting the cab industry, Airbnb disrupting the hospitality industry and so on; but have you wondered who is disrupting support and operations? AISERA helps make businesses and customers successful by offering consumer-like user experience for support and operations. We have built the world’s first AI-driven IT / HR / Cloud / Customer Support and Operations solution.
Darktrace is the world's leading AI company for cyber security. Created by mathematicians from the University of Cambridge, Darktrace's Enterprise Immune System is the first non-consumer application of machine learning to work at scale, across all network types, from physical, virtualized, and cloud, through to IoT and industrial control systems. Installed as a self-configuring cyber defense platform, Darktrace continuously learns what is ‘normal' for all devices and users, updating its understa...
As you know, enterprise IT conversation over the past year have often centered upon the open-source Kubernetes container orchestration system. In fact, Kubernetes has emerged as the key technology -- and even primary platform -- of cloud migrations for a wide variety of organizations. Kubernetes is critical to forward-looking enterprises that continue to push their IT infrastructures toward maximum functionality, scalability, and flexibility. As they do so, IT professionals are also embr...
Platform9, the leader in SaaS-managed hybrid cloud, has announced it will present five sessions at four upcoming industry conferences in June: BCS in London, DevOpsCon in Berlin, HPE Discover and Cloud Computing Expo 2019.
At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...
When you're operating multiple services in production, building out forensics tools such as monitoring and observability becomes essential. Unfortunately, it is a real challenge balancing priorities between building new features and tools to help pinpoint root causes. Linkerd provides many of the tools you need to tame the chaos of operating microservices in a cloud native world. Because Linkerd is a transparent proxy that runs alongside your application, there are no code changes required. I...
In his general session at 21st Cloud Expo, Greg Dumas, Calligo’s Vice President and G.M. of US operations, discussed the new Global Data Protection Regulation and how Calligo can help business stay compliant in digitally globalized world. Greg Dumas is Calligo's Vice President and G.M. of US operations. Calligo is an established service provider that provides an innovative platform for trusted cloud solutions. Calligo’s customers are typically most concerned about GDPR compliance, application p...
Modern software design has fundamentally changed how we manage applications, causing many to turn to containers as the new virtual machine for resource management. As container adoption grows beyond stateless applications to stateful workloads, the need for persistent storage is foundational - something customers routinely cite as a top pain point. In his session at @DevOpsSummit at 21st Cloud Expo, Bill Borsari, Head of Systems Engineering at Datera, explored how organizations can reap the bene...
"NetApp's vision is how we help organizations manage data - delivering the right data in the right place, in the right time, to the people who need it, and doing it agnostic to what the platform is," explained Josh Atwell, Developer Advocate for NetApp, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
Druva is the global leader in Cloud Data Protection and Management, delivering the industry's first data management-as-a-service solution that aggregates data from endpoints, servers and cloud applications and leverages the public cloud to offer a single pane of glass to enable data protection, governance and intelligence-dramatically increasing the availability and visibility of business critical information, while reducing the risk, cost and complexity of managing and protecting it. Druva's...
Kubernetes as a Container Platform is becoming a de facto for every enterprise. In my interactions with enterprises adopting container platform, I come across common questions: - How does application security work on this platform? What all do I need to secure? - How do I implement security in pipelines? - What about vulnerabilities discovered at a later point in time? - What are newer technologies like Istio Service Mesh bring to table?In this session, I will be addressing these commonly asked ...
BMC has unmatched experience in IT management, supporting 92 of the Forbes Global 100, and earning recognition as an ITSM Gartner Magic Quadrant Leader for five years running. Our solutions offer speed, agility, and efficiency to tackle business challenges in the areas of service management, automation, operations, and the mainframe.
Blockchain has shifted from hype to reality across many industries including Financial Services, Supply Chain, Retail, Healthcare and Government. While traditional tech and crypto organizations are generally male dominated, women have embraced blockchain technology from its inception. This is no more evident than at companies where women occupy many of the blockchain roles and leadership positions. Join this panel to hear three women in blockchain share their experience and their POV on the futu...