SYS-CON MEDIA Authors: Elizabeth White, Zakia Bouachraoui, Liz McMillan, Janakiram MSV, Carmen Gonzalez

Article

Making the World a Better Place with Big Data

This is just a summary of some of the great content at the conference

I spent last week at the Strata + Hadoop World Conference in New York City with 5000 other “big data” customers, vendors, and enthusiasts. In the last 6 months we’ve seen demand for a “big data” based network infrastructure really start to take off, and I’ve spent a lot of time recently trying to better understand the evolving market and technology landscape and use cases. I’m particularly interested in how network infrastructure can drive a better experience for users of big data applications, or networking/infrastructure teams that need to support these applications, but ultimately I want to know what do businesses get out of these investments in data, analytics, and infrastructure.

[On a related note, as part of our efforts to provide the best “Big Data Fabric” we recently brought on @networkn3rd (Ed Henry) to Plexxi to fully define our reference architecture. Ed will be demo’ing the first fruits of his labor this Friday on SDN’s Central’s Demo Friday - Register Here].

Hadoop World was a really great experience. As a relative newbie to Big Data, I have a lot to learn and this was a great place to soak up actual customer use cases. While there was certainly much feel-good hyperbole about the “making the world a better place” (if you haven’t seen HBO’s Silicon Valley, please watch this!) nature of big data, that was more than offset with actual real-world details of how data was being used to solve more day-to-day business problems. Here’s a quick synopsis of some of my personal highlights:

  • Finance People have their Sh*t Together In “Why Marketing Suck at Big Data” Jennifer Zeszut (@jenniferland) from Beckon pleaded with Marketers to learn from other functional areas that actually make formal use of data in pretty structured ways (e.g. finance people). Her message was a bit contrarian to the whole Big Data notion of data exploration – she talked about structuring the data on the way in, storing only the data that matters, and avoiding the data “spelunking” approach. She used some great examples of what Finance people would never do (like throw all receipts into one big data warehouse without any input classification!).
  • Data is Intrinsically Worthless In “Do you Know What Your Data is Worth?” Brian d’Alessandro from Dstillery (@delbrians)  talked about how you can easily double your data, which often doubles the investment (cost to acquire the data) but doesn’t double the benefit. He captured the “Value of Data (VOD)” in a handy equation that looked at understanding the value of an application with data minus the value of an application without data. A key lesson I learned here was that data has no intrinsic value – rather it is tied to the applications and actions derived from the data.
  • Surprise leads to Innovation In “The Sound of Data Silence” Jana Eggers from Nara Logics (@jeggers) talked about how to better listen to the non-obvious signals in the data. She gave some very practical exercises on how to do this including going beyond the ‘show me state’ to the ‘curiosity state’ – being hyper curious by channelling your inner Steve Jobs and remembering not to rationalize surprises in the data as this is the true source of innovation.
  • The Data Natives Generation In “The Future of Data” Kim Rees of Periscopic (@krees) gave a fascinating talk on just that – the future of data. To demonstrate how data is much more scalable than algorithms, she used an example of a robot that crowd sources its knowledge on how to handle objects from other robots’ data. Then she led us to the “data natives” phenomenon – using kids’ familiarity with gadgets as an analogy – data natives speaks to how we’ll very quickly have a generation that will be born with the universe of data at their fingertips and from birth will never need to remember or figure. This marks a new state in our evolution.
  • The Kevin Bacon Game for Banking In “How Goldman Sachs is Using Knowledge to Create an Information Edge” Peter Ferns talked about the GS “Big Graph” application and how it is used to build a relationship graph of people, legal entities, organizational entities, transactional data, and banking transactions like M&A. He then detailed how this information is used for compliance (surveillance, investigations and analytics), information security, technology infrastructure management, and customer relationship management. The best thing is that they put this info out in the public at http://www.gs.com/engineering!
  • Big People In “Building with Data: Lessons from Etsy” Nellwyn Thomas talked about how she built the data organization at Etsy.  She covered how they have 3 groups within the “Data Org” – Data engineering / hadoop team, data science team, and analysts. Then she zoomed in on the specific skill sets that are required for analysts – not just analytical skills (which she defined as the ability to understand the problem and the opportunity), but also math/statistics skills (to understand the data), technical skills (to get, parse, and visualize the data), and communication skills (to communicate what matters, and more importantly to not communicate what doesn’t matter).
  • Bad Recommendations Make Angry Customers In “Learning About Music and Listeners” Brian Whitman (@bwhitman) from Spotify gave what was probably my favorite session of the week. He talked about the company he started called The Echo Nest (that was acquired by Spotify) and the work they are doing to move beyond simple collaborative filtering engines (the ‘other customers bought this’ type). Spotify’s goal is to make users loyal by encouraging discovery, understanding that giving bad recommendations really makes music fans angry! They are doing this with content-based recommendations. He talked about the progress they have made in helping computers understand enough music to make recommendations, and the obvious but almost always overlooked basic human fact that we can actually have different “modes” that determine what we might want to listen too! He also went through some fascinating data analysis derived solely from user usage data – like predicting political affiliations. Pretty cool stuff. Best of all was the fact that the entire Echo Nest API and a million song data set are available for anyone to use for research purposes on http://developer.spotify.com.
  • Knowledge is Dangerous Last but not least, the venerable Shankar Vedantam (@HiddenBrain), author of “The Hidden Brain” and the Social Science Correspondent on NPR warned us all that more data doesn’t necessarily make us smarter or better. In fact, what the data shows is that the more knowledge we have, the more we amplify our own existing biases into stronger positions because we ultimately are really good at cherry picking what we want to believe. It was a bit sobering given the tone of the conference, but a practical message nonetheless.

This is just a summary of some of the great content at the conference, and I’m leaving out a great deal. I also spent a ton of time in our booth talking to some really fascinating customers and learning about what they are doing (and of course doing a bit of selling). Bottom line is that Big Data is not only big, but it has real/broad applications beyond the typical web crawling for the search crowd -including customer profiling for marketing/sales across a variety of industries, content and goods recommendations for eCommerce and online media, fraud detection/compliance for banking, resource allocation / inventory planning for retailers/manufacturers, and of course solving world hunger and making the world a better place!

All of this will ultimately drive new infrastructure designs and decisions as the data sets get larger, the users get more diverse and more demanding, and the expectations to provide real-time analysis across many data sets becomes more possible. Don’t forget to tune into our SDN Central Demo Friday to see how we’re starting to tackle these infrastructure challenges from the network side!

[Today’s fun fact: 111,111,111 x 111,111,111 = 12,345,678,987,654,321. That's big data for you.]

The post Making the World a Better Place with Big Data appeared first on Plexxi.

More Stories By Mat Mathews

Visionary solutions are built by visionary leaders. Plexxi co-founder and Vice President of Product Management Mat Mathews has spent 20 years in the networking industry observing, experimenting and ultimately honing his technology vision. The resulting product — a combination of traditional networking, software-defined networking and photonic switching — represents the best of Mat's career experiences. Prior to Plexxi, Mat held VP of Product Management roles at Arbor Networks and Crossbeam Systems. Mat began his career as a software engineer for Wellfleet Communications, building high speed Frame Relay Switches for the carrier market. Mat holds a Bachelors of Science in Computer Systems Engineering from the University of Massachusetts at Amherst.

Latest Stories
Platform-as-a-Service (PaaS) is a technology designed to make DevOps easier and allow developers to focus on application development. The PaaS takes care of provisioning, scaling, HA, and other cloud management aspects. Apache Stratos is a PaaS codebase developed in Apache and designed to create a highly productive developer environment while also supporting powerful deployment options. Integration with the Docker platform, CoreOS Linux distribution, and Kubernetes container management system ...
Because Linkerd is a transparent proxy that runs alongside your application, there are no code changes required. It even comes with Prometheus to store the metrics for you and pre-built Grafana dashboards to show exactly what is important for your services - success rate, latency, and throughput. In this session, we'll explain what Linkerd provides for you, demo the installation of Linkerd on Kubernetes and debug a real world problem. We will also dig into what functionality you can build on ...
In a recent survey, Sumo Logic surveyed 1,500 customers who employ cloud services such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). According to the survey, a quarter of the respondents have already deployed Docker containers and nearly as many (23 percent) are employing the AWS Lambda serverless computing framework. It's clear: serverless is here to stay. The adoption does come with some needed changes, within both application development and operations. Th...
After years of investments and acquisitions, CloudBlue was created with the goal of building the world's only hyperscale digital platform with an increasingly infinite ecosystem and proven go-to-market services. The result? An unmatched platform that helps customers streamline cloud operations, save time and money, and revolutionize their businesses overnight. Today, the platform operates in more than 45 countries and powers more than 200 of the world's largest cloud marketplaces, managing mo...
The standardization of container runtimes and images has sparked the creation of an almost overwhelming number of new open source projects that build on and otherwise work with these specifications. Of course, there's Kubernetes, which orchestrates and manages collections of containers. It was one of the first and best-known examples of projects that make containers truly useful for production use. However, more recently, the container ecosystem has truly exploded. A service mesh like Istio a...
Containerized software is riding a wave of growth, according to latest RightScale survey. At Sematext we see this growth trend via our Docker monitoring adoption and via Sematext Docker Agent popularity on Docker Hub, where it crossed 1M+ pulls line. This rapid rise of containers now makes Docker the top DevOps tool among those included in RightScale survey. Overall Docker adoption surged to 35 percent, while Kubernetes adoption doubled, going from 7% in 2016 to 14% percent.
Technology has changed tremendously in the last 20 years. From onion architectures to APIs to microservices to cloud and containers, the technology artifacts shipped by teams has changed. And that's not all - roles have changed too. Functional silos have been replaced by cross-functional teams, the skill sets people need to have has been redefined and the tools and approaches for how software is developed and delivered has transformed. When we move from highly defined rigid roles and systems to ...
Even if your IT and support staff are well versed in agility and cloud technologies, it can be an uphill battle to establish a DevOps style culture - one where continuous improvement of both products and service delivery is expected and respected and all departments work together throughout a client or service engagement. As a service-oriented provider of cloud and data center technology, Green House Data sought to create more of a culture of innovation and continuous improvement, from our helpd...
Docker and Kubernetes are key elements of modern cloud native deployment automations. After building your microservices, common practice is to create docker images and create YAML files to automate the deployment with Docker and Kubernetes. Writing these YAMLs, Dockerfile descriptors are really painful and error prone.Ballerina is a new cloud-native programing language which understands the architecture around it - the compiler is environment aware of microservices directly deployable into infra...
The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete at launch. DevOps may be disruptive, but it is essential. DevOpsSUMMIT at CloudEXPO expands the DevOps community, enable a wide sharing of knowledge, and educate delegates and technology providers alike.
Public clouds dominate IT conversations but the next phase of cloud evolutions are "multi" hybrid cloud environments. The winners in the cloud services industry will be those organizations that understand how to leverage these technologies as complete service solutions for specific customer verticals. In turn, both business and IT actors throughout the enterprise will need to increase their engagement with multi-cloud deployments today while planning a technology strategy that will constitute a ...
The platform combines the strengths of Singtel's extensive, intelligent network capabilities with Microsoft's cloud expertise to create a unique solution that sets new standards for IoT applications," said Mr Diomedes Kastanis, Head of IoT at Singtel. "Our solution provides speed, transparency and flexibility, paving the way for a more pervasive use of IoT to accelerate enterprises' digitalisation efforts. AI-powered intelligent connectivity over Microsoft Azure will be the fastest connected pat...
While more companies are now leveraging the cloud to increase their level of data protection and management, there are still many wondering “why?” The answer: the cloud actually brings substantial advancements to the data protection and management table that simply aren’t possible without it. The easiest advantage to envision? Unlimited scalability. If a data protection tool is properly designed, the capacity should automatically expand to meet any customer’s needs. The second advantage: the ...
At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...
As you know, enterprise IT conversation over the past year have often centered upon the open-source Kubernetes container orchestration system. In fact, Kubernetes has emerged as the key technology -- and even primary platform -- of cloud migrations for a wide variety of organizations. Kubernetes is critical to forward-looking enterprises that continue to push their IT infrastructures toward maximum functionality, scalability, and flexibility. As they do so, IT professionals are also embr...