FAQ: 'Big data' for publishers
Big data is, well, big. IDC says the market for big data software, infrastructure and services will grow 40% annually from $3.2 billion in 2010 to $16.9 billion in 2015. The concept has become an umbrella term for how companies and industries are using the ever-increasing amount of digital information at their fingertips to improve their businesses.
Big data is big in industries such as retail and consumer packaged goods, which are always looking for new ways to utilize the wealth of consumer data at their disposal. Advertisers are also eyeing big data as a way to more effectively target customers and prospects. Earlier this week, Yahoo launched Genome, its long-awaited predictive analytics ad platform that the company says “eats big data for lunch.”
If you’re new to the big data phenomenon, here are some answers to questions you may be asking.
What is 'big data'?
“Big data” is a phrase coined to define data sets that are too large to be processed by traditional database
management tools. “The data is too big, moves too fast, or doesn't fit the strictures of your database architectures,” Edd Dumbill wrote in a post for O’Reilly Radar. “To gain value from this data, you must choose an alternative way to process it.”
Tech companies such as IBM view big data across three dimensions: volume (the amount of data being collected), velocity (the ability to analyze this data in real time), and variety (structured or unstructured information including text, videos, images, log files, sensor data and click streams). IDC defines the evolving market this way:
Big data technologies describe a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling high-velocity capture, discovery, and/or analysis.
Where’s all this data coming from?
Big data is the result of a huge spike in connected devices – everything from cell phones to utility meters to global positioning satellites to smart appliances – combined with inexpensive storage and broadband communications. All these connected devices – which some are calling “Internet of Things” – are designed to capture and transmit a nearly endless stream of data.
IBM says 90% of the data in the world today has been created in the last two years. Cisco predicts that global IP traffic will increase nearly fourfold over the next five years, reaching 110 exabytes a month by 2016. Cisco says that by 2016, the gigabyte equivalent of all movies ever made will cross the global Internet every 3 minutes.

Why should I care about big data?
Companies need to harness this growing volume of data to remain competitive. As media companies move further into digital distribution channels – web, smartphones, tablets – they will need to find ways to track virtually every interaction their audiences have with their content.
By analyzing this data, publishers can develop a better understanding of how people are using their content. These insights will enable them – theoretically at least – to create better products and better ways to monetize that content.
Those are great buzzwords, but what can I really do with big data?
Analyzing various data sets can help editors and publishers make informed decisions about how to develop, package and deliver content – if they have the right tools. For example, JumpTime’s Traffic Valuator uses a combination of data analysis, machine learning and financial theory to help publishers calculate the current and future value of each piece of content on a website.
A company called Visual Revenue offers a predictive analytics tool that provides real-time recommendations about where to place content on a web page to drive optimal performance. “We created this model where I can take any piece of content created over the last day or two days … and model how well that’s going to perform in any given position … about 15 minutes into the future,” CEO Dennis Mortensen told Poynter last summer. “And since we know how well the future is going to play out, we can come up with a set of very specific recommendations about what to put where, for how long.”
Big data can also be used by the sales side of the house, as a tool to help advertisers better target your audience. In a blog post introducing its new Genome platform, Yahoo said, “Genome … takes in more data from more sources than other solutions, then analyzes and decodes it to create multi-dimensional views of customers that help define the best target audiences for ad campaigns.” Fast Company further explained how the platform works:
“[Genome] splices together more than 25 databases of online activity from hundreds of millions of people. … Combined with in-house data and its preexisting partnerships with AOL and Microsoft, Yahoo! can now target brand campaigns to micro-demographics that advertisers may not have even realized existed before.”
Data analysis can also help publishers identify trends to make decisions about which mobile platforms to support, what types of apps to develop, which markets to enter, how to engage more effectively on social networks ... the list goes on.
What skills do I need to address big data?
O’Reilly’s Dumbill says companies looking to build big data expertise require teams skilled in data science, which combines math, programming and scientific instinct. “Benefiting from big data means investing in teams with this skill set, and surrounding them with an organizational willingness to understand and use data for advantage,” he wrote.
Greylock Partners’ DJ Patil lists four important qualities of a good data scientist:
- Technical expertise: the best data scientists typically have deep expertise in some scientific discipline.
- Curiosity: a desire to go beneath the surface and discover and distill a problem down into a very clear set of hypotheses that can be tested.
- Storytelling: the ability to use data to tell a story and to be able to communicate it effectively.
- Cleverness: the ability to look at a problem in different, creative ways.
The cultural and structural shifts will be far more challenging than simply finding the right tactical skills. In a study of smart meter usage in the utility industry, released by Oracle this week, 45 percent of utility executives said they found it hard to get information to the right managers because they lack the organizational structure and staff members to handle the deluge of data, the New York Times reported.
Where should this big data expertise reside? "In some cases, media companies might want to create a business intelligence or predictive behavior group," said Frank Cutitta, CEO at the Center for Global Branding, a global branding and media consultancy. "Others may want to embed the skills in data-rich divisions such as their database or analytics groups."
Regardless of where it's housed, Cutitta added, media companies need to ask themselves whether they're actually ready to manage the changes brought about by leveraging big data.
Is the data ecosystem as complex as I think it is?
Yes. The market consists of frameworks such as Hadoop, an open source project for distributed processing of large data sets, along with many applications and tools for capturing, analyzing and presenting the information.
In the spirit of Luma’s ad ecosystem infographic, Matt Turck and Shivon Zilis created a chart of the big data landscape.
Where can I learn more about big data?
You’ll find nearly unlimited resources by Googling “big data” or searching #bigdata on Twitter. Some of them might actually be useful. Here are some of the better ones I found while researching this post:
- Hadoop: What it is, how it works, and what it can do
- Applying 'big data' to newsroom decisions
- How big data is disrupting local search
- The 7 steps in Big Data delivery
- Data isn't always the answer
- Big data: The next frontier for innovation, competition, and productivity






Join the discussion