The Age of Big Data
ETHOS Issue 09, June 2011
We are entering an age of big data. In a 2008 feature article highlighting the impact that massive data capture and storage has already had on science, medicine, business and technology, Wired Magazine Editor-in-Chief Chris Anderson argues that the growth of data will enhance opportunities to find new answers to fundamental questions. “In the era of big data,” Anderson points out, “more isn’t just more. More is different.”1 Similarly, in a 2010 special report on smart systems,2 The Economist suggests that “data, and the knowledge extracted from them, may \[become\] a factor of production in their own right, just like land, labour and capital.” Could data be the new oil: fuelling the next major wave of productivity gains and economic growth?
WHAT’S DRIVING THE AGE OF BIG DATA?
The age of big data is primarily being driven by two trends – Technological Development and Data Democratisation:
Technological Development: Data has become more useful
Technological developments in recent years have enhanced our ability to Collect, Store, Analyse, Present and Optimise data – driving up data value, improving decision-making and changing consumer behaviour at different segments of the value chain (see Figure 1).
Collect
An IBM Study in 20093 predicted that “information that was previously created by people will increasingly be machine-generated – flowing out of sensors, RFID tags, meters, actuators, GPS and more. Inventory will count itself. Containers will detect their contents. Pallets will report in if they end up in the wrong place.” This significantly simplifies the data collection process. We are now producing more information than we have the ability to store.
Store
Google’s DJ Collins has suggested that “organisations, like Amazon, Sun and even Google, are demonstrating the amazing benefits in scale and interoperability that come through moving data storage into the cloud”.4 Merrill Lynch5 estimates the volume of cloud computing market opportunity would amount to US$160 billion in 2011.
Analyse
Companies are making significant strides in data mining and database management, suggesting that smart systems will become increasingly prevalent. For example, IBM has launched a campaign called “Smarter Planet”6 to intelligently interconnect the many systems that keep the modern world running. They envision that digital technology will make energy, transport, cities and other areas more synergistic, responsive and adaptive.
Present
In the age of big data, as the amount of data collected and in need of analysis rises, we can also expect that new tools to present and visualise this data will emerge. Geoff McGhee, an online journalist specialising in multimedia and information graphics, was awarded a 2009-2010 John S. Knight Journalism Fellowship at Stanford University to study data visualisation.7 He explained that, “Journalists are coping with the rising information flood by borrowing data visualisation techniques from computer scientists, researchers and artists. Some newsrooms are already beginning to retool their staff and systems to prepare for a future in which data becomes a medium.”
Optimise
Having more data is only interesting to the extent that you can make better decisions and change behaviour with relevant information. In his book, “The Numerati”,8 Stephen Baker suggests that a “mathematical modeling of humanity… will manipulate our behaviour – how we buy, how we vote, whom we love – without our even realising it.”
Data Democratisation: Data has become more accessible
Some cities, including London, New York and San Francisco, have begun to open up their public service databases, making large amounts of public data freely available to citizens.
Having more data is only interesting to the extent that you can make better decisions and change behaviour with relevant information.
IBM, among many other companies, has built a web-based application called City Forward that takes in data from 50 cities. These trends towards unlocking public data could well change the way that local governments, in particular, are organised. Instead of being a collection of departmental silos, public agencies could come to act as task-specific computing platforms. Most services, from payment systems to traffic information, could be provided as a single consolidated version to be used across all departments – or by private firms that want to offer their own urban applications. As Glen Allmendinger, President of Harbour Research, suggests, “more openness should be good for innovation, not just in terms of the information itself but how it is handled.”9
HOW WILL BIG DATA CHANGE THE GAME?
1. Big data requires new capabilities
In “The Numerati”, Stephen Baker posits the rise of “a new math intelligentsia” who would analyse the data trails we create and build accurate models to predict our behaviour. Some precursors of this trend are already evident in the retail sector, with supermarket chains mining customer loyalty card data to target advertising and stimulate impulse purchases.
“I keep saying the sexy job in the next ten years will be statisticians.”
— Hal Varian, Google Chief Economist10
As we enter a world where data is increasingly free and ubiquitous, industry leaders such as Google Chief Economist Hal Varian believe that the ability to find, process, understand, and extract value from relevant data will become ever more important. Indeed, just as computer engineering was the desirable job of the 1990s, Varian believes that the statistician will be the most sought after professional of the next decade. We already see this increased demand for data-crunching capabilities reflected in employment data. The US Bureau of Labour Statistics lists Numerati-type jobs amongst the fastest growing and highest paying.
2. Big data creates new industries
In an age where there is a homogenisation of product and technological offerings across industries, differentiated business processes are among the last few sources of competitive advantage. We see this reflected in a significant increase in IT spending on business analytics and optimisation.11 Technology futurist, Paul Saffo has postulated that soon “many companies will suddenly discover that their main business is data.”12 Indeed, two data crunchers discovered this in the late 1990s, and founded Google. According to Stephen Baker,
“For the age we’re entering, Google is the marquee company. It’s built almost entirely upon math, and its very purpose is to help us hunt down data. Google’s breakthrough, which transformed a simple search engine into a media giant, was the discovery that our queries – the words we type when we hunt for web pages – are of immense value to advertisers. The company figured out how to turn our data into money.”13
In the age of big data, we are likely to find a new wave of sophisticated computing and mathematical techniques and tools being embraced by mainstream businesses. More and more data-driven companies will disrupt incumbent industries in the same way that Google has disrupted the advertising industry.
New data-driven industry mash-ups will also be created. In recent years, we have already seen how biomedicine and wireless technology have converged to create a booming remote-health-monitoring market which enables individuals and clinicians to better monitor changes in an individual’s physiological condition in order to better understand and manage their health. The industry is expected to more than double to $7.7 billion a year by 2012.14
3. Big data uses new methods
After successfully sequencing the human genome, Craig Venter went from sequencing individual organisations to sequencing entire ecosystems. In 2003, enabled by high-speed sequencers and super computers that statistically analyse the data they produce, Venter started sequencing much of the ocean, retracing the voyage of Captain Cook. In 2005, he started sequencing the air. In the process of data-crunching and analysing correlations, he discovered thousands of previously unknown species of bacteria and other life forms, and advanced the frontiers of science more than most other scientists of his generation.
In recent years, we have seen a shift away from hypothesis-based research methods. Through the Cluster Exploratory (CluE) programme, the US National Science Foundation now funds researchers to use software and services running on a Google-IBM cluster to explore innovative research ideas in data-intensive computing. Access to highly effective internet-scale applications powered by massively scaled, highly distributed computing resources are likely to fuel advances in science and engineering.
HOW CAN SINGAPORE SUCCEED IN THE AGE OF BIG DATA?
In a report on “A Vision of Smarter Cities: How Cities can Lead the Way into a Prosperous and Sustainable Future”,15 the IBM Institute for Business Value explained that cities are based on a number of core systems composed of different networks, infrastructures and environments related to their key functions: city services, citizens, business, transport, communication, water and energy. However, these systems are not discrete. Instead, they interconnect in a synergistic fashion that reaches towards optimum performance and efficiency. These core systems, in effect, become a “system of systems”. For better system performance, cities adopt new technologies to become “smarter”.
We might take a leaf out of Seoul’s experience. Over the last 10 years, Seoul has become a hotbed of early adopters, and global powerhouses from Microsoft to Cisco Systems to Nokia use it as a laboratory. While Seoul took the considerable risk of being out front in an untested field, it has demonstrated the potential gains to the city government, as well as its citizens, from being an early adopter of the “smart city” strategy.16
Seoul is not the only example. In many major cities, large portions of stimulus packages are being spent on smart city infrastructure projects, and some have made smart systems a priority of industrial policy. Urban centres in China are a good example of this: for instance, without an infrastructure enhanced by digital technology, Beijing knows that it will be difficult to provide the country’s newly urbanised population with enough food, transport, electricity and water.
Singapore needs to develop new numerati capabilities in its people and businesses.
Singapore too aspires to be a “smart city”, a living laboratory of sustainable urban solutions that are scalable and exportable. How can Singapore succeed in the age of big data? To start with, Singapore needs to develop new numerati capabilities in its people and businesses, to seed new data-driven industry mashups and encourage the adoption of datadriven business solutions and methods.
LIVE Singapore! closes the feedback loop between people moving in the city and the digital real-time data collected in multiple networks. It gives the data back to the people who themselves generate it through their actions, allowing them to be more in sync with their environment as well as to make decisions based on information that reflect the actual state of their city.
To achieve this, LIVE Singapore! consists the development of an open platform for the collection, the combination, fusion and distribution of real-time data that originate from a large number of different sources.
The platform is not aimed at one single application; instead it rather resembles an ecosystem and a toolbox for real-time data that describes urban dynamics. Building on this platform, a community of developers can build multiple applications in a joint effort which harnesses the creative potential of citizens in extracting new value from real- time data.
Another interesting area might be urban applications using real-time data. LIVE Singapore!,17 a research project funded by the National Research Foundation and developed by MIT’s Senseable City Lab, gives us some clues about how this might look like.
LIVE Singapore! is in its research phase, but we should already consider how Singapore could scale up such open platforms to become “a system of systems”, with real-time data dissemination and broader application development. We can begin to imagine a Singapore of the next 10-20 years where data-driven activities form the basis of a new, truly knowledge-based growth industry: deep data-crunching and analytics capabilities could drive growth in other industries like healthcare and clean technology; Asia’s top numerati could come here to work on leading-edge urban applications of real-time data; private companies could collaborate with the public sector to turn Singapore into a living laboratory, developing data-driven applications that enhance standards of living around the world.
What else might Singapore be able to achieve in the coming age of big data?
NOTES
- "The Petabyte Age: Because More Isn‘t Just More", Wired Magazine (June 2008)
- "A Special Report on Smart Systems" The Economist, (6 November 2010)
- "A Vision of Smarter Cities: How Cities Can Lead the Way To A Prosperous And Sustainable Future.", IBM Institute for Business Value (2009)
- "Data: Future Challenges" Vodafone, Future Agenda (2009). http://www.futureagenda.org/?p=593#more-593
- "The Cloud Wars: $100+ Billion At Stake", Merrill Lynch (2008). The analysts write that by 2011, the volume of cloud computing market opportunity would amount to $160bn, including $95bn in business and productivity apps (email, office, CRM, etc.) and $65bn in online advertising.
- http://www.ibm.com/smarterplanet/us/en/index.html?re=sphv
- http://datajournalism.stanford.edu/
- Baker, Stephen, The Numerati (USA: Houghton Mifflin, 2008)
- See Endnote 2.
- http://www.nytimes.com/2009/08/06/technology/06stats.html
- IBM estimates that IT spending for the business analytics and optimisation market was about USD105B with 7.8% compound annual growth rate (CAGR) in 2009.
- See Endnote 2.
- See Endnote 8.
- "The Networked Body", Fast Company (July/August 2009)
- See Endnote 3.
- Kim, Stephen and Powell, Bill, Time Magazine (7 September 2009)
- The LIVE Singapore! exhibition was on public display at the Singapore Art Museum in April 2011.