Contact Tracing Tech Across the Data Life Cycle

ETHOS Issue 22, June 2021

To stem the spread of the COVID-19 pandemic, governments around the world have created digital tools to quickly identify actual and potential COVID-19 infections. Together with traditional methods of contact tracing (i.e., interviewing patients to identify their close contacts), these digital tools aim to speed up tracing,¹ reducing the time taken from days to a matter of minutes.² This shortens the interval between an individual’s possible exposure to the virus and entry into quarantine, which is especially useful since asymptomatic and pre-symptomatic carriers are known to be significant spreaders of the virus. Such tools also help tracers work backwards to identify clusters, curbing the spread of an illness in which patients are most infectious in the first week of illness.³

However, these digital tools have also sparked privacy and other concerns. Scholars at the Oxford Internet Institute⁴ have highlighted two aspects of the ethics and governance questions involved. First, there are the high-level principles: are the apps necessary, proportional, scientifically sound, and temporary? Second, there are the enabling factors: such as whether use of the apps is voluntary, whether people can consent to their data being used, and the extent to which the purpose of the apps are defined.

Countries have struck different balances between public health utility and civil liberties in the use of digital contact tracing tools. In some countries, participation is practically compulsory and data is centralised. In China, for example, many cities use Alipay Health Code to determine whether people may enter buildings or use public transit. A New York Times analysis of the app’s code found that it shares data with the police by sending each user’s location and identifying code numbers to a server.⁵ In Singapore, the use of SafeEntry, which people use to log their presence, is compulsory at a wide range of venues: offices, shops, worksites, schools, healthcare facilities, places of worship, hotels, recreation and entertainment venues, restaurants, and cultural institutions among others.⁶ Data is stored and encrypted in a server for 25 days.

Other countries use apps that maximise choice and emphasise privacy protections. Canada and several European Union (EU) countries have launched apps that citizens are not required to use. As of March 2021, Canada’s COVID Alert had been downloaded by just 6 million people, roughly 16% of the population.⁷ As of January 2021, Germany’s Corona-Warn-App had been downloaded by just 24 million people, roughly 29% of the population.⁸

Governments have a duty to protect both the safety and the rights of their citizens; they must do both in a way that maintains trust. Policymakers and citizens can better think about what trade-offs to make and what guard rails against abuse to erect by considering the data life cycle of contact tracing technology.

Decisions Across the Data Life Cycle

Data lies at the core of any digital system. A digital system uses hardware and software to turn data into a solution. Take for example what happens when we have virtual meetings. Software (e.g., applications like Zoom/Skype, or operating systems such as Windows 10 or iOS) and hardware (e.g., a computer, cables, routers, Wi-Fi signals) combine to take sound and light on one end and turn them into bits of data that emerge on the other end as audio and video. All the companies involved in this process could be collecting various sorts of data, such as data about the quality of the connection, the time and duration of the meeting, and the performance of devices.

data life cycle small

Professor Jeanette Wing’s model of the data life cycle encompasses all but the last phase (archival/destruction), as part of her argument that privacy and ethical concerns ought to be considered throughout the life cycle of data. Click to see the larger image

As Jeanette Wing—Professor of Computer Science and Director of Columbia University’s Data Science Institute—describes in an influential 2019 essay, data in a digital system goes through a life cycle with various phases.⁹ When it comes to COVID-19 technology, policyowners need to make important decisions for each of these key phases, throughout the life cycle of the data in question.

Collection

Our smartphones contain sensors and devices that generate data about us in order to function. Accelerometers detect motion and can be used to measure our steps taken. GPS specifies our location. Bluetooth enables our phones to “talk” to other devices, thereby registering the device’s presence.

Not all data generated needs to be collected. Policyowners must determine what data generated is meaningful to collect. That is, does the data have quality, and is it a good proxy for the social condition being analysed? Also, how might they get the data they want to collect? Should they settle for less meaningful data if it is more readily collected?

Some COVID-19 contact tracing technologies collect location data (where you were); others collect proximity data (whom you were with). For the former, data can be collected in different ways. For example, some apps in China use data about user GPS locations, acquired from telcos, to determine an individual’s travel history and possible exposure in high-risk areas. In this case, data is acquired automatically, with little participation from users. Singapore’s SafeEntry app also logs location, but by having users scan a QR code when they visit various venues. The advantage to both approaches is their use of familiar technologies. QR code scanning has the added privacy advantage of not continuously logging users’ locations wherever they might be, including at home.

Not all data generated needs to be collected. Policy owners must determine what data generated is meaningful to collect.

There are, however, downsides to using these familiar technologies of location-based tracing. QR code-based systems expose users to security threats.¹⁰ Additionally, where droplet-based transmission dominates, location-based tracking is less useful than proximity-based tracking, since our vulnerability to infection by people with COVID-19 is shaped more by how near we were to them, and for how long, than whether we were at the same venue as them.

Besides the type of data to collect, policyowners must also decide how much data to collect to make contact tracing effective. How much does the government need to know about a person who might have been exposed to the virus?

Apps built on Apple and Google’s Exposure Notifications System (ENS)¹¹ only transmit and store random and regularly changing identification numbers that are linked to users’ devices rather than their identities. This approach has won plaudits from privacy advocates,¹² but creates a different challenge, because devices could be used by different people. How do we know that the person using the device when it registers proximity with an infected person is the same person who is called up for possible exposure? In contrast, India’s Aarogya Setyu, which has raised privacy concerns, collects a person’s name, phone number, gender, travel history, and smoking habits (if any), on top of both location and proximity data.

In deciding how much information is needed to identify individuals, authorities need to consider two issues: what information is useful for public health efforts, and potential risks to privacy and data security.

Storage

Data collected is data that must be stored somewhere. In the case of potentially sensitive personal information, this stored data must also be secured. Here, policyowners need to weigh threat assessment (who would want to compromise the data collected, for what purpose, and how?) against public health utility.

Unlike solutions that collect location data, both BlueTrace¹³ (derived from Singapore’s TraceTogether) and the Apple-Google ENS use Bluetooth to collect proximity or associational data. Both transmit and receive random “nicknames” associated with users’ devices. Both enhance privacy and data security by only storing nicknames on these devices. If the encryption keys that linked these nicknames to devices or identities were ever compromised, a hacker would still have to break into individual phones to access the proximity data stored within.

This means that both systems have a security advantage over systems that store data on a server. Storing data on a server creates a single point of failure—the server could be compromised by a hack, an accidental leak, or sabotage by a rogue insider. If the data is stored in plain text (i.e., readable by a human being), and if large amounts of data have been collected so that one leak exposes a great deal of information about individuals, then the leak becomes more severe. In China, COVID-19-related data, including names, identification numbers, phone numbers, and addresses, have been leaked:¹⁴ as a result, people who may have been exposed to the virus have been harassed. The leak of data that is difficult to change—e.g., a national identification number or biometric data—would be particularly egregious, since that could compromise people across multiple systems in ways that are challenging to quickly correct.

BlueTrace and the Apple-Google ENS differ in whether they allow information gathered to be uploaded to a server where it can be decrypted to identify individuals: BlueTrace does; the Apple-Google ENS does not. BlueTrace allows data centralisation because it enables health authorities to quickly identify and isolate potential carriers of the SARSCoV-2 virus. As a safeguard, BlueTrace only stores and uploads information on exposure gathered over a certain period. As such, BlueTrace’s compromise on data centralisation arguably makes it a better option for most health systems. Notably, centralising information and contact tracing by health authorities are already standard practices with many other infectious diseases (e.g., sexually transmitted infections,¹⁵ tuberculosis,¹⁶ Ebola¹⁷).

Apple and Google have refused to allow centralisation due to privacy concerns. As global companies, their tough stance on privacy is understandable, since they would not want their technology to be used in countries where government possession of individuals’ data could enable serious incursions against civil liberties. Governments have clashed with the tech giants by asking that the ENS allow them to collect more data and centralise information for COVID-19 infections.¹⁸ These governments question why the tech giants’ commercial policies should prevent them from better integrating technology with public health operations to combat an unprecedented health crisis.¹⁹

Nevertheless, many governments have launched apps based on the Apple-Google protocol, leveraging their ready-to-use digital contact tracing technology to help ameliorate rising outbreaks. The efficacy of such apps does depend on governments’ success in persuading citizens to update their status in the apps if they happen to test positive for COVID-19.

Management and Analysis

Wing writes that data needs to be managed in ways that “maximise our ability to access and modify the data for subsequent analysis”.²⁰ This raises a highly salient data governance issue, especially in cases where COVID-19 data is centralised: who gets access to the data collected, and what are they authorised to do with it? How this concern is addressed, and assurances that provisions will be adhered to, are key to building trust with citizens over the trade-offs in privacy that governments ask of them.

Singapore’s experience underscores the need for legislative guarantees to limit data use. The team behind Singapore’s TraceTogether initially publicly committed that data shared with the Ministry of Health would only be used for contact tracing. Politicians repeated this assurance as well. However, in January 2021, in response to a question posed by a Member of Parliament, Singapore’s Ministry of Home Affairs acknowledged that, contrary to these public assurances, the police had in fact accessed TraceTogether data for an investigation in May 2020. The TraceTogether team subsequently clarified that TraceTogether data had always been subject to the Criminal Procedure Code, which empowers the police to access any data for the purpose of criminal investigation.

Singaporeans reacted less to the violation of privacy and more to the fact that a public assurance had been broken, with the information about it only being released months after the fact. The undermining of trust was arguably made worse by the fact that, as Computer Science Associate Professors Terence Sim²¹ and Ben Leong²² argued separately, TraceTogether data would not have been that useful in criminal investigations anyway. Indeed, Minister of State for Home Affairs Desmond Tan acknowledged that investigators did not find useful data since the suspect had not downloaded the app onto his phone.²³ Singaporeans began to uninstall the app or leave their TraceTogether tokens at home. In response, the Government passed legislation to limit the use of TraceTogether data to investigations for specified serious crimes.

Contact Tracing and Privacy Concerns Elsewhere

Israel’s contact tracing technology is managed by the state security agency, Shin Bet. The agency traces close contacts of COVID-19 patients by using the “Tool”, a database which contains the details of phone users in Israel. There is little transparency about how health information collected by the Tool is stored and protected. The use of the Tool for health purposes has been challenged in courts and requires periodic authorisation by the Israeli parliament. Israel’s Health Ministry reported that traditional contact tracing had only uncovered a third of COVID-19 cases, while the Tool had identified the rest. Israeli society seems to have accepted the temporary use of a security service for public health purposes, arguably because of its governance and effectiveness.¹

Fear that a lack of robust governance would deter people from using contact tracing apps has led authorities to commit to limiting the purpose of contact tracing data.

The United Kingdom’s Department of Health and Social Care assured individuals that police would not get access to data acquired by the National Health Service’s COVID-19 app.² Australia has outright criminalized the use of contact tracing data for non-health reasons. ³

NOTES

Tehilla Schwartz Altshuler and Rachel Aridor Hershkowitz, “How Israel’s Covid-19 Mass Surveillance Operation Works”, Brookings: Tech Stream, July 6, 2020, accessed March 31, 2021.
Zoe Kleinman, “Covid Contact-Tracing App Not Sharing Data with Police”, BBC, October 19, 2020, accessed March 31, 2021.
Byron Kaye, “Australia Will Make It a Crime to Use Coronavirus Tracing Data for Non-Health Purposes”, Reuters, April 24, 2020, accessed March 31, 2021.

Singapore’s experience suggests that there is, realistically, a need to consider alternatives to the two extremes of either opaque and liberal use of data or strictly limiting its use. The World Economic Forum’s White Paper on Authorised Public Purpose Access (APPA)²⁴ provides a middle way forward based on public consent. APPA would move data governance away from a model that over-emphasises individual consent (which could harm individuals) to one that balances the needs of individuals, public purpose, and data holders. Masako Okamoto and Takanori Fujita explain that “under APPA, personal data can sometimes be accessed and used without explicit individual consent, provided this is done for a specific, widely agreed-upon public purpose”.²⁵ An APPA process would include checking a White List to determine if the data type has been approved for a specifically designated purpose. It would also include review by a third party such as an independent board. In this sense, Singapore’s legislation to limit the use of TraceTogether data to specific purposes is a step in the right direction.

More broadly, the use of digital contact tracing tools could be strengthened by better provisions for decision provenance (i.e., who makes what decisions, and how) for government technology. Jatinder Singh, Jennifer Cobbe and Chris Norval of Cambridge University’s Department of Computer Science and Technology²⁶ note that complex technological systems raise accountability challenges as data flows across technical and organisational boundaries. Transparency about who makes what decisions, where in a system, and how, could facilitate compliance with requirements and regulations, offer recourse against harm, and give users agency to make more informed decisions about how their data is used. For this approach to work, engineers, data scientists and bureaucrats will have to make their work more legible to ordinary citizens. In turn, citizens must also become more competent at querying and critically analysing these decisions.

How might authorities access data in a way that upholds trust?

Upholding Trust in Data Use

Microsoft CEO Satya Nadella has offered suggestions for how authorities might access data in a way that upholds trust. These include:¹

having an efficient mechanism to access data that is supported by “a clear legal framework that is subject to strong checks and balances”;
strengthening users’ privacy protections to prevent the erosion of these rights in the name of efficiency;
allowing technology companies, “except in highly limited cases”, to inform users that the authorities have sought their data;
having governments seek data from a source that is closest to the end user; and
ensuring that any attempt to access data does not undermine security, and therefore users’ trust in technology.

NOTE

Satya Nadella, Hit Refresh: The Quest to Rediscover Microsoft’s Soul and Imagine a Better Future for Everyone (London: William Collins, 2017), 190–193.

Data governance via the APPA model and which incorporates decision provenance transparency would require a society to have robust deliberations over the robustly deliberate circumstances under which specific data can be shared with and used by specific entities for specific purposes. Indeed, if countries are to treat data as a resource as much as they do finance, then it should be similarly subjected to periodic, public deliberation as to its best use. Continuous public deliberation over data would also be a way for governments to keep abreast of changing societal attitudes towards data privacy, and to consistently replenish reservoirs of trust.

The use of digital contact tracing tools could be strengthened by better provisions for decision provenance for government technology.

Archival or Destruction

Decisions at the archival or destruction stage of the data life cycle could also influence the extent to which people participate in digital contact tracing. Archival involves removing data from further use (by, for example, storing it in a device that is not connected to networks). Destruction involves permanently deleting it. Deleting data is, in fact, a key aspect of Mozilla’s Lean Data Practices. Mozilla notes that the “value of data diminishes over time”, and that sensitive data should either be deleted when it is no longer relevant, or stripped of markers that identify the person to whom the data belongs, as much as possible.²⁷

The best practice in COVID-19 contact tracing systems is to delete information after some time—usually, the time it takes for the virus to incubate. The Apple-Google ENS deletes information after 14 days. TraceTogether deletes information after 25 days on account of studies which show cases of the virus having a longer incubation period than the two weeks it was previously thought to have.²⁸ In systems that store data on servers, information could be anonymised and aggregated to diminish its value should it be leaked. This would reduce the harm caused to any individual in case the data is compromised.

Operational Realities and Considerations

A key ethical question with contact tracing tools is whether participation in tracing is voluntary or mandatory. A high take-up rate of apps is necessary to make digital contact tracing effective: between 56% and 95% of the population, according to a study in The Lancet.²⁹ This might tempt governments to make the use of contact tracing apps mandatory.

Indeed, depending on voluntary participation might not work. In Canada, 95% of Canadians who tested positive for COVID-19 failed to voluntarily report their diagnosis using the country’s COVID Alert app. In Ontario, where the app first launched, only 4% of people who had COVID-19 logged their positive diagnoses in the app between 31 July 2020, when the app launched, and 28 September 2020.³⁰

At the same time, people care about whether contact tracing apps could be used to perpetually monitor them. In Singapore, 45% of respondents in a study by Blackbox Research said that they did not download the TraceTogether app mainly because they “did not want the government tracing their movements”.³¹ When Singapore introduced a TraceTogether token, over 54,000 people signed a Change.org petition, “Singapore Says ‘No’ To Wearable Devices for Covid-19 Contact Tracing”.³² In India, citizens pushed back against an attempt by a district administration to make the use of Aarogya Setu compulsory.³³

Dr Vivian Balakrishnan, Singapore’s Minister for Foreign Affairs and Minister-in-charge of the Smart Nation initiative, affirmed in an interview that “maintaining trust, respecting privacy and getting voluntary participation is absolutely essential” for contact tracing.³⁴ Indeed, nurturing public trust in institutions is vital to governments’ ability to fight COVID-19. Actions that sacrifice trust for compliance with mandated use of apps could end up backfiring on overall public health efforts.

Moving Forward: Strengthening Governance and Trust

As COVID-19 restrictions ease, the case for using contact tracing tools to mitigate risk becomes stronger. More will need to be done to assure citizens that contact tracing technology—indeed, all sorts of civic technology—is doing things for them rather than to them. In this vein, Singapore’s Ministry of Health and Smart Nation and Digital Government Office promoted the use of TraceTogether as an effort by Singaporeans to “protect themselves, their loved ones and their community”³⁵ from COVID-19. This was an appeal to both self-interest and altruism. In Singapore, SafeEntry has long been functionally mandatory, its use required in a wide array of venues. There have latterly been moves to make the use of TraceTogether mandatory at certain venues. This is an opportunity to strengthen public engagement on when, where, and why contact tracing is used, in order to encourage enthusiastic participation.

Better data governance could also foster participation. Compromises on voluntary participation should be accompanied by stronger protections for privacy trade-offs. For example, SafeEntry could adopt some of TraceTogether’s privacy safeguards, such as limiting the data’s use to explicitly stated public purposes. Governments could also strengthen public assurances by stiffening sanctions on public servants who use contact tracing data for reasons other than those that have been authorised.

Ultimately, people’s willingness to work with COVID-19 public health measures depends on the extent to which they trust the governance system as a whole. In this sense, citizens’ enthusiastic compliance with public health measures—including digital contact tracing—is a daily poll of the authorities’ ability to both protect their health and keep their data safe.

Better data governance could foster participation. Compromises on voluntary participation should be accompanied by stronger protections for privacy trade-offs.

ABOUT THE AUTHOR

Vernie Oliveiro is Principal Researcher at the Institute of Governance and Policy, Civil Service College. Her research interests include governance and digital government.

NOTES

Sharon Begley, “Covid-19 Spreads Too Fast for Traditional Contact Tracing. New Digital ToolsCould Help”, Stat+, April 2, 2020, accessed March 31, 2021.
Cara Wong, “Digital Tools Help Speed Up Contact Tracing Efforts to Ring-Fence Covid-19 Cases” , The Straits Times, July 8, 2020, accessed March 31, 2021.
Muge Cevik, Matthew Tate, Ollie Lloyd, Alberto Enrico Maraolo, Jenna Schafers, and Antonia Ho, “SARS-CoV-2, SARS-CoV, and MERS-CoV Viral Load Dynamics, Duration of Viral Shedding, and Infectiousness: A Systematic Review and Meta-Analysis”, The Lancet Microbe 2, no. 1 (November 2020), accessed March 31, 2021.
Jessica Morley, Josh Cowls, Mariarosaria Taddeo, and Luciano Floridi, “Ethical Guidelines for SARS-Cov-2 Digital Tracking and Tracing Systems”, SSRN, April 22, 2020, accessed March 31, 2021.
Paul Mozur, Raymond Zhong, and Aaron Krolik, “In Coronavirus Fight, China Gives Citizens a Colour Code, with Red Flags”, New York Times, March 1, 2020, accessed March 31, 2021.
Rachel Collier, “More than 6 Million Canadians Download COVID Alert App”, The Star, March 10, 2021, accessed March 31, 2021.
Andrea Waisgluss, “With Over 24 Million Downloads, Germany’s COVID-19 App Helps Break the Infection Chain”, Forbes, January 25, 2021, accessed March 31, 2021.
Jeannette M. Wing, “The Data Life Cycle”, Harvard Data Science Review 1, no. 1 (Summer 2019), July 2, 2019, accessed March 31, 2021.
Brian Foster, “QR Codes: A Sneaky Security Threat”, threatpost, October 1, 2020, accessed March 31, 2021.
For more information, see “Exposure Notifications: Frequently Asked Questions”, September 1, 2020, accessed March 31, 2021.
“Contact Tracing Joint Statement”, April 19, 2020, accessed March 31, 2021. As of July 3, 2020, the letter had 650 signatories.
For more information, see Jason Bay, Joel Kek, Alvin Tan, Chai Sheng Hau, Lai Yongquan, Janice Tan, and Tang Anh Quy, “BlueTrace: A Privacy-Preserving Protocol for Community-Driven Contact Tracing Across Borders”, Government Technology Agency Singapore, April 9, 2020, accessed March 31, 2021.
Shen Xinmei, “Personal Information Collected to Fight Covid-19 Is Being Spread Online in China”, South China Morning Post, May 12, 2020, accessed March 31, 2021.
Jack Cassell, “Coronavirus: Why Did England Ignore an Army of Existing Contact Tracers?” The Conversation, June 18, 2020, accessed March 31, 2021.
See, for example, Department of Health Australia, “Communicable Diseases Network Australia Guidelines for Public Health Units–Management of TB”, last updated May 12, 2015, accessed March 31, 2021.
World Health Organization: Regional Office for Africa, “Contact Tracing During an Outbreak of Ebola Virus Disease”, September 2014, accessed March 31, 2021.
Reed Albergotti and Drew Harwell, “Apple and Google Are Building a Virus-Tracking System. Health Officials Say It Will Be Practically Useless”, Washington Post, May 16, 2020, accessed March 31, 2021.
Mark Scott, Elisa Braun, Janosch Delicker, and Vincent Manancourt, “How Google and Apple Outflanked Governments in the Race to Build Coronavirus Apps”, Politico, May 15, 2020, accessed March 31, 2021.
See Note 9.
Terence Sim Mong Cheng, “Forum: TraceTogether Data May Not Be That Useful for Criminal Investigation”, The Straits Times, January 5, 2021, accessed March 31, 2021.
Ben Leong, “Another Year, Another Fiasco”, post published on Medium, January 6, 2021, accessed March 31, 2021.
David Sun, “TraceTogether Data Was Accessed in May 2020 for Punggol Fields Murder Investigation”, The Straits Times, February 2, 2021, accessed March 31, 2021.
World Economic Forum White Paper, “APPA—Authorised Public Purpose Access: Building Trust into Data Flows for Well-being and Innovation”, December 2019, accessed March 31, 2021.
Masako Okamoto and Takanori Fujita, “A New Data Governance Model for Contact Tracing: Authorised Public Purpose Access”, World Economic Forum, August 12, 2020, accessed March 31, 2021.
Jatinder Singh, Jennifer Cobbe, and Chris Norval, “Decision Provenance: Harnessing Data Flow for Accountable Systems”, IEEE Access 7 (November 2019): 6562–6574, accessed March 31, 2021, doi: 10.1109/ACCESS.2018.2887201.
“Lean Data Practices”, Mozilla, accessed March 31, 2021, https://www.mozilla.org/en-US/about/policy/lean-data/stay-lean/.
“Coronavirus Incubation Could Be as Long as 27 Days Chinese Provincial Government Says”, Reuters, February 22, 2020, accessed March 31, 2021.
Isobel Braithwaite, Thomas Callender, Miriam Bullock, and Robert W. Aldridge, “Automated and Partly Automated Contact Tracing: A Systematic Review to Inform the Control of Covid-19”, The Lancet Digital Health 2, no. 11 (November 2020):E607–621, accessed March 31, 2021, https://www.doi.org/10.1016/S2589-7500(20)30184-9.
Jasmine Pazzano and Brian Hill, “The Covid Alert App Isn’t Working As Well As It Should Be, and Canadians Are Part of the Problem”, Global News, October 2, 2020, accessed March 31, 2021.
Dewey Sim and Kimberly Lim, “Coronavirus: Why Aren’t Singapore Residents Using the TraceTogether Contact-Tracing App?” South China Morning Post, May 18, 2020, accessed March 31, 2021.
Petition retrieved from https://www.change.org/p/singapore-government-singapore-says-no-to-wearable-devices-for-covid-19-contact-tracing.
Sushovan Sircar, “Noida Reverses Mandatory Aarogya Setu Order after Legal Challenge”, The Quint, May 20, 2020, accessed March 31, 2021.
Ministry of Foreign Affairs Singapore, “Minister for Foreign Affairs Dr Vivian Balakrishnan’s Skype Interview with Sky News Australia”, May 22, 2020, accessed March 31, 2021.
Smart Nation Singapore and Ministry of Health Singapore, “Launch of New App for Contact Tracing”, accessed March 31, 2021.

Back to Ethos Page