Will computers decide who lives and who dies? Ethics, Health, and AI in a COVID-19 world

“Brother! You doubting Thomases get in the way of more scientific advances with your stupid ethical questions! This is a brilliant idea! Hit the button, will ya?”

Calvin addressing Hobbes regarding the ‘Duplicator‘ (Waterson, 1990)

While talk about a post-COVID-19 world is ripe, reflecting more the desire for an economic relaunch than the medical reality of the moment, we are still struggling to understand the effects that the pandemic is having on our societies. Those ripple effects are likely both to outlive the pandemic, and even to make themselves visible after the pandemic has hopefully been eradicated. 

One of the conversations that has emerged most clearly is linked to the use of Artificial Intelligence (AI) in healthcare, and concerns both its effectiveness and its ethics. This article will follow two major ethical questions that have dominated the public sphere up to now: the use of data tracking systems for forecasting viral spread, and the possible use of AI as decision support for the allocation of medical resources in emergency situations. The main question underpinning this article is: how will our approach to these challenges impact our future?

Exploring the possible answers to this question will lead us to analyse the impact of the dynamic socio-cultural environment on the predictive capacities of algorithm-based AI models. The article will emphasise the importance of integrating culturally specific dimensions in developing and deploying AIs, and discuss how to approach ethics as applied to AI in a culturally aware manner. 

COVID-19

The pandemic we are living through has generated a series of unforeseen effects on local, regional and global scales. From raising instances of racism and subsequent domestic violence in conjunction with the lockdown measures, to major disruptions in human activities that may generate the biggest economic contraction since World War II, we are experiencing a combination of phenomena that reminds us of the interconnectedness of our world. 

The SARS-COV2 virus appeared in a context of decreased trust in public institutions and in scientific expertise at a global level, against a background of the increased dominance of social media in spreading fake news and pseudo-scientific theories. This was a perfect storm, which has allowed not only the weakening of democratic institutions and the rise of authoritarian leadership, but also the rapid spread of the virus itself. 

At the global level all efforts are geared towards controlling the spread of the virus, creating a vaccine, and treating those affected. Naturally, eyes turned to Artificial Intelligence and to the possibility of using it as a tool to help in these efforts. This process has revealed, and continues to reveal, complex and rather problematic interactions between AI models and the reality in which they are deployed, as well as the conflict between competing AI ethical principles.

Ethics: principles versus practices

In a lecture at Tübingen University, the former UN Secretary-General Kofi Annan said: “One thing that should be clear is that the validity of universal values does not depend on their being universally obeyed or applied. Ethical codes are always the expression of an ideal and an aspiration, a standard by which moral failings can be judged, rather than a prescription for ensuring that they never occur.” 

This is a powerful statement, touching at the core of the ethical challenges of leadership. However, it may also contain a major flaw: while ethical codes can be framed as an expression of universal aspirations, the standards by which we may judge moral failings cannot equally be universal. Whether we like it or not, morality is culturally dependent – and moral failings may certainly fall into a cultural blind-spot for many of us. Yet this does not mean that we can advocate abdicating universal ethical codes in the name of ‘cultural particularities’ (although this is a current practice among authoritarian figures, particularly regarding respect for human rights). It merely means that we need to be aware of how these aspirational universal codes are expressed in daily practices, and how the transformation of these practices can (and does) generate new moral norms that in their turn shed light on those very cultural blind-spots. 

Let’s take an example: Valuing human life is a universal ethical code. But what type of human life is more ‘valued‘ than others in different societies? And how do these societies make decisions on that basis? Is a young life more valuable than an old one? How is this valuing expressed in daily practices? Is life at any cost more valuable than an individual choice to ‘not resuscitate‘, or to retain dignity in dying? Is it possible to have an ‘equally valuable‘ approach to human life even in moments of scarcity? Is collective survival more important than individual well-being – and can these even be separated? 

These types of questions have emerged forcefully during the current COVID-19 pandemic crisis, and scientists, ethicists, and politicians are tackling the answers – or acting as if they knew them already. 

To continue, let’s follow two major conversations that have dominated the public sphere lately: the use of data tracking systems for forecasting viral spread, and the possible use of AI as decision support for the allocation of medical resources in emergency situations. By analysing the conversations and practices around this topic, this text will advocate a bottom-up approach towards the use of AI. The main arguments are that sometimes ethical codes may compete among themselves, and that trying to codify them in universally applicable AI algorithms would probably lead to the emergence of new types of biases instead of eliminating the existing ones. Thus, both deciding on the instances of using AI, and designing & relying on AI as decision-making mechanisms need to have the practices that embody moral norms as their starting points, and not universal ethical codes and their presumed possible codification in AI algorithms. The immateriality of AI models has received a reality check, and the same is about to happen to AI ethics.

A material world

At a higher level of analysis, the pandemic is a reminder that our world is material, despite a discourse that everything has now been virtualised, from markets to life itself. All of a sudden COVID-19 has forced us to experience at least three major types of materialisations: 

Materialisation of borders. While borders have not always been easy to cross, and some frontiers have been more material than others, in the past three months the transboundary movement has come to almost a complete halt. Most countries in the world have become inaccessible to those who are not their citizens or residents, and repatriation has more often than not been the only type of existing international travel. As I write this text the lockdown is easing in the European Union, but many other countries around the world remain closed to foreigners. 

In parallel, extraordinary forms of collaboration at regional and global levels have shown that only the continuation of an open type of approach may offer long-term solutions, for example the German hospitals taking in French patients at frontier regions in order to relieve the over-stretched French hospitals. At the same time, displays of solidarity have also been received with suspicion, raising questions about the use of solidarity as a mechanism of soft power, particularly in the case of China. 

(De-)Materialisation of movement. Movement has become at once materialised and virtual. Movement has entered a controlled phase at all levels during lockdown, with much of the workforce entering a mass experiment of working from home. Many who perceived the ability to move as ‘natural‘ are now experiencing it for the first time as a privilege. And movement has been displaced onto online platforms, dematerialising itself into bits and pieces of data (more on this later).

Materialisation of our bodies. Most importantly, we have been called upon to acknowledge the full extent of the importance of our bodies. We, individually and collectively, have dramatically come to realise that our lives are very real and unequivocally linked to our material bodies. The variations of the abstract indicators of the economy show that the entire global complex system is not separated from, but is in fact heavily dependent on our human bodies, their health and their movement (see above). This will contribute to the gradual dismantling of the illusion that we may have had that we live in a virtual world in which the body is only an instrument among others, a tool to be refined in gyms and yoga sessions, or a resource to overstretch during long, caffeine-fuelled working hours. Somehow our bodies have become ourselves again.

Tracking

The data tracking systems (DTS) are not a novelty, and their use by the police is quite widespread in the US. So is their use by marketing companies that rely on ’data from individual users to push products through targeted advertising. As early as 2012 the question of data tracking while surfing the internet was brought to the public’s attention’. The generalisation of the use of smartphones has made possible the extension of tracking from virtual movement to material movement in space and time. Apps, which use the phones’ GPS system and a scantily disguised but default option for the user (‘allow the app to access your location’), track, store, and sell movement data to third parties for the purpose of marketing and targeted publicity. In some instances police forces can use the same data to track movements and ‘prevent crime’ – a contested practice that is not yet fully understood, let alone regulated. 

The European Union (EU) enacted a data protection act (the General Data Protection Regulation 2016/679, implemented as of May 2018) that obliges developers to allow for security protection, pseudonymisation and/or anonymisation of users in designing their products, and to fully inform and obtain consent from the consumers regarding their use of data. This regulation impacts the use of data tracking systems (DTSs), and makes it more difficult to apply it indiscriminately or sell it to third parties (as US-based corporations tend to do). More recently the EU has adopted a series of white papers regarding the more general use of AI, to which I will return. 

DTSs combine borders, movements and bodies, and recreate them in the immaterial world of algorithms, while juxtaposing them with pre-designed models, assigning the individual user to typologies within the models. It does not matter if the models are of consumer habits, potential delinquency, or the likelihood of paying off one’s mortgage. The trouble with models in AI has largely been discussed in the literature (O’Neil, 2019; Broussard, 2018; Galison, 2019), and it emerges from a few major sources: 

  1. The models are based on previous behaviour and aggregated data, and have a probability rate of correct identification. This means that they are not 100% accurate. While this may not be a major problem in cases in which we have models of success for an athlete’s performance, it is highly problematic if they are used to decide upon the finances of or the delivery of justice to individuals. It also means that they function as long as the reality matches the conditions within which the data has been collected, and they are highly dependent on the data quality and accuracy. Under normal circumstances (read long periods of status quo), the models more or less function as designed (my emphasis). But as the COVID-19 crisis has shown, any sudden disruption causes ‘model drift’: that is, the models no longer correspond to reality, and they need to be redesigned. This was first signalledin Amazon’s use of AI, and then spotted in all the major industries that use Artificial Intelligence.
  2. The use of proxy measurements in order to decide the value attributed to a typology. For example, in order to decide if one is a good educator, a model may use the measurement of children’s performance in a specific exam. However, that in itself depends on a series of other factors that have nothing to do with the educator’s qualities and qualifications. At the same time, performance scores may be tackled if an educator feels threatened in her livelihood, giving birth to further distortions (see O’Neil).
  3. Models are designed by humans, and more often than not they embed the biases held by their designers and developers. This is also a frequently discussed topic in AI ethics. The solutions offered range from increasing the diversity in designer and product development teams to renouncing the use of the tool itself altogether. 

With the COVID-19 pandemic, eyes have turned towards the possibility of using DTSs in order to predict and prevent the rapid spread of the virus by creating early warning mechanisms. The idea is relatively simple: once downloaded, the DTS apps track the user’s movements using their phones, and identify whether the user has been in the vicinity of someone who is already registered as being COVID-19 positive. The app would then alert the user, and also create an anonymous map of possible viral spread.

AI ethicists raised the first concerns, particularly having to do with the tension between two major ethical principles in AI: the autonomy of the user (including rights to privacy) and usage for the common good. First, the DTS cannot offer 100% autonomy, particularly when the GPS system is being used for tracking. When movement data becomes health data (as in this case), anonymity is all the more important. Individual health data is highly sensitive; it is stored in highly secure environments, anonymised and used exclusively by specialists in healthcare. What if movement is health? What if one’s own movement is used in the aggregated data set in order to evaluate, through approximate models, one’s health – and eventually sold to interested third parties? Can we decide based on this data who can and who cannot return to work, travel, or even visit friends? What about getting the treatment one may need?

This dilemma has generated different responses, and the solutions proposed gravitate around a twofold approach: use the device’s Bluetooth systems instead of the GPS to signal proximity only (and not location), and store all the relevant data on the device (and not on third parties’ servers). Downloading and using the app is voluntary. A diversity of apps featuring these solutions are being deployed as I write. 

The US took a fragmented approach, leaving the development and deployment of tracing apps, and the subsequent ethical decisions, to the latitude of private companies. European countries have a more centralised approach, in that the governments are more involved in financing and developing the apps, with features that must meet European privacy standards. Germany has only just started rolling out its 20-million euro app, and is reassuring its users that the data will not be made accessible to the platform provider they use (Android or Apple), but only to public healthcare specialists in the country. At the same time, Norway has decided to withdraw its own app because its reliability was questionable at the very least. Being based on voluntary download and reporting, and built on the assumption that people always carry their smartphones with them, the Norwegian government concluded that the app’s models do not necessarily correspond to human behaviour. 

To the external observer, the situation seems to be completely different in those countries that appear to have a centralised, all-powerful system of data tracking and AI use, such as China. While a European observer may readily conclude that the balance between the common good versus individual anonymity has already tilted towards the former in China’s case, and that China can already use its Social Credit System in order to track and prevent COVID-19 spread, this is not precisely the reality of the situation. The approach in China actually seems to be more fragmented than in some European countries. Some provinces have developed their own DTSs; some of the apps use GPS, while others are based on the user voluntarily inputting their location. Regional governments and cities may use different apps that may result in different ‘health scores’ assigned to the same person. As Ding (2018) observes in his analysis of China’s AI strategy, the Western perception is that AI deployment in China is top-down and monolithic, hypercentralised and controlled, with no room for ethics. But this is far from the truth, as Ding shows in his work. This perception is a common trope of the depiction of the ‘East’ in Western popular thinking. While the doctrine of social peace and its attainment does guide the actions taken in China, ethical debates are still present and are being conducted by private enterprises, such as Tencent’s Research Institute.

In conclusion, the use of DTSs poses ethical dilemmas because they reveal the opposition between individual autonomy and the common good, and they raise practical issues regarding accuracy and efficiency because of the way in which data is collected, stored, and used.

Triage 

The spread of COVID-19 has put serious strain on healthcare systems in many countries, and each of them has had to find a different way of coping with the crisis. From avoiding testing and sending home those patients who were not in a critical state, as happened in the UK in the first phase of the pandemic, to carefully planning the lockdown and the bed allocations in places like Germany, the entire range of systemic behaviour has been displayed during this crisis. Among these, uplifting shows of solidarity between countries have been displayed, for example when border hospitals from Germany accepted patients from neighbouring France in order to help ease congestion in the French system. 

The strain on hospital beds and respiratory units, and the need to allocate scarce resources to an increasing number of patients in critical states have placed a lot of pressure on medical personnel. Ideally every national health system should have guidelines for extreme situations such as pandemics. More often than not, though, these guidelines contain a set of recommendations about triaging the patients and allocating scarce resources, but they do not necessarily describe practical ways in which these recommendations can be implemented. Thus, nurses and doctors are left scrambling to devise their own procedures in this type of emergency. 

The particularity of this pandemic is putting strain on the Intensive Care Units (ICUs) rather than on Emergency Rooms (ER). ERs around the world are currently using a diversity of triage systems, where one usually decides what type of treatment a patient needs, and in what order of emergency. This is different from the pandemic situation in overstretched ICUs, where treatment may not be available for everyone who needs it, and access to it has to be selective. This is an important distinction, and this is what happens today in many ICUs around the world. ER triage procedures do not apply to this situation. So what are the healthcare providers around the world doing? They are trying to follow the recommendations and to devise their own procedures, in order not only to best serve their patients and the common good, but also to reduce their own enormous emotional stress. There are a few criteria they may use, and as Philip Rosoff, ethicist and MD at Duke University explained, we know how not to take a decision of this kind: not in a rush, not at the bedside, and not using judgment based on privilege. In his words, in healthcare, at least in the US, there are ordinary situations in which there is a distinction made between VIPs and VUPs (Very Unfortunate Persons). In the case of the COVID-19 pandemic this distinction is eliminated, and so is the question of age. Age is not a decisive factor in providing treatment in case of scarcity (contrary to what some may believe). 

The only criterion that should play a role, Rosoff explained, is the clinical chances the patient has of surviving. This can be assessed by healthcare professionals based on the healthcare records of the respective patient and on the current clinical state displayed. Here, one can see that AI-powered tools may come into play to a very significant degree. Electronic Health Records (EHR) facilitate the preservation of patients’ medical history and, combined with the data of the current chart of a patient, they could theoretically match the patient’s history and current state with a recommendation regarding a triage decision. This may provide certain relief in high-stress situations, and the decision may be supported by this type of evidence-based approach. 

However, two important factors need to be taken into consideration here: 

  1. The AI models embedded that pass a judgment on the state of health of the patients may themselves be flawed: the use of proxy measures in order to establish the state of a patient’s health (such as the money they have spent in the past x years on health-related issues) can be very misleading: for example, one such AI-powered tool kept showing that black patients’ health is much better that of white ones, and as a result they may receive less medical attention. This was in fact due to a reversed causation: blacks in the US receive less medical attention due to financial hardships and systemic racism, resulting in their spending less money on health. The AI system considered this a sign of good health. If a subsequent decision is taken based on this, it will in fact continue the spiral of inequality (Obermeyer et al., 2019). 
  2. The risk of errors induced by the way in which the humans interact with the machines. One important element in AI as a decision support tool, particularly in healthcare, is that the system should remain a tool for support, and should not be transformed into a decision-maker. However, the high emotional stress combined with the workload experienced by health workers may generate the so-called “suspension of clinical thinking”, that is, taking the AI’s recommendation as the ultimate authoritative decision. In other words, under a variety of circumstances, particularly high stress, humans may be tempted to offload the weight of the decision onto the machine. While this may be possible in a driverless car, it may prove disastrous in medical settings. Ironically, it seems easier (although it is not) to create an algorithm advising doctors (because everything happens between the screen and the health worker) than an integrated AI system that drives a car. 

In conclusion, AI may provide assistance in patient triage for resource allocation in a pandemic situation, but it should not be transformed into an automated decision-making instrument, precisely because previous biases and model dysfunctionalities may create irremediable medical errors. And of course, the question of accountability may have to be considered.

AI ethics and models

Both the instances analysed above (DTSs and the possible use of AIs in triage for medical resource allocation in the ICU) have in common concerns regarding ethics. 

We should distinguish between making an ethical decision and the method with which we arrive at that decision. The methods used to arrive at an ethical decision are the equivalent of ethical codes, or principles. The decisions we take (or which we let the AI take in an automated manner) are the result of choosing the precedence of one principle or code over another. When subsequently analysing the decision under the lenses of a different code, the decision taken may appear unethical.

In ethical decision-making theories, there are five major methods of coming to an ethical decision: the utilitarian approach (make the most good and the least harm), the rights-based approach (what best protects the moral rights of those affected), the fairness and justice approach (whether the decision is fair), the ‘common good’ approach, and the virtue approach (is the decision in accordance with the decision-maker’s values?).These methods are present and expressed as AI ethical codes in most of the approaches.

Currently a series of bodies are devising principles for creating ethical AIs, that is, the things one needs to take into consideration when designing and using AIs. The EU has put forth seven principles for trustworthy AI: Human agency and oversight, Technical robustness and safety, Privacy and data governance, Transparency, Diversity, Non-discrimination and fairness, Societal and environmental well-being, and Accountability. Under each of these principles we can find a list of recommendations meant to explain what they mean. Under privacy and data governance we may find anonymity, respect for individual rights; under Societal and environmental well-being we may find concerns for the common good, and so on. As argued above, these principles may compete in different cases. They are also highly abstract, and they may mean different things in different socio-cultural contexts.

AI models interact with institutional, social and cultural contexts, and may fail if they are not designed for the appropriate context. In fact, this happens in most cases where AIs work directly with humans: a very recent example comes from health again, when a retina scan AI diagnosis system by Google performed perfectly in lab conditions but failed consistently in Thailand. This happened simply because the workflows differed from the lab, the light conditions were variable, and the health technicians understood the deployment of machines as an authoritative measure to which they had to respond perfectly; sometimes they photoshopped the images so that the AI algorithm would accept the quality of the shot.

Ethical models do the same, and in order to avoid drift, we should develop them by starting with observing practices. The ethical codes themselves do not exist in theory, 

despite the fact that some ethicists generate them theoretically first. In fact they are initially expressed in different practices. Their very meaning is translated through practices; but practices vary in time and space. Different practices show the cracks in the models, as in the AI deployment cases. We should look at practices and their variations first in order to make our way back to judgements on values and ethics. Returning to the question of rights and valuing life: how is this expressed in various practices? How can we design decision-making mechanisms (automated or not) that correspond to the variability of practices and their dynamic transformations? 

Matter matters

The major lesson for AI and for ethics which COVID-19 has taught us is that adoption means adaptation in a world in which matter matters. Therefore we must conclude: 

  • AI is a tool: it does not need to be ethical (it’s absurd). It should be designed in accordance with ethical principles understood contextually, leading to it acting ethically within the context. Therefore, we first need to understand the context – ask an anthropologist.
  • Assume that models are always wrong. Models do not drift because people behave weirdly – they drift to begin with because they are models; their accuracy is limited over time, and the faster we change, the faster they drift. Carrying them across contexts will implicitly lead to drift. So first, one needs to study the model’s cultural context (regional, institutional, professional) and to work one’s way back from there into the design of the AI systems. 
  • The design process should start in the field, and not in labs. We need to design for the cultural context: build models starting with reality, and do not try to model reality on abstract models (including ethics) – sooner or later they will drift, and one of the domains in which they fail is ethics. 
  • And last but not least, we need to create constant evaluation feedback loops. Remember, AI is material: it has a material support and it interacts with the material world. That means it is not going to flow smoothly. Be prepared to reassess and adjust based on how the adoption process develops. 

COVID-19 is here to stay. There is no post-COVID world. Even if a vaccine becomes readily available, the virus will only be subdued by its generalised use. Just as with measles or polio, stopping vaccination would mean the return of the virus. The ripple effects of the current pandemic will be felt in economy, culture, and politics. For AI it means both a great opportunity to show where it is really helpful, and a wake-up call to demystify some of the hype around it. One major lesson is that AI not only interacts with a material world in continuous transformation, but that its functioning depends on this very materiality (and material culture). The crisis has also re-emphasised the importance of understanding socio-cultural variations (geographical or institutional) when approaching ethics, and to be more aware of the ethical implications of AI design, deployment and adoption. One major question that was overlooked till recently would be: what domains and instances need the deployment of AI? Is AI as a decision-making support a really good idea in a particular domain or not? Should we automate decision-making support in all domains? Should we optimise everything just because we can? As Rosoff observed in his dialogue with David Remnick, healthcare is a multibillion-dollar business in the US. In this particular context, optimising processes with AI may not always be in the best interest of the patient. So let’s be patient, and instead consider where AI can be useful, and where it has the potential of becoming a ‘weapon of math destruction’. 

References:

Broussard, Meredith (2018). Artificial Unintelligence. How Computers Misunderstand the World. Cambridge, Massachusetts and London, England: The MIT Press.

Ding, Jeffrey (2018). Deciphering China’s AI Dream The context, components, capabilities, and consequences of China’s strategy to lead the world in AI. Centre for the Governance of AI, Future of Humanity Institute, University of Oxford.

Galison, Peter, ‘Algorists Dream of Objectivity’, in Brockman, John (ed.) (2019) Possible Minds. 25 Ways of Looking at AI. New York: Penguin Press. pp. 231-240

O’Neil, Cathy (2019). Weapons of Math Destruction. How Big Data Increases Inequality and Threatens Democracy, New York: Broadway Books.

Obermeyer, Ziad, Brian Powers, Christine Vogeli, Sendhil Mullainathan (2019), ‘Dissecting racial bias in an algorithm used to manage the health of populations’, Science, Vol. 366, Issue 6464, pp. 447-453, DOI: 10.1126/science.aax2342

Waterson, Bill (1990). ‘Calvin and Hobbes’, January 9, 10, 11, in The Complete Calvin and Hobbes, Book Three 1990-1992, Kansas City, Sydney, London: Andrew McMeel Publishing, p. 9