I will explain the title for this post at the end (I think that’s called click bait) but first things first. Kudos to Netflix, they have now made three impactful documentaries exposing the dangers of AI driven manipulation of data for society and our civil liberties. These malgorithms are what Cathy O’Neil, who features heavily in one of the films, calls ‘Weapons of Math Destruction’. First came ‘The Great Hack’ in 2019 which exposed the deeply disturbing scandal of Cambridge Analytica and their manipulation of voter behaviour using data facebook had provided resulting in Mark Zuckerberg having to appear in front of Congressional hearings. The following year two more films were debuted by Netflix, first ‘The Social Dilemma’ then ‘Coded Bias’.
I wrote an eBook about my reaction to ‘The Social Dilemma’. TSD focused on how social media was being driven by venal amoral algorithms designed to maximize advertising revenues. These algorithms learn that the best way to do this is to feed people, to hook them like addicts, on content that pandered to their prurience, prejudices and psychoses. The result, the unintended consequence, is an increase mental illness especially among the young, bias confirmation and, most concerningly for liberal democracies, polarization of opinion to the point where rational debate is all but extinguished. So I chose to write my eBook as a contribution to a more rational – Socratic – discussion based on some small scale research I conducted among opinion leaders and on the basis of this I attempted to offer possible solutions. I’ll come back to those.
The third Netflix documentary of 2020, following closely on the heels of TSD, was ‘Coded Bias’ directed by Shalini Kantayya and featuring Joy Buolamwini among many other experts and activists, mostly women from diverse backgrounds. This was entirely appropriate since Joy’s work which she carried out at MIT exposed how facial recognition surveillance powered by AI was reinforcing racial and gender bias. The efforts of Joy Buolamwini, Cathy O’Neil and other prominent activists like Silkie Carlo founder of ‘Big Brother Watch’ in the UK have had some notable successes in forcing governments and law enforcement agencies to curtail the use of facial recognition surveillance. However, there remains widespread commercial use of AI that affects peoples chances of gaining employment, housing, credit, insurance, healthcare based on algorithms that are unregulated and flawed, in particular AI that been shown to be negatively biased against the poor, the racial minorities and the unconventional. AI is therefore reinforcing social inequality, preventing social mobility and restricting individual self-expression. This is just as terrifying as the manipulation of social media to change not just what we think but the way we think, our most fundamental human right, and the manipulation of elections, an attack on the very foundation of democracy.
All of this has been exposed in three documentaries produced by Netflix. Amazon and Apple both make lots of documentaries but none so far on the dangers of big data and AI. One wonders why……… but as I say, kudos to Netflix. I guess in the case of Netflix they use algorithms only to commission new content for you, and to suggest available content to you, that they think you might like, more like weapons of Individual entertainment than mass destruction.
I said I would return to potential solutions to this AI challenge and we need solutions because we do want, we desperately need, the positive use of AI to help us take on the Herculean tasks of tackling climate change, food poverty, obtaining better health opportunities for all. As an atheist I don’t believe we were created by God but many of those who do also believe we were created in his/her/their likeness. They explain away humanity’s capacity to do as much evil as good as God giving us free will. Perhaps God did create us to be just like him/her/them and perhaps having given us free will he/she/they did not fully understand the ramifications of that until it became too late to do anything about it. This seems to be the perfect metaphor for AI. We created it and we gave it lots of data about us so it could think like us, maybe be better than us, certainly a lot faster than us. AI can only learn from big data (which remember means not just lots of it but multi-source). The biases that ‘Coded Bias’ talks about happened because the data we gave the AI to learn from was skewed to, let’s call it, ‘white privilege’. So we created AI to be like us, but only some of us, and we allowed it to develop in ways that were both good and bad for the world, just like us, and it is in danger of getting out of control, just like us. So how do we do better than God? How do we get AI back under control and how do we direct it towards things that are good for a free and open society, a world of equal opportunity for all irrespective of class, ethnicity, sexuality, gender, faith (personally I’m not so sure about the last of those given the religious extremists out there but maybe with AI we can sort them out too)?
China is on a very different agenda it must be said. They are 100% explicit that they do not agree with democracy and that they want to use AI and data to control their society. There is no secret to what China are doing with data and facial recognition, we saw this in Hong Kong in response to the people who dared to challenge the state. In China you get a Social Credit Score, like a financial credit score but all encompassing. If you do the wrong thing, if you say the wrong thing, even if people you know do or say something wrong you are punished and the state, the CCP will know exactly what you are doing and saying, where you go and with whom you are consorting because they have all your data. The state can control you by controlling your Social Credit Score and thereby restricting your ability to get housing, access to public transport & travel, healthcare, financial services, you name it.
That makes them terrible, right? China is much worse than the free Western democracies – but is it? Of the 9 major organizations developing big data AI, 3 are in China and 6 are in the USA. Exactly the same thing is happening in America as in China with two important differences a) you don’t know about it, it’s invisible and b) the power lies in the hands of these few huge commercial enterprises who care first and foremost about profit and shareholders. People are denied jobs, financial services, housing, information & content is pushed at us with bias and partiality, all because without us knowing we are being watched, measured and judged by AI algorithms that not even the people that created them fully understand. Governments have used AI and data in ways that undermine civil liberties but they are being called out, they are accountable, although there remains an understandable concern that an extreme left or right wing government might not be so shy in abusing the power of AI & data. As they say, just because you are paranoid it doesn’t mean they’re not out to get you.
So, solutions. I’ll start with the two proposals I’ve made previously because I still believe they are 100% right and both doable.
Firstly, social media needs to be regulated and forced to move to a subscription model. Social media generates a huge amount of data due its pervasiveness and frequency of use. AI learns from data and Social Media is where it does most of its homework. These are powerful platforms and they should require licenses that can be revoked in the case of malfeasance, just like newspapers and TV were. If the business model is subscription based they can still be very large businesses but most importantly the algorithms would be trained to build customer loyalty not eyeball addiction. If you pay something every month to use facebook, even just $1 then you are a customer not data fodder.
Secondly, there should be government investment together with commercial incentives to develop platforms that allow people to own, control and, when they chose to, transact their own data. Data is the new oil but it has been allowed to fall into the hands of robber barons. It is your data, you should be able to harvest it, store it and use it however benefits you most. This is not a quick fix and will require secure technology infrastructure with the scale and complexity we see today in financial markets and services. In my view it could be an opportunity for the financial sector who have the resources and customer base to make this work. Even if you don’t like your bank you have to trust them because they manage your most sensitive information already. A bank could be trusted to store your personal data, allow it to be transacted on your terms to get you a return and to manage those transactions. I don’t understand why banks don’t look at data in the same way they used to look at cash – bring it to us, we’ll keep it safe and give you access to it when you want and if you’ll allow us we will lend it out to be people (encrypted to preserve privacy) and make it work for you. Instead of going to facebook, or any of the data trawlers, scrapers and scavengers, big brands would go to the banks and buy the profiles they are looking for to promote whatever they want. People would consent to see brand content, anonymously, if it was made worth their time or interest.
Put these two things together – social media on subscription and the mechanism to leverage one’s own data – and you have solved a big part of the problem with no need for regulation.
That said there is still a role for regulation to prevent data abuse at the hands of AI and hold miscreants accountable but it has to be co-ordinated internationally and that sems like quite the challenge in a world were there seems to be growing nationalism and weakening global alliances. That was my conclusion but something in ‘Coded Bias’ gave me some optimism. The point was made that algorithms need an equivalent to the FDA, the US federal agency for Food and Drug Administration. We don’t allow people to market pharmaceuticals or foods that have not been tested or lack the appropriate quality controls. And this does, more or less, work across international borders. So why can’t there be an IAA, International Algorithm Administration, backed by international law that enforces the responsible development of AI?
Finally, I want to address the issue of whether big tech companies are actually able to behave responsibly – they say they want to but always use the defense that the scale of their operation, the sheer number of users and data points, make it impossible to have foresight on all unintended consequences and oversight on every malpractice. Let’s focus on the issue raised in ‘Coded Bias’, that facial recognition technology is biased against certain social groups, generally the disadvantaged groups who are under-represented in the data the AI is learning from. In my research I came across something new to me (I never claimed to be a technology expert). It is called synthetic data and is predicted to become a huge industry. The models and processing needed to develop synthetic data are no doubt very complex but the output is very simple to explain, the clue is in the name. This is artificial data, data that’s confected, invented, made up. It is needed to fill gaps in real authentic data to help AI to learn to do whatever it is developed to do. For AI to be effective it needs lots of data and the data has to be comprehensive and statistically representative. So they run lots of simulations based on lots of different scenarios in order to produce data to plug the gaps in real data.
This is a terrifying concept but it is not conceptual, it is happening right now. Many if not most of the systems developed using machine learning and AI use synthetic data, it overcomes issues of sensitive and confidential data that is hard to get. Obviously it is open to abuse, you can create the data to feed to AI that teaches it to discriminate prejudicially. So per the previous point, there has to be regulation. However, it can also be used to eliminate bias.
As humans we are programmed to be biased, our brains work by using pattern recognition. We know not all snakes are dangerous but some are, so if it looks like a snake we run. It’s a basic survival instinct and instincts are very hard to shift. When we look at an individual we take in the visual cues and form judgements and, just like the malgorithms, our brains have been trained to make prejudicial assumptions on flawed information. Someone looks a particular way, talks a particular way, exhibits certain behaviours and we make a negative judgement, there is no point in pretending otherwise. That judgement can be unfair but as humans we have the ability to over-ride our unconscious bias and make a conscious decision to look deeper, to give someone a chance, before making a decision that affects them. Synthetic data allows us to programme that humanity into AI. Poor people are a bad credit risk, the real data will teach AI this lesson and make it hard for certain social groups to access the loans that might help lift them out of poverty. The same system will make it very easy for the well off to buy a second car. One thinks it would be better for society to make finance available to facilitate social mobility rather than more physical mobility for the well off. If so we can use synthetic data to upweight the scenarios in which poor people are not unfairly treated as bad credit risks.
‘Coded Bias’ certainly got me thinking, so well done Netflix, again. My brain works in strange ways and the focus on racial bias in facial recognition made me think about ears. A lot of images of people will be side on as they walk past the camera that’s recording them, so it will only detect one ear. The AI might conclude that lots of people, even most people in certain locations, might only have one ear. Having only one ear has a medical term, it’s called microtia and it is more common than I thought when I looked it up. It occurs in 1-5 out of every 10,000 births which I think means there are 4 million out of the global population of 8 billion that only have one ear. Not common then, but not unheard of in the real world. We could teach AI about this, using synthetic data because samples of real world data would not likely detect the prevalence of microtia. It might prevent AI drawing the wrong conclusions, either ignoring microtia or over-estimating it. On the other hand, it might help facial recognition spot a one eared crook like Mark ‘Chopper’ Reid, the Australian criminal who cut off his own ear in prison to get an early release (it’s a long story). My question is very simple – would a machine have even thought about this, would it have looked up the data on microtia, searched online for an example of a one eared crook? I doubt it. So, if you have them, listen with both ears and both eyes wide open, we need to use AI, not let AI use us.