Discussing AI law in The Hague

a new vantage point

12 min readJan 26, 2020

“Welcome to the city of peace and justice.” There are few places on Earth which could rival The Hague for this title, as this part of the Netherlands hosts the International Criminal Court, International Court of Justice, and other organizations focused on human rights.

This month, I attended a short course on AI and International Law. Most of my classmates worked in law, policy, or legal research, and I am a coder with an interest in AI/ML. Getting to know the class, hearing from experts in different fields within law (before this I did not know human rights law from humanitarian law), and raising questions in our small forum was mind-shifting. I found myself questioning my beliefs and my understanding of technology.

I’ve been puzzling over how to structure this. I decided on several subsections, and Chatham House Rule (no revealing who said what).

What do I know about ethics?

Anyone in technology should have a healthy interest in the ethics space, not by waiting for CEO apology tweets, but through social scientists who give historical context and voice to concerns. If I could speed-recommend only one link to people with different learning styles, I would pick:

one article, AI Ethics-washing: Time to Act by MIT Technology Review’s Karen Hao
one talk, Dr. Ruha Benjamin’s A New Jim Code?, which is also a book
one book, Dr. Safiya Noble’s Algorithms of Oppression: How Search Engines Reinforce Racism which is also a talk
one resolution to sign, the Montréal Declaration on Responsible AI

What do I know about international relations?

I worked with One Laptop per Child, later with the Asia Foundation, and continue to appear in refugee and open government events, on open source technology policy.
I spoke at the IAEA Nuclear Safeguards Symposium in Nov 2018, which led me to look up events with the OPCW (chemical weapons disarmament), which is how I found this course at the Asser Institute nearby.

What can lawyers do before AI even exists?

Reading through the research topics of speakers, I sent a puzzled message to a friend in environmental law:

Me: since you did legal research… [fully autonomous weapons] sounds interesting, but how would someone fully research a thing that hasn’t happened yet?
An actual lawyer: Lots of legal research is quite hypothetical — how would the current law apply in a particular circumstance. We actually learn law using hypothetical scenarios which we then apply known outcomes to — so doing so on a broader scale isn’t much different.
It allows us to test laws and to test factual scenarios which can then be applied if and when something actually happens

Sage explanation!

So you took a class about killer robots?

We covered many topics, including explainable AI, anti-discrimination law in the EU, immigration screening, diversity in the tech industry, national policy, mass unemployment, social media manipulation… and yes, there was a good amount of time on fully autonomous weapons.

OK — when do we see ‘fully autonomous weapons’?

For the sake of this point, I will not count current technology, including drones and an automated anti-radar weapon, as ‘fully autonomous’. Definitions are a fiercely debated subject in law and international policy, and the relevant UN group — the Convention on Certain Conventional Weapons (CCW) — has yet to settle on a definition. The CCW operates on unanimous consent (another problem) but it has successfully banned landmines and blinding laser weapons (which are still not well defined, according to my notes).

Experts in International Humanitarian Law (IHL) spelled out situations where human soldiers make judgements and an AI weapon could easily make mistakes: wounded and surrendering soldiers, civilians with and without weapons, an AI trained on one country identifying objects and estimating damage to buildings in another country.

Their #1 recommendation is that humans need not to just be ‘in the loop’ with ‘meaningful control’, but specifically do target-selection and firing, or be in position to frequently review and override AI on those decisions. There is a precedent where (A) AI systems and humans seem to improve in cooperation, and (B) commanders are expected to test and deploy weapons with understanding of their impact.

When speakers were closer to industry or the military, they did not dwell on this topic as much. My sense was that ‘practitioners’ experience a lot of issues with current technology, and it is an open question whether we have an Artificial General Intelligence (AGI) in our lifetime, so ‘killer robot’ seems further away.

Who is responsible for the killer robots?

So a robot commits atrocities. What happens next?

The standard practice is to trace responsibility from an individual soldier, to a commander (who is responsible for selecting weapons and tactics), to their nation. As an AI cannot be meaningfully/satisfactorily punished, researchers explore the idea of a nation being directly responsible for its military AI, and also for any AI made by corporations based in their country.

(As part of showing that only a human or human organization can be put on trial, we heard about Europe’s centuries-long obsession with animal trials)

Extending to other forms of damage caused by AI, if Facebook is a danger to a country where they have no offices, their only recourse would be to challenge them in a US court, or some future International Cyber Police body.

So international courts can find humans at fault?

Let’s be more general and include AI from companies and the police. Outside of wartime we consider International Human Rights Law (IHRL) and the EU’s anti-discrimination rules.

There are concerns about how an AI’s decision and actions can be investigated. One speaker put up a diagram with designers, programmers, testing, and deployment, and tracing mistakes back to their origin.
I suggested two situations where this would be difficult:

A police design spec tells the programmers ‘focus on people from this list’. The programmers write this. The list gets implemented in an evil way.
(not AI) a car crashes due to a faulty part, which the car company blames on a line worker, who on his first day was poorly trained by his supervisor, who had a toothache… how deep do we trace this?

The fault in these cases seemed straightforward to the legal expert (bad spec or bad police; company is liable), but I’m not so sure?
I believe that programmers should seek out ethical awareness, so they don’t make evil AI or get blown up on a Death Star.
I also fear that in the legal mind, ‘responsibility’ can be neatly resolved with paperwork. If the line worker ever signed a document that they were fully trained, it undermines their defense. Has your workplace ever had you click through a slideshow about Conflicts of Interest and other fireable offenses? Were they designed to inform you?

I also have mixed feelings about ‘international law’ as a whole. About that…

International Court: is it a thing?

For international courts to have cooperation of countries, they agree only to take on cases when host countries are unable to conduct a proper trial. Once indicted, it’s up to their host country to turn them over. This leads to debate about who should be indicted, then years of non-compliance (though it’s easier to coordinate sanctions and travel restrictions after the indictment).

Due to this political calculus, and in part due to its recent founding, the International Criminal Court (ICC) has only indicted Africans.

The recent Myanmar case is in the International Court of Justice (ICJ)
Milošević was one of 161 indicted by a separate International Criminal Tribunal for the former Yugoslavia which concluded at the end of 2017

In the recent Myanmar case, which was filed by a representative from The Gambia, the lawyers for Myanmar referenced the ICJ’s only previous genocide case, where they could not find Serbia responsible or complicit, and ruled that ethnic cleansing can in some cases not be genocide.
I hesitate to write about this with my amateur knowledge, when my home country tends to ignore international courts. I can only raise these questions:

Why did Bangladesh, Australia, and other countries directly involved with Rohingya refugees, not appeal to international courts?
Is any international court prepared to investigate and indict in absentia any criminal from ‘the western world’, namely US or UK? The only attempt that I’ve seen was the short-lived Kuala Lumpur War Crimes Commission.
Is China going to be questioned at any level for their surveillance and detention of Uighur people?

Subpoint: why are international courts so hard to get right?

In modern times, we have access to stellar reporting, citizen recordings, and great insight into geopolitics around the world. It should be easier to call out who did what and when.
I mentioned some earlier issues about sovereignty, but if you zoom out: IHL depends on an ethics system which can punish war crimes and torture, but not war and killing itself. This is hard to unpack.

At least one speaker evaluated remotely-piloted drones in the context of Kantian ethics. Kant bases ethics on human dignity, or dignity as an end goal. Dignity certainly sounds good, but also was chosen to play well with Kant’s beliefs in just war and capital punishment, and not as a nonviolence thing.
I read or heard two connections between this and robots:

fully autonomous AND current drone weapons are outside the dignity of face-to-face combat: considering who gets targeted by drones, I cannot see what ‘face-to-face’ combat means against a piloted F-35, or a sniper? Is dignity being twisted again to fit the difference between a human-aimed and AI-aimed missile? Dulce et decorum est?
fully autonomous AND current drone weapons affect the target the same way, but the attacker is putting their own dignity above that of the target — couldn’t this apply to almost any case where the programmer has choices and safety that are denied to the subject of the algorithm?
The drone wars and current criminal justice system have their own problems, so I reframed this : as a programmer, is it OK for me to write an algorithm which controls welfare benefits, or assigns children to schools, when I myself do not receive welfare or have kids? (If you have kids, imagine that you send them to private school).
This problem is not farfetched at all: see this part of AI Now’s New York City Algorithmic Decision Systems Shadow Report:

The inevitability of research

In the immigration discussion, we heard how the EU internal politics led to a controversial research project in deep learning and automated interviews for potential immigrants. The research is being moved forward by EU border countries which are more right-wing, but by regulation must receive and host migrants while their cases are processed.

Practically speaking, it’s clear that the EU will create a shared database of all migrants, refugees, asylees, tourists, and residents. It also would benefit everyone to have a faster process, and to interview people before they arrive in a crowded, unfriendly border zone.
There are two questions which I can’t fully answer now:

How much should machine learning be involved? It seems strange to develop a database and not try to automate, but potentially unrighteous to create a system with a ‘black box’ algorithm.
This research program invested in virtually every AI psuedoscience: microexpressions in virtual interviews, social media scraping, no study of learned biases (likely because of plausible deniability), and no learning from outcomes. This isn’t a fault of tech, but an overestimation of tech’s abilities.
How comfortable would I be with working on the research? If I work at Ethical_University and we decide not to get involved, the program still gets researched and built, and potentially by less empathetic, less competent developers. Is it wrong to try in order to steer this thing towards critical research and removing the more egregious design problems?

How are our own neural networks ‘black boxes’?

One of my classmates asked how neural networks — being manmade or ‘math-made’ — could be ‘a black box’. I realized that this was something I’d taken for granted, without feeling that I could explain it confidently from first principles.

Here’s how I would explain it today:
There are approaches to testing individual decisions, for example in a ‘dog or giraffe’ tool, we change pixels in an image to determine which effect that decision and confidence. There is some research into deceiving these tests, so I’m not sure how this will evolve.

But why can’t we have a ‘global’ explanation of the network? To me, ‘explaining’ would mean that you could point to any node on any layer, and describe it as ‘looking for’ one thing, or explain its axis as opposites (e.g. eyes open or closed, as found in StyleGAN2).

Because the input of an image recognition network is every part of our data, like every RGB pixel value, our first step is to condense that to a smaller number of nodes. Research shows that (much like hand-coded algorithms) these initial nodes identify images’ edges or sentences’ parts of speech. When we condense and generalize so much data, a node in the network likely takes on different jobs in different situations.

Restricting AI: the case of GPT-2

One person brought up several of my go-to topics in one talk on the last day: OpenAI, GPT-2 and fake news, and the decision not to release the full GPT-2 model. The speaker (and OpenAI) use this as an example of a conservative, measured, community-review approach to dangerous AI.

Take a close look at the timeline for GPT-2. In February 2019, OpenAI announced the model, explained their fears that a sudden release could be too powerful in creating fake news and profiles, and released the initial model. In March, they restructured out of non-profit-hood. In May, AllenAI released GROVER for detecting AI-generated articles. In November, OpenAI released the full model.
The discussion in the tech world revolved around: is OpenAI doing this for real or for hype? What happened to OpenAI being… open? When they announced the restructuring shortly after, this solidified people’s fears about hype and opacity. Many people followed the research to make an off-brand full-parameter GPT-2, followed by other labs’ text transformers (BERT). Ultimately we live in a post-GPT-2 world today without being mind-controlled or running GROVER on every blog post.

So while we include GPT-2 as a success story for making a careful publication of a dangerous AI instead of censoring it, and I want to believe in everyone’s best intentions, it could also feature prominently in a post on AI safety is a fake problem, and I can’t be fully comfortable with that.

Storyboarding vs. news

Take facial recognition as another example of public perception. Is facial recognition about to be banned by the EU for 2-5 years? Or not at all? You can find reputable sources circulating this:

EU mulls five-year ban on facial recognition tech in public areas

BRUSSELS (Reuters) - The European Union is considering banning facial recognition technology in public areas for up to…

www.reuters.com

Many speakers would bring up this news-byte, even when on further discussion, they admitted it was likely not being seriously considered.
This isn’t a problem of the course or of academia, but the tech industry and its willingness to forward and retweet and repeat hype.

We are all guilty of falling for clickbait that confirms our biases. But once we know better, we have to stop helping these stories along. At times AI stories surface based on their intensity, or a fit into a general storyboard.

I also am officially tired of seeing this image from 2015 being used to introduce AI perception problems:

In 2020 we have plenty of adversarial AI examples to go with, some which work even without direct access to the system. There are attacks against explainable AI. In 2018 we got a physical sticker that makes your camera think any object is a toaster, and I can see why.

Deepfakes are another area where each new video is more and more incredible. Always use the latest.

Odds and ends

I promise that I’m not a total buzzkill in ethics and philosophy. I enjoyed learning the ins and outs of EU natural rights and whether ‘freedom of thought’ could pose a human rights argument against microtargeting.
Brits talking about AI taking our jobs and being impossible to understand and control, made me think I was in a Years and Years montage.
Blockchain came up rarely. I heard that blockchain-hash evidence was accepted by an international court, but I found only this article from China’s court system.
NLP didn’t play a huge role here either, but I still will think about it.
Studying a diagram of a neural network, a classmate suggested regulating a minimum number of layers for networks in particular, sensitive fields. Another suggestion was a standard test set (a Kaggle competition?).
Overall there was less theorizing about regulation. Toward the end, I came to wonder if this class is less about preventing bad AI, but sparring with each other in preparation to someday soon sue and defend cases of AI bias and human rights.
Every discussion continued over-time and outside of class; our organizer began to tell every speaker that we were ‘talkative’ or ‘interactive’.
There are also many classmates who I want to connect with in the future, based on their deep experiences… Even in parts where I disagreed or didn’t have experience, these were jumping-off points that I want to read more about in the future. Overall a great learning experience for a technical or legal mind.