ML Arxiv Haul #5

Nick Doiron
6 min readMay 7, 2022


Time for another hyper-exciting sweep of ML papers :

This is a less strictly formatted paper, but the authors have access to DALL-E 2 and are posting on arxiv, so let’s go. Just as CLIP / DALL-E floored image generation people one year ago, DALL-E 2 is a clear winner. The authors find a few aspects of language which puzzle the model (number, negation, placement in the scene) and shows examples. They don’t cover the issues which were already explored in the original paper (incoherent letters in signage, etc).

This is another example of three leading stock Tweet sentiment-analysis models being puzzled by word changes. Of three strategies, ‘joint optimization’ (basically playing the model against itself) is the most effective. This is another ‘text attack’ approach where it’s unclear if evil traders would use this to trick competitors, or we’re saying that the model is not reliable on real Tweets.
Many changes (‘Buy Stocks’ to ‘unsettled Stocks’) would truly be bad, and others (‘information’ to ‘discovery’) use words from ‘company is being sued’ type headlines.
I wonder if we could fool stock prediction with less technical methods like adding ‘class action incoming’ or ‘about to go broke’ to Tweets.

The researchers compare four methods for determining which word or token contributes to a sentence being labeled misogynist or not. The methods include SHAP, gradient methods, and hiding tokens. They find measures of attention (in the transformer model) do not provide useful explanations.
A problem might be the training data, as ‘women’ and personal names get flagged by the model simply for hinting that the text is about women.

A common problem for models is that they don’t perform well on new data once deployed. This paper argues that the common convention of fine-tuning a large pre-trained model might improve your numbers on the train/test dataset, but once you throw in new, out-of-distribution (OOD) data, it’s better if you didn’t fine-tune? Their preferred method LP-FT is linear-probing then fine-tuning.

I thought this would fall into my dream problem of a language-curriculum generator, but instead it shows large generative models a programming puzzle and asks it to generate more puzzles.

Six months ago, after reading over some toxic language papers, I tweeted:

This paper is a much much smarter take on this problem! It discusses all stages of researching toxic language (including checking in with the team, formatting text for publication, and protecting individuals and groups targeted by the hate speech).

Source of the mind-blowing fact that “There are more French words in GPT-3 training data (3.5B) than there were English words in the original BERT training data (3.3B)”. This makes OpenAI’s GPT-3 and Google’s T5 models passable or sometimes better on non-English benchmarks than the monolingual BERT models. I would still recommend mT5 if you can get it.

The paper develops a model to ‘explain’ individual neurons in a purely image-based neural network (i.e. Resnets, GANs, and one visual transformer [DINO] which does not have text data). The text model is trained on representations of neuron’s activations and labels by Mechanical Turkers.
It’s interesting to look at individual neurons even if it’s unclear what they’re really seeing.

I’m going to admit that I don’t like seeing a bunch of fancy named scores to describe a model accuracy. I prefer a good confusion matrix squares thing if you can get it. This is an open source project from Apple (!) which extends confusion matrix to some complex stuff.

There are a few multilingual hate speech models, but it’s cool to see a big toxic-language corpus outside of English. The researchers at Universidad de Jaén collected comments from social media sites. They measure accuracy of a few non-transformer models and BETO (a monolingual model from Chile) .

It’s unclear how friendly or bitter the politics are with OpenAI keeping GPT-3 on a restricted API/paywall, and now Facebook providing a “Open Pre-trained Transformers” model with a public 30-billion-parameter model and a full model (175 billion, same as GPT-3) on their research API.
EleutherAI has gone down this ‘just publish the GPT-3 model’ road before, but with a mere 20-billion parameters.

In a classic move seen recently in Google’s PaLM paper, ‘Stochastic Parrots’ is cited in the Limitations section (here in a cluster of seven cited papers).

AI Ethics Twitter so far has focused on the decision to release the model with easily probed hate, and the mention of PushShift / Reddit as a source. Eventually we will come to terms with what is Reddit, and can parts of it train general language ability and knowledge? But many Reddit subreddits have stoked hate, and even well-modded subreddits are often language / interests / experiences of young men.
Even if the model drops Reddit here, ‘the Pile’ draws from Reddit content, and ‘RoBERTa’ also has a general web corpus. So I don’t think you can get rid of Reddit without inventing your own corpus, which is not going to be all butterflies and sunshine.
I used relevant subreddits on Reddit for GPT-NYC, many have used ‘Am I The Asshole’ [1] [2] content to make fun models.

Students at different technical skill levels attempt to get the same results reported in NLP papers. The most technical students take less time to set up their system, but universally all struggled with these issues:

incomplete dependency specification
insufficiently documented code arguments
preexisting bugs in released code
difficulties accessing required external resources
and difficult-to-read code

A Resnet model predicts image classes and produces a saliency or Neural Activation Map from the next-to-final layer of neurons. Previous studies have then had humans label these maps as useful or spurious, but that labeling won’t scale to all images in ImageNet and future problems. Here the researchers find a way to learn from a subset of human labels.
Once that’s been used to develop good saliency maps across ImageNet, they try it out on new images and release their own dataset (Salient ImageNet).

DeepMind’s ‘Flamingo’ model can chat about what it sees in images, or do ‘zero-shot learning’ on visual tasks.

This was an interesting thread about whether it ‘gets’ an image of Barack Obama pranking someone. (The replies go more in-depth about if the questions were leading, and whether Flamingo can be tricked into saying there are giraffes depending on the phrasing of the question).



Nick Doiron

Web->ML developer and mapmaker.