Interesting papers from AI skeptics
I’ve been engrossed in a few recent academic pre-prints which are skeptical of AI. These are not just from a business / hype-train perspective, but digging deeper into how machine learning research is performed today, and what is being accomplished.
The computations required for deep learning research have been doubling every few months, resulting in an estimated…
A team at the Allen Institute for Artificial Intelligence wrote about ‘Green AI’ as an alternative to computation-heavy, energy-consuming ‘Red AI’. There’s a section on quantifying environmental impact (hence, Green AI), but the article also takes a critical look at a community focused on leaderboards. Chasing a leaderboard position means MegaCorps are piling on data and CPU time, with obvious returns, but these results are inaccessible to average developers, and come at the expense of novel research.
Neural networks winning by overfitting
The leaderboard and metrics problem appears again in these two articles on recommendation systems, and natural language processing (NLP).
Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches
Deep learning techniques have become the method of choice for researchers working on algorithmic aspects of recommender…
Discourse-Based Evaluation of Language Understanding
We introduce DiscEval, a compilation of 11 evaluation datasets with a focus on discourse, that can be used for…
The first article points out that several research papers have no code to repeat their results (good reason to look at paperswithcode.com — subscribe to their newsletter!) and this culture makes it difficult to tell if half of papers have advances in machine learning at all.
The second article dissects an emerging trend in natural language processing: moving away from word vectors / word embeddings, to much larger pre-trained models which look at word pairings in multiple directions (I’m not 100% on how this works, but there are a variety of new models from Allen AI, Facebook, Google, etc. using this technique). According to the paper, the latest progress of these models is dependent on data where they’ve trained, and is not significantly better at understanding typical language examples.
Neural networks losing to statistics techniques
This brief comment on Hacker News is a few months old now, but I think of it often. It’s a warning to people like me who try running their data through AutoML platforms. AutoML was bested by feature engineers (but given the same inputs, AutoML did outperform XGBoost).
Erkut Aykutlug and Mark Peng used XGBoost with creative feature engineering whereas AutoML uses both neural network and gradient boosting tree (TFBT) with automatic feature engineering and hyperparameter tuning.
An End-to-End AutoML Solution for Tabular Data at KaggleDays | Hacker News
An End-to-End AutoML Solution for Tabular Data at KaggleDays
| Hacker News An End-to-End AutoML Solution for Tabular Data at KaggleDaysnews.ycombinator.com
Natural Adversarial Examples
(in other words: real-world AI fuck-ups)
Natural Adversarial Examples
We introduce natural adversarial examples -- real-world, unmodified, and naturally occurring examples that cause…
The authors do a great job explaining how they sourced and filtered through many images to create a new sample which defeats all of the best ImageNet classifiers. Examples which seem obvious to the human eye are missed by the neural network, due to unusual backgrounds, lighting, partially obscured features, or image blurs.
- It’s a lucky miracle that AIs can drive cars. I sometimes wonder: how can robots drive but not make pancakes yet? It turns out that cars don’t need to be so picky — they make 3D models of their surroundings and don’t bump into those things. They benefit from wide roadways where everyone has learned since childhood not to get in the way. In a catastrophic situation, if the AI stops the car safely, it did good.
A fully autonomous car would have more problems to solve (reading signs, following directions of a flagger, etc.) but these jobs are made easier by repeatedly viewing the same intersections, or giving the wheel back to a human driver.
- Leaderboards are an interesting driver in showing us when AI advances a lot, but appear in all of these papers as a negative force. I wonder, is it possible for a leaderboard to be a moving target? Ideas:
- organizers gradually add adversarial examples and new classes
- an additional ‘none of the above’ class
- As someone working on AI/ML as a side project, I don’t know what to do on projects where a MegaCorp already has a team of people doing the work. A good example is map tracing for OpenStreetMap — it would be awesome for me to work on such a project, but there are actual research teams at Facebook making progress on this already. It’s a solid positive for the mapping world — similar to how I felt when data imports, drones, and cars started mapping — but whatever experiments that I tried in this space would be outpaced and out-computed fairly easily.
I appreciated an interesting and encouraging thread on this subject back in January:
4. I really want to bake Explainable AI libraries into all of my projects going forward. I find myself making excuses for their actions anyway, and all of the adversarial examples prove that we don’t understand what neural networks are seeing.