Update on ML side projects

  • NLP in 2022 is dominated by a few labs with massive resources, to the point a lot of research is pecking at an API or paywall or limited-beta (GPT-3, Codex, DALL-E 2) or not accessible at all (PaLM). This also raises the bar for results to look current, interesting, and professional.
  • My early work in this space is not quality, which makes sense, but makes it difficult to say ‘let’s open up that old notebook and start hacking’. New projects haven’t approached the point where I’m collaborating on a paper, and DIY-ing an ML paper is also intimidating.
  • I work on an ML platform in my day job, mostly on Go/Python backend code, so the amount of time/effort on ‘ML’ is already a lot, but time on models and analysis is less.
  • There is a cycle or treadmill or mechanism which is rapidly escalating the ability of models on English language, Python/JS coding, and some multilingual tasks. Despite only a few labs publishing the models, we’re seeing developments play out publicly.
    The risk here is that AI has an exponential growth chart, and AI evaluation / monitoring / auditing continues to be a laborious process. I depend on and respect that rigorous academic / legal space but damn it would be nice to throw a battery of adversarial continually-morphing tests at models as they come off the line.
  • We need better visualizations of text-generation models. One of my weirder code-generation results shows that when you change the name and license at the top of a code file, it changes details (such as food emojis or city names) in the file. It doesn’t make a ‘wrong’ answer, but it hints that ambiguous instructions float many options into our probability space, and the one which prints out of the model is determined by your decoder and some sketchy butterfly-effect stuff.
    Generation models also are not so great at probability in the beginning of some code or text, which makes it possible to miss surprisal/saliency.
  • Semantic search (searching by document vector similarity) has been getting a lot of attention and promises to be very cool. I would like to try it on /r/AskNYC. I do run into issues with fuzzy search when I search chimichanga and Google Maps returns every Mexican restaurant.
  • On the other end, explainability and algorithmic recourse (i.e. to change the output, you should change this input) is underrated / under-explored. Governments have spent millions on dowsing rods and will happily adopt random noise AI/ML unless there is some receipt or recourse to inspect why they work.
  • If you talk with tech-for-good groups, they have NLP tasks which are still difficult to set up in non-English language (text simplification, reverse curriculum, and open-domain QA). Even when a language needs more data and training, it should be easier to plug the libraries and datasets together for a demo.
  • I’m participating in the Probabilistic AI class in Finland in June, so I’m eager to learn more about that way of thinking.




Web->ML developer and mapmaker.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

ML Arxiv Haul #4

MMFashion : A Machine Learning Model for Fashion Segmentation

Developers Perspective On Azure Machine Learning Studio

Machine Learning In Food Industry

Knowledge Distillation for Object Detection 2: (Survey) “Learning Efficient Object Detection…

How to Train StyleGAN2-ADA with Custom Dataset

Teaching Hindi to ELECTRA

The Many Clusters of Drake

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Nick Doiron

Nick Doiron

Web->ML developer and mapmaker.

More from Medium

10 ways AI and ML are transforming different industries

Retired Programmer Tries AI Programming in Python. 18) AI — Read Several Moves Ahead (AlphaBeta)

My first date with Time Series Forecasting

Webinar: Why Scaling AI Businesses is a Struggle​