New year, new blog, new decoders

Nick Doiron
2 min readDec 31, 2022

This will be the last post on the Medium blog as I’m now set up on blog.georeactor.com. I’m hoping that the next post can be a new “ML Arxiv Haul”, with new formatting.
I added topics/tags (such as maps) which have their own RSS feeds. So people can subscribe just to the Arxiv posts, etc.

I started a post about country codes, flags, and TLDs, which I’m pleased has already made an impact! I convinced Emojipedia to set aside a page for the Flag of Sark — emojipedia.org/flag-sark/

Also I started making commits to decoder-ring, a library to control output of text-generation models:

The idea came ~6 months ago when I was preparing for ML Prague and had issues demoing decoders. The text-generation function in transformers accepts the parameters to cover all possible decoders, then silently runs the one which appears to fit. Meaning if I want to do Typical Decoding, I send a value for typical_p. But what if I forget to also pass do_sample=True, or have a value outside the valid range… [earlier this year] it would silently switch to another decoder. The function also accepts any **kwargs to pass on to models, so code mistakes are silently ignored.
Transformers has done a bit of a rewrite, but passed on most of my suggestions around logging the actual decoder or raising errors.

decoder-ring offers an opportunity to separate out text-generation, and explicitly set a decoder with distinct parameters and error handling. I’d like to support several new decoders (such as RankGen, contrastive decoding, or time control) without making a case that it’s going to be notable and necessary for every transformers user. In the next year I’d like to see text diffusion, reproducible runs, and some kind of decoder + probability visualization beyond the model’s own next-token probabilities.

--

--