Deciphering Explainable AI, with ELI5

Designing our process

ELI5, named after the internet acronym ‘explain like I’m 5 [years old]’, is a Python library which connects to several machine learning libraries including SciKit-Learn and XGBoost. From this sentence in the docs, it looks like our use case is a common and well-supported one:

  • We want a SciKit-only pipeline from text to tokens to vectors.
  • We need to pick a classifier which can tell these troll and not-troll vectors apart. It would be good to review several different types, and use metrics.classification_report to find out which has better accuracy
  • Then we will connect ELI5 to the end of the pipeline, to take the most important vectors back out of our vectorizer, and figure out which words in the text to highlight.

Picking a word token and vector-izer

There are a ton of libraries which could help here, but SciKit-Learn comes with only a few built in. I am using TfidfVectorizer, which the docs tell me is “equivalent to CountVectorizer followed by TfidfTransformer.” What does this all mean?

Picking a classifier

The ELI5 text demo uses the classifier LogisticRegressionCV (the ‘CV’ here stands for cross-validation). I started out with this option and train on an initial 10–18k messages from my AOC dataset (any more and I got a memory error during vectorization, or an imbalance of troll and not-troll messages).

Displaying ELI5 output

ELI5’s text analysis was confusing at first when I ran it in my console, but it looks great in notebooks — for this section I used NextJournal.
In the following, green contributes toward the final label, and red detracts from it (so a strong green has opposite meanings depending on whether the final label was ‘known weird’ or ‘less weird’ ).

Overlaying results on Twitter

The next step is setting up a local server which will receive Tweet text and overlay an analysis (with a consistent color scheme). Because the ML code is already in Python, I use Flask (see server code).

  • Either a troll or a non-troll could use the words ‘embarrassing’ or ‘daughters’, but it’s trained on a handful of replies to AOC, where these are assigned a meaning that doesn’t make sense in a global context.
  • The phrase ‘student debt relief’ is broken into three words which the model finds all ‘known weird’, but together they are a very liberal position which was sent as a compliment to Warren. There’s something which could be done here for named entity recognition, bundling phrases together, or finding models which include more context about what words it appears along with.
  • The model’s accuracy comes from summing up all of the positive and negative scores and making a conclusion. Revealing that process has given me more information as a developer / data scientist, but unless the model and its underlying dataset is significantly improved and diversified (like 10x better), it isn’t friendly enough to show a user.

Next Steps

  • I should have separate processes for a well-trained model and the server-side code, so that the server can quickly be updated and rebooted without rebuilding the model.
  • In a future post I will try FastText to see how pre-trained word vectors / word embeddings changes classification (I’m confident that it will improve classifier accuracy by A LOT). The trick then will be figuring out my word-highlighting with a system which isn’t natively supported by ELI5.
  • Flagging Tweets would be a great reinforcement learning project, but I know less about that at this point.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store