2021 research mentions wrap-up

Learning Domain-Specialised Representations for Cross-Lingual Biomedical Entity Linking

Fangyu Liu, Ivan Vulic, Anna Korhonen, Nigel Collier
Researchers from Language Technology Lab, University of Cambridge.

Hostility Detection and Covid-19 Fake News Detection in Social Media

Ayush Gupta, Rohan Sukumaran, Kevin John, Sundeep Teki
PathCheck Foundation, Cambridge and Indian Institute of Information Technology, Sri City

Extracting Latent Information from Datasets in CONSTRAINT 2021 Shared Task

Discusses the Hindi hate speech task where some teams used my model.

A Review of Bangla Natural Language Processing Tasks and the Utility of Transformer Models

Firoj Alam, Arid Hasan, Tanvirul Alam, Akib Khan, Janntatul Tajrin, Naira Khan, Shammur Absar Chowdhury
From Qatar Computing Research Institute, Cognitive Insight, BJIT, and Dhaka University

BanglaBERT: Combating Embedding Barrier in Multilingual Models for Low-Resource Language Understanding

Abhik Bhattacharjee, Tahmid Hasan, Kazi Samin, Md Saiful Islam, M. Sohel Rahman, Anindya Iqbal, Rifat Shahriyar
From Bangladesh University of Engineering and Technology (BUET) and University of Rochester

Cross-Lingual Text Classification of Transliterated Hindi and Malayalam

Jitin Krishnan, Antonios Anastasopoulos, Hemant Purohit, Huzefa Rangwala
From George Mason University

HuggingFace Datasets

This was a mass collaboration with Hugging Face. I got an acknowledgement for uploading notes on some datasets. To be honest I’m frustrated that common datasets such as XNLI are missing their model card information. The fields are too extensive and too often left in as ‘More Information Needed’.

Technical Domain Classification of Bangla Text using BERT

Koyel Ghosh, Dr. Apurbalal Senapati
From Central Institute of Technology, Assam

Odds & Ends


Not a research paper, but a complex project by students at St. Francis Institute of Technology in Mumbai.

Final / Thesis projects

Mitigating Language-Dependent Ethnic Bias in BERT

The paper from Korea Advanced Institute of Science and Technology doesn’t include results for Thai language, but the GitHub repo does include a Thai model upload in the configuration.py

Assessing the Compatibility of Cryptocurrencies and Islamic Law

This is from 2020, but I just noticed it this year. The law journal article cites my post on stablecoins and Islamic finance rules.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Nick Doiron

Nick Doiron

Web->ML developer and mapmaker.