Nick Doiron
1 min readJul 18, 2020

--

Uh-oh, I used the shuffled/deduplicated download from their site. I'm glad that I documented the process, then, and thankful for your response. I was planning to retrain the Hindi model in the near future - I will ask the OSCAR team for the unshuffled data to measure the improvement.

--

--

Nick Doiron
Nick Doiron

Written by Nick Doiron

Web->ML developer and mapmaker.

Responses (1)