Skip to content

Code for CoNLL BabyLM workshop Mini Minds: Exploring Bebeshka and Zlata Baby Models

Notifications You must be signed in to change notification settings

upunaprosk/small-language-models

Repository files navigation

Small Language Models

This repository contains code for the paper Mini Minds: Exploring Bebeshka and Zlata Baby Models accepted to BabyLM Shared task (CoNLL 2023).

In this work, we investigate the optimal size of language models minimizing perplexity on BabyLM shared task (Warstadt et al. (2023)) data and present a small 4-layer RoBERTa and 6-layer GPT-2 pre-trained on a 10M version of the corpus comparable to children's vocabulary.

We evaluate LMs on the ETHICS dataset and show that small LMs perform on par with LLMs on such tasks as Virtuous judgements.

Available Baby LMs: