Build Your Own Large Language Model Like Dolly

Caio Moreno
2 min readMay 22, 2023

I wrote many blog posts about LLM (Large Language Models), Chat GPT and Open AI.

Today, I want to write about Databricks Dolly.

Free Dolly: Introducing the World’s First Truly Open Instruction-Tuned LLM

Extracting from Databricks website:

Two weeks ago, we released Dolly, a large language model (LLM) trained for less than $30 to exhibit ChatGPT-like human interactivity (aka instruction-following). Today, we’re releasing Dolly 2.0, the first open source, instruction-following LLM, fine-tuned on a human-generated instruction dataset licensed for research and commercial use.

Dolly 2.0 is a 12B parameter language model based on the EleutherAI pythia model family and fine-tuned exclusively on a new, high-quality human generated instruction following dataset, crowdsourced among Databricks employees.

We are open-sourcing the entirety of Dolly 2.0, including the training code, the dataset, and the model weights, all suitable for

--

--

Caio Moreno

Solutions Architect and Data Scientist @databricks | Adjunct Professor at @IEuniversity | PhD @unicomplutense (Opinions are my own)