Member-only story
Creating a bespoke LLM for AI-generated documentation
Hi folks,
Databricks LLMs experts created a great blog post about teaching how you can create a bespoke LLM for AI-generated documentation and the best with 2 engineers, 1 month and less than $1,000.
The original blog post is here.
Original content
We recently announced our AI-generated documentation feature, which uses large language models (LLMs) to automatically generate documentation for tables and columns in Unity Catalog. We have been humbled by the reception of this feature among our customers. Today, more than 80% of the table metadata updates on Databricks are AI-assisted.
In this blog post, we share our experience developing this feature — from prototyping as a hackathon project using off-the-shelf SaaS-based LLMs to creating a bespoke LLM that is better, faster, and cheaper. The new model took 2 engineers, 1 month and less than $1,000 in compute cost to develop. We hope you will find the learnings useful, as we believe they apply to a wide class of GenAI use cases. More importantly, it has allowed us to take advantage of rapid advances being made in open-source LLMs.
What is AI-generated documentation?
At the center of each data platform lies a (potentially enormous) collection of datasets (often in the form of tables). In virtually every organization we have worked with, the vast majority of tables are not documented. The…