Mastering LLMs Course Notes

Office Hours 3: Gradio Q&A Session with Freddy Boulton

notes

llms

Freddy showcases Gradio’s features, advantages, and repository contributions, highlighting its potential for AI applications. He concludes with insights into its future roadmap, which includes enhanced agent workflows, real-time streaming, and improved UI features.

Jun 14, 2024

Workshop 3: Instrumenting & Evaluating LLMs

notes

llms

Workshop #3 focuses on the crucial role of evaluation in fine-tuning and improving LLMs. It covers three main types of evaluations: unit tests, LLM as a judge, and human evaluation.

Jun 20, 2024

26 min

Conference Talk 2: LLM Eval For Text2SQL

notes

llms

This talk by Ankur Goyal from BrainTrust covers how to build evals for LLM systems by walking through a Text2SQL use case.

Jun 29, 2024

61 min

Conference Talk 3: Prompt Engineering Workshop

notes

llms

This talk by John Berryman covers the fundamentals of language models, prompt engineering techniques, and building LLM applications.

Jun 30, 2024

21 min

Conference Talk 4: Inspect - An OSS Framework for LLM Evals

notes

llms

In this talk, J.J. Allaire walks through the core concepts and design of the Inspect framework and demonstrate its use for a variety of evaluation tasks.

Jul 6, 2024

27 min

Office Hours 4: Modal with Charles Frye

notes

llms

This Q&A session covers a wide array of topics related to Modal, a platform designed to simplify the execution of Python code in the cloud.

Jul 6, 2024

8 min

Office Hours 5: LangChain/LangSmith

notes

llms

This Q&A session on LangChain/LangSmith covers topics like product differentiation, features, use cases, agent workflows, data set creation, and full-stack development for ML engineers.

Jul 6, 2024

Conference Talk 5: Napkin Math For Fine Tuning with Johno Whitaker

notes

llms

In this talk, Jonathan Whitaker from answer.ai shows how to build intuition around training performance with a focus on GPU-poor fine tuning.

Jul 7, 2024

13 min

Office Hours 6: Johno Whitaker

notes

llms

This Q&A session covers a wide range of topics related to LLMs, including practical tips for training and optimization, insights into the current research landscape, and thoughts on future trends.

Jul 11, 2024

9 min

Conference Talk 6: Train Almost Any LLM Model Using 🤗 autotrain

notes

llms

In this talk, Abhishek Thakur, who leads AutoTrain at 🤗, shows how to use 🤗 AutoTrain to train/fine-tune LLMs without having to write any code.

Jul 12, 2024

4 min

Workshop 4: Instrumenting & Evaluating LLMs

notes

llms

Workshop #4 focuses on the practical aspects of deploying fine-tuned LLMs, covering various deployment patterns, performance optimization techniques, and platform considerations.

Jul 17, 2024

29 min

Conference Talk 7: Best Practices For Fine Tuning Mistral

notes

llms

In this talk, Sophia Yang from Mistal AI covers best practices for fine-tuning Mistral language models. It covers Mistral’s capabilities, the benefits of fine-tuning over prompting, and provides practical demos using the Mistral Fine-tuning API and open-source codebase.

Jul 18, 2024

Conference Talk 8: Creating, curating, and cleaning data for LLMs

notes

llms

In this talk, Daniel van Strien from 🤗 outlines key considerations and techniques for creating high-quality datasets for fine-tuning LLMs.

Jul 18, 2024

Conference Talk 9: Why Fine-Tuning is Dead

notes

llms

In this talk, Emmanuel Ameisen from Anthropic argues that fine-tuning LLMs is often less effective and efficient than focusing on fundamentals like data quality, prompting, and Retrieval Augmentation Generation (RAG).

Jul 19, 2024

Conference Talk 10: Systematically Improving RAG Applications

notes

llms

In this talk, Jason Liu covers a a systematic approach to improving Retrieval Augmented Generation (RAG) applications.

Jul 20, 2024

12 min

Conference Talk 12: Slaying OOMs with PyTorch FSDP and torchao

notes

llms

In this talk, Mark Saroufim and Jane Xu, discuss techniques and tools for mitigating Out of Memory (OOM) errors in PyTorch, specifically when working with LLMs.

Jul 24, 2024

11 min

Conference Talk 13: When to Fine-Tune with Paige Bailey

notes

llms

In this talk, Paige Bailey, Generative AI Developer Relations lead at Google, discusses Google’s AI landscape with a focus on Gemini models and their applications.

Jul 25, 2024

6 min

Conference Talk 14: Explaining the Basics of Retrieval Augmented Generation

notes

llms

In this talk, Ben Clavié from Answer.ai deconstructs the concept of Retrieval-Augmented Generation (RAG) and walks through building a robust, basic RAG pipeline.

Aug 2, 2024

32 min

Conference Talk 15: Modal - Simple Scalable Serverless Services

notes

llms

In this talk, Charles Frye provides a deeper dive into Modal, exploring its capabilities beyond fine-tuning LLMs and demonstrating how it empowers users to build and deploy scalable, cost-efficient, and serverless applications with simplicity using Python.

Aug 25, 2024

11 min

Office Hours 7: Replicate

notes

llms

This Q&A session on the Replicate platform covers topics like enterprise readiness, model deployment, application layers for LLMs, data privacy, logging, and potential future features.

Aug 25, 2024

5 min

Conference Talk 16: A Deep Dive on LLM Evaluation

notes

llms

In this talk, Hailey Schoelkopf from Eleuther AI provides an overview of the challenges in LLM evaluation, exploring different measurement techniques, highlighting reproducibility issues, and advocating for best practices like sharing evaluation code and using task-specific downstream evaluations.

8 min

Conference Talk 17: Language Models on the Command-Line

notes

llms

In this talk, Simon Willison showcases LLM, a command-line tool for interacting with large language models, including how to leverage its plugin system, local model support, embedding capabilities, and integration with other Unix tools for tasks like retrieval augmented generation.

Conference Talk 18: Fine-Tuning OpenAI Models - Best Practices

notes

llms

In this talk, Steven Heidel from OpenAI’s fine-tuning team covers best practices, use cases, and recent updates for fine-tuning OpenAI models.

12 min

Office Hours 8: Predibase

notes

llms

This Q&A session with Predibase compares and contrasts Lorax, an open-source adapter-tuning library for large language models, with other similar libraries, highlighting its performance optimizations, unique features like dynamic adapter loading and support for various adapter types, and its role in a broader machine learning infrastructure strategy.