Skip to yearly menu bar Skip to main content


IBM

Expo Workshop

Reliable and Efficient LLM Outputs with Mellea + Granite OSS Libraries

Heiko Ludwig ⋅ Kenney Ng ⋅ Paul Schweigert ⋅ Luis Lasras

AUDITORIUM
[ ]
Mon 6 Jul 4 p.m. KST — 7 p.m. KST

Abstract:

Every LLM application eventually runs into the same wall: the model generates plausible-sounding output that is wrong, off-format, or unsafe — and there is nothing between generation and delivery to catch it. Prompting the model harder helps sometimes. However, it is not reliable.

This workshop teaches a systematic approach to the problem using two open-source IBM tools: Mellea, a Python library for structured LLM generation, and Granite Libraries, a collection of lightweight LoRA adapters that score generated output against developer-defined requirements. Together they implement an Instruct-Validate-Repair loop — generate a response, measure it against your requirements, and select or retry before it reaches the user.

Participants start with a plain chatbot that hallucinates citations, ignores formatting rules, and produces uncontrolled output. By the end of the session, the same chatbot validates every response against a set of requirements defined in plain English, generates multiple candidates in parallel, and automatically selects the best one — all running locally, all on open-source models.

No cloud accounts, no audio hardware, no frontend build. A working environment takes under five minutes to set up.

What you will build: a command-line chat application that grows module by module — from a bare Mellea generation call, to single-requirement scoring, to a parallel Best-of-N validation loop with multiple Granite Libraries adapters firing simultaneously.

What you will leave with: a mental model of how to enforce output quality programmatically, hands-on experience writing and tuning natural-language requirements, and a local codebase you can adapt to your own domain.

Technologies covered: Mellea, Granite Libraries (activated LoRA adapters), IBM Granite 4.0, Python, OpenAI-compatible inference backends (LM Studio, Ollama, vLLM).

All tools and models used are Apache 2.0 licensed and available on HuggingFace.

Live content is unavailable. Log in and register to view live content