Close Menu
    Facebook X (Twitter) Instagram
    Articles Stock
    • Home
    • Technology
    • AI
    • Pages
      • About us
      • Contact us
      • Disclaimer For Articles Stock
      • Privacy Policy
      • Terms and Conditions
    Facebook X (Twitter) Instagram
    Articles Stock
    AI

    How one can Construct Kind-Protected, Schema-Constrained, and Operate-Pushed LLM Pipelines Utilizing Outlines and Pydantic

    Naveed AhmadBy Naveed Ahmad15/03/2026Updated:15/03/2026No Comments6 Mins Read
    blog banner23 1 5


    On this tutorial, we construct a workflow utilizing Outlines to generate structured and type-safe outputs from language fashions. We work with typed constraints like Literal, int, and bool, and design immediate templates utilizing outlines.Template, and implement strict schema validation with Pydantic fashions. We additionally implement strong JSON restoration and a function-calling model that generates validated arguments and executes Python capabilities safely. All through the tutorial, we concentrate on reliability, constraint enforcement, and production-grade structured technology.

    import os, sys, subprocess, json, textwrap, re
    
    
    subprocess.check_call([sys.executable, "-m", "pip", "install", "-q",
                          "outlines", "transformers", "accelerate", "sentencepiece", "pydantic"])
    
    
    import torch
    import outlines
    from transformers import AutoTokenizer, AutoModelForCausalLM
    
    
    from typing import Literal, Record, Union, Annotated
    from pydantic import BaseModel, Discipline
    from enum import Enum
    
    
    print("Torch:", torch.__version__)
    print("CUDA obtainable:", torch.cuda.is_available())
    print("Outlines:", getattr(outlines, "__version__", "unknown"))
    machine = "cuda" if torch.cuda.is_available() else "cpu"
    print("Utilizing machine:", machine)
    
    
    MODEL_NAME = "HuggingFaceTB/SmolLM2-135M-Instruct"
    
    
    tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, use_fast=True)
    hf_model = AutoModelForCausalLM.from_pretrained(
       MODEL_NAME,
       torch_dtype=torch.float16 if machine == "cuda" else torch.float32,
       device_map="auto" if machine == "cuda" else None,
    )
    
    
    if machine == "cpu":
       hf_model = hf_model.to(machine)
    
    
    mannequin = outlines.from_transformers(hf_model, tokenizer)
    
    
    def build_chat(user_text: str, system_text: str = "You're a exact assistant. Comply with directions precisely.") -> str:
       attempt:
           msgs = [{"role": "system", "content": system_text}, {"role": "user", "content": user_text}]
           return tokenizer.apply_chat_template(msgs, tokenize=False, add_generation_prompt=True)
       besides Exception:
           return f"{system_text}nnUser: {user_text}nAssistant:"
    
    
    def banner(title: str):
       print("n" + "=" * 90)
       print(title)
       print("=" * 90)

    We set up all required dependencies and initialize the Outlines pipeline with a light-weight instruct mannequin. We configure machine dealing with in order that the system routinely switches between CPU and GPU based mostly on availability. We additionally construct reusable helper capabilities for chat formatting and clear part banners to construction the workflow.

    def extract_json_object(s: str) -> str:
       s = s.strip()
       begin = s.discover("{")
       if begin == -1:
           return s
       depth = 0
       in_str = False
       esc = False
       for i in vary(begin, len(s)):
           ch = s[i]
           if in_str:
               if esc:
                   esc = False
               elif ch == "":
                   esc = True
               elif ch == '"':
                   in_str = False
           else:
               if ch == '"':
                   in_str = True
               elif ch == "{":
                   depth += 1
               elif ch == "}":
                   depth -= 1
                   if depth == 0:
                       return s[start:i + 1]
       return s[start:]
    
    
    def json_repair_minimal(dangerous: str) -> str:
       dangerous = dangerous.strip()
       final = dangerous.rfind("}")
       if final != -1:
           return dangerous[:last + 1]
       return dangerous
    
    
    def safe_validate(model_cls, raw_text: str):
       uncooked = extract_json_object(raw_text)
       attempt:
           return model_cls.model_validate_json(uncooked)
       besides Exception:
           raw2 = json_repair_minimal(uncooked)
           return model_cls.model_validate_json(raw2)
    
    
    banner("2) Typed outputs (Literal / int / bool)")
    
    
    sentiment = mannequin(
       build_chat("Analyze the sentiment: 'This product utterly modified my life!'. Return one label solely."),
       Literal["Positive", "Negative", "Neutral"],
       max_new_tokens=8,
    )
    print("Sentiment:", sentiment)
    
    
    bp = mannequin(build_chat("What is the boiling level of water in Celsius? Return integer solely."), int, max_new_tokens=8)
    print("Boiling level (int):", bp)
    
    
    prime = mannequin(build_chat("Is 29 a first-rate quantity? Return true or false solely."), bool, max_new_tokens=6)
    print("Is prime (bool):", prime)

    We implement strong JSON extraction and minimal restore utilities to soundly get well structured outputs from imperfect generations. We then show strongly typed technology utilizing Literal, int, and bool, guaranteeing the mannequin returns values which can be strictly constrained. We validate how Outlines enforces deterministic type-safe outputs straight at technology time.

    banner("3) Immediate templating (outlines.Template)")
    
    
    tmpl = outlines.Template.from_string(textwrap.dedent("""
    <|system|>
    You're a strict classifier. Return ONLY one label.
    <|person|>
    Classify sentiment of this textual content:
    {{ textual content }}
    Labels: Optimistic, Adverse, Impartial
    <|assistant|>
    """).strip())
    
    
    templated = mannequin(tmpl(textual content="The meals was chilly however the workers had been variety."), Literal["Positive","Negative","Neutral"], max_new_tokens=8)
    print("Template sentiment:", templated)

    We use outlines.Template to construct structured immediate templates with strict output management. We dynamically inject person enter into the template whereas preserving position formatting and classification constraints. We show how templating improves reusability and ensures constant, constrained responses.

    banner("4) Pydantic structured output (superior constraints)")
    
    
    class TicketPriority(str, Enum):
       low = "low"
       medium = "medium"
       excessive = "excessive"
       pressing = "pressing"
    
    
    IPv4 = Annotated[str, Field(pattern=r"^((25[0-5]|2[0-4]d|[01]?dd?).){3}(25[0-5]|2[0-4]d|[01]?dd?)$")]
    ISODate = Annotated[str, Field(pattern=r"^d{4}-d{2}-d{2}$")]
    
    
    class ServiceTicket(BaseModel):
       precedence: TicketPriority
       class: Literal["billing", "login", "bug", "feature_request", "other"]
       requires_manager: bool
       abstract: str = Discipline(min_length=10, max_length=220)
       action_items: Record[str] = Discipline(min_length=1, max_length=6)
    
    
    class NetworkIncident(BaseModel):
       affected_service: Literal["dns", "vpn", "api", "website", "database"]
       severity: Literal["sev1", "sev2", "sev3"]
       public_ip: IPv4
       start_date: ISODate
       mitigation: Record[str] = Discipline(min_length=2, max_length=6)
    
    
    e-mail = """
    Topic: URGENT - Can not entry my account after fee
    I paid for the premium plan 3 hours in the past and nonetheless cannot entry any options.
    I've a shopper presentation in an hour and wish the analytics dashboard.
    Please repair this instantly or refund my fee.
    """.strip()
    
    
    ticket_text = mannequin(
       build_chat(
           "Extract a ServiceTicket from this message.n"
           "Return JSON ONLY matching the ServiceTicket schema.n"
           "Motion gadgets should be distinct.nnMESSAGE:n" + e-mail
       ),
       ServiceTicket,
       max_new_tokens=240,
    )
    
    
    ticket = safe_validate(ServiceTicket, ticket_text) if isinstance(ticket_text, str) else ticket_text
    print("ServiceTicket JSON:n", ticket.model_dump_json(indent=2))

    We outline superior Pydantic schemas with enums, regex constraints, discipline limits, and structured lists. We extract a posh ServiceTicket object from uncooked e-mail textual content and validate it utilizing schema-driven decoding. We additionally apply secure validation logic to deal with edge circumstances and guarantee robustness at manufacturing scale.

    banner("5) Operate-calling model (schema -> args -> name)")
    
    
    class AddArgs(BaseModel):
       a: int = Discipline(ge=-1000, le=1000)
       b: int = Discipline(ge=-1000, le=1000)
    
    
    def add(a: int, b: int) -> int:
       return a + b
    
    
    args_text = mannequin(
       build_chat("Return JSON ONLY with two integers a and b. Make a odd and b even."),
       AddArgs,
       max_new_tokens=80,
    )
    
    
    args = safe_validate(AddArgs, args_text) if isinstance(args_text, str) else args_text
    print("Args:", args.model_dump())
    print("add(a,b) =", add(args.a, args.b))
    
    
    print("Tip: For finest velocity and fewer truncations, change Colab Runtime → GPU.")

    We implement a function-calling model workflow by producing structured arguments that conform to an outlined schema. We validate the generated arguments, then safely execute a Python operate with these validated inputs. We show how schema-first technology permits managed software invocation and dependable LLM-driven computation.

    In conclusion, we applied a totally structured technology pipeline utilizing Outlines with robust typing, schema validation, and managed decoding. We demonstrated transfer from easy typed outputs to superior Pydantic-based extraction and function-style execution patterns. We additionally constructed resilience by JSON salvage and validation mechanisms, making the system strong towards imperfect mannequin outputs. Total, we created a sensible and production-oriented framework for deterministic, secure, and schema-driven LLM purposes.


    Take a look at Full Codes here. Additionally, be at liberty to observe us on Twitter and don’t neglect to hitch our 120k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.




    Source link

    Naveed Ahmad

    Related Posts

    Unacademy to be acquired by upGrad in share-swap deal as India’s edtech sector consolidates

    15/03/2026

    TechCrunch Mobility: Travis Kalanick’s return proves it truly is 2016 once more

    15/03/2026

    Wiz investor unpacks Google’s $32B acquisition

    15/03/2026
    Leave A Reply Cancel Reply

    Categories
    • AI
    Recent Comments
      Facebook X (Twitter) Instagram Pinterest
      © 2026 ThemeSphere. Designed by ThemeSphere.

      Type above and press Enter to search. Press Esc to cancel.