Close Menu
    Facebook X (Twitter) Instagram
    Articles Stock
    • Home
    • Technology
    • AI
    • Pages
      • About us
      • Contact us
      • Disclaimer For Articles Stock
      • Privacy Policy
      • Terms and Conditions
    Facebook X (Twitter) Instagram
    Articles Stock
    AI

    The best way to Construct Environment friendly Agentic Reasoning Programs by Dynamically Pruning A number of Chain-of-Thought Paths With out Shedding Accuracy

    Naveed AhmadBy Naveed Ahmad05/02/2026Updated:05/02/2026No Comments3 Mins Read
    blog banner23 9

    **Dynamically Pruning Chain-of-Thought Paths in Environmentally Friendly Agent Reasoning Programs**

    Hey there, fellow developers! Today, I’m excited to share with you a fascinating topic in the realm of artificial intelligence. We’re going to dive into the world of agent reasoning programs and explore how to build environmentally friendly models that dynamically prune multiple chain-of-thought paths without compromising accuracy.

    **What’s the Problem?**

    As AI models get more complex and powerful, we’re faced with the challenge of reducing their environmental impact. One key area to focus on is token consumption, which is a major contributor to energy consumption. In this tutorial, we’ll demonstrate a novel framework that generates multiple reasoning paths in parallel and reduces them using consensus indicators and early stopping.

    **The Framework**

    Our framework is composed of several key components:

    1. **Multi-sample technology**: We use a quick multi-sample technology to generate multiple reasoning paths in a single model call. This produces multiple continuations from a given prompt, which we store to aid downstream pruning decisions.
    2. **Consensus power calculation**: We construct a lightweight consensus mechanism using a similarity graph over generated reasoning paths. This enables us to approximate agreement between reasoning trajectories without costly model calls.
    3. **Early stopping**: We incorporate progressive sampling with early stopping to terminate generation as soon as enough confidence emerges.

    **Implementation**

    Let’s dive into the implementation details. We start by initializing the model and tokenizer:
    “`python
    model_name = “Qwen/Qwen2.5-0.5B-Instruct”
    tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
    model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map=”auto”,
    torch_dtype=torch.float16,
    load_in_4bit=True
    )
    model.eval()
    “`
    We also define the core prompting template used throughout the tutorial:
    “`
    system = “You’re a cautious problem solver. Keep reasoning brief and output a final numeric answer.”
    “`
    Helper functions will be used to construct prompts, extract final numeric answers, and check correctness:
    “`python
    def make_prompt(q):
    return (
    f”{system}n”
    f”Problem: {q}n”
    f”Reasoning: (brief)n”
    f”Final: ”
    )

    def parse_final_number(text):
    m = re.search(r”[-]?d+(?:.d+)?”, text)
    if m:
    return m.group(1).strip()
    nums = re.findall(r”[-]?d+(?:.d+)?”, text)
    return nums[-1] if nums else None

    def is_correct(pred, gold):
    if pred is None:
    return 0
    try:
    return int(abs(float(pred) – float(gold)) < 1e-9)
    except:
    return int(str(pred).strip() == str(gold).strip())
    “`
    **Consensus Power Calculation**

    To calculate consensus power, we construct a similarity graph over generated reasoning paths:
    “`python
    def consensus_strength(completions, sim_threshold=0.22):
    if len(completions) <= sim_threshold:
    G.add_edge(i, j, weight=w)

    power = [0.0] * n
    for u, v, d in G.edges(data=True):
    w = float(d["weight"])
    power[u] += w
    power[v] += w

    return power
    “`
    **Agentic Pruning Logic**

    We implement the core agentic pruning logic that groups reasoning paths by final answers and ranks them using consensus and efficiency indicators:
    “`python
    def pick_final_answer(paths):
    solutions = [parse_final_number(p["completion"]) for p in paths]
    strengths = consensus_strength([p["completion"] for p in paths])

    teams = {}
    for i, a in enumerate(solutions):
    if a is None:
    continue
    teams.setdefault(a, {"idx": [], "power": 0.0, "tokens": 0})
    teams[a]["idx"].append(i)
    teams[a]["strength"] += strengths[i]
    teams[a]["tokens"] += paths[i]["gen_tokens"]

    if not teams:
    return None, {"solutions": solutions, "strengths": strengths}

    ranked = sorted(
    teams.items(),
    key=lambda kv: (len(kv[1]["idx"]), kv[1]["strength"], -kv[1]["tokens"]),
    reverse=True
    )

    best_answer = ranked[0][0]
    best_indices = ranked[0][1]["idx"]
    best_i = sorted(best_indices, key=lambda i: (paths[i]["gen_tokens"], -strengths[i]))[0]

    return best_answer, {"solutions": solutions, "strengths": strengths, "best_i": best_i}
    “`
    **Conclusion**

    We've demonstrated how agentic pruning can significantly reduce efficient token consumption without sacrificing accuracy by stopping reasoning as soon as enough consensus emerges. By combining self-consistency, similarity-based consensus graphs, and early-stop heuristics, we've created a scalable and efficient framework for reasoning in agentic models.

    Feel free to explore the full code and try out the framework on your own projects!

    Naveed Ahmad

    Related Posts

    Adobe Firefly’s video editor can now routinely create a primary draft from footage

    25/02/2026

    Khosla’s Keith Rabois backs Comp, which needs to bolster HR groups with AI

    25/02/2026

    Discuss to Your Personal Private Isaac Newton With Ailias’s Hologram Avatars

    25/02/2026
    Leave A Reply Cancel Reply

    Categories
    • AI
    Recent Comments
      Facebook X (Twitter) Instagram Pinterest
      © 2026 ThemeSphere. Designed by ThemeSphere.

      Type above and press Enter to search. Press Esc to cancel.