A Coding Implementation for an Agentic AI Framework that Performs Literature Evaluation, Speculation Era, Experimental Planning, Simulation, and Scientific Reporting

On this tutorial, we construct an entire scientific discovery agent step-by-step and expertise how every part works collectively to kind a coherent analysis workflow. We start by loading our literature corpus, establishing retrieval and LLM modules, after which assembling brokers that search papers, generate hypotheses, design experiments, and produce structured stories. By snippets talked about under, we see how an agentic pipeline emerges naturally, permitting us to discover a scientific query from preliminary curiosity to a full evaluation inside a single, built-in system. Take a look at the FULL CODES here.

import sys, subprocess


def install_deps():
   pkgs = ["transformers", "scikit-learn", "numpy"]
   subprocess.check_call([sys.executable, "-m", "pip", "install", "-q"] + pkgs)


attempt:
   from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
   from sklearn.feature_extraction.textual content import TfidfVectorizer
   from sklearn.metrics.pairwise import cosine_similarity
   import numpy as np
besides ImportError:
   install_deps()
   from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
   from sklearn.feature_extraction.textual content import TfidfVectorizer
   from sklearn.metrics.pairwise import cosine_similarity
   import numpy as np


from dataclasses import dataclass
from typing import Listing, Dict, Any


np.random.seed(42)


LITERATURE = [
   {"id": "P1","title": "Self-Supervised Protein Language Models for Structure Prediction","field": "computational biology",
    "abstract": "We explore transformer-based protein language models trained on millions of sequences. The models learn residue-level embeddings that improve secondary structure prediction and stability estimation."},
   {"id": "P2","title": "CRISPR Off-Target Detection Using Deep Learning","field": "genome editing",
    "abstract": "We propose a convolutional neural network architecture for predicting CRISPR-Cas9 off-target effects directly from genomic sequences, achieving state-of-the-art accuracy on GUIDE-seq datasets."},
   {"id": "P3","title": "Foundation Models for Scientific Equation Discovery","field": "scientific ML",
    "abstract": "Large language models are combined with symbolic regression to recover governing equations from noisy experimental observations in physics and fluid dynamics."},
   {"id": "P4","title": "Active Learning for Materials Property Optimization","field": "materials science",
    "abstract": "We integrate Bayesian optimization with graph neural networks to actively select candidate materials that maximize target properties while reducing experimental cost."},
   {"id": "P5","title": "Graph-Based Retrieval for Cross-Domain Literature Review","field": "NLP for science",
    "abstract": "We construct a heterogeneous citation and concept graph over multi-domain scientific papers and show that graph-aware retrieval improves cross-domain literature exploration."},
]


corpus_texts = [p["abstract"] + " " + p["title"] for p in LITERATURE]
vectorizer = TfidfVectorizer(stop_words="english")
corpus_matrix = vectorizer.fit_transform(corpus_texts)


MODEL_NAME = "google/flan-t5-small"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
mannequin = AutoModelForSeq2SeqLM.from_pretrained(MODEL_NAME)


def generate_text(immediate: str, max_new_tokens: int = 256) -> str:
   inputs = tokenizer(immediate, return_tensors="pt", truncation=True)
   outputs = mannequin.generate(**inputs, max_new_tokens=max_new_tokens, num_beams=4, early_stopping=True)
   return tokenizer.decode(outputs[0], skip_special_tokens=True)

We laid the muse for our scientific agent by loading libraries, getting ready the literature corpus, and initializing our language mannequin. We construct the TF-IDF vectorizer and embed all abstracts to later retrieve related papers. With the mannequin loaded and information structured, we create the computational spine for the whole lot that follows. Take a look at the FULL CODES here.

@dataclass
class PaperHit:
   paper: Dict[str, Any]
   rating: float


class LiteratureAgent:
   def __init__(self, vectorizer, corpus_matrix, papers: Listing[Dict[str, Any]]):
       self.vectorizer = vectorizer
       self.corpus_matrix = corpus_matrix
       self.papers = papers


   def search(self, question: str, ok: int = 3) -> Listing[PaperHit]:
       q_vec = self.vectorizer.rework([query])
       sims = cosine_similarity(q_vec, self.corpus_matrix)[0]
       idxs = np.argsort(-sims)[:k]
       hits = [PaperHit(self.papers[i], float(sims[i])) for i in idxs]
       return hits

We implement the literature-search part of our agent. We convert person queries right into a vector area and establish essentially the most related scientific papers utilizing cosine similarity. By this, we give our system the flexibility to floor its reasoning within the closest-matching prior work. Take a look at the FULL CODES here.

@dataclass
class ExperimentPlan:
   system: str
   speculation: str
   variables: Dict[str, Any]
   protocol: Listing[str]


@dataclass
class ExperimentResult:
   plan: ExperimentPlan
   metrics: Dict[str, float]


class ExperimentAgent:
   def design_experiment(self, query: str, speculation: str, hits: Listing[PaperHit]) -> ExperimentPlan:
       top_field = hits[0].paper["field"] if hits else "computational science"
       protocol = [
           f"Construct dataset combining ideas from: {', '.join(h.paper['id'] for h in hits)}.",
           "Cut up information into practice/validation/check.",
           "Examine baseline mannequin vs. augmented mannequin implementing the speculation.",
           "Consider utilizing applicable metrics and carry out ablation evaluation.",
       ]
       variables = {
           "baseline_model": "sequence CNN",
           "augmented_model": "protein language mannequin + CNN",
           "n_train_samples": 5000,
           "n_validation_samples": 1000,
           "metric": "AUROC",
       }
       system = f"{top_field} system associated to: {query}"
       return ExperimentPlan(system=system, speculation=speculation, variables=variables, protocol=protocol)


   def run_experiment(self, plan: ExperimentPlan) -> ExperimentResult:
       base = 0.78 + 0.02 * np.random.randn()
       acquire = abs(0.05 + 0.01 * np.random.randn())
       metrics = {
           "baseline_AUROC": spherical(base, 3),
           "augmented_AUROC": spherical(base + acquire, 3),
           "estimated_gain": spherical(acquire, 3),
       }
       return ExperimentResult(plan=plan, metrics=metrics)

We design and simulate experiments based mostly on the retrieved literature and the generated speculation. We routinely outline variables, construct a protocol, and generate artificial metrics that imitate the dynamics of an actual scientific analysis. This lets us transfer from theoretical concepts to an actionable experimental plan. Take a look at the FULL CODES here.

class ReportAgent:
   def write_report(self, query: str, hits: Listing[PaperHit], plan: ExperimentPlan, outcome: ExperimentResult) -> str:
       related_work = "n".be part of(f"- {h.paper['title']} ({h.paper['field']})" for h in hits)
       protocol_str = "n".be part of(f"- {step}" for step in plan.protocol)
       immediate = f"""
You might be an AI analysis assistant writing a concise research-style report.


Analysis query:
{query}


Speculation:
{plan.speculation}


Related prior work:
{related_work}


Deliberate experiment:
System: {plan.system}
Variables: {plan.variables}
Protocol:
{protocol_str}


Simulated outcomes:
{outcome.metrics}


Write a transparent report with the next sections:
1. Background
2. Proposed Method
3. Experimental Setup
4. Outcomes and Dialogue
5. Limitations and Future Work
"""
       return generate_text(immediate.strip(), max_new_tokens=320)

We generate a full research-style report utilizing the LLM. We assemble the speculation, protocol, outcomes, and associated work right into a structured doc with clearly outlined sections. This enables us to show the pipeline’s uncooked outputs into polished scientific communication. Take a look at the FULL CODES here.

class ScientificAgent:
   def __init__(self):
       self.lit_agent = LiteratureAgent(vectorizer, corpus_matrix, LITERATURE)
       self.exp_agent = ExperimentAgent()
       self.report_agent = ReportAgent()


   def propose_hypothesis(self, query: str, hits: Listing[PaperHit]) -> str:
       context = " ".be part of(h.paper["abstract"] for h in hits)
       immediate = f"""
You might be an AI scientist. Given a analysis query and associated abstracts,
suggest a single, testable speculation in 2-3 sentences.


Analysis query:
{query}


Associated abstracts:
{context}
"""
       return generate_text(immediate.strip(), max_new_tokens=96)


   def run_pipeline(self, query: str) -> str:
       hits = self.lit_agent.search(query, ok=3)
       speculation = self.propose_hypothesis(query, hits)
       plan = self.exp_agent.design_experiment(query, speculation, hits)
       outcome = self.exp_agent.run_experiment(plan)
       report = self.report_agent.write_report(query, hits, plan, outcome)
       return report


if __name__ == "__main__":
   research_question = (
       "How can protein language mannequin embeddings enhance CRISPR off-target "
       "prediction in comparison with sequence-only CNN baselines?"
   )
   agent = ScientificAgent()
   final_report = agent.run_pipeline(research_question)
   print(final_report)

We orchestrate the whole pipeline, looking out the literature, producing a speculation, designing the experiment, working the simulation, and writing the report. We then execute the system on an actual analysis query and observe the entire workflow in motion. This step brings all of the modules collectively right into a unified scientific agent.

In conclusion, we see how a compact codebase can evolve right into a functioning AI co-researcher able to looking out, reasoning, simulating, and summarizing. We perceive how every snippet contributes to the total pipeline and the way agentic elements amplify each other when mixed. Additionally, we place ourselves in a powerful place to increase the agent with richer literature sources, extra real looking fashions, and extra refined experimental logic, pushing our scientific exploration additional with each iteration.

Take a look at the FULL CODES here. Be at liberty to take a look at our GitHub Page for Tutorials, Codes and Notebooks. Additionally, be happy to observe us on Twitter and don’t overlook to hitch our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is dedicated to harnessing the potential of Synthetic Intelligence for social good. His most up-to-date endeavor is the launch of an Synthetic Intelligence Media Platform, Marktechpost, which stands out for its in-depth protection of machine studying and deep studying information that’s each technically sound and simply comprehensible by a large viewers. The platform boasts of over 2 million month-to-month views, illustrating its reputation amongst audiences.

🙌 Follow MARKTECHPOST: Add us as a preferred source on Google.

Source link

A Coding Implementation for an Agentic AI Framework that Performs Literature Evaluation, Speculation Era, Experimental Planning, Simulation, and Scientific Reporting

TechCrunch is heading to Tokyo — and bringing the Startup Battlefield with it

Anthropic briefly banned OpenClaw’s creator from accessing Claude

NASA Artemis II splashes down in Pacific Ocean in ‘good’ touchdown for Moon mission

A Coding Implementation for an Agentic AI Framework that Performs Literature Evaluation, Speculation Era, Experimental Planning, Simulation, and Scientific Reporting

Related Posts

TechCrunch is heading to Tokyo — and bringing the Startup Battlefield with it

Anthropic briefly banned OpenClaw’s creator from accessing Claude

NASA Artemis II splashes down in Pacific Ocean in ‘good’ touchdown for Moon mission