Close Menu
    Facebook X (Twitter) Instagram
    Articles Stock
    • Home
    • Technology
    • AI
    • Pages
      • About us
      • Contact us
      • Disclaimer For Articles Stock
      • Privacy Policy
      • Terms and Conditions
    Facebook X (Twitter) Instagram
    Articles Stock
    AI

    Construct an Autonomous Moist-Lab Protocol Planner and Validator Utilizing Salesforce CodeGen for Agentic Experiment Design and Security Optimization

    Naveed AhmadBy Naveed Ahmad07/11/2025No Comments9 Mins Read
    blog banner 17


    On this tutorial, we construct a Moist-Lab Protocol Planner & Validator that acts as an clever agent for experimental design and execution. We design the system utilizing Python and combine Salesforce’s CodeGen-350M-mono model for pure language reasoning. We construction the pipeline into modular elements: ProtocolParser for extracting structured information, resembling steps, durations, and temperatures, from textual protocols; InventoryManager for validating reagent availability and expiry; Schedule Planner for producing timelines and parallelization; and Security Validator for figuring out biosafety or chemical hazards. The LLM is then used to generate optimization solutions, successfully closing the loop between notion, planning, validation, and refinement.

    import re, json, pandas as pd
    from datetime import datetime, timedelta
    from collections import defaultdict
    from transformers import AutoTokenizer, AutoModelForCausalLM
    import torch
    
    
    MODEL_NAME = "Salesforce/codegen-350M-mono"
    print("Loading CodeGen mannequin (30 seconds)...")
    tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
    tokenizer.pad_token = tokenizer.eos_token
    mannequin = AutoModelForCausalLM.from_pretrained(
       MODEL_NAME, torch_dtype=torch.float16, device_map="auto"
    )
    print("✓ Mannequin loaded!")

    We start by importing important libraries and loading the Salesforce CodeGen-350M-mono mannequin domestically for light-weight, API-free inference. We initialize each the tokenizer and mannequin with float16 precision and computerized system mapping to make sure compatibility and pace on Colab GPUs.

    class ProtocolParser:
       def read_protocol(self, textual content):
           steps = []
           strains = textual content.break up('n')
           for i, line in enumerate(strains, 1):
               step_match = re.search(r'^(d+).s+(.+)', line.strip())
               if step_match:
                   num, title = step_match.teams()
                   context="n".be a part of(strains[i:min(i+4, len(lines))])
                   length = self._extract_duration(context)
                   temp = self._extract_temp(context)
                   security = self._check_safety(context)
                   steps.append({
                       'step': int(num), 'title': title, 'duration_min': length,
                       'temp': temp, 'security': security, 'line': i, 'particulars': context[:200]
                   })
           return steps
      
       def _extract_duration(self, textual content):
           textual content = textual content.decrease()
           if 'in a single day' in textual content: return 720
           match = re.search(r'(d+)s*(?:hour|hr|h)(?:s)?(?!w)', textual content)
           if match: return int(match.group(1)) * 60
           match = re.search(r'(d+)s*(?:min|minute)(?:s)?', textual content)
           if match: return int(match.group(1))
           match = re.search(r'(d+)-(d+)s*(?:min|minute)', textual content)
           if match: return (int(match.group(1)) + int(match.group(2))) // 2
           return 30
      
       def _extract_temp(self, textual content):
           textual content = textual content.decrease()
           if '4°c' in textual content or '4 °c' in textual content or '4°' in textual content: return '4C'
           if '37°c' in textual content or '37 °c' in textual content: return '37C'
           if '-20°c' in textual content or '-80°c' in textual content: return 'FREEZER'
           if 'room temp' in textual content or 'rt' in textual content or 'ambient' in textual content: return 'RT'
           return 'RT'
      
       def _check_safety(self, textual content):
           flags = []
           text_lower = textual content.decrease()
           if re.search(r'bsl-[23]|biosafety', text_lower): flags.append('BSL-2/3')
           if re.search(r'warning|corrosive|hazard|poisonous', text_lower): flags.append('HAZARD')
           if 'sharp' in text_lower or 'needle' in text_lower: flags.append('SHARPS')
           if 'darkish' in text_lower or 'light-sensitive' in text_lower: flags.append('LIGHT-SENSITIVE')
           if 'flammable' in text_lower: flags.append('FLAMMABLE')
           return flags
    
    
    class InventoryManager:
       def __init__(self, csv_text):
           from io import StringIO
           self.df = pd.read_csv(StringIO(csv_text))
           self.df['expiry'] = pd.to_datetime(self.df['expiry'])
      
       def check_availability(self, reagent_list):
           points = []
           for reagent in reagent_list:
               reagent_clean = reagent.decrease().exchange('_', ' ').exchange('-', ' ')
               matches = self.df[self.df['reagent'].str.decrease().str.accommodates(
                   '|'.be a part of(reagent_clean.break up()[:2]), na=False, regex=True
               )]
               if matches.empty:
                   points.append(f"❌ {reagent}: NOT IN INVENTORY")
               else:
                   row = matches.iloc[0]
                   if row['expiry'] < datetime.now():
                       points.append(f"⚠️  {reagent}: EXPIRED on {row['expiry'].date()} (lot {row['lot']})")
                   elif (row['expiry'] - datetime.now()).days < 30:
                       points.append(f"⚠️  {reagent}: Expires quickly ({row['expiry'].date()}, lot {row['lot']})")
                   if row['quantity'] < 10:
                       points.append(f"⚠️  {reagent}: LOW STOCK ({row['quantity']} {row['unit']} remaining)")
           return points
      
       def extract_reagents(self, protocol_text):
           reagents = set()
           patterns = [
               r'b([A-Z][a-z]+(?:s+[A-Z][a-z]+)*)s+(?:antibody|buffer|answer)',
               r'b([A-Z]{2,}(?:-[A-Z0-9]+)?)b',
               r'(?:add|use|put together|dilute)s+([a-z-]+s*(?:antibody|buffer|substrate|answer))',
           ]
           for sample in patterns:
               matches = re.findall(sample, protocol_text, re.IGNORECASE)
               reagents.replace(m.strip() for m in matches if len(m) > 2)
           return checklist(reagents)[:15]

    We outline the ProtocolParser and InventoryManager lessons to extract structured experimental particulars and confirm reagent stock. We parse every protocol step for length, temperature, and security markers, whereas the stock supervisor validates inventory ranges, expiry dates, and reagent availability by means of fuzzy matching.

    class SchedulePlanner:
       def make_schedule(self, steps, start_time="09:00"):
           schedule = []
           present = datetime.strptime(f"2025-01-01 {start_time}", "%Y-%m-%d %H:%M")
           day = 1
           for step in steps:
               finish = present + timedelta(minutes=step['duration_min'])
               if step['duration_min'] > 480:
                   day += 1
                   present = datetime.strptime(f"2025-01-0{day} 09:00", "%Y-%m-%d %H:%M")
                   finish = present
               schedule.append({
                   'step': step['step'], 'title': step['name'][:40],
                   'begin': present.strftime("%H:%M"), 'finish': finish.strftime("%H:%M"),
                   'length': step['duration_min'], 'temp': step['temp'],
                   'day': day, 'can_parallelize': step['duration_min'] > 60,
                   'security': ', '.be a part of(step['safety']) if step['safety'] else 'None'
               })
               if step['duration_min'] <= 480:
                   present = finish
           return schedule
      
       def optimize_parallelization(self, schedule):
           parallel_groups = []
           idle_time = 0
           for i, step in enumerate(schedule):
               if step['can_parallelize'] and that i + 1 < len(schedule):
                   next_step = schedule[i+1]
                   if step['temp'] == next_step['temp']:
                       saved = min(step['duration'], next_step['duration'])
                       parallel_groups.append(
                           f"✨ Steps {step['step']} & {next_step['step']} can overlap → Save {saved} min"
                       )
                       idle_time += saved
           return parallel_groups, idle_time
    
    
    class SafetyValidator:
       RULES = {
           'ph_range': (5.0, 11.0),
           'temp_limits': {'4C': (2, 8), '37C': (35, 39), 'RT': (20, 25)},
           'max_concurrent_instruments': 3,
       }
      
       def validate(self, steps):
           dangers = []
           for step in steps:
               ph_match = re.search(r'phs*(d+.?d*)', step['details'].decrease())
               if ph_match:
                   ph = float(ph_match.group(1))
                   if not (self.RULES['ph_range'][0] <= ph <= self.RULES['ph_range'][1]):
                       dangers.append(f"⚠️  Step {step['step']}: pH {ph} OUT OF SAFE RANGE")
               if 'BSL-2/3' in step['safety']:
                   dangers.append(f"🛡️  Step {step['step']}: BSL-2 cupboard REQUIRED")
               if 'HAZARD' in step['safety']:
                   dangers.append(f"🧤 Step {step['step']}: Full PPE + chemical hood REQUIRED")
               if 'SHARPS' in step['safety']:
                   dangers.append(f"💉 Step {step['step']}: Sharps container + needle security")
               if 'LIGHT-SENSITIVE' in step['safety']:
                   dangers.append(f"🌑 Step {step['step']}: Work in darkish/amber tubes")
           return dangers

    We implement the SchedulePlanner and SafetyValidator to design environment friendly experiment timelines and implement lab security requirements. We dynamically generate every day schedules, establish parallelizable steps, and validate potential dangers, resembling unsafe pH ranges, hazardous chemical compounds, or biosafety-level necessities.

    def llm_call(immediate, max_tokens=200):
       attempt:
           inputs = tokenizer(immediate, return_tensors="pt", truncation=True, max_length=512).to(mannequin.system)
           outputs = mannequin.generate(
               **inputs, max_new_tokens=max_tokens, do_sample=True,
               temperature=0.7, top_p=0.9, pad_token_id=tokenizer.eos_token_id
           )
           return tokenizer.decode(outputs[0], skip_special_tokens=True)[len(prompt):].strip()
       besides:
           return "Batch comparable temperature steps collectively. Pre-warm devices."
    
    
    def agent_loop(protocol_text, inventory_csv, start_time="09:00"):
       print("n🔬 AGENT STARTING PROTOCOL ANALYSIS...n")
       parser = ProtocolParser()
       steps = parser.read_protocol(protocol_text)
       print(f"📄 Parsed {len(steps)} protocol steps")
       stock = InventoryManager(inventory_csv)
       reagents = stock.extract_reagents(protocol_text)
       print(f"🧪 Recognized {len(reagents)} reagents: {', '.be a part of(reagents[:5])}...")
       inv_issues = stock.check_availability(reagents)
       validator = SafetyValidator()
       safety_risks = validator.validate(steps)
       planner = SchedulePlanner()
       schedule = planner.make_schedule(steps, start_time)
       parallel_opts, time_saved = planner.optimize_parallelization(schedule)
       total_time = sum(s['duration'] for s in schedule)
       optimized_time = total_time - time_saved
       opt_prompt = f"Protocol has {len(steps)} steps, {total_time} min whole. Key bottleneck optimization:"
       optimization = llm_call(opt_prompt, max_tokens=80)
       return {
           'steps': steps, 'schedule': schedule, 'inventory_issues': inv_issues,
           'safety_risks': safety_risks, 'parallelization': parallel_opts,
           'time_saved': time_saved, 'total_time': total_time,
           'optimized_time': optimized_time, 'ai_optimization': optimization,
           'reagents': reagents
       }

    We assemble the agent loop, integrating notion, planning, validation, and revision right into a single, coherent stream. We use CodeGen for reasoning-based optimization to refine step sequencing and suggest sensible enhancements for effectivity and parallel execution.

    def generate_checklist(outcomes):
       md = "# 🔬 WET-LAB PROTOCOL CHECKLISTnn"
       md += f"**Complete Steps:** {len(outcomes['schedule'])}n"
       md += f"**Estimated Time:** {outcomes['total_time']} min ({outcomes['total_time']//60}h {outcomes['total_time']%60}m)n"
       md += f"**Optimized Time:** {outcomes['optimized_time']} min (save {outcomes['time_saved']} min)nn"
       md += "## ⏱️ TIMELINEn"
       current_day = 1
       for merchandise in outcomes['schedule']:
           if merchandise['day'] > current_day:
               md += f"n### Day {merchandise['day']}n"
               current_day = merchandise['day']
           parallel = " 🔄" if merchandise['can_parallelize'] else ""
           md += f"- [ ] **{merchandise['start']}-{merchandise['end']}** | Step {merchandise['step']}: {merchandise['name']} ({merchandise['temp']}){parallel}n"
       md += "n## 🧪 REAGENT PICK-LISTn"
       for reagent in outcomes['reagents']:
           md += f"- [ ] {reagent}n"
       md += "n## ⚠️ SAFETY & INVENTORY ALERTSn"
       all_issues = outcomes['safety_risks'] + outcomes['inventory_issues']
       if all_issues:
           for threat in all_issues:
               md += f"- {threat}n"
       else:
           md += "- ✅ No essential points detectedn"
       md += "n## ✨ OPTIMIZATION TIPSn"
       for tip in outcomes['parallelization']:
           md += f"- {tip}n"
       md += f"- 💡 AI Suggestion: {outcomes['ai_optimization']}n"
       return md
    
    
    def generate_gantt_csv(schedule):
       df = pd.DataFrame(schedule)
       return df.to_csv(index=False)

    We create output mills that rework outcomes into human-readable Markdown checklists and Gantt-compatible CSVs. We be sure that each execution produces clear summaries of reagents, time financial savings, and security or stock alerts for streamlined lab operations.

    SAMPLE_PROTOCOL = """ELISA Protocol for Cytokine Detection
    
    
    1. Coating (Day 1, 4°C in a single day)
      - Dilute seize antibody to 2 μg/mL in coating buffer (pH 9.6)
      - Add 100 μL per effectively to 96-well plate
      - Incubate at 4°C in a single day (12-16 hours)
      - BSL-2 cupboard required
    
    
    2. Blocking (Day 2)
      - Wash plate 3× with PBS-T (200 μL/effectively)
      - Add 200 μL blocking buffer (1% BSA in PBS)
      - Incubate 1 hour at room temperature
    
    
    3. Pattern Incubation
      - Wash 3× with PBS-T
      - Add 100 μL diluted samples/requirements
      - Incubate 2 hours at room temperature
    
    
    4. Detection Antibody
      - Wash 5× with PBS-T
      - Add 100 μL biotinylated detection antibody (0.5 μg/mL)
      - Incubate 1 hour at room temperature
    
    
    5. Streptavidin-HRP
      - Wash 5× with PBS-T
      - Add 100 μL streptavidin-HRP (1:1000 dilution)
      - Incubate half-hour at room temperature
      - Work in darkish
    
    
    6. Improvement
      - Wash 7× with PBS-T
      - Add 100 μL TMB substrate
      - Incubate 10-Quarter-hour (monitor coloration growth)
      - Add 50 μL cease answer (2M H2SO4) - CAUTION: corrosive
    """
    
    
    SAMPLE_INVENTORY = """reagent,amount,unit,expiry,lot
    seize antibody,500,μg,2025-12-31,AB123
    blocking buffer,500,mL,2025-11-30,BB456
    PBS-T,1000,mL,2026-01-15,PT789
    detection antibody,8,μg,2025-10-15,DA321
    streptavidin HRP,10,mL,2025-12-01,SH654
    TMB substrate,100,mL,2025-11-20,TM987
    cease answer,250,mL,2026-03-01,SS147
    BSA,100,g,2024-09-30,BS741"""
    
    
    outcomes = agent_loop(SAMPLE_PROTOCOL, SAMPLE_INVENTORY, start_time="09:00")
    print("n" + "="*70)
    print(generate_checklist(outcomes))
    print("n" + "="*70)
    print("n📊 GANTT CSV (first 400 chars):n")
    print(generate_gantt_csv(outcomes['schedule'])[:400])
    print("n🎯 Time Financial savings:", f"{outcomes['time_saved']} minutes by way of parallelization")

    We conduct a complete check run utilizing a pattern ELISA protocol and a reagent stock dataset. We visualize the agent’s outputs, optimized schedule, parallelization good points, and AI-suggested enhancements, demonstrating how our planner features as a self-contained, clever lab assistant.

    Finally, we demonstrated how agentic AI rules can improve reproducibility and security in wet-lab workflows. By parsing free-form experimental textual content into structured, actionable plans, we automated protocol validation, reagent administration, and temporal optimization in a single pipeline. The combination of CodeGen permits on-device reasoning about bottlenecks and security circumstances, permitting for self-contained, data-secure operations. We concluded with a totally useful planner that generates Gantt-compatible schedules, Markdown checklists, and AI-driven optimization suggestions, establishing a strong basis for autonomous laboratory planning techniques.


    Take a look at the FULL CODES here. Be happy to take a look at our GitHub Page for Tutorials, Codes and Notebooks. Additionally, be happy to observe us on Twitter and don’t overlook to hitch our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.


    Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is dedicated to harnessing the potential of Synthetic Intelligence for social good. His most up-to-date endeavor is the launch of an Synthetic Intelligence Media Platform, Marktechpost, which stands out for its in-depth protection of machine studying and deep studying information that’s each technically sound and simply comprehensible by a large viewers. The platform boasts of over 2 million month-to-month views, illustrating its recognition amongst audiences.

    🙌 Follow MARKTECHPOST: Add us as a preferred source on Google.



    Source link

    Naveed Ahmad

    Related Posts

    Databricks CEO says SaaS is not useless, however AI will quickly make it irrelevant

    10/02/2026

    Bluesky lastly provides drafts | TechCrunch

    10/02/2026

    Waymo is testing driverless robotaxis in Nashville

    10/02/2026
    Leave A Reply Cancel Reply

    Categories
    • AI
    Recent Comments
      Facebook X (Twitter) Instagram Pinterest
      © 2026 ThemeSphere. Designed by ThemeSphere.

      Type above and press Enter to search. Press Esc to cancel.