Close Menu
    Facebook X (Twitter) Instagram
    Articles Stock
    • Home
    • Technology
    • AI
    • Pages
      • About us
      • Contact us
      • Disclaimer For Articles Stock
      • Privacy Policy
      • Terms and Conditions
    Facebook X (Twitter) Instagram
    Articles Stock
    AI

    How you can Design Manufacturing-Grade Mock Information Pipelines Utilizing Polyfactory with Dataclasses, Pydantic, Attrs, and Nested Fashions

    Naveed AhmadBy Naveed Ahmad08/02/2026Updated:08/02/2026No Comments9 Mins Read
    blog banner23 1 3


    On this tutorial, we stroll by a sophisticated, end-to-end exploration of Polyfactory, specializing in how we are able to generate wealthy, reasonable mock knowledge instantly from Python kind hints. We begin by establishing the atmosphere and progressively construct factories for knowledge lessons, Pydantic fashions, and attrs-based lessons, whereas demonstrating customization, overrides, calculated fields, and the technology of nested objects. As we transfer by every snippet, we present how we are able to management randomness, implement constraints, and mannequin real-world constructions, making this tutorial instantly relevant to testing, prototyping, and data-driven growth workflows. Take a look at the FULL CODES here.

    import subprocess
    import sys
    
    
    def install_package(package deal):
       subprocess.check_call([sys.executable, "-m", "pip", "install", "-q", package])
    
    
    packages = [
       "polyfactory",
       "pydantic",
       "email-validator",
       "faker",
       "msgspec",
       "attrs"
    ]
    
    
    for package deal in packages:
       strive:
           install_package(package deal)
           print(f"✓ Put in {package deal}")
       besides Exception as e:
           print(f"✗ Failed to put in {package deal}: {e}")
    
    
    print("n")
    
    
    print("=" * 80)
    print("SECTION 2: Primary Dataclass Factories")
    print("=" * 80)
    
    
    from dataclasses import dataclass
    from typing import Record, Optionally available
    from datetime import datetime, date
    from uuid import UUID
    from polyfactory.factories import DataclassFactory
    
    
    @dataclass
    class Handle:
       avenue: str
       metropolis: str
       nation: str
       zip_code: str
    
    
    @dataclass
    class Particular person:
       id: UUID
       title: str
       e mail: str
       age: int
       birth_date: date
       is_active: bool
       deal with: Handle
       phone_numbers: Record[str]
       bio: Optionally available[str] = None
    
    
    class PersonFactory(DataclassFactory[Person]):
       go
    
    
    particular person = PersonFactory.construct()
    print(f"Generated Particular person:")
    print(f"  ID: {particular person.id}")
    print(f"  Title: {particular person.title}")
    print(f"  Electronic mail: {particular person.e mail}")
    print(f"  Age: {particular person.age}")
    print(f"  Handle: {particular person.deal with.metropolis}, {particular person.deal with.nation}")
    print(f"  Telephone Numbers: {particular person.phone_numbers[:2]}")
    print()
    
    
    individuals = PersonFactory.batch(5)
    print(f"Generated {len(individuals)} individuals:")
    for i, p in enumerate(individuals, 1):
       print(f"  {i}. {p.title} - {p.e mail}")
    print("n")

    We arrange the atmosphere and guarantee all required dependencies are put in. We additionally introduce the core thought of utilizing Polyfactory to generate mock knowledge from kind hints. By initializing the essential dataclass factories, we set up the muse for all subsequent examples.

    print("=" * 80)
    print("SECTION 3: Customizing Manufacturing facility Habits")
    print("=" * 80)
    
    
    from faker import Faker
    from polyfactory.fields import Use, Ignore
    
    
    @dataclass
    class Worker:
       employee_id: str
       full_name: str
       division: str
       wage: float
       hire_date: date
       is_manager: bool
       e mail: str
       internal_notes: Optionally available[str] = None
    
    
    class EmployeeFactory(DataclassFactory[Employee]):
       __faker__ = Faker(locale="en_US")
       __random_seed__ = 42
    
    
       @classmethod
       def employee_id(cls) -> str:
           return f"EMP-{cls.__random__.randint(10000, 99999)}"
    
    
       @classmethod
       def full_name(cls) -> str:
           return cls.__faker__.title()
    
    
       @classmethod
       def division(cls) -> str:
           departments = ["Engineering", "Marketing", "Sales", "HR", "Finance"]
           return cls.__random__.selection(departments)
    
    
       @classmethod
       def wage(cls) -> float:
           return spherical(cls.__random__.uniform(50000, 150000), 2)
    
    
       @classmethod
       def e mail(cls) -> str:
           return cls.__faker__.company_email()
    
    
    workers = EmployeeFactory.batch(3)
    print("Generated Staff:")
    for emp in workers:
       print(f"  {emp.employee_id}: {emp.full_name}")
       print(f"    Division: {emp.division}")
       print(f"    Wage: ${emp.wage:,.2f}")
       print(f"    Electronic mail: {emp.e mail}")
       print()
    print()
    
    
    print("=" * 80)
    print("SECTION 4: Subject Constraints and Calculated Fields")
    print("=" * 80)
    
    
    @dataclass
    class Product:
       product_id: str
       title: str
       description: str
       worth: float
       discount_percentage: float
       stock_quantity: int
       final_price: Optionally available[float] = None
       sku: Optionally available[str] = None
    
    
    class ProductFactory(DataclassFactory[Product]):
       @classmethod
       def product_id(cls) -> str:
           return f"PROD-{cls.__random__.randint(1000, 9999)}"
    
    
       @classmethod
       def title(cls) -> str:
           adjectives = ["Premium", "Deluxe", "Classic", "Modern", "Eco"]
           nouns = ["Widget", "Gadget", "Device", "Tool", "Appliance"]
           return f"{cls.__random__.selection(adjectives)} {cls.__random__.selection(nouns)}"
    
    
       @classmethod
       def worth(cls) -> float:
           return spherical(cls.__random__.uniform(10.0, 1000.0), 2)
    
    
       @classmethod
       def discount_percentage(cls) -> float:
           return spherical(cls.__random__.uniform(0, 30), 2)
    
    
       @classmethod
       def stock_quantity(cls) -> int:
           return cls.__random__.randint(0, 500)
    
    
       @classmethod
       def construct(cls, **kwargs):
           occasion = tremendous().construct(**kwargs)
           if occasion.final_price is None:
               occasion.final_price = spherical(
                   occasion.worth * (1 - occasion.discount_percentage / 100), 2
               )
           if occasion.sku is None:
               name_part = occasion.title.exchange(" ", "-").higher()[:10]
               occasion.sku = f"{occasion.product_id}-{name_part}"
           return occasion
    
    
    merchandise = ProductFactory.batch(3)
    print("Generated Merchandise:")
    for prod in merchandise:
       print(f"  {prod.sku}")
       print(f"    Title: {prod.title}")
       print(f"    Worth: ${prod.worth:.2f}")
       print(f"    Low cost: {prod.discount_percentage}%")
       print(f"    Last Worth: ${prod.final_price:.2f}")
       print(f"    Inventory: {prod.stock_quantity} items")
       print()
    print()

    We deal with producing easy however reasonable mock knowledge utilizing dataclasses and default Polyfactory conduct. We present rapidly create single situations and batches with out writing any customized logic. It helps us validate how Polyfactory mechanically interprets kind hints to populate nested constructions.

    print("=" * 80)
    print("SECTION 6: Advanced Nested Buildings")
    print("=" * 80)
    
    
    from enum import Enum
    
    
    class OrderStatus(str, Enum):
       PENDING = "pending"
       PROCESSING = "processing"
       SHIPPED = "shipped"
       DELIVERED = "delivered"
       CANCELLED = "cancelled"
    
    
    @dataclass
    class OrderItem:
       product_name: str
       amount: int
       unit_price: float
       total_price: Optionally available[float] = None
    
    
    @dataclass
    class ShippingInfo:
       provider: str
       tracking_number: str
       estimated_delivery: date
    
    
    @dataclass
    class Order:
       order_id: str
       customer_name: str
       customer_email: str
       standing: OrderStatus
       objects: Record[OrderItem]
       order_date: datetime
       shipping_info: Optionally available[ShippingInfo] = None
       total_amount: Optionally available[float] = None
       notes: Optionally available[str] = None
    
    
    class OrderItemFactory(DataclassFactory[OrderItem]):
       @classmethod
       def product_name(cls) -> str:
           merchandise = ["Laptop", "Mouse", "Keyboard", "Monitor", "Headphones",
                      "Webcam", "USB Cable", "Phone Case", "Charger", "Tablet"]
           return cls.__random__.selection(merchandise)
    
    
       @classmethod
       def amount(cls) -> int:
           return cls.__random__.randint(1, 5)
    
    
       @classmethod
       def unit_price(cls) -> float:
           return spherical(cls.__random__.uniform(5.0, 500.0), 2)
    
    
       @classmethod
       def construct(cls, **kwargs):
           occasion = tremendous().construct(**kwargs)
           if occasion.total_price is None:
               occasion.total_price = spherical(occasion.amount * occasion.unit_price, 2)
           return occasion
    
    
    class ShippingInfoFactory(DataclassFactory[ShippingInfo]):
       @classmethod
       def provider(cls) -> str:
           carriers = ["FedEx", "UPS", "DHL", "USPS"]
           return cls.__random__.selection(carriers)
    
    
       @classmethod
       def tracking_number(cls) -> str:
           return ''.be a part of(cls.__random__.selections('0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ', okay=12))
    
    
    class OrderFactory(DataclassFactory[Order]):
       @classmethod
       def order_id(cls) -> str:
           return f"ORD-{datetime.now().12 months}-{cls.__random__.randint(100000, 999999)}"
    
    
       @classmethod
       def objects(cls) -> Record[OrderItem]:
           return OrderItemFactory.batch(cls.__random__.randint(1, 5))
    
    
       @classmethod
       def construct(cls, **kwargs):
           occasion = tremendous().construct(**kwargs)
           if occasion.total_amount is None:
               occasion.total_amount = spherical(sum(merchandise.total_price for merchandise in occasion.objects), 2)
           if occasion.shipping_info is None and occasion.standing in [OrderStatus.SHIPPED, OrderStatus.DELIVERED]:
               occasion.shipping_info = ShippingInfoFactory.construct()
           return occasion
    
    
    orders = OrderFactory.batch(2)
    print("Generated Orders:")
    for order in orders:
       print(f"n  Order {order.order_id}")
       print(f"    Buyer: {order.customer_name} ({order.customer_email})")
       print(f"    Standing: {order.standing.worth}")
       print(f"    Objects ({len(order.objects)}):")
       for merchandise so as.objects:
           print(f"      - {merchandise.amount}x {merchandise.product_name} @ ${merchandise.unit_price:.2f} = ${merchandise.total_price:.2f}")
       print(f"    Whole: ${order.total_amount:.2f}")
       if order.shipping_info:
           print(f"    Delivery: {order.shipping_info.provider} - {order.shipping_info.tracking_number}")
    print("n")

    We construct extra complicated area logic by introducing calculated and dependent fields inside factories. We present how we are able to derive values resembling ultimate costs, totals, and transport particulars after object creation. This permits us to mannequin reasonable enterprise guidelines instantly inside our take a look at knowledge mills.

    print("=" * 80)
    print("SECTION 7: Attrs Integration")
    print("=" * 80)
    
    
    import attrs
    from polyfactory.factories.attrs_factory import AttrsFactory
    
    
    @attrs.outline
    class BlogPost:
       title: str
       creator: str
       content material: str
       views: int = 0
       likes: int = 0
       revealed: bool = False
       published_at: Optionally available[datetime] = None
       tags: Record[str] = attrs.subject(manufacturing facility=listing)
    
    
    class BlogPostFactory(AttrsFactory[BlogPost]):
       @classmethod
       def title(cls) -> str:
           templates = [
               "10 Tips for {}",
               "Understanding {}",
               "The Complete Guide to {}",
               "Why {} Matters",
               "Getting Started with {}"
           ]
           matters = ["Python", "Data Science", "Machine Learning", "Web Development", "DevOps"]
           template = cls.__random__.selection(templates)
           subject = cls.__random__.selection(matters)
           return template.format(subject)
    
    
       @classmethod
       def content material(cls) -> str:
           return " ".be a part of(Faker().sentences(nb=cls.__random__.randint(3, 8)))
    
    
       @classmethod
       def views(cls) -> int:
           return cls.__random__.randint(0, 10000)
    
    
       @classmethod
       def likes(cls) -> int:
           return cls.__random__.randint(0, 1000)
    
    
       @classmethod
       def tags(cls) -> Record[str]:
           all_tags = ["python", "tutorial", "beginner", "advanced", "guide",
                      "tips", "best-practices", "2024"]
           return cls.__random__.pattern(all_tags, okay=cls.__random__.randint(2, 5))
    
    
    posts = BlogPostFactory.batch(3)
    print("Generated Weblog Posts:")
    for submit in posts:
       print(f"n  '{submit.title}'")
       print(f"    Creator: {submit.creator}")
       print(f"    Views: {submit.views:,} | Likes: {submit.likes:,}")
       print(f"    Printed: {submit.revealed}")
       print(f"    Tags: {', '.be a part of(submit.tags)}")
       print(f"    Preview: {submit.content material[:100]}...")
    print("n")
    
    
    print("=" * 80)
    print("SECTION 8: Constructing with Particular Overrides")
    print("=" * 80)
    
    
    custom_person = PersonFactory.construct(
       title="Alice Johnson",
       age=30,
       e mail="[email protected]"
    )
    print(f"Customized Particular person:")
    print(f"  Title: {custom_person.title}")
    print(f"  Age: {custom_person.age}")
    print(f"  Electronic mail: {custom_person.e mail}")
    print(f"  ID (auto-generated): {custom_person.id}")
    print()
    
    
    vip_customers = PersonFactory.batch(
       3,
       bio="VIP Buyer"
    )
    print("VIP Prospects:")
    for buyer in vip_customers:
       print(f"  {buyer.title}: {buyer.bio}")
    print("n")

    We prolong Polyfactory utilization to validated Pydantic fashions and attrs-based lessons. We display how we are able to respect subject constraints, validators, and default behaviors whereas nonetheless producing legitimate knowledge at scale. It ensures our mock knowledge stays suitable with actual utility schemas.

    print("=" * 80)
    print("SECTION 9: Subject-Degree Management with Use and Ignore")
    print("=" * 80)
    
    
    from polyfactory.fields import Use, Ignore
    
    
    @dataclass
    class Configuration:
       app_name: str
       model: str
       debug: bool
       created_at: datetime
       api_key: str
       secret_key: str
    
    
    class ConfigFactory(DataclassFactory[Configuration]):
       app_name = Use(lambda: "MyAwesomeApp")
       model = Use(lambda: "1.0.0")
       debug = Use(lambda: False)
    
    
       @classmethod
       def api_key(cls) -> str:
           return f"api_key_{''.be a part of(cls.__random__.selections('0123456789abcdef', okay=32))}"
    
    
       @classmethod
       def secret_key(cls) -> str:
           return f"secret_{''.be a part of(cls.__random__.selections('0123456789abcdef', okay=64))}"
    
    
    configs = ConfigFactory.batch(2)
    print("Generated Configurations:")
    for config in configs:
       print(f"  App: {config.app_name} v{config.model}")
       print(f"    Debug: {config.debug}")
       print(f"    API Key: {config.api_key[:20]}...")
       print(f"    Created: {config.created_at}")
       print()
    print()
    
    
    print("=" * 80)
    print("SECTION 10: Mannequin Protection Testing")
    print("=" * 80)
    
    
    from pydantic import BaseModel, ConfigDict
    from typing import Union
    
    
    class PaymentMethod(BaseModel):
       model_config = ConfigDict(use_enum_values=True)
       kind: str
       card_number: Optionally available[str] = None
       bank_name: Optionally available[str] = None
       verified: bool = False
    
    
    class PaymentMethodFactory(ModelFactory[PaymentMethod]):
       __model__ = PaymentMethod
    
    
    payment_methods = [
       PaymentMethodFactory.build(type="card", card_number="4111111111111111"),
       PaymentMethodFactory.build(type="bank", bank_name="Chase Bank"),
       PaymentMethodFactory.build(verified=True),
    ]
    
    
    print("Cost Methodology Protection:")
    for i, pm in enumerate(payment_methods, 1):
       print(f"  {i}. Sort: {pm.kind}")
       if pm.card_number:
           print(f"     Card: {pm.card_number}")
       if pm.bank_name:
           print(f"     Financial institution: {pm.bank_name}")
       print(f"     Verified: {pm.verified}")
    print("n")
    
    
    print("=" * 80)
    print("TUTORIAL SUMMARY")
    print("=" * 80)
    print("""
    This tutorial coated:
    
    
    1. ✓ Primary Dataclass Factories - Easy mock knowledge technology
    2. ✓ Customized Subject Turbines - Controlling particular person subject values
    3. ✓ Subject Constraints - Utilizing PostGenerated for calculated fields
    4. ✓ Pydantic Integration - Working with validated fashions
    5. ✓ Advanced Nested Buildings - Constructing associated objects
    6. ✓ Attrs Help - Various to dataclasses
    7. ✓ Construct Overrides - Customizing particular situations
    8. ✓ Use and Ignore - Express subject management
    9. ✓ Protection Testing - Guaranteeing complete take a look at knowledge
    
    
    Key Takeaways:
    - Polyfactory mechanically generates mock knowledge from kind hints
    - Customise technology with classmethods and interior decorators
    - Helps a number of libraries: dataclasses, Pydantic, attrs, msgspec
    - Use PostGenerated for calculated/dependent fields
    - Override particular values whereas conserving others random
    - Good for testing, growth, and prototyping
    
    
    For extra data:
    - Documentation: https://polyfactory.litestar.dev/
    - GitHub: https://github.com/litestar-org/polyfactory
    """)
    print("=" * 80)

    We cowl superior utilization patterns resembling specific overrides, fixed subject values, and protection testing situations. We present how we are able to deliberately assemble edge instances and variant situations for strong testing. This ultimate step ties all the things collectively by demonstrating how Polyfactory helps complete and production-grade take a look at knowledge methods.

    In conclusion, we demonstrated how Polyfactory allows us to create complete, versatile take a look at knowledge with minimal boilerplate whereas nonetheless retaining fine-grained management over each subject. We confirmed deal with easy entities, complicated nested constructions, and Pydantic mannequin validation, in addition to specific subject overrides, inside a single, constant factory-based method. General, we discovered that Polyfactory allows us to maneuver sooner and take a look at extra confidently, because it reliably generates reasonable datasets that intently mirror production-like situations with out sacrificing readability or maintainability.


    Take a look at the FULL CODES here. Additionally, be at liberty to observe us on Twitter and don’t overlook to hitch our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.




    Source link

    Naveed Ahmad

    Related Posts

    Amazon’s ‘Melania’ documentary stumbles in second weekend

    09/02/2026

    From Svedka to Anthropic, manufacturers make daring performs with AI in Tremendous Bowl adverts

    09/02/2026

    Okay, I’m barely much less mad about that ‘Magnificent Ambersons’ AI venture

    09/02/2026
    Leave A Reply Cancel Reply

    Categories
    • AI
    Recent Comments
      Facebook X (Twitter) Instagram Pinterest
      © 2026 ThemeSphere. Designed by ThemeSphere.

      Type above and press Enter to search. Press Esc to cancel.