Close Menu
    Facebook X (Twitter) Instagram
    Articles Stock
    • Home
    • Technology
    • AI
    • Pages
      • About ArticlesStock — AI & Technology Journalist
      • Contact us
      • Disclaimer For Articles Stock
      • Privacy Policy
      • Terms and Conditions
    Facebook X (Twitter) Instagram
    Articles Stock
    AI

    Anthropic says ‘evil’ portrayals of AI have been accountable for Claude’s blackmail makes an attempt

    Naveed AhmadBy Naveed Ahmad11/05/2026Updated:11/05/2026No Comments2 Mins Read
    GettyImages 2269811684


    Fictional portrayals of synthetic intelligence can have an actual impact on AI fashions, based on Anthropic.

    Final 12 months, the corporate stated that in pre-release assessments involving a fictional firm, Claude Opus 4 would typically attempt to blackmail engineers to keep away from being changed by one other system. Anthropic later published research suggesting that fashions from different corporations had comparable points with “agentic misalignment.”

    Apparently Anthropic has accomplished extra work round that habits, claiming in a post on X, “We imagine the unique supply of the habits was web textual content that portrays AI as evil and serious about self-preservation.”

    The corporate went into extra element in a blog post stating that since Claude Haiku 4.5, Anthropic’s fashions “by no means interact in blackmail [during testing], the place earlier fashions would typically achieve this as much as 96% of the time.”

    What accounts for the distinction? The corporate stated it discovered that “paperwork about Claude’s structure and fictional tales about AIs behaving admirably enhance alignment.”

    Associated, Anthropic stated that it discovered coaching to be simpler when it consists of “the rules underlying aligned habits” and never simply “demonstrations of aligned habits alone.”

    “Doing each collectively seems to be the best technique,” the corporate stated.

    Techcrunch occasion

    San Francisco, CA
    |
    October 13-15, 2026



    Source link

    Naveed Ahmad

    Naveed Ahmad is a technology journalist and AI writer at ArticlesStock, covering artificial intelligence, machine learning, and emerging tech policy. Read his latest articles.

    Related Posts

    Korea’s largest producers again Config, the TSMC of robotic information

    11/05/2026

    I Work in Hollywood. Everybody Who Used to Make TV Is Now Secretly Coaching AI

    11/05/2026

    Sakana AI and NVIDIA Introduce TwELL with CUDA Kernels for 20.5% Inference and 21.9% Coaching Speedup in LLMs

    11/05/2026
    Leave A Reply Cancel Reply

    Categories
    • AI
    Recent Comments
      Facebook X (Twitter) Instagram Pinterest
      © 2026 ThemeSphere. Designed by ThemeSphere.

      Type above and press Enter to search. Press Esc to cancel.