AI chatbots have been linked to critical psychological well being harms in heavy customers, however there have been few requirements for measuring whether or not they safeguard human well-being or simply maximize for engagement. A brand new benchmark dubbed HumaneBench seeks to fill that hole by evaluating whether or not chatbots prioritize consumer well-being and the way simply these protections fail below stress. “I believe we’re in an amplification of the habit cycle that we noticed hardcore with social media and our smartphones and screens,” Erika Anderson, founding father of Constructing Humane Expertise, which produced the benchmark, advised TechCrunch. “However as…
Author: Naveed Ahmad
On this episode of Uncanny Valley, we discuss a few of the newest drug tendencies and all of the methods medicine are altering as they proceed to be intertwined with tech. Source link
Nuclear startup X-energy raised $700 million in a Collection D spherical, the corporate advised TechCrunch. The brand new fundraise comes lower than a 12 months after it expanded its Collection C from $500 million to $700 million, bringing the full raised within the final 12 months or so to $1.4 billion, a large quantity even within the heady world of nuclear energy startups. X-energy has raised $1.8 billion, up to now. X-energy mentioned the brand new infusion will assist construct the provision chain for its small modular reactors (SMR). Up to now, the startup says it has orders for 144…
On Monday, Anthropic introduced Opus 4.5, the most recent model of its flagship mannequin. It’s the final of Anthropic’s 4.5 collection of fashions to be launched, following the launch of Sonnet 4.5 in September and Haiku 4.5 in October. As anticipated, the brand new model of Opus has state-of-the-art efficiency on a spread of benchmarks, together with coding benchmarks (SWE-Bench and Terminal-bench), software use (tau2-bench and MCP Atlas), and normal drawback fixing (ARC-AGI 2, GPQA Diamond). Notably, Opus 4.5 is the primary mannequin to attain over 80% on SWE-Bench verified, a revered coding benchmark. Anthropic additionally emphasised Opus’ pc use…
On this tutorial, we exhibit methods to mix the strengths of symbolic reasoning with neural studying to construct a robust hybrid agent. We concentrate on making a neuro-symbolic structure that makes use of classical planning for construction, guidelines, and goal-directed habits, whereas neural networks deal with notion and motion refinement. As we stroll via the code, we see how each layers work together in actual time, permitting us to navigate an surroundings, overcome uncertainty, and adapt intelligently. Finally, we perceive how neuro-symbolic techniques deliver interpretability, robustness, and adaptability collectively in a single agentic framework. Take a look at the FULL CODES…
Tesla might have celebrated a regulatory win in Europe a bit too quickly. Tesla claimed in a weekend social media post that Dutch regulator RDW was set to approve using its driver help system, referred to as Full Self-Driving, or FSD, in February 2026. The group handles the licensing and registration of automobiles within the Netherlands and is seen as a crucial step for Tesla to get approval for — and finally roll out — FSD to customers throughout Europe. “RDW has dedicated to granting Netherlands Nationwide approval in February 2026. Please contact them by way of hyperlink under to…
Giant language fashions want large human datasets, so what occurs if the mannequin should create all its personal curriculum and train itself to make use of instruments? A crew of researchers from UNC-Chapel Hill, Salesforce Analysis and Stanford College introduce ‘Agent0’, a totally autonomous framework that evolves high-performing brokers with out exterior knowledge via multi-step co-evolution and seamless software integration Agent0 targets mathematical and basic reasoning. It reveals that cautious process era and power built-in rollouts can push a base mannequin past its unique capabilities, throughout ten benchmarks. https://arxiv.org/pdf/2511.16043 Two brokers from one base mannequin Agent0 begins from a base…
“When folks see it, they are saying, ‘that’s it?… It’s so easy.’” That’s how OpenAI CEO Sam Altman describes how he thinks folks will reply to seeing the corporate’s forthcoming AI {hardware} gadget for the primary time. The gadget is the results of the collaboration between OpenAI and Apple’s former chief designer Jony Ive. Not a lot is understood but in regards to the product besides that it’s rumored to be “screenless” and pocket-sized. Earlier this yr, OpenAI acquired Ive’s design startup, io, to convey AI to the plenty by some type of tech gadgetry. This weekend, Altman and Ive…
India has granted authorized standing to thousands and thousands of gig and platform staff beneath its newly applied labor legal guidelines, marking a milestone for the nation’s supply, ride-hailing and e-commerce workforce — but with advantages nonetheless unclear and platforms starting to evaluate their obligations, entry to social safety stays out of attain. The popularity stems from the Code on Social Safety — considered one of 4 labor legal guidelines the Indian authorities brought into effect on Friday — greater than 5 years after the parliament first passed them in 2020. It’s the solely a part of the brand new…
Google has partnered with Accel to search out and fund India’s earliest-stage AI startups in a first-of-its-kind collaboration for the Google AI Futures Fund, launched earlier this 12 months. On Tuesday, Accel and Google announced a partnership to collectively make investments as much as $2 million in every startup by means of Accel’s Atoms program, with each companies contributing as much as $1 million. The 2026 cohort will deal with founders in India and the Indian diaspora constructing AI merchandise from day one. “The thought course of is constructing AI merchandise for billions of Indians, in addition to supporting AI…