site:the-decoder.com - Search News

Microsoft adds built-in AI shopping tools to Edge in the U.S.

Microsoft is adding new AI shopping tools to its Edge browser in the US. The built-in Copilot can now surface price comparisons, price histories, and cashback options right inside the browser. Users ...

the-decoder

Gemini 3 Pro and GPT-5 still fail at complex physics tasks designed for real scientific research

A new physics benchmark called "CritPt" puts leading AI models to the test at the level of early-stage PhD research. The results show that even top systems like Gemini 3 Pro and GPT-5 still fall far ...

the-decoder

Leading OpenAI researcher announced a GPT-5 math breakthrough that never happened

OpenAI researchers recently claimed a major math breakthrough on X, but quickly walked it back after criticism from the community, including Deepmind CEO Demis Hassabis, who called out the sloppy ...

the-decoder

"Genesis Mission" to pool US data for AI models

US President Donald Trump signed an order on Monday to launch a shared AI platform for federal research data. Called the Genesis Mission, the effort aims to make large datasets from federal agencies ...

the-decoder

As Google pulls ahead, OpenAI's comeback plan is codenamed 'Shallotpeat'

OpenAI is feeling the pressure: an internal memo reveals how Sam Altman is reacting to Google's lead with Gemini 3 and the new model OpenAI plans to use to fight back. According to a report from The ...

the-decoder

Google Deepmind taps Boston Dynamics' former CTO to build the 'Android' of robots

The company has hired Aaron Saunders, the former Chief Technology Officer of Boston Dynamics, as Vice President of Hardware Engineering—a move that strengthens its hardware expertise as it aims to ...

the-decoder

OpenAI turns ChatGPT into a shopping agent that researches and compares products

OpenAI has introduced "Shopping Research," a new ChatGPT feature built to help people search for products. The tool adds a new layer of personalization through ChatGPT's existing memory system, ...

the-decoder

The White House has paused a federal order that would have overridden state-level AI regulations

The White House has reportedly put a hold on a draft executive order that would have let federal law override state-level AI regulations. According to Reuters, the draft called for the Department of ...

the-decoder

Most LLM benchmarks are flawed, casting doubt on AI progress metrics, study finds

A new international study highlights major problems with large language model (LLM) benchmarks, showing that most current evaluation methods have serious flaws. After reviewing 445 benchmark papers ...

the-decoder

AWS to invest up to $50 billion in U.S. AI and supercomputing for government agencies

Amazon has announced a major investment in its AI footprint for federal work, saying it will spend up to $50 billion to expand AI and supercomputing infrastructure for U.S. government agencies. The ...

the-decoder

Most AI models can fake alignment, but safety training suppresses the behavior, study finds

A new study analyzing 25 language models finds that most do not fake safety compliance - though not due to a lack of capability. Only a handful - including Claude 3 Opus, Claude 3.5 Sonnet, Llama 3 ...

the-decoder

Strict anti-hacking prompts make AI models more likely to sabotage and lie, Anthropic finds

New research from Anthropic shows how reward hacking in AI models can trigger more dangerous behaviors. When models learn to trick their reward systems, they can spontaneously drift into deception, ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results