OpenAI Said to Be Working on New Reasoning Technology Under Code Name ‘Strawberry’

OpenAI Said to Be Working on New Reasoning Technology Under Code Name ‘Strawberry’

Technology



ChatGPT maker OpenAI is working on a new approach to its artificial intelligence models in a project codenamed “Strawberry,” according to a person familiar with the matter and internal documents reviewed by Reuters.

The project, which was previously unreported, comes as the Microsoft-backed startup races to demonstrate that the types of models it offers are capable of delivering advanced reasoning capabilities.

OpenAI teams are working on Strawberry, according to a copy of an internal OpenAI document seen by Reuters in May. Reuters was unable to determine the exact date of the document, which details a plan for how OpenAI plans to use Strawberry to conduct research. The source described the plan to Reuters as a work in progress. The news agency was unable to establish how close Strawberry is to being publicly available.

How Strawberry works is a closely guarded secret even within OpenAI, the person said.

The paper describes a project that uses Strawberry models with the goal of enabling enterprise AI to not only generate responses to queries, but to plan far enough ahead to autonomously and reliably navigate the Internet to carry out the which OpenAI calls “deep research,” according to the source. .

That's something that has so far eluded AI models, according to interviews with more than a dozen AI researchers.

Asked about Strawberry and the details reported in this story, an OpenAI company spokesperson said in a statement: “We want our AI models to see and understand the world more like we do. The ongoing research into new AI capabilities AI is common practice in industry, with the shared belief that these systems will improve reasoning over time.”

The spokesperson did not directly address questions about Strawberry.

Project Strawberry was previously known as Q*, which Reuters reported was already seen as a breakthrough within the company last year.

Two sources described viewing earlier this year what OpenAI employees told them were demonstrations of Q*, capable of answering complicated science and math questions beyond the reach of current commercially available models.

At an all-hands internal meeting on Tuesday, OpenAI showed off a demo of a research project it claimed had new human-like reasoning abilities, according to Bloomberg. An OpenAI spokesperson confirmed the meeting, but declined to elaborate on the content. Reuters could not determine whether the project shown was Strawberry.

OpenAI expects the innovation to dramatically improve the reasoning capabilities of its AI models, the person familiar with the matter said, adding that Strawberry involves a specialized way of processing an AI model after it has been previously trained on very large data sets.

Researchers interviewed by Reuters say that reasoning is key for AI to achieve human or superhuman intelligence.

While large language models can already summarize dense text and compose elegant prose much faster than any human, the technology often lacks common-sense problems whose solutions seem intuitive to people, such as recognizing logical fallacies and playing tic-tac-toe. When the model encounters these kinds of problems, it often “hallucinates” false information.

AI researchers interviewed by Reuters generally agree that reasoning, in the context of AI, involves forming a model that allows the AI ​​to plan ahead, reflect how the physical world works, and work through challenging problems of various steps reliably.

Improving reasoning in AI models is seen as the key to unlocking the ability of models to do everything from making important scientific discoveries to planning and building new software applications.

OpenAI CEO Sam Altman said earlier this year that in AI “the biggest areas of progress will be around the ability to reason.”

Other companies such as Google, Meta, and Microsoft are also experimenting with different techniques to improve reasoning in AI models, as are most academic labs conducting AI research. Researchers differ, however, on whether large linguistic models (LLMs) are able to incorporate ideas and long-term planning in the way they make predictions. For example, one of the pioneers of modern AI, Yann LeCun, who works at Meta, has often said that LLMs are not capable of reasoning like humans.

AI Challenges

Strawberry is a key component of OpenAI's plan to overcome these challenges, said the source familiar with the matter. The document seen by Reuters describes what Strawberry aims to enable, but not how.

In recent months, the company has been privately signaling to developers and other outside parties that it is about to release technology with significantly more advanced reasoning capabilities, according to four people who have heard the company's arguments. They declined to be identified because they are not authorized to discuss private matters.

Strawberry includes a specialized form of what is known as “post-training” of OpenAI's generative AI models, or the adaptation of base models to refine their performance in specific ways after they have already been “trained” on heaps of generalized data, one of the sources. said

The post-training phase of model development involves methods such as “tuning,” a process used in almost all language models today that comes in many flavors, such as humans giving feedback to the model based on their answers and feed it with examples. of good and bad answers.

Strawberry has similarities to a method developed at Stanford in 2022 called the “Self-Taught Reasoner” or “STaR,” said one of the sources with knowledge of the matter. STaR allows AI models to “boot” to higher levels of intelligence through iterative creation. its own training data and, in theory, could be used to get language models to transcend human-level intelligence, one of its creators, Stanford professor Noah Goodman, told Reuters.

“I think this is both exciting and terrifying … if things continue in this direction, we have some serious things to think about as humans,” Goodman said. Goodman is not affiliated with OpenAI and is not familiar with Strawberry.

Among the capabilities OpenAI is aiming for in Strawberry is long-horizon tasking (LHT), the paper says, referring to complex tasks that require a model to plan ahead and perform a series of actions over a period of time. extended time, the first source explained. .

To do that, OpenAI is building, training and evaluating the models on what the company calls a “deep research” data set, according to OpenAI's internal documentation. Reuters was unable to determine what is in that data set or how long an extended period would mean.

OpenAI specifically wants its models to use these capabilities to conduct research by navigating the web autonomously with the help of a “CUA,” or computer-using agent, that can take actions based on its findings, according to the document and one of the sources. OpenAI also plans to test its capabilities to do the work of software engineers and machine learning.

© Thomson Reuters 2024



Source

Leave a Reply

Your email address will not be published. Required fields are marked *