Fudan professor Xiao Yanghua: ChatGPT code interpreter is a milestone achievement

2023-07-14 02:10:59

Source: The Paper

Reporter Shao Wen

After the ChatGPT code interpreter beta is released, users can use natural language to issue instructions to ChatGPT to complete complex programming tasks even if they are not programmers. This may have two major impacts: eliminating the language gap and reshaping the industry.

There will be two trends in the rapid iteration of large models in the future: First, ChatGPT will definitely learn from larger-scale and more diverse data, and at the same time combine more professional data in the private domain to carry out wider learning ; Second, it will increase the degree of data analysis, which can be considered to be more in-depth learning to a certain extent.

Image source: Generated by Unbounded AI tool

The ChatGPT code interpreter beta version is officially open to all ChatGPT Plus users. It can use human natural language as instructions to drive large models to complete mathematical operations, data analysis, professional chart drawing, and even generate videos and analyze the stock market.

"OpenAI's ChatGPT capability has been upgraded again. It has completed the upgrade from a tool to an assistant, and this time it has been upgraded from an ordinary assistant to a professional assistant." On July 12, Xiao Yanghua, a professor at Fudan University and director of the Shanghai Key Laboratory of Data Science Pengpai Technology (talking about OpenAI's recent blockbuster release: ChatGPT code interpreter (Code interpreter).

On July 9th, Beijing time, the beta version of the ChatGPT code interpreter was officially opened to all ChatGPT Plus users. It can use human natural language as instructions to drive large models to complete mathematical operations, data analysis, professional chart drawing, and even generate videos, Analyze the stock market.

That is to say, even if users are not programmers, they can give instructions to ChatGPT in natural language to complete complex programming tasks. This is evaluated by the outside world as "the most powerful function of GPT-4 ever".

"To use an inappropriate metaphor." Xiao Yanghua said, "It can be seen that OpenAI should be 'planned for a long time'. They have been working hard to improve the multimodal interaction capabilities of large models." Multimodal interaction is driven by natural language. Capabilities for multimodal tasks such as images, specialized diagrams, etc.

What does it mean to be such a professional helper? "It means that ChatGPT is capable of completing even a lot of highly professional work. It can be said that it can be competent for the work of undergraduates with related majors in universities, such as data science majors." Xiao Yanghua said.

"The ability to analyze data determines the ability that the large model can obtain in the future"

As for why ChatGPT chose to upgrade in this area, Xiao Yanghua believes that this is due to the in-depth analysis and learning of data. Such data exists widely, and most papers essentially include professional data analysis of various disciplines. The previous versions of GPT mainly focused on the effective use of text data, but the use of charts, grids and their correspondence with text in these data is relatively extensive and simple. This upgrade actually benefited from the in-depth analysis of professional literature and other data, and the establishment of the corresponding relationship between text, charts and formulas, enabling GPT to acquire the ability to drive charts and tables through natural language interaction.

From such a discovery, Xiao Yanghua got a revelation in technology research and development: "This kind of in-depth analysis ability for corpus is likely to be one of the core factors that determine the ability of large models. The development of large models no matter how much data Not too much."

For ChatGPT, Xiao Yanghua believes that the direction of OpenAI's efforts has been to seek more high-quality data and deeply analyze the existing data, so as to make its capabilities more and more powerful. Therefore, obtaining large-scale, high-quality, and diverse data, and in-depth analysis of these data may be one of the important ideas to promote the development of large models. "

"Clearing the language gap"

Looking at the capability upgrade of ChatGPT in general, Xiao Yanghua believes that there are two possible impacts worthy of attention: first, "eliminate the language gap"; second, reshape the industrial form.

What is the language gap? Since the invention of computers, human beings hope to let computers complete various set tasks according to their own wishes, which requires professionals to express intentions and issue instructions through non-natural language or formal language, such as early assembly language, and later C++ high-level programming language , Structured query language such as SQL, etc. The language of human communication and communication is natural language.

According to Western legends, in order to prevent humans from building the "Tower of Babel" that reaches the sky, God messed up human languages, making it impossible for humans to communicate and understand others. Xiao Yanghua believes that there is also such a situation between machines and humans. At least machines have not been able to accurately understand human natural language, so in fact humans have been accommodating machines and converting their intentions into various formal languages.

However, the tasks that computers need to complete exist in thousands of industries. Xiao Yanghua said that this means that in order to complete different tasks, professionals have to learn different languages, such as languages specifically for chip design and languages for office automation. All of these require complex training to master, so every professional task requires complex language learning, which sets a high language threshold for people to engage in a certain industry.

But now it seems that Xiao Yanghua judges, "All these formal languages are unnecessary, and can basically be replaced by natural language." To some extent, it can be considered that machines "understand" human natural language and at the same time understand It has developed various professional formal languages, which can accurately convert human intentions expressed in various natural languages into corresponding formal languages, such as programming languages and chip design languages.

This is to eliminate the language gap, and there is no longer a barrier for machines to "understand" humans. "If the first version of ChatGPT eliminated the natural language expression gap between man and machine, this ChatGPT with Code Interpreter function will eliminate the professional language expression gap between man and machine." Xiao Yanghua believes that this will have a very far-reaching impact. Impact is a milestone achievement.

"Soon, large models will gradually be competent for the 'language' abilities required for human beings to engage in very professional work, such as mathematical language and physical language, as well as the corresponding thinking ability and problem-solving ability. Because, in principle, this is Similarly, the mathematical language required by mathematicians to carry out research work is only a formal language. As long as the paired data of natural language and corresponding professional language can be obtained, large models have the opportunity to learn. These data are widely available in In the thesis, widely used professional software, such as MATLAB, can also be used for data synthesis, thereby further alleviating the problem of data scarcity in the learning of large-scale professional capabilities." Xiao Yanghua said.

Is there still a need for professional positions?

This means that in the future, most of the professional work that requires a mastery of professional languages can be done well, and the large model may be able to complete it well. This also brings up a question worthy of in-depth consideration. Xiao Yanghua asked: Do we still have room for professionals to develop, or is their job necessary?

In Xiao Yanghua's view, with the improvement of the ability of large models, all work done with the help of language will be divided into three steps in the future: the first step is to prompt (), the second step is to generate, and the third step is to evaluate.

"Obviously, these generated jobs, whether professional or non-professional, can be handed over to the big model. But professionals still have their value, such as writing prompt words, how to prompt the professionalism required for large model generation Charts, and how to evaluate and analyze the quality of the generated results. Humans still have their advantages in these aspects, or in the short term, large models still need greater improvement to be competent.” Xiao Yanghua said, so this will reshape the industry form.

Furthermore, most tasks related to content generation and analytical work will be decomposed into many subdivision steps, among which the repetitive, routine, and generative subdivision steps will be gradually handed over to the large model, and the The subdivision tasks that traditional small models are good at are handed over to small models, and the subdivision tasks that are still only good at humans are handed over to humans. Xiao Yanghua believes that decomposing complex tasks into multiple steps (decomposition), and then completing the steps that they are good at (reorganization) by large models, small models, and humans. basic trend.

Two trends of rapid iteration of large models

As for whether this update represents the emergence of GPT-4.5, Xiao Yanghua believes that this is not the key, but this issue has attracted so much attention, which actually reflects human beings’ concerns about the rapid iteration of large models, and to a certain extent reflects the possible impact of everyone on it. concerns about social impact. In his view, this worry is not unreasonable, "In the case of its rapid iteration, at least we understand that its speed may not be able to keep up with its iteration speed. We even have to actively press the pause button for the development of large models , think carefully about what it can and cannot do.”

For the two trends of rapid iteration of large models, Xiao Yanghua believes that, first, ChatGPT is now mainly based on public data learning, and it will definitely learn from larger-scale and more diverse data, while combining private domain expertise Stronger data. Second, it will increase the degree of data analysis, which can be considered to improve the depth of learning to a certain extent. In other words, there are two dimensions, one is to learn more and more extensively, and the other is to learn more and more specialized and deeper old data.

"This is a very important idea in this version. In fact, it is very likely that the data is still the same data, but it is learned more deeply." Xiao Yanghua continued, "If the large models in each field are fragmented and cannot be integrated , then its ability may still be within the controllable range. However, if ChatGPT has a strong general knowledge ability and continuously combines various private domain data for learning, then its ability upgrade may be beyond our expectations. Therefore, the promotion of large models It is imperative and imminent to develop in a safe and controllable direction.”

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

0/400

No comments

Topic
#Gate Square Qixi Celebration
15k Popularity
#Crypto Market Pullback
277k Popularity
#Trump Removes Fed Governor Cook
12k Popularity
#Companies Expand Crypto Reserves
2k Popularity
#Gate Alpha DORA Points Airdrop
932 Popularity

Sitemap

Fudan professor Xiao Yanghua: ChatGPT code interpreter is a milestone achievement

"The ability to analyze data determines the ability that the large model can obtain in the future"

"Clearing the language gap"

**Is there still a need for professional positions? **

Two trends of rapid iteration of large models

Is there still a need for professional positions?