Written by: Li Dan
Source: Wall Street News
OpenAI, which started a wave of artificial intelligence (AI) applications with ChatGPT, released its latest work – GPT-4. ChatGPT supported by this new model will receive an upgrade.
On Tuesday, March 14th, Eastern Time, OpenAI announced the launch of a large-scale multimodal model GPT-4, saying that it can receive image and text input, output text, “more creative and collaborative than ever before”, and ” Since it has broader common sense and problem-solving abilities, it can solve difficult problems more accurately.”
OpenAI said it has worked with several companies to incorporate GPT-4 into their products, including Duolingo, Stripe, and Khan Academy. The GPT-4 model will also be provided to subscribers of the paid version of ChatGPT Plus in the form of API. Developers can sign up to build apps with it.
Microsoft has since said that the new Bing search engine will run on the GPT-4 system.
GPT-4 stands for Generative Pre-Training Transformer 4. Its two “predecessors”, GPT-3 and GPT3.5, were used to create Dall-E and ChatGPT respectively, and both attracted public attention, spurring other technology companies to invest heavily in AI applications.
According to OpenAI, compared to the previous generation GPT-3.5 that supports ChatGPT, there are only subtle differences between GPT-4 and the user’s dialogue, but the difference between the two is more obvious when facing more complex tasks.
“In our internal assessment, it was 40% more likely to generate the correct response than GPT-3.5.”
OpenAI also said that GPT-4 participated in a variety of benchmark tests, including the Uniform Bar Exam, the LSAT, the SAT math section and the evidence reading and writing section of the US college entrance examination. , which scored higher than 88% of test takers.
Last week, Andreas Braun, chief technology officer (CTO) of Microsoft Germany, attended an AI event in Germany and revealed that this week will release a multimodal system GPT-4, which “will provide completely different possibilities, such as video.” This led to speculation that GPT-4 should allow users to convert text to video, as he said the system would be multimodal, implying that not only text could be generated, but other mediums as well.
The GPT-4 introduced by OpenAI on Tuesday is indeed multimodal, but the media it can integrate is not as many as some people predicted. OpenAI said that GPT-4 can parse text and images at the same time, so it can interpret more complex input content.
In the example below, we can see how the GPT-4 system responds to image input, such as explaining what is unusual about the image, what is humorous about the image, and the purpose of a funny image like in the screenshot below.