What can ChatGPT do?
It can replace Jack to write articles. Through the study of Jack's past articles, AI can summarize the structure and routines of Jack's articles. After that, you can input the core of the article and related data, and he can quickly form Jack's article; it can replace Jack to do PPT at work, especially the report summary PPT, which can be quickly formed.
This is true, and it is clear that general-purpose AI models such as ChatGPT can subvert many things in the future. Then you should really understand technologies such as ChatGPT.
Therefore, this paper combines relevant information to summarize
What is ChatGPT and what can it do?
What is the technology behind ChatGPT?
What are the current limitations of ChatGPT?
What similar products does ChatGPT have at home and abroad?
What impact does ChatGPT have on automotive technology?
Help self-awareness and learning, and hope to bring you some information and inspiration.
What is ChatGPT and what can it do?
ChatGPT, first returns to its English meaning Chat is chat, GPT is the abbreviation of Generative Pre-trained Transformer, translated into a generative pre-trained Transformer, Transformer is a deep learning model based entirely on the self-attention mechanism. So ChatGPT is a generative, pre-trained algorithm model that can chat. It belongs to the current hot generative AI artificial intelligence model, that is, AI generates human content .
\ChatGPT is an artificial intelligence chat robot for ordinary users, currently mainly based on text chat, it uses advanced natural language processing (NLP) to have realistic conversations with humans. Currently ChatGPT can generate articles, fictional stories, poems and even computer code. ChatGPT can also answer questions, engage in conversations and, in some cases, provide detailed responses to very specific questions and queries.
In fact, chatbots are nothing new. They use keyword search technology and then match answers. This is very common in our daily life, such as Amazon’s Alexa, Apple’s SIRI, Tmall Genie, Baidu Xiaodu, etc. Many online customer service, and even the car language control we are familiar with "Car AI Smart Voice 101 and Its Supply Chain", they are mainly command-based voice assistants based on tasks. However, ChatGPT does use a more refined big data training model , which has a wider range of applications and can be integrated into various applications ;
One of the things that sets ChatGPT apart from other chatbots and NLP systems is its surreal conversational skills, including the ability to ask follow-up questions, admit mistakes, and point out nuances on topic. In many cases, it's basically difficult for a human to detect that they're interacting with a computer-generated robot if they don't tell you. Grammar and grammatical errors are rare, and the written structure is logical and clear.
Some features of ChatGPT include:
Generate human-like text that mimics the style and structure of the input data
· Generate responses to a given prompt or input text. This might include writing a story or answering questions.
· Generate text in multiple languages
Modify the style of generated text (e.g., formal or informal)
Ask clarifying questions to better understand the intent of the input data
Responding to text that is consistent with the context of the conversation, such as providing follow-up instructions or understanding references to previous questions
Other generative AI models can perform similar tasks on images, sound, and video.
In addition, ChatGPT can perform fine-tuning training : by training on a small related data set, the process of adapting LLM to a specific task or field is also the business model that ChatGPT is currently expanding, and the application of a certain segment such as professional legal counsel , a professional automotive think tank expert.
What is the technology behind ChatGPT?
ChatGPT is OpenAI's latest language model NLP (Natural language processing), which is based on the large language model LLM (Large Language Model) model GPT-3 plus the use of supervised learning and human feedback reinforcement learning RLHF (Reinforcement Learning from Human Feedback) special Techniques to fine-tune ChatGPT formation.
Three technical keywords
NLP (Natural language processing) natural speech processing
LLM (Large Language Model) large language model
RLHF (Reinforcement Learning from Human Feedback) reinforcement learning in human feedback
NLP natural language processing is the interaction between human language and computers. It has a relatively large scope. For example, it was probably shared in "Automotive AI Intelligent Voice 101 and Its Supply Chain" before. The popular technology in the NLP field is the deep learning model, which mainly relies on the following key technologies: the magically modified LSTM model and a small amount of improved CNN model, RNN as a typical feature extractor; Sequence to Sequence (or encoder-decoder also Yes) + Attention as a typical overall technical framework for various specific tasks.
\The LLM large-scale language model, which is currently the model of ChatGPT, is a subset of artificial intelligence. As the name implies, "big" means massive data. and millions of books (roughly 300 billion words) to generate human-like responses to conversations or other natural language input.
Its main key technical points are:
Word Embedding: An algorithm used in LLM to represent the meaning of a word in numerical form so that it can be fed into and processed by an AI model by mapping words to vectors in a high-dimensional space, where words with similar meaning Words move closer together.
Attention Mechanism: An algorithm used in LLM that enables the AI to focus on specific parts of the input text, such as sentiment-related words in the text, when generating output. This allows the LLM to take into account the context or sentiment of a given input, resulting in a more coherent and accurate response.
Transformers: A neural network architecture popular in LLM research that uses self-attention mechanisms to process input data, allowing them to efficiently capture long-term dependencies in human language. Transformers are trained to analyze the context of input data and weight the importance of each part of the data accordingly. Because this type of model learns context, it is often used in natural language processing (NLP) to generate text that resembles human writing.
Because of its attention mechanism, it is better than the previous RNN "Introduction to AI's Recurrent Neural Network RNN_MIT Chinese-English Subtitle Edition", deep learning algorithms such as GRU and LSTM have extremely long memory, and Transformer can "participate" or "focus" on previous All tokens generated . In theory, attention mechanisms, given sufficient computational resources, have an infinite window to refer to, and are thus able to use the entire context of the story when generating text.
Large language models (such as GPT-3) are trained on large amounts of text data from the Internet and are able to generate human-like text. In fact, their objective function is the probability distribution of the sequence of words (or sequence of tokens) , which enables them to predict the sequence What is the next word in (more details below), so they may not always produce output that is consistent with human expectations or ideals.
In practice, however, LLM models are trained to perform some form of valuable cognitive work, and there are clear differences between how these models are trained and how we want to use them. Although mathematically speaking, a machine-computed statistical distribution of word sequences can be a very efficient choice for modeling language, as humans, we generate language by choosing the sequence of text that best fits a given situation, and using Our background knowledge and common sense guide the process . Problems with the performance of LLMs when language models are used in applications that require a high degree of trust or reliability, such as dialogue systems or intelligent personal assistants, are:
· Lack of help: Not following the user's explicit instructions.
Hallucinations: Models fabricate facts that do not exist or are false.
Lack of explainability: Humans cannot understand how it came to a particular decision or prediction.
Generating biased or harmful output: A language model trained on biased/harmful data may reproduce this result in its output.
But how exactly did the creators of ChatGPT use human feedback to solve the alignment problem? At this time, it is necessary to perform RLHF (Reinforcement Learning from Human Feedback) reinforcement learning in human feedback on LLM.
The method generally consists of three distinct steps:
Supervised fine-tuning step: First, similar to autopilot labeling, the human labeler writes down the expected output response to train the LLM. For example, ChatGPT is fine-tuning the GPT-3.5 series. But obviously the cost of manual labeling is very high, and I heard that Open AI has invested a lot of capital in this area, so the scalability cost of the supervised learning step is very high.
"Mimicking human preferences" step: In this phase human labelers are asked to vote on a relatively large number of SFT model outputs, thus creating a new dataset consisting of comparative data. A new model is trained on this dataset. This is called the Reward Model (RM). Sequencing output is much easier for a labeler than producing it from scratch, a process that can be scaled up more efficiently.
Proximal Policy Optimization (PPO) step: It is the step where the reward model is used to further fine-tune and improve the SFT model. The result of this step is the so-called policy model, which continuously adjusts the current policy based on the actions the agent is taking and the rewards it receives , and it limits the change of the policy to a certain distance from the previous policy.
So on the basis of continuous intensive training based on the input of human annotators, they essentially endow ChatGPT with answers. With reinforcement learning in RLHF human feedback, ChatGPT uses human feedback in the training loop to minimize harmful, inauthentic and/or biased output.
Of course, with the algorithm, a computing center for storing and processing data and computing processing is also needed. The computing center behind ChatGPT is Microsoft's Azure cloud computing center.
Limitations of ChatGPT
Then the limitations of such a powerful ChatGPT may be understood after familiarizing with the core technology behind it. It is trained based on the accumulation of large data of human language and characters. It cannot have thinking and innovation, mainly based on the data trained in the past. Combined.
In addition, in terms of accuracy, there are currently two core related functions that make the language model more accurate:
· LLM's ability to retrieve information from external sources.
· The ability of LLMs to provide references and citations for the information they provide.
Therefore, the answer of ChatGPT is limited to the information already stored in its training. For example, the ChatGPT we use now is based on the network data before 2021 and related book knowledge for learning and training. captured in its static weights. (This is why it cannot discuss events occurring after 2021, when the model was trained.) Being able to obtain information from external sources would allow LLM to access the most accurate and up-to-date information available, even when that information changes frequently ( For example, the company's stock price), but obviously not currently.
In addition, the fine-tuning given by the artificial annotator of the artificial intensive training is also the limitation of the current ChatGPT. First of all, ChatGPT cannot get rid of the shadow of the artificial annotator such as position, preference, etc.
What similar products does ChatGPT have at home and abroad?
So is it similar to ChatGPT? Does no one else have such a product? In fact, as mentioned above, ChatGPT is a kind of large-scale LLM language model. In fact, many Internet companies have large-scale language models. As for why the new force Open AI released it first and caused a sensation, in fact, traditional forces are most afraid of making mistakes, so I gave it An opportunity for new forces to bravely try and make mistakes.
Post a Comment