2024-07-11
한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina
We will not go into too much depth about AI itself here, but focus more on upper-level applications.
When we talk about large language models, we mean software that can “speak” in a way that resembles human language. These models are amazing – they are able to take in context and generate responses that are not only coherent but also feel like they are coming from a real human.
These language models work by analyzing large amounts of text data and learning patterns in language usage. They use these patterns to generate text that is nearly indistinguishable from what humans say or write.
If you’ve ever chatted with a virtual assistant or interacted with an AI customer service agent, you may have interacted with a large language model without even realizing it! These models have a wide range of applications, from chatbots to language translation to content creation.
Why do I need to open a separate chapter to talk about "understanding" the big language model after the above overview of the big language model? Because this will help you better understand what the big language model is, understand its upper limit, and also help us better do the application layer
First of all, in general, machine learning is to find a special complex "function" that can convert our input into the desired output. For example, if we expect the input to be 1, the output is 5; the input to be 2, the output is 10, then the function may be y=2*x. Or if we input a picture of a cat, I hope it will output the word "cat", or if I input "hi" it will output "hello", etc.
This can actually be seen as a mathematical problem. Of course, it is actually much more complicated than the example above.
1. In the early days, people always wanted to make machines think like humans. At that time, people mainly promoted the "bird flying school" based on bionics. When people saw birds flying, they learned to flap their wings to fly, so they hoped to make machines think like humans. However, this effect was not very good. There was no "world knowledge" (world knowledge is the well-known and instinctive cognition in your brain that you don't need to think about), such as "water flows to lower places". This kind of world knowledge is massive, and it is difficult to solve the problem of multiple meanings of a word. In general, imitating the human brain nerves is too complicated, and it is difficult to achieve it with just code and functions.
2. Artificial Intelligence 2.0 Era: Data-driven realization of "statistical artificial intelligence". Why did various large models emerge like mushrooms after the emergence of GPT3? In fact, most companies have been researching AI for a long time, but in the early days everyone was crossing the river by feeling the stones. Although there were many plans and ideas, they did not dare to increase investment to the point of all-in, and they all conducted research within a limited range. The emergence of GPT3 allowed everyone to see that a certain method was feasible, which is to use massive amounts of data to calculate statistics, using quantitative changes to cause qualitative changes, so there were successful cases, and everyone realized that this method was feasible, so they began to increase investment and take this path.
3. Big data can make a leap forward in the level of machine intelligence; the greatest significance of using large amounts of data is to allow computers to accomplish things that only humans could accomplish in the past.
So the key to the problem becomes a probability problem. The current large models all calculate a probability from massive amounts of data to determine what the highest probability is for the next word or a certain paragraph in the middle, and then output it. In fact, the essence is not to generate new things, but to infer.
For example, if you ask him what is the capital of China, the algorithm will extract the keyword "what is the capital of China"
Then the big model calculates from the massive amount of data that the capital of China is the word followed by Beijing, which has the highest probability of appearing, so it will output the correct result.
The big model relies on "memorizing" massive amounts of data to achieve its current capabilities
Therefore, the data quality for training large models is also very critical. At the same time, we can also roughly associate the upper limit of large models.
AIGC, or Artificial Intelligence Generated Content, is a technology that uses machine learning algorithms to automatically generate various types of content, including text, images, audio, and video. The AIGC system analyzes large amounts of data, learns language, visual, and audio patterns, and is able to create new content that is similar or even indistinguishable from human-generated content.
All digital work is likely to be subverted by the "big model"
Most of our current application layer work belongs to the AIGC system.
After GPT3.5, large models can already use tools
• Plugins and networking: make up for the lack of memory in the large model itself, marking the official start of LLM learning to use tools
• Function: LLM learns to call APIs to complete complex tasks, which is the main work of backend engineers (giving Gorilla instructions will automatically call models such as diffusion to achieve multimodal tasks such as drawing and dialogue)
• Let the model "think": guide the large model to have logical ability, the core is: "planning memory tool"
In fact, the implementation of AI projects is the same as that of ordinary projects. The core of the project is to understand what core problem the project is going to solve, and then expand the thinking, and then analyze the needs, select technology, etc. We are not good at designing large models for the application layer. We usually call APIs directly or deploy large local open source models.
Anyone who has had some contact with AI may know prompts. In 2022-2023, the initial research on AI is still based on this, that is, how to ask questions that can make AI understand your meaning better, pay attention to your key points and give higher quality answers.
The threshold is relatively low, and most large model applications are designed in prompt. It can meet some needs, depending on the capabilities of the basic model.
RAG (Retrieval-Augmented Generation) is an AI technology that combines retrieval models and generative models. It enhances the answering ability of large language models (LLMs) by retrieving relevant information from a knowledge base or database and combining it with user queries. RAG technology can improve the accuracy and relevance of AI applications, especially in scenarios that deal with specific domain knowledge or require the latest information.
The working principle of RAG mainly includes two steps:
However, this threshold is relatively high and has certain requirements for computing power, data, and algorithms.
Goal: Conduct feasibility validation, design prototypes based on business requirements, and build PromptFlow to test key assumptions
Objective: Evaluate the robustness of the solution on a wider range of datasets and enhance model performance through techniques such as fine-tuning (SFT) and retrieval-augmented generation (RAG)
Objective: Ensure the stable operation of the AIGC system, integrate monitoring and alarm systems, and achieve continuous integration and continuous deployment (CI/CD)
The main content snippet is the text base that works in conjunction with the directive and can significantly increase its effectiveness.
Specific methods to achieve the main content include:
By showing the model how to generate outputs given instructions, the model is able to infer the output pattern, whether it is zero-shot, one-shot, or few-shot learning.
By providing clues to the large model, it guides it to perform logical reasoning in a clear direction, similar to providing a step-by-step formula to help the model gradually obtain the answer.
The value of templates lies in creating and publishing libraries of prompts for specific application areas, where these prompt templates have been optimized for the specific context or example of the application.
# 职位描述:数据分析助手
## 角色
我的主要目标是为用户提供专家级的数据分析建议。利用详尽的数据资源,告诉我您想要分析的股票(提供股票代码)。我将以专家的身份,为您的股票进行基础分析、技
术分析、市场情绪分析以及宏观经济分析。
## 技能
### 技能1:使用Yahoo Finance的'Ticker'搜索股票信息
### 技能2:使用'News'搜索目标公司的最新新闻
### 技能3:使用'Analytics'搜索目标公司的财务数据和分析
## 工作流程
询问用户需要分析哪些股票,并按顺序执行以下分析:
**第一部分:基本面分析:财务报告分析
*目标1:对目标公司的财务状况进行深入分析。
*步骤:
1. 确定分析对象: