譯者 | 朱先忠
審校 | 重樓
引言
近年來,自然語言模型得到了快速改進;其中,GPT-3和GPT-4等大型語言模型占據(jù)了中心地位。這些模型之所以受歡迎,是因為它們能夠以令人難以置信的技能執(zhí)行各種各樣的任務。此外,隨著這些模型的參數(shù)數(shù)量(數(shù)十億!)的不斷增加,這些模型意外地獲得了新的功能。
在本文中,我們將探討LLM(大型語言模型)、它們可以執(zhí)行的任務、它們存在的問題以及各種應用場景下的提示符工程化應對策略。
什么是LLM?
LLM是在大量文本數(shù)據(jù)上進行訓練的神經(jīng)網(wǎng)絡。訓練過程允許模型學習文本中的模式,包括語法、句法和單詞聯(lián)想。這些模型能夠使用這些學習到的模式來生成類似人類的文本,使其成為自然語言處理(NLP)任務的理想選擇。
目前存在哪些可用的LLM?
當前,已經(jīng)有幾種LLM正式投入商業(yè)應用,其中GPT-4最受歡迎。其他模型包括LLaMA、PaLM、BERT和T5等。當然,每一種模型都有其各自的優(yōu)點和缺點。其中有一些是開源的,另一些則是閉源的,只能通過API調(diào)用方式來使用。
LLM的不足
盡管LLM的性能令人印象深刻,但仍有一些局限性。一個顯著的缺點是它們無法超越提示符中提供的信息進行推理。此外,LLM可能根據(jù)訓練的數(shù)據(jù)生成帶有“偏見”的文本描述內(nèi)容。總之,想準確控制LLM的輸出結(jié)果也是一項很富有挑戰(zhàn)性的任務。因此,人們有必要采用提示符工程化策略來控制所需要的輸出內(nèi)容。
您可以執(zhí)行哪些任務?
我們可以通過格式化每一種應用場景相應的提示符內(nèi)容來指導LLM為我們執(zhí)行特定的任務。接下來,我們將列舉一些典型應用場景中的任務、提示符模板以及每個任務相應的輸出結(jié)果。
文本摘要應用場景
LLM可以幫助我們生成長文本的摘要,使內(nèi)容更容易理解和消化。例如,我們可以要求使用以下提示符模板對文本進行總結(jié):
模板
<Full text>
Summarize the text above: / Explain the text above in <N> sentences:
例如,我們可以要求用以下提示信息來總結(jié)上述文本:
提示符
In a mechanical watch the watch's gears are turned by a spiral spring called a mainspring. In a manual watch energy is stored in the mainspring by turning a knob, the crown on the side of the watch. Then the energy from the mainspring powers the watch movement until it runs down, requiring the spring to be wound again.
A self-winding watch movement has a mechanism which winds the mainspring using the natural motions of the wearer's body. The watch contains an oscillating weight that turns on a pivot. The normal movements of the watch in the user's pocket (for a pocketwatch) or on the user's arm (for a wristwatch) cause the rotor to pivot on its staff, which is attached to a ratcheted winding mechanism. The motion of the watch is thereby translated into circular motion of the weight which, through a series of reverser and reducing gears, eventually winds the mainspring. There are many different designs for modern self-winding mechanisms. Some designs allow winding of the watch to take place while the weight swings in only one direction while other, more advanced, mechanisms have two ratchets and wind the mainspring during both clockwise and anti-clockwise weight motions.
Explain the text above in 2 sentences:
最終,LLM將給出以下輸出結(jié)果:
輸出
A mechanical watch uses a mainspring to power the movement of the watch's gears. In a manual watch, the mainspring is wound by turning a knob, while a self-winding watch uses the natural motion of the wearer's body to wind the mainspring through an oscillating weight and ratcheted winding mechanism.
信息提取領域應用
LLM可以從大量文本中識別和提取特定信息。這在新聞和商業(yè)智能等領域很有用。以下模板指示LLM模型提取信息:
模板
<Full text>
Mention X in the text above: / List X in the text above:
例如,我們可以要求從以下文本中提取地名:
提示符
The earliest credible evidence for a successful design is the watch made by the Swiss watchmaker Abraham-Louis Perrelet, who lived in Le Locle. In late 1776 or early 1777, he invented a self-winding mechanism for pocket watches using an oscillating weight inside the watch that moved up and down. The Geneva Society of Arts, reporting on this watch in 1777, stated that 15 minutes walking was necessary to fully wind the watch.
List the names of places in the text above:
它給出以下輸出結(jié)果:
輸出
Le Locle, Geneva.
問答場景應用
LLM還可以根據(jù)提示符中提供的信息進行問題回答。以下模板可用于指導LLM回答問題:
模板
Answer the question based on the context below. <Other instructions>. Respond "Not sure" if not sure about the answer.
Context: <Context>
Q: <Question>
A:
例如,使用以下提示:
提示符
Answer the question based on the context below. Answer in a single sentence. Respond "Not sure" if not sure about the answer.
Context: An automatic watch, also known as a self-winding watch or simply an automatic, is a mechanical watch where the natural motion of the wearer provides energy to wind the mainspring, making manual winding unnecessary if worn enough. It is distinguished from a manual watch in that a manual watch must have its mainspring wound by hand at regular intervals.
Q: Do automatic watches need to be manually wound?
A:
而ChatGPT的回應是:
輸出
No, automatic watches do not need to be manually wound if worn enough because the natural motion of the wearer provides energy to wind the mainspring.
文本分類領域
LLM可以將文本分類為預定義的類別,例如情緒分析或垃圾郵件檢測。您可以按以下方式格式化文本分類說明:
模板
Classify the text into <List of choices>:
Text: <Full text>
Classification:
例如,我們可以嘗試根據(jù)文本的情緒是積極的、中立的還是消極的來對其進行分類:
提示符
Classify the text into positive, neutral or negative:
Text: That shot selection was awful
Classification:
輸出
Negative
會話應用場景
LLM可以生成類似人類的對話,使其在聊天機器人和客戶服務應用程序中非常有用。我們可以指示LLM以特定的方式跟隨對話。
模板
The following is a conversation with <Actor>. The <Actor> should answer <Further instructions>.
Person: <Conversation>
<Actor>: <Response>
Person: <Conversation>
<Actor>:
LLM能夠填寫我們指示它的參與者類型的響應,我們可以通過再次傳遞整個對話和新的響應來繼續(xù)對話。例如:
提示符1
The following is a conversation with a lawyer. The lawyer should answer simply and in layman terms.
Person: Hello, I would like to know how to dispute a parking ticket
Lawyer:
輸出1
Sure, I can help. The first step is to check the ticket for any errors or mistakes. If you find any, you can use those as grounds to dispute the ticket. If there are no mistakes, you can still dispute the ticket by providing evidence to support your case, such as photos or witness statements. You can also request a hearing to present your case in person.
提示符2
The following is a conversation with a lawyer. The lawyer should answer simply and in layman terms.
Person: Hello, I would like to know how to dispute a parking ticket
Lawyer: Sure, I can help. The first step is to check the ticket for any errors or mistakes. If you find any, you can use those as grounds to dispute the ticket. If there are no mistakes, you can still dispute the ticket by providing evidence to support your case, such as photos or witness statements. You can also request a hearing to present your case in person.
Person: How do I request a hearing?
Lawyer:
輸出2
To request a hearing, you'll need to follow the instructions on the back of the ticket or on the website provided. Usually, you'll need to submit a written request within a certain timeframe, along with any evidence or supporting documents you have. Once your request is received, you'll be notified of the date, time, and location of your hearing.
代碼生成應用場景
LLM可以生成計算機程序代碼,使開發(fā)人員更容易創(chuàng)建復雜的軟件程序。我們可以寫一個代碼注釋來描述我們希望代碼做什么,LLM將能夠盡最大努力為我們做這件事。我們希望代碼做的事情越復雜,我們就需要在提示注釋中表現(xiàn)得越具體和具備描述性。
模板可能會因我們想要生成的編程語言而異,但我建議大致遵循如下所示的模板:
模板
<Comment symbol of your language of choice (/* or ''' or #)>
Method: <Name of the method>
Language: <Programming language>
Description: <Description of what the method must do>
Parameters: <Parameters that the method will receive>
Context: <Further information needed to implement the method (URLs, databases, etc.)>
Output: <Output of the method>
<Comment symbol of your language of choice (/* or ''' or #)>
例如,如果我們想獲取一個帶有分頁的API,我們可以編寫以下注釋:
提示符
/* Method: fetchFirst5Pages
* Language: Javascript
* Description: Fetch the first 5 pages from this paginated API: http://example.com/pages/?n=x
* Context: This API accepts the following URL parameters:
* n: the number of the page to fetch
* Output: the content of the first 5 pages
*/
輸出
Here's an example implementation of the fetchFirst5Pages method in JavaScript, assuming the use of the fetch function:
async function fetchFirst5Pages() {
const baseUrl = 'http://example.com/pages/?n=';
const pages = [];
for (let i = 1; i <= 5; i++) {
const url = baseUrl + i;
const response = await fetch(url);
if (!response.ok) {
throw new Error(`Error fetching page ${i}`);
}
const pageData = await response.json();
pages.push(pageData);
}
return pages;
}
This function uses a loop to fetch the first 5 pages from the API by appending the page number to the base URL. It checks if the response from the server is okay and throws an error if it's not. Then, it parses the JSON data from the response and adds it to an array of pages. Finally, it returns the array of pages.
推理領域應用
LLM可能會在一些基本任務上遇到困難,例如算術問題。然而,如果我們要求LLM對其回應進行推理,我們可以幫助它找到正確的答案。下面是我推薦的模板:
模板
Q: <Question>
A: Let's work this out in a step by step way to be sure we have the right answer. First, <Instructions>.
在這個例子中,我們可以看到,如果我們不合理地提出一個簡單的算術問題,它就會失敗:
提示符1
Answer the following question. Respond "Not sure" if not sure about the answer.
Q: The sum of prime numbers in the following list are a multiple of 3: [2, 4, 6, 7, 9, 14]
A: [Yes or No]
輸出1
No. The sum of prime numbers in the list [2, 4, 6, 7, 9, 14] is not a multiple of 3. Therefore, the answer is "No".
然而,如果我們指示LLM對其答案進行推理,我們會得到一個非常不同的結(jié)果:
提示符2
Q: The sum of prime numbers in the following list are a multiple of 3: [2, 4, 6, 7, 9, 14]
A: Let's work this out in a step by step way to be sure we have the right answer. First, we find the prime numbers in the list and add them together. Then we check whether the resulting number can be divided by 3.
輸出2
Sure, let's work this out step by step.
The prime numbers in the list are 2 and 7. So we can add them together:
2 + 7 = 9
The resulting number is 9. Now we need to check whether it is a multiple of 3. We can do this by checking if the remainder of 9 divided by 3 is equal to 0:
9 % 3 = 0
Since the remainder is 0, we know that 9 is indeed a multiple of 3.
Therefore, the answer is "Yes", the sum of prime numbers in the list [2, 4, 6, 7, 9, 14] is a multiple of 3.
策略生成領域
雖然這些模板可以幫助您根據(jù)自己的需求挖掘LLM的潛力,但有時它們?nèi)匀粫谀承碗s的任務中遇到困難。當然,科學家們已經(jīng)開發(fā)了一些技術來幫助LLM獲得令人滿意的答案。到目前為止,我們只使用了零樣本提示,現(xiàn)在我們將對其進行定義;不過,我們可以使用更復雜的技術。
零樣本提示(Zero-Shot Prompting)
本文中的所有示例都使用了零樣本提示。零樣本提示意味著,向LLM傳遞一個提示,只包含我們希望它回答的問題,而不包含任何其他情況下它應該如何響應的示例。例如,在前面的一個例子中,我們做到了:
提示符
Classify the text into positive, neutral or negative:
Text: That shot selection was awful
Classification:
輸出
Negative
這可能不足以讓模型準確地回答我們希望它如何回答。讓我們看看我們有什么替代方案。
少快照提示(Few-Shot Prompting)
與零樣本提示不同,在少快照提示的情況下,我們給LLM幾個示例,說明它應該如何響應,然后再提問。例如,假設我們想實現(xiàn)像前面的例子一樣對文本的情感進行分類。但我們希望它每次都能以一種非常具體的形式給我們答案。我們可以提前舉幾個例子:
提示符
Classify the text into positive, neutral or negative:
Text: Today the weather is fantastic
Classification: Pos
Text: The furniture is small.
Classification: Neu
Text: I don't like your attitude
Classification: Neg
Text: That shot selection was awful
Classification:
輸出
Neg
思維鏈提示(Chain of Thought Prompting)
對于更復雜的任務,我們可以將少量的快照提示與讓模型對其答案進行推理相結(jié)合。我們可以使用與之前相同的例子來了解推理和少量快照提示是如何影響結(jié)果的:
提示符
Q: The sum of prime numbers in the following list are a multiple of 3: [2, 4, 6, 7, 9, 14]
A: The prime numbers in the list are 2 and 7. Their sum is 9. 9 % 3 = 0. The answer is yes.
Q: The sum of prime numbers in the following list are a multiple of 3: [1, 2, 4, 7, 8, 10]
A: The prime numbers in the list are 1, 2 and 7. Their sum is 10. 10 % 3 = 1. The answer is no.
Q: The sum of prime numbers in the following list are a multiple of 3: [1, 2, 3, 4]
A: The prime numbers in the list are 1, 2 and 3. Their sum is 6. 6 % 3 = 0. The answer is yes.
Q: The sum of prime numbers in the following list are a multiple of 3: [5, 6, 7, 8]
A:
輸出
The prime numbers in the list are 5 and 7. Their sum is 12. 12 % 3 = 0. The answer is yes.
總結(jié)
總之,當前的大型語言模型已經(jīng)徹底改變了自然語言處理領域;然而,另一方面,要最大限度地發(fā)揮其潛力,提示符的工程化策略選擇至關重要。通過了解LLM可以執(zhí)行的任務、它們的缺點以及各種提示符的工程策略,開發(fā)人員可以利用LLM的力量來創(chuàng)建創(chuàng)新和有效的解決方案。
在不久的將來,可能會開發(fā)出更多的策略和技術,因此請關注該領域的進一步發(fā)展,以最大限度地發(fā)揮LLM的潛力。此外,隨著LLM數(shù)十億個額外參數(shù)的不斷擴大,很可能會有更多我們現(xiàn)在甚至無法想到的任務。想到使用這些新工具會有什么可能,以及它們在未來會為我們提供哪些用例,不能不令人驚訝。
譯者介紹
朱先忠,51CTO社區(qū)編輯,51CTO專家博客、講師,濰坊一所高校計算機教師,自由編程界老兵一枚。
原文標題:Maximizing the Potential of LLMs: A Guide to Prompt Engineering,作者:Roger Oriol