A Business Guide to Tuning Language AI Part 2 | Written by Dr. Georg Reile

There are many prompting techniques and a large body of scientific literature benchmarking their effectiveness. Here I would like to introduce some well-known concepts. Once you understand the overall idea, I think you'll be able to expand your repertoire of prompts and even develop and test new techniques on your own.

ask and it will be given to you

Before I get into the concept of specific prompts, I want to emphasize a general idea that I don't think can be stressed enough.

The quality of the prompt greatly influences the model's response.

And quality here does not necessarily mean sophisticated and quick construction. It refers to the basic idea of asking precise questions, giving well-structured instructions, and providing the necessary context. I already touched on this in my last article when I met Sam the piano player. If you ask a piano player at a bar to play a random jazz song, chances are he won't play what you had in mind. Instead, ask exactly what you want to hear, and you'll likely be more satisfied with the outcome.

Similarly, if you have the opportunity to hire someone to do something around your house, and the contract specifications only say, for example, “bathroom renovation,” you may end up having to do your own bathroom. You may be surprised to find that it is different from what you expected. I had it in mind. Just like the model, the contractor uses only what he has learned about renovations and bathroom tastes as a reference, and delivers the product using the route he has learned.

here are some General guidelines for prompts:

· Be clear and specific.

· Be complete.

· Provide context.

– Specify the desired output format, length, etc.

This way, the model has enough matching reference data in the prompt that it can relate to when generating a response.

Roleplay prompts — simple but overrated

In the early days of ChatGPT, the idea of role-play prompts was popular. Instead of asking your assistant for immediate answers (i.e., simple questions), first assign your assistant a specific role, such as “teacher” or “consultant.” ” etc. Such a prompt would look like this: [2]:

From now on, you will become an excellent math teacher and always teach your students math problems correctly. And I'm also one of your students.

This concept has been shown to yield excellent results. One paper reports that through this role play, the model implicitly triggers a step-by-step inference process. This is what we want the model to do when applying the CoT technique (see below). However, this approach has also been shown to: Performs suboptimal performance at times And it needs to be designed properly.

In my experience, simply assigning roles doesn't work. I tried the example task from the paper referenced above. Unlike this study, GPT3.5 (currently the free version of OpenAI's ChatGPT, so you can try it out for yourself) produced correct results using a simple query.

An example of using simple queries instead of the role-play prompts suggested in [2]I still get the correct response

We also experimented with a variety of logical challenges, both simple queries and role-plays, using similar prompts as above. Two things happen in my experiment:

which one A simple query will give you the correct answer on the first tryor

both Simple queries and roleplays give false results, but the answers are different

Roleplay did not perform well A query in my simple (scientifically incorrect) experiment. Therefore, we conclude that the model must have been recently improved. The impact of roleplay prompts has been reduced.

After looking at various research results and without doing more extensive experimentation of our own, we believe that role-playing prompts need to be embedded in queries to outperform simple queries. Sound and thoughtful design Either it performs better than the most basic approach or it's not worth it at all.

I'd be happy to read your experience with this in the comments below.

Fewshot learning, also known as in-context learning

Another intuitive and relatively simple concept is called few-shot prompts, also known as in-context learning. Unlike zero-shot prompts, you don't just ask the model to perform a task and expect it to be delivered. Provide (a “few”) examples of solutions. It may seem obvious that providing examples improves performance, but this is a very noteworthy ability. These LLMs can be studied in context. This means that you can perform new tasks just by conditioning on some input and label pairs and by inference alone.Predicting new input [3].

To set up a few shots prompt:

(1) Gather examples of desired responsesand
(2) Write the prompt as follows An explanation of what to do with these examples.

Let's look at a typical classification example. Here the model is given several examples of statements with either positive, neutral, or negative judgments. The model's task is to evaluate the final statement.

A typical classification example of a Few-Shot prompt. The model must classify the statement into a specified category (positive/negative).

Again, although this is a simple and intuitive approach, I am skeptical of its value in state-of-the-art language models. In my (again, not scientifically correct) experiment, In both cases, Few-Shot prompts do not outperform Zero-Shot.. (The model already knew, even without me teaching him, that a drummer who is not punctual is a negative experience…). My findings seem consistent with recent research that even has the opposite effect (Zero shot outperforms few shot) is shown [4].

Based on my opinion and this empirical background, it's worth considering whether the design, computational, API, and latency costs of this approach are worth the investment.

CoT-Prompting or “Think Step by Step”

Chain of Thought (CoT) Prompting aims to enable models to better solve complex multi-step reasoning problems. Simply adding her CoT instruction “Let's think about it step by step” to the input query will significantly improve accuracy. [5][6].

Rather than simply specifying the final query or adding one or a few examples within the prompt, as in the Fewshot approach, we ask the model to: Break down the inference process into a series of intermediate steps. This is similar to how humans (ideally) approach difficult problems.

Remember your school math exams? In more advanced classes, you are often asked not only to solve math equations, but also to write down the logical steps of how you arrived at your final solution. there was. Even if you got it wrong, you might have earned some credit with a mathematically sound solution procedure. Just like a school teacher, we expect the model to break down the task into subtasks, perform intermediate inferences, and arrive at a final answer.

Again, I've been experimenting with CoT quite a bit myself.Again, in most cases you can simply add “Let's think about it step by step'' did not improve the quality of answers.. in fact, CoT approach has become an implicit standard Modern, fine-tuned chat-based LLMs like ChatGPT often split the response into chunks of inference in the absence of explicit commands.

However, we did find one instance where an explicit CoT was applied. The command actually improved the answer a lot. I used his CoT example from this article, but changed it to a trick question. Here you can see how ChatGPT fell into my trap when I was not explicitly asked for a CoT approach (although the response shows step-by-step reasoning).

Trick questions using simple queries instead of CoT prompts. Even if the response is broken down “step by step”, it is not completely correct.

Adding “Think about it step by step” to the same prompt correctly solved the trick question (well, it's unsolvable, but ChatGPT correctly pointed out that).

The same trick question with an explicit CoT prompt returns the correct response.

In summary, the purpose of thought chain prompts is to build reasoning skills that are difficult for language models to learn implicitly. This encourages the model to clarify and refine its reasoning process, rather than trying to jump directly from question to answer.

Again, my experiments revealed that Simple CoT approach has limited benefits (Added “Let's think about it step by step”) CoT showed better performance than simple queries at some pointAt the same time, the extra effort of adding CoT commands is minimal.this Cost-effectiveness This is one of the reasons why this approach is one of my favorites. Another reason I personally like this approach is that it not only helps the model; Please help us humans reflect. You can also iterate through the necessary inference steps as you create your prompts.

As before, the benefits of this simple CoT approach may diminish as your model becomes increasingly fine-tuned and you become more familiar with this inference process.

In this article, we've journeyed through the world of chat-based large-scale language model prompts. Rather than simply introducing the most common prompting techniques, we recommend starting by asking why prompting is important. Along this journey, we discovered that prompts have become less important due to the evolution of the model. Instead of asking users to invest in continually improving their prompting skills, the currently evolving model architecture is likely to make it even less relevant.Ann agent-based frameworkincluding the different “routes” taken while processing a particular query or task.

However, this does not mean that Be clear and specific and provide the necessary context within the prompt Not worth the effort. On the contrary, I strongly support this. Because this will help not only the model, but also yourself, understand exactly what you are trying to achieve.

As with human communication, multiple factors determine the appropriate approach to achieve the desired outcome. Combining and iterating different approaches often yields the best results for a particular context. Try, test, repeat.

And finally, unlike human interaction, a process of personal trial and error allows for virtually unlimited testing. Enjoy the ride!

Source link