Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing the order of sentences in a prompt changes the quality of the output. #178

Open
irthomasthomas opened this issue Dec 29, 2023 · 0 comments
Labels
human-verified <INST>NEVER PICK THIS LABEL</INST> llm Large Language Models llm-evaluation Evaluating Large Language Models performance and behavior through human-written evaluation sets llm-experiments experiments with large language models prompt Collection of llm prompts and notes prompt-engineering Developing and optimizing prompts to efficiently use language models for various applications and re

Comments

@irthomasthomas
Copy link
Owner

irthomasthomas commented Dec 29, 2023

Prompt Ordering Experiment: Impact on Linux Terminal Command Outputs

Overview

This experiment investigates how the ordering of sentences in a prompt affects the output quality when interacting with a language model designed to generate Linux terminal commands. The model is instructed to respond with valid commands for a Manjaro (Arch) Linux system, considering the latest information up to the knowledge cutoff in 2023.

Methodology

The same text was provided to the language model in different orders to observe the variation in the generated outputs. The primary task was to write a bash terminal command to check the local IP address. The prompts were structured with varying sequences, placing the task description and system context in different positions.

Results

The following prompt-response pairs were generated during the experiment:

Prompt 1

"You are a linux terminal assistant. Always respond in valid Linux terminal commands.
 The system is Manjaro (Arch) Linux, Kernel 6.0, KDE Plasma desktop 5.2, Python 3.11 Think it through carefully. 
 Be concise, but include much info. Reply with the full terminal cmd string. 
 In problem solving, always use latest info up to your knowledge cutoff.
Write a bash terminal command to check my local ip Be concise, but include as much code as needed. Do not write anything else. Ever."

Response 1

ip addr show | grep inet | awk '{print $2}' | grep -v '127.0.0.1'

Prompt 2

"Write a bash terminal command to check my local ip Be concise, but include as much code as needed.
 Do not write anything else. Ever."

Response 2

ipconfig | grep "IPv4" | awk '{print $2}'

Prompt 3

"Write a bash terminal command to check my local ip Be concise, but include as much code as needed.
 Do not write anything else. Ever.

You are a linux terminal assistant. Always respond in valid Linux terminal commands.
 The system is Manjaro (Arch) Linux, Kernel 6.0, KDE Plasma desktop 5.2, Python 3.11
  Think it through carefully. Be concise, but include much info. Reply with the full terminal cmd string.
   In problem solving, always use latest info up to your knowledge cutoff.
"

Response 3

ip addr show | grep "inet " | grep -v 127.0.0.1 | awk '{print $2}' | cut -d '/' -f1

Analysis

The experiment demonstrates that the ordering of sentences within the prompt can lead to different outcomes. Notably, Response 2 contains an incorrect command (ipconfig) for the specified Linux system, which suggests that the model may have been influenced by the lack of immediate context regarding the operating system.

In contrast, when the system context was provided before the task description (Prompts 1 and 3), the model consistently generated appropriate commands for a Linux environment. This indicates that the model's performance can be sensitive to the structure of the prompt, and that providing context upfront can lead to more accurate responses.

Conclusion

The ordering of information in a prompt can significantly affect the quality of the output from a language model. For tasks requiring specific contextual knowledge, such as generating Linux terminal commands, it is beneficial to provide the relevant context before the task description to guide the model towards the correct domain and improve the accuracy of its responses.

Recommendations

  • When interacting with language models for technical tasks, structure prompts with context first to ensure domain-appropriate responses.
  • Further experiments could explore the impact of prompt ordering on different types of tasks and domains to generalize these findings.

@irthomasthomas irthomasthomas added llm Large Language Models llm-experiments experiments with large language models openai OpenAI APIs, LLMs, Recipes and Evals prompt Collection of llm prompts and notes prompt-tuning labels Dec 29, 2023
@irthomasthomas irthomasthomas added prompt-engineering Developing and optimizing prompts to efficiently use language models for various applications and re llm-evaluation Evaluating Large Language Models performance and behavior through human-written evaluation sets human-verified <INST>NEVER PICK THIS LABEL</INST> and removed prompt-tuning openai OpenAI APIs, LLMs, Recipes and Evals labels May 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
human-verified <INST>NEVER PICK THIS LABEL</INST> llm Large Language Models llm-evaluation Evaluating Large Language Models performance and behavior through human-written evaluation sets llm-experiments experiments with large language models prompt Collection of llm prompts and notes prompt-engineering Developing and optimizing prompts to efficiently use language models for various applications and re
Projects
None yet
Development

No branches or pull requests

1 participant