LLM Engineering Cheatsheet

A timeless guide to thinking and building like a prompt engineer. This cheatsheet focuses on core principles and patterns that apply across any model, provider, or tool — whether you're using OpenAI, Claude, Llama, or something that doesn't exist yet.

This is not a cookbook or quickstart. It's a mindset guide — built for those who want to reason clearly and build reliably with LLMs.

Core Philosophy

LLMs are probabilistic next-token predictors, not deterministic logic machines. Prompt engineering is about:

Designing clear, structured inputs
Working within context and token limits
Thinking iteratively, not magically
Debugging failures like a system, not like a mystery

Treat prompts as interfaces, not incantations.

Prompting Patterns (Universal)

Zero-Shot

Ask the model to do a task with no examples.

"Summarize the following article in 3 bullet points: ..."

One-Shot / Few-Shot

Give one or more examples to improve reliability.

Review: "Great product, but shipping was late."
Response: "Thanks for your feedback! Sorry about the delay..."

Review: "Terrible quality."
Response: "We're sorry to hear that. Could you share more details so we can improve?"

Role-Based Prompting

Set a role for the model to adopt.

System: You are a technical support agent who speaks clearly and concisely.
User: My internet keeps cutting out. What should I do?

Constrained Output

Ask for output formats explicitly.

"List the steps as JSON: [step1, step2, step3]"

Sampling Controls

Controls randomness. Range: 0.0 to 2.0.

0.0 → deterministic, repeatable
0.2 – 0.5 → reliable tasks (Ng recommendation)
0.7 → balanced (OpenAI default)
1.0+ → creative, less reliable

Use top_p for more granular control of randomness.

Prompt Structure: The Anatomy

Always structure prompts with these components:

Role – Who is the model?
Task – What do you want?
Input – What info do they need?
Constraints – What form should the output take?
Examples (optional) – Show what success looks like

Example Prompt (all parts applied)

System: You are a helpful travel assistant that gives concise city guides.
User: I’m visiting Tokyo for 3 days. Suggest an itinerary with 3 activities per day.
Constraints:
- Format your response as bullet points grouped by day.
- Keep each activity description under 20 words.
Example:
Day 1:
- Visit Meiji Shrine in the morning
- Eat sushi at Tsukiji Market
- Explore Shibuya Crossing at night

Context Management

Be aware of token limits (e.g. 4k, 8k, 128k)
Use summarization for long chat histories
Drop irrelevant history when possible
Explicit > implicit — don't assume the model remembers everything
Fewer tokens = faster responses and lower cost
Use tools like OpenAI Tokenizer to inspect prompt size

Mindsets That Scale

The best results come when you treat LLMs as tools that augment your thinking, not replace it.

1. Use AI for Fast, Focused Tasks

AI thrives when the task is something a human could do in a second or two — things like renaming files, summarizing short content, or generating scaffolding. Don’t force it to solve problems that are too vague or complex. Break hard problems into small, clear ones. That’s when AI shines.

This is aligned with Andrew Ng’s “One-Second Rule” — tasks that a human can perform in under one second are great candidates for automation.

2. Prioritize Accuracy, But Accept Imperfection

When you evaluate a model's performance, accuracy is key — but perfect accuracy is not realistic. Ambiguity, nuance, and subjectivity are baked into language. Instead of aiming for 100%, aim for consistent and explainable behavior, and iterate. Treat errors as feedback loops, not failures.
(In practice, many real-world tasks operate safely with 70–90% accuracy — just make sure you know your risk tolerance.)

Andrew Ng emphasizes setting high but achievable accuracy standards, and treating improvement as an ongoing process.

3. You and the LLM Are Partners

Don't outsource your learning. Let the LLM guide, question, and collaborate — not do everything for you. Ask it for scaffolding, instructions, or alternatives, then build it yourself. That way, you stay in control and deepen your understanding. If you can't maintain the code later, you're not really building.

Experts recommend a human-in-the-loop mindset where you learn with the AI, not through it.

Evaluation Principles

LLM output is fuzzy. Define quality like this:

Does it meet the task objective?
Is the output formatted correctly?
Would a human say it's reasonable?
Can you detect regressions with A/B comparisons?

Perspective Shift

LLM success isn’t about perfection — it’s about clarity, consistency, and feedback loops.
Reliable > perfect. Iterate like a product, not like a test.

Common Failure Modes

Symptom	Likely Cause
Hallucination	Vague or underspecified prompts
Repetition	Poor constraint or unclear output format
Refusal	Misalignment between task and role
Loss of context	Too much history or poor summarization

Debugging Checklist (When Output Fails)

❓ Is the task clearly stated?
🧩 Are you using the right prompting pattern?
🧠 Is there a clear role and structure?
🧱 Could the context window be too full?
📣 Try asking the model: “Why did you respond this way?”

Structured Output & Tool Use (Advanced)

Some models support function calling or structured tool use — great for API responses or JSON output.

Example: OpenAI’s function_call or response_format="json"

Recommended Resources

Minimal Python Example

import os

from openai import OpenAI

client = OpenAI(
    api_key=os.environ.get("OPENAI_API_KEY"),
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a concise technical writer."},
        {
            "role": "user",
            "content": "Explain what a vector database is in simple terms.",
        },
    ],
    temperature=0.3,  # Lower = more deterministic
)

print(response.choices[0].message.content)

Final Thought

This guide helps you stay grounded when everything else is changing. Focus on clarity. Prompt with intent. And always think like an engineer.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Engineering Cheatsheet

Core Philosophy

Prompting Patterns (Universal)

Zero-Shot

One-Shot / Few-Shot

Role-Based Prompting

Constrained Output

Sampling Controls

Prompt Structure: The Anatomy

Example Prompt (all parts applied)

Context Management

Mindsets That Scale

1. Use AI for Fast, Focused Tasks

2. Prioritize Accuracy, But Accept Imperfection

3. You and the LLM Are Partners

Evaluation Principles

Perspective Shift

Common Failure Modes

Debugging Checklist (When Output Fails)

Structured Output & Tool Use (Advanced)

Recommended Resources

Minimal Python Example

Final Thought

About

Uh oh!

Releases

Packages

License

mlane/llm-engineering-cheatsheet

Folders and files

Latest commit

History

Repository files navigation

LLM Engineering Cheatsheet

Core Philosophy

Prompting Patterns (Universal)

Zero-Shot

One-Shot / Few-Shot

Role-Based Prompting

Constrained Output

Sampling Controls

Prompt Structure: The Anatomy

Example Prompt (all parts applied)

Context Management

Mindsets That Scale

1. Use AI for Fast, Focused Tasks

2. Prioritize Accuracy, But Accept Imperfection

3. You and the LLM Are Partners

Evaluation Principles

Perspective Shift

Common Failure Modes

Debugging Checklist (When Output Fails)

Structured Output & Tool Use (Advanced)

Recommended Resources

Minimal Python Example

Final Thought

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages