Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blog #271

Merged
merged 3 commits into from
Dec 6, 2024
Merged

Blog #271

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 9 additions & 19 deletions frontend/app/blog/[slug]/page.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -40,14 +40,9 @@ export default async function BlogPostPage({ params }: { params: { slug: string
return (
<>
<LandingHeader hasSession={session !== null && session !== undefined} />
<div className="mt-32 h-full flex justify-center">
{/* <div className="w-1/4 flex justify-end">
<Link href="/blog" className="text-secondary-foreground hover:text-primary h-0">Back to all posts</Link>
</div> */}
<article className="flex flex-col z-30 py-16 md:w-[1000px] w-full px-8 md:px-0">
{/* <ScrollArea className="h-full flex-grow w-full mx-auto bg-background px-16">
<div className="h-0"> */}
<BlogMeta data={data} />
<div className="mt-48 h-full flex justify-center flex-col items-center">
<BlogMeta data={data} />
<article className="flex flex-col z-30 md:w-[700px] w-full px-8 md:px-0">
<div className="pt-4 pb-48">
<MDXRemote
source={content}
Expand All @@ -56,26 +51,21 @@ export default async function BlogPostPage({ params }: { params: { slug: string
h2: (props) => <MDHeading props={props} level={1} />,
h3: (props) => <MDHeading props={props} level={2} />,
h4: (props) => <MDHeading props={props} level={3} />,
p: (props) => <p className="py-2 text-secondary-foreground" {...props} />,
a: (props) => <a className="text-primary underline" target="_blank" rel="noopener noreferrer" {...props} />,
p: (props) => <p className="py-2 text-white/85" {...props} />,
a: (props) => <a className="text-white underline" target="_blank" rel="noopener noreferrer" {...props} />,
blockquote: (props) => <blockquote className="border-l-2 border-primary pl-4 py-2" {...props} />,
// codeblock
pre: (props) => <PreHighlighter className="pl-4 py-4" {...props} />,
// inline code
code: (props) => <span className="text-sm bg-secondary text-primary font-mono px-0.5" {...props} />,
ul: (props) => <ul className="list-disc pl-4 text-secondary-foreground" {...props} />,
ol: (props) => <ol className="list-decimal pl-4 text-secondary-foreground" {...props} />,
img: (props) => <img className="w-full border rounded-lg" {...props} />,
code: (props) => <span className="text-sm bg-secondary rounded text-white font-mono px-1.5 py-0.5" {...props} />,
ul: (props) => <ul className="list-disc pl-4 text-white/85" {...props} />,
ol: (props) => <ol className="list-decimal pl-4 text-white/85" {...props} />,
img: (props) => <img className="md:w-[1000px] relative w-full border rounded-lg" {...props} />,
}}
/>
</div>
<Footer />
{/* </div>
</ScrollArea> */}
</article>
{/* <div className="w-1/5 right-0 top-120 hidden 2xl:block fixed">
<TableOfContents headings={parseHeadings(content)} />
</div> */}
</div>
</>
);
Expand Down
18 changes: 15 additions & 3 deletions frontend/assets/blog/2024-12-01-launch-week-1.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,17 +14,29 @@ launching one feature per day.

Stay tuned for Laminar launches and follow us on [X](https://x.com/skull8888888888) for updates!

## Launch day 1, December 2
### Launch day 1, December 2

Flow – a dynamic task engine for building AI agents. See my [X post](https://x.com/skull8888888888/status/1863661536180572412)
for more details.

## Launch day 2, December 3
### Launch day 2, December 3

Evaluations that just work. Read our [blog post](/blog/2024-12-03-evals)
for more details.

## Launch day 3, December 4
### Launch day 3, December 4

Semantic search – a way to find the most similar examples in your dataset. Read our
[blog post](/blog/2024-12-04-semantic-search) for more details.

### Launch day 4, December 5

Labeling queues – convenient UI to label LLM data. Read our
[blog post](/blog/2024-12-05-labeling-queues) for more details.

### Launch day 5, December 6

Online evaluations – a way to run evaluators on your LLM calls in production. Read our
[blog post](/blog/2024-12-06-online-evals) for more details.


80 changes: 80 additions & 0 deletions frontend/assets/blog/2024-12-06-online-evals.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
---
title: "Launch Week #1, Day 5. Online evaluations"
date: "2024-12-06"
description: "Online evaluations are a way to monitor and assess LLM behavior in real-time"
author:
name: Robert Kim
url: https://x.com/skull8888888888
image: /blog/2024-12-06-online-evals.jpg
tags: ["online evaluations"]
---

At Laminar, we're excited to announce our newest feature: Online Evaluations. This feature allows engineering teams to run custom evaluators, either LLM-based or Python-based, on their LLM calls as they happen in production.

## What are Online Evaluations?
Online evaluations run automated checks and produce labels on your LLM calls as they happen in production. Instead of collecting data for post-hoc analysis, Laminar automatically evaluates each model call in real-time by analyzing the inputs and outputs of your LLM spans.

## Why We Built It
When you have thousands of LLM calls happening every day, it's hard to know if your LLMs are behaving as expected. Online evaluations allow you to monitor the quality of your LLMs in real-time, collect performance statistics, and detect issues before they impact users.

## How It Works
Laminar's online evaluations system is built around three core concepts:

### 1. Span Paths

Span paths uniquely identify where LLM calls happen in your code. They're automatically constructed from the location of the call, making it easy to track specific functions and endpoints.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe a doc link to https://docs.lmnr.ai/tracing/structure#grouping-spans-into-traces

But not necessary, those docs don't specialize on path


### 2. Span Labels

Labels are values attached to spans that indicate evaluation results.

### 3. Evaluators

Evaluators analyze inputs and outputs to generate labels. Laminar supports two types of evaluators:

- LLM-based evaluators
- Python-based evaluators


## Setting Up Evaluations
![Setting up evaluations](/blog/2024-12-06-online-evals-example.png)

Getting started with Laminar's online evaluations is straightforward:

1. Navigate to "Traces" in your Laminar dashboard
2. Select the span you want to evaluate
3. Click "Add Label" and create or choose a label class
4. Configure your evaluator:

- Choose between Python code or LLM-based evaluation
- Test your evaluator directly in the UI
5. Save and enable for production

Once enabled, Laminar will automatically run your evaluator on your LLM calls and attach labels to the spans. This label will be marked as `AUTO` in the dashboard.

![Evaluations in action](/blog/2024-12-06-online-evals-test-label.png)

## Best Practices

Start Simple

- Begin with basic format and content checks
- Add more sophisticated evaluations gradually
- Monitor evaluator performance impact


Layer Your Checks

- Technical validation (format, structure)
- Content validation (completeness, relevance)
- Quality metrics (coherence, accuracy)

Monitor Results

- Track evaluation trends over time
- Regularly review and refine criteria


## Conclusion
Online evaluations represent a significant step forward in LLM operations, bringing immediate quality feedback to production systems. With Laminar's implementation, teams can maintain high standards while gathering valuable insights about their models' behavior.
Try out online evaluations today and let us know what you think! Check out our [documentation](https://docs.lmnr.ai/evaluations/online-evaluations) for detailed setup instructions and best practices.
21 changes: 11 additions & 10 deletions frontend/components/blog/blog-meta.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -14,17 +14,18 @@ interface BlogMetaProps {

export default function BlogMeta({ data }: BlogMetaProps) {
return (
<div className="flex flex-col space-y-1 items-start">
<h1 className="text-5xl font-bold py-2">{data.title}</h1>
{/* <p className="text-secondary-foreground">{data.description}</p> */}
<p className="text-secondary-foreground"> {formatUTCDate(data.date)} </p>
{data.author.url
? <Label className="text-secondary-foreground hover:text-primary"><Link href={data.author.url}>{data.author.name}</Link></Label>
: <Label className="text-secondary-foreground">{data.author.name}</Label>
}
<div className="flex flex-col gap-8 items-center">
<div className="flex flex-col w-full md:w-[700px] gap-4">
<h1 className="text-5xl font-bold">{data.title}</h1>
<p className="text-secondary-foreground"> {formatUTCDate(data.date)} </p>
{data.author.url
? <Label className="text-secondary-foreground hover:text-primary"><Link href={data.author.url}>{data.author.name}</Link></Label>
: <Label className="text-secondary-foreground">{data.author.name}</Label>
}
</div>
{data.image &&
<div className="w-full flex items-center py-4">
<Image src={data.image} alt={data.title} width={1200} height={800} />
<div className="w-full flex rounded overflow-hidden">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while we're at it, do we need border here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no

<Image src={data.image} alt={data.title} width={1000} height={800} />
</div>
}
</div>
Expand Down
2 changes: 1 addition & 1 deletion frontend/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@
"next-mdx-remote": "^5.0.0",
"next-themes": "^0.2.1",
"postgres": "^3.4.4",
"posthog-js": "^1.174.0",
"posthog-js": "^1.194.4",
"posthog-node": "^4.2.1",
"re-resizable": "^6.10.0",
"react": "^18.3.1",
Expand Down
56 changes: 42 additions & 14 deletions frontend/pnpm-lock.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added frontend/public/blog/2024-12-06-online-evals.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.