Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: sync 4-05-23 #521

Merged
merged 43 commits into from
Apr 5, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
520f8a6
excludes -> excluded_column_names
axiomofjoy Mar 30, 2023
114c4aa
docs: exports API (GITBOOK-91)
axiomofjoy Mar 30, 2023
2a9947f
Pandas -> pandas
axiomofjoy Mar 30, 2023
c8c341a
excludes -> excluded_column_names
axiomofjoy Mar 30, 2023
1479b21
docs: move quickstart under tutorials (GITBOOK-92)
axiomofjoy Mar 30, 2023
102dc80
docs: advertise quickstart (GITBOOK-93)
axiomofjoy Mar 30, 2023
9224534
Numpy -> NumPy
axiomofjoy Mar 30, 2023
75254b3
docs: Add colab quickstart link (GITBOOK-95)
mikeldking Mar 30, 2023
833541b
docs: Update README.md to include colab quickstart link
mikeldking Mar 30, 2023
46dcce9
docs: add github link (GITBOOK-94)
mikeldking Mar 30, 2023
6833ae0
docs: colab badge links (GITBOOK-96)
axiomofjoy Mar 30, 2023
a7a8366
try badge links
axiomofjoy Mar 30, 2023
d8f2a1b
docs: add badges to colab (GITBOOK-97)
axiomofjoy Mar 30, 2023
8358999
docs: remove stray colab links (GITBOOK-98)
axiomofjoy Mar 30, 2023
c55906b
update colab badge
axiomofjoy Mar 30, 2023
88187b3
change colab logo color to orange in badge
axiomofjoy Mar 30, 2023
db4e841
docs: clean up unfinished pages (GITBOOK-100)
axiomofjoy Mar 31, 2023
8b5caea
update badge links
axiomofjoy Mar 31, 2023
7a1d1f0
docs: remove todo (GITBOOK-101)
axiomofjoy Mar 31, 2023
0d49c57
add quickstart github and colab links
axiomofjoy Mar 31, 2023
33a8275
docs: landing page updates (GITBOOK-103)
Apr 1, 2023
e0c7cb6
docs: move quickstart link to sentiment classification tutorial into …
Apr 3, 2023
7a2d488
docs: copy colab and github badges for image classification notebook …
axiomofjoy Apr 3, 2023
42f1418
update image classification links
axiomofjoy Apr 3, 2023
99a084a
change quickstart
axiomofjoy Apr 3, 2023
0f7ff28
docs: update quickstart (GITBOOK-105)
axiomofjoy Apr 3, 2023
8cabb4f
docs: quickstart instructions to launch phoenix ui
axiomofjoy Apr 3, 2023
82cdfb0
docs: tweak quickstart (GITBOOK-106)
axiomofjoy Apr 3, 2023
606de81
docs: tweak to language around iframe (GITBOOK-107)
axiomofjoy Apr 3, 2023
08b6642
docs: remove step numbers from quickstart (GITBOOK-108)
axiomofjoy Apr 3, 2023
fc077e7
docs: change sentiment classification notebook name (GITBOOK-109)
axiomofjoy Apr 3, 2023
c4864ca
docs: add quickstart github and colab links to hint (GITBOOK-110)
axiomofjoy Apr 3, 2023
7b71a9e
docs: No subject (GITBOOK-111)
axiomofjoy Apr 3, 2023
190c313
docs: clean up language in quickstart (GITBOOK-112)
axiomofjoy Apr 3, 2023
74b298d
docs: incorporate feedback (GITBOOK-113)
axiomofjoy Apr 3, 2023
a79ea41
docs: simplify quickstart (GITBOOK-117)
axiomofjoy Apr 4, 2023
cedf5fb
docs: update badge links
axiomofjoy Apr 4, 2023
cdb8821
docs: typo (GITBOOK-119)
RogerHYang Apr 4, 2023
f98f748
docs: remove broken link (GITBOOK-120)
axiomofjoy Apr 4, 2023
aa2e9d4
docs: fix typo (GITBOOK-121)
axiomofjoy Apr 4, 2023
62dcf7c
docs: add import statements to quickstart (GITBOOK-122)
axiomofjoy Apr 4, 2023
e018a50
docs: create your own dataset -> import your data (GITBOOK-123)
axiomofjoy Apr 4, 2023
42dd182
docs: inverted images (GITBOOK-124)
Apr 5, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
65 changes: 36 additions & 29 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,25 +7,27 @@ cover: >-
coverY: 0
---

# 🌟 ML Observability in a Notebook
# ML Observability in a Notebook

Get up and running quickly with the [quickstart](quickstart.md).

## What is Phoenix?

Phoenix is an ML Observability library designed for the Notebook. The toolset is designed to ingest model inference data for LLMs, CV, NLP and tabular datasets. It allows Data Scientists to quickly visualize their model data, monitor performance, track down issues & insights, and easily export to improve. Unstructured data such as text (prompts and responses) and images are a first class citizen in Phoenix, with embeddings analysis designed as the core foundation of the library. 
Phoenix is an ML Observability library designed for the Notebook. The toolset is designed to ingest model inference data for LLMs, CV, NLP and tabular datasets. It allows Data Scientists to quickly visualize their model data, monitor performance, track down issues & insights, and easily export to improve. Unstructured data such as text (prompts and responses) and images are a first class citizen in Phoenix, with embeddings analysis designed as the core foundation of the library.

### Core Functionality:

* Unstructured & structured data drift 
* Unstructured & structured data drift
* Troubleshooting LLM prompt/responses
* Analytical tools for NLP/Image/Generative & tabular model analysis 
* Analytical tools for NLP/Image/Generative & tabular model analysis
* Automatic visualization and clustering of embeddings
* UMAP for dimension reduction & HDBScan for clustering analysis: designed to work together
* Easy A/B data comparison workflows
* Embedding drift analysis
* Ingest embeddings if you have them or leverage embedding generation SDK
* Generate embeddings using LLMs 
* Monitoring analysis to pinpoint issues\* 
* Automatic clustering to detect groups of problems 
* Generate embeddings using LLMs
* Monitoring analysis to pinpoint issues\*
* Automatic clustering to detect groups of problems
* Workflows to export and fine tune

#### Coming Soon:
Expand All @@ -35,65 +37,70 @@ Phoenix is an ML Observability library designed for the Notebook. The toolset is

### Phoenix Architecture

Phoenix is designed to run locally on a single server in conjunction with the Notebook. 
Phoenix is designed to run locally on a single server in conjunction with the Notebook.

<figure><img src="https://lh3.googleusercontent.com/JVbbKGB2DocrWGNum_xKVZMRVAb7c4oBcJFCL23M-diqMmerKUJKVU9ZvMLhtNTIa4RuwbcNLAr3ZSd5pku5iFw-nb9pdHF-myKWLdtAkBxFPWu2jFQ_6ugHfaMLwGUDGc-kln4It1qLyVmP6m005Tk" alt=""><figcaption><p><strong>Phoenix Architecture</strong></p></figcaption></figure>
<figure><img src=".gitbook/assets/Phoenix docs graphics-03.jpg" alt=""><figcaption><p><strong>Phoenix Architecture</strong></p></figcaption></figure>

Phoenix runs locally, close to your data, in an environment that interfaces to Notebook cells on the Notebook server. Designing Phoenix to run locally, enables fast iteration on top of local data. &#x20;
Phoenix runs locally, close to your data, in an environment that interfaces to Notebook cells on the Notebook server. Designing Phoenix to run locally, enables fast iteration on top of local data.

In order to use Phoenix:

1. Load data into pandas dataframe
2. Leverage SDK embeddings and LLM eval generators&#x20;
2. Leverage SDK embeddings and LLM eval generators
3. Start Phoenix
1. Single dataframe
2. Two dataframes: primary and reference&#x20;
2. Two dataframes: primary and reference
4. Investigate problems
5. Export data

<figure><img src="https://lh3.googleusercontent.com/uzRSF5MXNsti1NVxbn82Pnsx-pPpFznpQyV8ZYofFr2maqc5KbmdAf2zQ1wmDMeVwB8n0quoqpNozuGjKFwwtEXjO45Q5fplz4Oo3CbdeAuP-UomkjFglxkFjVtGDjHnVZ_ZyQpDq7TmtX69Iwn9f4M" alt=""><figcaption></figcaption></figure>
<figure><img src=".gitbook/assets/Phoenix docs graphics-02 (2).jpg" alt=""><figcaption></figcaption></figure>

The picture above shows the flow of execution of Phoenix, from pointing it to your data, running it to find problems or insights, grabbing groups of data for insights and then exporting for fine tuning.&#x20;
The picture above shows the flow of execution of Phoenix, from pointing it to your data, running it to find problems or insights, grabbing groups of data for insights and then exporting for fine tuning.

#### Load Data Into Pandas:
#### Load Data Into pandas:

Phoenix currently requires Pandas dataframes which can be downloaded from either an ML observability platform, a table or a raw log file. The data is assumed to be formatted in the Open Inference format with a well defined column structure, normally including a set of inputs/features, outputs/predictions and ground truth.&#x20;
Phoenix currently requires pandas dataframes which can be downloaded from either an ML observability platform, a table or a raw log file. The data is assumed to be formatted in the Open Inference format with a well defined column structure, normally including a set of inputs/features, outputs/predictions and ground truth.

#### Leverage SDK Embeddings and LLM Eval Generators:

The Phoenix library heavily uses embeddings as a method for data visualization and debugging. In order to use Phoenix with embeddings they can either be generated using an SDK call or they can be supplied by the user of the library. Phoenix supports generating embeddings for LLMs, Image, NLP, and tabular datasets.&#x20;
The Phoenix library heavily uses embeddings as a method for data visualization and debugging. In order to use Phoenix with embeddings they can either be generated using an SDK call or they can be supplied by the user of the library. Phoenix supports generating embeddings for LLMs, Image, NLP, and tabular datasets.

#### Start Phoenix:

Phoenix is typically started in a notebook from which a local Phoenix server is kicked off. Two approaches can be taken to the overall use of Phoenix::
Phoenix is typically started in a notebook from which a local Phoenix server is kicked off. Two approaches can be taken to the overall use of Phoenix:

1. **Single Dataset**

In the case of a team that only wants to investigate a single dataset for exploratory data analysis (EDA), a single dataset instantiation of Phoenix can be used. In this scenario, a team is normally analyzing the data in an exploratory manner and is not doing A/B comparisons. &#x20;
In the case of a team that only wants to investigate a single dataset for exploratory data analysis (EDA), a single dataset instantiation of Phoenix can be used. In this scenario, a team is normally analyzing the data in an exploratory manner and is not doing A/B comparisons.

2. **Two Datasets**

A common use case in ML is for teams to have 2x datasets they are comparing such as: training vs production, model A vs model B, OR production time X vs production time Y, just to name a few. In this scenario there exists a primary and reference dataset. When using the primary and reference dataset, Phoenix supports drift analysis, embedding drift and many different A/B dataset comparisons.&#x20;
A common use case in ML is for teams to have 2x datasets they are comparing such as: training vs production, model A vs model B, OR production time X vs production time Y, just to name a few. In this scenario there exists a primary and reference dataset. When using the primary and reference dataset, Phoenix supports drift analysis, embedding drift and many different A/B dataset comparisons.

#### Investigate Problems:

Once instantiated, teams can dive into Phoenix on a feature by feature basis, analyzing performance and tracking down issues.&#x20;

<mark style="color:red;">< example of embedding drift></mark>&#x20;

The above example shows embedding drift between clusters of data, where a cluster in production has a large drift relative to the training set.&#x20;
Once instantiated, teams can dive into Phoenix on a feature by feature basis, analyzing performance and tracking down issues.

#### Export Cluster:

Once an issue is found, the cluster can be exported back into a dataframe for further analysis. Clusters can be used to create groups of similar data points for use downstream, these include:

* Finding Similar Examples&#x20;
* Monitoring&#x20;
* Finding Similar Examples
* Monitoring
* Steering Vectors / Steering Prompts

### How Phoenix fits into the ML Stack

Phoenix is designed to monitor, analyze and troubleshoot issues on top of your model data allowing for interactive workflows all within a Notebook environment.&#x20;
Phoenix is designed to monitor, analyze and troubleshoot issues on top of your model data allowing for interactive workflows all within a Notebook environment.

<figure><img src=".gitbook/assets/Phoenix docs graphics-01.jpg" alt=""><figcaption><p><strong>How Phoenix Fits into the ML Stack</strong></p></figcaption></figure>

The above picture shows the use of Phoenix with a cloud observability system (this is not required). In this example the cloud observability system allows the easy download (or synchronization) of data to the Notebook typically based on model, batch, environment, and time ranges. Normally this download is done to analyze data at the tail end of troubleshooting workflow, or periodically to use the notebook environment to monitor your models.&#x20;

Once in a notebook environment the downloaded data can power Observability workflows that are highly interactive. Phoenix can be used to find clusters of data problems and export those clusters back to the Observability platform for use in monitoring and active learning workflows.&#x20;

Note: Data can also be downloaded from any data warehouse system for use in Phoenix without the requirement of a cloud ML observability solution.&#x20;

<figure><img src="https://lh5.googleusercontent.com/hpNMLyQQ5lpaHzrLuPUzRn_2i-IMySUpaXr6kumnaLXnzR_-tAvQtBtuumYf10FwAmnFyHT1riAgeP-cvc7xDDqMhMllZ4wl1SWrF5kNDuF7BBoqm9jtjRKh3aMVaI9MM6SDdBG_nwgM_kdltPaM_NE" alt=""><figcaption><p><strong>How Phoenix Fits into the ML Stack</strong></p></figcaption></figure>
In the first version of Phoenix it is assumed the data is available locally but we’ve also designed it with some broader visions in mind. For example, Phoenix was designed with a stateless metrics engine as a first class citizen, enabling any metrics checks to be run in any python data pipeline.&#x20;

\
7 changes: 3 additions & 4 deletions docs/SUMMARY.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,22 @@
# Table of contents

* [ML Observability in a Notebook](README.md)
* [Quickstart](quickstart.md)

## 💡 Concepts

* [ML Observability](concepts/ml-observability.md)
* [Embeddings](concepts/embeddings.md)
* [Phoenix Basics](concepts/phoenix-basics.md)

## 🎓 Tutorials

* [Quickstart](quickstart.md)
* [Notebooks](tutorials/notebooks.md)

## 🔢 How-To

* [Install and Import Phoenix](how-to/install-and-import-phoenix.md)
* [Create Your Own Dataset](how-to/define-your-schema.md)
* [Import Your Data](how-to/define-your-schema.md)
* [Manage the App](how-to/manage-the-app.md)
* [Use the App](how-to/use-the-app.md)
* [Use Example Datasets](how-to/use-example-datasets.md)

## ⌨ API
Expand All @@ -32,4 +30,5 @@

***

* [GitHub](https://github.com/Arize-ai/phoenix)
* [Releases](https://github.com/Arize-ai/phoenix/releases)
14 changes: 7 additions & 7 deletions docs/api/dataset-and-schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ A dataset containing a split or cohort of data to be analyzed independently or c

### Attributes

* **dataframe** (pandas.DataFrame): The Pandas DataFrame of the dataset.
* **dataframe** (pandas.DataFrame): The pandas DataFrame of the dataset.
* **schema** ([phoenix.Schema](dataset-and-schema.md#phoenix.schema)): The schema of the dataset.
* **name** (str): The name of the dataset.

Expand All @@ -47,7 +47,7 @@ The input DataFrame and schema are lightly processed during dataset initializati

### Usage

Define a dataset `ds` from a Pandas DataFrame `df` and a schema object `schema` by running
Define a dataset `ds` from a pandas DataFrame `df` and a schema object `schema` by running

```python
ds = px.Dataset(df, schema)
Expand All @@ -74,18 +74,18 @@ class Schema(
actual_label_column_name: Optional[str] = None,
actual_score_column_name: Optional[str] = None,
embedding_feature_column_names: Optional[Dict[str, EmbeddingColumnNames]] = None,
excludes: Optional[List[str]] = None,
excluded_column_names: Optional[List[str]] = None,
)
```

A dataclass that assigns the columns of a Pandas DataFrame to the appropriate model dimensions (predictions, actuals, features, etc.). Each column of the DataFrame should appear in the corresponding schema at most once.
A dataclass that assigns the columns of a pandas DataFrame to the appropriate model dimensions (predictions, actuals, features, etc.). Each column of the DataFrame should appear in the corresponding schema at most once.

**\[**[**source**](https://github.com/Arize-ai/phoenix/blob/main/src/phoenix/datasets/schema.py)**]**

### Parameters

* **prediction\_id\_column\_name** __ (Optional\[str]): The name of the DataFrame's prediction ID column, if one exists. Prediction IDs are strings that uniquely identify each record in a Phoenix dataset (equivalently, each row in the DataFrame). If no prediction ID column name is provided, Phoenix will automatically generate unique UUIDs for each record of the dataset upon [phoenix.Dataset](dataset-and-schema.md#phoenix.dataset) initialization.
* **timestamp\_column\_name** (Optional\[str]): The name of the DataFrame's timestamp column, if one exists. Timestamp columns must be Pandas Series with numeric or datetime dtypes.
* **timestamp\_column\_name** (Optional\[str]): The name of the DataFrame's timestamp column, if one exists. Timestamp columns must be pandas Series with numeric or datetime dtypes.
* If the timestamp column has numeric dtype (int or float), the entries of the column are interpreted as Unix timestamps, i.e., the number of seconds since midnight on January 1st, 1970.
* If the column has datetime dtype and contains timezone-naive timestamps, Phoenix assumes those timestamps belong to the UTC timezone.
* If the column has datetime dtype and contains timezone-aware timestamps, those timestamps are converted to UTC.
Expand All @@ -97,7 +97,7 @@ A dataclass that assigns the columns of a Pandas DataFrame to the appropriate mo
* **actual\_label\_column\_name** (Optional\[str]): The name of the DataFrame's actual label column, if one exists. Actual (i.e., ground truth) labels are used for classification problems with categorical model output.
* **actual\_score\_column\_name** (Optional\[str]): The name of the DataFrame's actual score column, if one exists. Actual (i.e., ground truth) scores are used for regression problems with continuous numerical output.
* **embedding\_feature\_column\_names** (Optional\[Dict\[str, [phoenix.EmbeddingColumnNames](dataset-and-schema.md#phoenix.embeddingcolumnnames)]]): A dictionary mapping the name of each embedding feature to an instance of [phoenix.EmbeddingColumnNames](dataset-and-schema.md#phoenix.embeddingcolumnnames) if any embedding features exist, otherwise, None. Each instance of [phoenix.EmbeddingColumnNames](dataset-and-schema.md#phoenix.embeddingcolumnnames) associates one or more DataFrame columns containing vector data, image links, or text with the same embedding feature. Note that the keys of the dictionary are user-specified names that appear in the Phoenix UI and do not refer to columns of the DataFrame.
* **excludes** (Optional\[List\[str]]): The names of the DataFrame columns to be excluded from the implicitly inferred list of feature column names. This field should only be used for implicit feature discovery, i.e., when `feature_column_names` is unused and the DataFrame contains feature columns not explicitly included in the schema.
* **excluded_column_names** (Optional\[List\[str]]): The names of the DataFrame columns to be excluded from the implicitly inferred list of feature column names. This field should only be used for implicit feature discovery, i.e., when `feature_column_names` is unused and the DataFrame contains feature columns not explicitly included in the schema.

### Usage

Expand All @@ -119,7 +119,7 @@ A dataclass that associates one or more columns of a DataFrame with an embedding

### Parameters

* **vector\_column\_name** (str): The name of the DataFrame column containing the embedding vector data. Each entry in the column must be a list, one-dimensional Numpy array, or Pandas Series containing numeric values (floats or ints) and must have equal length to all the other entries in the column.
* **vector\_column\_name** (str): The name of the DataFrame column containing the embedding vector data. Each entry in the column must be a list, one-dimensional NumPy array, or pandas Series containing numeric values (floats or ints) and must have equal length to all the other entries in the column.
* **raw\_data\_column\_name** (Optional\[str]): The name of the DataFrame column containing the raw text associated with an embedding feature, if such a column exists. This field is used when an embedding feature describes a piece of text, for example, in the context of NLP.
* **link\_to\_data\_column\_name** (Optional\[str]): The name of the DataFrame column containing links to images associated with an embedding feature, if such a column exists. This field is used when an embedding feature describes an image, for example, in the context of computer vision.

Expand Down
10 changes: 10 additions & 0 deletions docs/api/session.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@ A session that maintains the state of the Phoenix app.
### Attributes

* **url** (str): The URL of the running Phoenix session. Can be copied and pasted to open the Phoenix UI in a new browser tab or window.
* **exports** (List\[DataFrame]): List of pandas DataFrame. To access the most recently exported data, use `exports[-1]`.

### Usage

Expand Down Expand Up @@ -161,3 +162,12 @@ session.url
```

and copying and pasting the URL.

Once a cluster is selected in the UI it can be exported as a Parquet file on disk. To list the exported files, use the `exports` property on the `session` as follows.

```python
session.exports
# [<DataFrame 2023-03-29_22-14-37>]
session.exports[-1]
# pandas.DataFrame
```
2 changes: 1 addition & 1 deletion docs/concepts/embeddings.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
description: Meaning, Examples and How To Compute
---

# 🌌 Embeddings
# Embeddings

### What's an embedding?

Expand Down
2 changes: 0 additions & 2 deletions docs/concepts/ml-observability.md

This file was deleted.

16 changes: 4 additions & 12 deletions docs/concepts/open-inference.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# 📄 Open Inference
# Open Inference

## Overview

Expand All @@ -19,7 +19,7 @@ An inference store is a common approach to store model inferences, normally stor
* Text Classification
* NER Span Categorization

****


**Tabular:**

Expand Down Expand Up @@ -132,10 +132,6 @@ The example above shows an exploded representation of the hierarchical data. \<t

### Examples: Supported Schemas&#x20;

#### NLP - LLM Generative/Summarization/Translation

#### NLP - Classification &#x20;

#### Regression

<figure><img src="../.gitbook/assets/image.png" alt=""><figcaption></figcaption></figure>
Expand All @@ -144,15 +140,11 @@ The example above shows an exploded representation of the hierarchical data. \<t

<figure><img src="../.gitbook/assets/image (1) (1).png" alt=""><figcaption></figcaption></figure>

#### &#x20;Classification + Score

#### Ranking

<figure><img src="../.gitbook/assets/image (1).png" alt=""><figcaption></figcaption></figure>

#### CV - Classification&#x20;

<figure><img src="../.gitbook/assets/image (2).png" alt=""><figcaption></figcaption></figure>

More examples to come soon.

####

Expand Down
4 changes: 2 additions & 2 deletions docs/concepts/phoenix-basics.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
description: Learn the foundational concepts of the Phoenix API
---

# 🔢 Phoenix Basics
# Phoenix Basics

This section introduces _datasets_ and _schemas,_ the starting concepts needed to use Phoenix.

Expand All @@ -15,7 +15,7 @@ This section introduces _datasets_ and _schemas,_ the starting concepts needed t

A _Phoenix dataset_ is an instance of `phoenix.Dataset` that contains three pieces of information:

* The data itself (a Pandas DataFrame)
* The data itself (a pandas DataFrame)
* A schema (a `phoenix.Schema` instance) that describes the columns of your DataFrame
* A dataset name that appears in the UI

Expand Down
Loading