Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for optional telemetry plugin #1018

Merged
merged 5 commits into from
Oct 7, 2024

Conversation

dlqqq
Copy link
Member

@dlqqq dlqqq commented Oct 3, 2024

Description

  • This PR introduces support for an optional telemetry plugin, which receives interaction events that are emitted from Jupyter AI's frontend.

    • Revision 1: This PR has been revised to only include anonymized data about the message, without sharing the raw content of the message. This was done to ensure telemetry is not misused to collect identifying information about users.
  • The telemetry plugin is not implemented in Jupyter AI, meaning that nothing happens when an interaction event occurs by default. To clarify: no telemetry data ever leaves the user's browser memory by default. This can be readily verified from the code proposed here.

  • This feature is mainly for admins managing custom deployments who need to obtain usage data on their deployments. This change should leave almost all users completely unaffected.

  • This PR introduces 4 types of telemetry events, all of which are emitted from the code toolbar: 'copy' | 'replace' | 'insert-above' | 'insert-below'.

  • Admins can listen to these telemetry events only by providing a telemetry plugin in a separate labextension package. This is done by writing a separate plugin that implements & provides IJaiTelemetryHandler (imported from @jupyter-ai/core/tokens).

  • The plugin implementation of IJaiTelemetryHandler should return an object with one method: onEvent(e), where e: TelemetryEvent represents a telemetry event emitted by Jupyter AI's frontend. The definition of TelemetryEvent is given below:

/**
 * An object that describes an interaction event from the user.
 *
 * Jupyter AI natively emits 4 event types: "copy", "replace", "insert-above",
 * or "insert-below". These are all emitted by the code toolbar rendered
 * underneath code blocks in the chat sidebar.
 */
export type TelemetryEvent = {
  /**
   * Type of the interaction.
   *
   * Frontend extensions may add other event types in custom components. Custom
   * events can be emitted via the `useTelemetry()` hook.
   */
  type: 'copy' | 'replace' | 'insert-above' | 'insert-below' | string;
  /**
   * Anonymized details about the message that was interacted with.
   */
  message: {
    /**
     * ID of the message assigned by Jupyter AI.
     */
    id: string;
    /**
     * Type of the message.
     */
    type: AiService.ChatMessage['type']; // resolves to: 'human' | 'agent' | 'agent-stream'
    /**
     * UNIX timestamp of the message.
     */
    time: number;
    /**
     * Metadata associated with the message, yielded by the underlying language
     * model provider.
     */
    metadata?: Record<string, unknown>;
  };
  /**
   * Anonymized details about the code block that was interacted with, if any.
   * This is left optional for custom events like message upvote/downvote that
   * do not involve interaction with a specific code block.
   */
  code?: {
    charCount: number;
    lineCount: number;
  };
};
import { useTelemetry } from '@jupyter-ai/core/lib/contexts/telemetry-context';

// within the component:
const telemetryHandler = useTelemetry();
...
telemetryHandler.onEvent({ ... }) // <= this statement triggers a telemetry event when run
  • Lastly, this PR fixes a couple of bugs in Add metadata field to agent messages #1013, namely by a) removing a dangling console.log() statement, and b) ensuring that message metadata is recorded in the chat history array used to serve GET api/ai/chats/history.

Demo

No telemetry events ever appear in the browser console by default; this is a quick proof-of-concept that shows telemetry events are being emitted by the frontend & captured by a custom plugin. See the next section for steps on how to reproduce this locally.

Screen.Recording.2024-10-07.at.1.01.00.PM.mov

Steps to reproduce demo

  1. Add the below block to packages/jupyter-ai/src/index.ts:
const telemetryPlugin: JupyterFrontEndPlugin<IJaiTelemetryHandler> = {
  id: 'custom-jai-telemetry-plugin',
  autoStart: true,
  provides: IJaiTelemetryHandler,
  activate: async () => {
    return {
      onEvent: e => {
        console.log(e);
      }
    };
  }
};
  1. Add telemetryPlugin to the list of default exports at the bottom of the same file:
export default [
  plugin,
  statusItemPlugin,
  completionPlugin,
  menuPlugin,
  telemetryPlugin
];
  1. Run jlpm build, refresh, and observe that interaction events are being captured & logged to the console, as shown in the demo video.

@dlqqq dlqqq added the enhancement New feature or request label Oct 3, 2024
@dlqqq dlqqq changed the title Support optional telemetry plugin Add support for optional telemetry plugin Oct 3, 2024
@michaelchia
Copy link
Collaborator

awesome. this is something i've been wanting for awhile. thanks for adding this.

@dlqqq dlqqq closed this Oct 3, 2024
@dlqqq dlqqq reopened this Oct 7, 2024
@dlqqq dlqqq marked this pull request as draft October 7, 2024 19:02
@dlqqq dlqqq force-pushed the support-optional-telemetry-plugin branch from 7878ad0 to c2563cf Compare October 7, 2024 19:02
@dlqqq
Copy link
Member Author

dlqqq commented Oct 7, 2024

cc @jtpio: for work on jupyter-chat.

@dlqqq dlqqq marked this pull request as ready for review October 7, 2024 20:15
@dlqqq dlqqq force-pushed the support-optional-telemetry-plugin branch from 0231575 to c2b7ccc Compare October 7, 2024 21:58
@jtpio
Copy link
Member

jtpio commented Oct 7, 2024

Thanks @dlqqq for the ping!

It looks like telemetry would be useful to JupyterLab and the extension ecosystem in general. Wondering if you already have some ideas on how this could generalized so it can be used by any plugin? Maybe leveraging the existing EventsManager could be something to consider?

Also cc @afshin who was recently looking into telemetry and metrics via the built-in EventsManager.

@dlqqq dlqqq force-pushed the support-optional-telemetry-plugin branch from c2b7ccc to 7a4d4a0 Compare October 7, 2024 23:11
@dlqqq
Copy link
Member Author

dlqqq commented Oct 7, 2024

@jtpio

It looks like telemetry would be useful to JupyterLab and the extension ecosystem in general. Wondering if you already have some ideas on how this could generalized so it can be used by any plugin? Maybe leveraging the existing EventsManager could be something to consider?

This is a great idea, thanks for reaching out! I'd be very interested in working with you, @afshin, and the rest of your team to develop a general, standardized strategy for emitting & handling telemetry in JupyterLab & JupyterLab extensions (if one doesn't already exist 😁). I'd also love it if we could create a "how-to" guide for devs looking to implement telemetry in the extension development docs.

Integrations with other JupyterLab APIs like EventsManager can be added in the future as non-breaking changes (or as part of Jupyter AI v3.0.0, which we are eager to begin planning soon). The implementation proposed here is mainly for users who want to listen for & emit telemetry events via a custom labextension, which multiple enterprise users have requested recently.

Copy link
Collaborator

@andrii-i andrii-i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks and works well within the scope of the PR.

Note that “3 vertical dots” menu on the Jupyternaut response does not log actions as of now and it probably should for the consistency but this can be addressed in a separate PR.

@dlqqq dlqqq merged commit 097dbe4 into jupyterlab:main Oct 7, 2024
9 checks passed
@jtpio
Copy link
Member

jtpio commented Oct 7, 2024

Ah, I was hoping we could have a chance to discuss this a bit more before it gets merged.

Telemetry is not only useful for Jupyter AI. I'm a bit worried that we if we start having many different ways for doing this, and each extension implementing its own thing, it will quickly become difficult to find a way to generalize that later.

@andrii-i
Copy link
Collaborator

andrii-i commented Oct 8, 2024

Hi @jtpio. For context, logging functionality is also implemented in Jupyter Scheduler and implementation is not the same as here. So this is not the first extension to do it. See Jupyter Scheduler PRs jupyter-server/jupyter-scheduler#448, jupyter-server/jupyter-scheduler#457, jupyter-server/jupyter-scheduler#472, jupyter-server/jupyter-scheduler#523.

I'm also sure that if / when JupyterLab would provide default logging mechanism, both Jupyter Scheduler and Jupyter AI teams would be happy to consider transitioning to it.

@krassowski
Copy link
Member

Marchlak pushed a commit to Marchlak/jupyter-ai that referenced this pull request Oct 28, 2024
* remove console log accidentally merged with jupyterlab#1013

* set metadata on stream messages in the chat_history array

* implement support for optional telemetry plugin

* anonymize message & code details

* export telemetry hook from NPM package entry point
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants