Skip to content

Releases: OpenInterpreter/open-interpreter

The New Computer Update Part II

11 Mar 23:27
Compare
Choose a tag to compare
Pre-release
Screen Shot 2024-03-11 at 4 25 18 PM


The New Computer Update Part II introduces our first local vision model, a suite of native Mac integrations, 5x launch speed, and dozens of other requested features.

Mac Control

The new calendar, contacts, browser, mail, and sms Computer API modules let Open Interpreter control your Mac's native applications.

For example, Open Interpreter can execute the following code to get a contact, then send a text and an email with an attachment.

LLM-first Web Browser

The browser lets Open Interpreter quickly browse the web by querying a web-enabled language model.

Point Model

We have developed a model called point which is capable of locating visual controls precisely. It was designed to run locally on consumer hardware.

We leverage existing open-source models to "semantically search the screen" for text and icons. Language models can then call on this composite model to 'point' at text or icons on the screen:

While this model is capable of understanding simple interfaces with discrete icons, we intend to explore more general solutions in the next few weeks.

And more:

  • 5X launch speed
  • Experimental Docker support
  • Export conversation to a Jupyter notebook using %jupyter
  • Experimental one-click installers for Mac, Linux, and Windows
  • Profiles preview (a feature similar to custom GPTs)

New Contributors

The New Computer Update

05 Jan 04:49
54bf90f
Compare
Choose a tag to compare
Pre-release

● The New Computer Update

This is the most significant upgrade to Open Interpreter since 0.1.0. Almost every component has been rewritten to support our project's objective—building a standard interface between language models and computers.

Note: This update is not backwards compatible. If you use Open Interpreter in Python, please read our migration guide.

1. The Computer API

In 2023, Open Interpreter started building the world's first real-time code execution environment for language models.

Now, we're creating an API for language models to use in that environment, starting with basic I/O controls like display, mouse, and keyboard:

computer.display.view() # Returns a screenshot to vision models

computer.mouse.click("On-screen text") # Locates text on-screen and clicks it
computer.mouse.move(icon="Description of icon") # Locates icon on-screen and hovers over it

computer.clipboard.view() # Returns the contents of the user's clipboard

# Full reference: https://docs.openinterpreter.com/computer-api/reference

We are also launching a free preview of the hosted tools that power computer.mouse at api.openinterpreter.com.

2. OS Mode

You can instruct Open Interpreter to use the Computer API to control your computer graphically:

interpreter --os

Even local vision models running via .llamafile, LM-Studio, or Jan.ai are supported.

3. LMC Messages

To support the incoming Language Model Computer architecture, the new messaging format that powers Open Interpreter extends OpenAI's messages format to include additional information, and a new role called computer:

[
  {
    "role": "assistant",
    "type": "code",
    "format": "python",
    "content": "plot = create_plot_from_data('data')\ndisplay_as_image(plot)\ndisplay_as_html(plot)"
  },
  {
    "role": "computer",
    "type": "image",
    "format": "base64.png",
    "content": "base64"
  },
  {
    "role": "assistant",
    "type": "message",
    "content": "Plot generated successfully."
  }
]

Read about LMC Messages here.

And more...

In addition to these major updates, 0.2.0 comes with a suite of fixes and enhancement from our growing open-source community:

  • Fixes crash UnboundLocalError active_block by @CyanideByte in #818
  • Inserts dummy api key if missing by @CyanideByte in #808
  • Fix README_JA.md by @tegnike in #810
  • return empty dict if config file is empty by @sbendary25 in #811
  • Add pyautogui mouse click functions and parameters by @Arrendy in #843
  • Fixed the error when using Azure OpenAI API by @wsbao in #840
  • Add package mismatches info to system_debug_info by @tegnike in #800
  • Update keyboard control functions for better input handling by @Arrendy in #845
  • implement mouse position function by @Arrendy in #850
  • Fixed another a few bugs in using OpenAI API/Azure OpenAI API/OpenAI compatible custom API by @wsbao in #848
  • Added new docs for litellm hosted models by @tyfiero in #858
  • Added refreshed docs for terminal arguments and python streaming responses by @tyfiero in #864
  • Add os docs by @tyfiero in #868
  • Fixed the case where UnicodeDecodeError by @Moonlight-YS in #863

New Contributors

Full Changelog: v0.1.17...v0.2.0

This is only the beginning. Happy 2024. ●

The New Computer Update (Preview)

28 Dec 12:14
Compare
Choose a tag to compare
Pre-release

The New Computer Update is the most significant update to Open Interpreter since v0.1.0.

Ahead of its release, we're soft-launching this preview to get feedback on the update.


pip install --upgrade open-interpreter
interpreter --os

v0.1.17

05 Dec 02:29
96ab4aa
Compare
Choose a tag to compare
v0.1.17 Pre-release
Pre-release

Minor fix: gpt-4-1106-previewgpt-4

We decided to make gpt-4 no longer default to gpt-4-1106-preview, which we've found performs worse on code interpretation tasks.

Important Note: This was not published from the repository. The last release, v0.1.16, was simply patched to not switch to gpt-4-1106-preview. The updates in the repo (and the commit associated with this release) are therefore out of sync with the binaries below.

We'll be back in sync by the next release of 0.2.0, the New Computer update.

v0.1.16

26 Nov 03:53
Compare
Choose a tag to compare
v0.1.16 Pre-release
Pre-release

Critical fixes and cool new magic commands in this one.

  • New magic commands: %% {insert shell code} for custom shell code running and %info added by @CyanideByte
  • Significantly better debugging with more system information in errors, thank you @Notnaton!
  • Create multiple agents / instances and watch them communicate in Python
  • Azure now works with the config thanks to @hargunmujral in #786
  • @CyanideByte made it so if the user can't access gpt-4, we give a great error, and let them switch to gpt-3.5
  • @pratss10 added some helpful documentation (with screenshots!) for setting up gpt-4
  • vision now works via the Python package
  • Brand new German README thanks to @iongpt!
  • We're significantly more organized — one folder for the terminal_interface, another for core
  • Much better shell output processing
  • Finally, a proper ROADMAP!

Thank you to everyone that made it happen!

Cannot go without thanking the extraordinary @ericrallen for this release as well, who practically single-handedly brought our issues below 100 (from nearly 300 just a few weeks ago). Have a safe trip to Japan Eric!

New Contributors

Full Changelog: v0.1.15...v0.1.16

v0.1.15

15 Nov 09:20
Compare
Choose a tag to compare
v0.1.15 Pre-release
Pre-release

Quick fixes to resolve some common issues:

New Contributors

Full Changelog: v0.1.14...v0.1.15

Vision I (Quick Fixes II)

11 Nov 10:34
Compare
Choose a tag to compare
Pre-release
  • An issue with UNIX files has been resolved (#748)
  • Experimental support for Python in --vision mode has been added

Full Changelog: v0.1.13...v0.1.14

Vision I (Quick Fixes I)

11 Nov 06:16
Compare
Choose a tag to compare
Pre-release

Quick fix for --vision support Windows. File paths should now be properly recognized and loaded into the model.

Full Changelog: v0.1.12...v0.1.13

Vision I

10 Nov 22:35
Compare
Choose a tag to compare
Vision I Pre-release
Pre-release

A quick one, a fun one. Added experimental vision support for OpenAI users.

interpreter --vision

Drag files / screenshots into your terminal to use it. Also supports reflection for HTML. (It can see the designs it produces!)

Vision II will introduce support for reflective vision in many more languages.

What's Changed

New Contributors

Full Changelog: v0.1.11...v0.1.12

Local II Update

09 Nov 16:52
Compare
Choose a tag to compare
Local II Update Pre-release
Pre-release
  • Local mode is now powered by LM Studio. Running --local will tell you how to setup LM Studio + connect to it automatically.
  • It's way smaller. Removed the MASSIVE local embedding model, chromadb, oobabooga, a bunch of other packages we didn't really need. Semgrep is now optional.
  • The system message is tighter, so it's cheaper + faster on any LLM.

Several crashes have also been resolved, temperature is now properly set to 0 (which should increase performance on OpenAI models), Powershell on Linux support, an ugly print statement was removed, we're now enforcing a consistent code style (black, isort), and much more:

What's Changed

New Contributors

Full Changelog: v0.1.10...v0.1.11

Great work everyone!