Open Interpreter: 2024’s Ultimate LLM OS Guide | Free Update

Open Interpreter is a powerful tool that allows large language models (LLMs) to run code locally, including Python, JavaScript, and Shell. This innovative software provides a natural language interface for various computer functions, bridging the gap between human communication and machine execution.

Open Interpreter Functionality

File and Code Operations

  1. Manage files and folders: Copy, move, delete, find files, etc.
  2. Write and run code: Support for multiple programming languages like Python, JavaScript, etc.
  3. Automate tasks: Write scripts to automatically execute repetitive operations.

System Interaction

  1. Retrieve system information: View hardware specifications, network status, running processes, etc.
  2. Control applications: Open, close, and monitor applications.
  3. Execute command-line instructions: Run system commands without leaving the current interface.

Data Processing and Analysis

  1. Read and process data: Obtain data from files, databases, and networks.
  2. Data cleaning and transformation: Organize, filter, and format data.
  3. Data analysis and visualization: Perform statistical analysis, generate charts to display data patterns.

Network and Internet Operations

  1. Download and upload files: Download files from the network or upload local files to servers.
  2. Send emails: Automatically send email notifications or reports.
  3. Web scraping: Extract data from web pages.

Design and Multimedia

  1. Website design: Design simple websites using HTML, CSS, and JavaScript.
  2. Photo editing: Complete basic image editing tasks using Python libraries like PIL or OpenCV.
  3. Video processing: Write scripts using Python’s MoviePy library to complete basic video processing tasks.
  4. Create animations: Use JavaScript or CSS to create basic animations.
  5. User interface design: Design simple user interfaces for applications.

New Model Commands

Groq (Free API)

interpreter --api_base "https://api.groq.com/openai/v1" --api_key "xxx" --model "llama3-70b-8192" --context_window 8192

Together AI/Llama3 70B ($25 sign-up credit)

export TOGETHER_API_KEY=xxx
interpreter --model together_ai/META-LLAMA/LLAMA-3-70B-CHAT-HF --context_window 8192

Claude

export ANTHROPIC_API_KEY=xxx

interpreter --model anthropic/claude-3-haiku-20240307

or

interpreter --model anthropic/claude-3-opus-20240229

Additionally, Open Interpreter has officially launched Local III.

Local III introduces a free hosted optional model ‘i’ (currently Llama3-70B). User conversations with the ‘i’ model will be used to train Open Interpreter’s own open-source computer control language model. Open Interpreter will remove personal identifying information before open-sourcing the model and training set.

interpreter --model i

Open Interpreter Local III Update

New features include:

  • Easy-to-use local model browser
  • Deep integration with inference engines like Ollama
  • Custom configuration files for open-source models like Llama3, Moondream, and Codestral
  • A set of settings to improve offline code interpretation reliability
pip install --upgrade open-interpreter

Local Model Browser

Local III makes using local models easier than ever. Through an interactive setup, users can:

  • Choose the inference provider
  • Select the model
  • Download new models

Launch the local model browser with:

interpreter --local

Optimized Configuration Files

The Open Interpreter team has extensively experimented with two SOTA local language models, Codestral and Llama3. Users can configure Open Interpreter to use recommended settings with the following flags:

interpreter --profile codestral.py # Best settings for Codestral
interpreter --profile llama3.py # Best settings for Llama3
interpreter --profile qwen.py # Best settings for Qwen

Note: The profile flag loads settings from files in the profiles directory, which users can open by running:

interpreter --profiles

Local Vision

Images sent to local models will be presented as image descriptions generated by Moondream (a small vision model). The model will also receive OCR extracted from the images.

interpreter --local --vision

Experimental Local OS Mode

By enabling local vision, Local III also enables experimental local OS mode support.

In this mode, Open Interpreter can control the user’s mouse, keyboard, and view the screen. The LLM can interact with the user’s computer by clicking on icons recognized by the open-source Point model.

interpreter --local --os

Summary of This Update

This update includes:

  1. Easy-to-use local model browser
  2. Deep integration with inference engines like Ollama
  3. Custom configuration files for open-source models such as Llama3, Moondream, and Codestral
  4. A set of settings to improve offline code interpretation reliability

Local Model Browser

Local III makes using local models easier than ever. Through an interactive setup, users can:

  • Choose the inference provider
  • Select the model
  • Download new models

Use the following flag to launch the local model browser:

interpreter --local

Optimized Configuration Files

The Open Interpreter team has conducted extensive experiments with two SOTA local language models, Codestral and Llama3. Users can configure Open Interpreter to use recommended settings with the following flags:

interpreter --profile codestral.py # Best settings for Codestral
interpreter --profile llama3.py # Best settings for Llama3
interpreter --profile qwen.py # Best settings for Qwen

Note: The profile flag loads settings from files in the profiles directory. Users can open this directory by running:

interpreter --profiles

Local Vision

Images sent to local models will be presented as image descriptions generated by Moondream (a small vision model). The model will also receive OCR extracted from the images.

interpreter --local --vision

Experimental Local OS Mode

By enabling local vision, Local III also enables experimental local OS mode support.

In this mode, Open Interpreter can control the user’s mouse, keyboard, and view the screen. The LLM can interact with the user’s computer by clicking on icons recognized by the open-source Point model.

interpreter --local --os

User Experience

interpreter –local –vision

  • Image renaming
  • The author mentions that Open Interpreter uses the Moondream visual model, which they’ve introduced in a previous article.
  • Adding watermarks to images

interpreter –local

  • File classification: Due to a crowded download directory, the author used Open Interpreter with the local Codestral model to analyze file composition.
  • Querying computer space usage
  • Creating charts

Stock Price Retrieval Issues

The author encountered errors when trying to get NVIDIA stock prices using local models (Codestral, Llama3 8B, Yi:9B) and even the API-connected Claude Opus.

Video Downloading

A tip: Specifying an excellent library for the LLM to use led to success.

Article Summarization

Local LLMs, including Llama 3 70B, couldn’t summarize articles. However, Claude Haiku and Opus models were successful.

Video Processing

The author noted differences between local LLMs and Claude Haiku. For example, Haiku would first check if FFmpeg was installed on the computer, while local LLMs would directly suggest installation.

interpreter –local –os

The author found this mode to be unstable, unable to complete tasks like opening Typora and creating a new file.

Conclusion

Based on the user experience, the author appreciates Open Interpreter’s ability to automatically break down tasks, write code, and reflect on errors. However, they acknowledge that it still has many issues to resolve.

The best models to pair with Open Interpreter are GPT-4 and Claude Opus, but their main drawback is the high cost. Using local LLMs can lead to many bottlenecks, especially if users are unfamiliar with various libraries.

Currently, Open Interpreter can complete relatively simple tasks, but it’s still far from being a full LLM OS.

The author mentions GitHub Copilot’s command-line interface (CLI) as a similar tool and promises to introduce another AI command-line tool in a future article.

The piece concludes by inviting readers to share their thoughts and experiences in the comments section, encouraging further discussion and improvement.

Categories: AI Tools
X