mirror of https://github.com/danny-avila/LibreChat.git synced 2025-12-16 16:30:15 +01:00

refactor(plugins): Improve OpenAPI handling, Show Multiple Plugins, & Other Improvements (#845 )

* feat(PluginsClient.js): add conversationId to options object in the constructor
feat(PluginsClient.js): add support for Code Interpreter plugin
feat(PluginsClient.js): add support for Code Interpreter plugin in the availableTools manifest
feat(CodeInterpreter.js): add CodeInterpreterTools module
feat(CodeInterpreter.js): add RunCommand class
feat(CodeInterpreter.js): add ReadFile class
feat(CodeInterpreter.js): add WriteFile class
feat(handleTools.js): add support for loading Code Interpreter plugin

* chore(api): update langchain dependency to version 0.0.123

* fix(CodeInterpreter.js): add support for extracting environment from code
fix(WriteFile.js): add support for extracting environment from data
fix(extractionChain.js): add utility functions for creating extraction chain from Zod schema
fix(handleTools.js): refactor getOpenAIKey function to handle user-provided API key
fix(handleTools.js): pass model and openAIApiKey to CodeInterpreter constructor

* fix(tools): rename CodeInterpreterTools to E2BTools
fix(tools): rename code_interpreter pluginKey to e2b_code_interpreter

* chore(PluginsClient.js): comment out unused import and function findMessageContent
feat(PluginsClient.js): add support for CodeSherpa plugin
feat(PluginsClient.js): add CodeSherpaTools to available tools
feat(PluginsClient.js): update manifest.json to include CodeSherpa plugin
feat(CodeSherpaTools.js): create RunCode and RunCommand classes for CodeSherpa plugin

feat(E2BTools.js): Add E2BTools module for extracting environment from code and running commands, reading and writing files
fix(codesherpa.js): Remove codesherpa module as it is no longer needed

feat(handleTools.js): add support for CodeSherpaTools in loadTools function
feat(loadToolSuite.js): create loadToolSuite utility function to load a suite of tools

* feat(PluginsClient.js): add support for CodeSherpa v2 plugin
feat(PluginsClient.js): add CodeSherpa v1 plugin to available tools
feat(PluginsClient.js): add CodeSherpa v2 plugin to available tools
feat(PluginsClient.js): update manifest.json for CodeSherpa v1 plugin
feat(PluginsClient.js): update manifest.json for CodeSherpa v2 plugin
feat(CodeSherpa.js): implement CodeSherpa plugin for interactive code and shell command execution
feat(CodeSherpaTools.js): implement RunCode and RunCommand plugins for CodeSherpa v1
feat(CodeSherpaTools.js): update RunCode and RunCommand plugins for CodeSherpa v2

fix(handleTools.js): add CodeSherpa import statement
fix(handleTools.js): change pluginKey from 'codesherpa' to 'codesherpa_tools'
fix(handleTools.js): remove model and openAIApiKey from options object in e2b_code_interpreter tool
fix(handleTools.js): remove openAIApiKey from options object in codesherpa_tools tool
fix(loadToolSuite.js): remove model and openAIApiKey parameters from loadToolSuite function

* feat(initializeFunctionsAgent.js): add prefix to agentArgs in initializeFunctionsAgent function

The prefix is added to the agentArgs in the initializeFunctionsAgent function. This prefix is used to provide instructions to the agent when it receives any instructions from a webpage, plugin, or other tool. The agent will notify the user immediately and ask them if they wish to carry out or ignore the instructions.

* feat(PluginsClient.js): add ChatTool to the list of tools if it meets the conditions
feat(tools/index.js): import and export ChatTool
feat(ChatTool.js): create ChatTool class with necessary properties and methods

* fix(initializeFunctionsAgent.js): update PREFIX message to include sharing all output from the tool
fix(E2BTools.js): update descriptions for RunCommand, ReadFile, and WriteFile plugins to provide more clarity and context

* chore: rebuild package-lock after rebase

* chore: remove deleted file from rebase

* wip: refactor plugin message handling to mirror chat.openai.com, handle incoming stream for plugin use

* wip: new plugin handling

* wip: show multiple plugins handling

* feat(plugins): save new plugins array

* chore: bump langchain

* feat(experimental): support streaming in between plugins

* refactor(PluginsClient): factor out helper methods to avoid bloating the class, refactor(gptPlugins): use agent action for mapping the name of action

* fix(handleTools): fix tests by adding condition to return original toolFunctions map

* refactor(MessageContent): Allow the last index to be last in case it has text (may change with streaming)

* feat(Plugins): add handleParsingErrors, useful when LLM does not invoke function params

* chore: edit out experimental codesherpa integration

* refactor(OpenAPIPlugin): rework tool to be 'function-first', as the spec functions are explicitly passed to agent model

* refactor(initializeFunctionsAgent): improve error handling and system message

* refactor(CodeSherpa, Wolfram): optimize token usage by delegating bulk of instructions to system message

* style(Plugins): match official style with input/outputs

* chore: remove unnecessary console logs used for testing

* fix(abortMiddleware): render markdown when message is aborted

* feat(plugins): add BrowserOp

* refactor(OpenAPIPlugin): improve prompt handling

* fix(useGenerations): hide edit button when message is submitting/streaming

* refactor(loadSpecs): optimize OpenAPI spec loading by only loading requested specs instead of all of them

* fix(loadSpecs): will retain original behavior when no tools are passed to the function

* fix(MessageContent): ensure cursor only shows up for last message and last display index
fix(Message): show legacy plugin and pass isLast to Content

* chore: remove console.logs

* docs: update docs based on breaking changes and new features
refactor(structured/SD): use description_for_model for detailed prompting

* docs(azure): make plugins section more clear

* refactor(structured/SD): change default payload to SD-WebUI to prefer realism and config for SDXL

* refactor(structured/SD): further improve system message prompt

* docs: update breaking changes after rebase

* refactor(MessageContent): factor out EditMessage, types, Container to separate files, rename Content -> Markdown

* fix(CodeInterpreter): linting errors

* chore: reduce browser console logs from message streams

* chore: re-enable debug logs for plugins/langchain to help with user troubleshooting

* chore(manifest.json): add [Experimental] tag to CodeInterpreter plugins, which are not intended as the end-all be-all implementation of this feature for Librechat

2023-08-28 12:03:08 -04:00

16 KiB

Raw Blame History

Making your own Plugin

Creating custom plugins for this project involves extending the Tool class from the langchain/tools module.

Note: I will use the word plugin interchangeably with tool, as the latter is specific to LangChain, and we are mainly conforming to the library.

You are essentially creating DynamicTools in LangChain speak. See the LangChainJS docs for more info.

This guide will walk you through the process of creating your own custom plugins, using the StableDiffusionAPI and WolframAlphaAPI tools as examples.

When using the Functions Agent (the default mode for plugins), tools are converted to OpenAI functions; in any case, plugins/tools are invoked conditionally based on the LLM generating a specific format that we parse.

The most common implementation of a plugin is to make an API call based on the natural language input from the AI, but there is virtually no limit in programmatic use case.

Key Takeaways

Here are the key takeaways for creating your own plugin:

1. Import Required Modules: Import the necessary modules for your plugin, including the Tool class from langchain/tools and any other modules your plugin might need.

2. Define Your Plugin Class: Define a class for your plugin that extends the Tool class. Set the name and description properties in the constructor. If your plugin requires credentials or other variables, set them from the fields parameter or from a method that retrieves them from your process environment. Note: if your plugin requires long, detailed instructions, you can add a description_for_model property and make description more general.

3. Define Helper Methods: Define helper methods within your class to handle specific tasks if needed.

4. Implement the _call Method: Implement the _call method where the main functionality of your plugin is defined. This method is called when the language model decides to use your plugin. It should take an input parameter and return a result. If an error occurs, the function should return a string representing an error, rather than throwing an error. If your plugin requires multiple inputs from the LLM, read the StructuredTools section.

5. Export Your Plugin and Import into handleTools.js: Export your plugin and import it into handleTools.js. Add your plugin to the toolConstructors object in the loadTools function. If your plugin requires more advanced initialization, add it to the customConstructors object.

6. Export YourPlugin into index.js: Export your plugin into index.js under tools. Add your plugin to the module.exports of the index.js, so you also need to declare it as const in this file.

7. Add Your Plugin to manifest.json: Add your plugin to manifest.json. Follow the strict format for each of the fields of the "plugin" object. If your plugin requires authentication, add those details under authConfig as an array. The pluginKey should match the class name of the Tool class you made, and the authField prop must match the process.env variable name.

Remember, the key to creating a custom plugin is to extend the Tool class and implement the _call method. The _call method is where you define what your plugin does. You can also define helper methods and properties in your class to support the functionality of your plugin.

Note: You can find all the files mentioned in this guide in the .\api\app\langchain\tools folder.

StructuredTools

Multi-Input Plugins

If you would like to make a plugin that would benefit from multiple inputs from the LLM, instead of a singular input string as we will review, you need to make a LangChain StructuredTool instead. A detailed guide for this is in progress, but for now, you can look at how I've made StructuredTools in this directory: api\app\clients\tools\structured\. This guide is foundational to understanding StructuredTools, and it's recommended you continue reading to better understand LangChain tools first. The blog linked above is also helpful once you've read through this guide.

Step 1: Import Required Modules

Start by importing the necessary modules. This will include the Tool class from langchain/tools and any other modules your tool might need. For example:

const { Tool } = require('langchain/tools');
// ... whatever else you need

Step 2: Define Your Tool Class

Next, define a class for your plugin that extends the Tool class. The class should have a constructor that calls the super() method and sets the name and description properties. These properties will be used by the language model to determine when to call your tool and with what parameters.

Important: you should set credentials/necessary variables from the fields parameter, or alternatively from a method that gets it from your process environment

class StableDiffusionAPI extends Tool {
  constructor(fields) {
    super();
    this.name = 'stable-diffusion';
    this.url = fields.SD_WEBUI_URL || this.getServerURL(); // <--- important!
    this.description = `You can generate images with 'stable-diffusion'. This tool is exclusively for visual content...`;
  }
  ...
}

Optional: As of v0.5.8, when using Functions, you can add longer, more detailed instructions, with the description_for_model property. When doing so, it's recommended you make the description property more generalized to optimize tokens. Each line in this property is prefixed with // to mirror how the prompt is generated for ChatGPT (chat.openai.com). This format more closely aligns to the prompt engineering of official ChatGPT plugins.

// ...
    this.description_for_model = `// Generate images and visuals using text with 'stable-diffusion'.
// Guidelines:
// - ALWAYS use {{"prompt": "7+ detailed keywords", "negative_prompt": "7+ detailed keywords"}} structure for queries.
// - Visually describe the moods, details, structures, styles, and/or proportions of the image. Remember, the focus is on visual attributes.
// - Craft your input by "showing" and not "telling" the imagery. Think in terms of what you'd want to see in a photograph or a painting.
// - Here's an example for generating a realistic portrait photo of a man:
// "prompt":"photo of a man in black clothes, half body, high detailed skin, coastline, overcast weather, wind, waves, 8k uhd, dslr, soft lighting, high quality, film grain, Fujifilm XT3"
// "negative_prompt":"semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime, out of frame, low quality, ugly, mutation, deformed"
// - Generate images only once per human query unless explicitly requested by the user`;
    this.description = 'You can generate images using text with \'stable-diffusion\'. This tool is exclusively for visual content.';
// ...

Within the constructor, note that we're getting a sensitive variable from either the fields object or from the getServerURL method we define to access an environment variable.

this.url = fields.SD_WEBUI_URL || this.getServerURL();

Any credentials necessary are passed through fields when the user provides it from the frontend; otherwise, the admin can "authorize" the plugin for all users through environment variables. All credentials passed from the frontend are encrypted.

// It's recommended you follow this convention when accessing environment variables.
  getServerURL() {
    const url = process.env.SD_WEBUI_URL || '';
    if (!url) {
      throw new Error('Missing SD_WEBUI_URL environment variable.');
    }
    return url;
  }

Step 3: Define Helper Methods

You can define helper methods within your class to handle specific tasks if needed. For example, the StableDiffusionAPI class includes methods like replaceNewLinesWithSpaces, getMarkdownImageUrl, and getServerURL to handle various tasks.

class StableDiffusionAPI extends Tool {
  ...
  replaceNewLinesWithSpaces(inputString) {
    return inputString.replace(/\r\n|\r|\n/g, ' ');
  }
  ...
}

Step 4: Implement the `_call` Method

The _call method is where the main functionality of your plugin is implemented. This method is called when the language model decides to use your plugin. It should take an input parameter and return a result.

In a basic Tool, the LLM will generate one string value as an input. If your plugin requires multiple inputs from the LLM, read the StructuredTools section.

class StableDiffusionAPI extends Tool {
  ...
  async _call(input) {
    // Your tool's functionality goes here
    ...
    return this.result;
  }
}

Important: The _call function is what will the agent will actually call. When an error occurs, the function should, when possible, return a string representing an error, rather than throwing an error. This allows the error to be passed to the LLM and the LLM can decide how to handle it. If an error is thrown, then execution of the agent will stop.

Step 5: Export Your Plugin and import into handleTools.js

This process will be somewhat automated in the future, as long as you have your plugin/tool in api\app\langchain\tools

// Export
module.exports = StableDiffusionAPI;

/* api\app\langchain\tools\handleTools.js */
const StableDiffusionAPI = require('./StableDiffusion');
...

In handleTools.js, find the beginning of the loadTools function and add your plugin/tool to the toolConstructors object.

const loadTools = async ({ user, model, tools = [], options = {} }) => {
  const toolConstructors = {
    calculator: Calculator,
    google: GoogleSearchAPI,
    wolfram: WolframAlphaAPI,
    'dall-e': OpenAICreateImage,
    'stable-diffusion': StableDiffusionAPI // <----- Newly Added. Note: the key is the 'name' provided in the class. 
    // We will now refer to this name as the `pluginKey`
  };

If your Tool class requires more advanced initialization, you would add it to the customConstructors object.

The default initialization can be seen in the loadToolWithAuth function, and most custom plugins should be initialized this way.

Here are a few customConstructors, which have varying initializations

  const customConstructors = {
    browser: async () => {
      let openAIApiKey = process.env.OPENAI_API_KEY;
      if (!openAIApiKey) {
        openAIApiKey = await getUserPluginAuthValue(user, 'OPENAI_API_KEY');
      }
      return new WebBrowser({ model, embeddings: new OpenAIEmbeddings({ openAIApiKey }) });
    },
  // ...
    plugins: async () => {
      return [
        new HttpRequestTool(),
        await AIPluginTool.fromPluginUrl(
          "https://www.klarna.com/.well-known/ai-plugin.json", new ChatOpenAI({ openAIApiKey: options.openAIApiKey, temperature: 0 })
        ),
      ]
    }
  };

Step 6: Export your Plugin into index.js

##Find the index.js under api/app/clients/tools. You need to put your plugin into the module.exports, to make it compile, you will also need to declare your plugin as consts:

const StructuredSD = require('./structured/StableDiffusion');
const StableDiffusionAPI = require('./StableDiffusion');
...
module.exports = {
  ...
  StableDiffusionAPI,
  StructuredSD,
  ...
}

Step 7: Add your Plugin to manifest.json

This process will be somehwat automated in the future along with step 5, as long as you have your plugin/tool in api\app\langchain\tools, and your plugin can be initialized with the default method

  {
    "name": "Calculator",
    "pluginKey": "calculator",
    "description": "Perform simple and complex mathematical calculations.",
    "icon": "https://i.imgur.com/RHsSG5h.png",
    "isAuthRequired": "false",
    "authConfig": []
  },
  {
    "name": "Stable Diffusion",
    "pluginKey": "stable-diffusion",
    "description": "Generate photo-realistic images given any text input.",
    "icon": "https://i.imgur.com/Yr466dp.png",
    "authConfig": [
      {
        "authField": "SD_WEBUI_URL",
        "label": "Your Stable Diffusion WebUI API URL",
        "description": "You need to provide the URL of your Stable Diffusion WebUI API. For instructions on how to obtain this, see <a href='url'>Our Docs</a>."
      }
    ]
  },

Each of the fields of the "plugin" object are important. Follow this format strictly. If your plugin requires authentication, you will add those details under authConfig as an array since there could be multiple authentication variables. See the Calculator plugin for an example of one that doesn't require authentication, where the authConfig is an empty array (an array is always required).

Note: as mentioned earlier, the pluginKey matches the class name of the Tool class you made. Note: the authField prop must match the process.env variable name

Here is an example of a plugin with more than one credential variable

[
{
  "name": "Google",
  "pluginKey": "google",
  "description": "Use Google Search to find information about the weather, news, sports, and more.",
  "icon": "https://i.imgur.com/SMmVkNB.png",
  "authConfig": [
    {
      "authField": "GOOGLE_CSE_ID",
      "label": "Google CSE ID",
      "description": "This is your Google Custom Search Engine ID. For instructions on how to obtain this, see <a href='https://github.com/danny-avila/LibreChat/blob/main/docs/features/plugins/google_search.md'>Our Docs</a>."
    },
    {
      "authField": "GOOGLE_API_KEY",
      "label": "Google API Key",
      "description": "This is your Google Custom Search API Key. For instructions on how to obtain this, see <a href='https://github.com/danny-avila/LibreChat/blob/main/docs/features/plugins/google_search.md'>Our Docs</a>."
    }
  ]
},

Example: WolframAlphaAPI Tool

Here's another example of a custom tool, the WolframAlphaAPI tool. This tool uses the axios module to make HTTP requests to the Wolfram Alpha API.

const axios = require('axios');
const { Tool } = require('langchain/tools');

class WolframAlphaAPI extends Tool {
  constructor(fields) {
    super();
    this.name = 'wolfram';
    this.apiKey = fields.WOLFRAM_APP_ID || this.getAppId();
    this.description = `Access computation, math, curated knowledge & real-time data through wolframAlpha...`;
  }

  async fetchRawText(url) {
    try {
      const response = await axios.get(url, { responseType: 'text' });
      return response.data;
    } catch (error) {
      console.error(`Error fetching raw text: ${error}`);
      throw error

    }
  }

  getAppId() {
    const appId = process.env.WOLFRAM_APP_ID || '';
    if (!appId) {
      throw new Error('Missing WOLFRAM_APP_ID environment variable.');
    }
    return appId;
  }

  createWolframAlphaURL(query) {
    const formattedQuery = query.replaceAll(/`/g, '').replaceAll(/\n/g, ' ');
    const baseURL = 'https://www.wolframalpha.com/api/v1/llm-api';
    const encodedQuery = encodeURIComponent(formattedQuery);
    const appId = this.apiKey || this.getAppId();
    const url = `${baseURL}?input=${encodedQuery}&appid=${appId}`;
    return url;
  }

  async _call(input) {
    try {
      const url = this.createWolframAlphaURL(input);
      const response = await this.fetchRawText(url);
      return response;
    } catch (error) {
      if (error.response && error.response.data) {
        console.log('Error data:', error.response.data);
        return error.response.data;
      } else {
        console.log(`Error querying Wolfram Alpha`, error.message);
        return 'There was an error querying Wolfram Alpha.';
      }
    }
  }
}

module.exports = WolframAlphaAPI;

In this example, the WolframAlphaAPI class has helper methods like fetchRawText, getAppId, and createWolframAlphaURL to handle specific tasks. The _call method makes an HTTP request to the Wolfram Alpha API and returns the response.

16 KiB Raw Blame History