{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Dummy Agent Library\n", "\n", "In this simple example, **we're going to code an Agent from scratch**.\n", "\n", "This notebook is part of the Hugging Face Agents Course, a free Course from beginner to expert, where you learn to build Agents.\n", "\n", "\"Agent" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install -q huggingface_hub" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Serverless API\n", "\n", "In the Hugging Face ecosystem, there is a convenient feature called Serverless API that allows you to easily run inference on many models. There's no installation or deployment required.\n", "\n", "To run this notebook, **you need a Hugging Face token** that you can get from https://hf.co/settings/tokens. A \"Read\" token type is sufficient.\n", "- If you are running this notebook on Google Colab, you can set it up in the \"settings\" tab under \"secrets\". Make sure to call it \"HF_TOKEN\" and restart the session to load the environment variable (Runtime -> Restart session).\n", "- If you are running this notebook locally, you can set it up as an [environment variable](https://huggingface.co/docs/huggingface_hub/en/package_reference/environment_variables). Make sure you restart the kernel after installing or updating huggingface_hub. You can update huggingface_hub by modifying the above `!pip install -q huggingface_hub -U`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "from huggingface_hub import InferenceClient\n", "\n", "## You need a token from https://hf.co/settings/tokens, ensure that you select 'read' as the token type.\n", "## If you run this on Google Colab, add it in the \"Secrets\" tab (key icon on the left sidebar) and call it \"HF_TOKEN\".\n", "try:\n", " from google.colab import userdata\n", " HF_TOKEN = userdata.get(\"HF_TOKEN\")\n", "except ImportError:\n", " HF_TOKEN = os.environ.get(\"HF_TOKEN\")\n", "\n", "client = InferenceClient(model=\"moonshotai/Kimi-K2.5\", token=HF_TOKEN)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We use the `chat` method since it is a convenient and reliable way to apply chat templates:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "output = client.chat.completions.create(\n", " messages=[\n", " {\"role\": \"user\", \"content\": \"The capital of France is\"},\n", " ],\n", " stream=False,\n", " max_tokens=1024,\n", " extra_body={'thinking': {'type': 'disabled'}},\n", ")\n", "print(output.choices[0].message.content)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The chat method is the RECOMMENDED method to use in order to ensure a **smooth transition between models**." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dummy Agent\n", "\n", "In the previous sections, we saw that the **core of an agent library is to append information in the system prompt**.\n", "\n", "This system prompt is a bit more complex than the one we saw earlier, but it already contains:\n", "\n", "1. **Information about the tools**\n", "2. **Cycle instructions** (Thought → Action → Observation)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# This system prompt is a bit more complex and actually contains the function description already appended.\n", "# Here we suppose that the textual description of the tools has already been appended.\n", "\n", "SYSTEM_PROMPT = \"\"\"Answer the following questions as best you can. You have access to the following tools:\n", "\n", "get_weather: Get the current weather in a given location\n", "\n", "The way you use the tools is by specifying a json blob.\n", "Specifically, this json should have an `action` key (with the name of the tool to use) and an `action_input` key (with the input to the tool going here).\n", "\n", "The only values that should be in the \"action\" field are:\n", "get_weather: Get the current weather in a given location, args: {\"location\": {\"type\": \"string\"}}\n", "example use :\n", "\n", "{{\n", " \"action\": \"get_weather\",\n", " \"action_input\": {{\"location\": \"New York\"}}\n", "}}\n", "\n", "\n", "ALWAYS use the following format:\n", "\n", "Question: the input question you must answer\n", "Thought: you should always think about one action to take. Only one action at a time in this format:\n", "Action:\n", "\n", "$JSON_BLOB (inside markdown cell)\n", "\n", "Observation: the result of the action. This Observation is unique, complete, and the source of truth.\n", "... (this Thought/Action/Observation can repeat N times, you should take several steps when needed. The $JSON_BLOB must be formatted as markdown and only use a SINGLE action at a time.)\n", "\n", "You must always end your output with the following format:\n", "\n", "Thought: I now know the final answer\n", "Final Answer: the final answer to the original input question\n", "\n", "Now begin! Reminder to ALWAYS use the exact characters `Final Answer:` when you provide a definitive answer. \"\"\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We need to append the user instruction after the system prompt. This happens inside the `chat` method. We can see this process below:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "messages = [\n", " {\"role\": \"system\", \"content\": SYSTEM_PROMPT},\n", " {\"role\": \"user\", \"content\": \"What's the weather in London?\"},\n", "]\n", "\n", "print(messages)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's call the `chat` method!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "output = client.chat.completions.create(\n", " messages=messages,\n", " stream=False,\n", " max_tokens=200,\n", " extra_body={'thinking': {'type': 'disabled'}},\n", ")\n", "print(output.choices[0].message.content)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Do you see the issue?\n", "\n", "> At this point, the model is hallucinating, because it's producing a fabricated \"Observation\" -- a response that it generates on its own rather than being the result of an actual function or tool call.\n", "> To prevent this, we stop generating right before \"Observation:\".\n", "> This allows us to manually run the function (e.g., `get_weather`) and then insert the real output as the Observation." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# The answer was hallucinated by the model. We need to stop to actually execute the function!\n", "output = client.chat.completions.create(\n", " messages=messages,\n", " max_tokens=150,\n", " stop=[\"Observation:\"], # Let's stop before any actual function is called\n", " extra_body={'thinking': {'type': 'disabled'}},\n", ")\n", "\n", "print(output.choices[0].message.content)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Much Better!\n", "\n", "Let's now create a **dummy get weather function**. In a real situation you could call an API." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Dummy function\n", "def get_weather(location):\n", " return f\"the weather in {location} is sunny with low temperatures. \\n\"\n", "\n", "get_weather('London')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's concatenate the system prompt, the base prompt, the completion until function execution and the result of the function as an Observation and resume generation." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "messages = [\n", " {\"role\": \"system\", \"content\": SYSTEM_PROMPT},\n", " {\"role\": \"user\", \"content\": \"What's the weather in London?\"},\n", " {\"role\": \"assistant\", \"content\": output.choices[0].message.content + \"Observation:\\n\" + get_weather('London')},\n", "]\n", "\n", "output = client.chat.completions.create(\n", " messages=messages,\n", " stream=False,\n", " max_tokens=200,\n", " extra_body={'thinking': {'type': 'disabled'}},\n", ")\n", "\n", "print(output.choices[0].message.content)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We learned how we can create Agents from scratch using Python code, and we **saw just how tedious that process can be**. Fortunately, many Agent libraries simplify this work by handling much of the heavy lifting for you.\n", "\n", "Now, we're ready **to create our first real Agent** using the `smolagents` library." ] } ], "metadata": { "colab": { "provenance": [] }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.7" } }, "nbformat": 4, "nbformat_minor": 5 }