Published on July 23, 2024

Building Smart Command Bar using Gemini Nano

Developing a Smart Command Bar with LLM Capabilities

The integration of Local Language Models (LLMs) directly into web applications is revolutionizing how users interact with websites. One exciting development is the use of Google's Gemini Nano, an LLM that operates within Google Chrome, eliminating the need for external API calls and reducing token fees. This post explores how to create a Shadcn-derived command bar that leverages Gemini Nano to enhance user navigation through natural language commands.

Overview

Gemini Nano, Google’s on-device large language model (LLM), is designed to enhance user interactions by understanding and processing natural language commands directly in the browser. This capability is especially powerful for creating intuitive command bars that can respond to user inputs in real-time.

How to Access Gemini Nano

If you're new to Gemini Nano, here's a brief guide on how to get started:

Gaining Access: To get started, you first need access to Gemini Nano. Complete the required form to receive access, which typically takes about 24 hours.
Download Developer Chrome: Download the developer version of Google Chrome from the Google Chrome Developer Tools page.
Enable Experimental Features: Open Chrome and navigate to chrome://flags/#optimization-guide-on-device-model and set the flag to Enabled (BypassPerfRequirement). Next, go to chrome://flags/#prompt-api-for-gemini-nano and enable this flag as well. Relaunch Chrome to apply these changes.
Download the Model: Visit `chrome://components/``, find Optimization Guide On Device Model, and click Check for update.

Once these steps are completed, Gemini Nano should be ready to use within your Chrome browser. For reference, look at Google Document

Setting Up the Project

To get started, set up your project environment. For this example, we’ll use a React project.

git clone https://github.com/LLMByte/SmartBar
cd SmartBar
npm install

Using the Command Bar

We have defined a These components provide a sleek and responsive interface. The AI integration is facilitated by Google’s Gemini Nano, which processes user input and suggests the most relevant links.

Importing Necessary Libraries

First, import the necessary libraries and components:

import { CommandMenu } from "./components/CommandMenu";

Setting Up the Generative AI

Next, set up the generative AI instance. The setup ensures that if Gemini Nano is unavailable, the system falls back to Gemini Flash for continuity.

const genAI = new GoogleGenerativeAI("YOUR_API_KEY");

The code below shows how we are using Gemini Nano inside the SmartBar. If you are using the SmartBar component you do not need to write this part of the code.

const runGenApi = async (userInput, links, genAI) => {
  let model = genAI.getGenerativeModel({
    model: "gemini-1.5-flash",
    generationConfig: { responseMimeType: "application/json" }
  });
 
  let prompt = `Give a JSON based in the format {'name':'name here'} based on the query: Find a page where the user can do the following "${userInput}. Followings links are used: ${JSON.stringify(links)}. Return the most relevant name given in the links, and the name should strictly be present in the given links json.   Note the output should strictly consist only of JSON and no other formatting`;
  let response = '';
 
  try {
    // Try to use Gemini Nano if available
    if (window.ai && await window.ai.canCreateTextSession() === 'readily') {
      if (!session) session = await window.ai.createTextSession();
      response = await session.prompt(prompt);
      response = response.replace(/```json|```/g, '').trim();
    } else {
      // Fallback to Gemini Flash if Nano is not available
      let result = await model.generateContent(prompt);
      response = result.response.text();
      console.log("Gemini Flash Response " + response);
    }
 
    const text = JSON.parse(response) ;
    console.log("Model response: ", text);
    return [text.name.toLowerCase()];
  } catch (error) {
    console.error("Error running GenAI:", error);
    return []; // Return an empty array on error
  }

In the above code, we set up GoogleGenerativeAI with an API key and create a prompt to process user inputs. If Gemini Nano is available, it is used to generate a response. Otherwise, Gemini Flash is used as a fallback.

Trimming the String to a JSON Object:

In the runGenApi function, we handle the response from the AI. Sometimes, the AI's response might include extra formatting such as code blocks ( ```json ... ``` ). To ensure we only parse the JSON content, we use the replace method to remove these code block markers and then trim any extra whitespace:

response = response.replace(/```json|```/g, '').trim();

This cleaned string is then parsed into a JSON object:

const text = JSON.parse(response);

Now, create the command menu component. This component will handle user input, call the AI for suggestions, and display the results.

<SmartCommandMenu links={links} genAI={genAI}/>

Result

Practical Application

This command bar enhances user experience by allowing natural language navigation within a web application. For example, if a user wants to access certain part or feature of your complex SaaS app, it might be easier to search it through a quick command bar

Access the Code

For a detailed implementation and additional features, the complete code is available in our GitHub repository. This repository contains all the necessary components to integrate Gemini Nano into your own web applications.

See all posts