The pragmatic AI developer: Balancing future excitement with practical considerations
The problem with being on the “inflated expectations” part of the hype cycle is that all the chatter about a utopian (or dystopian) AI-fueled future makes short-term planning much harder. “I don’t know what AI is going to become,” one CEO told me, “but I want to get real value out of it today. The problem is, I don’t know how to do that.”
To that question I have three suggestions:
1) Identify the right problems
Getting practical about AI means having the ability to differentiate between the sorts of problems that AI might one day be able to solve and what it can successfully accomplish today. I’ve noticed that when you get outside of Silicon Valley, there is far less talk about autonomous agents and the road to AGI. Instead, leaders have more prosaic concerns: what should we start experimenting with today? What are the most valuable cases for AI in my company?
In an early column I mentioned my rule of thumb for “what makes a good AI application”, and I think it’s still true today: Given the current unpredictability of generative AI models, a worthwhile AI application is one where mistakes are low-risk OR easily detected.
Let me give you some examples. I recently met with product leaders from two companies: Notion, the enterprise collaboration platform, and Zillow, the real estate marketplace. Both companies are investing in AI tools. In Notion’s case, they have an AI assistant that can summarize meeting notes, create first drafts of emails, etc. Zillow is using AI to make search better; they want to allow users to craft custom natural-language queries to find houses for sale, rather than traditional filters on location, price, and so on.
These two use cases demonstrate the two rules of thumb. In the Notion example, mistakes are easily caught–presumably you’ll read over the email or the summary Notion AI creates, and verify its accuracy. The AI is acting like a junior employee, giving its best shot but overseen by someone more senior (or, more accurately, by someone who is an actual human). For Zillow, mistakes are relatively low-risk. A 20% failure rate for an AI trained to detect cancer cells might be considered a failure, but if 80% of the results from a Zillow search are perfect matches, that would be unprecedented (when was the last time a Google search returned results where 80% were helpful to you?).
Identifying areas where AI will actually be useful and relatively low-risk is at least as important as determining what problems it is theoretically capable of solving.
2/ Get your teams experimenting
Recently I spoke with a PM leader who had created two distinct engineering teams within her organization to hack together AI prototypes. The goal of the Blue Team, as she called them, was to think about how current products under development could be augmented with AI. Could the same roadmap goals be accomplished more easily, or yield better results? How would an AI approach differ from a more traditional one?
The second group, the Red Team, was told to ignore all existing product and imagine they were starting from scratch. What would they build with no constraints?
Hackathons like these are fun, but the secret to judging the output is to embrace the creativity while remaining ruthlessly practical. What value is being created? How would actual customers respond? What risks exist within a non-deterministic user experience (and ALL genAI products are non-deterministic–you never know what you’re going to get)? What are the ongoing costs associated with it? (If you’re using a third party API like OpenAI, inference costs–that is, the work involved with running the model each time it’s queried–can quickly get pricey.)
A visionary leader who can also get products out the door is someone who can simultaneously embody both the Red Team and Blue Team ethos.
3/ Experiment yourself
There is nothing that will get you more comfortable with AI products, or give you more credibility, than taking the time to mess around with something like the OpenAI API’s and a few associated products.
If you have basic Python programming knowledge (or really just basic programming knowledge at all), use Replit to create a simple web interface for a custom chatbot. You can use ChatGPT or Replit’s integrated AI-powered coding helper to get you started.
Create an account on OpenAI if you don’t already have one, and activate your personal API key–don’t worry, OpenAI has usage-based pricing, so it will only cost a few dollars at most to run experiments, unless you unwisely send what you build to all of your friends and it blows up in popularity. Read their quickstart tutorial on how to integrate a ChatGPT-powered experience into the site you built.
Next, give your chatbot a personality by creating a “meta prompt”, a set of instructions that will dictate how it should behave. I recently made a chatbot for some friends from home that would always, no matter what, mention in every response that our friend Jeff just bought a boat. I’d ask it something like “give me a quick overview of the history of India” and it might start its response by saying, “Sure, learning about history is something Jeff loves to do on his boat”, before launching into its actual answer. This was completely useless, but pretty funny. Meta prompts that are actually useful can determine tone, topics to avoid, tone of voice, and so on.
A more advanced step is learning how to train your chatbot on custom or proprietary data. The most popular way to do this is through a technique called Retrieval Augmented Generation (RAG), which I detailed in a previous post. The essence of RAG is creating your own private vector database, containing only data you want the chatbot to use when crafting its answers. This database is queried for relevant context before a request is sent off to the API. For instance, I created a vector database with lots of information about Yosemite hiking trails that had far more specificity on factors like mileage and trail details than exist within the massive brain that powers ChatGPT. Then, when a user asked my Yosemite Hiking Bot a question it would secretly append to the query any information from the vector database that it deemed necessary before sending it to OpenAI.
It’s also good to get familiar with the popular tools used as a wrapper around LLM API’s, which can help you do interesting things. One name that’s worth dropping is Langchain, an open-source library/framework that makes building products with LLM’s a lot easier. Check out this excellent video tutorial from DeepLearning.ai (the gold standard in online AI tutorials in my opinion) that gives step by step instructions on how to use Langchain to set up a chatbot with your proprietary data (as with my Yosemite example). The video will give you a flavor of the mechanics of making sure you retrieve the appropriate, relevant data from your vector database.
Once you understand the under-the-hood decisions, the better you’ll be at making the big choices.