When it comes to approaches for guiding the behavior of LLMs in their applications, prompt engineering, fine tuning, and LLM chaining garner the lion’s share of attention in this space, and for good reason – they don’t require extremely deep technical expertise, and they support fast iteration cycles.
However, they don’t encompass the full scope of techniques that can be or will be brought to bear in the creation of LLM applications in the coming years. In this post, we cover three more tools, from de rigueur for complex LLM applications to speculative techniques that may not be production-ready for some time yet.
Read more
When creating LLM applications, people correctly place a lot of emphasis on the foundation model – the model underpinning an LLM app sets a cap on the reasoning ability of the system, and because LLM calls tend to dominate the per-interaction costs of serving an LLM application, the choice of foundation model sets the baseline for the marginal cost and latency of the whole system.
However, unless you’re trying to make a mirror of the ChatGPT or Claude website, you’ll want to modify the behavior of that underlying model in some way: you’ll want it to provide certain types of information, refrain from touching certain topics, and respond in a certain style and format. In this article and the next, we’ll discuss techniques for achieving that behavior modification, from well-trod to exploratory.
Read more
Picture this: two users, same exact need – to get advice on a health issue. User 1 opens up a text interface. Types in their symptoms, medical history, the works. Maybe they're a little embarrassed, but hey, no one's watching. They take their time, make sure they don't leave anything out. The AI comes back with a detailed response. User 1 reads it once, twice, a few times. Lets it sink in. They highlight the key points, the action items. They feel informed, empowered. They've got a plan.
Now User 2, they go for voice. They start explaining their symptoms, and the AI jumps in with clarifying questions. It's a back-and-forth, a real conversation. User 2 feels heard, understood. The AI shares its advice. User 2 listens intently. It's like the AI is right there in the room with them, guiding them. The inflection, the pauses, it all lands differently. User 2 feels cared for, supported.
Same need, two very different experiences. All because of the interface.
Read more
The general nature of LLMs makes them inherently powerful but notoriously difficult to control. When building an LLM-based product or interface that is exposed to users, a key challenge is limiting the scope of interaction to your business domain and intended use case. This remains an “unsolved” problem in practice mostly because modern LLMs are still susceptible to disregarding instructions and hallucinating (i.e., factual inaccuracy). As a consequence, operators must defend against unintended and potentially risky interactions. That can be difficult, because the ecosystem and tools for this problem are relatively nascent. Few (if any) commercial or open-source software packages offer out-of-the-box solutions that are accurate, simple, and affordable. We know, because our team has investigated many of these solutions, including AWS Bedrock Guardrails, NVIDIA NeMO Guardrails, and others.
Read more
Since the release of ChatGPT in late 2022, AI has received large and increasing amounts of attention and investment. We believe this is entirely warranted – AI in various forms is poised to change the way that businesses work. But one consequence of the ChatGPT release being the catalyst for this wave of attention is that people equate AI with large language models (LLMs), and they equate LLMs with chatbots.
We love chatbots – ChatGPT and others in its class are amazing tools – but, as an AI consultancy with a long history of projects in the space before the current mania, we’re sensitive to the conflation of LLMs and chatbots. Many of the most exciting potential uses for LLMs have little to do with the chatbot interface, and we think those should get more attention.
Read more