Shlok Khemani
Published on

A Guide to LLM Capabilities

Authors
cover image Today marks a year since ChatGPT launched. It’s been a wild one. The fastest product to reach 100 million users. The first glimpse of a truly intelligent computer. The overnight onset of a new technology wave. Euphoria around human progress. Fear around human redundancy. At the center of it all, is one of the greatest innovations in history.

There is universal acceptance that ChatGPT and other large language models are going to change the fabric of society and the nature of work. Yet, because these tools are so versatile, they emulate so many aspects we earlier considered to be exclusively human, it is difficult to envision what those changes are going to be.

That is what this post is about. Based on a year of almost daily conversations with ChatGPT and a broad exploration of other LLM-based products, I’ve identified and categorised its key capabilities and their potential impact on life, work, and society. Some of these would be obvious to regular users, others not so much.

Think of this as a normie’s guide to an AI-first future.

But before we dive into all of this, we need to understand how LLMs work. Let’s start from there.

(I’ve used 'ChatGPT' and 'LLMs' interchangeably throughout this piece. For most people, ChatGPT is the only LLM they know. Think of this as using 'Google' as a general term for all search engines.)

How do LLMs work?

The most surprising thing about ChatGPT is that it works at all. - Ilya Sutskever, Co-Founder of OpenAI

I try to present an extremely simplistic view of how LLMs work. I’m by no means an authority on this, nor do I fully grasp all the intricacies of the technology. You can find more thorough explanations in this excellent article by Jon Stokes, this delightful video by Andrej Karpath, or an extremely deep breakdown in this brilliant book by Stephen Wolfram.

LLMs, like ChatGPT, use a technique known as next-token prediction. In simple terms, given a series of words (tokens), the model tries to predict the next most likely word. This process repeats to form complete responses.

For example, when asked, "What direction does the sun rise from?", ChatGPT starts by predicting the word 'the', followed by each subsequent word in the phrase 'sun rises from the east'.

If you've ever noticed ChatGPT generate responses one word at a time, this is the reason.

The next question is: where do these predictions come from? How does ChatGPT determine that after “the sun rises from”, it should follow with “the east” rather than “the west”, “the north”, or “Amsterdam”? In other words, how does it know that “the east” is more statistically probable than other options?

The training of ChatGPT is extensive, involving a vast array of data sources from the internet, including websites, blogs, and books. For scale, ChatGPT's training data is estimated to be 15 times larger than all of Wikipedia!

Thus, the frequency of phrases on this training data dictates ChatGPT’s output. For example, 'The sun rises in the east' appears more often than 'the sun rises in the west'. While the latter might be used in contexts like literary metaphors (’as absurd as believing the sun rises in the west’) or discussions about other planets, “east” is more prevalent for the question we asked.

A simple way to understand this is by comparing the number of Wikipedia pages containing these phrases. 'The sun rises in the east' yields 55 pages, whereas 'the sun rises in the west' returns 27 pages. 'The sun rises in Amsterdam' shows no results! These are the patterns ChatGPT picks up.

While I've oversimplified some complex processes, this gives a basic idea of how LLMs function. Initially, it feels like magic. Once you peek under the hood, it starts making sense.

Now, onto what LLMs enable.

1: Natural Language Capabilities

It's not too hard to get into sci-fi mode when you realise that we are talking to computers and they understand us. - Ilya Sutskever, Co-Founder of OpenAI

Every interaction we have with computers, be it a complex calculation, a Google search, an Instagram post, or an airline ticket reservation, is designed by programmers. They anticipate our needs and instruct computers precisely to meet them. Traditionally, computers follow these coded instructions to the letter, without deviation. The only ways to interact with computers were to either use these pre-designed interfaces or to learn programming and build them yourself.

With the advent of ChatGPT, this paradigm changed. For the first time, users can communicate with a machine in natural human language — English, Spanish, German, Chinese, Hindi, among others. The machine understands them. And replies to them.

Given our understanding of how ChatGPT works, this ability makes sense. The sheer volume of text ChatGPT is trained on means that it has internally learned the structure of human language. It can hold multi-turn conversations in multiple languages, with logical coherence, and in perfect grammar.

This natural language capability makes LLMs suited to a variety of tasks.

Text Generation

The first thing you realise when using ChatGPT is that it's an amazing writer. It effortlessly crafts essays, letters, emails, and more. Given a single prompt, topic, or even a half-formed idea, it can generate a body of coherent text that often seems indistinguishable from that written by humans.

From supercharging the productivity of knowledge workers to aiding non-native speakers in articulating their ideas more effectively to assisting with homework assignments, this is perhaps the most widespread use case for ChatGPT.

Translation

ChatGPT is perhaps the best language translation tool we have. Its proficiency stems from training on millions of books and websites in multiple languages.

This means that apart from complementing (replacing?) Google Translate, it is also a great tool for those learning a new language. It can offer context-rich translations and explanations, enhancing the learning process.

Summarisation

One of the most underrated capabilities of LLMs is their ability to summarize large bodies of text in varying degrees of length.

For instance, you can take an article and condense it into a couple of paragraphs to grasp the main ideas, reduce it to a single line to create a headline, or distill it into a single word for categorization or tagging purposes. (Example)

ChatGPT is also great at finding pieces of data across disparate sources (say, a company database) and bringing them into one place. It can power through swaths of unstructured data to surface information or insight, making it an all-powerful knowledge extraction tool. Notion’s recent AI release is a great demonstration of this.

Transformation

Just as we use image editing tools to adjust features like brightness, contrast, sharpness, and color, and to apply presets (like portraits, black and white, vintage) and filters (similar to those on Instagram), ChatGPT functions similarly with text.

At a basic level, it allows us to tinker with aspects of writing such as complexity, tone, syntax, and structure. But its capabilities extend beyond these adjustments. ChatGPT too can transform text based on certain 'presets' – for example, it can emulate the style of another author or a fictional character.

A practical application is taking a technical research paper and creating an 'Explain Like I'm Five' (ELI5) version of it. Or, consider writing a birthday message for a loved one and transforming it into a poem or even a rap song!

In essence, ChatGPT can be seen as the Swiss Army knife of language (or a Word Synthesizer), allowing us to tinker and play with words like we’ve never been able to before.

Conversation

As highly social beings, humans are currently facing an epidemic of loneliness and isolation. The need for meaningful conversations and a sense of being heard is more pronounced than ever.

In what seems like a real-life echo of the movie "Her," AI is emerging as a potential companion for humans to engage in conversation. From casual debates and sex talks to tutoring and therapy, AI provides an outlet, a conversational partner that is infinitely patient and always available.

Whether this solves or exacerbates the loneliness epidemic is yet to be seen (and depends on our design of the tools) but large-scale societal impact is inevitable.

2: Code Generation

A billion people will code soon. Not in a programming language like Java but in natural language. - Vinod Khosla

The corpus of data on which Large Language Models (LLMs) are trained includes not only text but also a lot of code. This comes from open-source code repositories, technical documentation, developer forums like Stack Overflow, and coding platforms such as Leetcode. These sources are often accompanied by natural language explanations.

As a result, LLMs excel at both generating code and converting it into natural language (and vice versa) with impressive fluency.

This capability too has some wild implications for everyone - from professional coders, to aspiring entrepreneurs, to the everyday tech consumer.

Supercharging Developers

If software forms the infrastructure of modern society, then developers are its architects and builders.

LLMs supercharge their abilities. Tools like Github Copilot, which utilizes AI to assist in coding, have dramatically reduced the need for developers to type out every line of code. Microsoft estimates that about 40% of the code written by Copilot users is AI-generated. Beyond predictive coding, tools such as ChatGPT and newer innovations like Phind are instrumental in error correction and architecture design.

These advancements translate to a remarkable increase in productivity. Developers can now achieve faster development cycles, lower costs, and generate more software in less time.

The trend of this century has been software eating the world, and the acceleration is not stopping anytime soon.

Learning Code

Learning code is a lot like learning a new language. It requires mental rewiring that’s uncomfortable. ChatGPT makes it easier.

The status quo is a constant struggle to find the best books, online courses, and tutorials. Very often, these don’t match the skill level of the learner. They can be too easy or too difficult. ChatGPT changes the game. You can generate customised learning experiences to match your exact needs, adjusting the difficulty on the go, as needed.

Beyond tutorials, ChatGPT is also great at providing clear explanations of syntax and concepts, helping understand not just individual lines of code, but how they work together in a program.

Everyone becomes a coder

Because LLMs are so great at converting natural language to code, very soon, everyone, irrespective of their coding ability, will be able to make software. We’re seeing early glimpses of this and it honestly isn’t anything short of sci-fi.

Some broad implications of this leap:

  • Everyone can create customised apps on their phones, suited to their own needs.
  • Commercially, knowing how to code will not be a roadblock for non-technical folks wanting to build their own products.
  • The role of the professional developer will evolve to work exclusively on difficult problems, further pushing the boundaries of what we can achieve with computers and software.

Migrations

A significant number of essential sectors in society—such as banking and financial institutions, airlines and air traffic control, government agencies, and hospitals—rely on archaic software written in languages seldom used by newer developers. Due to their emphasis on stability and security, this code often remains unchanged for years. However, no system can last indefinitely. As the need for upgrades and maintenance of these systems becomes pressing, there's a scarcity of programmers equipped to handle these tasks.

LLMs are playing a supportive role in this process. They can assist in translating old code, suggesting modern alternatives, and helping programmers understand legacy systems.

Customised Software

I previously mentioned how every interface we interact with is carefully designed by programmers who anticipate our needs. However, these interfaces, unless updated, remain static. For example, the user experience when placing an order on Amazon is the same whether you're buying an iPhone or a rain jacket.

With the integration of Large Language Models (LLMs), there's potential for Amazon and similar platforms to create dynamic UI experiences. Imagine searching for an iPhone and having the UI adapt to mirror the functionality seen on Apple's website, on-the-go. Or, when shopping for clothes, the UI could evolve to intuitively understand and reflect your specific preferences (like material, colors, and styles) instead of relying on manual filtering.

This represents a shift towards software becoming significantly more dynamic and personalized. Since these experiences would vary from user to user, they could feel more natural and intuitive. The potential for a deeper human-machine symbiosis through more adaptable and responsive interfaces is clear. The future of software design is exciting.

An early example of this is the UI of the awesome AI search app Perplexity.

3: Knowledge Base

It bears repeating: ChatGPT's training encompasses a vast array of information, including the entirety of the internet, millions of books, and much more.

This extensive training effectively makes it a condensed repository of recorded human knowledge. Whether it's survival skills in the wild, detailed accounts of World War II, the philosophies of Socrates and Plato, or technical guidance like setting up a D-Link router, ChatGPT holds a wealth of information.

Another way to put it, it’s collective-human-knowledge-as-a-service, or, more crudely, a Wikipedia on steroids.

Learn Anything

The internet democratized access to knowledge, allowing anyone to learn about virtually anything. LLMs have taken this a step further, bringing this information to your fingertips.

While we've already discussed language and coding, the potential for learning with LLMs extends far beyond these areas. Whether it's quenching a thirst for historical knowledge, exploring scientific concepts, or indulging in artistic pursuits, LLMs cater to a wide spectrum of curiosities.

The challenge with learning of the internet often lies in sifting through numerous sources to find relevant and accurate information. ChatGPT streamlines this process by consolidating information into a single interface. This not only saves time but also makes the learning experience more efficient and focused, offering precisely the information you seek.

Mentorship

AI technology now allows us to simulate mentorship from many of the greatest figures in history, whether they are alive, deceased, real, or fictional.

  • Imagine receiving product feedback inspired by Steve Jobs.
  • Marcus Aurelius' philosophy could be adapted to help navigate life crises.
  • Marketing professionals might get copy feedback in the style of of Don Draper.
  • Chris Voss' negotiation strategies could be tailored to help prepare for a job interview.

This aspect of AI can be thought of as the encoding of human expertise. It enables the extraction of world-class advice from various fields, tailored to specific personal or professional situations.

Research

Gathering accurate data for research, whether academic or commercial, is a daunting task. It typically involves sifting through a multitude of sources — journals, blogs, interviews — to extract relevant and reliable information.

ChatGPT significantly reduces the time and effort involved in this process:

  • For academic researchers, ChatGPT streamlines the discovery of specific examples or papers and simplifies understanding new concepts, methodologies, or historical contexts. It can act as an initial reference tool, pointing researchers toward relevant literature or summarizing complex topics.
  • In market research, ChatGPT offers quick insights into market trends, competitive landscapes, historical precedents in business ventures, and compliance-related information. Sahil Lavingia shares some great insights into how he used ChatGPT while researching buying a building in NYC.

Healthcare

Quality healthcare is often both financially prohibitive and geographically inaccessible for a significant portion of the global population. Top-tier physicians face constraints in terms of the number of patients they can attend to and are further impeded by time-consuming tasks such as reviewing reports and generating prescriptions.

The fusion of an LLMs extensive general knowledge with a top doctor's specific expertise has given rise to chatbots and other tools that offer superior, cost-effective, and highly scalable primary healthcare solutions on a global scale.

We are already seeing the advantages of AI play out in the real world. Hospitals in Taiwan that have embraced AI have seen patient mortality drop by 25% and antibiotic costs by 30%. AI is literally saving lives.

Much like healthcare, access to quality legal advice is often limited and costly for a significant portion of the population. LLMs address this issue by:

  • Simplifying Complex Jargon: deciphering intricate legal jargon, making it more accessible and understandable to individuals who are not legal experts.
  • Compliance Guidance: providing insights into compliance rules and regulations, helping individuals and businesses ensure they adhere to legal requirements.
  • Quick Legal Advice: swift responses to basic legal queries, helping individuals determine whether a particular action is legal or not.
  • Interaction with Legal Bots: Professionally trained legal bots can provide specific legal advice and guidance, making the knowledge of experts more accessible.

4: Reasoning and Planning

We earlier got a simplistic understanding of how ChatGPT works. But in the process of accurately predicting the next word (given a body of text), something weird and amazing begins to unfold. ChatGPT starts building an internal model of the world. Hidden in the tapestry of language, lies an understanding of how the world actually functions.

We are still very early in the journey of language models and a computer completely understanding the world is yet to happen. However, we can already see glimpses of this future and its consequences for humanity.

Agents

Computers have traditionally excelled at taking a set of inputs and instructions and following them accurately to produce an output. These instructions are what one can also call “algorithms”. To date, humans had to furnish these algorithms, paving a path for the computer to perform tasks.

With LLMs, that changes. Given a high-level objective, computers can now create a plan to fulfill the outcome. Moreover, if they realise that the initially chosen path is not leading to fulfillment, they can also self-correct along the way, iterating till they reach the desired result.

This emerging capability means we’re on the track to building autonomous agents. We will soon be able to give our AIs objectives - finding the cheapest return flights to Bali, getting an appointment for the best oncologist in India, arranging for a weekend corporate beach retreat - and they’ll complete these multi-step tasks for us (with or without us in the loop).

As this capability concretizes, it will lead to the automation of large swaths of knowledge work, causing temporary upheaval in society (as any new technology does). But automating routine tasks affords us more time to pursue endeavors that align with our desires and aspirations, fostering a future where humans can focus on what truly matters to them.

New Science

The increasing depth of understanding within the world model paves the way for unraveling the mysteries of the universe — essentially, scientific exploration. Computers have already played a pivotal role in enhancing the capabilities of scientists and technologists, enabling them to model and comprehend the complexities of our world.

AIs will supercharge this. Iteration and experimentation will be quicker. It will start identifying patterns we were previously oblivious to. On the flip side, they will also help quickly discard bad ideas and explanations.

AI will act as a true companion for innovators, picking up all kinds of tasks - from the mundane to the most important ones. All of this means that we will collectively push the boundaries of human knowledge.

What’s next

ChatGPT and other LLMs have already changed the world in ways we do not yet fully realize. It is a tool that has significant short-, medium-, and long-term consequences. I’ve mostly focussed on the benefits of the technology here, sometimes painting a rosy picture. But, like any novel and powerful technology, there are very real downsides and safety concerns to address. Fortunately, AI itself holds the key to mitigating many of those concerns.

On a personal level, I undertook this analysis to help me understand the space better as I plan the next step in my career. You can use it to understand how AI will impact your job, how it can help make work and life tasks easier, or even as an investment framework while evaluating companies. Some of the best AI products will lie at the intersection of these capabilities.

I will try to keep it updated as the technology grows and I come across new unique use cases.