By now, most people know that OpenAI's ChatGPT is built on top of general-purpose large language models (LLMs), such as GPT 3.5 and GPT 4. The ChatGPT interface, leading a class of new tech known as Generative AI, has captured the imagination of an entire industry, including entrepreneurs, investors, journalists, and customers. Have we entered a new era of application building, where Generative AI is the "new electricity" that we will build everything on top of? Will it be the Operating System of the future?
The ChatGPT interface to these LLMs will continue to evolve with integrated drawing capabilities (DALL-E 2), and larger, more enhanced LLMs. However, its introduction has clearly spawned a wave of similar technology to compete with. Anthropic, sprung from former OpenAI employees, also has a general-purpose LLM named Claude. Meta/Facebook has one called LLaMa 2 (open-sourced). Google has PaLM 2 powering many of its AI-products going forward (including BARD, its conversational AI product). Even Apple is expected to unveil its own LLM targeted for early 2024. Other LLMs emerging from the woodwork include Mistral, Falcon180B, and Giraffe to name a few.
These kinds of LLMs are very expensive to build because they require billions of sentences and text ingested from all over the Web via text-rich sites such as Wikipedia, social media platforms, online books, and other textual online resources. Each fragment of data retrieved from these sources must be analyzed, weighted, and scored repeatedly to build the neural networks that are essentially the core of these LLMs. The costs and resources required to perform these computationally intensive activities puts the capability out of the reach of entities other than well-heeled tech firms. Of course, that is likely to change soon enough.
In addition to these general-purpose LLMs, we are also seeing more specialized LLMs such as Nvidia's Prismer, an LLM focused on the challenge of vision. And Alpaca, Meta's reduced-size LLM based on their general-purpose LLaMa 2 LLM. Many other reduced parameter and smaller footprint LLMs are also now emerging. Presumably we will shortly see many more specialized LLMs in various categories that are smaller in scope and therefore able to be built more inexpensively. Will there be a "long tail" of LLMs that going forward we will be able to pick and choose from based on a given use case?
So what's the over/under on how many LLMs might dot the landscape by, say, the end of 2024? Will some of these models already have been retired? Will there be LLM marketplaces? How about LLM construction sets available for custom, small footprint LLMs that perform specifically in domains such as medicine, education, or programming? Or will some new form of an evolved technology such as Artificial General Intelligence (AGI) emerge that will render all of these current LLMs obsolete?
It will certainly be exciting to watch it all unfold.
If you would like to put them to use for better data quality, sign up for free usage and an API Key now.