In early April, we launched the Generative AI Open Source (GenOS) Index to track the top open source projects related to Generative AI and LLMs. Since then, we have published two more editions of the GenOS Index - the April edition that showed frenetic activity in the Generative AI space and the May edition that showed some early signals of stability. This month, we are back with the June edition of the GenOS Index, showcasing the fastest growing open source projects in Generative AI between the beginning of March and end of June, and presenting 5 trends that we are seeing among the projects that made it to the GenOS Index.
First, a quick refresher on the methodology. Every month, we identify the top 30 open source projects in Generative AI as ranked by GitHub star growth (adds) in the preceding 90 days, with 500-star adds being the minimum for a project to be considered. Furthermore, we categorize the projects into three categories - Models, Infrastructure/Tooling and Applications - to provide visibility into how different parts of the Generative AI ecosystem are evolving.
The key takeaways from this month’s GenOS Index are as below.
In the most recent three editions of the GenOS Index, including the current one, Auto-GPT has remained at the very top of all open source projects in Generative AI. Additionally, multiple projects that help build autonomous agents such as SuperAGI (a dev-first autonomous agent framework and a new entrant this time), AgentGPT (deploying AI agents in browsers), BabyAGI, etc. have featured in the GenOS Index, indicating really strong user demand for projects that enable automation. While most of these agent frameworks had initially started with hobbyist use cases, many of these are now being used to automate task-level enterprise use cases, opening up a large market with available budgets. Looking forward, we anticipate some of these agent frameworks to graduate from “planning” the automation of a task to actually call underlying services to accomplish the end objective - we are seeing some early signs of that already.
With enterprise use cases coming to the fore, there has clearly been a strong interest in projects which allow one to privately interact with language models based on proprietary data and in secure environments. No other project captures this trend better than PrivateGPT, which broke into the GenOS Index at #7 in the May edition and stayed at #5 this month. This month, we have two new projects in this category breaking into the top 30 - Quivr at #24 and LocalGPT at #26 - further reinforcing the user demand for data privacy and control.
In order to really put the power of LLMs into the hands of users, the future is to run these models on devices - read CPUs - and at the edge. Two projects in particular - GPT4All and llama.cpp - that have consistently shown up in previous editions of the GenOS Index as well as in the current one - are targeting exactly that. We expect that trend to accelerate further as LLMs are deployed in real use cases, many of which will require “inference at the edge.”
Building the infrastructure and tools to train and run LLMs at scale shows no sign of slowing down. Like previous editions of the GenOS Index, over a third of the projects in the current edition are in the Infrastructure/Tooling category. LangChain continues to lead the way (remaining unmoved at #4 and being included in every edition of the GenOS Index since its launch in April), followed by JARVIS (at #11; a collaborative system where multiple AI models can be used to achieve a given task, with ChatGPT acting as the controller), Guidance (a new entrant at #18; guidance language to control LLMs) and DeepSpeed (at #13; deep learning optimization library that makes distributed training and inference fast and easy) from Microsoft, and finally Flowise (a new entrant at #21; drag-and-drop UI to build custom LLM flows).
While LLMs had originally started with text and then graduated to images, this month’s GenOS Index includes multiple new entrants that are tackling audio, the final frontier of making Generative AI models truly multimodal - Audicraft (PyTorch library for deep learning research on audio generation), AudioGPT (dialogue assistant like ChatGPT, but with audio as input), and Bark (a text-to-audio model that can generate highly realistic, multilingual speech as well as other audio).
The list of all top 30 projects in this month’s GenOS Index are as follows:
As we had done in the past, we highlight below a few other really interesting projects that, while not on the GenOS Index this month, have gained significant traction and are anticipated to break into a future edition of the GenOS Index:
That is it for this edition of the GenOS Index. For the next installment, we plan to move to quarterly installments now that the top open source projects in Generative AI have somewhat stabilized. So, stay tuned for the Q3-2023 GenOS Index to be published in early October!