Meta has launched NotebookLlama, an open-source AI tool designed to generate podcasts from uploaded PDF files as a direct competitor to Google’s NotebookLM. The tool utilizes three different Llama 3.1 AI models to convert text into an audio podcast featuring two AI hosts that engage in a conversational dialogue.
The process begins with the Llama 3.2 1B instruct model pre-processing the PDF file and saving it as a text file. Subsequently, the Llama 3.1 70B instruct model generates a podcast transcript based on the source text. This transcript is then dramatized using the Llama 3.1 8B instruct model before being converted into speech through a text-to-speech (TTS) workflow using Meta’s Parler TTS tool. Users can access all necessary models via Meta’s GitHub repository. According to Meta, users can opt for smaller models during each step, though this may affect the results. To run the recommended AI system setup, a GPU with approximately 140GB of aggregated memory is required.
Early feedback on the NotebookLlama-generated podcast, shared by an X user, indicates that the audio quality is not as good as Google’s NotebookLM. There were also instances where audio segments were skipped and the AI hosts overlapped in conversation.
Meta has recognized some of these audio quality issues and plans to address them in upcoming versions of NotebookLlama. The company stated, “The TTS model is the limitation of how natural this will sound. This probably will be improved with a better pipeline and with the help of someone more knowledgeable.” Future plans include using two different large language models (LLMs) to write the script, allowing the AI hosts to debate each other for a more natural conversational tone. Additionally, Meta is testing the Llama 405B AI model for transcript writing and aims to support more input and output formats.
Meta’s Commitment to Open Source
Meta’s NotebookLlama is part of the company’s broader push towards open-source AI tools. The platform has garnered significant attention, with Meta’s Llama models achieving 400 million downloads globally, making it a staple in the AI community. India stands out as one of the top markets for Llama, with expectations that the upcoming Llama 4 model will cement Meta’s Llama models as a global standard in AI, described by Mark Zuckerberg as AI’s “Linux moment.” Industry leaders like Jensen Huang and Mukesh Ambani have also expressed confidence in Llama’s role in building foundational AI infrastructure, particularly in India.
Another open-source concurrent project is Open NotebookLM, which is built on the Llama 3.1 405B model alongside Fireworks AI and Instructor, offering additional alternatives for developers seeking flexibility in AI podcast generation.
(Photo by Chris on Unsplash)