A 4-Step Guide on How to Build an AI-based Text to Audio using LLMs with RAGs
How to build an AI-based Text-Speech News Summary using LLMs with RAGs
A beginner-friendly guide to using a RAG model and your entry point to the AI world
Table of Contents
Introduction: Why Should You Care?
Understanding the Rise of AI
Why Convert News Articles to Audio with RAG?
2. Who Is This Guide For?
Beginners in Data Science and AI
RAG and Its Simplicity
3. What is Retrieval-Augmented Generation (RAG)?
Breaking Down RAG: The Retriever and Generator
Why RAG is Relevant to AI and Data Science Enthusiasts
4. Why News? A Practical Application for RAG
My Experience with News Data
Leveraging RAG for News Summarization and Audio Conversion
5. The 4-Step Guide to Convert News to Audio
Step 0: Importing Libraries
Step 1: Fetching the News
Step 2: Extracting Key Information
Step 3: Converting Text to Audio
Step 4: Implementing Retrieval-Augmented Generation
6. The Benefits of Using RAG
Accuracy and Relevance in AI-Generated Content
Time-Saving Automation
Audio Accessibility for Learning on the Go
7. Getting Started with RAG for Your Projects
Setting Up Your Retriever and LLM
Automating Your Workflow
8. Conclusion: Why RAG is the Future of AI Content Generation
The Power of Combining Search and Intelligence
Resources for Further Learning
Introduction: Why Should You Care?
Understanding the Rise of AI
We live in the age of AI. Over the last 2 years, since the advent of Open AI, thanks to Sam Altman and his team, the world has moved at a rapid pace in the field of Artificial Intelligence. Amazon’s Alexa and Google Home Assistant devices had been there prior to that but they didn’t give us visibility into how they made a little circular or square box talk. OpenAI solved this problem by helping us understand how to create everything and anything. ChatGPT has become the new Google Search, Google Assistant, Alexa or Siri. With all of this super technology advancement available to us, why should you care to convert text news to audio using an advanced technique?
Why Convert News Articles to Audio with RAG?
Well, the idea is to teach you to Do It Yourself (#DIY). Why don’t you want to learn how to do something yourself and perhaps use your creative imagination to take it to the next level? You can think of other use cases with RAGs, and other problems to solve once you learn how to convert text to audio. With that said, let’s go into who this is for and how we convert Text news to audio using RAGs.
Who is it for?
Beginners in Data Science and AI
Whether you’re new to Data Science and AI or have been dabbling for a while, you have come to the right place. Join me as we dive into something super cool: Retrieval-Augmented Generation (RAG).
I know hearing terms like RAG, AI, or Large Language Models (LLMs) can sound intimidating, but trust me: You’ve got this! I’m learning this stuff alongside you, and I will help break it all down together. We’ll go step-by-step and have some fun with it.
RAG and Its Simplicity
RAG is a powerful tool that will make your data science journey easier, and the best part? It’s way more beginner-friendly than you think!
The ever-evolving landscape of Data Science and AI is making significant strides in automating processes, extracting valuable insights, and offering efficient solutions to complex problems. One of the latest trends in this space is Retrieval-Augmented Generation (RAG). This approach is gaining traction due to its potential to merge the strengths of large language models (LLMs) with document retrieval systems.
In this post, we’ll explore how RAG can be applied to transform news articles into valuable insights and even convert them into audio formats, making it a fantastic tool for Data Science enthusiasts and AI beginners alike.
What is Retrieval-Augmented Generation (RAG)?
Before I dive deeper into its application, let’s first understand what RaG is and why it’s essential for anyone getting started with Data Science or AI.
Breaking Down RAG: The Retriever and Generator
According to Amazon Web Services, RAG is an optimization process for a Large Language model. RAG is a hybrid system that combines:
A Retriever: A tool to fetch relevant information from a large corpus or a set of documents based on a query.
A Generator: A large language model (like GPT-4) that synthesizes the retrieved information into a coherent response, summary, or insight.
In essence, RAG ensures that AI-generated content is relevant and grounded in factual data, making it a highly accurate method for producing insights, summaries, or even converting text to audio.
Why RAG is Relevant to AI and Data Science Enthusiasts
If you are a Data Science or AI beginner, you might wonder: Why should I care about RAG?
Here’s why:
Accurate Information: RAG ensures that content is not only generated by AI but also supported by relevant data sources, reducing hallucinations (AI-generated inaccuracies).
Data-Driven Insights: You can leverage RAG to extract specific insights from vast amounts of data. For example, instead of manually reading through news articles, RAG fetches and generates summaries based on specific queries.
Time-Saving: RAG automates tedious tasks like summarizing long documents or generating reports, freeing you up to focus on analysis and decision-making.
Accessibility: Through audio conversion, RAG makes complex data more accessible, allowing you to listen to important insights on the go.
Why News? A Practical Application for RAG
My Experience with News Data
If you know about me, you may know I write a lot about extracting news data and you can see some of my previous work here:
How to get the Daily news using Python
Getting the news using NewsAPI in Python and generating word clouds to understand the author’s opinion about a topic.towardsdatascience.com
In this step-by-step guide, I show you how to use Python and the NewsAPI to fetch daily news articles, analyze them, and visualize trends using word clouds, helping you build your news aggregator and stay informed efficiently. By leveraging libraries like pandas, BeautifulSoup, and wordcloud, I walk you through automating news retrieval, parsing article content, and even setting up notifications with Twilio or SendGrid.
How to Extract Daily News Headlines and Convert to Audio Using Python
towardsdev.com
In this blog post, I walk you through a step-by-step guide on how to extract daily news headlines from CNN’s website and convert them to audio using Python. Using web scraping tools like requests
and BeautifulSoup
, I show how to extract article details with newspaper3k
and then convert the news text into audio using gTTS
.
Leveraging LLMs for Efficient News Summarization: A Hugging Face and Python Guide
Efficient News Summarization Using LLMs, Hugging Face, and Python: Discover how to efficiently summarize news articles…blog.gopenai.com
In this post, I explain how to use Large Language Models (LLMs) from Hugging Face to efficiently summarize news articles using Python. By leveraging models like BART, I guide you through setting up a summarization pipeline that extracts key information from news content, helping you stay informed without reading lengthy articles.
Leveraging RAG for News Summarization and Audio Conversion
Returning to the problem we are trying to solve here, let’s walk through the 4-step process of how I used RAG to turn a set of news articles from CNN into meaningful insights — this will help demonstrate how you can apply RAG in your data science projects.
The 4-Step Guide to Convert News to Audio
Step 0: Import relevant Libraries
This has to be a step 0, kind of a pre-requisite for any Data Science or AI related project:
import requests
from bs4 import BeautifulSoup
from newspaper import Article
from gtts import gTTS
import os
from sentence_transformers import SentenceTransformer
import faiss
import numpy as np
import openai
import nltk
nltk.download('punkt')
Step 1: Fetching the News
First, I built a simple Python script to scrape news articles from the news website’s homepage. Using tools like BeautifulSoup and requests, I retrieved the latest articles. This step is crucial because the retriever in RAG needs access to a corpus of documents to function.
While I have used CNN here, you can just about use another website supported by the newspaper3k library.
def get_cnn_homepage():
url = "https://www.cnn.com"
response = requests.get(url)
return response.text
Step 2: Extracting Relevant Information
After retrieving the articles, I used the newspaper3k library to extract key information like the title, link, author, publication date, and the full text of each article. The goal was to create a dataset of recent news articles that could be used for further analysis.
def extract_article_links(html):
soup = BeautifulSoup(html, 'html.parser')
article_links = []
for link in soup.find_all('a', href=True):
href = link['href']
if href.startswith('/') and '/2024/' in href: # Filter for recent articles
full_url = f"https://www.cnn.com{href}"
if full_url not in article_links:
article_links.append(full_url)
return article_links[:5] # Limit to 5 articles
# Step 2: Extract Article Details
def extract_article_details(url):
article = Article(url)
article.download()
article.parse()
return {
'title': article.title,
'text': article.text
}
Step 3: Converting Text to Audio
One of the standout features of this approach is the ability to convert the generated summaries into audio. Using the gTTS (Google Text-to-Speech) library, I transformed the summaries into audio files, making it easier for users to consume the insights while on the go.
This makes the RAG-generated content not only informative but also highly accessible — perfect for data enthusiasts who prefer learning while multitasking.
# Step 3: Text to Speech Conversion
def text_to_speech(text, filename):
tts = gTTS(text=text, lang='en')
tts.save(filename)
return filename
Step 4: Retrieval-Augmented Generation
Next, I implemented the retrieval part of RAG using FAISS, a tool developed by Facebook AI for efficient similarity search. FAISS helped me index the articles and retrieve the most relevant ones based on user queries, such as “latest trends in AI” or “how automation affects business.”
I then used GPT-4 to generate summaries of the retrieved articles, ensuring that the content was grounded in factual news sources.
def index_articles(articles):
model = SentenceTransformer('paraphrase-MiniLM-L6-v2')
texts = [article['text'] for article in articles]
embeddings = model.encode(texts, convert_to_tensor=False)
index = faiss.IndexFlatL2(embeddings.shape[1])
index.add(np.array(embeddings))
return index, texts
def retrieve_articles(query, index, texts):
model = SentenceTransformer('paraphrase-MiniLM-L6-v2')
query_embedding = model.encode([query], convert_to_tensor=False)
D, I = index.search(np.array(query_embedding), 5)
retrieved_articles = [texts[i] for i in I[0]]
return retrieved_articles
def generate_summary(retrieved_articles):
openai.api_key = 'your_openai_api_key' # Replace with your OpenAI API key
prompt = "Summarize the following news articles:\n"
for article in retrieved_articles:
prompt += f"\nArticle: {article}\n"
response = openai.Completion.create(
engine="text-davinci-003",
prompt=prompt,
max_tokens=500
)
return response['choices'][0]['text']
And finally, the Main Function to Run the Program
def main():
print("Fetching CNN homepage...")
homepage_html = get_cnn_homepage()
print("Extracting article links...")
article_links = extract_article_links(homepage_html)
print(f"Found {len(article_links)} articles. Extracting details...")
articles = []
for link in article_links:
article_details = extract_article_details(link)
articles.append(article_details)
print(f"Extracted: {article_details['title']}")
print("Indexing articles...")
index, texts = index_articles(articles)
query = input("Enter a topic or keyword for news retrieval: ")
retrieved_articles = retrieve_articles(query, index, texts)
print(f"Found {len(retrieved_articles)} relevant articles.")
print("Generating summary...")
summary = generate_summary(retrieved_articles)
print("Summary:")
print(summary)
# Convert summary to audio
filename = "news_summary.mp3"
text_to_speech(summary, filename)
print(f"Summary converted to audio: {filename}")
if __name__ == "__main__":
main()
The Benefits of RAG for Data Science and AI Beginners
1. Combining Search with Intelligence
RAG allows you to ask specific questions — just like you would on a search engine — but instead of simply retrieving documents, it synthesizes the information into a cohesive and insightful answer. This is incredibly useful for those just starting in Data Science and AI, where learning requires understanding complex concepts.
2. Real-Time Insights
Unlike traditional content generation methods that rely solely on pre-existing models, RAG updates itself based on the latest data it retrieves. This means your insights are always fresh and relevant, which is crucial in fast-evolving fields like Data Science and AI.
3. Audio Accessibility
Converting content into audio using RAG makes it accessible to a broader audience. Whether you’re commuting, working out, or just need a break from the screen, audio summaries allow you to continue learning effortlessly.
To see the complete code, you can visit my Github.
How You Can Implement RAG in Your Projects!
Whether you’re working on a personal data science project or looking to incorporate AI into a larger business process, RAG can add immense value. Here’s a simple guide to get started:
Choose Your Retriever: Use tools like FAISS or Elasticsearch to index and search your document corpus.
Leveraging an LLM: GPT-4 or any other large language model will help you generate insightful summaries or content from the retrieved data.
Automate Your Pipeline: Integrate RAG into a pipeline that can fetch new data, retrieve relevant documents, and generate summaries on demand.
Conclusion: Why RAG is the Future
Retrieval-Augmented Generation is more than just a buzzword; it’s a transformative technology that makes AI-generated content more reliable, relevant, and useful. This opens up a world of possibilities for Data Science and AI beginners. From news extraction to audio conversion, RAG can help you streamline your workflow, gain insights faster, and stay informed — all while ensuring that your data-driven decisions are grounded in the latest information.
If you’re just getting started in Data Science and AI, RAG is a powerful tool you don’t want to miss! Here are some other useful links you can use to learn more about RAGs:
freeCodeCamp offers a detailed tutorial on building RAG from scratch. This includes indexing, retrieval, and generation techniques. The course covers advanced topics such as RAG fusion, query translation, and adaptive RAG, making it ideal for those looking to apply RAG in real-world projects :
NVIDIA’s Technical Blog provides an insightful guide on how to build and deploy RAG pipelines. It explains key concepts such as document ingestion, embedding generation, and real-time query processing. NVIDIA also offers examples and public resources to help fast-track your RAG development.
Nexla explores practical use cases for RAG in industries like healthcare, customer service, and financial analysis. The guide compares RAG to model fine-tuning and highlights its benefits for handling real-time data and improving the contextual relevance of AI-generated responses.
Bonus!
Do you also want to know about how you can apply Machine Learning to your Business? Here is a Free 5-day Email Course that teaches how Machine Learning Can Transform Your Business. Discover practical ways to leverage machine learning to grow your business, improve efficiency, and drive results.
Free 5-Day Email Course: How Machine Learning Can Transform Your Business!