Building a YouTube Video Summarizer with Gemini and Python

In today's fast-paced digital world, we're constantly bombarded with video content. While YouTube has become an incredible resource for learning and entertainment, finding time to watch lengthy videos can be challenging. What if you could get the key points from any YouTube video in just seconds?

In this blog post, I'll walk you through how I built a simple yet powerful YouTube video summarizer using Google's Gemini AI. This project showcases how modern AI tools can be leveraged to solve practical, everyday problems.

Step 1: Setting Up the Environment

Before we start coding, we need to set up our environment. Here’s what you’ll need:

Python installed on your machine.
A Google Cloud API key for Gemini AI (you can get this from the Google AI studio).
Install the required Python libraries.

Run the following commands to install the necessary libraries:

pip install streamlit python-dotenv google-generativeai youtube_transcript_api

Step 2: Writing the Code

Here’s the complete code for our Recipe Builder app:

from dotenv import load_dotenv
load_dotenv()
import streamlit as st
import os
import google.generativeai as genai
from youtube_transcript_api import YouTubeTranscriptApi

flag=0
genai.configure(api_key=os.getenv("GENAI_API_KEY"))

prompt_text="""You are a YouTube video summarizer. Your task is to analyze the transcript of a video and create a concise summary, highlighting the key points within a 250-word limit. 
Please provide the important details from the text provided.  """

#preprare transcript from video
def transcript_extractor(video_string):
    video_id=video_string.split("=")[1]
    transcript_text=YouTubeTranscriptApi.get_transcript(video_id)
    transcript = ""
    for i in transcript_text:
        transcript += " " + i["text"]
    return transcript

#gemini call
def get_gemini_response(prompt_text,transcript_text):
    model = genai.GenerativeModel('gemini-2.0-flash')
    response = model.generate_content(prompt_text+transcript_text)
    return response.text

##initialize streamlit app
st.set_page_config(page_title="Youtube transcriber")
st.header("Youtube transcriber")
youtube_link=st.text_input("Enter Youtube video URL: ",key="input")
if youtube_link:
    video_id = youtube_link.split("=")[1]
    st.image(f"http://img.youtube.com/vi/{video_id}/0.jpg", use_container_width=True)
    flag=1
submit=st.button("Prepare summary of the video")

## Button click
if submit:
    if flag==1:
        transcript_text=transcript_extractor(youtube_link)
        response=get_gemini_response(prompt_text,transcript_text)
        st.subheader("Video summary:")
        st.write(response)
    else:
        st.warning("Please enter a YouTube video URL.")

Github:

https://github.com/vipinputhanveetil/gemini_youtube_transcriber

Step 3: Breaking Down the Code

1. Import

from dotenv import load_dotenv
load_dotenv()
import streamlit as st
import os
import google.generativeai as genai
from youtube_transcript_api import YouTubeTranscriptApi

flag=0
genai.configure(api_key=os.getenv("GENAI_API_KEY"))

I'm using environment variables to securely store the Gemini API key, which is a good practice for any application using API credentials.

2. Extracting the Transcript

#preprare transcript from video
def transcript_extractor(video_string):
    video_id=video_string.split("=")[1]
    transcript_text=YouTubeTranscriptApi.get_transcript(video_id)
    transcript = ""
    for i in transcript_text:
        transcript += " " + i["text"]
    return transcript

This function extracts the video ID from a YouTube URL and then uses the YouTube Transcript API to fetch and concatenate the transcript text.

3. Generating the Summary with Gemini AI

#gemini call
def get_gemini_response(prompt_text,transcript_text):
    model = genai.GenerativeModel('gemini-2.0-flash')
    response = model.generate_content(prompt_text+transcript_text)
    return response.text

Here, I'm using Gemini 2.0 Flash model, which is optimized for quick responses while maintaining high quality. The AI is given a specific prompt to create a concise summary:

prompt_text="""You are a YouTube video summarizer. Your task is to analyze the transcript of a video and create a concise summary, highlighting the key points within a 250-word limit. 
Please provide the important details from the text provided.  """

4. Building the User Interface with Streamlit

##initialize streamlit app
st.set_page_config(page_title="Youtube transcriber")
st.header("Youtube transcriber")
youtube_link=st.text_input("Enter Youtube video URL: ",key="input")
if youtube_link:
    video_id = youtube_link.split("=")[1]
    st.image(f"http://img.youtube.com/vi/{video_id}/0.jpg", use_container_width=True)
    flag=1
submit=st.button("Prepare summary of the video")

## Button click
if submit:
    if flag==1:
        transcript_text=transcript_extractor(youtube_link)
        response=get_gemini_response(prompt_text,transcript_text)
        st.subheader("Video Summary:")
        st.write(response)
    else:
        st.warning("Please enter a YouTube video URL.")

Step 4: Running the App

To run the app, save the code in a file (e.g., gemini_youtube_transcriber.py) and run the following command in your terminal:

streamlit run gemini_youtube_transcriber.py

This will start the Streamlit app, and you can access it in your browser at http://localhost:8510

Step 5: Testing the App

Copy and paste any youtube URL into the input box ant press the tab button.
App displays the video thumbnail for confirmation.
When they click the "Prepare summary" button, the app processes the transcript and displays the AI-generated summary.

Benefits and Use Cases

This YouTube summarizer offers several benefits:

Time Saving: Get the key points of a video without watching the entire thing
Content Filtering: Quickly determine if a video contains the information you need
Study Aid: Create summaries of educational videos for review
Accessibility: Make video content more accessible to those who prefer reading

Future Improvements

While the current version works well, there are several enhancements I plan to implement:

Better URL Parsing: Support various YouTube URL formats (shortened links, timestamps, etc.)
Multi-language Support: Add options for summarizing videos in different languages
Customizable Summary Length: Allow users to specify how detailed they want the summary to be
Timestamp Linking: Include timestamps in the summary that link to the relevant parts of the video
Error Handling: More robust handling of videos without available transcripts

Conclusion

This project demonstrates how AI tools like Google Gemini can be combined with existing APIs to create practical applications that solve real-world problems. The YouTube summarizer is just one example of how AI can help us consume information more efficiently in our content-rich digital landscape.

By leveraging the power of large language models and complementary tools, even relatively simple applications can provide significant value. The code for this project is straightforward, yet the resulting tool can save hours of time for students, researchers, professionals, and casual YouTube viewers alike.

Building a YouTube Video Summarizer with Gemini and Python

Step 1: Setting Up the Environment

Step 2: Writing the Code

Github:

Step 3: Breaking Down the Code

1. Import

2. Extracting the Transcript

3. Generating the Summary with Gemini AI

4. Building the User Interface with Streamlit

Step 4: Running the App

Step 5: Testing the App

Benefits and Use Cases

Future Improvements

Conclusion

Comments

AI

AI in the Kitchen: Building a Recipe Builder with Gemini and Python

More from this blog

Install OpenClaw on a GCP Ubuntu VM Using Ollama + Gemma 4

Run Claude Code on an Ubuntu GCP Instance Using Gemma 4 (Free & Local)

Run Claude Code Locally Using Ollama + Gemma 4(Free & Private)

ADK Agent MCP Server

Loop Agent

Command Palette

Step 1: Setting Up the Environment

Step 2: Writing the Code

Github:

Step 3: Breaking Down the Code

1. Import

2. Extracting the Transcript

3. Generating the Summary with Gemini AI

4. Building the User Interface with Streamlit

Step 4: Running the App

Step 5: Testing the App

Benefits and Use Cases

Future Improvements

Conclusion

Comments

AI

AI in the Kitchen: Building a Recipe Builder with Gemini and Python

More from this blog