I don’t know about you, but my “Watch Later” queue on YouTube has become less of a to-do list and more of a digital graveyard of good intentions. Far too often, I’d click on a promising tutorial, a technical deep-dive, or even just a review of a new piece of hardware, only to find it stretched out over 20, 30, sometimes 45 minutes of preamble, filler, and slow pacing. My problem was simple: I needed the core information, the actual steps, the pros and cons, without committing half an hour to finding a few key sentences buried within. I’m a busy person; time is currency, and I was spending too much of it mining for gold in rivers of gravel.
The common approach, of course, is to just scrub through the timeline, hop-scotching from one timestamp to another, or to rely on the often-sparse and sometimes wildly inaccurate auto-generated captions. Even if the captions are decent, reading a full transcript of a verbose speaker is hardly faster than watching the video at 2x speed. What I really needed was a concise, actionable summary—something that could distill the video’s essence into a few bullet points or a short paragraph. That’s where leveraging AI, specifically Large Language Models, comes in. It’s not about being trendy; it’s about offloading a tedious task to a tool that’s actually good at it, letting me quickly decide if a video warrants a full watch or if I’ve extracted all the necessary details.
The Practical Approach: Browser Extension and an LLM
After trying a few different methods—some involving downloading transcripts and pasting them into various online summarizers, others requiring more convoluted API calls—I settled on a workflow that’s surprisingly robust and simple. It relies on a good browser extension paired with an accessible LLM. For me, that meant Chrome and a free-tier ChatGPT account.
Step 1: Install a Reliable YouTube Summary Extension
- Open your browser (I use Chrome, but Firefox usually has similar options).
- Navigate to the Chrome Web Store. You can usually find this by going to the browser menu (three dots in Chrome), then Extensions, then Visit Chrome Web Store.
- In the search bar, type something like “YouTube Summary with ChatGPT”. There are a few reputable options that do this. I’ve had good luck with the one by Glasp, or similar tools that clearly state their purpose. Don’t just pick the first one; take a moment to look at the reviews and the number of users. I’m always cautious about giving extensions broad permissions, so I stick to well-established ones.
- Click Add to Chrome (or your browser’s equivalent). Confirm any prompts asking for permissions. Typically, they need to read and change data on YouTube, which is expected for their function.
Step 2: Access the Summary on a YouTube Video
- Once installed, navigate to any YouTube video you want to summarize.
- You should now see a new icon or a new section on the YouTube page, usually on the right-hand side or under the video player. For the Glasp extension, there’s typically a sidebar that appears, often showing the full transcript immediately.
- Locate the summarization button. It might be labeled something like “Summarize with ChatGPT”, “View Summary”, or just a small icon that, when clicked, opens a new tab or window.
Step 3: Generate and Refine Your Summary
- Click that summarization button. What usually happens is that the extension will either open a new browser tab directly to ChatGPT (or whatever LLM it’s configured for) with the video’s transcript already pasted and a prompt pre-filled, or it will generate the summary within the extension’s sidebar.
- If it takes you to ChatGPT, you’ll likely see a prompt similar to “Summarize the following YouTube video transcript:” followed by the full text. ChatGPT will then process this and provide a summary.
- If the initial summary isn’t quite what you need—maybe it’s too long, too short, or missed a specific point you were looking for—you can refine it. For instance, you could type a follow-up prompt in ChatGPT like:
- “Summarize this into 3 bullet points focusing on the technical steps.”
- “What are the main advantages and disadvantages mentioned?”
- “Extract any specific commands or code snippets mentioned in the video.”
This is where your ability to formulate a clear question helps the AI deliver a more precise answer. Think of it like talking to a very diligent but sometimes naive intern.
Usual problems
I’ve run into a few snags doing this, which are worth noting so you don’t waste time.
- LLM Login Status: My most memorable mistake was clicking the “Summarize with ChatGPT” button repeatedly, getting a new tab to the ChatGPT interface each time, but no summary. I sat there for a good minute, scratching my head, wondering if the extension was broken or if YouTube had changed its API. Turns out, I just wasn’t logged into my ChatGPT account. The extension dutifully opened the page, but without an active session, it couldn’t paste the prompt and initiate the summary. A quick login fixed it, and I felt a bit silly for not checking the obvious first.
- Transcript Quality and Availability: These extensions rely heavily on YouTube’s ability to generate a good transcript. For older videos, videos with poor audio quality, or those in less common languages, the auto-generated transcript can be garbled, incomplete, or simply nonexistent. If the source material is poor, the summary will be poor. Some creators also disable transcripts, so you’re out of luck there.
- Extension Breaking: YouTube frequently updates its interface and underlying code. This can sometimes break extensions that rely on specific page elements. If an extension stops working, check its reviews or the developer’s notes for recent complaints or updates. Sometimes a simple browser restart or extension re-enable is all it takes.
- Context Window Limitations: While less common for typical YouTube videos, extremely long videos (hours upon hours) might generate transcripts that exceed the context window of the free-tier LLM you’re using. If you get an error about “too many tokens,” you might need to manually segment the transcript or consider a paid LLM service. I haven’t hit this often for my use case, but it’s a theoretical limit to be aware of.
This approach provides a reliable shortcut to gleaming essential information from lengthy videos, saving valuable time and mental effort.
