ytdebunk is a command-line tool that can be installed via pip. It takes a youtube video link as an argument and does a lots of works for you automatically.
This repository contains the source code and a demonstration of its features.
- Download audio from YouTube videos.
- Transcribe the audio content.
- Optionally enhance the transcription using the Gemini API.
- Optionally detect logical faults in the transcription using the Gemini API.
- Store the audio, transcription and logical errors in local folder
There is also a Streamlit-based demo application.
- Classifying assertive claims within the transcription.
- Fact-checking and validating claims using online searches and agentic AI.
- Categorizing factual and logical faults.
- Generating a script for a hypothetical debunker character using generative AI (or AI Agents).
- Synthesizing the script into audio and video using generative AI (or AI Agents).
This tool is particularly useful for analyzing transcriptions to identify logical fallacies and incorrect claims made by YouTubers, helping to prepare debunk videos.
To avoid conflicts, it is recommended to create a virtual environment:
python3.11 -m venv .venv
source .venv/bin/activateNow, install ytdebunk from PyPI:
pip install ytdebunkAlternatively, install the latest version directly from GitHub:
pip install git+https://github.com/hissain/youtuber-debunked.gitytdebunk is a command-line interface (CLI) with multiple options.
video_url(str) – URL of the YouTube video to extract audio from.
| Option | Description |
|---|---|
-l, --language (str) |
Language code for transcription. Supported: [bn, en] (default: en) |
-e, --enhance (bool) |
Enhance the transcription using the Gemini API (default: False) |
-d, --detect (bool) |
Detect logical faults using the Gemini API (default: False) |
-v, --verbose (bool) |
Enable verbose logging. |
-t, --token (str) |
API token for Gemini API (Required if --enhance or --detect is enabled) |
-st, --start_time (float) |
Start time of the audio clip (seconds) |
-et, --end_time (float) |
End time of the audio clip (seconds) |
-m, --model (str) |
Transcription model from Hugging Face (WhisperFeatureExtractor) |
ytdebunk "https://www.youtube.com/watch?v=example" -e -d -v -t YOUR_GEMINI_API_KEYAlternatively, using an environment variable:
export GEMINI_API_KEY="your_api_key"
ytdebunk "https://www.youtube.com/watch?v=example" -e -d -vFor more examples, check the Example Notebook.
To run the demo using Streamlit:
- Install Streamlit:
pip install streamlit- Run the application:
streamlit run app.pySet the Gemini API token as an environment variable:
export GEMINI_API_KEY="your_api_key"-
Download Audio
- Uses
ytdebunk.downloader.download_audioto download audio from the given YouTube URL.
- Uses
-
Transcribe Audio
- Uses
ytdebunk.transcriber.transcribe_audioto generate a text transcription.
- Uses
-
Enhance Transcription (Optional)
- If
--enhanceis enabled,ytdebunk.refiner.enhance_transcriptionrefines the transcription using the Gemini API. - The API token must be provided via
--tokenor as an environment variable.
- If
-
Detect Logical Faults (Optional)
- If
--detectis enabled,ytdebunk.philosopher.detect_logical_faultsidentifies logical faults, fallacies, biases, irony, etc., using the Gemini API. - The API token must be provided via
--tokenor as an environment variable.
- If
-
Save Transcription
- The final audio, transcription, and detected logical faults (raw or enhanced) are saved to the
./outputfolder.
- The final audio, transcription, and detected logical faults (raw or enhanced) are saved to the
- If
--enhanceor--detectare enabled but no Gemini API token is provided, the script exits with an error message.
This project is licensed under the MIT License. See the LICENSE file for details.
Contributions are welcome! Fork the project and submit a pull request to add new features or improve existing ones.
For inquiries, contact the project author at [email protected].




