Search Engine for YouTuber Ali Abdaal's videos

Overview

Ali Abdaal Search Engine

image

This is a personalized search engine for my favorite YouTubers, Ali Abdaal. I used selenium to scrape all his videos, youtube-dl to download them as audio files, and Google Speech Recognition to transcribe the audio files.

I then took all the data and used it to populate a Postgres database hosted on supabase, then built a frontend with React where users can look up phrases and find out how many times and in which videos were they said.

Technologies Used

  • Python (all scripting)
  • Selenium (web scraping)
  • SpeechRecognition (Google Speech Recognition)
  • youtube-dl (downloading videos)
  • React & Chakra UI (Frontend)
  • Firebase (Hosting & Analytics)
  • Supabase (Postgres DataBase)

Usage

  1. Run scraping_vid_info.py to get a JSON file with all the video names and URLs of a channel
  2. Run download_yt_vids to download a WAV audio file from each video URL and save it locally
  3. Run transcribe_audio.py to transcribe all the audio files and save them in a JSON file
  4. Create a database and port the JSON file over there. Then connect it to the frontend and voila!

Progress is being tracked with GitHub Issues and a Kanban board in the Projects tab of this repo.

Motivation

This project was inspired by Kalle Hallden's Joe Rogan project. The idea behind it is to have a search engine for my favorite YouTuber so that I can lookup certain phrases / words and find videos where he mentions them.

This could also be applicable to students to use for downloading their professor's lectures and creating a searchable database from it to quickly lookup where certain concepts were mentioned. Another application is for conferences to take all talks, makes them transcribable, and search it. I plan to develop this into a boilerplate anyone can use to create their own search engines starting from video, audio, or text files.

You might also like...

Node starter kit for semantic-search. Uses Mighty Inference Server with Qdrant vector search.

Node starter kit for semantic-search.  Uses Mighty Inference Server with Qdrant vector search.

Mighty Starter This project provides a complete and working semantic search application, using Mighty Inference Server, Qdrant Vector Search, and an e

Oct 18, 2022

Allows users to quickly search highlighted items on Wikipedia. Inspired by the "search Wikipedia" function on the kindle mobile app.

Allows users to quickly search highlighted items on Wikipedia. Inspired by the

wikipedia-search Allows users to quickly search highlighted items on Wikipedia. Inspired by the "search Wikipedia" function on the kindle mobile app.

Aug 15, 2022

A plugin for Obsidian (https://obsidian.md) that adds a button to its search view for copying the Obsidian search URL.

A plugin for Obsidian (https://obsidian.md) that adds a button to its search view for copying the Obsidian search URL.

Copy Search URL This plugin adds a button to Obsidian's search view. Clicking it will copy the Obsidian URL for the current search to the clipboard. T

Dec 26, 2022

🟢 Music player app with a modern homepage, fully-fledged music player, search, lyrics, song exploration features, search, popular music around you, worldwide top charts, and much more.

🟢 Music player app with a modern homepage, fully-fledged music player, search, lyrics, song exploration features, search, popular music around you, worldwide top charts, and much more.

Music-player-app see the project here. 1. Key Features 2. Technologies I've used Key Features: 🟢 Fully responsive clean UI. 🟢 Entirely mobile respo

Nov 16, 2022

Tiny and powerful JavaScript full-text search engine for browser and Node

MiniSearch MiniSearch is a tiny but powerful in-memory fulltext search engine written in JavaScript. It is respectful of resources, and it can comfort

Jan 3, 2023

Chappe - 🧑‍💻 Developer Docs builder. Write guides in Markdown and references in API Blueprint. Comes with a built-in search engine.

Chappe - 🧑‍💻 Developer Docs builder. Write guides in Markdown and references in API Blueprint. Comes with a built-in search engine.

Chappe Developer Docs builder. Write guides in Markdown and references in API Blueprint. Comes with a built-in search engine. Chappe is a Developer Do

Jan 1, 2023

MLPleaseHelp is a simple ML resource search engine.

MLPleaseHelp is a simple ML resource search engine.

README MLPleaseHelp is a simple ML resource search engine. How To Use You can use this search engine right now at https://jgreenemi.github.io/MLPlease

Jan 20, 2021

🧑‍💻 Developer Docs builder. Write guides in Markdown and references in API Blueprint. Comes with a built-in search engine.

🧑‍💻 Developer Docs builder. Write guides in Markdown and references in API Blueprint. Comes with a built-in search engine.

Developer Docs builder. Write guides in Markdown and references in API Blueprint. Comes with a built-in search engine. Chappe is a Developer Docs buil

Jan 1, 2023

Yu-Gi-Oh! Card Search Engine

Yu-Gi-Oh! Card Search Engine

Yu-Gi-Oh! Card Search Engine Buscador de cartas de Yu-Gi-Oh, os resultados são apresentados em PT-BR. Algumas cartas podem não ser encontradas devido

Apr 14, 2022
Comments
  • Run the audio transcribing on VPS

    Run the audio transcribing on VPS

    Problem: The folder is 55GB. Need to find a way to get it on the linode thing. Got around it by using an 80GB Linode.

    Plan it to transfer audio files (55GB) over SFTP and SSH using this guide, then run the audio transcribing script - https://www.linode.com/docs/guides/transfer-files-with-cyberduck-on-mac-os-x/

    opened by Nutlope 0
Owner
Hassan El Mghari
Learning & Building.
Hassan El Mghari
Grupprojekt för kurserna 'Javascript med Ramverk' och 'Agil Utveckling'

JavaScript-med-Ramverk-Laboration-3 Grupprojektet för kurserna Javascript med Ramverk och Agil Utveckling. Utvecklingsguide För information om hur utv

Svante Jonsson IT-Högskolan 3 May 18, 2022
Hemsida för personer i Sverige som kan och vill erbjuda boende till människor på flykt

Getting Started with Create React App This project was bootstrapped with Create React App. Available Scripts In the project directory, you can run: np

null 4 May 3, 2022
Kurs-repo för kursen Webbserver och Databaser

Webbserver och databaser This repository is meant for CME students to access exercises and codealongs that happen throughout the course. I hope you wi

null 14 Jan 3, 2023
"Jira Search Helper" is a project to search more detail view and support highlight than original jira search

Jira Search Helper What is Jira Search Helper? "Jira Search Helper" is a project to search more detail view and support highlight than original jira s

null 41 Dec 23, 2022
A personal semantic search engine capable of surfacing relevant bookmarks, journal entries, notes, blogs, contacts, and more, built on an efficient document embedding algorithm and Monocle's personal search index.

Revery ?? Revery is a semantic search engine that operates on my Monocle search index. While Revery lets me search through the same database of tens o

Linus Lee 215 Dec 30, 2022
An efficient (and the fastest!) way to search the web privately using Brave Search Engine

Brave Search An efficient (and the fastest) way to search the web privately using Brave Search Engine. Not affiliated with Brave Search. Tested on Chr

Jishan Shaikh 7 Jun 2, 2022
Tesodev-search-app - Personal Search App with React-Hooks

Tesodev-search-app Personal Search App with React-Hooks View on Heroku : [https://tesodev-staff-search-app.herokuapp.com/] Instructions Clone this rep

Rahmi Köse 1 Nov 10, 2022
Instant spotlight like search and actions in your browser with Sugu Search.

Sugu Search Instant spotlight like search and actions in your browser with Sugu Search. Developed by Drew Hutton Grab it today for Firefox and Chrome

Drew Hutton (Yoroshi) 9 Oct 12, 2022
🍭 search-buddy ultra lightweight javascript plugin that can help you create instant search and/or facilitate navigation between pages.

?? search-buddy search-buddy is an open‑source ultra lightweight javascript plugin (* <1kb). It can help you create instant search and/or facilitate n

Michael 4 Jun 16, 2022