To recap, I am building a bot that given a list of YouTube channels it’ll tweet out snippets of the videos when a new one is posted. In the previous post, I had a list of the things we need to do to get there:
- Get list of my subscriptions from YouTube.
- Mark channels as music or blog.
- For each channel get a list of recent videos and store that in a database.
- Take the last video link and download the video.
- Clip the video and clip it to 40s.
- Tweet that video (title, link to Youtube)
- Save link to tweet, video in database
- Check every 30 minutes if there’s a new video, if yes start from 4.
Today we’ll be looking at the first one: getting a list of subscriptions from YouTube.
You can find the code for what I’ll be doing today on GitHub at: https://github.com/princelySid/yt-bot. The branch for this particular post is youtube
. Once you have it on your local machine you’ll need to create a .env file in the project top-level directory that follows the one in .env_example.
It’s always good to use virtual environments when working on code to prevent you from messing your local environment with packages. For this I personally like to use pipenv, a dependency management tool.
Once you have pipenv installed, you can run the command pipenv install --dev
to install all the dependencies and then run the tests: pipenv run pytest
the output should look like this:
If the tests don’t pass for some reason it probably means you’ve messed up somewhere.
Let’s quickly talk through the dependencies you’ll need:
python-dotenv
reads key-value pairs from the .env file. Mostly used to store config stuff.google-api-python-client
client library for Google’s discovery based APIs. It’s how we’ll access the YouTube data APIloguru
logging library, I find it easier than the standard module.pytest
testing library, remember part of my goals in this is to get better at thispretty-errors
cleaner error messages, better than standard error messages
The documentation is clear enough. You want to follow the instructions on this page. Remember that for this you only need access to public available data so the only credentials you need is an API key. You’ll only need to follow the instructions for step 1, the rest can be ignored. Also something that tripped me up was the pagination. For this you should use the method described here.
Once you’ve registered for access you’ll need to change the YT_API_KEY
value in your .env file. Also change YT_CHANNEL_ID
to the channel you’d like to download subscriptions from.
At this point I should warn you that Google imposes quota limits on how often you can hit their API. You get a maximum of 10000 units a day and each page that my code pulls will cost you 1 unit. Be careful.
Okay, so to download the data, which will be saved to a CSV, you want to run the command pipenv run python data/dl_my_subs.py
. It’ll save the data to my_subs.csv
. I’ve included my subscriptions in data just for fun, don’t judge me, but you should definitely download your own. The output should look similar to this:
Now we have our data. Next post in this series will be: marking channels as music or blog. There’s really no automated way to do this so it’ll be manual, so maybe we’ll skip to step 3: for each channel list of recent videos and store that in a database. Till next time. Peace!!!