Original Source Here
How I personalized my YouTube recommendation using YT API?
How to utilize most of YouTube’s API?
Last week, I wrote about how YouTube Algorithm works and the AI workflow behind it. But based on the information available regarding its recommendation system, I think there are some flaws in it:
- It highly prefers watch time and obviously, longer videos will have high watch time and it tends to recommend higher duration videos after a certain period.
- YouTube has a lot of clickbait videos, low-quality content and yet it is recommended and no actions are taken on false information imparted.
- Satisfaction rates such as LikeCount, DislikeCount have little effect on recommendations that could be improved.
During my research regarding the YT algorithm, I found a really interesting article by Chris Lovejoy where using YT API, he managed to create a personalized recommendation system. Inspired by his thought process and an insightful article, I decided to create my own YT recommendation algorithm using YT APIs.
The plan was to create a system that can suggest relevant videos following a personalized plan. The motive was to avoid looking for the best video in a pool of 1000s of videos but rather to get a video that statistically suits my taste.
The plan could save me a lot of hours looking for that particular relevant content and maybe help me to avoid distraction.
The workflow consisted of using getting video information using YouTube’s API and then rank them statistically according to my taste. Later, to have some ease we can also automate the whole process using Python.
Getting familiar with YouTube’s API
YouTube API is the car driving this project. It will bring you every sort of information about the video, be it, statistical or descriptive.
Referring to the documentation, it can work both for channels as well as videos and return us with their respective metadata.
To start with the API, we need an API key that could be generated using Developer Console.
Follow the below code to get content based on your queries.
The output will leave us with a JSON object, which could be later parsed and useful information can be extracted.
This would provide descriptive attributes of a video/channel.
To get statistical attributes, we need to take id from descriptive attributes and follow the following code.
Creating the Perfect Formula?
I’m not a big fan of YouTube’s recommendation system. I think it lacks several important attributes or maybe I have got a peculiar taste.
Now that I’d got familiar with YouTube API and can easily generate useful information, it was time to switch on my creative machine and develop ranking metrics that could suit my preferences.
Several factors could make a good video. It could be the view count, watch duration, the satisfaction rate of video(like comment, share), or maybe more relatable tags to my search query.
The easiest approach would be to settle with a video with a high view count, but logically, if a channel has 10M subscribers then getting 100k views on a video won’t be a big deal for him. But if some content of a channel with 10k subscribers hit 100k view count, we can infer the content was up to the mark.
In that case, getting a view-to-subscriber ratio might be the best metric to choose relevant videos.
But, the content of the channel with a low subscriber count can boost the ratio. I tweaked the code a little bit and added some limits and set the videos to have at least 10k views and 1k subscribers.
Further, view count and the number of subscribers couldn’t be the only measure of ranking. I introduced likecount-to-dislikecount ratio to further pick relevant and trustworthy content.
Adding the view-to-subscriber ratio and likecount-to-dislikecount ratio, I developed a score for each video.
It is universally assumed that any content on YouTube is at its prime time within 24–48 hours and fetch most views and satisfaction rate. But, contrary to the fact, I decided to keep it manual for each query.
To get a precise result, I also tweaked around with descriptive attributes and checked if the “query” is present in both the title as well as description.
I counted the occurrence of queries in the title as well as the description. And followed the idea of “More the Merrier”.
And at the last step, I modified my final score function. First, focus on keyword in title and description, return with content with maximum content. Later return content with maximum view-to-subscriber ratio and likecount-to-dislikecount ratio.
I tested my workflow for query “Kubernetes” and got following result.
The results fetched are great and reliable but in my opinion, things could get a little better.
Overall, it was a fun project revolving around the understanding of YouTube’s API and YouTube’s recommendation system workflow.
The workflow of the code can be concluded as:
- Manually enter the query, time frame, and API key to extract videos.
- Filter videos according to Descriptive and Statistical attributes.
- Rank the videos.
- Display the output.
You can find the full code at my Github.
The project is still in its initial stages and could be improved a lot, some of the steps that can be taken into account are:
- The whole process of fetching personalized videos could be automated.
- A better metric implementation to get even better results.
- Deployment of the code on cloud servers for public use.
If you like this article, please consider subscribing to my newsletter: Daksh Trehan’s Weekly Newsletter.
Hopefully, this article has given you an insight into the YouTube recommendation system and how one can construct one for them.
But, the information portrayed in this article regarding generic YouTube recommendation systems is solely based on some theories that are experienced by users or publicized by YouTube developers. The personalized algorithm could be pushed further to its limits and we can fetch even better results.
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot