Spotify Playlist Automation and Web Scraping with Javascript

Mehmet Akif KARADAĞ
11 min readApr 21, 2023

The Max FM radio channel, i love to listen, can only be accessed in Ankara, where I live, and on the website. It came to my attention that this radio channel does not have an official playlist on Spotify, and I wanted to create it (unofficially). Technologies I used for this project:

- JavaScript

- Spotify API

- Max FM Website

- NodeJS

- Puppeteer, dotenv, expressJS, Axios, fs

The general outlines of the program are as follows:

1- Find out the name and singer of the song playing instantly from the web player of the radio.

2- To perform the authorization process to be able to use the Spotify API.

3- Communicating with the Spotify API to add the song obtained in step one to the playlist.

4- Deleting the oldest added song from the playlist so that there are always 100 songs in the list.

Let’s move on to the development steps of the project, which constantly updates the final version of the playlist and keeps it constant at 100 songs.

Before moving on to Spotify API requests, I want to start with web scraping over the radio channel. For this process, I need a bot that will access the Max Fm player at regular intervals and learn the instant playing song information. What I need is Puppeteer, which allows QA tests and automation over Chromium. After the HTTP GET request I made to the web player of the radio with Puppeteer, I made parsing operations in the response and determined the song name and artist name and got it as a function output.

MaX Fm Web Site

Next is the more complex part. Spotify API! Using the Spotify API, we can do many operations with coding. In order to benefit from these, we first need to add a new application by logging in with our existing account via Spotify Dashboard so that Spotify can communicate with us through the callback we have integrated into this application. In addition, we have information on our Dashboard, such as CLIENT_ID, CLIENT_SECRET and ENDPOINT_URI, that will help us in this whole process. CLIENT_ID and CLIENT_SECRET are given to us by Spotify but we have to set ENDPOINT_URI ourselves. Since the program I wrote will only work in my own test environment, I use the localhost value here. For this, I set the URI in the form of http://localhost:8888/callback to work on my localhost and save all this information on the dashboard. I store this information in my .env file in my project directory. In order to be able to access the .env file through my main file and store my information as a variable, I import the dotenv library to my program and extract the information in this way. A functional library when you need to read environment variables from an external file. One final addition will be to the playlist I’ll be working on. I also save the ID value of my playlist in this .env file.

example .env file
User Spotify Dashboard Settings

First of all, we need Authorization to make a request to Spotify and we will proceed by choosing one of the 4 different flows that Spotify offers us. The flow I use for this project is Authorization Code Flow. Authorization Code Flow is a flow that allows the client to access user resources, provides an extra layer of security with a secret key on the server side, and provides a renewable access token. (JWT)

We need an application to use Authorization Code Flow. Since it is an application that will only run on localhost, I create a middleware with ExpressJs and manage the traffic between Spotify and the application it needs in this way.

The functions I edited as async-await are waiting for each other in turn when the main function runs.

We fill the .env file with the information we will need later and add our js file to the input part as a global variable with dotenv. Don’t forget, you must type your environment variables in the places indicated by “XXX” in the .env file!

require('dotenv').config();

const PLAYLIST_ID = process.env.PLAYLIST_ID;
const CLIENT_ID = process.env.CLIENT_ID;
const CLIENT_SECRET = process.env.CLIENT_SECRET;
const REDIRECT_URI = process.env.REDIRECT_URI;

To test the Express, I make a request through my root path and access my result at http://localhost:8888/. (Tip: you can make the request in command line with curl http://localhost:8888 or you can directly access with your favorite web browser.)

app.get('/', (req, res) => {
const data = {
name: 'michael',
isActive: true,
};
res.json(data);
});

After seeing the response to my requests on my localhost, I will start my Spotify authorization process and send a HTTP GET request to the /authorize endpoint. Important at this point is the mandatory and optional parameters that we must request when editing the authorization code flow. We configure our request with client_id, redirect_uri, response_type, scopeand stateusing queryString for a more organized and single-roof usage.

app.get('/login', (req, res) => {
const state = generateRandomString(16);
res.cookie(stateKey, state);
const scope = 'user-read-private user-read-email playlist-modify-private
playlist-modify-public';
const queryParams = querystring.stringify({
client_id: CLIENT_ID,
response_type: 'code',
redirect_uri: REDIRECT_URI,
state: state,
scope: scope,
}); res.redirect(`https://accounts.spotify.com/authorize?${queryParams}`);
});

After the request, we are redirected to the Spotify login screen, and after we log in and confirm, we encounter an error saying that we received a /callback error. Since we haven’t created the /callback route handler yet, getting this error is normal. In the response, a code that we can exchange with the access_token we need shows itself in the address bar. After this point, we start to create the /callback route handler.

app.get('/callback', (req, res) => {
const code = req.query.code || null;
axios({
method: 'post',
url: 'https://accounts.spotify.com/api/token',
data: querystring.stringify({
grant_type: 'authorization_code',
code: code,
redirect_uri: REDIRECT_URI
}),
headers: {
'content-type': 'application/x-www-form-urlencoded',
Authorization: `Basic ${new
Buffer.from(`${CLIENT_ID}:${CLIENT_SECRET}`).toString('base64')}`,
}, })
.then(response => {...}

We use the codevariable to reach the code that we will replace with the access_token with the help of the query parameter. We create the HTTP POST request we will make with Axios. In order to benefit from Axios, we need to download it as a dependency and require it to our javascript file. After the request, we can access the access_token, refresh_token, token_typeand expires_ininformation in the HTTP Response that we can check via http://localhost:8888/callback. Here, you should be careful to pull the access_token and refresh_token information from the HTTP Response and save it as a variable because we will need it in every operation.

I preferred to write the access_token and refresh_token information in the answer to a .txtfile using fs, a module in NodeJS that allows me to operate on the file system. Since the access_token lasts 1 hour, I completed my file reading and file writing processes by making a HTTP Request to renew the access_token with my refresh_token information every 1 hour. The async function I created for the refresh_token is almost the same as the /callback route handler.

const refresh = async () => {
let buffer = fs.readFileSync("refresh.txt");
let ref_file_token = buffer.toString();
refresh_token = ref_file_token;
axios({
method: 'post',
url: 'https://accounts.spotify.com/api/token',
data: querystring.stringify({
}),
headers: {
grant_type: 'refresh_token',
refresh_token: refresh_token
'content-type': 'application/x-www-form-urlencoded',
Authorization: `Basic ${new
Buffer.from(`${CLIENT_ID}:${CLIENT_SECRET}`).toString('base64')}`,
},
}).then(response => {...}

It’s time to shape our main function with the info we took with web scraping and our access_token, which is necessary for us to make a request to Spotify! Unfortunately, it is not possible to directly add this track to the playlist in our Spotify account with the song name and artist name information we obtained through web scraping. Spotify needs the unique id(uris) of that song to add a song to the playlist. At this point, our steps are as follows:

1- Login with our Spotify account,

2- With the access_token information received after authorization, to obtain the song id to Spotify with the HTTP GET method,

3- To make a request to Spotify again with the HTTP POST method in order to add the song to the playlist with the unique id value.

While our main workflow is like this, I optionally edited the delay function inside my main function (to make a request at a certain time (10 minutes, etc.)) and to add to the playlist with each new song while at the same time removing the oldest song from the list as the date of addition in the playlist. In this way, I edited a fetch request with the HTTP DELETE method to keep the list up-to-date and constant at 100 tracks.

The code block below belongs to the delay function which I created.

const delay = ms => new Promise(res => setTimeout(res, ms))
...
await delay(...)

With my delay function, I prevent possible conflicts of my asynchronous functions by adjusting the wait times inside the program.

I mentioned above that I keep the access_token and refresh_token information returned after Authorization in tokens.txt and refresh.txt files. When I run my main function, before entering the while loop and make request to Spotify API, I need to read information from tokent.txt and refresh.txt with fs. Than I assigned those values to my auth_token variable.

const fs = require("fs"); 
...
const buffer = fs.readFileSync("tokens.txt");
const file_token = buffer.toString();
auth_token = file_token;

Another thing I want to do before entering the while loop is to obtain the unique ids of songs which in my existed playlist. I store those ids in an array. In order to be able to access the array that I have determined as local variable, I throw the ids to another empty array that I have determined outside of the function. Now that we have completed our pre-while loop steps, we can start writing our code blocks that will run in the while loop.

The code block below shows calling the scrape function after the time I added after entering the while loop (600000 ms = 10 min) and synchronizing the output of the function to the outputs variable. (I don’t prefer to share my scrape function details, you know, I developed that part for Max FM. You can develop an scrape function what you want.)

while (true) {
await delay(600000);
outputs = await scrape()
...
}

After determining the 10 minutes in milliseconds and adding my delay function to the input with await, I call my scrape function with await again and assign the output of the function to my variable named outputs, which I set as empty before while. The important part at this point is that we need to create our main function as an async function in order to use awaits. In the part where I assign the response to the outputsvariable, I create an if-else condition and also create an empty variable called latestSong before the while loop. If the answer returned from the scrape function in the previous loop is the same as the answer returned in the next loop, our current loop ends here. If our answer has changed (means that song is changed), we enter the else block and, with the response returned from the scrape function, we make a unique id request to the Spotify API with the help of fetch. At this point, our code block roughly looks like this:

let spotiUrl = `https://api.spotify.com/v1/search?q=${outputs}&type=track&limit=1`;
let method = "GET";

try{
fetch(spotiUrl, {
method,
headers: {
"Authorization": `Bearer ${auth_token}`,
Accept: 'application/json',
'Content-Type': 'application/json'
}
}).then(response =>...

At this point, there is an important part that I should mention. The HTTP GET request I made above is also the first request I made in the while loop. This is the part where I will test the validity of my access_token, which has become unusable after 1 hour, and if it expires, I will get a new one! That’s why I do error checking before post request and if my token expires, I call the refresh()function that I created outside of the main function, and this is how I refresh my access_token.

...
if(err == {"error":{"status":401,"message":"The access token expired"}} ||
"TypeError: Cannot read properties of undefined (reading 'items')") {
await refresh();
await delay(5000);
...
}

I add the above code block inside the catch block of my HTTP GET request with the unique id I wrote in the try-catch block and continue to use my delay function to maintain the order.

With the unique id value of the song we obtained from Spotify, it’s time to send an HTTP POST request to Spotify to add the song to the playlist with this id. At this point, we need the PLAYLIST_ID that I have stored in my .env file. In addition, for HTTP POST request, we also have a body section with a header in fetch reqeust. In the body, there is our response, that is, our unique id upon our HTTP GET request, and the position value that allows us to add the song to the top of the list.

let spotiUrl = `https://api.spotify.com/v1/playlists/${PLAYLIST_ID}/tracks`;
let method = "POST";

fetch(spotiUrl, {
method,
headers: {
"Authorization": `Bearer ${auth_token}`,
Accept: 'application/json',
'Content-Type': 'application/json'
},
body: JSON.stringify({"uris":
[`spotify:track:${outputsSpotiId}`],"position": 0})
}).then(response => ...

The last part I will contact Spotify is the request to delete the oldest song in the playlist, which I will do with the HTTP DELETE method. At this point, I can use the playlist song information that I took and added to my array before entering the while loop. I make my HTTP DELETE request with this variable by equating the unique id of the oldest dated song, which I obtained by extracting from the incoming information to the variable named removeSongId in each round. Also, when making my HTTP POST request, when adding to the beginning of the list with the “position : 0” parameter; By using the position : 99parameter in my HTTP DELETE request, I ensure that I reach the last element of the playlist along with the unique id.

let spotiUrl = `https://api.spotify.com/v1/playlists/${PLAYLIST_ID}/tracks`;
let method = "DELETE";

fetch(spotiUrl, {
method,
headers: {
"Authorization": `Bearer ${auth_token}`,
Accept: 'application/json',
'Content-Type': 'application/json'
},
body: JSON.stringify({"uris":
[`spotify:track:${removeSongId}`],"position": 99})
}).then(response => ...

That’s all!

Playlist I created: |Live| Max Fm 95.8

The final version of my existing playlist is as follows:

I hope it helps you too. I would be very happy if you could share with me the places where I am missing, wrong or could be more productive as follows. Don’t forget to like and follow! Enjoyable listening, and enjoyable coding!

--

--