Debugging Dan
My personal journey balancing life, a job and sideprojects.
 

015 - Podcast Automation

16-09-2024
podcast automation

- the podcast-tooling github repository

In this episode of Debugging Dan, I share my journey in automating my podcast process, which has saved me a significant amount of time. Additionally, I discuss my progress with SEO learning and improvements to my streaming setup. I delve into the tools and scripts I use, including Node.js, FFmpeg, Adobe Podcasts, OpenAI Whisper, Notion, and DaVinci Resolve. By automating various steps like generating transcripts, enhancing audio, and creating blog pages, I've streamlined my workflow and increased efficiency. I hope my insights inspire other podcasters to optimize their processes.

Video

Transcript

Recap

Before we dive into the topic of podcast automation, I'd first like to do a little recap of the past week. So in the accountability update of last week, I told myself I really need to focus on understanding search engine optimization SEO. And I want to improve my streaming setup and all of that's also my recording setup for the podcast, but also create a live streaming setup so I can start doing live stream streams.

So the image you're seeing now is a result of that. So I have a green screen behind me. I'm recording this using the camera of my phone, which should be a higher quality. And I've looked into a podcast microphone, hopefully I can get into that soon to improve the audio more. The audio I'm currently using is just a microphone from this Sony headset, which is okay-ish and not great. So that was the, I guess the easy part.

I created the backdrop and other things can show you the backdrop. So I went. If you're looking at this via YouTube, you can see the backdrop if you're watching or listening to the podcast, visit YouTube and you can see the backdrop. I went for some kind of steampunk office vibe. I like it. And it also allows me to vary on the backdrop. And I think that the green screen works pretty well. So that's good.

SEO

On the topic of SEO, I watched and listened to the beginner's course recording or a YouTube video from Ahrefs. And I got a basic full understanding. So I already had some understanding from SEO, but that's mostly, we're mostly the technical basics. So how to create meta tags, where to put the content in your site, header tags, stuff like that. But in the past, when people started talking about SEO, I noticed there were a lot of words I didn't understand.

So if you're watching this, I've grouped them on a slide. So there's abbreviations like SERP, DR, DAE, the search intent, CTR, a content gap. I didn't understand what those were. Now I know. So SERP is the search engine result page. So that's basically the result page that you're looking at. And SEO tools like Ahrefs have that in line in their tool, and they enhance that so they can show you the DR of a site.

For example, that's a domain rating. So how trustworthy or how good it is. You try to get that up. There's DA, domain authority. You can have a content gap. So that means that you are aiming to have a certain keyword, but your content isn't really aligned. So you're missing some content still for that specific keyword.

There's the search intent. That's the reason why the user is searching. So you're trying to capture that. If the user is searching for, the search intent is buying something, then it doesn't make sense to rank for that specific search intent if you're doing a blog page to inform people about a specific product and not get them to buy it.

CTR is the click-through rate. So on a search result page, search results get shown. Your site might get shown and there's a click-through rate of the people that actually click on the link to your site versus the total amount of people that have seen the link in the search results. So you try to get that as high as possible other than just ranking for number one on the thing.

I'm starting to realize that with SEO, I'm kind of getting into the same trap that I get when developing a product. So I develop a product because I have an ID, I have a problem, and I want to build something that fixes the problem, but I don't really think about my audience, who to sell it to, how to sell it, how to reach those people, where to find those people. And the same thing kind of applies with SEO.

With SEO, you need to think about what keywords do I want to rank for? And for that, you need to know what people am I trying to reach and get to my page. So I think that's a gap that I have again found with myself that I really need to address or start to address or do something about in the coming period.

I guess I'll also incorporate that in my next accountability update to set out some tasks to really work on that instead of just going into it blindly and hoping that at the end I will try to find something that correctly finds the audience or the people that I'm looking for. Or in hindsight, I could also start creating those things for Observalyze, for example, or Datasthor, or other products that I have to think about the audience and maybe also create, update the landing page.

For example, I decide that I want to focus Datasthor on app developers that want to store some external data, like JSON, because the landing page currently doesn't really mention that or accommodate that there's only a web example. So that's something, for example, could be a result of that.

But before I get to that, I first need to think about it and prepare. And well, at least I feel that's more of how boring is not the right word, but it's not something I really enjoy. But I need to find a way to make that more enjoyable for myself, I guess.

But I have learned more and I'm now in the first week of the two-week period for the accountability for a period. And now I also need to start applying some of the SEO stuff that I learned. And I'm still listening to the videos from Roberto, which is an Indie Hacker I already knew, robertodigital underscore at X, and he also has some interesting information and it's less corporative than the video from Ahrefs.

He is also an SEO expert. So I'm also listening to his videos, but for the sites that I'm running, for example, I need to add a canonical URL because I also noticed in the Google Search Console that that's missing. And Google sometimes access those pages with query parameters and then it says, hey, I've already found this content. You need to look at that. That's something that I'm going to look at. That was the past week.

Todays’ topic, Podcast Automation

The topic I will now be diving into is how I automated my podcast. So this is the first postcard podcast that I've created and doing it all solo. And I just dove into it and I started doing audio recordings and I started to what I did is I minutiously edited the audio and removed the ums and the ahs and stuff like that.

And later on, when I started adding video, I kind of dropped the detailed audio editing figuring I'll just do it in one take. Hope it's okay. I don't have that many listeners, so I'm not really annoying a lot of people if the audio isn't that great. I do want to have good audio, but I also need to find the right balance between time and recording this and doing other stuff.

So what I figured is when I moved to video, I just cut off the end and the beginning to properly align it and I'll let Adobe and Hans just take care of the rest. And if there are some ums and ahs, I guess that makes it more natural. So I embraced it and but it was still taking me a lot of time to do a set of the podcast, the video, do the rendering, the transcoding and stuff like that.

So last week, Saturday, I sat down and I figured I have a checklist with all these steps. Let's see what I can automate. And no, it was already two weeks ago Saturday. So I've used the script twice and I'm really happy about it because it automates a lot of parts. It can do stuff in parallel. And I'm also sharing the script, I'll link that later on, the link is also in the description.

So I have a list of tooling that I'm using to create this script. So I use Node.js to build the script and to execute the shell commands. I use FFmpeg to manipulate the video, the audio. I use Adobe Podcast to enhance the audio. I use OBS to record it. I use OpenEye Whisper, the Python package. I run that locally to do the, to generate the transcript from the audio.

I have a locally or a self-hosted installation of Castopod from which I serve the podcast. The blog is using Notion Blog, my internal tool, but the pages itself are stored in Notion. I use JetGPT from OpenAI to generate a summary and the transcripts. And I use DaVinci Resolve to do the video editing, which is very basic.

DaVinci Resolve is a very powerful package. And I just use it to trim off the beginning and the end. And that's basically it. And I figured it's free and it's a powerful package and I can grow into learning stuff from it. So that's what I'm going for.

Those are the tools that I use and combine to record the podcast, podcast, podcast. So you can find the tool here. It's at github.com, I abbreviate it to gh.com. I don't even know if that works, but it's github.com/danships/podcast-tooling. And at least that's what I'm doing or using and feel free to fork it.

And if you're recording something, apply it to your own process or get inspired by it. You can just have a look at it. For me, it really saves a lot of time in terms of hours, two hours or three hours even. I record often the recording and setting up the recording takes me about 30 minutes to 45 minutes or for a 15 minute recording often.

And after that, it takes more time. It likes two, three hours. And now it's like more like an hour, an hour and a half. So that's really beneficial for me. So that's the script, the flow that it processes.

Start

So when I start, I record it, I stop OBS, then I have an MKV file. It says recording in the flow here, if you're looking via YouTube, watching via YouTube. But I have an MKV file and that's the output of OBS. I chose MKV because then what it says in the description is when OBS crashes, you still have the recording until then.

When you use a container format like MP4, you don't. And what I then do, or often I created that before recording, I create the visuals. So that's the thumbnail of them. And I store that as cover video in the script. I've already started it.

I've pointed it to the recording and then it waits for the cover video.png file. If it's there, it then starts generating the introduction and it's called an introduction or the ending of the podcast using the cover video. So I have an audio recording for the introduction, the generic introduction and the ending. It stitches the video on top of it and creates an MP4.

And then in DaVinci Resolve, I then merge the beginning, the middle part. I added that if needed. And for example, if I pause the video in between, I sometimes do that. I also need to cut it and align it. And then I add the end and I save that. And the script then takes over and it extracts the audio from the video because in the next step from the audio, I update it manually to Adobe Podcasts at Hans.

Enhanced audio

And once that is done processing, I save it and the script waits for the enhance file. That's a WAV file. It's the output from Adobe Podcasts and then the flow splits into two parts. It converts it to MP3 and that MP3 file I can then upload to Custopod because then it is done.

And it also replaces the audio in the video file. So the original audio from the video file has been enhanced by Adobe Podcasts at Hans. I replace that and then the video is also done. And then I can start uploading the video to YouTube.

But in a parallel track, now that I have the enhanced WAV file, the script creates a Notion blog page with some of the metadata that I am already able to generate, like time and the date and stuff like that. And then the script starts to generate the transcripts.

So the transcripts are created using OpenAI Whisper, the local Python package. And it creates a TXT file with the transcripts, but also a SRT and a VTT file, which are subtitle files that I then also use in Custopod and in YouTube to show subtitles because I want to have my podcast being accessible.

People watching it without audio could do that, or people with a hearing impairment could still be able to read what I do, or people that prefer reading over podcasts could then also still be seeing or viewing what I create. So after generating the transcripts, I then let AI do two things.

So ChatGPT, I give it the transcripts and I ask it to generate a title, a summary and some keywords. And I also ask it to fix the transcripts because the output from Whisper is just a big wall of text and I can't use that on the webpage. So I ask it to keep the text as is, but group it into different paragraphs.

And what I've learned from the CO thingy now is that I might also have to ask it to create headings for each paragraph or for some paragraphs because that will improve CEO for my page. If somebody happens to be searching on it, something that I mentioned in my podcast, I don't for the postcard itself, I don't do any SEO optimization because it's a moment in time, a capture of a moment in time.

When the content is generated by AI, so I use GPT 4.0 mini, I use that to also set the blog page contents using the fixed transcripts and then I go into the final manual process. So I've uploaded the YouTube video, but the summary also needs to be stored in the YouTube video.

I added to Custopod and there are some manual steps there in the blog page. I need to set the slug correct for the embedding. I often add the link to the YouTube video, stuff like that. Those are still manual steps that I do. And after that, the podcast is done and I normally publish on Monday morning at 7 in the morning because it's Monday morning at Central European time.

And that's just a random time that I picked and figured that's okay. I could also decide to just publish immediately, but I like to have a fixed publishing schedule so people know when to expect it. That's also things that I might revisit in the future, so changing the schedule from a week to something else or publishing on a different time, it all really depends on what's important or what works and what people want, I guess.

Because I do this as a hobby. I like recording it and sharing it, but so far there are not really people giving feedback or letting me know that they've listened or things like that. So I'm kind of broadcasting to the ether. So I know that the joy that I'm getting out of this is mostly by recording it and putting it out there and knowing that some people listen to it and probably benefit from it.

Otherwise people won't be listening, I guess. And I hope that at some point my audience grows big enough that I can have interactions with people or at least get some feedback or comments. So if you're watching this and like it, leave a comment on your podcast platform that you're listening or like and subscribe on YouTube, let me know that you're listening. And if you're enjoying it and if you have any feedback, also just let me know.

But in terms of flow, that's it. I do have some future announcements that I'd like to make. And enhancements is a really difficult word with a lot of S's, I guess. So future announcements are I'd like to upload to YouTube automatically because it's a pretty big file and it often takes some time and it's already done halfway through the process.

And then I could later on update the description automatically. So that's something that I want to look into. But the Google API is not that easily accessible. You can just generate an API key and that's it. But you need to do something on Google Cloud Platform, generate an API token, authenticate with your user, stuff like that.

So I figured let's keep this off for now. I'll just do it manually. That's okay. And for Custapod, the podcasting platform that I use, I don't believe they have an API available. So I still need to do that manually, but it's not that a lot of work. It's mostly copying, pasting and that's also okay.

And for the more distant future, I'd also like to do is use AI to, for example, feed AI the transcript or the VTT file and tell it, hey, can you, based on this user profile or viewer profile, what would be interesting parts in this podcast to clip and perhaps have it automatically select the time and create a video from it at a bumper in the front or in the back or at the end.

And so that I can then easily upload that to YouTube shorts or Instagram. I don't have an Instagram currently for debugging then because somebody has the username, but it's not available. So it's not a user profile. It's just this page is not available. So I cannot register that. I need to figure out someone for that.

But regardless of that uploaded to TikTok and per platform, it could also be that there's a different audience. So for TikTok might be, people might be interested in different things than on Instagram or YouTube. So I'd like to start doing more with that with AI, but that's more for a distant future.

And also depends on how complicated it is. I already know how to cut stuff from the video. I think that what I mentioned in the beginning, I don't really know what kind of content I want to clip, what's interesting. So that's something that I may be looking into. And after that, the technical stuff will probably be there by itself.

I even figured I might even just create an audio only version of my podcast, add some Minecraft video under it. For example, what some automatic YouTube channels are doing, they use text to speech from a post from the Reddit, am I the a-hole threads? And then they put some kind of Minecraft video below it and people watch that.

But that's not something that fits me because I like to create content for adding value and not really by doing it for the views or gamification of the algorithm and just growing a channel and then selling it, for example. I'm more about the content than about the number of subscribers.

But yeah, it would be nice to automate some stuff to add value and have it be more efficient for my time. That's it for today. I hope you enjoyed this outing. It's not really about the core focus of the podcast, but I figured it saves me a lot of time. I'm proud of it. And I figured I'd share it with you. And I'll speak, you'll hear from me on the next one. And let me know what you think, bye.

Thanks for tuning in to Debugging Dan. If you enjoyed this episode, please subscribe and leave a review. Stay curious and see you next week.