Debugging Dan
My personal journey balancing life, a job and sideprojects.
 

019 - Socials Like a Pro

13-10-2024
podcast socialautomation

← 018 - Side Life Issues Have Popped Up 020 - Get It Done #5 →

In this episode of Debugging Dan, I discuss the challenges I faced last week, including technical difficulties with my camera, which prevented me from recording visuals for this podcast. I share my progress on automating my social media outreach to make it easier to post content across various platforms. I've been working on a Next.js app that integrates with Telegram for storing raw video files and utilizes Python with MoviePy for video manipulation and captioning. My goal is to streamline the process, allowing me to easily share video snippets by automating tasks like trimming, adding subtitles, and different output formats for social media. I emphasize the importance of creating an efficient workflow to maintain engagement with my audience while juggling my full-time job and side projects.

Video

Transcript

In this week's episode, the camera isn't working, unfortunately, but I dive into what I've been doing last week and how I am automating my socials.

And for people that are watching via YouTube or other video means, this week my camera failed so I'm not able to record my face cam. So you'll only hear my voice and see what I display on the screen. For people listening to podcasts, you're not missing anything. Audio is good enough.

So the episode of last week where I coined a side-life balance, I mentioned that I was a little bit stuck or not doing that well, not making progress and not feeling good about it. So after the podcast, and what I also mentioned during the episode is, I'm going to let go for a while. I didn't do the get it done episode and I figured, as I mentioned, I'm going to focus on socials and getting that up to speed, getting that quality good, similar to what I've been wanting to do with the podcast.

So I'm recording with an improved microphone. This week I bought a light source so I can light myself better while I'm sitting here. Unfortunately, this week there is no camera so I can't really see the results and that's what I've been doing the past week.

I've been building in Python even because of a dependency that was only available in Python that I needed and I've started doing some work in Next.js, which is really taking some time to get used to. It works different than I expect, which causes me to read or forces me to read a lot of documentation, which is not bad necessarily, but it's less intuitive than I would have hoped. So that's what I've been doing the past week. I've been focusing on that and in this episode I'd like to take you with me in the grand scheme of things.

So how I want to do social. So the thing is for me it should be as low effort as possible because I don't have a lot of time and if something takes like one hour or two hour, I really need to plan it and for me that would mean that I will probably not do it because it takes too much time. What I have in my mind that it should be easy to just pick up my phone, record a video or record that audio snippet and then get that ready to be posted on different social media.

Even posting this stuff is already going to take a lot of time with the different social media apps that are around like YouTube shorts or TikTok or Instagram, threads, blue sky. So that takes time. So I want to keep it as low level as possible. That's why I've been working on automation.

So on the screen, if you're listening on a podcast, I'm showing a flow diagram. So what I want to do, I want to make it as easy as possible and when I record a video or record an audio snippet, I want to be able to upload it to a Telegram bot and I purposely did that because the Telegram bot will serve as a storage for the raw video. And after that, I can automate and take that file from Telegram and force it to all kinds of other services and I can easily share it from my phone. I don't need to go to a webpage and say upload file and have issues with large video file uploads and stuff like that.

So I just upload it in Telegram. I'm building a Next.js app currently that connects to Telegram, takes the input and will be able to process it further. And in the Next.js app, I'm going to integrate FFmpeg to be able to manipulate the video. So trim the video, extract the audio, replace the audio, add subtitles, things like that.

And that Next.js app will also be connecting to the Python service that I've been building, which uses MoviePy to create captions, overlays with text, and it uses Faster Whisper to do speech to text. And speech to text with Whisper is pretty good, but it's not perfect. So in the Next.js app, I'm also building an interface so after that, the translations have been generated, or no, not the translations, but the speech to text, I'm going to create an interface that allows you to change the generated text.

So for example, the way that I pronounce Dan, Whisper often interprets that as D-E-N instead of D-A-N and some names of my services and things like that. It doesn't really understand it correctly always, so you can correct that and then continue with the processing pipeline.

So the most work in the past week went into creating the Python servers. I'm not really experienced with Python a lot. I've built a service before at my place of work, and I've also used that experience and also a lot of JetGBT to get a service working pretty stable that uses MoviePy to create the captioning and also the speech to text with Whisper. And now I'm starting to work with the Next.js app to really get it flowing.

So the Python part, I found a sample Python notebook that somebody created where he used Faster Whisper and MoviePy to generate the captions, and I'm building on top of that to be able to create clips. So I'm showing a sample here, I'll upload that to YouTube also, you can see it here. What I did is I took the introduction that I recorded for the previous podcast, and what you see here is that the words that I'm talking are overlaid over the image and it's replaced.

Every word is shown while I'm speaking it, as you often also see with TikTok videos and other stuff. And it doesn't look that great, it's just white text over the video itself, but the potential is there. And I often implement it like this, so I create the technical capability, but since I'm not really a graphical or a good UI person, I often leave it there. So picking the right font, the right size, the right colors, the right position, I figured I've built it.

When I'm doing the actual implementation of the pipeline for my socials, I just figure it out then. So I'm kind of forwarding the challenge to then in the future, and for now that's okay. I love the feeling of it coming together and being able to do something in Python that works again, and that was fun to build, and the end result is there.

The example that I found only did subtitles that highlighted a specific word, and that's a preset that's also there, and also the one that I've just shown where just the word is highlighted, and that's cool. And so what I'm going to build now in the Next.js app.

So in the next slide that you can see on YouTube, but I also explain it there, you see the presets that you can build. So the source is a video or an audio file, and the audio file then gets transformed into a video, for example, by being able to add a image background and just have the video play or loop that specific image, or it could even be a video where the audio is replaced and have that generated.

And then I feel like after I've uploaded the application, it will ask me to pick some presets or flows that I've predefined, and it could be several steps that I've looped together. So for example, it could remove the audio and enhance the audio with some audio servers and then stitch back the audio into the video.

It could overlay the speech to text part and add a intro and outro clip, for example, and make it ready for Instagram, and it could crop it, make it square, while something that I upload to TikTok, I keep it portrayed, or I transform it from landscape to portrait. And the way that I feel it is this is all some different nodes that are able to do something.

So crop, add clip, trim, that I can all place together in a flow, and I just upload the image. So I can take the flows, one or more that I want to apply for that specific image, it gets to work. And at some point I get a message back saying, hey, we've got this video back for you. You can download it and you can upload it to social.

So part of the work will be sent to the Python service. Part of the stuff will be done with FFmpeg with Next.js. I could even decide to add some AI with an LLM. That could even be a third input type where I just only describe the video and I use some kind of external API to generate the video.

I'm probably not going to do that because I want to have just me on the socials, not some AI generated thing. But it could be, for example, that I have a podcast, I have AI summarize it from the transcripts, and I then generate the script for a summary video, and that could be generated and then uploaded somewhere.

Hopefully at some point I might even upload it automatically to socials for the ones that have an API. So Twitter, that's pretty difficult, Instagram is also not that easy, but other services may have a better API and those I could already upload it automatically.

And within that flows you can have manual steps, as I just explained for the captioning that you can fix the captions manually. And when you press the button, the flow continues going.

Yeah. So that's what I envisioned for that part right now, and I'm kind of focusing on that. I'm not focusing on the other products at the moment. I just want to get a good quality debugging podcast channel out. I want to be able to promote stuff on socials.

For example, now I'm working with Next.js and I'm having a hard time finding the difference or understanding the difference between React server components, React client components, when to use which. I thought that when I started, you kind of always want to do server components because those are generated on the server and it's faster. But as it turns out, I'm now building the login form, for example.

But if you do that with a server action, then showing the progress of doing the login, you're only able to do that using a hook from a client server component. So I already need a client component while I figured, hey, I thought it was the best to have only server components. So I'm just reading a lot of documentation there and the stuff that I encountered there, I want to share.

So I've already recorded something about moving from Preact to Next.js and React, or the main team UI components where I found out that even the entire main team UI image layer, you can use that as server components with Next.js. So that was a bummer.

But those are things that I'm learning and I want to be able to share that easier by just creating short clips. I could even just record audio somewhere. Another thing I'm thinking about is something that I've dubbed shortcasting. So you have podcasting, but you replace the word pod with short, is that when you have short clips, like I'm in it two minutes long where I fend or rant about something or something I've learned, that I'm also able to easily record an audio snippet, have this platform use or generate a video or improve the audio for me and just upload that as a podcast, for example.

Besides debugging then that you just have a collection of short snippets with me fending about stuff, which doesn't really fit with the standard debugging channel, because those are a bit longer with a beginning, a middle and an end, but it's also something that I've been thinking about, which would really fit with this.

And I'm currently building it. The internal name I'm using is Dan Cutts, like ProCuts and all those video editing programs or have cuts in the name. I'm not planning to release it as a product somewhere because it's really specific to my needs and what I need.

But yeah, that's what I've been working on the past week. I'm going to be working on it next week also. And yeah, it was fun for me to explain what I'm doing. If you have any questions, want to know more, just send me a message, comment on the video, the podcast, just let me know. And I'll speak to you again next week.

Bye.