More

BetterWhisper · 2026-03-19T00:09:56 1773878996

What scraping challenges? From your pricing I can say you are just using other APIs and you are a layer on top of them. Also your docs mix insta and TikTok results

FdezRomero · 2026-03-20T20:57:39 1774040259

Thanks a lot for your feedback!

On the docs: you caught a real bug. There were some paginated Instagram endpoints that were inheriting TikTok data in the response examples because of a shared base schema, and the override was not being applied correctly. Already fixed and deployed now.

On the architecture: I'm not wrapping any third-party API providers. I scrape directly from TikTok and Instagram with Puppeteer + Chromium running on Cloud Run, built from scratch. If I were buying from a third-party scraping API, the economics wouldn't work: those run for $1–3/1k, and I'm selling at $1.11–1.96/1k. The margins only make sense if you own the infra.

The scraping challenges are very real: TikTok fingerprints headless browsers aggressively, signs all requests, rotates session requirements, and rate-limits by behavioral patterns rather than just request volume. This infrastructure is what makes it possible to offer no rate limits with great reliability.

The added value is the normalization layer: every response follows ActivityStreams 2.0, so a TikTok user and an Instagram user come back with the same fields. You write your code once and it works across both platforms.

Appreciate you taking the time to look closely enough to catch the response issue.

BetterWhisper · 2025-08-13T16:19:11 1755101951

Hey, indeed Whisper can do the transcription of Japanese and even the translation (but only to English). For the best results you need to use the largest model which depending on your hardware might be slow or fast.

Another option is to use something like VideoToTextAI which allows you to transcribe it fast and then translate it into 100+ languages which you can then export the subtitle (SRT) file for

BetterWhisper · 2025-07-15T22:01:43 1752616903

Do you support speaker recognition?

lostmsu · 2025-07-15T23:37:58 1752622678

No. I found models doing that unreliable when there are many speakers.

BetterWhisper · on Feb 4, 2025

Not a 8 min read as stated in the beginning but nevertheless interesting.

BetterWhisper · on Jan 10, 2025

In "The proof as we know" section he states that the dot is a NAND operation

Quote: "the · dot here can be thought of as representing the Nand operation"

BetterWhisper · on Oct 7, 2024

Are you running whisper on that same $7 Server?

BetterWhisper · on Sept 29, 2024

https://www.videototextai.com/ - an AI transcription, translation, chat with your video/audio platform. We are very close to releasing an update where it is possible to caption any video in any language - perfect for making social media content.

BetterWhisper · on Aug 24, 2024

We're currently developing https://www.videototextai.com/ – ChatGPT for video and audio. The idea is to get to an all-in-one video and audio editing/insights platform. We’re actively building out new features to fully realise our vision, and we'd love to get any feedback from HackerNews!

BetterWhisper · on Aug 9, 2024

If you are looking for something automatic that also allows you to interact with your transcripts chatgpt style then I would recommend https://www.videototextai.com/

Terretta · on Aug 9, 2024

That cookies box though... Dark pattern (accept lots + accept all, fake drag affordance, covering a quarter of the page) for cookies doesn't bode well for privacy protections around the transcripts.

BetterWhisper · on Aug 9, 2024

You are allowed to delete any transcription you make and with that we do not keep any copy of the transcripts :) . The cookie banner is there to comply with the EU laws.

BetterWhisper · on Aug 8, 2024

https://github.com/Emerge-Lab/gpudrive - the repo