
Kapwing
Joshua Grossberg, CTO at Kapwing, discusses Kapwing’s secret to building successful AI-first features, and how that included partnering with AssemblyAI.
What are the most important considerations when building with AI?
Our users are the most important consideration when building a new feature. There are some tools or some technology where we say, is it going to be too high-touch for our customers to use?
We try to separate things like, is it a gimmick, is it a stunt? Is it too complex for our user? But then somewhere in between those things is the sweet spot.
When choosing to integrate a new AI model or partner, the way we see it is that we have our core competencies as a company, but then we have to integrate with the outside world and bring other people's core competencies into ours.
How do you commit to a specific AI feature?
We look at trends, but we also have personas that we go by.
Sometimes we're chasing growth, where we say this feature attracts a high-growth persona. A person may have a lot of followers on TikTok or Instagram, so it's going to lead to high growth. But oftentimes that person in and of themselves is not a high-revenue persona.
So the type of people who are our high-revenue persona is like a small business person who's making an Instagram ad. Or maybe short-form YouTube tutorials and things like that.
And for that person, they tend to really want types of text-based embellishments. When we see something that is attractive to that person, like really strong word-by-word timings, or just really good transcription, then that makes sense for us.
What's an example of one of Kapwing's AI features?
A big thing that people want, and that correlates very highly with our paying customers, is transcriptions and translations.
People watch videos on mute now a lot. Someone sends me a video and if I'm supposed to watch them on the train without subtitles, I'm not going to watch it. And if the subtitles are engaging, that makes it better.
That's been a major driver of our revenue and a major driver for some of our best customers.
So, one of the things that we started to do to make our transcription editing more powerful was give people precise word timings. And that allows them to do things like trimming with the transcript and things like when you're actually trimming the video, you're tethering the subtitles to the video as opposed to a specific point in time.
And this also allows us to do things like word-by-word animations.
Why Kapwing switched to AssemblyAI
We switched over to AssemblyAI because our previous API didn't have accurate enough word timing or foreign language translations. And foreign languages are actually important for us because we get a lot of users from around the world.
What AI tech are you most excited about in the future?
The Generative AI stuff is cool. For us, we're seeing it happening on images, right? But then what's the next step to really augmenting video? I'm not sure it's ready for a lot of paid usage today, but in the future, it could be really compelling.
What AI-powered feature is next for Kapwing?
We're exploring ways to leverage AI to speed up video creation, for example, automatically generate highlights or teasers, automatically edit raw footage, and generate voice-overs to simplify the filming process. The goal is to help more businesses and creators to tell stories through video fast and at scale.
Start building with $50 in free credits
Start building with Universal-Streaming and create voice agents that feel natural, responsive, reliable, and genuinely helpful.
A partnership built on support and scale


Siro
Siro, an AI-powered coaching platform for field sales, integrated AssemblyAI to accurately transcribe and identify speakers in the field, leading to significant downstream benefits for its customers.


Delphi
Expertise can be transformative, and Delphi is pioneering a revolutionary approach to knowledge sharing. By harnessing advanced AI technology, the company is creating digital clones of thought leaders, entrepreneurs, and experts, making their insights accessible to a global audience 24/7.
Turn voice data into unparalleled product experiences
Partner with the leader in Speech AI to build powerful products with breakthrough industry impact.
