Dan Kokotov: Speech Recognition with AI and Humans
Core Takeaways
Rev's ASR has a 14% word error rate, while human transcription is around 2-3%.
Why it matters
This highlights the current limitations of ASR technology compared to human capabilities.
Rev.ai focuses on automatic speech recognition to improve transcription efficiency.
▶ 1:00
Why it matters
This specialization allows Rev to streamline transcription services and reduce costs.
Machine translation between English and Russian is complex due to structural differences.
▶ 20:00
Why it matters
The complexity underscores challenges in achieving accurate machine translations across languages.
Podcasting is valued for its depth and potential for human connection.
▶ 1:10:00
Why it matters
Podcasting's depth counters the superficiality of other media, fostering genuine dialogue.
Joe Rogan's $100M Spotify deal highlights the tension between exclusivity and open-source podcasting.
▶ 1:45:00
Why it matters
The deal exemplifies the conflict between monetization and the ethos of open-access content.
Ask this episode Deep
A preview of how Deep chat answers, grounded in this episode with citations and timestamps:
Cite this episode
For papers, blog posts, anywhere.
Related episodes
Where to go next from this conversation.
More on these ideas
AI-generated summary · last refreshed 2026-06-06 21:39:21 · how we make these
Quotes are matched verbatim against the source transcript; references are checked to resolve to real URLs. Even so, AI can misread structure or attribute claims imperfectly. If you spot an error, please let us know.