Changelog

Follow along to see weekly accuracy and product improvements.

March 11, 2025

Multiple API Keys & Projects

We’ve introduced Multiple API Keys and Projects for AssemblyAI accounts. You can now create separate projects for development, staging, and production, making it easier to manage different environments. Within each project, you can set up multiple API keys and track detailed usage and spending metrics. All billing remains centralized while ensuring a clear separation between projects for better organization and control.

Easily manage different environments and streamline your workflow. Visit your dashboard to get started! 🚀

February 24, 2025

Universal improvements

Last week we delivered improvements to our October 2024 Universal release across latency, accuracy, and language coverage.

Universal demonstrates the lowest standard error rate when compared to leading models on the market for English, German, and Spanish:

Average word error rate (WER) across languages for several providers. WER is a canonical metric in speech-to-text that measures typical accuracy (lower is better). Descriptions of our evaluation sets can be found in our October release blog post.

Additionally, these improvements to accuracy are accompanied by significant increases in processing speed. Our latest Universal release achieves a 27.4% speedup in inference time for the vast majority of files (at the 95th percentile), enabling faster transcription at scale.

Additionally, these changes build on Universal's already best-in-class English performance to bring significant upgrades to last-mile challenges, meaning that Universal faithfully captures the fine details that make transcripts useable, like proper nouns, alphanumerics, and formatting.

Comparative error rates across speech recognition models, with lower values indicating better performance. Descriptions of our evaluation sets can be found in our October release blog post.

You can read our launch blog to learn more about these Universal updates.

February 14, 2025

Ukrainian support for Speaker Diarization

Our Speaker Diarization service now supports Ukrainian speech. This update enables automatic speaker labeling for Ukrainian audio files, making transcripts more readable and powering downstream features in multi-speaker contexts.

Here's how you can get started obtaining Ukrainian speaker labels using our Python SDK:

import assemblyai as aai

aai.settings.api_key = "<YOUR_API_KEY>"
audio_file = "/path/to/your/file"

config = aai.TranscriptionConfig(
  speaker_labels=True,
  language_code="uk"
)

transcript = aai.Transcriber().transcribe(audio_file, config)

for utterance in transcript.utterances:
  print(f"Speaker {utterance.speaker}: {utterance.text}")

Check out our Docs for more information.

February 11, 2025

Claude 2 sunset

As previously announced, we sunset Claude 2 and Claude 2.1 for LeMUR on February 6th.

If you were previously using these models, we recommended switching to Claude 3.5 Sonnet, which is both more performant and less expensive than Claude 2. You can do so via the final_model parameter in LeMUR requests. Additionally, this parameter is now required.

Additionally, we have sunset the lemur/v3/generate/action-items endpoint.

February 10, 2025

Reduced hallucination rates; Bugfix

We have reduced Universal-2's hallucination rate for the string "sa" during periods of silence.

We have fixed a rare bug in our Speaker Labels service that would occasionally cause requests to fail and return a server error.

February 5, 2025

Multichannel audio trim fix

We've fixed an issue which caused the audio_start_from and audio_end_at parameters to not be respected for multichannel audio.

February 3, 2025

Platform enhancements and security updates

🌍 Simplified EU Data Residency & Management

We've simplified EU operations with instant access to:

  • Self-serve EU data processing via our EU endpointComplete data sovereignty for EU operations
  • Regional usage filtering and cost tracking
  • Reduced latency for EU-based operations

✅ Enhanced Security & Compliance

  • Full-scope SOC 2 Type 2 certification across all Trust Service Criteria
  • ISO 27001 certification achievement
  • Enhanced security controls across our infrastructure

You can read more about these new enhancements in our related blog.

January 31, 2025

Reduced hallucination rates

We have reduced Universal-2's hallucination rate for the word "it" during periods of silence.

January 15, 2025

New dashboard features

Two new features are available to users on their dashboards:

  1. Users can now see and filter more historical usage and spend data
  2. Users can now see usage and spend by the hour for a given day
December 20, 2024

Reliability improvements

We've made reliability improvements for Claude models in our LeMUR framework.

We've made adjustments to our infrastructure so that users should see fewer timeout errors when using our Nano tier with some languages.

December 19, 2024

LiveKit 🤝 AssemblyAI

We've released the AssemblyAI integration for the LiveKit Agents framework, allowing developers to use our Streaming Speech-to-Text model in their real-time LiveKit applications.

LiveKit is a powerful platform for building real-time audio and video applications. It abstracts away the complicated details of building real-time applications so developers can rapidly build and deploy applications for video conferencing, livestreaming, and more.

Check out our tutorial on How to build a LiveKit app with real-time Speech-to-Text to see how you can build a real-time transcription chat feature using the integration. You can browse all of our integrations on the Integrations page of our Docs.

December 12, 2024

SOC2 Type 2 expansion and renewal

We have renewed our SOC2 Type 2 certification, and expanded it to include Processing Integrity. Our SOC2 Type 2 certification now covers all five Trust Services Criteria (TSCs). 

Our SOC2 Type 2 report is available in our Trust Center to organizations with an NDA.

December 10, 2024

ISO 27001:2022 certification

We have obtained our inaugural ISO 27001:2022 certification, which is an internationally recognized standard for managing information security. It provides a systematic framework for protecting sensitive information through risk management, policies, and procedures. 

Our ISO 27001:2022 report is available in our Trust Center to organizations with an NDA.

November 20, 2024

Timestamp improvement; no-space languages fix

We've improved our timestamp algorithm, yielding higher accuracy for long numerical strings like credit card numbers, phone numbers, etc.

We've released a fix for no-space languages like Japanese and Chinese. While transcripts for these languages correctly contain no spaces in responses from our API, the text attribute of the utterances key previously contained spaces. These extraneous spaces have been removed.

We've improved Universal-2's formatting for punctuation, lowering the likelihood of consecutive punctuation characters such as ?'.

November 18, 2024

Multichannel support

We now offer multichannel transcription, allowing users to transcribe files with up to 32 separate audio channels, making speaker identification easier in situations like virtual meetings.

You can enable multichannel transcription via the `multichannel` parameter when making API requests. Here's how you can do it with our Python SDK:

import assemblyai as aai

aai.settings.api_key = "YOUR_API_KEY" 
audio_file = "path/to/your/file.mp3"

config = aai.TranscriptionConfig(multichannel=True)

transcriber = aai.Transcriber(config=config)
transcript = transcriber.transcribe(audio_url)

print(transcript.json_response["audio_channels"])
print(transcript.utterances)

You can learn more about multichannel transcription in our Docs.

November 5, 2024

Introducing Universal-2

Last week we released Universal-2, our latest Speech-to-Text model. Universal-2 builds upon our previous model Universal-1 to make significant improvements in "last mile" challenges critical to real-world use cases - proper nouns, formatting, and alphanumerics.

Comparison of error rates for Universal-2 vs Universal-1 across overall performance (Standard ASR) and four last-mile areas, each measured by the appropriate metric

Universal-2 is now the default model for English files sent to our `v2/transcript` endpoint for async processing. You can read more about Universal-2 in our announcement blog or research blog, or you can try it out now on our Playground.

November 4, 2024

Claude Instant 1.2 removed from LeMUR

The following models were removed from LeMUR: anthropic/claude-instant-1-2 and basic (legacy, equivalent to anthropic/claude-instant-1-2), which will now return a 400 validation error if called.

These models were removed due to Anthropic sunsetting legacy models in favor of newer models which are more performant, faster, and cheaper. We recommend users who were using the removed models switch to Claude 3 Haiku (anthropic/claude-3-haiku).

November 4, 2024

French performance patch; bugfix

We recently observed a degradation in accuracy when transcribing French files through our API. We have since pushed a bugfix to restore performance to prior levels.

We've improved error messaging for greater clarity for both our file download service and Invalid LLM response errors from LeMUR.

We've released a fix to ensure that rate limit headers are always returned from LeMUR requests, and not just 200 and 400 responses.

October 18, 2024

New and improved - AssemblyAI Q3 recap

Check out our quarterly wrap-up for a summary of the new features and integrations we launched this quarter, as well as improvements we made to existing models and functionality.

Claude 3 in LeMUR

We added support for Claude 3 in LeMUR, allowing users to prompt the following LLMs in relation to their transcripts:

  • Claude 3.5 Sonnet
  • Claude 3 Opus
  • Claude 3 Sonnet
  • Claude 3 Haiku

Check out our related blog post to learn more.

Automatic Language Detection

We made significant improvements to our Automatic Language Detection (ALD) Model, supporting 10 new languages for a total of 17, with best in-class accuracy in 15 of those 17 languages. We also added a customizable confidence threshold for ALD.

Learn more about these improvements in our announcement post.

We released the AssemblyAI Ruby SDK and the AssemblyAI C# SDK, allowing Ruby and C# developers to easily add SpeechAI to their applications with AssemblyAI. The SDKs let developers use our asynchronous Speech-to-Text and Audio Intelligence models, as well as LeMUR through a simple interface.

Learn more in our Ruby SDK announcement post and our C# SDK announcement post.

This quarter, we shipped two new integrations:

Activepieces 🤝 AssemblyAI

The AssemblyAI integration for Activepieces allows no-code and low-code builders to incorporate AssemblyAI's powerful SpeechAI in Activepieces automations. Learn how to use AssemblyAI in Activepieces in our Docs.

Langflow 🤝 AssemblyAI

We've released the AssemblyAI integration for Langflow, allowing users to build with AssemblyAI in Langflow - a popular open-source, low-code app builder for RAG and multi-agent AI applications. Check out the Langflow docs to learn how to use AssemblyAI in Langflow.

Assembly Required

This quarter we launched Assembly Required - a series of candid conversations with AI founders sharing insights, learnings, and the highs and lows of building a company.

Click here to check out the first conversation in the series, between Edo Liberty, founder and CEO of Pinecone, and Dylan Fox, founder and CEO of AssemblyAI.

We released the AssemblyAI API Postman Collection, which provides a convenient way for Postman users to try our API, featuring endpoints for Speech-to-Text, Audio Intelligence, LeMUR, and Streaming for you to use. Similar to our API reference, the Postman collection also provides example responses so you can quickly browse endpoint results.

Free offer improvements

This quarter, we improved our free offer with:

  • $50 in free credits upon signing up
  • Access to usage dashboard, billing rates, and concurrency limit information
  • Transfer of unused free credits to account balance upon upgrading to Pay as you go

We released 36 new blogs this quarter, from tutorials to projects to technical deep dives. Here are some of the blogs we released this quarter:

  1. Build an AI-powered video conferencing app with Next.js and Stream
  2. Decoding Strategies: How LLMs Choose The Next Word
  3. Florence-2: How it works and how to use it
  4. Speaker diarization vs speaker recognition - what's the difference?
  5. Analyze Audio from Zoom Calls with AssemblyAI and Node.js

We also released 10 new YouTube videos, demonstrating how to build SpeechAI applications and more, including:

  1. Best AI Tools and Helpers Apps for Software Developers in 2024
  2. Build a Chatbot with Claude 3.5 Sonnet and Audio Data
  3. How to build an AI Voice Translator
  4. Real-Time Medical Transcription Analysis Using AI - Python Tutorial

We also made improvements to a range of other features, including:

  1. Timestamps accuracy, with 86% of timestamps accuracy to within 0.1s and 96% of timestamps accurate to within 0.2s
  2. Enhancements to the AssemblyAI app for Zapier, supporting 5 new events. Check out our tutorial on generating subtitles with Zapier to see it in action.
  3. Various upgrades to our API, including more improved error messaging and scaling improvements to improve p90 latency
  4. Improvements to billing, now alerting users upon auto-refill failures
  5. Speaker Diarization improvements, especially robustness in distinguishing speakers with similar voices
  6. A range of new and improved Docs

And more!

We can't wait for you to see what we have in store to close out the year 🚀

October 17, 2024

Claude 1 & 2 sunset

Recently, Anthropic announced that they will be deprecating legacy LLM models that are usable via LeMUR. We will therefore be sunsetting these models in advance of Anthropic's end-of-life for them:

  • Claude Instant 1.2 (“LeMUR Basic”) will be sunset on October 28th, 2024
  • Claude 2.0 and 2.1 (“LeMUR Default”) will be sunset on February 6th, 2025

You will receive API errors rejecting your LeMUR requests if you attempt to use either of the above models after the sunset dates. Users who have used these models recently have been alerted via email with notice to select an alternative model to use via LeMUR.

We have a number of newer models to choose from, which are not only more performant but also ~50% more cost-effective than the legacy models. 

  • If you are using Claude Instant 1.2 (“LeMUR Basic”), we recommend switching to Claude 3 Haiku.
  • If you are using Claude 2.0 (“LeMUR Default”) or Claude 2.1, we recommend switching to Claude 3.5 Sonnet.

Check out our docs to learn how to select which model you use via LeMUR.