Speech-to-Text for Video and Audio Assets

Bynder can automatically generate transcripts for audio and video assets in your Bynder DAM via Speech-to-Text. This feature automatically converts audio content for multiple languages into text (transcriptions), making these assets easily searchable. Users can locate keywords or phrases in videos and audio files without manually adding individual tags. Clicking a word in the generated transcript will play the media from that specific location. In addition, you can improve the accessibility of your content by adding closed captions to your videos.

How to Enable AI Search Experience 

Reach out to your Customer Success Contact to learn how to enable the AI Search Experience in your portal and any associated costs.

Download Transcripts for Video and Audio Assets

Note

The subtitles will be displayed within Bynder only. Subtitles will not appear for assets embedded outside of Bynder (i.e., via embed code) or in any other Bynder modules.

  1. Navigate to your Portal.
  2. Select the Assets tab. 
  3. You can use the search bar to search for the video. speech-to-text-results.png
  4. Select and open the video. 
  5. Select Transcript in the Asset Detail View.Screenshot 2023-11-17 at 3.33.33 PM.png
  6. A new window will pop up where you can view the transcript.

    Note

    Clicking on a word in the transcript will bring you to the exact location in the video.

    Screenshot 2023-11-17 at 3.35.46 PM.png
  7. View the date generated, length, language, word count, and confidence score.  Screenshot 2023-11-17 at 3.37.01 PM.png
  8. Click on the three file formats (SRT, VTT, TXT)to download the transcripts.

Speech-to-Text Settings

  1. Navigate to your Assets.
  2.  Use the search bar to search for content spoken in the video.
  3. Once you find the video, select Screenshot 2024-03-28 at 10.46.08 AM.png next to Transcript
  4. In the bottom right of the selected video, click and select from the following options. 
    • Captions: Enable/disable captions
    • Playback speed: Adjust the speed from Normal to 0.5, 0.75, 1.25 or 1.5.
    • Picture-in-picture: View the video in a separate, smaller window if you’d like to switch between tabs while watching (available on all browsers except Firefox).

Supported Languages

Speech-to-Text offers support for 100 languages, with the following as the most common:

Arabic, Modern Standard Japanese
Belarusian Korean
Bosnian Latvian
Bulgarian Lithuanian
Catalan Macedonian
Chinese, Simplified Malay
Chinese, Traditional Norwegian Bokmål
Croatian Polish
Czech Portuguese
Danish Portuguese, Brazilian
Dutch Romanian
English Russian
Estonian Serbian
Finnish Slovak
French Slovenian
German Spanish
Greek Tagalog/Filipino
Hebrew Tamil
Hindi, Indian Thai
Hungarian Turkish
Icelandic Ukrainian
Indonesian Vietnamese
Italian  

If the language you're interested in is not included in the list, please contact your Customer Success Manager to verify its support status.

File Restrictions

The following files cannot be transcribed:

  • Files larger than 2GB
  • Files longer than 4 hours
  • Files shorter than 3 seconds

Confidence Score

Transcripts will not be shown if their confidence score is less than 50 out of 100

A confidence score indicates the accuracy of a transcript. See below for some of the factors that can affect the confidence score:

  • Audio Quality: The audio input quality can significantly affect the confidence score. Clear, noise-free audio produces higher confidence, while poor quality or loud background audio results in lower scores.
  • Speaker Variability: If multiple speakers are in the audio, this can lower the confidence score, as distinguishing between different voices can be challenging.
    • Currently, single-language identification is supported. The predominant language will be transcribed if two languages are spoken in the media.
  • Language Complexity: Complex vocabulary, accents, and dialects can impact the confidence score. Uncommon or technical terms may lead to lower scores.

Updated

Was this article helpful?

0 out of 0 found this helpful

We're sorry to hear that!

Find out more in our community

Have more questions? Find out more in our community