How to transcribe audio to text for FREE - Riversideโ€™s new AI transcription tool

Joey /// VP Land
21 Mar 202307:04

TLDRRiverside has introduced a new AI-powered transcription tool that's completely free. It can transcribe audio or video files in over 100 languages. Built on OpenAI's Whisper technology, the tool offers real-time transcription with timestamps. Users can copy the transcript, download it as a text file or SRT for captioning. However, it lacks speaker identification and editing capabilities. Despite these limitations, the tool is highly accurate, with an 88% similarity to proofread text. It's a great starting point for transcriptions, but proofreading is still necessary. The tool is separate from the Riverside app but may be integrated in the future.

Takeaways

  • ๐Ÿ†“ Riverside has launched a free AI transcription tool that can transcribe audio or video in over 100 languages.
  • ๐ŸŒ The tool is built on Whisper, an OpenAI technology, which is also behind ChatGPT and GPT models.
  • ๐Ÿ“„ Visiting riverside.fm/transcription allows you to use the tool without needing an account, simply by dragging and dropping your file.
  • ๐Ÿ“‰ The tool processed a 50-minute interview file under 5GB in about 5 minutes, indicating efficient processing.
  • ๐Ÿ” The transcription is displayed in real-time with timestamps, but without speaker labels or identification.
  • ๐Ÿ“‹ The transcribed text can be copied to clipboard with timecodes, but without speaker labels.
  • ๐Ÿ’พ Download options include a plain text file and an SRT file for video captioning, both lacking speaker identification.
  • ๐ŸŽ‰ For those interested in Riverside's services, using the code JOEY30 saves 30% off any chosen plan.
  • โš ๏ธ Limitations include the inability to preview audio for accuracy checks or edit the transcribed text directly within the tool.
  • ๐Ÿค– The tool does not support uploading a custom vocabulary library for better recognition of proper nouns.
  • ๐Ÿ“ˆ Despite the limitations, the tool offers a highly accurate transcription service for free, with an 88% similarity to proofread text.
  • ๐Ÿ“˜ The transcription can serve as a great starting point, though proofreading is still necessary for accuracy.

Q & A

  • What is the name of the new AI transcription tool released by Riverside?

    -The new AI transcription tool released by Riverside is built on top of Whisper and is completely free to use.

  • How many languages does Riverside's transcription tool support?

    -Riverside's transcription tool supports over 100 different languages.

  • What is the name of the company that built Whisper, which Riverside's tool is based on?

    -Whisper is built by OpenAI, the same company that created ChatGPT, GPT-3, and GPT-4.

  • What is the process of using Riverside's transcription tool?

    -To use the tool, you go to riverside.fm/transcription, drag and drop your audio or video file, and then click start transcribing. The file uploads and the tool processes the transcription in real time.

  • What is the approximate file size limit for Riverside's transcription tool?

    -There is no explicit mention of file size limits, but the example in the transcript was a file under 5 gigabytes.

  • Does the transcription tool provide speaker labels in the transcription?

    -No, the transcription tool does not identify speakers or provide speaker labels. For this feature, you would need to use another tool like Descript.

  • What can you do with the finalized transcript from Riverside's tool?

    -You can copy the transcript to the clipboard, download it as a text file or an SRT file for captioning, and use it for various purposes such as reference or uploading to video hosting platforms.

  • What is the advantage of using the SRT file downloaded from Riverside's tool for captioning?

    -The SRT file from Riverside's tool, which uses Whisper from OpenAI, is likely to be more accurate than YouTube's auto transcription and can be used to caption videos on platforms like YouTube, Vimeo, LinkedIn, and Facebook.

  • What are some downsides of using Riverside's free transcription service?

    -Downsides include the inability to preview the audio for accuracy testing, no option to edit the text within the tool, and no integration with the Riverside app for podcast episodes at the time of the recording.

  • How long did it take for Riverside's tool to transcribe an hour-long interview?

    -The transcription of an hour-long interview was completed in approximately 1 to 2 minutes.

  • What is the accuracy of Riverside's transcription tool when compared to a proofread transcript?

    -The overall similarity between the Riverside's tool output and a proofread transcript was 88%, with most differences being punctuation and a few proper nouns.

  • How can you get a discount on Riverside's services?

    -You can get a 30% discount on any Riverside plan by using the code JOEY30 during sign-up.

Outlines

00:00

๐Ÿ†“ Riverside's AI Transcription Tool Overview

Riverside has launched a free AI-powered transcription tool capable of transcribing audio or video files in over 100 languages. Built on OpenAI's Whisper technology, the tool offers a simple interface without the need for an account. Users can upload files up to 5 gigabytes and approximately 50 minutes in duration. The transcription process is real-time, and the output includes timecode markers but lacks speaker labels. The tool allows copying to clipboard, downloading in text file format, or as an SRT file for captioning. However, it does not support audio preview or text editing within the tool. Despite these limitations, the service is highly regarded for its accuracy and free access.

05:02

๐Ÿ“Š Comparing Riverside's Transcription Accuracy

The video script includes a comparison of Riverside's transcription output using Whisper from OpenAI with a proofread transcript. The comparison shows an overall similarity of 88%, with differences primarily in punctuation and some proper nouns. The tool did not correctly transcribe 'GPT' and made errors with certain names. It also lacks the ability to use a custom vocabulary library for specific proper words. Despite these minor inaccuracies, the transcription is considered highly accurate for a free service, with most differences being capitalization and punctuation. The tool is praised for correctly transcribing certain proper nouns and acronyms. It is recommended that users always proofread auto-transcripts for accuracy.

Mindmap

Keywords

๐Ÿ’กAI transcription tool

An AI transcription tool is a software application that uses artificial intelligence to convert spoken language in audio or video files into written text. In the context of the video, Riverside has released a free AI transcription tool that can transcribe audio and video files in over 100 languages. The tool is built on Whisper, another AI technology from OpenAI, and is showcased for its ability to transcribe a 50-minute interview in real-time.

๐Ÿ’กRiverside.fm/transcription

This is the URL provided in the video for accessing Riverside's new AI transcription service. The interface at this URL is described as simple and user-friendly, where users can upload their audio or video files without needing to create an account. The video demonstrates the process of uploading a file and starting the transcription, highlighting the ease of use of this free service.

๐Ÿ’กWhisper

Whisper is an AI tool developed by OpenAI, the same company behind ChatGPT and GPT models. It is used as the underlying technology for Riverside's transcription tool. The video script mentions that the transcription tool is built on top of Whisper, indicating its role in providing the transcription capabilities.

๐Ÿ’กTranscribe

To transcribe means to convert spoken language into written form. In the video, the process of transcribing involves uploading an audio or video file to Riverside's tool, which then converts the spoken words into text. The video demonstrates this process with a podcast episode, showing the transcription in real-time.

๐Ÿ’กAccuracy

Accuracy in the context of transcription refers to how closely the written text matches the spoken words in the audio or video file. The video discusses the accuracy of Riverside's transcription tool by comparing its output to a proofread transcript. The tool's accuracy is evaluated based on the similarity percentage and the types of errors present.

๐Ÿ’กTimecode markers

Timecode markers are used in transcriptions to indicate the specific time points in the audio or video file where certain words or phrases are spoken. The video mentions that when copying the transcript, timecode markers are included, which can be useful for referencing specific parts of the audio or video.

๐Ÿ’กSpeaker labels

Speaker labels are used to identify which speaker is speaking at any given time in a transcript. The video points out that Riverside's transcription tool does not identify speakers, meaning it does not provide labels to distinguish between different speakers in the conversation.

๐Ÿ’กDescript

Descript is a tool mentioned in the video for its ability to perform speaker identification, which is not a feature of Riverside's transcription tool. The video suggests using Descript to load transcripts and add speaker labels, indicating its utility for enhancing the basic transcription output.

๐Ÿ’กSRT file

An SRT file is a SubRip Text file format used for video captions. The video script explains that Riverside's tool can download transcriptions as SRT files, which can then be used to add captions to videos on platforms like YouTube, Vimeo, LinkedIn, and Facebook.

๐Ÿ’กProofread

Proofreading is the process of checking written material for errors before it is finalized. In the context of the video, proofreading is suggested as a necessary step after using the transcription tool, as it helps to ensure the accuracy of the transcribed text by identifying and correcting any mistakes.

๐Ÿ’กCustom vocab library

A custom vocab library is a collection of specific words or terms that a transcription tool can be programmed to recognize accurately. The video notes that Riverside's tool does not allow for uploading a custom vocab library, which could lead to inaccuracies in transcribing proper names or specialized vocabulary.

Highlights

Riverside has released a free AI transcription tool that can transcribe audio or video in over 100 languages.

The tool is built on Whisper, an OpenAI technology, which is also the company behind ChatGPT, GPT-3, and GPT-4.

The transcription tool is completely free and does not require an account to use.

Users can simply drag and drop their audio or video files for transcription.

The tool can handle large files, with the demonstration involving a file under 5 gigabytes.

Transcription is processed in real-time, with timestamps generated as the audio is processed.

The tool completed transcription of a 50-minute interview in approximately 1 to 2 minutes.

Transcripts can be copied to clipboard with timecode markers but without speaker labels.

For speaker identification, the transcript needs to be loaded into another tool like Descript.

Transcripts can be downloaded in text file format without timecode markers or as an SRT file for video captioning.

The SRT file is suitable for uploading to video hosting platforms like YouTube, Vimeo, LinkedIn, and Facebook.

The free service does not allow for audio preview or text editing within the tool.

The transcription tool is a separate entity from the Riverside app and does not integrate with podcast episodes.

The tool's accuracy is demonstrated to be high, with an 88% similarity to a proofread transcript.

Most inaccuracies are minor, such as punctuation and capitalization differences, and proper nouns.

The tool does not support uploading a custom vocab library for specific proper words.

Despite its limitations, the tool is considered highly accurate and useful, especially considering it is free to use.

The transcription tool is available at riverside.fm/transcription for users to upload and transcribe their audio or video files.

For additional features like podcasting and video podcasting, users are encouraged to explore Riverside's full course offerings.