How to make a FREE Text-to-Speech Voice Tutorial | Microsoft Azure AI

All About AI
28 May 202209:24

TLDRIn this tutorial, the speaker demonstrates how to create a free text-to-speech voice using Microsoft Azure AI. The process begins with signing up for a free account on Azure and setting up a resource group. The user is then guided through creating a speech service, selecting the 'Free F0' pricing tier, and deploying the service. Once the service is ready, the user can explore various voice options and customize them by adjusting pitch, rate, and volume to suit their preferences. The tutorial also highlights the ability to choose from a wide range of voices and speaking styles, allowing for a personalized experience. The speaker emphasizes the ease of use and the extensive customization options available with Azure's text-to-speech service.

Takeaways

  • 😀 The tutorial demonstrates how to create a text-to-speech voice using Microsoft Azure AI.
  • 🌐 The presenter is not using the typical AI voice and introduces the process of generating an artificial voice.
  • 🇳🇴 The presenter is from Norway but believes their English is understandable for the audience.
  • 💻 The tutorial uses three platforms: AWS, Google Cloud Platform, and Microsoft Azure, with a focus on the latter.
  • 🔍 Users are guided to start with a free account on Azure, which provides a $200 credit for various services.
  • 📦 A new resource group is created in the Azure portal, which is a place to store and access resources.
  • 📍 The tutorial specifies choosing the 'US East' region for the resource group due to neural language support.
  • 🔖 The 'Speech' service is created within the resource group, with a focus on neural languages.
  • 🎤 The 'Speech Studio' is introduced for audio content creation, allowing users to generate artificial voices.
  • 📝 A sample text is used to demonstrate the text-to-speech process, with options to choose language and voice.
  • 🎚️ The tutorial explains how to adjust voice parameters such as rate, pitch, and volume to customize the artificial voice.
  • 🗣️ Multiple voice options are available, with the ability to preview and select different speaking styles.
  • 🔄 The presenter shows how to change the speaking style and speed of the voice to achieve the desired effect.
  • 🎉 The tutorial concludes with an encouragement to like, subscribe, and look forward to the next video.

Q & A

  • What is the main purpose of the video tutorial?

    -The main purpose of the video tutorial is to demonstrate how to generate an artificial text-to-speech voice using Microsoft Azure AI.

  • What are the three platforms mentioned for cognitive services?

    -The three platforms mentioned for cognitive services are AWS (Amazon Web Services), Google Cloud Platform, and Microsoft Azure.

  • How much credit does Microsoft Azure offer to start with when creating a new account?

    -Microsoft Azure offers $200 in credits to start with when creating a new account.

  • What is a resource group in Azure?

    -A resource group in Azure is a logical container that holds related Azure resources and allows you to manage them together.

  • Why is the region 'East US' chosen for creating the resource group?

    -The region 'East US' is chosen for creating the resource group because it is required for the neural languages in the text-to-speech service.

  • What type of service is being created in the tutorial?

    -The tutorial is creating a speech service in Microsoft Azure.

  • What is the name of the service created in the tutorial?

    -The name of the service created in the tutorial is 'All About AI'.

  • What is the pricing tier chosen for the speech service?

    -The pricing tier chosen for the speech service is 'Free F0'.

  • How can one access the Speech Studio in the Azure portal?

    -To access the Speech Studio in the Azure portal, navigate to your resource, scroll down, find 'Discover', and click on it. Then, explore full features in Speech Studio.

  • What is the process for creating artificial voices in the Speech Studio?

    -The process for creating artificial voices in the Speech Studio involves starting a new project, uploading text, selecting a language, choosing a voice, and customizing the voice's pitch, rate, and volume.

  • How can the speed and pitch of the generated voice be adjusted?

    -The speed and pitch of the generated voice can be adjusted using sliders in the Speech Studio interface, allowing you to increase or decrease the rate and pitch to your preference.

  • What are the different speaking styles available for the voices?

    -The different speaking styles available for the voices include default, excited, cheerful, and other styles that can be previewed and selected in the Speech Studio.

Outlines

00:00

🤖 Introduction to Creating Artificial Voices with Azure

The speaker introduces a tutorial on generating artificial voices using Microsoft Azure. They explain that they are not using the typical AI voice and that they are from Norway but believe their English is comprehensible. The tutorial will guide viewers through setting up a free account on Azure, creating a resource group, and navigating to the speech service to generate a text-to-speech voice. The process involves selecting a region, naming the service, and choosing a pricing tier, with the speaker opting for the free F0 tier.

05:01

🎙️ Customizing and Selecting Voices in Azure's Speech Studio

In this part, the speaker demonstrates how to customize and select different artificial voices within Azure's Speech Studio. They show the process of starting a new project, pasting in text, and choosing a language. The speaker then previews various voices, making adjustments to speed, pitch, and volume to achieve a desired sound. They highlight the ability to select from thousands of voices and styles, providing examples of different English accents and speaking styles, such as 'excited' and 'newscaster'. The speaker concludes by emphasizing the customization options available and thanks the viewers for watching.

Mindmap

Keywords

💡Text-to-Speech

Text-to-Speech (TTS) is a technology that converts written text into audible speech. In the video, the creator demonstrates how to generate an artificial voice using TTS technology provided by Microsoft Azure AI. This is central to the video's theme as it shows the process of creating a voice that can read out text, which is useful for accessibility features, audio books, or any application requiring voice output.

💡Microsoft Azure

Microsoft Azure is a cloud computing service created by Microsoft for building, testing, deploying, and managing applications and services through Microsoft-managed data centers. The video tutorial focuses on using Azure's cognitive services to create a text-to-speech voice. Azure's platform is highlighted as a tool for generating the artificial voice, emphasizing its capabilities in AI and cloud services.

💡Cognitive Services

Cognitive Services are AI services and cognitive APIs provided by Microsoft Azure that enable applications to see, hear, speak, understand, and interpret human needs using natural methods of communication. In the context of the video, cognitive services are used to access the text-to-speech functionality, which is a part of the suite of AI services offered by Azure.

💡Resource Group

A Resource Group in Azure is a logical container that holds related Azure resources. The video script instructs viewers to create a new resource group as a place to store and easily access their resources. This concept is integral to organizing and managing the services and resources needed for the text-to-speech project.

💡Speech Service

The Speech Service in Azure is a component of the Cognitive Services suite that enables developers to add speech recognition, speech synthesis, and other speech capabilities to their applications. The script details the creation of a Speech Service resource, which is a prerequisite for generating the artificial voice in the tutorial.

💡Pricing Tier

Pricing tiers in Azure refer to the different levels of service and associated costs. The video mentions choosing a 'free F0' tier for the Speech Service, indicating that the viewer can start experimenting with text-to-speech capabilities without incurring costs, which is an attractive feature for beginners or those on a budget.

💡Speech Studio

Speech Studio is a feature within Azure's Speech Service that allows users to create, test, and deploy custom speech models. In the script, the creator navigates to Speech Studio to start a project for audio content creation, demonstrating how to use the platform to generate and customize the artificial voice.

💡Voice Customization

Voice customization refers to the process of adjusting the characteristics of a voice, such as pitch, rate, and volume, to suit specific needs. The video shows how to customize the voice by changing its speed and pitch to achieve a more natural and desirable sound, which is a key aspect of creating an engaging text-to-speech voice.

💡Neural Voices

Neural voices are a type of text-to-speech technology that uses neural networks to create more natural-sounding speech. The script mentions choosing a region that supports neural languages, implying the use of advanced AI to generate high-quality, human-like voices for the text-to-speech application.

💡Project

In the context of the video, a 'project' refers to a specific instance of creating and configuring a text-to-speech voice within the Speech Studio. The creator starts a new project to demonstrate the process of selecting text, choosing a voice, and customizing its attributes to create the desired artificial voice.

Highlights

Introduction to creating a free Text-to-Speech voice with Microsoft Azure AI.

The speaker is not using the usual artificial intelligence voice.

The speaker is from Norway but believes their English is understandable.

Three platforms used for cognitive services: AWS, Google Cloud Platform, and Microsoft Azure.

Instructions to visit azure.com and start with a $200 free credit.

Creating an account on Azure and navigating to the Resource Group.

Creating a new resource group named 'All About AI' in the US East region.

Creating a Speech service resource within the resource group.

Choosing the 'Free F0' pricing tier for the Speech service.

Waiting for the Speech service deployment to complete.

Accessing the Speech service and starting to create artificial voices.

Using the Speech Studio to create audio content.

Selecting text for the Text-to-Speech voice and choosing UK English.

Previewing and choosing the voice 'Abby' for the Text-to-Speech.

Adjusting the speaking rate and pitch of the chosen voice.

Exploring different voice options and styles available in the Speech Studio.

Customizing the voice 'Jenny' with different speaking styles like 'excited'.

Final adjustments to the voice speed and pitch to achieve a satisfactory result.

Conclusion and invitation to like, subscribe, and watch the next video.