AITalk® Koe-No-Shokunin Cloud

Instant creation on web browsers!
Low-cost TTS creation cloud service

AITalk® Koe-No-Shokunin Cloud

AITalk Koenoshokunin cloud version allows easy creation of voice audio for narrations or voice guidances.
The cloud service is recommended for people who
are searching for an affordable speech synthesis service and or people who are only looking to create small amout of audio files, as monthly contracts are possible.

AITalk® Koe-No-Shokunin Cloud

Main usage

Video narrations / Telephone voice guidance / Product and or service voice guidance

Provision type

Cloud/API/SaaS

Characteristics of AITalk® Koe-No-Shokunin Cloud

  • Natural voices human like voices

    Koe-No-Shokunin Cloud is able to create human-like natural voices unlike the previous robotic voices.

  • A variety of voice

    From children to adults, you can choose a voice from the total of 17 standard/Kansai-dialect speakers to suit each use case.

  • Original speaker’s voices are also available

    The original voice dictionary created through “AITalk Custom Voice” can be used. This allows use of celebrity and or chracter voices.

  • Easy operation

    With easy operation, you can immediately start using this without the hassle of studying the manual

  • Having access to the updated version at all times is for cloud only

    There are no fees for updating to use the latest version of our speech synthesis engine.

  • Installation is unnecessary for the cloud version, therefore it is ready to use
    immediately after signing up

    After the contract is signed, we will provide you with the ID and password you can input to immediately start using the service.

  • Accessible from multiple locations

    You can access the service from multiple locations, because MAC address authentication and USB key authentication is not needed.

Speakers Introduction

Standard Japanese

Nozomi

Nozomi

Corresponding Expression of Emotion: Normal, joy, anger, sadness Her voice is pleasant and youthful. Her voice can be used for various situations such as for narrations, automatic telephone answering system, wireless-activated disaster warning system, entertainment, etc.

  • DNN
  • Unit-selection-based・with emotion
  • Normal
  • Joy
  • Anger
  • Sadness
Kaho

Kaho

Her voice is extremely clear and easy to understand. Available for a wide range of use including automatic telephone answering (CTI, IVR) and narration for the making of animation.

  • DNN
  • Unit-selection-based・with emotion
  • Normal
  • Joy
  • Anger
  • Sadness
Yumiko

Yumiko

Mature and calm voice.

  • DNN
  • Unit-selection-based・with emotion
  • Normal
  • Joy
  • Anger
  • Sadness
Kanon

Kanon

Sweet and cute voice.

  • DNN
  • Unit-selection-based・with emotion
  • Normal
  • Joy
  • Anger
  • Sadness
Tsubasa

Tsubasa

Firm and honest voice.

  • DNN
  • Unit-selection-based・with emotion
  • Normal
  • Joy
  • Anger
  • Sadness
Akari

Akari

Her voice gives a cheerful and bright impression. Most suitable for the use of product guidance and promotions.

  • DNN
  • Unit-selection-based・with emotion
  • Normal
  • Joy
  • Anger
  • Sadness
Nanako

Nanako

Features a very calming voice. Her voice is best suited for reading news and audio guidance.

  • DNN
  • Unit-selection-based・with emotion
  • Normal
  • Joy
  • Anger
  • Sadness
Shiori

Shiori

Youthful and friendly voice.

  • DNN
  • Unit-selection-based・with emotion
  • Normal
  • Joy
  • Anger
  • Sadness
Seiji

Seiji

His voice has a very sincere tone. Suitable for persuasion and calling attention.

  • DNN
  • Unit-selection-based・with emotion
  • Normal
  • Joy
  • Anger
  • Sadness
Osamu

Osamu

His voice features high applicability. Applicable to various scenes.

  • DNN
  • Unit-selection-based・with emotion
  • Normal
  • Joy
  • Anger
  • Sadness
Taichi

Taichi

Corresponding Expression of Emotion: Normal, joy His voice gives a youthful and unique impression. Most suitable for using in the field of entertainment.

  • DNN
  • Unit-selection-based・with emotion
  • Normal
  • Joy
  • Anger
  • Sadness
Kenta

Kenta

A gentle, luminous and modest voice.

  • DNN
  • Unit-selection-based・with emotion
  • Normal
  • Joy
  • Anger
  • Sadness
Anzu

Anzu

Features a very loving and earnest voice.

  • DNN
  • Unit-selection-based・with emotion
  • Normal
  • Joy
  • Anger
  • Sadness
Chihiro

Chihiro

A charming nasal voice.

  • DNN
  • Unit-selection-based・with emotion
  • Normal
  • Joy
  • Anger
  • Sadness
Koutarou

Koutarou

Features a slow-paced and cute voice.

  • DNN
  • Unit-selection-based・with emotion
  • Normal
  • Joy
  • Anger
  • Sadness
Yuuto

Yuuto

A brisk and intelligent sounding boy’s voice.

  • DNN
  • Unit-selection-based・with emotion
  • Normal
  • Joy
  • Anger
  • Sadness

Audio demonstration

  • Speed

    1

  • Pitch

    1

  • Intonation

    1

  • Anger

    0

  • Sadness

    0

  • Joy

    0

Synthesize 合成中 再生中 Stop

* About the use of speech synthesis demontration

Secondary use of the speech synthesis demonstration provided on this website is prohibited.
In addition, use other than demonstration on this website is prohibited.
Also, please check the terms and conditions of this website.

Function Introduction

Word dictionary function

Word dictionary function

The user dictionary function registers and saves names of people and places that are read in special ways. You can register not only how to read but also the intonation of the word.

Accent adjustment function

Accent adjustment function

Able to easily adjust the accent by moving the accent mark. Fine adjustments can be made and saved.

Inntonation adjustment function

Inntonation adjustment function

Able to easily adjust the inntonation of voices.

話速変換

Speed conversion

Able to adjust the speed of voice in ranges between 0.5x – 4x.

Pitch adjustment function

Pitch adjustment function

Able to adjust the pitch (tone of the voice) in the ranges between 0.5x – 2.0x.

Volume adjustment function

Volume adjustment function

Able to adjust speed in the ranges between 0.5x – 2x.

Saving voice all at once

Saving voice all at once

Able to edit and save multiple texts all at once.

Various counting functions

Various counting functions

Able to count the number of letters that are entered.
You can also see the duration of audio files.

SSML*available

SSMLavailable

By entering in the text box in SSML (markup language) format, you can control reiteration details. Selecting a Speaker / Adjusting Volume, Speed, Pitch, Intonation / Pause Setting / Controlling a specific part of a sentence to read differently is possible.

*What is SSML (speech synthesis markup language)?

It is an XML-base language for marking texts with pronunciation, volume, pitch and etc. necessary for generating speech synthesis.

Screen Image

Voice creation screen

Enter the title of voice file.Enter sentences you wish to synthesize .Press play to check the voice created.Select voices here.Volume,speed,pitch and inntonation can be set here .Text can be saved for each parameter setting.Save voice file.Able to select the voice file saving format.

Voice list

You can download the saved voice as a voice file to reuse and edit.

Text list

You can edit and reuse the saved texts as well.

Have you encountered any of these issues?

  • You want to create guidance voices for your product but modification is needed for each updates, therefore the running cost is worrying
  • You want to use low price synthesized voices for products still in development
  • There are regular modifications in the voice guidance for your company phone but it is difficult to use a speech synthesis software due to budget
  • You want to add narations tocontents such as manuals or tutorial videos
  • You want to install a speech synthesis software for your company but cannot due to company security policies

AITalk® Koe-No-Shokunin Cloud Application Examples

Voice guidance for various devices such as vending machines,
coin lockers and parking lot machines

Costs can be minimized while being able to create high quality voices as spot use is possible.

Voice guidance for various devices such as vending machines, 
coin lockers and parking lot machines

The voice for interactive voice response (IVR)

Response are able to be changed and adjusted easily.
The cloud service is able to provide high quality voice guidance at a low price as spot use is possible.

The voice for interactive voice response (IVR)

Narration for videos

Anyone is able to create voices for narrations.
Replacements of narrations can be made easliy and replaced.

Narration for videos

Able to use just by issuing an account!

There are no software installing processes needed as voice files can be made on a browser. Therefore this service can be used even if your company security does not allow installing softwares.

Able to use just by issuing an account!

Steps before use

Flow of introducing easy to use text to speech AITalk Koenoshokunin Cloud version

Step.1
Inquiry Form (10 days before use)

Please download the application form, terms and conditions, and account application form from the following and check the contents.
Download terms and conditions・application form・account application form.
If you agree to the terms and conditions, please fill out the application form and account application form and apply from form.

Step.2
Contact from AI

We will have a person in charge contact you back within two business days.

※Please understand that depending on the results of our company examination we will not be able to provide you our service.

Step.3
Providing the ID and PW

We will issue you the ID and PW by email.

Step.4
Start of usage

You are able to start, based on the contents of your application plan. Please contact us for any further questions.