ParaStyleTTS: Toward Efficient and Robust Paralinguistic Style Control for Expressive Text-to-Speech Generation

Controlling speaking style in text-to-speech (TTS) systems has become a growing focus in both academia and industry. While many existing approaches rely on reference audio to guide style generation, such methods are often impractical due to privacy concerns and limited accessibility. More recently, large language models (LLMs) have been used to control speaking style through natural language prompts; however, their high computational cost, lack of interpretability, and sensitivity to prompt phrasing limit their applicability in real-time and resource-constrained environments. In this work, we propose ParaStyleTTS, a lightweight and interpretable TTS framework that enables expressive style control from text prompts alone.ParaStyleTTS features a novel two-level style adaptation architecture that separates prosodic and paralinguistic speech style modelling. It allows fine-grained and robust control over factors such as emotion, gender, and age. Unlike LLM-based methods, ParaStyleTTS maintains consistent style realization across varied prompt formulations and is well-suited for real-world applications, including on-device and low-resource deployment. Experimental results show that ParaStyleTTS generates high-quality speech with performance comparable to state-of-the-art LLM-based systems while being 30x faster, using 8x fewer parameters, and requiring 2.5x less CUDA memory. Moreover, ParaStyleTTS exhibits superior robustness and controllability over paralinguistic speaking styles, providing a practical and efficient solution for style-controllable text-to-speech generation.

Multilingual Speech generation

Prompt: A female speaking [English/Chinese] with neutral emotion

The weather looks perfect for a walk in the park.

Technology is changing the way we live and work.

In the end, it's the simple moments that matter most.

今天的天气非常适合散步

科技正在改变我们的生活方式

到头来,最重要的是那些简单的瞬间.

Emotion and Gender Style Control

Prompt: Adult [male/female] speaking [English/Chinese] with [neutral/sad/happy/angry/surprise] emotion

😊 Happy

Female (EN)

I'm so excited to see you again!

This is the best day I’ve had in a long time.

女声 (ch)

太开心了,终于又见到你了!

今天真是我最近最快乐的一天!

Male (EN)

I'm so excited to see you again!

This is the best day I’ve had in a long time.

男声 (ch)

太开心了,终于又见到你了!

今天真是我最近最快乐的一天!

😢 Sad

Female (EN)

I really thought things would turn out differently.

It’s been hard to smile lately.

女声 (ch)

我以为事情会有不同的结局。

最近真的很难笑出来。

Male (EN)

I really thought things would turn out differently.

It’s been hard to smile lately.

男声 (ch)

我以为事情会有不同的结局。

最近真的很难笑出来。

😠 Angry

Female (EN)

This is not what we agreed on!

I can’t believe this happened again!

女声 (ch)

这不是我们说好的!

我真的不敢相信又发生了!

Male (EN)

This is not what we agreed on!

I can’t believe this happened again!

男声 (ch)

这不是我们说好的!

我真的不敢相信又发生了!

😲 Surprised

Female (EN)

No way! I didn’t expect that at all!

Wow, this is unbelievable!

女声 (ch)

不会吧?我完全没想到!

哇,太不可思议了!

Male (EN)

No way! I didn’t expect that at all!

Wow, this is unbelievable!

男声 (ch)

不会吧?我完全没想到!

哇,太不可思议了!

😐 Neutral

Female (EN)

Let me know if you need anything else.

The meeting will start at 3 PM.

女声 (ch)

如果你需要什么,请告诉我。

会议将在下午三点开始。

Male (EN)

Let me know if you need anything else.

The meeting will start at 3 PM.

男声 (ch)

如果你需要什么,请告诉我。

会议将在下午三点开始。

Age Style Control

Prompt: A [Child/Teenager/YoungAdult/Adult] speaking [English/Chinese]

Child (EN)

I have a red balloon and a blue one too.

We read a story in class today, and it was fun.

儿童 (CH)

我有一个红气球,还有一个蓝色的。

我们今天在课堂上读了一个故事,很有趣。

Teenager (EN)

I think I did okay on the test, but I’m not sure.

We’re planning to hang out after school.

青少年 (CH)

我觉得考试还行,但不太确定。

我们放学后打算一起出去玩。

Young Adult (EN)

I’ll send you the report by the end of the day.

Let me check my schedule before confirming.

青年 (CH)

我今天之内会把报告发给你。

我确认一下时间再答复你。

Adult (EN)

The package should arrive by Friday.

Please let me know if you need further assistance.

成人 (CH)

包裹应该会在星期五送到。

如果还需要帮助,请随时告诉我。