Skip to content

Modern .NET10 / C#14 library to normalize text (emojis, currency, numbers, abbreviations, chat slang) for consistent and natural Text-to-Speech (TTS) synthesis, targeting stream chat/donations.

License

Notifications You must be signed in to change notification settings

Agash/TTSTextNormalization

Repository files navigation

TTSTextNormalization

.NET library for normalizing user-generated text before Text-to-Speech playback (chat, donations, comments, alerts), so engines pronounce content more consistently.

GitHub Actions Workflow Status NuGet Version License: MIT

Targets

  • net10.0 (primary)
  • net9.0

Install

dotnet add package Agash.TTSTextNormalization

What You Get

  • Emoji normalization (including ZWJ sequences and flags)
  • Currency normalization ($10.50, EUR 100, etc.) to spoken forms
  • Number normalization (cardinal, ordinal, decimal, multi-dot/version-style)
  • URL replacement via configurable placeholder text
  • Abbreviation expansion for common chat/gaming terms (lol, brb, gg, ...)
  • Basic sanitization of control chars and punctuation variants
  • Cleanup for excessive punctuation and repeated letters
  • Final whitespace and punctuation spacing normalization
  • DI-first, ordered pipeline via ITextNormalizationRule

Quick Start (DI)

using Microsoft.Extensions.DependencyInjection;
using TTSTextNormalization.Abstractions;
using TTSTextNormalization.DependencyInjection;
using TTSTextNormalization.Rules;

ServiceCollection services = new();

services.Configure<UrlRuleOptions>(o => o.PlaceholderText = " website link ");
services.Configure<EmojiRuleOptions>(o => o.Suffix = "emoji");

services.AddTextNormalization(builder =>
{
    builder.AddBasicSanitizationRule();
    builder.AddEmojiRule();
    builder.AddCurrencyRule();
    builder.AddAbbreviationNormalizationRule();
    builder.AddNumberNormalizationRule();
    builder.AddExcessivePunctuationRule();
    builder.AddLetterRepetitionRule();
    builder.AddUrlNormalizationRule();
    builder.AddWhitespaceNormalizationRule();
});

ServiceProvider provider = services.BuildServiceProvider();
ITextNormalizer normalizer = provider.GetRequiredService<ITextNormalizer>();

string input = "OMG!!! that stream was 🔥🔥🔥 $10.50 www.example.com";
string output = normalizer.Normalize(input);

Notes

  • This library normalizes text only. It does not provide TTS playback itself.
  • Rule ordering is configurable; defaults are designed for chat-like inputs.

Build

dotnet restore
dotnet build -c Release
dotnet test -c Release

Contributing

PRs are welcome. If behavior changes, include tests in TTSTextNormalization.Tests.

License

MIT. See LICENSE.txt.

About

Modern .NET10 / C#14 library to normalize text (emojis, currency, numbers, abbreviations, chat slang) for consistent and natural Text-to-Speech (TTS) synthesis, targeting stream chat/donations.

Topics

Resources

License

Stars

Watchers

Forks

Sponsor this project

 

Packages

No packages published

Languages