A Persian grapheme-to-phoneme (G2P) model designed for homograph disambiguation, fine-tuned using the HomoRich dataset to improve pronunciation accuracy.
-
Updated
May 20, 2025 - Jupyter Notebook
A Persian grapheme-to-phoneme (G2P) model designed for homograph disambiguation, fine-tuned using the HomoRich dataset to improve pronunciation accuracy.
Benchmarking notebooks for various Persian G2P models, comparing their performance on the SentenceBench dataset, including Homo-GE2PE and Homo-T5.
HomoRich: The first large-scale Persian homograph dataset for G2P conversion, featuring 528K annotated sentences with balanced pronunciation variants and dual phoneme representations.
Add a description, image, and links to the homograph-disambiguation topic page so that developers can more easily learn about it.
To associate your repository with the homograph-disambiguation topic, visit your repo's landing page and select "manage topics."