Persian G2P Tools Benchmark

This repository contains benchmarking notebooks for various Persian grapheme-to-phoneme (G2P) models, including both baseline models and the proposed Homo-GE2PE and Homo-T5 models in the Fast, Not Fancy: Rethinking G2P with Rich Data and Rule-Based Models study. The benchmarks are conducted using the SentenceBench Persian G2P Benchmark.

Repository Structure

benchmarking-scripts/
│   ├── Benchmark_AzamRabiee_Persian_G2P.ipynb
│   ├── Benchmark_GE2PE.ipynb
│   ├── Benchmark_HomoFast_eSpeak.ipynb
│   ├── Benchmark_Homo_GE2PE.ipynb
│   ├── Benchmark_Homo_T5.ipynb
│   ├── Benchmark_PasaOpasen_PersianG2P.ipynb
│   ├── Benchmark_de_mh_persian_phonemizer.ipynb
│   ├── Benchmark_dmort27_epitran.ipynb
│   ├── Benchmark_eSpeak_NG.ipynb
│   └── Benchmark_mohamad_hasan_sohan_ajini_G2P.ipynb
│   └── Benchmark_sajadalipour7_Persian_Grapheme_To_Phoneme_With_Transformer.ipynb

Each notebook benchmarks a specific model using the SentenceBench dataset. The results of each run (5 independent runs per model) are documented in the last markdown cell of each notebook.

Benchmarking Results

The table below presents the performance of each model, averaged across 5 runs:

Model	PER (%)	Homograph Acc. (%)	Avg. Inf. Time (s)
Persian_G2P	35.23	21.23	11.1374
PersianG2P	15.04	37.74	2.1686
persian_phonemizer	25.27	29.25	0.1803
Epitran	45.12	0.00	0.0003
Persian G2P	19.63	29.91	28.0039
Persian_Grapheme_To_Phoneme	12.85	40.00	0.9685
eSpeak NG	6.92	43.87	0.0169
GE2PE	4.81	47.17	0.4464
HomoFast eSpeak	6.33	74.53	0.0084
Homo-T5	4.12	76.32	0.4141
Homo-GE2PE	3.98	76.89	0.4473

Contributions

Contributions and pull requests are welcome. Please open an issue to discuss the changes you intend to make or the models/ttols you want to add to the benchmark.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
benchmarking-scripts		benchmarking-scripts
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Persian G2P Tools Benchmark

Repository Structure

Benchmarking Results

Contributions

Additional Links

About

Uh oh!

Releases

Packages

Languages

License

MahtaFetrat/Persian-G2P-Tools-Benchmark

Folders and files

Latest commit

History

Repository files navigation

Persian G2P Tools Benchmark

Repository Structure

Benchmarking Results

Contributions

Additional Links

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages