https://github.com/tesseract-ocr/tesseract/blob/d3e50cfb0674574dad/src/ccutil/universalambigs.h#L116-L19037 Most of the file looks like garbage that was generated by an unknown tool. Maybe we can remove all lines starting from line 116.