Skip to content

Install failure for 2.1.0 and 2.2.0 on Win10/Python 3.9 due to sklearn==0.23.1 #62

@CaseGuide

Description

@CaseGuide

Attempting to pip install lexnlp currently pulls 2.1.0 from pypi. This fails to install on Win10/Python 3.9 and apparently M1 MacBooks. Downloading the current master and installing from zip encounters similar issues.

The issue is scikit learn version 0.23.1 failing to install due to changes made in numpy, resulting in the below error even when a sufficient numpy is installed.

Importing the numpy c-extensions failed.
[...]
      ImportError: numpy is not installed.
      scikit-learn requires numpy >= 1.13.3.
      Installation instructions are available on the scikit-learn website: http://scikit-learn.org/stable/install.html

Was able to workaround and run two test examples in the docs, but havent fully tested, by installing current master with requirements set to the following in setup.py

...
python_requires='>=3.6',
...
        'cloudpickle==2.1.0',
        'dateparser==1.1.1',
        'gensim==4.1.2',
        'joblib==1.1.0',
        'nltk==3.7',
        'num2words==0.5.10',
        'numpy>=1.13.1',
        'pandas>=1.1.5',
        'pycountry==22.3.5',
        'regex==2022.3.2',
        'reporters-db==3.2.18',
        'requests==2.27.1',
        'scipy==1.8.1',
        'scikit-learn==0.24.2',
        'tzlocal==2.1',
        'tqdm>=4.36.0',
        'Unidecode==1.3.4',
        'us==2.0.2',
        'zahlwort2num==0.3.0'

Can I suggest using less rigid requirements? This package is often going to be use as part of a workflow, and rigidly pinning not only causes install issues when those deps start to age (sklearn 0.23.1 is 2 years old) but it also unnecessarily forces your package to be the driver of install requirements for the system its a part of.

EDIT: This doesnt work as there are breaking changes from sklearn 0.23.1 -> 0.24, in particular when loading the pickle from addresses.py sklearn 0.24 throws the error:
ModuleNotFoundError: No module named 'sklearn.tree.tree'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions