I'm encountering an issue while transitioning from Python 3.7.3 to Python 3.10 due to the deprecation of the older version. The problem arises when attempting to load a pickled sklearn DecisionTreeClassifier model. Environment:
Original: Python 3.7.3, scikit-learn 0.23.1 Current: Python 3.10, scikit-learn 1.3.2
Problem: When loading the pickled model, I receive the following error:
ValueError: node array from the pickle has an incompatible dtype: - expected: {'names': ['left_child', 'right_child', 'feature', 'threshold', 'impurity', 'n_node_samples', 'weighted_n_node_samples', 'missing_go_to_left'], 'formats': ['<i8', '<i8', '<i8', '<f8', '<f8', '<i8', '<f8', 'u1'], 'offsets': [0, 8, 16, 24, 32, 40, 48, 56], 'itemsize': 64} - got : [('left_child', '<i8'), ('right_child', '<i8'), ('feature', '<i8'), ('threshold', '<f8'), ('impurity', '<f8'), ('n_node_samples', '<i8'), ('weighted_n_node_samples', '<f8')]
Code:
import pickle for m in models: file = 'finalized_model_' + m + '.sav' loaded_model = pickle.load(open(file, 'rb')) df[m] = loaded_model.predict_proba(X)[:, 1]
Attempted Solution: I've tried to mitigate this issue by loading and re-saving the model using a higher protocol:
import pickle from sklearn import model_selection # Load the model with open("path_to_old_model.pkl", 'rb') as file: model = pickle.load(file) # Re-save using a higher protocol with open("path_to_updated_model.pkl", 'wb') as file: pickle.dump(model, file, protocol=pickle.HIGHEST_PROTOCOL) print(m)
However, after upgrading the environment to Python 3.10 and the latest version of scikit-learn, I still encounter the same error when attempting to load the re-saved model. Is there a way to successfully load these models in the newer Python environment without losing their functionality? Any assistance or guidance would be greatly appreciated.