Many algorithms and methods in modern Machine Learning techniques contain randomness, and because of that, running the same ML script several times can result in different outputs, therefore accuracy results. For example, running Random Forest can produce accuracy of 0.78, then when run again with no change in data, setup, code, it can result in 0.79. This brings the challenge of inability to perform controlled experiment, when I am testing some changes in input and their effect on output.
So in order to be able to perform perfectly controlled experiments to achieve the best model output, what are the extensive set of random parameters I should fix? I want whole process to be completely deterministic.
PS: I am using Sci-kit Learn environment with additional algorithms such as XGBoost, CatBoost, LightGBM.
I assume there are some parameters, random_state(s) I should fix in NumPy, too.