Highest scored 'scikit-learn+python+cross-validation' questions

33votes

2answers

62kviews

How to calculate the fold number (k-fold) in cross validation?

I am confused about how I choose the number of folds (in k-fold CV) when I apply cross validation to check the model. Is it dependent on data size or other parameters?

Taimur Islam

951

asked Feb 22, 2018 at 5:23

11votes

1answer

4kviews

Why you shouldn't upsample before cross validation

I have an imbalanced dataset and I am trying different methods to address the data imbalance. I found this article that explains the correct way to cross-validate when oversampling data using SMOTE ...

sums22

447

asked Sep 22, 2020 at 11:40

10votes

3answers

11kviews

Nested cross-validation and selecting the best regression model - is this the right SKLearn process?

If I understand correctly, nested-CV can help me evaluate what model and hyperparameter tuning process is best. The inner loop (GridSearchCV) finds the best ...

BobbyJohnsonOG

103

asked Aug 4, 2016 at 1:28

9votes

1answer

5kviews

What is GridSearchCV doing after it finishes evaluating the performance of parameter combinations that takes so long?

I'm running GridSearchCV to tune some parameters. For example: ...

Dan Scally

1,784

asked Feb 19, 2019 at 12:48

7votes

2answers

6kviews

Is there a way of performing stratified cross validation using xgboost module in python?

I am training and predicting on the same data-set, but I want to perform 10-fold cross-validation and predict on the left out fold and thus predict on the whole data set. How can I do this? The ...

Ved Gupta

191

asked Aug 20, 2015 at 9:53

6votes

2answers

38kviews

How to implement Python's MLPClassifier with gridsearchCV?

I am trying to implement Python's MLPClassifier with 10 fold cross-validation using gridsearchCV function. Here is a chunk of my code: ...

zx mnb

63

asked Jun 16, 2017 at 14:13

4votes

2answers

26kviews

Found input variables with inconsistent numbers of samples

I would appreciate if you could let me know how to resolve this error: Code: ...

ebrahimi

1,305

asked Jan 31, 2017 at 19:59

4votes

0answers

92views

Does ROC AUC different between crossval and test set indicate overfitting or other problem?

I am training a composite model (XGBoost, Linear Regression, and RandomForest) to predict injured people probability. Well, the results of cross-validation with 5 folds. Well, I can see any problem ...

GregOliveira

116

asked Sep 13, 2022 at 13:52

3votes

2answers

1kviews

Understanding Sklearns learning_curve

I have been using sklearns learning_curve , and there are a few questions I have that are not answered by the documentation(see also here and here), as well as questions that are raised by the ...

Abijah

181

asked Oct 12, 2021 at 10:04

3votes

1answer

4kviews

GridSearchCV results are different to directly applied default model (SVM)

I run a Support Vector Machines model on part of my train set with following result: ...

Mateusz Konopelski

265

asked May 31, 2018 at 14:48

2votes

2answers

455views

Advice and Ideas appreciated - Machine Learning one man project

I have a project where I am supposed to start from scratch and learn how machine Learning works. So far everything is working out better than expected but I feel as I am offered to many ways to choose ...

CRoNiC

147

asked Oct 2, 2019 at 10:57

2votes

1answer

2kviews

Validation curve unlike SKLearn sample

I'm trying to implement the validation curve based on this SKLearn tutorial. On the site, it shows how based on the parameters the model goes from under- to overfitted, finding the optimal parameter ...

lte__

1,379

asked Jan 22, 2018 at 12:58

2votes

1answer

1kviews

Cross Validation for Different Metrics - Sklearn

When I am doing cross validation using Python's Sklearn and take the score of different metrics (accuracy, precision, etc.) like this: ...

Akhmad Zaki

141

asked Mar 12, 2018 at 18:07

2votes

1answer

4kviews

validation_curve differs from cross_val_score?

I'm trying to see how well a decision tree classifier performs on my input. For this I'm trying to use the validation and learning curves and SKLearn's cross-validation methods. However, they differ, ...

lte__

1,379

asked Jan 23, 2018 at 12:25

1vote

3answers

18kviews

Leave one out Cross validation using sklearn (Multiple CSV)

I have 52 CSV files in a folder. I want to build a model based on this data. That's why I want to Leave one out cross-validation on these data. How can I do this using sci-kit learn in python? I ...

Bloodstone Programmer

300

asked Apr 21, 2018 at 8:21

Stack Exchange Network

All Questions

How to calculate the fold number (k-fold) in cross validation?

Why you shouldn't upsample before cross validation

Nested cross-validation and selecting the best regression model - is this the right SKLearn process?

What is GridSearchCV doing after it finishes evaluating the performance of parameter combinations that takes so long?

Is there a way of performing stratified cross validation using xgboost module in python?

How to implement Python's MLPClassifier with gridsearchCV?

Found input variables with inconsistent numbers of samples

Does ROC AUC different between crossval and test set indicate overfitting or other problem?

Understanding Sklearns learning_curve

GridSearchCV results are different to directly applied default model (SVM)

Advice and Ideas appreciated - Machine Learning one man project

Validation curve unlike SKLearn sample

Cross Validation for Different Metrics - Sklearn

validation_curve differs from cross_val_score?

Leave one out Cross validation using sklearn (Multiple CSV)

Hot Network Questions

All Questions

Related Tags