Questions tagged [cross-validation]

Ask Question

Refers to general procedures that attempt to determine the generalizability of a statistical result. Cross-validation arises frequently in the context of assessing how a particular model fit predicts future observations. Methods for cross-validation usually involve withholding a random subset of the data during model fitting and quantifying how accurate the withheld data are predicted and repeating this process to get a measure of prediction accuracy.

644 questions

2votes

1answer

77views

400 instances dataset XGboost, is my model overfitting?

Im working on a regression problem with 400 samples and 7 features, to predict job durations of machineries from historical data. Im using XGboost and (90,10) split works better than (80,20) split. Is ...

barcamela

asked Feb 26 at 13:51

1vote

0answers

28views

Choosing the number of features via cross-validation

I have an algorithm that trains a binary predictive model for a specified number of features from the dataset (features are all of the same type, but not all important.) Thus, the number of features ...

Roger V.

asked Feb 11 at 18:00

1vote

0answers

29views

Can't understand the evaluation approach used in this paper

In this paper, two deep learning models where proposed: Hybrid-AttUnet++ and EH-AttUnet++. The first model, Hybrid-AttUnet++, is simply a modified U-net model, and the second model is an ensemble ...

AAA_11

asked Feb 1 at 15:27

0votes

0answers

15views

Error in plotting Gaussian Process for 3 models that use Bayesian Optimization

I'm writing a python script for Orange Data Mining to plot the gaussian processes in order to find the best hyperparameters for the 5-FoldCrossValidation Accuracy metric. The three models are SVC, ...

Mattma

asked Jan 15 at 17:06

0votes

0answers

33views

how to properly implement Random Undersampling during Cross-Validation in Orange

I am working on a highly imbalanced fraud detection dataset (class 0:284315 instances, class 1: 492 instances) and trying to implement random undersampling correctly during cross-validation in Orange. ...

Mattma

asked Jan 10 at 21:06

3votes

1answer

73views

Is it statistically wrong to adjust for sex and race then do subgroup based on them?

I am doing for subgroup analysis of early mortality (Outcome) based on Transfusion(WITH ADJUSTMENT for both ...

Mohamed Rahouma

asked Dec 5, 2024 at 17:05

3votes

1answer

140views

Need advice regarding cross-validiation to obtain optimal lambda in Lasso

I am comparatively new to machine learning and any suggestions and coding corrections will be a great help. I am using Lasso for feature selection and want to select the lambda that produces the ...

h_ihkam

asked Nov 28, 2024 at 5:56

7votes

1answer

141views

Nested-cross validation pipeline and confidence intervals

I'm hoping someone can help me think through this. I've come across a lot of different resources on nested-cv, but I think I'm confused as to how to go about model selection and the appropriate ...

molecularrunner

asked Nov 26, 2024 at 18:00

3votes

1answer

51views

When I use linear regression in machine learning, variables selection is same as choosing turning parameters?

I am a newbie in machine learning. After days of studying the ideas of machine learning, I have made some conclusions, which are below (I only consider supervised learning). Step 1: Data splitting ...

Student coding

asked Nov 24, 2024 at 7:51

0votes

0answers

23views

Is this a good way to use a separate validation set with k-fold cross-validation?

I am training a CNN, and I divided the dataset into 70% training set, 20% validation set, and 10% test set. What I want is to use this validation set for early stopping to avoid overfitting the model ...

AAA_11

asked Oct 20, 2024 at 18:45

1vote

0answers

88views

XGBoost CV confusion on how to choose eval set

If I am using XGBoost with GridSearchCV, how should I choose my evaluation set? Note, I am referring to eval_set within the model params. My current implementation is using GridSearchCV in order to ...

user54565

asked Oct 15, 2024 at 21:24

1vote

0answers

47views

What is the standard ML pipeline for training and testing? [closed]

I have a dataframe containing 1324 rows and 28 columns and I'm kinda lost on which approach to go for when training regression models. Currently I perform a data split and run GridSearchCV to pick the ...

Davi Magalhães

asked Sep 16, 2024 at 15:35

1vote

1answer

18views

CV-kNN performing worse than kNN

I have been writing some code which compares some basic classifiers. Just wondering if CV-kNN can perform worse than regular kNN when checking performance on test data? We train the models using a ...

scruby

asked Sep 16, 2024 at 13:40

1vote

0answers

102views

Confused about use of random states for training models in scikit

I am new to ML and currently working on improving the accuracy of an MLPClassifier in scikit. My code looks like so ...

Leandro

asked Sep 7, 2024 at 19:15

1vote

1answer

35views

XGB find hyperparameters and then crossvalidation

I want to train an XGBoost model, and here's how I believe the process should go: Step 1: Find the optimal hyperparameters using GridSearchCV. Step 2: Evaluate the selected parameters. My question is: ...

Math_D

asked Sep 6, 2024 at 0:45

15 30 50per page

2 3 4 5

…

43 Next

Stack Exchange Network

Questions tagged [cross-validation]

400 instances dataset XGboost, is my model overfitting?

Choosing the number of features via cross-validation

Can't understand the evaluation approach used in this paper

Error in plotting Gaussian Process for 3 models that use Bayesian Optimization

how to properly implement Random Undersampling during Cross-Validation in Orange

Is it statistically wrong to adjust for sex and race then do subgroup based on them?

Need advice regarding cross-validiation to obtain optimal lambda in Lasso

Nested-cross validation pipeline and confidence intervals

When I use linear regression in machine learning, variables selection is same as choosing turning parameters?

Is this a good way to use a separate validation set with k-fold cross-validation?

XGBoost CV confusion on how to choose eval set

What is the standard ML pipeline for training and testing? [closed]

CV-kNN performing worse than kNN

Confused about use of random states for training models in scikit

XGB find hyperparameters and then crossvalidation

Hot Network Questions

Questions tagged [cross-validation]

Related Tags