Questions tagged [model-evaluations]
This tag is meant to be used for questions related to how to evaluate a model performance, not only based on standard metrics, but also in the context of real use case applications. What is a good model might depend on many factors to take into account, to eventually get really useful data science applications.
370 questions
2votes
0answers
45views
How to evaluate a new policy given a historical dataset?
Suppose I have a dataset where, for each observation, we observe the loan's interest rate and whether the customer defaulted (i.e., failed to repay the loan). The interest rate is determined by a ...
2votes
0answers
15views
Evaluation of token importance attribution based on human rationales
I am working on evaluating an explainability method for a text classification model that predicts whether a given text sequence contains hate speech or not. The method outputs token-level importance ...
2votes
0answers
37views
Evaluating model performance when used in targeting decisions
I have a logistic regression model, the output of which is used to make decisions. I am testing an improved version of this model. In testing, it has substantially improved logloss vs old model. When ...
1vote
1answer
27views
Evaluation of model on imperfect validation set
I would like to get help with evaluation of my classification model. It is a typical model that for each input produces vector of floats that represents probabilities of labels and I classify the ...
0votes
0answers
22views
Data Exploration - Uneven Sampling Frequency
I apologize in advance for the noob question, this is the first ML project that I have attempted although I have some stats background. I am in the data exploration phase for a project, where I am ...
0votes
0answers
11views
Getting low accuracy while using QSVM
I am trying to predict weather using QSVM. The dataset I am using can be seen here : Dataset: https://www.kaggle.com/datasets/muthuj7/weather-dataset I am using ZZfeatyremap and Linear Quantum Kernel. ...
5votes
3answers
82views
Same validation curves for training and test dataset
I am learning machine learning by myself. I am applying logistic regression to Weather Forecast dataset from Kaggle Weather_data. The goal is to predict Rain according to the given features and the ...
1vote
0answers
41views
Is overfitting always bad?
I have trained my model for the first time and inference it on random images. When I tried random image that has similar camera position with my dataset, it fits well at detecting river. But when it’s ...
1vote
0answers
18views
Match between Regression prediction and human-produced guesses for upper and lower threshold
I have some database, containing numerical data about products. I use different models for predicting the value of a feature, e.g. batttery capacity of a laptop, given other features, such as size, ...
2votes
2answers
141views
Random Forest always predicting the majority class
I'm predicting disease outcome using biological data (metabolites plus covariates age, sex and BMI). The outcome is a binary variable and moderately imbalanced (~12% positive cases). I have a ...
2votes
1answer
21views
How do you actually evaluate a retrieval system?
Say I have a knowledge base, I splitted it and generated question-answer with qa_generator and filtered it out with a qa_critic, so I have question, answer, and context. Now while building a rag ...
0votes
0answers
23views
Measure recall or false negatives rate in a very unbalanced data set
We want to measure recall (or the false negative rate) for our machine learning model. The problem is that the Positive Case only exists in <0.1% of all cases and we can't afford to annotate the ...
2votes
2answers
70views
Is there an evaluation metric for (time series) regression that evaluates how accurate the shape of the "curve" is?
I am trying to predict a value y. I am mainly interested in when its peaks are and how the general curve for a day will look. It's less important the actualy predicted values are correct. A bit ...
0votes
1answer
45views
Potential Sign Issues in a Composite Performance Metric for Model Selection
I am analyzing the results of various machine learning models for a regression task, using four metrics: RMSE, MAE, MAPE, and $R^2$. My approach involves two types of analyses: Individual Metric ...
0votes
0answers
15views
How to setup train/test and evaluation to compare multiple recommender Models?
long story short, I need to compare perfmance of various reccomender models on a dataset. The data contains users and their ratings of some items. I need to compare various approaches (collaborative ...