All Questions
Tagged with imbalanced-data or class-imbalance
602 questions
0votes
0answers
10views
How to Properly Use scale_pos_weight in an XGBoost MultiOutput Classifier to Address Severe Class Imbalance?
I'm working on predicting two genetic mutations simultaneously using an XGBoost Multioutput Classifier. My dataset is severely imbalanced, particularly for cases where both genetic mutations are ...
2votes
1answer
79views
Why do we need Smote?
We use Smote to balance the imbalanced dataset but why we are manipulating things and cannot use the natural data i mean what is the need for balancing what exact impact it will make to model
0votes
0answers
23views
Question on Optimized Threshold in Predictive Modeling
I'm trying to build a predictive model, but I haven't found a method that consistently delivers high performance. Is it acceptable to use an # Optimize classification threshold 0.996 ?
1vote
2answers
33views
What do these train and test accuracy and loss graphs suggest ? Can train and test accuracy reach 80% after one epoch?
This is the accuracy and loss plot for CNN model. Is it possible that train and test accuracy may starts from 80% from the 1st epoch itself for 5 k fold.
0votes
0answers
33views
how to properly implement Random Undersampling during Cross-Validation in Orange
I am working on a highly imbalanced fraud detection dataset (class 0:284315 instances, class 1: 492 instances) and trying to implement random undersampling correctly during cross-validation in Orange. ...
1vote
3answers
142views
ROC vs PR-score and imbalanced datasets
I can see everywhere that when the dataset is imbalanced PR-AUC is a better performance indicator than ROC. From my experience, if the positive class is the most important, and there is higher ...
2votes
1answer
69views
Taking into account instance cost in learning?
I am generally trying to take into account costs in learning. The set-up is as follows: a statistical learning problem with usuall X and y, where y is imbalanced (roughly 1% of ones). Scikit learn ...
3votes
1answer
90views
When using class weights is bad?
I have a DB with 50 different classes. One of the classes has x10 more data than the other classes. Each class has ~20K samples and the 'big' class has ~200K samples When training classification model ...
0votes
0answers
17views
How to improve LSTM model performance for weather prediction?
I predict rainfall using observational data. There are a total of 87,070 data samples, but only 1,885 samples have rainfall. And here is the LSTM model I am using: ...
0votes
1answer
25views
Poor performance for two classer in a multi class classification
I have a multi class classification With 5 classes(tabular data), I used xgboost model, the model score well for 3 classes but poor for the raimainig classes(2 classes), I tried up-sampling and class ...
0votes
0answers
30views
Using ResNet50 with SE block on imbalanced data - pytorch
I worked with a breast cancer ultrasound image dataset containing 432 benign cases, 210 malignant cases, and 133 normal cases. Initially, I used a pretrained ResNet-50 model, which yielded the ...
0votes
2answers
40views
Imbalanced class in my dataset
I’m working with an imbalanced dataset to predict strokes, where the positive class (stroke occurrence) is significantly underrepresented. Initially, I used logistic regression, but due to the class ...
0votes
0answers
43views
How to handle imbalanced edge weights in a graph for node embedding and edge weight prediction?
I have an undirected weighted graph where the edge weights represent probabilities. The majority of the edge weights are 1 (which are 7 times more frequent than the second major group of weights). I'm ...
-1votes
1answer
31views
I am getting better results with under sampling compared to weight class modification for a binary classification? what could be the possible reason?
I am getting better results with under sampling compared to weight class modification? what could be the possible reason?
0votes
1answer
40views
SVC labels entire sample majority class, even after using ADASYN
I have an imbalanced sample (850 in group X vs 100 in group Y). I am trying to predict group membership using support vector classifcation. I am using 'Adaptive Synthetic' (ADASYN) to oversample the ...