Questions tagged [regression]
Techniques for analyzing the relationship between one (or more) "dependent" variables and "independent" variables.
1,585 questions
0votes
0answers
16views
What does the Grenander condition imply about the data-generating process of $(y_i, x_i)$?
Consider a correctly specified linear model $$ y_i = x_i^\top \beta + \varepsilon_i,\quad i=1,\dots,n, $$ where the errors $\varepsilon_i$ are independent with zero mean and finite variance. ...
5votes
1answer
28views
Wind Power Data Analysis - Python
I am seeking some help and or perspectives in solving a problem. I have a dataset (accessible here) with the following columns: DATE: this is the date in dd/mm/yyyy format HH: this is the "half-...
3votes
1answer
55views
Why MAE is hard to optimize?
In numerous sources it is said that MAE has a disadvantage of not being differentiable a zero hence it has problems with gradient-based optimization methods. However I've never saw an explanation why ...
5votes
1answer
49views
How to do Exploratory Data Analysis when my response variable is binary?
I am doing a multilevel regression, and my response variable is binary (presence of females on a tech board). all the EDA methods i know are about plotting correlation, but this as this is a binary i ...
2votes
0answers
51views
Reverse engineering what stocks are in a dummy ETF using regression (lasso, ridge, etc) in Python
I'm trying to reverse engineer what stocks are in a ETF using python. In my code, I create a fake ETF that is equal weighted 20 random stocks. I then try to reverse engineer whats in my ETF using ...
0votes
0answers
16views
Performing piecewise polynomial regression on the Auto dataset (ISLR2)
I am trying to analyze some data from the Auto data set (from ISLR2). I am trying to fit a piecewise polynomial to the acceleration column. But every time I run my code, R throws this error: ...
0votes
0answers
11views
Modeling Continuous Variable Given Only Categorical Dependent Variables
In my scenario, we have several categorical variables with multiple levels as predictors (X), and a continuous response variable (y). We have many observations of Y for every possible combination of ...
2votes
1answer
53views
Is a two-phase model (ensembling/stacking) a valid approach for forecasting product demand?
I am working on a project to forecast food sales for a corporate restaurant. Sales are heavily influenced by the number of guests per day, along with other factors like seasonality, weather conditions,...
0votes
0answers
27views
CNN for gaze regression predicts near the mean
I am currently building my first CNN network on my own for a regression task for which the network must predict the coordinates I am looking at on my screen based on an input image taken through my ...
0votes
0answers
22views
Data Exploration - Uneven Sampling Frequency
I apologize in advance for the noob question, this is the first ML project that I have attempted although I have some stats background. I am in the data exploration phase for a project, where I am ...
1vote
1answer
48views
I didn't scale all features I used for prediction, does it make sense?
In my regression-based machine learning project, I have features like coordinates (latitude and longitude) that I prefer not to scale or transform. The main reason is that reversing the transformation ...
0votes
0answers
23views
Can a time series model self correct during prediction?
Can a time series model that makes predictions for January to December and let's say in March the actual data's value in above the 95% confidence interval prediction change weights dynamically and ...
9votes
3answers
2kviews
Regression model R2 drops when I remove outliers: is that even possible?
I'm analyzing how outliers in my dataset of size 8x8000 affect regression models. I have three scenarios: raw dataset (with outliers), Winsorized dataset (2% of the extreme outliers adjusted), and ...
3votes
1answer
296views
How to train a model to estimate the coefficients of a coupled ODE?
Consider the coupled ODE system below (Lotka-Volterra equations): $$ \frac{dx}{dt} = \alpha x - \beta x y, \\ \frac{dy}{dt} = - \gamma y + \delta x y , $$ How can I train a model to estimate the ...
1vote
3answers
71views
What is the impact of low correlation on regression and classification problems, and how does it affect model performance?
I’m building two models (one for a regression problem and the other for a classification task) but I am facing low correlation in the data (lower in the classification problem than in the regression ...