Questions tagged [normalization]
Normalization is a technique often applied as part of data preparation for machine learning. The goal of normalization is to change the values of numeric columns in the dataset to use a common scale, without distorting differences in the ranges of values or losing information.
303 questions
1vote
0answers
19views
Rolling z-score and normalizing
I am using a rolling window z-score method to flag if a record is an outlier. Is it necessary to first normalize the values of the desired feature before computing the rolling z-score?
3votes
1answer
64views
using Standardization and Normalization in the same pipeline
I have a pyspark ML pipeline that uses PCA reduction and ANN. My understanding is that PCA performs best when given standardized values while NN perform best when given normalized values. Does it ...
0votes
1answer
45views
Is normalization required before outlier detection?
When working with machine learning or data preprocessing, the order of operations is crucial for accurate results. One common question is: Should normalization or standardization be applied before ...
1vote
1answer
57views
Combining standardizing and normalizing my input data for ML gives the best results, why?
When I combine standardizing and normalizing my input data for my hybrid ANN model, it gives the best results. But I can't find anywhere, why. I based it on a paper's approach but they don't justify ...
4votes
1answer
52views
Help on data transformation
I have reaction time as a dependent variable and age as an independent variable. I want to do a linear mixed model analysis. My data is not normally distributed. Should I have to transform data? I ...
2votes
0answers
13views
How should I input and output feature and target timeseries to timeseries transformer
I am trying out PatchTST timeseries transformer (paper, code) on a timeseries data that I have. The way PatchTST handles data is as follows: Note that on line 78-79, the repo does following: ...
0votes
2answers
50views
Float to int of vector
I have vector with data type of float, mainly float32. The zero value is the baseline a.k.a. mean/center of data. It has shape (batch_size, seq_len). I want to ...
1vote
1answer
36views
Should normalization be applied on interaction feature
I am working with interaction features in my machine learning model, where I create new features by multiplying a numeric variable with an encoded categorical feature. My question is: Should ...
1vote
1answer
35views
Time series Data Scaling - per individual or combined?
I have data on many cars over time (a few years per car) I am planning on creating a model for all the cars combined (not one model per car). Do I want to scale the data (Normalize / Standardize) ...
0votes
0answers
28views
What does it mean to normalize a time series signal against another?
I'm looking at ways to reduce the dimensions of a multivariate data set to a univariate signals. But some preprocessing needs to be done first. Someone mentioned that I should combine the signals by ...
0votes
0answers
16views
Prepare gaming dataset - one-hot-encoding vs. min-max-normalization for card ids
I have a dataset for a game. 5 player cards with ids for player 1, 5 player cards with ids for player 2. Column names are like player1_card1_id, ..., player1_card5_id, player2_card1_id, ..., ...
0votes
0answers
17views
Understanding SKLearn's normalization
Whenever I see someone talking about normalization, they usually talk about scaling a feature based on the feature's range, meaning that for a given feature $x$: $$x' = \frac{x - x_{min}}{x_{max} - x_{...
2votes
2answers
109views
Splitting and scaling of ML training and test data
I gather you are supposed to split data into training and test before you scale/shift to avoid data leakage. The issue I have with this is how do you cope with values in the test set that are outside ...
0votes
1answer
45views
Potential Sign Issues in a Composite Performance Metric for Model Selection
I am analyzing the results of various machine learning models for a regression task, using four metrics: RMSE, MAE, MAPE, and $R^2$. My approach involves two types of analyses: Individual Metric ...
0votes
0answers
17views
I don't understand why LayerNorm is killing cosine predictions
I have a very plain cosine prediction model: batch_size = 20 Conv1D(filters=1, kernel=10, padding="same") RELU Dense(1) Tanh If I add LayerNormalization between 1 and 2 or between 2 and 3, ...