Class StandardScaler (2.0.0)

StandardScaler()

Standardize features by removing the mean and scaling to unit variance.

The standard score of a sample x is calculated as:z = (x - u) / s where u is the mean of the training samples or zero if with_mean=False, and s is the standard deviation of the training samples or one if with_std=False.

Centering and scaling happen independently on each feature by computing the relevant statistics on the samples in the training set. Mean and standard deviation are then stored to be used on later data using transform.

Standardization of a dataset is a common requirement for many machine learning estimators: they might behave badly if the individual features do not more or less look like standard normally distributed data (e.g. Gaussian with 0 mean and unit variance).

Examples:

.. code-block:: from bigframes.ml.preprocessing import StandardScaler import bigframes.pandas as bpd scaler = StandardScaler() data = bpd.DataFrame({"a": [0, 0, 1, 1], "b":[0, 0, 1, 1]}) scaler.fit(data) print(scaler.transform(data)) print(scaler.transform(bpd.DataFrame({"a": [2], "b":[2]}))) 

Methods

__repr__

__repr__()

Print the estimator's constructor with all non-default parameter values.

fit

fit(X:typing.Union[bigframes.dataframe.DataFrame,bigframes.series.Series,pandas.core.frame.DataFrame,pandas.core.series.Series,],y=None,)-> bigframes.ml.preprocessing.StandardScaler

Compute the mean and std to be used for later scaling.

Parameters
NameDescription
Xbigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.Series

The Dataframe or Series with training data.

ydefault None

Ignored.

Returns
TypeDescription
StandardScalerFitted scaler.

fit_transform

fit_transform(X:typing.Union[bigframes.dataframe.DataFrame,bigframes.series.Series,pandas.core.frame.DataFrame,pandas.core.series.Series,],y:typing.Optional[typing.Union[bigframes.dataframe.DataFrame,bigframes.series.Series,pandas.core.frame.DataFrame,pandas.core.series.Series,]]=None,)-> bigframes.dataframe.DataFrame

Fit to data, then transform it.

Parameters
NameDescription
Xbigframes.dataframe.DataFrame or bigframes.series.Series

Series or DataFrame of shape (n_samples, n_features). Input samples.

ybigframes.dataframe.DataFrame or bigframes.series.Series

Series or DataFrame of shape (n_samples,) or (n_samples, n_outputs). Default None. Target values (None for unsupervised transformations).

Returns
TypeDescription
bigframes.dataframe.DataFrameDataFrame of shape (n_samples, n_features_new). Transformed DataFrame.

get_params

get_params(deep:bool=True)-> typing.Dict[str,typing.Any]

Get parameters for this estimator.

Parameter
NameDescription
deepbool, default True

Default True. If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
TypeDescription
DictionaryA dictionary of parameter names mapped to their values.

to_gbq

to_gbq(model_name:str,replace:bool=False)-> bigframes.ml.base._T

Save the transformer as a BigQuery model.

Parameters
NameDescription
model_namestr

The name of the model.

replacebool, default False

Determine whether to replace if the model already exists. Default to False.

transform

transform(X:typing.Union[bigframes.dataframe.DataFrame,bigframes.series.Series,pandas.core.frame.DataFrame,pandas.core.series.Series,],)-> bigframes.dataframe.DataFrame

Perform standardization by centering and scaling.

Parameter
NameDescription
Xbigframes.dataframe.DataFrame or bigframes.series.Series or pandas.core.frame.DataFrame or pandas.core.series.Series

The DataFrame or Series to be transformed.

Returns
TypeDescription
bigframes.dataframe.DataFrameTransformed result.