R-Flow
  • R-Flow Task Guide
  • User-Guide
  • R
    • Transfer: R to Python
    • Script: R Script
    • Data Preprocess
      • Data Cleansing (Missing, Duplicate Data)
        • Missing Data Imputer
        • Outlier Imputer
      • Data Scaler
      • Normalize
      • Encode Feature
      • Highly Imbalanced Data
      • Data Handling
        • Data Aggregate
        • Data Subset
        • Data Filter
        • Data Join
        • Data Merge
        • Data Sort
        • Data Sampling
        • Data Imputation
    • Statistics
      • Hypothesis Test
      • ANOVA
      • PCR
      • Time Series Analysis
      • Factor Analysis
    • Machine Learning
      • Feature Selection
        • Filter Methods
        • Wrapper Methods
      • Dimension Reduction
      • Neural Network based ML
      • Similarity based ML
      • Information based ML
      • Bayesian Based ML
      • Clustering
        • Optimal
        • OPTICS
        • Others
      • Outlier Detection
        • Univariate Outliers
        • Bivariate Outliers
        • Multivariate Outliers
        • Time-Series Outliers
      • Recommend
      • Association Rule Analysis
    • Data Predict
    • R Object Load, Save
  • Python
    • Transfer: Python To R
    • Script: Python Script
    • Data Preprocess
      • Data Cleansing
        • Missing Data Imputer
      • Data Scaler
      • Normalize
      • Encode Feature
      • Highly-Imbalanced Data
        • Under-Sampling
        • Over-Sampling
        • Combination of over- and under-sampling
      • Data Handling
        • Data Aggregate
        • Data X,Y Split
        • Data Filter
        • Data Join
        • Data Concat
        • Data Sampling
        • Data Imputation
    • Machine Learning
      • Estimator
      • Probability Calibration
      • Clustering
      • Matrix Decomposition
      • Discriminant Analysis
      • Ensemble Methods
      • Feature Selection
      • Isotonic regression
      • Kernel Ridge Regression
      • Linear Models
        • Linear classifiers
        • Classical linear regressors
        • Regressors with variable selection
        • Bayesian Reg.
        • Multi-task linear regressors with variable selection
        • Outlier-robust regressors
      • Manifold Learning
      • Gaussian Mixture Models
      • Model Selection
      • Multiclass and Multilabel Classification
      • Multioutput regression and classification
      • Naive Bayes
      • Nearest Neighbors
      • Neural network models
      • Support Vector Machines
    • Data Predict/Transform
      • Data Transform
      • Python Predict
    • Python Object Load,Save
Powered by GitBook
On this page
  • Common
  • Input Data
  • Statistic & p_value
  • Transformation Normalize
  • Input Data
  • How to Use
  • Workflow Example
  • R Package

Was this helpful?

  1. R
  2. Data Preprocess

Normalize

bestNormalize, Arcsinh Transformation, Box-Cox Normalization, Lambert W x F Normalization,log Transformation, sqrt Normalization, Yeo-Johnson Normalization, Transform Normalize

PreviousData ScalerNextEncode Feature

Last updated 4 years ago

Was this helpful?

Common

Input Data

Inputs

Target Data

Input data

Output Name

Name of the output after normalization

Normalize info

Short description of the work. It should not contain spaces.

sample

Number of samples. 400 by default

x

Variables to normalize. Only numeric or integer types will show up.

Drag the variables from the table and drop them under X.

Statistic & p_value

The two variables are the result of Shapiro-Wilk test. The null hypothesis of the Shapiro-Wilk test is that the data is normally distributed. If the p-value is greater than the alpha level, the null hypothesis cannot be rejected therefore indicating that the data is normally distributed. The program will run the test again with the given sample number whenever you click the green arrow button.

Transformation Normalize

This task is used to mitigate the influence of the heavy-tailed distributions while preserving the 1-1 nature of the transformation.

Input Data

Inputs

Target Data

Input (vector-like) data

Output Name

Name of the output after Transformation Normalize

Normalize info

Normalize Info of 'Normalize' task generated before doing 'Transform Normalize' task .

  • The 'Normalize info.' of the 'Transform Normalize' task should be the same as the 'Normalize info.' of the 'Normalize' task.

How to Use

  1. Use vector-like data as input for the normalization

  2. Use other 'Normalize' tasks such as 'bestNormalize' to normalize the input value

  3. Then, Transform_Normalize can be used.

Workflow Example

R Package

- Transform Normalize

Best Normalize:

Arcsinh Transformation, Box-Cox Normalization, Lambert W x F Normalization,log Transformation, sqrt Normalization, Yeo-Johnson Normalization:

R-Flow Task Example Video
https://cran.r-project.org/web/packages/bestNormalize/index.html
https://cran.r-project.org/web/packages/bestNormalize/vignettes/bestNormalize.html
[Task information of Normalize]
[Task Information of Transformation Normalize]