ML Concepts – What is Feature Scaling?

Feature Scaling Feature scaling is technique that will get mean and standard deviation of your feature in order to scale your feature. If we apply..

Feature Scaling

Feature scaling is technique that will get mean and standard deviation of your feature in order to scale your feature. If we apply the feature scaling before the splitting the dataset, then it takes the mean and standard deviation of all the values including training set. It will cause the information leakage. We do not need to apply feature scaling for all the machine learning models, but for few of them. Like Regression model do not required

Simple meaning: Let’s suppose you have a dataset has income details or network load details. Some time network is going to 10 GBPS, and sometimes it is on 1 GBPS, some time it is on 100 MBPS. When you will use this data, or create a graph, you need high scale. Like you need to create a graph till 10000 KBs. So, you will scale the data to mitigate this. Means you will scale into 1-100%. Means if 10 GBPS is used then 100 %, if 100 MB is used then 0.1%. Now you can use this data in simple. Like we create for CPU utilization capacity report. We have data with 340 MHz used, we convert to 2% utilization, if its 2k MHz used, then 20% utilization.

Feature Scaling Techniques:

  1. Standardisation
  2. Normalisation

Standardization: Consist of subtracting each value of feature by the mean of all the value of your features and dividing by standard deviation which is square root of variants. Standardization actual well all the time. This is technique we use most of the time. It is calculated as:

X_new = (X - mean)/Std

Normalization: Subtracting each value of feature with the minimum value of all features, then dividing by maximum value of feature and minimum value of features. Normalization is recommended when you have normal distribution. This is for some specific time we use. It is calculated as:

X_new = (X - X_min)/(X_max - X_min)

Only apply feature scaling on numerical values.

About The Author

Leave a Reply

Your email address will not be published. Required fields are marked *

About the Author

Dr Pranay Jha

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

BlockSpare — News, Magazine and Blog Addons for (Gutenberg) Block Editor