In computing, the process of converting data from one format or structure into another format or structure is Data transformation. Most system creation and data processing functions such as data wrangling, data storage, data creation are a crucial part of this. We are trying to alter the essence of the data using some methods in this process, so we can extract useful information from it.
Some of the techniques used for data transformation are:
i. Aggregation: In this technique over the data is applied the summation or aggregation operation. For example, the daily sales data that be aggregated to measure both monthly and annual amounts.
ii. Discretization: In this technique, the raw values of a statistical attribute (e.g. age data) are constructed and replaced by interval values (e.g. 0-10, 10-20, 20-40) or conceptual values (e.g., infant, young, adult).
iii. Attribute construction/ Feature engineering: First let’s understand what engineering function is? Actually feature engineering is the process of building/engineering a new attribute/component by analyzing the features available and the relationship between them. This technique is useful when extra information is produced from vague data. If we have fewer features but still they contain secret information to retrieve, this technique may be helpful.
iv. Normalization/Standardization: What is Normalization or Standardization? Standardization is characterized as the process of rescaling original data without altering the behavior or existence of the data. We establish new boundaries and convert data accordingly (mostly 0,1). This technique is useful in classification algorithms that require a neural network or algorithm based on distance (e.g. KNN, K-means). Visit here to learn more about standardizing results.