Preparation techniques plan: framework for data preparation techniques in machine learning challenge of data preparation



Yüklə 60,65 Kb.
səhifə4/8
tarix30.12.2023
ölçüsü60,65 Kb.
#164088
1   2   3   4   5   6   7   8
PREPARATION TECHNIQUES

Data Preparation for Columns
This group is for data preparation techniques that add or remove columns of data.
In machine learning, columns are often referred to as variables or features.
These techniques are often required to either reduce the complexity (dimensionality) of a prediction problem or to unpack compound input variables or complex interactions between features.
The main class of techniques that come to mind are feature selection techniques.
This includes techniques that use statistics to score the relevance of input variables to the target variable based on the data type of each.
For more on these types of data preparation techniques, see the tutorial:

This also includes feature selection techniques that systematically test the impact of different combinations of input variables on the predictive skill of a machine learning model.
For more on these types of methods, see the tutorial:

  • Recursive Feature Elimination (RFE) for Feature Selection in Python

Related are techniques that use a model to score the importance of input features based on their use by a predictive model, referred to as feature importance methods. These methods are often used for data interpretation, although they can also be used for feature selection.
For more on these types of methods, see the tutorial:

This group of methods also brings to mind techniques for creating or deriving new columns of data, new features. These are often referred to as feature engineering, although sometimes the whole field of data preparation is referred to as feature engineering.
For example, new features that represent values raised to exponents or multiplicative combinations of features can be created and added to the dataset as new columns.
For more on these types of data preparation techniques, see the tutorial:

  • How to Use Polynomial Feature Transforms for Machine Learning

This might also include data transforms that change a variable type, such as creating dummy variables for a categorical variable, often referred to as a one-hot encoding.
For more on these types of data preparation techniques, see the tutorial:


Yüklə 60,65 Kb.

Dostları ilə paylaş:
1   2   3   4   5   6   7   8




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə