Scikit-Learn
4
This chapter deals with the modelling process involved in Sklearn. Let us understand about
the same in detail and begin with dataset loading.
Dataset Loading
A collection of data is called dataset. It is having the following two components:
Features:
The variables of data are called its features. They are also known as predictors,
inputs or attributes.
Feature matrix:
It is the collection of features, in case there are more than one.
Feature Names:
It is the list of all the names of the features.
Response:
It is the output variable that basically depends upon the feature variables.
They are also known as target, label or output.
Response Vector:
It is used to represent response column. Generally, we have
just one response column.
Target Names:
It represent the possible values taken by a response vector.
Scikit-learn have
few example datasets like
iris
and
digits
for
classification and the
Boston house prices
for regression.
Following is an example to load
iris
dataset:
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data
y = iris.target
feature_names = iris.feature_names
target_names = iris.target_names
print("Feature names:", feature_names)
Dostları ilə paylaş: