4.2: Engineering Ordinal Features
Ordinal feature, just like the
categorical feature has multiple categories. The difference between the both
type of feature is that in the case of ordinal feature, categories follow a
specific order. For example, an ordinal feature 'age group' can have the age of
individuals recorded as categories such as 'kid', 'teenager', 'adult', and
'elderly'. These categories are reflective of the age of individuals from lower
to higher value incrementally. In this case, the 'age group' can be considered
an ordinal feature. Let s discuss the different types of encoding possible for
ordinal features.
Let's consider the ordinal feature 'income' in the coupon recommendation
data set. It has 9 levels, starting from 0 to more than 100000. Each category
is a range with lower and upper ranges. The difference between the upper and
lower ranges of each category is 12499. The difference between the lower range
of the next category and the upper range of the previous category is 1.
Just like categorical features,
ordinal features should be presented as encodings.
4.2.1 Rank Encoding
This is the most simplistic feature
encoding for ordinal features. We rank categories in such a way that the
category of least value is given a value of 1 and for other categories, we
incrementally increase the label by 1. In the example given for 'income', the category ''Less than
$12500'' will be given value as 1, for '$12500 - $24999' it will be 2, and so
on.
Below is a snapshot of the
rank-encoded representation of the income feature.
4.2.2 Polynomial
Encoding
It searches for 3 different types of
trends in the feature, based on which it creates contrast encodings. It fits a
regression line using mean, quadratic parabola, and a cubic term to produce
linear, quadratic, and cubic encodings. If the ordinal feature has N number of
categories, polynomial encodings produce N-1 polynomial encodings. Below is a
snapshot of polynomial encoded features for income .
4.2.3 Backward
Difference Encoding
In this method, the mean of the
dependent variable for one level of the categorical variable is compared to the
mean of the dependent variable for the prior adjacent level. It produces N-1
encoded features for N categories in an ordinal feature. Below is a snapshot of
backward difference encoding for income .