Question:

Consider designing a linear classifier

\[ y = \text{sign}(f(x; w, b)), \quad f(x; w, b) = w^T x + b \]

on a dataset \( D = \{(x_1, y_1), (x_2, y_2), \dots, (x_N, y_N)\} \), where \( x_i \in \mathbb{R}^d \), \( y_i \in \{+1, -1\} \), for \( i = 1, 2, \dots, N \).

Recall that the sign function outputs \( +1 \) if the argument is positive, and \( -1 \) if the argument is non-positive. The parameters \( w \) and \( b \) are updated as per the following training algorithm:

\[ w_{\text{new}} = w_{\text{old}} + y_n x_n, \quad b_{\text{new}} = b_{\text{old}} + y_n \]

whenever \( \text{sign}(f(x_n; w_{\text{old}}, b_{\text{old}})) \neq y_n \).

In other words, whenever the classifier wrongly predicts a sample \( (x_n, y_n) \) from the dataset, \( w_{\text{old}} \) gets updated to \( w_{\text{new}} \), and likewise \( b_{\text{old}} \) gets updated to \( b_{\text{new}} \).

Consider the case \( (x_n, +1) \), where \( f(x_n; w_{\text{old}}, b_{\text{old}}) < 0 \). Then:

Show Hint

In the training of linear classifiers, the parameters are updated whenever the classifier makes a mistake. This update usually increases the classifier's output for correctly classified samples, moving it in the direction of the true label.
Updated On: Apr 4, 2025
  • \( f(x_n; w_{\text{new}}, b_{\text{new}})>f(x_n; w_{\text{old}}, b_{\text{old}}) \)
  • \( f(x_n; w_{\text{new}}, b_{\text{new}})<f(x_n; w_{\text{old}}, b_{\text{old}}) \)
  • \( f(x_n; w_{\text{new}}, b_{\text{new}}) = f(x_n; w_{\text{old}}, b_{\text{old}}) \)
  • \( y_n f(x_n; w_{\text{old}}, b_{\text{old}})>1 \)
Hide Solution
collegedunia
Verified By Collegedunia

The Correct Option is A

Solution and Explanation

We are given a linear classifier where the parameters \( w \) and \( b \) are updated whenever the classifier makes a mistake. The update rule for the parameters is as follows: \[ w_{\text{new}} = w_{\text{old}} + y_n x_n, \quad b_{\text{new}} = b_{\text{old}} + y_n. \] In this case, we are considering the situation where \( (x_n, +1) \) is the incorrect prediction, meaning: \[ f(x_n; w_{\text{old}}, b_{\text{old}}) < 0. \] This indicates that the classifier has incorrectly predicted the label of \( x_n \) as \( -1 \) when the true label is \( +1 \).

Step-by-Step Process:

1. Initial Prediction:
The classifier incorrectly predicts the label of \( x_n \), i.e., it computes a negative value for the decision function: \[ f(x_n; w_{\text{old}}, b_{\text{old}}) = w_{\text{old}}^T x_n + b_{\text{old}} < 0. \] Since \( y_n = +1 \), we will update the parameters \( w \) and \( b \) as follows: \[ w_{\text{new}} = w_{\text{old}} + y_n x_n = w_{\text{old}} + x_n, \] \[ b_{\text{new}} = b_{\text{old}} + y_n = b_{\text{old}} + 1. \]

2. New Prediction:
After the update, we compute the new value of the decision function with the updated parameters: \[ f(x_n; w_{\text{new}}, b_{\text{new}}) = w_{\text{new}}^T x_n + b_{\text{new}} = (w_{\text{old}} + x_n)^T x_n + (b_{\text{old}} + 1). \] This simplifies to: \[ f(x_n; w_{\text{new}}, b_{\text{new}}) = f(x_n; w_{\text{old}}, b_{\text{old}}) + x_n^T x_n + 1. \] Since \( x_n^T x_n > 0 \) (as it is a squared norm), and we are adding 1, the updated decision function will be larger than the previous one: \[ f(x_n; w_{\text{new}}, b_{\text{new}}) > f(x_n; w_{\text{old}}, b_{\text{old}}). \]

Thus, the correct answer is \( f(x_n; w_{\text{new}}, b_{\text{new}}) > f(x_n; w_{\text{old}}, b_{\text{old}}) \), which corresponds to option (A).
Was this answer helpful?
0
0

Top Questions on Machine Learning

Questions Asked in GATE DA exam

View More Questions