Question:

Consider the supervised learning task. The objective function being minimized is \( f(w) = w \cdot x \), where \( w \in \mathbb{R} \) is the parameter. Stochastic Gradient Descent with learning rate of 0.10. Let \( w = 10.00 \) be the \( i^{\text{th}} \) iteration (\( w_i \)). The value of \( w \) at the end of iteration \( (i+1) \) is if \( x = 10 \) ________. Round off 2 decimal.

Show Hint

Remember the three key components of a gradient descent update: the current position ($w_{old}$), the direction of steepest ascent (the gradient, $\nabla f$), and the step size ($\eta$). To minimize a function, you always move in the direction *opposite* to the gradient, hence the minus sign in the update rule.
Updated On: Feb 23, 2026
Hide Solution
collegedunia
Verified By Collegedunia

Correct Answer: 9

Solution and Explanation

Step 1: Understanding the Question:
We need to perform a single update step of the Stochastic Gradient Descent (SGD) algorithm for a given objective function, initial parameter value, learning rate, and data point.
Step 2: Key Formula or Approach:
The update rule for Stochastic Gradient Descent is:
\[ w_{\text{new}} = w_{\text{old}} - \eta \cdot \nabla f(w) \] where:
- $w_{\text{new}}$ is the updated parameter.
- $w_{\text{old}}$ is the current parameter value.
- $\eta$ is the learning rate.
- $\nabla f(w)$ is the gradient of the objective function with respect to the parameter w, evaluated at the current data point.
Step 3: Detailed Explanation:
Let's identify the given values:
- Objective function: $f(w) = w \cdot x$
- Current parameter ($w_{\text{old}}$): 10.00
- Learning rate ($\eta$): 0.10
- Data point (x): 10
First, we need to compute the gradient of the objective function $f(w)$ with respect to w.
\[ \nabla f(w) = \frac{\partial}{\partial w} (w \cdot x) \] Since x is treated as a constant during differentiation with respect to w, the derivative is: \[ \frac{\partial f(w)}{\partial w} = x \] Now, we evaluate this gradient at our specific data point x = 10. \[ \nabla f(w) = 10 \] Finally, we apply the SGD update rule: \[ w_{\text{new}} = w_{\text{old}} - \eta \cdot \nabla f(w) \] \[ w_{\text{new}} = 10.00 - (0.10 \times 10) \] \[ w_{\text{new}} = 10.00 - 1.0 \] \[ w_{\text{new}} = 9.00 \] Step 4: Final Answer:
The value of w at the end of the next iteration is 9.00.
Was this answer helpful?
0
0

Top Questions on Machine Learning

View More Questions

Questions Asked in GATE DA exam

View More Questions