Content text [Exercise] Skip-gram [Eng].docx
▪ . o for j=0 to 6: [___, ___, ___, ___, ___, ___, ___]. o Sum . o : [___, ___, ___, ___, ___, ___, ___]. Exercise 1.2: Loss Function with Softmax ● Formulas and Explanation: o Loss: , where is the predicted probability for (index=1), and is the one-hot vector for . o Cross-entropy measures the difference between predicted and true distributions. ● Calculations (fill in the blanks): o (for index=1): ___. o $L = -\log ___ = ___. Exercise 1.3: Backward Pass and Parameter Update Formulas and Explanation: o Gradient for : ( is one-hot for ). o Gradient for : (3x7 matrix). o Gradient for : (3x1 vector). o Gradient for : . o Update: . ● Calculations (fill in the blanks): o (one-hot for index=1). o . o . o . o Gradient : [___, ___, ___]. o Update . o Update $W[2, :] \leftarrow [___, ___, ___] - 0.01 \cdot [___, ___, ___] = [___, ___, ___]. o Check: Compute with new parameters, compare . Exercise 1.4: Inference and Embedding Application ● Formulas and Explanation: o Final embeddings: Rows of (or average of and ). o Cosine similarity: (measures semantic similarity).
● Calculations (fill in the blanks): o Embedding for "cat" (row 1 of updated ): [___, ___, ___]. o Embedding for "fish" (row 6): [___, ___, ___]. o Dot product: ___ \cdot ___ + ___ \cdot ___ + ___ \cdot ___ = ___. o . o . o . o Interpretation: ___ (e.g., high cosine indicates semantic similarity). Exercise 1.5: Counting Parameters ● Formulas and Explanation: o Total parameters: (sum of parameters in and ). ● Calculations (fill in the blanks): o For V=7, D=2: . o For D=3: . o For D=5: $#\Theta = ___. Part 2: Skip-gram with Negative Sampling This part uses negative sampling with to approximate softmax, making computations more efficient. Exercise 2.1: Vectorization and Forward Pass for Positive Word ● Formulas and Explanation: o Vectorization: for . o Center embedding: . o Positive score: (dot product measuring similarity). ● Calculations (fill in the blanks): o One-hot for ("likes", index=2): . o Embedding . o (for "cat"). o $s_o = ___ \cdot ___ + ___ \cdot ___ + ___ \cdot ___ = ___. Exercise 2.2: Negative Sampling and Scores for Negative Words ● Formulas and Explanation: