NOTE
This is part of the 3rd homework, titled Probabilistic Inference, for the course Machine Learning (IN2064) in the Winter Semester 2024/25 at TUM.
Intro
Usually, we maximise the log-likelihood, instead of the likelihood. The next two problems provide a justification for this. In the lecture, we encountered the likelihood maximisation problem:
where and denoted the number of tails and heads in a sequence of coin tosses, respectively.
Problem 6
Compute the first and second derivative of this likelihood with respect to . Then compute the first and second derivative of the log-likelihood .
Solution
Likelihood:
- First derivative:
- Second derivative:
Log-likelihood:
- First derivative:
- Second derivative:
Problem 7
Show that for any differentiable, positive function , every local maximum of is also a local maximum of . Considering this and the previous exercise, what is your conclusion?
Solution
Let be a differentiable and positive function on its domain. Denote . We will show that: *if is a local maximum of , then is also a local maximum of .
Since , is well-defined and differentiable. The derivative of is given by:
Now, if is a local maximum of , then . Substituting :
Since , this implies:
Thus, is a critical point of . As is a local maximum of , there exists an interval such that for all in this interval:
Equivalently:
Exponentiating both sides (which preserves the inequality since the exponential function is monotonic):
Thus, is a local maximum of .
Conclusion
After this and the previous exercise, we can conclude that maximising the likelihood function is the same as maximising the log-likelihood function. However, the log-likelihood is much simpler to derive, making it easier to use in practice.