NOTE

This is part of the 3rd homework, titled Probabilistic Inference, for the course Machine Learning (IN2064) in the Winter Semester 2024/25 at TUM.

Intro

Usually, we maximise the log-likelihood, instead of the likelihood. The next two problems provide a justification for this. In the lecture, we encountered the likelihood maximisation problem:

where and denoted the number of tails and heads in a sequence of coin tosses, respectively.

Problem 6

Compute the first and second derivative of this likelihood with respect to . Then compute the first and second derivative of the log-likelihood .

Solution

Likelihood:

  • First derivative:
  • Second derivative:

Log-likelihood:

  1. First derivative:
  1. Second derivative:

Problem 7

Show that for any differentiable, positive function , every local maximum of is also a local maximum of . Considering this and the previous exercise, what is your conclusion?

Solution

Let be a differentiable and positive function on its domain. Denote . We will show that: *if is a local maximum of , then is also a local maximum of .

Since , is well-defined and differentiable. The derivative of is given by:

Now, if is a local maximum of , then . Substituting :

Since , this implies:

Thus, is a critical point of . As is a local maximum of , there exists an interval such that for all in this interval:

Equivalently:

Exponentiating both sides (which preserves the inequality since the exponential function is monotonic):

Thus, is a local maximum of .

Conclusion

After this and the previous exercise, we can conclude that maximising the likelihood function is the same as maximising the log-likelihood function. However, the log-likelihood is much simpler to derive, making it easier to use in practice.