Probability from log likelihood in LDA output

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Probability from log likelihood in LDA output

Quiroz Hernandez, Andres
Hello,

As I understand it, the output for LDA is a log likelihood value for
each word/topic pair, which is a function of log(P(w|t)). Is it possible
to invert that function to obtain P(w|t)? I have a feeling it is not,
since it looks like the final value is obtained as a sum of log
probabilities, but I just wanted to check, since an output as a
probability is more readable than the likelihood value given.

Thanks,

Andres
Reply | Threaded
Open this post in threaded view
|

Re: Probability from log likelihood in LDA output

Ted Dunning
Yes.  I should be possible to use exp to get the actual probability.  The
fact that it is a sum
of log probabilities just means that the probability is a product of
probabilities.

It is possible that the probabilities are not normalized, but that would be
a bit surprising for
this kind of algorithm.

On Mon, Dec 6, 2010 at 8:02 AM, Quiroz Hernandez, Andres <
[hidden email]> wrote:

> Hello,
>
> As I understand it, the output for LDA is a log likelihood value for
> each word/topic pair, which is a function of log(P(w|t)). Is it possible
> to invert that function to obtain P(w|t)? I have a feeling it is not,
> since it looks like the final value is obtained as a sum of log
> probabilities, but I just wanted to check, since an output as a
> probability is more readable than the likelihood value given.
>
> Thanks,
>
> Andres
>
Reply | Threaded
Open this post in threaded view
|

RE: Probability from log likelihood in LDA output

Quiroz Hernandez, Andres
Thanks for your quick reply, Ted. It looks like either the probabilities are not normalized or the function being used is not a simple sum of log probabilities, because exp does not always return a value between 0 and 1. I will take a look at the code to see if I can find exactly how the value is calculated (but if anyone knows the function used, and if I can directly invert it to find P(w|t) please let me know).

Thanks again,

Andres

-----Original Message-----
From: Ted Dunning [mailto:[hidden email]]
Sent: Monday, December 06, 2010 11:57 AM
To: [hidden email]
Subject: Re: Probability from log likelihood in LDA output

Yes.  I should be possible to use exp to get the actual probability.  The
fact that it is a sum
of log probabilities just means that the probability is a product of
probabilities.

It is possible that the probabilities are not normalized, but that would be
a bit surprising for
this kind of algorithm.

On Mon, Dec 6, 2010 at 8:02 AM, Quiroz Hernandez, Andres <
[hidden email]> wrote:

> Hello,
>
> As I understand it, the output for LDA is a log likelihood value for
> each word/topic pair, which is a function of log(P(w|t)). Is it possible
> to invert that function to obtain P(w|t)? I have a feeling it is not,
> since it looks like the final value is obtained as a sum of log
> probabilities, but I just wanted to check, since an output as a
> probability is more readable than the likelihood value given.
>
> Thanks,
>
> Andres
>
Reply | Threaded
Open this post in threaded view
|

Re: Probability from log likelihood in LDA output

David Hall-17
Hi,

The scores aren't (log) normalized until they're loaded in the map
phase. Take a look at LDAState. The array

private final double[] logTotals; // log \sum p(w|t) for topic=1..nTopics

in LDAState has normalization constants.  The method
logProbWordGivenTopic is intended for access...  LDADriver#createState
is a round about way of creating an LDA State.

-- David

On Mon, Dec 6, 2010 at 12:06 PM, Quiroz Hernandez, Andres
<[hidden email]> wrote:

> Thanks for your quick reply, Ted. It looks like either the probabilities are not normalized or the function being used is not a simple sum of log probabilities, because exp does not always return a value between 0 and 1. I will take a look at the code to see if I can find exactly how the value is calculated (but if anyone knows the function used, and if I can directly invert it to find P(w|t) please let me know).
>
> Thanks again,
>
> Andres
>
> -----Original Message-----
> From: Ted Dunning [mailto:[hidden email]]
> Sent: Monday, December 06, 2010 11:57 AM
> To: [hidden email]
> Subject: Re: Probability from log likelihood in LDA output
>
> Yes.  I should be possible to use exp to get the actual probability.  The
> fact that it is a sum
> of log probabilities just means that the probability is a product of
> probabilities.
>
> It is possible that the probabilities are not normalized, but that would be
> a bit surprising for
> this kind of algorithm.
>
> On Mon, Dec 6, 2010 at 8:02 AM, Quiroz Hernandez, Andres <
> [hidden email]> wrote:
>
>> Hello,
>>
>> As I understand it, the output for LDA is a log likelihood value for
>> each word/topic pair, which is a function of log(P(w|t)). Is it possible
>> to invert that function to obtain P(w|t)? I have a feeling it is not,
>> since it looks like the final value is obtained as a sum of log
>> probabilities, but I just wanted to check, since an output as a
>> probability is more readable than the likelihood value given.
>>
>> Thanks,
>>
>> Andres
>>
>