It tells you the base rate of positive cases. The scale is log-odds and if
you have any very common features then those should be included in this
estimate. L_1 regularization will tend to prefer the intercept over any
feature that has less than 100% prevalence. If you have features with 100%
prevalence and constant weight then you might as well eliminate them anyway.
To convert a log-odds value x to a probability, use 1/(1+exp(-x))
On Wed, Dec 15, 2010 at 12:06 PM, Adrian E. Gould <
[hidden email]> wrote:
> The intercept sometimes appears in model dissection reports. What can the
> intercept's weight tell me about my model?
>
>
>