# Model Analysis

Larch includes several tools to support analysis of estimated model results.  We will demonstrate
some of these features here using the simple example model for MTC mode choice:

In [None]:
import larch as lx

m = lx.example(1)
m.estimate(quiet=True)
m.parameter_summary()

## Choice and Availability Summary

A simple aggregate analysis of the data's choice and availablity statistics is available
via `choice_avail_summary` even without any model structure or parameters, as this 
summary can be constructed from the data alone.

In [None]:
m.choice_avail_summary()

## Analyze Predictions

Larch includes methods to analyze model predictions across various dimensions.  The [`analyze_predictions_co`](larch.Model.analyze_predictions_co)
method can be used to examine how well the model predicts choices against any available (or computable)
attibute of the chooser.  For example, consider the basic example model for MTC mode choice.
This model includes a utility function that incorporates alternative specific constants, level of
service variables (time and cost), as well as alternative-specific parameters to account for income.

We may be interested in knowing how well the model predicts choices across various age levels. To
see this, we can pass the "age" variable to [`analyze_predictions_co`](larch.Model.analyze_predictions_co):

In [None]:
m.analyze_predictions_co("age")

This gives us a table of mode choices, segregated into five age categories. These 
categories were selected by the `pandas.qcut` functions to roughly divide the
sample of observations into quintiles.  We can see the mean and standard deviation
of the model predictions for each mode choice in each age group, as well as
the actual observed count of choices in each group.  The `signif` column gives
the level of significance of the difference between the predicted totals and
the observed totals.  A small number in this column indicates that, assuming 
the model is correct, is would be very unlikely to actually collect the observed
data.  We can see the very small significance values in the lowest age group
are bold and highlighted in red, as these very small numbers are suggesting there
is a problem in our model.

We can also generate a figure to present this same information in a more
visual representation, using [`analyze_predictions_co_figure`](larch.Model.analyze_predictions_co_figure).

In [None]:
m.analyze_predictions_co_figure("age")

If we prefer to analyze age using specific categorical breakpoints, we can do so
by providing the preferred explicit bin breakpoints to the 
[`analyze_predictions_co`](larch.Model.analyze_predictions_co) method, which
are then used by [`pandas.cut`](pandas.cut) to categorize the data.

In [None]:
m.analyze_predictions_co("age", bins=[0, 25, 45, 65, 99])

We can also apply non-uniform weights to the observations, by passing an expression to the `wgt` argument of 
the [`analyze_predictions_co`](larch.Model.analyze_predictions_co) method.  For example, here we 
overweight persons who work in the core CBD:

In [None]:
m.analyze_predictions_co("age", wgt="1.5 if wkccbd else 1.0")

Note that weights are *not* normalized within the analysis, so if you use something 
like a population expansion weight or other large value, you will see results that
appear to be extraordinarily significant across the board.

In [None]:
m.analyze_predictions_co("age", wgt="1000")

To counteract this effect, you can normalize weights before providing the data to Larch, 
or explicitly in the `wgt` expression.

Weights in [`analyze_predictions_co`](larch.Model.analyze_predictions_co) are computed 
seperately from weighting that is applied in estimation, but if weights are used in
estimation you can choose to apply the same weights here by setting `wgt=True`.

## Elasticity

Users can also review the elasticity of demand with respect to various input variables, using the 
[`analyze_elasticity`](larch.Model.analyze_elasticity) method.  This method accepts a variable
name and it computes the elasticity with respect to that variable.  For `idca` format variables, 
you can also provide an `altid` (the integer code for an individual alternative), and the 
elasticity will be computed with respect to a change in only that alterantives values for the 
selected variable.  For example, in the model we are reviewing here, cost is stored in a single
`idca` format variable, but if we want to see the elasticity with respect to transit cost specifically
we can do so like this:

In [None]:
m.analyze_elasticity("totcost", altid=4)

For `idco` format variables, we can still compute elasticities, but only without the `altid` argument,
as it does not make sense to try to have an elasticity with respect to something like "income when 
choosing to drive".

In [None]:
m.analyze_elasticity("hhinc")

Elasticities can also be computed by segments of choosers, in a manner mirroring the segmentation available
from the [`analyze_predictions_co`](larch.Model.analyze_predictions_co) method.  By adding the `q` argument
to break the data into quantiles, (and optionally the `n` to set the number of quantiles), we can see elasticity
by various segments.  For example, here we can see the price elasticity of demand for transit is (slightly)
increasing as income increases.

In [None]:
m.analyze_elasticity("totcost", altid=4, q="hhinc", n=3)

## Full Probability Array

In addition to the pre-packaged analysis above, Larch makes available the full 
[`probability`](larch.Model.probability) array (among other internals), so that 
advanced users can slice and analyze the results in arbitrarily complex ways.

In [None]:
m.probability(return_format="dataframe")

In conjunction with manipulations of the data and model parameters, users can evaluate nearly any type of elasticity,
reponse function, or summary statistic.