109: Swissmetro Nested Logit Mode Choice#

import pandas as pd

import larch as lx

This example is a mode choice model built using the Swissmetro example dataset. First we create the Dataset and Model objects.

raw_data = pd.read_csv(lx.example_file("swissmetro.csv.gz")).rename_axis(index="CASEID")
data = lx.Dataset.construct.from_idco(raw_data, alts={1: "Train", 2: "SM", 3: "Car"})
m = lx.Model(data.dc.query_cases("PURPOSE in (1,3) and CHOICE != 0"))

We can attach a title to the model. The title does not affect the calculations as all; it is merely used in various output report styles.

m.title = "swissmetro example 09 (nested logit)"

We need to identify the availability and choice variables.

m.availability_co_vars = {
    1: "TRAIN_AV * (SP!=0)",
    2: "SM_AV",
    3: "CAR_AV * (SP!=0)",
}
m.choice_co_code = "CHOICE"

The swissmetro dataset, as with all Biogeme data, is only in co format.

from larch import P, X

m.utility_co[1] = P.ASC_TRAIN + P.B_TIME * X.TRAIN_TT + P.B_COST * X("TRAIN_CO*(GA==0)")
m.utility_co[2] = P.B_TIME * X.SM_TT + P.B_COST * X("SM_CO*(GA==0)")
m.utility_co[3] = P.ASC_CAR + P.B_TIME * X.CAR_TT + P.B_COST * X("CAR_CO")

To create a new nest, we can use the graph.new_node command. For this example, we want to nest together the Train and Car modes into a “existing” modes nest. Those are modes 1 and 3, so we can use the graph.new_node command like this:

m.graph.new_node(parameter="existing", children=[1, 3], name="Existing")
4

In a Jupyter notebook, we can verify the nesting structure visually if we like.

m.graph
Tree cluster_elemental Elemental Alternatives 1 Train (1) 2 SM (2) 3 Car (3) 0 Root 0->2 4 Existing (4) 0->4 4->1 4->3

Larch will find all the parameters in the model, but we’d like to output them in a rational order. We can use the ordering method to do this:

m.ordering = [
    (
        "ASCs",
        "ASC.*",
    ),
    (
        "LOS",
        "B_.*",
    ),
    ("LogSums", "existing"),
]

We can estimate the models and check the results match up with those given by Biogeme:

m.set_cap(15)
result = m.maximize_loglike()
result
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Larch Model Dashboard                                                                 ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ optimization complete                                                                 │
│ Log Likelihood Current =      -5236.899902 Best =      -5236.899902                   │
│ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓ │
│ ┃ Parameter                       Value           Gradient        Best           ┃ │
│ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩ │
│ │ ASC_CAR                        │ -0.16718941    │  0.084960938   │ -0.16718941    │ │
│ │ ASC_TRAIN                      │ -0.51194449    │ -0.063713074   │ -0.51194449    │ │
│ │ B_COST                         │ -0.0085665424  │ -1.6259766     │ -0.0085665424  │ │
│ │ B_TIME                         │ -0.0089865975  │ -1.1230469     │ -0.0089865975  │ │
│ │ existing                       │  0.48685183    │ -0.083129883   │  0.48685183    │ │
│ └────────────────────────────────┴────────────────┴────────────────┴────────────────┘ │
└───────────────────────────────────────────────────────────────────────────────────────┘
/opt/hostedtoolcache/Python/3.10.17/x64/lib/python3.10/site-packages/jax/_src/interpreters/xla.py:115: RuntimeWarning: overflow encountered in cast
  return np.asarray(
      message: Optimization terminated successfully
      success: True
       status: 0
            x: [-1.672e-01 -5.119e-01 -8.567e-03 -8.987e-03  4.869e-01]
          nit: 16
          jac: [ 8.496e-02 -6.371e-02 -1.626e+00 -1.123e+00 -8.313e-02]
         nfev: 47
         njev: 16
      n_cases: 6768
 total_weight: 6768.0
      logloss: 0.7737736262328235
      loglike: -5236.89990234375
m.calculate_parameter_covariance(robust=True)
m.parameter_summary()
    Value Std Err t Stat Signif Robust Std Err Robust t Stat Robust Signif Null Value
Category Parameter                
ASCs ASC_CAR -0.167  0.0371 -4.50 ***  0.0545 -3.07 ** 0.00
ASC_TRAIN -0.512  0.0452 -11.33 ***  0.0791 -6.47 *** 0.00
LOS B_COST -0.00857  0.000463 -18.51 ***  0.000600 -14.27 *** 0.00
B_TIME -0.00899  0.000570 -15.77 ***  0.00107 -8.39 *** 0.00
LogSums existing  0.487  0.0279 -18.39 ***  0.0389 -13.18 *** 1.00