larch.model.troubleshooting.doctor#

doctor(model: NumbaModel, *, repair_ch_av: Literal['?', '+', '-', '!'] | None = '?', repair_ch_zq: Literal['?', '-', '!'] | None = None, repair_av_zq: Literal['?', '-', '!'] | None = None, repair_noch_nzwt: Literal['?', '+', '-'] | None = '?', repair_nan_wt: Literal['?', True, '!'] | None = '?', repair_nan_data_co: Literal['?', True, '!'] | None = '?', check_low_variance_data_co: Literal['?', '!'] | None = None, check_overspec: Literal['?', '!', None] | None = None, repair_nan_utility: Literal['?', True, '!'] | None = '?', verbose: int = 3, warning_stacklevel: int = 2)[source]#

Diagnose data problems with a model.

The doctor will check for common data problems that can cause numerical instability in a model. The doctor will return a list of problems found, and optionally repair them.

Parameters:
  • model (larch.Model) – The model to diagnose.

  • repair_ch_av ({'?', '+', '-', '!'}, default '?') – How to repair the data if some observations are chosen but not available. The plus (‘+’) will make the conflicting alternatives available, overriding the availability status. The minus (‘-’) will make them not chosen (possibly leaving no chosen alternative). A question mark (‘?’) effects no repair, and simply emits a warning without interrupting program execution. An exclamation mark will raise an error if there are any conflicts.

  • repair_ch_zq ({'?', '-', '!'}, default None) – How to repair the data if some observations are chosen but have zero quantity. The minus (‘-’) will make alternatives with zero quantity not chosen (possibly leaving no chosen alternative). A question mark (‘?’) effects no repair, and simply emits a warning. An exclamation mark (‘!’) will raise an error if there are any conflicts.

  • repair_av_zq ({'?', '-', '!'}, default None) – How to repair the data if some observations are available but have zero quantity. The minus (‘-’) will make alternatives with zero quantity not available (possibly leaving no available alternatives). A question mark (‘?’) effects no repair, and simply emits a warning. An exclamation mark (‘!’) will raise an error if there are any conflicts.

  • repair_noch_nzwt ({'?', '+', '-'}, default '?') – How to repair the data if some observations have no choice but have some weight. Minus (‘-’) will make the weight zero when there is no choice. Plus (‘+’) will make the weight zero, plus autoscale all remaining weights so the total of the case weights equals the number of cases. A question mark (‘?’) effects no repair, and simply emits a warning.

  • repair_nan_wt ({'?', '!', True}, default '?') – How to repair the data if some weight values are NaN. Any true value other than “?” or “!” will make NaN values in weight zero. The question mark simply emits a warning if there are NaN values found, while the exclamation mark will raise an error.

  • repair_nan_data_co ({'?', '!', True}, default '?') – How to repair the data if some data_co values are NaN. Any true value other than “?” or “!” will make NaN values in data_co zero. The question mark simply emits a warning if there are NaN values found, while the exclamation mark will raise an error.

  • check_low_variance_data_co ({'?', '!'}, default None) – Check if any data_co columns have very low variance. No repairs are available for this check. The question mark simply emits a warning if there are issues found, while the exclamation mark will raise an error.

  • check_overspec ({'?', '!'}, default None) – Check model for possible over-specification. No repairs are available for this check. A question mark (‘?’) simply emits a warning if a possible over- specification is found. An exclamation mark (‘!’) will raise an error if possible over-specification is found. This is considered a “deep” check, and will only be run if there are no known data problems found by the other checks.

  • repair_nan_utility ({'?', '!', True}, default '?') – How to repair the data if some utility values are NaN at current parameters. Any true value other than “?” or “!” will take alternatives with NaN values in utility, and make them unavailable. The question mark simply emits a warning if there are NaN values found, while the exclamation mark will raise an error. This is considered a “deep” check, and will only be run if there are no known data problems found by the other checks.

  • verbose (int, default 3) – The number of example rows to list for each problem.

  • warning_stacklevel (int, default 2) – The stacklevel for warnings.

Returns:

  • model (larch.Model) – The model with revised dataset attached.

  • problems (dict) – A dictionary of problems found, with the key being the name of the problem and the value being a DataFrame with the number of bad instances and some example rows.

Raises:
  • TypeError – If the model is not a Model instance.

  • ValueError – If any of the repair settings are invalid, or if the repair is set to ‘!’ and there are any conflicts found.