Estimation (likelihood-based)

Note

Most functionality here lives in the submodule Estimation (src/SubModules/Estimation).

This page documents likelihood-based estimation of model parameters using the Kalman filter/smoother and (optionally) Bayesian MCMC. IRF matching is documented elsewhere and is intentionally omitted here.

Settings

BASEforHANK.Parsing.EstimationSettingsType
EstimationSettings()

Collect settings for the estimation of the model parameters in a struct.

Use package Parameters to provide initial values. Input and output file names are stored in the fields mode_start_file, data_file, save_mode_file and save_posterior_file.

source

The estimation settings (globally available as e_set) control:

  • Data and observables: data_file (CSV path), observed_vars_input (observable names), data_rename (mapping from CSV to model names), growth_rate_select (transformations), and meas_error_input (observables with measurement error).
  • Variances: shock_names (structural shocks whose variances are estimated) and me_treatment for measurement errors: :fixed (use data-driven caps), :bounded (estimate with uniform priors up to caps), or :unbounded (use priors in meas_error_distr). The cap scale is me_std_cutoff. See measurement_error.
  • Numerics: max_iter_mode (outer iterations for mode finding), optimizer and tolerances (optimizer, x_tol, f_tol), and MCMC controls ndraws, burnin, mhscale (see rwmh).
  • Flags: estimate_model, compute_hessian (estimate Hessian at the mode), and multi_chain_init (overdispersed starts for multiple chains via multi_chain_init).

Data and measurement error

Measurement-error treatment and ordering are derived from e_set and the data. The helper builds the mapping to observable columns, prior distributions, and per-observable caps, and constructs the observation selector matrix H_sel (used to form H = H_sel * [I; gx]) that maps model variables to data.

BASEforHANK.Estimation.measurement_errorFunction
measurement_error(Data, observed_vars, e_set)

Construct the measurement-error specification from user settings and data.

Maps each observable listed in e_set.meas_error_input to its column index in observed_vars, and builds corresponding priors and standard-deviation caps according to e_set.me_treatment.

Arguments

  • Data::AbstractMatrix: observed data matrix of size nobs × nvar
  • observed_vars::AbstractVector{Symbol}: names of observables, length nvar
  • e_set::EstimationSettings: estimation settings controlling measurement errors

Returns

  • meas_error::OrderedDict{Symbol,Int}: mapping from observable names to their column index
  • meas_error_prior::Vector{<:Distribution}: prior distributions for measurement-error stds
  • meas_error_std::Vector{Float64}: per-observable std caps (used when me_treatment ∈ (:bounded, :fixed)) or defaults

Treatment modes (e_set.me_treatment):

  • :unbounded: uses e_set.meas_error_distr (e.g., inverse-gamma) as priors and sets meas_error_std .= e_set.me_std_cutoff (a neutral scaling anchor).
  • :bounded or :fixed: sets a data-driven upper bound for each selected observable as UB_j = e_set.me_std_cutoff * std(skipmissing(Data[:, j])) and assigns Uniform(0, UB_j) priors. For :fixed, these caps are later used directly as fixed stds.

If e_set.meas_error_input is empty, returns an empty OrderedDict(), an empty vector of priors, and an empty Float64[] for stds.

source

Likelihood and prior

The likelihood of a parameter vector combines the linear state-space model with the Kalman filter (or smoother) and the prior density.

  • Linear solution: LinearSolution_reduced_system computes the reduced aggregate linearization for the current parameters and yields gx (state-to-control) and hx (state transition).
  • Observation mapping: H = H_sel * [I; gx] where H_sel selects observed states/controls.
  • Likelihood: kalman_filter on (H, hx) with structural/measurement covariances SCov, MCov.
  • Smoother: kalman_filter_smoother if smoother=true. Missing observations in the CSV (NaN) are passed via a boolean mask to the filter/smoother.
BASEforHANK.Estimation.likeliFunction
likeli(args...; smoother=false)

Compute the (Gaussian) likelihood of observed data given candidate parameters and priors, optionally returning Kalman-smoother outputs.

This docstring covers both methods:

  • likeli(par, Data, Data_missing, H_sel, priors, meas_error, meas_error_std, sr, lr, m_par, e_set; smoother=false)
  • likeli(par, sr, lr, er, m_par, e_set; smoother=false) where er bundles data, masks, priors, and measurement-error info.

Workflow:

  1. Evaluate the prior; if violated, set alarm=true and return a large negative likelihood.
  2. Reconstruct model parameters from par and build structural (SCov) and measurement (MCov) covariances.
  3. Solve the reduced linear system via LinearSolution_reduced_system (aggregate-only update).
  4. Form observation matrix H = H_sel * [I; State2Control] and compute the likelihood using kalman_filter (filter only) or kalman_filter_smoother (if smoother=true).

Returns

  • If smoother == false:

    • log_like::Float64: data log-likelihood
    • prior_like::Float64: prior log-density
    • post_like::Float64: sum of likelihood and prior
    • alarm::Bool: true if prior violated or model solution failed
    • State2Control::AbstractMatrix: the observation mapping gx used to build H
  • If smoother == true:

    • smoother_output: a tuple (log_lik, xhat_tgt, xhat_tgT, Sigma_tgt, Sigma_tgT, s, m) from kalman_filter_smoother

Notes:

  • Measurement-error treatment is governed by e_set.me_treatment:

    • :fixed uses meas_error_std to set diagonal entries of MCov at positions in meas_error.
    • otherwise, measurement-error stds are read from the tail of par in the same order.
source
BASEforHANK.Estimation.priorevalFunction
prioreval(par, priors)

Evaluate the joint prior density at parameters par using Distributions.jl.

Arguments

  • par::Tuple or vector: parameter values
  • priors::Tuple or vector: matching prior distributions

Returns

  • log_priorval::Float64: sum of log-densities if all par[i] ∈ support(priors[i]), otherwise -9e15
  • alarm::Bool: true if any parameter lies outside its prior support
source
BASEforHANK.Estimation.kalman_filterFunction
kalman_filter(H, LOM, Data, D_miss, SCov, MCov, e_set)

Compute the log-likelihood of Data by applying the Kalman filter to the state-space representation (H, LOM) of the model.

This docstring covers all kalman_filter methods, with and without missing data.

Arguments

  • H::AbstractMatrix: observation matrix (maps states to observed variables)
  • LOM::AbstractMatrix: state transition (law of motion) matrix
  • Data: data matrix with observations (time × variable); may contain missing
  • D_miss/D_nomiss::BitArray{2}: mask for missing/available observations by time and variable
  • SCov::AbstractMatrix: covariance of structural shocks
  • MCov::AbstractMatrix: covariance of measurement errors
  • e_set: estimation settings (e.g., debug_print flag)

Returns

  • loglik::Float64: the (Gaussian) log-likelihood of Data given the model
source
BASEforHANK.Estimation.kalman_filter_smootherFunction
kalman_filter_smoother(H, LOM, Data, D_nomiss, SCov, MCov, e_set)

Compute likelihood and estimate of underlying states given the full observed Data by applying the Kalman smoother to the state-space representation (H,LOM) of the model.

Arguments

  • H::Array{Float64,2}: observation equation
  • LOM::Array{Float64,2}: law of motion for states
  • Data::Array{Union{Missing,Float64},2},D_nomiss::BitArray{2}: data (time $imes$ variable); marker for existent data
  • SCov::Array{Float64,2}: covariance of structural shocks
  • MCov::Array{Float64,2}: covariance of measurement error

Returns

  • log_lik: log-likelihood
  • xhat_tgt,xhat_tgT: estimate of underlying states from forward iteration [xhat_tgt] and backward iteration [xhat_tgT]
  • Sigma_tgt,Sigma_tgT: estimate of covariance matrices from forward iteration [Sigma_tgt] and backward iteration [Sigma_tgT]
  • s,m: smoothed structural shocks [s] and measurement errors [m]
source
BASEforHANK.Estimation.kalman_filter_herbstFunction
    kalman_filter_herbst(Data, LOM, SCov, H, MCov, t0, e_set)

Compute the Gaussian log-likelihood using a steady-state Kalman filter variant that iteratively updates gain-related matrices until convergence (per-period fixed-point on Kg). This follows a formulation akin to steady-state filtering as used in the DSGE estimation literature.

Arguments

- `Data::AbstractMatrix`: observation matrix (time × variable)
- `LOM::AbstractMatrix`: state transition matrix
- `SCov::AbstractMatrix`: covariance of structural shocks
- `H::AbstractMatrix`: observation matrix
- `MCov::AbstractMatrix`: covariance of measurement errors
- `t0::Integer`: number of initial observations excluded from the likelihood (burn-in)
- `e_set`: estimation settings (uses `debug_print` for error reporting)

Returns

- `log_lik::Float64`: (Gaussian) log-likelihood after convergence (or max iterations)

Notes:

  • Ensures symmetry/PD of intermediate covariance matrices and checks determinant signs; returns a large negative likelihood if numerical issues are detected (and debug_print).
  • Convergence criterion is based on maximum(abs.(Kg - Kg_old)) < tol.
source

Mode finding

Find the posterior mode by maximizing the log posterior (likelihood + prior). The routine updates the reduced system between optimization rounds and can compute a Hessian at the mode.

BASEforHANK.find_modeFunction
find_mode(sr, lr, m_par, e_set)

Find parameter that maximizes likelihood of data given linearized model lr.

Arguments

Returns

  • EstimResults, containing all returns of mode_finding()

    • posterior_mode::Float64: value of the log posterior at the mode
    • smoother_output: output from likeli(...; smoother=true) (Kalman smoother tuple)
    • sr::SteadyResults: Updated SteadyResults (may be modified by the routine)
    • lr::LinearResults: Updated LinearResults (may be modified by the routine)
    • m_par::ModelParameters: Updated ModelParameters (may be modified)
  • posterior_mode::Float64: value of the log posterior at the mode

  • smoother_output: output from likeli(...; smoother=true) (Kalman smoother tuple)

  • sr::SteadyResults: Updated SteadyResults (may be modified by the routine)

  • lr::LinearResults: Updated LinearResults (may be modified by the routine)

  • m_par::ModelParameters: Updated ModelParameters (may be modified)

Notes

  • If e_set.mode_start_file is set, the function attempts to read a JSON file at that path to initialize the optimization; the file is read (JSON.parse) and its contents are used to build the starting vector. This function therefore performs file I/O when e_set.mode_start_file != "".
source
BASEforHANK.Estimation.mode_findingFunction
mode_finding(sr, lr, m_par, e_set, par_start)

Find the posterior mode of the parameter vector via numerical optimization.

Two estimation modes are supported:

  • Likelihood estimation (default, when e_set.irf_matching == false): loads time-series data, builds the observation matrix, and maximizes the log posterior via likeli using Optim.
  • IRF matching (when e_set.irf_matching == true): loads IRF targets from CSV according to e_set.irf_matching_dict, builds weights, and maximizes the IRF-matching objective via irfmatch.

In both modes, the routine updates the reduced model using model_reduction/update_model between optimization rounds to improve accuracy, and it optionally computes a Hessian at the mode.

Arguments

  • sr: structural/reduced-system container (sizes, indexes, reduction flags)
  • lr: linearization container (A, B matrices and caches)
  • m_par: model parameter struct (Flatten-compatible)
  • e_set::EstimationSettings: controls data paths, observables, priors, optimizer and tolerances, measurement-error treatment, IRF-matching settings, and flags like compute_hessian
  • par_start::AbstractVector: initial parameter vector (includes measurement-error stds if estimated)

Returns

Returns a 16-tuple with elements depending on the mode:

  1. par_final::Vector{Float64}: parameter vector at the mode
  2. hessian_final::Matrix{Float64}: Hessian at the mode (finite-diff or identity if disabled)
  3. posterior_mode::Float64: value of the objective at the mode (log posterior)
  4. meas_error: mapping from observable names to column indexes (likelihood mode); [] for IRF matching
  5. meas_error_std::Vector{Float64}: std caps or fixed stds (likelihood mode); [] for IRF matching
  6. parnames::Vector{Symbol}: names of estimated parameters (incl. ME params if not fixed)
  7. Data::Matrix{Float64}: data matrix (likelihood mode); [] for IRF matching
  8. Data_missing::BitMatrix: missingness mask (likelihood mode); [] for IRF matching
  9. IRFtargets::Array{Float64,3}: target IRFs (IRF matching); empty otherwise
  10. IRFserrors::Array{Float64,3}: IRF target standard errors (IRF matching); empty otherwise
  11. H_sel::Matrix{Float64}: selection matrix mapping states/controls to observables (likelihood mode); empty for IRF matching
  12. priors::Vector{<:Distribution}: priors for structural (and ME) parameters
  13. smoother_output: Kalman smoother output from likeli(...; smoother=true) (likelihood mode); empty for IRF matching
  14. m_par: parameter struct updated at par_final
  15. sr: structural container after final reduction update
  16. lr: linearization container after final update

Notes:

  • Measurement errors are constructed by measurement_error and included in parnames and priors unless e_set.me_treatment == :fixed. # No irf matching? => likelihood estimation
  • The optimizer and tolerances are taken from e_set (e.g., e_set.optimizer, e_set.x_tol). # Load data
source

Bayesian MCMC

Random-Walk Metropolis–Hastings draws from the posterior, optionally using overdispersed initialization to seed multiple chains. The posterior for each draw is evaluated via likeli.

BASEforHANK.Estimation.rwmhFunction
rwmh(xhat, Σ, sr, lr, er, m_par, e_set)

Random-Walk Metropolis–Hastings sampler for the posterior of the model parameters.

This routine draws e_set.ndraws + e_set.burnin proposals using a multivariate normal random walk with proposal covariance Σ, evaluates the (log) posterior using either likeli (default) or irfmatch (if e_set.irf_matching == true), and accepts/rejects via the usual MH rule.

Arguments

  • xhat::Vector{Float64}: initial parameter vector (at least approximately feasible)
  • Σ::Symmetric{Float64,Array{Float64,2}}: proposal covariance matrix for the random walk
  • sr, lr, er, m_par, e_set: model structures and estimation settings; see package docs

Returns

  • draws::Matrix{Float64}: (e_set.ndraws + e_set.burnin) × length(xhat) matrix of draws, each row a parameter vector
  • posterior::Vector{Float64}: log posterior for each stored draw (likelihood + prior)
  • accept_rate::Float64: overall acceptance rate

Notes:

  • When e_set.irf_matching == false, the posterior is computed with likeli.
  • When e_set.irf_matching == true, IRF targets are read from a CSV according to e_set.irf_matching_dict, and the posterior is computed with irfmatch using inverse-variance weights. The function periodically prints progress (draw count, acceptance rate, posterior) and the current parameters.
source
BASEforHANK.Estimation.multi_chain_initFunction
multi_chain_init(xhat, Σ, sr, lr, er, m_par, e_set)

Construct an overdispersed starting value for a Metropolis–Hastings chain.

Draws proposals from MvNormal(0, Σ) scaled by 2 * e_set.mhscale around xhat and keeps the first candidate that yields a feasible model (no alarm from likeli). Tries up to 100 times.

Arguments

  • xhat::Vector{Float64}: baseline parameter vector
  • Σ::Symmetric{Float64,Array{Float64,2}}: covariance used to generate overdispersed proposals
  • sr, lr, er, m_par, e_set: model structures and estimation settings

Returns

  • init_draw::Vector{Float64}: overdispersed, feasible starting parameter vector
  • init_success::Bool: whether a feasible start was found within 100 attempts
source
BASEforHANK.sample_posteriorFunction
sample_posterior(sr, lr, er, m_par, e_set)

Sample from the posterior with Random-Walk Metropolis–Hastings rwmh(), compute the sample mean as a point estimate, and return draws and diagnostics.

Arguments

Returns

  • sr::SteadyResults: Updated SteadyResults
  • lr::LinearResults: Updated LinearResults
  • er::EstimResults: Updated EstimResults
  • m_par::ModelParameters: Updated ModelParameters
  • draws_raw::Array{Float64,2}: Raw draws from the posterior
  • posterior::Array{Float64,1}: Posterior
  • accept_rate::Float64: Acceptance rate
  • par_final::Array{Float64,1}: Mode of the posterior
  • hessian_sym::Symmetric{Float64,Array{Float64,2}}: Hessian of the posterior
  • smoother_output: Kalman smoother output at par_final (may be empty when e_set.irf_matching == true)

Notes

  • This routine prints progress messages to STDOUT while running ("Started MCMC...").
  • When e_set.irf_matching == true the function returns smoother_output = [].
source

Priors are evaluated by BASEforHANK.Estimation.prioreval, see above.

To obtain smoothed states at a parameter value (e.g., the posterior mean), call likeli(...; smoother=true) to access the Kalman smoother’s output.

Marginal likelihood

Estimate the (log) marginal likelihood from posterior draws using the Modified Harmonic Mean estimator of Geweke (1998).

BASEforHANK.Estimation.marginal_likeliFunction
marginal_likeli(draws, posterior)

Estimate the log marginal likelihood using the Modified Harmonic Mean estimator of Geweke (1998), averaging over a grid of truncation levels.

Given MCMC draws θ_i and their log posteriors posterior[i], the estimator computes Gaussian kernel densities within a χ²-ball defined by τ ∈ {0.1, …, 0.9} and averages the implied marginal-likelihood estimates for stability.

Arguments

  • draws::AbstractMatrix{<:Real}: ndraws × npars matrix of parameter draws
  • posterior::AbstractVector{<:Real}: log posterior evaluations for the draws

Returns

  • marg_likeli::Float64: estimated log marginal likelihood

Caution: As with all harmonic-mean style estimators, results can be sensitive to tail behavior and mixing. Use with diagnostics and, if possible, alternative estimators.

source

Utilities

BASEforHANK.Estimation.nearest_spdFunction
nearest_spd(A)

Return the nearest symmetric positive definite (SPD) matrix to A in Frobenius norm.

Algorithm (Higham):

  • Symmetrize A to B = (A + A')/2.
  • Compute the symmetric polar factor H of B via SVD, and set Ahat = (B + H)/2.
  • Enforce exact symmetry: Ahat = (Ahat + Ahat')/2.
  • If Ahat is not numerically PD, add a small multiple of the identity until a Cholesky factorization succeeds (capped at 100 tweaks).

Arguments

  • A::AbstractMatrix{<:Real}: square matrix

Returns

  • Ahat::Matrix{<:Real}: symmetric positive definite matrix near A in Frobenius norm

Notes:

  • If A is already SPD, the output is (up to roundoff) A.

  • The SVD-based construction yields the nearest symmetric positive semidefinite matrix; the final identity shift ensures positive definiteness, which is often required for covariance matrices.

  • See N. J. Higham (1988), “Computing a nearest symmetric positive semidefinite matrix”.

    symmetrize A into B

source