Estimation (likelihood-based)

Note

Most functionality here lives in the submodule Estimation (src/SubModules/Estimation).

This page documents likelihood-based estimation of model parameters using the Kalman filter/smoother and (optionally) Bayesian MCMC. IRF matching is documented elsewhere and is intentionally omitted here.

Settings

BASEforHANK.Parsing.EstimationSettings — Type

EstimationSettings()

Collect settings for the estimation of the model parameters in a struct.

Use package Parameters to provide initial values. Input and output file names are stored in the fields mode_start_file, data_file, save_mode_file and save_posterior_file.

source

The estimation settings (globally available as e_set) control:

Data and observables: data_file (CSV path), observed_vars_input (observable names), data_rename (mapping from CSV to model names), growth_rate_select (transformations), and meas_error_input (observables with measurement error).
Variances: shock_names (structural shocks whose variances are estimated) and me_treatment for measurement errors: :fixed (use data-driven caps), :bounded (estimate with uniform priors up to caps), or :unbounded (use priors in meas_error_distr). The cap scale is me_std_cutoff. See measurement_error.
Numerics: max_iter_mode (outer iterations for mode finding), optimizer and tolerances (optimizer, x_tol, f_tol), and MCMC controls ndraws, burnin, mhscale (see rwmh).
Flags: estimate_model, compute_hessian (estimate Hessian at the mode), and multi_chain_init (overdispersed starts for multiple chains via multi_chain_init).

Data and measurement error

Measurement-error treatment and ordering are derived from e_set and the data. The helper builds the mapping to observable columns, prior distributions, and per-observable caps, and constructs the observation selector matrix H_sel (used to form H = H_sel * [I; gx]) that maps model variables to data.

BASEforHANK.Estimation.measurement_error — Function

measurement_error(Data, observed_vars, e_set)

Construct the measurement-error specification from user settings and data.

Maps each observable listed in e_set.meas_error_input to its column index in observed_vars, and builds corresponding priors and standard-deviation caps according to e_set.me_treatment.

Arguments

Data::AbstractMatrix: observed data matrix of size nobs × nvar
observed_vars::AbstractVector{Symbol}: names of observables, length nvar
e_set::EstimationSettings: estimation settings controlling measurement errors

Returns

meas_error::OrderedDict{Symbol,Int}: mapping from observable names to their column index
meas_error_prior::Vector{<:Distribution}: prior distributions for measurement-error stds
meas_error_std::Vector{Float64}: per-observable std caps (used when me_treatment ∈ (:bounded, :fixed)) or defaults

Treatment modes (e_set.me_treatment):

:unbounded: uses e_set.meas_error_distr (e.g., inverse-gamma) as priors and sets meas_error_std .= e_set.me_std_cutoff (a neutral scaling anchor).
:bounded or :fixed: sets a data-driven upper bound for each selected observable as UB_j = e_set.me_std_cutoff * std(skipmissing(Data[:, j])) and assigns Uniform(0, UB_j) priors. For :fixed, these caps are later used directly as fixed stds.

If e_set.meas_error_input is empty, returns an empty OrderedDict(), an empty vector of priors, and an empty Float64[] for stds.

source

Likelihood and prior

The likelihood of a parameter vector combines the linear state-space model with the Kalman filter (or smoother) and the prior density.

Linear solution: LinearSolution_reduced_system computes the reduced aggregate linearization for the current parameters and yields gx (state-to-control) and hx (state transition).
Observation mapping: H = H_sel * [I; gx] where H_sel selects observed states/controls.
Likelihood: kalman_filter on (H, hx) with structural/measurement covariances SCov, MCov.
Smoother: kalman_filter_smoother if smoother=true. Missing observations in the CSV (NaN) are passed via a boolean mask to the filter/smoother.

BASEforHANK.Estimation.likeli — Function

likeli(args...; smoother=false)

Compute the (Gaussian) likelihood of observed data given candidate parameters and priors, optionally returning Kalman-smoother outputs.

This docstring covers both methods:

likeli(par, Data, Data_missing, H_sel, priors, meas_error, meas_error_std, sr, lr, m_par, e_set; smoother=false)
likeli(par, sr, lr, er, m_par, e_set; smoother=false) where er bundles data, masks, priors, and measurement-error info.

Workflow:

Evaluate the prior; if violated, set alarm=true and return a large negative likelihood.
Reconstruct model parameters from par and build structural (SCov) and measurement (MCov) covariances.
Solve the reduced linear system via LinearSolution_reduced_system (aggregate-only update).
Form observation matrix H = H_sel * [I; State2Control] and compute the likelihood using kalman_filter (filter only) or kalman_filter_smoother (if smoother=true).

Returns

If smoother == false:
- log_like::Float64: data log-likelihood
- prior_like::Float64: prior log-density
- post_like::Float64: sum of likelihood and prior
- alarm::Bool: true if prior violated or model solution failed
- State2Control::AbstractMatrix: the observation mapping gx used to build H
If smoother == true:
- smoother_output: a tuple (log_lik, xhat_tgt, xhat_tgT, Sigma_tgt, Sigma_tgT, s, m) from kalman_filter_smoother

Notes:

Measurement-error treatment is governed by e_set.me_treatment:
- :fixed uses meas_error_std to set diagonal entries of MCov at positions in meas_error.
- otherwise, measurement-error stds are read from the tail of par in the same order.

source

BASEforHANK.Estimation.prioreval — Function

prioreval(par, priors)

Evaluate the joint prior density at parameters par using Distributions.jl.

Arguments

par::Tuple or vector: parameter values
priors::Tuple or vector: matching prior distributions

Returns

log_priorval::Float64: sum of log-densities if all par[i] ∈ support(priors[i]), otherwise -9e15
alarm::Bool: true if any parameter lies outside its prior support

source

BASEforHANK.Estimation.kalman_filter — Function

kalman_filter(H, LOM, Data, D_miss, SCov, MCov, e_set)

Compute the log-likelihood of Data by applying the Kalman filter to the state-space representation (H, LOM) of the model.

This docstring covers all kalman_filter methods, with and without missing data.

Arguments

H::AbstractMatrix: observation matrix (maps states to observed variables)
LOM::AbstractMatrix: state transition (law of motion) matrix
Data: data matrix with observations (time × variable); may contain missing
D_miss/D_nomiss::BitArray{2}: mask for missing/available observations by time and variable
SCov::AbstractMatrix: covariance of structural shocks
MCov::AbstractMatrix: covariance of measurement errors
e_set: estimation settings (e.g., debug_print flag)

Returns

loglik::Float64: the (Gaussian) log-likelihood of Data given the model

source

BASEforHANK.Estimation.kalman_filter_smoother — Function

kalman_filter_smoother(H, LOM, Data, D_nomiss, SCov, MCov, e_set)

Compute likelihood and estimate of underlying states given the full observed Data by applying the Kalman smoother to the state-space representation (H,LOM) of the model.

Arguments

H::Array{Float64,2}: observation equation
LOM::Array{Float64,2}: law of motion for states
Data::Array{Union{Missing,Float64},2},D_nomiss::BitArray{2}: data (time $imes$ variable); marker for existent data
SCov::Array{Float64,2}: covariance of structural shocks
MCov::Array{Float64,2}: covariance of measurement error

Returns

log_lik: log-likelihood
xhat_tgt,xhat_tgT: estimate of underlying states from forward iteration [xhat_tgt] and backward iteration [xhat_tgT]
Sigma_tgt,Sigma_tgT: estimate of covariance matrices from forward iteration [Sigma_tgt] and backward iteration [Sigma_tgT]
s,m: smoothed structural shocks [s] and measurement errors [m]

source

BASEforHANK.Estimation.kalman_filter_herbst — Function

    kalman_filter_herbst(Data, LOM, SCov, H, MCov, t0, e_set)

Compute the Gaussian log-likelihood using a steady-state Kalman filter variant that iteratively updates gain-related matrices until convergence (per-period fixed-point on Kg). This follows a formulation akin to steady-state filtering as used in the DSGE estimation literature.

Arguments

- `Data::AbstractMatrix`: observation matrix (time × variable)
- `LOM::AbstractMatrix`: state transition matrix
- `SCov::AbstractMatrix`: covariance of structural shocks
- `H::AbstractMatrix`: observation matrix
- `MCov::AbstractMatrix`: covariance of measurement errors
- `t0::Integer`: number of initial observations excluded from the likelihood (burn-in)
- `e_set`: estimation settings (uses `debug_print` for error reporting)

Returns

- `log_lik::Float64`: (Gaussian) log-likelihood after convergence (or max iterations)

Notes:

Ensures symmetry/PD of intermediate covariance matrices and checks determinant signs; returns a large negative likelihood if numerical issues are detected (and debug_print).
Convergence criterion is based on maximum(abs.(Kg - Kg_old)) < tol.

source

Mode finding

Find the posterior mode by maximizing the log posterior (likelihood + prior). The routine updates the reduced system between optimization rounds and can compute a Hessian at the mode.

BASEforHANK.find_mode — Function

find_mode(sr, lr, m_par, e_set)

Find parameter that maximizes likelihood of data given linearized model lr.

Arguments

sr::SteadyResults: Output of call_prepare_linearization()
lr::LinearResults: Output of linearize_full_model()
m_par::ModelParameters
e_set::EstimationSettings

Returns

EstimResults, containing all returns of mode_finding()
- posterior_mode::Float64: value of the log posterior at the mode
- smoother_output: output from likeli(...; smoother=true) (Kalman smoother tuple)
- sr::SteadyResults: Updated SteadyResults (may be modified by the routine)
- lr::LinearResults: Updated LinearResults (may be modified by the routine)
- m_par::ModelParameters: Updated ModelParameters (may be modified)
posterior_mode::Float64: value of the log posterior at the mode
smoother_output: output from likeli(...; smoother=true) (Kalman smoother tuple)
sr::SteadyResults: Updated SteadyResults (may be modified by the routine)
lr::LinearResults: Updated LinearResults (may be modified by the routine)
m_par::ModelParameters: Updated ModelParameters (may be modified)

Notes

If e_set.mode_start_file is set, the function attempts to read a JSON file at that path to initialize the optimization; the file is read (JSON.parse) and its contents are used to build the starting vector. This function therefore performs file I/O when e_set.mode_start_file != "".

source

BASEforHANK.Estimation.mode_finding — Function

mode_finding(sr, lr, m_par, e_set, par_start)

Find the posterior mode of the parameter vector via numerical optimization.

Two estimation modes are supported:

Likelihood estimation (default, when e_set.irf_matching == false): loads time-series data, builds the observation matrix, and maximizes the log posterior via likeli using Optim.
IRF matching (when e_set.irf_matching == true): loads IRF targets from CSV according to e_set.irf_matching_dict, builds weights, and maximizes the IRF-matching objective via irfmatch.

In both modes, the routine updates the reduced model using model_reduction/update_model between optimization rounds to improve accuracy, and it optionally computes a Hessian at the mode.

Arguments

sr: structural/reduced-system container (sizes, indexes, reduction flags)
lr: linearization container (A, B matrices and caches)
m_par: model parameter struct (Flatten-compatible)
e_set::EstimationSettings: controls data paths, observables, priors, optimizer and tolerances, measurement-error treatment, IRF-matching settings, and flags like compute_hessian
par_start::AbstractVector: initial parameter vector (includes measurement-error stds if estimated)

Returns

Returns a 16-tuple with elements depending on the mode:

par_final::Vector{Float64}: parameter vector at the mode
hessian_final::Matrix{Float64}: Hessian at the mode (finite-diff or identity if disabled)
posterior_mode::Float64: value of the objective at the mode (log posterior)
meas_error: mapping from observable names to column indexes (likelihood mode); [] for IRF matching
meas_error_std::Vector{Float64}: std caps or fixed stds (likelihood mode); [] for IRF matching
parnames::Vector{Symbol}: names of estimated parameters (incl. ME params if not fixed)
Data::Matrix{Float64}: data matrix (likelihood mode); [] for IRF matching
Data_missing::BitMatrix: missingness mask (likelihood mode); [] for IRF matching
IRFtargets::Array{Float64,3}: target IRFs (IRF matching); empty otherwise
IRFserrors::Array{Float64,3}: IRF target standard errors (IRF matching); empty otherwise
H_sel::Matrix{Float64}: selection matrix mapping states/controls to observables (likelihood mode); empty for IRF matching
priors::Vector{<:Distribution}: priors for structural (and ME) parameters
smoother_output: Kalman smoother output from likeli(...; smoother=true) (likelihood mode); empty for IRF matching
m_par: parameter struct updated at par_final
sr: structural container after final reduction update
lr: linearization container after final update

Notes:

Measurement errors are constructed by measurement_error and included in parnames and priors unless e_set.me_treatment == :fixed. # No irf matching? => likelihood estimation
The optimizer and tolerances are taken from e_set (e.g., e_set.optimizer, e_set.x_tol). # Load data

source

Bayesian MCMC

Random-Walk Metropolis–Hastings draws from the posterior, optionally using overdispersed initialization to seed multiple chains. The posterior for each draw is evaluated via likeli.

BASEforHANK.Estimation.rwmh — Function

rwmh(xhat, Σ, sr, lr, er, m_par, e_set)

Random-Walk Metropolis–Hastings sampler for the posterior of the model parameters.

This routine draws e_set.ndraws + e_set.burnin proposals using a multivariate normal random walk with proposal covariance Σ, evaluates the (log) posterior using either likeli (default) or irfmatch (if e_set.irf_matching == true), and accepts/rejects via the usual MH rule.

Arguments

xhat::Vector{Float64}: initial parameter vector (at least approximately feasible)
Σ::Symmetric{Float64,Array{Float64,2}}: proposal covariance matrix for the random walk
sr, lr, er, m_par, e_set: model structures and estimation settings; see package docs

Returns

draws::Matrix{Float64}: (e_set.ndraws + e_set.burnin) × length(xhat) matrix of draws, each row a parameter vector
posterior::Vector{Float64}: log posterior for each stored draw (likelihood + prior)
accept_rate::Float64: overall acceptance rate

Notes:

When e_set.irf_matching == false, the posterior is computed with likeli.
When e_set.irf_matching == true, IRF targets are read from a CSV according to e_set.irf_matching_dict, and the posterior is computed with irfmatch using inverse-variance weights. The function periodically prints progress (draw count, acceptance rate, posterior) and the current parameters.

source

BASEforHANK.Estimation.multi_chain_init — Function

multi_chain_init(xhat, Σ, sr, lr, er, m_par, e_set)

Construct an overdispersed starting value for a Metropolis–Hastings chain.

Draws proposals from MvNormal(0, Σ) scaled by 2 * e_set.mhscale around xhat and keeps the first candidate that yields a feasible model (no alarm from likeli). Tries up to 100 times.

Arguments

xhat::Vector{Float64}: baseline parameter vector
Σ::Symmetric{Float64,Array{Float64,2}}: covariance used to generate overdispersed proposals
sr, lr, er, m_par, e_set: model structures and estimation settings

Returns

init_draw::Vector{Float64}: overdispersed, feasible starting parameter vector
init_success::Bool: whether a feasible start was found within 100 attempts

source

BASEforHANK.sample_posterior — Function

sample_posterior(sr, lr, er, m_par, e_set)

Sample from the posterior with Random-Walk Metropolis–Hastings rwmh(), compute the sample mean as a point estimate, and return draws and diagnostics.

Arguments

sr::SteadyResults: Output of call_prepare_linearization()
lr::LinearResults: Output of linearize_full_model()
er::EstimResults: Output of find_mode()
m_par::ModelParameters
e_set::EstimationSettings

Returns

sr::SteadyResults: Updated SteadyResults
lr::LinearResults: Updated LinearResults
er::EstimResults: Updated EstimResults
m_par::ModelParameters: Updated ModelParameters
draws_raw::Array{Float64,2}: Raw draws from the posterior
posterior::Array{Float64,1}: Posterior
accept_rate::Float64: Acceptance rate
par_final::Array{Float64,1}: Mode of the posterior
hessian_sym::Symmetric{Float64,Array{Float64,2}}: Hessian of the posterior
smoother_output: Kalman smoother output at par_final (may be empty when e_set.irf_matching == true)

Notes

This routine prints progress messages to STDOUT while running ("Started MCMC...").
When e_set.irf_matching == true the function returns smoother_output = [].

source

Priors are evaluated by BASEforHANK.Estimation.prioreval, see above.

To obtain smoothed states at a parameter value (e.g., the posterior mean), call likeli(...; smoother=true) to access the Kalman smoother’s output.

Marginal likelihood

Estimate the (log) marginal likelihood from posterior draws using the Modified Harmonic Mean estimator of Geweke (1998).

BASEforHANK.Estimation.marginal_likeli — Function

marginal_likeli(draws, posterior)

Estimate the log marginal likelihood using the Modified Harmonic Mean estimator of Geweke (1998), averaging over a grid of truncation levels.

Given MCMC draws θ_i and their log posteriors posterior[i], the estimator computes Gaussian kernel densities within a χ²-ball defined by τ ∈ {0.1, …, 0.9} and averages the implied marginal-likelihood estimates for stability.

Arguments

draws::AbstractMatrix{<:Real}: ndraws × npars matrix of parameter draws
posterior::AbstractVector{<:Real}: log posterior evaluations for the draws

Returns

marg_likeli::Float64: estimated log marginal likelihood

Caution: As with all harmonic-mean style estimators, results can be sensitive to tail behavior and mixing. Use with diagnostics and, if possible, alternative estimators.

source

Utilities

BASEforHANK.Estimation.nearest_spd — Function

nearest_spd(A)

Return the nearest symmetric positive definite (SPD) matrix to A in Frobenius norm.

Algorithm (Higham):

Symmetrize A to B = (A + A')/2.
Compute the symmetric polar factor H of B via SVD, and set Ahat = (B + H)/2.
Enforce exact symmetry: Ahat = (Ahat + Ahat')/2.
If Ahat is not numerically PD, add a small multiple of the identity until a Cholesky factorization succeeds (capped at 100 tweaks).

Arguments

A::AbstractMatrix{<:Real}: square matrix

Returns

Ahat::Matrix{<:Real}: symmetric positive definite matrix near A in Frobenius norm

Notes:

If A is already SPD, the output is (up to roundoff) A.
The SVD-based construction yields the nearest symmetric positive semidefinite matrix; the final identity shift ensures positive definiteness, which is often required for covariance matrices.
See N. J. Higham (1988), “Computing a nearest symmetric positive semidefinite matrix”.
symmetrize A into B

source