Base R ships with a lot of functionality useful for computational
econometrics, in particular in the stats package. This
functionality is complemented by many packages on CRAN, a brief overview
is given below. There is also a considerable overlap between the tools
for econometrics in this view and for finance in the
Finance
view.
Furthermore, the
Finance SIG
is a suitable mailing list for obtaining help
and discussing questions about both computational finance and econometrics.
Finally, there is also some overlap with the
SocialSciences
that
also covers a broad variety of tools for social sciences, e.g., including political science.
The packages in this view can be roughly structured into the following topics.
If you think that some package is missing from the list, please let me know.
Linear regression models
-
Linear models can be fitted (via OLS) with
lm()
(from stats) and standard tests for model comparisons are available in various
methods such as
summary()
and
anova().
-
Analogous functions
that also support asymptotic tests (
z
instead of
t
tests, and
Chi-squared instead of
F
tests) and plug-in of other covariance
matrices are
coeftest()
and
waldtest()
in
lmtest.
-
Tests of more general linear hypotheses are implemented in
linear.hypothesis()
in
car.
-
HC and HAC covariance matrices that can be plugged
into these functions are available in
sandwich.
-
Diagnost checking: The packages
car
and
lmtest
provide a large collection
of regression diagonstics and diagnostic tests.
-
Instrumental variables regression (two-stage least squares) is
provided by
ivreg()
in
AER, another implementation
is
tsls()
in package
sem.
Microeconometrics
-
Many standard microeconometric models belong to the
family of generalized linear models (GLM) and can be fitted by
glm()
from package stats. This includes in particular logit and probit models
for modeling choice data and poisson models for count data. Effects for typical
values of regressors in these models can be obtained and visualized using
effects.
Marginal effects tables for certain GLMs can be obtained using the
mfx
package.
-
Negative binomial GLMs are available via
glm.nb()
in package
MASS.
Another implementation of negative binomial models
is provided by
aod, which also contains other models for overdispersed
data.
-
Zero-inflated and hurdle count models are provided in package
pscl.
-
Multinomial responses: Multinomial models
with individual-specific covariates only are available in
multinom()
from package
nnet. Implementations with both individual- and
choice-specific variables are
mlogit
and
mnlogit. Generalized additive models
(GAMs) for multinomial responses can be fitted with the
VGAM
package.
A Bayesian approach to multinomial probit models is provided by
MNP.
Various Bayesian multinomial models (including logit and probit) are available
in
bayesm. Furthermore, the package
RSGHB
fits various
hierarchical Bayesian specifications based on direct specification of the likelihood
function.
-
Ordered responses: Proportional-odds regression for ordered responses is implemented
in
polr()
from package
MASS. The package
ordinal
provides cumulative link models for ordered data which encompasses proportional
odds models but also includes more general specifications. Bayesian ordered probit
models are provided by
bayesm.
-
Censored responses: Basic censored regression models (e.g., tobit models)
can be fitted by
survreg()
in
survival, a convenience
interface
tobit()
is in package
AER. Further censored
regression models, including models for panel data, are provided in
censReg.
Interval regression models are in
intReg. Censored regression models with
conditional heteroskedasticity are in
crch.
Furthermore, hurdle models for left-censored data at zero can be estimated with
mhurdle. Models for sample selection are available in
sampleSelection
and semiparametric extensions of these are provided by
SemiParSampleSel.
-
Instrumental variables for binary responses: The
LARF
package estimates
local average response functions for binary treatments and binary instruments.
-
Multivariate probit models: Estimation and marginal effect computations can be
carried out with
mvProbit.
-
Miscellaneous: Further more refined tools for microecnometrics are provided in
the
micEcon
family of packages: Analysis with
Cobb-Douglas, translog, and quadratic functions is in
micEcon;
the constant elasticity of scale (CES) function is in
micEconCES;
the symmetric normalized quadratic profit (SNQP) function is in
micEconSNQP.
The almost ideal demand system (AIDS) is in
micEconAids.
Stochastic frontier analysis is in
frontier.
The package
bayesm
implements a Bayesian
approach to microeconometrics and marketing. Inference for relative
distributions is contained in package
reldist.
Further regression models
-
Nonlinear least squares modeling is availble in
nls()
in package stats.
-
Quantile regression:
quantreg
(including linear, nonlinear, censored,
locally polynomial and additive quantile regressions).
-
Linear models for panel data:
plm, providing a wide range of within,
between, and random-effect methods (among others) along with corrected standard
errors, tests, etc. For panel-corrected standard errors in OLS and GEE models,
see
geepack
and
pcse. Estimation of linear models with
multiple group fixed effects is contained in
lfe.
-
Generalized method of moments (GMM) and generalized empirical likelihood (GEL):
gmm.
-
Spatial econometric models: The
Spatial
view gives details about
handling spatial data, along with information about (regression) modeling. In particular,
spatial regression models can be fitted using
spdep
and
sphet
(the
latter using a GMM approach).
splm
is a package for spatial panel
models. Spatial probit models are available in
spatialprobit.
-
Linear structural equation models:
sem
(including two-stage least squares).
-
Simultaneous equation estimation:
systemfit.
-
Nonparametric kernel methods:
np.
-
Beta regression:
betareg
and
gamlss.
-
Truncated (Gaussian) regression:
truncreg.
-
Nonlinear mixed-effect models:
nlme
and
lme4.
-
Generalized additive models (GAMs):
mgcv,
gam,
gamlss
and
VGAM.
-
Mixed data sampling regression:
midasr.
-
Miscellaneous: The packages
VGAM,
rms
and
Hmisc
provide several tools for extended
handling of (generalized) linear regression models.
Zelig
is a unified
easy-to-use interface to a wide range of regression models.
Basic time series infrastructure
-
The
TimeSeries
task view provides much more detailed
information. Here, only the most important aspects are briefly mentioned.
-
The class
"ts"
in package stats is R's standard class for
regularly spaced time series (especially annual, quarterly, and monthly data).
-
Time series in
"ts"
format can be
coerced back and forth without loss of information to
"zooreg"
from package
zoo.
zoo
provides infrastructure for
both regularly and irregularly spaced time series (the latter via the class
"zoo") where the time information can be of arbitrary class.
This includes daily series (typically with
"Date"
time index)
or intra-day series (e.g., with
"POSIXct"
time index).
-
Several
other implementations of irregular time series building on the
"POSIXct"
time-date class are available in
its,
tseries
and
timeSeries
(previously: fSeries) which are all aimed particularly at
finance applications. See the
Finance
task view for
more information.
Time series modeling
-
The
TimeSeries
task view contains detailed information about time series analysis in R.
Time series models for financial econometrics (e.g., GARCH, stochastic volatility models, or
stochastic differential equations, etc.) are described in the
Finance. Here, only a brief overview
of the most important methods for econometrics is given.
-
Classical time series modeling tools are
contained in the stats package and include
arima()
for ARIMA modeling
and Box-Jenkins-type analysis.
-
Fitting linear regression models with AR error terms via OLS is possible
using
gls()
from
nlme.
-
Structural time series models are provided by
StructTS()
in stats.
-
Filtering and decomposition for time series is available in
decompose()
and
HoltWinters()
in stats.
-
Extensions to these
methods, in particular for forecasting and model selection, are provided in
the
forecast
package.
-
Miscellaneous time series filters are available in
mFilter.
-
For estimating VAR models, several
methods are available: simple models can be fitted by
ar()
in stats, more
elaborate models are provided in package
vars,
estVARXls()
in
dse
and a Bayesian approach is available in
MSBVAR. A
convenient interface for fitting dynamic regression models via OLS is available
in
dynlm; a different approach
that also works with other regression functions is implemented in
dyn.
-
More advanced dynamic system equations can be fitted using
dse.
-
Periodic autoregressive models are provided by
partsm.
-
Gaussian linear state space models can be fitted using
dlm
(via maximum likelihood,
Kalman filtering/smoothing and Bayesian methods).
-
Unit root and cointegration techniques are available in
urca,
tseries,
CADFtest.
-
Time series factor analysis is available in
tsfa.
-
Asymmetric price transmission modeling is available in
apt.
Data sets
-
Packages
AER
and
Ecdat
contain a comprehensive collections of data sets from various standard econometric
textbooks as well as several data sets from the Journal of
Applied Econometrics and the Journal of Business & Economic Statistics
data archives.
-
AER
additionally provides an extensive set of
examples reproducing analyses from the textbooks/papers, illustrating
various econometric methods.
-
FinTS
is the R companion to Tsay's 'Analysis of
Financial Time Series' (2nd ed., 2005, Wiley) containing data sets, functions
and script files required to work some of the examples.
-
CDNmoney
provides Canadian monetary aggregates.
-
pwt
provides the Penn World Table from versions 5.6, 6.x, 7.x. The version 8.x
data are available in
pwt8.
-
The packages
expsmooth,
fma, and
Mcomp
are
data packages with time series data
from the books 'Forecasting with Exponential Smoothing: The State Space Approach'
(Hyndman, Koehler, Ord, Snyder, 2008, Springer) and 'Forecasting: Methods and Applications'
(Makridakis, Wheelwright, Hyndman, 3rd ed., 1998, Wiley) and the M-competitions,
respectively.
-
Package
erer
contains functions and datasets for the book of
'Empirical Research in Economics: Growing up with R' (Sun, forthcoming).
-
The package
psidR
available from GitHub can build panel data
sets from the Panel Study of Income Dynamics (PSID).
Miscellaneous
-
Matrix manipulations
: As a vector- and matrix-based language, base R
ships with many powerful tools for doing matrix manipulations, which are
complemented by the packages
Matrix
and
SparseM.
-
Optimization and mathematical programming
: R and many of its contributed
packages provide many specialized functions for solving particular optimization
problems, e.g., in regression as discussed above. Further functionality for
solving more general optimization problems, e.g., likelihood maximization, is
discussed in the the
Optimization
task view.
-
Bootstrap
: In addition to the recommended
boot
package,
there are some other general bootstrapping techniques available in
bootstrap
or
simpleboot
as well some bootstrap techniques
designed for time-series data, such as the maximum entropy bootstrap in
meboot
or the
tsbootstrap()
from
tseries.
-
Inequality
: For measuring inequality, concentration and poverty the
package
ineq
provides some basic tools such as Lorenz curves,
Pen's parade, the Gini coefficient and many more.
-
Structural change
: R is particularly strong when dealing with
structural changes and changepoints in parametric models, see
strucchange
and
segmented.
-
Exchange rate regimes
: Methods for inference about exchange
rate regimes, in particular in a structural change setting, are provided
by
fxregime.