About MLE++
Table of Contents:
MLE++ is a C++ class library for the estimation of
econometric models. To use MLE++ you must have a C++ compiler. The new
Version 3.0 is designed to work with the FREE compiler Micorsoft (R) Visual C++ (R) 2010 Express
Edition. The Express edition is
more convenient than ever and can be downloaded to a single image (ISO) file
that you can install offline on any computer. You can burn it to a DVD. Try it out before investing in
MLE++!
When compiled and executed, programs provided in the library find the maximum
of a sample likelihood function using iterative numerical techniques (MLE
stands for Maximum Likelihood Estimator). The library contains classes that
augment the C++ language to permit matrix and vector expressions, to support
file and variable manipulation, and to estimate models. Each class represents a
new data type, and instances of these new types are the objects of Object
Oriented Programming (OOP). Objects incorporate both data and the ability to
manipulate it. This approach allows us to write programs that are just as
simple as those of a conventional econometric package.
The following program estimates a probit model by constructing an object P of type Probit, then calling its "member" functions.
#include "estimate.hpp"
#include
"probit.hpp"
int MleMain() {
Probit P("rprobit", "Y", "X1,X2,X3");
P.SetOutFile("probit.out");
P.SetHessMethod(ANALYTIC_H);
P.FindParms();
return 0;
}
The dependent variable Y and three regressors X1, X2, and X3 are found in
the file named rprobit. Output tables showing the parameter estimates and
statistics will be sent to the file probit.out as well as to the screen.
Analytic Hessian matrices are used in the maximization algorithm.
We include a collection of classes that represent particular econometric
models. Each can be used in a short program to estimate the model parameters,
just like the Probit class. Compared to other econometric packages, the
collection of models is large — and it is growing. The emphasis is on limited
dependent variable models that arise often in the analysis of micro-data. Since
full, commented source code is included, the coded models may be used as guides
to related models that the user may see in the literature, or which he or she
may develop.
MLE++ is different from the majority of econometric packages. Most packages provide a few canned routines in pre-compiled form, driven from a command line or menu. Usually the only flexibility in the model estimated is the choice of parameters in the specific model-estimation commands. An internal "language" is usually provided as part of the package so that variables in a data set can be manipulated. MLE++, on the other hand, provides the user with the tools to estimate the many hundreds of models possible through maximum likelihood and related iterative methods. Particularly important is the support for model testing using artificial data. If you do write a program to estimate a complex econometric model, it is natural to question whether you have made some mistakes along the way. Fortunately there is a very powerful test available to verify the correctness of estimation procedures. Artificial data is generated according to the true model with known parameters. The estimation program should then be able to retrieve those parameters. We believe that MLE++ is unique in its support for and documentation of this testing routine.
Some conventional packages augment their internal language to the point where it can be used to code new statistical models. They may include access to some optimization algorithms, so that parameters which maximize a likelihood function coded in the internal language by a user can be found. But this method is only as good as the internal language, its mini-compiler or interpreter, and its support and debugging facilities.
Version 3.0 has been rewritten to run with Microsoft (R) Visual C++ (R) 2010 Express
Edition. This excellent compiler can be downloaded from the Microsoft web
site.
We make full use of the Standard Template Library (STL), including the special
numerical analysis features such as the valarray class. C++ language
constructions are further updated to conform to the high level of standard
compliance demanded by Visual C++ 2010. Errors are now dealt with by
using the exceptions mechanism. Because all low level issues such as memory
allocation and deallocation are now delegated to the STL, we expect this
version to be very robust and easy to support.
Microsoft (R) Visual C++ (R) 2010 Express Edition is the free version of the most popular C++ compiler on the Windows platform. It is an excellent compiler with a friendly Interactive Development Environment (IDE). This makes it a perfect environment for developing and running model estimation programs using the MLE++ library. Working from the Visual C++ (R) IDE makes the packaged models of MLE++ as convenient and easy to use as those of a conventional package.
Since Microsoft (R) Visual C++ (R) 2010 Express Edition (all you need) is available free from the Microsoft web site, why not try out C++ and see if you like it? See the Microsoft web site for details.
The following models are implemented as classes derived from an abstract base class:
Since the C++ Compiler is free, you are paying only for the library. You will be getting a compiler representing many person-years of software development and which includes extensive code optimizing capability, a friendly development environment, on-line help, and a powerful debugger. You can try out the C++ language before investing in MLE++.
Do we really claim this for C++? In this case YES!
Although the package can be used with only the most basic programming knowledge, you will gain access to the C++ language. C++ grew out of the dominant C programming language, which it includes a subset, and provides support for OOP. C++ is a standard non-proprietary language that continues to evolve with maximum backward compatibility under the guidance of an international standards committee. Learning a language can be a big investment, and the enormous popularity of C++ as a standard language could be an advantage to some users who consider the human capital value of language expertise. C++ programming is a big industry, and there are many libraries produced by third-party vendors to support the language. Although C gained its popularity as a language for systems programmers, C++ is suitable for general programming, and has excellent facilities for numerical and statistical programming. C++ class libraries have proven themselves in many fields, and are now one of the most popular standard tools used in the software industry.
For more thoughts on the future of C++, we suggest a visit the home page of its creator, Bjarne Stroustrup.
http://www2.research.att.com/~bs/homepage.html
Compared to a compiled package, a C++ library offers flexibility with the ability to extend models and to implement new models. Access to the source code for estimators of many models from the econometric literature gives the creative researcher a head start.
The library source code and this manual provide some of the most practical documentation available for many advanced limited dependent variable models and the technique used to estimate them.
If a conventional package includes a sufficiently flexible language, it may be possible to implement a new statistical procedure within the package. However, the package would then be acting as a compiler or interpreter of the code for the new model, and it is very unlikely that it would generate code as efficient as that from an optimizing C++ compiler. Since iterative techniques can be very time consuming, this may be an important consideration.
A conventional econometric package with an internal programming language will not contain debugging tools comparable to those of a good C++ compiler.
High-level matrix languages can be very powerful for numerical work, and some contain libraries suitable for maximum-likelihood estimation. But there are some disadvantages. These languages are interpreted, and in maximum likelihood estimation this means placing an interpreter in the inner loop of a maximization routine. If you find it necessary, or just convenient, to write a program which addresses the elements of a matrix in a loop, this may lead to extreme inefficiency. Other disadvantages are a relatively small user community, limited third-party support, and weakness for general programming use.
Java is object oriented, but alas does not support operator overloading which is needed to incorporate matrix expressions. Also current implementations are not very efficient, and there is little scientific and mathematical library support. Often other languages are in some sense proprietary, at least because of the enhancements that the chief supporter supplies. This is true of C# and BASIC. By contrast, the C++ standard is well supported. Java and C# in particular have been touted as the successors to C++. Naturally, we do not agree. While some of the features that may cause software bugs through access to raw memory have been designed out of these languages, we see this primarily as an advantage in team programming of interfaces, network, and internet programming. In our view, a scientific programmer using a subset of C++ with matrix-vector classes would not gain by switching to Java or C#. They would lose the advantage of a standard language with library support and maximum efficiency.
Although FORTRAN offers extensive libraries and continues to evolve, the language is limited if you stray from its narrow focus on numerical analysis. C++ is superior to FORTRAN for use as a general programming language. In applied work, especially in government departments and agencies, econometrics is often associated with the construction of program simulation models. C++ and the OOP approach lend themselves well to simulation (C++ classes were patterned on Simula 67).
A likely candidate would be someone working with econometrics on cross-sectional or panel data at a level near the research frontier, and who is interested in programming. Development of the product was motivated by the author's experience in government with Unemployment Insurance program analysis, and with labor market program evaluation. In this environment, the construction of simulation models and the estimation of econometric models go hand-in-hand. An interest in computer programming is a natural result. Similar situations may arise in academia, and in consulting firms.
Two manuals are included with the product: the MLE++ Users Manual (343 pages) and MLE++ with Microsoft (R) Visual C++ (R) 2010 (29 pages) .
Copyright by Cahill Software (C) 2010