What is MLE++?

MLE++ is a C++ class library for the estimation of econometric models. To use MLE++ you must have a C++ compiler. The new Version 3.0 is designed to work with the FREE compiler Micorsoft (R) Visual C++ (R) 2010 Express Edition. The Express edition is more convenient than ever and can be downloaded to a single image (ISO) file that you can install offline on any computer. You can burn it to a DVD. Try it out before investing in MLE++!

When compiled and executed, programs provided in the library find the maximum of a sample likelihood function using iterative numerical techniques (MLE stands for Maximum Likelihood Estimator). The library contains classes that augment the C++ language to permit matrix and vector expressions, to support file and variable manipulation, and to estimate models. Each class represents a new data type, and instances of these new types are the objects of Object Oriented Programming (OOP). Objects incorporate both data and the ability to manipulate it. This approach allows us to write programs that are just as simple as those of a conventional econometric package.

The following program estimates a probit model by constructing an object P of type Probit, then calling its "member" functions.

#include "estimate.hpp"
#include "probit.hpp"

int MleMain() {
   Probit P("rprobit", "Y", "X1,X2,X3");
   P.SetOutFile("probit.out");
   P.SetHessMethod(ANALYTIC_H);
   P.FindParms();
   return 0;
}

The dependent variable Y and three regressors X1, X2, and X3 are found in the file named rprobit. Output tables showing the parameter estimates and statistics will be sent to the file probit.out as well as to the screen. Analytic Hessian matrices are used in the maximization algorithm.
We include a collection of classes that represent particular econometric models. Each can be used in a short program to estimate the model parameters, just like the Probit class. Compared to other econometric packages, the collection of models is large — and it is growing. The emphasis is on limited dependent variable models that arise often in the analysis of micro-data. Since full, commented source code is included, the coded models may be used as guides to related models that the user may see in the literature, or which he or she may develop.

MLE++ is different from the majority of econometric packages. Most packages provide a few canned routines in pre-compiled form, driven from a command line or menu. Usually the only flexibility in the model estimated is the choice of parameters in the specific model-estimation commands. An internal "language" is usually provided as part of the package so that variables in a data set can be manipulated. MLE++, on the other hand, provides the user with the tools to estimate the many hundreds of models possible through maximum likelihood and related iterative methods. Particularly important is the support for model testing using artificial data. If you do write a program to estimate a complex econometric model, it is natural to question whether you have made some mistakes along the way. Fortunately there is a very powerful test available to verify the correctness of estimation procedures. Artificial data is generated according to the true model with known parameters. The estimation program should then be able to retrieve those parameters. We believe that MLE++ is unique in its support for and documentation of this testing routine.

Some conventional packages augment their internal language to the point where it can be used to code new statistical models. They may include access to some optimization algorithms, so that parameters which maximize a likelihood function coded in the internal language by a user can be found. But this method is only as good as the internal language, its mini-compiler or interpreter, and its support and debugging facilities.

What's New?

Version 3.0 has been rewritten to run with Microsoft (R) Visual C++ (R) 2010 Express Edition. This excellent compiler can be downloaded from the Microsoft web site.

We make full use of the Standard Template Library (STL), including the special numerical analysis features such as the valarray class. C++ language constructions are further updated to conform to the high level of standard compliance demanded by Visual C++ 2010. Errors are now dealt with by using the exceptions mechanism. Because all low level issues such as memory allocation and deallocation are now delegated to the STL, we expect this version to be very robust and easy to support.

MLE++ with Microsoft (R) Visual C++ (R) 2010 Express Edition

Microsoft (R) Visual C++ (R) 2010 Express Edition is the free version of the most popular C++ compiler on the Windows platform. It is an excellent compiler with a friendly Interactive Development Environment (IDE). This makes it a perfect environment for developing and running model estimation programs using the MLE++ library. Working from the Visual C++ (R) IDE makes the packaged models of MLE++ as convenient and easy to use as those of a conventional package.

Since Microsoft (R) Visual C++ (R) 2010 Express Edition (all you need) is available free from the Microsoft web site, why not try out C++ and see if you like it? See the Microsoft web site for details.

Minimum System Requirements

Microsoft Windows (R) 2000, XT or later
192MB of memory (256MB recommended)
1.3 GB hard disk space
Min 600 MHz Pentium processor
Minimum: 800 x 600 256 colors Recommended: 1024 x 768 High Color — 16-bit
CD-ROM drive
Mouse or other Windows pointing device

The Packaged Models

The following models are implemented as classes derived from an abstract base class:

Ordinary Least Squares
Binary Probit
Binary Logit
Tobit
Truncated Regression
Sample Selection
Treatment Effects
Type 3 Tobit
Type 5 Tobit: An Endogenous Switching Regression
Bivariate Probit
Multinomial Logit
Conditional Logit
Poisson Regression
Weibull and Exponential Survival Models
Weibull and Exponential Models with Heterogeneity
Weibull and Exponential Models with Time-Varying Covariates
Weibull and Exponential Models with Heterogeneity and Time-Varying Covariates
The Lognormal Survival Model
The Loglogistic Survival Model
Meyer's Semiparametric Survival Model
Meyer's Model with Heterogeneity

Why use a C++ class library and compiler?

Price

Since the C++ Compiler is free, you are paying only for the library. You will be getting a compiler representing many person-years of software development and which includes extensive code optimizing capability, a friendly development environment, on-line help, and a powerful debugger. You can try out the C++ language before investing in MLE++.

Convenience and Ease of Use

Do we really claim this for C++? In this case YES!

The main reason is the matrix and vector class library that we provide, including overloaded operators. While C++ is a large general-purpose language, you do not need to know it all. When you write code for a likelihood function you should not use any of the low-level features of C++ such as access to raw memory through pointers. You will use a small and simple subset of C++ that uses the Matrix and Vector objects. The code will look very similar to that of an interpreted matrix language. Nevertheless, your code will compile to the most efficient machine code possible.
Another reason is the Integrated Development Environment (IDE). Programs are prepared in a full-screen editor supporting multiple windows and a host of useful tools. The sequence of compiling, linking, and submitting your program is fully automated — all done with a single mouse click. No package could be more "user-friendly."
OOP Object oriented programming has proved to be a very valuable innovation. If you plan to write econometric model estimation programs, then you should benefit from this modern technique. MLE++ applies OOP to support general maximum likelihood estimation.

Popularity and Longevity

Although the package can be used with only the most basic programming knowledge, you will gain access to the C++ language. C++ grew out of the dominant C programming language, which it includes a subset, and provides support for OOP. C++ is a standard non-proprietary language that continues to evolve with maximum backward compatibility under the guidance of an international standards committee. Learning a language can be a big investment, and the enormous popularity of C++ as a standard language could be an advantage to some users who consider the human capital value of language expertise. C++ programming is a big industry, and there are many libraries produced by third-party vendors to support the language. Although C gained its popularity as a language for systems programmers, C++ is suitable for general programming, and has excellent facilities for numerical and statistical programming. C++ class libraries have proven themselves in many fields, and are now one of the most popular standard tools used in the software industry.

For more thoughts on the future of C++, we suggest a visit the home page of its creator, Bjarne Stroustrup.

http://www2.research.att.com/~bs/homepage.html

Flexibility

Compared to a compiled package, a C++ library offers flexibility with the ability to extend models and to implement new models. Access to the source code for estimators of many models from the econometric literature gives the creative researcher a head start.

Documentation

The library source code and this manual provide some of the most practical documentation available for many advanced limited dependent variable models and the technique used to estimate them.

Efficiency

If a conventional package includes a sufficiently flexible language, it may be possible to implement a new statistical procedure within the package. However, the package would then be acting as a compiler or interpreter of the code for the new model, and it is very unlikely that it would generate code as efficient as that from an optimizing C++ compiler. Since iterative techniques can be very time consuming, this may be an important consideration.

Debugging

A conventional econometric package with an internal programming language will not contain debugging tools comparable to those of a good C++ compiler.

What About Other languages?

Specialty Matrix-Oriented Languages

High-level matrix languages can be very powerful for numerical work, and some contain libraries suitable for maximum-likelihood estimation. But there are some disadvantages. These languages are interpreted, and in maximum likelihood estimation this means placing an interpreter in the inner loop of a maximization routine. If you find it necessary, or just convenient, to write a program which addresses the elements of a matrix in a loop, this may lead to extreme inefficiency. Other disadvantages are a relatively small user community, limited third-party support, and weakness for general programming use.

Java, C#, BASIC, and Others

Java is object oriented, but alas does not support operator overloading which is needed to incorporate matrix expressions. Also current implementations are not very efficient, and there is little scientific and mathematical library support. Often other languages are in some sense proprietary, at least because of the enhancements that the chief supporter supplies. This is true of C# and BASIC. By contrast, the C++ standard is well supported. Java and C# in particular have been touted as the successors to C++. Naturally, we do not agree. While some of the features that may cause software bugs through access to raw memory have been designed out of these languages, we see this primarily as an advantage in team programming of interfaces, network, and internet programming. In our view, a scientific programmer using a subset of C++ with matrix-vector classes would not gain by switching to Java or C#. They would lose the advantage of a standard language with library support and maximum efficiency.

FORTRAN

Although FORTRAN offers extensive libraries and continues to evolve, the language is limited if you stray from its narrow focus on numerical analysis. C++ is superior to FORTRAN for use as a general programming language. In applied work, especially in government departments and agencies, econometrics is often associated with the construction of program simulation models. C++ and the OOP approach lend themselves well to simulation (C++ classes were patterned on Simula 67).

Who Would Benefit from MLE++?

A likely candidate would be someone working with econometrics on cross-sectional or panel data at a level near the research frontier, and who is interested in programming. Development of the product was motivated by the author's experience in government with Unemployment Insurance program analysis, and with labor market program evaluation. In this environment, the construction of simulation models and the estimation of econometric models go hand-in-hand. An interest in computer programming is a natural result. Similar situations may arise in academia, and in consulting firms.

Documentation

Two manuals are included with the product: the MLE++ Users Manual (343 pages) and MLE++ with Microsoft (R) Visual C++ (R) 2010 (29 pages) .