ESTIMATION OF SOME NON-NORMAL LIMITED DEPENDENT VARIABLE MODELS by Lung-Fei Lee Discussion Paper No. 81 - 148, June 1981 Center for Economic Research Department of Economics University of Minnesota Minneapolis, Minnesota 55455 ESTIMATION OF SOME NON-NORMAL LIMITED DEPENDENT VARIABLE MODELS Lung-Fei Lee{*) Department of Economics, University of Minnesota, Minneapolis, and Center for Econometrics and Decision Sciences, University of Florida, Gainesville Abstract Estimation of the Tobit models where the distribution of the disturbances belongs to the Pearson family of distribution is considered. Some simple recursive relations between the moments of the truncated distribution are derived. Those recursive relations provide some structural equations which are linear in variables but nonlinear in coefficients. The estimation method proposed is a nonlinear two stage least squares method. Address: (valid after July 1, 1981) Department of Economics University of Minnesota 1035 Business Administration 271 19th Avenue South Minneapolis, MN 55455 (valid until June 30, 1981) Department of Economics University of Florida Gainesville, Florida 32611 ESTIMATION OF SOME NON-NORMAL LIMITED DEPENDENT VARIABLE MODELS by Lung-Fei Lee 1. Introduction May 1981 It is known that classical least squares method does not provide consistent estimates for the Tobit regression models when the dependent variables are truncated or censored. To overcome this problem, specific distribution for the disturbances is assumed. The commonly specified distribution is the normal distribution. Based on the normal distribution, the maximum likelihood estimator has been shown to be consistent, asymptotically normal and asymptotically efficient in Amemiya [1973]. Consistent instrumental variable method has also been proposed in Amemiya [1973]. Unfortunately, these estimates are not robust. The misspecification of normality may create severe asymptotic bias in the estimates. Other misspecifications, such as the assumption of homoscedaatic errors when there are heteroscedasticity, will also create such problem. The consequences of such misspecifications have been investigated in Goldberger [1980], Hurd [1979] and Nelson [1981]. Since misspecification in the Tobit model can cause problems, it is desirable to have testing procedures to test the assumptions or to consider models which will relax the strong assumptions. For the censored-normal Tobit model, a specification test is suggested in Nelson [1981]. To test the normality assumption, some Lagrangean multiplier tests are provided in Lee [1981] for both the censored and truncated Tobit models. -2- Since normality is a rather strong assumption, it is desirable to relax this assumption and consider models with more flexible distributions. In this article, we consider the estimation of the Tobit models where the distribution of the disturbances belong to the Pearson family of distributions. The Pearson family of distributions is attractive since it contains distributions of various widely different shapes and contains the normal, student t, and gamma distributions as special cases. This article is organized as follows. In Section 2, the Tobit model with the Pearson family of distributions is specified. In Section 3, we analyze the moments of the distribution of the Tobit models when there is a single truncation. In Section 4, recursive relations between the moments of the doubly truncated distributions are derived. In Section 5, consistent estimation methods are suggested. Finally, we draw our conclusions. -3- 2. The Tobit Models and the Pearson Family of Distributions A typical censored Tobit model is specified as i=1,2, ... ,N where x. is a 1 x k row vector of exogenous variables which contains a 1. (2.1) constant term; E(u.)=O and the disturbances u. are independent and identically 1. 1. distributed; the dependent variable Yli is an unobservable variable but is related to the observed dependent variable Yi as follows:!/ = 0 otherwise This model was originated in the pioneer paper in Tobin [1959] with the additional normality assumption imposed on the distribution of the disturbance u. To relax the normality assumption, we assume that the distribution of u belongs to the Pearson family of distributions. This system of distributions was originated by K. Pearson, see, e.g. Elderton and Johnson [1969] and Johnson and Kotz [1970]. For each number of the system, the probability density function g(u) satisfies the following differential equation, dQ,ng(u) = du a+u 2 b o +bl u+b 2u 2/ As the disturbance u has zero mean, it implies that a=-bl .- After some (2.2) reparameterizations, the density function g(u) with zero mean is characterized by the differential equation, (2.3) -4- Let f(ylix) be the conditional density function·of Yl conditional on the exogenous variable vector x. It follows that = dQ.ng(u) du u = y -xS 1 (2.4) The formal solution of the density function g(u) from the differential equation (Z.3) depends on the characteristics of the solutions of the . Z quadratic equation c o -cl u+c2u =0. The general density function, however, can be expressed informally as g(u) = exp Define a dichotomous indicator I., 1. I. = J 1 1. l 0 if Yli > 0, otherwise. dt (Z.5) g(u)du be the probability -00 that I. = O. The log likelihood function for this model is 1. where L c q(t) = c , c l ' c Z) + I. q(y.-x.S) o 1. 1. 1. _ Ii'n JOOexp q(t)dt) _00 (Z.6) (2.7) -5- The normal density function corresponds to the case that c1=O and cZ=O. The (central) t distribution belongs to the family with c1=0. The Tobit model described above is a censored regression model in which the exogenous variables vector x. is always observable and the number 1. of observations corresponding to 1.=0 is also known. There are situations 1. in which the samples are drawn only from the populations with Y1>0. These samples are known as truncated samples and the corresponding Tobit model is a regression model with truncated dependent variable. For the truncated Tobit model with the Pearson family of distributions, the general density function is g(ulx) where, for given x, u is in the interval [-xS, ooJ. For a given samples {y.} of size N, the log likelihood function for the truncated Tobit model is 1. LT = Li~l{q(Yi-xiS) - ~n f exp q(t)dt} -x.S 1. (2.9) (Z.8) The Pearson family of distributions of u uses four parameters and is characterized by the first four moments of u (see Elderton and Johnson [1969J, pp. 38-39). It is possible to expand the system to the family of distributions characterized by the differential equation, d£ng(u) = du a+u with higher order polynomials in the denominator. However, as argued in the statistical literatures, such complications are unnecessary for most practical purposes. 3. Incomplete Moments and Singly Truncated Distribution While the maximum likelihood method for the Tobit model with normal distribution is attractive, it seems computationally extremely complicated for the estimation of the Tobit model with the Pearson family of distributions. As pointed out in Johnson and Kotz ([1970], p. 12), fitting the Type IV distribution by maximum likelihood is difficult and it is practically never 3/ attempted.- As a practice alternative, the method of moments are used for the general distribution fitting purpose, see Elderton and Johnson ([1969], Chapters 4 and 5). The method of moments have been generalized to the estimation of the truncated Pearson distributions in Cohen [1951, 1953]. The models considered by Cohen are not regression-type models. For the estimation of the Tobit models, the method of moments, however, can be modified to the method of instrumental variables estimation. Instrumental variables estimation of the Tobit models with normal distribution was first proposed in Amemiya [1973]. The estimation method that we will consider utilizes only the sample information on the observed values of Yli and are applicable to both the censored and truncated models. For both models, it follows from (2.4) that the density function of Yl conditional on x satisfies the following relation where a(x), b.(x), i=0,1,2 are functions of x; 1 (3.1) -7- a(x) = c l + xS (3.2) b (x) 2 (3.3) = c + clxS + c2 (xS) 0 0 bl (x) = cl + 2c 2xS (3.4) b 2(x) c 2 (3.5) For the sake of notational simplicity, the vector x in the density function (3.1) and the functions (3.2)-(3.5) will be subpressed. The differential equation "in (3.1) can be used to derive very simple recursive relations for the moments of the truncated Pearson distribution. Such relations can be found in Cohn [1952].i/ Equation (3.1) implies that (3.6) Multiplying both sides by y~, n£{O,1,2, .•. ,} and integrating over the range from ° up to the upper limit of Yl' denoted as "00", we have By the integration by parts, it follows that 00 f y~(a-Yl)f(Yl)dYl ° + (n+2)b2y~+lJdYl • a f~Y~f(Yl)dYl ° (3.7) (3.8) Suppose that lim y~f(Yl) = ° for all n, n=O,1,2, ••• i.e., at the upper end Yf~ point, y~f(Yl) vanish, the equation (3.8) can then be further simplified to and Let -8- -bof(O) - foo[-bl+2b2Yllf(Yl)dYl = o ~n = IooY~f(Yl)dYl / Ioof(Yl)dYl o 0 n=1,2, •.. be the nth moment of the truncated density function f(yl ) / Joof(Yl)dYl . o Equation (3.9) implies (1-2b2)~1 = a-b + b f(o) 1 0 F where F Joof(Yl)dYl , and equation (3.10) implies o (3.9) (3.10) (3.11) (3.12) (3.13) Substituting the expressions (3.2)-(3.5) into the equations (3.12) and (3.l3~, we have explicitly the following set of equations: 1 [co+cl xS + 2 f(olx) Y = xS + l-2c c2 (xS) 1 + nl 2 F(x) (3.14) where F(x) = (f(Yllx)dYl; nl = Y-~ and 1 0 n+l n-l n n n-l Y = ny c* - ny c* + Y x(l-nc* )6 + ny xc* 6 on In 2n In (3.15) n=l ,2, ... -9- where c* = c o /(1-(n+2)c2) (3.16) on c* = c/(1-(n+2)c2) (3.17) In c* 2n c2/(1-(n+2)c 2) (3.18) and X(yn_~ )(l-nc* )6 n 2n n-l (y -~ l)nxcl* 6 n- n (3.19) n-l 5/ - nCy -~ 1)Cx~x)c*2 CS~S).- n- n Obviously, conditional on Yl being observed, all the disturbances have zero mean, i.e., ECnniyl>O) = 0 for all n, ne{O,1,2, ... }. -10- 4. Incomplete Homents and Doubly Truncated Distribution The derivations of the moments and the recursive formulae in the n previous section are based on the assumption that ylf(yl ) evaluated at the upper end point of the support or ~ are zero for all n, ns{O,l, •.. }. This assumption is quite restrictive and rules out many distributions in the Pearson family. This problem can be avoided if we artifically truncate the distribution of Yl from above with a nonstochastic threshold k, k>O, which eliminates the observations y. with value greater than k. Thus the 1. model considered becomes a two limits Tobit model with the Pearson family of distributions which generalizes the model in Rossett and Nelson [1975].i/ From equation (3.6), it implies and hence for all n, ns{0,1,2, ••• }. Let Vn = JkY~f(Yl)dYl / o dYl = Jky~(a-Yl)f(Yl)dYl o (4.1) (4.3) be the nth moment of and F = Jkf(Yl)dYl . o the doubly truncated density function f(yl)/Jkf(Yl)d Yl , o (4.2) -11- The equations in (4.2) become (b -b k + b kZ) f(k) - b f(O) - [-bl + Zb Z v ] = a v 01 Z F of 1 - 1 (4.4) and = av - v n n+l n = 1,2, ... Equations in (4.5) imply, in turn, the following recursive relation, (4.6) -k(n-l)b v o n-2' n = 2,3, .•. Substituting the functions in (3.2)-(3.5) into (4.4) and (4.6), we have explicitly the following equations, and y = xS + l-~C [co+clxS + C2 (XS)2]f(0Ix)/F(X) 2 (4;4)' n+l y 1 2 2 f (k I x) 1-2c [co-clk + cZk + (cl -2c Zk)xS + c 2 (xS) ] F(x) + nl Z n n n-l y (-nc* + k(l+c* » + y xS(l-nc* ) + y (nc* +(n-l)kc* ) In 2n 2n on In n-l n-l + y xS(ncIn - k(1-(n-2)c~n» + ny (x~x)(S~S)c~n n-2 n-2 n-2 - y k(n-l)c* - k(n-l)y xSc* - k(n-l)y (x~x)(S0S)c* +n on 1 2n n n = 2,3, ... (4.5) , where E(nniO < Yl < k) = 0 for all n; and C* on c* In c* 2n (4.7) (4.8) (4.9) -12- 5. Instrumental Variables Esimations For the Tobit models with normal distribution, Aroemiya [1973] has derived the relations between the first and the second moments of a singly truncated normal distribution and proposed a simple consistent instrumental variables estimation method. This kind of method can be extended to the estimation of our models. To estimate the models, the relations in (3.15) or the ones in (4.5)' can be used when the dependent variables is singly truncated. For the doubly truncated cases, only the relations in (4.5)' can be used. The basic vector of parameters in our model is 9 = (S', co' c l ' c2)'. The equations (3.15) and (4.5)' are linear in variables but nonlinear in parameters. A consistent estimation method that can be used is the nonlinear two stage least squ~res procedure (see Aroemiya [1974]). With loss of generality, let us briefly describe the estimation procedure for the doubly truncated models. For each n, nE{2,3, ••• ,}, it is 'convenient to rewrite the equation (4.5)' implicitly as (5.1) where the functions Y. (9), j=l, ••• ,8, are nonlinear vector-value functions In of the basic parameter vector 9 and are defined from the equation (4.5)', (4.7)-(4.9) . Let z n n n n-l n-l n-l n-2 n-2 n-2 = ( y , y x, y , y x, y ( x~x), y ,Y x, y (x~x) ) be the vector of explanatory variables on the right hand side of the equation (5.1). Hence, equation (5.1) can be rewritten as -13- n+l = z y (9) + n y n n n (5.2) Without loss of generality, assume that the sample size of the observed samples on y with o < y < k is N. Let Y 1 and Z be the data matrices of the dependent n+ n variable yn+l and the explanatory variable vector z. In matrix notations, n equation (5.2) is y +1 = Z Y (9) + E (5.3) n n n n where E is the corresponding vector of the disturbances n " To estimate n n1 this equation, valid instrumental variables need to be constructed for z . n Since, in general, the vector of exogenous variables, x, contains a constant term, some of the columns in Z are identical. Of course, one can eliminate n the duplicated columns and combine the corresponding coefficients. But this is rather tedious to do in practice. Let x = (1, w). The distinct explanatory n n n-l variables in z are the variables, y , y w, y n n-l n-l n-2 n-2 y w, y (wiw), Y , Y w n-2 and y (~.naw) . To construct a set of valid instruments for these variables, we can regress the dependent variable y on the exogenou8variables 1, wand some low order polynomials of wand then construct the least square predictor y for y. The vector of instrumental variables can be constructed as ~n ~n ~n-l ~n-l ~n-l ~n-2 ~n-2 ~n-2 q = (y , y w, y ,y w, y (wiw), Y , Y w, Y (wQw» (5.4) n Let Q denote such constructed instrumental variables matrix. For each n n, n=2,3, •.. , a nonlinear two stage least squares estimator e of e n can be derived by min (Yn+1 8 -14- Z y (8))'Q (0'0 )-lQ,(y +1 - Z y (8)). n n n -n'n n n n n For this estimation, iterative methods such as the Newton's method or its variants need be used. Let e(m) be the estimator derived in the mth step n iteration. th The m+1 step iteration will give a modified estimate as e(m+1) n ay' (e (m)) = a(m) + [ n n Z'Q (Q'Q )-lQ,Z n a8 n n n n n n ay' (B(m)) [ n n Z'Q (QIQ )-lQ, (y -Z y (B(m)))] a8 n n n n n n+1 n n n The final estimate is derived when the iteration converges. As in Amemiya [1973, 1974], the estimator can be easily shown to be consistent and asymptotically normal under very general regular conditions. The asymptotic covariance matrix can be computed as ay 1(8) [ n Z'Q (QIQ )-10,Z ao n n n n 'n n ay (8) -1 ay'(8) n ] [n Z' 0 (0 I Q ) -10 I rl (8) a8' a8 n -n -n n -n n . Q (QIQ )-lQ,z ay (8) ay'(8) ay (8) -1 a~' ] [a~ Z'Q (QIQ )-lQ'Z n ] n n n n n n n n n n n a8' where rl (8) is the covariance matrix of E. The analytical expression for n n rl (8) involves the evaluation of higher order truncated moments of y, the n density function f(yllx) at Yl = 0 and k, and the probability ~kf(Ylx)dY as in the equations (4.4) and (4.5). A relatively computational simple method to estimate the asumptotic variance matrix is to use the matrix V n,n (5.4) -15- V ay' (8 ) [ n n 2'Q (Q'Q )-lQ'2 a8 n n n n n n ay (8 ) -1 ay , (8 ) n n][ n n 2'0 (O'Q )-1 a8' a8 n 'n 'n n n,n ay (8 ) n n a8' ] ay' (8 ) [ n n 2'Q (Q'Q )-lQ'2 a8 n n n n n n (5.5) where nni n+l = y. -z.y (8 ), i=l, ... ,N, are the estimated residuals. 1. n1. n n Under the regular conditions as in Amemiya [1973J, this is a consistent estimate of h . . 7/ t e covar1.ance matr1.x.- The above nonlinear two stage method gives a separate estimate of 8 for each n, nE{2,3, .•• }, for the doubly truncated Tobit model. In practice, it is enough to use several of the low order moments for the estimation purpose without much information lost on the higher moments. Suppose we have used the equations in (5.1) for n=2,3, ••• ,m, and derived the corresponding estimates 8 for each n, it is desirable to pool the estimators so as to derive a more n efficient one. A relatively simple pooling procedure is the following mixed estimation procedure. 8 m = I I The set of estimators 8 can be rearranged as n 8 + (5.6) where ~ = 8 - 8. The asymptotic covariance matrix of ~n can be estimated n n bv V in (5.5). Since the estimators 8 , n=2,3, ... ,m, are derived from the . ~ n same samples, the disturbances ~n' ~~, n#~, are correlated. The asymptotic covariance of ~n and ~~ can be estimated by Vn~ where -16- ay' (e ) Vnn = [ n n Z'O (Q'Q )-lQ'Z ~ ae n'n n n n n ay' (8 ) [ n n ae ay (e ) -1 n n J ae' The equation (5.6) can then be estimated by the generalized least (5.7) squares method. Let V = [V ni ] be the (m~l)x(m-l) block matrix consisting of the matrices Vni,and Vni be the submatrix in the (n,i)th position of the inverse -1 - matrix V . The pooled estimator e from the equation (5.6) is (5.8) m m ji-l and its asymptotic covariance matrix can be estimated by (Lj=2Li=2V ) . The above estimation method is, of course, not the most efficient method. A more efficient estimation method is to estimate e from a set of equations in (5.3) with n=2, ••• ,m, by the generalized nonlinear three stage least squares. This method, however, is rather complicated and,will not be 8/ recommended for our general model.- The above estimation method can be greatly simplified if the distribution is assumed to be normal. The normal distribution is a member of the Pearson family of distributions and corresponds to the case that cl =c 2=0. For the normal distribution, the recursive relation (3.15) for the singly truncated case will become n+l y n n-l y xS + ny Co + nn+l n=1,2, .•. (5.9) -17- and the recursive relation (4.5)' for the doubly truncated case will become n+l k n y - y n n-l n-l n-Z (y -ky )x8 + (ny -k(n-l)y )c + n , o n (5.10) n=Z,3, •.• Since the equations in (5.10) are linear in coefficients, the nonlinear two stage least squares method will become the two stage least square~ method and the computations will be simplified. This two stage method generalizes the instrumental variable method in Amemiya [1973J. Since the pooled estimate uses more information, it will be asymptotically more efficient than his one. Since the instrumental variable estimate can be easily derived for the normal distribution, it may also be useful as an initial estimate to start the iteration for the estimation of the general model. Similar to the Amemiya procedure, our procedure does not impose inequality constraint on the estimation of the variance c in (5.10). Hence it is possible to have negative o estimate of the variance c when the sample size is not sufficiently large.1! o Similar problem may occur in the estimation of the general model with the Pearson family of distribution. The variance is c o !(1-3cZ) for the general model. Since in many empirical studies, the main interest is to estimate the vector of coefficients 8, the unconstrained estimation method will suffice for this purpose. -18- 6. Conclusions In this article, we have considered the estimation of the Tobit models when the distribution of the disturbances belongs to the Pearson family of distributions. Model with single truncation and model with double truncations are considered. The estimation method we have proposed in this article is a nonlinear two stage least squares method. We have derived from the differential equation, which characterizes the Pearson family of distributions, some simple recursive relations between the moments of the truncated distributions. The recursive relations provide some structural equations which are linear in variables but nonlinear in coefficients. The nonlinear two stage least squares methods are applied to estimate those equations. -19- Footnotes (*) The author is an associate professor of economics, University of Minnesota, and a visiting assQciate professor for the Center for Econometrics and Decision Sciences, University of Florida. Financial support from the National Science Foundation under grant SES 8006481 to the University of Minnesota, is gratefully acknowledged. A preliminary version of this article was presented at the CEME Conference on Qualitative Decision Theory and Discrete Data Analysis at Harvard University on April 2-4, 1981. I appreciate having valuable comments from Dale Poirier and the conference participants. Any errors are of my own. 1. The specified truncation threshold is specified as zero. More general case is that Y1i is observed if Y1i > c where c is a known constant; otherwise Y1i = c. This case can be transformed into our specification by rewriting the equation (2.1) into Y1i - c = XiS - c + u i and regarding Y1i - c as the dependent variable. If the truncation threshold Ci is varying for different i, the regression equation Y1i - c i = x.S - c. + u. 1 1 1 will have a known coefficient in the exogenous variable c. on the right 1 hand side. For this case, our estimation methods proposed below need to be modified slightly to take into account this constraint. 2. See, e.g., Elderton and Johnson [1969], p. 38. 3. Type IV distribution is one of the main type distribution and corresponds 2 to the case that the solutions of c o -c1u + c 2u = 0 are complex. 4. Some of such related references were pointed out to me by Dale Poirier. - 20- 5. The operator ~ denotes the Kronecker product. It should note that the relation a~'(S~S) = (IK~S) + (S~IK)' where IK is is a k x k identity matrix, is useful for our .estimation procedure proposed below. 6. The distribution of the disturbances in the model of Rosett and Nelson 7. is assumed to be normal. They call it the two limit probit model. More regular statement is that NV is a consistent estimate n,n covariance matrix for the limiting distribution of IN (~ - e). n of the 8. The computation burden in this procedure is on the evaluation of the density function and the probability function. For some specific members of the Pearson family such as normal distribution, the computations of the density and probability functions are rather simple and the generalized nonlinear three stage least squares are attractive. 9. Another reason may be due to the misspecification of the model. -21- REFERENCES Amemiya, T. (1973), "Regression Analysis When the Dependent Variable is Truncated Normal," Econometrica 41, pp. 997-1016. Amemiya, T. (1974), "The Nonlinear Two-Stage Least-Squares Estimator," Journal of Econometrics 2, pp. 105-110. Cohen, A. C., Jr. (1951), "Estimation of Parameters in Truncated Pearson Frequency Distributions," Annals of Mathematical Statistics 22, pp. 256-65.- Cohen, A. C., Jr. (1953), "Estimating Parameters in Truncated Pearson Frequency Distributions Without Resort to Higher Moments," Biometrika 40, pp. 50-57. Elderton, W. P. and N. J. Johnson (1969), Systems of Frequency Curves, Cambridge University Press. Goldberger, A. S. (1980), "Abnormal Selection Bias," Discussion paper no. 8006, Social Systems Research Institute, University of Wisconsin, Madison. Hurd, M. (1979), "Estimation in Truncated Samples when There is Heteroscedasticity," Journal of Econometrics 11, pp. 247-258. Johnson, N. L. and S. Kotz (1970), Continuous Univariate Distributions - 1, Houghton Mifflin Co., Boston. Lee, 1. F. (1981), "A Specification Test for Normality Assumption for the Truncated and Censored Tobit Models," Manuscript, U. of Florida. Nelson, F. D. (1979), "The Effect of and A Test of Misspecification in the Censored-Normal Model," Social Science Working Paper 291, California Institute of Technology, forthcoming in Eco~~~~trica. Rosett, R. N. and F. D. Nelson (1975), "Estimation of the '1"~.o Limit Probit Regression Model," Econometrica 43, pp. 141-146. Tobin, J. (1958), "Estimation of Relationships for Limited Dependent Variables," Econometrica 26, pp. 24-26.