ISSN: 2341-2356 
WEB DE LA COLECCIÓN: http://www.ucm.es/fundamentos-analisis-economico2/documentos-de-trabajo-del-icaeWorking 
papers are in draft form and are distributed for discussion. It may not be reproduced without permission of the author/s.  

 
Instituto 
Complutense 
 de Análisis 
Económico 

 A New Inequality Measure that is Sensitive  

to Extreme Values and Asymmetries  

Michael McAleer  
Department of Quantitative Finance, National Tsing Hua University, Taiwan 

 Discipline of Business Analytics, University of Sydney Business School, Australia 

 Econometric Institute, Erasmus School of Economics, Erasmus University Rotterdam, The 

Netherlands; Department of Quantitative Economics, Complutense University of Madrid, Spain; 

Institute of Advanced Sciences, Yokohama National University, Japan 

Hang K. Ryu  
Department of Economics, Chung Ang University, Seoul, Korea  

Daniel J. Slottje  
Department of Economics, SMU, Dallas 

 
Abstract 
There is a vast literature on the selection of an appropriate index of income inequality and on what 

desirable properties such a measure (or index) should contain. The Gini index is, of course, the most 

popular. There is a concurrent literature on the use of hypothetical statistical distributions to 

approximate and describe an observed distribution of incomes. Pareto and others observed early on 

that incomes tend to be heavily right-tailed in their distribution. These asymmetries led to 

approximating the observed income distributions with extreme value hypothetical statistical 

distributions, such as the Pareto distribution. But these income distribution functions (IDFs) continue 

to be described with a single index (such as the Gini) that poorly detects the extreme values present in 

the underlying empirical IDF. This paper introduces a new inequality measure to supplement, but not 

to replace, the Gini that measures more accurately the inherent asymmetries and extreme values that 

are present in observed income distributions. The new measure is based on a third-order term of a 

Legendre polynomial from the logarithm of a share function (or Lorenz curve). We advocate using 

the two measures together to provide a better description of inequality inherent in empirical income 

distributions with extreme values. 

Keywords       Inequality Index, Extreme value distributions, Maximum entropy method, 
Orthonormal basis, Legendre polynomials. 

JEL Classification            D31, D63 

 
UNIVERSIDAD 

COMPLUTENSE  
MADRID 

 
Working Paper nº 1725 
October,  2017 


1 

 
A New Inequality Measure that is Sensitive  
to Extreme Values and Asymmetries1 

 
Michael McAleer2, Hang K. Ryu3 and Daniel J. Slottje4 

 
Abstract 
 
There is a vast literature on the selection of an appropriate index of income inequality and on 
what desirable properties such a measure (or index) should contain. The Gini index is, of 
course, the most popular. There is a concurrent literature on the use of hypothetical statistical 
distributions to approximate and describe an observed distribution of incomes. Pareto and 
others observed early on that incomes tend to be heavily right-tailed in their distribution. 
These asymmetries led to approximating the observed income distributions with extreme 
value hypothetical statistical distributions, such as the Pareto distribution. But these income 
distribution functions (IDFs) continue to be described with a single index (such as the Gini) 
that poorly detect the extreme values present in the underlying empirical IDF. This paper 
introduces a new inequality measure to supplement, but not to replace, the Gini that measures 
more accurately the inherent asymmetries and extreme values that are present in observed 
income distributions. The new measure is based on a third-order term of a Legendre 
polynomial from the logarithm of a share function (or Lorenz curve). We advocate using the 
two measures together to provide a better description of inequality inherent in empirical 
income distributions with extreme values. 
 
JEL Classification: D31, D63 
 
Keywords: Inequality Index, Extreme value distributions, Maximum entropy method, 
Orthonormal basis, Legendre polynomials. 
 

1 This research was supported by the National Research Foundation of Korea (2017S1A3A2066657), National 
Science Council, Ministry of Science and Technology (MOST), Taiwan, and the Australian Research Council. 
2 Department of Quantitative Finance, National Tsing Hua University, Taiwan; Discipline of Business Analytics, 
University of Sydney Business School, Australia; Econometric Institute, Erasmus School of Economics, 
Erasmus University Rotterdam, The Netherlands; Department of Quantitative Economics, Complutense 
University of Madrid, Spain; Institute of Advanced Sciences, Yokohama National University, Japan. 
Email: michael.mcaleer@gmail.com  
3 Department of Economics, Chung Ang University, Seoul, Korea, 156-756, Tel.: +82-11-253-6500;  

Email: hangryu@cau.ac.kr    
4 Department of Economics, SMU, Dallas, TX 75275, Tel: 214-732-9170,  

Email: dan.slottje@fticonsulting.com  

                                           
mailto:michael.mcaleer@gmail.com
mailto:hangryu@cau.ac.kr
mailto:dan.slottje@fticonsulting.com


2 

 
I. Introduction 

 
Income inequality research has experienced a resurgence after losing some 

momentum in the late 1990s and the first decade of the Twenty-first Century.  

Piketty (1995, 2014) and Boushey et al. (2017) reignited some interest in the 

field; Piketty did so with his 2014 tome on “polarization.” There is a vast 

literature on the measurement of income inequality, cf. Cowell (2011) for an 

excellent bibliography of much of this work. This literature contains hundreds 

of papers on an appropriate index of income inequality and on what desirable 

properties such a measure (or index) should possess. We present and review 

some of this discussion below.   

There is also a concurrent literature on the use of hypothetical statistical 

distributions to approximate and describe an observed distribution of incomes.  

Pareto (1896) and others observed early on that incomes tend to be heavily 

right-tailed in their distribution. These asymmetries led researchers to 

approximating the observed income distributions with extreme value 

hypothetical statistical distributions, such as the Pareto distribution. Statisticians 

have done considerable work on extreme value distributions in other 

applications. The generalized extreme value distribution (GEV) and its family 

members, including the Weibull, Gumbel, Frechet and others, have been 

extensively explored by statisticians and inequality researchers alike (cf. Coles 

(2001) and Cowell and Flachaire (2007)). James McDonald has been a leading 

researcher in the area of functional forms of hypothetical statistical distributions 

to describe IDFs for a long time (cf. McDonald (1984), McDonald et al. (2013) 

and Slottje (1987)).  

Interestingly, even with the recognition of the fact that incomes are 

distributed with asymmetric higher moments, inequality indices constructed to 

capture the level of inequality inherent in these observed income distributions 


3 

 
(with a single number) are generally based on the mean and variance of the 

observed data. Cowell and Flachaire (2002, 2007) is the only work that seems to 

discuss the two concepts (that is, extreme values in the IDF and detecting it with 

an inequality index) in the same place. They do not introduce a new index or 

measure to deal with the issue, but note that the two most popular classes of 

measures, the Gini and Entropy-based measures, have different sensitivities to 

the problem in their first paper (cf. Cowell and Flachaire (2002)).   

In their second paper, the authors are primarily concerned about how 

sensitive commonly used inequality measures are to extreme values in the 

underlying distributions, and suggest some semi-parametric specifications of the 

commonly used measures to account for the extreme values (cf. Cowell and 

Flachaire (2007)). The Gini coefficient and Theil’s entropy measure (frequently 

generalized) are two very popular inequality indices, among others, that have 

not always performed well in describing some of the tail behavior in observed 

income distributions. Specifically, both measures fall short in detecting changes 

in various group’s share (cf. Ryu(2013) and Ryu and Slottje (2017))5.  

Another way to approach the problem is to realize that there are many 

income distribution functions which will produce the same value of a Gini 

coefficient. The overall shape of the income share function may be well 

described by the Gini coefficient (or by Theil’s entropy measure), but the 

poorest group’s share and the precise details of the richest group’s share 

generally are not described well by these measures. In this paper, a second 

inequality measure is introduced and added to the Gini coefficient to describe 

movements of the extreme values and asymmetries of observed income 

distributions as they change over time. 

5 See Maasoumi (1986, 1989) for excellent work on the generalized entropy class of 
measures. 

                                           
4 

 
In the next section we discuss desirable properties an inequality measure should 

possess. In Section 3 and 4 we introduce the new measure, which is based on 

the expansion of the logarithm of the share function (or Lorenz curve) with a 

Legendre polynomial expansion. Section 5 of the paper discusses an application 

by fitting the new measure to CPS data. Section 6 concludes the paper. 

 
II. Desirable Properties of an Income Inequality Index, I(y)6 

 
There is significant consensus among inequality researchers that any income 

inequality index, I(y), should possess statistical properties that allow it to 

reasonably describe the inequality inherent in an observed IDF. Given the 

inherent difficulty in describing the characteristics of an entire IDF with one 

number, the following properties are desirable: 

• Anonymity or symmetry 

The inequality measure should not depend on how individuals in an 

observed distribution are labeled.  Another words, it doesn’t matter who 

receives the income, all that matters is the distribution of income. This is 

generally expressed mathematically as: 

 
                          ( ( )) ( )I P y I y=                        (1) 

 
where P(y) is any permutation of income y; 

6 This list is a collection whose individual properties are discussed in many places, including 
Cowell (2011), Ryu and Slottje (1998), Basmann and Slottje (1987), and Basmann, Hayes 
and Slottje (1991), among others.  

                                           
https://en.wikipedia.org/wiki/Permutation


5 

 
• Scale independence or homogeneity 

As Cowell (2011, p. 63) notes, the measured inequality of the slices of the 

cake should not depend on the size of the cake.  This property says that if 

(say) every person’s income in an economy is increased by some 

constant, then the overall metric of inequality should not change.  This 

may be stated as: 

 
                                          ( ) ( )I ay I y=                     (2) 

 
where a is a positive real number. 

• Population independence 

Similarly, the inequality measure should be independent of the level of 

population.  Cowell (2011, p. 63) notes the inequality of the cake 

distribution should not depend on the number of cake-receivers.  This is 

generally written as: 

 
            ( ) ( )I y y I y∪ =         (3) 

 
where ∪  is the union of x with itself. 

• Transfer principle 

The Pigou–Dalton, or transfer principle, states, in its weak form, that if 

income is transferred from a rich person to a poor person, while still 

preserving the order of income ranks, then the inequality measurement 

should not increase. In its strong form, the transfer principle says the 

measured level of inequality should decrease.  As will be shown below 

https://en.wikipedia.org/wiki/Union_(set_theory)
https://en.wikipedia.org/wiki/Pigou%E2%80%93Dalton_principle


6 

 
in our paper, our new second measure satisfies this condition if it is 

considered together with the Gini coefficient (see the Appendix for 

proof). 

• Non-negativity 

The inequality index I(y) must be greater than or equal to zero. 

• Egalitarian zero 

The index I(y) is zero when everyone has the same income, meaning 

when all values yi are equal. 

• Bounded above by maximum inequality 

The index I(y) attains its maximum value of one, reflecting the maximum 

level of inequality (all iy  are zero except one).  

 
In the discussion to follow, we introduce a new measure that will be shown to 

satisfy these properties. 

 
III. New Measure of Inequality that Supplements the Gini Coefficient 
 

Given our objective to find a new income inequality measure which is sensitive 

to extreme values, we propose to describe the income distribution with two 

summary measures rather than a single measure. The Gini coefficient, Theil’s 

entropy measure, and other well-known measures are useful in describing the 

overall state of income inequality, but these measures do not provide precise 

information about the presence of extreme values in an underlying IDF, or in 

how change in the extreme values over time impact the level of inequality as 

reflected in the summary index over time.   


7 

 
In this paper, we conceptualize a complete set of distributions all having 

the same Gini value. A function derived using only the Gini coefficient will be 

called the basic model in the paper. This basic model is known to be imprecise 

in describing the presence of extreme values. A second inequality measure will 

supplement the Gini, and is designed to describe the movements of the poorest 

group’s income share and the extreme values of the richest income group. 

The choice of the second inequality measure is extremely important. The 

basic model can be derived using the first inequality measure, such as the Gini 

coefficient, Theil’s entropy measure, and others. The basic model used in this 

paper is the Gini coefficient-based model. When the second inequality measure 

is added, it is desirable to derive the functional form corresponding to this 

second measure and to add this part to the basic model. In the applications 

section, the income distribution of the basic model and the distribution of the 

extended model will be compared. 

To introduce the second inequality measure, two functional forms are 

considered in this paper. The first functional form is the expansion of the 

logarithm of the share function in terms of the Legendre polynomial series. The 

second functional form is the expansion of the Lorenz curve in terms of the 

Legendre polynomial series. For the first functional form, the parameter of the 

first order polynomial term can be derived from the Gini coefficient, and the 

parameter of the third order polynomial term will be used as the second 

inequality measure. Note that the second-order term of the Legendre polynomial 

series is a symmetric function, so that it cannot be used in describing the 

monotonic increasing function. Both forms will be explained below. 

For the second functional form where the Lorenz curve is expanded in 

Legendre polynomials, the parameter of the zero-th Legendre polynomial term 

corresponds to the Gini coefficient, and the parameter of the first Legendre 

polynomial term can be used as the second inequality measure.  


8 

 
3.1 Orthonormal basis expansion of the logarithm of income share function  

For the given income observations, there are many ways to approximate the 

functional form of the data generating model. If an orthonormal basis (ONB) 

expansion is applied, the parameter calculation is unaffected by the size of the 

series. In comparison, the estimated parameters of the ordinary least squares 

regression method change their values when a new term is added in the 

regression series.  

The addition of higher-order terms in the series will allow the 

approximated function to converge to the data generating model. These 

functions with different series lengths form a complete set of income 

distributions corresponding to the basic model derived from the Gini coefficient. 

Orthonormal basis expansion allows us to superpose new terms on the basic 

model without disturbing the basic model.  

Suppose we have a continuous share function ( )s z  for 0 1z≤ ≤ , where 

the poorest person is located at 0z =  and the richest at 1z = . We can 

approximate the logarithm of the share function with a sequence of orthonormal 

functions, 0 1( ), ( ),P z P z 2 3( ), ( ), ....P z P z . Arfken (1985) presents an explanation of 

the ONB method: 

 
1

( ) ( )
N

N n n
n

log s z a P z
=

=∑               (4) 

 
An orthonormal sequence satisfies: 

 
9 

 
                 ( ) ( ) , , , 0,1, 2,n m nm
Z

P z P z dz n md= =∫        (5) 

 
where 1nmd =  if n m=  and zero otherwise. The parameters of (4) can be found 

with: 

 
1

( ) ( ) ( ) ( )
N

m m N m n n
n

a P z log s z dz P z a P z dz
=

 = =   
∑∫ ∫            (6) 

 
(see Ryu (1993) for the continuous version of ONB, and Ryu and Slottje (1996) 

and Milne (1949) for a discussion of the discrete version of ONB). The 

orthogonal sequence { }nP  in the space 2 ( )L Z  is called complete if there is no 

element 0f ≠  of 2 ( )L Z  which is orthogonal to all the elements of nP . If: 

 
                   ( ) ( ) 0 for 0,1, 2,n
Z

f z P z dz n= =∫        (7)               

 
it follows ( ) 0f z =  for almost all z Z∈ .  

Suppose the Legendre polynomials are used for 0 1z≤ ≤ : 

 
10 

 
( )
( )
( )
( )
( )

0

1

2
2

3 2
3

4 3 2
4

5 4 3 2
5

( ) 1

( ) 3 2 1

( ) 5 6 6 1

( ) 7 20 30 12 1

( ) 9 70 140 90 20 1

( ) 11 252 630 560 210 30 1

P z

P z z

P z z z

P z z z z

P z z z z z

P z z z z z z

=

= −

= − +

= − + −

= − + − +

= − + − + −

   (8) 

 
Fig.1 shows 0 ( )P z is flat and 1( )P z  is a linear function but ( )nP z has 1n −  peak 

values. To approximate the logarithm of the share function, the Legendre 

polynomials with degrees of even numbers seem to be less useful because they 

have peak values at 0z = . Those functions with degrees of odd numbers will be 

useful as they have their lowest values at 0z =  and their largest values at 1z = .   

Consider the following basic model, which can be derived from the 

given Gini coefficient: 

 
0 1 1( ) ( )Ginilog s z a a P z= +     or     0 1 1( ) exp[ ( )]Ginis z a a P z= +      (9) 

 
Yitzhaki (2013) has shown that knowledge of the Gini coefficient is equivalent 

to knowledge of the first moment of the share function. To find the parameters 

of (9) from the Gini coefficient, consider: 

 
        0 1 1 0 1 0 1 1( ) 3 (2 1) 3 2 3a a P z a a z a a a z A Bz+ = + − = − + = +     (10) 


11 

 
1 ( )dz exp[A ]

1 Giniexp[B ]
1 2B

z s z z Bz dz

B z z dz
e

µ = = +

+ = = − 

∫ ∫

∫
           (11) 

 
where the parameter A is removed with normalization of the share function. 

Knowledge of the Gini allows us to find 0,B a  and 1a  of (10). Therefore, the 

basic model is derived from the given Gini coefficient. 

 
12 

 
-3

-2

-1

0

1

2

3

0.0 0.2 0.4 0.6 0.8 1.0

z

P0 P1 P2
P3 P4 P5

Fig.1 Plots of Legendre Polynomials


13 

 
To consider the extreme values at the fat right tail of the share function, 

the following extended functional forms can be applied: 

 
Basic model:      0 1 1( ) ( )Ginilog s z a a P z= +                   (12)  

Second order:     2 0 1 1 2 2( ) ( ) ( )log s z a a P z a P z= + +             (13) 

Third order:          3 0 1 1 2 2 3 3( ) ( ) ( ) ( )log s z a a P z a P z a P z= + + +       (14) 

Fourth order:         4 0 1 1 2 2 3 3 4 4( ) ( ) ( ) ( ) ( )log s z a a P z a P z a P z a P z= + + + +      (15) 

Fifth order:       5 0 1 1 2 2 3 3 4 4 5 5( ) ( ) ( ) ( ) ( ) ( )log s z a a P z a P z a P z a P z a P z= + + + + +   (16) 

 
The parameters can be found with: 

 
                ( ) ( )m m Na P z log s z dz= ∫       (17) 

 
The parameter values calculated by (17) do not depend on the length of the 

series. For example, the 2a  parameters of (13), (14), (15), and (16) are the same. 

This is the benefit of the orthonormal function expansion. In comparison, the 

parameters estimated using a least squares method will fluctuate when we 

increase the length of series. Therefore, we can superpose another function 

derived with the additional parameter to the basic Gini model without damaging 

the basic model.   

We have assumed knowledge of a continuous function ( )s z  and 

expanded the logarithmic transformation with an orthonormal basis (4), so that 


14 

 
the parameters were found with (6) using the orthogonality of the Legendre 

functions. As an alternative method, suppose we do not know the functional 

form of the underlying share function ( )s z . If nothing is known, the share 

function can be assumed to be a flat function. Suppose the moments of the share 

function are known, as follows: 

 
            ( )m
m z s z dzµ = ∫  for 0,1,2,...,m N=                 (18) 

 
Then the following moments can be calculated based on (8): 

 
                ( ) ( )m mP z s z dzλ = ∫  for 0,1,2,...,m N=             (19) 

 
Zellner and Highfield (1988) and Ryu (1993) solved an entropy maximization 

problem: 

 
                ( ) log ( )sMax W s z s z dz= −∫                     (20) 

 
satisfying: 

 
               ( ) ( )m mP z s z dzλ = ∫  for 0,1,2,...,m N=              (19) 

 
Then:  


15 

 
0
( ) exp ( )

N

n n
n

s z c P z
=

 =   
∑  satisfying ( ) ( )m mP z s z dzλ = ∫  for 0,1,2,...,m N=     (21) 

 
If the Gini coefficient is known, this is equivalent to knowledge of 0λ  and 1λ  , 

and so we have: 

 
                [ ]0 0 1 1( ) exp ( ) ( )s z c P z c P z= +                  (22) 

 
which is equivalent to (12). The parameters of (22) can be determined from the 

given Gini coefficient, as derived in Ryu and Slottje (2017b). Two alternative 

methods to approximate the share function are now explained. The first method 

assumes knowledge of the continuous ( )s z , which is expanded with a Legendre 

series. The second method does not assume the functional form of ( )s z  but 

maximizes entropy subject to known values of moments. The derived functional 

forms are the same, but the parameter calculation methods are different.  

As we add more terms to the series, the approximated function 

approaches log ( )Ns z : 

 
[ ]

2
2 2 2 2 2

0 1 2
1

( ) ( )
N

N n n N
n

log s z dz a P z dz a a a a
=

 = = + + + +  
∑∫ ∫        (23) 

 
Using 2016 CPS data (which will be discussed below in detail), we have: 


16 

 
2 2 2 2 2 2
0 1 2 3 4 527.921, 1.190, 0.0376, 0.1340, 0.0146, 0.0740a a a a a a= = = = = =  (24) 

 
where 0a  is used for normalization and 1a  is the slope term corresponding to 

the Gini coefficient.  If we have to choose a term in addition to the basic model, 

then we can choose a term with the largest parameter squared value.  In our 

case, 2
3a  has the largest value among the remaining terms.  

Now suppose we wish to introduce a second inequality measure as a 

supplement to the Gini coefficient. There are a few choices suitable for this 

purpose. Consider the following: 

 
Typical model:    0 1 1( ) ( ) ( )N N Nlog sh z a a P z a P z= + +           (25) 

Basic model:      0 1 1( ) ( )Ginilog s z a a P z= +          (12) 

Second order model:    2 0 1 1 2 2( ) ( ) ( )log s z a a P z a P z= + +            (13) 

Third order model:         3 0 1 1 3 3( ) ( ) ( )log sh z a a P z a P z= + +            (26) 

Fourth order model:         4 0 1 1 4 4( ) ( ) ( )log sh z a a P z a P z= + +       (27) 

Fifth order model:         5 0 1 1 5 5( ) ( ) ( )log sh z a a P z a P z= + +       (28) 

 
An approximated share function with the additional third-order term will be a 

monotonic increasing function if its slope is nonnegative for the given values of 

positive 1a  and 3a : 

 
17 

 
23 0 1 1 3 3

1 3
( ) ( ) ( ) 2 3 7 (60 60 12) 0log sh z a a P z a P z a a z z

z z
∂ ∂ + +

= = + − + >
∂ ∂

     (29) 

 
If a monotonicity test is passed for (26), then the third-order parameter 3a  can 

be used as the second inequality measure. A similar monotonicity test can be 

performed for (28): 

 
                 5 0 1 1 5 5( ) ( ) ( ) 0log sh z a a P z a P z
z z

∂ ∂ + +
= >

∂ ∂
               (30) 

 
IV. Lorenz dominance and expansion of the basic model 

Another way to understand the intuition behind our new measure is to think 

about it in terms of Lorenz dominance.  There are many Lorenz curves which 

can generate the same Gini coefficient. If we expand the Lorenz curve with a 

Legendre polynomial series, the zero-th order parameter can be determined 

from the Gini coefficient. The basic model will be the second-order Legendre 

polynomial series with three parameters, which can be determined from two 

boundary conditions, ( 0) 0L z = =  and ( 1) 1L z = = , and the Gini coefficient. 

Inclusion of higher-order Legendre functions will modify the basic Lorenz 

curve, but all these Lorenz functions will have the same Gini coefficient due to 

the orthogonality of the Legendre series. A related discussion can be found in 

Choo and Ryu (1994). 

Suppose the Lorenz curve can be expanded through Legendre functions: 

 
1

( ) ( )
N

N n n
n

L z b P z
=

=∑       (31) 


18 

 
The parameters can be found from the following relation: 

 
        1
( ) ( ) ( ) ( )

N

m m N m n n
n

b P z L z dz P z b P z dz
=

 = =   
∑∫ ∫     (32)

  
The Gini coefficient determines the zero-th order parameter: 

 
1 1

0
0 0

1 Gini ( ) ( )
2 NL z dz L z dz b−

= =∫ ∫�             (33) 

 
Notice the above relation does not depend on the size of the series N  and all 

( )NL z  will share the same Gini coefficient. The Lorenz curve should satisfy two 

boundary conditions: 

 
            ( 0) 0 and ( 1) 1N NL z L z= = = =      (34) 

 
Now using: 

 
             (z 0) ( 1) 2 1 and (z 1) 2 1n
n nP n P n= = − + = = +         (35) 

 
the second-order polynomial series, which we label as the basic model, is given 


19 

 
as follows: 

 
       2 0 0 1 1 2 2( ) ( ) ( ) ( )L z b P z b P z b P z= + +            (36) 

 
Suppose the Gini coefficient is known, that is, 0b  is known. Using the 

boundary conditions, 2 ( 0) 0L z = =  and 2 ( 1) 1,L z = =  the parameters 1b  and 2b  

can be calculated for the given Gini coefficient: 

  
       2
2 1 2

1 Gini 1 Gini( ) ( ) ( ) 3Gini z (1 3Gini)
2 2 3 2 5

L z P z P z z− = + + = + − 
 

      (37) 

 
This function becomes a nonnegative convex function if Gini < 1/3 because the 

convexity is satisfied if 2 2
2 ( ) / 0L z z∂ ∂ ≥  for all z.  

(i) If the Gini coefficient is greater than 1/3, (37) will not be a convex 

function.  

(ii) If the Gini coefficient is zero, ( )L z z= ;  

(iii) If the Gini coefficient is 1/3, then 2( )L z z= . 

The third-order polynomial series is: 

 
                 3 0 0 1 1 2 2 3 3( ) ( ) ( ) ( ) ( )L z b P z b P z b P z b P z= + + +         (38) 

 
If we apply the boundary conditions 3( 0) 0L z = =  and 3( 1) 1L z = = , we have the 


20 

 
following 

 
         1
3 1 1 2 3

(1 2 3 )1 Gini Gini( ) ( ) ( ) ( )
2 2 5 2 7

bL z b P z P z P z−− = + + + 
 

        (39) 

 
if 1(1 2 3 ) / 2B b= − , rewrite (39) as: 

 
              2 3
3 ( ) (1 3Gini 5 ) 3(Gini 5 ) 10L z B z B z Bz= − + + − +           (40) 

 
Sufficient conditions to make (40) a positive convex function are: 

  
                 0, Gini 5 , 1 3Gini 5 0B B B≥ ≥ − + ≥                (41) 

 
These conditions can be simplified as: 

 
                        1 50 5 Gini
3

BB +
≤ < ≤                      (42) 

 
This condition limits the range of 0 0.1 and Gini 0.5B≤ ≤ ≤ . If the given data do 

not satisfy the above conditions, then the Lorenz curve derived by (40) may not 

be a nonnegative convex function. If the Gini coefficient is 0.5 and 0.1B = , 

then 3( )L z z= . 


21 

 
V. Applications 

In order to illustrate the usefulness of the new measure, we present examples 

using Current Population Survey (CPS) data from 2000-2016. The CPS is 

sponsored jointly by the U.S. Bureau of the Census and the U.S. Bureau of the 

Census. The CPS produced a technical paper, TP66, which describes the design 

and methodology of the CPS, cf. www.bls.census.gov/cps/tp66.htm.   

We use CPS household income data disaggregated into centiles for the 

years 2000-2016.7 The distribution of the data for each year can be summarized 

by the Gini index. Now using the logarithmic share function given in (26), we 

can calculate a secondary measure to supplement the Gini index. 

In Fig.2, the approximated function converges to the observed income 

shares for 2016 as we increase the number of expansion terms. The Gini-based 

model in (12) is a basic model, and it performs poorly for the very richest 

income group.  Even-order polynomials of the second-order in (13) and fourth-

order in (15) performed badly because the even power terms of the Legendre 

polynomial terms are symmetric functions, and do not fit well for the 

monotonically increasing function. The third-order model in (14) seems to 

perform well, but the fifth-order model in (16) produced minor fluctuations in 

the middle range of the IDF.  

 
7 We are grateful to Martha Starr for providing these data to us. 
                                           

http://www.blg.census.gov/cps/tp66.htm


22 

 
In Fig.3, the Gini-based model produced a straight line and could not 

approximate the share values for the very poor and very rich groups properly.  

In comparison, if the third-order term is added, (26) showed an improved result 

for the poorest and very richest group.  In the middle ranges, slight 

improvements were observed.  

 
.00

.01

.02

.03

.04

.05

.06

.07

.08

.09

0.0 0.2 0.4 0.6 0.8 1.0

z

Observed log shares (2016)
Gini based function
Second order polynomials
Third order polynomials
Fourth order polynomials
Fifth order polynomials

Fig.2 Converg. of Legendre polynomials to obs. log shares


23 

 
In Fig. 4, the performance of the third-order model of (26) is shown. Except for 

the very rich group, this model provided a relatively good performance. In Fig. 

5, the performance of the fifth-order model of (28) is shown.  Here, there is a 

small fluctuation around 0.7z = , but it produced a better performance for the 

richest group.   

-12

-10

-8

-6

-4

-2

0

0.0 0.2 0.4 0.6 0.8 1.0

z

Gini based model
Legendre third order model (26)
Observed log share

Fig.3 Approx. log shares with Gini and third order models


24 

 
.00

.04

.08

.12

.16

.20

.24

0.0 0.2 0.4 0.6 0.8 1.0

z

Observed shares
The third order model (26)

Fig.4 Approximated observed shares with third order model

.00

.04

.08

.12

.16

.20

.24

0.0 0.2 0.4 0.6 0.8 1.0

z

Observed share function
The fifth order model (28)

Fig.5 Approximate observed shares with fifth order model


25 

 
In Fig. 6, we used the CPS data from the year 2000 and examined the 

performance of the Legendre polynomial series expansion of the Lorenz curve. 

To impose the convexity of an approximated Lorenz curve of a third-order 

polynomial series, the Gini coefficient should not be larger than 0.5, as stated 

below (42). The Gini coefficient for CPS data in 2000 is 0.490. The CPS data 

for the years 2012~2016 have Gini coefficients greater than 0.5. If the Gini 

coefficient is larger than 0.5, we need a higher-order Legendre polynomial 

series expansion instead of relying only on (39). In comparison, to impose the 

convexity of the approximated Lorenz curve of the second-order, the Gini 

coefficient should be less than 1/3, as stated below (37). 

 
0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Z

Observed Lorenz Curve for 2000
Approximated Lorenz Curve for 2000

Fig.6 Approximate the Lorenz Curve for 2000


26 

 
In Fig. 7, the movements of the Gini coefficient and income shares of the richest 

5% are compared. They move more or less in the same directions, though the 

gap between the two curves decreased after 2012. This means the Gini 

coefficient is not as sensitive to extreme movement in the highest percentiles of 

income earners. 

  
.46

.48

.50

.52

.54

.56

.58

.60

.62

.26 

.28 

.30 

.32 

.34 

.36 

.38 

.40 

.42 

1996 2000 2004 2008 2012 2016 2020

Year

Gini(Left Scale)
Rich5p(Right Scale)

Fig.7 Comparison of Gini and richest 5% movements


27 

 
Fig. 8 shows the third order parameter ( 3a ) of an ONB expansion of the log 

share in (26). This parameter ( 3a ) moves in an opposite direction relative to the 

movements of the poorest 5 percent of income earners (poor 5P) curve. In 2015, 

the poorest 5P faced a significant loss in income share but recovered in 2016. 

The parameter ( 3a ) shows the opposite movements, indicating more inequality 

as the poorest group suffered a loss in income share. For movement of the 

richest 5P and parameter ( 3a ), a similar trend is observed but more refined 

details are different.  Here, the ( 3a ) measure goes up as the richest share 

increases and goes down as the richest share decreases.    

.10

.15

.20

.25

.30

.35

.40

.45

.0016 

.0020 

.0024 

.0028 

.0032 

.0036 

.0040 

.0044 

1996 2000 2004 2008 2012 2016 2020

Year

a3 (Left Scale)
Rich5P (Left Scale)
Poor5P (Right Scale)

Fig.8 Comparison of a3(ONB), rich 5P, and poor 5P


28 

 
Fig.9 shows the usefulness of the Gini coefficient, Theil’s entropy measure, and 

the third order parameter ( 3a ) in describing the movements of the poorest 5P and 

the richest 5P. 

The Gini coefficient and Theil’s measure are more or less the same in 

that they are both are reasonably good at describing the movement of the richest 

5P.  As explained in the discussion of Fig. 7, the third parameter ( 3a ) was 

stronger in describing the movement of the poorest 5P group’s share. 

 
.0

.1

.2

.3

.4

.5

.6

.0016 

.0020 

.0024 

.0028 

.0032 

.0036 

.0040 

1996 2000 2004 2008 2012 2016 2020

Gini(Left Scale)
Theil(Left Scale)
a3(ONB Parameter, Left Scale)
Rich 5P(Left Scale)
Poor 5P(Right Scale)

Fig.9 Compare Gini, Theil, a3(ONB), Rich 5P, and Poor 5P


29 

 
To check the performance of the Gini, Theil, and the third parameter 3a , a curve-

fitting exercise is performed where least squares estimation results are 

compared: 

 
     2
(0.0007662) (0.001435) 15 0.01124 0.01616 Gini+u , 0.8943P R= − =       (43) 

 2
(0.0002582) (0.001768) 25 0.004793 0.01523 Theil , 0.8319P u R= − + =        (44) 

2
(0.0004459) (0.001144) (0.001157) 3 35 0.008385 0.007416 Gini 0.01025 , 0.9840P a u R= − − + =    

(45) 

     2
(0.01404) (0.02630) 45 0.3677 1.2824 Gini+u , 0.9937R R= − + =       (46) 

     2
(0.003108) (0.02128) 55 0.1371 1.2544 Theil , 0.9957R u R= + + =         (47) 

2
(0.01355) (0.03475) (0.03514) 3 65 0.4111 1.4155 Gini 0.1559 , 0.9974R a u R= − + − + =  (48)  

 
Equations (45) and (48) show that the poorest group and the richest group are 

both described well if the Gini coefficient and the third parameter 3a are used 

simultaneously, as these combinations provide the best fit of the data. 

 
VI. Conclusion 

This paper introduced a new inequality measure to supplement the better known 

Gini Index, where the new measure is sensitive to the asymmetries and extreme 

values in the underlying IDF that the index is intended to measure. The 

inequality measurement literature contains hundreds of papers on an appropriate 

index of income inequality, and on what desirable properties such a measure (or 

index) should contain.   


30 

 
There is a concurrent literature on the use of hypothetical statistical 

distributions to approximate and describe an observed distribution of incomes.  

Even with the recognition by some of the fact that incomes are distributed with 

asymmetric higher moments, inequality indices constructed to capture the level 

of inequality inherent in these observed income distributions (with a single 

number) are generally based on the mean and variance of the observed data.  

This paper introduced a new inequality measure to supplement, but not to 

replace, the Gini coefficient that measures more accurately the inherent 

asymmetries and extreme values that are present in observed income 

distributions.  

The new measure is based in a third-order term of a Legendre 

polynomial from the logarithm of a share function (or a first-order term of a 

Lorenz curve).  In this paper, we advocated using the two measures together to 

provide a better description of inequality inherent in empirical income 

distributions with extreme values. 

We applied the new measure to examine inequality in U.S. CPS 

household income data for 2000-2016 in income centiles. The new measure was 

shown to be an excellent supplement to the Gini coefficient. The Gini index 

provides an intuitive overall measure of the inequality inherent in an IDF. 

Changes in the level of inequality inherent in the empirical IDF (particularly for 

the extreme portions of the IDF) were detected more accurately by the new 

measure than by simply calculating the Gini index alone.   

  
31 

 
References 
Arfken, George, 1985, Mathematical methods for physicists, third edition, Academic Press, 

Inc. San Diego. 

Basmann, R. and D. Slottje, (1987), “A new index of income inequality,” Economics Letters 

24: 385-389. 

Basmann, R., K. Hayes, and D. Slottje, (1991), “The Lorenz curve and the mobility function,” 

Economics Letters, 35: 105-111. 

Boushey, H., J. Delong, and M. Steinbaum, (2017), After Piketty, Harvard University, 

Cambridge, MA. 

Choo, Hakchung, and Hang Ryu, 1994, Gini coefficient, Lorenz curves, and Lorenz 

dominance effect: An application to Korean income distribution data, Journal of 

Economic Development 19, No.2, 47-65. 

Coles, S. (2001), An introduction to Statistical Modeling of Extreme Values, Springer-Verlag. 

Cowell, F. (2011), Measuring Inequality, 3rd Edition, Oxford: Oxford U. Press. 

Cowell, F. and E. Flachaire (2002), “Sensitivity of Inequality Measures to Extreme Values,” 

LSE STICERD Paper No. DARP 60. 

Cowell, F. and E. Flachaire (2007), “Income Distributions and Inequality Measurement: the 

Problem of Extreme Values,” Journal of Econometrics, 141: 1044-1072. 

Maasoumi, E. (1986), "The Measurement and Decomposition of Multidimensional 

Inequality," Econometrica, 54: 991-998. 

Maasoumi, E. (1989), "Continuously Distributed Attributes and Measures of Multivariate 

Inequality," Journal of Econometrics, 42: 131-144. 

McDonald, J.B. (1984), “Some Generalized Functions for the Size Distributions of Income,” 

Econometrica, 52: 647 – 663.  

McDonald, J., J. Sorenson and P. Turley (2013), “Skewness and Kurtosis Properties of 

Income Distribution Models,” Review of Income and Wealth, 59: 360 – 374. 

Milne, W. (1949), Numerical Calculus, Princeton University Press, Princeton. 

Pareto, V. (1876), Cours d'Économie Politique Professé a l'Université de Lausanne.  

Piketty, T. (1995), “Social Mobility and Redistributive Politics”, Quarterly Journal of 

Economics, 110: 551-584. 

Piketty, T. (2014), Capital in the Twenty-First Century, , Harvard University Press, 

Cambridge . 


32 

 
Ryu, H. (1993), "Maximum entropy estimation of density and regression functions", Journal 

of Econometrics, 56: 397-440. 

Ryu, H. (2013), “A bottom poor sensitive Gini coefficient and maximum entropy estimation 

of income distributions, Economics Letters, 118: 370-374 

Ryu, H. and D. Slottje, (1996), "Two Flexible Functional Form Approaches for 

Approximating the Lorenz Curve", Journal of Econometrics, 72: 251-274. 

Ryu, H. and D. Slottje, (1998), Measuring Trends in U.S. Income Inequality, Theory and 

Applications, Springer, New York. 

Ryu, H. and D. Slottje (2017), “Maximum Entropy Estimation of Income Distributions from 

Basmann’s WGM Class,” Journal of Econometrics, 199 (2): 221-231. 

Slottje, D. (1987), “Relative Price Changes and Inequality in the Size Distribution of Various 

Components of Income,” Journal of Business and Economic Statistics, 5: 19-26. 

Yitzhaki, S. (2013), More than a dozen ways of spelling Gini, ch-2 in The Gini Methodology, 

Springer, 11-13.   

Zellner, A. and R. Highfield, (1988), “Calculation of maximum entropy distributions and 

approximation of marginal posterior distributions,” Journal of Econometrics, 37: 195-209. 

  
33 

 
Appendix: Pigou-Dalton Principle (PDP) for model (26) 

 
The logarithm of the share function can be expanded in the Legendre series: 

 
 0 0 1 1 2 2 3 3( )N N Nlog s z a P a P a P a P a P= + + + + +   (4) 

 
Suppose we want to summarize income inequality with only a Gini coefficient. 

This corresponds to taking a basic Gini model (12) because higher-order 

Legendre polynomials do not influence the choice of 0a  and 1a : 

 
 Basic model:      0 1 1( ) ( )Ginilog s z a a P z= +        (12) 

 
The Gini coefficient can be determined from 1a  and vice-versa, as discussed in 

(11). Even if we include higher-order terms of (4), 1a  will be the same in (4) 

and (12). 

Now to prove the PDP condition holds for our new measure, suppose 

i j<  and ( ) ( )i js z s z< . After a transfer of small income share (∆ ) from the jth 

person to the ith person, new income shares of these two people become 

( )is z + ∆  and ( )js z −∆ . This means the slope of log ( )s z  is now lower. Thus 1a  

and the Gini coefficient are lower, and [ ]
2

( )Nlog s z dz∫  has decreased. If 

[ ]
2

( )Ginilog s z dz∫  is a good approximation of [ ]
2

( )Nlog s z dz∫ , 2 2
0 1a a+  will 

decrease because we have: 

 
                  [ ]
2 2 2

0 1( )Ginilog s z dz a a= +∫                    (A1) 

 
34 

 
In the standard discussion, income transfers from a rich person to a poor person 

is described with a lower value of the Gini coefficient, but here the same effect 

is represented with lower values of [ ]
2

( )Ginilog s z dz∫  and 2 2
0 1a a+ .  

Similarly, if the logarithm of the share function is approximated with the 

first-order and third-order Legendre polynomials, then the logarithm of the 

share function is summarized with the ONB parameters 1a  and 3a . 

For the Third-order model: 

 
         3 0 1 1 3 3( ) ( ) ( )log sh z a a P z a P z= + +                     (26) 

 
The parameters 1a  of (12) and (26) are the same, and can be derived from the 

given Gini coefficient. If the income share transfer decreases [ ]
2

( )Nlog s z dz∫ , 

and if [ ]
2

3 ( )log sh z dz∫ is a good approximation of [ ]
2

( )Nlog s z dz∫ , then the income 

share transfer lowers 2 2 2
0 1 3a a a+ + : 

 
 [ ]
2 2 2 2

3 0 1 3( )log sh z dz a a a= + +∫   (A2) 

 
Therefore, the PDP will have a decrease of 2 2 2
0 1 3a a a+ +  which completes the proof.