See5 algorithm versus discriminant analysis. An application to the prediction of insolvency in Spanish non-life insurance companies

Thumbnail Image
Official URL
Full text at PDC
Publication Date
Advisors (or tutors)
Journal Title
Journal ISSN
Volume Title
Facultad de Ciencias Económicas y Empresariales. Decanato
Google Scholar
Research Projects
Organizational Units
Journal Issue
Prediction of insurance companies insolvency has arised as an important problem in the field of financial research, due to the necessity of protecting the general public whilst minimizing the costs associated to this problem. Most methods applied in the past to tackle this question are traditional statistical techniques which use financial ratios as explicative variables. However, these variables do not usually satisfy statistical assumptions, what complicates the application of the mentioned methods. In this paper, a comparative study of the performance of a well-known parametric statistical technique (Linear Discriminant Analysis) and a non-parametric machine learning technique (See5) is carried out. We have applied the two methods to the problem of the prediction of insolvency of Spanish non-life insurance companies upon the basis of a set of financial ratios. Results indicate a higher performance of the machine learning technique, what shows that this method can be a useful tool to evaluate insolvency of insurance firms.
UCM subjects
Unesco subjects
ALTMAN, E.I., MARCO, G. and VARETTO, F. (1994): “Corporate distress diagnosis: comparisions using linear discriminant analysis and neural networks (the Italian experience)”, Journal of Banking and Finance, 18, 505-529. AMBROSE, J.M. and CARROLL, A.M. (1994): “Using Best’s Ratings in Life Insurer Insolvency Prediction”, The Journal of Risk and Insurance, 61 (2), 317-327. BAR-NIV, R. and SMITH, M.L. (1987): “Underwriting, Investment and Solvency”, Journal of Insurance Regulation, 5, 409-428. BREIMAN, L., FRIEDMAN, J.H., OLSHEN, R.A. and STONE, C.J. (1984): Classification and regression trees, Wadsworth, Belmont. CLARK, P. and BOSWELL, R. (1991): “Rule Induction with CN2: Some Recent Improvements”, in KODRATOFF, Y. (Ed.): Machine Learning - Proceedings of the Fifth European Conference (EWSL-91), Springer-Verlag, Berlin, 151-163. DE ANDRÉS, J. (2001): “Statistical Techniques vs. SEE5 Algorithm. An Application to a Small Business Environment”, International Journal of Digital Accounting Research, 1 (2), 153-179. DIMITRAS, A.I., SLOWINSKI, R., SUSMAGA, R. and ZOPOUNIDIS, C. (1999): “Business failure prediction using Rough Sets”, European Journal of Operational Research, 114, 263-280. DIZDAREVIC, S., LARRAÑAGA, P., PEÑA, J.M., SIERRA, B., GALLEGO, M.J. and LOZANO, J.A. (1999): “Predicción del fracaso empresarial mediante la combinación de clasificadores provenientes de la estadística y el aprendizaje automático”, in Bonsón, E. (Ed.): Tecnologías Inteligentes para la Gestión Empresarial, RA-MA Editorial, Madrid, 71-113. DUDA, R.O., HART, P.E. and STORK, D.G. (2001): Pattern Classification, John Wiley & Sons, Inc., New York. FREUND, Y. and SCHAPIRE, R.E. (1997): “A decision-theoretic generalization of on-line learning and an application to boosting”, Journal of Computer and System Sciences, 55(1), 119-139. HENRICHON, Jr., E.G. and FU, K.S. (1969): “A nonparametric partitioning procedure for pattern classification”, IEEE Transactions on Computers, 18, 614-624. KPMG (2002): “Study into the methodologies to assess the overall financial position of an insurance undertaking from the perspective of prudential supervision”, (in KRZANOWSKI, W.J. (1996): Principles of Multivariate Analysis. A User’s Perspective, Oxford University Press, Oxford. MARTÍNEZ DE LEJARZA, I. (1999): “Previsión del fracaso empresarial mediante redes neuronales: un estudio comparativo con el análisis discriminante”, in Bonsón, E. (Ed.): Tecnologías Inteligentes para la Gestión Empresarial, RA-MA Editorial, Madrid, 53-70. MORA, A. (1994): “Los modelos de predicción del fracaso empresarial: una aplicación empírica del logit”, Revista Española de Financiación y Contabilidad, 78, enero-marzo, 203-233. MORGAN, J.N. and MESSENGER, R.C. (1973): THAID: a Sequential Search Program for the Analysis of Nominal Scale Dependent Variables, Survey Research Center, Institute for Social Research, University of Michigan. MORGAN, J.N. and SONQUIST, J.A. (1963): “Problems in the analysis of survey data, and a proposal”, Journal of the American Statistical Association, 58, 415-434. MÜLLER GROUP (1997): Müller Group Report. 1997. Solvency of insurance undertakings, Conference of Insurance Supervisory Authorities of The Member States of The European Union. NIBLETT, T. (1987): “Constructing decision trees in noisy domains”, in BRATKO, I. and LAVRAČ, N. (Eds.): Progress in Machine Learning (proceedings of the 2nd European Working Session on Learning), Sigma, Wilmslow, UK, 67-78. QUINLAN, J.R. (1979): “Discovering rules by induction from large collections of examples”, in Michie, D. (Ed.): Expert systems in the microelectronic age, Edimburgh University Press, Edimburgh. QUINLAN, J.R. (1983): “Learning efficient classification procedures”, in Machine learning: an Artificial Intelligence approach, Tioga Press, Palo Alto. QUINLAN, J.R. (1986): “Induction of decision trees”, Machine Learning, 1 (1), 81-106. QUINLAN, J.R. (1988): “Decision trees and multivalued attributes”, Machine Intelligence, 11, 305-318. QUINLAN, J.R. (1993): C4.5: Programs for machine learning, Morgan Kaufmann Publishers, Inc., California. QUINLAN, J.R. (1997) : See5 (available from REZA, F.M. (1994) : An introduction to Information Theory, Dover Publications, Inc., New York. SANCHÍS, A., GIL, J.A. and HERAS, A. (2003): “El análisis discriminante en la previsión de la insolvencia en las empresas de seguros no vida”, Revista Española de Financiación y Contabilidad, 116, enero-marzo, 183-233. SEGOVIA, M.J., GIL, J.A., HERAS, A. and VILAR, J.L. (2003): “La metodología Rough Set frente al Análisis Discriminante en los problemas de clasificación multiatributo”, XI Jornadas ASEPUMA, Oviedo, Spain. SERRANO, C. and MARTÍN, B. (1993): “Predicción de la crisis bancaria mediante el empleo de redes neuronales artificiales”, Revista Española de Financiación y Contabilidad, 74, enero-marzo, 153-176. SETHI, I.K. and SARVARAYUDU, G.P.R. (1982): “Hierarchical classifier design using mutual information”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 4, 441-445. TAM, K.Y. (1991): “Neural network models and the prediction of bankruptcy”, Omega, 19 (5), 429-445. VENABLES, W.N. and RIPLEY, B.D. (2002): Modern Applied Statistics with S, Springer-Verlag, New York.