Mass appraisal of apartments using Random Forest and Gradient Boosting algorithms: case study of Florianópolis, Brazil
DOI:
https://doi.org/10.11606/eISSN.2236-2878.rdg.2024.212297Keywords:
Mass appraisal of properties, Property tax, Machine learning, Web scrappingAbstract
Property tax is an important tool of urban policy, and its calculation is based on the assessed market value of the property, typically determined through mass appraisals. This study evaluates the predictive performance of the machine learning algorithms random forest and gradient boosting in the mass appraisal of urban properties, comparing them to classical linear regression. A total of 8,694 real estate market data points were collected using web scraping techniques, and after initial processing with inclusion criteria, 1,572 apartment data points from the central region of Florianópolis, Brazil, were selected for modeling. The results indicated that the gradient boosting model outperformed all others across metrics such as RMSE, MAE, MAPE, COD, PRD, and R², with predictions up to 30% more accurate, confirming its potential to estimate apartment market values in a robust and equitable manner. These findings reinforce gradient boosting as a viable alternative for generating the property tax base, enabling fairer and more equitable taxation, thereby achieving fiscal justice and tax transparency.
Downloads
References
ASSOCIAÇÃO BRASILEIRA DE NORMAS TÉCNICAS. NBR 14653-2: Avaliação de Bens. Parte 2: Imóveis Urbanos. Rio de Janeiro, 2011. 53 p.
ANTIPOV, E.A.; POKRYSHEVSKAYA, E.B. Mass appraisal of residential apartments: An application of Random forest for valuation and a CART-based approach for model diagnostics. Expert Systems with Applications, v. 39, p. 1772–1778, 2012. https://doi.org/10.1016/j.eswa.2011.08.077.
BALDOMINOS, A.; BLANCO, I.; MORENO, A.J.; ITURRARTE, R.; BERNÁRDEZ, Ó.; AFONSO, C. Identifying Real Estate Opportunities Using Machine Learning. Applied Sciences, v. 8, p. 2321, 2018. Disponível em: https://doi.org/10.3390/app8112321
BASHA, A.M.; ANKAIAH, B.; SRIVANI, J.; DADAKALANDER, U. Real Estate Analytics With Respect To Andhra Pradesh: Machine Learning Algorithm Using R-Programming. International Journal of Scientific & Technology Research, v. 9, n. 4, 2020.
BRASIL. Constituição (1988). Constituição da República Federativa do Brasil. Brasília, DF, 1988.
BOURASSA, S.C.; HOESLI, M. Hedonic, residual, and matching methods for residential land valuation. Journal of Housing Economics, v. 58, Part A, p. 101870, 2022. Disponível em: https://doi.org/10.1016/j.jhe.2022.101870.
BREIMAN, L. Bagging predictors. Machine Learning, v. 24, p. 123–140, 1996.
BREIMAN, L. Random forests. Machine Learning, Springer, v. 45, n. 1, p. 5–32, 2001.
BREUER, W.; STEININGER, B.I. Recent trends in real estate research: a comparison of recent working papers and publications using machine learning algorithms. 2020. Disponível em: https://doi.org/10.1007/s11573-020-01005-w.
CARRANZA, J.P; PIUMETTO, M.A.; LUCCA, C.M.; DA SILVA, E. Mass appraisal as affordable public policy: open data and machine learning for mapping urban land values. Land Use Policy, v. 119, p. 106211, 2022. https://doi.org/10.1016/j.landusepol.2022.106211.
CEH, M.; KILIBARDA, M.; LISEC, A.; BAJAT, B. Estimating the Performance of Random Forest versus Multiple Regression for Predicting Prices of the Apartments. ISPRS International Journal of Geo-Information, v. 7, p. 168, 2018. https://doi.org/10.3390/ijgi7050168.
CHENG, C.; CHENG, X.; YUAN, M.; CHAO, K.; ZHOU, S.; GAO, J.; XU, L.; ZHANG, T. A Novel Architecture and Machine Learning Algorithm for Real Estate. In: Sun, S., et al. (eds.), Signal and Information Processing, Networking and Computers, Lecture Notes in Electrical Engineering, v. 473, 2018. Disponível em: https://doi.org/10.1007/978-981-10-7521-6_60.
DANTAS, R.A. Engenharia de avaliações: uma introdução à metodologia científica. 3. ed. São Paulo: Pini, 2014.
DELFINO, D.; SPANIOL, E.; BUGLIONE, S. O setor imobiliário de Florianópolis na perspectiva da nova sociologia econômica e das inserções sociais como categoria de análise. 2016. Disponível em: https://periodicos.ufsc.br/index.php/geosul/article/download/2177-5230.2016v31n62p367/32611/156702.
DIMOPOULOS, T.; BAKAS, N. Sensitivity Analysis of Machine Learning Models for the Mass Appraisal of Real Estate. Case Study of Residential Units in Nicosia, Cyprus. Remote Sensing, v. 11, p. 3047, 2019. https://doi.org/10.3390/rs11243047.
DUARTE, D.C.O. Análise multicritério e geoestatística aplicadas na avaliação em massa de imóveis urbanos. 2019. 150 f. Tese (Doutorado em Engenharia Civil) - Universidade Federal de Viçosa, Viçosa, 2019.
FARIA FILHO, R.F.; GONÇALVES, R.M.L.; LUIZ, H.T.G. Statistical models for generating the plants of generic values: an application in a small municipality. Urbe - Revista de Gestão Urbana, v. 11, 2019. https://doi.org/10.1590/2175-3369.011.001.e20180192.
FILHO, C.M.; BIN, O. Estimation of hedonic price functions via additive nonparametric regression. Empirical Economics, v. 30, p. 93–114, 2005. https://doi.org/10.1007/s00181-004-0224-6.
FORTI, M. Técnicas de machine learning aplicadas na recuperação de crédito do mercado brasileiro. 74 f. Dissertação (Mestrado em Economia) – Faculdade de Economia da Fundação Getúlio Vargas, 2018.
FREUND, Y.; SCHAPIRE, R.; ABE, N. A short introduction to boosting. Journal Japanese Society for Artificial Intelligence, v. 14, p. 771–780, 1999.
GERON, A. Hands-on machine learning with Scikit-Learn and TensorFlow: concepts, tools, and techniques to build intelligent systems. O'Reilly Media, Inc., 2017.
GHASEMAGHAEI, M.; CALIC, G. Assessing the impact of big data on firm innovation performance: big data is not always better data. Journal of Business Research, v. 108, p. 147-162, 2020. Disponível em: https://doi.org/10.1016/j.jbusres.2019.09.062.
GROVER, P. Gradient Boosting from scratch. 2017. Disponível em: https://medium.com/mlreview/gradient-boosting-from-scratch-1e317ae4587d.
GRUS, J. Data Science from Scratch. Sebastopol: O’Reilly, 2015.
GUJARATI, D.N.; PORTER, D.C. Econometria básica. 5. ed. Porto Alegre: AMGH Bookman, 2018.
HO, T.K. Random Decision Forests. In: Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, 1995, p. 278–282.
HONG, J.; CHOI, H.; KIM, W.S. A house price valuation based on the random forest approach: the mass appraisal of residential property in South Korea. International Journal of Strategic Property Management, v. 24, n. 3, p. 140–152, 2020. https://doi.org/10.3846/ijspm.2020.11544.
HORNBURG, R.A.; HOCHHEIM, N. Avaliação em massa de imóveis usando geoestatística e krigagem bayesiana: um estudo de em Balneário Camboriú/SC. RECC - Revista Eletrônica de Engenharia Civil, v. 13, n. 1, 2017. https://doi.org/10.5216/reec.v13i1.42347.
IAAO - International Association of Assessing Officers. Standards on Ratio Studies. Missouri: IAAO, 2013.
IBM. IBM-Bringing Big Data to the Enterprise. 2015.
JAMES, G.; WITTEN, D.; HASTIE, T.; TIBSHIRANI, R. An introduction to statistical learning: with applications in R. 2013.
JAROSZ, M.; KUTRZYŃSKI, M.; LASOTA, T.; PIWOWARCZYK, M.; TELEC, Z.; TRAWIŃSKI, B. Machine Learning Models for Real Estate Appraisal Constructed Using Spline Trend Functions. Intelligent Information and Database Systems. ACIIDS 2020. Lecture Notes in Computer Science, v. 12033, Springer, 2020. https://doi.org/10.1007/978-3-030-41964-6_55.
LEE, J.; PARK, S.C.; KIM, S.H. Comparison of Models to Forecast Real Estates Index Introducing Machine Learning. Journal of the Architectural Institute of Korea, v. 37, n. 1, p. 191, 2021. https://doi.org/10.5659/JAIK.2021.37.1.191.
LIAW, A.; WIENER, M. Classification and regression by randomForest. R News, v. 2, p. 18–22, 2002.
MARSLAND, S. Machine Learning: An Algorithmic Perspective. 2. ed. Taylor & Francis Group, 2015.
MAYRINK, V.T.M. Avaliação do algoritmo Gradient Boosting em aplicações de previsão de carga elétrica a curto prazo. 91 f. Dissertação (Mestrado em Modelagem Computacional) - Universidade Federal de Juiz de Fora, 2016.
MCCLUSKEY, W.J.; ANAND, S. The application of intelligent hybrid techniques for the mass appraisal of residential properties. Journal of Property Investment and Finance, v. 17, n. 3, p. 218–238, 1999. https://doi.org/10.1108/14635789910270495.
MITCHELL, T.M. Machine Learning. McGraw-Hill, 1997.
MURPHY, K.P. Machine learning: a probabilistic perspective. MIT Press, 2012.
NIU, J.; NIU, P. An Intelligent Automatic Valuation System for Real Estate Based on Machine Learning. ACM, 2019. ISBN 978-1-4503-7633-4/19/12…$15.00.
OLIVEIRA, A.A.F. Avaliação em massa com modelos de aprendizado de máquina aplicados aos terrenos urbanos do município de Fortaleza. 80 f. Dissertação (Mestrado em Economia) - Universidade Federal do Ceará, Fortaleza, 2020.
OLIVEIRA, A.A.F.; REYES-BUENO, F.; GONZÁLEZ, M.A.S.; DA SILVA, E. Comparing traditional and machine learning techniques in apartments mass appraisal in Fortaleza, Brazil. Aestimum, Just Accepted, 2024. https://doi.org/10.36253/aestim-15344.
PAI, P.-F.; WANG, W.-C. Using Machine Learning Models and Actual Transaction Data for Predicting Real Estate Prices. Applied Sciences, v. 10, p. 5832, 2020. https://doi.org/10.3390/app10175832.
PELLI NETO, A. Redes neurais artificiais aplicadas às avaliações em massa: estudo de caso para a cidade de Belo Horizonte/MG. 111 f. Dissertação (Mestrado em Engenharia Elétrica) - Universidade Federal de Minas Gerais, Belo Horizonte, 2006.
PINTER, G.; MOSAVI, A.; FELDE, I. Artificial Intelligence for Modeling Real Estate Price Using Call Detail Records and Hybrid Machine Learning Approach. Entropy, v. 22, p. 1421, 2020. https://doi.org/10.3390/e22121421.
RAFIEI, M.H.S.; ADELI, H. A Novel Machine Learning Model for Estimation of Sale Prices of Real Estate Units. Journal of Construction Engineering and Management, v. 142, n. 2, 2016. https://doi.org/10.1061/(ASCE)CO.1943-7862.0001047.
RAVE, J.I.P.; MORALES, J.C.C.; ECHAVARRÍA, F.G. A machine learning approach to big data regression analysis of real estate prices for inferential and predictive purposes. Journal of Property Research, 2019. https://doi.org/10.1080/09599916.2019.1587489.
SAMUEL, A.L. Some Studies in Machine Learning Using the Game of Checkers. IBM Journal of Research and Development, p. 210–229, 1959. https://doi.org/10.1147/rd.33.0210.
SELIM, H. Determinants of house prices in Turkey: Hedonic regression versus artificial neural network. Expert Systems with Applications, v. 36, p. 2843–2852, 2009. https://doi.org/10.1016/j.eswa.2008.01.044.
THEODORO, L.T.C.; UBERTI, M.S.; ANTUNES, M.A.H.; DEBIASI, P. Avaliação em massa de imóveis rurais através da regressão clássica e da geoestatística. Revista Brasileira de Cartografia, v. 71, n. 2, p. 459-485, 2019. https://doi.org/10.14393/rbcv71n2-47458.
TRAWIŃSKI, B., et al. Comparison of expert algorithms with machine learning models for real estate appraisal. IEEE International Conference on Innovations in Intelligent Systems and Applications, p. 51-54, 2017. https://doi.org/10.1109/INISTA.2017.8001131.
UBERTI, M.S.; ANTUNES, M.A.H.; DEBIASI, P.; TASSINARI, W. Mass appraisal of farmland using classical econometrics and spatial modeling. Land Use Policy, v. 72, p. 161-170, 2018. https://doi.org/10.1016/j.landusepol.2017.12.044.
VERIKAS, A.; LIPNICKAS, A.; MALMQVIST, K. Selecting neural networks for a committee decision. International Journal of Neural Systems, v. 12, n. 5, p. 351–362, 2002. https://doi.org/10.1142/S0129065702001229.
YILMAZER, S.; KOCAMAN, S. A mass appraisal assessment study using machine learning based on multiple regression and random forest. Land Use Policy, v. 99, 2020. https://doi.org/10.1016/j.landusepol.2020.104889.
YOO, S.; IM, J.; WAGNER, J.E. Variable selection for hedonic model using machine learning approaches: a case study in Onondaga County, NY. Landscape and Urban Planning, v. 107, p. 293–306, 2012. https://doi.org/10.1016/j.landurbplan.2012.06.009.
YU, Y.; LU, J.; SHEN, D.; CHEN, B. Research on real estate pricing methods based on data mining and machine learning. Springer-Verlag London Ltd., 2020. https://doi.org/10.1007/s00521-020-05469-3.
ZILLI, C.A.; BASTOS, L.C.; DA SILVA, L.R. Machine learning models in mass appraisal for property tax purposes: a systematic mapping study. Aestimum, v. 84, p. 31-52, 2024. https://doi.org/10.36253/aestim-15792.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Carlos Augusto Zilli, Lia Caetano Bastos

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Autores que publicam nesta revista concordam com os seguintes termos:
- Autores mantém os direitos autorais e concedem à revista o direito de primeira publicação, com o trabalho simultaneamente licenciado sob a Licença Creative Commons Attribution BY-NC-SA que permite o compartilhamento do trabalho com reconhecimento da autoria e publicação inicial nesta revista.
- Autores têm autorização para assumir contratos adicionais separadamente, para distribuição não-exclusiva da versão do trabalho publicada nesta revista (ex.: publicar em repositório institucional ou como capítulo de livro), com reconhecimento de autoria e publicação inicial nesta revista. A licença adotada enquadra-se no padrão CC-BY-NC-SA.
- Autores têm permissão e são estimulados a publicar e distribuir seu trabalho online (ex.: em repositórios institucionais ou na sua página pessoal) a qualquer ponto antes ou durante o processo editorial, já que isso pode gerar alterações produtivas, bem como aumentar o impacto e a citação do trabalho publicado (Veja O Efeito do Acesso Livre).