| Avaliação Psicológica | |
| Comparing the Predictive Power of the CART and CTREE algorithms | |
| Enio G. Jelihovschi1  Cristiano Mauro Assis Gomes2  Gina C. Lemos3  | |
| [1] Universidade Estadual de Santa Cruz;Universidade Federal de Minas Gerais;Universidade do Minho, Braga; | |
| 关键词: algorithms; data mining; large-scale educational assessment; machine learning; National Exam of Upper Secondary Education; | |
| DOI : 10.15689/ap.2020.1901.17737.10 | |
| 来源: DOAJ | |
【 摘 要 】
The CART algorithm has been extensively applied in predictive studies, however, researchers argue that CART produces variable selection bias. This bias is reflected in the preference of CART in selecting predictors with large numbers of cutpoints. Considering this problem, this article compares the CART algorithm to an unbiased algorithm (CTREE), in relation to their predictive power. Both algorithms were applied to the 2011 National Exam of High School Education, which includes many categorical predictors with a large number of categories, which could produce a variable selection bias. A CTREE tree and a CART tree were generated, both with 16 leaves, from a predictive model with 53 predictors and the students' writing essay achievement as the outcome. The CART algorithm yielded a tree with a better outcome prediction. This result suggests that for large data sets, called big data, the CART algorithm might give better results than the CTREE algorithm.
【 授权许可】
Unknown