Evaluation of college admissions: a decision tree guide to provide information for improvement (2025)

Introduction

To alleviate the pressures of pursuing further education, facilitate the adaptive development of students, balance urban and rural education and improve the academic training of the people, the 12-year basic national education program was implemented in Taiwan. This education program is based on the multiple intelligence theory, which provides an adaptive approach to the admissions system, affirming learners’ right to education and equal opportunities in admission (Ministry of Education, 2009). In the 5-year junior college program, by promoting the assessment of learning abilities of students as the basis for evaluation through a variety of admission systems, schools can focus on the potential diversified development of students while at the same time creating a fair and reasonable admission system that expands the free admission process since 2011.

The exam-free admissions system considers the candidate’s choice for Taiwan’s 5-year junior college programs. When the number of applicants exceeds the number of available seats, the admission system recommends more than the allocated quota, using criteria such as “multiple learning performance,” “technically and artistically gifted,” “disadvantaged status,” “balanced learning performance,” and “counseling competency”. Multiple learning performances during secondary school, as well as the results of the “comprehensive assessment program” and other results, are listed and sorted according to the criteria set by each school. Under this admissions system, applicants can choose their own colleges and departments according to their own conditions and interests, while the colleges can recruit the appropriate talent according to their standards. During the enrollment process, applicants may also choose to forgo the opportunity to enroll. The educational beliefs on which this system is based are fair and testable.

However, when the number of enrollments far exceeds the number of seats available at the college, a majority is generated of students who failed to pass the entrance requirements. Furthermore, if enrollees have a preference for particular colleges and departments, other schools may become underenrolled. When these phenomena occur, it is necessary to explore whether this exam-free admission registration system can maintain expected educational objectives or cause popular colleges to retain the advantage of blatantly screening outstanding students. Marginalized rural and suburban colleges produce trends of insufficient enrollment or weak competition, showing two extremes within the college enrollment phenomenon.

Decision tree analysis is a common data-mining method that can be used as a tool for supervised feature extraction and description (Berry and Linoff, 1997). High-dimensional data can be quickly learned, a hierarchical tree structure can be established, and the results obtained can be transformed into easy-to-understand rules, generally for exploration and prediction purposes. At the same time, decision trees have the advantage of reducing unnecessary variables and sorting the importance of independent variables. In this study, the classification rules (factors) for enrollment failures/successes will be easily obtained during the classification process through the decision tree method, making these rules clear, easy to understand and easy to explore for applicants (students), admission-based colleges and education experts. This study uses statistical tests and decision tree analyses to assess possible factors for enrollment failure and success, reviews the consistency of the admissions system with stated educational philosophy and objectives, and makes specific recommendations.

The content of this research article is as follows: section “Literature review” discusses the literature review, section “Data and research methods” presents the data and research methods, section “Empirical results and discussion” shows the empirical results and discussion, and section “Conclusions” presents the conclusions.

Literature review

Data mining is the analytical step of the knowledge discovery in databases (KDD) process (Fayyad et al., 1996). It is a technique that combines statistics, algorithms, artificial intelligence, and machine learning and uses databases for calculations and analyses to explore specific rules or models from large, cluttered data. KDD is useful for finding potentially valuable information to solve problems. This rapidly developing new approach and technology is being widely used across a broad range of applications (Aulck et al., 2019; Han, 2022; Kiss et al., 2019; Nagy and Molontay, 2018; PhridviRaj and GuruRao, 2014; Rastrollo-Guerrero et al., 2020). The decision tree is one of the most common methods used in data-mining technology and is essentially a simple classifier (Kingsford and Salzberg, 2008), which produces a kind of supervised learning that can be used as an analytical data and prediction tool. Compared with statistical methods assessing the parameters of the data, decision tree analysis is a more reliable and intuitive method (Park and Dooris, 2020) that has been widely used to solve problems in the field of education (Delibasic et al., 2013; Lee and Liu, 2021; Yao et al., 2022).

Kirby and Dempster (2014) used decision tree analysis to identify the variables that best describe student achievement, using college student achievements to identify challenges and remediation opportunities during the course selection process and the course itself. Park and Dooris (2020) explored assessments and evaluations in higher education using decision tree analysis to predict student evaluation of teaching. Križanić (2020) applied data-mining techniques to educational data in higher education institutions, exploring student behavior interacting with course materials through a decision tree to better analyze how students learn. Asif et al. (2017) believed that data mining results can provide timely warning and support to underachieving students while providing advice and opportunities to high-achieving students. Singer et al. (2020) used decision trees to predict the stability of the academic behavior of students with learning disabilities (LD) with and without accommodation factors and found that the rendered models were excellent in predicting performance.

Amburgey and Yi (2011) used the principles of business intelligence to explore first-year data from master’s students in private colleges with decision tree analysis, neural network analysis, and multiple regression analysis to develop models to predict the average score (GPA) for each student. Lin et al. (2013) used the decision tree to build a Personalized Creative Learning System (PCLS) as an important predictor of university professional and learning paths. Howard et al. (2018) used eight predictive models as an early warning system to assess both students’ potential learning and poor learning in the course, resulting in a preference for the Bayesian additive regression trees (BART) as the best predictive model. Lynch (2017) offers insight into the issues that can be exacerbated while applying big data analytics to education and argues that while big data can lead to personalized learning, deep student modeling, and true vertical learning, its application requires in-depth and continuous monitoring of students, classes, and teachers, as well as an invasion of privacy, potential interference with educational effectiveness, and other ethical issues.

Oranye (2016) focuses on the importance of college admissions. In past studies, data mining has been applied to predict school enrollment to provide useful information for the effective improvement and achievement of enrollment goals. For example, Tanna (2012) developed a decision support system that enables students to choose the right university based on their entrance exam scores. Zeng et al. (2014) used decision tree construction models to predict the popularity of universities in various regions of China, and the results showed the practical feasibility of decision tree modeling. Ragab et al. (2012) proposed a hybrid recommendation for a university admission system based on data-mining techniques and knowledge discovery rules to address college enrollment forecasting issues and recommend appropriate tracking channels for students to enroll successfully. Maltz et al. (2007) used computers to develop a decision-supporting system to improve responsiveness and real-time management capabilities in the college admission process, significantly increasing the effectiveness of the process and achieving enrollment business goals.

Finally, the theory of multiple intelligences argues that learners should be empowered, not limited, in the way they learn (Gardner, 2011). In particular, presentations of intelligence vary from person to person, so each learner should have a different learning course, and it is recommended that learners should not all be measured by the same criteria for learning effectiveness. Waterhouse (2006) casts doubt on the multiple intelligence theory, arguing that the lack of sufficient empirical support implies that it should not be the basis of educational practice. Chou (2009) questioned the fairness of allocating educational resources in Taiwan and the right of all citizens to be taught. The presence of an exam-free admission registration system does not reduce the long-term pressure on candidates, nor does it slow down the teaching of exam leaders.

Data and research methods

This study explores the enrollment of 5-year junior colleges in Taiwan in 2016, which is 5 years after the implementation of the new education reform system in 2011, to understand how the new system is deployed in practice. The information collected comes from the committee of joint admissions. These data are filled in by the participating applicants (students), the necessary personal achievements are submitted, and the personal information and data are checked and verified by the checker before submitting a formal registration in the computer database of the joint admission registration committee. In these 5-year junior colleges, 6013 applicants used the exam-free admissions system; only 2294 people were successfully enrolled, and the remaining 3719 people failed, accounting for 61.85% of the total applicants. Please refer to Table 1 for the differences and characteristics of the four enrolling colleges in terms of location, enrollment scale, and admissions department. This study analyzes the data at the time of registration and organizes them in the form of variables based on Table 2, which contains the definition and description of the study variables indicating the data patterns and definitions for the 21 candidate-based independent variables and 1 dependent variable of the enrollment results (Y1).

Full size table
Full size table

Data mining is based on combined algorithms of statistical analysis, which use rapid computing abilities to analyze big data and find useful knowledge as a decision analysis tool or predictive technology (PhridviRaj and GuruRao, 2014). Vialardi et al. (2011) describe the data-mining analysis process, which consists of six steps: business understanding, data understanding, data dating, modeling, evaluation, and deployment. Some of the methods of data mining (e.g., traditional artificial neural network models) should be applied with caution because these may not have the ability to automatically filter variables. When many candidate independent variables cannot be proven to have a significant impact on the dependent variable, the variables might be screened first with decision tree analysis, statistical tests, or other dimensional reduction methods. In this case, this study collects data through the decision tree classification method to establish an assessment model; the model can then be used for factor (rule) exploration of admission successes and failures and as a future application for admissions assessment or predictive analysis.

After categorizing the applicants between “enrollment failures” and “enrollment successes,” an evaluation model is established through the decision tree classification method to identify the problems that cause enrollment failures. The decision tree is a data classification method that establishes a tree structure that usually groups cases according to the independent variables or the prediction value of the dependent variable. The established tree structure provides a validation tool for interpretation and confirmation classification analysis. Common algorithms include CART (classification and regression trees), CHAID (chi-square automatic interaction detector), and C5.0 (Combination 5.0).

The CHAID growth method has been used in this decision tree; it is based on the chi-square distribution. The variables assessed must be categorical, and if there is a continuous variable, the data need to be transformed to a categorical variable. Through the automatic interaction detection of the chi-square, independent variables that have the strongest interaction with the dependent variable are selected. When there is no significant difference between categories and related dependent variables, the categories of the independent variables can be automatically merged. The CHAID algorithm is based on chi-square testing to determine the best branching properties, which can be divided into multiple branches. CHAID also has the advantages of fast calculation speed, and does not consider postpruning. In addition, CHAID directly joins the mechanism that stops the growth of the decision tree during the establishment of the decision tree.

Empirical results and discussion

Descriptive statistics and tests

Table 3 shows the results of the frequency distribution table and the chi-square test of independence for the categorical variables. In Table 3, in the enrolling colleges (X1), School A had the largest number of applicants, and 2810 people accounted for 46.7% of the total sample size. The ratio of applicants to enrollment places is 4.01. A total of 2382 applicants in School D accounted for 39.6% of the total sample size, and the ratio of applicants to enrollment places was 1.8. In School B, there are 544 applicants, accounting for 9.0% of the total sample size, and the ratio of applicants to enrollment places is 10.88. Finally, 277 people in School C accounted for 4.6% of the total sample size, with a ratio of applicants to enrollment places of 1.01. A total of 3354 people, who accounted for 55.8% of the total sample size, chose colleges in the metropolitan area.

Full size table

Figure 1 shows a bar chart of the exam-free admissions schools in metropolitan areas and agricultural counties in terms of enrollment places and actually enrolled students. It was found that schools in the metropolitan area had full enrollment (when the last students to be admitted have the same conditions, the number of students who can be admitted can be increased so that the number of actual enrolled students is greater than the number of enrollment places), but the schools in agricultural counties were underenrolled (a possible enrollment of 1544 places vs. an actual enrollment of 1286 places).

A bar chart of enrollment places and actual enrollment in the exam-free admissions schools of the metropolitan area and agricultural counties.

Full size image

The distance between the applicants and the registered colleges spread across the county and city, but the largest plurality of applicants were local (X4 = 0), with 2140 people accounting for 35.6% of the total sample size. There are 1799 people coming from across the county and city (X4 = 1), accounting for 29.9%, and 1202 people coming from across 2 counties and cities (X4 = 2), accounting for 20.0%, while the remaining 872 people come from across at least 3 counties and cities (X4  3), accounting for 14.5% of the total sample size. These results are in line with the educational philosophy stressing admission close to the student’s home.

The registration threshold set at the enrolling college, which did not meet the entry threshold for registered colleges, accounted for 19.7% of the total sample size with 1182 people. There were 4819 applicants from the noncore urban area, who accounted for 80.1% of the total sample size, and only 122 applicants (2.0%) from remote or outlying areas.

There were 2592 applicants enrolled in City T (metropolitan area) (65.4% of the total enrollment failures and 77.28% of the total applicants were in City T), indicating that the colleges registered in the metropolitan area were more popular. When the distance between the enrollee and the target college is spread across at least 3 counties and cities (X4  3), the enrollment failure rate is as high as 80%, higher than the nearest county and city enrollees (approximately 61 to 67%). A total of 2783 people (70.2% of all enrollment failures) reached the admissions threshold but failed to enroll. A total of 90.16% of the applicants were from remote or outlying areas; 110 people failed to enroll, indicating that the applicants from remote or outlying areas were in danger of competing with one another.

Table 3 also shows the chi-square test of independence results for the independent variables and the admission distribution results (Y1). At a significance level of 5%, it is found that variables such as enrolling colleges (X1), county or city where the enrolling colleges are situated (X2), registered schools in a metropolitan area (X3), the distance between the enrollee and the target colleges (X4), the admissions threshold (X5), types of junior high-school graduates (U1), and origins from remote or outlying areas (U3) are statistically significant, indicating that there is a dependency between the seven variables and the enrollment distribution results (Y1).

Table 4 shows the descriptive statistics variables, the difference in means t-tests (failures–success), Mann‒Whitney U-tests, and Kolmogorov‒Smirnov Z-tests. In Table 4, based on the results of the enrollment distribution (Y1), the categorical differences between enrollment failures and enrollment successes are presented in the following manner: the size of junior high-school graduation (U4) (483.820 vs. 467.177), competition (U5) (0.501 vs. 0.424), service-learning effectiveness (U6) (6.462 vs. 6.325), daily life performance (U7) (3.894 vs. 3.795), multiple learning performances (U9) (23.040 vs. 22.785), comprehensive assessment program (U14) (12.850 vs. 12.104), and writing test (U15) (4.082 vs. 3.956). The average performance of enrollment failures is higher than that of enrollment successes and shows higher scores that belong to failed candidates and lower scores that belong to successful candidates, showing the phenomenon of a reverse selection of talent. However, in the four performances of technically and artistically gifted (U10) (0.446 vs. 0.713), disadvantaged status (U11) (0.142 vs. 0.164), balanced learning performance (U12) (7.022 vs. 7.866), and other factors (U16) (0.220 vs. 0.710), the performance average of enrollment failures is less than that of enrollment successes, as evident from these results.

Full size table

Table 4 also shows the results of the two-sample t-test for difference in means (failures–success), Mann‒Whitney U-tests, and Kolmogorov‒Smirnov Z-tests. At a significance level of 5%, the average difference was significantly positive in categories such as the size of the junior high-school graduation (U4) (16.643; p-value = 0.009), competition (U5) (0.077; p-value = 0.028), service-learning effectiveness (U6) (0.137; p-value = 0.000), daily life performance (U7) (0.098; p-value = 0.000), multiple learning performances (U9) (0.254; p-value= 0.001), comprehensive assessment program (U14) (0.746; p-value = 0.000), and writing test (U15) (0.126; p-value = 0.000), which indicates that the performance average for enrollment failures is significantly higher than that of enrollment successes. In addition, categories such as technically and artistically gifted (U10) (−0.267; p-value = 0.000), balanced learning performance (U12) (−0.844; p-value = 0.000), and other factors (U16) (−0.490; p-value = 0.000) are significantly negative, showing that the performance average for enrollment failures is significantly lower than that of enrollment successes. Finally, the results of the Mann‒Whitney U-tests and Kolmogorov‒Smirnov Z-tests are roughly similar to those of the two-sample t-test for differences in means.

Table 4 shows that five variables, namely, competition (U5), service-learning effectiveness (U6), daily life performance (U7), multiple learning performances (U9), comprehensive assessment program (U14), and writing test (U15), are cases where the enroll failures will perform significantly better than the enroll successes. These results were not consistent with general expectations and related to participants’ free choice to forgo this admission and then choose another school. Second, the number of students graduating from national secondary schools (size of school (U4)), where enrollment failures are significantly larger than enrollment successes, indicates that the participants’ graduating schools do not have a competitive advantage in large student numbers.

Decision tree analysis

Table 5 shows the results of classification using the CHAID method by samples of training/test. This is the result of classifying the overall sample (6013 people) into 80% used for training the algorithm (4809 people) and 20% of applicants used to actually test the characteristics of the sample (1204 people) separately. The growth condition of the tree is 6 at the maximum depth of the tree structure, the minimum number of observations in the parent node is 100, and the minimum number of observations in the child node is 50. At a significance level of 0.05, the significant value of the consolidation and segmentation conditions is the result of growth under conditions adjusted by using the Bonferroni method. The training/test model has a risk value of 0.231/0.252, a standard error of 0.006/0.013, a sensitivity rate of 81.9/80.6% (precision rate of 67.4/63.2%) for enrollment failures, 76.9/74.8% for overall accuracy, and 39 nodes. The independent variables of the training/test model are in order: registered schools in a metropolitan area (X3), other factors (U16), registration threshold (X5), technically and artistically gifted (U10), comprehensive assessment program (U14), disadvantaged status (U11), distance between the enrollee and the distribution colleges (X4), types of junior high-school graduates (U1), and writing test (U15).

Full size table

Figure 2 shows a tree diagram for the training/test sample. This study aims to be able to illustrate in detail the three sub-structure diagrams that divide the decision tree into Up (Figs. 3, 4 is the training/test sample), Left (Figs. 5, 6 is the training/test sample in the metropolitan area colleges), and Right (Figs. 7, 8 is training/test sample in the agricultural county colleges).

A tree diagram for the training/test sample.

Full size image

A sub-tree structure of the first to the second level for the training sample.

Full size image

A sub-tree structure of the first to the second level for the test sample.

Full size image

A sub-tree structure for the training sample in the metropolitan area colleges.

Full size image

A sub-tree structure for the test sample in the metropolitan area colleges.

Full size image

A sub-tree structure for the training sample in the agricultural county colleges.

Full size image

A sub-tree structure for the test sample in the agricultural county colleges.

Full size image

Figures 3 and 4 is the first level of the decision tree by training/test samples. In Node 0, 65.7/66.8% of the applicants in the training/test groups failed to enroll. The branches below node 0 are Node 1 and Node 2 (first level). In Node 1 (metropolitan area), 77.0/78.3% of the applicants in the training/test groups failed to enroll, which is a very high percentage. In Node 2 (agricultural county), 51.4/52.6% of the applicants in the training/test groups failed to enroll. This proportion is 25% lower than that of Node 1, indicating that colleges in agricultural counties are less competitive than those in the metropolitan area. The tree structure of the branches below node 1 in Figs. 3 and 4 is the result of the enrollment of schools in the metropolitan area, while the tree structure of the following branches in node 2 is the result of the enrollment of schools in the agricultural county.

Admission colleges in the metropolitan area

From Node 1 in Figs. 5 and 6, 77.0/78.3% of the applicants in the training/test groups failed to enroll. The branching structure of the following level is highlighted for Node 1:

  1. (1)

    In Node 3, 86.1/85.4% of the applicants in the training/test group failed to enroll, indicating that if the applicant opts for a registered college in the metropolitan area, the bonus score for the English tests (GEPT or TOEIC) (other  1.2) becomes the main cause for enrollment failures.

    1. I.

      In Node 8, 75.9/75.1% of the applicants in the training/test groups failed to enroll, and this proportion was high.

    1. (I)

      At Node 19, 82.8/79.5% of the applicants in the training/test groups failed to enroll, which shows that among the applicants who opted for a registered college in the metropolitan area (other  1.2, who reach the threshold registration, with technically and artistically gifted = 0) in these conditions, there will still be enrollment failures at up to approximately 80%. The fifth level is divided into two nodes by Node 19 below: Node 27 (comprehensive assessment program  14; 95.8/95.5% is the ratio of those who failed to enroll) and Node 28 (comprehensive assessment program > 14; 64.0/60.0% is the ratio of those who failed to enroll).

  1. i.

    At Node 27, 95.5/95.8% of the applicants in the training/test groups failed to enroll. The sixth level comprises the branch below Node 33 (disadvantaged status  0; 98.3/99.1% is the ratio of those who failed to enroll) and Node 34 (disadvantaged status > 0; 71.2/72.2% is the ratio of those who failed to enroll). The enrollment failure rate of those with disadvantaged status is lower than that of nondisadvantaged status, which shows that there is a partial safeguard effect for disadvantaged status.

  2. ii.

    At Node 28, 64.0/60.0% of the applicants in the training/test groups failed to enroll. The sixth level comprises the branch below Node 35 (the distance between the enrollee and the distribution colleges  0; 52.0/50.9% is the ratio of those who failed to enroll) and Node 36 (the distance between the enrollee and the distribution colleges > 0; 75.7/69.1% is the ratio of those who failed to enroll). These results show that the enrollment failure rate will increase significantly for nonlocal applicants.

    1. (I)

      At Node 20 (technically and artistically gifted > 0), 53.2/61.7% of the applicants in the training/test groups failed to enroll. The fifth level is divided into two nodes by Node 20 below: Node 29 (the distance between the enrollee and the distribution colleges  0; 36.0/34.3% is the ratio of those who failed to enroll) and Node 30 (the distance between the enrollee and the distribution colleges > 0; 67.9/82.6% is the ratio of those who failed to enroll). It was found that the enrollment failure rate increased when the distance between the enrollee and the target colleges was greater across the county and city.

  1. I.

    (ii) At Node 9, the registration threshold was not reached, and 100/100% of the applicants in the training/test group failed to enroll.

  1. (2)

    At Node 4 (other > 1.2), 64.2/58.3% of the applicants in the training/test group were successful in enrolling. This result shows that when other > 1.2, the enrollment success rate in metropolitan colleges will be significantly higher than that of Node 3. The third level is composed of the branch below Node 4 to Node 10 (Comprehensive assessment program  15.4; 76.1/71.4% is the ratio of those who succeeded in enrolling) and Node 11 (Comprehensive assessment program > 15.4; 54.5/56.2% is the ratio of those who succeeded in enrolling). After comparing the enrollment success rates from two horizontal independent nodes 10 and 11, it is found that the lower composite assessment program, instead, has a higher the ratio of those who succeeded in enrolling, showing an abnormal phenomenon.

Admissions in agricultural county colleges

In Figs. 7 and 8, the second level is divided into three nodes by Node 2: Node 5 (the distance between the enrollee and the distribution colleges  1.0; 65.7/64.4% is the ratio of those who succeeded in enrolling), Node 6 (1 < the distance between the enrollee and the distribution colleges  2, 60.8/61.7% is the ratio of those who failed to enroll), and Node 7 (the distance between the enrollee and the target colleges > 2, 80.3/79.8% is the ratio of those who failed to enroll). Comparing the three nodes in this level, it was found that the distance between the enrollee and the admitting college is associated with the ratio of success/failure for enrollments at agricultural county colleges. Applicants will have a higher ratio of successful enrollees when they are in the same county or city as the school of admission. When applicants and schools span two counties and cities, there is a higher ratio of failed enrollment.

  1. 1.

    At Node 5, the third level is divided into three nodes by Node 5: Node 12 (comprehensive assessment program  11.2; 73.0/73.3% is the ratio of those who succeeded in enrolling), Node 13 (11.2 < comprehensive assessment program  12.18; 63.8/54.2% is the ratio of those who succeeded in enrolling), and Node 14 (comprehensive assessment program > 12.18; 51.0/59.4% is the ratio of those who succeeded in enrolling). Comparing the results of the three nodes, it is found that the lowest comprehensive assessment program (Node 12) has the highest ratio of successful enrollees. A higher comprehensive assessment program will instead have a lower ratio of successful enrollees. Therefore, there is a situation of reverse selection of talent.

  1. I.

    At Node 12, the fifth level is composed of the branch below Node 12 to Node 23 (writing test  3; 80.0/73.8% is the ratio of those who succeeded in enrolling) and Node 24 (writing test > 3; 70.4/73.1% is the ratio of those who succeeded in enrolling). It turns out that those with lower scores on a writing test (Node 23) will have a higher ratio of successful enrollees, so there is also a case of talent inverse selection. This phenomenon can occur in agricultural county colleges, which may be the result of an applicant abandoning enrollment and the admissions department creating a strategy to fill up the vacant seats with students.

  2. II.

    At Node 14, the fifth level is composed of the branch below Node 14 to Node 25 (Types of junior high-school graduate is the city; 60.0/64.0% is the ratio of those who succeeded in enrolling) and Node 26 (Types of junior high-school graduate include county, private, or national; 73.5/52.6% is the ratio of those who failed to enroll). Here, it will be found that junior high-school graduates in the city have relatively high success rates at enrollment.

Finally, a possible reason for the emergence of the adverse selection of talent may be that the more qualified applicants gave up this admission opportunity and chose to register at other schools. This represents a potentially difficult problem faced by exam-free admissions colleges in agricultural counties.

  1. (2)

    At Node 6, the fourth level is composed of the branch below Node 6 to Node 15 (comprehensive assessment program  8.4; 46.0/33.3% is the ratio of those who failed to enroll) and Node 16 (comprehensive assessment program > 8.4; 63.5/60.9% is the ratio of those who failed to enroll). A higher comprehensive assessment program has a higher ratio of those who failed to enroll, and there is a phenomenon of inverse talent selection.

  2. (3)

    At Node 7, the fourth level is composed of the branch below Node 7 to Node 17 (writing test  3; 65.1/84.6% is the ratio of those who failed to enroll) and Node 18 (writing test > 3; 83.5/78.2% is the ratio of those who failed to enroll).

Based on the results of Figs. 3 through 8, the seven important decision-making rules are collated in Table 6. This is the result of consolidating some decisions rules from the training group’s ratio of predicted failures (or successes) to actual enrolled that reached an accuracy rate of at least 75% or more. Among them, in Part I Colleges in Metropolitan City, 4 rules determine failure to enroll and 1 rule determines success. Part II Colleges in Agricultural County provides 1 important rule for failed enrollment and 1 important rule for successful enrollment. These rules may be provided to applicants, relatives and friends, as well as to the relevant responsible persons of the exam-free admissions school as a reference.

Full size table

Conclusions

This study used the decision tree analysis method to explore the reasons behind enrollment failures/successes in the joint enrollment and distribution process for Taiwan’s 5-year junior colleges and to observe whether this admissions system fulfills its expected educational objectives. The established training/test model used with the exam-free admissions colleges had a sensitivity rate of 81.9/80.6% for detecting enrollment failures. The location of the colleges (metropolitan area vs. agricultural county) found in the tree structure is a first-level factor; that is, there is a significant difference between the two categorical colleges for the factors that determine a failure in enrollment.

First, the failed enrollment percentage in metropolitan area colleges is much higher than that of agricultural county colleges. This shows that metropolitan area exam-free admissions colleges still place relatively high competitive pressure on applicants. In the metropolitan area, a college has the advantage of talent selection. The English test results (other) and registration threshold are two important factors in successful enrollment. Then, even if the registration threshold is reached and the other  1.2 then:

  1. (1)

    If the technically and artistically gifted (U10) = 0, there was a high probability of approximately 80.0% of failing to enter the college. Attention must be paid to the higher performance of the technically and artistically gifted, which can be accompanied by an improved comprehensive assessment program that can reduce the likelihood of failed enrollment. In addition, there are some safeguards for the conditions of the disadvantaged (see Nodes 33 & 34), and the proportion of enrollment failures due to the distance between the enrollee and the target college in the higher comprehensive assessment program (>14.0) (see Nodes 35 & 36) is dependent on the relationship.

  2. (2)

    If technically and artistically gifted > 0, then the proportion of enrollment failures was reduced to 53.2% for the training sample (see Nodes 19 & 20). When the technically and artistically gifted also had a farther distance to their target colleges, the higher the proportion of enrollment failures was (see Nodes 29 & 30). This study found that the enrolling colleges in the metropolitan area have better English test scores (other > 1.2), which reduced the proportion of enrollment failures to 35.8% for the training sample (see Nodes 3 & 4); subsequently, the distance (see Nodes 29 & 30) between the enrollee and the target college, as well as the results of the comprehensive assessment program (see Nodes 37 & 38) were secondary factors.

For agricultural county colleges, some applicants retain the flexibility to choose admission, and the colleges may face underenrollment (see Fig. 1; actual enrollment ratios at schools C and D were 56.73 and 89.05%). The distance between the enrollee and the target college is a major factor in enrollment failures (see Nodes 5, 6, &7 in Figs. 2 and 3); it is found that in the local or neighboring counties and cities of the applicant, the results of the comprehensive assessment program are not ideal, but the enrollment failure rate is lowered (see Nodes 12, 13, & 14 in Figs. 2 and 3). Moreover, some of the results of the comprehensive assessment program or writing test (see Nodes 23 & 24 in Figs. 2 and 3) show better results among applicants, but the enrollment failure rate is higher, showing that the colleges experience a reverse selection phenomenon for talent. The reason for this phenomenon may be that better-performing applicants prefer to choose a different college or that it is formed by the lax conditions for college admission.

The results of this study show that although the admissions systems for Taiwan’s national secondary schools are based on the goal of a multi-intelligence balanced education, under the influence of enrollees’ college preferences, the admission competition between metropolitan area colleges and agricultural county colleges is worsened. The enrolling colleges in the metropolitan areas have a greater ability to make selections, and the rural enrolling colleges in the agricultural county have been marginalized by the reverse selection of talent. This phenomenon of reverse selection of talents will not be conducive to agricultural enrollment schools to select excellent talents for education, cannot effectively use educational resources, it is not conducive to the local cultivation of talents. To address the under-enrollment of colleges in agricultural counties, it is proposed that the Wu (2020) approach should be used to recommend enrollment policies in agricultural county college districts that give students guaranteed or priority admission to improve student enrollment at schools of their choice.

Data availability

Data sharing is not applicable to this research as no data were generated or analyzed.

References

  • Amburgey WOD, Yi J (2011) Using business intelligence in college admissions: a strategic approach. Int J Bus Intell Res 2(1):1–15

    Article Google Scholar

  • Asif R, Merceron A, Ali SA, Haider NG (2017) Analyzing undergraduate students’ performance using educational data mining. Comput Educ 113:177–194

    Article Google Scholar

  • Aulck L, Nambi D, Velagapudi N, Blumenstock J, West J (2019) Mining university registrar records to predict first-year undergraduate attrition. In: Proceedings of the 12th international conference on educational data mining (EDM 2019). International Educational Data Mining Society, Montreal, Canada, pp. 9–18

  • Berry MA, Linoff G (1997) Data mining techniques for marketing, sales, and customer support. Wiley and Sons, New York

    Google Scholar

  • Chou CP (2009) Toward a twelve-year basic education program in Taiwan. Bull Educ Resour Res (Taiwan) 42:25–42

    Google Scholar

  • Delibasic B, Vukicevic M, Jovanovic M, Suknovic M (2013) White-box or black-box decision tree algorithms: which to use in education? IEEE Trans Educ 56(3):287–291

    Article Google Scholar

  • Fayyad U, Piatetsky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery in databases. AI Mag 17(3):37–54

    Google Scholar

  • Gardner H (2011) Intelligence, creativity, ethics: reflections on my evolving research interests. Gift Child Q 55(4):302–304

    Article Google Scholar

  • Han S (2022) Identifying the roots of inequality of opportunity in South Korea by application of algorithmic approaches. Humanit Soc Sci Commun 9(1):18

    Article Google Scholar

  • Howard E, Meehan M, Parnell A (2018) Contrasting prediction methods for early warning systems at undergraduate level. Internet High Educ 37:66–75

    Article Google Scholar

  • Kingsford C, Salzberg SL (2008) What are decision trees? Nat Biotechnol 26(9):1011–1013

    Article CAS Google Scholar

  • Kirby NF, Dempster ER (2014) Using decision tree analysis to understand foundation science student performance. Insight gained at one South African university. Int J Sci Educ 36(17):2825–2847

    Article Google Scholar

  • Kiss B, Nagy M, Molontay R, Csabay B (2019) Predicting dropout using high school and first-semester academic achievement measures. In: 2019 17th international conference on emerging elearning technologies and applications (ICETA). IEEE, Starý Smokovec, Slovakia, pp. 383–389

  • Križanić S (2020) Educational data mining using cluster analysis and decision tree technique: a case study. Int J Eng Bus Manag 12:184797902090867

    Article Google Scholar

  • Lee L, Liu YS (2021) Use of decision trees to evaluate the impact of a holistic music educational approach on children with special needs. Sustainability 13(3):1410

    Article Google Scholar

  • Lin CF, Yeh YC, Hung YH, Chang RI (2013) Data mining for providing a personalized learning path in creativity: an application of decision trees. Comput Educ 68:199–210

    Article Google Scholar

  • Lynch CF (2017) Who prophets from big data in education? New insights and new challenges. Theory Res Educ 15(3):249–271

    Article ADS Google Scholar

  • Maltz EN, Murphy KE, Hand ML (2007) Decision support for university enrollment management: implementation and experience. Decis Support Syst 44(1):106–123

    Article Google Scholar

  • Ministry of Education (2009) The implementation plan for expansion of application admission on the high school and five-year junior college. Ministry of Education in Taiwan, Taipei City

    Google Scholar

  • Nagy M, Molontay R (2018) Predicting dropout in higher education based on secondary school performance. In: 2018 IEEE 22nd international conference on intelligent engineering systems (INES). IEEE, Las Palmas de Gran Canaria, Spain, pp. 389–394

  • Oranye NO (2016) The validity of standardized interviews used for university admission into health professional programs: a Rasch analysis. SAGE Open 6(3):215824401665911

    Article Google Scholar

  • Park E, Dooris J (2020) Predicting student evaluations of teaching using decision tree analysis. Assess Eval High Educ 45(5):776–793

    Article Google Scholar

  • PhridviRaj MSB, GuruRao CV (2014) Data mining—past, present and future—a typical survey on data streams. Procedia Technol 12:255–263

    Article Google Scholar

  • Ragab AHM, Mashat AFS, Khedra AM (2012) HRSPCA: hybrid recommender system for predicting college admission. In: 2012 12th International Conference on Intelligent Systems Design and Applications (ISDA). IEEE, Kochi, India, pp. 107–113

  • Rastrollo-Guerrero JL, Gómez-Pulido JA, Durán-Domínguez A (2020) Analyzing and predicting students’ performance by means of machine learning: a review. Appl Sci 10(3):1042

    Article Google Scholar

  • Singer G, Golan M, Rabin N, Kleper D (2020) Evaluation of the effect of learning disabilities and accommodations on the prediction of the stability of academic behaviour of undergraduate engineering students using decision trees. Eur J Eng Educ 45(4):614–630

    Article Google Scholar

  • Tanna M (2012) Decision support system for admission in engineering colleges based on entrance exam marks. Int J Comput Appl 52(11):38–41

    Google Scholar

  • Vialardi C, Chue J, Peche JP, Alvarado G, Vinatea B, Estrella J, Ortigosa Á (2011) A data mining approach to guide students through the enrollment process based on academic performance. User Model User Adapt Interact 21(1-2):217–248

    Article Google Scholar

  • Waterhouse L (2006) Multiple intelligences, the mozart effect, and emotional intelligence: a critical review. Educ Psychol 41(4):207–225

    Article Google Scholar

  • Wu MJ (2020) Predicting outcomes of school-choice policies using district characteristics: empirical evidence from Hong Kong. J School Choice 14(4):633–654

    Article Google Scholar

  • Yao G, Wang J, Cui B, Ma Y (2022) Quantifying effects of tasks on group performance in social learning. Humanit Soc Sci Commun 9(1):282

    Article Google Scholar

  • Zeng X, Yuan S, Li Y, Zou Q (2014) Decision tree classification model for popularity forecast of Chinese colleges. J Appl Math 2014:675806

    Article Google Scholar

Download references

Acknowledgements

The authors would like to thank Mr. Kuo-En Wang (Director of the Admissions Center) for providing anonymous participants registration data set in this study.

Author information

Authors and Affiliations

  1. College of Humanities & Social Sciences, Chaoyang University of Technology, 168, Jifeng E. Road, Wufeng District, Taichung, 413310, Taiwan

    Ying-Sing Liu&Liza Lee

Authors

  1. Ying-Sing Liu

    View author publications

    You can also search for this author in PubMedGoogle Scholar

  2. Liza Lee

    View author publications

    You can also search for this author in PubMedGoogle Scholar

Contributions

Y-SL designed the study, analyzed the data, and wrote the manuscript. LL contributed to the study supervision and project administration. The authors read and approved the final manuscript.

Corresponding author

Correspondence to Ying-Sing Liu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical approval

The study follows the principles of the Declaration of Helsinki. This article does not contain any studies with human participants or animals performed by the authors. This study is an educational issue and does not involve human experiments.

Informed consent

This article does not contain any studies with human participants performed by any of the authors

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Evaluation of college admissions: a decision tree guide to provide information for improvement (9)

Cite this article

Liu, YS., Lee, L. Evaluation of college admissions: a decision tree guide to provide information for improvement. Humanit Soc Sci Commun 9, 390 (2022). https://doi.org/10.1057/s41599-022-01413-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1057/s41599-022-01413-z

Evaluation of college admissions: a decision tree guide to provide information for improvement (2025)
Top Articles
Latest Posts
Recommended Articles
Article information

Author: Greg Kuvalis

Last Updated:

Views: 5993

Rating: 4.4 / 5 (55 voted)

Reviews: 94% of readers found this page helpful

Author information

Name: Greg Kuvalis

Birthday: 1996-12-20

Address: 53157 Trantow Inlet, Townemouth, FL 92564-0267

Phone: +68218650356656

Job: IT Representative

Hobby: Knitting, Amateur radio, Skiing, Running, Mountain biking, Slacklining, Electronics

Introduction: My name is Greg Kuvalis, I am a witty, spotless, beautiful, charming, delightful, thankful, beautiful person who loves writing and wants to share my knowledge and understanding with you.