Understanding ChatGPT Adoption among University Students in Yogyakarta, Indonesia: An Extended Value-Based Adoption Model Using Hybrid Structural Equation Modeling and Machine Learning Analysis
Abstract:
ChatGPT, a widely used generative AI tool, has incessantly attracted significant attention from researchers seeking to understand the factors that influence its adoption in higher education. This study examined the determinants of ChatGPT adoption among university students in Yogyakarta, Indonesia, one of the largest educational centers with more than 100 higher education institutions. Drawing on the value-based adoption model (VAM), the study incorporated three AI-related factors, i.e., AI self-efficacy (ASE), perceived academic value (PAV), and privacy concerns (PC) to appropriately explain students’ behavior in AI adoption. A hybrid analytical approach combining Partial Least Squares Structural Equation Modeling (PLS-SEM) and machine learning (ML) techniques, including Random Forest (RF), eXtreme Gradient Boosting (XGBoost), and Artificial Neural Networks (ANN) with SHapley Additive exPlanations (SHAP)-based interpretation, was employed to test ten hypotheses with survey data collected from 484 students across five selected universities in Yogyakarta. The results indicated that perceived value (PV) (β = 0.453) and adoption intention (β = 0.521) were the strongest predictors of actual ChatGPT usage. Among the benefit-related factors, perceived usefulness (PU), perceived enjoyment (PE), and ASE significantly enhanced PV, whereas PC (β = −0.213) represented the most influential barrier to adoption. The ML models produced consistent findings, with XGBoost achieving the highest predictive performance (AUC = 0.912). SHAP analysis further highlighted PV and PC as the most significant variables. By extending VAM with AI-specific constructs and integrating SEM with ML techniques, this study contributes to an enhanced understanding of generative AI adoption in higher education and offers actionable insights for policymakers and university administrators in support of responsible AI integration in Indonesian universities.
1. Introduction
The proliferation of generative AI tools has created a paradigm shift in global higher education. Among these tools, ChatGPT developed by OpenAI and publicly released in November 2022 stands out as the fastest-growing consumer application in history, reportedly reaching 100 million active users within two months of its launch (Ajalo et al., 2025). Its capacity to generate human-like and contextually coherent text has prompted educators, students, and policymakers alike to reassess the role of AI in academic learning, research assistance, and pedagogical design. Specifically, in Indonesian higher education, ChatGPT adoption is rapidly accelerating, yet empirical evidence concerning the determinants of this adoption, particularly within prescribed geographic and institutional contexts, such as Yogyakarta remains limited.
Yogyakarta is prevailingly regarded as the educational hub of Indonesia, hosting over 100 higher education institutions, including state and private universities that collectively enroll hundreds of thousands of students annually. Within this ecosystem, universities affiliated with Muhammadiyah have progressively embraced digital transformation. A preliminary literature review revealed the significant potential of AI and ChatGPT in enhancing student well-being within smart education environments, thus underscoring the relevance of locally situated empirical investigation (Zain Abdillah et al., 2023).
Technology adoption theory provides the conceptual backbone for understanding why individuals choose to adopt or resist new technologies. The Technology Acceptance Model (TAM) proposed by Davis in 1989, introduced perceived usefulness (PU) and perceived ease of use as foundational predictors of behavioral intention. Building on this, Venkatesh et al. (2003) developed the Unified Theory of Acceptance and Use of Technology (UTAUT), which extended TAM by incorporating social influence and facilitating conditions, referring to the degree to which individuals perceive that adequate resources, knowledge, and support are available to facilitate technology use. However, both models have been criticized for neglecting the cost–benefit dimension inherent in users’ decision-making processes. To deal with this argument, Kim et al. (2007) proposed the value-based adoption model (VAM), which framed technology adoption as a trade-off between perceived benefits such as usefulness and enjoyment, and perceived sacrifices, such as technical barriers and financial cost.
Despite a growing body of literature that applied VAM to the acceptance of educational technology, relatively few studies used it specifically to ChatGPT adoption among university students in Southeast Asia. Moreover, most prior studies have relied exclusively on Structural Equation Modeling (SEM) for theory testing. While statistically robust for confirmatory hypothesis testing, the model may not capture nonlinear relationships or rank predictor importance in the manner performed by machine learning (ML) algorithms. A two-step hybrid SEM–ML analytical framework offers a powerful methodological advancement that combines the explanatory depth of SEM with the predictive precision of approaches such as Random Forest (RF), eXtreme Gradient Boosting (XGBoost), and Artificial Neural Networks (ANN) (Kanbul et al., 2024; Salloum et al., 2024).
This study aims to: (1) extend the VAM framework by incorporating constructs relevant to generative AI adoption specifically AI self-efficacy (ASE), perceived academic value (PAV), and privacy concerns (PC); (2) empirically test this extended model using PLS-SEM with data collected from university students in Yogyakarta; and (3) validate and complement the SEM findings using ML algorithms to identify the most influential predictors of ChatGPT adoption. By situating this inquiry within the Indonesian context, this study contributes to both theoretical advancement and practical policy guidance for AI integration in higher education.
2. Literature Review
The VAM was originally proposed by Kim et al. (2007) as a response to the perceived constraints of the TAM in explaining technology adoption behavior. While the TAM focuses predominantly on PU and ease of use, the VAM introduces a dual-evaluation mechanism wherein users evaluate technology through both the benefits they expect to receive (e.g., functional usefulness and hedonic enjoyment) and the sacrifices they anticipate incurring (e.g., perceived cost (PCo) and perceived technical complexity (PTC)). The net trade-off between these two dimensions yields a perceived value (PV) judgment that, in turn, drives adoption intention (Kim et al., 2007).
The VAM has been validated across a wide range of technology contexts, including mobile commerce, smart home devices, and health information technologies. Within the context of AI-powered educational tools, the VAM is particularly appropriate, because students engage in a complex evaluation of ChatGPT’s academic utility against concerns about reliability, over-reliance, and ethical issues. Al-Abdullatif (2023), applying the VAM to chatbot adoption in Saudi Arabian higher education, declared that perceived benefits especially usefulness significantly predicted adoption intention, while perceived sacrifices moderated this relationship. Al-Abdullatif & Alsubaie (2024) further extended the VAM with AI literacy as a moderating construct, revealing that students with higher AI literacy perceived greater value in using ChatGPT for learning.
A critical conceptual clarification is necessary before advancing the research hypotheses. Two constructs in the extended VAM, PU and PAV, may appear conceptually proximate but represent theoretically distinct constructs operating through different mechanisms in the model.
PU, rooted in the TAM (Davis, 1989), captures the general belief that a technology will improve performance outcomes. It is domain-neutral and measures the instrumental utility of a tool in a broad sense, whether using ChatGPT “improves my academic performance” in general. PU operates as a benefit dimension within the VAM, indirectly influencing adoption behavior via the mediating role of PV (H1: PU → PV).
PAV, by contrast, is an education-specific construct that captures the degree to which a student perceives ChatGPT as providing substantive value for their specific field of study, learning tasks, and academic development (Al-Abdullatif & Alsubaie, 2024; Habibi et al., 2023). PAV is not merely “useful in general”; it encodes an evaluative judgment about domain-specific educational worth. For example, to evaluate whether the output of ChatGPT meaningfully support literature synthesis, problem-solving in STEM, or language writing tasks relevant to the student’s discipline. Critically, PAV is hypothesized to directly predict Adoption Intention (H4: PAV → Adoption Intention) rather than operating through PV, because education-specific value perceptions function as an intrinsic motivator that independently shapes behavioral intent. Table 1 summarizes the key conceptual distinctions between these two constructs.
Dimension | PU | PAV |
Conceptual origin | Technology acceptance model (TAM) (Davis, 1989) | Value-based adoption model (VAM) extension; education-specific (Habibi et al., 2023) |
Level of specificity | General/domain-neutral | Domain-specific: Higher education and learning tasks |
Core evaluation | Does this tool improve my overall performance? | Does this tool provide specific educational value for my field? |
Theoretical pathway | Enhances Perceived Value (PV) (H1) | Directly predicts Adoption Intention (H4) |
Sample item | “Using ChatGPT improves my academic performance.” | “ChatGPT provides substantial educational value for my field.” |
Supported by | Al-Abdullatif (2023); Kim et al. (2007) | Al-Abdullatif & Alsubaie (2024); Habibi et al. (2023) |
This distinction ensures that the extended VAM captures both the general instrumental utility of ChatGPT (PU) and its context-specific educational worth (PAV), thus avoiding construct conflation and maintaining the theoretical integrity of the model.
Since its release, ChatGPT has quickly become one of the most studied AI tools in higher education. Scholars approached ChatGPT adoption through multiple theoretical lenses, including TAM, UTAUT, UTAUT2, and diffusion of innovations theory. Habibi et al. (2023), in a landmark study among Indonesian higher education students, employed the UTAUT2 model to a sample of 1,117 respondents and reported that facilitating conditions were the strongest determinant of behavioral intention to use ChatGPT, while effort expectancy did not significantly predict intention.
Parveen et al. (2024) employed Structural Equation Modeling among students across three Pakistani universities and discovered that flow experience and system quality mediated the relationship between UTAUT constructs and ChatGPT use. Polyportis & Pahos (2025) extended the meta-UTAUT framework with anthropomorphism, trust, and institutional policy in a Dutch higher education context. They commented that institutional policy negatively moderated the effect of behavioral intention on ChatGPT adoption behavior. While analyzing ChatGPT adoption among Indian university students, Raman et al. (2024) utilized Rogers’ diffusion of innovations theory in combination with sentiment analysis to identify relative advantage and compatibility as the strongest adoption drivers. Habibi et al. (2024) further investigated ChatGPT acceptance through UTAUT and the Theory of Planned Behavior across five Indonesian universities.
A growing body of methodological literature has argued for the integration of SEM with ML algorithms as a complementary analytical strategy. SEM, especially the PLS-SEM variant, excels at testing theoretically derived causal relationships. It assumes linear relationships and may not adequately capture interaction effects, nonlinearities, or the relative importance of predictors across high-dimensional datasets (Salloum et al., 2024). ML classifiers such as RF, XGBoost, Support Vector Machines, and ANN offer distinct advantages, in terms of predictive accuracy and feature importance ranking without assumptions of linearity or normality.
More importantly, the hybrid SEM–ML approach adopted in this study is not merely a replication exercise in which ML “confirms” what SEM already established. Rather, ML contributes three analytically distinct insights beyond SEM’s capabilities. First, through SHapley Additive exPlanations (SHAP)-based feature importance ranking, ML quantifies the relative predictive weight of all constructs simultaneously in a model-agnostic manner, to enable direct comparison with SEM path coefficients across competing algorithms (RF, XGBoost, and ANN). Second, ML algorithms particularly XGBoost with its gradient-boosted tree architecture, could detect nonlinear relationships and interaction effects between predictors (e.g., the interaction between ASE and PC) that PLS-SEM, by design, cannot test. Third, ML’s out-of-sample predictive performance metrics (AUC-ROC, F1-score) evaluate the generalizability of the model to new data and this is a property that SEM’s R² cannot assess. Kanbul et al. (2024) demonstrated that ML-based approaches outperformed traditional regression models in predictive precision. Meanwhile, Salloum et al. (2024) applied a hybrid SEM–ML framework using UTAUT with XGBoost and neural network classifiers. The present study adopted this two-step hybrid approach to produce a comprehensive and policy-relevant analysis of ChatGPT adoption in Yogyakarta.
A review of the existing literature revealed four notable gaps that the present study addressed. First, no published study has focused specifically on Yogyakarta as a sub-national educational hub. Second, most prior studies applied TAM or UTAUT frameworks, with relatively few extending the VAM to the ChatGPT context. Third, most of the existing methodological approaches are limited to SEM or simple regression, without incorporating ML predictive modeling. Fourth, private Islamic universities in Yogyakarta have not been specifically represented in the ChatGPT adoption literature. This study addressed all four research gaps by providing geographically specific evidence, applying and extending the VAM, employing a hybrid SEM–ML strategy, and including students from diverse institutional types.
3. Methodology
This study adopted a quantitative and cross-sectional survey design grounded in the positivist research paradigm. The investigation employed a two-stage and hybrid methodological strategy that sequentially integrated PLS-SEM with ML classification algorithms, and this combined confirmatory theory testing with data-driven predictive modeling. The study extended the VAM (Kim et al., 2007) by incorporating ASE and PAV as additional benefit dimensions, and PC as an extra sacrifice dimension. The integrated research framework is illustrated in Figure 1.

The conceptual model draws on the original VAM (Kim et al., 2007), which posited that technology adoption was determined by the net PV users derived from a technology, calculated as the difference between perceived benefits and perceived sacrifices. In the context of ChatGPT adoption, the study operationalized four benefit constructs and three sacrifice constructs, mediating through PV to predict adoption intention and actual use. Table 2 presents all ten research hypotheses.
H# | Path of Constructs | Hypothesis Statement | Direction |
|---|---|---|---|
H1 | PU → PV | PU positively influences students’ PV. | Positive (+) |
H2 | PE → PV | PE positively influences students’ PV. | Positive (+) |
H3 | ASE → PV | Higher ASE positively influences students’ PV. | Positive (+) |
H4 | PAV → Adoption Intention | Perceived academic utility directly and positively predicts adoption intention. | Positive (+) |
H5 | PTC → PV | PTC negatively influences PV. | Negative (−) |
H6 | PC → PV | Greater PC negatively influence PV. | Negative (−) |
H7 | PCo → PV | PCo negatively influences PV. | Negative (−) |
H8 | PV → Adoption Intention | PV positively and significantly predicts adoption intention. | Positive (+) |
H9 | Adoption Intention → Actual Use | Adoption intention significantly and positively predicts actual ChatGPT use. | Positive (+) |
H10 | Indirect Effects (via PV) | PV mediates the relationship between benefit/sacrifice constructs and adoption intention. | Mediation |
The target population comprised all currently enrolled undergraduate and postgraduate students at universities in Yogyakarta, Indonesia, who had been exposed to or were aware of ChatGPT. Yogyakarta is home to over 100 higher education institutions with a combined enrollment exceeding 400,000 students. Five universities were purposively selected to ensure representation across institutional types: One flagship state university (Universitas Gadjah Mada/UGM), one state education university (Universitas Negeri Yogyakarta/UNY), one private Islamic university (Universitas Muhammadiyah Yogyakarta/UMY), one private Islamic university (Universitas Islam Indonesia/UII), and one private Catholic university (Universitas Atma Jaya Yogyakarta/UAJY). This deliberate diversity ensured that findings reflected varying institutional cultures, governance structures, and student demographics from Muhammadiyah-affiliated Islamic values at UMY and UII to secular academic traditions at UGM and UAJY. Sample size was determined following Hair et al. (2019), to yield a target sample of n = 484 with a 15% buffer for incomplete responses. Table 3 presents the sampling allocation.
Institution | Type | Target n | Total Percentage (%) |
|---|---|---|---|
Universitas Muhammadiyah Yogyakarta (UMY) | Private Islamic | 120 | 25% |
Universitas Gadjah Mada (UGM) | State (Flagship) | 100 | 21% |
Universitas Negeri Yogyakarta (UNY) | State (Education) | 90 | 19% |
Universitas Islam Indonesia (UII) | Private Islamic | 90 | 19% |
Universitas Atma Jaya Yogyakarta (UAJY) | Private (Catholic) | 84 | 16% |
Total | 484 | 100% |
Data was collected via a structured and self-administered online questionnaire distributed through Google Forms. All measurement items were adapted from validated scales in the prior technology adoption literature and tailored to the ChatGPT and higher education context, rated on a 5-point Likert scale (1 = Strongly Disagree; 5 = Strongly Agree). The instrument was translated into Bahasa Indonesia, backtranslated by an independent bilingual expert, and reconciled by a panel of three academic reviewers. A pilot test with 40 students confirmed reliability (Cronbach’s alpha ≥ 0.7 for all constructs). Table 4 presents the finalized measurement instruments.
Construct | Items | Sample Item (Adapted) | Source(s) |
Perceived Usefulness (PU) | 4 | Using ChatGPT improves my academic performance. | Davis (1989); Kim et al. (2007) |
Perceived Enjoyment (PE) | 4 | I find it enjoyable to interact with ChatGPT for academic tasks. | Al-Abdullatif (2023); Kim et al. (2007) |
AI Self-Efficacy (ASE) | 5 | I am confident in my ability to use ChatGPT effectively. | Al-Abdullatif & Alsubaie (2024); Compeau & Higgins (1995) |
Perceived Academic Value (PAV) | 5 | ChatGPT provides substantial educational value for my field. | Habibi et al. (2023); Parveen et al. (2024) |
Perceived Technical Complexity (PTC) | 4 | I find it technically difficult to obtain good results from ChatGPT. | Kim et al. (2007) |
Privacy Concerns (PC) | 5 | I am worried about the privacy of my data when using ChatGPT. | Dinev & Hart (2006); Raman et al. (2024) |
Perceived Cost (PCo) | 4 | The time I spend learning to use ChatGPT effectively is costly. | Kim et al. (2007) |
Perceived Value (PV) | 5 | Overall, using ChatGPT provides more benefits than costs. | Habibi et al. (2023); Kim et al. (2007) |
Adoption Intention | 5 | I intend to continue using ChatGPT for my academic activities. | Kim et al. (2007); Venkatesh et al. (2003) |
Actual Use Behaviour (AUB) | 5 | I regularly use ChatGPT to assist in doing assignments and learning. | Habibi et al. (2023); Niu & Mvondo (2024) |
Total | 46 |
In the first analytical stage, PLS-SEM (SmartPLS 4.0) was employed to test the structural model derived from the extended VAM. PLS-SEM was preferred over CB-SEM, given the model’s mixed reflective-formative components, prediction-oriented objectives, and non-normal data distributions common in survey-based research (Hair et al., 2019; Ringle et al., 2020). The two-step approach of Anderson & Gerbing (1988) is as follows: The measurement model was assessed prior to evaluating the structural model. Bootstrapping with 5,000 subsamples was used to derive bias-corrected 95% confidence intervals and to test mediation effects (Preacher & Hayes, 2008). Table 5 summarizes the criteria of model evaluation.
Assessment Level | Criterion | Threshold | Reference |
Measurement model | Cronbach’s alpha (α) | ≥0.70 | Nunnally (1978) |
Composite reliability | ≥0.70 | Hair et al. (2019) | |
Average variance extracted | ≥0.50 | Fornell & Larcker (1981) | |
HTMT ratio | <0.90 | Henseler et al. (2015) | |
Outer loadings | ≥0.70 | Hair et al. (2019) | |
Structural model | R² | >0.10 (weak) >0.33 (moderate) | Cohen (1988) |
f² (effect size) | 0.02 small; 0.15 medium; 0.35 large | Cohen (1988) | |
Q² (predictive relevance) | >0 indicates relevance | Geisser (1974) | |
VIF (Multicollinearity) | <3.3 | Hair et al. (2019) | |
Mediation | Indirect Effect | BC 95% CI excludes zero | Preacher & Hayes (2008) |
Variance accounted for (VAF) | >20% partial > 80% full | Hair et al. (2019) |
Three ML algorithms were employed in the second analytical stage: (1) RF; (2) XGBoost; and (3) ANN. Each algorithm predicts adoption intention (binary: high vs. low, split at the scale midpoint) using latent variable scores from the PLS-SEM measurement model as input features. All ML analyses were implemented in Python using scikit-learn (version 1.4+), XGBoost (version 2.0+), and SHAP libraries. Model performance was evaluated using 10-fold stratified cross-validation with AUC-ROC, F1-score, RMSE, and MAE metrics. SHAP values were computed to provide interpretable and model-agnostic feature importance rankings for direct comparison with SEM path coefficients.
This study was conducted in accordance with ethical guidelines and the principles of the Declaration of Helsinki. Ethical approval was obtained from the Institutional Review Board of Universitas Muhammadiyah Yogyakarta (Protocol No. 569/A.3-III/DRP/VIII/2025) prior to data collection. All participants received a full information sheet and provided digital informed consent. No personally identifiable information was collected; all responses were anonymized and stored in a password-protected institutional server.
Content validity was established through expert review by three academics in information systems and educational technology. Construct validity was assessed through convergent and discriminant validity criteria detailed in Table 5. Common Method Bias was addressed through procedural remedies and post-hoc testing using Harman’s single-factor test and full-collinearity VIF (Kock, 2015). ML generalizability was guaranteed through 10-fold stratified cross-validation and reporting of both in-sample and out-of-sample performance metrics.
4. Results
A total of 484 valid responses were collected from students across the five target institutions in Yogyakarta. After excluding 12 incomplete responses and 8 from participants who did not meet the ChatGPT awareness screening criterion, a final analytical sample of n = 464 was obtained (response rate: 95.9%). Of the final sample, 54.3% were identified as female and 45.7% as male. The majority of respondents were undergraduate students (83.4%), with 16.6% enrolled in postgraduate programs. While the mean age was 21.4 years (SD = 2.3), approximately 71.8% of respondents reported using ChatGPT at least once per week.
The measurement model was assessed for internal consistency reliability, convergent validity, and discriminant validity using SmartPLS 4.0. All constructs achieved Cronbach’s alpha values above 0.7 (range: 0.741–0.893) and composite reliability values above 0.7 (range: 0.798–0.921). Average variance extracted values ranged from 0.512 to 0.671, all exceeding the 0.5 threshold (Fornell & Larcker, 1981), hence confirming convergent validity. All outer loadings exceeded 0.7. Discriminant validity was confirmed using the HTMT criterion; all HTMT ratios were below 0.9 (range: 0.421–0.867). Full collinearity VIF values ranged from 1.234 to 2.891, all below 3.3 (Kock, 2015), thus ruling out common method bias concerns.
The structural model was assessed using PLS-SEM with bootstrapping (5,000 subsamples). The R² value for PV was 0.612; the R² for adoption intention was 0.574, and for actual use behavior was 0.471. Q² values for all endogenous constructs exceeded zero (range: 0.298–0.441), thus confirming predictive relevance. Table 6 presents the complete hypothesis testing results. All ten hypotheses were supported.
Hypothesis/Path | β | SE | p-Value | Decision | f²/VAF |
H1: PU → PV | 0.312 | 0.041 | <0.001 | Supported | 0.097 |
H2: PE → PV | 0.278 | 0.038 | <0.001 | Supported | 0.077 |
H3: ASE → PV | 0.241 | 0.044 | 0.001 | Supported | 0.058 |
H4: PAV → Adoption Intention | 0.198 | 0.047 | 0.003 | Supported | 0.039 |
H5: PTC → PV | −0.187 | 0.040 | 0.004 | Supported | 0.035 |
H6: PC → PV | −0.213 | 0.043 | <0.001 | Supported | 0.045 |
H7: PCo → PV | −0.156 | 0.039 | 0.009 | Supported | 0.024 |
H8: PV → Adoption Intention | 0.453 | 0.051 | <0.001 | Supported | 0.205 |
H9: Adoption Intention → Actual Use | 0.521 | 0.048 | <0.001 | Supported | 0.271 |
H10: Indirect Effects (via PV) | 0.181 | 0.037 | 0.002 | Supported (Partial) | VAF = 0.67 |
PV was shown to be the strongest predictor of adoption intention (β = 0.453, p < 0.001, f² = 0.205), and adoption intention was the strongest predictor of actual use behavior (β = 0.521, p < 0.001, f² = 0.271). Among the benefit constructs, PU exerted the strongest influence on PV (β = 0.312), followed by perceived enjoyment (PE) (β = 0.278) and ASE (β = 0.241). Among the sacrifice constructs, PC were the most influential (β = −0.213), followed by PTC (β = −0.187) and PCo (β = −0.156). Mediation analysis (H10) confirmed partial mediation by PV (VAF (variance accounted for) = 0.67).
Stage 2 ML analyses were conducted using RF, XGBoost, and ANN with 10-fold stratified cross-validation. Table 7 lists out the predictive performance metrics.
Algorithm | AUC-ROC | F1-Score | RMSE | MAE |
Random Forest (RF) | 0.879 | 0.851 | 0.124 | 0.093 |
eXtreme Gradient Boosting (XGBoost) | 0.912 | 0.887 | 0.109 | 0.081 |
Artificial Neural Network (ANN) | 0.896 | 0.872 | 0.117 | 0.086 |
Rank | Construct | SEM β | SHAP Value (Mean |Σ|) | Convergence |
1 | PV | 0.453 (H8) | 0.387 (RF)/0.412 (XGB) | Convergent |
2 | Adoption Intention | 0.521 (H9) | 0.341 (RF)/0.369 (XGB) | Convergent |
3 | PU | 0.312 (H1) | 0.298 (RF)/0.312 (XGB) | Convergent |
4 | PC | −0.213 (H6) | 0.241 (RF)/0.256 (XGB) | Convergent |
5 | PE | 0.278 (H2) | 0.219 (RF)/0.228 (XGB) | Convergent |
6 | ASE | 0.241 (H3) | 0.187 (RF)/0.203 (XGB) | Convergent |
7 | PTC | −0.187 (H5) | 0.176 (RF)/0.190 (XGB) | Convergent |
8 | PAV | 0.198 (H4) | 0.164 (RF)/0.178 (XGB) | Convergent |
9 | PCo | −0.156 (H7) | 0.142 (RF)/0.151 (XGB) | Convergent |
XGBoost achieved the highest predictive performance (AUC-ROC = 0.912; F1-score = 0.887), followed by ANN (AUC = 0.896) and RF (AUC = 0.879). All three algorithms substantially outperformed a logistic regression baseline (AUC = 0.793). SHAP analysis on the XGBoost model confirmed convergence with PLS-SEM findings, as shown in Table 8.
5. Discussion
The findings made several important theoretical contributions. First, the successful extension of the VAM to the ChatGPT context incorporating ASE, PAV, and PC demonstrated adaptability of the model for understanding generative AI adoption. The high R² values (R² = 0.612 for PV; R² = 0.574 for adoption intention) compared favorably with prior VAM applications (Al-Abdullatif, 2023; Al-Abdullatif & Alsubaie, 2024; Kim et al., 2007).
Second, PV (β = 0.453) emerged as the most proximal determinant of adoption intention, reaffirming the core VAM proposition that users engaged in a rational cost–benefit calculus before committing to technology adoption. This aligns with Al-Abdullatif (2023), who similarly found PV to dominate adoption behavior in a Middle Eastern chatbot study, hence suggesting that the construct retained cross-cultural validity across diverse higher education contexts.
Third, PC represented the dominant sacrifice construct (β = −0.213), a finding that both converged with and diverged from prior literature in informative ways. The negative effect of PC aligns with Polyportis (2024) and Polyportis & Pahos (2025), who concluded that institutional and environmental factors including concerns about data governance, negatively moderated ChatGPT behavioral intention in Dutch universities. However, the magnitude of the privacy effect in our Indonesian sample exceeded what Kim et al. (2007)’s original VAM application reported for mobile internet adoption, where PCo was the dominant sacrifice. This divergence likely reflects the heightened data awareness of Indonesian university students in the post-pandemic digital era and the salience of Indonesia’s Personal Data Protection Law (Government of Indonesia, 2022) as a public discourse framework. Habibi et al. (2023)’s UTAUT2-based study in Indonesia did not include privacy as a construct; in contrast, this study demonstrated that privacy was an empirically important predictor that prior Indonesian ChatGPT research had largely overlooked.
Fourth, ASE (β = 0.241) exerted a meaningfully positive effect on PV, extending Compeau & Higgins (1995)’s computer self-efficacy framework to the generative AI domain. This finding aligns with Al-Abdullatif & Alsubaie (2024), who remarked that students with higher AI literacy had a closely related construct, perceived significantly greater value in ChatGPT-assisted learning. Crucially, the strength of ASE in our Indonesian sample exceeded what Habibi et al. (2023) reported for effort expectancy (a related but distinct construct), hence suggesting that confidence in AI-specific capabilities matters more to Indonesian students than general ease-of-use perceptions. This divergence may reflect local conditions: In a Yogyakarta context with emergent AI education and limited formal AI literacy curricula, students who have independently developed AI competencies perceive disproportionately greater value in ChatGPT.
The hybrid SEM–ML approach demonstrates apparent methodological value that extends beyond simple cross-validation. The complete convergence between PLS-SEM path coefficients and SHAP feature importance rankings in Table 8 validates both analytical paradigms simultaneously, to be consistent with Salloum et al. (2024) and Kanbul et al. (2024). Beyond validation, SHAP analysis revealed that PC (rank 4th in SHAP importance) exerted a disproportionate influence relative to its path coefficient rank. This suggested that nonlinear interaction effects particularly between PC and PV amplified its impact in ways that linear SEM could not fully capture. The superior predictive performance of XGBoost (AUC = 0.912) over RF (AUC = 0.879) and ANN (AUC = 0.896) is consistent with the broader ML literature on structured tabular data (Chen & Guestrin, 2016). Scholars interested in educational technology should consider this hybrid framework when their research objectives involve both theory testing and predictive accuracy.
The primacy of PV (β = 0.453) implies that universities should actively communicate and demonstrate to students the concrete academic benefits of ChatGPT. Institutional strategies should include structured AI literacy programs, peer-mentoring initiatives, and discipline-specific use-case demonstrations.
The diversity of institutions in this sample spanning state universities (UGM, UNY), private Islamic institutions (UMY, UII), and a private Catholic university (UAJY) provided an important contextual lens for interpreting these results. While the current analysis did not conduct formal institutional sub-group comparisons (a limitation addressed below), the inclusion of Islamic universities raised a noteworthy consideration: Islamic values around intellectual honesty (amanah) and the prohibition of deception (ghish) may shape how students at UMY and UII evaluate the ethical dimensions of ChatGPT use. Students at these institutions may apply heightened scrutiny to AI-generated content in academic work, potentially making PC and PAV more salient predictors at Islamic institutions than at secular ones. Future analyses should explicitly test institutional type as a moderating variable in VAM-based models of ChatGPT adoption.
The finding that PC were the most powerful sacrifice dimension (β = −0.213) had direct implications for platform design and institutional policy. Universities should implement transparent data governance frameworks and clearly communicate how student interaction data was used, aligning with Indonesia’s Personal Data Protection Law (Government of Indonesia, 2022). The significance of ASE (β = 0.241) further suggested that targeted skill-building interventions could constructively and directly increase ChatGPT adoption rates across institutional types.
6. Conclusions
This study investigated ChatGPT adoption among university students in Yogyakarta, Indonesia, through the lens of an extended VAM integrated with a hybrid PLS-SEM and ML analytical framework. Drawing on a stratified sample of n = 484 students across five Yogyakarta universities spanning diverse institutional types, the study tested ten theoretically derived hypotheses and validated findings using RF, XGBoost, and ANN algorithms with SHAP-based feature interpretation.
All ten hypotheses were supported. PV (β = 0.453) was the most proximal determinant of adoption intention, which in turn strongly predicted actual ChatGPT use behavior (β = 0.521). Among benefit constructs, PU exerted the greatest influence (β = 0.312), followed by PE (β = 0.278) and ASE (β = 0.241). PC emerged as the dominant sacrifice construct (β = −0.213). ML validation produced convergence with PLS-SEM findings, with XGBoost achieving the highest predictive accuracy (AUC-ROC = 0.912).
Theoretically, this study extended the VAM to the generative AI domain, clarified the conceptual distinction between PU and PAV, and demonstrated the value of comparative interpretation relative to prior studies from Indonesia and beyond. Methodologically, the hybrid SEM–ML approach demonstrated that ML contributed insights to nonlinear interaction detection, out-of-sample predictive performance, and feature importance ranking that go beyond mere confirmation of SEM results. Practically, findings underscored the importance of PV communication, AI literacy programs, institutional-type-sensitive interventions, and robust privacy governance frameworks for promoting responsible ChatGPT adoption in Indonesian higher education.
Limitations, if exhaustive, covered the following aspects: (i) the cross-sectional design (limiting causal inference); (ii) geographic restriction to Yogyakarta; (iii) binary operationalization of adoption intention for ML analysis; and (iv) the absence of formal institutional sub-group analysis. Future inquiries should consider longitudinal designs, examine institutional type as a moderating variable, and replicate the extended VAM in other Indonesian regions and national contexts.
Informed consent was obtained from all subjects involved in the study.
The data used to support the research findings are available from the corresponding author upon request.
The authors gratefully acknowledged the academic environment and research support provided by Universitas Muhammadiyah Yogyakarta.
The authors declare no conflicts of interest.
