A Large Language Model-Driven Framework for Policy Instrument Analysis: An Empirical Study of University Teachers’ Ethical Misconduct Governance Texts
Abstract:
To address the limitations of traditional policy instrument analysis—such as labor-intensive coding, high subjectivity, and time-consuming procedures—this study develops a policy instrument analysis framework that integrates large language models (LLMs) and proposes a LLM-driven analytical workflow comprising six stages: case repository construction, policy instrument selection, content element generation, clause-level coding, reliability and validity testing, and quantitative analysis. Using governance texts on teachers’ ethical misconduct from 27 universities specializing in finance and economics as the empirical context, the study employed DeepSeek-R1 to identify policy instruments, classify content elements, perform clause-level coding, and conduct two-dimensional cross-tabulation analysis. The results indicate that these governance texts exhibit pronounced regulatory, procedural, and accountability-oriented characteristics, while also revealing a structural imbalance marked by strong front-end norm construction and relatively weak back-end remedial mechanisms. Overall, the proposed framework improves the efficiency and consistency of policy text analysis and provides a novel technical pathway for methodological innovation in education policy research.1. Introduction
The development of Chinese higher education is deeply embedded in a governance structure characterized by state leadership, tiered coordination, and multi-actor collaboration (Ruan et al., 2024). At the national level, competent authorities such as the State Council and the Ministry of Education have continuously issued top-level policy designs that articulate the developmental goals and institutional framework of higher education through plans, regulations, and guiding documents, thereby providing strategic direction for university reform and development (Chen & Wan, 2016). On this basis, individual universities translate and recode national policy requirements in light of their institutional missions, regional development needs, and the characteristics of their faculty and student bodies, formulating internal rules and operational provisions that enable effective alignment between national policy and institutional practice (Liu, 2025). This process reflects a distinctive governance logic in Chinese higher education: the combination of top-down strategic guidance and bottom-up practical adaptation (Sheng, 2003).
To deepen theoretical understanding of this governance logic, strengthen universities’ institutional design capacity, and improve the adaptability and effectiveness of top-level policy implementation, scholars have examined policy areas such as personnel system reform (Bian, 2025; Xia et al., 2024), university charters (Zhuang et al., 2022), educational evaluation (Xiao & Liu, 2024), discipline assessment (Bao & Han, 2022; Guo et al., 2025; Liang et al., 2025), research evaluation (Xu et al., 2021; Zhao et al., 2025), and entrepreneurship education (Mei & Symaco, 2022). Using qualitative approaches such as policy instrument analysis, critical discourse analysis, and content analysis, this body of research has explored how national regulations and policies are implemented at the university level, focusing on institutional logic, policy instrument preferences, implementation mechanisms, and legitimacy construction. Such studies provide important empirical support for improving the alignment between policy content elements and policy instruments in university governance and for enhancing the completeness and coherence of institutional policy systems (Li, 2025). However, these research processes typically require repeated cross-level comparison and interpretive analysis between national policy texts and university regulations, involving concept extraction, category construction, category merging, and manual coding. This not only raises research costs but also increases the difficulty of identifying new concepts, distinguishing between adjacent categories, and maintaining coding consistency (Krippendorff, 2018), while making results vulnerable to researchers' subjective judgment (Schmidt, 2008).
In recent years, large language models (LLMs) have shown clear advantages in processing large volumes of text and extracting complex information through contextual semantic representation, pattern recognition, semantic induction, and cross-task transfer (Mehta et al., 2025). Existing studies suggest that, when operating under researcher-defined task rules, LLMs can assist with concept extraction, theme induction, candidate category generation, pre-coding, and consistency review, thereby reducing repetitive manual labor and improving the efficiency and coverage of text analysis (Tai et al., 2024).
Although existing studies on educational policy translation have generated important insights into institutional logic and implementation mechanisms, they have paid relatively limited attention to methodological development—especially technical improvements to content analysis and policy instrument analysis—and a systematic research pathway has yet to emerge (Anh, 2023; Chan, 2023). Against this background, this study introduces LLMs into policy instrument analysis and content analysis to construct a human-machine collaborative workflow. The aim is to improve the efficiency of concept extraction and thematic summarization, mitigate subjective bias in traditional manual coding, and provide new technical support for methodological refinement in education policy analysis.
2. Related Research
Exploring the translation of national policy into university institutional arrangements and the corresponding organizational response is a core issue in educational research (Mei & Symaco, 2022; Xia et al., 2024). Studies on areas such as discipline assessment, research evaluation, teaching reform, and entrepreneurship education have drawn on institutional complexity theory (Liang et al., 2025), Bourdieu’s field theory (Zhao et al., 2025), and related perspectives, while employing policy analysis tools (Bao & Han, 2022; Guo et al., 2025; Xiao & Liu, 2024), critical discourse analysis (Bian, 2025), and semi-structured interviews (Mei & Symaco, 2022; Xu et al., 2021). These studies have examined policy instrument preferences, typological structures, and content elements in policy formulation, thereby revealing the decision-making patterns and institutional logic underlying policy design. For example, Guo et al. (2025) used the McDonnell and Elmore policy instrument framework to analyze 52 policy texts on Chinese higher education discipline assessment from 1985 to 2023 and found that command instruments were dominant, whereas capacity-building and symbolic-persuasive instruments were underused; policy content elements were unevenly distributed, with relatively limited attention to assessment objectives and methods; and mismatches between policy instruments and content elements appeared across stages. Bao & Han (2022), drawing on a policy instrument classification framework based on the nature of governmental power resources, analyzed 40 policy texts on discipline assessment and found that these policies focused heavily on evaluation subjects, procedures, and methods, while neglecting evaluation content and outcomes. They therefore proposed improving the fit between policy content elements and policy instruments and strengthening the completeness and systematicity of discipline-assessment policy content. Xiao & Liu (2024) constructed a two-dimensional framework of policy instruments and teaching elements to analyze 170 teaching-related policies and examine the policy logic behind the marginalization of teaching by university faculty. They found that coercive instruments predominated, while incentive-based and capacity-building instruments were relatively scarce. Teaching evaluation lacked incentive-based instruments, teaching content lacked organizational-development instruments, and teaching methods lacked capacity-building instruments. The study argued that imbalances within policy instruments and weak alignment between instruments and teaching elements jointly contribute to situations in which teachers are unwilling or unable to teach, and it recommended increasing the use of incentive-based instruments, ensuring the full-process supply of teaching policy instruments, and improving the fit between policy instruments and the elements of teaching activities.
Most of the above studies rely on policy instrument analysis, supplemented by word-frequency analysis, thematic analysis, and manual content coding, to extract content elements and build a policy instrument-content element framework. From the perspective of instrument selection preferences and content elements, this line of inquiry examines how state-level policy orientations are translated into university development positioning, construction goals, and reform pathways. In practice, however, researchers often need to conduct multi-level comparison and interpretive analysis across national policy texts, local supporting documents, and internal university regulations, while repeatedly performing concept extraction, category construction, text segmentation, and manual coding around policy objectives, policy instruments, implementation mechanisms, and institutional arrangements. Such qualitative work is highly iterative and interpretation-dependent: category systems require continual refinement through induction and revision, while coding outcomes are easily affected by subjective judgment and frequently face problems of blurred category boundaries and weak cross-text consistency (Krippendorff, 2018). At the same time, semantic variation and interpretive flexibility across multi-level policy structures further increase the difficulty of concept identification and category delineation (Schmidt, 2008).
In recent years, with rapid advances in text understanding, semantic clustering, summarization, classification, and structured information extraction, LLMs have shown strong potential for policy text analysis. On the one hand, given a researcher-defined theoretical framework or codebook, LLMs can support large-scale pre-coding and assisted classification. On the other hand, in exploratory analysis they can help identify high-frequency concepts, generate candidate categories, and detect semantic relations among similar categories (Wen et al., 2025; Zhang et al., 2025), thereby substantially improving the efficiency and coverage of content analysis (Mehta et al., 2025).
Overall, existing research still concentrates on explaining institutional logic and policy implementation mechanisms, with primary attention to how policies are understood, reinterpreted, and enacted in different organizational contexts. Most studies remain at the level of method selection and application, with insufficient attention to the intelligentization of analytical tools and methodological innovation, and relatively limited effort devoted to improving policy analysis methods themselves (Chan, 2023). A systematic pathway has not yet been established for using technological means to improve the precision of concept extraction, the stability of category construction, and the consistency of coding (Anh, 2023; Chan, 2023). Accordingly, integrating large language models with policy instrument analysis and content analysis is becoming an important frontier in education policy research (Liu & Sun, 2025).
3. Developing Large Language Model-Driven Framework for Policy Instrument Analysis
To address the shortcomings of traditional policy instrument analysis, this study proposes a framework that integrates LLMs into the analytical process. As shown in Figure 1, the framework consists of six components: case repository construction, analytical framework selection, content element generation, clause-level coding, reliability and validity testing, and quantitative comparison. It uses raw policy texts, policy instrument frameworks, content element categories, clause-level coding results, and research conclusions from existing studies as supervision and calibration resources to adapt a general-purpose LLM to policy instrument analysis tasks. Through consistency checks, reliability and validity testing, and manual review, the framework iteratively updates the dictionary, trigger-phrase library, policy instrument category table, and content element set. The calibrated model is then applied to new policy texts, with human review retained to identify institutional logic, preferences in policy instrument selection, and the distributional structure of content elements.

Let an educational policy task T be defined as T = {G, L, U}, where G denotes the set of national-level policy texts on a given topic, L denotes the corresponding set of policy texts issued by local educational authorities, and U denotes the set of internal university institutional texts formulated in response to national policies and local guidance.
Based on the existing literature, a policy case repository Pcase is constructed by annotating texts according to themes such as personnel systems, teaching evaluation, discipline assessment, and research evaluation. For each study or set of policy texts that has already been analyzed, the following information is recorded.
Basic attributes of policy texts, including document ID, policy level, topic, publication date, and the original policy text.
The policy instrument framework PT adopted in the existing literature and its subcategories are identified to establish a policy instrument category table. For policy text i, the policy instrument framework is represented as PTi = {PTi1, PTi2, …, PTim}; each subcategory PTij is represented as PTij = {pti1, pti2, …, ptin}. For example, Bao & Han (2022) adopted the McDonnell and Elmore five-part policy instrument framework to analyze the ninth discipline-assessment policy document. In their coding scheme, the command instrument included three subcategories, such that PT9 = {Mandate, Inducements, Capacity-building, System-changing, Hortatory}, and PT9(mandate instruments) = {behavioral requirements, rule formulation, punitive provisions}.
Content element categories are extracted and summarized from existing studies on specific types of policy texts to form a content element set CE. For example, Xiao & Liu (2024) proposed that the content element set for teaching-related policy documents can be expressed as CE = {teaching content, teaching methods, teaching support, teaching evaluation}.
The coding experience from prior studies is stored in the format ‘policy text number-policy clause-policy content element-policy instrument-policy sub-instrument’ as EP, providing a reference for subsequent pre-coding and manual review.
A trigger-phrase library B is constructed around expressions associated with policy instruments. For any policy instrument PTi, its trigger-phrase set is denoted B(PTi) = {ts1, ts2, …, tsr}. For example, capacity-building instruments often appear in expressions such as ‘issue guidelines’, ‘conduct training’, ‘build platforms’, and ‘promote capacity enhancement’. A corresponding trigger-phrase set may be represented as B(PTi) = {issue annual discipline-research guidelines, conduct regular frontier training, integrate resources to build an interdisciplinary platform, promote high-level discipline development}.
Given a national policy text gi ∈ G, the LLM extracts its expressions of the policy purpose and objectives to form a semantic representation S(gi). For each case pk ∈ Pcase, the corresponding semantic representation S(pk) is extracted, and the semantic similarity sim(S(gi),S(pk)) is calculated.
On the basis of similarity scores, a candidate case set P1 containing thematically similar cases is retrieved from Pcase. Drawing on the policy instrument frameworks used in these cases, the LLM outputs candidate frameworks PT* together with reasons for their suitability. If the target text reflects logics such as resource provision, environmental shaping, and demand inducement, the three-part framework (Rothwell, 1983) is prioritized. If the text primarily reflects governance approaches such as command, incentives, capacity building, system change, and symbolic persuasion, the five-part framework (McDonnell & Elmore, 1987) is more appropriate. If the text emphasizes how different instruments influence the behavior of target groups, the five-category framework (Schneider & Ingram, 1990) is better suited.
Let PT0 denote the initial list of policy instrument categories derived from existing cases, and PT* the candidate categories proposed by the LLM for the new task text. After manual review and calibration, the task-adapted category list PT1 is obtained: PT1 = f(PT0, PT*, Revise), where Revise represents revisions made by the researcher in light of the consensus in prior studies, the characteristics of the new task text, and theoretical fit. Once PT1 is determined, the coding experience EP and trigger-phrase library B are used to specify the corresponding sub-instrument set PT1_sub.
For a given policy text d, the text is segmented by title, chapter structure, and semantic paragraphs to produce a set of structural units Seg(d), where Seg(d) = {segd1, segd2, …, segdq}. Based on text structure, governance functions, and category experience drawn from the case repository, the LLM then generates a candidate set of content elements for each structural unit, denoted CE*(d), where CE*(d) = {ced1, ced2, …, cedt}.
The element cedj typically falls into dimensions such as governance objects, policy objectives, responsible actors, resource allocation, implementation procedures, supervision and evaluation, use of results, and safeguard mechanisms. Guided by the principles of mutual exclusivity, completeness, and contextual appropriateness, the researcher then merges and revises CE*(d). After multiple rounds of consolidation, the final content element table for task T is formed as CE (T), where CE (T) = {CE1, CE2, …, CEm}.
Let the set of clauses in policy text d be C(d) = {c1, c2, …, cn}. For each clause cj, both content elements and policy instruments are identified.
Using the predefined policy instrument category list PTi, the content element set CEi, the sub-instrument set PTi_sub, and the trigger-phrase library B, the LLM generates pre-coded results for each clause. For clause cj, the lexical feature vector can be represented as W(cj) = {wj1, wj2, …, wjs}, where wjk may include tool-related words or phrases such as ‘special fund support’, ‘included in performance evaluation’, ‘prohibited’, ‘must’, ‘encouraged’, and ‘conduct training’. Keyword similarity and trigger-sentence similarity are then calculated as simkey(cj, PTi_sub) and simsent(cj, B(PTi)), and combined into a matching score for clause cj with respect to a given policy instrument: Score(cj, PTi)=αsimkey(cj, PTi_sub)+βsimsent(cj, B(PTi)). The matching score between clause cj and a content element is denoted Score(cj, CEk)). The highest-scoring categories are taken as the candidate policy instrument and principal content element for that clause. The LLM’s pre-coded output is thus expressed as Code *(cj) = [CEk, PTi, ptis, Ej, Score(cj, CEk)], where ptis denotes the policy sub-instrument and Ej the model-provided rationale. Two researchers then independently review Code*(cj). If a clause has high matching scores for two or more policy instruments, the highest-scoring one is treated as the primary instrument.
A test set is randomly sampled 10% from the full set of coded units, and the following measures are calculated: R1 = Consistency(LLM, Human1), R2 = Consistency(LLM, Human2), R3 = Consistency(Human1, Human2).
Here, R1 and R2 measure the consistency between the LLM’s pre-coding and human judgments, while R3 measures inter-coder consistency between the two human coders. If R1, R2, and R3 all meet the preset threshold, this indicates that the domain-adapted LLM has satisfactory usability and that the category definitions are sufficiently clear.
Let freq(PTi)denote the frequency with which PTi appears in the text corpus. The usage frequency of each type of policy instrument is then calculated. Under a specific policy instrument framework, comparing the proportions of instrument use across the G, L, and U levels makes it possible to identify preferences in instrument selection during policy translation and to clarify the institutional logic through which universities implement national policy documents.
By integrating the horizontal dimension of policy instrument types with the vertical dimension of content elements, a two-dimensional cross-analysis of ‘policy instruments-content elements’ is conducted. Given a policy instrument category PTi and a content element category CEj, a cross-frequency matrix M = [mij]k×q is constructed.
Here, mij represents the frequency with which PTi and CEj co-occur across all coded clauses. Comparing the rows and columns of matrix M reveals the usage pattern of policy instruments for each content element and helps evaluate the degree of compatibility between policy content elements and policy instruments.
4. Empirical Analysis
The governance of teachers’ ethical misconduct is a key institutional safeguard for implementing the fundamental task of fostering virtue and cultivating talent in higher education (Qin & Yu, 2024). Existing studies have mainly approached this issue from legal and sociological perspectives, using content analysis, case studies, and related methods to focus on conceptual definitions of misconduct, accountability mechanisms, and the control of discretionary power (Fan, 2023; Ma & Yang, 2024). However, much of this work remains at the level of textual interpretation, and important issues—including the identification of misconduct, the allocation of responsibilities, the operational requirements of governance, and modes of implementation—remain insufficiently clarified (Wang & Yin, 2022). This study therefore applies the framework proposed above to examine the implementation trajectory of teacher ethics governance from the perspectives of policy instrument selection and institutional logic, with a view to improving institutional design for the governance of teacher ethics and conduct.
Using the keywords ‘education policy’, ‘policy instruments’, and ‘higher education policy’, we first collected relevant studies from China National Knowledge Infrastructure, retrieved the corresponding policy texts, and constructed a case repository consisting of 15 documents, with five documents for each of the three major framework categories. DeepSeek-R1 was then invoked through Python to determine the policy instrument framework, identify sub-instrument categories, and conduct policy coding, while two education researchers independently completed manual coding. The resulting consistency scores—R1 = 0.86, R2 = 0.91, and R3 = 0.94—all exceeded the preset threshold, indicating that the adapted LLM performed well and that the category definitions were reasonably clear. We then searched the official websites of the Ministry of Education and finance-and-economics universities using the keywords ‘violations of teacher ethics and conduct’ and ‘teacher ethics and conduct’, collecting policy documents on the handling of such violations from 53 universities as well as national and provincial sources, including 2 national-level documents, 12 provincial- and municipal-level documents, and 28 university-level documents.
The two national institutional documents were uploaded to DeepSeek-R1 to extract their policy purposes, objectives, and core governance expressions, yielding the following semantic representation: ‘Under the guidance of top-level national policies, universities translate and localize these policies to build a high-quality faculty governance system centered on fostering virtue and cultivating talent, anchored in standards for teacher ethics and conduct, and combining accountability with institutionalized enforcement’. We then compared this representation with each case in the repository and calculated semantic similarity scores.
The similarity analysis showed that the target policy was semantically closest to two types of policy texts—discipline assessment and teaching evaluation—with similarity scores of 0.93 and 0.87, respectively. The text repeatedly contained expressions related to rule formulation, organizational arrangement, authoritative control, incentive and constraint mechanisms, and procedural oversight. Accordingly, the discipline-assessment case (Guo et al., 2025), with the higher similarity score of 0.93, was selected as the benchmark. In Guo et al. (2025), the initial policy instrument category list was PT0 = {mandate, inducements, capacity-building, system-changing, Hortatory}. By contrast, the candidate tool categories generated by the LLM for the teacher ethics policy text were PT* = {authoritative regulations, procedures, organizational structures, disciplinary measures, supervision and accountability}. After manual review and calibration, the final category list was defined as PT1 = {Norm Specification, Procedural governance, Organizational arrangements, Disciplinary constraints, Supervision and accountability}.
The corresponding sub-instrument set PT1_sub was then specified by combining the sub-tool content extracted from the benchmark case with the word-segmentation results from DeepSeek-R1 and the LDA topic-mining results, yielding PT1_sub = [{explicit prohibition, violation identification, handling}, {reporting, investigation, verification, review, service of documents, appeal, reconsideration}, {working leading group, faculty affairs department, secondary units, division of responsibilities among relevant functional departments}, {criticism and education, public censure, suspension from teaching, transfer from post, revocation of qualifications, disciplinary action, revocation of teaching qualifications}, {primary responsibility, direct responsibility, accountability for dereliction of duty}].
Each university policy text on teacher ethics and conduct was segmented by titles, chapter structure, and semantic paragraphs to generate a set of structural units. Based on text structure, governance functions, and category experience from the case repository—in which content elements of discipline-assessment policy texts mainly include evaluation entities, evaluation objectives, evaluation content, evaluation procedures, evaluation methods, and the publication and use of results—DeepSeek-R1 generated candidate content elements for each structural unit. These candidates were then consolidated and revised by the researchers according to operational principles. After multiple rounds of merging, the final content element table was defined as CE = {Policy objectives and basic principles, Definition of ethical misconduct, Organizational structure and division of responsibilities, Reporting reception and investigation mechanisms, Deliberation, decision-making, and Procedural safeguards, Disciplinary actions and qualification restrictions, Remedies, review, and resolution of disciplinary Actions, supervision, accountability, and implementation of responsibilities}.
Using the established policy instrument framework PT1, the sub-instrument set PT1_sub, and the content element table CE, DeepSeek-R1 performed clause-level tokenization and sentence-pattern extraction, calculated lexical and syntactic similarity, assigned content elements, policy instruments, and sub-instruments, identified trigger words, and produced both justifications and confidence scores for its classifications.
A validation set was randomly sampled from the full set of coded units, and R1, R2, and R3 were calculated accordingly. Because all three values exceeded 0.8, the policy instrument analysis method incorporating DeepSeek-R1 was deemed to have satisfactory validity.
A two-dimensional cross-analysis of policy instruments and content elements was conducted for governance texts on teachers’ ethical misconduct from 27 universities specializing in finance and economics, producing a total of 1,469 coded units (see Table 1).
Content Elements\Policy Instruments | Norm Specification | Procedural Governance | Organizational Arrangement | Disciplinary Constraint | Supervision and Accountability | Total |
Policy objectives and basic principles | 337 | 4 | 8 | 3 | 4 | 356 |
Definition of teachers’ ethical misconduct | 349 | 1 | 4 | 1 | 2 | 357 |
Organizational structure and division of responsibilities | 3 | 2 | 64 | 0 | 3 | 72 |
Reporting reception and investigation mechanisms | 11 | 141 | 16 | 0 | 0 | 168 |
Deliberation, decision-making and procedural safeguards | 2 | 60 | 10 | 0 | 0 | 72 |
Disciplinary measures and qualification restrictions | 22 | 0 | 3 | 163 | 0 | 188 |
Remedy, review, and termination of sanctions | 12 | 64 | 14 | 14 | 2 | 106 |
Supervision, accountability and responsibility implementation | 32 | 1 | 6 | 2 | 109 | 150 |
Total | 768 | 273 | 125 | 183 | 120 | 1469 |
The results show that, in the policy instrument dimension, definition of norms was the most frequently used instrument type (768 instances, 52.28%), followed by procedural governance (273 instances, 18.58%) and disciplinary constraints (183 instances, 12.46%), whereas organizational arrangements (125 instances, 8.51%) and supervision and accountability (120 instances, 8.17%) appeared less frequently. This pattern suggests that the governance texts exhibit a distinct rules-first logic: they first establish clear normative boundaries for teacher ethics through a large number of provisions; they then rely on procedural governance to institutionalize identification, deliberation, and disciplinary handling; and finally, they supplement this structure with disciplinary constraints, organizational arrangements, and supervision and accountability to form a closed governance loop. In this sense, the governance of teachers’ ethical misconduct has developed into a regime centered on normative construction and procedural regulation.
In the content element dimension, Definition of ethical misconduct (357 instances, 24.30%) and Policy objectives and basic principles (356 instances, 24.23%) accounted for the largest shares. These were followed by Disciplinary actions and qualification restrictions (188 instances, 12.80%), Reporting reception and investigation mechanisms (168 instances, 11.44%), and Supervision, accountability, and implementation of responsibilities (150 instances, 10.21%), while Organizational structure and division of responsibilities and Deliberation, decision-making, and procedural safeguards each appeared 72 times (4.90%). This indicates that the sampled universities have concentrated primarily on defining forms and boundaries of misconduct and articulating institutional principles, while comparatively limited attention has been devoted to organizational coordination, procedural support, and follow-up remedial arrangements.
Further two-dimensional cross-analysis (see Table 2) shows a statistically significant association between policy instrument types and the distribution of content elements (χ² = 3783.02, df = 28, p < 0.001; Cramér's V = 0.802), indicating that the sampled universities have developed a relatively stable configuration structure between content elements and instrument types in texts governing teachers’ ethical misconduct. Specifically, the five most prominent combinations were ‘Definition of teachers’ ethical misconduct—Norm specification’ (349 instances, 23.76%), ‘Policy objectives and basic principles—Norm specification’ (337 instances, 22.94%), ‘Disciplinary measures and qualification restrictions—Disciplinary constraint’ (163 instances, 11.10%), ‘Reporting, acceptance, and investigation mechanisms—Procedural governance’ (141 instances, 9.60%), and ‘Supervision, accountability, and implementation of responsibilities—Supervision and accountability’ (109 instances, 7.42%). Together, these five combinations accounted for 74.81% of all coded units. This pattern indicates that governance texts in finance-and-economics universities largely proceed along the logic of ‘behavioral definition → procedural handling → disciplinary enforcement → accountability’, and thus display strong regulatory, procedural, and accountability-oriented features. At the same time, content related to Remedy, review, and termination of sanctions occupies only a small share of the overall texts (106 instances in total, or 7.21%). Although these provisions are associated with procedural governance (64 instances), their frequency is far lower than that of front-end normative and disciplinary combinations. This distribution reveals a clear structural tendency: strong front-end norm construction but weak back-end safeguards. In other words, while the governance texts have developed a relatively complete regulatory system for defining misconduct, handling procedures, and disciplinary constraints, they remain comparatively underdeveloped in back-end supportive arrangements such as remedial rights, review and appeal procedures, and the lifting of disciplinary actions. This weakness may affect both perceived fairness and the sustainable operation of the overall governance system.
Rank | Instrument–Element Combination | Frequency | Percentage (%) | Cumulative (%) |
1 | Definition of teachers’ ethical misconduct—Norm specification | 349 | 23.76 | 23.76 |
2 | Policy objectives and basic principles—Norm specification | 337 | 22.94 | 46.70 |
3 | Disciplinary measures and qualification restrictions—Disciplinary constraint | 163 | 11.10 | 57.80 |
4 | Reporting reception and investigation mechanisms—Procedural governance | 141 | 9.60 | 67.40 |
5 | Supervision, accountability and responsibility implementation—Supervision and accountability | 109 | 7.42 | 74.82 |
5. Conclusions
Policy instrument theory is an important approach for analyzing the institutional logic of educational policies and the preferences reflected in instrument selection. It has been widely applied in studies of institutional translation involving personnel reform, discipline assessment, and curriculum development, but it also suffers from several limitations, including labor-intensive coding, time-consuming content analysis, and strong subjectivity in concept definition and category consolidation. LLMs, with their strong capabilities in natural language understanding and large-scale text processing, have attracted growing attention in information collection, processing, and analysis. In response to the limitations of traditional policy instrument analysis, this study proposes an LLM-integrated analytical framework and tests it empirically using governance documents on teachers’ ethical misconduct from 27 universities. The findings indicate that governance in finance-and-economics universities unfolds along the main axis of ‘behavioral definition → procedural handling → disciplinary enforcement → accountability’, exhibiting pronounced regulatory, procedural, and accountability-oriented characteristics. At the same time, however, the overall structure still reflects strong front-end norm construction and relatively insufficient back-end remedial support. The empirical results suggest that the proposed framework is both feasible and effective for analyzing the institutional logic of educational policy texts. Future research will expand the number of policy texts analyzed and extend the framework to fields such as energy, public health, and traffic safety to further test its generalizability.
Conceptualization, C.Q. and J.L.; methodology, X.Z.; software, Y.Z.; validation, W.Z., X.Z., and Y.Z.; formal analysis, C.Q.; investigation, J.L.; resources, X.Z.; data curation, X.Z.; writing—original draft preparation, J.L.; writing—review and editing, J.L.; visualization, X.Z.; supervision, X.Z.; project administration, C.Q.; funding acquisition, C.Q. All authors have read and agreed to the published version of the manuscript.
The data used to support the research findings are available from the corresponding author upon request.
The authors declare no conflict of interest.
