Javascript is required
1.
D. Crystal, English as a Global Language. Cambridge University Press, Cambridge, UK, 2003. [Google Scholar]
2.
J. Dearden, “English as a medium of instruction-A growing global phenomenon,” British Council, UK, 2014. [Google Scholar]
3.
S. Marginson, “Student self-formation in international education,” J. Stud. Int. Educ., vol. 18, no. 1, pp. 6–22, 2013. [Google Scholar] [Crossref]
4.
R. Phillipson, “The linguistic imperialism of neoliberal empire,” Crit. Inquiry Lang. Stud., vol. 5, no. 1, pp. 1–43, 2008. [Google Scholar] [Crossref]
5.
A. Sherstinsky, “Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network,” Physica D, vol. 404, p. 132306, 2020. [Google Scholar] [Crossref]
6.
F. Gers, “Long short-term memory in recurrent neural networks,” phdthesis, Universität Hannover, Hannover, Germany, 2001. [Google Scholar]
7.
J. M. Imperial, “BERT embeddings for automatic readability assessment,” ArXiv, p. arXiv:2106.07935, 2021. [Google Scholar] [Crossref]
8.
L. Jian, H. Xiang, and G. Le, “English text readability measurement based on convolutional neural network: A hybrid network model,” Comput. Intell. Neurosci., vol. 2022, pp. 1–9, 2022. [Google Scholar] [Crossref]
9.
H. Butt, M. R. Raza, M. J. Ramzan, M. J. Ali, and M. Haris, “Attention-based CNN-RNN Arabic text recognition from natural scene images,” Forecasting, vol. 3, no. 3, pp. 520–540, 2021. [Google Scholar] [Crossref]
10.
N. C. Dang, N. María Moreno-García, and F. De la Prieta, “Sentiment analysis based on deep learning: A comparative study,” Electronics, vol. 9, no. 3, p. 483, 2020. [Google Scholar] [Crossref]
11.
L. Diao and P. Hu, “Deep learning and multimodal target recognition of complex and ambiguous words in automated English learning system,” J. Intell. Fuzzy Syst., vol. 40, no. 4, pp. 7147–7158, 2021. [Google Scholar] [Crossref]
12.
Y. Han, “Evaluation of English online teaching based on remote supervision algorithms and deep learning,” J. Intell. Fuzzy Syst., vol. 40, no. 4, pp. 7097–7108, 2021. [Google Scholar] [Crossref]
13.
Y. Li, “Construction of Internet of Things English terms model and analysis of language features via deep learning,” J. Supercomput., vol. 78, no. 5, pp. 6296–6317, 2021. [Google Scholar] [Crossref]
14.
R. Zhang, “Construction method of network teaching resources based on deep learning,” Converter, vol. 2021, no. 6, pp. 440–446, 2021. [Google Scholar]
15.
M. Kasamatsu, Y. Murakami, and M. Kengo, “Examination of estimation accuracy of deep learning by data augmentation,” IEICE Tech. Rep., vol. 118, no. 8, pp. 115–119, 2019. [Google Scholar]
16.
M. O. A. Elhassan, A. S. Muhamad, and I. H. A. Tharbe, “The effects of surface and deep learning strategies on academic achievement in English among high school students: Do implicit beliefs of intelligence matter?,” J. Couns. Psychol., vol. 7, no. 2, pp. 1–13, 2020. [Google Scholar]
17.
K. S. Lee, “A study of STEAM model development and assessment method for deep learning: Through the voice of mimesis and brontë,” J. Eng. Teach. Movie Media., vol. 22, no. 4, pp. 39–58, 2021. [Google Scholar] [Crossref]
18.
Z. Xu and Y. Shi, “Application of constructivist theory in flipped classroom — Take college English teaching as a case study,” Theory Pract. Lang. Stud., vol. 8, no. 7, p. 880, 2018. [Google Scholar] [Crossref]
19.
L. Alzubaidi, J. Zhang, J. Amjad Humaidi, A. Al-Dujaili, Y. Duan, O. Al-Shamma, J. Santamaría, A. Mohammed Fadhel, M. Al-Amidie, and L. Farhan, “Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions,” J. Big Data, vol. 8, no. 1, pp. 1–74, 2021. [Google Scholar] [Crossref]
20.
S. Ram, S. Gupta, and B. Agarwal, “Devanagri character recognition model using deep convolution neural network,” J. Stat. Manage. Syst., vol. 21, no. 4, pp. 593–599, 2018. [Google Scholar] [Crossref]
21.
M. P. Sari, “An evaluation of English program: A deep analysis of EFL learners’attitude towards English program,” Int. J. Inf. Educ., vol. 5, no. 2, pp. 133–147, 2021. [Google Scholar] [Crossref]
22.
D. K. Renuka and C. A. Devi, “Affective model based speech emotion recognition using deep learning techniques,” Indian J. Comput. Sci., 2020, doi: 10.1701 0/ijcs/2020/v5/i4-5/l 54783. [Google Scholar]
23.
S. S. Ali, “Problem based learning: A student-centered approach,” English Lang. Teach., vol. 12, no. 5, pp. 73–78, 2019. [Google Scholar]
24.
M. Saqlain, “Sustainable hydrogen production: A decision-making approach using VIKOR and intuitionistic hypersoft sets,” J. Intell. Manag. Decis., vol. 2, no. 3, pp. 130–138, 2023. [Google Scholar] [Crossref]
25.
H. B. U. Haq and M. Saqlain, “Iris detection for attendance monitoring in educational institutes amidst a pandemic: A machine learning approach,” J. Ind. Intell., vol. 1, no. 3, pp. 136–147, 2023. [Google Scholar] [Crossref]
26.
M. Saqlain, M. Sana, N. Jafar, M. Saeed, and B. Said, “Single and multi-valued neutrosophic hypersoft set and tangent similarity measure of single valued neutrosophic hypersoft sets,” Neutrosophic Sets Syst., vol. 32, pp. 317–329, 2020. [Google Scholar]
27.
M. Saqlain and X. L. Xin, “Interval valued, m-polar and m-polar interval valued neutrosophic hypersoft sets,” Neutrosophic Sets Syst., vol. 36, pp. 389–399, 2020. [Google Scholar] [Crossref]
28.
M. Saqlain, M. Saeed, R. M. Zulqarnain, and S. Moin, “Neutrosophic hypersoft matrix theory: Its definition, operators, and application in decision-making of personnel selection problem,” J. Oper. Res., vol. 2021, pp. 449–470, 2021. [Google Scholar] [Crossref]
29.
M. Saqlain, M. Riaz, M. A. Saleem, and M. S. Yang, “Distance and similarity measures for neutrosophic hypersoft set (NHSS) with construction of NHSS-TOPSIS and applications,” IEEE Access, vol. 9, pp. 30803–30816, 2021. [Google Scholar] [Crossref]
30.
H. B. U. Haq and M. Saqlain, “An implementation of effective machine learning approaches to perform Sybil Attack Detection (SAD) in IoT network,” Theor. Appl. Comput. Intell., vol. 1, no. 1, pp. 1–14, 2023. [Google Scholar] [Crossref]
31.
M. N. Jafar, K. Muniba, and M. Saqlain, “Enhancing diabetes diagnosis through an intuitionistic fuzzy soft matrices-based algorithm,” Spectr. Eng. Manage. Sci., vol. 1, no. 1, pp. 73–82, 2023. [Google Scholar] [Crossref]
32.
M. Abid and M. Saqlain, “Decision-making for the bakery product transportation using linear programming,” Spec. Eng. Manage. Sci., vol. 1, no. 1, pp. 1–12, 2023. [Google Scholar] [Crossref]
33.
D. G. Weldemariam, N. D. Amaha, N. Abdu, and E. H. Tesfamariam, “Assessment of completeness and legibility of handwritten prescriptions in six community chain pharmacies of Asmara, Eritrea: A cross-sectional study,” BMC Health Serv. Res., vol. 20, no. 1, pp. 1–7, 2020. [Google Scholar] [Crossref]
34.
K. Kusunose, T. Abe, A. Haga, D. Fukuda, H. Yamada, M. Harada, and M. Sata, “A deep learning approach for assessment of regional wall motion abnormality from echocardiographic Images,” JACC Cardiovasc. Imaging, vol. 13, no. 2, pp. 374–381, 2020. [Google Scholar] [Crossref]
35.
W. Souma, I. Vodenska, and H. Aoyama, “Enhanced news sentiment analysis using deep learning methods,” J. Comput. Soc. Sci., vol. 2, no. 1, pp. 33–46, 2019. [Google Scholar] [Crossref]
36.
Y. Su, Y. Li, H. Hu, and P. Carolyn Rosé, “Exploring college English language learners’ self and social regulation of learning during wiki-supported collaborative reading activities,” Int. J. Comput. Support Collab. Learn., vol. 13, no. 1, pp. 35–60, 2018. [Google Scholar] [Crossref]
37.
F. Teng, “Tertiary-level students’ English writing performance and metacognitive awareness: A group metacognitive support perspective,” Scand. J. Educ. Res., vol. 64, no. 4, pp. 551–568, 2020. [Google Scholar]
38.
J. Zhao, Y. Sun, Z. Zhu, J. E. Antonio-Lopez, R. A. Correa, S. Pang, and A. Schülzgen, “Deep learning imaging through fully-flexible glass-air disordered fiber,” ACS Photonics, vol. 5, no. 10, pp. 3930–3935, 2018. [Google Scholar] [Crossref]
39.
B. Mandasari and L. Oktaviani, “English language learning strategies: An exploratory study of management and engineering students,” Premise J. Eng. Linguist., vol. 7, no. 7, pp. 61–78, 2018. [Google Scholar]
40.
X. Liu, Y. He, Z. Zhen, and J. Thompson, “An empirical study of production-oriented approach in college English writing teaching,” Univers. J. Educ. Res., vol. 8, no. 11B, pp. 6173–6177, 2020. [Google Scholar]
Search
Open Access
Research article

Text Readability Evaluation in Higher Education Using CNNs

muhammad zulqarnain1,
muhammad saqlain2*
1
Department of Computer Science, Virtual University (VU), 54000 Lahore, Pakistan
2
Department of Mathematics, Faculty of Science, King Mongkut’s University of Technology Thonburi (KMUTT), 10140 Bangkok, Thailand
Journal of Industrial Intelligence
|
Volume 1, Issue 3, 2023
|
Pages 184-193
Received: 08-16-2023,
Revised: 09-11-2023,
Accepted: 09-19-2023,
Available online: 09-29-2023
View Full Article|Download PDF

Abstract:

The paramountcy of English in the contemporary global landscape necessitates the enhancement of English language proficiency, especially in academic settings. This study addresses the disparate levels of English proficiency among college students by proposing a novel approach to evaluate English text readability, tailored for the higher education context. Employing a deep learning (DL) framework, the research focuses on developing a model based on convolutional neural networks (CNNs) to assess the readability of English texts. This model diverges from traditional methods by evaluating the difficulty of individual sentences and extending its capability to ascertain the readability of entire texts through adaptive weight learning. The methodology's effectiveness is underscored by an impressive 72% accuracy rate in readability assessment, demonstrating its potential as a transformative tool in English language education. The application of this DL-based text readability evaluation model in college English training is explored, highlighting its potential to facilitate a more nuanced understanding of text complexity. Furthermore, the study contributes to the broader discourse on enhancing English language instruction in higher education, proposing a method that not only evaluates text comprehensibility but also aligns with diverse educational needs. The findings suggest that this approach could significantly support the enhancement of English teaching methodologies, thereby promoting a deeper, more accessible learning experience for students with varying levels of proficiency.

Keywords: Text readability, Convolutional neural networks (CNNs), English language, Higher education, Assessment

1. Introduction

English, often regarded as the universal language, exerts a significant influence on higher education, particularly in countries where English is not the native language. Its role transcends national borders, providing students from diverse linguistic backgrounds with access to a plethora of global academic resources. The widespread use of English in scholarly journals, textbooks, and academic conferences necessitates its mastery by students and researchers for engaging with contemporary research and academic discourse. Moreover, numerous universities in non-native English-speaking countries offer English-taught programs, attracting international students and fostering intercultural collaboration. This not only enhances educational experiences and aids in developing a global outlook but also improves professional prospects in an increasingly interconnected world [1], [2].

In the realm of text readability, traditional methods such as the Flesch Reading Ease, Gunning FOG, and Automated Readability Index were employed. However, these formulae provided less precise assessments as they merely skimmed the surface of word features, lacking in-depth understanding of word meanings. The advent of DL models has revolutionized text readability assessment by automating the process and obviating the need for manual word feature configuration. The efficiency and accuracy of automatic word feature extraction via DL models represent a significant advancement, reducing the reliance on human intervention. Notably, the development of the Ranked Sentence Readability Score method, a deep neural network-based approach capable of processing multilingual texts, has been a notable contribution by international researchers [3], [4]. Despite these advancements, there remains a gap in addressing the challenges of English instruction in college settings. The dual pressures of employment and further study faced by students in higher education exacerbate the competition for job opportunities. Additionally, the ongoing reforms in the higher education system, still in their nascent stages, require a more robust alignment with secondary education. Consequently, English education in higher education contexts necessitates an efficient and effective approach.

This paper contributes to the field by evaluating English text readability through a DL model tailored for college English education. A text CNN evaluation model is developed, with parameters fine-tuned for optimal performance. The methodology includes the creation of a text readability dataset from college English textbooks. The results demonstrate that the model effectively predicts text readability with an accuracy rate of 82%, evaluating sentence complexity and text length through weight learning. This approach provides substantial support for English instruction in colleges, aiding in bridging the gap in students’ English proficiency.

2. Experimental Techniques and Design

Deep learning algorithms have revolutionized the assessment of text readability, offering more precise and adaptable methodologies compared to traditional measures. Utilizing large-scale text data, these algorithms, which include recurrent neural networks (RNNs), long short-term memory networks (LSTMs), and CNNs, are employed to predict textual complexity. RNNs and LSTMs, in particular, have been recognized for their ability to analyze the context and structure of text [5], significantly surpassing traditional readability formulas such as Flesch Reading Ease and Gunning FOG Index in accuracy. These models, through training on extensive datasets of human-rated text readability, can meticulously evaluate text complexity by considering factors such as sentence length, word difficulty, and syntactic complexity. The proficiency of RNNs and LSTMs lies in their capacity to capture sequential relationships in language, thus enabling a comprehensive assessment of text coherence and flow [6]. Furthermore, CNNs, traditionally used in image processing, have been adapted for text analysis, demonstrating their versatility. They can extract salient features from text data, such as n-grams, to determine readability. By training CNNs on large text corpora annotated with readability labels, these models can discern patterns indicative of text difficulty, rendering them effective for a broad spectrum of readability evaluation contexts.

Moreover, the introduction of Transformers, a groundbreaking deep learning architecture, has markedly enhanced the assessment of text readability. Renowned for their efficacy in natural language processing, models like Bidirectional Encoder Representations from Transformers (BERT) utilize a bidirectional approach to gather context, allowing for a more nuanced understanding of text complexity [7]. Transfer learning has been instrumental in adapting these pre-trained models to readability datasets across various domains, thereby broadening their application scope.

These deep learning methods, encompassing sophisticated architectures like Transformers and neural networks like RNNs, LSTMs, and CNNs, have been pivotal in evaluating text complexity. By training on large, labeled datasets, they consider aspects such as syntax, semantics, and context, offering more refined and nuanced readability assessments. This advancement has profound implications for content creation, educational applications, and other sectors where text readability analysis is crucial [8]. Additionally, self-training and domain adaptation techniques have been explored to enhance these systems' proficiency in processing input from second language learners.

2.1 Transformers

In the context of text readability evaluation, Transformers have significantly outperformed conventional sequence models like RNNs or LSTMs. Their design, incorporating a self-attention mechanism, enables the assessment of the relative importance of each word in a sequence. Models like BERT, pre-trained on extensive text corpora, have been fine-tuned for readability evaluation, excelling in interpreting text complexity and adapting to domain-specific linguistic nuances [9]. Their ability to handle long-range dependencies in text positions Transformers as highly capable in considering context, semantics, and syntax in readability assessment.

2.2 RNNs and LSTMs

RNNs and LSTMs have been applied to various natural language processing applications, including text readability evaluation. RNNs process data sequences sequentially while maintaining a hidden state, and LSTMs, an advanced version of RNNs, effectively manage the vanishing gradient problem, making them suitable for longer sequences. These models excel in analyzing sentence structures and word dependencies, thus accurately estimating text complexity. While Transformers have largely superseded RNNs and LSTMs in NLP tasks, these models still hold relevance in certain contexts.

2.3 CNNs

CNNs, though primarily associated with image processing, have proven effective in text analysis tasks, including readability evaluation. Employed to extract textual features such as n-grams, CNNs undergo training on extensive datasets with readability labels, enabling them to identify patterns correlating with text complexity. While their popularity in NLP may not match that of Transformers, CNNs continue to offer valuable insights in text readability analysis and remain computationally efficient for specific tasks [10], [11].

3. Encoding Text

The initial step in utilizing deep learning for text readability assessment is text encoding, a process in which textual words are converted into numerical representations. Commonly, this is achieved through one-hot encoding, where each word is denoted by a large vector. The length of this vector corresponds to the total number of words in the corpus, with the position representing the specific word marked as one, and all other values set to zero [12]. This method, however, presents two substantial limitations: its inability to capture relationships between words and the problem of high dimensionality in large vocabularies [13].

An alternative to one-hot encoding is distribution-based text representation, which addresses these drawbacks effectively. In this approach, words are represented as low-dimensional real variables, significantly reducing the dimensionality issue and allowing for the representation of word-to-word relationships. Nevertheless, this method encounters a challenge in differentiating words with similar meanings, potentially leading to an overlap in word properties [14].

Prominent models utilized in distribution-based text representation include the Continuous Bag-of-Words (CBOW) model and the Skip-Gram model. The CBOW model focuses on predicting a word based on its context, estimating the likelihood of a word's occurrence within a specific contextual framework. Conversely, the Skip-Gram model infers the contextual words from a given word, essentially reversing the process employed by the CBOW model. Figure 1 and Figure 2 provide schematic illustrations of the CBOW and Skip-Gram models, respectively.

Figure 1. CBOW model
Figure 2. Skip-Gram model

In enhancing the efficacy of word training, both the CBOW and Skip-Gram models employ hierarchical SoftMax in the fully connected layer, coupled with a negative sampling strategy [15], [16]. However, these models share a common limitation in their design: they are trained to recognize the contextual continuity of words but do not distinguish between words that may appear frequently throughout the text. The models do not differentiate between global and local word associations, which could be a drawback. Despite this, both the CBOW and Skip-Gram models have demonstrated competent performance in readability assessments, largely unaffected by these limitations.

4. DL Networks

DL networks, such as CNNs, RNNs, and LSTMs, play a pivotal role in the process of text readability assessment. Each network type possesses distinct characteristics and functions, as elaborated below.

•CNNs, comprising multiple deep layers, primarily utilize convolution computations. A standard CNN architecture includes an input layer, convolutional layers, and pooling layers. Renowned for their efficacy in image processing, CNNs have also demonstrated considerable capability in text categorization and semantic analysis, especially in the realm of text readability assessment, a form of text classification (Figure 3).

Figure 3. CNN structure diagram

•As depicted in Figure 3, CNNs use various convolution kernels in their convolutional layers to extract word properties essential for text classification [17]. The activation function layer often employs the ReLU function for efficiency, and the output from the fully connected layer is flattened, enhancing its suitability for subsequent processing stages.

•RNNs, designed to handle sequential input, connect nodes in accordance with the sequence order, making them particularly effective in sequence processing. They exhibit a unique capability to retain accurate memory of previous datasets due to their complete and progressive parameters [18]. Figure 4 illustrates an RNN's structure, comprising an input layer, hidden layer, and output layer, underscoring its utility in both text processing and image recognition.

Figure 4. RNN model

•LSTMs, a specialized form of sequence model, consist of three primary components: the forget gate, input gate, and output gate. These networks overcome the gradient vanishing problem commonly encountered in traditional RNN models. LSTMs' innovative cellular structure allows for intelligent decision-making regarding data retention or disposal. The model sequentially analyzes data, making crucial decisions about which information to preserve and which to discard [19], [20]. Eq. (1) outlines the control function governing the process.

$f^{(t)}=\sigma\left(w_f\left[h^{(t-1)}, x^t\right]+b_f\right)$
(1)

where, $w_f$ and $b_f$ represent the weight and bias of the forget gate, respectively. The determination of the necessity and extent of alterations to historical data is a crucial step. Such data is selectively fed into the input gate, as delineated in Eqs. (2) to (4).

$i^t=\sigma\left(w_i\left[h^{(t-1)}, x^t\right]+b_i\right)$
(2)
$c^t=\sigma\left(w_c\left[h^{(t-1)}, x^t\right]+b_c\right)$
(3)
$C^t=i^t * c^t+f^{(t)} * C^{t-1}$
(4)

where, $w_i$ and $w_c$, $b_i$ and $b_c$ denote the respective weights and biases integral to the process. The value of the current cell state is represented by $C^t$. The role of the output gate is then to adjudicate whether the information, post-initial filtration by the preceding gates, is suitable for output. This gate encompasses internal switches that regulate the output mechanism, as shown in Eq. (5).

$\sigma^t=\sigma\left(w_o\left[h^{(t-1)}, x^t\right]+b_o\right)$
(5)

where, $w_o$ and $b_o$ are the parameters representing the weights and biases of the output gates, respectively.

In natural language processing applications, the Bidirectional LSTMs (BiLSTMs), which contain both forward and backward LSTM units, demonstrates the importance of contextual content in the representation of words [21]. Unlike standard LSTMs that solely utilize forward processing, BiLSTMs account for word context, highlighting the significance of bidirectional analysis.

5. Text Readability Evaluation Model Using Text CNN

In the construction of the text readability evaluation model, datasets comprising textual information are employed. These datasets serve as the foundation for training the model, after which new textual data is introduced for the purpose of evaluating the model's efficacy [22], [23]. The model is initially trained with a portion of the text, which has been pre-labeled for this purpose. The text data utilized for labeling is denoted as $D=\{(d 1, l 1),(d 2, l 2) \ldots,(d n, l n)\}$, representing the text dataset, with $n$ indicating the number of labeled texts. $l i \in\{G 1, G 2, \ldots, G m\}$ signifies the text readability grade, and $m$ represents the corresponding readability grade. The model's structure is bifurcated, aligning with its theoretical underpinnings. One segment encompasses the dataset used for model training, while the other pertains to the dataset under evaluation. The labeled data facilitates the extraction of text features, which are subsequently instrumental in assessing the text. Figure 5 offers a schematic depiction of the model's structure.

Figure 5. Schematic diagram of the text evaluation model structure

As illustrated in Figure 5, the model comprises three primary components: the text input layer, the text feature extraction layer, and the text classifier. These components collectively facilitate the transition from text dataset to readability evaluation. Additionally, references [24], [25], [26], [27], [28], [29], [30], [31], [32], [33] present further deep learning models pertinent to image detection, providing a broader context to the methodologies employed.

The process of text representation, crucial for text readability assessment, involves the generation of word vectors. These vectors, representing words numerically, are embedded within a vector space. The subsequent stages involve the extraction and training of features, with deep learning algorithms facilitating the training of words across multiple dimensions. This method allows for the representation of textual input, where the proximity of word vectors signifies similarity in meaning [34]. The model for text feature extraction, grounded in CNN, employs convolution kernels of varying sizes to train word data from the input layer, thus extracting word relationships.

In the text classifier component, words are categorized using a SoftMax layer, which applies logistic regression for classification. Logistic regression, under the assumption of a Bernoulli distribution for the data, utilizes gradient descent for optimizing parameters through maximum likelihood estimation [35], [36]. Post-classification, the categorized word vectors are then fed into the fully connected layer. Regarding dataset preparation, the required text readability dataset is derived from undergraduate English textbooks, following the procedure illustrated in Figure 6.

Figure 6. Data collection process

The process begins with the segmentation of college textbooks into sections using a book cutter. High-speed scanning is then applied to each page. Subsequently, text recognition interfaces identify the textual content within the instructional materials. Under expert supervision, the scanned content undergoes rigorous examination. Post-evaluation, any repetitive scanned words are removed using Excel, followed by updating the text database [37].

For sample balance, three textbooks are selected, categorized into three levels based on difficulty: Level 1 (easiest), Level 2 (moderately difficult), and Level 3 (most difficult) [38], [39], [40]. To maintain consistency, the extraction of sentences is limited to 1,000 per level. Table 1 provides a breakdown of the sentence count across these levels.

Table 1. Number of sentences in full dataset
LevelNumber of Sentences
Level 1999
Level 2998
Level 3999

In the hyperparameter configuration for model training, 25% of the sample count is allocated for labeling, with the remaining 75% forming the training set. Table 2 outlines the hyperparameter settings for the text CNN model. A batch size of 64 is selected for each training iteration, incorporating a dropout rate of 0.5. Convolution kernel sizes are set at three, four, and five, with an established learning rate of 0.001.

Table 2. Hyperparameter settings
HyperparameterValue
Batch size64
Dropout0.5
Kernel size3,4,5
Learning rate0.001

6. Results and Discussion

The implemented deep learning-based model for text readability assessment has demonstrated an impressive accuracy rate of 82%. Table 3 illustrates the model's ability to classify sentences into distinct complexity levels following training.

Table 3. Sentence readability evaluation results

Text Type

Readability Levels

She consumes breakfast every single day.

Level 1

Frank is currently working at a hotel in Yorkshire, which is regarded as one of the coldest places in the UK.

Level 2

More than 3,000 native plant species, or more than 10% of the roughly 25,000 species in the United States, may go extinct in the next ten years, according to botanists.

Level 3

These results indicate that the model can accurately categorize textbook phrases into Level 1, Level 2, and Level 3. Previously, the focus was on identifying the difficulty level of sentences. In the evaluation of a complete text, Level 1, Level 2, and Level 3 sentences are assigned different weights, leading to Scores 1, 2, and 3, respectively. The lower the score, the simpler the text is to comprehend, with Score 1 indicating the easiest level. Conversely, a higher score, such as Score 3, suggests greater complexity. Figure 7 displays the outcomes of the model's evaluation of text readability.

Figure 7. Text readability model evaluation results

The model categorizes Texts 1, 2, and 3 into Level 1, Level 2, and Level 3 complexities, respectively (Figure 7). This classification underscores the feasibility and accuracy of the proposed approach. Such findings can be effectively employed in educational settings, allowing for tailored instruction that accounts for individual student capabilities. This approach is beneficial for student development. Moreover, as illustrated in Figure 8, the model can classify students' English proficiency levels, with 12% fitting into Level 1, 31% into Level 2, and 57% into Level 3.

Figure 8. Distribution of students' English proficiency levels

Additionally, a control group experiment involving homework assignments was conducted to validate the model's impact on university teaching. The experimental group utilized the DL-based text readability evaluation method, while the control group did not. It was observed that texts classified as Level 1 and Level 2 received more correct responses compared to Level 3, which had the least. This correlation between reading test scores and text readability levels confirms the validity of the DL-based text readability evaluation method from a scientific perspective. The empirical evidence supports the effectiveness of this approach in enhancing college English instruction.

7. Conclusions

This study has explored a deep learning-based methodology for evaluating text readability within the context of collegiate English education. A text CNN evaluation model was developed and demonstrated an accuracy rate of 72%. The findings suggest that further exploration into more sophisticated models, such as LSTM, BiLSTM, Transformer, and BERT, is warranted to enhance the accuracy of text readability evaluations.

A limitation of this study is identified in the small dataset size, which may not comprehensively represent the diversity of texts encountered in college English courses. This limitation potentially affects the model's ability to accurately assess a broader spectrum of educational materials. Additionally, the study's small sample size could impact the validity of the findings and may not reflect the full range of texts utilized in higher education settings. Moreover, the model's parameter settings are based on pre-existing knowledge, which introduces the risk of bias or suboptimal performance.

The research underscores the need for future investigations into advanced models like LSTM, BiLSTM, Transformer, and BERT. Such research endeavors would aim to address the current study's limitations and further refine the assessment of text readability, particularly in the context of college English instruction. The enhancement of text readability assessment methods could significantly contribute to the development of more effective teaching strategies, benefiting both educators and students in higher education.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflict of interest.

References
1.
D. Crystal, English as a Global Language. Cambridge University Press, Cambridge, UK, 2003. [Google Scholar]
2.
J. Dearden, “English as a medium of instruction-A growing global phenomenon,” British Council, UK, 2014. [Google Scholar]
3.
S. Marginson, “Student self-formation in international education,” J. Stud. Int. Educ., vol. 18, no. 1, pp. 6–22, 2013. [Google Scholar] [Crossref]
4.
R. Phillipson, “The linguistic imperialism of neoliberal empire,” Crit. Inquiry Lang. Stud., vol. 5, no. 1, pp. 1–43, 2008. [Google Scholar] [Crossref]
5.
A. Sherstinsky, “Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network,” Physica D, vol. 404, p. 132306, 2020. [Google Scholar] [Crossref]
6.
F. Gers, “Long short-term memory in recurrent neural networks,” phdthesis, Universität Hannover, Hannover, Germany, 2001. [Google Scholar]
7.
J. M. Imperial, “BERT embeddings for automatic readability assessment,” ArXiv, p. arXiv:2106.07935, 2021. [Google Scholar] [Crossref]
8.
L. Jian, H. Xiang, and G. Le, “English text readability measurement based on convolutional neural network: A hybrid network model,” Comput. Intell. Neurosci., vol. 2022, pp. 1–9, 2022. [Google Scholar] [Crossref]
9.
H. Butt, M. R. Raza, M. J. Ramzan, M. J. Ali, and M. Haris, “Attention-based CNN-RNN Arabic text recognition from natural scene images,” Forecasting, vol. 3, no. 3, pp. 520–540, 2021. [Google Scholar] [Crossref]
10.
N. C. Dang, N. María Moreno-García, and F. De la Prieta, “Sentiment analysis based on deep learning: A comparative study,” Electronics, vol. 9, no. 3, p. 483, 2020. [Google Scholar] [Crossref]
11.
L. Diao and P. Hu, “Deep learning and multimodal target recognition of complex and ambiguous words in automated English learning system,” J. Intell. Fuzzy Syst., vol. 40, no. 4, pp. 7147–7158, 2021. [Google Scholar] [Crossref]
12.
Y. Han, “Evaluation of English online teaching based on remote supervision algorithms and deep learning,” J. Intell. Fuzzy Syst., vol. 40, no. 4, pp. 7097–7108, 2021. [Google Scholar] [Crossref]
13.
Y. Li, “Construction of Internet of Things English terms model and analysis of language features via deep learning,” J. Supercomput., vol. 78, no. 5, pp. 6296–6317, 2021. [Google Scholar] [Crossref]
14.
R. Zhang, “Construction method of network teaching resources based on deep learning,” Converter, vol. 2021, no. 6, pp. 440–446, 2021. [Google Scholar]
15.
M. Kasamatsu, Y. Murakami, and M. Kengo, “Examination of estimation accuracy of deep learning by data augmentation,” IEICE Tech. Rep., vol. 118, no. 8, pp. 115–119, 2019. [Google Scholar]
16.
M. O. A. Elhassan, A. S. Muhamad, and I. H. A. Tharbe, “The effects of surface and deep learning strategies on academic achievement in English among high school students: Do implicit beliefs of intelligence matter?,” J. Couns. Psychol., vol. 7, no. 2, pp. 1–13, 2020. [Google Scholar]
17.
K. S. Lee, “A study of STEAM model development and assessment method for deep learning: Through the voice of mimesis and brontë,” J. Eng. Teach. Movie Media., vol. 22, no. 4, pp. 39–58, 2021. [Google Scholar] [Crossref]
18.
Z. Xu and Y. Shi, “Application of constructivist theory in flipped classroom — Take college English teaching as a case study,” Theory Pract. Lang. Stud., vol. 8, no. 7, p. 880, 2018. [Google Scholar] [Crossref]
19.
L. Alzubaidi, J. Zhang, J. Amjad Humaidi, A. Al-Dujaili, Y. Duan, O. Al-Shamma, J. Santamaría, A. Mohammed Fadhel, M. Al-Amidie, and L. Farhan, “Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions,” J. Big Data, vol. 8, no. 1, pp. 1–74, 2021. [Google Scholar] [Crossref]
20.
S. Ram, S. Gupta, and B. Agarwal, “Devanagri character recognition model using deep convolution neural network,” J. Stat. Manage. Syst., vol. 21, no. 4, pp. 593–599, 2018. [Google Scholar] [Crossref]
21.
M. P. Sari, “An evaluation of English program: A deep analysis of EFL learners’attitude towards English program,” Int. J. Inf. Educ., vol. 5, no. 2, pp. 133–147, 2021. [Google Scholar] [Crossref]
22.
D. K. Renuka and C. A. Devi, “Affective model based speech emotion recognition using deep learning techniques,” Indian J. Comput. Sci., 2020, doi: 10.1701 0/ijcs/2020/v5/i4-5/l 54783. [Google Scholar]
23.
S. S. Ali, “Problem based learning: A student-centered approach,” English Lang. Teach., vol. 12, no. 5, pp. 73–78, 2019. [Google Scholar]
24.
M. Saqlain, “Sustainable hydrogen production: A decision-making approach using VIKOR and intuitionistic hypersoft sets,” J. Intell. Manag. Decis., vol. 2, no. 3, pp. 130–138, 2023. [Google Scholar] [Crossref]
25.
H. B. U. Haq and M. Saqlain, “Iris detection for attendance monitoring in educational institutes amidst a pandemic: A machine learning approach,” J. Ind. Intell., vol. 1, no. 3, pp. 136–147, 2023. [Google Scholar] [Crossref]
26.
M. Saqlain, M. Sana, N. Jafar, M. Saeed, and B. Said, “Single and multi-valued neutrosophic hypersoft set and tangent similarity measure of single valued neutrosophic hypersoft sets,” Neutrosophic Sets Syst., vol. 32, pp. 317–329, 2020. [Google Scholar]
27.
M. Saqlain and X. L. Xin, “Interval valued, m-polar and m-polar interval valued neutrosophic hypersoft sets,” Neutrosophic Sets Syst., vol. 36, pp. 389–399, 2020. [Google Scholar] [Crossref]
28.
M. Saqlain, M. Saeed, R. M. Zulqarnain, and S. Moin, “Neutrosophic hypersoft matrix theory: Its definition, operators, and application in decision-making of personnel selection problem,” J. Oper. Res., vol. 2021, pp. 449–470, 2021. [Google Scholar] [Crossref]
29.
M. Saqlain, M. Riaz, M. A. Saleem, and M. S. Yang, “Distance and similarity measures for neutrosophic hypersoft set (NHSS) with construction of NHSS-TOPSIS and applications,” IEEE Access, vol. 9, pp. 30803–30816, 2021. [Google Scholar] [Crossref]
30.
H. B. U. Haq and M. Saqlain, “An implementation of effective machine learning approaches to perform Sybil Attack Detection (SAD) in IoT network,” Theor. Appl. Comput. Intell., vol. 1, no. 1, pp. 1–14, 2023. [Google Scholar] [Crossref]
31.
M. N. Jafar, K. Muniba, and M. Saqlain, “Enhancing diabetes diagnosis through an intuitionistic fuzzy soft matrices-based algorithm,” Spectr. Eng. Manage. Sci., vol. 1, no. 1, pp. 73–82, 2023. [Google Scholar] [Crossref]
32.
M. Abid and M. Saqlain, “Decision-making for the bakery product transportation using linear programming,” Spec. Eng. Manage. Sci., vol. 1, no. 1, pp. 1–12, 2023. [Google Scholar] [Crossref]
33.
D. G. Weldemariam, N. D. Amaha, N. Abdu, and E. H. Tesfamariam, “Assessment of completeness and legibility of handwritten prescriptions in six community chain pharmacies of Asmara, Eritrea: A cross-sectional study,” BMC Health Serv. Res., vol. 20, no. 1, pp. 1–7, 2020. [Google Scholar] [Crossref]
34.
K. Kusunose, T. Abe, A. Haga, D. Fukuda, H. Yamada, M. Harada, and M. Sata, “A deep learning approach for assessment of regional wall motion abnormality from echocardiographic Images,” JACC Cardiovasc. Imaging, vol. 13, no. 2, pp. 374–381, 2020. [Google Scholar] [Crossref]
35.
W. Souma, I. Vodenska, and H. Aoyama, “Enhanced news sentiment analysis using deep learning methods,” J. Comput. Soc. Sci., vol. 2, no. 1, pp. 33–46, 2019. [Google Scholar] [Crossref]
36.
Y. Su, Y. Li, H. Hu, and P. Carolyn Rosé, “Exploring college English language learners’ self and social regulation of learning during wiki-supported collaborative reading activities,” Int. J. Comput. Support Collab. Learn., vol. 13, no. 1, pp. 35–60, 2018. [Google Scholar] [Crossref]
37.
F. Teng, “Tertiary-level students’ English writing performance and metacognitive awareness: A group metacognitive support perspective,” Scand. J. Educ. Res., vol. 64, no. 4, pp. 551–568, 2020. [Google Scholar]
38.
J. Zhao, Y. Sun, Z. Zhu, J. E. Antonio-Lopez, R. A. Correa, S. Pang, and A. Schülzgen, “Deep learning imaging through fully-flexible glass-air disordered fiber,” ACS Photonics, vol. 5, no. 10, pp. 3930–3935, 2018. [Google Scholar] [Crossref]
39.
B. Mandasari and L. Oktaviani, “English language learning strategies: An exploratory study of management and engineering students,” Premise J. Eng. Linguist., vol. 7, no. 7, pp. 61–78, 2018. [Google Scholar]
40.
X. Liu, Y. He, Z. Zhen, and J. Thompson, “An empirical study of production-oriented approach in college English writing teaching,” Univers. J. Educ. Res., vol. 8, no. 11B, pp. 6173–6177, 2020. [Google Scholar]

Cite this:
APA Style
IEEE Style
BibTex Style
MLA Style
Chicago Style
Zulqarnain, M. & Saqlain, M. (2023). Text Readability Evaluation in Higher Education Using CNNs. J. Ind Intell., 1(3), 184-193. https://doi.org/10.56578/jii010305
M. Zulqarnain and M. Saqlain, "Text Readability Evaluation in Higher Education Using CNNs," J. Ind Intell., vol. 1, no. 3, pp. 184-193, 2023. https://doi.org/10.56578/jii010305
@research-article{Zulqarnain2023TextRE,
title={Text Readability Evaluation in Higher Education Using CNNs},
author={Muhammad Zulqarnain and Muhammad Saqlain},
journal={Journal of Industrial Intelligence},
year={2023},
page={184-193},
doi={https://doi.org/10.56578/jii010305}
}
Muhammad Zulqarnain, et al. "Text Readability Evaluation in Higher Education Using CNNs." Journal of Industrial Intelligence, v 1, pp 184-193. doi: https://doi.org/10.56578/jii010305
Muhammad Zulqarnain and Muhammad Saqlain. "Text Readability Evaluation in Higher Education Using CNNs." Journal of Industrial Intelligence, 1, (2023): 184-193. doi: https://doi.org/10.56578/jii010305
cc
©2023 by the author(s). Published by Acadlore Publishing Services Limited, Hong Kong. This article is available for free download and can be reused and cited, provided that the original published version is credited, under the CC BY 4.0 license.