Exploring Research on ChatGPT in Online Consumer Reviews: A Systematic Literature Review and Thematic Analysis
Abstract:
Studies on ChatGPT within the context of online consumer reviews (OCRs) have emerged as part of the broader exploration of generative AI across multiple disciplines. However, to date, no research has systematically examined the current research focus or other key aspects related to the application of ChatGPT in OCRs. To address this gap, this study conducts a systematic literature review to identify dominant research focus areas, highlight existing research gaps, and propose directions for future research. Guided by the PRISMA 2020 protocol and employing a thematic analysis approach, 22 relevant studies were analysed, revealing three overarching themes: (1) ChatGPT for review analytics, (2) ChatGPT for review modeling and evaluation, and (3) ChatGPT for review management. The findings indicate that current research primarily emphasizes ChatGPT’s potential as an analytical tool for OCR datasets, enabling the extraction of valuable and actionable insights for both marketers and researchers. In addition, the review identifies growing concern regarding fake reviews and highlights the emerging use of ChatGPT-generated synthetic reviews as datasets for developing fake review detection models, offering a practical alternative for studies facing challenges in obtaining high-quality training data. Finally, findings related to the third theme demonstrate ChatGPT’s utility in supporting managerial responses to customer reviews, providing insights into its role in enhancing customer relationship management. Overall, this review suggests that research on ChatGPT in OCRs remains at an early stage but offers significant insights and opportunities for future investigation in this emerging field.1. Introduction
In the contemporary digital marketplace, consumers increasingly rely on online platforms to inform their purchasing decisions. Among these platforms, online consumer reviews (OCRs) have emerged as one of the most influential information sources, shaping consumer perceptions, guiding preferences, and influencing purchase intentions (Su et al., 2025). Prior research highlights that OCRs function as an essential decision-support mechanism, enabling consumers to evaluate service quality, credibility, and reliability before making choices (Jeong & Lee, 2024). This influence is particularly pronounced among younger generations, who demonstrate a strong dependence on online feedback when selecting products and services (Roy et al., 2017). From a managerial perspective, OCRs not only affect firm performance and sales outcomes (Babić Rosario et al., 2016; Harun et al., 2025; Ishak & Harun, 2023) but also offer rich insights into consumer needs, preferences, and satisfaction levels (Baier et al., 2025). Collectively, OCRs serve as a critical interface between consumers and firms, shaping both purchasing behavior and strategic decision-making.
As the volume and complexity of OCRs continue to expand, both researchers and practitioners have increasingly turned to artificial intelligence (AI) to analyze and manage review content more efficiently. AI technologies are now deeply embedded across industries (Collins et al., 2021) and are commonly defined as systems capable of simulating human reasoning, learning, and problem-solving processes (Berente et al., 2021). Within this broader AI landscape, ChatGPT, which is powered by large language models (LLMs), has rapidly gained prominence. Trained on extensive textual datasets, LLMs are able to generate coherent, contextually relevant, and human-like text (Campbell IV et al., 2024). Reflecting its versatility, ChatGPT has been widely adopted across diverse domains, including healthcare (Garg et al., 2023; Miao et al., 2023), education (Xiao et al., 2025), tourism (Kumamoto & Joho, 2025), entrepreneurship (Duong et al., 2025), and content creation (Hwang & Lee, 2025).
Despite this growing adoption, research examining ChatGPT within the context of OCRs remains fragmented. Existing studies tend to focus on isolated applications, such as review generation (Amos & Zhang, 2024; Knoedler et al., 2024), sentiment analysis (Cheng et al., 2024), or review management (Tan et al., 2025; Wayne Litvin & Pei-Sze Tan, 2024), without offering a consolidated understanding of the broader research landscape. As a result, there is limited clarity regarding the dominant research themes, methodological approaches, industry contexts, and emerging trends that characterize this body of literature. This lack of systematic synthesis constrains theoretical development and hinders the identification of research gaps or the establishment of coherent future research agendas at the intersection of ChatGPT and OCRs. Taken together, this fragmentation highlights the need for a systematic literature review to organize existing knowledge and guide future research on the use of ChatGPT in the context of OCRs.
To address this gap, this study conducts a systematic literature review (SLR) integrating thematic analysis to explore the application of ChatGPT in OCRs. Specifically, this review aims to identify prevailing research focus areas, dominant themes, industry distributions, and methodological patterns in the existing literature, as well as to highlight underexplored issues and future research opportunities. Accordingly, this study is guided by the following research questions:
RQ1: What are the current publication fields, industry distributions, and research methodologies in studies examining ChatGPT and OCRs?
RQ2: What key themes and dominant research focus areas have emerged in studies on the application of ChatGPT in OCRs?
RQ3: What research gaps and future directions can be identified based on the current state of scholarship in this area?
This review offers several contributions. First, it provides a structured and comprehensive synthesis of prior research to map the intellectual landscape surrounding ChatGPT and OCRs. Second, it identifies thematic and methodological trends, clarifying where scholarly attention has been concentrated and where it remains limited. Third, by highlighting research gaps and emerging directions, this study offers guidance for future investigations and supports the responsible, evidence-based application of generative AI in online review contexts.
The remainder of this paper is organized as follows. Section 2 describes the SLR methodology, including the search strategy and selection criteria. Section 3 presents the descriptive and thematic findings. Section 4 discusses the theoretical and managerial implications, and Section 5 concludes by outlining research gaps, limitations, and directions for future studies.
2. Methodology
This study adopted an SLR approach combined with thematic analysis, drawing on established methodological guidance (Braun & Clarke, 2006; Nasrabadi et al., 2024; Page et al., 2021; Pham et al., 2023). The selection of an SLR over alternative review methods, such as bibliometric or meta-analytical analyses, was guided by several considerations. First, the SLR approach offers richer qualitative insights by emphasising content-level and contextual understanding rather than relying solely on quantitative indicators such as citation counts (Ishak et al., 2025; Nikseresht et al., 2024; Rojas-Sánchez et al., 2023). This qualitative orientation is particularly well suited to thematic analysis, as it enables the systematic identification, organisation, and interpretation of recurring patterns and meanings across studies, thereby facilitating deeper theoretical integration and synthesis. Such an approach allows researchers to capture nuanced insights that purely quantitative review methods may overlook (Marzi et al., 2024). As noted by Paul & Criado (2020), a well-structured SLR plays a critical role in consolidating existing knowledge within a research domain, thereby enhancing conceptual clarity and contextual depth.
Second, compared with other review methodologies, the SLR technique provides a more systematic and rigorous framework for addressing clearly defined research objectives (Ahmad et al., 2022; Angioi & Hiller, 2023). When combined with thematic analysis, the SLR process enables the transparent and replicable identification of key themes, sub-themes, and research patterns across the reviewed studies. This involves the formulation of explicit research aims, the selection of appropriate academic databases, and the application of a structured procedure for coding, theme development, and synthesis (Braun & Clarke, 2006; Nasrabadi et al., 2024). Together, this methodological integration strengthens the credibility, interpretive depth, and comprehensiveness of the review findings. Accordingly, the combined SLR and thematic analysis approach was deemed the most appropriate method for achieving the objectives of this study.
This study employed an SLR approach to explore the current research focus on the application of ChatGPT in the context of OCRs, guided by the PRISMA 2020 protocol proposed by Page et al. (2021). The PRISMA 2020 framework provides a set of 27 reporting items intended to enhance transparency, rigor, and reproducibility in systematic reviews. Accordingly, the review process was structured into three main phases—identification, screening, and inclusion—as illustrated in Figure 1.

The literature search was conducted using two prominent academic databases, Elsevier’s Scopus and Clarivate’s Web of Science (WoS), selected for their extensive coverage and comprehensive indexing compared with other academic databases (Carrera-Rivera et al., 2022; Nikseresht et al., 2024). The eligibility criteria restricted inclusion to peer-reviewed journal articles published between 2022 and 2025 to ensure the relevance and timeliness of the evidence base (Muslim & Harun, 2022; Snyder, 2019). The year 2022 was selected as the initial timeframe, as ChatGPT was introduced at the end of that year and no relevant studies were available prior to its emergence.
The search was conducted between July and December 2025. The development of the search strategy, including keyword selection, was adapted from Nasrabadi et al. (2024). Keywords were designed to capture studies examining ChatGPT in the context of OCRs. The final list of search terms included “ChatGPT,” “online reviews,” “online customer reviews,” “online consumer reviews,” “customer reviews,” “user reviews,” “consumer reviews,” and “user-generated content.” Notably, only “ChatGPT” was used, rather than broader terms such as large language models or generative AI, to maintain conceptual specificity. This restriction ensured the identification of studies explicitly addressing ChatGPT. Although some relevant studies may employ broader terminology in their titles or abstracts, author-assigned keywords typically reflect the core content and research focus of an article (Corrin et al., 2022). Our review found that 80% (n = 20) of the selected articles included “ChatGPT” as an author-assigned keyword. Accordingly, prioritizing “ChatGPT” as the primary search term enhanced relevance and reduced the inclusion of studies examining generative AI more broadly without directly focusing on ChatGPT in the context of OCRs.
The search was executed using the advanced document search function by combining the selected keywords with the Boolean operators “OR” and “AND” across the title, abstract, and keyword fields (Khan et al., 2024). For the Scopus database, the following comprehensive search string was applied: (TITLE-ABS-KEY(“ChatGPT”) AND TITLE-ABS-KEY (“online reviews” OR “online customer reviews” OR “online consumer reviews” OR “customer reviews” OR “user reviews” OR “consumer reviews” OR “user-generated content”)). For the Web of Science (WoS) database, the corresponding search string was: TS = (“ChatGPT”) AND TS = (“online reviews” OR “online customer reviews” OR “online consumer reviews” OR “customer reviews” OR “user reviews” OR “consumer reviews” OR “user-generated content”). The search yielded 113 records from Scopus and 17 records from WoS. After removing 16 duplicate records, 97 unique records remained and were subsequently subjected to the screening phase.
This phase is a critical component of the SLR process, as it enhances reliability and mitigates potential bias. During this stage, 97 identified papers were screened for eligibility. Table 1 summarises the inclusion and exclusion criteria applied during the screening process. The screening process was facilitated by the detailed publication metadata provided by the selected databases, which categorized documents as research articles, conference papers, or other publication types. Publications categorized as conference papers (n = 36), conference reviews (n = 5), review articles (n = 2), book chapters (n = 1), and short surveys (n = 1) were excluded, resulting in a total of 45 exclusions. The decision to exclude conference papers, conference reviews, book chapters, and short surveys aligns with methodological studies suggesting that such documents may provide limited detail and hinder the extraction of meaningful and reliable data (Scherer & Saldanha, 2019; Taylor et al., 2014). Review articles were excluded to avoid duplication of evidence and potential bias, as they synthesize findings from primary studies rather than provide original empirical data (Ahmad et al., 2022). Consequently, these exclusions, supported by the literature, help to maintain a focused and rigorous evidence base.
Following this eligibility screening, the remaining records were subjected to a more detailed examination at the title and abstract level in accordance with the PRISMA 2020 protocol (Page et al., 2021). For this reason, two researchers independently reviewed the titles and abstracts of the 52 records and discussed any inconsistencies until consensus was reached (Zupic & Čater, 2015). In cases of disagreement regarding which articles should proceed to full-text screening, consensus was achieved through discussion. Next, the same two researchers independently screened the full-text articles for inclusion. Any disagreements at this stage were resolved through discussion to reach a final inclusion or exclusion decision. As a result, 34 articles were identified as not relevant, and 18 articles were included in the final review.
Criterion | Inclusion Criteria | Exclusion Criteria |
Research relevance | Studies examining the application, evaluation, or implications of ChatGPT in the context of OCRs | Studies focusing on AI, LLMs, or generative AI without explicit relevance to ChatGPT in OCRs |
Document type | Peer-reviewed journal articles | Conference papers, conference reviews, book chapters, short surveys, editorials, and review articles |
Language | English-language publications | Non-English publications |
Publication period | Studies published between 2022 and 2025 | Studies published outside the defined period |
Data availability | Full-text articles accessible | Abstract-only or inaccessible full texts |
Indexing source | Studies indexed in Scopus or Web of Science | Studies indexed exclusively in other databases |
To ensure completeness, an additional search was conducted using Google Scholar with the same set of keywords (Ahmad et al., 2022; Pham et al., 2023). This process identified four additional papers, which were subsequently checked in the Scopus and WoS databases and found to be indexed in Scopus. The full texts of these papers were then retrieved. Two researchers carefully examined the titles and abstracts of these papers and reached a consensus to include them for further review. Consequently, a total of 22 studies were finalized for inclusion in the review. The final step of this phase involved compiling a summary table containing essential details from each selected study, including authors, titles, publication years, journal sources, methodologies, key findings, and other relevant information (see Appendix A).
Consistent with prior exploratory and thematic systematic literature reviews, this study did not conduct a formal risk-of-bias or quality assessment of the included studies (Paul & Criado, 2020; Snyder, 2019). The primary objective of the review was to map research themes, methodological trends, and dominant research focus areas rather than to evaluate effect sizes or intervention outcomes (Marzi et al., 2024). Furthermore, all included studies were peer-reviewed journal articles indexed in Scopus and Web of Science, which provides an initial level of quality assurance (Carrera-Rivera et al., 2022). This approach aligns with previous SLRs that emphasize conceptual synthesis and theory development over methodological appraisal (Nikseresht et al., 2024; Rojas-Sánchez et al., 2023).
The selection of relevant papers in this study was conducted in accordance with the PRISMA 2020 protocol (Page et al., 2021), resulting in the identification of 22 studies for analysis. Subsequently, thematic analysis was employed to identify key themes and to establish the dominant research focus areas within the selected literature, thereby directly addressing the second research question of the study. The analysis followed the well-established framework proposed by Braun & Clarke (2006), which is widely regarded as one of the most influential approaches in the social sciences. In addition, the guidance provided by Maguire & Delahunt (2017) was used to support the systematic execution of the thematic analysis.
Following Braun & Clarke (2006), thematic analysis was conducted through six sequential phases: (1) familiarisation with the data, (2) generation of initial codes, (3) searching for themes, (4) reviewing themes, (5) defining and naming themes, and (6) producing the final write-up. During the familiarisation phase, the reviewers examined the titles, objectives, methodologies, and key findings of the selected articles, with additional full-text readings undertaken where necessary to develop a deeper understanding of the studies (Maguire & Delahunt, 2017; Nasrabadi et al., 2024). Building on this process, initial codes were generated manually using an inductive, open-coding approach, with no pre-established codes applied. These codes were progressively compared, sorted, and refined to identify broader themes (Braun & Clarke, 2006; Clark et al., 2019; Liñán & Fayolle, 2015). Manual coding was considered appropriate given the manageable sample size of 22 studies, as it enabled close engagement with the data and facilitated deeper analytical insight.
To ensure coding consistency and reliability, two independent reviewers conducted the coding process. Inter-coder reliability was assessed using Cohen’s Kappa based on a contingency table constructed from their coding decisions. The analysis yielded a Kappa value of 0.775, indicating substantial agreement beyond chance. In cases of coding discrepancies, the coders discussed differences to clarify code definitions and reach consensus, thereby ensuring consistent application of the coding framework and enhancing reproducibility.
The third phase focused on identifying relevant themes. Consistent with Braun & Clarke (2006), themes were defined based on their relevance to the research question rather than their frequency of occurrence. At this stage, the analysis shifted from individual codes to a broader thematic level, whereby related codes derived from the reviewed studies were systematically organised, compared, and grouped into potential themes. This process involved examining how different codes converged to form overarching themes that represent the dominant research focus areas within the literature. To support this organisation and synthesis, a detailed table illustrating the development of initial codes into sub-themes and overarching themes is provided in Appendix B. In the fourth and fifth phases, the preliminary themes were reviewed and refined to minimise overlap and enhance conceptual clarity, resulting in a coherent and well-defined thematic structure that captured distinct and meaningful aspects of the research focus across the included studies (Ryan & Bernard, 2003).
In the final phase, the refined themes were consolidated and articulated in relation to the research objectives. This phase involved synthesising the themes into a coherent narrative and presenting them with supporting evidence from the reviewed studies to ensure transparency and analytical rigour. The final write-up aimed to clearly communicate the dominant research focus areas and their interrelationships within the existing literature (Braun & Clarke, 2006; Thorpe et al., 2005). The entire thematic analysis procedure was conducted manually to maintain close interpretive engagement with the data.
3. Findings
This study conducted an SLR on the application of ChatGPT in OCRs, with a primary emphasis on thematic analysis. A descriptive quantitative analysis was first undertaken, followed by an inductive thematic analysis to identify and synthesise key themes characterising current research on ChatGPT in OCRs. The findings from the thematic analysis constitute the core qualitative contribution of this study.
This section presents a descriptive quantitative analysis of the research landscape, focusing on trends across publication fields, industries, and research methodologies. The analysis provides an overview of the structure of existing research and highlights the dominant areas of scholarly attention.
The 22 reviewed papers were published across a diverse range of academic journals, underscoring the multidisciplinary nature of research on ChatGPT in OCR, as shown in Table 2. Publications span several major fields, including Hospitality and Tourism, featuring journals such as Cornell Hospitality Quarterly (Wayne Litvin & Pei-Sze Tan, 2024), Tourism Management (Tan et al., 2025), and the International Journal of Hospitality Management (Choi et al., 2024), and Retail and Consumer Services, represented by the Journal of Retailing and Consumer Services (Baier et al., 2025; Zhao et al., 2025). Additionally, a significant portion of the research appeared in Computer Science and Information Technology outlets, including Telematics and Informatics (Amos & Zhang, 2024), Information Processing and Management (Xylogiannopoulo et al., 2024), and IEEE Access (Mathebula et al., 2024), reflecting the technological foundations of ChatGPT-related studies. Other contributions emerged from Healthcare and Medicine journals such as PLoS ONE (Li et al., 2025) and the Journal of Plastic, Reconstructive and Aesthetic Surgery (Knoedler et al., 2024), as well as multidisciplinary platforms including Scientific Reports (Su et al., 2025), Sustainability (Switzerland) (Jeong & Lee, 2024), and Technology in Society (Koc et al., 2023). These journals show how the topic bridges multiple disciplines, from tourism and consumer behavior to computing and health sciences.
Journal Field | Journal Name | No. of Study | No. of Citation |
Hospitality & Tourism | Cornell Hospitality Quarterly | 1 | 6 |
Journal of Hospitality and Tourism Technology | 2 | 4 | |
Tourism Management | 1 | 6 | |
International Journal of Hospitality Management | 1 | 10 | |
International Journal of Contemporary Hospitality Management | 1 | 3 | |
Journal of Qualitative Research in Tourism | 1 | 0 | |
Computer Science & Information Technology | Telematics and Informatics | 1 | 12 |
Knowledge-Based Systems | 1 | 0 | |
Information Processing and Management | 1 | 8 | |
Computers | 1 | 0 | |
IEEE Access | 2 | 15 | |
Multidisciplinary & General Science | Scientific Reports | 1 | 1 |
Sustainability (Switzerland) | 1 | 14 | |
Technology in Society | 1 | 63 | |
Medical & Surgical | Journal of Plastic, Reconstructive and Aesthetic Surgery | 1 | 9 |
Journal of Foot and Ankle Surgery | 1 | 3 | |
PLoS ONE | 1 | 1 | |
Retailing & Consumer Services | Journal of Retailing and Consumer Services | 2 | 2 |
Operations Research & Management Science | Annals of Operations Research | 1 | 9 |
The reviewed studies span a range of industry contexts, demonstrating ChatGPT’s broad relevance and adaptability across service-oriented sectors, as shown in Figure 2. Research contexts are predominantly concentrated in the hospitality and tourism industry, followed by e-commerce and retail and healthcare. Hospitality and tourism emerge as the primary research focus, with 12 papers, of which 10 examine hotel reviews and 2 focus on tourism platforms such as TripAdvisor and Airbnb (Cheng et al., 2024; Morini-Marrero et al., 2025; Tan et al., 2025; Wayne Litvin & Pei-Sze Tan, 2024). The e-commerce and retail sector represents the second major area, encompassing four studies that analyze online reviews from platforms such as Amazon, Yelp, Google Play Store, and restaurant feedback datasets (Baier et al., 2025; Su et al., 2025; Zhao et al., 2025). In contrast, healthcare and medical contexts form a smaller yet emerging category, with three studies focusing on patient and surgeon reviews on platforms such as Haodf.com and Healthgrades (Knoedler et al., 2024; Li et al., 2025). Additional sectors appear less frequently: the food and beverage industry is represented by one study (McCloskey et al., 2024), cross-industry service contexts—including airlines and the clothing industry—are examined in one study (Rosete et al., 2025), and the finance sector appears in one study (Mathebula et al., 2024). Collectively, these patterns highlight the predominance of hospitality- and retail-focused research while underscoring ChatGPT’s expanding role in analyzing consumer experiences across diverse service industries.

The selected studies employed a range of methodological approaches, as summarised in Table 3. Computational text mining and NLP-based approaches were most prevalent, with many studies applying large-scale text analysis, machine learning, predictive modelling, and sentiment or topic modelling to analyse OCRs and examine ChatGPT’s analytical capabilities (Baier et al., 2025; Botunac et al., 2024; Casciato & Mateen, 2024; Cheng et al., 2024; Choi et al., 2024; Koc et al., 2023; Li et al., 2025; Mathebula et al., 2024; Ramos-Henriquez & Morini-Marrero, 2025; Su et al., 2025; Xylogiannopoulos et al., 2024; Zhao et al., 2025). Experimental designs formed the second largest group, employing perception, comparative, detection, and quasi-experimental approaches to assess the credibility and behavioural effects of ChatGPT-generated reviews (Amos & Zhang, 2024; Knoedler et al., 2024; Morini-Marrero et al., 2025; Wayne Litvin & Pei-Sze Tan, 2024). Perception and behavioural studies were less common, focusing on user or managerial evaluations of AI-generated content through surveys or qualitative assessments (Hajra, 2023; Jeong & Lee, 2024; Rosete et al., 2025). Only one study adopted a mixed-methods approach, integrating computational and experimental insights to examine managerial response contexts (Tan et al., 2025). Overall, the dominance of computational and experimental approaches reflects the field’s emphasis on empirical validation, while the limited use of perception-oriented and mixed-methods designs indicates opportunities for deeper contextual and theory-driven research. Figure 3 provides a visualization of the methodologies utilized in the reviewed studies.

Methodological Approach | No. of Papers | Description |
Computational Text Mining/NLP Approaches | 11 | This category comprises studies that primarily apply computational techniques to analyse or generate OCRs, including large-scale text analysis (Zhao et al., 2025), supervised machine learning (Su et al., 2025), predictive modelling (Baier et al., 2025), sentiment and topic modelling (Cheng et al., 2024; Li et al., 2025), and other NLP-based or AI-driven analytical approaches (Botunac et al., 2024; Casciato & Mateen, 2024; Choi et al., 2024; Koc et al., 2023; Mathebula et al., 2024; Ramos-Henriquez & Morini-Marrero, 2025; Xylogiannopoulos et al., 2024). |
Experimental Designs | 7 | This category includes studies employing experimental or quasi-experimental designs to examine the performance, credibility, or behavioural effects of ChatGPT-generated reviews, such as perception experiments (Litvin & Tan, 2024), controlled comparative experiments (Amos & Zhang, 2024), human–AI comparative detection experiments (Knoedler et al., 2024), and quasi-experimental designs (Morini-Marrero et al., 2025). |
Perception/Behavioral Studies | 3 | This category covers studies that focus on users’ or managers’ perceptions, evaluations, or behavioural responses to ChatGPT-generated reviews, typically using surveys or evaluative assessments (Hajra, 2023; Jeong & Lee, 2024; Rosete et al., 2025). |
Mixed-Methods Approaches | 1 | This category includes studies that explicitly combine computational analysis with experimental or qualitative techniques, such as mixed-method experimental designs comparing managerial responses to ChatGPT-generated reviews (Tan et al., 2025). |
The thematic analysis of the selected studies, following Braun & Clarke (2006), resulted in the emergence of three overarching themes reflecting the current research focus in this area: (1) ChatGPT for review analytics, (2) ChatGPT for review modeling and evaluation, and (3) ChatGPT for review management. The formation of these overarching themes from the initial codes is visualized in Figure 4. These findings provide insight into how ChatGPT has been applied within OCR research and highlight potential avenues for future investigation.

ChatGPT for review analytics emerged as the dominant overarching theme in the selected studies (n = 11; 50%), indicating that prior research primarily positions ChatGPT as an analytical tool for extracting meaning from OCRs. As summarised in Appendix B, this theme comprises several analytically oriented sub-themes, reflecting the diverse applications of ChatGPT in review analysis across domains. The largest sub-theme involves aspect-based sentiment analysis (n = 4), which enhances the precision and interpretability of OCR analysis by capturing sentiment at the attribute level (Botunac et al., 2024; Falatouri et al., 2024). A substantial subset of studies focuses on sentiment analysis (n = 3), examining ChatGPT’s ability to identify emotional polarity and evaluative tone in OCRs and demonstrating accuracy comparable to established NLP approaches, particularly in large-scale textual datasets (Casciato & Mateen, 2024; Cheng et al., 2024; Mathebula et al., 2024). Rather than replacing traditional models, ChatGPT is commonly positioned as a complementary tool that enhances sentiment interpretation through more context-aware language processing.
Another stream of research applies content analysis (n = 2) and topic modeling (n = 2), in which ChatGPT is used to extract themes, patterns, and higher-level insights from unstructured OCR data. Prior studies indicate that ChatGPT can synthesise large volumes of textual reviews into coherent analytical outputs, supporting the identification of recurring issues, consumer expectations, and experiential dimensions embedded in OCRs (Hajra, 2023; Morini-Marrero et al., 2025). In this context, ChatGPT is employed either as a primary coding aid or as an interpretive layer that enhances traditional qualitative content analysis. Similarly, in topic modeling applications, ChatGPT is used to identify latent themes by summarising recurring patterns and discussion points expressed by consumers, thereby facilitating a more interpretable synthesis of review content (McCloskey et al., 2024; Ramos-Henriquez & Morini-Marrero, 2025). Taken together, these findings indicate that research on ChatGPT for review analytics is methodologically diverse, with applications spanning multiple analytical functions rather than a single homogeneous approach.
ChatGPT for review modeling and evaluation emerged as the second dominant overarching theme in the reviewed literature (n = 8; 36%), with the majority of studies (n = 5) focusing on fake review detection. This emphasis reflects growing concern over the authenticity and credibility of AI-generated online reviews. As detailed in Appendix B, this theme comprises four closely related sub-themes: ChatGPT-generated reviews for detection models, prediction models, consumer perception evaluation, and ChatGPT’s capability in generating synthetic review datasets. The most prominent sub-theme centres on the use of ChatGPT-generated reviews for detection models, with five studies employing these generated reviews to train or test models designed to distinguish AI-generated from human-authored content (Knoedler et al., 2024; Su et al., 2025; Xylogiannopoulos et al., 2024; Xylogiannopoulos et al., 2025; Zhao et al., 2025). These studies highlight the increasing difficulty users face in reliably identifying AI-generated reviews, underscoring the heightened risk of misleading or deceptive content.
Complementing these detection-focused investigations, other studies examine alternative applications of ChatGPT-generated reviews for modeling and evaluation purposes. One study incorporates these reviews into prediction models to assess technology acceptance outcomes (Baier et al., 2025), while another explores consumer perceptions of AI-generated reviews, demonstrating that suspected AI authorship negatively affects perceived usefulness and authenticity (Amos & Zhang, 2024). In addition, Rosete et al. (2025) evaluated ChatGPT’s capacity to generate synthetic review datasets by comparing AI-generated and human-written content across multiple datasets, identifying limitations related to linguistic diversity and repetitiveness. Collectively, these studies indicate that while ChatGPT-generated reviews offer methodological utility for model development and evaluation, they also raise substantive concerns regarding credibility, detection, and consumer trust within OCR ecosystems.
ChatGPT for review management emerged as a distinct but less prevalent overarching theme in the reviewed studies (n = 3; 14%). In line with Appendix B, this theme is represented by a single, clearly defined sub-theme—management response—which focuses on the use of ChatGPT to assist managers in replying to online customer reviews. Within this sub-theme, prior studies consistently examine ChatGPT’s ability to generate managerial responses that meet service recovery and communication standards. For example, Koc et al. (2023) compared human-authored management replies on TripAdvisor with responses generated by ChatGPT and found that AI-generated responses were more efficient and, in some cases, more effective in addressing customer concerns. Similarly, Wayne Litvin & Pei-Sze Tan (2024) examined consumer perceptions of human versus ChatGPT-generated management responses and reported that ChatGPT could convincingly replicate authentic managerial communication, suggesting its potential value for responding to reviews at scale.
Despite these advantages, the reviewed studies also identify important limitations. Tan et al. (2025) further explored ChatGPT’s role in online service recovery, highlighting both its strengths and constraints. In particular, they found that although customers often struggled to distinguish between human- and AI-generated responses, explicit disclosure of AI authorship led to lower satisfaction and reduced purchase intentions. This finding suggests that while ChatGPT can support efficiency and consistency in review management, maintaining perceived authenticity remains critical. Overall, the evidence indicates that ChatGPT functions most effectively as a supportive tool for review management rather than a full substitute for human managerial engagement.
4. Discussion
Prior studies have examined ChatGPT’s applications in OCRs; however, the literature remains fragmented, with no systematic synthesis of the current research landscape. To address this gap, this study conducted an SLR of ChatGPT research in the context of OCRs. The analysis identified three overarching themes: (1) ChatGPT for review analytics, (2) ChatGPT for review modeling and evaluation, and (3) ChatGPT for review management. Among these, review analytics emerged as the most dominant theme, indicating the primary focus of existing OCR research.
Our findings indicate that current research on the application of ChatGPT in the OCR field predominantly focuses on analytical purposes, with particular emphasis on content and sentiment analysis. This finding provides a significant contribution to the literature, as existing methods for analyzing OCR data typically rely on traditional text-mining software such as Leximancer and ATLAS.ti. For example, Arasli et al. (2020) utilized Leximancer to analyze OCRs from cruise travelers in order to identify key service quality perceptions influencing value-for-money ratings. Similarly, Olorunsola et al. (2024) analyzed OCRs from eco-friendly hotels using Leximancer to identify key themes related to customer satisfaction. Further details on other analytical tools used for OCR data analysis, including ATLAS.ti, KH Coder, R software, and related tools, are extensively reviewed by Ishak et al. (2025). Although these tools perform effectively in processing textual data, they present notable limitations, most prominently the need for highly skilled personnel to operate the software and interpret the analyses (Engstrom et al., 2022). In contrast, AI tools such as ChatGPT are more user-friendly (Skjuve et al., 2023), thereby facilitating the analysis of OCR datasets and enabling marketers to obtain timely insights into consumers’ current perceptions of a brand or company. These findings suggest that ChatGPT has the potential to enhance OCR data analysis and support faster, more valuable insight generation.
However, existing studies report mixed findings regarding ChatGPT’s analytical accuracy, particularly in sentiment analysis tasks. While some research shows that ChatGPT’s classification performance can be competitive with, or even exceed, traditional NLP and machine learning approaches (Fatouros et al., 2023; Lossio-Ventura et al., 2024), other studies report greater variability and context-dependent weaknesses. For example, comparative evaluations indicate that although ChatGPT demonstrates competitive overall accuracy, it may lag traditional algorithms on certain datasets and task types (Arif & Aladdin, 2025). Similarly, sentiment classification studies suggest that while ChatGPT performs well overall, it struggles with specific categories, particularly neutral sentiment, reflecting challenges in interpreting ambiguous emotional expressions (Jonathan et al., 2025). Likewise, Rebora et al. (2023) found that although ChatGPT’s sentiment analysis performance is comparable to established automated tools at the aggregate level, it aligns less well with individual human judgments. Overall, these findings indicate that ChatGPT’s effectiveness in sentiment analysis is context-dependent and should be interpreted with caution in OCR research.
Our thematic analysis of the selected papers revealed growing scholarly attention to fake review detection within the second theme, reflecting increasing concern about the prevalence and impact of fake reviews on digital platforms. Prior research has provided substantial evidence of the severity of this issue; for example, He et al. (2022) demonstrate that fake reviews significantly distort product ratings on Amazon, showing that fake reviews are purchased across a broad range of products on the platform. This concern can be interpreted through a Signaling Theory lens, whereby OCRs function as market signals that convey cues about underlying product or service quality and reduce information asymmetry (Siering et al., 2018; Spence, 1978; Xu et al., 2024). Fake reviews, including AI-generated reviews, thus operate as deceptive signals that imitate credible experiential feedback without incurring genuine consumption-based costs, thereby undermining the reliability of online review ecosystems.
Building on this concern, a key finding within this theme is the growing number of studies that develop fake review detection models using ChatGPT-generated reviews, as evidenced by five reviewed studies (Knoedler et al., 2024; Su et al., 2025; Xylogiannopoulos et al., 2024; Xylogiannopoulos et al., 2025; Zhao et al., 2025). Earlier research highlighted the limited availability of high-quality datasets as a major challenge in developing reliable fake review detection models (Aslam et al., 2019; Wu et al., 2020). In contrast, our findings indicate that recent studies increasingly leverage ChatGPT-generated reviews to train and test detection models, signaling a clear methodological shift away from data scarcity toward AI-enabled data generation. Accordingly, this study contributes by systematically identifying this emerging shift in fake review research and clarifying ChatGPT’s evolving role as a data-generation tool in advancing fake review detection within the OCR literature.
In addition to its role in modeling fake review detection, ChatGPT also demonstrates constructive applications within the online review ecosystem, particularly in supporting managerial responses to customer feedback. Prior studies highlight its practical benefits in enhancing customer relationship management. For example, Wayne Litvin & Pei-Sze Tan (2024) showed that ChatGPT is effective and useful in improving management responses to online reviews in the hotel industry. Similarly, Koc et al. (2023) found that ChatGPT is both effective and efficient in summarizing and generating responses to customer complaints within the hospitality context. Our findings regarding ChatGPT’s usefulness for management responses are consistent with the conclusions of Simetgo et al. (2025), who systematically reviewed 40 articles and established the positive role of ChatGPT in customer support, particularly in providing 24/7 assistance, rapid responses, and consistent service quality.
From a Trust Theory perspective, these benefits primarily strengthen functional dimensions of trust, such as perceived ability and reliability, by signalling managerial competence and responsiveness (Mayer & Davis, 1999; Seok & Chiew, 2013). However, our review also reveals a more nuanced trust-related challenge. While AI-assisted responses may enhance operational efficiency, they may simultaneously weaken relational trust by raising concerns about authenticity and benevolence, particularly when customers become aware that responses are AI-generated. This concern is consistent with prior research on Trust Theory showing that user trust in AI-enabled systems is undermined when transparency, perceived benevolence, and socio-ethical expectations are weakened (Bach et al., 2024). Notably, our SLR identified evidence that prospective customers report lower purchase intentions when management responses are explicitly disclosed as AI-generated (Tan et al., 2025). Collectively, these findings suggest that organizations need to balance the operational advantages of ChatGPT with careful consideration of how AI-mediated interactions shape customer trust perceptions.
This SLR provides targeted practical insights for marketers, platform managers, and policymakers involved in the management and use of OCRs. For marketers and managers, the findings highlight ChatGPT’s strong potential as an analytical tool for extracting valuable insights from OCR datasets. Within the hospitality and tourism sector, ChatGPT can be deployed for near real-time review analysis on platforms such as TripAdvisor to monitor guest feedback related to service quality dimensions such as cleanliness, staff professionalism, accommodation comfort, food and beverage quality, and overall travel experience. By enabling rapid identification of emerging issues, dominant consumer sentiments, and key drivers of satisfaction and dissatisfaction, ChatGPT supports timelier and data-driven decision-making aimed at improving service quality and overall performance. Hotel managers, destination marketers, and tourism operators may leverage ChatGPT to support proactive service recovery and continuous experience enhancement throughout the customer journey, helping to sustain positive experiences while addressing areas requiring improvement.
For platform managers, however, the findings also raise important concerns regarding the increasing prevalence of fake reviews. The generation of false or AI-generated feedback poses a direct threat to the credibility of OCR platforms and may erode consumer trust if left unaddressed. In the e-commerce context, ChatGPT-generated reviews may be used constructively as training data to enhance fake review detection systems tailored to different product categories. As such, review platforms such as Amazon should invest in more robust detection mechanisms and consider integrating ChatGPT-powered review verification or screening tools as decision-support systems for human moderators, while actively collaborating with researchers and technology providers to identify and filter out deceptive content. Enhanced transparency mechanisms, such as review verification labels or disclosure indicators, may further help maintain platform integrity.
From a policy and regulatory perspective, the findings indicate a growing need for governance frameworks that address the use of generative AI in online reviews. Policymakers may consider developing and enforcing explicit AI-generated content disclosure standards that require review platforms to clearly label ChatGPT-generated or AI-assisted reviews. Such measures would establish accountability mechanisms for non-compliance, thereby protecting consumers from misleading information and enhancing transparency across various industries, particularly e-commerce platforms. At the platform governance level, managers are encouraged to integrate ChatGPT-powered review verification and screening tools into existing moderation systems to automatically flag suspicious or AI-generated content, prioritise reviews for human evaluation, and monitor emerging manipulation patterns. The implementation of standardised disclosure indicators and AI-assisted moderation guidelines may further help balance technological innovation with consumer protection, while consumers are advised to cross-check information across multiple sources to make more informed purchasing and travel decisions.
This study has several limitations that should be considered when interpreting the findings. First, although the review employed two leading academic databases, Scopus and Web of Science, to enhance coverage and rigour, relevant studies indexed in other databases or emerging outlets may not have been captured. Second, the review was restricted to English-language publications, which may introduce language bias and limit representation of research conducted in non-English-speaking contexts. Third, the relatively narrow date range reflects the emerging nature of ChatGPT research but may constrain the generalisability of the findings as the field continues to develop rapidly. Fourth, while the thematic analysis followed established procedures, a degree of subjectivity in coding and theme development is inherent in qualitative synthesis. Finally, the review did not incorporate backward and forward citation tracking, which may have resulted in the omission of influential studies not identified through the database search strategy alone.
Beyond these methodological considerations, a substantive limitation lies in the absence of a detailed, practice-oriented framework to guide marketers in applying ChatGPT for online review analysis. This limitation is closely related to the exploratory nature of the present study, which aims to map and synthesise existing research rather than to develop prescriptive models. Although this systematic review identifies key themes and research patterns, it does not propose a structured or step-by-step framework that practitioners can readily implement.
Building on these limitations, future research could be advanced through a more structured agenda encompassing theoretical, methodological, and contextual dimensions. From a theoretical perspective, future studies could integrate established theories from marketing, information systems, and communication such as trust, persuasion, and technology acceptance theories to explain how AI-assisted and AI-generated reviews influence consumer judgment, credibility assessment, and decision-making. Such theoretical integration would help move the literature beyond descriptive applications toward deeper explanatory and theory-building contributions.
From a methodological perspective, future research could extend beyond the predominantly experimental and modeling-based approaches by employing mixed-method designs, longitudinal analyses, and qualitative inquiries. Combining computational techniques with interviews or surveys involving managers and consumers would allow for a more nuanced understanding of how ChatGPT is used, evaluated, and managed in real-world review environments.
From a contextual perspective, future studies could broaden the scope of investigation by examining a wider range of industries and cross-cultural settings. Most existing research remains concentrated in service-oriented and digitally mature contexts, limiting the generalisability of findings. Expanding empirical coverage to diverse cultural, regulatory, and industry environments would enhance understanding of how contextual factors shape perceptions, ethical concerns, and managerial adoption of ChatGPT in OCRs.
In addition, the increasing use of ChatGPT-generated content calls for more rigorous empirical investigations into its reliability and ethical implications. While this review focuses on OCRs, future research could extend to high-stakes domains such as healthcare, finance, journalism, and customer service, where misinformation and automation bias may have more serious consequences. Such efforts would contribute to the development of stronger governance mechanisms, improved detection techniques, and the responsible integration of generative AI across digital platforms.
5. Conclusion
This study systematically reviewed recent literature on the application of ChatGPT within the context of OCRs to identify key themes, current research focus, and directions for future inquiry. Guided by the PRISMA 2020 protocol, three dominant themes emerged: ChatGPT for review analytics, ChatGPT for review modeling and evaluation, and ChatGPT for review management. Across the reviewed studies, research is largely situated within marketing and information systems disciplines, predominantly examines service-based industries, and relies mainly on experimental, modeling, and computational text analysis approaches. The findings indicate that existing research primarily investigates the use of ChatGPT as an analytical tool for extracting insights from consumer feedback, particularly in sentiment analysis and pattern identification. In parallel, a growing body of work examines ChatGPT-generated reviews for modeling, validation, and authenticity assessment, reflecting increasing concerns regarding the trustworthiness and ethical implications of AI-generated content. A smaller but notable stream of studies focuses on review management, highlighting ChatGPT’s role in supporting managerial responses and customer communication. Overall, this review suggests that research on ChatGPT and OCRs remains at an early stage of development. The emerging emphasis on review analytics, synthetic review evaluation, fake review detection, and responsible AI use presents promising opportunities for future research. Future studies could expand this field by incorporating diverse industry contexts, adopting cross-cultural perspectives, and employing mixed-method approaches to deepen understanding of ChatGPT’s role in digital marketing and online review ecosystems.
Conceptualization, M.I.I.; methodology, M.I.I. and N.A; validation, M.I.I., A.K.M., and N.A.; formal analysis, M.I.I. and A.K.M.; investigation, M.I.I.; data curation, M.I.I.; writing—original draft preparation, M.I.I.; writing—review and editing, M.I.I. and N.A.; supervision, M.I.I.; project administration, M.I.I.; funding acquisition, M.I.I. and A.K.M. All authors have read and agreed to the published version of the manuscript.
The data used to support the research findings are available from the corresponding author upon request.
The authors would like to express their sincere appreciation to Universiti Malaysia Perlis (UniMAP), particularly the Research Management Centre (RMC), through the UniMAP Publication Fund, for their encouragement in completing this research.
The authors declare no conflict of interest.
Artificial intelligence (AI)-based tools were used solely to improve the clarity, grammar, and readability of this manuscript. The use of AI did not influence the study’s conceptualization, data analysis, interpretation of results, or conclusions. All intellectual contributions, ideas, and content remain the sole responsibility of the authors.
Appendix A. Summary of core information for the 22 studies included in this review
No. | Authors | Aim | Methodology | Key Findings | Data |
1 | Wayne Litvin & Pei-Sze Tan (2024) | This study aims to assess hotels’ use of ChatGPT for responding to TripAdvisor reviews. | A quantitative research design involving a perception experiment using a modified Turing test. | ChatGPT performs effectively mimicking authentic responses written by hotel managers. | Hotel TripAdvisor reviews |
2 | Cheng et al. (2024) | This study aims to propose a user-friendly framework for mining tourism reviews with high sentiment analysis accuracy. | A qualitative research design involving computational sentiment and topic analysis. | ChatGPT outperforms traditional models and matches BERT in sentiment analysis. | Tourism reviews of China’s Five Sacred Mountains |
3 | Amos & Zhang (2024) | This study aims to study consumer reactions to reviews perceived as ChatGPT-generated versus human-written. | A quantitative research design involving controlled comparative experiments on source perception. | Consumers rate reviews as less useful, trustworthy, and authentic when they perceive them to be generated by ChatGPT. | Hotel reviews on TripAdvisor and restaurant reviews on Yelp |
4 | Knoedler et al. (2024) | This study aims to compare real and ChatGPT-generated reviews of top US plastic surgeries and evaluate human and AI detection. | A quantitative research design involving a human–AI comparative detection experiment with computational text analysis. | ChatGPT can mimic real patient reviews, with humans identifying them correctly only 59.6%, highlighting the risk of fake reviews. | Plastic surgery patient reviews |
5 | Baier et al. (2025) | This study aims to develop transfer models predicting technology acceptance and its key factors using online customer reviews. | A quantitative research design involving predictive modeling using secondary OCR data | OCRs, combined with transfer models, AI, and machine learning, can predict technology acceptance over time and may replace traditional surveys. | Google play store reviews |
6 | Morini-Marrero et al. (2025) | This study aims to explore using ChatGPT to analyze hotel reviews and compare its rating accuracy with humans and classic machine learning methods. | A quantitative research design involving a structured comparative (quasi-)experiment | ChatGPT gives more moderate ratings than humans and can help analyze feedback in hospitality. | TripAdvisor reviews of five-star hotels |
7 | Li et al. (2025) | This study aims to propose using ChatGPT to analyze patient reviews for improved healthcare insights and resource allocation. | A qualitative research design involving aspect-based sentiment analysis and prompt-engineered ChatGPT | Using ChatGPT with aspect-based sentiment analysis templates achieved high precision, helping providers understand patient needs. | Patient reviews on Haodf.com |
8 | Su et al. (2025) | This study aims to develop a multi-level method combining AI review detection with product attribute analysis. | A quantitative research design involving supervised machine learning model | The method detects AI-generated fake reviews, analyzes user preferences, and outperforms other approaches. | Amazon reviews of sweeping robots |
9 | Tan et al. (2025) | This study aims to assess ChatGPT’s effectiveness in online service recovery by comparing its managerial responses to human-written ones for hotel reviews. | A mixed-method experimental design | Customers couldn’t reliably tell ChatGPT apart from human managerial responses, but knowing a reply was from ChatGPT lowered ratings due to perceived inauthenticity and uncanniness. | Responses to customer reviews in tourism and hospitality |
10 | Xylogiannopoulos et al. (2025) | To propose and validate a pattern-based method for detecting AI-generated (paraphrased) reviews. | Pattern-based detection evaluated on TripAdvisor reviews and ChatGPT-4.0 paraphrases. | The results show that the proposed method outperforms existing AI text detection approaches in identifying AI-assisted fake reviews. | TripAdvisor reviews |
11 | Zhao et al. (2025) | This study aims to study linguistic differences between AI, human, and real reviews to improve detection methods. | Quantitative apporach involving large-scale text analysis and statistical testing | AI-generated fake reviews are easier to read but less specific, exaggerated, and mechanical than human or real reviews, highlighting the need for AI-specific detection methods. | Yelp dataset reviews |
12 | Mathebula et al. (2024) | This study aims to develop a more accurate sentiment analysis model for financial reviews using advanced NLP techniques. | A quantitative research design involving computational machine learning experiment | ChatGPT with BERT and BiLSTM outperformed lexicon methods, improving sentiment analysis for financial decisions. | HelloPeter reviews |
13 | McCloskey et al. (2024) | This study aims to explore using NLP and LLMs to help small businesses analyze strengths and weaknesses from limited customer reviews. | Quantitative content analysis combined with topic modeling and large language model analysis | Combining NLP methods like topic modeling and ChatGPT yields clear, actionable insights for small businesses. | Altomonte’s Italian Market reviews from Google, TripAdvisor, and Yelp |
14 | Hajra (2023) | This study aims to examine the emotional and spiritual impact of sacred elephant interactions in Asia and identify factors affecting visitor satisfaction. | A qualitative research design involving thematic analysis | Elephant interactions boost visitor satisfaction, and ChatGPT can analyze reviews to enrich tourism aesthetics research. | TripAdvisor reviews of Sacred Elephant |
15 | Casciato & Mateen (2024) | This study aims to analyze sentiment in foot and ankle surgeon reviews by sex and demographics. | A quantitative research design involving secondary data analysis | Foot and ankle surgeons get mostly positive reviews; males rate higher, and ratings dip with experience, highlighting the need to monitor reputation. | Healthgrades website reviews |
16 | Xylogiannopoulos et al. (2024) | This study aims to study AI-paraphrased reviews from ChatGPT 4.0 and show how text similarity aids in detecting fake reviews. | Quantitive approach involving comparative computational text similarity experiment | AI-paraphrased reviews differ from human ones but remain similar to each other, making text similarity a useful tool for detecting AI-generated fake reviews. | Reviews of 20 hotels worldwide |
17 | Botunac et al. (2024) | This study aims to compare fine-tuned Transformers with LLMs like ChatGPT and GPT-4 for classifying and analyzing hotel reviews. | A quantitative research design involving comparative NLP modeling experiment | RoBERTa is faster, while ChatGPT and GPT-4 capture sentiment better but need more resources. | Restaurant reviews from the SemEval-2014 |
18 | Falatouri et al. (2024) | This study aims to assess the efficiency of LLMs, including ChatGPT, for sentiment analysis and service quality (SQ) dimension extraction from online reviews. | The study compares two LLMs (ChatGPT-3.5 and Claude-3) with three traditional NLP techniques using English and Persian customer review datasets. | The findings show that LLMs outperform traditional NLP methods, achieving higher accuracy and stronger agreement with human raters, particularly ChatGPT. | Mobile app reviews from Google Play Store and Cafebazaar.ir |
19 | Jeong & Lee (2024) | This study aims to use ChatGPT for aspect-based analysis of hotel reviews to identify detailed service failures. | A qualitative research design involving aspect-based content analysis | ChatGPT effectively summarizes hotel complaints, extracts key terms, and outperforms traditional methods. | TripAdvisor hotel reviews dataset |
20 | Koc et al. (2023) | This study aims to explore using ChatGPT-4 to generate TripAdvisor responses and evaluate their effectiveness in service recovery. | A quantitative research design involving controlled expert-rating experiment | ChatGPT-4’s responses are high-quality, fast, and meet service recovery standards, while also assessing service failure severity and offering insights for recovery research. | TripAdvisor customer complaints and management responses |
21 | Ramos-Henriquez & Morini-Marrero (2025) | This study aims to examine remote workers’ Airbnb experiences and compare cognitive outcomes between long- and short-term stays. | A quantitative research design involving structural topic modeling | Remote workers’ Airbnb stays are mainly emotional, influenced by stay length and city, with high satisfaction and return intent; hosts should tailor amenities accordingly. | InsideAirbnb reviews for Lisbon and Austin |
22 | Rosete et al. (2025) | This study aims to test ChatGPT’s ability to generate synthetic comments that mimic human vocabulary. | A quantitative research design involving comparative computational vocabulary analysis | ChatGPT’s comments are less diverse and partly repetitive but can still support tasks like word clouds. | Reviews from four Kaggle datasets |
Appendix B. Thematic coding hierarchy from initial codes to overarching themes
Theme | Theme Description | Sub-theme | Sample Quote | Initial Code | Representative Studies |
ChatGPT for Review Analytics | This theme captures studies using ChatGPT as an analytical tool to extract meaning from reviews. | Aspect-based sentiment analysis | “Using ChatGPT, we automatically identified aspects and sentiments within the reviews, specifically focusing on negative sentiments, with the aid of a pre-trained BERT model.” (Jeong & Lee, 2024). | ChatGPT for aspect-based sentiment analysis in reviews. | Botunac et al. (2024); Falatouri et al. (2024); Jeong & Lee (2024); Li et al. (2025) |
Sentiment analysis | “ChatGPT was used to perform a sentiment analysis to describe the positivity, negativity, and neutrality of online physician reviews.” (Casciato & Mateen, 2024). | ChatGPT for sentiment analysis in reviews. | Cheng et al. (2024); Mathebula et al. (2024); Casciato & Mateen (2024) | ||
Content analysis | “This study aims to explore the application of ChatGPT to analyze hotel guest satisfaction from online reviews.” (Morini-Marrero et al., 2025). | ChatGPT for content analysis for useful insights. | Morini-Marrero et al. (2025); Hajra (2023) | ||
Topic modeling | “Do large language models (LLM) such as Chat Generative Pre-Trained Transformer (ChatGPT) provide equal or superior results to topic modeling without requiring the technical skills needed to implement topic modeling?” (McCloskey et al., 2024). | ChatGPT for topic modeling in reviews. | Ramos-Henriquez & Morini-Marrero (2025); McCloskey et al. (2024) | ||
ChatGPT for Review Modeling and Evaluation | This theme encompasses studies that use ChatGPT-generated reviews for review modeling and evaluation in the OCR context. | Detection model | “ChatGPT3.5 (also known as ChatGPT in its first version) and ERNIE Bot, developed by Baidu, Inc., are used to generate fake reviews.” (Su et al., 2025). | ChatGPT-generated reviews (fake reviews) for detection model development. | Su et al. (2025); Zhao et al. (2025); Xylogiannopoulos et al. (2024); Xylogiannopoulos et al. (2025); Knoedler et al. (2024) |
Prediction model | “Based on these findings, we used the chatbot OpenAI ChatGPT 4o | ChatGPT-generated reviews used in prediction models for technology acceptance. | Baier et al. (2025) | ||
Consumer perception | “This research provides a much-needed investigation into consumer reactions to reviews perceived to be generated by ChatGPT vs. a human.” (Amos & Zhang, 2024). | ChatGPT-generated reviews for examining consumer reactions and perceptions. | Amos & Zhang (2024) | ||
Synthetic review dataset | “This paper examines the ability of ChatGPT to generate synthetic comment datasets that mimic those produced by humans.” (Rosete et al., 2025). | ChatGPT-generated reviews to test its ability to generate synthetic reviews dataset. | Rosete et al. (2025) | ||
ChatGPT for reviews management | This theme reflects ChatGPT’s role in managerial interaction and response to online reviews. | Management response | “This managerial perspective considers the use of ChatGPT by hotels as a tool for replying to their property’s online consumer-generated media postings.” (Wayne Litvin & Pei-Sze Tan, 2024). | ChatGPT for replying to customer reviews by management. | Wayne Litvin & Pei-Sze Tan (2024); Tan et al. (2025); Koc et al. (2023) |
