The goal of this study is to suggest a method for turning an ontology into a hidden Markov model (HMM). Ontology properties (relationships between classes) and ontology classes are taken as HMM symbols and states, respectively. Knowledge is represented in many different fields using the central element of the Semantic Web dubbed ontology. The authors employed machine learning technologies like HMM to add knowledge to these ontologies or to extract knowledge from within them. The meaning obtained from ontologies is not described during this task. The ontology triples that were extracted using SPARQL queries are used in this paper to transform the ontology into an HMM in order to handle this semantic. The Pizza ontology has been used to implement this method, which is based on lightweight ontologies.
Ontology is a term that comes from philosophy, more specifically from the branch of metaphysics. The science of artificial intelligence then adopted ontology as a knowledge representation model for assisting reasoning in knowledge-based systems relatively early on. Ontologies are now at the heart of the architecture of the semantic web as a support for resource annotations (documents, images, videos, etc.) and facilitating communication between users and applications as well as between the apps themselves. "An ontology is an explicit specification of a conceptualization," claims Gruber [1].
The term "conceptualization" refers to an abstract representation of a certain reality phenomenon that enables the identification of the pertinent concepts for this occurrence. When a concept is described as "explicit," it means that it is clear and precise. Ontology use in applications can sometimes inspire authors to develop methods for discovering ontology properties. These methods were created using machine learning. According to Arthur Samuel, the scientific subject of machine learning enables computers to learn without explicit programming. It gives machines the ability to manage data using various statistical models and techniques [2].
Some of these algorithms are for supervised learning (decision tree, naive bayes, support vector machine, and neural networks), unsupervised learning (k-means, and hidden Markov model), semi-supervised learning (generative models, and self-training). Some authors linked these algorithms to ontologies in order to accomplish a variety of tasks. Authors suggested methods like Bayesian networks and ontologies for mapping, translation, and classification to support these tasks [3], [4], [5], [6], [7]. For prediction, several authors [8], [9] coupled decision trees and ontologies. For prediction, reasoning, mapping, and classification, several authors employed ontologies and neural networks [10], [11], [12], [13] or ontologies with support vector machines [14]. Hidden Markov model (HMM) was created by Rabiner and Juang after being introduced by Rabiner and Juang [15]. It is employed for classification, prediction, comparison, and speech and pattern recognition [16], [17], [18], [19], [20].
Since HMM conserved semantics between elements, it is the technique that is most frequently employed for events when phenomena are sequentially and semantically linked. For ontologies represented knowledge, the main goal is to maintain semantics when managing their concepts. The HMM is therefore designed to learn ontology concepts and all of their attributes. This concept was backed by two sets of tasks: (1) using HMMs to populate ontologies [21], [22], [23], [24], [25]; and (2) combining HMMs and ontologies to develop systems [26], [27], [28], [29], [30], [31], [32].
The drawback of these approaches is that, although some authors used ontology ideas as symbols for HMMs, no clear relationship between ontology and HMM was established. The use of HMM to create modular ontologies is a suggestion made by Warda et al. [33]. Learning ontology properties will therefore be useful for some applications.
As a result, the goal of this study is to suggest a method for turning an ontology into an HMM. Ontology properties (relationships between classes) and ontology classes are taken as HMM symbols and states, respectively. These classes and attributes come from the triples collection that was produced by SPARQL queries on the target ontology. Ontology axioms are not handled by these queries. Only light ontologies are therefore altered. Axioms are managed in heavy ontology by removing the triples that contained them. To implement this concept, certain equations are provided. One unique feature is the ability to convert many ontologies into a single HMM. For instance, a single HMM can represent a number of ontologies depending on the domain. The Pizza ontology is used to illustrate this method.
The rest of the paper is organized as follow: Ontologies and HMMs are briefly defined in Section 2 along with the current state of the art in this area. The suggested approach is discussed in Section 3, experimental findings and discussion are covered in Section 4, and the conclusion and potential future directions are covered in Section 4.
The content of this section derives from the study of Iloga et al. [16]. Formally, a HMM $\lambda=\{N, M, A, B, \pi\}$ is a given of:
(1) $N$ : its number of states. The set of states is noted $S=\left\{S_1, S_2 \ldots S_N\right\}$. Generally, at time $t$, the state is noted $q_t \in S$.
(2) $M$ : its number of observation symbols. The set of observation symbols is noted $V=\left\{v_1, v_2 \ldots v_M\right\}$. Generally, at time $t$, the symbol observed by the model is noted $O_t \in V$.
(3) $A=\left[a_{i j}\right]$ : its state transition probabilities distributions where $a_{i j}=P\left(q_{t+1}=S_j \mid q_t=S_i\right), 1 \leq i, j \leq N$.
(4) $B=\left[b_j(k)\right]$ : its observation symbols probabilities distributions where $b_j(k)=P\left(v_k\right.$ at time $\left.\mid q_t=S_j\right)$ in each state $S_j, 1 \leq j \leq N$ and $1 \leq k \leq M$.
(5) $\pi=\left[\pi_i\right]$ : its initial state probabilities distributions where $\pi_i=P\left(q_1=S_i\right), 1 \leq i \leq N$.
Given $T$ observations symbols $O=\left(O_1, O_2, \ldots, O_T\right)$. This sequence can be generated by a HMM $\lambda=$ $\{N, M, A, B, \pi\}$ as shown in Figure 1. This representation is called generated Markov chain.
Ontology is a set of concepts joined by relationships and based on some functions more especially some axioms. It is used to refer to a body of knowledge describing some domains, typically a common-sense knowledge domain, using a representation vocabulary. Given a target domain, its ontology forms the heart of any system of knowledge representation for that domain [34]. The components of ontology are: Concepts, Relations, Instances and Axioms. Concepts represent a set of entities within the domain. Relations specify the interaction among concepts. Instances indicate the concrete examples of concepts within the domain and axioms denote a statement that is always true [35]. Following the components of ontology, we have heavy ontologies and lightweight ontologies. Heavy ontologies are those which handle axioms and lightweight ontologies do not handle axioms. Lightweight ontologies give hierarchical order of classes and can precise domain and range of some properties (principally owl: Object Property). If axioms are avoided in heavy ontology, it becomes lightweight. For a given ontology, each class or relation is identified by its IRI (International Resource Identifier). Many tools help to build ontologies but the commonly used tool is Protégé 2000. Ontologies can be stored into many formalisms: RDF (Resource Description Framework), RDFS (RDF - Schema), XML (Extended Markup Language), OWL (Web Ontology language). To query ontology in Protégé 2000, we can use SPARQL. SPARQL is a recursive acronym that stands for SPARQL Protocol and RDF Query Language.
Approaches proposed hitherto are divided into two categories. For the one hand, HMMs are used to build or populate ontologies and for the other hand, ontologies are mixed to HMM to build systems. Several works maintained these ideas. Valarakos et al. [21] proposed a methodology for enriching multi-lingual domain ontology using machine learning (based on HMM), principally CROSSMARC ontologies. Their approach consists to add instances of ontology concepts using machine learning techniques. HMM is trained on the ontology instances before apply it to web pages using Viterbi algorithm to recognize matches, hence locate new ontology instances and Packer and Embley followed this approach in the study [23] to propose ListReader, an approach based on HMM to populate ontology. HMM derived from OCRed, a collection of page images. Thus, the train HMM generated labeled text, which is transformed into predicates for ontology using Viterbi algorithm. For Monika and Raju [25] ontology can be obtained with another manner. They proposed an effective model integration algorithm based on HMM to build ontology. HMMs are used to capture knowledge from datasets before initialize process and each ontology concepts derived from this approach respect initial prediction. Bratus et al. [26] proposed an approach which combined HMM and CRF models to extract data using ontology-guided search. They first identified and extracted part names from unstructured data and second, they developed TCBR (Textual Case-Based Reasoning) systems for service technicians and engineers. According to this goal, Azanzi and Camara [24] proposed an approach for knowledge extraction from source code based on HMM. It was applied to EPICAM, a tuberculosis surveillance system. Ontology is code in Java language and HMM is trained to identify Java code concepts to be extracted. To classify genes, Mi et al. [27], proposed PANTHER (protein annotation through evolutionary relationship) by integrating statistical tools (HMMs). They used HMMs to capture evolutionary relationships of genes families and subfamilies and they used ontology (GO) to annotate them. The idea of classification guided Prestat et al. [22], to propose FOAM (Functional Ontology Assignments for Metagenomes). This ontology is a database of HMMs used for classification. HMMs are obtained by fetching profiles of KEGG orthologs (KOs). Pipitone and Pirrone [28] proposed an approach to automatically generate ERD (Entity Relationship Diagrams) from OWL ontology based on HMM. To construct HMM using OWL/ERD, they took ERD as hidden states. They defined grammar to determine the transitions probabilities and observation probabilities derived from OWL/ERD mapping rules. Rani et al. [29] proposed OPAESFH, an approach based on ontology for personalizing system of E-learning using Fuzzy Petri Networks (FPN) and HMM. This system used metadata of SwetoDblp, an ontology of Computer Science bibliography data. The courses and exercises of E-learning are modeled with FPN. HMM are used to updated FPN parameters and to recommend level of learner while Karmegam [30] proposed an HMM and ontology based cross-lingual question answering system for the agricultural domain. The ontology is used to map knowledge components and HMM are used to identify the most suitable resource queried by user based on semantic relations among resources. Recently, to recognize group activities based on imageries data, Elangovan [31] proposed an approach where he considered the groups of human activities as ontologies. Then, these ontologies are used as sequences. These sequences (considered as symbols) are used to train HMM and to determine the probability of sequence evaluation. In the IoT (Internet of things) domain, Muthukumar et al. [32] build a semantic-based security platform to detect the malicious attack data. Ontologies (Semantic Sensor Networks Ontology and Temporal Ontology) are used to represent sensor data. HMM is used to identify anomalies derived from clustered data using observations of HMM. These works are summarized in Table 1.
Authors | Ref. | Year | Main idea |
Valarakos et al. | [21] | 2003 | Populate ontology using HMM |
Bratus et al. | [26] | 2011 | Use HMM and CRF models for data extraction guided by ontology search |
Mi et al. | [27] | 2013 | Use ontology to annotate concepts and HMM to capture their relationships for classification of genes |
Prestat et al. | [22] | 2014 | Collect HMMs to build ontology such that it can be possible for classification |
Pipitone and Pirrone | [28] | 2014 | Generate ERD from OWL ontology using HMM |
Packer and Embley | [23] | 2015 | Populate ontology using HMM |
Azanzi and Camara | [24] | 2017 | Extract knowledge from ontology (Java source) using HMM |
Rani et al. | [29] | 2017 | Use ontology to personalize system of E-learning based on FPN and HMM |
Karmegam | [30] | 2019 | Combine HMM and ontology to build answering system for agricultural domain |
Monika and Raju | [25] | 2019 | Build ontology using algorithms based on HMMs |
Elangovan | [31] | 2021 | Train HMM based on ontologies to compute sequence evaluation |
Muthukumar et al. | [32] | 2021 | Use HMM to identify anomalies from data (observations) derives from ontologyal |
Approaches hereinbefore mentioned focused on populating ontologies using HMMs, training HMMs based on ontology concepts and combining ontologies and HMMs to build a target system. No-one typically found correspondence between ontology and HMM. This limit guided us to propose this approach.
This paper aims to propose an approach to transform an OWL ontology into a HMM, in other terms learn knowledge from ontology using HMM. To achieve this goal, ontology classes are considered as hidden states of HMM and ontology properties are considered as symbols of model. Therefore, the number of ontology classes is equal to the number of HMM hidden states and the number of ontology properties is equal to the number of HMM symbols. For different concepts of ontology (classes and properties), their associated IRI can be used or only their short name depending to the user. The methodology of this approach is detailed in Section 3.2.
Figure 2 describes steps to transform ontology into HMM. Its input is an ontology and its output is a HMM.
Ontology triple is a (subject, predicate, object) set represented with Figure 3 In this configuration, subject and object are classes; object can also be an axiom according to the type of ontology. In this work, axioms are not considered since light ontologies are handled. And predicate denotes the relationship between subject and object.
The step 1 consists to use ontology triples to replace whole ontology. The set of ontology triples can be obtained by querying ontology using SPARQL language. Here, user can define the condition of extraction. For instance, to get all ontology triples, we can use following query:
SELECT ?subject ?predicate ?object
WHERE {?subject ?predicate ?object . }
This query means select all triples where ?predicate is the relationship between ?subject and ?object. To be more precisely, some set of values can be defined on ?predicate. It can be owl:ObjectProperty, rdf:type, rdf:subClassOf ... Hence, previous query can be modified like following queries:
SELECT ?subject ?predicate ?object
WHERE {?subject ?predicate ?object .
?predicate a owl:ObjectProperty . }
And
SELECT ?subject ?object
WHERE {?subject rdfs:subClassOf ?object . }
All these triples can be stored in .txt file for simple exploitation.
In this step, triples are labeled with integers. Since all ontology resources and relations are identified by IRI (which are string type), it is hard to manage them. Therefore, all concepts are replaced by integers. So, ontology triples are transformed into set of integers where each number refers to corresponding class or property. For example, if we consider this set of triples:
T = {(mango, subClassOf, fruit), (banana, subClassOf, fruit), (human, eat, fruit), (human, plant, vegetable)}
In this example, we assume that each expression is the short name of class or property (not its IRI), then each class and property are transformed as described in Table 2 and Table 3.
Classes | Labels |
mango | 1 |
fruit | 2 |
banana | 3 |
human | 4 |
vegetable | 5 |
Properties | Labels |
subClassOf | 1 |
plant | 2 |
eat | 3 |
Using Table 2 and Table 3, the labeled set of triples is:
T _{Labelled }= {(1,1,2), (3,1,2), (4,2,2), (4,3,5)}
Let $\lambda=(N, M, A, B, \pi)$ a HMM. Construct $\lambda$ refers to compute the components of matrix $A$ and $B$ and vector $\pi$.
For matrix $A$, the numbers of lines and columns corresponds to the number of hidden states of the model. Therefore, each $a_{i j}$, component of $A$, is the probability to move from the state $i$ to the state $j$ wherefore it means that there exists a relation moved from class labelled with $i$ to one labelled with $j$. Eq. (1) is used to compute $a_{i j}$.
For matrix $B$, the number of lines corresponds to the number of model hidden states and the number of columns corresponds to the number of model symbols. Therefore, each $b_j(k)$, components of $\mathrm{B}$, is the probability to observe symbol $k$ at the state $j$ wherefore it means that there exists a property labelled with $k$ moved from the class labelled with $j$. Hence, each $b_j(k)$ is computed using Eq. (2):
For vector $\pi$, the number of elements corresponds to the number of model hidden states. Therefore, each $\pi_i$, element of $\pi$, is the probability such that state $i$ be the initial state of model wherefore it means that ontology can be browsed launching with the class labelled with $i$. Hence, Eq. (3) is used to compute each $\pi_i$:
In Eqns. (1)-(3), $\varepsilon$ is a real positive number. It has been added to avoid division by zero and to ensure probabilities distributions of model components. To have the model, it is needed to readjust the values of $A$, $B$ and $\pi$. The difference between 1 and the sum of the elements for each line of $A$ and $B$ is redistributed equitably to all elements of this line and the same technique is applied to the components of $\pi$. These readjustments follow the Eqns. (4)-(6) below:
For Matrix $A$:
In Eq. (4), the term $\Delta_i^a$ is the difference between 1 and the sum of elements of line number $i$ in matrix $A$. For Matrix $B$:
In Eq. (5), the term $\Delta_j^b$ is the difference between 1 and the sum of elements of line number $j$ in matrix $B$. For Vector $\pi$:
In $\mathrm{Eq}$. (6), the term $\Delta^\pi$ represents the difference between 1 and the sum of elements vector $\pi$.
For instance, according to the example of triples $T$ in Section $3.2 .2$ transformed into labelled triples $T_{L a b e l l e d}$ the number of classes is 5 and the number of relations (properties) is 3. $a_{1,1}$ corresponds to the probability to move from state 1 to sate 1. It refers to the number of triples which have 1 as subject and 1 as object. There is one triple which has 1 as subject $(1,1,2)$ and no one has 1 as object i.e. np move from state 1 to state 1, hence $a_{1,1}=0$. For $a_{1,2}$ there is one move from state 1 to state $2(1,1,2)$ and only one move from 1 to another state outside state 2. Hence, $a_{1,2}=\frac{1}{1+\varepsilon}$. Similarly, others components of $A$ are computed. $b_1(1)$ corresponds to the probability to observe symbol 1 in the state 1. It refers to the number of triples which have 1 as subject and 1 as relation. There is one triple which has 1 as state and 1 as symbol and there is two triples which has 1 as symbol. Hence $b_1(1)=\frac{1}{2+\varepsilon}$. Similarly, others components of $B$ are computed. $\pi_1$ corresponds to the probability such that 1 be the initial state. It refers to the number of triples which the subject is 1 or the number of triples moved from 1. There is one triple which has 1 as subject. Hence $\pi_1=\frac{1}{5+\varepsilon}$. Similarly, others components of $\pi$ are computed. Thus, we have the following values:
$A=\left[\begin{array}{ccccc}0 & \frac{1}{1+\varepsilon} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & \frac{1}{1+\varepsilon} & 0 & 0 & 0 \\ 0 & \frac{1}{2+\varepsilon} & 0 & 0 & \frac{1}{2+\varepsilon} \\ 0 & 0 & 0 & 0 & 0\end{array}\right] \quad B=\left[\begin{array}{ccc}\frac{1}{2+\varepsilon} & 0 & 0 \\ 0 & 0 & 0 \\ \frac{1}{2+\varepsilon} & 0 & 0 \\ 0 & \frac{1}{1+\varepsilon} & \frac{1}{1+\varepsilon} \\ 0 & 0 & 0\end{array}\right]$
$\pi=\left[\begin{array}{ccccc}\frac{1}{5+\varepsilon} & 0 & \frac{1}{5+\varepsilon} & \frac{2}{5+\varepsilon} & 0\end{array}\right]$
Applying the Eqns. (4)-(6), we have:
For matrix $A$
$\Delta_1^a=1-\frac{1}{1+\varepsilon}=\frac{\varepsilon}{1+\varepsilon} \quad \Delta_2^a=1 \quad \Delta_3^a=1-\frac{1}{1+\varepsilon}=\frac{\varepsilon}{1+\varepsilon} \quad \Delta_4^a=1-\frac{2}{2+\varepsilon}=\frac{\varepsilon}{2+\varepsilon}$
For matrix $B$ :
$\Delta_1^b=1-\frac{1}{2+\varepsilon}=\frac{1+\varepsilon}{2+\varepsilon} \quad \Delta_2^b=1 \quad \Delta_3^b=1-\frac{1}{2+\varepsilon}=\frac{1+\varepsilon}{2+\varepsilon} \quad \Delta_4^b=1-\frac{2}{1+\varepsilon}=\frac{\varepsilon-1}{1+\varepsilon}$
For vector $\pi$ :
$\Delta^\pi=1-\frac{4}{5+\varepsilon}=\frac{1+\varepsilon}{5+\varepsilon}$
Hence, the components of $A, B$ and $\pi$ after readjustment are:
$A=\left[\begin{array}{ccccc}\frac{\varepsilon}{5+5 \varepsilon} & \frac{5+\varepsilon}{5+5 \varepsilon} & \frac{\varepsilon}{5+5 \varepsilon} & \frac{\varepsilon}{5+5 \varepsilon} & \frac{\varepsilon}{5+5 \varepsilon} \\ \frac{1}{5} & \frac{1}{5} & \frac{1}{5} & \frac{1}{5} & \frac{1}{5} \\ \frac{\varepsilon}{5+5 \varepsilon} & \frac{5+\varepsilon}{5+5 \varepsilon} & \frac{\varepsilon}{5+5 \varepsilon} & \frac{\varepsilon}{5+5 \varepsilon} & \frac{\varepsilon}{5+5 \varepsilon} \\ \frac{\varepsilon}{10+5 \varepsilon} & \frac{5+\varepsilon}{10+5 \varepsilon} & \frac{\varepsilon}{10+5 \varepsilon} & \frac{\varepsilon}{10+5 \varepsilon} & \frac{5+\varepsilon}{10+5 \varepsilon} \\ \frac{1}{5} & \frac{1}{5} & \frac{1}{5} & \frac{1}{5} & \frac{1}{5}\end{array}\right]$
$B=\left[\begin{array}{ccc}\frac{4+\varepsilon}{6+3 \varepsilon} & \frac{1+\varepsilon}{6+3 \varepsilon} & \frac{1+\varepsilon}{6+3 \varepsilon} \\ \frac{1}{3} & \frac{1}{3} & \frac{1}{3} \\ \frac{4+\varepsilon}{6+3 \varepsilon} & \frac{4+\varepsilon}{6+3 \varepsilon} & \frac{4+\varepsilon}{6+3 \varepsilon} \\ \frac{\varepsilon-1}{3+3 \varepsilon} & \frac{2+\varepsilon}{3+3 \varepsilon} & \frac{2+\varepsilon}{3+3 \varepsilon} \\ \frac{1}{3} & \frac{1}{3} & \frac{1}{3}\end{array}\right]$
$\pi=\left[\begin{array}{lllll}\frac{6+\varepsilon}{25+5 \varepsilon} & \frac{1+\varepsilon}{25+5 \varepsilon} & \frac{6+\varepsilon}{25+5 \varepsilon} & \frac{11+\varepsilon}{25+5 \varepsilon} & \frac{1+\varepsilon}{25+5 \varepsilon}\end{array}\right]$
With this approach, multiple ontologies can be handled with single HMM. In this case, each ontology is transformed into set of triples. All these sets are considered to compute model parameters $(A, B$ and $\pi)$ using Eqns. (1)-(3). Nevertheless, in this case, the set of classes which should be considered is the union of the sets of classes derived from each ontology, similarly for the set of relations (properties).
This approach was experiment on Pizza ontology (https://protege.stanford.edu/ontologies/pizza/pizza.owl) (pizza.owl) developed for educational purposes by the University of Manchester, United Kingdom. It describes all concepts concerning pizza. Opened with Protégé 2000, it has one hundred classes and eight object properties. The partial view of pizza ontology metrics is given in Figure 4.
For this experiment, triples extracted with Python language, derived from the following SPARQL request:
SELECT ?s ?o
WHERE {?s rdfs:subClassOf ?o .
?s a owl:Class .
?o a owl:Class . }
And
SELECT ?s ?p ?o
WHERE {?p a owl:ObjectProperty .
?p a rdfs:domain ?s .
?p a rdfs:range ?o . }
The results of these queries gave 90 triples and a part of triples is given in Figure 5. To have the view of triples, IRI were replaced with namespace (the term before symbol #). For each line in this figure, the first term corresponds to the subject of triple, the second corresponds to the predicate and the other one to the object.
With this set of triples, we had metrics of triples summarized in Table 4:
Elements | Quantity |
Number of triples | 90 |
Number of classes | 85 |
Number of relations | 07 |
The rest of the work is done with Java language. Classes and relations are then labeled and Figure 6 (resp. Figure 7) shows partial view of labeled classes (resp. labeled relations).
The set of triples in Figure 5 is transformed into labeled triples according to labelled classes and labelled relations. The Figure 8 shows partial view of the corresponding labelled triples.
Full results described in Figure 5, Figure 6, Figure 7, Figure 8 are available in appendix A.
The set of labelled triples in Figure 8 are considered as the input for HMM construction. Since the number of classes and relations are obtained using the set of triples, the number of HMM states is 85, the number of HMM symbols is 7 and the sequence of observations to be used contained 90 observations.
After the application of proposed approach, the characteristics of HMM components are:
(1) Matrix $A$: the number of lines and the number of columns, denoted $N$, are equal to 85;
(2) Matrix $B$: the number of lines, denoted $N$, is equal to 85 and the number of columns, denoted $M$, is equal to 7;
(3) Vector $\pi$: the number of components, denoted $N$, is equal to 85.
The value of $\varepsilon$ was fixed to $10^{-6}$. This value can be modified by user and depends to the desired precision. To show the impact of $\varepsilon$, values are printed with 10 digits after coma.
For the high number of lines and columns of $A, B$ and $\pi$, just some partial views are respectively presented in Figure 9, Figure 10 and Figure 11. In matrix $A$ given by Figure 9, in the first line and first column, $a_{1,1}=$ $0.0000000118$, it is the probability such that there is a relation moved from the class labeled with 0 to the class labeled with 0 . Since this probability reaches 0, it means that no relation exists between these two classes wherefore there is a relation between the class labelled with 0 and the one labelled with 1 because a $a_{1,2}=$ $0.9999990118$.
In matrix $B$ in Figure 10, $b_0(0)=0.9999991429$. This value corresponds to the probability that the relation labelled with 0 is moved from the class labelled with $0 . b_0(1)=0.0000001429$ means that the relation labelled with 1 is not moved from the class labelled with 0.
In vector $\pi$ given by Figure 11, we have $\pi=0.0233910033$. This value corresponds to the probability such that pizza ontology can be browsed starting from the state labelled with 0.
The full parameters (Figure 9, Figure 10, Figure 11) of the obtained HMM are available in appendix A.
The proposed approach focuses on learning ontology concepts through HMM, in other words the transformation of ontology into HMM. This technique handled only lightweight ontology and it is the limit of this approach. Nevertheless, a heavy ontology can be used however triples which contain axioms will be avoided. Hence, sometimes, the set of relations can be reduced to rdfs:subClassOf and then the HMM will be initialize using only one symbol. The sensibility of this approach is based on the value of $\varepsilon$. Since authors [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32] based on populating ontologies through HMM and mixed HMM and ontologies, in this work, a strictly relationship is outlined between ontology and HMM. As precise in Section 3.2, in the case of multiple ontologies, a single HMM can represent them using this approach. For the main goal of ontology is to represent knowledge – and knowledge is based on semantic –, HMM consolidated this semantic because Eqns. (1)-(3) are based on the triples obtained via SPARQL querying on target ontology. With this approach, comparing two ontologies can be referred to comparing the two corresponding HMMs since they captured ontologies properties. Some challenges are outlined concerning the modularization of ontology [33]. This technique can be a possible issue to extract ontology modules. Another challenge is to transform a HMM into ontology.
This paper proposed a method to transform ontology into HMM. For this, ontology is transformed into triples using SPARQL querying, then triples are transformed into labelled triples using labelled classes and labelled relations and HMM parameters are initialized with these labelled triples. This approach was experimented on Pizza ontology and results are presented and discussed. This approach does not handle heavy ontologies and this constitutes its drawback. Since the main purpose of machine learning is to help computers to learn from several data sources, this approach can contribute to ensure this goal concerning data represented by ontologies. The future trends are to apply HMM algorithms and tasks (prediction, classification, clustering ...) and to perform the results in ontology engineering.
The data (Pizza ontology, an owl file) supporting our research results is deposited in https://protege.stanford.edu/ontologies/pizza/pizza.owl.
The authors declare no conflict of interest.