Introduction
Scientific research aims to obtain knowledge to explain phenomena of reality through the alignment of observation, knowledge, and data to solve problems. The science application allows individuals, industries, and countries to transform abstract theories into practical knowledge. Several areas, such as the automotive industry, energy, computing industry, and others, have their foundation in scientific investigations.
Cuban universities play a key role in creating and applying new knowledge whose application enhances the life quality of the society. However, at the beginning of an investigation, several issues usually arise that limit the work of the researchers. These problems are more significant in areas with a low level of “maturity”. For example, in the informatics sciences area is difficult to define the type of result that will be yielded after a scientific investigation. The low level of maturity in this area leads to heterogeneity in the definition of potential scientific results. It is possible to find similar results but with a different classification, for example, a model and a method with a similar description. Furthermore, when the type of result is defined, there is no clearness about the elements that must be developed. On the other hand, there is no guide to selecting the research methods to be applied or defining validation strategy according to the type of result. Hence, sometimes wrong research methods are applied.
These problems have been identified in seminars with professors and PhD students and through the review of several theses. These problems hinder the assessment and analysis of the contributions described in the thesis. For example, it is difficult to compare two similar results because they could be described in different terms. Besides, the heterogeneity of the descriptions makes it difficult to cluster the investigations to make it easy to search for information. Usually to find a specific element of a thesis, it is necessary to read the complete document because there is no other option. The thesis evaluators are also affected because they do not know the elements that cannot be omitted according the type of scientific result. There are proposals (Hernández Sampieri, Fernández Collado, & Baptista Lucio, 2010; León, Alfredo, & Coello Gonzalez, 2008) to guide the scientific activity methodologically, but focusing only on general aspects.
On the other hand, ontologies are an artificial intelligence technique applied successfully to describe and analyze knowledge. It is possible to find several applications of ontologies in different domains (Bouzidi, Nicola, Nader, & Chalal, 2019; Larentis et al., 2021; Ma et al., 2019; Nicola, Melchiori, & Villani, 2019; Segura, Martínez, & Fernández, 2018; Sil Sen, Banerjee, & Mukherjee, 2022; Silega & Noguera, 2021; Silega et al., 2022; Tapia-Leon, Rivera, Chicaiza, & Luján-Mora, 2018; Yousefianzadeh & Taheri, 2020), such as bank management systems, enterprise management, health management systems, and others.
To address the issues described above, in this article, some investigation results that are common outcomes of doctoral theses are identified. Furthermore, we propose some necessary elements for each type of result. We reviewed 12 doctoral theses developed at The University of Informatics Sciences to achieve this objective. Furthermore, the paper describes an ontology that (1) specifies the structure of the investigation results, and (2) it is described, based on the defined structure, the results of the analyzed theses. Hence, this ontology could be a useful instrument to support the work of the PhD students and the evaluators since it is not easy to find documentation that defines the potential investigation results that could be developed in a doctoral thesis.
The problems above lead to delays in the development of thesis projects or in the worst case, to the project's failure. Hence, this approach can help define the right scope (parts of the thesis) of the project and then do proper planning.
The remainder of the article is structured as follows. In the next section, some basic concepts for this research are analyzed as well as the review results of the 12 doctoral theses. Then, we describe the approach to represent investigation results based on ontologies; further, some examples to demonstrate the approach's applicability are presented in this section. Finally future work and conclusions are presented.
Methods and computational methodology
Ontology
An ontology is a formal, explicit description of concepts in a domain of discourse (classes (sometimes called concepts)), properties of each concept describing various features and attributes of the concept (slots (some-times called roles or properties)), and restrictions on slots (facets (sometimes called role restrictions)) (Noy & McGuinness, 2001). The set of classes of an ontology and its instances represent a knowledge base. Therefore, ontologies are a suitable option to represent the knowledge of a domain of discourse.
There are several languages to represent ontologies (Amith, Fujimoto, Mauldin, & Tao, 2020; Magumba & Nabende, 2017; Yang et al., 2019), i.e., Ontolingua, XML Schema, RDF (Resource Description Framework), RDF Schema (o RDF-S), y OWL (Ontology Web Language). OWL (Web Ontology Language) (Xing & Ah-Hwee, 2010) is one of the most relevant languages for managing ontologies. OWL has significant features, such as a rich set of operators - e.g., intersection, union, and negation (Horridge, 2009). On the other hand, it is possible to use reasoners to check the consistency of models automatically. Moreover, OWL is supported by the tool Protégé, which allows the creation of ontologies easily.
To adopting a sound methodology is crucial to develop an ontology. Hence, we carried out an analysis of some relevant methodologies (Kotis, Vouros, & Spiliotopoulos, 2020; Kumar, 2017). We developed the ontology following the methodology of Noy and McGuinness. This methodology has been extensively adopted to guide the development of ontologies (Sattar, Surin, Ahmad, Ahmad, & Mahmood, 2020). In addition, we analyzed several ontologies focused on describing and analyzing research results (Guerrero-Sosa, Menendez-Domínguez, Castellanos-Bolanos, & Gómez-Montalvo, 2019; Varen & Silega, 2022).
Methodology for the review
At the University of Informatics Sciences, several investigations in different research fields have been developed. Some results of these investigations have been documented in PhD thesis. The review of 12 PhD thesis demonstrated the heterogeneity in the description of investigation results. This review was carried out by executing three steps: identifying the thesis, data extraction, and analysis of results. In the first step, we searched in the institutional repository, where every thesis developed in the university can be downloaded.
Finally, 12 theses were downloaded and analyzed (Alfonso, 2015; Baryolo, 2012; Betancourt, 2016; Castillo, 2014; Díaz, 2012; A. O. García, 2016; J. A. L. García, 2015; Hernández, 2015; Lago, 2015; López, 2015; Pérez, 2016; Silega, 2014). The authors of this paper carried out the extraction of the information, and these results were used to elaborate an ontology to describe scientific results.
Results and discussion
This section presents the main ontology components that we developed to describe and analyze scientific results. Classes and properties are the most important components of an ontology. The review of the 12 theses helped identify the ontology's main concepts. Finally, 22 classes were specified; Fig. 1 depicts these classes. Five of the most important classes are Result, PartOfResult, Method, Model, Person, and KnowledgeField. The classes help to homogenize the types of scientific results, especially those that can be the main outcomes of doctoral research.
The properties in an ontology allow the characterization of individuals. There are two types of properties: object properties and data properties. The object properties describe the relations between two individuals, while the data properties specify a simple attribute of an individual. In the ontology, 35 object properties were defined. For example, the object property HasPart defines that a Result HasPart some PartOfResult. Furthermore, we defined the object property ApplyMethodotoValidate to define that a Result ApplyMethodotoValidate some ResearchMethod.
Table 1 shows some of the most relevant object properties defined in the ontology, with the sake of brevity, we do not explain all of them. One interesting decision during the ontology design was the definition of the super property HasPart, which subsumes several properties. This property defines that a Result HasPart some PartOfResult, they will be defined the specific part that can have, for example, a Method HasPart some Step while a Model HasPart some Components, for these cases the properties HasStep and HasComponent respectively will be specified to enhance the accuracy of the description.
We also defined data properties to record some important information about the individuals. For example, we defined that a Result has the data properties HasTitle and DateOfPresentation. While the authors have the properties: HasName, BirthDay, and HasIN.
Domain | Property | Range |
---|---|---|
Result | HasPart | PartOfResult |
Result | ApplyResearchMethod | ResearchMethod |
Result | HasAuthor | Person |
Model | HasComponent | Component |
Model | HasIput | Input |
Model | HasIput | Output |
Method | HasStep | Step |
Stratetegy | HasStage | Stage |
In OWL is possible to represent universal restrictions (only), existential restrictions (some) and cardinal restrictions. For example, we defined the existential restriction that a Result Haspart some PartOfResult; this restriction forces the description of a thesis to include some of the result parts defined in the ontology. On the other hand, we defined a universal restriction to specify that a Model HasComponent only individuals of the class Component. With this specification, it is possible to find description errors, for example, whether a Model is related through the property HasComponent to something that it is not a Component. To check the completeness of the descriptions, the cardinal restriction that a Model HasStep minimum of two Steps was defined. Figures 2 a), 2 b), and 2 c) depict the definition of these three restrictions, respectively.
Exploiting the ontological model
To demonstrate the applicability of the ontology, we described seven of the analyzed theses. Figure 3 shows the classification of the theses: Method (3), Model (2), and Strategy (2).
We represent the elements that describe each thesis; for example, Figure 4 shows that MetodoTDNemury2014 has an author, tutor, principles, premises, and steps and adopts several research methods. The ontology with the theses description can be exploited to search for useful information; for example, it is possible to easily find and analyze the other theses with the principle flexibility.
These elements could be useful for a researcher who needs to define the investigation result. Once the type of result is defined, the ontology could help him to identify the elements that should be developed for this type of result. For example, a project management researcher could easily know the results presented in this area and their characteristics. Likewise, the researcher could know the research methods adopted to validate the investigation results depending on the type.
The current version of the ontology focuses on the investigation results' structure and describes the analyzed theses. However, we are working to define new restrictions to validate the descriptions. Currently, we are working to extend the scope of the ontology. This ontology also may be a useful instrument for the evaluators of the theses.
On the other hand, the expressive richness of the owl language and the usage of a reasoner allowed to obtain interesting inferences; for example, owl allows defining functional properties. To illustrate an inference, we declared that the property HasAuthor is functional. This statement means that a thesis can only have one author; hence, if two authors are assigned to the same thesis, the reasoner infers that it is the same person. For example, we defined that the individual Nemury is the author of the method MetodoTDNemury2014, then we specified that Nsilega is the author of the method MetodoTDNemury2014 as well. With these specifications, the reasoner inferred that Nemury and Nsilega are the same person. Figures 5 a) and 5 b) show this inference.
These inferences can be useful to analyze large ontologies, for example, to identifying people with several names or detecting description mistakes. For example, for the same example described above, if it is previously defined that Nemury and Nsilega are different individuals, then the reasoner will detect an inconsistency. Likewise it is possible to define that the property IsAuthorOf is functional too. Hence the reasoner will detect an inconsistence whether a person has been assigned as author of two theses.
OWL allows the creation of defined classes, which are classes with a set of necessary and sufficient conditions. Therefore, a reasoner can automatically classify the individuals that belong to these classes. To illustrate this feature, we created the class named WrongResult; then, we declared that a Model with some Step is a necessary and sufficient condition to classify a Result as WrongResult. Figure 6 a) shows the statements for the defined class, while Figure 6 b) depicts an example of inference carried out by the reasoner.
Conclusions
The ontology structure, together with the instances, represents a knowledge base. Since OWL is a formal language based on description logics, several tools can automatically analyze this knowledge base. Therefore, it is possible to check the consistency of the represented information as well as to infer new knowledge. There are several languages to search in an ontology, SARQL is one of the most popular. With this language is possible to make complex queries that could support the analysis of the described knowledge. The application of this ontology could be a useful instrument for researchers to find information about scientific results. The ontology was evaluated using a reasoner to check its formal logical properties. Furthermore, to demonstrate the applicability of the ontology, we described the results of the 12 doctoral thesis. This approach could contribute to achieving success in a thesis project.