Articles via Databases
Articles via Journals
Online Catalog
E-books
Research & Information Literacy
Interlibrary loan
Theses & Dissertations
Collections
Policies
Services
About / Contact Us
Administration
Littman Architecture Library
This site will be removed in January 2019, please change your bookmarks.
This page will redirect to https://digitalcommons.njit.edu/dissertations/1385/ in 5 seconds

The New Jersey Institute of Technology's
Electronic Theses & Dissertations Project

Title: Applications of big knowledge summarization
Author: Zheng, Ling
View Online: njit-etd2018-042
(xx, 173 pages ~ 7.2 MB pdf)
Department: Department of Computer Science
Degree: Doctor of Philosophy
Program: Computer Science
Document Type: Dissertation
Advisory Committee: Perl, Yehoshua (Committee co-chair)
Geller, James (Committee co-chair)
McHugh, James A. (Committee member)
Halper, Michael (Committee member)
Gu, Huanying (Committee member)
Liu, Mei (Committee member)
Date: 2018-08
Keywords: Abstraction networks
Biomedical ontologies
Drug-drug interaction discovery
Ontology quality assurance
Ontology summarization
Ontology visualization
Availability: Unrestricted
Abstract:

Advanced technologies have resulted in the generation of large amounts of data ("Big Data"). The Big Knowledge derived from Big Data could be beyond humans' ability of comprehension, which will limit the effective and innovative use of Big Knowledge repository. Biomedical ontologies, which play important roles in biomedical information systems, constitute one kind of Big Knowledge repository. Biomedical ontologies typically consist of domain knowledge assertions expressed by the semantic connections between tens of thousands of concepts. Without some high-level visual representation of Big Knowledge in biomedical ontologies, humans cannot grasp the "big picture" of those ontologies. Such Big Knowledge orientation is required for the proper maintenance of ontologies and their effective use. This dissertation is addressing the Big Knowledge challenge - How to enable humans to use Big Knowledge correctly and effectively (referred to as the "Big Knowledge to Use" (BK2U) problem) - with a focus on biomedical ontologies.

In previous work, Abstraction Networks (AbNs) have been demonstrated successful for the summarization, visualization and quality assurance (QA) of biomedical ontologies. Based on the previous research, this dissertation introduces new AbNs of various granularities for Big Knowledge summarization and extends the applications of AbNs. This dissertation consists of three main parts. The first part introduces two advanced AbNs. One is the weighted aggregate partial-area taxonomy with a parameter to flexibly control the summarization granularity. The second is the Ingredient Abstraction Network (IAbN) for the National Drug File - Reference Terminology (NDF-RT) Chemical Ingredients hierarchy, for which the previously developed AbNs for hierarchies with outgoing relationships, are not applicable. Since NDF-RT's Chemical Ingredients hierarchy has no outgoing relationships.

The second part describes applications of the two advanced AbNs. A study utilizing the weighted aggregate partial-area taxonomy for the identification of major topics in SNOMED CT's Specimen hierarchy is reported. A multi-layer interactive visualization system of required granularity for ontology comprehension, based on the weighted aggregate partial-area taxonomy, is demonstrated to comprehend the Neoplasm subhierarchy of National Cancer Institute thesaurus (NCIt). The IAbN is applied for drug-drug interaction (DDI) discovery.

The third part reports eight family-based QA studies on NCIt's Neoplasm, Gene, and Biological Process hierarchies, SNOMED CT's Infectious disease hierarchy, the Chemical Entities of Biological Interest ontology, and the Chemical Ingredients hierarchy in NDF-RT. There is no one-size-fits-all QA method and it is impossible to find a QA method for each individual ontology. Hence, family-based QA is an effective way, i.e., one QA technique could be applicable to a whole family of structurally similar ontologies. The results of these studies demonstrate that "complex concepts" and "uncommonly modeled concepts" are more likely to have errors. Furthermore, the three studies on overlapping concepts in partial-area taxonomies reported in this dissertation combined with previous three studies prove the success of "overlapping concepts" as a QA methodology for a whole family of 76 similar ontologies in BioPortal.


If you have any questions please contact the ETD Team, libetd@njit.edu.

 
ETD Information
Digital Commons @ NJIT
Theses and DIssertations
ETD Policies & Procedures
ETD FAQ's
ETD home

Request a Scan
NDLTD

NJIT's ETD project was given an ACRL/NJ Technology Innovation Honorable Mention Award in spring 2003