Date of Completion
12-9-2013
Embargo Period
12-6-2020
Keywords
natural language processing, web content mining, semantic processing, dynamic ontology development, collaboration system, information retrieval, search, biomedical literature mining, text visualization, document visualization, gene regulatory relationships, cell signalling, picture rendering, web search
Major Advisor
Dr. Dong-Guk Shin
Associate Advisor
Dr. Robert McCartney
Associate Advisor
Dr. Xiaoyan Wang
Field of Study
Computer Science and Engineering
Degree
Doctor of Philosophy
Open Access
Campus Access
Abstract
Semantic processing system (SPS) is a system that performs phrase search of web content. SPS takes a user query in natural language, converts it to a keyword query, expands the keyword query with synonyms, hypernyms, hyponyms, and meronyms, and presents the keyword query to a search engine. SPS then sifts through the search engine result pages extracting grammatical and semantic information from each page for computing the page's relevance to the natural language query. SPS' relevance computation uses semantic matching of phrases rather than term-and-document frequency weighting—a method that is most commonly used by existing web search engines. SPS consults an ontology that is both "crowd-sourced," i.e., built collaboratively and incrementally by the large number of users and "auto-learned," i.e., contextually inferred from sentences containing desired words. SPS would be suitable for the areas of biomedical literature mining, legal document review and discovery, and news/RSS feed monitoring because these are laden with prose text. We implemented a prototype SPS, experimented with it and demonstrate that SPS outperforms a representative keyword based search engine. The strength of SPS stems from its exploitation of phrase semantics, which is not used in the conventional search engines.
Recommended Citation
Leone, Joseph, "A Phrase-based Ontology Enabled Semantic Processing System for Web Search" (2013). Doctoral Dissertations. 267.
https://digitalcommons.lib.uconn.edu/dissertations/267