< Back to list

GWKG: semi-automatic construction of the Great Wall Knowledge Graph

conferencePaper

DOI:10.1109/CAC53003.2021.9727293
Authors: Feng Yizhan / Teng Jing / Zhang Honglei

Extracted Abstract:

—This study proposes to construct the Great Wall knowledge graph (GWKG) in a semi-automatic way. First, under the guidance of domain experts, we build the professional dictio- nary and ontology layer, where BERT is applied for Named Entity Recognition. The resulting entities are clustered by Word2Vec to automatically refine the ontology layer. Then, we conduct Relation Extraction based on semi-supervision, link entities to encyclopedia websites, and obtain semi-structured information by crawler technology for attribute filling. Finally, we visualize the GWKG and report the results of multiple data formats. The GWKG consists of 34 ontology concepts, 33 types of relations, more than 6,000 entities and attributes, and 720,000 Chinese corpora. Index Terms—knowledge graph, the Great Wall, ontology, semi-automatic I.

Level 1: Include/Exclude

  • Papers must discuss situated information visualization* (by Willet et al.) in the application domain of CH.
    *A situated data representation is a data representation whose physical presentation is located close to the data’s physical referent(s).
    *A situated visualization is a situated data representation for which the presentation is purely visual – and is typically displayed on a screen.
  • Representation must include abstract data (e.g., metadata).
  • Papers focused solely on digital reconstruction without information visualization aspects are excluded.
  • Posters and workshop papers are excluded to focus on mature research contributions.
Show all meta-data