A novel visualization approach for data provenance


Yazici I. M. , AKTAŞ M. S.

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021 (Journal Indexed in SCI) identifier identifier

  • Publication Type: Article / Article
  • Publication Date: 2021
  • Doi Number: 10.1002/cpe.6523
  • Title of Journal : CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE
  • Keywords: data lineage, data provenance, e-Science workflows, PROV-O provenance specification, provenance visualization

Abstract

Data provenance has led to a developing need for the technologies to empower end-users to assess and take action on the data life cycle. In the Big Data era, companies' amount of data over the world increases each day. As data increases, metadata on the data origin and lifecycle of data also overgrows. Thus, this requires innovations that can provide a better understanding and interpretation of data using data provenance. This study addresses the challenge of extracting data in the form of graphs from scientific workflows and facilitating demanded visualization approaches such as graph comparison, summarization, backward-forward querying, and stream data visualization. W3C-PROV-O provenance specification is implemented via a visualization tool to assess the applicability of proposed algorithms. The proposed algorithms are tested on a large-scale provenance dataset to explore their performance. In addition, this study discusses the details of a comprehensive usability study of the prototype visualization tool. Results indicate that proposed visualization approaches are usable and processing overhead is insignificant.