A novel visualization approach for data provenance


Yazici I. M., AKTAŞ M. S.

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, cilt.34, sa.9, 2022 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 34 Sayı: 9
  • Basım Tarihi: 2022
  • Doi Numarası: 10.1002/cpe.6523
  • Dergi Adı: CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Aerospace Database, Applied Science & Technology Source, Communication Abstracts, Compendex, Computer & Applied Sciences, INSPEC, Metadex, zbMATH, Civil Engineering Abstracts
  • Anahtar Kelimeler: data lineage, data provenance, e-Science workflows, PROV-O provenance specification, provenance visualization
  • Yıldız Teknik Üniversitesi Adresli: Evet

Özet

Data provenance has led to a developing need for the technologies to empower end-users to assess and take action on the data life cycle. In the Big Data era, companies' amount of data over the world increases each day. As data increases, metadata on the data origin and lifecycle of data also overgrows. Thus, this requires innovations that can provide a better understanding and interpretation of data using data provenance. This study addresses the challenge of extracting data in the form of graphs from scientific workflows and facilitating demanded visualization approaches such as graph comparison, summarization, backward-forward querying, and stream data visualization. W3C-PROV-O provenance specification is implemented via a visualization tool to assess the applicability of proposed algorithms. The proposed algorithms are tested on a large-scale provenance dataset to explore their performance. In addition, this study discusses the details of a comprehensive usability study of the prototype visualization tool. Results indicate that proposed visualization approaches are usable and processing overhead is insignificant.