CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, vol.34, no.9, 2022 (SCI-Expanded)
Data provenance has led to a developing need for the technologies to empower end-users to assess and take action on the data life cycle. In the Big Data era, companies' amount of data over the world increases each day. As data increases, metadata on the data origin and lifecycle of data also overgrows. Thus, this requires innovations that can provide a better understanding and interpretation of data using data provenance. This study addresses the challenge of extracting data in the form of graphs from scientific workflows and facilitating demanded visualization approaches such as graph comparison, summarization, backward-forward querying, and stream data visualization. W3C-PROV-O provenance specification is implemented via a visualization tool to assess the applicability of proposed algorithms. The proposed algorithms are tested on a large-scale provenance dataset to explore their performance. In addition, this study discusses the details of a comprehensive usability study of the prototype visualization tool. Results indicate that proposed visualization approaches are usable and processing overhead is insignificant.