On the Anonymization of Workflow Provenance without Compromising the Transparency of Lineage
Belhajjame, Khalid (2022), On the Anonymization of Workflow Provenance without Compromising the Transparency of Lineage, Journal of Data and Information Quality, 14, 1, p. 1–27. 10.1145/3460207
TypeArticle accepté pour publication ou publié
Journal nameJournal of Data and Information Quality
ACM - Association for Computing Machinery
MetadataShow full item record
Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Abstract (EN)Workflows have been adopted in several scientific fields as a tool for the specification and execution of scientific experiments. In addition to automating the execution of experiments, workflow systems often include capabilities to record provenance information, which contains, among other things, data records used and generated by the workflow as a whole but also by its component modules. It is widely recognized that provenance information can be useful for the interpretation, verification, and re-use of workflow results, justifying its sharing and publication among scientists. However, workflow execution in some branches of science can manipulate sensitive datasets that contain information about individuals. To address this problem, we investigate, in this article, the problem of anonymizing the provenance of workflows. In doing so, we consider a popular class of workflows in which component modules use and generate collections of data records as a result of their invocation, as opposed to a single data record. The solution we propose offers guarantees of confidentiality without compromising lineage information, which provides transparency as to the relationships between the data records used and generated by the workflow modules. We provide algorithmic solutions that show how the provenance of a single module and an entire workflow can be anonymized and present the results of experiments that we conducted for their evaluation.
Showing items related by title and author.
Belhajjame, Khalid (2020) Communication / Conférence
Alper, Pinar; Belhajjame, Khalid; Goble, Carole (2017) Article accepté pour publication ou publié
Alper, Pinar; Belhajjame, Khalid; Goble, Carole; Karagoz, pinar (2015) Communication / Conférence
Dey, Saumen; Belhajjame, Khalid; Koop, David; Song, Tianhong; Missier, Paolo; Ludäscher, Bertram (2014) Communication / Conférence
Gaignard, Alban; Belhajjame, Khalid; Skaf-Molli, Hala (2017) Communication / Conférence