On the Anonymization of Workflow Provenance without Compromising the Transparency of Lineage
Belhajjame, Khalid (2022), On the Anonymization of Workflow Provenance without Compromising the Transparency of Lineage, Journal of Data and Information Quality, 14, 1, p. 1–27. 10.1145/3460207
Type
Article accepté pour publication ou publiéDate
2022Journal name
Journal of Data and Information QualityVolume
14Number
1Publisher
ACM - Association for Computing Machinery
Pages
1–27
Publication identifier
Metadata
Show full item recordAuthor(s)
Belhajjame, KhalidLaboratoire d'analyse et modélisation de systèmes pour l'aide à la décision [LAMSADE]
Abstract (EN)
Workflows have been adopted in several scientific fields as a tool for the specification and execution of scientific experiments. In addition to automating the execution of experiments, workflow systems often include capabilities to record provenance information, which contains, among other things, data records used and generated by the workflow as a whole but also by its component modules. It is widely recognized that provenance information can be useful for the interpretation, verification, and re-use of workflow results, justifying its sharing and publication among scientists. However, workflow execution in some branches of science can manipulate sensitive datasets that contain information about individuals. To address this problem, we investigate, in this article, the problem of anonymizing the provenance of workflows. In doing so, we consider a popular class of workflows in which component modules use and generate collections of data records as a result of their invocation, as opposed to a single data record. The solution we propose offers guarantees of confidentiality without compromising lineage information, which provides transparency as to the relationships between the data records used and generated by the workflow modules. We provide algorithmic solutions that show how the provenance of a single module and an entire workflow can be anonymized and present the results of experiments that we conducted for their evaluation.Related items
Showing items related by title and author.
-
Belhajjame, Khalid (2020) Communication / Conférence
-
Alper, Pinar; Belhajjame, Khalid; Goble, Carole (2017) Article accepté pour publication ou publié
-
Alper, Pinar; Belhajjame, Khalid; Goble, Carole; Karagoz, pinar (2015) Communication / Conférence
-
Dey, Saumen; Belhajjame, Khalid; Koop, David; Song, Tianhong; Missier, Paolo; Ludäscher, Bertram (2014) Communication / Conférence
-
Gaignard, Alban; Belhajjame, Khalid; Skaf-Molli, Hala (2017) Communication / Conférence