The objective of this thesis is to expand taint analysis from the traditional use in communication networks to track data flow to the field of datasets. We aim to provide a solid framework which can trace data propagation within a dataset through "taint markers" representative of the origin, transformations, and interactions for each data element. The framework, by labelling the data elements and following their behaviour as they go through different processing stages, generates well-defined mechanisms to detect unauthorized modifications in the data and hence guarantees consistency throughout the data's life cycle. Protection of integrity is important in maintaining datasets, since more and more sectors such as finance, healthcare, and government depend on clean data; any tampering with such information would have disastrous effects. The proposed solution designs scalable taint markers and algorithms to track, verify, and validate data in order to allow for real-time detection of integrity breaches and unauthorized changes. Preliminary results prove that the taint analysis framework enhances a dataset's security and smoothly integrates into existing database management systems without their architecture being disrupted. In all, this work demonstrates the feasibility and added value of adapting taint analysis to sensitive data protection, meeting the critical needs of a data-centric world.

The objective of this thesis is to expand taint analysis from the traditional use in communication networks to track data flow to the field of datasets. We aim to provide a solid framework which can trace data propagation within a dataset through "taint markers" representative of the origin, transformations, and interactions for each data element. The framework, by labelling the data elements and following their behaviour as they go through different processing stages, generates well-defined mechanisms to detect unauthorized modifications in the data and hence guarantees consistency throughout the data's life cycle. Protection of integrity is important in maintaining datasets, since more and more sectors such as finance, healthcare, and government depend on clean data; any tampering with such information would have disastrous effects. The proposed solution designs scalable taint markers and algorithms to track, verify, and validate data in order to allow for real-time detection of integrity breaches and unauthorized changes. Preliminary results prove that the taint analysis framework enhances a dataset's security and smoothly integrates into existing database management systems without their architecture being disrupted. In all, this work demonstrates the feasibility and added value of adapting taint analysis to sensitive data protection, meeting the critical needs of a data-centric world.

Taint Analysis on Dataset

FABBIANI, PATRICK
2024/2025

Abstract

The objective of this thesis is to expand taint analysis from the traditional use in communication networks to track data flow to the field of datasets. We aim to provide a solid framework which can trace data propagation within a dataset through "taint markers" representative of the origin, transformations, and interactions for each data element. The framework, by labelling the data elements and following their behaviour as they go through different processing stages, generates well-defined mechanisms to detect unauthorized modifications in the data and hence guarantees consistency throughout the data's life cycle. Protection of integrity is important in maintaining datasets, since more and more sectors such as finance, healthcare, and government depend on clean data; any tampering with such information would have disastrous effects. The proposed solution designs scalable taint markers and algorithms to track, verify, and validate data in order to allow for real-time detection of integrity breaches and unauthorized changes. Preliminary results prove that the taint analysis framework enhances a dataset's security and smoothly integrates into existing database management systems without their architecture being disrupted. In all, this work demonstrates the feasibility and added value of adapting taint analysis to sensitive data protection, meeting the critical needs of a data-centric world.
2024
The objective of this thesis is to expand taint analysis from the traditional use in communication networks to track data flow to the field of datasets. We aim to provide a solid framework which can trace data propagation within a dataset through "taint markers" representative of the origin, transformations, and interactions for each data element. The framework, by labelling the data elements and following their behaviour as they go through different processing stages, generates well-defined mechanisms to detect unauthorized modifications in the data and hence guarantees consistency throughout the data's life cycle. Protection of integrity is important in maintaining datasets, since more and more sectors such as finance, healthcare, and government depend on clean data; any tampering with such information would have disastrous effects. The proposed solution designs scalable taint markers and algorithms to track, verify, and validate data in order to allow for real-time detection of integrity breaches and unauthorized changes. Preliminary results prove that the taint analysis framework enhances a dataset's security and smoothly integrates into existing database management systems without their architecture being disrupted. In all, this work demonstrates the feasibility and added value of adapting taint analysis to sensitive data protection, meeting the critical needs of a data-centric world.
File in questo prodotto:
File Dimensione Formato  
869936_Tesi_magistrale_62_pdfA.pdf

embargo fino al 05/11/2027

Dimensione 6.83 MB
Formato Adobe PDF
6.83 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14247/26985