Preferred Language
Articles
/
ijs-13069
Code-Talker Paradox: Machine Learning Algorithms with Differential Privacy to Classify Heart Failure Patients
...Show More Authors

The Code-Talker Paradox concept is applied to test the ability to achieve differential privacy while maintaining an acceptable level of machine learning accuracy. A real-world dataset for heart failure patients is used to test the accuracy. Four different machine learning algorithms, namely: decision tree, logistic regression, random forest, and Naïve Bayes, are employed. Laplace noise is added to the raw dataset to protect private and sensitive user data. This research aims to: first, find a balanced noise scale where differential privacy is achievable with an acceptable accuracy result. Second, evaluate the four machine learning classifiers and introduce the one that best fits the current heart failure dataset. Hyperparameter tunings have been applied to the employed algorithms. Different levels and scenarios are tested with the Laplace noise scale and added to the raw data. The accuracy results are recorded and compared. Laplace noise between 1 and 4 does not affect accuracy, while 5 to 7 results in regularization and increases the accuracy accordingly. A Laplace noise value of 28 and above significantly reduces the accuracy value. Finally, the decision tree shows the more stable algorithm regarding the added noise. While logistic regression is the more fluctuating algorithm, it still presents the highest accuracy. Potential future research and study limitations are discussed in order to contribute to a more comprehensive study.

View Publication Preview PDF
Quick Preview PDF