Data Privacy (DP) has always been an issue in data analysis. This is more so today than ever before because of the advanced tools available to take advantage of data for all sorts of reasons including unethical. It has, therefore, become one of the big challenges that Big Data has thrown about in recent years. A number of attempts at dealing with DP and confidentiality preservation have been made. They mainly rely on data encoding, homomorphic encryption in particular, and other mathematical devices that allow datasets to be worked on in place of others with the aim of getting the same or equivalent solutions. They do, however, have limitations often due to the high dimensionality of these datasets and their extremely large volumes. The curse of dimensionality and large volume are of course inherent to the concept of Big Data.
In my talk, I will suggest a new approach for protecting data privacy and confidentiality that relies on optimisation, complexity theory and NP-Completeness in particular. I will describe this approach and illustrate it on a very common problem in data science, namely clustering which will be converted into a TSP. Results will be provided and discussed.