Google has open-sourced Differentially Private SQL, a tool for companies aiming to keep sensitive data private. Learn more: . For a company that’s in the business oftracking users’ online activities, Google sure is going all out to prove it’s dead serious about privacy. To that effect, the internet behemoth isopen-sourcing a librarythat it uses to glean insights from aggregate data in a privacy-preserving manner. Called Differentially Private SQL , the library leverages the idea of differential privacy (DP) — a statistical technique that makes it possible to collect and share aggregate information about users, while safeguarding individual privacy. The link for this article located at The Next Web is no longer available. . Meta has unveiled a new framework, Securely Anonymized Analytics, designed to assist organizations in safeguarding sensitive information during data exploration.. Differentially Private SQL, Data Protection, Open Source Solutions. . LinuxSecurity.com Team
Ever since Paul Graham published "A Plan for Spam" in August 2002 (prerequisite reading for this article), a lot of people have spent a great deal of time applying statistical methods to automatically classify email messages as spam. Generally, spam identification is a hard problem to solve given that the definition of spam can differ from person to person. Messages erroneously classified as spam, known as "false positives," are pretty much intolerable, which further compounds the problem. Statisitical classifiers show great promise in this area as they are able to automatically adjust to handle personal definitions of spam. The odd false positive shows up from time to time, but these become few and far between as the local statistical model continues to improve. . These classifiers already come in many forms. There are POP3 proxies, IMAP proxies, mail file processors, and even classifiers built directly into mail clients. I use POPFile (a na?ve Bayesian classifier in a POP3 proxy) at home with great success. Some work better than others, but with a little training, they all seem to work pretty well. Unfortunately, they have a common shortcoming: They don't cause the spammers any pain. And we all want to cause spammers pain. None of these classifiers are capable of causing the spammers any pain because the spammer is long gone by the time the classifier has the opportunity to process the message. What we need is a way to use the classifier against the spammer while the spammer is still connected. . Combat spam and reduce false positives with Naive Bayes classifiers, ensemble methods like Random Forests, and effective feature engineering strategies. Spam Detection, Classification System, Bayesian Classifier, Email Filtering. . LinuxSecurity.com Team
Get the latest Linux and open source security news straight to your inbox.