Adversarial ML 對抗式機器學習
BIIC Knows · AI 新聞新知
Adversarial ML 對抗式機器學習

As early as 2004, spammers started trying to fool the email filter AI, and since machine learning has been implemented into our lives into a deeper level nowadays, malware attacks are also invading this new playground.
Adversarial machine learning is a method designed to trick machine learning models by creating deceptive inputs, and can be mainly put into these categories:

Poisoning Attacks
By injecting malicious samples that can confuse or impact the algorithm predictions, poisoning attacks contaminate the dataset and lower models’ performance. 

For example, in 2016, Microsoft released an AI based chatbot that would learn from the internet and reply to twitter users. The internet trolls fed massive amounts of inflammatory messages to Tay and “contaminated” the dataset. Within 16 hours, Tay started to generate politically incorrect content and was forced to shut down.

Evasion attacks 
Evasion attack is the most prevalent and practical method, and has attracted many researchers’ attention. The attackers will target a trained model by generating modified data to “evade” the classifier detection without impacting the training data. 

The pictures of a panda may look the same to human eyes, but image classifiers will identify them one as a panda and the other as a gibbon! In 2014, Ian Goodfellow introduced Fast Gradient Sign Method (FGSM) that could trick image classifiers. 
FGSM computes the sign of the gradient and constructs the output adversarial images accordingly, which have minimal distortion of the original image that is almost impossible for human eyes to distinguish but affect the “pixel peeking” machine learning models. Without visible distorting the image, the classifier would have even higher confidence in identifying the “fraud image” than the correct image.

- Ian Goodfellow, 2014

Model Extraction
Model extraction/stealing attempts to reconstruct a model or extract the data it was trained on. These attacks aim at AI systems containing sensitive or confidential data such as national security or healthcare systems. 

In 2016, Florian Tramèr published a research showing that with prediction APIs, it can “steal” machine learning models with near-perfect fidelity for model classes including logistic regression, neural networks, and decision trees.

With AI-malware more and more rampant these days, researchers have dedicated more attention to protecting machine learning from being attacked. In the process of defending adversarial attack, sometimes developers are forced to choose between model robustness or accuracy, fast or confident response, proactive or reactive detection and other variables.
To counter AI malware, researchers have been developing protective mechanisms using GAN, model retrain, abuse detection and other state-of-art techniques to improve cybersecurity and user privacy.

Article Tags
Privacy 隱私 Federated Learning 聯合式學習 ASR 語音辨識 Emotion Recognition 情緒辨識 Psychology 心理學 Healthcare 醫療 Algorithm 演算法 Edge Computing 終端運算 Human Behavior 人類行為 Multimedia 多媒體 NLP 自然語言處理 Signal Processing 訊號處理