What are adversarial Attacks?

Adversarial attacks are malicious inputs designed to fool a model in AI. It can be defined as to create an illusion that misunderstands the model. This technique is applied for various purposes.


There are different types of it. The most common one is to fool the model of image recognition. Attackers will create noise for the model and the model gets confuses to recognize an image and gives a false result.

In this example, we have 2 images of panda and we trained the model to recognize the image but after adding 0.07 noise and model has failed to recognize the image and declare it gibbon. so, this example shows that the ML model misbehave quickly when they faced any adversarial attacks.


The main purpose of adversarial attacks is to fool the model by giving false inputs. Attackers can attack your model for different reasons either they just want to fool the model or they want to remove your database tables by their attacks

How does it works?

Machine learning algorithm accepts input values as numeric vectors and then train the model with the false input to get false results. This is how it affects our model.

Types of adversarial Attacks

adversarial attacks are of different types.

1Black Box attack

black box attack is a dangerous type of attack. In this type of attack, attackers want to cyber security crime like bank robing. They attack to the cash machine to gain access to its internal infrastructure.

2White Box attack

White box attack is different from black-box attack In this attack, the attacker has access to the underlying training policy network of the target model. With the small disturbance in privacy training policy can drastically affect your model.