Abhishek Anand
6 min readJun 6, 2021

--

source: google

CONFUSION MATRIX AND CYBER SECURITY

Confusion matrix is a table which tell us about the performance of our classification model on the test data for which the true result is known.

source: google

Definition of the Terms:

True Positive: You predicted positive and it’s true.

True Negative: You predicted negative and it’s true.

False Positive: You predicted positive and it’s false.

False Negative: You predicted negative and it’s false.

Let’s start with an example confusion matrix for a binary classifier :

What can we learn from this matrix?

  • There are two possible predicted classes: “yes” and “no”. If we were predicting the presence of a disease, for example, “yes” would mean they have the disease, and “no” would mean they don’t have the disease.
  • The classifier made a total of 165 predictions (e.g., 165 patients were being tested for the presence of that disease).
  • Out of those 165 cases, the classifier predicted “yes” 110 times, and “no” 55 times.
  • In reality, 105 patients in the sample have the disease, and 60 patients do not.

This is a list of rates that are often computed from a confusion matrix for a binary classifier:

  • Accuracy: Overall, how often is the classifier correct?
  • (TP+TN)/total = (100+50)/165 = 0.91

— — — — — — — — — — — — — — — — — — — — — — — — — -

  • Misclassification Rate: Overall, how often is it wrong?
  • (FP+FN)/total = (10+5)/165 = 0.09
  • equivalent to 1 minus Accuracy
  • also known as “Error Rate”

— — — — — — — — — — — — — — — — — — — — — — — — — — -

  • True Positive Rate: When it’s actually yes, how often does it predict yes?
  • TP/actual yes = 100/105 = 0.95
  • also known as “Sensitivity” or “Recall”

— — — — — — — — — — — — — — — — — — — — — — — — — — -

  • False Positive Rate: When it’s actually no, how often does it predict yes?
  • FP/actual no = 10/60 = 0.17

— — — — — — — — — — — — — — — — — — — — — — — — — —

  • True Negative Rate: When it’s actually no, how often does it predict no?
  • TN/actual no = 50/60 = 0.83
  • equivalent to 1 minus False Positive Rate
  • also known as “Specificity”

— — — — — — — — — — — — — — — — — — — — — — — — — — -

  • Precision: When it predicts yes, how often is it correct?
  • TP/predicted yes = 100/110 = 0.91

— — — — — — — — — — — — — — — — — — — — — — — — — — -

  • Prevalence: How often does the yes condition actually occur in our sample?
  • actual yes/total = 105/165 = 0.64

Need for Confusion Matrix in Machine learning

  • It evaluates the performance of the classification models, when they make predictions on test data, and tells how good our classification model is.
  • It not only tells the error made by the classifiers but also the type of errors such as it is either type-I or type-II error.
  • With the help of the confusion matrix, we can calculate the different parameters for the model, such as accuracy, precision, etc.

What is Cyber Security?

Cybersecurity is the practice of protecting systems, networks, and programs from digital attacks, here digital attacks can be stealing information or spying on another system, and much more. There are cybersecurity experts present whose work is to protect users or prevent the digital attack, here digital attack is also known as cybercrime.

Cyber Attack Detection and Classification using Parallel Support Vector Machine

Support Vector Machines (SVM) are the classifiers that were originally designed for binary c1assification. The c1assificatioin applications can solve multi-class problems. The result shows that pSVM gives more detection accuracy for classes and comparable to the false alarm rate.

Cyberattack detection is a classification problem, in which we classify the normal pattern from the abnormal pattern (attack) of the system.

The SDF is a very powerful and popular data mining algorithm for decision-making and classification problems. It has been using in many real-life applications like medical diagnosis, radar signal classification, weather prediction, credit approval, and fraud detection, etc.

A parallel Support Vector Machine (pSVM) algorithm was proposed for the detection and classification of cyber attack datasets.

The performance of the support vector machine is greatly dependent on the kernel function used by SVM. Therefore, we modified the Gaussian kernel function in a data-dependent way in order to improve the efficiency of the classifiers. The relative results of both the classifiers are also obtained to ascertain the theoretical aspects. The analysis is also taken up to show that PSVM performs better than SDF.

The classification accuracy of PSVM remarkably improve (accuracy for Normal class as well as DOS class is almost 100%) and comparable to false alarm rate and training, testing times.

KDD CUP ‘’99 Data Set Description

This data set is prepared by Stolfo et al and is built based on the data captured in the DARPA’98 IDS evaluation program . DARPA’98 is about 4 gigabytes of compressed raw (binary) TCP dump data of 7 weeks of network traffic, which can be processed into about 5 million connection records, each with about 100 bytes.

For each TCP/IP connection, 41 various quantitative (continuous data type) and qualitative (discrete data type) features were extracted among the 41 features, 34 features (numeric), and 7 features (symbolic).

To analysis the different results, there are standard metrics that have been developed for evaluating network intrusion detections. Detection Rate (DR) and false alarm rate are the two most famous metrics that have already been used. DR is computed as the ratio between the number of correctly detected attacks and the total number of attacks, while the false alarm (false positive) rate is computed as the ratio between the number of normal connections that is incorrectly misclassified as attacks and the total number of normal connections.

In parallel SVM machine first we reduced nonclassified features data by distance matrix of binary pattern. From this concept, the cascade structure is developed by initializing the problem with a number of independent smaller optimizations and the partial results are combined in later stages in a hierarchical way, as shown in figure 1, supposing the training data subsets and are independent among each other.

  • True Positive (TP): The amount of attack detected when it is actually attack.
  • True Negative (TN): The amount of normal detected when it is actually normal.
  • False Positive (FP): The amount of attack detected when it is actually normal (False alarm).
  • False Negative (FN): The amount of normal detected when it is actually attack.

THANK YOU!

SOURCE: https://www.ee.ryerson.ca , wikipedia

--

--