Deploying a Network Intrusion Prediction API

This example explores how to use the DataScience.com Platform to build a network intrusion detection system with SMS alerts and a reporting front end.

A network intrusion attack is the use of a network that compromises its stability or security. There is a large variety of actions on a network that could be considered as an intrusion. For example:

  • A Denial of Service attack, in which the network is overwhelmed with requests, causing other services to become unavailable.
  • An attacker searching for hidden files for secrets or sensitive data.

You may look at the signatures of user behavior on a network and try to identify patterns that indicate an attack.

This example will use the 1999 KDD Cup dataset.

After some feature engineering, model testing, and evaluation, train a gradient boosting classifier, serialize the model along with some metadata, and save it to the project:

from sklearn.model_selection import GridSearchCV
cv = GridSearchCV(
  GradientBoostingClassifier(),
  {
    'min_samples_split':[2, 4, 8],
    'max_depth':[2, 3, 4],
    'max_features':[None, 'auto']
  },
  n_jobs=4
)

cv.fit(X_train, y_train)

model = {
  'model':cv.best_estimator_,
  'features': X_train.columns.values.tolist()
}

import pickle

with open('my-model', 'wb') as model_path:
  pickle.dump(model, model_path)

With this model that can predict intrusions, we want to use it for a system that can:

  • Read connections in real time and alert team members if an attack is detected.
  • If an attack is detected, provide a report indicating why the model believes an attack is underway.

This tutorial will walk through the alert system. This system is a deployed model that, when connections are determined to be an intrusion, sends SMS messages to team members with a link to a report about the attack. The alert builds a custom link by encoding feature values in the URL to the report application. The report application, deployed on Heroku in this example, will use the values of the form to generate an explanation. Then SMS messages are sent via the Twilio API. Here is the alert system:

# my app, deploy this script
import os
import yaml
import gzip
import pickle
import os
import pandas as pd
import numpy as np
import sys
import json
from twilio.rest import Client

#read in Twilio credentials from environment variables
account_sid = os.environ['TWILIO_SID']
auth_token = os.environ['TWILIO_TOKEN']
client = Client(account_sid, auth_token)
from_number = os.environ['ALERT_SOURCE_NUMBER']
to_number = os.environ['ALERT_DST_NUMBER']

# collect metadata about runtime to ensure pickle protocol
# is correct, and paths to local dependencies are absolute.
PY_VERSION = sys.version_info.major
APP_PATH = os.path.dirname(os.path.abspath(__file__))

def alert(msg, to_number):
  """Uses the twilio API to send messages to a phone number"""
    message = client.api.account.messages.create(to=to_number,
                                                 from_=from_number,
                                                 body=msg)
def scale_data(array):
  """Scales the data for the model"""
    if len(array.shape) == 1:
        return StandardScaler().fit_transform(array[:, np.newaxis])
    else:
        return StandardScaler().fit_transform(array)
def load_config(config_path):
  """Loads a configuration yml file"""
    with open(config_path) as config_file:
        config = yaml.load(config_file.read())
    return config

def load_model(model_path):
  """Reads the serialized model"""
    with open(model_path, 'rb') as model_file:
        model = pickle.load(model_file)
    return model

def load_data(data_path):
    return pd.read_csv(data_path)

print("Files in APP", os.listdir(APP_PATH))

#Paths to local dependencies
MODEL_DIR = os.path.join(APP_PATH, 'models')
DATA_DIR = os.path.join(APP_PATH, 'data')
MODEL_PATH = os.path.join(MODEL_DIR, 'intruder-model-{}.pkl'.format(PY_VERSION))
DATA_PATH = os.path.join(DATA_DIR, 'intruder-data.csv')
CONFIG_PATH = os.path.join(APP_PATH, 'config.yml')
STATIC_DIR = os.path.join(APP_PATH, 'static')
INDEX_PAGE_PATH = os.path.join(STATIC_DIR, 'index.html')
FEATURES_PATH = os.path.join(MODEL_DIR, 'intruder-model-features-{}.pkl'.format(PY_VERSION))

# load configuration file
config = load_config(CONFIG_PATH)
# model is loaded from local file
model = load_model(MODEL_PATH)
# load list of features
features = load_model(FEATURES_PATH)

classes = model.classes_.tolist()

class_pretty_names = {
    'normal': "Normal Activity",
    'u2r': "User to Root Attack",
    "r2l":"Remote to Local Attack",
    "probe":"Network Probe",
    "dos":"Denial of Service Attack",
}

# Send initial alert, indicating system is online
message = alert("Intruder alert system now online", to_number)

def get_url_from_row(row):
  """This function is used to take a connection (row of data), and build
  a URL to our reporting app. The feature values will fill a form that the report uses to generate an explanation. This app is deployed on Heroku."""

    base = 'https://secure-scrubland-78676.herokuapp.com/individual?'
    args = 'src_bytesname={0}&src_dst_rationame={1}&logged_inname={2}&dst_bytesname={3}&same_srv_ratename={4}&is_flag_S0name={5}&serror_ratename={6}&rerror_ratename={7}&srv_serror_ratename={8}&is_service_FTPname={9}&srv_countname={10}&countname={11}'.format(*row)
    return base + args

def predict(connection):
  """ The function to be deployed. We allow the user to manually set the
  number to which we send SMS messages. The connection (row of data) is sent as the key-values in a dictionary, where keys indicate the feature name,
  and values correspond to feature values."""

    if 'alertnumber' in connection:
        sent_to_number = connection['alertnumber']
    else:
        sent_to_number = to_number

    row = [connection[feature] for feature in features]

    prediction = model.predict(row)[0]

    if prediction == 'normal':
        return {'message': "Normal Activity"}
    else:
        inspect_url = get_url_from_row(list(row))

        alert("{0} detected. To see why, go to {1}".format(class_pretty_names[prediction], inspect_url), sent_to_number)
        return {'message': "{} detected, sending alert.".format(class_pretty_names[prediction])}

With the intruder alert system built and saved as intruder-predictor.py, deploy it on the DataScience.com Platform (see our articles on Deploying APIs for more detailed instructions):

../_images/deploy-iot-1.png

The function predict() takes data in JSON format. In the case above, we expect the data to have this particular format:

{"connection":   {"src_bytes": 1, "src_dst_ratio": 1, "logged_in": 1, "dst_bytes": 1, "same_srv_rate": 1, "is_flag_S0": 1, "serror_rate": 1, "rerror_rate": 10, "srv_serror_rate": 1, "is_service_FTP": 1, "srv_count": 1, "count": 1000, "srv_serror_rate": 1}   }

With the model running, you may submit an example to ensure it’s working properly:

../_images/deploy-iot-2.png

If the model detected an attack, an SMS will be sent with a report explaining the attack.