0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

How to install DoHlyzer on Ubuntu 22.04 LTS

Last updated at Posted at 2023-12-20

Introduction

DoHlyzer is a tool for extracting statistical features from packet-captured DoH (DNS-over-HTTPS) traffic.

This article describes what you need to know when installing DoHlyzer on Ubuntu 22.04 LTS.

Update Ubuntu

$ sudo apt update
$ sudo apt upgrade

Check Ubuntu version

$ cat /etc/os-release 
PRETTY_NAME="Ubuntu 22.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"

Check Python version

$ python3 -V 
Python 3.10.12

Set up virtual environment in Python

$ sudo apt install -y python3.10-venv
$ python3 -m venv boosting
$ source ~/boosting/bin/activate

Update package installer for Python

(boosting)$ pip install -U pip
(boosting)$ pip -V
pip 23.3.2 from /home/username/boosting/lib/python3.10/site-packages/pip (python 3.10)

Clone GitHub repository

(boosting)$ sudo apt install -y git
(boosting)$ git clone https://github.com/ahlashkari/DoHLyzer

Edit requirement list

(boosting)$ cd DoHLyzer
(boosting)$ mv requirements.txt requirements.txt.org 
(boosting)$ vi requirements.txt
requirements.txt
numpy==1.22.4
scipy==1.7.3
scapy==2.4.3
matplotlib==3.5.0
scikit-learn==1.3.2
Keras==2.8.0
tensorflow==2.8.0
ijson==3.2.3

# numpy~=1.18
# scipy~=1.4.1
# scapy~=2.4.3
# matplotlib==3.1.2
# scikit-learn==0.22.1
# Keras==2.3.1
# tensorflow==2.1.0
# ijson==3.0

Create setup file

(boosting)$ vi setup.py
setup.py
#!/usr/bin/env python

from setuptools import setup, find_packages

setup(
    name="dohlyzer",
    description="Set of tools to capture HTTPS traffic, extract statistical and time-series features from it, and analyze them with a focus on detecting and characterizing DoH (DNS-over-HTTPS) traffic.  ",
    long_description=open('README.md').read(),
    long_description_content_type="text/markdown",
    url="https://github.com/ahlashkari/DoHlyzer",
    packages=find_packages(exclude=[]),
    python_requires=">=3.6",
    install_requires=open('requirements.txt').read().split('\n'),
    entry_points={
        "console_scripts": [
            "dohlyzer=meter.dohlyzer:main",
        ]
    },
)

Setup file citation
The setup file is cited from the following site.

Prepare flow.patch

(boosting)$ cd ./meter/features/context
(boosting)$ vi flow.patch
flow.patch
--- packet_flow_key.py  2023-12-16 02:19:53.536965986 -0800
+++ packet_flow_key.py.new      2023-12-16 02:58:13.650651329 -0800
@@ -1,7 +1,7 @@
 #!/usr/bin/env python


-from meter.features.context import packet_direction
+from meter.features.context.packet_direction import PacketDirection


 def get_packet_flow_key(packet, direction) -> tuple:
@@ -30,7 +30,7 @@
     else:
         raise Exception('Only TCP protocols are supported.')

-    if direction == packet_direction.FORWARD:
+    if direction == PacketDirection.FORWARD:
         dest_ip = packet['IP'].dst
         src_ip = packet['IP'].src
         src_port = packet[protocol].sport

Apply flow.patch

(boosting)$ patch -b < flow.patch
patching file packet_flow_key.py

Prepare time.patch

(boosting)$ cd ../../features
(boosting)$ vi time.patch
time.patch
--- packet_time.py      2023-12-19 06:53:15.148255511 -0800
+++ packet_time.py.new  2023-12-19 06:54:03.259084221 -0800
@@ -48,7 +48,7 @@
             String of Date and time.

         """
-        time = self.flow.packets[0][0].time
+        time = float(self.flow.packets[0][0].time)
         date_time = datetime.fromtimestamp(time).strftime('%Y-%m-%d %H:%M:%S')
         return date_time

Apply time.patche

(boosting)$ patch -b < time.patch
patching file packet_time.py

Install DoHlyzer

(boosting)$ cd ../../../DoHLyzer/
(boosting)$ pip install .
Successfully installed Keras-2.8.0 MarkupSafe-2.1.3 absl-py-2.0.0 astunparse-1.6.3 
cachetools-5.3.2 certifi-2023.11.17 charset-normalizer-3.3.2 dohlyzer-0.0.0 
flatbuffers-23.5.26 gast-0.5.4 google-auth-2.25.2 google-auth-oauthlib-0.4.6 
google-pasta-0.2.0 grpcio-1.60.0 h5py-3.10.0 idna-3.6 ijson-3.2.3 joblib-1.3.2 
keras-preprocessing-1.1.2 libclang-16.0.6 markdown-3.5.1 matplotlib-3.5.0 
numpy-1.22.4 oauthlib-3.2.2 opt-einsum-3.3.0 protobuf-4.25.1 pyasn1-0.5.1 
pyasn1-modules-0.3.0 requests-2.31.0 requests-oauthlib-1.3.1 rsa-4.9 scapy-2.4.3 
scikit-learn-1.3.2 scipy-1.7.3 setuptools-scm-8.0.4 tensorboard-2.8.0 
tensorboard-data-server-0.6.1 tensorboard-plugin-wit-1.8.1 tensorflow-2.8.0 
tensorflow-io-gcs-filesystem-0.35.0 termcolor-2.4.0 
tf-estimator-nightly-2.8.0.dev2021122109 threadpoolctl-3.2.0 tomli-2.0.1 
typing-extensions-4.9.0 urllib3-2.1.0 werkzeug-3.0.1 wheel-0.42.0 wrapt-1.16.0

Get sample data of packet-captured DoH traffic

(boosting)$ cd meter
(boosting)$ wget https://eprints.lib.hokudai.ac.jp/dspace/bitstream/2115/88092/2/DoH-Tunnel-Traffic-HKD.zip
(boosting)$ unzip DoH-Tunnel-Traffic-HKD.zip
(boosting)$ cp ./DoH-Pcaps/DoH-Pcaps-48h/tuns-48h.pcap .

Sample data details
More details on the sample data can be found at the site below.

Start DoHlyzer

(boosting)$ python dohlyzer.py -f ./tuns-48h.pcap -c ./tuns-48h.csv
reading from file ./tuns-48h.pcap, link-type EN10MB (Ethernet)
Packet count: 262
Garbage Collection Began. Flows = 2
Garbage Collection Finished. Flows = 0
Packet count: 307264
Garbage Collection Began. Flows = 1
Garbage Collection Finished. Flows = 0
Garbage Collection Began. Flows = 2
Garbage Collection Finished. Flows = 0

Waiting time
It was about 5 minutes on my computer.

Mark data with a label

(boosting)$ cp -p tuns-48h.csv tuns-48h.csv.org
(boosting)$ sed -i "s/DoH/Label/g" tuns-48h.csv
(boosting)$ sed -i "s/False/tuns/g" tuns-48h.csv
(boosting)$ sed -i "s/True/tuns/g" tuns-48h.csv

Confirm results

(boosting)$ head -n 3 tuns-48h.csv
SourceIP,DestinationIP,SourcePort,DestinationPort,TimeStamp,Duration,FlowBytesSent,
FlowSentRate,FlowBytesReceived,FlowReceivedRate,PacketLengthVariance,
PacketLengthStandardDeviation,PacketLengthMean,PacketLengthMedian,PacketLengthMode,
PacketLengthSkewFromMedian,PacketLengthSkewFromMode,PacketLengthCoefficientofVariation,
PacketTimeVariance,PacketTimeStandardDeviation,PacketTimeMean,PacketTimeMedian,
PacketTimeMode,PacketTimeSkewFromMedian,PacketTimeSkewFromMode,
PacketTimeCoefficientofVariation,ResponseTimeTimeVariance,
ResponseTimeTimeStandardDeviation,ResponseTimeTimeMean,ResponseTimeTimeMedian,
ResponseTimeTimeMode,ResponseTimeTimeSkewFromMedian,ResponseTimeTimeSkewFromMode,
ResponseTimeTimeCoefficientofVariation,Label
192.168.11.12,192.168.11.16,35146,443,2021-10-30 09:44:22,0.058030,
1057,18214.71652593486127864897467,5497,94726.86541444080647940720317,
241648.4709141274,491.5775329631404,344.94736842105266,74.0,66,1.653537948253031,
0.5674534528451868,1.4250798178669006,0.0006792408957783933518005540163,
0.02606225039743101984381890440,0.02310989473684210526315789474,0.002850,0.0,
2.332096548980951604474806341,0.8867190816001087,1.127752882226773168550630360,
0.0000171263761875,0.004138402613025948687341905884,0.00247025,0.0001025,
3.8e-05,1.716423138155277661149494295,0.5877267698276386,
1.675297080467948056812835091,tuns
192.168.11.12,192.168.11.16,35148,443,2021-10-30 09:44:22,121.103592,15694,
129.5915318515077570944386191,23173,191.3485770100031384700794011,27973.836644142997,
167.25380905720203,159.94650205761317,74.0,66,1.5416061829997336,0.5617002242710226,
1.0456859443975632,1444.809650456204504801097394,38.01065180256982260949407615,
53.03876174074074074074074074,51.127386,0.056111,0.1508557983168957281846502085,
1.3938895606404382,0.7166579790902743810399707875,0.8197690634662370200108166581,
0.9054109914653328417537853837,0.6178618255813953488372093023,0.000087,
4.5e-05,2.046942763246924571056582828,0.6823606421891454,
1.465393966706649347200615788,tuns

28 statistica features are extracted in the csv file.

Parameter Feature
F1 Number of flow bytes sent
F2 Rate of flow bytes sent
F3 Number of flow bytes received
F4 Rate of flow bytes received
F5 Variance of Packet Length
F6 Standard Deviation of Packet Length
F7 Mean Packet Length
F8 Median Packet Length
F9 Mode Packet Length
F10 Skew from median Packet Length
F11 Skew from mode Packet Length
F12 Coefficient of Variation of Packet Length
F13 Variance of Packet Time
F14 Standard Deviation of Packet Time
F15 Mean Packet Time
F16 Median Packet Time
F17 Mode Packet Time
F18 Skew from median Packet Time
F19 Skew from mode Packet Time
F20 Coefficient of Variation of Packet Time
F21 Variance of Request/response time difference
F22 Standard Deviation of Request/response time difference
F23 Mean Request/response time difference
F24 Median Request/response time difference
F25 Mode Request/response time difference
F26 Skew from median Request/response time difference
F27 Skew from mode Request/response time difference
F28 Coefficient of Variation of Request/response time difference

Statistical features citation
The statistical features have been cited from the below site.

Deactivate the virtual environment in Python

(boosting)$ deactivate
$ 

Conclusion

This article introduced how to install DoHlyzer on Ubuntu 22.04 LTS.

DoHlyzer is a powerful feature extractor for packet-captured DoH traffic.

Hopefully, DoHlyzer will be more widely used.

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?