master_thesis/Thesis/content/Technical_Background/Technical_Background.tex

\chapter{Technical Background}
\label{cha:technical_background}

\input{content/Technical_Background/DNS/DNS}

\section{Machine Learning}
\label{sec:machine_learning}

Machine learning is broad field in computer science that aims to give computers the ability to learn without being explicitly programmed for a special purpose. There are many different approaches available that have advantages and disadvantages in different areas. Machine learning in this work is mostly limited to decision tree learning. Decision tree learning is an approach that is generally adopted from how humans are making decisions. Given a set of attributes, humans are able to decide, e.g. whether to buy one or another product. Machine learning algorithms use a technique called training to build a model which can later be used to make decisions. A decision tree consists of three components: a node represents the test of a certain attribute to split up the tree, leafs are terminal nodes and represent the prediction (the class or label) of the path from the root node to the leaf, and edges correspond to the results of a test and establish a connection to the next node or leaf. This training is performed in multiple steps: Given an arbitrarily large dataset (training set) with an fixed size of features (attributes) and each sample in the training set is assigned a label. The amount of labels is arbitrary (but limited), in a binary classification there are two different labels (e.g. malicious or benign in cases for domains). In the first step of the training, the whole training set is iterated and each time, a set of samples can be separated using one single attribute (in perspective to the assigned label) it is branched out and a new leaf is created. Each branch is then split into more fine grained subtrees as long as there is an \textit{information gain}, which means that all samples of the subset belong to the same class, i.e. are assigned the same label. The model can later be queried with an unlabeled data sample and the model returns the probability with which the data sample can be assigned to a class/label.

This way, having a labeled training set with limited size and by learning the characteristics of the labeled test sample, unlabeled data can be classified.


%\input{content/Technical_Background/Detecting_Malicious_Domain_Names/Detecting_Malicious_Domain_Names}
\input{content/Technical_Background/Benchmarks/Benchmarks}