more writing, added benchmark to compare log files

This commit is contained in:
2017-11-30 18:48:23 +01:00
parent c1237406f8
commit 673464137e
8 changed files with 68 additions and 10 deletions

View File

@@ -1,14 +1,7 @@
\chapter{Introduction}
\label{cha:Introduction}
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum.
\lstinputlisting[language={java}, label=lst:sendImpliciteIntent,caption=Intent - Bild anzeigen]{res/src/sendImpliciteIntent.java}
Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi.
Nam liber tempor cum soluta nobis eleifend option congue nihil imperdiet doming id quod mazim placerat facer possim assum. Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat.
The domain name system (\gls{dns}) has been one of the corner stones of the internet for a long time. It acts as a hierarchical, bidirectional translation device between mnemonic domain names and network addresses. It also provides service lookup or enrichment capabilities for a range of application protocols like HTTP, SMTP, and SSH. In the context of defensive IT security, investigating aspects of the \gls{dns} can facilitate protection efforts tremendously. Estimating the reputation of domains can help in identifying hostile activities. Such a score can, for example, consider features like quickly changing network blocks for a given domain or clustering of already known malicious domains and newly observed ones.
\section{Motivation}
@@ -18,11 +11,21 @@ Nam liber tempor cum soluta nobis eleifend option congue nihil imperdiet doming
\section{Challenges}
\label{sec:challenges}
All of the investigated approaches are using \gls{pdns} logs to generate a reputation score for a specific domain. These logs are generated on central \gls{dns} resolvers and capture outgoing traffic of multiple users (see section~\ref{subsec:passive_dns}), one challenge of this work is handling huge volumes of data. With about seven Gigabytes \todo{verify} of uncompressed \gls{pdns} logs for a single day, various general issues might occur: General purpose computers nowadays usually have up to 16 Gigabytes of RAM (rarely 32 GB) which concludes that multiple tasks (i.e. building a training set) may not be performed purely in-memory. The time of analysis might also become a bottleneck. Simply loading one single day (see benchmark example~\ref{lst:load_and_iterate_one_day_of_compressed_pdns_logs}) of (compressed) logs from disk and iterating it without actual calculations takes roughly 148 seconds. To evaluate existing algorithms certain requirements have to be met. Passive DNS logs usually contain sensitive data which is one reason why most papers do not publish test data. For a precise evaluation the raw input data is needed. Some previously developed classifications have not completely disclosed the involved algorithms so these have to be reconstructed as close as possible taking all available information into account.
\section{Goals}
\label{sec:goals}
The task of this work is to evaluate existing scoring mechanisms of domains in the special context of IT security, and also research the potential for combining different measurement approaches. It ultimately shall come up with an improved and evaluated algorithm for determining the probability of a domain being related to hostile activities.
\section{Related Work}
\label{sec:related_work}
\todo{machine learning vs others}
\lstinputlisting[language={java}, label=lst:sendImpliciteIntent,caption=Intent - Bild anzeigen]{res/src/sendImpliciteIntent.java}