Network analysis is an important and interesting part of malware analysis. Very often malware communicates with so-called command and control servers. From these servers it receives instructions, keys are exchanged or new functions are loaded in the form of payloads. If you want to analyze unknown Malware, it is a good first step to find out if the malware connects to a server.
In this blog article i will show you, how to quickly and easily create a small network analysis tool for TCP connections with Tycho. The goal is to detect when a process connects to a server, find out the address of the server, and report what data is exchanged.
What is Required
For illustrative purposes it is sufficient to work with a standalone configuration. That means I use a local server on the same computer as the client to check the functionality of the tool.
We need:
- A working Tycho setup as described here
- A client and a server, to establish a connection and transmit data. We will intercept this connection in order to analyze it.
As client and as server netcat is used, a simple cross-platform tool. With netcat you can also easily establish a local TCP connection and transfer data. The client is the process that we are going to analyze with Tycho.
All right, let’s go!
We will start with refreshing a bit of TCP socket basics. Then we do our connection analysis in two parts: Part 1 is identifying the destination and port of a new client connection, and part 2 is finding out the content that is being sent from the client.
Socket Basics
Windows uses sockets for communication over a network. Both the client and the server must have created a socket. Using the IP address and the port of the server, the client can connect to the server. After the connection is established, data can be transmitted in both directions.
To have a better understanding I will briefly explain on snippets from a simple C program how a client initializes a socket and transmits data.
The snippets are from Windows Dev Center and the complete code can be found here.
First, the data has to be initialized, that means the message and the buffer size are defined and initialized the socket.
{
WSADATA wsaData;
SOCKET ConnectSocket = INVALID_SOCKET;
struct addrinfo *result = NULL,
*ptr = NULL,
hints;
const char *sendbuf = "this is a test";
char recvbuf[DEFAULT_BUFLEN];
The next step is to create a socket for communication.
// Create a SOCKET for connecting to server
ConnectSocket = socket(ptr->ai_family, ptr->ai_socktype, ptr->ai_protocol);
// error handling truncated from sample
After the process has created a socket, it uses the API function connect
to connect it to the remote side.
// Connect to server.
iResult = connect( ConnectSocket, ptr->ai_addr, (int)ptr->ai_addrlen);
// error handling truncated from sample
Next, the API function send
is used to send the previously defined message to the server.
// Send an initial buffer
iResult = send( ConnectSocket, sendbuf, (int)strlen(sendbuf), 0 );
// error handling truncated from sample
printf("Bytes Sent: %ld\n", iResult);
Finally, the client receives the response from the server and the connection is terminated.
// Receive until the peer closes the connection
do {
iResult = recv(ConnectSocket, recvbuf, recvbuflen, 0);
if ( iResult > 0 )
printf("Bytes received: %d\n", iResult);
else if ( iResult == 0 )
printf("Connection closed\n");
// error handling truncated from sample
} while( iResult > 0 );
Part one: Get Remote Address and Port
A possibility to continue now would be to set breakpoints on the API functions. But that would be a very complicated way. With Tycho we can go a simpler way. API calls need to use system calls internally in order to make the operating system talk to the actual hardware. We can simply use the system call tracking function to get the important information. As demonstrated in other blog entries, analyzing system calls is one of Tycho’s strengths.
We build on this Tycho functionality and intercept NtDeviceIoControlFile
which is documented here, to get the IP address, port of the server and the transmitted data. NtDeviceIoControlFile
is not only used for network connections, but also for many other things e.g. to find out the size of the hard disk, as demonstrated in the blog article Reverse Engineering with Tycho.
The task for which NtDeviceIoControlFile
was just called can be determined from the IOCTL
code. The IOCTL
Code is an input parameter of this system call, so we can readily retrieve it with Tycho. To find out which IOCTL
are used for network operations we can look at the Dr. Memory project. This project contains a rich collection of open source system call analysis data, that we can use. Of particular interest is IOCTL 0x12007
which stands for AFD_Connect
.
The acronym AFD stands for Ancillary Function Driver. This driver is responsible for the communication to the low level functions of
tcpip.sys
. That is the primary driver for managing network connectivity.
This means that when the system call NtDeviceIoControlFile
is called successfully with this IOCTL
code, a connection to a server is established.
Now i will show code snippets that demonstrate how to use Tycho in order to read the server address and the transmitted data, only with information from NtDeviceIoControlFile
.
Important, the following code is not complete. It misses the initialization steps for Tycho and also execption handling. It is just intended as illustrative example for this blog post. The complete script can be found as Tycho recipe tycho-recipes repository.
First we set a system call breakpoint to NtDeviceIoControlFile
with the IOCTL
for AFD_CONNECT
. So we already know when the client establishes a connection.
if name == "NtDeviceIoControlFile":
ioctl = in_parameters["IoControlCode"]["value"]
if ioctl == 0x12007: # IOCTL for AFD_Connect
print("Client connect to server")
The next task is to find out the IP address and port of the server. Again this github repository of the Dr. Memory project also tells us that the input buffer of NTDeviceIoControlFile
is an AFD_CONNECT_INFO
structure in this case, which is described here.
This structure is defined as follows:
typedef struct _AFD_CONNECT_INFO {
BOOLEAN UseSAN;
ULONG Root;
ULONG Unknown;
SOCKADDR RemoteAddress;
} AFD_CONNECT_INFO , *PAFD_CONNECT_INFO ;
Looking at this structure, RemoteAddress
obviously sounds very interesting and we have to find out the layout behind SOCKADDR
. This information is available directly from Windows Dev Center and looks like this:
typedef struct sockaddr_in {
short sin_family;
USHORT sin_port;
IN_ADDR sin_addr;
CHAR sin_zero[8];
} SOCKADDR_IN, *PSOCKADDR_IN;
Now we have found the information we were looking for and we can extend the code example from above:
if ioctl == 0x12007: # IOCTL for AFD_Connect
input_buffer = in_params["InputBuffer"]["value"]
input_buffer_length = in_params["InputBufferLength"]["value"]
afd_connect_info = process.read_linear(input_buffer, input_buffer_length)
sockaddr_member = afd_connect_info[24:input_buffer_length]
ip = struct.unpack("!L", sockaddr_member[4:8])[0]
port = struct.unpack("!H", sockaddr_member[2:4])[0]
print("Client connect to server {}:{}".format(ip, port))
Summary Part One
The data structure AFD_CONNECT_INFO
can be found in the input buffer of NtDeviceIoControlFile
. There, we have to look at offset 24 where RemoteAddress
starts. It is important to remember that we are in the area of network data, meaning the data we read is big-endian. You can see this in the struct.unpack
functions. Normally the first parameter is a <
which indicates that the data as little-endian
. But now there is a !
which means that the data will be read in big-endian. For more information about the struct.unpack
function, refer to the Python documentation.
Counted from the start of RemoteAddress
we can find the port at offset 2, called sin_port
(2 bytes long), and the IP address at offset 4, called sin_address
(4 bytes long).
So part one is done. Tycho allows us to find out with very little effort to which server a process has connected.
Part Two: Get Transmit Data
What you might be interested in next is the data being sent to the server.
Finding the point when a transmission starts is very simple as the NtDeviceIoControlFile
system call is used again. This time, the IOCTL
code is 0x1201F
which stands for AFD_SEND
and it is responsible for sending data to the server. You can check the definition of the AFD_SEND
again at Dr. Memory.
This time the input buffer of NTDeviceIoControlFile
has a different function as it contains a AFD_SEND_INFO
struct. You can find a congruent definition of this structure in the Dr. Memory project source.
This looks like this:
typedef struct _AFD_SEND_INFO {
PAFD_WSABUF BufferArray;
ULONG BufferCount;
ULONG AfdFlags;
ULONG TdiFlags;
} AFD_SEND_INFO , *PAFD_SEND_INFO ;
At this point we need to take a closer look at PAFD_WSABUF
because this structure contains the number of transmitted bytes and a pointer to the data in memory. The documentation can be found here.
The WSABUF
struct is quite simple:
typedef struct _WSABUF {
ULONG len;
CHAR *buf;
} WSABUF, *LPWSABUF;
With this information, we can extend the script to get the transmitted data. First we get the input buffer (AFD_SEND_INFO
):
if ioc == 0x1201F:# IOCTL for AFD_Send
input_buffer = in_params["InputBuffer"]["value"]
input_buffer_length = in_params["InputBufferLength"]["value"]
afd_send_info = process.read_linear(input_buffer, input_buffer_length)
Then we read at offset 0 from AFD_SEND_INFO
(8 bytes) to get the PAFD_WSABUF
pointer.
pafd_wsabuf = struct.unpack_from("<Q", afd_send_info, 0)[0]
Finally we only have to read at PAFD_WSABUF
offset 0 the size of the transmitted data and at offset 8 the pointer, which shows where the data is located.
buf_len = struct.unpack("<Q",proc.read_linear(pafd_wsabuf, 8))[0]
buf_pointer = struct.unpack("<Q",proc.read_linear(pafd_wsabuf+8, 8))[0]
print("process sent: {}".format(proc.read_linear(buf_pointer, buf_len)))
Summary Part Two
The input buffer is an AFD_SEND_INFO
structure this time. It contains a pointer at offset 0 to the PAFD_WSABUF
structure. And in this one we find the number of bytes, at offset 0, that are sent and also the pointer, at offset 8, where to find them.
This is all there is, we have now created a network monitoring script for a process.
Demo
Here is a short demo on Youtube that shows the script in action. It shows a typical Tycho set up which uses an analyst’s system (left) on which the script is running. On the right is the target system where the processes to be monitored run.
Summary
With Tycho it is super easy to analyze TCP connections. To get information such as a remote address, port, and transmitted data you don’t need to deal with different API calls in a complicated way, you only need one system call, NtDeviceIoControlFile
. In order to understand the system call semantics, we studied structure definitions that are openly available in the microsoft docs and open source projects. In the end, we only had to convert the collected knowledge in a python script. And because the Pytycho
library abstracts all complicated operating system details for us, this was a very simple task.
But this is only the beginning. With the knowledge gained and this script as a basis, other functions could be implemented. For example, it would be possible to actively manipulate the transmitted data or to integrate block- and allow-lists for defined addresses/ports to log specific information.
Source: https://www.cyberus-technology.de/posts/2020-08-28-network-analysis-with-tycho.html