macOS MIG - Mach Interface Generator

Reading time: 11 minutes

tip

Learn & practice AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)

Support HackTricks

Basic Information

MIG was created to simplify the process of Mach IPC code creation. It basically generates the needed code for server and client to communicate with a given definition. Even if the generated code is ugly, a developer will just need to import it and his code will be much simpler than before.

The definition is specified in Interface Definition Language (IDL) using the .defs extension.

These definitions have 5 sections:

  • Subsystem declaration: The keyword subsystem is used to indicate the name and the id. It's also possible to mark it as KernelServer if the server should run in the kernel.
  • Inclusions and imports: MIG uses the C-prepocessor, so it's able to use imports. Moreover, it's possible to use uimport and simport for user or server generated code.
  • Type declarations: It's possible to define data types although usually it will import mach_types.defs and std_types.defs. For custom ones some syntax can be used:
    • [in/out]tran: Function that needs to be trasnlated from an incoming or to an outgoing message
    • c[user/server]type: Mapping to another C type.
    • destructor: Call this function when the type is released.
  • Operations: These are the definitions of the RPC methods. There are 5 different types:
    • routine: Expects reply
    • simpleroutine: Doesn't expect reply
    • procedure: Expects reply
    • simpleprocedure: Doesn't expect reply
    • function: Expects reply

Example

Create a definition file, in this case with a very simple function:

myipc.defs
subsystem myipc 500; // Arbitrary name and id

userprefix USERPREF;        // Prefix for created functions in the client
serverprefix SERVERPREF;    // Prefix for created functions in the server

#include <mach/mach_types.defs>
#include <mach/std_types.defs>

simpleroutine Subtract(
    server_port :  mach_port_t;
    n1          :  uint32_t;
    n2          :  uint32_t);

Note that the first argument is the port to bind and MIG will automatically handle the reply port (unless calling mig_get_reply_port() in the client code). Moreover, the ID of the operations will be sequential starting by the indicated subsystem ID (so if an operation is deprecated it's deleted and skip is used to still use its ID).

Now use MIG to generate the server and client code that will be able to communicate within each other to call the Subtract function:

bash
mig -header myipcUser.h -sheader myipcServer.h myipc.defs

Several new files will be created in the current directory.

tip

You can find a more complex example in your system with: mdfind mach_port.defs
And you can compile it from the same folder as the file with: mig -DLIBSYSCALL_INTERFACE mach_ports.defs

In the files myipcServer.c and myipcServer.h you can find the declaration and definition of the struct SERVERPREFmyipc_subsystem, which basically defines the function to call based on the received message ID (we indicated a starting number of 500):

c
/* Description of this subsystem, for use in direct RPC */
const struct SERVERPREFmyipc_subsystem SERVERPREFmyipc_subsystem = {
	myipc_server_routine,
	500, // start ID
	501, // end ID
	(mach_msg_size_t)sizeof(union __ReplyUnion__SERVERPREFmyipc_subsystem),
	(vm_address_t)0,
	{
          { (mig_impl_routine_t) 0,
          // Function to call
          (mig_stub_routine_t) _XSubtract, 3, 0, (routine_arg_descriptor_t)0, (mach_msg_size_t)sizeof(__Reply__Subtract_t)},
	}
};

Based on the previous struct the function myipc_server_routine will get the message ID and return the proper function to call:

c
mig_external mig_routine_t myipc_server_routine
	(mach_msg_header_t *InHeadP)
{
	int msgh_id;

	msgh_id = InHeadP->msgh_id - 500;

	if ((msgh_id > 0) || (msgh_id < 0))
		return 0;

	return SERVERPREFmyipc_subsystem.routine[msgh_id].stub_routine;
}

In this example we have only defined 1 function in the definitions, but if we would have defined more functions, they would have been inside the array of SERVERPREFmyipc_subsystem and the first one would have been assigned to the ID 500, the second one to the ID 501...

If the function was expected to send a reply the function mig_internal kern_return_t __MIG_check__Reply__<name> would also exist.

Actually it's possible to identify this relation in the struct subsystem_to_name_map_myipc from myipcServer.h (**subsystem*to_name_map*\***** in other files):

c
#ifndef subsystem_to_name_map_myipc
#define subsystem_to_name_map_myipc \
    { "Subtract", 500 }
#endif

Finally, another important function to make the server work will be myipc_server, which is the one that will actually call the function related to the received id:

mig_external boolean_t myipc_server
	(mach_msg_header_t *InHeadP, mach_msg_header_t *OutHeadP)
{
	/*
	 * typedef struct {
	 * 	mach_msg_header_t Head;
	 * 	NDR_record_t NDR;
	 * 	kern_return_t RetCode;
	 * } mig_reply_error_t;
	 */

	mig_routine_t routine;

	OutHeadP->msgh_bits = MACH_MSGH_BITS(MACH_MSGH_BITS_REPLY(InHeadP->msgh_bits), 0);
	OutHeadP->msgh_remote_port = InHeadP->msgh_reply_port;
	/* Minimal size: routine() will update it if different */
	OutHeadP->msgh_size = (mach_msg_size_t)sizeof(mig_reply_error_t);
	OutHeadP->msgh_local_port = MACH_PORT_NULL;
	OutHeadP->msgh_id = InHeadP->msgh_id + 100;
	OutHeadP->msgh_reserved = 0;

	if ((InHeadP->msgh_id > 500) || (InHeadP->msgh_id < 500) ||
	    ((routine = SERVERPREFmyipc_subsystem.routine[InHeadP->msgh_id - 500].stub_routine) == 0)) {
		((mig_reply_error_t *)OutHeadP)->NDR = NDR_record;
		((mig_reply_error_t *)OutHeadP)->RetCode = MIG_BAD_ID;
		return FALSE;
	}
	(*routine) (InHeadP, OutHeadP);
	return TRUE;
}

Check the previously highlighted lines accessing the function to call by ID.

The following is the code to create a simple server and client where the client can call the functions Subtract from the server:

c
// gcc myipc_server.c myipcServer.c -o myipc_server

#include <stdio.h>
#include <mach/mach.h>
#include <servers/bootstrap.h>
#include "myipcServer.h"

kern_return_t SERVERPREFSubtract(mach_port_t server_port, uint32_t n1, uint32_t n2)
{
    printf("Received: %d - %d = %d\n", n1, n2, n1 - n2);
    return KERN_SUCCESS;
}

int main() {

    mach_port_t port;
    kern_return_t kr;

    // Register the mach service
    kr = bootstrap_check_in(bootstrap_port, "xyz.hacktricks.mig", &port);
    if (kr != KERN_SUCCESS) {
        printf("bootstrap_check_in() failed with code 0x%x\n", kr);
        return 1;
    }

    // myipc_server is the function that handles incoming messages (check previous exlpanation)
    mach_msg_server(myipc_server, sizeof(union __RequestUnion__SERVERPREFmyipc_subsystem), port, MACH_MSG_TIMEOUT_NONE);
}

The NDR_record

The NDR_record is exported by libsystem_kernel.dylib, and it's a struct that allows MIG to transform data so it's agnostic of the system it's being used as MIG was thought to be used between different systems (and not only in the same machine).

This is interesting because if _NDR_record is found in a binary as a dependency (jtool2 -S <binary> | grep NDR or nm), it means that the binary is a MIG client or Server.

Moreover MIG servers have the dispatch table in __DATA.__const (or in __CONST.__constdata in macOS kernel and __DATA_CONST.__const in other *OS kernels). This can be dumped with jtool2.

And MIG clients will use the __NDR_record to send with __mach_msg to the servers.

Binary Analysis

jtool

As many binaries now use MIG to expose mach ports, it's interesting to know how to identify that MIG was used and the functions that MIG executes with each message ID.

jtool2 can parse MIG information from a Mach-O binary indicating the message ID and identifying the function to execute:

bash
jtool2 -d __DATA.__const myipc_server | grep MIG

Moreover, MIG functions are just wrappers of the actual function that gets called, which means taht getting its dissasembly and grepping for BL you might be able to find the acatual function being called:

bash
jtool2 -d __DATA.__const myipc_server | grep BL

Assembly

It was previously mentioned that the function that will take care of calling the correct function depending on the received message ID was myipc_server. However, you usually won't have the symbols of the binary (no functions names), so it's interesting to check how it looks like decompiled as it will always be very similar (the code of this function is independent from the functions exposed):

int _myipc_server(int arg0, int arg1) {
    var_10 = arg0;
    var_18 = arg1;
    // Initial instructions to find the proper function ponters
    *(int32_t *)var_18 = *(int32_t *)var_10 & 0x1f;
    *(int32_t *)(var_18 + 0x8) = *(int32_t *)(var_10 + 0x8);
    *(int32_t *)(var_18 + 0x4) = 0x24;
    *(int32_t *)(var_18 + 0xc) = 0x0;
    *(int32_t *)(var_18 + 0x14) = *(int32_t *)(var_10 + 0x14) + 0x64;
    *(int32_t *)(var_18 + 0x10) = 0x0;
    if (*(int32_t *)(var_10 + 0x14) <= 0x1f4 && *(int32_t *)(var_10 + 0x14) >= 0x1f4) {
            rax = *(int32_t *)(var_10 + 0x14);
            // Call to sign_extend_64 that can help to identifyf this function
            // This stores in rax the pointer to the call that needs to be called
            // Check the used of the address 0x100004040 (functions addresses array)
            // 0x1f4 = 500 (the strating ID)
            rax = *(sign_extend_64(rax - 0x1f4) * 0x28 + 0x100004040);
            var_20 = rax;
            // If - else, the if returns false, while the else call the correct function and returns true
            if (rax == 0x0) {
                    *(var_18 + 0x18) = **_NDR_record;
                    *(int32_t *)(var_18 + 0x20) = 0xfffffffffffffed1;
                    var_4 = 0x0;
            }
            else {
                    // Calculated address that calls the proper function with 2 arguments
                    (var_20)(var_10, var_18);
                    var_4 = 0x1;
            }
    }
    else {
            *(var_18 + 0x18) = **_NDR_record;
            *(int32_t *)(var_18 + 0x20) = 0xfffffffffffffed1;
            var_4 = 0x0;
    }
    rax = var_4;
    return rax;
}

Actually if you go to the function 0x100004000 you will find the array of routine_descriptor structs. The first element of the struct is the address where the function is implemented, and the struct takes 0x28 bytes, so each 0x28 bytes (starting from byte 0) you can get 8 bytes and that will be the address of the function that will be called:

This data can be extracted using this Hopper script.

Debug

The code generated by MIG also calles kernel_debug to generate logs about operations on entry and exit. It's possible to check them using trace or kdv: kdv all | grep MIG

References

tip

Learn & practice AWS Hacking:HackTricks Training AWS Red Team Expert (ARTE)
Learn & practice GCP Hacking: HackTricks Training GCP Red Team Expert (GRTE)

Support HackTricks