Understanding Mach-O Header

The Mach-O Header

At the start of every Mach-O file, there is a header. It contains basic information about the rest of the file. Header structure for 32-bit architecture. The structure can be found under user/include/mach-o/loader.h We can visualize the Mach-O header to be something like this.

Mach-O Header Layout

struct mach_header {
	uint32_t	magic;		/* mach magic number identifier */
	cpu_type_t	cputype;	/* cpu specifier */
	cpu_subtype_t	cpusubtype;	/* machine specifier */
	uint32_t	filetype;	/* type of file */
	uint32_t	ncmds;		/* number of load commands */
	uint32_t	sizeofcmds;	/* the size of all the load commands */
	uint32_t	flags;		/* flags */
};

Header structure for 64 bit architecture

struct mach_header_64 {
	uint32_t	magic;		/* mach magic number identifier */
	cpu_type_t	cputype;	/* cpu specifier */
	cpu_subtype_t	cpusubtype;	/* machine specifier */
	uint32_t	filetype;	/* type of file */
	uint32_t	ncmds;		/* number of load commands */
	uint32_t	sizeofcmds;	/* the size of all the load commands */
	uint32_t	flags;		/* flags */
	uint32_t	reserved;	/* reserved */
};

Magic Number Identifier (magic)

For 32-bit

#define	MH_MAGIC	0xfeedface	/* the mach magic number */
#define MH_CIGAM	0xcefaedfe	/* NXSwapInt(MH_MAGIC) */

For 64-bit

#define MH_MAGIC_64 0xfeedfacf /* the 64-bit mach magic number */
#define MH_CIGAM_64 0xcffaedfe /* NXSwapInt(MH_MAGIC_64) */

You can see two variants for the ‘magic number’. They are MAGIC at its reverse CIGAM. CIGAM / CIGAM_64 represents all the bytes must be swapped / reversed since the host machine on which the binary was created has the opposite byte alignment to that of the target machine.

CPU Type (cputype)

The CPU Type field shows the architecture targeted by the binary. The type cpu_type_t is an integer alias.

#define CPU_TYPE_ANY            ((cpu_type_t) -1)

#define CPU_TYPE_VAX            ((cpu_type_t) 1)
/* skip				((cpu_type_t) 2)	*/
/* skip				((cpu_type_t) 3)	*/
/* skip				((cpu_type_t) 4)	*/
/* skip				((cpu_type_t) 5)	*/
#define CPU_TYPE_MC680x0        ((cpu_type_t) 6)
#define CPU_TYPE_X86            ((cpu_type_t) 7)
#define CPU_TYPE_I386           CPU_TYPE_X86            /* compatibility */
#define CPU_TYPE_X86_64         (CPU_TYPE_X86 | CPU_ARCH_ABI64)

/* skip CPU_TYPE_MIPS		((cpu_type_t) 8)	*/
/* skip                         ((cpu_type_t) 9)	*/
#define CPU_TYPE_MC98000        ((cpu_type_t) 10)
#define CPU_TYPE_HPPA           ((cpu_type_t) 11)
#define CPU_TYPE_ARM            ((cpu_type_t) 12)
#define CPU_TYPE_ARM64          (CPU_TYPE_ARM | CPU_ARCH_ABI64)
#define CPU_TYPE_ARM64_32       (CPU_TYPE_ARM | CPU_ARCH_ABI64_32)
#define CPU_TYPE_MC88000        ((cpu_type_t) 13)
#define CPU_TYPE_SPARC          ((cpu_type_t) 14)
#define CPU_TYPE_I860           ((cpu_type_t) 15)
/* skip	CPU_TYPE_ALPHA		((cpu_type_t) 16)	*/
/* skip				((cpu_type_t) 17)	*/
#define CPU_TYPE_POWERPC                ((cpu_type_t) 18)
#define CPU_TYPE_POWERPC64              (CPU_TYPE_POWERPC | CPU_ARCH_ABI64)
/* skip				((cpu_type_t) 19)	*/
/* skip				((cpu_type_t) 20 */
/* skip				((cpu_type_t) 21 */
/* skip				((cpu_type_t) 22 */

CPU SubType (cpusubtype)

This field specifies the specific machine the code can run. I won’t paste the whole list in the source file here. However, a few interesting values can be mentioned.

#define CPU_SUBTYPE_ANY         ((cpu_subtype_t) -1)

The definition in the header file was hard fo me to understand. So I will quote it verbatim here When selecting a slice, ANY will pick the slice with the best grading for the selected cpu_type_t, unlike the “ALL” subtypes, which are the slices that can run on any hardware for that cpu type.

File Types (filetype)

This field let us know what kind of file the Mach-O represents and also it defines what the layout of the file will be. Let us examine a few. See mach-o/loader.h for more references.

File Type Flag Description
MH_OBJECT 0x1 Represents intermediate files produced by compiler or assembler. This is used by .o files
MH_EXECUTE 0x2 A standard executable file.
MH_DYLIB 0x6 Represent a .dylib or dynamically linked binary
MH_BUNDLE 0x8 Represent a .bundle file
MH_DSYM 0xa The file storing symbol information. Services like Firebase uses these files to reproduce the class names and details that lead to a crash
MH_APP_EXTENSION_SAFE 0x02000000 This seems to represent .appex or App extension files
MH_SIM_SUPPORT 0x08000000 Possibly represent tvOS, watchOS, iOS app builds that can be executed on Simulator
MH_DYLIB_IN_CACHE 0x80000000 Represents ‘dylibs’ that are part of shared cache. Think UIKit or Foundation frameworks

These are flags, implies filetypes can represent one or more file types since they can be ‘OR’ed. filetype field for both 32/64 bit architectures is a 32-bit unsigned integer.

Number of load commands (ncmds)

Before explaining what this field is, let us answer an important question

What are load commands?

Loading is the process of bringing a program into the main memory (RAM) so it could be executed. And load commands specifies how to do it. The process of loading the following happens

Now, back to the ncmds field. It defines the total number of all load commands in the Mach-O file

Size of load commands (sizeofcmds)

Defines how many bytes the load commands occupy in the Mach-O binary.

Flags (flags)

They represent bit flags, to indicate optional features in the Mach-O files. We won’t be discussing much about them in this post

Reserved (reserved)

This field only exist for 64-bit Mach-O binaries. As its everywhere, its ‘reserved’ for future use.