Technology Sharing

Statically search for the call location of iOS dynamically linked functions

2024-07-11

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Statically search for the call location of iOS dynamically linked functions

The executable file format mach-O is a binary file format used on Apple's operating systems macOS and iOS.

In some iOS security scans, there may be a need to obtain the specific call location of a function, which can guide users to locate vulnerabilities more accurately.
Now take the NSLog function as an example and use a static method to search for the specific calling location of the dynamically linked function in macho.

Target

Search for the specific location where the NSLog function is called in the macho application.

Ideas

The entire search process is a process of parsing macho files, parsing binary data into appropriate data structures.
Multiple parts of macho are used:
String Table
Symbol Table
Dynamic Symbol Table
Section64(TEXT,stubs)
Section64(TEXT,text)

Specific steps

1. First find the location of Symbol Talbe and String Table in the macho file

Find LC_SYMTAB in Load Commands to determine the offset and size of the StringTable, and also find the offset and number of the Symbol Table:

Symbol Table Offset is 0x18c478
Number of Symbols is 0x9a2d
String Table offset is 0x2273b0
String Table Size is 0x108d58

Because the length of each symbol data is 16 bytes, that is, 0x10,

So the size of the Symbol Table is 0x9a2d*0x10 = 0x9A2D0
The starting address of the Symbol Table is 0x18c478
The end address is 0x18c478 0x9A2D0 = 0x226748

The starting address of the String Table is 0x2273b0
The end address is 0x2273b0 0x108d58 = 0x330108
Because the length of the string is not fixed, the length of each data in the String Table is not fixed.
When reading binary string table data, you can use 'x00' as the string separator.

2. Traverse the String Table and find _NSLog

You can first read the String Table data in macho, use 'x00' as the delimiter, and generate a string array.
Traverse the string array and determine whether each data is equal to "_NSLog".

_NSLog is found at position 0x23331b, and the machine code “5F4E534C6F6700” is the “_NSLogn” string.

49003 (0xBF6B in hexadecimal) is the index number of the current string, tentatively strTab_index = 49003.
Index number 49003 is calculated from the starting address 0x2273b0 of the String Table, the 49003th byte,
0x2273b0 49003 = 0x23331B, which is exactly the starting address of _NSLog.

3. According to 49003 (0xBF6B) in step 2, you can search for the corresponding symbol in the SymbolTable

In MachOView, you can see that the address 0x00224988 corresponds to _NSLog.

How to find the matching symbol table data through index number 49003?
In step 1, we know:
The starting address of the Symbol Table is 0x18c478.
The symbol table end address is 0x226748.
In this macho, the data size of a single Symbol Table is 0x10.

The value of the first four bytes of the 38993rd data in the Symbol Table is 0xBF6B, which is 49003.
This is the same as the index number of the _NSLog string on the String Table, so this data corresponds to "_NSLog":

38993 is the index number of the current data in the Symbol Table, tentatively set to symTab_index = 38993.
0x18c478 38993 * 0x10 = 0x224988 is exactly the address of the current data.

4. Search for data in the Dynamic Symbol Table based on symTab_index

The location of Indirect Symbols can be determined in LC_DYSYMTAB under Load Commands in Macho:

Starting address: 0x226748, there are 794 data, each data size is 0x4.
Traversing each piece of data in Indirect Symbols, the 111th piece of data stores 38993.
So this data corresponds to "_NSLog", set dySymTab_index = 111
0x226748 111*0x4 = 0x226904