在平時開發和調試中,經常遇到C調用棧和匯編,所以這里來統一的了解下這部分內容,本章需要一定的匯編基礎才能更好的理解。
函數簽名
在JavaScript中,我們定義函數和調用函數都是相當自由的:
function func(a, b, c) {
console.log(a, b, c)
}
func(1)
func(1, 2, 3, 4, 5, 6)
這樣做完全沒有問題。但是在C語言中,方法調用卻是非常嚴格的,如果參數類型或者個數不對,就會直接編譯失敗(隱式轉換除外)。
int arg1_func(int a) {
return a;
}
int arg2_func(int a, int b) {
return a+b;
}
arg1_func(1, 2);
arg2_func(1);
以上C語言將會直接編譯不通過,原因之后再說。這里我們把int(*)(int)
稱為這個函數的函數簽名
。
為什么我們要了解函數簽名
呢?由于C方法的參數傳遞是和函數簽名相關的,而且是編譯期就需要確定的。他決定了參數是如何傳遞給具體方法,并且返回參數是如何返回的。
那么接下來就讓我們來了解C語言的參數傳遞方式。由于不同架構平臺擁有不同的處理方式,但大同小異,這里我們就用AArch64
架構來做介紹。
Registers
在了解底層之前,我們需要一點ARM的預備知識,這里做一個簡單的介紹,具體ARM匯編可以參考官方文檔armasm_user_guide和ABI。
ARM_ASM (4.1節)
In AArch64 state, the following registers are available:
- Thirty-one 64-bit general-purpose registers X0-X30, the bottom halves of which are accessible as
W0-W30. - Four stack pointer registers SP_EL0, SP_EL1, SP_EL2, SP_EL3.
- Three exception link registers ELR_EL1, ELR_EL2, ELR_EL3.
- Three saved program status registers SPSR_EL1, SPSR_EL2, SPSR_EL3.
- One program counter.
ABI (9.1節)
For the purposes of function calls, the general-purpose registers are divided into four groups:
-
Argument registers (X0-X7)
These are used to pass parameters to a function and to return a result. They can be used as scratch registers or as caller-saved register variables that can hold intermediate values within a function, between calls to other functions. The fact that 8 registers are available for passing parameters reduces the need to spill parameters to the stack when compared with AArch32.
-
Caller-saved temporary registers (X9-X15)
If the caller requires the values in any of these registers to be preserved across a call to another function, the caller must save the affected registers in its own stack frame. They can be modified by the called subroutine without the need to save and restore them before returning to the caller.
-
Callee-saved registers (X19-X29)
These registers are saved in the callee frame. They can be modified by the called subroutine as long as they are saved and restored before returning.
-
Registers with a special purpose (X8, X16-X18, X29, X30)
- X8 is the indirect result register. This is used to pass the address location of an indirect result, for example, where a function returns a large structure.
- X16 and X17 are IP0 and IP1, intra-procedure-call temporary registers. These can be used by call veneers and similar code, or as temporary registers for intermediate values between subroutine calls. They are corruptible by a function. Veneers are small pieces of code which are automatically inserted by the linker, for example when the branch target is out of range of the branch instruction.
- X18 is the platform register and is reserved for the use of platform ABIs. This is an additional temporary register on platforms that don't assign a special meaning to it.
- X29 is the frame pointer register (FP).
- X30 is the link register (LR).
根據官方文檔,這里我們需要知道的是X0-X30個通用寄存器,D0-D31個浮點寄存器,堆棧寄存器SP,和獨立不可直接操作的PC寄存器。
其中通用寄存器在C語言的ABI定義中,X29作為棧幀FP,X30作為函數返回地址LR,X0-X7作為參數寄存器,X8為Indirect result location
(和返回值相關),X9-X15為臨時寄存器。其他的寄存器和目前我們的內容沒有太大的關系,所以不做介紹了。這里有個官方的簡要圖:
在閱讀以下內容需要明確上述的幾個寄存器,特別是LR=X30
,FP=X29
,其中W0和X0代表同一個寄存器,只是W是32位,X是64位。
需要了解的存取指令是LDR(load),STR(store),其他存取指令都是以這兩個為基礎。相關運算可見ABI 6.3.4節
,這里介紹下下面會遇到的運算:
| Example | Description |
| --------------------- |
|LDR X0, [X1, #8]
|Load from address X1 + 8 |
|LDR X0, [X1, #8]!
|Pre-index: Update X1 first (to X1 + #8), then load from the new address |
|LDR X0, [X1], #8
|Post-index: Load from the unmodified address in X1 first, then update X1 (to X1 + #8) |
Stack Frame
在C語言調用過程中,SP
和LR
是成對出現的,他們代表了一個函數的棧區域,也稱為棧幀
。
一個棧幀的大概結構如下:
這個結構對我們來說非常重要,也是本次我們討論的重點。
少參數調用
對于一個函數的調用,入參會放入X0-X7中,返回參數會放在X0中返回,那么我們就來分析下一個簡單的例子:
int lessArg(int arg1, char *arg2) {
return 0;
}
調用前:
caller:
0x100791c6c <+20>: mov w9, #0x0
0x100791c70 <+24>: stur w9, [x29, #-0x14]
0x100791c74 <+28>: stur w0, [x29, #-0x18]
0x100791c78 <+32>: str x1, [x8, #0xa0]
0x100791c7c <+36>: mov x1, #0x0 ; // 第二個參數 arg2 = 0
0x100791c80 <+40>: mov x0, x9 ; // 第一個參數 arg1 = 0
0x100791c84 <+44>: str x1, [sp, #0x88]
0x100791c88 <+48>: str x8, [sp, #0x80]
0x100791c8c <+52>: str w9, [sp, #0x7c]
0x100791c90 <+56>: bl 0x100791a60 ; CALL 'lessArg'
cfunction`lessArg:
0x104491a98 <+0>: sub sp, sp, #0x10 ; 由于棧是向下增長的,所以 SP = SP - 0x10
0x104491a9c <+4>: mov w8, #0x0
0x104491aa0 <+8>: str w0, [sp, #0xc]
0x104491aa4 <+12>: str x1, [sp]
0x104491aa8 <+16>: mov x0, x8 ; 返回值 X0 = 0
0x104491aac <+20>: add sp, sp, #0x10 ; 銷毀棧
0x104491ab0 <+24>: ret
由以上結果看的確按照ABI所描述的,在<=8個參數的時候,參數是放在寄存器中傳遞。
多參數調用
那么如果參數超過8個呢?據ABI描述是通過堆棧的形式來傳遞,我們來看下結果:
int moreArg(int arg1, int arg2, int arg3, int arg4, int arg5, int arg6, int arg7, int arg8, int arg9, int arg10, int arg11, int arg12, int arg13, char *arg14) {
return 0;
}
caller:
0x100791c9c <+68>: mov x1, sp ; x1 = SP
0x100791ca0 <+72>: ldr x30, [sp, #0x88]
0x100791ca4 <+76>: str x30, [x1, #0x18]
0x100791ca8 <+80>: orr w9, wzr, #0xc
0x100791cac <+84>: str w9, [x1, #0x10] ; SP+0x10 = arg13
0x100791cb0 <+88>: mov w9, #0xb
0x100791cb4 <+92>: str w9, [x1, #0xc] ; SP+0xc = arg12
0x100791cb8 <+96>: mov w9, #0xa
0x100791cbc <+100>: str w9, [x1, #0x8] ; SP+0x8 = arg11
0x100791cc0 <+104>: mov w9, #0x9
0x100791cc4 <+108>: str w9, [x1, #0x4] ; SP+0x4 = arg10
0x100791cc8 <+112>: orr w9, wzr, #0x8
0x100791ccc <+116>: str w9, [x1] ; SP = arg9
0x100791cd4 <+124>: orr w2, wzr, #0x2 ; w2 = arg3
0x100791cd8 <+128>: orr w3, wzr, #0x3 ; w3 = arg4
0x100791cdc <+132>: orr w4, wzr, #0x4 ; w4 = arg5
0x100791ce0 <+136>: mov w5, #0x5 ; w5 = arg6
0x100791ce4 <+140>: orr w6, wzr, #0x6 ; w6 = arg7
0x100791ce8 <+144>: orr w7, wzr, #0x7 ; w7 = arg8
0x100791cec <+148>: ldr w10, [sp, #0x7c]
0x100791cf0 <+152>: str w0, [sp, #0x78]
0x100791cf4 <+156>: mov x0, x10 ; w0 = arg1
0x100791cd0 <+120>: orr w9, wzr, #0x1
0x100791cf8 <+160>: mov x1, x9 ; w1 = arg2
0x100791cfc <+164>: str x8, [sp, #0x70]
0x100791d00 <+168>: str w9, [sp, #0x6c]
0x100791d04 <+172>: bl 0x100791a7c ; moreArg at main.mm:16
從上面可以看出來,arg9以上的入參被存在了SP ~ (SP+0x10)
的位置,也就是當前棧的棧底,下一層棧幀的棧頂。
cfunction`moreArg:
0x104491ab4 <+0>: sub sp, sp, #0x40 ; 申請棧空間,這里我們將原來的sp記作'SP0'
; 那么 SP = SP0 - 0x40
0x104491ab8 <+4>: ldr x8, [sp, #0x58]
0x104491abc <+8>: ldr w9, [sp, #0x50] ; w9 = SP + 0x50 = SP0 - 0x40 + 0x50 = SP0 + 0x10
; 也就是w13 = arg13
; 按照這樣的推導,下面依次為arg9 ~ arg12
0x104491ac0 <+12>: ldr w10, [sp, #0x4c]
0x104491ac4 <+16>: ldr w11, [sp, #0x48]
0x104491ac8 <+20>: ldr w12, [sp, #0x44]
0x104491acc <+24>: ldr w13, [sp, #0x40] ; w13 = SP + 0x40 = SP0 - 0x40 + 0x40 = SP0
; 也就是w13 = arg9
0x104491ad0 <+28>: mov w14, #0x0
0x104491ad4 <+32>: str w0, [sp, #0x3c]
0x104491ad8 <+36>: str w1, [sp, #0x38]
0x104491adc <+40>: str w2, [sp, #0x34]
0x104491ae0 <+44>: str w3, [sp, #0x30]
0x104491ae4 <+48>: str w4, [sp, #0x2c]
0x104491ae8 <+52>: str w5, [sp, #0x28]
0x104491aec <+56>: str w6, [sp, #0x24]
0x104491af0 <+60>: str w7, [sp, #0x20]
0x104491af4 <+64>: str w13, [sp, #0x1c]
0x104491af8 <+68>: str w12, [sp, #0x18]
0x104491afc <+72>: str w11, [sp, #0x14]
0x104491b00 <+76>: str w10, [sp, #0x10]
0x104491b04 <+80>: str w9, [sp, #0xc]
0x104491b08 <+84>: str x8, [sp]
0x104491b0c <+88>: mov x0, x14
0x104491b10 <+92>: add sp, sp, #0x40 ; =0x40
0x104491b14 <+96>: ret
由此可見,大于8個的參數會被放入棧中SP ~ (SP + count - 8)
,和預期的一樣。
struct參數及返回
上面說了基本類型的傳遞情況,在C語言中,還有一類不定長數據類型可以直接傳遞,那就是struct。那么我們來看看struct參數是怎么傳遞的。
小struct
struct SmallStruct {
int arg1;
};
struct SmallStruct smallStructFunc(int arg1, struct SmallStruct arg2) {
struct SmallStruct s = arg2;
return s;
}
caller:
0x100791d24 <+204>: ldur w9, [x29, #-0x30]
0x100791d28 <+208>: mov x1, x9 ; x1 = arg2 !
; 這里struct內容直接賦值給了x1,因為x1的容量完全夠用!
0x100791d2c <+212>: ldr w9, [sp, #0x7c]
0x100791d30 <+216>: str w0, [sp, #0x64] ; w0 = arg1
0x100791d34 <+220>: mov x0, x9
0x100791d38 <+224>: bl 0x100791b04 ; smallStructFunc at main.mm:32
cfunction`smallStructFunc:
0x1003b5b04 <+0>: sub sp, sp, #0x20 ; =0x20
0x1003b5b08 <+4>: mov x8, x1 ; x8 = arg2
0x1003b5b0c <+8>: str w8, [sp, #0x10]
0x1003b5b10 <+12>: str w0, [sp, #0xc]
0x1003b5b14 <+16>: ldr w8, [sp, #0x10]
0x1003b5b18 <+20>: str w8, [sp, #0x18]
0x1003b5b1c <+24>: ldr w8, [sp, #0x18]
0x1003b5b20 <+28>: mov x0, x8 ; x0 = x8 = arg2
; 這里直接將x0作為struct返回值
0x1003b5b24 <+32>: add sp, sp, #0x20 ; =0x20
0x1003b5b28 <+36>: ret
可見,小型struct,可以直接放在寄存器中傳遞,和普通基本類型的傳遞沒有太大的區別。
大struct
那么struct足夠的大呢,導致不能簡單的用寄存器容納struct的數據?
這里就要涉及到X8的一個特殊身份了(XR, indirect result location),這里我們將X8
記作XR
。
struct BigStruct {
int arg1; int arg2; int arg3; int arg4; int arg5; int arg6; int arg7; int arg8; int arg9; int arg10; int arg11; int arg12; int arg13; char *arg14;
};
struct BigStruct bigStructFunc(int arg1, struct BigStruct arg2) {
struct BigStruct s = arg2;
return s;
}
caller:
0x100791d3c <+228>: mov x9, x0
0x100791d40 <+232>: stur w9, [x29, #-0x38]
0x100791d44 <+236>: ldr x8, [sp, #0x80]
0x100791d48 <+240>: ldur q0, [x8, #0x78]
0x100791d4c <+244>: str q0, [x8, #0x30]
0x100791d50 <+248>: ldur q0, [x8, #0x68]
0x100791d54 <+252>: stur q0, [x29, #-0xa0]
0x100791d58 <+256>: ldur q0, [x8, #0x58]
0x100791d5c <+260>: stur q0, [x29, #-0xb0]
0x100791d60 <+264>: ldur q0, [x8, #0x48]
0x100791d64 <+268>: stur q0, [x29, #-0xc0] ; 以上是將臨時變量arg2賦值到Callee的參數棧區
; 這樣子函數修改就不會改動原始數據了
; 為方便,后面將已拷貝的數據成為 arg2
0x100791d68 <+272>: add x8, sp, #0xb0 ; XR = SP + 0xb0
; Callee save area
; 這是一個空的區域,用作返回的臨時存儲區
0x100791d6c <+276>: sub x1, x29, #0xc0 ; x1 = FP - 0xc0 = &arg2
0x100791d70 <+280>: ldr w0, [sp, #0x7c] ; w0 = arg1
0x100791d74 <+284>: bl 0x100791b2c ; bigStructFunc at main.mm:36
cfunction`bigStructFunc:
0x1003b5b2c <+0>: sub sp, sp, #0x20 ; 申請棧空間 SP = SP0 - 0x20
0x1003b5b30 <+4>: stp x29, x30, [sp, #0x10] ; 這里和以上幾個不同,是因為這里有函數調用,所以需要把LR和FP壓棧
0x1003b5b34 <+8>: add x29, sp, #0x10
0x1003b5b38 <+12>: orr x2, xzr, #0x40 ; struct 的 size = 0x40,作為第三個參數
0x1003b5b3c <+16>: stur w0, [x29, #-0x4]
0x1003b5b40 <+20>: mov x0, x8 ; dst = x0 = XR = SP0 + 0xb0
; 第一個入參dst為caller的臨時存儲區
; 第二個參數為x1,也就是caller的 &arg2
0x1003b5b44 <+24>: bl 0x1003b62f0 ; symbol stub for: memcpy
; void *memcpy(void *dst, const void *src, size_t n);
; 這里居然直接調用了memcpy,賦值!
0x1003b5b48 <+28>: ldp x29, x30, [sp, #0x10]
0x1003b5b4c <+32>: add sp, sp, #0x20 ; =0x20
0x1003b5b50 <+36>: ret
這樣返回值就放在了*XR
所在的位置,caller只需要再拷貝到臨時變量區中即可。
可以看到,在處理大型struct時,就會出現多次內存拷貝,會對性能造成一定影響,所以這類方法盡量不要直接傳遞大型struct,可以傳遞指針或者引用,或者采用inline的方案,在優化期去除函數調用。
struct參數的分界線
根據AAPCS 64的Parameter Passing Rules
節所述:
If the argument is a Composite Type and the size in double-words of the argument is not more than 8 minus NGRN, then the argument is copied into consecutive general-purpose registers, starting at x[NGRN]. The argument is passed as though it had been loaded into the registers from a double-word- aligned address with an appropriate sequence of LDR instructions loading consecutive registers from memory (the contents of any unused parts of the registers are unspecified by this standard). The NGRN is incremented by the number of registers used. The argument has now been allocated.
大致說的是如果X0-X8中剩余的寄存器足夠去保存該結構,那么就保存到寄存器,否則保存到棧。
If the type, T, of the result of a function is such that
void func(T arg)
would require that arg be passed as a value in a register (or set of registers) according to the rules in §5.4 Parameter Passing, then the result is returned in the same registers as would be used for such an argument.
返回值也遵守以上規則。
這個文檔不是最新的,而且是beta版,暫時沒有找到正式版本。而且這里還涉及到很多其他的因素,所以這里也就不深究了。
va_list
以上都是確定參數,那么如果是不確定參數,又是怎么傳遞的呢?
在AAPCS 64
文檔里有明確的說明,但是這里我們從匯編的角度來看這個問題。
int mutableAragsFunc(int arg, ...) {
va_list list;
va_start(list, arg);
int ret = arg;
while(int a = va_arg(list, int)) {
ret += a;
}
va_end(list);
return ret;
}
mutableAragsFunc(1, 2, 3, 0);
在函數入口打斷點,打印參數寄存器:
x0 = 0x0000000000000001
x1 = 0x000000016fce7930
x2 = 0x000000016fce7a18
x3 = 0x000000016fce7a90
x4 = 0x0000000000000000
x5 = 0x0000000000000000
x6 = 0x0000000000000001
x7 = 0x00000000000004b0
可以發現除了x0是正確的第一個參數,其他都是隨機的,那么說明參數肯定被放到了棧上。
cfunction`main:
0x100121be4 <+0>: sub sp, sp, #0xa0 ; =0xa0
0x100121be8 <+4>: stp x29, x30, [sp, #0x90]
0x100121bec <+8>: add x29, sp, #0x90 ; =0x90
0x100121bf0 <+12>: mov w8, #0x0
0x100121bf4 <+16>: stur w8, [x29, #-0x4]
0x100121bf8 <+20>: stur w0, [x29, #-0x8]
0x100121bfc <+24>: stur x1, [x29, #-0x10]
0x100121c00 <+28>: mov x1, sp
0x100121c04 <+32>: mov x9, #0x0
0x100121c08 <+36>: str x9, [x1, #0x10] ; 壓棧 0
0x100121c0c <+40>: orr w8, wzr, #0x3
0x100121c10 <+44>: mov x9, x8
0x100121c14 <+48>: str x9, [x1, #0x8] ; 壓棧 3
0x100121c18 <+52>: orr w8, wzr, #0x2
0x100121c1c <+56>: mov x9, x8
0x100121c20 <+60>: str x9, [x1] ; 壓棧 2
0x100121c24 <+64>: orr w0, wzr, #0x1 ; arg = 1
0x100121c28 <+68>: bl 0x1001218d8 ; mutableAragsFunc at main.mm:67
也就是表明被明確定義的參數,是按照上面所說的規則傳遞,而...
參數全部按照棧方式傳遞。這從實現原理上也比較容易理解,在取va_arg的時候,只需要將棧指針+sizeof(type)就可以了。
錯誤的函數簽名
那么現在,我們回過頭來看看第一個問題。C語言為什么會有函數簽名?
函數簽名決定了參數以及返回值的傳遞方式,同時還決定了函數棧幀的分布與大小,所以如果不確定函數簽名,我們也就無法知道如何去傳遞參數了。
那么錯誤的函數簽名會導致什么樣的后果呢?運行時是否會崩潰?我們來看:
int arg1_func(int a) {
return a;
}
int arg2_func(int a, int b) {
return a+b;
}
void arg_test_func() {
int ret1 = ((int (*)(int, int))arg1_func)(1, 2);
int ret2 = ((int (*)(int))arg2_func)(1);
int ret3 = ((int (*)())arg1_func)();
int ret4 = ((int (*)())arg2_func)();
printf("%d, %d, %d, %d\n", ret1, ret2, ret3, ret4);
}
首先說結果,結果是一切運行正常,只是結果值有部分是錯誤的。那么我們來看看匯編代碼:
cfunction`arg_test_func:
0x1003462cc <+0>: sub sp, sp, #0x50 ; =0x50
0x1003462d0 <+4>: stp x29, x30, [sp, #0x40]
0x1003462d4 <+8>: add x29, sp, #0x40 ; =0x40
; 以上都是處理棧幀
0x1003462d8 <+12>: orr w0, wzr, #0x1 ; w0 = 1
0x1003462dc <+16>: orr w1, wzr, #0x2 ; w1 = 2
0x1003462e0 <+20>: bl 0x100346298 ; arg1_func at main.mm:87
0x1003462e4 <+24>: orr w1, wzr, #0x1 ; w1 = 1
0x1003462e8 <+28>: stur w0, [x29, #-0x4] ; 將結果存入臨時變量 ret1
; 按照寄存器的狀態,這里相當于調用了 arg1_func(1)
; 其結果是正確的,只是可能沒有符合預期
0x1003462ec <+32>: mov x0, x1 ; x0 = 1
0x1003462f0 <+36>: bl 0x1003462ac ; arg2_func at main.mm:90
0x1003462f4 <+40>: stur w0, [x29, #-0x8] ; 將結果存入臨時變量 ret2
; 相當于 arg2_func(1, 1) = 2
; 第二個參數取決于上一次x1的狀態
; 所以結果應該是隨機的
0x1003462f8 <+44>: bl 0x100346298 ; arg1_func at main.mm:87
0x1003462fc <+48>: stur w0, [x29, #-0xc] ; 相當于 ret3 = arg1_func(2) = 2
0x100346300 <+52>: bl 0x1003462ac ; arg2_func at main.mm:90
0x100346304 <+56>: stur w0, [x29, #-0x10] ; 相當于 ret4 = arg2_func(2, 1) = 3
所以結果應該是1, 2, 2, 3
。
這里的結果不能代表任何在其他環境下的結果,可以說其結果是難以預測的。這里沒有奔潰也只是隨機參數并不會帶來奔潰的風險。
所以我們是不能用其他函數簽名來傳遞參數的。
obj_msgSend
接下來,我們來說說iOS中最著名的函數obj_msgSend
,可以說,這個函數是objc的核心和基礎,沒有這個方法,就不存在objc。
根據我們上面的分析,理論上我們不能改變obj_msgSend
的函數簽名,來傳遞不同類型和個數的參數。那么蘋果又是怎么實現的呢?
以前我們一直說obj_msgSend
用匯編來寫是為了速度,但這并不是主要原因,因為retain,release也是非常頻繁使用的方法,為什么不把這幾個也改為匯編呢。其實更重要的原因是如果用C來寫obj_msgSend
根本實現不了!
我們翻開蘋果objc的源碼,查看其中arm64.s匯編代碼:
ENTRY _objc_msgSend
MESSENGER_START
cmp x0, #0 // nil check and tagged pointer check
b.le LNilOrTagged // (MSB tagged pointer looks negative)
ldr x13, [x0] // x13 = isa
and x9, x13, #ISA_MASK // x9 = class
LGetIsaDone:
CacheLookup NORMAL // calls imp or objc_msgSend_uncached
LNilOrTagged:
b.eq LReturnZero // nil check
// tagged
adrp x10, _objc_debug_taggedpointer_classes@PAGE
add x10, x10, _objc_debug_taggedpointer_classes@PAGEOFF
ubfx x11, x0, #60, #4
ldr x9, [x10, x11, LSL #3]
b LGetIsaDone
LReturnZero:
// x0 is already zero
mov x1, #0
movi d0, #0
movi d1, #0
movi d2, #0
movi d3, #0
MESSENGER_END_NIL
ret
END_ENTRY _objc_msgSend
看出于上面其他C方法編譯出來的匯編的區別了嗎?
那就是obj_msgSend
居然不存在棧幀!同時也沒有任何地方修改過X0-X7
,X8
,LR
,SP
,FP
!
而且當找到真正對象上的方法的時候,并不像其他方法一樣使用BL
,而是使用了
.macro CacheHit
br x17 // call imp
也就是說并沒有修改LR
。這樣做的效果就相當于在函數調用的時候插入了一段代碼!更像是c語言的宏。
由于obj_msgSend
并沒有改變任何方法調用的上下文,所以真正的objc方法就好像是被直接調用的一樣。
可以說,這種想法實在是太精彩了。
objc_msgSend對nil對象的處理
大家都知道,向空對象發送消息,返回的內容肯定都是0。那么這是為什么呢?
還是來看obj_msgSend
的源代碼部分,第一行就判斷了nil:
cmp x0, #0 // nil check and tagged pointer check
b.le LNilOrTagged // (MSB tagged pointer looks negative)
其中tagged pointer技術并不是我們本期的話題,所以我們直接跳到空對象的處理方法上:
LReturnZero:
// x0 is already zero
mov x1, #0
movi d0, #0
movi d1, #0
movi d2, #0
movi d3, #0
MESSENGER_END_NIL
ret
他將可能的保存返回值的寄存器全部寫入0!(為什么會有多個寄存器,是因為ARM其實是支持向量運算的,所以在某些條件下會用多個寄存器保存返回值,具體可以去參考ARM官方文檔)。
這樣我們的返回值就只能是0了!
等等,還缺少一個類型,struct!如果是棧上的返回,上文已經分析過是保存在X8
中的,可是我們并沒有看到任何有關X8
的操作。那么我們來寫一個demo嘗試一下:
void struct_objc_nil(Test *t) {
struct BigStruct retB;
printf("stack: %d,%d,%d,%d,%d,%d,\n", retB.arg1, retB.arg2, retB.arg3, retB.arg4, retB.arg5, retB.arg6);
retB = ((struct BigStruct(*)(Test *, SEL))objc_msgSend)(t, @selector(retStruct));
printf("msgSend: %d,%d,%d,%d,%d,%d,\n", retB.arg1, retB.arg2, retB.arg3, retB.arg4, retB.arg5, retB.arg6);
retB = [t retStruct];
printf("objc: %d,%d,%d,%d,%d,%d,\n", retB.arg1, retB.arg2, retB.arg3, retB.arg4, retB.arg5, retB.arg6);
}
首先我們打開編譯優化-os
(非優化狀態,棧空間會被清0)。其結果居然是:
stack: 50462976,185207048,0,0,0,0,
msgSend: 1,0,992,0,0,0,
objc: 0,0,0,0,0,0,
struct類型兩者的返回并不一致!按照我們閱讀源碼來推論,隨機數值才是正確的結果,這是為什么呢?
我們還是來看匯編,我將關鍵部分特意標注了出來:
cfunction`struct_objc_nil:
0x10097e754 <+0>: sub sp, sp, #0x90 ; =0x90
0x10097e758 <+4>: stp x20, x19, [sp, #0x70]
0x10097e75c <+8>: stp x29, x30, [sp, #0x80]
0x10097e760 <+12>: add x29, sp, #0x80 ; =0x80
0x10097e764 <+16>: bl 0x10097e9d4 ; symbol stub for: objc_retain
0x10097e768 <+20>: mov x19, x0
0x10097e76c <+24>: adr x0, #0x1730 ; "stack: %d,%d,%d,%d,%d,%d,\n"
0x10097e770 <+28>: nop
0x10097e774 <+32>: bl 0x10097e9f8 ; symbol stub for: printf
0x10097e778 <+36>: nop
0x10097e77c <+40>: ldr x20, #0x262c ; "retStruct"
0x10097e780 <+44>: add x8, sp, #0x30 ; =0x30
0x10097e784 <+48>: mov x0, x19
0x10097e788 <+52>: mov x1, x20
0x10097e78c <+56>: bl 0x10097e9b0 ; symbol stub for: objc_msgSend
0x10097e790 <+60>: ldp w8, w9, [sp, #0x30]
0x10097e794 <+64>: ldp w10, w11, [sp, #0x38]
0x10097e798 <+68>: ldp w12, w13, [sp, #0x40]
0x10097e79c <+72>: stp x12, x13, [sp, #0x20]
0x10097e7a0 <+76>: stp x10, x11, [sp, #0x10]
0x10097e7a4 <+80>: stp x8, x9, [sp]
0x10097e7a8 <+84>: adr x0, #0x170f ; "msgSend: %d,%d,%d,%d,%d,%d,\n"
0x10097e7ac <+88>: nop
0x10097e7b0 <+92>: bl 0x10097e9f8 ; symbol stub for: printf
//////////////////////////////////////////////////////////
-> 0x10097e7b4 <+96>: cbz x19, 0x10097e7d8 ; <+132> at main.mm:134
; 這里的意思是:
; IF X19 == NULL THEN
; GOTO 0x10097e7d8
; 而 0x10097e7d8 就是內存清0的地方!
; X19 在 0x10097e768 被賦值為 objc 對象 'nil'
; 而在第一次調用 'obj_msgSend' 就沒有這一段!
; (由于優化,有些邏輯和代碼中有變化)
//////////////////////////////////////////////////////////
0x10097e7b8 <+100>: add x8, sp, #0x30 ; =0x30
0x10097e7bc <+104>: mov x0, x19
0x10097e7c0 <+108>: mov x1, x20
0x10097e7c4 <+112>: bl 0x10097e9b0 ; symbol stub for: objc_msgSend
0x10097e7c8 <+116>: ldp w8, w9, [sp, #0x30]
0x10097e7cc <+120>: ldp w10, w11, [sp, #0x38]
0x10097e7d0 <+124>: ldp w12, w13, [sp, #0x40]
0x10097e7d4 <+128>: b 0x10097e800 ; <+172> at main.mm:135
; 這里有一段清0的代碼!正好就是返回值的局部變量地址
0x10097e7d8 <+132>: mov w13, #0x0
0x10097e7dc <+136>: mov w12, #0x0
0x10097e7e0 <+140>: mov w11, #0x0
0x10097e7e4 <+144>: mov w10, #0x0
0x10097e7e8 <+148>: mov w9, #0x0
0x10097e7ec <+152>: mov w8, #0x0
0x10097e7f0 <+156>: stp xzr, xzr, [sp, #0x60]
0x10097e7f4 <+160>: stp xzr, xzr, [sp, #0x50]
0x10097e7f8 <+164>: stp xzr, xzr, [sp, #0x40]
0x10097e7fc <+168>: stp xzr, xzr, [sp, #0x30]
0x10097e800 <+172>: stp x12, x13, [sp, #0x20]
0x10097e804 <+176>: stp x10, x11, [sp, #0x10]
0x10097e808 <+180>: stp x8, x9, [sp]
0x10097e80c <+184>: adr x0, #0x16c8 ; "objc: %d,%d,%d,%d,%d,%d,\n"
0x10097e810 <+188>: nop
0x10097e814 <+192>: bl 0x10097e9f8 ; symbol stub for: printf
0x10097e818 <+196>: mov x0, x19
0x10097e81c <+200>: bl 0x10097e9c8 ; symbol stub for: objc_release
0x10097e820 <+204>: ldp x29, x30, [sp, #0x80]
0x10097e824 <+208>: ldp x20, x19, [sp, #0x70]
0x10097e828 <+212>: add sp, sp, #0x90 ; =0x90
0x10097e82c <+216>: ret
0x10097e830 <+220>: b 0x10097e834 ; <+224> at main.mm
0x10097e834 <+224>: mov x20, x0
0x10097e838 <+228>: mov x0, x19
0x10097e83c <+232>: bl 0x10097e9c8 ; symbol stub for: objc_release
0x10097e840 <+236>: mov x0, x20
0x10097e844 <+240>: bl 0x10097e98c ; symbol stub for: _Unwind_Resume
到這里我們就能夠明白了,為什么struct返回值也會變成0。是編譯器給我們加入了一段判定的代碼!
那么'objc空對象的返回值一定是0'這個判定就需要在一定條件下了。
總結
對這一部分的探索一直持續了很久,一直是迷糊狀態,不過經過長時間的多次探索,慢慢思考,總算有一個比較清晰的認識了。可以說底層的東西真的很多很復雜,這里只是其中很小的一方面,其他方面等有時間了另外再寫吧。