ObjC Runtime 中 Weak 屬性的實現 (中)

導語

在上一篇中簡單分析了 Weak 屬性是如何被存儲,獲取和銷毀的,其中的 SideTable 結構體當做黑盒進行處理。本文嘗試對 SideTable 的結構進行一些分析。

觀察

struct SideTable {
    spinlock_t slock;
    RefcountMap refcnts;
    weak_table_t weak_table;

    SideTable() {
        memset(&weak_table, 0, sizeof(weak_table));
    }

    ~SideTable() {
        _objc_fatal("Do not delete SideTable.");
    }

    void lock() { slock.lock(); }
    void unlock() { slock.unlock(); }
    void forceReset() { slock.forceReset(); }

    // Address-ordered lock discipline for a pair of side tables.

    template<HaveOld, HaveNew>
    static void lockTwo(SideTable *lock1, SideTable *lock2);
    template<HaveOld, HaveNew>
    static void unlockTwo(SideTable *lock1, SideTable *lock2);
};

SideTable 主要分為 3 部分

  • weak_table_t: weak 引用的全局 hash
  • RefcountMap : 引用計數的 hash
  • slock: 保證原子操作的自旋鎖

static id storeWeak(id *location, objc_object *newObj) 方法中有

// Assign new value, if any.
if (haveNew) {
   newObj = (objc_object *)
       weak_register_no_lock(&newTable->weak_table, (id)newObj, location, 
                             crashIfDeallocating);
   // weak_register_no_lock returns nil if weak store should be rejected

   // Set is-weakly-referenced bit in refcount table.
   if (newObj  &&  !newObj->isTaggedPointer()) {
       newObj->setWeaklyReferenced_nolock();
   }

   // Do not set *location anywhere else. That would introduce a race.
   *location = (id)newObj;
}

可知對于弱引用變量的保存,主要還是看 weak_table 這個屬性

測試代碼

#import <Foundation/Foundation.h>

@interface WeakProperty : NSObject

@property (nonatomic,weak) NSObject *obj;
@property (nonatomic,weak) NSObject *obj2;
@property (nonatomic,weak) NSObject *obj3;
@property (nonatomic,weak) NSObject *obj4;
@property (nonatomic,weak) NSObject *obj5;

@end

@implementation WeakProperty

- (void)dealloc {
    NSLog(@"%s",__func__);
}

@end


int main(int argc, const char * argv[]) {
    @autoreleasepool {
        WeakProperty *property = [[WeakProperty alloc] init];
        NSObject *obj = [[NSObject alloc] init];
        property.obj = obj;
        NSLog(@"%@",property.obj);
        
        // [1]
        property.obj2 = obj;
        
        // [2]
        property.obj3 = obj;
        property.obj4 = obj;
        property.obj5 = obj;
        
        // [3]
        property.obj = nil;        
    }
    return 0;
}

結構體: weak_table_t

weak_table_t 的定義在 objc-weak.h

/**
 * The global weak references table. Stores object ids as keys,
 * and weak_entry_t structs as their values.
 */
struct weak_table_t {
    weak_entry_t *weak_entries;
    size_t    num_entries;
    uintptr_t mask;
    uintptr_t max_hash_displacement;
};

說明:

  • 一個指向 weak_entry_t 的指針
  • size_t(即 unsigned ) 類型的 num_entries ,用于描述 hash 表的長度
  • uintptr_t(即 unsigned long) 類型的 mask(掩碼)
  • uintptr_t(即 unsigned long) 類型的 max_hash_displacement

結構體: weak_entry_t

struct weak_entry_t {
    DisguisedPtr<objc_object> referent;
    union {
        struct {
            weak_referrer_t *referrers;
            uintptr_t        out_of_line_ness : 2;
            uintptr_t        num_refs : PTR_MINUS_2;
            uintptr_t        mask;
            uintptr_t        max_hash_displacement;
        };
        struct {
            // out_of_line_ness field is low bits of inline_referrers[1]
            weak_referrer_t  inline_referrers[WEAK_INLINE_COUNT];
        };
    };

    bool out_of_line() {
        return (out_of_line_ness == REFERRERS_OUT_OF_LINE);
    }

    weak_entry_t& operator=(const weak_entry_t& other) {
        memcpy(this, &other, sizeof(other));
        return *this;
    }

    weak_entry_t(objc_object *newReferent, objc_object **newReferrer)
        : referent(newReferent)
    {
        inline_referrers[0] = newReferrer;
        for (int i = 1; i < WEAK_INLINE_COUNT; i++) {
            inline_referrers[i] = nil;
        }
    }
};

C++ 中,結構體是由關鍵詞 struct 定義的一種數據類型。他的成員和基類默認為公有的(public)。由關鍵詞 class 定義的成員和基類默認為私有的(private)。這是 C++結構體和類僅有的區別

類: DisguisedPtr

// DisguisedPtr<T> acts like pointer type T*, except the 
// stored value is disguised to hide it from tools like `leaks`.
// nil is disguised as itself so zero-filled memory works as expected, 
// which means 0x80..00 is also disguised as itself but we don't care.
// Note that weak_entry_t knows about this encoding.
template <typename T>
class DisguisedPtr {
    uintptr_t value; // unsigned long

    static uintptr_t disguise(T* ptr) {
        return -(uintptr_t)ptr;
    }

    static T* undisguise(uintptr_t val) {
        return (T*)-val;
    }

 public:
    DisguisedPtr() { }
    DisguisedPtr(T* ptr) 
        : value(disguise(ptr)) { }
    DisguisedPtr(const DisguisedPtr<T>& ptr) 
        : value(ptr.value) { }

    DisguisedPtr<T>& operator = (T* rhs) {
        value = disguise(rhs);
        return *this;
    }
    DisguisedPtr<T>& operator = (const DisguisedPtr<T>& rhs) {
        value = rhs.value;
        return *this;
    }

     // 重載了一些指針的運算符
    operator T* () const {
        return undisguise(value);
    }
    T* operator -> () const { 
        return undisguise(value);
    }
    T& operator * () const { 
        return *undisguise(value);
    }
    T& operator [] (size_t i) const {
        return undisguise(value)[i];
    }

    // pointer arithmetic operators omitted 
    // because we don't currently use them anywhere
};
// The address of a __weak variable.
// These pointers are stored disguised so memory analysis tools
// don't see lots of interior pointers from the weak table into objects.
typedef DisguisedPtr<objc_object *> weak_referrer_t;

小結

weak_entry_t 包含一個 DisguisedPtr<objc_object>,Disguised 是偽裝的意思,根據注釋可知,可以將 DisguisedPtr<T> 當成 T * 指針類型即可,在當前場景可看作是一個指向 objc_object 的指針類型

weak_referrer_tDisguisedPtr<objc_object *> 可以看成是 objc_object 指針的地址

接著是一個 unionout_of_line_nessinline_referrers[1] 共用了低 2 位,因為

// out_of_line_ness field overlaps with the low two bits of inline_referrers[1].
// inline_referrers[1] is a DisguisedPtr of a pointer-aligned address.
// The low two bits of a pointer-aligned DisguisedPtr will always be 0b00
// (disguised nil or 0x80..00) or 0b11 (any other address).
// Therefore out_of_line_ness == 0b10 is used to mark the out-of-line state.
#define REFERRERS_OUT_OF_LINE 2

注釋說明了 DisuisedPtr 的低 2 位不是 0b00 就是 0b11,所以要表示 out-of-line 只能使用 out_of_line_ness == 0b10 (當 out_of_line_ness0b010b10 會分別得到 falsetrue )

num_refs30 (32位系統) / 62(64位系統)
maskmax_hash_displacementweak_table_t 中也有。

out_of_line() 方法在上面算是已經說過了

weak_entry_t& operator=(const weak_entry_t& other) {...} 重載運算符 =

The memcpy() function copies n bytes from memory area src
to memory area dest. The memory areas may not overlap.
Use memmove(3) if the memory areas do overlap.

從參數 other 所指的內存地址的起始位置開始拷貝 sizeof(other) 字節到 this 指針指向的當前對象的起始地址。

weak_entry_t(objc_object *newReferent, objc_object **newReferrer) : referent(newReferent)

: referent(newReferent) 是初始化列表,代表用參數 newReferent來初始化結構體中的 referent 屬性。聯合類型中的 inline_referrers[0] 接收參數 newReferrer ,并將剩下的 1,2,3 都置為 nil

函數: weak_register_no_lock

注釋 [2] & [3]

/// Adds an (object, weak pointer) pair to the weak table.
/// 添加一個 (對象,弱引用指針)到 weak hash 表中
id weak_register_no_lock(weak_table_t *weak_table, id referent, 
                         id *referrer, bool crashIfDeallocating);

具體實現如下:

/** 
 * Registers a new (object, weak pointer) pair. Creates a new weak
 * object entry if it does not exist.
 * 
 * @param weak_table The global weak table.
 * @param referent The object pointed to by the weak reference.
 * @param referrer The weak pointer address.
 */
id 
weak_register_no_lock(weak_table_t *weak_table, id referent_id, 
                      id *referrer_id, bool crashIfDeallocating)
{
    objc_object *referent = (objc_object *)referent_id;
    objc_object **referrer = (objc_object **)referrer_id;

    if (!referent  ||  referent->isTaggedPointer()) return referent_id;

    // ensure that the referenced object is viable
    bool deallocating;
    if (!referent->ISA()->hasCustomRR()) {
        deallocating = referent->rootIsDeallocating();
    }
    else {
        BOOL (*allowsWeakReference)(objc_object *, SEL) = 
            (BOOL(*)(objc_object *, SEL))
            object_getMethodImplementation((id)referent, 
                                           SEL_allowsWeakReference);
        if ((IMP)allowsWeakReference == _objc_msgForward) {
            return nil;
        }
        deallocating =
            ! (*allowsWeakReference)(referent, SEL_allowsWeakReference);
    }

    if (deallocating) {
        if (crashIfDeallocating) {
            _objc_fatal("Cannot form weak reference to instance (%p) of "
                        "class %s. It is possible that this object was "
                        "over-released, or is in the process of deallocation.",
                        (void*)referent, object_getClassName((id)referent));
        } else {
            return nil;
        }
    }

    // now remember it and where it is being stored
    weak_entry_t *entry;
    if ((entry = weak_entry_for_referent(weak_table, referent))) {
        append_referrer(entry, referrer);
    } 
    else {          
        weak_entry_t new_entry(referent, referrer);
        weak_grow_maybe(weak_table);
        weak_entry_insert(weak_table, &new_entry);
    }

    // Do not set *referrer. objc_storeWeak() requires that the 
    // value not change.

    return referent_id;
}

弱引用屬性 obj 在函數 weak_register_no_lock 中傳遞給行參 referent_id, 賦值給局部變量 referentlocation 傳遞給形參 referrer_id,賦值給局部變量 referrer

經過一些檢查,比如是否允許弱引用,弱引用對象是否可用。

/** 
 * Return the weak reference table entry for the given referent. 
 * If there is no entry for referent, return NULL. 
 * Performs a lookup.
 *
 * @param weak_table 
 * @param referent The object. Must not be nil.
 * 
 * @return The table of weak referrers to this object. 
 */
static weak_entry_t *
weak_entry_for_referent(weak_table_t *weak_table, objc_object *referent)
{
    assert(referent);

    weak_entry_t *weak_entries = weak_table->weak_entries;

    if (!weak_entries) return nil;

    size_t begin = hash_pointer(referent) & weak_table->mask;
    size_t index = begin;
    size_t hash_displacement = 0;
    while (weak_table->weak_entries[index].referent != referent) {
        index = (index+1) & weak_table->mask;
        if (index == begin) bad_weak_table(weak_table->weak_entries);
        hash_displacement++;
        if (hash_displacement > weak_table->max_hash_displacement) {
            return nil;
        }
    }
    
    return &weak_table->weak_entries[index];
}

根據 referentkey ,在 weak_table 中通過遍歷 weak_entries 數組,對referent 屬性值進行比較的方式來查找元素,未找到,走 else

weak_entry_t(objc_object *newReferent, objc_object **newReferrer)
   : referent(newReferent)
{
   inline_referrers[0] = newReferrer;
   for (int i = 1; i < WEAK_INLINE_COUNT; i++) {
       inline_referrers[i] = nil;
   }
}

執行 weak_entry_t 結構體的初始化

通過強轉操作來偽裝指針。接收 newReferrerreferrerinline_referrers[0] 在這里 *referrer 等于 nil 所以 inline_referres 數組元素全指向 nil,因為是無符號長整數,因此就是 0

函數: weak_grow_maybe

當弱引用的 hash 表的空間使用率達到 3/4 后,擴充 hash

// Grow the given zone's table of weak references if it is full.
static void weak_grow_maybe(weak_table_t *weak_table)
{
    size_t old_size = TABLE_SIZE(weak_table);

    // Grow if at least 3/4 full.
    if (weak_table->num_entries >= old_size * 3 / 4) {
        weak_resize(weak_table, old_size ? old_size*2 : 64);
    }
}

函數: weak_entry_insert

添加元素到弱引用的 hash 表中

/** 
 * Add new_entry to the object's table of weak references.
 * Does not check whether the referent is already in the table.
 */
static void weak_entry_insert(weak_table_t *weak_table, weak_entry_t *new_entry)
{
    weak_entry_t *weak_entries = weak_table->weak_entries;
    assert(weak_entries != nil);

    size_t begin = hash_pointer(new_entry->referent) & (weak_table->mask);
    size_t index = begin;
    size_t hash_displacement = 0;
    while (weak_entries[index].referent != nil) {
        index = (index+1) & weak_table->mask;
        if (index == begin) bad_weak_table(weak_entries);
        hash_displacement++;
    }

    weak_entries[index] = *new_entry;
    weak_table->num_entries++;

    if (hash_displacement > weak_table->max_hash_displacement) {
        weak_table->max_hash_displacement = hash_displacement;
    }
}

獲取 new_entryreferent 屬性,即弱引用的 obj 屬性,以其地址的無符號長整數取相反數來做參數,通過移位與位移進行 hash 操作,通過 weak_table->mask(63 = 0b111111) 掩碼保留 hash 操作后的低 6 位( 64 位系統),作為索引,接下來用 while (weak_entries[index].referent != nil) {...} ,解決 hash 碰撞的問題。然后添加到 hash 表中,修改表的長度

效果如上圖所示,static id storeWeak(id *location, objc_object *newObj)locationnewObj 分別被保存到 weak_table_t 結構體的 referentinline_referrers 數組的首位。

查找 referent 是否存在的條件是

while (weak_table->weak_entries[index].referent != referent) {
   index = (index+1) & weak_table->mask;
   if (index == begin) bad_weak_table(weak_table->weak_entries);
   hash_displacement++;
   if (hash_displacement > weak_table->max_hash_displacement) {
       return nil;
   }
}

注釋 [2] & [3]

進入 append_referrer 函數后

if (! entry->out_of_line()) {
   // Try to insert inline.
   for (size_t i = 0; i < WEAK_INLINE_COUNT; i++) {
       if (entry->inline_referrers[i] == nil) {
           entry->inline_referrers[i] = new_referrer;
           return;
       }
   }

因為 entry-> out_of_line() 等于 false 會嘗試添加到 (entry->inline_referrers 數組中。

取消 [2] 的注釋,因為已經達到 4 個,所以在 obj5 時,會擴充。

// Couldn't insert inline. Allocate out of line.
weak_referrer_t *new_referrers = (weak_referrer_t *)
  calloc(WEAK_INLINE_COUNT, sizeof(weak_referrer_t));
// This constructed table is invalid, but grow_refs_and_insert
// will fix it and rehash it.
for (size_t i = 0; i < WEAK_INLINE_COUNT; i++) {
  new_referrers[i] = entry->inline_referrers[i];
}
entry->referrers = new_referrers;
entry->num_refs = WEAK_INLINE_COUNT;
entry->out_of_line_ness = REFERRERS_OUT_OF_LINE;

會設置 entry->out_of_line_nessREFERRERS_OUT_OF_LINE

結合注釋

/**
 * The internal structure stored in the weak references table. 
 * It maintains and stores
 * a hash set of weak references pointing to an object.
 * If out_of_line_ness != REFERRERS_OUT_OF_LINE then the set
 * is instead a small inline array.
 */

可知當 weak 變量引用數量不多于 4 個時,會使用數組方式進行存儲,而多于 4 個后會用 hash 表的方式進行存儲。

函數: weak_unregister_no_lock

/// Removes an (object, weak pointer) pair from the weak table.
/// 從 weak hash 表中移除一個(對象,弱引用指針)
void weak_unregister_no_lock(weak_table_t *weak_table, id referent, id *referrer);

具體實現如下:

/** 
 * Unregister an already-registered weak reference.
 * This is used when referrer's storage is about to go away, but referent
 * isn't dead yet. (Otherwise, zeroing referrer later would be a
 * bad memory access.)
 * Does nothing if referent/referrer is not a currently active weak reference.
 * Does not zero referrer.
 * 
 * FIXME currently requires old referent value to be passed in (lame)
 * FIXME unregistration should be automatic if referrer is collected
 * 
 * @param weak_table The global weak table.
 * @param referent The object.
 * @param referrer The weak reference.
 */
void
weak_unregister_no_lock(weak_table_t *weak_table, id referent_id, 
                        id *referrer_id)
{
    objc_object *referent = (objc_object *)referent_id;
    objc_object **referrer = (objc_object **)referrer_id;

    weak_entry_t *entry;

    if (!referent) return;

    if ((entry = weak_entry_for_referent(weak_table, referent))) {
        remove_referrer(entry, referrer);
        bool empty = true;
        if (entry->out_of_line()  &&  entry->num_refs != 0) {
            empty = false;
        }
        else {
            for (size_t i = 0; i < WEAK_INLINE_COUNT; i++) {
                if (entry->inline_referrers[i]) {
                    empty = false; 
                    break;
                }
            }
        }

        if (empty) {
            weak_entry_remove(weak_table, entry);
        }
    }

    // Do not set *referrer = nil. objc_storeWeak() requires that the 
    // value not change.
}

同樣是使用 weak_entry_for_referent 函數查找弱引用是否存在

注釋 [1] & [2] ,取消注釋 [3]

/** 
 * Remove old_referrer from set of referrers, if it's present.
 * Does not remove duplicates, because duplicates should not exist. 
 * 
 * @todo this is slow if old_referrer is not present. Is this ever the case? 
 *
 * @param entry The entry holding the referrers.
 * @param old_referrer The referrer to remove. 
 */
static void remove_referrer(weak_entry_t *entry, objc_object **old_referrer)
{
    if (! entry->out_of_line()) {
        for (size_t i = 0; i < WEAK_INLINE_COUNT; i++) {
            if (entry->inline_referrers[i] == old_referrer) {
                entry->inline_referrers[i] = nil;
                return;
            }
        }
        ...
        objc_weak_error();
        return;
    }

    size_t begin = w_hash_pointer(old_referrer) & (entry->mask);
    size_t index = begin;
    size_t hash_displacement = 0;
    while (entry->referrers[index] != old_referrer) {
        index = (index+1) & entry->mask;
        if (index == begin) bad_weak_table(entry);
        hash_displacement++;
        if (hash_displacement > entry->max_hash_displacement) {
            ...
            return;
        }
    }
    entry->referrers[index] = nil;
    entry->num_refs--;
}

移除時,是以 referrer 屬性來比較,發現地址相同,將其置為 nil 來實現移除的效果。

函數: weak_clear_no_lock

/// Called on object destruction. Sets all remaining weak pointers to nil.
/// 在對象調用析構方法時,設置所有留下的弱引用指針為nil
void weak_clear_no_lock(weak_table_t *weak_table, id referent);

具體實現如下:

/** 
 * Called by dealloc; nils out all weak pointers that point to the 
 * provided object so that they can no longer be used.
 * 
 * @param weak_table 
 * @param referent The object being deallocated. 
 */
void 
weak_clear_no_lock(weak_table_t *weak_table, id referent_id) 
{
    objc_object *referent = (objc_object *)referent_id;

    weak_entry_t *entry = weak_entry_for_referent(weak_table, referent);
    if (entry == nil) {
        /// XXX shouldn't happen, but does with mismatched CF/objc
        //printf("XXX no entry for clear deallocating %p\n", referent);
        return;
    }

    // zero out references
    weak_referrer_t *referrers;
    size_t count;
    
    if (entry->out_of_line()) {
        referrers = entry->referrers;
        count = TABLE_SIZE(entry);
    } 
    else {
        referrers = entry->inline_referrers;
        count = WEAK_INLINE_COUNT;
    }
    
    for (size_t i = 0; i < count; ++i) {
        objc_object **referrer = referrers[i];
        if (referrer) {
            if (*referrer == referent) {
                *referrer = nil;
            }
            else if (*referrer) {
                _objc_inform("__weak variable at %p holds %p instead of %p. "
                             "This is probably incorrect use of "
                             "objc_storeWeak() and objc_loadWeak(). "
                             "Break on objc_weak_error to debug.\n", 
                             referrer, (void*)*referrer, (void*)referent);
                objc_weak_error();
            }
        }
    }
    
    weak_entry_remove(weak_table, entry);
}

會被 dealloc 調用,根據注釋和代碼可知同樣以 referentkey 遍歷,然后依次將置為 nil,但是測試時,走的都是 if (entry == nil) 然后直接 return

總結

弱引用查找根據 referent 屬性,首次會被存儲到 weak_table_t 結構體 referentinline_referrers[0],當繼續添加時,如果引用次數不大于 4 個保存在數組inline_referrers 中,當超過 4 個后以 hash 表的形式進行存儲。移除時,根據 referrerinline_referrers 中移除。

參考

  1. int - What is size_t in C? - Stack Overflow
  2. wiki - C++類
  3. wiki - 位段
  4. Linux Programmer's Manual memcpy
  5. inline bool objc_object::isTaggedPointer(); How does the function work?
  6. PHP哈希表碰撞攻擊原理
  7. wiki - 掩碼
  8. wiki - 哈希表
  9. When to use reinterpret_cast?
  10. OSObject
  11. c++ operator操作符的兩種用法:重載和隱式類型轉換,string轉其他基本數據類型的簡潔實現string_cast
  12. c++中冒號(:)和雙冒號(::)的用法
  13. ARC 引用計數之weak
最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。

推薦閱讀更多精彩內容