ObjC Runtime 中 Weak 属性的实现 (上)

前言

 

OC 中的 weak 属性是怎么实现的,为什么在对象释放后会自动变成 nil?本文对这个问题进行了一点探讨。

环境

 

mac OS Sierra 10.12.4
objc709

参考答案

 

搜索后发现runtime 如何实现 weak 属性给出了一个参考答案。

runtime 对注册的类, 会进行布局,对于 weak 对象会放入一个 hash 表中。 用 weak 指向的对象内存地址作为 key,当此对象的引用计数为 0 的时候会 dealloc,假如 weak 指向的对象内存地址是 a ,那么就会以 a 为键, 在这个 weak 表中搜索,找到所有以 a 为键的 weak 对象,从而设置为 nil

测试

 

代码

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
#import <Foundation/Foundation.h>

@interface WeakProperty : NSObject

@property (nonatomic,weak) NSObject *obj;


@end

@implementation WeakProperty

- (void)dealloc {
NSLog(@"%s",__func__);
}

@end


int main(int argc, const char * argv[]) {
@autoreleasepool {
WeakProperty *property = [[WeakProperty alloc] init];
NSObject *obj = [[NSObject alloc] init];
property.obj = obj;
NSLog(@"%@",property.obj);

// 会触发函数 ``id objc_initWeak(id *location, id newObj)``
// NSObject *obj = [[NSObject alloc] init];
// __weak NSObject *obj2 = obj;
// 会触发函数 ``void objc_copyWeak(id *dst, id *src)``
// __weak NSObject *obj3 = obj2;
}
return 0;
}

结果

 

对象的 weak 属性调用 setter

  • 调用 id objc_storeWeak(id *location, id newObj)
  • 调用 static id storeWeak(id *location, objc_object *newObj)

使用 NSLog 输出 property.obj 属性时

  • 调用 id objc_loadWeakRetained(id *location)

dealloc 释放对象时

  • 调用 void objc_destroyWeak(id *location)

相关函数

 

查看 NSObject.mm 源码发现

  • id objc_storeWeak(id *location, id newObj)
  • id objc_storeWeakOrNil(id *location, id newObj)
  • id objc_initWeak(id *location, id newObj)
  • id objc_initWeakOrNil(id *location, id newObj)
  • void objc_destroyWeak(id *location)

都调用了 static id storeWeak(id *location, objc_object *newObj) , objc_xxxWeakOrNil 多了一点额外的处理,但并不影响整体的理解。而 void objc_destroyWeak(id *location) 在调用 static id storeWeak(id *location, objc_object *newObj)newObj 参数传递的是 nil 这一点与上面提到的参考答案中关于 dealloc 释放对象时,将哈希表中指定的键对应的值设置为 nil 是符合的。

小结

 

  • storeWeak 函数用于为 weak 属性赋值 (包括销毁)
  • objc_loadWeakRetained 函数用于获取 weak 属性

观察 & 分析

 

对于函数 storeWeak 主要分析两种情况下的调用

  1. 赋值,即 id objc_storeWeak(id *location, id newObj)
  2. 销毁,即 void objc_destroyWeak(id *location)

而对于 weak 属性的获取主要分析

  1. 函数 id objc_loadWeakRetained(id *location)

观察: id objc_storeWeak(id *location, id newObj)

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
/** 
* This function stores a new value into a __weak variable. It would
* be used anywhere a __weak variable is the target of an assignment.
*
* @param location The address of the weak pointer itself
* @param newObj The new object this weak ptr should now point to
*
* @return \e newObj
*/
id
objc_storeWeak(id *location, id newObj)
{
return storeWeak<DoHaveOld, DoHaveNew, DoCrashIfDeallocating>
(location, (objc_object *)newObj);
}

该函数单纯的调用了 storeWeak 函数

观察: void objc_destroyWeak(id *location)

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
/** 
* Destroys the relationship between a weak pointer
* and the object it is referencing in the internal weak
* table. If the weak pointer is not referencing anything,
* there is no need to edit the weak table.
*
* This function IS NOT thread-safe with respect to concurrent
* modifications to the weak variable. (Concurrent weak clear is safe.)
*
* @param location The weak pointer address.
*/
void
objc_destroyWeak(id *location)
{
(void)storeWeak<DoHaveOld, DontHaveNew, DontCrashIfDeallocating>
(location, nil);
}

该函数也只是单纯的调用了 storeWeak 函数

函数 storeWeak 源码

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
template <HaveOld haveOld, HaveNew haveNew,
CrashIfDeallocating crashIfDeallocating>
static id
storeWeak(id *location, objc_object *newObj)
{
assert(haveOld || haveNew);
if (!haveNew) assert(newObj == nil);

Class previouslyInitializedClass = nil;
id oldObj;
SideTable *oldTable;
SideTable *newTable;

// Acquire locks for old and new values.
// Order by lock address to prevent lock ordering problems.
// Retry if the old value changes underneath us.
retry:
if (haveOld) {
oldObj = *location;
oldTable = &SideTables()[oldObj];
} else {
oldTable = nil;
}
if (haveNew) {
newTable = &SideTables()[newObj];
} else {
newTable = nil;
}

SideTable::lockTwo<haveOld, haveNew>(oldTable, newTable);

if (haveOld && *location != oldObj) {
SideTable::unlockTwo<haveOld, haveNew>(oldTable, newTable);
goto retry;
}

// Prevent a deadlock between the weak reference machinery
// and the +initialize machinery by ensuring that no
// weakly-referenced object has an un-+initialized isa.
if (haveNew && newObj) {
Class cls = newObj->getIsa();
if (cls != previouslyInitializedClass &&
!((objc_class *)cls)->isInitialized())
{
SideTable::unlockTwo<haveOld, haveNew>(oldTable, newTable);
_class_initialize(_class_getNonMetaClass(cls, (id)newObj));

// If this class is finished with +initialize then we're good.
// If this class is still running +initialize on this thread
// (i.e. +initialize called storeWeak on an instance of itself)
// then we may proceed but it will appear initializing and
// not yet initialized to the check above.
// Instead set previouslyInitializedClass to recognize it on retry.
previouslyInitializedClass = cls;

goto retry;
}
}

// Clean up old value, if any.
if (haveOld) {
weak_unregister_no_lock(&oldTable->weak_table, oldObj, location);
}

// Assign new value, if any.
if (haveNew) {
newObj = (objc_object *)
weak_register_no_lock(&newTable->weak_table, (id)newObj, location,
crashIfDeallocating);
// weak_register_no_lock returns nil if weak store should be rejected

// Set is-weakly-referenced bit in refcount table.
if (newObj && !newObj->isTaggedPointer()) {
newObj->setWeaklyReferenced_nolock();
}

// Do not set *location anywhere else. That would introduce a race.
*location = (id)newObj;
}
else {
// No new value. The storage is not changed.
}

SideTable::unlockTwo<haveOld, haveNew>(oldTable, newTable);

return (id)newObj;
}

可以结合 lldb 边调试边对其进行分析,

分析: id objc_storeWeak(id *location, id newObj)

 

1
2
3
// Template parameters.
enum HaveOld { DontHaveOld = false, DoHaveOld = true };
enum HaveNew { DontHaveNew = false, DoHaveNew = true };

对于模板参数,传递的是 DoHaveOld(true) & DoHaveNew(true)

在64位汇编中,当参数少于7个时, 参数从左到右放入寄存器: rdi, rsi, rdx, rcx, r8, r9。此处 locationnewObj 分别来自 rdirsi

根据注释加地址比较,可知 location指向弱引用的地址newObj 为要求 弱引用指向的地址,在当前场景下为赋值给 WeakPropertyobj 属性的 obj 变量。

在当前场景下即为执行 storeWeak 后,内存地址 0x0000000101301638 上保存的值为 0x0000000101301490

铺垫: SideTable

 

关于结构体 SideTable,在本文中当做黑盒来处理

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
struct SideTable {
spinlock_t slock;
RefcountMap refcnts;
weak_table_t weak_table;

SideTable() {
memset(&weak_table, 0, sizeof(weak_table));
}

~SideTable() {
_objc_fatal("Do not delete SideTable.");
}

void lock() { slock.lock(); }
void unlock() { slock.unlock(); }
void forceReset() { slock.forceReset(); }

// Address-ordered lock discipline for a pair of side tables.

template<HaveOld, HaveNew>
static void lockTwo(SideTable *lock1, SideTable *lock2);
template<HaveOld, HaveNew>
static void unlockTwo(SideTable *lock1, SideTable *lock2);
};

关于 spinlock_tWiki 上关于 Spinlock 词条的解释如下

In software engineering, a spinlock is a lock which causes a thread trying to acquire it to simply wait in a loop (“spin”) while repeatedly checking if the lock is available. Since the thread remains active but is not performing a useful task, the use of such a lock is a kind of busy waiting. Once acquired, spinlocks will usually be held until they are explicitly released, although in some implementations they may be automatically released if the thread being waited on (that which holds the lock) blocks, or “goes to sleep.

例子

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
; Intel syntax

locked: ; The lock variable. 1 = locked, 0 = unlocked.
dd 0 ; 定义 lock 变量 默认为 0

spin_lock:
mov eax, 1 ; Set the EAX register to 1.
; 设置 EAX 寄存器的值为 1

xchg eax, [locked] ; Atomically swap the EAX register with
; the lock variable.
; This will always store 1 to the lock, leaving
; the previous value in the EAX register.
; 交换 eax 与 lock 变量的值,根据上一步可知,lock 肯定会被赋值为1

test eax, eax ; Test EAX with itself. Among other things, this will
; set the processor's Zero Flag if EAX is 0.
; If EAX is 0, then the lock was unlocked and
; we just locked it.
; Otherwise, EAX is 1 and we didn't acquire the lock.
; 将 EAX 与 自身比较,如果 EAX 是 0 则设置 Zeor Flag ,表明当前未加锁,只要加锁操作即可,反之证明已被加锁,不设置 Zero Flag。
jnz spin_lock ; Jump back to the MOV instruction if the Zero Flag is
; not set; the lock was previously locked, and so
; we need to spin until it becomes unlocked.
; 如果 Zero Flag 未被设置,则跳转继续 spin_lock
ret ; The lock has been acquired, return to the calling
; function.
; 获得锁后,继续执行

; 当获得所的操作执行完成后,则 locked 变成 0,另一个线程再次进行 spin_lock 操作 locked 为 0,导致 EAX 为0 ,重新获得了锁,同时 locked 变成 1...

spin_unlock:
mov eax, 0 ; Set the EAX register to 0.

xchg eax, [locked] ; Atomically swap the EAX register with
; the lock variable.

ret ; The lock has been released.

配合 google 的翻译可知,自旋锁会循环等待直到锁可用。

weak_table_t 结构体的注释说明了,它会保存 idskeys 的形式保存对象

1
2
3
4
5
6
7
8
9
10
/**
* The global weak references table. Stores object ids as keys,
* and weak_entry_t structs as their values.
*/
struct weak_table_t {
weak_entry_t *weak_entries;
size_t num_entries;
uintptr_t mask;
uintptr_t max_hash_displacement;
};

结构体 SideTable 可看做是一个带加锁功能的集合,其中的元素以键值对的形式存放。

ObjC 的入口函数 _objc_init 会调用函数 arr_init 来初始化 SideTableBuf 静态变量

正文: id objc_storeWeak(id *location, id newObj)

 

进入 if (haveOld) 条件

创建新元素,因此 location 地址的原值为 nil

进入 SideTables() 函数

1
2
3
static StripedMap<SideTable>& SideTables() {
return *reinterpret_cast<StripedMap<SideTable>*>(SideTableBuf);
}

关于 reinterpret_cast 的讨论

reinterpret_cast is the most dangerous cast, and should be used very sparingly. It turns one type directly into another - such as casting the value from one pointer to another, or storing a pointer in an int, or all sorts of other nasty things. Largely, the only guarantee you get with reinterpret_cast is that normally if you cast the result back to the original type, you will get the exact same value (but not if the intermediate type is smaller than the original type). There are a number of conversions that reinterpret_cast cannot do, too. It’s used primarily for particularly weird conversions and bit manipulations, like turning a raw data stream into actual data, or storing data in the low bits of an aligned pointer.

它是一种类型强转的方式

SideTableBuf 是大小为 4096SideTable 缓存数组, oldTable 的赋值相当于在取数组元素,nil 可看成 0 ,即取第一个元素。

同理,haveNewtruenewTable 是以 newObj 为索引在 SideTabBuf 中 查找元素。

调用 SideTable::lockTwo 方法

1
SideTable::lockTwo<haveOld, haveNew>(oldTable, newTable);

进入 SideTable::lockTwo 方法

1
2
3
4
5
6
template<>
void SideTable::lockTwo<DoHaveOld, DoHaveNew>
(SideTable *lock1, SideTable *lock2)
{
spinlock_t::lockTwo(&lock1->slock, &lock2->slock);
}

进入 lockTwo 方法

1
2
3
4
5
6
7
8
9
10
11
// Address-ordered lock discipline for a pair of locks.

static void lockTwo(mutex_tt *lock1, mutex_tt *lock2) {
if (lock1 < lock2) {
lock1->lock();
lock2->lock();
} else {
lock2->lock();
if (lock2 != lock1) lock1->lock();
}
}

判断 if (haveOld && *location != oldObj) 条件

haveOld && *location != oldObjoldObj 被赋值为 *location 正常情况下,两者相等,不等说明出了问题,算是容错。

判断 if (haveNew && newObj) 条件

haveNew && newObj 根据注释可知也是一个容错的处理

清除旧值

1
2
3
if (haveOld) {
weak_unregister_no_lock(&oldTable->weak_table, oldObj, location);
}

赋予新值

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// Assign new value, if any.
if (haveNew) {
newObj = (objc_object *)
weak_register_no_lock(&newTable->weak_table, (id)newObj, location,
crashIfDeallocating);
// weak_register_no_lock returns nil if weak store should be rejected

// Set is-weakly-referenced bit in refcount table.
if (newObj && !newObj->isTaggedPointer()) {
newObj->setWeaklyReferenced_nolock();
}

// Do not set *location anywhere else. That would introduce a race.
*location = (id)newObj;
}
else {
// No new value. The storage is not changed.
}

location 为 key,以 newObj 为值保存到对应的 weak_table_t 的结构体中

调用 SideTable::unlockTwo 方法

1
SideTable::unlockTwo<haveOld, haveNew>(oldTable, newTable);

分析: void objc_destroyWeak(id *location)

 

因为传递的模板参数为 DontHaveNew ,当释放掉旧值后,不会再进入 if (haveNew) 条件中获得新值。

分析: id objc_loadWeakRetained(id *location)

 

1
2
3
4
5
6
7
retry:
// fixme std::atomic this load
obj = *location;
...
result = obj;
...
return result

通过 * 取值符号操作 location ,获得弱引用指向的地址。

总结

 

本文通过对 ObjC 运行时粗略分析,来了解 weak 属性是如何进行存储,使用与释放的。ObjC 的类结构中一个静态的键值对表变量,它保存着对象的弱引用属性,其中的键为指向弱引用的内存地址,值为弱引用,当对象销毁时通过键查表,然后将对应的弱引用从表中移除。

参考

 

  1. When should static_cast, dynamic_cast, const_cast and reinterpret_cast be used?
  2. The LLDB Debugger
  3. 64位汇编参数传递
  4. Wiki - Spinlock
  5. alignas specifier
  6. 4.3.7. MOV and MVN