男朋友当兵回来忍住不找我,中文字幕一卡二卡三卡,搡老熟女国产熟妇

本文源自本人的學習記錄整理與理解，其中參考閱讀了部分優秀的博客和書籍，盡量以通俗簡單的語句轉述。引用到的地方如有遺漏或未能一一列舉原文出處還望見諒與指出，另文章內容如有不妥之處還望指教，萬分感謝！

性能優化是開發中不可避開的一個環節，也是開發者需要深入研究的一個課題；提高APP的性能也是一名高級開發者的必備技能！

這里只聊聊關于iOS系統下的APP性能那些事！想了解APP的性能就需要知道它體現在那些方面，以及如何監控它，并優化它！

APP的性能大概體現在以下方面：
CPU 占用率、 內存使用情況、啟動時間、卡頓、FPS、使用時崩潰率、耗電量監控、流量監控、網絡狀況監控、等等。

本文會從硬件CPU的工作原理，到卡頓和離屏渲染、到APP的啟動、電池的能耗、再到安裝包瘦身；做逐一認識和盡可能的提出優化辦法

1. CPU 占用率

CPU作為中央處理器，是手機最關鍵的組成部分，所有應用程序都需要它來調度運行，且資源有限。所以當我們的APP因設計不當，使 CPU 持續以高負載運行，將會出現APP卡頓、手機發熱發燙、電量消耗過快等等嚴重影響用戶體驗的現象。

由此對于應用在CPU中占用率的監控，將變得尤為重要。那么應該如何來獲取CPU的占有率呢？！

我們知道APP在運行的時候，會對應一個Mach Task，而Task下可能有多條線程同時執行任務，每個線程都是作為利用CPU的基本單位。所以我們可以通過獲取當前Mach Task下所有線程占用 CPU 的情況，來計算APP的 CPU 占有率。

在《OS X and iOS Kernel Programming》是這樣描述 Mach task 的：

任務（task）是一種容器（container）對象，虛擬內存空間和其他資源都是通過這個容器對象管理的，這些資源包括設備和其他句柄。嚴格地說，Mach 的任務并不是其他操作系統中所謂的進程，因為 Mach 作為一個微內核的操作系統，并沒有提供“進程”的邏輯，而只是提供了最基本的實現。不過在 BSD 的模型中，這兩個概念有1：1的簡單映射，每一個 BSD 進程（也就是 OS X 進程）都在底層關聯了一個 Mach 任務對象。

Mac OS X 中進程子系統組成的概念圖.png

iOS 是基于 Apple Darwin 內核，由操作系統(kernel)、XNU內核和Runtime庫 組成，而 XNU 是 Darwin 的內核，它是“X is not UNIX”的縮寫，是一個混合內核，由 Mach 微內核和BSD組成。Mach 內核是輕量級的平臺，只能完成操作系統最基本的職責，比如：進程和線程、虛擬內存管理、任務調度、進程通信和消息傳遞機制等。其他的工作，例如文件操作和設備訪問，都由 BSD 層實現。

iOS 的線程技術與Mac OS X類似，也是基于 Mach 線程技術實現的，在 Mach 層中 thread_basic_info 結構體封裝了單個線程的基本信息：

struct thread_basic_info {
    time_value_t  user_time;      /* user run time */
    time_value_t  system_time;    /* system run time */
    integer_t    cpu_usage;       /* scaled cpu usage percentage */
    policy_t     policy;          /* scheduling policy in effect */
    integer_t    run_state;       /* run state (see below) */
    integer_t    flags;           /* various flags (see below) */
    integer_t    suspend_count;   /* suspend count for thread */
    integer_t    sleep_time;      /* number of seconds that thread  has been sleeping */
}

一個Mach Task包含它的線程列表。內核提供了task_threads API 調用獲取指定 task 的線程列表，然后可以通過thread_infoAPI調用來查詢指定線程的信息，在thread_act.h中有相關定義。

task_threads 將target_task 任務中的所有線程保存在act_list數組中，act_listCnt表示線程個數：

kern_return_t task_threads
(
    task_t target_task,
    thread_act_array_t *act_list,
    mach_msg_type_number_t *act_listCnt
);

thread_info結構如下：

kern_return_t thread_info
(
    thread_act_t target_act,
    thread_flavor_t flavor,  // 傳入不同的宏定義獲取不同的線程信息
    thread_info_t thread_info_out,  // 查詢到的線程信息
    mach_msg_type_number_t *thread_info_outCnt  // 信息的大小
);

所以我們如下來獲取CPU的占有率：

#import "LSLCpuUsage.h"
#import <mach/task.h>
#import <mach/vm_map.h>
#import <mach/mach_init.h>
#import <mach/thread_act.h>
#import <mach/thread_info.h>

@implementation LSLCpuUsage

+ (double)getCpuUsage {
    kern_return_t           kr;
    thread_array_t          threadList;         // 保存當前Mach task的線程列表
    mach_msg_type_number_t  threadCount;        // 保存當前Mach task的線程個數
    thread_info_data_t      threadInfo;         // 保存單個線程的信息列表
    mach_msg_type_number_t  threadInfoCount;    // 保存當前線程的信息列表大小
    thread_basic_info_t     threadBasicInfo;    // 線程的基本信息
    
    // 通過“task_threads”API調用獲取指定 task 的線程列表
    //  mach_task_self_，表示獲取當前的 Mach task
    kr = task_threads(mach_task_self(), &threadList, &threadCount);
    if (kr != KERN_SUCCESS) {
        return -1;
    }
    double cpuUsage = 0;
    for (int i = 0; i < threadCount; i++) {
        threadInfoCount = THREAD_INFO_MAX;
        // 通過“thread_info”API調用來查詢指定線程的信息
        //  flavor參數傳的是THREAD_BASIC_INFO，使用這個類型會返回線程的基本信息，
        //  定義在 thread_basic_info_t 結構體，包含了用戶和系統的運行時間、運行狀態和調度優先級等
        kr = thread_info(threadList[i], THREAD_BASIC_INFO, (thread_info_t)threadInfo, &threadInfoCount);
        if (kr != KERN_SUCCESS) {
            return -1;
        }
        
        threadBasicInfo = (thread_basic_info_t)threadInfo;
        if (!(threadBasicInfo->flags & TH_FLAGS_IDLE)) {
            cpuUsage += threadBasicInfo->cpu_usage;
        }
    }
    
    // 回收內存，防止內存泄漏
    vm_deallocate(mach_task_self(), (vm_offset_t)threadList, threadCount * sizeof(thread_t));

    return cpuUsage / (double)TH_USAGE_SCALE * 100.0;
}
@end

當然也可以更直觀的看CPU的占有率，用神奇的Xcode

2. 內存

雖然現在的手機內存越來越大，但畢竟是有限的，如果因為我們的應用設計不當造成內存過高，可能面臨被系統“干掉”的風險，這對用戶來說是毀滅性的體驗。

Mach task 的內存使用信息存放在mach_task_basic_info結構體中，其中resident_size 為應用使用的物理內存大小，virtual_size為虛擬內存大小，在task_info.h中：

#define MACH_TASK_BASIC_INFO     20         /* always 64-bit basic info */
struct mach_task_basic_info {
        mach_vm_size_t  virtual_size;       /* virtual memory size (bytes) */
        mach_vm_size_t  resident_size;      /* resident memory size (bytes) */
        mach_vm_size_t  resident_size_max;  /* maximum resident memory size (bytes) */
        time_value_t    user_time;          /* total user run time for
                                               terminated threads */
        time_value_t    system_time;        /* total system run time for
                                               terminated threads */
        policy_t        policy;             /* default policy for new threads */
        integer_t       suspend_count;      /* suspend count for task */
};

獲取方式是通過 task_info API 根據指定的 flavor 類型，返回 target_task 的信息，在task.h中：

kern_return_t task_info
(
    task_name_t target_task,
    task_flavor_t flavor,
    task_info_t task_info_out,
    mach_msg_type_number_t *task_info_outCnt
);

筆者嘗試過使用如下方式獲取內存情況，基本和騰訊的GT的相近，但是和Xcode和Instruments的值有較大差距：

// 獲取當前應用的內存占用情況，和Xcode數值相差較大
+ (double)getResidentMemory {
    struct mach_task_basic_info info;
    mach_msg_type_number_t count = MACH_TASK_BASIC_INFO_COUNT;
    if (task_info(mach_task_self(), MACH_TASK_BASIC_INFO, (task_info_t)&info, &count) == KERN_SUCCESS) {
        return info.resident_size / (1024 * 1024);
    } else {
        return -1.0;
    }
}

后來看了一篇博主討論了這個問題，說使用phys_footprint才是正解，博客地址。親測，基本和Xcode的數值相近。

// 獲取當前應用的內存占用情況，和Xcode數值相近
+ (double)getMemoryUsage {
    task_vm_info_data_t vmInfo;
    mach_msg_type_number_t count = TASK_VM_INFO_COUNT;
    if(task_info(mach_task_self(), TASK_VM_INFO, (task_info_t) &vmInfo, &count) == KERN_SUCCESS) {
        return (double)vmInfo.phys_footprint / (1024 * 1024);
    } else {
        return -1.0;
    }
}

博主文中提到：關于 phys_footprint 的定義可以在 XNU 源碼中，找到 osfmk/kern/task.c 里對于 phys_footprint 的注釋，博主認為注釋里提到的公式計算的應該才是應用實際使用的物理內存。

/*
 * phys_footprint
 *   Physical footprint: This is the sum of:
 *     + (internal - alternate_accounting)
 *     + (internal_compressed - alternate_accounting_compressed)
 *     + iokit_mapped
 *     + purgeable_nonvolatile
 *     + purgeable_nonvolatile_compressed
 *     + page_table
 *
 * internal
 *   The task's anonymous memory, which on iOS is always resident.
 *
 * internal_compressed
 *   Amount of this task's internal memory which is held by the compressor.
 *   Such memory is no longer actually resident for the task [i.e., resident in its pmap],
 *   and could be either decompressed back into memory, or paged out to storage, depending
 *   on our implementation.
 *
 * iokit_mapped
 *   IOKit mappings: The total size of all IOKit mappings in this task, regardless of
     clean/dirty or internal/external state].
 *
 * alternate_accounting
 *   The number of internal dirty pages which are part of IOKit mappings. By definition, these pages
 *   are counted in both internal *and* iokit_mapped, so we must subtract them from the total to avoid
 *   double counting.
 */

當然我也是贊同這點的????????????。

3. FPS

通過維基百科我們知道，FPS是Frames Per Second 的簡稱縮寫，意思是每秒傳輸幀數，也就是我們常說的“刷新率（單位為Hz）。

FPS是測量用于保存、顯示動態視頻的信息數量。每秒鐘幀數愈多，所顯示的畫面就會愈流暢，FPS值越低就越卡頓，所以這個值在一定程度上可以衡量應用在圖像繪制渲染處理時的性能。一般我們的APP的FPS只要保持在50-60之間，用戶體驗都是比較流暢的。

蘋果手機屏幕的正常刷新頻率是每秒60次，即可以理解為FPS值為60。我們都知道CADisplayLink是和屏幕刷新頻率保存一致，所以我們是否可以通過它來監控我們的FPS呢？！

首先CADisplayLink是什么

CADisplayLink是CoreAnimation提供的另一個類似于NSTimer的類，它總是在屏幕完成一次更新之前啟動，它的接口設計的和NSTimer很類似，所以它實際上就是一個內置實現的替代，但是和timeInterval以秒為單位不同，CADisplayLink有一個整型的frameInterval屬性，指定了間隔多少幀之后才執行。默認值是1，意味著每次屏幕更新之前都會執行一次。但是如果動畫的代碼執行起來超過了六十分之一秒，你可以指定frameInterval為2，就是說動畫每隔一幀執行一次（一秒鐘30幀）。

使用CADisplayLink監控界面的FPS值，參考自YYFPSLabel：

import UIKit

class LSLFPSMonitor: UILabel {

    private var link: CADisplayLink = CADisplayLink.init()
    private var count: NSInteger = 0
    private var lastTime: TimeInterval = 0.0
    private var fpsColor: UIColor = UIColor.green
    public var fps: Double = 0.0
    
    // MARK: - init
    
    override init(frame: CGRect) {
        var f = frame
        if f.size == CGSize.zero {
            f.size = CGSize(width: 55.0, height: 22.0)
        }
        super.init(frame: f)
        
        self.textColor = UIColor.white
        self.textAlignment = .center
        self.font = UIFont.init(name: "Menlo", size: 12.0)
        self.backgroundColor = UIColor.black
        
        link = CADisplayLink.init(target: LSLWeakProxy(target: self), selector: #selector(tick))
        link.add(to: RunLoop.current, forMode: RunLoopMode.commonModes)
    }
    
    deinit {
        link.invalidate()
    }
    
    required init?(coder aDecoder: NSCoder) {
        fatalError("init(coder:) has not been implemented")
    }
    
    // MARK: - actions
    
    @objc func tick(link: CADisplayLink) {
        guard lastTime != 0 else {
            lastTime = link.timestamp
            return
        }
        
        count += 1
        let delta = link.timestamp - lastTime
        guard delta >= 1.0 else {
            return
        }
        
        lastTime = link.timestamp
        fps = Double(count) / delta
        let fpsText = "\(String.init(format: "%.3f", fps)) FPS"
        count = 0
        
        let attrMStr = NSMutableAttributedString(attributedString: NSAttributedString(string: fpsText))
        if fps > 55.0{
            fpsColor = UIColor.green
        } else if(fps >= 50.0 && fps <= 55.0) {
            fpsColor = UIColor.yellow
        } else {
            fpsColor = UIColor.red
        }
        attrMStr.setAttributes([NSAttributedStringKey.foregroundColor:fpsColor], range: NSMakeRange(0, attrMStr.length - 3))
        attrMStr.setAttributes([NSAttributedStringKey.foregroundColor:UIColor.white], range: NSMakeRange(attrMStr.length - 3, 3))
        DispatchQueue.main.async {
            self.attributedText = attrMStr
        }
    }
}

通過CADisplayLink的實現方式，并真機測試之后，確實是可以在很大程度上滿足了監控FPS的業務需求和為提高用戶體驗提供參考，但是和Instruments的值可能會有些出入。下面我們來討論下使用CADisplayLink的方式，可能存在的問題。

(1). 和Instruments值對比有出入，原因如下:

CADisplayLink運行在被添加的那個RunLoop之中（一般是在主線程中），因此它只能檢測出當前RunLoop下的幀率。RunLoop中所管理的任務的調度時機，受任務所處的RunLoopMode和CPU的繁忙程度所影響。所以想要真正定位到準確的性能問題所在，最好還是通過Instrument來確認。

(2). 使用CADisplayLink可能存在的循環引用問題。

例如以下寫法：

let link = CADisplayLink.init(target: self, selector: #selector(tick))

let timer = Timer.init(timeInterval: 1.0, target: self, selector: #selector(tick), userInfo: nil, repeats: true)

原因：以上兩種用法，都會對 self 強引用，此時 timer持有 self，self 也持有 timer，循環引用導致頁面 dismiss 時，雙方都無法釋放，造成循環引用。此時使用 weak 也不能有效解決:

weak var weakSelf = self
let link = CADisplayLink.init(target: weakSelf, selector: #selector(tick))

那么我們應該怎樣解決這個問題，有人會說在deinit(或dealloc)中調用定時器的invalidate方法，但是這是無效的，因為已經造成循環引用了，不會走到這個方法的。

YYKit作者提供的解決方案是使用 YYWeakProxy，這個YYWeakProxy不是繼承自NSObject而是繼承NSProxy。

NSProxy

An abstract superclass defining an API for objects that act as stand-ins for other objects or for objects that don’t exist yet.

NSProxy是一個為對象定義接口的抽象父類，并且為其它對象或者一些不存在的對象扮演了替身角色。具體的可以看下NSProxy的官方文檔

萬能替身上線

修改后代碼如下，親測定時器如愿釋放，LSLWeakProxy的具體實現代碼已經同步到github中。

let link = CADisplayLink.init(target: LSLWeakProxy(target: self), selector: #selector(tick))

4. CPU和GPU

在屏幕成像的過程中，CPU和GPU起著至關重要的作用

**CPU (Central Processing Unit)

對象的創建和銷毀、對象屬性的調整、布局計算、文本的計算的排版、圖片的格式轉換和解碼、圖像的繪制 (Core Graphics)

GPU (Graphics Processing Unit) 圖像處理器

紋理的渲染
紋理是一個用來保存圖像顏色的元素值的緩存，渲染是指將數據生成圖像的過程。紋理渲染則是將保存在內存中的顏色值等數據，生成圖像的過程。

現在的手機設備基本都是采用雙緩存+垂直同步（即V-Sync）屏幕顯示技術。

`屏幕顯示內容的過程`

屏幕顯示過程.png

如上圖所示，系統內CPU、GPU和顯示器是協同完成顯示工作的。其中CPU負責計算顯示的內容，例如視圖創建、布局計算、圖片解碼、文本繪制等等。
隨后CPU將計算好的內容提交給GPU，由GPU進行變換、合成、渲染。
GPU會預先渲染好一幀放入一個緩沖區內，讓視頻控制器讀取，當下一幀渲染好后，GPU會直接將視頻控制器的指針指向第二個容器（雙緩存原理）。這里，GPU會等待顯示器的VSync（即垂直同步）信號發出后，才進行新的一幀渲染和緩沖區更新（這樣能解決畫面撕裂現象，也增加了畫面流暢度，但需要消費更多的計算資源，也會帶來部分延遲）。

離屏渲染

在OpenGL中，GPU有2種渲染方式

On-Screen Rendering: 當前屏幕渲染，在當前屏幕用于顯示的屏幕緩沖區進行渲染操作
Off-Screen Rendering: 離屏渲染，在當前屏幕緩沖區以外新開辟一個緩沖區進行渲染操作；這個操作是非常消耗性能的
離屏渲染消耗性能的原因

需要創建新的緩沖區
離屏渲染的整個過程，需要多次切換上下文環境, 先是從當前屏幕(On - Screen)切換到離屏(Off-Screen); 等到離屏渲染結束以后，將離屏緩沖區的渲染結果顯示到屏幕上時，又需要將上下文環境從離屏切換到當前屏幕

那些操作會觸發離屏渲染？

光柵化，是將幾何數據經過一系列變換后最終轉換為像素，從而呈現在顯示設備上的過程，光柵化的本質是坐標變換、幾何離散化；

我們使用 UITableView 和 UICollectionView 時經常會遇到各個 Cell 的樣式是一樣的，這時候我們可以使用這個屬性提高性能:
layer.shouldRasterize = YES
layer.rasterizationScale = [[UIScreen mainScreen] scale]

遮罩，layer.mask
圓角，同時設置layer.masksToBounds = YES、layer.cornerRadius大于0
圓角可以通過CoreGraphics這種技術繪制裁剪圓角，如果是固定圖片可以讓UI提供帶圓角圖片
陰影，layer.shadow相關API ; 但如果設置了layer.shadowPath就不會產生離屏渲染

5. 卡頓

卡頓的原因：

掉幀.png

由上面屏幕顯示的原理，采用了垂直同步機制的手機設備。如果在一個VSync 時間內，CPU 或GPU 沒有完成內容提交，則那一幀就會被丟棄，等待下一次機會再顯示，而這時顯示屏會保留之前的內容不變。例如在主線程里添加了阻礙主線程去響應點擊、滑動事件、以及阻礙主線程的UI繪制等的代碼，都是造成卡頓的常見原因。

卡頓監控：

卡頓監控一般有兩種實現方案：

(1). 主線程卡頓監控。通過子線程監測主線程的runLoop，判斷結束休眠到再次休眠兩個狀態區域之間的耗時是否達到一定閾值。
(2). FPS監控。要保持流暢的UI交互，App 刷新率應該當努力保持在 60fps。FPS的監控實現原理，上面已經探討過這里略過。

在使用FPS監控性能的實踐過程中，發現 FPS 值抖動較大，造成偵測卡頓比較困難。為了解決這個問題，通過采用檢測主線程每次執行消息循環的時間，當這一時間大于規定的閾值時，就記為發生了一次卡頓的方式來監控。

這也是美團的移動端采用的性能監控Hertz 方案，微信團隊也在實踐過程中提出來類似的方案--微信讀書 iOS 性能優化總結。

美團Hertz方案流程圖.png

/* Run Loop Observer Activities */
typedef CF_OPTIONS(CFOptionFlags, CFRunLoopActivity) {
    kCFRunLoopEntry = (1UL << 0),                 // 即將進入Loop
    kCFRunLoopBeforeTimers = (1UL << 1),          // 即將處理Timer
    kCFRunLoopBeforeSources = (1UL << 2),         // 即將處理Source
    kCFRunLoopBeforeWaiting = (1UL << 5),         // 即將進入休眠
    kCFRunLoopAfterWaiting = (1UL << 6),          // 剛從休眠中喚醒
    kCFRunLoopExit = (1UL << 7),                  // 即將退出Loop
    kCFRunLoopAllActivities = 0x0FFFFFFFU         // 所有狀態
};

方案的提出，是根據滾動引發的Sources事件或其它交互事件總是被快速的執行完成，然后進入到kCFRunLoopBeforeWaiting狀態下；假如在滾動過程中發生了卡頓現象，那么RunLoop必然會保持kCFRunLoopAfterWaiting或者kCFRunLoopBeforeSources這兩個狀態之一。

所以監控主線程卡頓的方案一：

開辟一個子線程，然后實時計算 kCFRunLoopBeforeSources 和 kCFRunLoopAfterWaiting 兩個狀態區域之間的耗時是否超過某個閥值，來斷定主線程的卡頓情況。

(南梔傾寒)給出了自己的解決方案，Swift的卡頓檢測第三方ANREye。這套卡頓監控方案大致思路為：創建一個子線程進行循環檢測，每次檢測時設置標記位為YES，然后派發任務到主線程中將標記位設置為NO。接著子線程沉睡超時闕值時長，判斷標志位是否成功設置成NO，如果沒有說明主線程發生了卡頓。

結合這套方案，當主線程處在Before Waiting狀態的時候，通過派發任務到主線程來設置標記位的方式處理常態下的卡頓檢測：

#define lsl_SEMAPHORE_SUCCESS 0
static BOOL lsl_is_monitoring = NO;
static dispatch_semaphore_t lsl_semaphore;
static NSTimeInterval lsl_time_out_interval = 0.05;


@implementation LSLAppFluencyMonitor

static inline dispatch_queue_t __lsl_fluecy_monitor_queue() {
    static dispatch_queue_t lsl_fluecy_monitor_queue;
    static dispatch_once_t once;
    dispatch_once(&once, ^{
        lsl_fluecy_monitor_queue = dispatch_queue_create("com.dream.lsl_monitor_queue", NULL);
    });
    return lsl_fluecy_monitor_queue;
}

static inline void __lsl_monitor_init() {
    static dispatch_once_t onceToken;
    dispatch_once(&onceToken, ^{
        lsl_semaphore = dispatch_semaphore_create(0);
    });
}

#pragma mark - Public
+ (instancetype)monitor {
    return [LSLAppFluencyMonitor new];
}

- (void)startMonitoring {
    if (lsl_is_monitoring) { return; }
    lsl_is_monitoring = YES;
    __lsl_monitor_init();
    dispatch_async(__lsl_fluecy_monitor_queue(), ^{
        while (lsl_is_monitoring) {
            __block BOOL timeOut = YES;
            dispatch_async(dispatch_get_main_queue(), ^{
                timeOut = NO;
                dispatch_semaphore_signal(lsl_semaphore);
            });
            [NSThread sleepForTimeInterval: lsl_time_out_interval];
            if (timeOut) {
                [LSLBacktraceLogger lsl_logMain];       // 打印主線程調用棧
//                [LSLBacktraceLogger lsl_logCurrent];    // 打印當前線程的調用棧
//                [LSLBacktraceLogger lsl_logAllThread];  // 打印所有線程的調用棧
            }
            dispatch_wait(lsl_semaphore, DISPATCH_TIME_FOREVER);
        }
    });
}

- (void)stopMonitoring {
    if (!lsl_is_monitoring) { return; }
    lsl_is_monitoring = NO;
}

@end

其中LSLBacktraceLogger是獲取堆棧信息的類，詳情見代碼Github。

打印日志如下:

2018-08-16 12:36:33.910491+0800 AppPerformance[4802:171145] Backtrace of Thread 771:
======================================================================================
libsystem_kernel.dylib         0x10d089bce __semwait_signal + 10
libsystem_c.dylib              0x10ce55d10 usleep + 53
AppPerformance                 0x108b8b478 $S14AppPerformance25LSLFPSTableViewControllerC05tableD0_12cellForRowAtSo07UITableD4CellCSo0kD0C_10Foundation9IndexPathVtF + 1144
AppPerformance                 0x108b8b60b $S14AppPerformance25LSLFPSTableViewControllerC05tableD0_12cellForRowAtSo07UITableD4CellCSo0kD0C_10Foundation9IndexPathVtFTo + 155
UIKitCore                      0x1135b104f -[_UIFilteredDataSource tableView:cellForRowAtIndexPath:] + 95
UIKitCore                      0x1131ed34d -[UITableView _createPreparedCellForGlobalRow:withIndexPath:willDisplay:] + 765
UIKitCore                      0x1131ed8da -[UITableView _createPreparedCellForGlobalRow:willDisplay:] + 73
UIKitCore                      0x1131b4b1e -[UITableView _updateVisibleCellsNow:isRecursive:] + 2863
UIKitCore                      0x1131d57eb -[UITableView layoutSubviews] + 165
UIKitCore                      0x1133921ee -[UIView(CALayerDelegate) layoutSublayersOfLayer:] + 1501
QuartzCore                     0x10ab72eb1 -[CALayer layoutSublayers] + 175
QuartzCore                     0x10ab77d8b _ZN2CA5Layer16layout_if_neededEPNS_11TransactionE + 395
QuartzCore                     0x10aaf3b45 _ZN2CA7Context18commit_transactionEPNS_11TransactionE + 349
QuartzCore                     0x10ab285b0 _ZN2CA11Transaction6commitEv + 576
QuartzCore                     0x10ab29374 _ZN2CA11Transaction17observer_callbackEP19__CFRunLoopObservermPv + 76
CoreFoundation                 0x109dc3757 __CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__ + 23
CoreFoundation                 0x109dbdbde __CFRunLoopDoObservers + 430
CoreFoundation                 0x109dbe271 __CFRunLoopRun + 1537
CoreFoundation                 0x109dbd931 CFRunLoopRunSpecific + 625
GraphicsServices               0x10f5981b5 GSEventRunModal + 62
UIKitCore                      0x112c812ce UIApplicationMain + 140
AppPerformance                 0x108b8c1f0 main + 224
libdyld.dylib                  0x10cd4dc9d start + 1

======================================================================================

方案二: 是結合CADisplayLink的方式實現

在檢測FPS值的時候，我們就詳細介紹了CADisplayLink的使用方式，在這里也可以通過FPS值是否連續低于某個值開進行監控。

卡頓解決的主要思路

盡可能減少CPU、GPU資源消耗
按照60FPS的刷幀率，每隔16ms就會有一次VSync信號；
也就是說在16毫秒內完成CPU和GPU的操作就不會出現卡頓; (1秒鐘等于1000毫秒，1000?60 就等于16.666666666667)

優化CPU

盡量用輕量級的對象，比如用不到事件處理的地方，可以考慮使用CALayer取代UIView
不要頻繁的調用UIView的相關屬性，比如frame、bounds、transform等屬性，盡量減少不必要的修改；
- 盡量提前計算好布局，在有需要時一次性調整對應的屬性，不要多次修改屬性
- Autolayout會比直接設置frame消耗更多的CPU資源
圖片的size最好跟UIImageView的size保持一致，這樣就不需要拉伸重繪
控制一下線程的最大并發數量，不要開啟一些不必要的線程
盡量把耗時的操作放在子線程，比如：文本處理(尺寸計算、繪制)、圖片處理(編碼解碼、繪制)
圖片解碼可以考慮先獲取CGImage，再創建位圖上下文，把圖形數據繪制到上面，這個過程就是解碼；最終把CGImage通過[UIImage imageWithCGImage: CGImage]轉為圖片賦值給UIImageView

主要功能代碼

     // context
        CGContextRef context = CGBitmapContextCreate(NULL, width, height, 8, 0, CGColorSpaceCreateDeviceRGB(), bitmapInfo);

        // draw
        CGContextDrawImage(context, CGRectMake(0, 0, width, height), cgImage);

        // get CGImage
        cgImage = CGBitmapContextCreateImage(context);

        // into UIImage
        UIImage *newImage = [UIImage imageWithCGImage:cgImage];

類似的實現在SDWebImage的SDWebImageImageIOCoder.m文件的sd_decompressedAndScaledDownImageWithImage:有用到

- (nullable UIImage *)sd_decompressedAndScaledDownImageWithImage:(nullable UIImage *)image {

 CGContextRef destContext;
    
    // autorelease the bitmap context and all vars to help system to free memory when there are memory warning.
    // on iOS7, do not forget to call [[SDImageCache sharedImageCache] clearMemory];
    @autoreleasepool {
        CGImageRef sourceImageRef = image.CGImage;
        
        CGSize sourceResolution = CGSizeZero;
        sourceResolution.width = CGImageGetWidth(sourceImageRef);
        sourceResolution.height = CGImageGetHeight(sourceImageRef);
        float sourceTotalPixels = sourceResolution.width * sourceResolution.height;
        // Determine the scale ratio to apply to the input image
        // that results in an output image of the defined size.
        // see kDestImageSizeMB, and how it relates to destTotalPixels.
        float imageScale = kDestTotalPixels / sourceTotalPixels;
        CGSize destResolution = CGSizeZero;
        destResolution.width = (int)(sourceResolution.width*imageScale);
        destResolution.height = (int)(sourceResolution.height*imageScale);
        
        // device color space
        CGColorSpaceRef colorspaceRef = SDCGColorSpaceGetDeviceRGB();
        BOOL hasAlpha = SDCGImageRefContainsAlpha(sourceImageRef);
        // iOS display alpha info (BGRA8888/BGRX8888)
        CGBitmapInfo bitmapInfo = kCGBitmapByteOrder32Host;
        bitmapInfo |= hasAlpha ? kCGImageAlphaPremultipliedFirst : kCGImageAlphaNoneSkipFirst;
        
        // kCGImageAlphaNone is not supported in CGBitmapContextCreate.
        // Since the original image here has no alpha info, use kCGImageAlphaNoneSkipLast
        // to create bitmap graphics contexts without alpha info.
        destContext = CGBitmapContextCreate(NULL,
                                            destResolution.width,
                                            destResolution.height,
                                            kBitsPerComponent,
                                            0,
                                            colorspaceRef,
                                            bitmapInfo);
        
        if (destContext == NULL) {
            return image;
        }
        CGContextSetInterpolationQuality(destContext, kCGInterpolationHigh);
        
        // Now define the size of the rectangle to be used for the
        // incremental blits from the input image to the output image.
        // we use a source tile width equal to the width of the source
        // image due to the way that iOS retrieves image data from disk.
        // iOS must decode an image from disk in full width 'bands', even
        // if current graphics context is clipped to a subrect within that
        // band. Therefore we fully utilize all of the pixel data that results
        // from a decoding opertion by achnoring our tile size to the full
        // width of the input image.
        CGRect sourceTile = CGRectZero;
        sourceTile.size.width = sourceResolution.width;
        // The source tile height is dynamic. Since we specified the size
        // of the source tile in MB, see how many rows of pixels high it
        // can be given the input image width.
        sourceTile.size.height = (int)(kTileTotalPixels / sourceTile.size.width );
        sourceTile.origin.x = 0.0f;
        // The output tile is the same proportions as the input tile, but
        // scaled to image scale.
        CGRect destTile;
        destTile.size.width = destResolution.width;
        destTile.size.height = sourceTile.size.height * imageScale;
        destTile.origin.x = 0.0f;
        // The source seem overlap is proportionate to the destination seem overlap.
        // this is the amount of pixels to overlap each tile as we assemble the ouput image.
        float sourceSeemOverlap = (int)((kDestSeemOverlap/destResolution.height)*sourceResolution.height);
        CGImageRef sourceTileImageRef;
        // calculate the number of read/write operations required to assemble the
        // output image.
        int iterations = (int)( sourceResolution.height / sourceTile.size.height );
        // If tile height doesn't divide the image height evenly, add another iteration
        // to account for the remaining pixels.
        int remainder = (int)sourceResolution.height % (int)sourceTile.size.height;
        if(remainder) {
            iterations++;
        }
        // Add seem overlaps to the tiles, but save the original tile height for y coordinate calculations.
        float sourceTileHeightMinusOverlap = sourceTile.size.height;
        sourceTile.size.height += sourceSeemOverlap;
        destTile.size.height += kDestSeemOverlap;
        for( int y = 0; y < iterations; ++y ) {
            @autoreleasepool {
                sourceTile.origin.y = y * sourceTileHeightMinusOverlap + sourceSeemOverlap;
                destTile.origin.y = destResolution.height - (( y + 1 ) * sourceTileHeightMinusOverlap * imageScale + kDestSeemOverlap);
                sourceTileImageRef = CGImageCreateWithImageInRect( sourceImageRef, sourceTile );
                if( y == iterations - 1 && remainder ) {
                    float dify = destTile.size.height;
                    destTile.size.height = CGImageGetHeight( sourceTileImageRef ) * imageScale;
                    dify -= destTile.size.height;
                    destTile.origin.y += dify;
                }
                CGContextDrawImage( destContext, destTile, sourceTileImageRef );
                CGImageRelease( sourceTileImageRef );
            }
        }
        
        CGImageRef destImageRef = CGBitmapContextCreateImage(destContext);
        CGContextRelease(destContext);
        if (destImageRef == NULL) {
            return image;
        }
        UIImage *destImage = [[UIImage alloc] initWithCGImage:destImageRef scale:image.scale orientation:image.imageOrientation];
        CGImageRelease(destImageRef);
        if (destImage == nil) {
            return image;
        }
        return destImage;
    }

}

優化GPU

盡量避免短時間內大量圖片的顯示，盡可能將多張圖片合成一張進行顯示(每張圖都是大圖就算了)
GPU能處理的最大紋理尺寸是4096x4096，一旦超過這個尺寸，就需要更多的時間處理，這樣CPU的處理空間就會變小，所以紋理盡量不要超過這個尺寸
盡量減少視圖數量和層次
減少透明的視圖(alpha<1),不透明的就設置opaque屬性為YES;
opaque: 確定視圖是否透明的布爾值
盡量避免出現離屏渲染

更多相關內容：啟動時間、耗電量監控、使用時崩潰率、流量監控、網絡狀況監控、等等。，由于篇幅太長，將作為第二篇文中發出，歡迎交流探討。

iOS 底層 - 性能優化之啟動和電池能耗
 iOS 底層 - 性能優化之安裝包瘦身(App Thinning)

參考文章：

青蘋果園

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

iOS 底層 - 性能優化之CPU、GPU

iOS 底層 - 性能優化之CPU、GPU

1. CPU 占用率

2. 內存

3. FPS

4. CPU和GPU

`屏幕顯示內容的過程`

離屏渲染

5. 卡頓

卡頓的原因：

卡頓監控：

所以監控主線程卡頓的方案一：

卡頓解決的主要思路

推薦閱讀更多精彩內容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美 国产 综合 欧美 视频

iOS 底層 - 性能優化之CPU、GPU

1. CPU 占用率

2. 內存

3. FPS

4. CPU和GPU

屏幕顯示內容的過程

離屏渲染

5. 卡頓

卡頓的原因：

卡頓監控：

所以監控主線程卡頓的方案一：

卡頓解決的主要思路

推薦閱讀更多精彩內容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

`屏幕顯示內容的過程`