ART世界探險(18) InlineMethod
好,我們還是先復習一下上上節學到的圖:
在開始InlineMethod之前,我們再繼續補充一點BasicBlock的知識。
BasicBlock中針對MIR的相關操作
AppendMIR
AppendMIR的作用是將MIR增加到一個BasicBlock的結尾。
/* Insert an MIR instruction to the end of a basic block. */
void BasicBlock::AppendMIR(MIR* mir) {
// Insert it after the last MIR.
InsertMIRListAfter(last_mir_insn, mir, mir);
}
InsertMIRListAfter
一個標準的鏈表,實現MIR的列表隊尾增加元素。
void BasicBlock::InsertMIRListAfter(MIR* insert_after, MIR* first_list_mir, MIR* last_list_mir) {
// If no MIR, we are done.
if (first_list_mir == nullptr || last_list_mir == nullptr) {
return;
}
// If insert_after is null, assume BB is empty.
if (insert_after == nullptr) {
first_mir_insn = first_list_mir;
last_mir_insn = last_list_mir;
last_list_mir->next = nullptr;
} else {
MIR* after_list = insert_after->next;
insert_after->next = first_list_mir;
last_list_mir->next = after_list;
if (after_list == nullptr) {
last_mir_insn = last_list_mir;
}
}
// Set this BB to be the basic block of the MIRs.
MIR* last = last_list_mir->next;
for (MIR* mir = first_list_mir; mir != last; mir = mir->next) {
mir->bb = id;
}
}
編譯的第一個大步驟是MIRGraph::InlineMethod。
我們上一節準備了指令集和BasicBlock等儲備知識,下面我們正式開始分析這第一個大步驟。
MIRGraph::InlineMethod
InlineMethod的作用是將一個Dex方法插入到MIRGraph中的當前插入點中。
void MIRGraph::InlineMethod(const DexFile::CodeItem* code_item, uint32_t access_flags,
InvokeType invoke_type ATTRIBUTE_UNUSED, uint16_t class_def_idx,
uint32_t method_idx, jobject class_loader, const DexFile& dex_file) {
current_code_item_ = code_item;
第一步先把傳進來的code_item賦給當前MIRGraph對象的current_mode_item_項目。
它的定義為:
const DexFile::CodeItem* current_code_item_;
第二步將current_method_和current_offset_對壓入到method_stack_棧中。
method_stack_是一個MIRLocation類型的ArenaVector。
ArenaVector<MIRLocation> method_stack_; // Include stack
MIRLocation是在MIRGraph類中定義的整數對。
typedef std::pair<int, int> MIRLocation; // Insert point, (m_unit_ index, offset)
總之,上面和下面這幾句的目的是定位到插入下一處的位置
method_stack_.push_back(std::make_pair(current_method_, current_offset_));
current_method_ = m_units_.size();
current_offset_ = 0;
m_units_是DexCompilationUnit的容器,其結構如下:
ArenaVector<DexCompilationUnit*> m_units_; // List of methods included in this graph
下面就開始往m_units_中push一個新建的DexCompilationUnit。
m_units_.push_back(new (arena_) DexCompilationUnit(
cu_, class_loader, Runtime::Current()->GetClassLinker(), dex_file,
current_code_item_, class_def_idx, method_idx, access_flags,
cu_->compiler_driver->GetVerifiedMethod(&dex_file, method_idx)));
然后計算代碼的首地址和結束地址:
const uint16_t* code_ptr = current_code_item_->insns_;
const uint16_t* code_end =
current_code_item_->insns_ + current_code_item_->insns_size_in_code_units_;
下面為新的BasicBlock預留空間。
block_list_是一個BasicBlock的ArenaVector容器:
ArenaVector<BasicBlock*> block_list_;
先reserve block_list_的空間。然后再定義一個ScopedArenaVector。
block_list_.reserve(block_list_.size() + current_code_item_->insns_size_in_code_units_);
// FindBlock lookup cache.
ScopedArenaAllocator allocator(&cu_->arena_stack);
ScopedArenaVector<uint16_t> dex_pc_to_block_map(allocator.Adapter());
dex_pc_to_block_map.resize(current_code_item_->insns_size_in_code_units_ +
1 /* Fall-through on last insn; dead or punt to interpreter. */);
...
下面開始處理第一個方法,為其創建BasicBlock對象:null_block對象,entry_block_對象和exit_block_對象。
CreateNewBB的邏輯在前面我們已經講過了。
// If this is the first method, set up default entry and exit blocks.
if (current_method_ == 0) {
DCHECK(entry_block_ == nullptr);
DCHECK(exit_block_ == nullptr);
DCHECK_EQ(GetNumBlocks(), 0U);
// Use id 0 to represent a null block.
BasicBlock* null_block = CreateNewBB(kNullBlock);
DCHECK_EQ(null_block->id, NullBasicBlockId);
null_block->hidden = true;
entry_block_ = CreateNewBB(kEntryBlock);
exit_block_ = CreateNewBB(kExitBlock);
} else {
UNIMPLEMENTED(FATAL) << "Nested inlining not implemented.";
/*
* Will need to manage storage for ins & outs, push prevous state and update
* insert point.
*/
}
null塊,入口塊和出口塊都是默認的。下面再創建代碼塊:
/* Current block to record parsed instructions */
BasicBlock* cur_block = CreateNewBB(kDalvikByteCode);
DCHECK_EQ(current_offset_, 0U);
cur_block->start_offset = current_offset_;
// TODO: for inlining support, insert at the insert point rather than entry block.
entry_block_->fall_through = cur_block->id;
cur_block->predecessors.push_back(entry_block_->id);
下面開始處理try塊所管轄的區間。
/* Identify code range in try blocks and set up the empty catch blocks */
ProcessTryCatchBlocks(&dex_pc_to_block_map);
我們看一下ProcessTryCatchBlock的處理邏輯。
主要思路是:
- 遍歷,尋找塊中的每一個try語句
- 針對每一個try,計算catch需要處理的區間,然后加入到CatchHandler中。
/* Identify code range in try blocks and set up the empty catch blocks */
void MIRGraph::ProcessTryCatchBlocks(ScopedArenaVector<uint16_t>* dex_pc_to_block_map) {
int tries_size = current_code_item_->tries_size_;
DexOffset offset;
if (tries_size == 0) {
return;
}
for (int i = 0; i < tries_size; i++) {
const DexFile::TryItem* pTry =
DexFile::GetTryItems(*current_code_item_, i);
DexOffset start_offset = pTry->start_addr_;
DexOffset end_offset = start_offset + pTry->insn_count_;
for (offset = start_offset; offset < end_offset; offset++) {
try_block_addr_->SetBit(offset);
}
}
// Iterate over each of the handlers to enqueue the empty Catch blocks.
const uint8_t* handlers_ptr = DexFile::GetCatchHandlerData(*current_code_item_, 0);
uint32_t handlers_size = DecodeUnsignedLeb128(&handlers_ptr);
for (uint32_t idx = 0; idx < handlers_size; idx++) {
CatchHandlerIterator iterator(handlers_ptr);
for (; iterator.HasNext(); iterator.Next()) {
uint32_t address = iterator.GetHandlerAddress();
FindBlock(address, true /*create*/, /* immed_pred_block_p */ nullptr, dex_pc_to_block_map);
}
handlers_ptr = iterator.EndDataPointer();
}
}
下面開始處理每一條指令,將其轉化成MIR。
uint64_t merged_df_flags = 0u;
/* Parse all instructions and put them into containing basic blocks */
while (code_ptr < code_end) {
MIR *insn = NewMIR();
insn->offset = current_offset_;
insn->m_unit_index = current_method_;
int width = ParseInsn(code_ptr, &insn->dalvikInsn);
Instruction::Code opcode = insn->dalvikInsn.opcode;
if (opcode_count_ != nullptr) {
opcode_count_[static_cast<int>(opcode)]++;
}
int flags = insn->dalvikInsn.FlagsOf();
int verify_flags = Instruction::VerifyFlagsOf(insn->dalvikInsn.opcode);
前面都是跟上節講到的Dalvik指令集密切相關,相關信息可以參考上節。
下面開始處理一些特殊的標志。
uint64_t df_flags = GetDataFlowAttributes(insn);
merged_df_flags |= df_flags;
if (df_flags & DF_HAS_DEFS) {
def_count_ += (df_flags & DF_A_WIDE) ? 2 : 1;
}
if (df_flags & DF_LVN) {
cur_block->use_lvn = true; // Run local value numbering on this basic block.
}
下面先處理空指令。
空指令雖然只有一個字節,而且也沒有操作要執行。但是處理起來也是有不少工程上的細節。
- 首先要判斷是否是因為對齊,而占用的字節數大于1.
- 如果只占一個字節,則AppendMIR這條空指令
- 否則可能存在不可達指令,對此要做一些針對性的處理
// Check for inline data block signatures.
if (opcode == Instruction::NOP) {
// A simple NOP will have a width of 1 at this point, embedded data NOP > 1.
if ((width == 1) && ((current_offset_ & 0x1) == 0x1) && ((code_end - code_ptr) > 1)) {
// Could be an aligning nop. If an embedded data NOP follows, treat pair as single unit.
uint16_t following_raw_instruction = code_ptr[1];
if ((following_raw_instruction == Instruction::kSparseSwitchSignature) ||
(following_raw_instruction == Instruction::kPackedSwitchSignature) ||
(following_raw_instruction == Instruction::kArrayDataSignature)) {
width += Instruction::At(code_ptr + 1)->SizeInCodeUnits();
}
}
if (width == 1) {
// It is a simple nop - treat normally.
cur_block->AppendMIR(insn);
} else {
DCHECK(cur_block->fall_through == NullBasicBlockId);
DCHECK(cur_block->taken == NullBasicBlockId);
// Unreachable instruction, mark for no continuation and end basic block.
flags &= ~Instruction::kContinue;
FindBlock(current_offset_ + width, /* create */ true,
/* immed_pred_block_p */ nullptr, &dex_pc_to_block_map);
}
如果不是空指令的話,直接AppendMIR。
} else {
cur_block->AppendMIR(insn);
}
下面開始處理跳轉相關的指令:
// Associate the starting dex_pc for this opcode with its containing basic block.
dex_pc_to_block_map[insn->offset] = cur_block->id;
code_ptr += width;
if (flags & Instruction::kBranch) {
cur_block = ProcessCanBranch(cur_block, insn, current_offset_,
width, flags, code_ptr, code_end, &dex_pc_to_block_map);
處理返回相關的操作:
} else if (flags & Instruction::kReturn) {
cur_block->terminated_by_return = true;
cur_block->fall_through = exit_block_->id;
exit_block_->predecessors.push_back(cur_block->id);
/*
* Terminate the current block if there are instructions
* afterwards.
*/
if (code_ptr < code_end) {
/*
* Create a fallthrough block for real instructions
* (incl. NOP).
*/
FindBlock(current_offset_ + width, /* create */ true,
/* immed_pred_block_p */ nullptr, &dex_pc_to_block_map);
}
處理拋出異常指令:
} else if (flags & Instruction::kThrow) {
cur_block = ProcessCanThrow(cur_block, insn, current_offset_, width, flags, try_block_addr_,
code_ptr, code_end, &dex_pc_to_block_map);
處理分支指令:
} else if (flags & Instruction::kSwitch) {
cur_block = ProcessCanSwitch(cur_block, insn, current_offset_, width,
flags, &dex_pc_to_block_map);
}
...
尋找下一個BasicBlock. 找到之后,就把它們關聯起來。
周而復始,我們就將它們畫成了一張圖。
current_offset_ += width;
BasicBlock* next_block = FindBlock(current_offset_, /* create */ false,
/* immed_pred_block_p */ nullptr,
&dex_pc_to_block_map);
if (next_block) {
/*
* The next instruction could be the target of a previously parsed
* forward branch so a block is already created. If the current
* instruction is not an unconditional branch, connect them through
* the fall-through link.
*/
DCHECK(cur_block->fall_through == NullBasicBlockId ||
GetBasicBlock(cur_block->fall_through) == next_block ||
GetBasicBlock(cur_block->fall_through) == exit_block_);
if ((cur_block->fall_through == NullBasicBlockId) && (flags & Instruction::kContinue)) {
cur_block->fall_through = next_block->id;
next_block->predecessors.push_back(cur_block->id);
}
cur_block = next_block;
}
}
merged_df_flags_ = merged_df_flags;
...
最后再檢查一下是不是有落空的代碼跳出去了。
// Check if there's been a fall-through out of the method code.
BasicBlockId out_bb_id = dex_pc_to_block_map[current_code_item_->insns_size_in_code_units_];
if (UNLIKELY(out_bb_id != NullBasicBlockId)) {
// Eagerly calculate DFS order to determine if the block is dead.
DCHECK(!DfsOrdersUpToDate());
ComputeDFSOrders();
BasicBlock* out_bb = GetBasicBlock(out_bb_id);
DCHECK(out_bb != nullptr);
if (out_bb->block_type != kDead) {
LOG(WARNING) << "Live fall-through out of method in " << PrettyMethod(method_idx, dex_file);
SetPuntToInterpreter(true);
}
}
}
以上,便完成了一次MIRGraph的生成過程。后面我們會舉例子,詳細分析生成代碼時這個流程是如何走的。
但是,我們還有一些細節還沒有講,我們先過一下它們。
ProcessCanBranch
ProcessCanBranch方法,會處理下面這些跟跳轉相關的指令:
- 無條件跳轉指令
- GOTO
- GOTO_16
- GOTO_32
- 條件跳轉指令
- IF_EQ: 等于
- IF_NE: 不等于
- IF_LT: 小于
- IF_GE: 大于或等于
- IF_GT: 大于
- IF_LE: 小于或等于
另外,還有兩參數的指令:IF_XXZ。
上節看指令格式的時候我們可以看到,IF_EQ是三參數的:。而對應的IF_EQZ是兩個參數的:IF_EQZ vAA, +BBBB
首先是根據指令曬參數:
/* Process instructions with the kBranch flag */
BasicBlock* MIRGraph::ProcessCanBranch(BasicBlock* cur_block, MIR* insn, DexOffset cur_offset,
int width, int flags, const uint16_t* code_ptr,
const uint16_t* code_end,
ScopedArenaVector<uint16_t>* dex_pc_to_block_map) {
DexOffset target = cur_offset;
switch (insn->dalvikInsn.opcode) {
case Instruction::GOTO:
case Instruction::GOTO_16:
case Instruction::GOTO_32:
target += insn->dalvikInsn.vA;
break;
case Instruction::IF_EQ:
case Instruction::IF_NE:
case Instruction::IF_LT:
case Instruction::IF_GE:
case Instruction::IF_GT:
case Instruction::IF_LE:
cur_block->conditional_branch = true;
target += insn->dalvikInsn.vC;
break;
case Instruction::IF_EQZ:
case Instruction::IF_NEZ:
case Instruction::IF_LTZ:
case Instruction::IF_GEZ:
case Instruction::IF_GTZ:
case Instruction::IF_LEZ:
cur_block->conditional_branch = true;
target += insn->dalvikInsn.vB;
break;
default:
LOG(FATAL) << "Unexpected opcode(" << insn->dalvikInsn.opcode << ") with kBranch set";
}
后面根據參數情況查找要跳轉的代碼塊:
CountBranch(target);
BasicBlock* taken_block = FindBlock(target, /* create */ true,
/* immed_pred_block_p */ &cur_block,
dex_pc_to_block_map);
DCHECK(taken_block != nullptr);
cur_block->taken = taken_block->id;
taken_block->predecessors.push_back(cur_block->id);
下面處理continue退出塊的情況:
/* Always terminate the current block for conditional branches */
if (flags & Instruction::kContinue) {
BasicBlock* fallthrough_block = FindBlock(cur_offset + width,
/* create */
true,
/* immed_pred_block_p */
&cur_block,
dex_pc_to_block_map);
DCHECK(fallthrough_block != nullptr);
cur_block->fall_through = fallthrough_block->id;
fallthrough_block->predecessors.push_back(cur_block->id);
} else if (code_ptr < code_end) {
FindBlock(cur_offset + width, /* create */ true, /* immed_pred_block_p */ nullptr, dex_pc_to_block_map);
}
return cur_block;
}
ProcessCanSwitch
處理switch語句:
/* Process instructions with the kSwitch flag */
BasicBlock* MIRGraph::ProcessCanSwitch(BasicBlock* cur_block, MIR* insn, DexOffset cur_offset,
int width, int flags,
ScopedArenaVector<uint16_t>* dex_pc_to_block_map) {
UNUSED(flags);
const uint16_t* switch_data =
reinterpret_cast<const uint16_t*>(GetCurrentInsns() + cur_offset +
static_cast<int32_t>(insn->dalvikInsn.vB));
int size;
const int* keyTable;
const int* target_table;
int i;
int first_key;
switch的case以壓縮的格式存儲的話:
/*
* Packed switch data format:
* ushort ident = 0x0100 magic value
* ushort size number of entries in the table
* int first_key first (and lowest) switch case value
* int targets[size] branch targets, relative to switch opcode
*
* Total size is (4+size*2) 16-bit code units.
*/
if (insn->dalvikInsn.opcode == Instruction::PACKED_SWITCH) {
DCHECK_EQ(static_cast<int>(switch_data[0]),
static_cast<int>(Instruction::kPackedSwitchSignature));
size = switch_data[1];
first_key = switch_data[2] | (switch_data[3] << 16);
target_table = reinterpret_cast<const int*>(&switch_data[4]);
keyTable = nullptr; // Make the compiler happy.
以非壓縮的稀疏方式存儲的情況:
/*
* Sparse switch data format:
* ushort ident = 0x0200 magic value
* ushort size number of entries in the table; > 0
* int keys[size] keys, sorted low-to-high; 32-bit aligned
* int targets[size] branch targets, relative to switch opcode
*
* Total size is (2+size*4) 16-bit code units.
*/
} else {
DCHECK_EQ(static_cast<int>(switch_data[0]),
static_cast<int>(Instruction::kSparseSwitchSignature));
size = switch_data[1];
keyTable = reinterpret_cast<const int*>(&switch_data[2]);
target_table = reinterpret_cast<const int*>(&switch_data[2 + size*2]);
first_key = 0; // To make the compiler happy.
}
...
下面去查找對應的代碼塊,并把它們組織起來。
cur_block->successor_block_list_type =
(insn->dalvikInsn.opcode == Instruction::PACKED_SWITCH) ? kPackedSwitch : kSparseSwitch;
cur_block->successor_blocks.reserve(size);
for (i = 0; i < size; i++) {
BasicBlock* case_block = FindBlock(cur_offset + target_table[i], /* create */ true,
/* immed_pred_block_p */ &cur_block,
dex_pc_to_block_map);
DCHECK(case_block != nullptr);
SuccessorBlockInfo* successor_block_info =
static_cast<SuccessorBlockInfo*>(arena_->Alloc(sizeof(SuccessorBlockInfo),
kArenaAllocSuccessor));
successor_block_info->block = case_block->id;
successor_block_info->key =
(insn->dalvikInsn.opcode == Instruction::PACKED_SWITCH) ?
first_key + i : keyTable[i];
cur_block->successor_blocks.push_back(successor_block_info);
case_block->predecessors.push_back(cur_block->id);
}
下面處理落空的情況,就是default的情況了。
/* Fall-through case */
BasicBlock* fallthrough_block = FindBlock(cur_offset + width, /* create */ true,
/* immed_pred_block_p */ nullptr,
dex_pc_to_block_map);
DCHECK(fallthrough_block != nullptr);
cur_block->fall_through = fallthrough_block->id;
fallthrough_block->predecessors.push_back(cur_block->id);
return cur_block;
}
ProcessCanThrow - 處理異常的情況
/* Process instructions with the kThrow flag */
BasicBlock* MIRGraph::ProcessCanThrow(BasicBlock* cur_block, MIR* insn, DexOffset cur_offset,
int width, int flags, ArenaBitVector* try_block_addr,
const uint16_t* code_ptr, const uint16_t* code_end,
ScopedArenaVector<uint16_t>* dex_pc_to_block_map) {
UNUSED(flags);
bool in_try_block = try_block_addr->IsBitSet(cur_offset);
bool is_throw = (insn->dalvikInsn.opcode == Instruction::THROW);
首先是處理try塊:
/* In try block */
if (in_try_block) {
CatchHandlerIterator iterator(*current_code_item_, cur_offset);
if (cur_block->successor_block_list_type != kNotUsed) {
LOG(INFO) << PrettyMethod(cu_->method_idx, *cu_->dex_file);
LOG(FATAL) << "Successor block list already in use: "
<< static_cast<int>(cur_block->successor_block_list_type);
}
for (; iterator.HasNext(); iterator.Next()) {
BasicBlock* catch_block = FindBlock(iterator.GetHandlerAddress(), false /* create */,
nullptr /* immed_pred_block_p */,
dex_pc_to_block_map);
if (insn->dalvikInsn.opcode == Instruction::MONITOR_EXIT &&
IsBadMonitorExitCatch(insn->offset, catch_block->start_offset)) {
// Don't allow monitor-exit to catch its own exception, http://b/15745363 .
continue;
}
if (cur_block->successor_block_list_type == kNotUsed) {
cur_block->successor_block_list_type = kCatch;
}
catch_block->catch_entry = true;
if (kIsDebugBuild) {
catches_.insert(catch_block->start_offset);
}
SuccessorBlockInfo* successor_block_info = reinterpret_cast<SuccessorBlockInfo*>
(arena_->Alloc(sizeof(SuccessorBlockInfo), kArenaAllocSuccessor));
successor_block_info->block = catch_block->id;
successor_block_info->key = iterator.GetHandlerTypeIndex();
cur_block->successor_blocks.push_back(successor_block_info);
catch_block->predecessors.push_back(cur_block->id);
}
in_try_block = (cur_block->successor_block_list_type != kNotUsed);
}
bool build_all_edges =
(cu_->disable_opt & (1 << kSuppressExceptionEdges)) || is_throw || in_try_block;
if (!in_try_block && build_all_edges) {
BasicBlock* eh_block = CreateNewBB(kExceptionHandling);
cur_block->taken = eh_block->id;
eh_block->start_offset = cur_offset;
eh_block->predecessors.push_back(cur_block->id);
}
如果有異常要拋出,就需要構建一個catch塊去處理:
if (is_throw) {
cur_block->explicit_throw = true;
if (code_ptr < code_end) {
// Force creation of new block following THROW via side-effect.
FindBlock(cur_offset + width, /* create */ true, /* immed_pred_block_p */ nullptr, dex_pc_to_block_map);
}
if (!in_try_block) {
// Don't split a THROW that can't rethrow - we're done.
return cur_block;
}
}
if (!build_all_edges) {
/*
* Even though there is an exception edge here, control cannot return to this
* method. Thus, for the purposes of dataflow analysis and optimization, we can
* ignore the edge. Doing this reduces compile time, and increases the scope
* of the basic-block level optimization pass.
*/
return cur_block;
}
下面是對catch的處理。注釋里有詳細的說明,我們后面再討論細節。
這個階段重要的是大家對于整個流程有個概念,可以不必過于關注細節。
/*
* Split the potentially-throwing instruction into two parts.
* The first half will be a pseudo-op that captures the exception
* edges and terminates the basic block. It always falls through.
* Then, create a new basic block that begins with the throwing instruction
* (minus exceptions). Note: this new basic block must NOT be entered into
* the block_map. If the potentially-throwing instruction is the target of a
* future branch, we need to find the check psuedo half. The new
* basic block containing the work portion of the instruction should
* only be entered via fallthrough from the block containing the
* pseudo exception edge MIR. Note also that this new block is
* not automatically terminated after the work portion, and may
* contain following instructions.
*
* Note also that the dex_pc_to_block_map entry for the potentially
* throwing instruction will refer to the original basic block.
*/
BasicBlock* new_block = CreateNewBB(kDalvikByteCode);
new_block->start_offset = insn->offset;
cur_block->fall_through = new_block->id;
new_block->predecessors.push_back(cur_block->id);
MIR* new_insn = NewMIR();
*new_insn = *insn;
insn->dalvikInsn.opcode = static_cast<Instruction::Code>(kMirOpCheck);
// Associate the two halves.
insn->meta.throw_insn = new_insn;
new_block->AppendMIR(new_insn);
return new_block;
}