更詳細(xì)的講解和代碼調(diào)試演示過(guò)程,請(qǐng)點(diǎn)擊鏈接
上一節(jié),我們探討了jvm函數(shù)調(diào)用時(shí),參數(shù)是如何傳遞的。上節(jié)對(duì)參數(shù)傳遞的方式有個(gè)錯(cuò)誤,這里先更正一下。在上節(jié),我是這么說(shuō)明jvm函數(shù)調(diào)用時(shí)的參數(shù)傳遞方式的:
static double lotsOfArguments(int a, long b, float c, String[][] d) {
....
}
當(dāng)上面函數(shù)運(yùn)行時(shí),在執(zhí)行函數(shù)lotsOfArguments前,jvm會(huì)把輸入?yún)?shù)全部放到堆棧上,當(dāng)函數(shù)被執(zhí)行時(shí),參數(shù)會(huì)從堆棧拷貝到局部變量隊(duì)列,因此當(dāng)lotsOfArguments執(zhí)行前,堆棧上參數(shù)如下:
stack:
d
c
b
a
函數(shù)執(zhí)行時(shí)堆棧上的參數(shù)會(huì)依次拷貝到局部變量隊(duì)列,情況如下:
local_list: d c b a 。
上面說(shuō)法錯(cuò)誤在于,參數(shù)從堆棧拷貝到局部隊(duì)列的次序說(shuō)反了,參數(shù)拷貝到隊(duì)列后的次序應(yīng)該是:
local_list: a b c d.
由此,當(dāng)我們的編譯器把C含有傳遞參數(shù)的函數(shù)調(diào)用的代碼編譯成java字節(jié)碼時(shí),需要注意處理上面所說(shuō)的參數(shù)傳遞的次序,上一節(jié),我們使用了一個(gè)名為:getLocalVariableIndex(Symbol symbol)的函數(shù)來(lái)查找函數(shù)輸入?yún)?shù)對(duì)應(yīng)在局部變量隊(duì)列上的位置,這里我們需要對(duì)其實(shí)現(xiàn)根據(jù)上面的修改來(lái)更正一下,好在需要改正的代碼不多,內(nèi)容如下:
public int getLocalVariableIndex(Symbol symbol) {
TypeSystem typeSys = TypeSystem.getTypeSystem();
String funcName = nameStack.peek();
Symbol funcSym = typeSys.getSymbolByText(funcName, 0, "main");
ArrayList<Symbol> localVariables = new ArrayList<Symbol>();
Symbol s = funcSym.getArgList();
while (s != null) {
localVariables.add(s);
s = s.getNextSymbol();
}
Collections.reverse(localVariables); //相比上節(jié)代碼,我們只需添加這行就能改正上面所說(shuō)的錯(cuò)誤
ArrayList<Symbol> list = typeSys.getSymbolsByScope(symbol.getScope());
for (int i = 0; i < list.size(); i++) {
if (localVariables.contains(list.get(i)) == false) {
localVariables.add(list.get(i));
}
}
for (int i = 0; i < localVariables.size(); i++) {
if (localVariables.get(i) == symbol) {
return i;
}
}
return -1;
}
說(shuō)到有輸入?yún)?shù)的函數(shù)調(diào)用,在我們的C語(yǔ)言代碼里,用到最多的是printf,因此有必要就該函數(shù)如何編譯成java字節(jié)碼再進(jìn)行詳細(xì)的解析。假設(shè)C代碼中有這樣一條語(yǔ)句:
printf("the value of a, b , c is: %d, %d, %d", a, b, c);
我們知道,在Java中對(duì)應(yīng)printf函數(shù)的是System.out.print,我要在jvm中實(shí)現(xiàn)上面printf函數(shù)的功能,我們需要先把對(duì)象System.out壓入到堆棧,然后再把要打印到控制臺(tái)上的變量?jī)?nèi)容壓入到堆棧上。我們先看一個(gè)簡(jiǎn)單點(diǎn)的版本:
printf("%d", x);
上面的代碼本意是要把整形變量x的值輸出到控制臺(tái),但我們實(shí)現(xiàn)的編譯器在解析上面代碼時(shí),會(huì)先解析變量x, 當(dāng)解析到一個(gè)變量x時(shí),假設(shè)變量x在局部隊(duì)列中的次序是0,那么我們當(dāng)前實(shí)現(xiàn)的編譯器一旦解析到變量x時(shí),會(huì)先輸出一條java字節(jié)碼指令:
ILOAD 0
解析完變量x后,編譯器才會(huì)明白printf不是一個(gè)變量,而是一個(gè)函數(shù)名稱,這時(shí)我們的編譯器才會(huì)知道,要把printf與java對(duì)象System.out對(duì)應(yīng)起來(lái),于是就會(huì)把System.out對(duì)象壓入到堆棧上,因此就會(huì)輸出java字節(jié)碼指令:
getstatic java/lang/System/out Ljava/io/PrintStream;
這樣的話就會(huì)把System.out對(duì)象壓入到堆棧上,此時(shí)堆棧的情況如下:
stack:
System.out
x
這樣一來(lái)就有問(wèn)題了,要想正確打印x變量的值,堆棧上的情況應(yīng)該是:
stack:
x
System.out
也就是說(shuō)兩者在堆棧上的位置發(fā)生了顛倒。為了處理這個(gè)問(wèn)題,需要修改一下編譯器的編譯流程,當(dāng)編譯器先解讀了變量x,導(dǎo)致x的值先壓入堆棧,當(dāng)編譯器接著解讀到printf時(shí),我們必須先把變量x從堆棧上,因?yàn)樽兞縳是從局部變量隊(duì)列加載到堆棧上的,也就是說(shuō)上面執(zhí)行完語(yǔ)句ILOAD 0 后,變量x同時(shí)存在在堆棧和隊(duì)列上:
stack:
x
local_list: x
此時(shí),我們需要把x從堆棧上挪開(kāi),挪開(kāi)后放到哪呢?我們把它挪開(kāi)后,放到局部變量隊(duì)列的末尾,因此為了把x挪到局部變量隊(duì)列的末尾,我們接著執(zhí)行指令:ISTROE 1, 于是堆棧和隊(duì)列的情況在執(zhí)行指令后如下:
stack:
null
local_list: x , x
也就是說(shuō),此時(shí)變量x的值在隊(duì)列上有兩份,你會(huì)疑問(wèn)直接把x的值從堆棧頂部彈掉,壓入System.out對(duì)象后,再次把x從隊(duì)列里面再壓入堆棧不就可以了嗎,亦或者我們把x從堆棧頂部挪開(kāi)時(shí),直接挪回x原來(lái)在隊(duì)列中的位置,也就是執(zhí)行ISTORE 0 不就可以了嗎。這么做是不可以的,因?yàn)槿绻沁@種情況:
printf("%d", x+1);
此時(shí)我們要輸出的是x+1的值,輸出后變量x的值是不變的,編譯器解讀上面代碼時(shí),會(huì)先解讀到x+1,于是它會(huì)把這個(gè)表達(dá)式的值壓入堆棧,形成如下情景:
stack:
x+1
local_list: x
因此如果直接把堆棧頂部元素彈開(kāi),那么x+1的值就找不來(lái)了,如果直接把堆棧頂部的內(nèi)容存回到x變量在隊(duì)列中的位置,那就好變成:
stack:
null
local_list: x+1
這樣的話,變量x的內(nèi)容就改變了,因此也不行,所以要把堆棧頂部元素的值存到局部變量隊(duì)列的末尾,也就是執(zhí)行指令I(lǐng)STORE 1:
stack:
null
local_list: x, x+1
接著把System.out壓入堆棧,也就是執(zhí)行指令:
getstatic java/lang/System/out Ljava/io/PrintStream;
這時(shí)候情況如下:
stack:
System.out
local_list: x, x+1
這時(shí)候,再把隊(duì)列末尾的元素壓到堆棧頂部,也就是執(zhí)行語(yǔ)句ILOAD 1,于是情況變成:
stack:
x+1
System.out
local_list: x, x+1
此時(shí)再調(diào)用System.out對(duì)象的print(I)V方法,也就是打印一個(gè)整形的方法,因此對(duì)應(yīng)的指令就是:
invokevirtual java/io/PrintStream/print(I)V
所以綜合起來(lái)說(shuō),當(dāng)編譯器編譯語(yǔ)句:
printf("%d", x);
成為java字節(jié)碼時(shí),編譯后的代碼為:
iload 0
istore 1
getstatic java/lang/System/out Ljava/io/PrintStream;
iload 1
invokevirtual java/io/PrintStream/print(I)V
我們?cè)倏磸?fù)雜一點(diǎn)的情況:
int f(int a, int b, int c, int x) {
printf("%d, %d, %d, $d", a, b ,c , x);
int d = x;
return d;
}
該函數(shù)有4個(gè)輸入?yún)?shù),因此函數(shù)在jvm執(zhí)行時(shí),四個(gè)輸入?yún)?shù)會(huì)放置在局部變量隊(duì)列上:
local_list: a, b , c , x, d
變量a處于局部變量隊(duì)列的低0處,變量b處于隊(duì)列位置為1處,變量c處于隊(duì)列位置為2處,變量x處于隊(duì)列位置為3處。由于函數(shù)還有一個(gè)局部變量d,因此變量d在隊(duì)列的末尾,也就是處于隊(duì)列位置為4處。
在編譯器解讀語(yǔ)句printf("%d, %d, %d, %d", a, b, c, x);時(shí),根據(jù)我們前面的討論,編譯器會(huì)先解讀函數(shù)的輸入?yún)?shù),也就是先解讀變量a,b,c,d,因?yàn)槊拷庾x到一個(gè)變量是,我們的編譯器會(huì)自動(dòng)執(zhí)行iload語(yǔ)句,把變量加載到堆棧上,因此就產(chǎn)生了如下指令:
iload 0 ;解析變量a
iload 1 ;解析變量b
iload 2 ;解析變量c
iload 3 ;解析變量x
上面指令執(zhí)行后,jvm堆棧和隊(duì)列情況如下:
stack :
x
c
b
a
local_list: a, b , c, x , d
根據(jù)前面討論,我們需要先把變量從堆棧上挪到隊(duì)列的末尾,因此編譯器要執(zhí)行指令:
iload 5 ;把x放到隊(duì)列位置為5處
iload 6 ;把c放到隊(duì)列位置為6處
iload 7 ;把b放到隊(duì)列位置為7處
iload 8 ;把a(bǔ)放到隊(duì)列位置為8處
于是堆棧和隊(duì)列的情景如下:
stack:
null
local_list: a, b , c, x, d, x, c, b , a
這時(shí),編譯器再把System.out壓入堆棧,也就是執(zhí)行指令:
getstatic java/lang/System/out Ljava/io/PrintStream;
接著把處于位置8處的變量a的值再重新放回堆棧,也就是執(zhí)行指令I(lǐng)LOAD 8,此時(shí)運(yùn)行環(huán)境為:
stack:
a
System.out
local_list: a, b, c, x ,d ,x ,c ,b, a
這時(shí)調(diào)用指令invokevirtual java/io/PrintStream/print(I)V,執(zhí)行System.out對(duì)象的Print函數(shù),把a(bǔ)變量的值輸出到控制臺(tái)。接著反復(fù)執(zhí)行這幾個(gè)步驟,把剩下幾個(gè)變量的值依次打印到控制臺(tái)上,于是上面代碼中的printf語(yǔ)句編譯成java字節(jié)碼后的內(nèi)容如下:
iload 0
iload 1
iload 2
iload 3
istore 5
istore 6
istore 7
istore 8
getstatic java/lang/System/out Ljava/io/PrintStream;
ilod 8
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ilod 7
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ilod 6
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ilod 5
invokevirtual java/io/PrintStream/print(I)V
大家可以看到,一條C語(yǔ)言語(yǔ)句編譯成java字節(jié)碼時(shí),編出的語(yǔ)句數(shù)量是原語(yǔ)句的十幾倍,這也是為何在編譯原理中,代碼優(yōu)化是極為重要的一環(huán),沒(méi)有優(yōu)化技術(shù),就算編出來(lái)的代碼能跑,但是效率也是非常低下的。
實(shí)現(xiàn)上面編譯功能的代碼在ClibCall.java中:
private Object handlePrintfCall() {
ArrayList<Object> argsList = FunctionArgumentList.getFunctionArgumentList().getFuncArgList(false);
String argStr = (String)argsList.get(0);
String formatStr = "";
int i = 0;
int argCount = 1;
String str = "";
while (i < argStr.length()) {
if (argStr.charAt(i) == '%' && i+1 < argStr.length() &&
argStr.charAt(i+1) == 'd') {
i += 2;
//generateJavaAssemblyForPrintf(str);
str = "";
formatStr += argsList.get(argCount);
argCount++;
//printInteger();
} else {
str += argStr.charAt(i);
formatStr += argStr.charAt(i);
i++;
}
}
System.out.println(formatStr);
generateJavaAssemblyForPrintf(argStr, argCount - 1);
return null;
}
private void generateJavaAssemblyForPrintf(String argStr, int argCount) {
ProgramGenerator generator = ProgramGenerator.getInstance();
String funcName = generator.getCurrentFuncName();
TypeSystem typeSystem = TypeSystem.getTypeSystem();
ArrayList<Symbol> list = typeSystem.getSymbolsByScope(funcName);
int localVariableNum = list.size();
int count = 0;
while (count < argCount) {
generator.emit(Instruction.ISTORE, "" + (localVariableNum + count));
count++;
}
int i = 0;
String str = "";
count = argCount - 1;
while (i < argStr.length()) {
if (argStr.charAt(i) == '%' && i+1 < argStr.length() &&
argStr.charAt(i+1) == 'd') {
i += 2;
printString(str);
str = "";
printInteger(localVariableNum + count);
count--;
} else {
str += argStr.charAt(i);
i++;
}
}
printString("\n");
}
最后,我們?cè)倏纯磈vm的運(yùn)算指令,對(duì)整形進(jìn)行四種基礎(chǔ)運(yùn)行的jvm指令為:iadd, isub, imul, idiv. 如果要計(jì)算 1+2, 那么分別把數(shù)值1和2壓入堆棧,然后執(zhí)行指令iadd, 那么堆棧頂部的元素就會(huì)被彈出,他們的和也就是3會(huì)被壓入到堆棧,其他運(yùn)算指令的原理相同。如果計(jì)算的是浮點(diǎn)數(shù)而不是整形,那么就得使用對(duì)應(yīng)指令,他們分別是fadd, fsub, fmul, fdiv,如果計(jì)算的是長(zhǎng)整形,那么對(duì)應(yīng)的指令就是ladd, lsub, lmul, ldiv, 在我們實(shí)現(xiàn)的編譯器中,目前暫時(shí)只支持對(duì)整形的運(yùn)算指令,相應(yīng)的代碼實(shí)現(xiàn)在BinaryExecutor.java:
public class BinaryExecutor extends BaseExecutor{
@Override
public Object Execute(ICodeNode root) {
....
case CGrammarInitializer.Binary_Plus_Binary_TO_Binary:
case CGrammarInitializer.Binary_DivOp_Binary_TO_Binary:
case CGrammarInitializer.Binary_Minus_Binary_TO_Binary:
case CGrammarInitializer.Binary_Start_Binary_TO_Binary:
//先假設(shè)是整形數(shù)相加
int val1 = (Integer)root.getChildren().get(0).getAttribute(ICodeKey.VALUE);
int val2 = (Integer)root.getChildren().get(1).getAttribute(ICodeKey.VALUE);
if (production == CGrammarInitializer.Binary_Plus_Binary_TO_Binary) {
String text = root.getChildren().get(0).getAttribute(ICodeKey.TEXT) + " plus " + root.getChildren().get(1).getAttribute(ICodeKey.TEXT);
root.setAttribute(ICodeKey.VALUE, val1 + val2);
root.setAttribute(ICodeKey.TEXT, text);
System.out.println(text + " is " + (val1+val2) );
ProgramGenerator.getInstance().emit(Instruction.IADD);
} else if (production == CGrammarInitializer.Binary_Minus_Binary_TO_Binary) {
String text = root.getChildren().get(0).getAttribute(ICodeKey.TEXT) + " minus " + root.getChildren().get(1).getAttribute(ICodeKey.TEXT);
root.setAttribute(ICodeKey.VALUE, val1 - val2);
root.setAttribute(ICodeKey.TEXT, text);
System.out.println(text + " is " + (val1-val2) );
ProgramGenerator.getInstance().emit(Instruction.ISUB);
} else if (production == CGrammarInitializer.Binary_Start_Binary_TO_Binary) {
String text = root.getChildren().get(0).getAttribute(ICodeKey.TEXT) + " * " + root.getChildren().get(1).getAttribute(ICodeKey.TEXT);
root.setAttribute(ICodeKey.VALUE, val1 * val2);
root.setAttribute(ICodeKey.TEXT, text);
System.out.println(text + " is " + (val1 * val2) );
ProgramGenerator.getInstance().emit(Instruction.IMUL);
}
else {
root.setAttribute(ICodeKey.VALUE, val1 / val2);
System.out.println( root.getChildren().get(0).getAttribute(ICodeKey.TEXT) + " is divided by "
+ root.getChildren().get(1).getAttribute(ICodeKey.TEXT) + " and result is " + (val1/val2) );
ProgramGenerator.getInstance().emit(Instruction.IDIV);
}
break;
....
}
}
完成本節(jié)代碼后,我們的編譯器能將下面C代碼編譯成java字節(jié)碼:
int f(int a, int b, int c, int x) {
printf("value of a, b ,c ,x is: %d, %d, %d, %d", a, b, c, x);
int d;
d = (a*x*x) + (b*x);
int e;
int h;
e = 6;
h = 3;
printf("e divided by h is : %d", e/h);
return d;
}
void main() {
int c;
c = f(1, 2, 3, 4);
printf("result of calling f is :%d", c);
}
上面代碼編譯成java匯編代碼的結(jié)果為:
.class public CSourceToJava
.super java/lang/Object
.method public static main([Ljava/lang/String;)V
sipush 1
sipush 2
sipush 3
sipush 4
invokestatic CSourceToJava/f(IIII)I
istore 0
iload 0
istore 1
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "result of calling f is :"
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
iload 1
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "
"
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
return
.end method
.method public static f(IIII)I
iload 0
iload 1
iload 2
iload 3
istore 7
istore 8
istore 9
istore 10
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "value of a, b ,c ,x is: "
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
iload 10
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc ", "
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
iload 9
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc ", "
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
iload 8
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc ", "
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
iload 7
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "
"
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
iload 0
iload 3
imul
iload 3
imul
iload 1
iload 3
imul
iadd
istore 4
sipush 6
istore 5
sipush 3
istore 6
iload 5
iload 6
idiv
istore 7
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "e divided by h is : "
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
getstatic java/lang/System/out Ljava/io/PrintStream;
iload 7
invokevirtual java/io/PrintStream/print(I)V
getstatic java/lang/System/out Ljava/io/PrintStream;
ldc "
"
invokevirtual java/io/PrintStream/print(Ljava/lang/String;)V
iload 4
ireturn
.end method
.end class
把上面java匯編代碼在編譯成二進(jìn)制字節(jié)碼運(yùn)行后結(jié)果如下:
更詳細(xì)的講解和調(diào)試演示過(guò)程請(qǐng)參看視頻。
更多技術(shù)信息,包括操作系統(tǒng),編譯器,面試算法,機(jī)器學(xué)習(xí),人工智能,請(qǐng)關(guān)照我的公眾號(hào):