-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Favicon change, styling change and some drafts.
- Loading branch information
1 parent
9ed5667
commit 0159eda
Showing
20 changed files
with
347 additions
and
29 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"name":"","short_name":"","icons":[{"src":"/android-chrome-192x192.png","sizes":"192x192","type":"image/png"},{"src":"/android-chrome-512x512.png","sizes":"512x512","type":"image/png"}],"theme_color":"#ffffff","background_color":"#ffffff","display":"standalone"} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,292 @@ | ||
--- | ||
title: Writing a simple jvm | ||
hidden: true | ||
--- | ||
|
||
# Writing a simple jvm | ||
<div class="h-0.5 w-full my-2 bg-pink-800/30 dark:bg-slate-600 weeb:bg-slate-600"></div> | ||
I recently decided to write a simple `JVM`. The [JVM specification](https://docs.oracle.com/javase/specs/jvms/se17/html/index.html) was surprisingly easy to understand, | ||
and writing a simple `JVM` was lot easier than I had imagined. | ||
In this post, we will write a `JVM` using `go` that can run the following Java code: | ||
|
||
>This post is inspired by [Zserge’s article](https://zserge.com/posts/jvm/). I follow a similar approach in this post while building the JVM. | ||
```java | ||
public class Main { | ||
int multiplier = 3; | ||
|
||
public static int add(int a, int b) { | ||
return a + b; | ||
} | ||
|
||
public int multiply(int b) { | ||
return b * multiplier; | ||
} | ||
|
||
public int premultiply(int c) { | ||
return multiply(c) * multiply(c); | ||
} | ||
|
||
public static void main(String[] args) { | ||
Main main = new Main(); | ||
int d = main.premultiply(2) * add(1,2); | ||
} | ||
} | ||
``` | ||
## Parsing Java Class | ||
We need to compile our code before we can run it. Use `javac Main.java` to compile our code into `Main.class`. Our `JVM` will execute this binary `Main.class`. | ||
|
||
Let's take a look at hexdump of this class using `hexdump -C Main`. | ||
``` | ||
00000000 ca fe ba be 00 00 00 3d 00 1f 0a 00 02 00 03 07 |.......=........| | ||
00000010 00 04 0c 00 05 00 06 01 00 10 6a 61 76 61 2f 6c |..........java/l| | ||
00000020 61 6e 67 2f 4f 62 6a 65 63 74 01 00 06 3c 69 6e |ang/Object...<in| | ||
00000030 69 74 3e 01 00 03 28 29 56 09 00 08 00 09 07 00 |it>...()V.......| | ||
00000040 0a 0c 00 0b 00 0c 01 00 04 4d 61 69 6e 01 00 0a |.........Main...| | ||
00000050 6d 75 6c 74 69 70 6c 69 65 72 01 00 01 49 0a 00 |multiplier...I..| | ||
00000060 08 00 0e 0c 00 0f 00 10 01 00 08 6d 75 6c 74 69 |...........multi| | ||
00000070 70 6c 79 01 00 04 28 49 29 49 0a 00 08 00 03 0a |ply...(I)I......| | ||
00000080 00 08 00 13 0c 00 14 00 10 01 00 0b 70 72 65 6d |............prem| | ||
00000090 75 6c 74 69 70 6c 79 0a 00 08 00 16 0c 00 17 00 |ultiply.........| | ||
000000a0 18 01 00 03 61 64 64 01 00 05 28 49 49 29 49 01 |....add...(II)I.| | ||
000000b0 00 04 43 6f 64 65 01 00 0f 4c 69 6e 65 4e 75 6d |..Code...LineNum| | ||
000000c0 62 65 72 54 61 62 6c 65 01 00 04 6d 61 69 6e 01 |berTable...main.| | ||
000000d0 00 16 28 5b 4c 6a 61 76 61 2f 6c 61 6e 67 2f 53 |..([Ljava/lang/S| | ||
000000e0 74 72 69 6e 67 3b 29 56 01 00 0a 53 6f 75 72 63 |tring;)V...Sourc| | ||
000000f0 65 46 69 6c 65 01 00 09 4d 61 69 6e 2e 6a 61 76 |eFile...Main.jav| | ||
00000100 61 00 21 00 08 00 02 00 00 00 01 00 00 00 0b 00 |a.!.............| | ||
<omitted>... | ||
``` | ||
The first 4 bytes spell [0xCAFEBABE](https://en.wikipedia.org/wiki/Java_class_file) which is a magic number to indentify java class file format. | ||
We can also notice some strings such as `java/lang/Object`, `<init>`, etc. in the hexdump. We will learn more about these strings later in this post. | ||
|
||
If we look at the [specification](https://docs.oracle.com/javase/specs/jvms/se14/html/jvms-4.html#jvms-4.1) the class has the following format: | ||
``` | ||
ClassFile { | ||
u4 magic; | ||
u2 minor_version; | ||
u2 major_version; | ||
u2 constant_pool_count; | ||
cp_info constant_pool[constant_pool_count-1]; | ||
u2 access_flags; | ||
u2 this_class; | ||
u2 super_class; | ||
u2 interfaces_count; | ||
u2 interfaces[interfaces_count]; | ||
u2 fields_count; | ||
field_info fields[fields_count]; | ||
u2 methods_count; | ||
method_info methods[methods_count]; | ||
u2 attributes_count; | ||
attribute_info attributes[attributes_count]; | ||
} | ||
``` | ||
Here the notation `u1`, `u2`, `u4` refers to the number of bytes, with `u1` being `1 byte`. Let's begin by defining this in code. This makes it easier to map the format | ||
to code. | ||
```go | ||
type u1 = uint8 | ||
type u2 = uint16 | ||
type u4 = uint32 | ||
|
||
type ClassLoader struct { | ||
reader io.Reader | ||
} | ||
|
||
func (classLoader *ClassLoader) readBytes(n int) []byte { | ||
byteArray := make([]byte, n, n) | ||
_, _ = io.ReadFull(classLoader.reader, byteArray) | ||
return byteArray | ||
} | ||
|
||
func (classLoader *ClassLoader) loadU1() u1 { return classLoader.readBytes(1)[0] } | ||
func (classLoader *ClassLoader) loadU2() u2 { | ||
return binary.BigEndian.Uint16(classLoader.readBytes(2)) | ||
} | ||
func (classLoader *ClassLoader) loadU4() u4 { | ||
return binary.BigEndian.Uint32(classLoader.readBytes(4)) | ||
} | ||
``` | ||
|
||
Now let's define a basic Class struct and initialize the values using our loader. | ||
```go | ||
type Class struct { | ||
magic u4 | ||
minorVersion u2 | ||
majorVersion u2 | ||
} | ||
file, _ := os.Open("Main.class") | ||
loader := ClassLoader{reader: file} | ||
class := Class{ | ||
magic: loader.loadU4(), | ||
minorVersion: loader.loadU2(), | ||
majorVersion: loader.loadU2(), | ||
} | ||
``` | ||
|
||
## Constant Pool | ||
Next item in the class is `Constant pool`. It contains different constants such as string constants, class name, field names, etc. | ||
Each constant follows this structure. | ||
``` | ||
cp_info { | ||
u1 tag; | ||
u1 info[]; | ||
} | ||
``` | ||
The constant type is determined by it's `tag`. The constants relevant for our `JVM` are: | ||
* CONSTANT_Utf8 | ||
* CONSTANT_Integer | ||
* CONSTANT_Class | ||
* CONSTANT_String | ||
* CONSTANT_Fieldref | ||
* CONSTANT_Methodref | ||
* CONSTANT_NameAndType | ||
|
||
We can now parse the constants using the following code: | ||
```go | ||
type ConstantPoolTags u1 | ||
|
||
const ( | ||
CONSTANT_Utf8 ConstantPoolTags = iota + 1 | ||
CONSTANT_Integer | ||
CONSTANT_Class = 7 | ||
CONSTANT_String = 8 | ||
CONSTANT_Fieldref = 9 | ||
CONSTANT_Methodref = 10 | ||
CONSTANT_NameAndType = 12 | ||
) | ||
|
||
type ConstantPool []Constant | ||
|
||
type Constant struct { | ||
tag ConstantPoolTags | ||
info ConstantType | ||
} | ||
|
||
type ConstantType struct { | ||
String string | ||
StringIndex u2 | ||
ClassIndex u2 | ||
Integer int | ||
MethodRef MethodRef | ||
FieldRef FieldRef | ||
NameAndType NameAndType | ||
} | ||
|
||
func (classLoader *ClassLoader) loadConstantPool() (constantPool ConstantPool) { | ||
constantPoolCount := classLoader.loadU2() | ||
|
||
//The constant_pool table is indexed from 1 to constant_pool_count - 1. | ||
for i := u2(1); i < constantPoolCount; i++ { | ||
constant := Constant{tag: ConstantPoolTags(classLoader.loadU1())} | ||
|
||
switch constant.tag { | ||
|
||
case CONSTANT_Integer: | ||
constant.info = ConstantType{Integer: int(classLoader.loadU4())} | ||
case CONSTANT_Utf8: | ||
utfLength := classLoader.loadU2() | ||
constant.info = ConstantType{String: string(classLoader.readBytes(int(utfLength)))} | ||
case CONSTANT_String: | ||
constant.info = ConstantType{StringIndex: classLoader.loadU2()} | ||
case CONSTANT_Class: | ||
constant.info = ConstantType{ClassIndex: classLoader.loadU2()} | ||
case CONSTANT_Methodref: | ||
methodRef := MethodRef{ | ||
ClassIndex: classLoader.loadU2(), | ||
NameAndTypeIndex: classLoader.loadU2(), | ||
} | ||
constant.info = ConstantType{MethodRef: methodRef} | ||
case CONSTANT_Fieldref: | ||
fieldRef := FieldRef{ | ||
ClassIndex: classLoader.loadU2(), | ||
NameAndTypeIndex: classLoader.loadU2(), | ||
} | ||
constant.info = ConstantType{FieldRef: fieldRef} | ||
case CONSTANT_NameAndType: | ||
nameAndType := NameAndType{ | ||
nameIndex: classLoader.loadU2(), | ||
descriptor: classLoader.loadU2(), | ||
} | ||
constant.info = ConstantType{NameAndType: nameAndType} | ||
default: | ||
|
||
} | ||
constantPool = append(constantPool, constant) | ||
} | ||
return constantPool | ||
} | ||
|
||
type MethodRef struct { | ||
ClassIndex u2 | ||
NameAndTypeIndex u2 | ||
} | ||
type FieldRef = MethodRef | ||
type NameAndType struct { | ||
nameIndex u2 | ||
descriptor u2 | ||
} | ||
``` | ||
|
||
Now define the following methods so we can easily resolve `className` and `CONSTANT_Utf8` from the `constantPool`. | ||
|
||
```go | ||
func (constantPool ConstantPool) resolveUtf(index u2) (*string, error) { | ||
if constantPool[index-1].tag == CONSTANT_Utf8 { | ||
return &constantPool[index-1].info.String, nil | ||
} | ||
return nil, fmt.Errorf("constant at index %d is not a valid string", index) | ||
} | ||
|
||
func (constantPool ConstantPool) resolveClassName(index u2) (*string, error) { | ||
if constantPool[index-1].tag == CONSTANT_Class { | ||
className, err := constantPool.resolveUtf(constantPool[index-1].info.ClassIndex) | ||
if err != nil { | ||
return nil, err | ||
} | ||
return className, nil | ||
} | ||
return nil, fmt.Errorf("constant at index %d is not a valid class", index) | ||
} | ||
``` | ||
|
||
## Code Attribute | ||
If we look at methods in the class, we will find an attribute named `Code`. According to the [spec](https://docs.oracle.com/javase/specs/jvms/se17/html/jvms-4.html#jvms-4.7.3) | ||
code has the following structure. | ||
``` | ||
Code_attribute { | ||
u2 attribute_name_index; | ||
u4 attribute_length; | ||
u2 max_stack; | ||
u2 max_locals; | ||
u4 code_length; | ||
u1 code[code_length]; | ||
u2 exception_table_length; | ||
{ u2 start_pc; | ||
u2 end_pc; | ||
u2 handler_pc; | ||
u2 catch_type; | ||
} exception_table[exception_table_length]; | ||
u2 attributes_count; | ||
attribute_info attributes[attributes_count]; | ||
} | ||
``` | ||
For purpose of this post, we will only be focusing on `maxStack`, `maxLocals` and `code`. | ||
```go | ||
type CodeAttribute struct { | ||
name string | ||
maxStack u2 | ||
maxLocals u2 | ||
code []byte | ||
} | ||
|
||
func (attribute *Attribute) toCodeAttribute() *CodeAttribute { | ||
codeAttribute := CodeAttribute{ | ||
name: attribute.name, | ||
maxStack: binary.BigEndian.Uint16(attribute.info[0:2]), | ||
maxLocals: binary.BigEndian.Uint16(attribute.info[2:4]), | ||
} | ||
codeLength := binary.BigEndian.Uint32(attribute.info[4:8]) | ||
codeAttribute.code = attribute.info[8 : 8+codeLength] | ||
return &codeAttribute | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.