# Java Under the Hood — Part 2: Memory Management

[In the first part](https://dev.ilenlab.com/what-s-java-under-the-hood-part-1-2), we discussed theoretically the process that occurs from the compilation of a *.java* file until the loading of the *.class* file into the Java runtime.

We also discussed briefly how the interpreter — one of *JVM*'s aspects — uses an execution stack to process the *bytecode* instructions in a "last in, first out" manner.

Now we are going to go further in the interpreter in order to understand what is a *bytecode* and how the *Java Virtual Machine* executes the instructions located in a *.class* file.

### The .class file

It's important to highlight that a lot of transformations occurs from the initial human-readable code written in a *.java* file until its execution. The first of these transformations is the compilation — which is the process of transforming source code into a *.class* file.

Each *.class* file groups a set of instructions known as *bytecode* and represents a single class or interface.

Be aware that not every class or interface has an external representation, however, we are going to refer to any valid representation of a class or interface as being a *.class* file.

A *.class* file is a stream of *8-bit* bytes. Given that, it's possible to consume multiples of *8-bit* bytes in order to represent *16-bit*, *32-bit*, and *64-bit* quantities.

### The bytecode

The *bytecode* is platform agnostic and you can trust that any valid *.class* file is executable in any OS or platform that provides a *Java Virtual Machine* implementation that complies with the *VMSpec*.

It's also important to highlight that the *bytecode* is independent of the Java language itself. Many other languages are capable to compile their own source code into bytecode and the generated *.class* file is executable on any *JVM*.

Let's write a little piece of code to help us understand what occurs from the initial Java code until the end of the `java` process.

First, create a `HelloWorld.java` containing the following code.

```java
class HelloWorld {
  public static void main(String[] args) {
      System.out.println("Hello World!");
  }
}
```

To compile this code, execute `javac`[`HelloWorld.java`](http://HelloWorld.java) in a command-line. This will generate the *HelloWorld.class* file.

The *.class* file is a binary file, so you won't be able to read it properly by simply opening this file with a text editor.

To analyze its structure let's dump the content in a hexadecimal format by running `hexdump -C HelloWorld.class > HelloWorld.hexdump`. This will output the *.class* content into a new *HelloWorld.hexdump* file.

By running `cat HelloWorld.hexdump` it will output something similar to:

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1708411844953/1d1bb693-af8d-4b17-a9df-e6c51ebbaf02.webp align="center")

A good exercise that is pretty straightforward is: write a "*Hello World!*" function [in Kotlin and/or Scala, and execute](https://docs.scala-lang.org/scala3/book/taste-hello-world.html) the same steps above: compile with `kotlinc` or `scalac` and then use `hexdump` to generate a hexadecimal representation of your *.class* file.

You will see that each output file is slightly different but all of them follows the same structure and the same runtime concepts apply to all.

A *.class* file follows the [*Unix binary file definition*. Looking to the first line](http://www.linfo.org/binary_file.html) we have:

> *00000000 ca fe ba be 00 00 00 34 00 1d 0a 00 06 00 0f 09 |…….4……..|*

Remember that a *.class* file "*is a stream of 8-bit bytes*"? So, from left to right, grouping bytes we have:

* **00000000** as the memory address where the following 16 bytes are allocated
    
* **ca fe ba be** as the magic number that indicates this file is a Java binary
    
* **00 00** as the minor version of the *.class* file format
    
* **00 34** as the major version of the *.class* file format
    
* **00 1d** as the constant poll count
    
* **0a 00 06 00 0f 09** as the initial information of the constant poll
    

We are not going to evaluate every bit of this file. However, it is important to know that a valid *.class* file has this structure:

* *8-bit* magic number
    
* *4-bit* minor\_version
    
* *4-bit* major\_version
    
* *4-bit* constant\_pool\_count
    
* *N-bit* constant\_pool\[constant\_pool\_count-1\]
    
* *4-bit* access\_flags
    
* *4-bit* this\_class
    
* *4-bit* super\_class
    
* *4-bit* interfaces\_count
    
* *4-bit* interfaces\[interfaces\_count\]
    
* *4-bit* fields\_count
    
* *N-bit* fields\[fields\_count\]
    
* *4-bit* methods\_count
    
* *N-bit* methods\[methods\_count\]
    
* *4-bit* attributes\_count
    
* *N-bit* attributes\[attributes\_count\]
    

The *N-bit* represents a dynamic size that varies accordingly with the class structure and members.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1708412193946/19bde36a-6ccd-4257-ae2b-f544196a10d6.webp align="center")

Running `javap -c HelloWorld > HelloWorld.bytecode`\` we can output the *bytecode* in a more human-readable format:

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1708412238594/dd0a20b3-15bc-472e-9e68-52ee67a5e31e.webp align="center")

Each instruction in this representation is an *opcode* — a predefined instruction that represents the type, the operation, the interactions between local variables, the constant poll, and the stack.

All instructions are predefined and specified by the [*Java Virtual Machine Specification*](https://docs.oracle.com/javase/specs/jvms/se8/jvms8.pdf) (chapter 6).

### The Execution

The execution of a class or interface consists of three processes: *loading*, *linking* and *initializing*.

*Loading* is the process of finding the proper binary representation with a particular name and creating a class or interface from that representation.

*Linking* is the process of taking a class or interface and combining it into the runtime state of the Java Virtual Machine so that it can be executed.

*Initializing* is the process of execution of the *&lt;clinit&gt;* method of a given class or interface.

To be executed, a class or interface must have one of its methods invoked as a result of another class execution or it must be the initial class of a `java` process. Either case, it will have one of your constructors invoked during the initialization process and its code will be executed.

Every constructor written in Java represents an ***instance initialization method*** — that is especially known as a `<init>` method. All classes and interfaces contain at most one particular initialization method that takes no arguments and returns `void`. This particular ***class or interface initialization method*** is called `<clinit>`*.*

The `<clinit>` method is named by the compiler and cannot be referenced in a Java code. Only the JVM, during a class or interface initialization, is capable of invoking the `<clinit>` of a given class or interface.

A ***Java Virtual Machine*** process starts by initializing the *Bootstrap* classloader and consequently the initial class is loaded, linked, and initialized by invoking its `<clinit>` method.

The *VMSpec* specifies many ways to define the initial class, such as a command-line argument for the `java` binary.

Finally, the method `public static void main()` is invoked. Any other instruction is executed as a result of the `main()` method invocation. This invocation may cause the linking and initialization of other classes and/or interfaces, as well as invocation of additional methods.

The execution process of our `HelloWorld.java` code is described in the image below:

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1708412619860/153b98ee-7233-4a32-af34-78af4bc3588f.webp align="center")

### Runtime Areas

Each instruction incurs in, at least, an execution in one of the six different runtime areas: `heap space`, `method area`, `Java stack`, `native method stack`, `program counter register`*,* or `constant pool`.

The *JVM* defines multiple runtime data areas that are used during execution. Some areas are created when the *JVM* starts and are destroyed only when it exits while others are created when a thread is created and destroyed when the respective thread ends.

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1708412769819/faf0e0b3-b014-43ff-860b-19a46b989d2e.webp align="center")

All areas are equally important and each one of them plays a fundamental role in *JVM*'s execution flow. Although, two of them deserves our attention because they will help us understand how the multithreading aspect of the *JVM* works.

These two areas are: the *Java Virtual Machine Stack* (or *stack*) and the *Java Virtual Machine Heap* (or *heap*).

### The Stack

One of the most important aspects of *JVM* is the support to **multithreading** execution.

Each thread has its own private ***Java Virtual Machine Stack*** that is created together when the thread starts. It consists of ***frames*** — which are used to store data and partial results, as well as to perform dynamic linking, return values for methods, and dispatch exceptions such as local variables, **object references**, method parameters, and other method-specific data during the execution of a method.

Local variables and method invocations use the Thread Stack. Each thread gets its own stack. Other areas like the Method Area store static variables and class-level data. The size of the stack memory is fixed. The JVM automatically allocates and frees these memory areas instead of you managing raw memory.

```java
void myMethod() {
  int x; //local variable uses Stack 
}
```

The ***Java Virtual Machine Stack*** is never directly manipulated and only operates in terms of push and pop of ***frames***.

The specification of *Java SE 8* allows the user to either specify a fixed size for the ***Java Virtual Machine Stacks*** or define a minimum and maximum sizes to allow dynamic expansion/contraction according to its use.

Two important exceptions come from this:

* If the execution of a thread requires a larger stack than the size prefixed by the user, it results in a `StackOverflowError`
    
* If expansion is allowed but there is no memory to support it, a `OutOfMemoryError` is thrown
    

### The Heap

The ***Java Virtual Machine Heap*** is a runtime data area shared by all threads.

The *heap* is used to allocate memory for all **classes instances** and **arrays**. This area is created when the *JVM* process starts and is destroyed only when it ends.

Objects are never explicitly deallocated from a given memory space and this space is reclaimed by an automatic storage management system known as the ***garbage collector***.

```java
int[] arr = new int[5]; //object allocated in Heap
```

Like the *stack* area, on startup, the user can define if the memory size allocated for the *heap* is fixed or dynamically allocated.

Only a `OutOfMemoryError` can occur in the *heap* area since this area is not related to a specific thread.

It's important to highlight that both memory spaces, for *stack* and *heap* allocation, don't necessarily need to be contiguous.

### The Exit

The ***Java Virtual Machine*** process exits when any thread invokes either of the options below:

* the `exit()` or `halt()` methods of the `Runtime` class
    
* the `exit()` method of the `System` class
    

The [Java security manager](https://docs.oracle.com/javase/specs/jvms/se8/jvms8.pdf) must allow the `exit` or `halt` operation.

**To be continued…**

From a static perspective, we distilled what is a *bytecode* and how it is executed by the *JVM* using the runtime areas in a stacked manner — last in, first out — until the end of the `java` process.

In the next post, we are going to investigate the memory management system: the *garbage collector* and the *JVM* multithreaded characteristics

**References**

This post is a summary of my understanding of the contents of the following references. I strongly recommend the reading of these contents if you want to go deeper into this subject. Are they:

* [***The Java® Virtual Machine Specification (Java SE 8 Edition)***](https://docs.oracle.com/javase/specs/jvms/se8/jvms8.pdf) by Tim Lindholm, Frank Yellin, Gilad Bracha, and Alex Buckley
    
* [***Optimizing Java***](https://www.amazon.com/Optimizing-Java-Techniques-Application-Performance/dp/1492025798) by Ben Evans, Chris Newland, and James Gough
    

**"Under the Hood of the JVM" Series**

* [Java Under the Hood](https://dev.ilenlab.com/series/java-under-the-hood)
    

```yaml
source:
[1] https://medium.com/@caique.me/8b10fae2a468
[2] https://naveen-metta.medium.com/8f5b98747486
```
