Monday, September 1, 2014

Go Compiler nil Pointer Checks

Introduction
I was thinking about how the compiler looks to protect the code we write when it can. Invalid memory access checks are one type of safety check the compiler adds to our code. We might think that this "extra code" is hurting our performance and maybe over billions of iterative operations it is. However, these checks can prevent our code from causing damage to the systems we are running on. The compiler is essentially reporting and finding bugs, making the code we write safer to run.

Even with this in mind, Go wants to be fast and if the hardware can catch a problem, it will let it. One of these cases is with the detection of invalid memory access. There are times when the compiler will add a nil pointer check into our code and times when it doesn't. In this post we will explore one case where the compiler leaves it up to the hardware to detect invalid memory access and one case where the compiler adds a nil pointer check.

Hardware Only Checks
When the compiler can depend on the hardware to check for and report invalid memory access, the compiler can produce less code which helps with performance. If our code attempts to read or write to address 0x0, the hardware will throw an exception that will be caught by the Go runtime and reported back up to our program in the form of a panic. If the panic is not recovered, a stack trace is produced.

Here is an example that attempts to write to memory address 0x0 and the corresponding panic/stack trace:

01 package main
02
03 func main() {
04    var p *int  // Declare a nil value pointer
05    *p = 10     // Write the value 10 to address 0x0
06 }

panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x0 pc=0x2007]

goroutine 16 [running]:
runtime.panic(0x28600, 0x51744)
    /go/src/pkg/runtime/panic.c:279 +0xf5
main.main()
    /Go/Projects/src/github.com/goinaction/code/temp/main.go:5 +0x7

goroutine 17 [runnable]:
runtime.MHeap_Scavenger()
    /go/src/pkg/runtime/mheap.c:507
runtime.goexit()
    /go/src/pkg/runtime/proc.c:1445

Let's look at the assembly produced by the 6g compiler on darwin/amd64 for this example:

go tool 6g -S main.go

04  var p *int
0x0004 00004 (main.go:4)   MOVQ $0,AX

05  *p = 10
0x0007 00007 (main.go:5)   NOP ,
0x0007 00007 (main.go:5)   MOVQ $10,(AX)

In the assembly above we see the value of 0 is assigned to the AX register and then the code attempts to write the value of 10 to the memory that the AX register points to. This results in the panic and corresponding stack trace. There is no nil pointer check in this assembly so the compiler has left this up to the hardware to detect and report.

Compiler Checks
Let's look at an example that produces a nil pointer check by the compiler:

01 package main
02
03 type S struct {
04     b [4096]byte
05     i int
06 }
07
08 func main() {
09     s := new(S)
10     s.i++
11 }

On lines 03 through 06 we have a struct type named S. This type has two fields. The first field is an array of 4096 bytes and the second field is an integer. Then in main on line 09, we create a value of type S and assign the address of that value to a pointer variable named s. Finally on line 10, we increment the value of the i field by one, which is a field inside our value of type S.

Let's look at the assembly produced by the 6g compiler on darwin/amd64 for lines 09 and 10:

go tool 6g -S main.go

09  s := new(S)
0x0036 00054 (main.go:9)   LEAQ "".autotmp_0001+0(SP),DI
0x003a 00058 (main.go:9)   MOVL $0,AX
0x003c 00060 (main.go:9)   MOVQ $513,CX
0x0043 00067 (main.go:9)   REP ,
0x0044 00068 (main.go:9)   STOSQ ,
0x0046 00070 (main.go:9)   LEAQ "".autotmp_0001+0(SP),BX

10  s.i++
0x004a 00074 (main.go:10)  CMPQ BX,$0
0x004e 00078 (main.go:10)  JEQ $1,105
0x0050 00080 (main.go:10)  MOVQ 4096(BX),BP
0x0057 00087 (main.go:10)  NOP ,
0x0057 00087 (main.go:10)  INCQ ,BP
0x005a 00090 (main.go:10)  MOVQ BP,4096(BX)
0x0061 00097 (main.go:10)  NOP ,

The code on line 10 in our example requires pointer arithmetic underneath to make it work:

10  s.i++

0x004a 00074 (main.go:10)  CMPQ BX,$0
0x004e 00078 (main.go:10)  JEQ $1,105

0x0050 00080 (main.go:10)  MOVQ 4096(BX),BP
0x0057 00087 (main.go:10)  NOP ,
0x0057 00087 (main.go:10)  INCQ ,BP
0x005a 00090 (main.go:10)  MOVQ BP,4096(BX)

The i field is located at a 4096 byte offset inside the value of type S. In the assembly, the BP register is assigned the value at the memory location of BX (value of s) plus the offset of 4096. Then the value of BP is incremented by one and the new value is assigned back to the memory at location BX + 4096.

In this example, the Go compiler is adding a nil pointer check prior to the increment code:

10  s.i++

0x004a 00074 (main.go:10)  CMPQ BX,$0
0x004e 00078 (main.go:10)  JEQ $1,105

0x0050 00080 (main.go:10)  MOVQ 4096(BX),BP
0x0057 00087 (main.go:10)  NOP ,
0x0057 00087 (main.go:10)  INCQ ,BP
0x005a 00090 (main.go:10)  MOVQ BP,4096(BX)

The highlighted code above shows the nil pointer check added by the compiler. The value of BX is compared with the value of 0 and if they are equal, the code jumps away from the increment code.

The question is, why is Go adding a nil pointer check in this example and not in the first example?

When Checks Are Added
Let's take the last example and make a quick change:

01 package main
02
03 type S struct {
04     i int          // Switching field order
05     b [4096]byte
06 }
07
08 func main() {
09     s := new(S)
10     s.i++
11 }

All I have done is switched the fields in the struct. This time the i field of type int comes before the b field of type byte array of 4096 elements. Now let's produce the assembly as see if the nil pointer check is still there:

09  s := new(S)
0x0036 00054 (main.go:9)   LEAQ "".autotmp_0001+0(SP),DI
0x003a 00058 (main.go:9)   MOVL $0,AX
0x003c 00060 (main.go:9)   MOVQ $513,CX
0x0043 00067 (main.go:9)   REP ,
0x0044 00068 (main.go:9)   STOSQ ,
0x0046 00070 (main.go:9)   LEAQ "".autotmp_0001+0(SP),BX

10  s.i++
0x004a 00074 (main.go:10)  NOP ,
0x004a 00074 (main.go:10)  MOVQ (BX),BP
0x004d 00077 (main.go:10)  NOP ,
0x004d 00077 (main.go:10)  INCQ ,BP
0x0050 00080 (main.go:10)  MOVQ BP,(BX)
0x0053 00083 (main.go:10)  NOP ,

If you compare the assembly for the first example for line 10 with the last example you will see a big difference:

First Example      |   Second Example
CMPQ BX,$0         |   NOP
JEQ $1,105         |   MOVQ (BX),BP
MOVQ 4096(BX),BP   |   NOP ,
NOP ,              |   INCQ ,BP
INCQ ,BP           |   MOVQ BP,(BX)
MOVQ BP,4096(BX)   |   NOP ,

Once we moved the integer field to be first in the struct, the nil pointer check has disappeared. Go is now leaving it up to the hardware again to report any invalid memory access with the use of that field.

The reason is that Go can trust the hardware to identify and report invalid memory access from addresses that may exist between 0x0 and 0xFFF. There is no reason for the compiler to add checks in these cases. Once the code is handling addresses that may be outside of this 4k range, nil pointer checks are placed back into the assembly.

When the array of bytes comes first in the struct, the integer field is set at an offset past the 4k boundary. This causes the compiler to add the nil pointer check for the integer field. When the integer field is first, the compiler leaves it up to the hardware to detect because the address is within the 4k range.

Conclusion
I don't believe in writing code with a focus on performance. Write code that is idiomatic, clean and simple. Then benchmark your program and make performance tweaks where you need to. As we saw, the design of your structs can play a role in the performance of your programs. Accessing fields within a struct value requires pointer arithmetic. For structs that are larger than 4K, the way you organize your fields could make a difference.

There is one thing to note. What we are seeing is a compiler implementation detail and not part of the Go specification. This could change with any new release of Go or be different between the 5g, 6g or 8g compilers. Be careful when you performance tune on an implementation detail, these need to be rechecked with each new version. However, it is interesting to see how the compiler wants to make our code safe and yet, when it can trust the hardware it will do so.

Bonus Material
If you want to see a case in the standard library where a struct was re-ordered to eliminate nil pointer checks for a field, check out this code change by Nigel Tao:
https://codereview.appspot.com/6625058

Here is a document written by Russ Cox in July 2013 titled, "Go 1.2 Field Selectors and Nil Checks".
http://golang.org/s/go12nil