How to Optimize Go Programs with Stack Allocations: A Step-by-Step Guide

From Stripgay, the free encyclopedia of technology

Introduction

Heap allocations in Go can slow down your program and put extra pressure on the garbage collector. The Go team has been working on moving more allocations to the stack, which are cheaper and garbage-collector-free. This guide will show you how to identify heap allocations and refactor your code to use stack allocations, using the example of building a slice of tasks from a channel. By following these steps, you can make your Go programs faster and more efficient.

How to Optimize Go Programs with Stack Allocations: A Step-by-Step Guide
Source: blog.golang.org

What You Need

  • A Go development environment (Go 1.22 or later recommended for best escape analysis)
  • Basic knowledge of Go slices and channels
  • A hot code path in your project where allocations are frequent (you can use go test -bench or pprof to find them)
  • The pprof tool for memory profiling (optional but helpful)

Step-by-Step Guide

Step 1: Identify Code That Causes Heap Allocations

Start by finding functions where slices are repeatedly appended inside loops, especially when the final size is unknown. Heap allocations often occur when append grows a slice beyond its current capacity. Use the Go memory profiler to locate these spots:

import "runtime/pprof"
// In your test or main: f, _ := os.Create("heap.prof")
pprof.WriteHeapProfile(f)

Look for allocation-heavy functions. The example below allocates on the heap inside a loop:

func process(c chan task) {
    var tasks []task
    for t := range c {
        tasks = append(tasks, t)
    }
    processAll(tasks)
}

Step 2: Understand Why the Heap Is Used

When you declare var tasks []task, the slice header is on the stack, but the backing array is allocated on the heap because the compiler sees that append might need to grow it. Each time the capacity is exceeded, a new heap allocation occurs. For the first few iterations, the slice doubles in size (1, 2, 4, 8, ...), causing multiple allocations and garbage. This startup phase is wasteful, especially if the slice remains small.

In the example, during iteration 1, the allocation size is 1; iteration 2, size 2; iteration 3, size 4; iteration 4, no allocation (space available); iteration 5, size 8, and so on. The overhead adds up in hot loops.

Step 3: Preallocate Slice Capacity to Avoid Repeated Allocations

The simplest fix is to preallocate the slice with a known or estimated capacity. This moves the allocation to a single stack-allocated header and a single heap allocation for the backing array (which can often be optimized onto the stack if the size is small and constant). Use make with a capacity hint:

func process(c chan task) {
    // Preallocate capacity to avoid growth allocations
    tasks := make([]task, 0, 1000) // estimate based on typical channel size
    for t := range c {
        tasks = append(tasks, t)
    }
    processAll(tasks)
}

Now the backing array is allocated only once, and subsequent appends are just length increments. This reduces heap allocations from many to one (or zero if the compiler can stack-allocate the backing array).

Step 4: Let the Compiler Escape Analysis Help

Go’s escape analysis tries to move allocations from heap to stack when safe. To help, keep the slice size small and constant. If the preallocated capacity is known at compile time (e.g., make([]task, 0, 10)), the compiler may place the backing array on the stack. Avoid storing the slice in a global variable or returning it from the function, as that forces heap allocation. In our example, the slice is passed to processAll – if that function also does not escape the slice, the allocation may stay on the stack.

Step 5: Profile to Verify the Improvement

After making changes, run your benchmarks again. Compare the number of heap allocations before and after:

go test -bench=. -benchmem

Look for a reduction in allocations/op. For the channel example, you should see far fewer allocations. Also run the memory profiler to confirm that the backing array is no longer allocated repeatedly.

Tips for Further Optimization

  • Use small, constant-sized slices when possible. The compiler can often allocate them entirely on the stack, eliminating heap allocation entirely.
  • Consider using arrays instead of slices if the size is known and small (e.g., [4]int instead of []int). Arrays are always stack-allocated when not escaping.
  • Profile regularly. Use go tool pprof -alloc_objects to find remaining heap allocation hotspots.
  • Be mindful of closures and interface parameters. They often cause heap allocations (heap-allocated variables or allocations for the closed-over variables).
  • Update to the latest Go version – newer releases have improved escape analysis and can stack-allocate more cases (e.g., Green Tea optimizations).
  • If you cannot preallocate, consider using a sync.Pool for very short-lived slices to recycle allocations, though this is a last resort.

By following these steps, you can dramatically reduce heap allocations in hot paths, making your Go programs faster and more efficient. Start with step 1 and work through each step to apply stack allocation principles to your own code.