I’ve always found the signature of io.Reader
a bit odd:
type Reader interface {
Read(p []byte) (n int, err error)
}
Why take a byte slice and write data into it? Wouldn’t it be simpler to create the slice
inside Read
, load the data, and return it instead?
// Hypothetical; what I *thought* it should be
Read() (p []byte, err error)
This felt more intuitive to me—you call Read
, and it gives you a slice filled with data,
no need to pass anything.
I found out why it’s designed this way while watching this1 excellent GopherCon Singapore talk on understanding allocations by Jacob Walker. It mainly boils down to two reasons.
Reducing heap allocations
If Read
created and returned a new slice every time, the memory would always end up on the
heap.
Heap allocations are slower because they require garbage collection, while stack allocations
are faster since they are freed automatically when a function returns. By taking a
caller-provided slice, Read
lets the caller control memory and reuse buffers, keeping them
on the stack whenever possible.
This matters a lot when reading large amounts of data. If each Read
call created a new
slice, you’d constantly be allocating memory, leading to more work for the garbage
collector. Instead, the caller can allocate a buffer once and reuse it across multiple
reads:
buf := make([]byte, 4096) // Single allocation
n, err := reader.Read(buf) // Read into existing buffer
Go’s escape analysis tool (go build -gcflags=-m
) can confirm this. If Read
returned a
new slice, the tool would likely show:
buf escapes to heap
meaning Go has to allocate it dynamically. But by reusing a preallocated slice, we avoid unnecessary heap allocations—only if the buffer is small enough to fit in the stack. How small? Only the compiler knows, and you shouldn’t depend on it. Use the escape analysis tool to see that. But most of the time, you don’t need to worry about this at all.
Reusing buffers in streaming
The second issue is correctness. When reading from a stream, you usually call Read
multiple times to get all the data. If Read
returned a fresh slice every time, you’d have
no control over memory usage across calls. Worse, you couldn’t efficiently handle partial
reads, making buffer management unpredictable.
With the hypothetical version of Read
, every call would allocate a new slice. If you
needed to read a large stream of data, you’d have to manually piece everything together
using append
, like this:
var allData []byte
for {
buf, err := reader.Read() // New allocation every call
if err != nil {
break
}
allData = append(allData, buf...) // Growing slice every time, more allocation
}
process(allData)
This is a mess. Every time append
runs out of space, Go will have to allocate a larger
slice and copy the existing data over, piling on unnecessary GC pressure.
By contrast, io.Reader
’s actual design avoids this problem:
buf := make([]byte, 4096) // Allocate once
for {
n, err := reader.Read(buf)
if err != nil {
break
}
process(buf[:n])
}
This avoids unnecessary allocations and produces less garbage for the GC to clean up.
Recent posts
- Go slice gotchas
- The domain knowledge dilemma
- Hierarchical rate limiting with Redis sorted sets
- Dynamic shell variables
- Link blog in a static site
- Running only a single instance of a process
- Function types and single-method interfaces in Go
- SSH saga
- Injecting Pytest fixtures without cluttering test signatures
- Explicit method overriding with @typing.override