Skip to content

Memory considerations around how bytes fields are deserialized #712

@jcready

Description

@jcready

When deserializing binary protobuf-ts tries to be fast by only creating "views" over the underlying ArrayBuffer. This isn't a concern except for when you have a bytes field.

return this.buf.subarray(start, start + len);

Let's imagine we have the following protos:

message ManyFoos {
  repeated Foo foo = 1;
}

message Foo {
  optional bytes bin = 1;
}

And for the sake of argument we say that I have the binary representation of a ManyFoos instance. It has 1000 Foo instances. Each Foo instance has a bin field with ~1MB of data. And for the sake of argument let's say I have a function like this:

function getFirstFoo(manyBin: UInt8Array): Foo | undefined {
  const many = ManyFoo.fromBinary(manyBin);
  return many.foo[0];
}

const firstFoo = getFirstFoo(uint8_from_somewhere);

All 1000MB of data are retained in memory for the lifetime of firstFoo. This is all because of the way bytes fields get deserialized into a view on top of the original ArrayBuffer.

I guess it's all tradeoffs, but the current approach means that a single bytes field nested anywhere inside a large protobuf message sort of poisons your application if you happen to hold onto a reference to that bytes field as none of the original bytes from the large protobuf message can be GCed until your bytes reference goes out of scope.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions