NAUT-1310: add SBOM RPCs to StorageService#50
Conversation
Adds four SBOM RPCs to support node-agent (workload) and kubevuln (registry) writing SBOMs directly to the backend, per design `naut-1310-storage-deprecation.md` §5.2 / §6 S1: - PutSBOM(unary) — upload SBOM ≤ 4 MiB - PutSBOMStream(client) — upload SBOM > 4 MiB, chunked - GetSBOM(unary) — probe (metadata_only=true) or fetch ≤ 4 MiB - GetSBOMStream(server) — fetch > 4 MiB, chunked Primary key on the wire is `(customer_guid, image_digest, syft_version)`, where `customer_guid` is supplied via the existing gRPC metadata header pattern used by `SendContainerProfile`. SBOMSource enum distinguishes workload / registry / host origin so a single Vulnerability Scanner pipeline downstream can route per-kind. Hand-written StorageClient wrappers added in `storageclient.go`. Unary methods mirror the existing pattern; streaming methods marshal the SBOMSyft proto, chunk at 1 MiB, and reassemble server-side. Makefile updated to also invoke `protoc-gen-go-grpc` — previously it ran only `protoc-gen-gogo`, leaving `storage_service_grpc.pb.go` stale on proto changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdds metadata-first streaming RPCs for SBOM and container-profile transfers, updates proto Makefile to generate both gogo message bindings and go-grpc service stubs, implements chunked client upload/download helpers with a streaming ReadCloser, and adds bufconn-based tests plus mock streaming hooks. ChangesSBOM gRPC client support
Sequence Diagram(s)sequenceDiagram
participant Client
participant StorageClient
participant gRPC
Client->>StorageClient: PutSBOMStream(imageDigest, syftVersion, source, io.Reader)
StorageClient->>gRPC: PutSBOMChunk(metadata + first blob slice)
StorageClient->>gRPC: PutSBOMChunk(blob chunks...)
gRPC-->>StorageClient: PutSBOMResponse
Client->>StorageClient: GetSBOMStream(imageDigest, syftVersion)
StorageClient->>gRPC: GetSBOMRequest
gRPC-->>StorageClient: GetSBOMChunk(metadata)
gRPC-->>StorageClient: GetSBOMChunk(blob chunks...)
StorageClient-->>Client: io.ReadCloser (streamed bytes)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
matthyx
left a comment
There was a problem hiding this comment.
I don't like the dual API depending on the size, can we make it simpler for the caller?
Addresses matthyx review on #50: - Drop unary PutSBOM and GetSBOM RPCs. There is now ONE client-streaming PutSBOM and ONE server-streaming GetSBOM — callers never need to know payload size in advance. Single-chunk streams handle the small-payload case naturally. - Client wrappers take/return io.Reader instead of *v1beta1.SBOMSyft, so callers can stream from any source (a file, a syft pipe, etc.) without holding the full marshaled blob in memory. Convenience helpers MarshalSBOM and UnmarshalSBOM cover the typed-object case. - Add a bufconn-based round-trip test (TestStorageClient_SBOMRoundTrip) that stands up a real gRPC server in-process and exercises the full marshal → chunk → wire → unchunk → unmarshal path with a non-trivial SBOMSyft. Covers: probe hit, probe miss, full fetch single-chunk, full fetch with the payload split across many small chunks (stresses the io.ReadCloser buffering logic). This is the kind of end-to-end test matthyx asked for — would have caught any wire-shape regression that the prior mock-based smoke tests missed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per discussion on PR #50: - Rename PutSBOM/GetSBOM to PutSBOMStream/GetSBOMStream (proto + Go client + bufconn test server). Suffix makes the streaming nature explicit and leaves the naming room for unary helpers later if needed. - Add SendContainerProfileStream (client-streaming) and GetContainerProfileStream (server-streaming) RPCs as the proper fix for the latent 4 MiB cliff in SendContainerProfile / GetProfile. The entry-count cap on ContainerProfile only loosely correlates with serialized byte size — a profile with high MaxContainerProfileSize or unusually large individual entries can still fail silently on the wire. - Mark `rpc SendContainerProfile` as `option deprecated = true` and add a Deprecated: doc comment on the Go wrapper. GetProfile is left alone because it serves ApplicationProfile and NetworkNeighborhood too — those need their own streaming variants if they have the same risk. (Out of scope for this PR; separate ticket.) - Add a bufconn round-trip test for CP streaming mirroring the existing SBOM round-trip test: send → verify server received intact bytes → fetch with payload split across small chunks → verify reassembly. Renamed sbomRoundTripServer to storageRoundTripServer since it now hosts both SBOM and CP handlers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
It now has a streaming replacement (GetContainerProfileStream) and delegates to the unary GetProfile RPC under the hood, which carries the same 4 MiB cliff exposure. GetApplicationProfile and GetNetworkNeighborhood remain undecorated — they don't have streaming variants yet. The proto-level GetProfile RPC also stays as-is because it still serves AP/NN. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
pkg/client/v1/proto/storage_service.proto (1)
386-392: 💤 Low valueConsider
google.protobuf.Timestampinstead of RFC3339 strings.
created_atandlast_referenced_atas strings push timestamp parsing/formatting onto every consumer and lose type checking at the wire.google.protobuf.Timestampwould give you well-known semantics, native Gotime.Timeconversion viaAsTime(), and cleaner JSON encoding.Since these fields are new in this PR, there's no migration cost.
♻️ Proposed change
+import "google/protobuf/timestamp.proto"; + // SBOMMetadata is the indexed view of an SBOM row, returned by GetSBOM // with or without metadata_only=true. message SBOMMetadata { ... - // created_at is the RFC3339 timestamp when the row was first inserted. - string created_at = 5; + // created_at is when the row was first inserted. + google.protobuf.Timestamp created_at = 5; - // last_referenced_at is the RFC3339 timestamp when the orchestrator most - // recently resolved an inventory entry to this SBOM. Drives eviction. - string last_referenced_at = 6; + // last_referenced_at is when the orchestrator most recently resolved an + // inventory entry to this SBOM. Drives eviction. + google.protobuf.Timestamp last_referenced_at = 6; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/client/v1/proto/storage_service.proto` around lines 386 - 392, Replace the RFC3339 string fields created_at and last_referenced_at in the storage_service.proto message with google.protobuf.Timestamp types: add an import for google/protobuf/timestamp.proto, change the field types from string to google.protobuf.Timestamp for the symbols created_at and last_referenced_at, then regenerate the protobuf stubs (and update any consumers) so they use the generated Timestamp types (e.g., in Go call AsTime()/AsTimePtr() or FromTime when constructing) instead of manual string parsing/formatting.pkg/client/v1/storageclient.go (2)
22-25: 💤 Low valueRename
sbomStreamChunkSize— it's also used by container-profile streams.Both
SendContainerProfileStream(line 322) andPutSBOMStreamuse this constant. The SBOM-prefixed name is misleading now that it governs all four streaming RPCs.♻️ Proposed rename
-// sbomStreamChunkSize is the per-chunk byte budget used by PutSBOMStream -// and GetSBOMStream. Set well below the default 4 MiB gRPC message limit -// to leave headroom for framing overhead. -const sbomStreamChunkSize = 1 << 20 // 1 MiB +// streamChunkSize is the per-chunk byte budget used by all streaming +// upload/download RPCs. Set well below the default 4 MiB gRPC message +// limit to leave headroom for framing overhead. +const streamChunkSize = 1 << 20 // 1 MiB🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/client/v1/storageclient.go` around lines 22 - 25, The constant sbomStreamChunkSize is misleading because it is used by multiple streaming RPCs (PutSBOMStream, GetSBOMStream and SendContainerProfileStream); rename it to a neutral, descriptive identifier (e.g., streamChunkSize or grpcStreamChunkSize) and update all references in storageclient.go (including PutSBOMStream, GetSBOMStream, SendContainerProfileStream) to use the new name so callers and readers see a consistent, accurate symbol across all four streaming RPC implementations.
660-693: ⚡ Quick winRecover real server status when
stream.Sendreturns an error.In gRPC-Go client streams, when the server closes the stream early (e.g., rejects the metadata header),
Sendreturnsio.EOF— the actual status (InvalidArgument,Unauthenticated,ResourceExhaustedfor over-large SBOMs, etc.) is only retrievable by callingCloseAndRecv. Currently, both functions return immediately onSenderror without ever reachingCloseAndRecv, losing the server's diagnostic status.This applies to:
PutSBOMStream: lines 673 and 683SendContainerProfileStream: lines 318 and 327♻️ Suggested pattern
if err := stream.Send(first); err != nil { - return nil, fmt.Errorf("failed to send first chunk: %w", err) + if err == io.EOF { + // Server closed the stream early; retrieve the real status. + if _, recvErr := stream.CloseAndRecv(); recvErr != nil { + return nil, fmt.Errorf("server rejected upload: %w", recvErr) + } + } + return nil, fmt.Errorf("failed to send first chunk: %w", err) }A small helper applied at both
Sendsites in each function keeps it tidy.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/client/v1/storageclient.go` around lines 660 - 693, Both PutSBOMStream and SendContainerProfileStream currently return immediately when stream.Send fails, losing the server's final gRPC status; change both to call stream.CloseAndRecv() when Send returns an error and include that returned error (if any) in the final error you return so callers see the server status. Implement a small helper (e.g., handleSendError(stream grpc.ClientStream, sendErr error) error or a function specific to the typed stream) that: if sendErr == nil returns nil; otherwise calls stream.CloseAndRecv(), and returns a wrapped error that contains both the original sendErr and the CloseAndRecv error (if present). Replace the two direct early returns in PutSBOMStream and SendContainerProfileStream with calls to this helper so CloseAndRecv is always invoked on Send failures and its diagnostic is propagated.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/client/v1/storageclient.go`:
- Around line 744-749: GetSBOMStream currently returns (md, nil, nil) when
md.Success is false which hides server failures; change it to mirror
GetContainerProfileStream by returning a non-nil error for the !md.Success case
while still treating !md.Exists and metadataOnly as "no reader" (nil reader)
cases. Concretely, in GetSBOMStream replace the branch that does "if !md.Success
|| !md.Exists || metadataOnly { cancel(); return md, nil, nil }" with logic
that: cancels in all three cases, if !md.Success construct and return an error
that includes md.ErrorCode and md.ErrorMessage (or a formatted string with those
fields) along with md, and only return (md, nil, nil) for the !md.Exists or
metadataOnly paths; follow the error-wrapping style used in
GetContainerProfileStream.
---
Nitpick comments:
In `@pkg/client/v1/proto/storage_service.proto`:
- Around line 386-392: Replace the RFC3339 string fields created_at and
last_referenced_at in the storage_service.proto message with
google.protobuf.Timestamp types: add an import for
google/protobuf/timestamp.proto, change the field types from string to
google.protobuf.Timestamp for the symbols created_at and last_referenced_at,
then regenerate the protobuf stubs (and update any consumers) so they use the
generated Timestamp types (e.g., in Go call AsTime()/AsTimePtr() or FromTime
when constructing) instead of manual string parsing/formatting.
In `@pkg/client/v1/storageclient.go`:
- Around line 22-25: The constant sbomStreamChunkSize is misleading because it
is used by multiple streaming RPCs (PutSBOMStream, GetSBOMStream and
SendContainerProfileStream); rename it to a neutral, descriptive identifier
(e.g., streamChunkSize or grpcStreamChunkSize) and update all references in
storageclient.go (including PutSBOMStream, GetSBOMStream,
SendContainerProfileStream) to use the new name so callers and readers see a
consistent, accurate symbol across all four streaming RPC implementations.
- Around line 660-693: Both PutSBOMStream and SendContainerProfileStream
currently return immediately when stream.Send fails, losing the server's final
gRPC status; change both to call stream.CloseAndRecv() when Send returns an
error and include that returned error (if any) in the final error you return so
callers see the server status. Implement a small helper (e.g.,
handleSendError(stream grpc.ClientStream, sendErr error) error or a function
specific to the typed stream) that: if sendErr == nil returns nil; otherwise
calls stream.CloseAndRecv(), and returns a wrapped error that contains both the
original sendErr and the CloseAndRecv error (if present). Replace the two direct
early returns in PutSBOMStream and SendContainerProfileStream with calls to this
helper so CloseAndRecv is always invoked on Send failures and its diagnostic is
propagated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9f7c6a87-afa9-4dbc-8f5c-d2ec9726f7b2
⛔ Files ignored due to path filters (2)
pkg/client/v1/proto/storage_service.pb.gois excluded by!**/*.pb.gopkg/client/v1/proto/storage_service_grpc.pb.gois excluded by!**/*.pb.go
📒 Files selected for processing (3)
pkg/client/v1/proto/storage_service.protopkg/client/v1/storageclient.gopkg/client/v1/storageclient_test.go
Co-authored-by: Matthias Bertschy <matthias.bertschy@gmail.com> Signed-off-by: Jonathan Green <jonathang@armosec.io>
- CodeRabbit: GetSBOMStream silently masked server-side failures. When the server reported !md.Success the helper returned (md, nil, nil), so a caller checking `if err != nil` couldn't tell a real RPC failure apart from a clean miss. Now returns a non-nil error on !md.Success, wrapping md.ErrorMessage / md.ErrorCode. !md.Exists and metadataOnly remain the (md, nil, nil) "no reader, not an error" paths. Mirrors GetContainerProfileStream's existing contract; doc updated to lead with the contract. - matthyx (line 656): PutSBOMStream was wrapping the stream context in callTimeout — defeats the purpose of streaming for large payloads. Dropped on PutSBOMStream, SendContainerProfileStream, and GetContainerProfileStream (the latter two had the same bug). GetSBOMStream already avoided it explicitly. All four streaming wrappers now share the same comment explaining why. - matthyx (line 874): added TestStorageClient_GetSBOMStream_NotBounded ByCallTimeout. Configures the client with WithCallTimeout(20ms), has the server delay between chunks by 50ms × 5 chunks = 250ms of server-side stream time (12× the configured timeout). If the stream were wrapped in context.WithTimeout(callTimeout), the call would fail with deadline-exceeded in the first 20ms. Stays under 500ms wall time so it's CI-safe. Extended startBufconnStorageServer to forward StorageClientOption variadics and added a serveChunkDelay knob to the round-trip server. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| // Note: we deliberately do NOT apply callTimeout here, because stream | ||
| // duration depends on payload size / network speed. Callers wanting a | ||
| // timeout should pass a ctx with their own deadline. | ||
| } |
There was a problem hiding this comment.
This stray } closes PutSBOMStream early, so stream, err := c.protoClient.PutSBOMStream(ctx) is parsed outside the function body and the package no longer builds.
The current CI failure is:
pkg/client/v1/storageclient.go:658:2: syntax error: non-declaration statement outside function body
| // Note: we deliberately do NOT apply callTimeout here, because stream | |
| // duration depends on payload size / network speed. Callers wanting a | |
| // timeout should pass a ctx with their own deadline. | |
| } | |
| // Note: we deliberately do NOT apply callTimeout here, because stream | |
| // duration depends on payload size / network speed. Callers wanting a | |
| // timeout should pass a ctx with their own deadline. |
Adds four SBOM RPCs to support node-agent (workload) and kubevuln (registry) writing SBOMs directly to the backend, per design
naut-1310-storage-deprecation.md§5.2 / §6 S1:Primary key on the wire is
(customer_guid, image_digest, syft_version), wherecustomer_guidis supplied via the existing gRPC metadata header pattern used bySendContainerProfile. SBOMSource enum distinguishes workload / registry / host origin so a single Vulnerability Scanner pipeline downstream can route per-kind.Hand-written StorageClient wrappers added in
storageclient.go. Unary methods mirror the existing pattern; streaming methods marshal the SBOMSyft proto, chunk at 1 MiB, and reassemble server-side.Makefile updated to also invoke
protoc-gen-go-grpc— previously it ran onlyprotoc-gen-gogo, leavingstorage_service_grpc.pb.gostale on proto changes.Summary by CodeRabbit
New Features
Tests
Chores