Skip to content

fix: persist Moltbook auth DIDs and deduplicate by moltbook_id#4

Open
HaraldeRoessler wants to merge 1 commit into
MoltyCel:mainfrom
HaraldeRoessler:fix/moltbook-auth-orphan-dids
Open

fix: persist Moltbook auth DIDs and deduplicate by moltbook_id#4
HaraldeRoessler wants to merge 1 commit into
MoltyCel:mainfrom
HaraldeRoessler:fix/moltbook-auth-orphan-dids

Conversation

@HaraldeRoessler

Copy link
Copy Markdown
Contributor

Summary

  • /auth/moltbook now stores the agent in the database on first authentication
  • Repeated auth calls for the same Moltbook user return the same DID (lookup by moltbook_id + platform='moltbook')
  • Only creates a new agent record on first auth

Problem

The endpoint generated a new random DID on every call (uuid.uuid4()) without inserting it into the agents table. This meant:

  • Repeated logins returned different DIDs each time
  • The returned DID was useless (not registered, could not be used with any other endpoint)
  • Orphan DIDs accumulated conceptually with no cleanup

Test plan

  • First /auth/moltbook call with valid token creates agent and returns DID
  • Second call with same user token returns the same DID
  • DID appears in /identity/resolve/{did}
  • Different Moltbook users get different DIDs

Generated with Claude Code

The /auth/moltbook endpoint generated a new random DID on every call
without storing it, creating orphan DIDs that could never be used with
other endpoints. Repeated auth calls for the same user returned
different DIDs each time.

Now looks up existing agent by moltbook_id (stored as display_name with
platform='moltbook'). Returns the same DID on repeated auth. Only
creates a new agent on first authentication.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@HaraldeRoessler

Copy link
Copy Markdown
Contributor Author

Merge order

Independent — can merge in any order relative to #2 and #7.

Sequence: #1#3#2#4#7

HaraldeRoessler added a commit to HaraldeRoessler/moltrust-api that referenced this pull request May 12, 2026
Follow-up to commit d25e70c (SSRF). After running CodeQL default-setup
on the fork, 17 additional findings surfaced. Triage outcome:

  Already closed by earlier commits this PR:   1 (SSRF)
  False positives (dismissed via CodeQL UI):   4
  Real findings fixed in this commit:          5
  Stack-trace-exposure (deferred to design):   7

FIXES IN THIS COMMIT

  #1 [LOG SANITISATION] credit_middleware exception swallows DB password
     - app/main.py (logger.error in credit_middleware)
     `logger.error("…: %s", caller_did, e)` — the raw exception `e`
     can be an asyncpg ConnectionError whose repr() includes the
     Postgres connection string (with the password). Log only
     `type(e).__name__` instead.

  #2 [DEFENSIVE URL ENCODING] /join?ref= referrer parameter
     - app/main.py /join handler
     The redirect target is HARDCODED to https://moltrust.ch — the
     host is not user-controlled. But `f"https://moltrust.ch?ref={ref}"`
     interpolates `ref` raw, and a payload like `ref="x&malparam=…"`
     could corrupt the query string. Use `urllib.parse.quote(ref)` to
     percent-encode the value before interpolation.

  MoltyCel#3 [STDOUT TOKEN LEAK] telegram_hn_remind print(r.text)
     - scripts/telegram_hn_remind.py
     `print(f'Status: {r.status_code}, Response: {r.text}')` — if
     Telegram error responses ever echo the request URL (which contains
     the bot token in the path), the body lands in stdout / CI scrollback.
     Print only the status code.

  MoltyCel#4 [ReDoS] mpp authorization header regex
     - packages/mpp/index.js
     `auth.match(/^(?:Payment|MPP)\s+(.+)$/i)` on an unbounded header
     is polynomial-quadratic. This package is published to npm, so
     consumer servers carry the risk. Cap header at 8 KiB and use
     bounded `\s{1,8}` with a non-greedy first char.

  MoltyCel#5 [ReDoS] moltrust-openclaw-v2 base URL trim
     - moltrust-openclaw-v2/src/client.ts
     `.replace(/\/+$/, "")` is polynomial on pathological inputs.
     Replace with a `while (str.endsWith("/")) str = str.slice(0, -1)`
     loop, which is linear.

DISMISSED AS FALSE POSITIVES (no code change)

  MoltyCel#14 py/clear-text-logging-sensitive-data at SPIFFE bind log
      Logs spiffe_uri, did, caller_did — none are passwords. CodeQL
      misfires on the "did" → "id" → "password" name-similarity heuristic.

  MoltyCel#13, MoltyCel#12 py/clear-text-logging-sensitive-data in scripts/threadwatch.py
      Telegram bot token flows into the request URL but never into a
      logger or print() call — only to requests.post (which doesn't
      log URLs by default).

  MoltyCel#16 py/weak-sensitive-data-hashing in _reg_tracker
      This is in-memory rate-limit bucket-key derivation, not password
      storage. bcrypt/argon2 would be wrong here (slow + salted breaks
      the lookup). SHA-256 of the full API key is the correct primitive
      for an O(1) tracker.

EXPLICITLY DEFERRED (7 stack-trace-exposure findings)

  Multiple endpoints currently return `{"error": str(e)[:100]}` to
  callers. CodeQL flags these as info disclosure. Fixing them means
  changing the API contract — clients that parse the `error` field
  would break. This is a design call for the maintainer; deferring
  to a separate PR + discussion rather than including in this hardening
  pass.

VERIFICATION

  Python 3.12 AST parse — app/main.py + scripts/telegram_hn_remind.py
  compile cleanly. `node -c packages/mpp/index.js` clean. The TS file
  change is a syntactically-trivial loop, not type-impacting.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
HaraldeRoessler added a commit to HaraldeRoessler/moltrust-api that referenced this pull request May 18, 2026
Follow-up to commit d25e70c (SSRF). After running CodeQL default-setup
on the fork, 17 additional findings surfaced. Triage outcome:

  Already closed by earlier commits this PR:   1 (SSRF)
  False positives (dismissed via CodeQL UI):   4
  Real findings fixed in this commit:          5
  Stack-trace-exposure (deferred to design):   7

FIXES IN THIS COMMIT

  #1 [LOG SANITISATION] credit_middleware exception swallows DB password
     - app/main.py (logger.error in credit_middleware)
     `logger.error("…: %s", caller_did, e)` — the raw exception `e`
     can be an asyncpg ConnectionError whose repr() includes the
     Postgres connection string (with the password). Log only
     `type(e).__name__` instead.

  #2 [DEFENSIVE URL ENCODING] /join?ref= referrer parameter
     - app/main.py /join handler
     The redirect target is HARDCODED to https://moltrust.ch — the
     host is not user-controlled. But `f"https://moltrust.ch?ref={ref}"`
     interpolates `ref` raw, and a payload like `ref="x&malparam=…"`
     could corrupt the query string. Use `urllib.parse.quote(ref)` to
     percent-encode the value before interpolation.

  MoltyCel#3 [STDOUT TOKEN LEAK] telegram_hn_remind print(r.text)
     - scripts/telegram_hn_remind.py
     `print(f'Status: {r.status_code}, Response: {r.text}')` — if
     Telegram error responses ever echo the request URL (which contains
     the bot token in the path), the body lands in stdout / CI scrollback.
     Print only the status code.

  MoltyCel#4 [ReDoS] mpp authorization header regex
     - packages/mpp/index.js
     `auth.match(/^(?:Payment|MPP)\s+(.+)$/i)` on an unbounded header
     is polynomial-quadratic. This package is published to npm, so
     consumer servers carry the risk. Cap header at 8 KiB and use
     bounded `\s{1,8}` with a non-greedy first char.

  MoltyCel#5 [ReDoS] moltrust-openclaw-v2 base URL trim
     - moltrust-openclaw-v2/src/client.ts
     `.replace(/\/+$/, "")` is polynomial on pathological inputs.
     Replace with a `while (str.endsWith("/")) str = str.slice(0, -1)`
     loop, which is linear.

DISMISSED AS FALSE POSITIVES (no code change)

  MoltyCel#14 py/clear-text-logging-sensitive-data at SPIFFE bind log
      Logs spiffe_uri, did, caller_did — none are passwords. CodeQL
      misfires on the "did" → "id" → "password" name-similarity heuristic.

  MoltyCel#13, MoltyCel#12 py/clear-text-logging-sensitive-data in scripts/threadwatch.py
      Telegram bot token flows into the request URL but never into a
      logger or print() call — only to requests.post (which doesn't
      log URLs by default).

  MoltyCel#16 py/weak-sensitive-data-hashing in _reg_tracker
      This is in-memory rate-limit bucket-key derivation, not password
      storage. bcrypt/argon2 would be wrong here (slow + salted breaks
      the lookup). SHA-256 of the full API key is the correct primitive
      for an O(1) tracker.

EXPLICITLY DEFERRED (7 stack-trace-exposure findings)

  Multiple endpoints currently return `{"error": str(e)[:100]}` to
  callers. CodeQL flags these as info disclosure. Fixing them means
  changing the API contract — clients that parse the `error` field
  would break. This is a design call for the maintainer; deferring
  to a separate PR + discussion rather than including in this hardening
  pass.

VERIFICATION

  Python 3.12 AST parse — app/main.py + scripts/telegram_hn_remind.py
  compile cleanly. `node -c packages/mpp/index.js` clean. The TS file
  change is a syntactically-trivial loop, not type-impacting.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MoltyCel added a commit that referenced this pull request Jun 3, 2026
* docs(specs): D-1 Acceptance-Gate architecture brief (design-only)

Scope: AAE draft-04 §5 Step 1 (signature verify + signing-authority) + Step 2 (payload/schema/cty). Decisions: #1 JWS-wrapped VC submit-contract (extract blocks from verified payload; component-1 API/raw_canonical impact named); #2 did:web + did:moltrust launch (did:key follow-on); #3 resolve-and-verify with trust-tiering (trusted vs unverified_issuer, no hard-allowlist); #4 scope = steps 1+2 only (step 4 subject-binding + step 9 delegation = follow-ons). PyJWT 2.12.1 (no new dep). Canonicalization clarity: D-1 verifies JOSE-JWS bytes, not JCS raw_canonical. Open sign-off: DID-resolution depth/SSRF/caching, raw_canonical redefinition, trust-tier persistence.

* docs(specs): resolve 4 D-1 sign-off points (design-only)

1) DID-resolution SSRF/DoS = same egress-proxy as revocation_check (no new mitigation); did:web gated on proxy, D-1 LAUNCHES did:moltrust-only (no outbound, not proxy-gated). 2) raw_canonical = JWS-payload (trigger structurally unchanged); breaking submit-contract change, only smoke-rows affected. 3) trust-tier = new additive column issuer_trust_tier (trusted/unverified_issuer, analog value_source). 4) did:web VM-dereferencing = new layer (resolver gives raw DID-doc only). Phased launch: A did:moltrust-only now, B did:web when egress-proxy live.

* docs(specs): D-1 review-hardening — 4 criticals + 2 mediums resolved (design-only)

alg-confusion (explicit algorithms=[EdDSA] allowlist, never trust header alg); kid strict DID-URL validation + path-traversal/look-alike protection; canonicalization = exact b64url-decoded payload bytes (never re-serialize); submit rate-limit + per-issuer quota (PK already blocks exact replays); did:moltrust registry SPOF -> key rotation; JSON duplicate-keys reject via object_pairs_hook. Implementation contract, not architecture change.

---------

Co-authored-by: Lars Kroehl <kersten.kroehl@cryptokri.ch>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant