Consolidated fork update: battery/power, reliability, history+ETA, mDNS, web UI, diagnostics, CI (+ companion Android app)#31
Open
BattloXX wants to merge 23 commits into
Open
Conversation
…UI, CI release
Battery (Phase 1):
- WiFi.setSleep(WIFI_PS_MIN_MODEM) — largest single battery gain (~2-3x runtime)
- DFS via esp_pm_configure (80–240 MHz, light_sleep_enable=false)
- Default backlight timeout changed from 0 to 3 minutes
- Webserver task delay(1) → delay(2)
Device logic (Phase 3):
- Probe: 60-sample int16_t history ringbuffer per probe (~1 KB total RAM)
- Probe::seconds_to_target() — linear regression ETA estimation
- grill::alarm_active aggregate flag updated in task_probes
API (Phase 4):
- GET /api/grill: adds alarm_active, mdns_hostname, alarm + eta_seconds per probe
- GET /api/probes/history: ringbuffer dump for all 8 probes
- POST /api/alarm/mute + OPTIONS CORS handler
- GET /api/info: uuid, name, fw, mdns_hostname, capabilities
mDNS (Phase 5):
- MDNS.begin("free-grilly-<uuid8>") after webserver.begin()
- Services: _http._tcp + _free-grilly._tcp with TXT records
LCD (Phase 2):
- Detail screen: ETA line ("in H:MM") + alarm blink (inverted label)
- Info screen: shows mDNS short hostname (xxxx.local)
- New helpers: get_alarm(), get_eta_seconds(), format_eta()
Web UI (Phase 6):
- Canvas sparkline graph per probe card (no external library)
- ETA badge per probe card
- Sticky alarm banner + Mute button → POST /api/alarm/mute
- Browser Notification API support
- base_url changed to "" (relative URLs)
Gzip assets (Phase 7):
- Website.cpp: conditional gzip serving with Content-Encoding: gzip header
Generator (Phase 8):
- tools/generate_web_assets.py: regenerates lib/Website/*.h from html_source/
(HTML as raw string literals, CSS/JS as gzip uint8_t PROGMEM arrays)
CI (Phase 9):
- .github/workflows/release.yml: tag-triggered build, size gate (<2031616 B),
OTA + full-flash bins as GitHub Release assets
Docs (Phase 10):
- docs/android_app.md: provisioning flow, NSD discovery, REST API reference,
Kotlin/Compose architecture proposal
- docs/openapi.yaml: v25.6.28, new endpoints + extended Grill schema
- README.md: mDNS section, CI release notes, generator usage, Android app link
- changelog.md: full entry for v25.06.28
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01N6JdMf6qA8D57FBTGeZkbr
The free-grilly Android app (v0.0.2) expects different JSON field names than the firmware web UI. This change emits both key sets additively so the existing web UI continues to work while the Android app parses all endpoints correctly. Changes per endpoint: - GET /api/grill: add uuid alias for unique_id; add id alias for probe_id per probe - GET /api/info: add firmware alias for firmware_version - GET /api/probes: add id and type aliases for probe_id/probe_type - GET /api/probes/history: add id and name per probe (required by app) - GET /api/wifiscan: add rssi and encryption aliases for signal_strength/auth_method - GET /api/settings: add grill_name alias for name - POST /api/settings: accept grill_name as alias for name - POST /api/probes: accept id/type as aliases for probe_id/probe_type - mDNS: register uuid/name/fw TXT records on _free-grilly._tcp service type so the app can find the device by UUID during reconnect Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01AZPve8gau6kiEWgHjiL5Gn
feat: Android app compatibility aliases (v26.04.19)
WiFi.setSleep(WIFI_PS_MIN_MODEM) caused Android to fail connecting after mDNS discovery: the ESP32 radio was asleep when incoming TCP SYN packets arrived, silently dropping them. The upstream firmware explicitly disabled sleep for this reason; revert to WiFi.setSleep(false). backlight_timeout_minutes default 3 -> 0: on a fresh flash (NVS erased), the new default of 3 caused the backlight to turn off 3 minutes after boot and appeared to the user as settings not being loaded. Restoring original behavior (0 = never timeout). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01AZPve8gau6kiEWgHjiL5Gn
fix: revert WiFi sleep mode and backlight timeout default
The pattern [0-9][0-9].[0-9][0-9].[0-9][0-9] only matched exact 8-char tags. A suffix like -fix1 caused the CI to skip the tag push entirely, so a hotfix release could not be built without deleting and recreating the base tag. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01AZPve8gau6kiEWgHjiL5Gn
ci: accept hotfix tags (YY.MM.DD-*) in release workflow
BattloXX commit 6f500e6 changed backlight_timeout_minutes default 0->3. Devices that OTA-updated to fix1 still have 3 in NVS because initialize_settings() is skipped on OTA (initialized=true). The new code default of 0 has no effect on those devices. Add a one-time migration marker (backl_migr_v1): on first boot after this update, if backl_to_mins == 3 the value is reset to 0. Once the marker is written the check is skipped on subsequent boots. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01AZPve8gau6kiEWgHjiL5Gn
fix: NVS migration for backlight timeout regression (OTA installs)
Three root-cause fixes for the reported bugs: 1. Web UI (Settings, Probes, About) was fetching all API data from a hardcoded developer IP (http://10.30.10.235) instead of the device. Change base_url to window.location.origin in all three html_source pages and regenerate HtmlSettings.h / HtmlProbes.h / HtmlAbout.h. This fixes: Settings empty, Probe empty, Firmware not displayed. 2. LEDC backlight/buzzer used the deprecated channel-based API (ledcSetup/ledcAttachPin/ledcWrite(channel)) which is removed in Arduino ESP32 3.x. Migrate Power.cpp and Buzzer.cpp to the new pin-based API (ledcAttach/ledcDetach/ledcWrite(pin)). This fixes: backlight immediately off at startup. 3. GrillConfig::load_settings() read temperature_unit from NVS with no fallback, returning "" on a clean flash. Add default "celcius". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01AZPve8gau6kiEWgHjiL5Gn
fix: web UI base_url, LEDC pin API, temperature_unit default
The new pin-based LEDC API (ledcAttach/ledcDetach) requires Arduino ESP32 3.x which is not used by the CI — revert to channel-based API (ledcSetup/ledcAttachPin/ledcWrite/ledcDetachPin). Root cause of backlight-off-immediately: task_screen calls display.init() → setScreenBrightness() → ledcWrite() before task_battery has run power.startup() → power.init() → ledcSetup/ ledcAttachPin. Without the LEDC channel being configured, ledcWrite() is a no-op and GPIO 4 stays in default (input, hi-Z) → backlight off. Fix: call power.init() explicitly at the start of task_screen so the LEDC channel is always configured before display.init() runs, regardless of FreeRTOS task scheduling order. Calling power.init() twice (also in task_battery) is safe — ledcSetup is idempotent. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01AZPve8gau6kiEWgHjiL5Gn
Required for the backlight race-condition fix: task_screen must call power.init() before display.init() to ensure LEDC is configured. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01AZPve8gau6kiEWgHjiL5Gn
…ST (#6) Adds a persisted `power_saving` setting (default true) toggling between "max battery" and "always reachable": - WiFi modem-sleep (WIFI_PS_MIN_MODEM) is now actually applied (was setSleep(false) despite the 25.06.28 changelog claim) - SoftAP is shut down (STA-only) once the home network is joined - reduced WiFi TX power, default display timeout, slower probe polling Always-on tuning: webserver loop 2ms->20ms, MQTT/Opengrill idle at 1s when no broker, battery poll 1s->5s, removed a stray delay(10) from the render path. Fixes POST /api/settings clobbering every absent field (wiping local_ap_*, wifi_* IP config, mqtt_*, brightness, timeouts) — now merges only present keys. Docs: correct setup AP SSID to FreeGrilly_<mac6>, document power_saving, fix changelog. Claude-Session: https://claude.ai/code/session_01AZPve8gau6kiEWgHjiL5Gn Co-authored-by: BattloXX <johannes@battlogg.org> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…e) (#7) Replaces the fixed 10-minute history with two time-uniform tiers per probe so the graph can cover multi-hour cooks (Pulled Pork etc.) with fixed memory: - Fine tier: 180 samples @ 10 s = 30 min recent detail. - Coarse tier: 180 samples at an adaptive interval that starts at 60 s and doubles each time the buffer fills (3 h -> 6 h -> 12 h -> 24 h -> 48 h ...). Memory stays fixed (~720 B/probe, ~5.8 KB total) and the sampling rate drops over time -> battery-neutral. Probe poll/ADC cadence unchanged. All probes are sampled in lockstep, so they share one coarse interval. API GET /api/probes/history extended (backward compatible): adds top-level coarse_interval_seconds and per-probe history_coarse. Endpoint buffer is now heap-allocated per request (and freed) instead of a permanent static buffer. Docs (openapi.yaml, android_app.md) and changelog updated; the previously released battery section moved from Unreleased to 26.06.30. Claude-Session: https://claude.ai/code/session_01AZPve8gau6kiEWgHjiL5Gn Co-authored-by: BattloXX <johannes@battlogg.org> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…#8) * Power: stop power-saving from blanking the display; advertise capability Two fixes around the power_saving mode added in #6: - Display no longer switches off on its own. power_saving used to silently override a "never" (0) backlight/screen timeout with a 3-min/5-min default, which users read as "the device turned itself off after a few minutes". The override is removed: display timeouts are governed purely by the user's explicit settings (0 = never, always). power_saving now affects only the Wi-Fi radio (modem sleep / reduced TX / SoftAP teardown), never the screen. - GET /api/info now advertises the "power_saving" capability, so the Android app shows its power-saving toggle (it gates on this flag). Also documents that POST /api/probes is a full replace (clients must read-modify-write to preserve a probe's type and thermistor calibration). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01AZPve8gau6kiEWgHjiL5Gn * changelog: tag this release 26.06.30-2; file History under 26.06.30-1 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01AZPve8gau6kiEWgHjiL5Gn --------- Co-authored-by: BattloXX <johannes@battlogg.org> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
* Fix connectivity regressions from aggressive power-saving
User reports: app can no longer connect, web interface keeps cutting out,
device feels unresponsive. All trace to the power-saving mode (default-on)
being too aggressive, plus a per-request 24 KB heap allocation.
- power_saving now defaults to OFF ("always reachable"): full TX power, radio
awake, SoftAP up. Battery mode is opt-in via the app. Out-of-the-box
connectivity is the priority.
- Power-saving no longer drops TX power to 11 dBm (the main cause of weak/flaky
links). It now keeps only standard modem-sleep, which does not break
reachability.
- Stop the SoftAP with WiFi.softAPdisconnect() instead of WiFi.mode(WIFI_STA).
Switching mode at runtime tore down the netif and killed the mDNS responder
the app discovers the device through; now STA + mDNS stay up.
- GET /api/probes/history uses a fixed static buffer instead of malloc'ing
~24 KB per request, which under WiFi heap churn intermittently failed (503)
and fragmented the heap, dropping web-server connections.
- Web server services sockets every 5 ms (was 20 ms).
pio run -e esp32dev: SUCCESS (RAM 29.7%, Flash 77.6%).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01AZPve8gau6kiEWgHjiL5Gn
* changelog: tag connectivity fixes as 26.06.30-3
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01AZPve8gau6kiEWgHjiL5Gn
---------
Co-authored-by: BattloXX <johannes@battlogg.org>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…f + diagnostics (#10) The device could appear to "turn itself off" (e.g. after ~1.5 h with the battery still >80%). There is no auto-off timer or battery cutoff in the firmware — the real cause is that ANY unexpected reset (brownout from a WiFi TX current spike, watchdog, or crash) hit the boot-time "hold the button to turn on" gate and, since the button wasn't held, went straight back into deep sleep. - Reset-reason-aware boot gate: the hold-to-power-on gate now applies only to ESP_RST_POWERON / ESP_RST_DEEPSLEEP (deliberate power-ons). After any fault reset the device resumes running instead of sleeping, so a transient fault self-recovers. - Protective low-battery cutoff (new): only shuts down to protect the cell when not charging and SoC <= 5% (plausible, non-zero) or cell voltage <= 3.2 V, after ~15 s of consecutive confirmations. Only active when the fuel gauge is confirmed present (fail open -> keep running). - Power button debounced (GPIO35 is input-only, no internal pull) so noise can no longer masquerade as a 2-10 s press and trigger a shutdown. - Diagnostics: status API now reports last_off_reason, last_reset_reason and battery_millivolts (persisted across shutdown) to diagnose self-power-offs. Firmware version bumped to 26.07.01. pio run -e esp32dev: SUCCESS. Claude-Session: https://claude.ai/code/session_01AZPve8gau6kiEWgHjiL5Gn Co-authored-by: BattloXX <johannes@battlogg.org> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Adds an "Energy" section (battery %, charging, measured cell voltage) and a "Diagnostics" section (last restart reason, last power-off reason) to the device web UI at /about, directly above the Authors. Reads the existing /api/grill fields (battery_millivolts, last_reset_reason, last_off_reason) introduced in 26.07.01 and maps the reason codes to friendly labels. No API or behavior change. Regenerated lib/Website/HtmlAbout.h from html_source/about.html. Version -> 26.07.01-1. pio run -e esp32dev: SUCCESS. Claude-Session: https://claude.ai/code/session_01AZPve8gau6kiEWgHjiL5Gn Co-authored-by: BattloXX <johannes@battlogg.org> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…device diagnostics (#12) Reflect the 26.07.01 / 26.07.01-1 changes in the feature list: the device no longer powers off after a transient fault, the protective low-battery cutoff, and the Energy/Diagnostics sections on the web About page. Docs only. Claude-Session: https://claude.ai/code/session_01AZPve8gau6kiEWgHjiL5Gn Co-authored-by: BattloXX <johannes@battlogg.org> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…ment (#13) All JSON serializers shared one file-global JsonDocument. It was mutated concurrently by the webserver task (/api/*), the MQTT publisher and the Opengrill publisher (both every 1s), and save_settings()/save_probes() re-entered it via publish_*() while handling a web POST. Concurrent clear()/add()/serializeJson() on one ArduinoJson pool corrupts its heap and panics. Since 26.07.01 makes fault resets resume instead of sleeping, this now appears as a regular reboot rather than the device "turning itself off". - Give each serializer its own local JsonDocument (reentrant, thread-safe; the ESP32 heap allocator is itself thread-safe). No shared JSON state. - Probe::push_coarse: move the 360 B compaction scratch off the small probes-task stack (make it static) to avoid a stack overflow on long cooks. - Enlarge the alarm/probes/battery task stacks (were 1000/1000/2000 B); 1000 B is too tight for the probes task's float/log() math. pio run -e esp32dev: SUCCESS. Claude-Session: https://claude.ai/code/session_01AZPve8gau6kiEWgHjiL5Gn Co-authored-by: BattloXX <johannes@battlogg.org> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
current_screen_page is set by the power-button task at press time, but the screen task redraws independently every 1 s and re-reads the connected-probe list fresh each time via an unchecked vector index. If the probe count drops between the button press and the next redraw (loose contact, a transient sensor read miss), the detail page indexed past the end of the now-shorter list -- undefined behaviour, which crashed the device. Since 26.07.01 made fault resets resume instead of sleep, the crash now shows up as an immediate reboot back to the overview screen -- from the button, indistinguishable from "the detail page just doesn't open anymore". display_update() now re-checks the current page against the freshly read probe count before indexing into it, falling back to the overview screen instead of reading past the end of the list. Firmware version bumped to 26.07.02. pio run -e esp32dev: SUCCESS. Co-authored-by: BattloXX <johannes@battlogg.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR consolidates all the work done on the
BattloXX/free-grillyfork since branchingoff. It supersedes my three smaller open PRs (#26, #29, #30), which are all included here —
feel free to close them in favour of this one, or I can rebase this to exclude whatever you
merge first.
masteris 23 commits ahead / 1 behind. Everything is backward compatible: new APIfields are additive, and behavioural changes default to the safe/least-surprising option.
A per-release breakdown lives in
changelog.md.Why
The fork chases three goals: (1) make the device stay reachable and not turn itself off,
(2) make long cooks (pulled pork etc.) usable with real history + ETA, and (3) give a
native app a clean, discoverable API to talk to.
Battery & Power
power_savingsetting — a real toggle between "max battery" and "always reachable",persisted in NVS, exposed via
GET/POST /api/settingsand advertised inGET /api/infocapabilitiesso an app can gate its UI on it. Defaults to OFF(always reachable) after we found the aggressive variant hurt connectivity.
WIFI_PS_MIN_MODEM) actually applied in power-saving mode — thesingle largest battery saving — plus lower task wake-up rates (webserver/MQTT/Opengrill/
battery/probe loops idle instead of spinning). Full TX power is kept so connections stay
strong.
down to protect the cell at SoC ≤5 % or measured cell voltage ≤3.2 V (needs ~15 s of
consecutive confirmations, so an I²C glitch can't power it off).
Reliability / "stops turning itself off" fixes
only to deliberate power-ons (cold boot / wake from our own deep sleep). After a fault
reset (brownout/panic/watchdog) the device resumes instead of silently sleeping, which
users experienced as an unexplained self-power-off.
JsonDocument— every serializer now uses its ownlocal document, so the web server, MQTT and Opengrill publishers no longer corrupt a shared
ArduinoJson pool concurrently. Task stacks enlarged; the coarse-history array moved off a
small task stack.
the current page against the freshly-read probe count instead of indexing past a shrunk
list (loose probe → out-of-bounds → panic).
long-press shutdown.
POST /api/settingsno longer wipes config — settings are now merged; a partial update(e.g. from the app) no longer resets Wi-Fi/AP/MQTT/brightness/timeout fields.
"never" (0) timeout; display timeouts are governed purely by the user's settings.
Temperature history & ETA
coarse tier covering the whole cook at an adaptive interval (starts 60 s, doubles as
it fills → 3 h → 6 h → 12 h → 24 h …). Memory stays fixed (~5.8 KB total) regardless of
cook length, so it's battery-neutral.
Probe::seconds_to_target(), linear regression overrecent samples) shown on the LCD and in the API.
Discovery (mDNS)
free-grilly-<uuid8>.localvia_http._tcpand_free-grilly._tcpwith TXT records (uuid, name, fw) — the app finds devices with zero manual IP entry.
Web UI
/about: battery %, charging, measured cell voltage, lastrestart reason and last power-off reason — all on-device.
base_url→ relative URLs (works from any network).On-device LCD
in H:MM) and blinks the probe label when alarming; info screenshows the short mDNS hostname.
API additions (all additive / backward compatible)
GET /api/grillextended:alarm_active,mdns_hostname, per-probealarm+eta_seconds.GET /api/probes/history— fine + coarse history for all 8 probes(
coarse_interval_seconds, per-probehistory_coarse), served from a fixed buffer.POST /api/alarm/mute— mute an active alarm.GET /api/info— lightweight device identity +capabilitiesarray.last_off_reason,last_reset_reason,battery_millivolts.docs/openapi.yamlupdated to match.Build / CI / docs
tools/generate_web_assets.pyregenerateslib/Website/*.hfromhtml_source/(gzip for CSS/JS; served
Content-Encoding: gzip)..github/workflows/release.yml— tag-triggered CI builds OTA + full-flash binaries, runs a<2 MB size gate, and attaches both as release assets (also accepts
YY.MM.DD-*hotfix tags).docs/android_app.md— full API guide (provisioning, NSD discovery, REST reference).Companion Android app (separate repo — for awareness)
A native Android client was built against this API and is not part of this PR (it lives in
its own repo), but it's the reason several of the API/mDNS/capabilities changes above exist and
may be worth linking from the README:
➜ https://github.com/BattloXX/Free-Grilly-Android (Kotlin · Jetpack Compose · Retrofit ·
Hilt · Room; Android 8.0+)
_free-grilly._tcp) — no IP entry.capabilitiesflag.restart/power-off reason) — consuming the fields added here.
bilingual DE/EN; demo mode.
How to review / test
changelog.md.build_firmware_release.sh, or let thenew
release.ymlproduce OTA + full binaries on a tag./api/info,/api/grill,/api/probes/history,POST /api/alarm/mute) against a device; web UI verifiable at/and/about.