Press "Enter" to skip to content

Category: computers

Sunrise-like Alarm Clock via Home Assistant + Android

Bedside Sunrise Alarm Clock Setup

Quite a few years ago I came across Lighten Up!, which was a dawn-simulating alarm clock module that got connected between an incandescent lamp and used gently increasing light instead of noise. Coupled with a halogen bulb (that’d start out very yellow at lowest brightness) I had a wonderful sunrise-like alarm clock and it was much, much nicer than a beeping alarm.

The LCD displays in the Lighten Up! units began failing so I couldn’t change the programming, which was a hassle as the clocks in them drifted by a couple minutes per month. With a combination of COVID-19 remote work eliminating the need for an alarm clock and the devices dying, in the trash they went. (They also didn’t work right with LED bulbs, and now the person making them has closed down the business.)

I’ve been trying to use an alarm to stay on a more regular sleep schedule and while a bunch of other wake-up lights are available, they are dedicated units that are basically alarm clocks with built in lights. I really liked the elegance of the Lighten Up! and how it’d use an existing lamp, and outside of dedicated smart bulbs + an app I couldn’t find anything else like it. For a while I thought about developing my own hardware version that’d also work with LED bulbs, but never got around to it.

Lighten Up! (Image from Pintrest)

This winter I’ve been experimenting with Home Assistant (HA), and it turns out that with a couple cheap Zigbee parts (bulb and pushbutton from IKEA) it allows for a wonderful replacement/upgrade sunrise alarm idea. A next-generation Lighten Up!, if you will.

With everything put together the lamp next to my bed will now slowly come up to brightness 15 minutes before the wake-up alarm on my phone, reaching final as the normal alarm triggers. If I change the alarm time on my phone, or shut it off, the light-up alarm in HA will follow suit. Additionally, a physical button on the nightstand turns off the light off while replicating a sunrise alarm, or otherwise toggles the light on and off.

Even better, if I’m not home or if the alarm is set for other than between 3:00 AM and 9:00 AM (times during which I’d likely be in bed and wanting to wake up) the light won’t activate. This allows me to use alarms during the normal day for other things without activating with the light, or while traveling without waking Kristen.

Between this and the gently-increasing volume (and vibration) alarm built into the Android clock which triggers at the end of the sunrise cycle it’s a very nice, gradual wake-up system. And, all of this happens without any cloud services or ongoing subscriptions. My HA instance is local; the phone app communicates directly with it across either my home or the public networks. Communication between the physical controls and lights is a local, private network.

In this post I’ll document the major building blocks of how I did this so that someone else with basic Home Assistant experience (and a functioning HA setup, which is beyond the scope of this writeup) can do the same.

For reference, my Home Assistant hardware setup for this piece is:

With the Home Assistant Companion App for Android running on an Android phone, Home Assistant can get the date and time of the next alarm. After installing the app, go into SettingsCompanion appManage sensors and enable the Next alarm sensor. My phone is named Pixel 8, so the alarm is now available as entity sensor.pixel_8_next_alarm. Note that this is not available if an iPhone (or other iOS device) is used. (ref: Next Alarm Sensor)

Part of setting up HA configures a Zone (location) called Home. This, combined with the default location information collected by the companion app, allows HA to know if my phone is at Home (or elsewhere), via the the state of entity device_tracker.pixel_8 (eg: home).

Note: While I give YAML of the automations for configuration reference, most of these automations were built using the GUI and involve the (automatically generated) entity and device IDs. If you are setting this up you’ll want to use the GUI and build these out yourself using the code for reference.

To make this all work, three community components are used and must be installed:

Ashley’s Light Fader 2.0: This script takes a light and, over a configured amount of time, fades from the light’s current setting to the defined setting (both brightness and color temperature) using natural feeling curves (easing). It will also cancel the fade if some conditions are met. I use this to have the light fade, over 15 minutes, using a sine function, to 70% brightness and 4000K temperature, and cancel the fade if the light is turned off or brightness changes significantly, the latter of which allows the button next to the bed to cancel the alarm.

To make this happen I turn on the bulb at 1% brightness and 2202K (it’s warmest temperature), then use the script to fade to 70% and 4000K over the course of 15 minutes. This does a decent job of replicating a sunrise or the results of the Lighten Up! with a halogen bulb.

This is configured as an automation I call Bedroom Steve Nightstand: Lighten Up! (Sunrise). Note that it has no trigger because it’ll be called from the next automation:

alias: "Bedroom Steve Nightstand: Lighten Up! (Sunrise)"
description: ""
trigger: []
condition: []
  - condition: state
    entity_id: light.bedroom_test_bulb_light
    state: "off"
  - service: light.turn_on
    metadata: {}
      brightness_pct: 1
      color_temp: 500
      entity_id: light.bedroom_test_bulb_light
  - service: script.1705454664908
      lampBrightnessScale: zeroToTwoFiftyFive
      easingTypeInput: easeInOutSine
      endBrightnessEntityScale: zeroToOneHundred
      autoCancelThreshold: 10
      shouldStopIfTheLampIsTurnedOffDuringTheFade: true
      shouldResetTheStopEntityToOffAtStart: false
      shouldInvertTheValueOfTheStopEntity: false
      minimumStepDelayInMilliseconds: 100
      shouldTryToUseNativeLampTransitionsToo: false
      isDebugMode: false
      light: light.bedroom_test_bulb_light
        hours: 0
        minutes: 15
        seconds: 0
      endColorTemperatureKelvin: 4000
      endBrightnessPercent: 70
mode: single

Adjustable Wake-up to Android alarm v2: This blueprint for an Automation takes the time from the next alarm sensor (alarm_source) to trigger an action before the alarm happens. I use this to initiate Ashley’s Light Fader 2.0 at 15 minutes before my alarm, only when my phone is at Home, and and the alarm is between 3:00 AM and 9:00 AM.

Part of configuring this is setting up a Helper or basically a system-wide variable, called Pixel 8 Next Alarm (entity id: input_datetime.pixel_8_next_alarm, type: Date and/or time).

This is configured as an automation called Bedroom Steve Nightstand: Lighten Up at 15 Before Alarm, set to only run if my phone is at Home and it’s between 3:00 AM and 9:00 AM:

alias: "Bedroom Steve Nightstand: Lighten Up at 15 Before Alarm"
description: ""
  path: homeassistant/adjustable-wake-up-to-android-alarm.yaml
    offset: 900
    alarm_source: sensor.pixel_8_next_alarm
    alarm_helper: input_datetime.pixel_8_next_alarm
      - condition: device
        device_id: 1fb6fd197bd2b771249ae819f384cfe2
        domain: device_tracker
        entity_id: e695e05f01a328b349a42bfd7d533ef6
        type: is_home
      - condition: time
        after: "03:00:00"
        before: "09:00:00"
      - service: automation.trigger
        metadata: {}
          skip_condition: true
          entity_id: automation.lighten_up

I don’t want to get out a phone and dig into an app to manage the light, so next to the bed I have a TRÅDFRI Shortcut Button for controlling the light. If the button is pressed while the light is simulating sunrise, it turns off. If the light is off it turns it on, or visa versa.

Because turning the light off mid-dimming leaves it set at the current color and brightness, I use this instead of the normal Toggle action. In here I check the state of the bulb and either turn it off (if on), or turn it on to 100% brightness and 4000K if it is off:

alias: "Bedroom Steve Nightstand: Light Toggle"
description: >-
  Doesn't use the normal toggle because it needs to set the light color and
  brightness just in case it was left at something else when turned off
  - device_id: 12994a6c215ae1d4cfb86e261a2b2f3b
    domain: zha
    platform: device
    type: remote_button_short_press
    subtype: turn_on
condition: []
  - if:
      - condition: device
        type: is_on
        device_id: e3421c7d54269752a371fe8443daf95f
        entity_id: 78599118c4ab8043cf03ce6532546b94
        domain: light
      - service: light.turn_off
        metadata: {}
          transition: 0
          entity_id: light.bedroom_test_bulb_light
      - stop: ""
    alias: On to Off
  - if:
      - condition: device
        type: is_off
        device_id: e3421c7d54269752a371fe8443daf95f
        entity_id: 78599118c4ab8043cf03ce6532546b94
        domain: light
      - service: light.turn_on
        metadata: {}
          color_temp: 153
          transition: 0
          brightness_pct: 100
          entity_id: light.bedroom_test_bulb_light
      - stop: ""
    alias: "Off to On: Full Brightness and 4000K"
mode: single

Finally, I also have this all displaying, and controllable, via a card stack in a dashboard. For the next alarm info I started with the template in this post but modified it to simplify one section by using now(), fix a bug in it that occurs with newer versions of HA, and then build it into something that better illustrates the start and end of the simulated sunrise. Because normal entity cards can’t do templating (to dynamically show data) I used TheHolyRoger/lovelace-template-entity-row and some Jinja templating to make it look nice.

This gives me a row which shows the next alarm time (or “No alarm” if none set), nicely formatted, and has a toggle that can enable/disable the Bedroom Steve Nightstand: Lighten Up at 15 Before Alarm automation. Finally, I added a row of buttons to allow easy toggling between 1% / 454 mireds, 33% / 357 mireds, 66% / 294 mireds, and 100% / 250 mireds so I can manually set the light to some nice presets across dawn to full brightness.

Note: There is an older version of this template in HACS, thomasloven/lovelace-template-entity-row in the Home Assistant Community Store (HACS), but it has a bug which keeps the icon from changing color to reflect the state of the automation.

type: vertical-stack
  - type: entities
    title: Bedroom
      - type: custom:template-entity-row
        entity: automation.adjustable_wake_up_to_android_alarm
        name: Sunrise Alarm
        icon: mdi:weather-sunset-up
        active: '{{ states("automation.adjustable_wake_up_to_android_alarm"), "on") }}'
        toggle: true
        tap_action: none
        hold_action: none
        double_tap_action: none
        secondary: >-
          {% set fullformat = '%Y-%m-%d %H:%M' %}
          {% set longformat = '%a %b %-m %-I:%M %p' %}
          {% set timeformat = '%-I:%M %p' %}
          {% if states('sensor.pixel_8_next_alarm') != 'unavailable' %}
            {% set sunrise_start = state_attr('input_datetime.pixel_8_next_alarm', 'timestamp') | int %}
            {% set sunrise_end = (state_attr('sensor.pixel_8_next_alarm', 'Time in Milliseconds') /1000) | int %}
            {% if sunrise_start | timestamp_custom('%Y-%m-%d', true) == (now().timestamp() | timestamp_custom('%Y-%m-%d', true)) %}
              {% set sunrise_start_preamble = 'Today' %}
            {% elif (1+ (sunrise_start - now().timestamp() | int) / 86400) | int == 1 %}
              {% set sunrise_start_preamble = 'Tomorrow' %}
            {% elif (1+ (sunrise_start - now().timestamp() | int) / 86400) | int <= 7 %}
              {% set sunrise_start_preamble = sunrise_start | timestamp_custom('%A',true) %}
            {% else %}
              {% set sunrise_start_preamble = sunrise_start | timestamp_custom('%a %b %-m', true) %}
            {% endif %}
            {% if sunrise_end | timestamp_custom('%Y-%m-%d', true) == (now().timestamp() | timestamp_custom('%Y-%m-%d', true)) %}
              {% set sunrise_end_preamble = 'Today' %}
            {% elif (1+ (sunrise_end - now().timestamp() | int) / 86400) | int == 1 %}
              {% set sunrise_end_preamble = 'Tomorrow' %}
            {% elif (1+ (sunrise_end - now().timestamp() | int) / 86400) | int <= 7 %}
              {% set sunrise_end_preamble = sunrise_end | timestamp_custom('%A',true) %}
            {% else %}
              {% set sunrise_end_preamble = sunrise_end | timestamp_custom('%a %b %-m', true) %}
            {% endif %}
            {% if (sunrise_start_preamble == sunrise_end_preamble) %}
              {% if sunrise_start_preamble == 'None' %}
                {{ sunrise_start | timestamp_custom(longformat, true) }} - {{ sunrise_end | timestamp_custom(timeformat, true) }}
              {% else %}
                {{ sunrise_start_preamble }} {{ sunrise_start | timestamp_custom(timeformat, true) }} - {{ sunrise_end | timestamp_custom(timeformat, true) }}
              {% endif %}
            {% else %}
              {% if sunrise_start_preamble == 'None' %}
                {{ sunrise_start | timestamp_custom(longformat, true) }} - {{ sunrise_end | timestamp_custom(longformat, true) }}
              {% else %}
                {{ sunrise_start_preamble }} {{ sunrise_start | timestamp_custom(timeformat, true) }} - {{ sunrise_end_preamble }} {{ sunrise_end | timestamp_custom(timeformat, true) }}
              {% endif %}
            {% endif %}
          {% else %}
            No alarm set on {{ state_attr('device_tracker.pixel_8', 'friendly_name') }}
          {% endif %}
      - type: divider
      - entity: light.bedroom_test_bulb_light
        name: Steve's Nightstand
        icon: mdi:bed
          brightness: 255
          color_temp_kelvin: 4000
    show_header_toggle: false
    state_color: true
  - type: grid
    square: false
      - show_name: false
        show_icon: true
        show_state: false
        type: button
          action: call-service
          service: light.turn_on
            brightness_pct: 1
            color_temp: 454
            entity_id: light.bedroom_test_bulb_light
        name: 1%
        icon: mdi:moon-waning-crescent
          action: none
      - show_name: false
        show_icon: true
        show_state: false
        type: button
          action: call-service
          service: light.turn_on
            brightness_pct: 33
            color_temp: 357
            entity_id: light.bedroom_test_bulb_light
        name: 33%
        icon: mdi:moon-last-quarter
          action: none
      - show_name: false
        show_icon: true
        show_state: false
        type: button
          action: call-service
          service: light.turn_on
            brightness_pct: 66
            color_temp: 294
            entity_id: light.bedroom_test_bulb_light
        name: 66%
        icon: mdi:moon-waning-gibbous
          action: none
      - show_name: false
        show_icon: true
        show_state: false
        type: button
          action: call-service
          service: light.turn_on
            brightness_pct: 100
            color_temp: 250
            entity_id: light.bedroom_test_bulb_light
        name: 100%
        icon: mdi:moon-full
          action: none
    columns: 4

The result of all of this is that, if my phone is at home and I have an alarm set between 3:00 AM and 9:00 AM, the light next to the bed will simulate a 15-minute sunrise before the alarm goes off. If the light is simulating a sunrise, pressing the button will turn it off. Otherwise, the button toggles the light on and off at full brightness, for normal lamp-type use. Finally, via the Home Assistant UI I can easily check the status of, or turn off, the sunset alarm if I don’t want to use it.

So far, this is working great. There’s two things I’m looking into changing:

First, the bulb I’m using, 405.187.36, is an 1100 lumen maximum brightness. This is a bit too bright for the final stage of the alarm, and it’s minimum brightness is a bit higher than I’d like and seems a little abrupt. (Ideally the initial turn-on won’t be noticable.)

Since IKEA bulbs are cheap and generally work well, I’ll likely try a few other lower brightness ones and see how they work out. Both 605.187.35 (globe) and 905.187.34 (chandelier) are color temperature adjustable, 450 lumen maximum, cost $8.99, and look like good candidates as I expect their minimum brightness to be lower.

There is also 104.392.55 ($12.99), but it is fixed at 2200K and has a maximum brightness of 250 lumens. I suspect this will be nicely dim for the start, but wouldn’t allow a color transition and might not have enough final brightness to make me feel ready for the day.

I may also try something like 204.391.94 ($17.99), which is adjustable color, as this could allow me to use something like the sunrise color pallete, but this would require moving to a different script for fading. The current script doesn’t support fading between colors (see here for discussion around this), so this would take a lot of work on my part. Probably more than would be beneficial, since varying color temp on white-range bulbs is pretty darn good already.

Second, the TRÅDFRI Shortcut Button (203.563.82) that I’m using has been discontinued. It’s a nice, simple button, and I can trigger on it using short or long press. It’s replacement, SOMRIG Shortcut Button (305.603.54), isn’t in stock at my local IKEA so I don’t have one, but I expect it to be two buttons that can each have short or long presses, and perhaps even double-click on each. If so, I may add something more like dimming the nightstand light to use as a reading light, or perhaps something to leave on for the dogs when we’re gone.

Thinking a bit bigger picture I could even do things like use an in-wall dimmer to have the adjacent closet lights serve as wake-up lights. But as all the quality ones of these are Z-Wave I’d have to get another radio for the Pi and… and…

The possibilities for this stuff are nearly endless, which is neat, because it becomes an engineering problem of what to do that provides sufficient benefit without complexity for complexity’s sake. This, at least, a Home Assistant-based replacement for the old, beloved Lighten Up!, is great.

Note: This post has been updated a few times since original posting to fix grammar, a bug in the Jinga2 template for displaying the next alarm, and to add buttons for setting lamp brightness.

Comments closed


Wahoo smart trainers support network connectivity (instead of just the traditional Bluetooth or ANT+). Since I don’t have one I’d never bothered looking into how it works, but this morning while troubleshooting something with TrainerRoad running in the background I happened to see an mDNS query for _wahoo-fitness-tnp._tcp.local and realized this is how the smart trainers get discovered on the network.


Maybe one day I’ll have a smart trainer that can use the network and I can dig further into how this all works.

Comments closed

NGINX on OPNsense for Home Assistant

I’ve been experimenting with Home Assistant (HA) for some temperature monitoring around the house. It has a great mobile client that’ll work across the public internet, but HA itself unfortunately it only does HTTP by default. It has some minor built in support for HTTPS by using the NGINX proxy and Let’s Encrypt (LE) Add-ons, but for a couple of reasons[1] I didn’t like this solution. I’m not about to expose something with credentials across the public internet via plain HTTP, so I wanted to do this proxying on my firewall instead of on the device itself.

My firewall at home runs OPNsense which has an NGINX Plugin, along with a full featured ACME client that I’m already using for other certificates, so it was perfect for doing this forwarding. After a bit of frustration, fooling around, and unexpected errors I got things working, so I wanted to share a simple summary of what it took to make it work. I’m leaving the DNS, certificate, and firewall sides of this out, as they’ll vary and are well documented elsewhere.

Here’s the steps I used:

  • Set up DNS so the hostname you wish to use is accessible internally and externally. In this example will resolve to on the public internet, and at home, which are the WAN and LAN interfaces on the OPNsense box.
  • Set up the ACME plugin to get a certificate for the hostname you will be using for, in this case
  • On your Home Assistant instance, add the following to the configuration.yaml. This tells HA to accept proxied connections from the gateway. If you don’t do this, or specify the wrong trusted_proxy, you will receive a 400: Bad Request error when trying to access the site via the proxy:
  use_x_forwarded_for: true
  • In OPNsense, install NGINX.
  • In ConfigurationUpstreamUpstream Server define your HA instance as a server:
    • Description: HA Server
    • Server: (your Home Assistant device)
    • Port: 8123 (the port you have Home Assistant running on, 8123 is the default)
    • Server Priority: 1
  • In ConfigurationUpstreamUpstream define a grouping of upstream servers, in this case the one you defined in the previous step:
    • Description: Home Assistant
    • Server Entries: HA Server
  • In ConfigurationHTTP(S)Location define what will get redirected to the Upstream:
    • Toggle Advanced Mode
    • Description: Home Assistant
    • URL Pattern: /
    • Upstream Servers: Home Assistant
    • Advanced Proxy OptionsWebSocket Support: ✓
  • In ConfigurationHTTP(S)HTTP Server define the actual server to listen for HTTP connections:
    • HTTP Listen Address: Clear this out unless you want to proxy HTTP for some reason.
    • HTTPS Listen Address: 8123 and [::]:8123. Leave out the latter if you don’t wish to respond on IPv6.
    • Default Server: ✓
    • Server Name:
    • Locations: Home Assistant
    • TLS Certificate: Pick the certificate that you created early on with the ACME plugin.
    • HTTPS Only: ✓ (Unless for some reason you wish to support cleartext HTTP.)
  • Then under General Settings check Enable nginx and click Apply.
  • Finally, if needed, be sure to create the firewall rule(s) needed to allow traffic to connect to the TCP port you designated in the HTTP Server portion of the NGINX configuration.

[1] Reasons for doing the proxying on the firewall include:

  • The Let’s Encrypt Add-on won’t restart NGINX automatically on cert renewal as OPNsense can. This means I’d have to either write something to do it, or manually restart the add-on to avoid periodic certificate errors.
  • If NGINX is running on the same device as Home Assistant, then it needs to be on a different port. I prefer using the default port.
  • I’d prefer to run just one copy of NGINX on my network for reverse proxying.
  • While experimenting with NGINX and LE on HA I kept running into weird problems where something would start logging errors or just not work until I restarted the box. With everything running as containers, troubleshooting intermittent issues like these is painful enough that I preferred to avoid it.
Comments closed

Command Line 802.11 Monitor Mode on macOS Sonoma (14.0)

Because it supports monitor mode, a Macbook with the built-in WiFi adapter is one of the simplest ways to grab packets off the air. It’s not the most robust, but often all I need to do is grab data from a couple devices I’m near on a known channel, so fancy antennas and channel hopping and whatnot is overkill; I just need to grab packets. Using the Sniffer built into the Wireless Diagnostics captures in Monitor Mode has been fairly easy for a while, but I was stuck using the GUI.

For a while macOS has had a command line utility called airport to handle all sorts of wireless network manipulation, log gathering, and debugging. It also has a poorly documented command verb sniff, but until the release of macOS Sonoma (14.0) it was only possible to specifying the channel. Not being able to specify the width made it useless for most capturing I’d do in the real world.

Thankfully the airport command now works for channel and width, so now it’s possible to use remotely, in scripts, etc. It’s not well documented, but it works. For example, the following will capture on en0 on 5GHz channel 137 with 80MHz width:

airport en0 sniff 5g137/80

This will capture en1 on 2.4GHz channel 7 at 20MHz width:

airport en0 sniff 2g7/20

Output files end up randomly named in /tmp in pcap format with a name of /tmp/airportSniff??????.cap. They can be opened in Wireshark or your analysis tool of choice.

(I suspect that sniffing from 6GHz WiFi will follow the same pattern, but I don’t have access to a device with such a radio so I’m unable to test. It’d also be pretty nifty to see this somehow built in / better automated via Wireshark… That could be a neat project for later.)

The airport binary can be found at /System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport. I link this to ~/bin, with something like the following:

ln -s /System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport ~/bin/airport

I keep ~/bin around for personal executable stuff, and it’s been added to my path by putting a line like this in ~/.zshrc:

export PATH=".:$PATH:$HOME/bin"

The airport binary itself has a pretty decent output from --help. It’s light on sniffing examples, but pretty good for other stuff.

Amusingly, this is pretty much the extent of the airport(8) man page; a TODO:

airport manages 802.11 interfaces. airport more information needed here.

Comments closed

Using OpenSSL to Match Certificates and Private Keys

I recently was troubleshooting a problem with network authentication and suspected that the issue was around certificates and private keys not matching on a client. I had a .PEM file for the certificate and a .KEY for the private key, and I wanted to see if they matched.

Thankfully OpenSSL, the Swiss Army Knife of wrangling certs, made it easy. While this isn’t anything particularly secret, it took me a few to figure it out, so I’m re-documenting it here.

To see if the private key matches the certificate, use the following two commands and compare the Modulus section:

openssl x509 -in file.pem -noout -text

openssl rsa -in file.key -noout -text

If they match, the private key matches the certificate. If they don’t, they don’t.

In my case they didn’t match, which was causing the authentication problems. So we then solved what was happening during cert issuance and everything was then good.

Comments closed

Garmin Edge 530 WiFi Connection Weirdness

Today I tried to connect my Garmin Edge 530 running the latest firmware (v9.73) to my home wireless network, and couldn’t get it working. my friend Nick dug up a solution, so I wanted to share it here.

The problem I had is that when trying to find the network to join, either in the Garmin Connect mobile app or right on the device, my WPA2-secured wireless network would be shown as Unsecured and I couldn’t join it. No matter what I tried, on device or in app, switching around network types, names, security, or bands, the Edge 530 always saw it as Unsecured. The one thing I didn’t try was making a non-secured network, but that’s not an option for me.

Turns out you can work around this by using the Garmin Express desktop app then going into the 530, Tools & Content, Utilities, then under Wi-Fi Networks manually adding the network with appropriate security, password, etc.

After saving settings and ejecting the device it joined my wireless network, as confirmed on the device, in the DHCP leases, and on the APs themselves. Now it’ll automatically sync rides whenever I get back home.

Something else odd is the Edge 530 truncates 32-character SSIDs. My network at home is Smart Meter Surveillance Network, which is 32 ASCII characters long. This is the maximum allowed by 802.11, which is 32 octets, or 32 sets of 8 bytes, or 32 ASCII characters. For some reason in much of the Garmin UI it’d drop the last character, truncating to Smart Meter Survellance Networ. Thinking this was the problem I first dug into network name as a problem but eventually found a shorter SSID didn’t help. Also, this isn’t the first device I’ve had with SSID length problems (see Bypassing Reolink SSID Length Limitation); thankfully in this case it only seemed to be a display issue.

Comments closed

How ASUS and a Microsoft Bug Almost Broke Remote Work

A couple of years after it happened I’m sharing this story about the intersection of an OS bug, a network hardware quirk, and a global pandemic. A chain of semi-esoteric things aligned and only caused noticeable problems in a very specific — dare I snarkily say unprecidented — situation.

I found this issue both fascinating and maddening and I hope you will as well. This does not contain code-level details of the bug (I don’t have them), but I’m sharing it both to document this problem and share a little story of what goes on behind the scenes in supporting a big enterprise IT environment.

In March 2020, when Stay Home, Stay Safe began in Michigan, most of my fellow employees began working from home (WFH). For years we’d been building IT systems to allow most people to work from anywhere, and the shift to WFH was going great. Between VPN connections back to the corporate network, lots of things in the cloud, and Azure Active Directory (AAD) to handle Single Sign-On (SSO) for almost every company application, all most folks needed for remote work was their standard laptop and an internet connection. The computing experience of working from home was effectively the same experience as working in the office.

Sure, we had some bumps with home internet connections not being robust enough, but we helped people through those. Mostly we’d find that what someone thought was a good home internet connection — because their phone or video streaming worked fine — wasn’t great for things like moving big files around or video calls. [1] Generally those having network performance issues had fine service from their ISP, but their home routers were old and not up to task. We would recommend upgrading the router, they’d go buy something new, and all would be good.

In early autumn we began receiving reports of users getting the notorious You Can’t Get There From Here (YCGTFH) message when trying to access anything that used AAD for SSO. This generic message is displayed when AAD authentication fails and access is denied. Because so many things were behind AAD SSO this interrupted a lot of work. These computers either were no longer joined to AAD or had an expired token, and it seemed tied to internet access.

Digging into it, there would be errors shown by dsregcmd /status (the AAD CLI utility) and Test-DeviceRegConnectivity.ps1. This script checks internet access as SYSTEM to the AAD public endpoints would fail on all three connection tests, implying the lack of connectivity to Microsoft endpoints was keeping AAD registration (and token refresh) from working properly. But the user still could browse the web and hit those URLs and we hadn’t changed anything internet access-wise since the WFH began.

We found that manually setting a proxy server for the whole of the system (via netsh winhttp set proxy) would allow AAD registration (dsregcmd /join) to succeed and SSO would then work. We also found that restarting the WinHTTP Web Proxy Auto-Discovery Service (WinHttpAutoProxySvc), which handles WPAD, would sometimes fix the problem, but only sometimes.

Even more confusingly sometimes a reboot would fix it. Or, sometimes if a user drove into the office and used the network there, it would work. But not always. [2]

Simply, we had some computers whose SYSTEM account couldn’t access the internet for so long that an AAD token had expired, this broke SSO and users were being told You Can’t Get There From Here.

Typical for a lot of large organizations we have authenticating proxy servers sitting between the client network and the public internet. All requests bound for the internet need to go through them, and these proxies are located by a Proxy Auto-Config (PAC) file that is found either by a direct setting (AutoConfigUrl) or Web Proxy Auto-Discovery (WPAD), via DNS, both of which send the same file. We directly set the PAC file URL on a per-user basis and leave WPAD at its default of enabled. Thus for the end user and things running under their account, WPAD is used, falling back to the PAC file setting if that fails. For the SYSTEM account a direct PAC file setting is not used, relying solely on WPAD to find the path to the internet. [3]

Looking at a network capture when AAD registration would fail instead of the normal chain of events requests we’d see no DNS requests for WPAD, and no PAC file download. Instead we saw the client attempting to resolve the AAD endpoints via DNS, and then would attempt to reach out directly to them, which would be blocked by the company firewall. The proxy was not being used; WPAD wasn’t working. This was weird because every piece worked when tested independently (DNS resolution for WPAD hostnames, invoke-webrequest http://x.x.x.x/wpad.dat, specifying the WPAD PAC file in AutoConfigUrl), but as a whole it just didn’t work.

This went on for quite a while, supported by a Premier case with Microsoft. We could see that WPAD was frequently failing, but struggled with getting a consistent reproduction and going down dead-ends. We bandaged the problem with manual, direct proxy settings and AAD registration. This was mostly fine short-term, but caused overhead for our support folks and was a ticking bomb.

Then one day, thanks to a fortuitous conversation with a very smart lead Microsoft engineer while working another issue I found out about a bug with Microsoft’s WPAD implementation that was just discovered and was being patched in the next round of patches. The description exactly explained our problem and I was elated.

It turned out that if Windows 10 received a blank DHCP option 252, the WinHttpAutoProxySvc service would not query DNS for WPAD, and — the broken part — it would never do so again until the service was restarted. Directly configured PAC files would be used, but WPAD was broken. Here in our environment the SYSTEM account would not have internet access and this meant AAD registration, Test-DeviceRegConnectivity.ps1, the Microsoft Store, and all such internet-needing SYSTEM-level things didn’t work.

Apparently some home router vendors — most notably the hugely-popular ASUS [4] — would send option 252 but leave it blank because they found doing so reduced name resolution requests from clients. This is seen in Wireshark as:
Option: (252) Private/Proxy autodiscovery
    Length: 1
    Private/Proxy autodiscovery: \n

Windows, which has WPAD enabled by default, will try a number of name resolution queries (DNS for wpad/, NetBIOS, LLMNR, etc) to locate a PAC file server if it does not see a DHCP option 252. Because WinHttpAutoProxySvc looks to DHCP then only tries DNS if that fails, by setting this option but leaving it blank the name resolution steps would not occur. I can only guess as to why these vendors find it desirable, but perhaps they like reducing the load on the built-in DNS forwarders, or they saw it as a security benefit or… who knows. Either way, the result of this blank option and the Windows bug was that WPAD — via DHCP and DNS — didn’t work.

So what is option 252? In WPAD there are multiple discovery mechanisms for finding the PAC file server. Beyond DNS there is also a Dynamic Host Configuration Protocol (DHCP) method where, along with the typical network address settings, the client receives the URL for downloading the PAC file. This is done via option 252, but isn’t widely supported, it’s normally not used, and we don’t use it it either.

While WPAD is core OS function, it unfortunately never left draft RFC status. Implementations have no formal standard to target; it’s just a guideline. Additionally, because it never left draft status, DHCP option 252 also remains unallocated and without a standard, simply part of the Reserved (Private Use) range. So, setting it but leaving it blank is not unacceptable, OS’ should be able to accommodate.

In a network capture I was then able to clearly see this happen, and it was simple to replicate on a test network. And then it finally all came together…

COVID-19 WFH resulted in a bunch of people upgrading their home networks, with lots of them buying new home routers, including the very-popular ASUS brand. If someone booted up their Windows 10 computer and the blank option 252 sending DHCP server was the first thing it saw, WPAD would break, with all the downstream consequences, including our AAD connectivity issues. And if they’d have AAD connectivity issues for long enough — until a token expired — they would start getting You Can’t Get There From Here messages.

If they went in the office or somewhere else which doesn’t set 252 and fully rebooted (which restarts the WinHttpAutoProxySvc service), WPAD would work and AAD registration would work. But if the service never restarted — if they never actually rebooted — WPAD was stuck not doing anything. [2]

Testing a pre-release version of the patch showed it fixed the problem, and then a few weeks later KB4601382 was released, with the detail “Improves the ability of the WinHTTP Web Proxy Auto-Discovery Service to ignore invalid Web Proxy Auto-Discovery Protocol (WPAD) URLs that the Dynamic Host Configuration Protocol (DHCP) server returns.”. We deployed this patch and the reports of AAD registration / You Can’t Get There From Here issues collapsed.

That was it, it was fixed. A popular home router vendor did something weird (but not against standard), the OS implemented something poorly, and people were working for so long in one of those environments that a credential expired, couldn’t be renewed, and they lost access to SSO.

[1] Mobile apps and a lot of modern websites are fairly asynchronous, sending requests in the background while still working nicely, because they are built to be tolerant of the blips that happen while on wireless networks. Video streaming specifically caches (or buffers) the video locally so that hiccups in the network connection don’t make the video pause and stutter. More real-time-ish things like Remote Desktop or video calls or copying files via SMB are considerably more sensitive to poor network connections.

[2] Retrospectively I suspect confusion around what it means to shut down or restart a computer led to many of the reports of reboots/shutdowns/driving into the office fixing the problem or not and the difficulties in getting a reliable reproduction. Different sleep modes, some of which result in the BIOS displaying the POST when waking from sleep even if the OS doesn’t restart, leads some to believe the operating system was restarted when it may not have been. Or other folks believe that closing the lid is “shutting down”.

It was also unfortunately common for users to have a flexible version of “home”. Sometimes home meant where they’d been working for the last six months and had the problem, sometimes it meant a vacation home with a different ISP they’d gone to the day before but failed to mention, sometimes it meant the other side of the world. Teasing this information out was difficult, as many used “home” to mean anywhere but their normally assigned desk. “I’ve only been at home” frequently meant “I continue to be not at the office”.

[3] We use WPAD because Windows 10 requires internet access for a lot of OS level things that run as SYSTEM, including the Microsoft Store and AAD device registration. If proxy servers are used, the SYSTEM account needs to find the proxy servers. Early on in our Windows 10 deployment Microsoft told us either direct internet access or WPAD was required. WPAD via DHCP doesn’t work with most VPN clients, because they configure their virtual adapters directly and not via DHCP. Thus, to have similar connectivity when in the office or remote, DNS for WPAD is the the choice.

[4] I’ve been told multiple brands do this, but ASUS only vendor where I personally observed it.

Comments closed

BorgBackup Repository on DSM 7.0

A few years back I began using Borg for backing up, sending it home to my Synology DSM 1019+. At the time this was running the 6.x family of DSM and worked great, but it broke after moving to v7.0. Attempts to run Borg would result in this error:

/var/services/homes/borguser/borg: error while loading shared libraries: failed to map segment from shared object

This appears to be happening because with the upgrade to v7.0 /tmp is mounted noexec.

adminuser@diskstation:/var/services/homes/borguser$ mount | grep /tmp
tmpfs on /tmp type tmpfs (rw,nosuid,nodev,noexec)

While a few online solutions (such as this one) propose remounting /tmp with exec, this is a poor solution as it changes the security model for DSM v7.0 and may break in the future during an upgrade. The best solution for this is to create a private temp directory for just borguser and define it as $TMPDIR.

To do this create ~borguser/tmp, ensure it’s owned by your Borg user, and set it to 700:

mkdir ~borguser/tmp
chown borguser:users ~borguser/tmp
chmod 700 ~borguser/tmp

Then create a wrapper script for Borg setting this variable. The result will be Borg using ~borguser/tmp for it’s private temporary directory, leaving /tmp alone, working nicely with the DSM v7.0 security design. I keep mine in ~borguser/.ssh and call it And, be sure it’s executable. Mine is like this:

adminuser@diskstation:/var/services/homes/borguser$ sudo cat .ssh/
export TMPDIR=$HOME/tmp
/var/services/homes/borguser/borg serve --storage-quota 120G --restrict-to-repository /volume2/Backups/borg
adminuser@diskstation:/var/services/homes/borguser$ sudo ls -als .ssh/
4 -rwx------ 1 borguser root 161 Nov 15 06:56 .ssh/

Finally, change ~borguser/.ssh/authorized_keys limiting the backup user to executing the new script.

command="/var/services/homes/backupuser/.ssh/",restrict,from="" ssh-rsa AAAA[...restofkeygoeshere...]

Comments closed

Upgrading Pi-Hole on Docker on Synology DSM

At home I use Pi-Hole running via Docker on a Synology DS1019+ (details here). Every once in a while I update this, but it seems to be long enough between updates that I forget the exact steps, so I’m documenting them here.

This is the simplest update process I know of. It keeps the configuration the same but updates the container itself. Because the volumes for persistent storage are not changed and the custom environment variables stay the same, all settings are preserved.

  1. Log into DSM.
  2. Launch Docker.
  3. In Docker, update the image:
    • Registry
    • Search for pihole
    • Select pihole/pihole (ensure it is the official image)
    • Click Download
  4. Stop and reset the container:
    • Container
    • Select your Pi-hole container.
    • Stop
    • Reset
    • Start
  5. Wait a few minutes, then log back in to Pi-Hole (eg: http://pi.hole:8081/admin/)
  6. At the bottom of the page observe updated versions. Check here for the latest version of the Docker image, for comparison.

Note: This process presumes you have configured volumes mounted to /etc/dnsmasq.d and /etc/pihole, and have set environment variables INTERFACE, WEB_PORT, WEBPASSWORD, and TZ. Without these your results may vary.

Comments closed

Network Capture with Process Name and PID on macOS

There are many times when one wants to see which process is responsible for network traffic on a endpoint. While there are ways to look at which process has an open socket or whatnot, this doesn’t help with UDP, and it’s often quite useful to simply do a recording then later see which process is responsible for creating a packet.

On Windows this is very easy using Pktmon / netsh trace / Network Monitor to do a capture, but on macOS it’s not that straightforward.

Using tcpdump with the -k argument one can print process name and PID to the console. In this example I’m filtering on just the host (Google Public DNS) then running dig in another window to look up

c0nsumer@myopia ~ % sudo tcpdump -k -i en0 host
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on en0, link-type EN10MB (Ethernet), capture size 262144 bytes
09:17:58.210959 pid dig.76987 svc BE IP > 10294+ [1au] A? (47)
09:17:58.292541 IP > 10294 1/0/1 A (63)

The bold line is the packet going out to, and the pid dig.76987 portion shows that it’s from a dig process, which had process ID 76987.

Unfortunately, due to the non-standard way Apple writes this metadata into files, it’s not viewable in Wireshark. This would be very nice, as it’d make parsing large captures much easier.

For now it seems the best way to do this is to record the capture to a file, then feed it back into tcpdump. Recording this requires using the pktap pseudo interface (see the tcpdump man page about the --interface argument) to ensure this data is saved into the file. The same capture above, writing to a file called out.cap, would be as follows:

sudo tcpdump -i pktap,en0 host -w out.cap

This can then be fed back into tcpdump for parsing/filtering/viewing:

c0nsumer@myopia ~ % tcpdump -k -r out.cap
reading from PCAP-NG file out.cap
09:27:10.964758 (en0, proc dig:77619, svc BE, out, so) IP > 56446+ [1au] A? (47)
09:27:11.018809 (en0, proc dig:77619, svc BE, in, so) IP > 56446 1/0/1 A (63)
c0nsumer@myopia ~ %

In this case it shows both the sending process and the process which received the packet.

When using this for more detailed analysis, I’ll use the macOS tcpdump to grab a very broad capture and then do first-pass filtering before bringing it into Wireshark for more detailed analysis. See the PACKET METADATA FILTER section of the tcpdump man page for details on how to filter on a PID, process name, etc. From the header of this section:

Use packet metadata filter expression to match packets against descriptive information about the packet: interface, process, service type or direction.

Comments closed