Press "Enter" to skip to content

Category: computers

Garmin Edge 530 WiFi Connection Weirdness

Today I tried to connect my Garmin Edge 530 running the latest firmware (v9.73) to my home wireless network, and couldn’t get it working. my friend Nick dug up a solution, so I wanted to share it here.

The problem I had is that when trying to find the network to join, either in the Garmin Connect mobile app or right on the device, my WPA2-secured wireless network would be shown as Unsecured and I couldn’t join it. No matter what I tried, on device or in app, switching around network types, names, security, or bands, the Edge 530 always saw it as Unsecured. The one thing I didn’t try was making a non-secured network, but that’s not an option for me.

Turns out you can work around this by using the Garmin Express desktop app then going into the 530, Tools & Content, Utilities, then under Wi-Fi Networks manually adding the network with appropriate security, password, etc.

After saving settings and ejecting the device it joined my wireless network, as confirmed on the device, in the DHCP leases, and on the APs themselves. Now it’ll automatically sync rides whenever I get back home.

Something else odd is the Edge 530 truncates 32-character SSIDs. My network at home is Smart Meter Surveillance Network, which is 32 ASCII characters long. This is the maximum allowed by 802.11, which is 32 octets, or 32 sets of 8 bytes, or 32 ASCII characters. For some reason in much of the Garmin UI it’d drop the last character, truncating to Smart Meter Survellance Networ. Thinking this was the problem I first dug into network name as a problem but eventually found a shorter SSID didn’t help. Also, this isn’t the first device I’ve had with SSID length problems (see Bypassing Reolink SSID Length Limitation); thankfully in this case it only seemed to be a display issue.

Comments closed

How ASUS and a Microsoft Bug Almost Broke Remote Work

A couple of years after it happened I’m sharing this story about the intersection of an OS bug, a network hardware quirk, and a global pandemic. A chain of semi-esoteric things aligned and only caused noticeable problems in a very specific — dare I snarkily say unprecidented — situation.

I found this issue both fascinating and maddening and I hope you will as well. This does not contain code-level details of the bug (I don’t have them), but I’m sharing it both to document this problem and share a little story of what goes on behind the scenes in supporting a big enterprise IT environment.

In March 2020, when Stay Home, Stay Safe began in Michigan, most of my fellow employees began working from home (WFH). For years we’d been building IT systems to allow most people to work from anywhere, and the shift to WFH was going great. Between VPN connections back to the corporate network, lots of things in the cloud, and Azure Active Directory (AAD) to handle Single Sign-On (SSO) for almost every company application, all most folks needed for remote work was their standard laptop and an internet connection. The computing experience of working from home was effectively the same experience as working in the office.

Sure, we had some bumps with home internet connections not being robust enough, but we helped people through those. Mostly we’d find that what someone thought was a good home internet connection — because their phone or video streaming worked fine — wasn’t great for things like moving big files around or video calls. [1] Generally those having network performance issues had fine service from their ISP, but their home routers were old and not up to task. We would recommend upgrading the router, they’d go buy something new, and all would be good.

In early autumn we began receiving reports of users getting the notorious You Can’t Get There From Here (YCGTFH) message when trying to access anything that used AAD for SSO. This generic message is displayed when AAD authentication fails and access is denied. Because so many things were behind AAD SSO this interrupted a lot of work. These computers either were no longer joined to AAD or had an expired token, and it seemed tied to internet access.

Digging into it, there would be errors shown by dsregcmd /status (the AAD CLI utility) and Test-DeviceRegConnectivity.ps1. This script checks internet access as SYSTEM to the AAD public endpoints would fail on all three connection tests, implying the lack of connectivity to Microsoft endpoints was keeping AAD registration (and token refresh) from working properly. But the user still could browse the web and hit those URLs and we hadn’t changed anything internet access-wise since the WFH began.

We found that manually setting a proxy server for the whole of the system (via netsh winhttp set proxy) would allow AAD registration (dsregcmd /join) to succeed and SSO would then work. We also found that restarting the WinHTTP Web Proxy Auto-Discovery Service (WinHttpAutoProxySvc), which handles WPAD, would sometimes fix the problem, but only sometimes.

Even more confusingly sometimes a reboot would fix it. Or, sometimes if a user drove into the office and used the network there, it would work. But not always. [2]

Simply, we had some computers whose SYSTEM account couldn’t access the internet for so long that an AAD token had expired, this broke SSO and users were being told You Can’t Get There From Here.

Typical for a lot of large organizations we have authenticating proxy servers sitting between the client network and the public internet. All requests bound for the internet need to go through them, and these proxies are located by a Proxy Auto-Config (PAC) file that is found either by a direct setting (AutoConfigUrl) or Web Proxy Auto-Discovery (WPAD), via DNS, both of which send the same file. We directly set the PAC file URL on a per-user basis and leave WPAD at its default of enabled. Thus for the end user and things running under their account, WPAD is used, falling back to the PAC file setting if that fails. For the SYSTEM account a direct PAC file setting is not used, relying solely on WPAD to find the path to the internet. [3]

Looking at a network capture when AAD registration would fail instead of the normal chain of events requests we’d see no DNS requests for WPAD, and no PAC file download. Instead we saw the client attempting to resolve the AAD endpoints via DNS, and then would attempt to reach out directly to them, which would be blocked by the company firewall. The proxy was not being used; WPAD wasn’t working. This was weird because every piece worked when tested independently (DNS resolution for WPAD hostnames, invoke-webrequest http://x.x.x.x/wpad.dat, specifying the WPAD PAC file in AutoConfigUrl), but as a whole it just didn’t work.

This went on for quite a while, supported by a Premier case with Microsoft. We could see that WPAD was frequently failing, but struggled with getting a consistent reproduction and going down dead-ends. We bandaged the problem with manual, direct proxy settings and AAD registration. This was mostly fine short-term, but caused overhead for our support folks and was a ticking bomb.

Then one day, thanks to a fortuitous conversation with a very smart lead Microsoft engineer while working another issue I found out about a bug with Microsoft’s WPAD implementation that was just discovered and was being patched in the next round of patches. The description exactly explained our problem and I was elated.

It turned out that if Windows 10 received a blank DHCP option 252, the WinHttpAutoProxySvc service would not query DNS for WPAD, and — the broken part — it would never do so again until the service was restarted. Directly configured PAC files would be used, but WPAD was broken. Here in our environment the SYSTEM account would not have internet access and this meant AAD registration, Test-DeviceRegConnectivity.ps1, the Microsoft Store, and all such internet-needing SYSTEM-level things didn’t work.

Apparently some home router vendors — most notably the hugely-popular ASUS [4] — would send option 252 but leave it blank because they found doing so reduced name resolution requests from clients. This is seen in Wireshark as:
Option: (252) Private/Proxy autodiscovery
    Length: 1
    Private/Proxy autodiscovery: \n

Windows, which has WPAD enabled by default, will try a number of name resolution queries (DNS for wpad/ wpad.local.tld.com/wpad.local.com, NetBIOS, LLMNR, etc) to locate a PAC file server if it does not see a DHCP option 252. Because WinHttpAutoProxySvc looks to DHCP then only tries DNS if that fails, by setting this option but leaving it blank the name resolution steps would not occur. I can only guess as to why these vendors find it desirable, but perhaps they like reducing the load on the built-in DNS forwarders, or they saw it as a security benefit or… who knows. Either way, the result of this blank option and the Windows bug was that WPAD — via DHCP and DNS — didn’t work.

So what is option 252? In WPAD there are multiple discovery mechanisms for finding the PAC file server. Beyond DNS there is also a Dynamic Host Configuration Protocol (DHCP) method where, along with the typical network address settings, the client receives the URL for downloading the PAC file. This is done via option 252, but isn’t widely supported, it’s normally not used, and we don’t use it it either.

While WPAD is core OS function, it unfortunately never left draft RFC status. Implementations have no formal standard to target; it’s just a guideline. Additionally, because it never left draft status, DHCP option 252 also remains unallocated and without a standard, simply part of the Reserved (Private Use) range. So, setting it but leaving it blank is not unacceptable, OS’ should be able to accommodate.

In a network capture I was then able to clearly see this happen, and it was simple to replicate on a test network. And then it finally all came together…

COVID-19 WFH resulted in a bunch of people upgrading their home networks, with lots of them buying new home routers, including the very-popular ASUS brand. If someone booted up their Windows 10 computer and the blank option 252 sending DHCP server was the first thing it saw, WPAD would break, with all the downstream consequences, including our AAD connectivity issues. And if they’d have AAD connectivity issues for long enough — until a token expired — they would start getting You Can’t Get There From Here messages.

If they went in the office or somewhere else which doesn’t set 252 and fully rebooted (which restarts the WinHttpAutoProxySvc service), WPAD would work and AAD registration would work. But if the service never restarted — if they never actually rebooted — WPAD was stuck not doing anything. [2]

Testing a pre-release version of the patch showed it fixed the problem, and then a few weeks later KB4601382 was released, with the detail “Improves the ability of the WinHTTP Web Proxy Auto-Discovery Service to ignore invalid Web Proxy Auto-Discovery Protocol (WPAD) URLs that the Dynamic Host Configuration Protocol (DHCP) server returns.”. We deployed this patch and the reports of AAD registration / You Can’t Get There From Here issues collapsed.

That was it, it was fixed. A popular home router vendor did something weird (but not against standard), the OS implemented something poorly, and people were working for so long in one of those environments that a credential expired, couldn’t be renewed, and they lost access to SSO.


[1] Mobile apps and a lot of modern websites are fairly asynchronous, sending requests in the background while still working nicely, because they are built to be tolerant of the blips that happen while on wireless networks. Video streaming specifically caches (or buffers) the video locally so that hiccups in the network connection don’t make the video pause and stutter. More real-time-ish things like Remote Desktop or video calls or copying files via SMB are considerably more sensitive to poor network connections.

[2] Retrospectively I suspect confusion around what it means to shut down or restart a computer led to many of the reports of reboots/shutdowns/driving into the office fixing the problem or not and the difficulties in getting a reliable reproduction. Different sleep modes, some of which result in the BIOS displaying the POST when waking from sleep even if the OS doesn’t restart, leads some to believe the operating system was restarted when it may not have been. Or other folks believe that closing the lid is “shutting down”.

It was also unfortunately common for users to have a flexible version of “home”. Sometimes home meant where they’d been working for the last six months and had the problem, sometimes it meant a vacation home with a different ISP they’d gone to the day before but failed to mention, sometimes it meant the other side of the world. Teasing this information out was difficult, as many used “home” to mean anywhere but their normally assigned desk. “I’ve only been at home” frequently meant “I continue to be not at the office”.

[3] We use WPAD because Windows 10 requires internet access for a lot of OS level things that run as SYSTEM, including the Microsoft Store and AAD device registration. If proxy servers are used, the SYSTEM account needs to find the proxy servers. Early on in our Windows 10 deployment Microsoft told us either direct internet access or WPAD was required. WPAD via DHCP doesn’t work with most VPN clients, because they configure their virtual adapters directly and not via DHCP. Thus, to have similar connectivity when in the office or remote, DNS for WPAD is the the choice.

[4] I’ve been told multiple brands do this, but ASUS only vendor where I personally observed it.

Comments closed

BorgBackup Repository on DSM 7.0

A few years back I began using Borg for backing up nuxx.net, sending it home to my Synology DSM 1019+. At the time this was running the 6.x family of DSM and worked great, but it broke after moving to v7.0. Attempts to run Borg would result in this error:

/var/services/homes/borguser/borg: error while loading shared libraries: libz.so.1: failed to map segment from shared object

This appears to be happening because with the upgrade to v7.0 /tmp is mounted noexec.

adminuser@diskstation:/var/services/homes/borguser$ mount | grep /tmp
tmpfs on /tmp type tmpfs (rw,nosuid,nodev,noexec)
adminuser@diskstation:/var/services/homes/borguser$

While a few online solutions (such as this one) propose remounting /tmp with exec, this is a poor solution as it changes the security model for DSM v7.0 and may break in the future during an upgrade. The best solution for this is to create a private temp directory for just borguser and define it as $TMPDIR.

To do this create ~borguser/tmp, ensure it’s owned by your Borg user, and set it to 700:

mkdir ~borguser/tmp
chown borguser:users ~borguser/tmp
chmod 700 ~borguser/tmp

Then create a wrapper script for Borg setting this variable. The result will be Borg using ~borguser/tmp for it’s private temporary directory, leaving /tmp alone, working nicely with the DSM v7.0 security design. I keep mine in ~borguser/.ssh and call it borg.sh. And, be sure it’s executable. Mine is like this:

adminuser@diskstation:/var/services/homes/borguser$ sudo cat .ssh/borg.sh
!/bin/sh
export TMPDIR=$HOME/tmp
/var/services/homes/borguser/borg serve --storage-quota 120G --restrict-to-repository /volume2/Backups/borg
adminuser@diskstation:/var/services/homes/borguser$ sudo ls -als .ssh/borg.sh
4 -rwx------ 1 borguser root 161 Nov 15 06:56 .ssh/borg.sh
adminuser@diskstation:/var/services/homes/borguser$

Finally, change ~borguser/.ssh/authorized_keys limiting the backup user to executing the new script.

command="/var/services/homes/backupuser/.ssh/borg.sh",restrict,from="192.168.0.23" ssh-rsa AAAA[...restofkeygoeshere...] remoteuser@remoteserver.example.com

Comments closed

Upgrading Pi-Hole on Docker on Synology DSM

At home I use Pi-Hole running via Docker on a Synology DS1019+ (details here). Every once in a while I update this, but it seems to be long enough between updates that I forget the exact steps, so I’m documenting them here.

This is the simplest update process I know of. It keeps the configuration the same but updates the container itself. Because the volumes for persistent storage are not changed and the custom environment variables stay the same, all settings are preserved.

  1. Log into DSM.
  2. Launch Docker.
  3. In Docker, update the image:
    • Registry
    • Search for pihole
    • Select pihole/pihole (ensure it is the official image)
    • Click Download
  4. Stop and reset the container:
    • Container
    • Select your Pi-hole container.
    • Stop
    • Reset
    • Start
  5. Wait a few minutes, then log back in to Pi-Hole (eg: http://pi.hole:8081/admin/)
  6. At the bottom of the page observe updated versions. Check here for the latest version of the Docker image, for comparison.

Note: This process presumes you have configured volumes mounted to /etc/dnsmasq.d and /etc/pihole, and have set environment variables INTERFACE, WEB_PORT, WEBPASSWORD, and TZ. Without these your results may vary.

Comments closed

Network Capture with Process Name and PID on macOS

There are many times when one wants to see which process is responsible for network traffic on a endpoint. While there are ways to look at which process has an open socket or whatnot, this doesn’t help with UDP, and it’s often quite useful to simply do a recording then later see which process is responsible for creating a packet.

On Windows this is very easy using Pktmon / netsh trace / Network Monitor to do a capture, but on macOS it’s not that straightforward.

Using tcpdump with the -k argument one can print process name and PID to the console. In this example I’m filtering on just the host 8.8.8.8 (Google Public DNS) then running dig in another window to look up dingleberrypie.com:

c0nsumer@myopia ~ % sudo tcpdump -k -i en0 host 8.8.8.8
Password:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on en0, link-type EN10MB (Ethernet), capture size 262144 bytes
09:17:58.210959 pid dig.76987 svc BE IP myopia.home.nuxx.net.56248 > dns.google.domain: 10294+ [1au] A? dingleberrypie.com. (47)
09:17:58.292541 IP dns.google.domain > myopia.home.nuxx.net.56248: 10294 1/0/1 A 96.126.107.52 (63)

The bold line is the packet going out to 8.8.8.8, and the pid dig.76987 portion shows that it’s from a dig process, which had process ID 76987.

Unfortunately, due to the non-standard way Apple writes this metadata into files, it’s not viewable in Wireshark. This would be very nice, as it’d make parsing large captures much easier.

For now it seems the best way to do this is to record the capture to a file, then feed it back into tcpdump. Recording this requires using the pktap pseudo interface (see the tcpdump man page about the --interface argument) to ensure this data is saved into the file. The same capture above, writing to a file called out.cap, would be as follows:

sudo tcpdump -i pktap,en0 host 8.8.8.8 -w out.cap

This can then be fed back into tcpdump for parsing/filtering/viewing:

c0nsumer@myopia ~ % tcpdump -k -r out.cap
reading from PCAP-NG file out.cap
09:27:10.964758 (en0, proc dig:77619, svc BE, out, so) IP myopia.home.nuxx.net.52983 > dns.google.domain: 56446+ [1au] A? dingleberrypie.com. (47)
09:27:11.018809 (en0, proc dig:77619, svc BE, in, so) IP dns.google.domain > myopia.home.nuxx.net.52983: 56446 1/0/1 A 96.126.107.52 (63)
c0nsumer@myopia ~ %

In this case it shows both the sending process and the process which received the packet.

When using this for more detailed analysis, I’ll use the macOS tcpdump to grab a very broad capture and then do first-pass filtering before bringing it into Wireshark for more detailed analysis. See the PACKET METADATA FILTER section of the tcpdump man page for details on how to filter on a PID, process name, etc. From the header of this section:

Use packet metadata filter expression to match packets against descriptive information about the packet: interface, process, service type or direction.

Comments closed

HOSTS v3.5.3 and v3.6.0 Broke BackBlaze Backups in Arq

About a week back I did a round of updates at home, including updating the Pi-hole container (running in Docker on a Synology DS1019+) to the latest version, v4.2.2. Not long after this I noticed that backups to Backblaze, via Arq running on my main Mac, were stuck with a Caching existing backup metadata (this may take a while) message.

Since it said it might take a while I gave it a few days, but after a week it was likely something was wrong. Turns out it wasn’t caused by any of my updates, but instead by two versions of the the block list HOSTS (v3.5.3 and v3.6.0) — the default block list in Pi-hole — in turn caused by the Polish block list KAD.

How’d I figure it out? Here goes:

First, a wee bit of digging led to this Reddit thread on /r/Arqbackup, and a quick look at Pi-hole showed that yes, f000.backblazeb2.com is being blocked over and over.

Whitelisting this site allowed backups resume working. But… why?

I then disabled the whitelist entry and updated gravity in Pi-hole (pulling down and compiling a new copy of the blocklists) and everything kept working. So, this seems like a block list might have been the source of the problem.

I only use two block lists, one the Pi-hole default and the other from the COVID-19 Cyber Thread Coalition. Taking a quick look through the current versions (1 · 2) didn’t show anything blocking this site as of this morning, which seemed rational as the blocklist update fixed things. Local DNS for this client is via Pi-hole, which in turn points to my firewall, which is running Unbound to handle all resolution itself. So, it shouldn’t have been caused by a DNS provider blocking things.

Pi-hole automatically updates gravity every Sunday early in the morning, which would about correlate with the Arq problems starting. So maybe this is it? With the last Gravity updates happening on 2021-Apr-04 and 2021-Mar-28 we’ve got a window to look for f000.backblazeb2.com in blocklists.

The COVID-19 Cyber Threat Coalition domain blocklist was updated this morning, and doesn’t have any obvious version control, so I skipped over this one for now. The second, the Pi-hole default HOSTS, is hosted in GitHub and has regular releases. So let’s look through there…

Grabbing the last four, v3.5.2, v3.5.3, v3.6.0, and v3.6.1 spanned the last 18 days, which should cover the window during which this broke. A quick unzip and grep showed f000.backblazeb2.com and www.f000.backblazeb2.com in the fakenews, gambling + social, gambling + porn, and social categories in versions 3.5.3 and 3.6.0, but not anything before nor after.

There we go; the reason for the block and it’s all within the observed timeframe. This isn’t a hostname one would normally want to block, as it’s part of BackBlaze’s CDN (PDF). Sounds like an overzealous addition to a blocklist got sucked up into the HOSTS list.

Looking further through the grep output, this was part of the .../KADhosts/hosts file from the KAD list. It turns out that f000.backblazeb2.com was added to the KAD list on 2021-Mar-26 and then removed on 2021-Apr-01. HOSTS pulled from KAD for v3.5.3 on 2021-Mar-28 and v3.6.0 2021-Mar-31, which caused it to inherit the block in those versions.

Quite an interesting chain, eh? A Polish ad blocking group makes a change that ends up in the default list for one of the most common DIY adblockers, which in turn breaks access to a fairly common CDN, in turn breaking data backups. It’s dependencies all the way down…

It’s now fixed, and everything would have resolved itself had I waited until Sunday, but at least now I know why.

Comments closed

A Home Network Troubleshooting Journey

This week I moved from UniFi to a new setup that included OPNsense on the edge to handle firewall, NAT, and other such tasks on the home network. Built in to OPNsense is a basic NetFlow traffic analyzer called Insight. Looking at this and turning on Reverse lookup something strange popped out: ~22% of the traffic coming in from the internet over the last two hours was from just two hosts: dynamic-75-76-44-147.knology.net and dynamic-75-76-44-149.knology.net.

While reverse DNS worked to resolve the IPs to hostnames (75.76.44.147 to dynamic-75-76-44-147.knology.net and 75.76.44.149 to dynamic-75-76-44-149.knology.net), forward lookup of those hostnames didn’t work. This didn’t really surprise me as the whole DNS situation on the WOW/Knowlogy network is poor, but it did make me more curious. Particularly strange was the IPs being are so close together.

To be sure this is Knology (ruling out intentionally-misleading reverse DNS) I used whois to confirm the addresses are owned by them:

NetRange: 75.76.0.0 - 75.76.46.255
CIDR: 75.76.46.0/24, 75.76.40.0/22, 75.76.0.0/19, 75.76.44.0/23, 75.76.32.0/21
NetName: WIDEOPENWEST
NetHandle: NET-75-76-0-0-1
Parent: NET75 (NET-75-0-0-0-0)
NetType: Direct Allocation
OriginAS: AS12083
Organization: WideOpenWest Finance LLC (WOPW)
RegDate: 2008-02-13
Updated: 2018-08-27
Ref: https://rdap.arin.net/registry/ip/75.76.0.0

My home ISP is Wide Open West (WOW), and Knology is an ISP that they bought in 2012. While I use my ISP directly for internet access (no VPN tunnel to elsewhere), I run my own DNS to avoid their service announcement redirections, so why would I be talking to something else on my ISP’s network?

Could this be someone doing a bunch of scanning of my house? Or just something really misconfigured doing a bunch of broadcasting? Let’s dig in and see…

First I used the Packet capture function in OPNsense to grab a capture on the WAN interface filtered to these two IPs. Looking at it in Wireshark showed it was all HTTPS. Hmm, that’s weird…

A couple coworkers and I have Plex libraries shared with each other, maybe that’s it? The port isn’t right (Plex usually uses 32400) but maybe one of them are running on it in 443 (HTTPS)… But why the two IPs so close to each other? Maybe one of them are getting multiple IPs from their cable modem, have dual WAN links configured on their firewall, and it’s bouncing between them… (This capture only showed the middle of a session, so there was no certificate exchange present to get any service information from.)

Next I did another packet capture on the LAN interface to see if it’s a computer on the network or OPNsense as the local endpoint. This showed it’s coming from my main personal computer, a 27″ iMac at 192.168.0.8 / myopia.--------.nuxx.net, so let’s look there. (Plex doesn’t run on the iMac, so that’s ruled out.)

Conveniently the -k argument to tcpdump on macOS adds packet metadata, such as process name, PID, etc. A basic capture/display on myopia with tcpdump -i en0 -k NP host 75.76.44.149 or 75.76.44.147 to show all traffic going to and from those hosts identified Firefox as the source:

07:39:57.873076 pid firefox.97353 svc BE pktflags 0x2 IP myopia.--------.nuxx.net.53515 > dynamic-75-76-44-147.knology.net.https: Flags [P.], seq 19657:19696, ack 20539524, win 10220, options [nop,nop,TS val 3278271236 ecr 1535621504], length 39
07:39:57.882070 IP dynamic-75-76-44-147.knology.net.https > myopia.--------.nuxx.net.53515: Flags [P.], seq 20539524:20539563, ack 19696, win 123, options [nop,nop,TS val 1535679857 ecr 3278271236], length 39

Well, okay… Odd that my browser would be talking so much HTTPS to my ISP directly. I double-checked that DNS-over-HTTPS was disabled, so it’s not that…

Maybe I can see what these servers are? Pointing curl at one of them to show the headers, the server header indicated proxygen-bolt which is a Facebook framework:

c0nsumer@myopia Desktop % curl --insecure -I https://75.76.44.147
HTTP/2 400
content-type: text/plain
content-length: 0
server: proxygen-bolt
date: Sat, 16 Jan 2021 13:22:57 GMT
c0nsumer@myopia Desktop %

Now we’re getting somewhere…

Finally I pointed openssl at the IP to see what certificate it’s presenting and it’s a wildcard cert for a portion of Facebook’s CDN:

c0nsumer@myopia Desktop % openssl s_client -showcerts -connect 75.76.44.149:443 </dev/null
CONNECTED(00000003)
depth=2 C = US, O = DigiCert Inc, OU = www.digicert.com, CN = DigiCert High Assurance EV Root CA
verify return:1
depth=1 C = US, O = DigiCert Inc, OU = www.digicert.com, CN = DigiCert SHA2 High Assurance Server CA
verify return:1
depth=0 C = US, ST = California, L = Menlo Park, O = "Facebook, Inc.", CN = *.fdet3-1.fna.fbcdn.net
verify return:1
[SNIP]

As a final test I restarted tcpdump on the iMac then closed the Facebook tab I had open in Firefox and the traffic stopped.

So there’s our answer. All this traffic is to Facebook CDN instances on the Wide Open West / Knology network. It sure seems like a lot for a tab just sitting open in the background, but hey… welcome to the modern internet.


I could have received more information from OPNsense’s Insight by clicking on the pie slice shown above to look at that host in the Details view, but it seems to have an odd quirk. When the Reverse lookup box is checked, clicking the pie slice to jump to the Details view automatically puts the hostname in the (src) Address field, which returns no results (it needs an IP address). I thought this was the tool failing, so I looked to captures for most of the info.

Later on I realized that filtering on the IP showed a bunch more useful information, including two other endpoints within the network talking to these servers (mobile phones), and that HTTPS was also running over UDP, indicating QUIC.

(Bug 4609 was submitted for this issue and AdSchellevis fixed it within a couple hours via commit c797bfd.)

Comments closed

Pi-hole via Docker on Synology DSM with Bonded Network Interface

With consolidating and upgrading my home network I’m moving Pi-hole from a stand-alone Raspberry Pi to running under Docker on my Synology DS1019+ running DiskStation Manager (DSM) v6.2.3.

This was a little bit confusing at first as the web management UI would work, but DNS queries weren’t getting answered. This ended up being caused by the bonded network interface, which is ovs_bond0 instead of the normal default of eth0.

Using the official Pi-hole Docker image, set to run with Host networking (Use the same network as Docker host in the Synology UI), setting or changing the following variables will set up Pi-hole work from first boot, configured to:

  • Listen on ovs_bond0 (instead of the default eth0).
  • Answer DNS queries on the same IP as DSM (192.168.0.2).
  • Run the with the web-based management interface on port 8081 with password piholepassword.
  • Send internal name resolutions to the internal DNS/DHCP server at 192.168.0.1 for clients *.internal.example.com within 192.168.0.0/24.
  • Set the displayed temperature to Farenheit and time zone to America/Detroit.
  • Listen for HTTP requests on http://diskstation.internal.example.com:8081 along side the default pi.hole hostname.

DNS=127.0.0.1
INTERFACE=ovs_bond0
REV_SERVER=True
REV_SERVER_CIDR=192.168.0.0/24
REV_SERVER_DOMAIN=internal.example.com
REV_SERVER_TARGET=192.168.0.1
ServerIP: 192.168.0.2
TEMPERATUREUNIT=f
TZ: America/Detroit
VIRTUAL_HOST: diskstation.internal.example.com
WEB_PORT: 8081
WEBPASSWORD: piholepassword

Additionally, setting up volumes for /etc/dnsmasq.d/ and /etc/pihole/ will ensure changes to the UI persist across restarts and container upgrades. I do this as shown here:

Note: If you stop the Pi-hole container, clear out the contents of these directories, and then restart the container, Pi-hole will set itself up again from the environment variables. This allows tweaking the variables without recreating the container each time.

UPDATE: With the update to Synology DSM 7.0 the interface is now called bond0.

Comments closed

nginx for HTTPS Request Logging

Consider the following situation: You have a web app from a vendor and during a security scan it crashes. The web app is running over HTTPS with your certificates, but neither the scanning tool nor web app offer sufficient logging to see exactly which request caused the crash.

Because you can’t decrypt HTTPS without access to a client key log file (or making a bunch of TLS changes), and the client is a security scanning tool, Wireshark is not an option to see the triggering request. Fiddler is also likely out, as that’d require the security scanner to trust a new root cert. So what can you do? Stick something else in the way to proxy the connection, logging all the requests!

Having access to the private certificates for the server this is quite easy: set up nginx as a proxy. The only wrinkle is that getting access to all of the request headers requires Lua, so you’ll need to ensure your nginx install supports this. On macOS this was easy using Homebrew to install nginx from denji’s GitHub repository (the default nginx doesn’t support Lua):

brew tap denji/nginx
brew install nginx-full --with-lua-module --with-set-misc-module

This configuration uses the web app’s certificates in nginx to proxy requests it receives to your main site, logging the client IP, request, headers, body, and request status to intercept.log. Requests are broken out by line to make for easy visual reading. You may wish to move this all on to one line to make parsing easy:

events {
}

http {
    log_format custom 'Time: $time_local'
                      '
'
                      'Remote Addr: $remote_addr'
                      '
'
                      'Request: $request'
                      '
'
                      'Request Headers: $request_headers'
                      '
'
                      'Body: $request_body'
                      '
'
                      'Status: $status'
                      '
'
                      '-----';

    server {
        listen 443 ssl;
        server_name example.com;
        access_log /path/to/intercept.log custom;
        ssl_certificate /path/to/cert.pem;
        ssl_certificate_key /path/to/privkey.pem;

        location / {
            proxy_pass https://example.com;
            proxy_set_header Accept-Encoding ''; 
            set_by_lua_block $request_headers {
                local h = ngx.req.get_headers()
                local request_headers_all = ""
                for k, v in pairs(h) do
                    request_headers_all = request_headers_all .. ""..k..": "..v..";"
                end
                return request_headers_all
            }
        }
    }
}

To put this in place, ensure that requests from the scanner go to nginx instead of the web app and then nginx will forward and log the requests. There are a few ways you could do this:

  • Run nginx on the same server as the web app, move the web app to listen to another port for HTTPS, and set proxy_pass to the other port: proxy_pass https://example.com:4430
  • Run nginx on a new server, change the DNS records for the site to point to the new server, and point nginx to the old server by IP: proxy_pass https://192.168.10.10
  • If the scanner tool’s name resolution can be adjusted, such as via a HOSTS file or custom configuration, point it to the nginx proxy for the site name.

To test you can use a web browser on a client computer and a HOSTS file to point the original hostname nginx. To get the screenshot above I ran nginx on iMac running macOS, then in a Windows VM I changed the HOSTS file to map nuxx.net to the iMac’s IP. Firefox on the Windows VM then sent requests for nuxx.net to nginx on macOS which logged and proxied the requests out to the real nuxx.net.

Comments closed

Pi-hole (and PiVPN) with Ubiquiti UniFi

Pi-hole

My home network is based around Ubiquiti’s UniFi, with a Security Gateway (USG) handling the NAT/firewall/routing duties. For ad blocking and to have better control over DNS I use Pi-hole running on a Raspberry Pi.

With the following settings you can have the two working well together with UniFi doing DHCP and Pi-hole doing DNS. Internal forward and reverse resolution will work, which means hostnames will appear properly for internal devices on both consoles while requests are still appropriately Pi-hole’d.

Here’s how:

  • Set up the Pi-hole and put it on the network at a static IP.
  • In Pi-hole, under SettingsDNS turn on:
    • Never forward non-FQDNs
    • Never forward reverse lookups for private IP ranges
    • Conditional forwarding with IP address of your DHCP server (router) as the USG
    • Local domain name (optional) as your internal DNS suffix
  • In the USG, set DHCP to hand out the Pi-hole’s IP for DHCP Name Server.
  • In USG, under ServicesDHCPDHCP Server, set Register client hostname from DHCP requests in USG DNS forwarder to On.
  • Leave the WAN interface’s DNS set to something public, such as what the ISP provides or Google’s 8.8.8.8/8.8.4.4 or whatever. This ensures that if the Pi-hole goes down then the USG can still resolve DNS.

After setting this up clients will use Pi-hole for DNS, as configured via DHCP. Requests for hostnames and addresses on the local network (shortnames or local suffix) will get forwarded to the USG, ensuring ensures that internal requests work properly.

PiVPN

Taking this a step further, I also have PiVPN running on the same Pi, to provide an endpoint for connecting into my home network via Wireguard. Pi-hole and PiVPN integrate very nicely and are designed to work together, making the setup very smooth.

By default, PiVPN sets the Pi-hole as the DNS via a DNS option in the [Interface] section of the config. To ensure appropriately geolocated search results when connected to VPN, use a DNS which supports Extended Client Subnet (ECS) (under SettingsDNS) on the Pi-hole.

(For reference, I’m running Pi-hole on a Raspberry Pi 4 Model B with 2GB of RAM and it has plenty of overhead for both Pi-hole for ~20 devices and sustaining 50 MByte/sec via Wireguard. The Pi-hole section of this was originally written up here on Reddit.)

Comments closed