Press "Enter" to skip to content

Category: computers

Simple PAC File Pilot Testing (including WPAD)

In a network that’s isolated from the public internet, such as many enterprise networks, proxy servers are typically used to broker internet access for client computers. Configuring the client computers to use these proxies is often done via a Proxy Auto-Config (PAC) file, code that steers requests so traffic for internal sites stays internal, and public sites go through the proxies.

Commonly these PAC files are made available via Web Proxy Auto-Discovery Protocol (WPAD) as well, because some systems need to automatically discover them. Specifically, in a Windows 10 environment which uses proxies, WPAD is needed because many components of Windows (including the Microsoft Store and Azure Device Registration) will not use the browser’s PAC file settings; it’s dependent on WPAD to find a path to the internet.

WPAD is typically configured via DNS, with a hostname of wpad.companydomain.com (or anything in the DNS Search Suffix List) resolving to the IP of a webserver [1]. This server must then answer an HTTP request for http://x.x.x.x/wpad.dat (where x.x.x.x is the server’s IP) or http://wpad.company.com/wpad.dat with a PAC file, with a Content-Type of x-ns-proxy-autoconfig [2].

Because WPAD requires DNS, something which can’t easily be changed for a subset of users, putting together a mechanism to perform a pilot deployment of a new PAC file can be a bit complicated. When attempting to perform a pilot deployment engineers will often send out a test PAC file URL to be manually configured, but this misses WPAD and does not result in a complete system test.

In order to satisfy WPAD, one can set up a simple webserver to host the new PAC file and a DNS server to answer the WPAD queries. This DNS server forwards all requests except for those for the PAC file to the enterprise DNS, so everything else works as normal. Testing users then only need to change their DNS to receive the pilot PAC file and everything else will work the same; a true pilot deployment.

Below I’ll detail how I use simplified configurations of Unbound and nginx to pilot a PAC file deployment. This can be done from any Windows machine, or with very minor config changes from something as simple as a Raspberry Pi running Linux.

[1] WPAD can be configured via DHCP, but this is only supported by a handful of Microsoft applications. DNS-based WPAD works across all modern OS’.

[2] Some WPAD clients put the server’s IP in the Host: field of the HTTP request.

DNS via Unbound

Unbound is a DNS server that’s straightforward to run and is available on all modern platforms. It’s perfect for our situation where we need to forward all DNS queries to the production infrastructure, modifying only the WPAD/PAC related queries to point to our web server. While it’s quite robust and has a lot of DNSSEC validation options, we don’t need any of that.

This simple configuration forwards all requests to corporate Active Directory-based DNS’ (10.0.1.2 and 10.0.2.2) for everything except the PAC file servers. For these, pacserver.example.com and wpad.example.com, it’ll intercept the request and return our webserver’s address of 10.0.3.25.

server:
interface: 0.0.0.0
access-control: 0.0.0.0/0 allow
module-config: "iterator"

local-zone: "wpad.example.com." static
local-data: "wpad.example.com. IN A 10.0.3.25"

local-zone: "pacserver.example.com." static
local-data: "pacserver.example.com. IN A 10.0.3.25"

stub-zone:
name: "."
stub-addr: 10.0.1.2
stub-addr: 10.0.2.2

This configuration allows recursive queries from any hosts, but by specifying one or more subnets using access-control clauses to you can restrict from where it is usable. The stub-zone clause to send all requests up to two DNS’. If these upstream DNS’ handle recursion for the client, the forward-zone clause can be used instead.

PAC File via nginx

For serving up the PAC file, both for direct queries and those from WPAD, we’ll use nginx, a powerful but easy to use web server to which we can give a minimal config.

Put a copy of your PAC file at …/html/wpad.dat under nginx’s install directory so the server can find it. (There is great information on writing PAC files at FindProxyForUrl.com.)

This simple configuration will set up a web server which serves all files as MIME type application/x-ns-proxy-autoconfig, offering up the wpad.dat file by default (eg: http://pacserver.example.com) or when directly referenced (eg: http://10.0.3.25/wpad.dat or http://wpad.example.com/wpad.dat), satisfying both standard PAC file and WPAD requests.

events {
worker_connections 1024;
}

http {
default_type application/x-ns-proxy-autoconfig;
sendfile on;
keepalive_timeout 65;

server {
listen 80;
server_name localhost;

location / {
root html;
index wpad.dat;
}
}
}

Putting It All Together

With all the files in place and unbound and nginx running, you’re ready to go. Instruct pilot users to manually configure the new DNS, or push this setting out via Group Policy, VPN settings, or some other means. These users will then get the special DNS response for your PAC and WPAD servers, get the pilot PAC file from your web server, and be able to test.

Comments closed

Archiving Gallery 2 with HTTrack

Along with the static copy of the MediaWiki, I’ve been wanting to make a static, archival copy of the Gallery 2 install that I’ve been using to share photos, for 15+ years, at nuxx.net/gallery. Using HTTrack I was able to do so, after a bit of work, resulting in a copy at the same URL and with images accessed using the same paths, from static files.

The result is that I no longer need to run the aging Gallery 2 software, yet links and embedded images that point to my photo gallery did not break.

In the last few years I’ve both seen the traffic drop off, I haven’t posted many new things there, and it seems like the old Internet of pointing people to a personal photo gallery is nearly dead. I believe that blog posts, such as this, with links to specific photos, are where effort should be put. While there is 18+ years of personal history in digital images in my gallery, it doesn’t get used the same way it was 10 years ago.

On the technical side, the relatively-ancient (circa 2008) Gallery 2 has and the ~90GB of data in it has occasionally been a burden. I had to maintain an old copy of PHP just for this app, and this made updating things a pain. While there is a recent project, Gallery the Revival, which aims to update Gallery to newer versions of PHP, this is based around Gallery 3 and a migration to that brings about its own problems, including breaking static links.

I’m still not sure if I want to keep the gallery online but static as it is now, put the web app back up, completely take it off the internet and host it privately at home, or what… but figuring out how to create an archive has given me options.

What follows are my notes on how I used HTTrack, a package specifically designed to mirror websites, to archive nuxx.net’s Photo Gallery. I encountered a few bumps along the way, so this details each and how it was overcome, resulting in the current static copy. To find each of these I’d start HTTrack, let it run for a while, see if it got any errors, fix them, then try again. Eventually I got it to archive cleanly with zero errors:

Gallery Bug 83873

During initial runs, HTTrack finished after ~96MB (out of ~90GB of images) saved, reporting that it was complete. The main portions of the site looked good, but many sub-albums or original-resolution images were zero-byte HTML files on disk and displayed blank in the browser. This was caused by Gallery bug 83873, triggered by using HTTPS on the site. It seems to be fixed by adding the following line just before line 780 in .../modules/core/classes/GallerySession.class:

GalleryCoreApi::requireOnce('modules/core/classes/GalleryTranslator.class');

This error was found by via the following in Apache’s error log:

AH01071: Got error 'PHP message: PHP Fatal error: Class 'GalleryTranslator' not found in /var/www/vhosts/nuxx.net/gallery/modules/core/classes/GallerySession.class on line 780\n', referer: http://nuxx.net/gallery/

Minimize External Links / Footers

To clean things up further, minimizing external links, and make the static copy of the site as simple as possible, I also removed external links in footer by commenting out the external Gallery links and version from the footer, via .../themes/themename/templates/local/theme.tpl and .../themes/themename/templates/local/error.tpl:

<div id="gsFooter">
{*
{g->logoButton type="validation"}
*{g->logoButton type="gallery2"}
*{g->logoButton type="gallery2-version"}
*{g->logoButton type="donate"}
*}
</div>

Remove Details from EXIF/IPTC Plugin

The EXIF/IPTC Plugin for Gallery is excellent because it shows embedded metadata from the original photo, including things like date/time, camera model, location. This presents as a simple Summary view and a lengthier Details view. Unfortunately, when being indexed by HTTrack, selecting of the Details view — done via JavaScript — returns a server error. This shows up in the HTTrack UI as an increasing error count, and server errors as some pages are queried.

To not have a broken link on every page I modified the plugin to remove the Summary and Details view selector so it’d only display Summary, and used the plugin configuration to ensure that every field I wanted was shown in the summary.

To make this change copy .../modules/exif/templates/blocks/ExifInfo.tpl to .../modules/exif/templates/blocks/local/ExifInfo.tpl (to create a local copy, per the Editing Templates doc). Then edit the local copy and comment out lines 43 through 60 so that only the Summary view is displayed:

{* {if ($exif.mode == 'summary')}
* {g->text text="summary"}
* {else}
* <a href="{g->url arg1="controller=exif.SwitchDetailMode"
* arg2="mode=summary" arg3="return=true"}" onclick="return exifSwitchDetailMode({$exif.blockNum},{$item.id},'summary')">
* {g->text text="summary"}
* </a>
* {/if}
* &nbsp;&nbsp;
* {if ($exif.mode == 'detailed')}
* {g->text text="details"}
* {else}
* <a href="{g->url arg1="controller=exif.SwitchDetailMode"
* arg2="mode=detailed" arg3="return=true"}" onclick="return exifSwitchDetailMode({$exif.blockNum},{$item.id},'detailed')">
* {g->text text="details"}
* </a>
* {/if}
*}

Disable Extra Plugins

Finally, I disabled a bunch of plugins which both wouldn’t be useful in a static copy of the site, and cause a number of interconnected links which would make a mirror of the site overly complicated:

  • Search: Can’t search a static site.
  • Google Map Module: Requires a maps API key, which I don’t want to mess with.
  • New Items: There’s nothing new getting posted to a static site.
  • Slideshow: Not needed.

Fix Missing Files

My custom theme, which was based on matrix, linked to some images in the matrix directory which were no longer present in newer versions of the themes, so HTTrack would get 404 errors on these. I copied these files from my custom theme to the .../themes/matrix/images directory to fix this.

Clear Template / Page Cache

After making changes to templates it’s a good idea to clear all the template caches so all pages are rendering with the above changes. While all these steps may be overkill, I do this by going into Site Admin → Performance and setting Guest Users and Registered Users to No acceleration. I then uncheck Enable template caching and click Save. I then click Clear Saved Pages to clear any cached pages, then re-enable template caching and Full acceleration for Guest Users (which HTTrack will be working as).

PANIC! : Too many URLs : >99999

If your Gallery has a lot of images, HTTrack could quit with the error PANIC! : Too many URLs : >99999. Mine did, so I had to run it with the -#L1000000 argument so that it’ll then be limited to 1,000,000 URLs instead of the default 99,999.

Run HTTrack

After all of this, I ran the httrack binary with the security (bandwidth, etc) limits disabled (--disable-security-limits) and used its wizard mode to set up the mirror. The URL to be archived was https://nuxx.net/gallery/, stored in an appropriately named project directory, with no other settings.

CAUTION: Do not disable security limits if you don’t have good controls around the site you are mirroring and the bandwidth between the two. HTTrack has very sane defaults for rate limiting when mirroring that keep its behavior polite, it’s not wise to override these defaults unless you have good control of the source and destination site.

When httrack begins it shows no progress on screen, so I quit with Ctrl-C, switched to the project directory, and ran httrack --continue to allow the mirror to continue and show status info on the screen (the screenshot above). The argument --continue can be used to restart an interrupted mirror, and --update can be used to freshen up a complete mirror.

Alternately, the following command puts this all together, without the wizard:

httrack https://nuxx.net/gallery/ -W -O "/home/username/websites/nuxx.net Photo Gallery" -%v --disable-security-limits -#L1000000

As HTTrack spiders the site it comes across external links and needs to know what to do with them. Because I didn’t specify an action for external links on the command line, it prompts with the question “A link, [linkurl], is located beyond this mirror scope.”. Since I’m not interested in mirroring any external sites (mostly links to recipes or company websites) I answer * which is “Ignore all further links and do not ask any more questions” (text in httrack.c). (I was unable to figure out how to suppress this via a command line option before getting a complete mirror, although it’s likely possible.)

Running from a Dedicated VM

I ran this mirror task from a Linode VM, located in the same region as the VM hosting nuxx.net. This results in all traffic flowing over the Private network, avoiding bandwidth charge.

Because of the ~90GB of images, I set up a Linode 8GB, which has 160GB of disk, 8GB of RAM, and 4 CPUs. This should provide plenty of space for the mirror, with enough resources to allow the tool to work. This VM costs $40/mo (or $0.06/hr), which I find plenty affordable for getting this project done. The mirror took N days to complete, after which I tar’d it up and copied it a few places before deleting the VM.

By having a separate VM I was able to not worry about any dependencies or package problems and delete it after the work is done. All I needed to do on this VM was create a user, put it in the sudoers file, install screen (sudo apt-get install screen) and httrack (sudo apt-get install httrack), and get things running.

Wrapping It All Up

After the mirror was complete I replaced my .../gallery directory with the .../gallery directory from the HTTrack output directory and all was good.

Comments closed

Archiving MediaWiki with mwoffliner and zimdump

For a number of years on nuxx.net I used MediaWiki to host technical content. The markup language is nearly perfect for this sort of content, but in recent years I haven’t been doing as much of this and maintaining the software became a bit of a hassle. In order to still make the content available but get rid of the actual software, I moved all the content to static HTML files.

These files were created by creating a ZIM file — commonly used for offline copies of a website — and then extracting that file. The extracted files, a static copy of the MediaWiki-based site, was then made available using Apache.

You can get the ZIM file here, or browse the new static pages here.

Here’s the general steps I used to make it happen.

Create ZIM file: mwoffliner --mwUrl="https://nuxx.net/" --adminEmail=steve@nuxx.net --redis="redis://localhost:6379" --mwWikiPath="/w/" --customZimFavicon=favicon-32x32.png

Create HTML Directory from ZIM File: zimpdump -D mw_archive outfile.zim

Note: There are currently issues with zimdump and putting %2f HTML character codes in filenames instead of creating paths. This is openzim/zim-tools issue #68, and will need to be fixed by hand.

Consider using find . -name "*%2f*" to find problems with files, then use rename 's/.{4}(.*)/$1/' * (or so) to fix the filenames after moving them into appropriate subdirectories.

If using Apache (as I am) create .htaccess to set MIME Types Appropriately, turning off the rewrite engine so higher-level redirects don’t affect things:

<FilesMatch "^[^.]+$">
ForceType text/html
</FilesMatch>

RewriteEngine Off

Link to http://sitename.com/outdir/A/Main_Page to get to the original main wiki page. In my case, http://nuxx.net/wiki_archive/A/Main_Page.

 

Comments closed

BorgBackup Repository on Synology DSM 6.2.2

(UPDATE: With the release of Synology DSM 7.0 this setup will break. It’s easy to fix and I’ve updated added this post describing how to make this system work under the new version.)

Lately I’ve become enamored with BorgBackup (Borg) for backups of remote *NIX servers, so after acquiring a Synology DS1019+ for home I wanted to make it the destination repository for Borg-based backups of nuxx.net. While setting up Borg is usually quite straightforward (a package or stand-alone binary), it’s not so cut and dry on the Synology DiskStation Manager (DSM); the OS which runs on the DS1019+ and most other Synology NAS’.

What follows here are the steps I used to make and the reason for each step. In the end it was fairly simple, but a few of the steps are obtuse and only relevant to DSM.

These steps were written for DSM 6.2.2; I have not checked to see if it applies to other versions. Also, I leave out all details of setting up public key authentication for SSH as this is thoroughly documented elsewhere.

  1. Enable “User Home Service” via Control PanelUserAdvancedUser HomeEnable user home service: This creates a home directory for each user on the machine and thus a place to store .ssh/authorized_keys for the backup user account.
  2. Create a backup user account and make it part of the administrators group: Accounts must be part of administrators in order to log in via SSH. Starting with DSM 6.2.2 non-admin users do not have SSH access.
  3. Change the permissions on the backup user’s home directory to 755: By default users’ home directories have an ACL applied which has too broad of permissions and SSH will refuse to use the key, instead prompting for a password. Home directories are located under /var/services/homes and this can be set via chmod 755 /var/services/homes/backupuser. (See this thread for details.)
  4. Put ~/.ssh/authorized_keys, containing the remote user’s public key, in place under the backup user’s home directory and ensure that the file is set to 700: If permissions are too open, sshd will refuse to use the key.
  5. Test that you can log in remotely with ssh and public key authentication.
  6. Place the borg-linux64 binary (named borg) in the user’s home directory and confirm that it’s executable: Binaries available here.
  7. Create a directory on the NAS to be used the backup destination and give the backup user read and write permissions.
  8. Modify the backup user’s ~/.ssh/authorized_keys to prevent remote interactive logins and restrict how borg is run: This is optional, but a good idea.

    In this example only the borg serve command (the borg repository server) can be run remotely, is restricted to 120GB of disk, in a repository on DSM under the backup directory of /volume2/Backups/borg, and from remote IP of 192.168.0.23:

    command="/var/services/homes/backupuser/borg serve --storage-quota 120G --restrict-to-repository /volume2/Backups/borg",restrict,from="192.168.0.23" ssh-rsa AAAA[...restofkeygoeshere...] remoteuser@remoteserver.example.com

Please note, there are a number of articles about enabling public key authentication for SSH on DSM which mention uncommenting and setting PubkeyAuthentication yes and AuthorizedKeysFile .ssh/authorized_keys in /etc/ssh/sshd_config and restarting sshd. I did not need to do this. The settings, as commented out, are the defaults and thus already set that way (see sshd_config(5) for details).

At this point DSM should allow a remote user, authenticating with a public key and restricted to a particular source IP address, to use the Synology NAS as a BorgBackup repository. For more information about automating backups check out this article about how I use borg for backing up nuxx.net, including a wrapper script that can be run automatically via cron.

Comments closed

Using borg for backing up nuxx.net

For 10+ years I’ve been backing up the server hosting this website using a combination of rsync and some homegrown scripts to rotate out old backups. This was covered in Time Machine for… FreeBSD? and while it works well and has saved me on some occasions it had downsides.

Notably, the use of rsync’s --link-dest requires a destination filesystem that supports shared inodes, limiting me to Linux/BSD/macOS for backup storage. Also, while the rotation was decent, it wasn’t particularly robust nor fast. Because of how I used rsync to maintain ownership with --numeric-ids it also required more control on the destination than I preferred. And, the backups couldn’t easily be moved between destinations.

In the years since, a few backup packages have come out for doing similar remote backups over ssh, and this weekend I settled on BorgBackup (borg) automated with borgbackup.sh.

Borg and a single wrapper script (that I based on dailybackup.sh) executed from cron simplifies all of this; handling compression, encryption, deduplication, nightly backups, ssh tunneling, and pruning of old backups. More importantly, it removes the dependence on shared inodes for deduplication, eliminating the Linux/BSD/macOS destination requirement. This means my destination could be anything from a Windows box running OpenSSH to a NAS to rsync.net.

Here’s a brief overview of how I have it set up:

  1. Install BorgBackup on both the source and destination (currently Linux nuxx.net server and a Mac at home).
  2. Put a copy of borgbackup.sh on the source, make it executable, and restrict it just to your backup user. Edit the variables near the top as needed for your install, and the paths to back up in the create section.
  3. Create an account on the destination which accepts remote SSH connections via key auth from the backup user on the source.
  4. Restrict the remote ssh connection to running only the borg serve command, to a particular repository, and give it a storage quota:
    restrict,command="borg serve --restrict-to-repository /Volumes/Rusty/borg/servername.nuxx.net” from=“10.0.0.10” ssh-rsa AAAAB3[…snipped…] root@servername.nuxx.net
  5. On the remote server, turn off history (to keep the passphrase from ending up in your history; in bash: set +o history), set the BORG_REPO, BORG_PASSPHRASE, BORG_REMOTE_PATH, and BORG_RSH variables from the top of the script to test what the script defines:
    export BORG_REPO='ssh://user@server.example.com:220/path/to/repo/location/'
    export BORG_PASSPHRASE='PASSPHRASEGOESHERE'
    export BORG_REMOTE_PATH=/usr/local/bin/borg
    export BORG_RSH='ssh -oBatchMode=yes'
  6. From the source, initialize the repo on the destination (this uses the environment variables): borg init --encryption=repokey-blake2
  7. Perform the first backup as a test, backing up just a smallish test directory, such as /var/log: borg create --stats --list ::{hostname}-{now:%Y-%m-%dT%H:%M:%S} /var/log
  8. Remotely list the backups in the repo: borg list
  9. Remotely list the files in the backup: borg list ::backupname
  10. Test restoring a file: borg extract ::backupname var/log/syslog
  11. Back up the encryption key for the repo, just in case: borg key export ssh://user@server.example.com:220/path/to/repo/location/
  12. Define things to exclude in /etc/borg_exclude on the source per the patterns. In my I use shell-style patterns to exclude some cache directories:
    sh:**/cache/*
    sh:**/locks/*
    sh:**/tmp/*
  13. Create the log directory, in the example as /var/log/borg.
  14. Run the backup script. Tail the log file to watch it run, see that it creates a new archive, prunes old backups as needed, etc.
  15. Once this all works, put it in crontab to run every night.
  16. Enjoy your new simpler backups! And read the excellent Borg Documentation on each command to see just how easy restores can be.
Comments closed

Pi-hole and NBC Sports Gold

I use Pi-hole to block some ads on my home network, but I also pay for a NBC Sports Gold Cycling Pass to easily watch cycling events. Unfortunately, the default block list for Pi-hole will keep NBC Sports Gold from working, breaking it in a somewhat odd way: pages load, but the list of events doesn’t display.

So, how do you make it work? Easy! Whitelist these three sites:

  • experience.tinypass.com
  • geo.nbcsports.com
  • nbc.demdex.net
Leave a Comment

defaults on macOS Can Break Otherwise Usable plist Files

Today I learned that the defaults(1) command on macOS can break otherwise-parsable plist files.

I found this when trying to change selectedThrottleIndex in the Arq config (~/Library/Arq/config/app_config.plist) to adjust a bandwidth usage setting. I didn’t have access to the GUI, wanted to change things from CLI, and whoops! After changing this things broke.

Consider this… First I read out the setting with defaults read ~/Library/Arq/config/app_config.plist selectedThrottleIndex, then set it with defaults write ~/Library/Arq/config/app_config.plist selectedThrottleIndex 2, then read it again to verify the change. All looked good, so I figured I was done and I let things bake.

Upon getting physical access to the machine and launching the GUI, I found that Arq was no longer activated, needed my license info again, and then (uggh!) had lost my configuration. This was fixed by first restoring the whole of ~/Library/Arq from Time Machine, launching Arq, browsing to the backup destination, and clearing it’s cache. After rescanning the destination all was good.

So, why did it happen? Well, it turns out that Arq uses a headerless XML-ish text file with the suffix .plist. Even though it’s not in the same format as plutil(1) outputs, defaults(1) will still read it and modify it… But the resulting file is a plist binary! It seems that Arq only likes to parse its format (note: I did not exhaustively test this — plutil(1)‘s xml1 format may work), so upon seeing the binary version it the program presumed no config and started over. Whoops! Interestingly, the defaults(1), plutil(1), and pl(1) commands will all happily parse the Arq format file.

Note that after launching Arq with the broken config and re-adding the license, restoring just the app_config.plist file didn’t make things better, which is why I jumped right to restoring the whole ~/Library/Arq directory. There must be some config caching somewhere or so…

Leave a Comment

TrainerRoad PowerMatch: Disabled

After getting the CycleOps Hammer smart trainer I’ve been experimenting with how it, my Stages power meter on the Salsa Vaya, and the TrainerRoad PowerMatch function work together. In short, PowerMatch is designed for those who have a power meter on their bike and want to ensure that the resistance and workouts are consistent indoors (with the smart trainer) and out (with just the power meter). While I only use power data for training indoors during the winter (never for outdoor training), I do like to look it over after outdoors rides and thus want to be sure the two are as in line as possible.

To automatically handle differences in power meters, TrainerRoad’s PowerMatch calculates the offset between the power meter and smart trainer, then adjusts the resistance every 10 seconds to accommodate the difference. On its face this makes sense to me, but whenever I’d enable it the resistance would get bit surge-y feeling during harder efforts leading me to think something wasn’t quite right. I suspect this is because of the Stages being single-leg, I likely have a bit of an imbalance between legs, and my recent rides in TrainerRoad have shorter efforts than the sustained stuff that single-leg meters are best at. So, I started to think about if something else was right for me.

In TrainerRoad, on the Power Meter settings, there is a toggle to use the meter for cadence only. In the Smart Trainer setting the options for PowerMatch are Auto, Disabled, or a Manual offset. This results in the following scenarios:

Power Meter Normal, PowerMatch Auto: Displayed power is from meter, with this data used by PowerMatch to determine resistance. Occasional surging feeling, but overall good. While riding it appeared that power data jumped around.

Power Meter Normal, PowerMatch Disabled: Displayed power is from meter, resistance set by trainer’s internal meter. Displayed power data appeared low (5-10W) when holding an interval steady.

Power Meter Cadence Only, Power Match Disabled: Displayed power is from trainer, resistance set by trainer’s internal meter. Feels smooth (no surging) but seemed marginally easier than with PowerMatch.

Power Meter Normal, Power Match Manual: Displayed power is from meter, resistance set by trainer offset adding/subtracting manual value; no automatic adjustment of offset done.

With Power Meter Normal, Power Match Manual looking most like what I wanted, I set out to determine the offset between my Stages meter and the CycleOps Hammer. To do this I first ensured the Stages and Hammer were calibrated. TrainerRoad was set to have PowerMatch disabled and the Stages meter to Use Cadence Only. My Garmin Edge 520 recorded power from the Stages meter and TrainerRoad recorded the Hammer’s data. I then rode a custom TrainerRoad workout that has a warmup, then a series of 1 minute intervals at 60%, 70%, 80%, 90%, 100%, and 110% of FTP with 30 second 50% rests between in Erg mode so  trainer resistance adjusted automatically. After this I did a 30 minute Free Ride where I shifted to adjust speed (and thus power), trying to get a good mix of steady state and short/hard intervals, under different situations, to get sane date to compare. To cut down on data misalignment both of these rides were non-stop spins without pausing, doing my best to start and stop the Garmin along with the TrainerRoad workout.

Both of these sets of data were then compared in DC Rainmaker‘s Analyzer tool, with the results visible here:

Comparing the two, I see two notable things, both most visible in the 30 Minute Freeride:

  1. Sudden transitions decreasing power show 0 values when measured by the Hammer, but still some power with the Stages.
  2. Hammer seemed to measure higher, with the variance becoming greater as power output became higher.

I don’t believe the sudden transitions are a concern nor really something that can be accommodated for, and I don’t think they’ll be a problem for the normal Erg mode workouts where the main desire is to have the trainer providing resistance. This is likely a simple side effect of the large flywheel in the trainer taking a while to slow.

For the scaling disparity between the Stages and Hammer, maybe there’s something there… I’m tending to think that the Stages is reading higher on very hard efforts because with these I’m more apt to be standing and shoving down on the pedals versus a smooth spin. Perhaps this is throwing off the strain gauge? Let’s see…

Here’s how the average powers worked out:

Workout Meter Average Power Weighted Average Power
Custom Test Stages 173.50 191.70
Hammer 174.77 194.20
Free Ride 30 Stages 209.87 228.25
Hammer 211.03 234.05

 

Since I’m trying to compare power meters themselves, I’m looking at Average Power. (I don’t want to use Weighted Average, because this gives increasing priority to greater power outputs, since they are harder on one’s body. For example, it’s a way of reflecting how 300 W feels more than 2x as hard as 150 W.)

Across these two rides the two meters are within ~1 W of each other. While I originally went into this investigation looking to see how much of an offset I’d have to set up in Power Match Manual, I’m now thinking that the Hammer is close enough to the Stages to simply use Power Meter Cadence Only, PowerMatch Disabled. It’s possible this could skew a bit more as I do higher power efforts, but I think this will probably still be within sane ranges. It’s rare that I’ll see the 100W difference like in the high power effort of the Free Ride 30 test (700-800 W range), most things will be in the 200W-300W steady state range where alignment seems sane.

This will cut down on the surging that seems to be coming from the single-leg power meter, still provide sufficient correlation between indoor and outdoor efforts, all while having the benefits of an Erg mode smart trainer.

It’d be nice if TrainerRoad offered some sort of percentage correction, but perhaps this is why PowerMatch instead does a frequent reassessment and is turned on my default; better to check the offset and correct vs. attempting to figure out a scale. Being able to see how PowerMatch is working internally would be nice, but I’m not sure this would add any real value to the product.

2 Comments

Easy Web Page Load Timing Comparison via Bookmarklet

I needed to get rough metrics of web page load times across different browsers. While the built in development tools are good for fine-grained timing, outside of pay tools (eg: HttpWatch) or browser-specific (Page load time Chrome extension) I couldn’t find anything easily available for general page load time.

It turns out that the time between performance.timing.navigationStart and performance.timing.loadEventEnd does a good job of this, as it shows the time elapsed between when the browser starts navigating to the new page and when it feels the page is done loading (the load event handler is terminated).

A bookmarklet containing the following can be used to do this:

javascript:(function(){ var loadtime = (performance.timing.loadEventEnd - performance.timing.navigationStart) / 1000; alert("Page Load Time: " + loadtime + " sec"); })();

This link can be dragged and dropped to your Favorites / Bookmarks bar to easily create a bookmarklet with this content: Page Load Time

Leave a Comment

Full CD Collection Ripping Workflow

Back in 2003-2004 I ripped ever CD I owned to 192kbps AAC, a very good sounding format which was cost effective to store on disks of the time. This was a great achievement, and for the last 12 years I’ve enjoyed having all of my music in a central digital format. Now that storage is cheaper and I have some time, and before data rot sets in, I wanted to re-rip the collection to archival-quality Apple Lossless format. (This format was chosen for compatibility with my currently preferred software and hardware players, and being lossless can easily be transcoded to other formats as needed.)

My previous ripping operation was performed with a dual CPU PowerMac G5 and iTunes. While iTunes’ eject-after-import feature facilitates disc swapping, the workflow was selecting a stack of CDs, inserting them one at a time, then manually validating tags and artwork. At the time CD metadata providers didn’t have many of the discs, so a great deal of manual cleanup was needed. This was tedious, and something I didn’t want to repeat…

Not to mention there was no good way to assess the accuracy of the resulting rips…

Twelve years later, with better tools available, and finding some time in the form of a holiday break (Christmas Eve to after New Year’s) I began thinking about re-ripping my CD collection. Having recently moved my web hosting (this site, nuxx.net) to Linode, my old server was sitting unused at home and I only needed some modern software and an autoloader to set up a high performance CD ripping workstation. The requisite tools were purchased and I got to work. 411.3 GB later and most all of my physical CDs (1241 albums) have been imported as cleanly as possible, ready for listening on a myriad of devices, hopefully for years to come. (Duplicates were not ripped.)

Since ripping an entire CD collection is a desire of many friends of mine I wanted to share  the general setup and workflow that I used. It went well and was mostly hands-off, with only occasional manual intervention needed when the auto-loader jammed and then tool-assisted cleanup of tags and artwork. By using high quality ripping software which supports AccurateRip I was able to ensure that the vast majority of ripped tracks are affirmed error-free and effectively digital archival quality.

Most Inaccurate tracks were caused by damaged discs; typically reflective layer scratches, cracks, or scuffed polycarbonate.

Success Results:

  • 14069 tracks (92.3%) are Accurate
  • 144 tracks (0.95%) are Inaccurate
  • 1024 tracks (6.7%) were not in the AccurateRip database.
  • 1 track was a hidden track 0 in the pregap, which had to be ripped specially, and thus couldn’t be checked with AccurateRip.

Hardware / Software Used:

  • HP ML110 G7 w/ 24GB RAM, 64GB SSD, 3x 1TB HDD, USB 3.0 Card, AMD Radeon HD 5450 (Needs to be a sufficient computer to handle ripping, encoding, and displaying Aero graphics with ease. Built-in video card was not sufficient.)
  • Dell 2005FPW Monitor (1680×1050)
  • Acronova Nimbie USB Plus NB21-DVD
  • HP SATA DVD Drive (Internal, identifies as hp DVD-RAM GH80N.)
  • Samsung USB DVD Drive (External, identifies as TSSTcorp CDDVDW SE-218CB.)
  • Epson Perfection 3170 Scanner
  • Windows 7 Professional
  • dBpoweramp and PerfectTUNES (w/dBpoweramp Batch Ripper and Nimbie Batch Ripper Driver)
  • Mp3tag (Incredibly useful tagging tool with powerful scripting.)
  • CD cases organized into numbered boxes of 30-50 discs.

Workflow:

  1. Use dBpoweramp Batch Ripper to rip all CDs. Label output folders by the box numbers containing each CD; this will make manual metadata validation/cleanup easier.
  2. For each disc that is rejected, use dBpoweramp CD Ripper to rip the entire disc. This is likely a metadata issue as CD Ripper has access to more metadata providers than Batch Ripper. Or it may be the drive failing to recognize the disc.
  3. Use AccurateRip from PerfectTUNES to scan the entire directory structure, then use the built-in information tool to get a text file listing all “InAccurate” tracks.
  4. For each disc with “InAccurate” listings:
    1. Delete entire disc. (I opted to do this instead of re-ripping just the InAccurate tracks, as metadata differences between CD Ripper and Batch Ripper could lead to file names that are clumsy to fix.)
    2. Look disc over and clean if necessary. This, coupled with using a different drive, seems to resolve about 50% of ripping issues. Be sure to use proper technique: soft/clean/low dust cloth, wiping from inside to out (not circularly).
    3. Re-rip discs using dBpoweramp CD Ripper in fast mode.
    4. Re-rip individual tracks with secure mode as needed. Note that re-reading of bad frames can take hours per track, and that some tracks just won’t match AccurateRip or even rip securely. (Some of my discs are sufficiently damaged that I was not able to rip certain tracks.)
    5. Some discs may need to be ripped with Defective by Design settings, particularly in case of copy protection.
    6. Make a separate list of discs which have been accepted issues.
  5. Run AccurateRip (part of PerfectTUNES suite) to confirm all tracks and check against list of accepted issues. Repeat #4 as needed.
  6. Run the Fix Albums tool within Album Art (part of PerfectTUNES suite) to attempt automatic acquisition of artwork for all albums.
  7. Start with the PerfectTUNES suite components to fix artwork and metadata:
    1. In ID Tags go through a series of directories at a time sanity-checking metadata. Compare each CD case to the tags as needed, confirming that artwork looks sane. Keep a document listing artwork to later review.
    2. Use the Album Art tool to attempt bulk fix of art on all albums.
    3. One box at a time add artwork using a scanner and online resources. Then fix Low Resolution artwork.
      1. An Epson Perfection 3170 scanner connected and configured in automatic mode, clicking … next to an album then Acquire (from Scanner) will automatically scan, rotate, and crop artwork from a scanner. This seems to fail on thick-case (Digipak, clamshell) albums and is inconsistent on mostly-white artwork.
      2. Decent artwork can be obtained from Discogs.
      3. Add observed metadata errors to a document for later review.
  8. Use Mp3tag to clean up tags. Useful filters and suggestions include:
    1. (albumartist MISSING) AND (compilation MISSING) – Find tracks that are not part of compilations but did not get Album Artist set.
    2. %_covers% IS "" – Find tracks without artwork.
    3. compilation IS 0 – Find tracks with Compilation set to No. This can be removed using the Extended Tags editor.
    4. "$len(%year%)" GREATER 4 – Find tracks with Year fields longer than four digits (some metadata includes month and day).
    5. (totaldiscs IS 1) AND (discnumber GREATER 1) – Find disc numbers higher than 1 when the total number of discs in the album is 1.
    6. Selecting all tracks, exporting to CSV, then reviewing in a spreadsheet program can help find misspellings, duplicates, etc. For example, look at unique values in the Artist column to find misspellings like “Dabyre” vs. “Dabrye” or “X Marks The Pedwalk” vs. “X-Marks The Pedwalk”.
    7. NOT %ACCURATERIPRESULT% HAS ": Accurate" – Show all tracks that do not contain the AccurateRip header indicating an accurate rip.
  9. Fix any noted errors using a combination of ID Tags, Album Art, and Mp3tag.
  10. Prepare for archiving by renaming all files using Mp3tag:
    1. The following Format string for the ConvertTag – Filename renamer will move files to e:\final_move with the following formats, without most invalid characters for Windows filesystems, truncating the artist, album, and track lengths to 64 characters: e:\final_move\$if(%albumartist%,$validate($replace(%albumartist%,\,_),_),Various Artists)\$validate($replace(%album%,\,_),_)\$if(%albumartist%,$replace($validate($left(%artist%,64)-$left(%album%,64)-%discnumber%_$num(%track%,2)-$left(%title%,64),-),\,_),$replace($validate($left(%album%,64)-%discnumber%_$num(%track%,2)-$left(%artist%,64)-$left(%title%,64),-),\,_))
      1. Tracks with Album Artist set: ...\Artist\Album\Artist-Album-Disc#_Track#-Title.ext
      2. Tracks without Album Artist set (compilations): ...\Various Artists\Album\Album-Disc#_Track#-Artist-Title.ext
  11. Make a backup. Make many backups…
  12. Import into your preferred music player. In my case, iTunes on OS X:
    1. Album or artist at a time, delete the old, MPEG-4 versions and import the Apple Lossless tracks.

Issues:

  • Autoloader would occasionally jam. Seemed to be caused by:
    • Discs sticking together; ensuring they are gently placed in the loader seems to help.
    • Some discs are particularly thin or thick; these would often fail to load properly. Manually rip these.
  • Autoloader only supports standard size CDs, so mini or artistically cut CDs must be ripped in a normal drive. The USB drive is a laptop-type with a snap-lock spindle, which is better for artistically cut CDs.
  • Off-balance artistically cut CDs must be ripped at 1x to mitigate vibration. This can be problematic during initial spinup.
  • Some discs didn’t read well in one drive or another. If a rip would not be error-free in one drive, it’d frequently be fine in another.
  • Some discs are not present in AccurateRip.
  • Dirty discs caused more problems than I’d anticipated. Discs seemed to be scratched or dirty for a number of reasons:
    • Previous poor storage techniques (DJ or CaseLogic-style slip cases).
    • Outgassing of liner notes caused cloudy white buildup on some discs; could be removed with alcohol.
    • Discs lacking paint over the reflective layer are more susceptible to damage; particularly if stored in slip cases.
  • I suspect that less-common discs may have invalid information in AccurateRip. (Tip: The number after the Accurate Rip CRC indicates how many other users the rip matched with.)
  • In order to use either dBpoweramp CD Ripper or Batch Ripper via Remote Desktop, the following group policy must be enabled: Local Computer Policy → Computer Configuration → Administrative Templates → System → Removable Storage Access → All Removable Storage: Allow direct access in remote sessions. This setting is detailed in this article from Microsoft.
  • PerfectTunes Album Art will lump together albums with differing versions specified with parenthesis. For example, “Pearl’s Girl (Tin There)” and “Pearl’s Girl (Carp Dreams…Koi)” will both show up simply as “Pearl’s Girl”. This can make artwork assignment challenging. To work around this I’d name albums something like “Pearl’s GirlTin There)” until artwork is assigned, then change the name afterward.
  • PerfectTunes ID Tags will occasionally fail to set the Compilation tag when it is the only attribute being edited. Work around this by using Mp3tag and editing the extended tag COMPILATION to 0 or 1.
  • PerfectTunes Album Art will not always show missing art. Work around this by using Mp3tag with filter %_covers% IS "" to find specific tracks without art assigned.
  • Mp3tag had issues renaming the artist “Meanwhile, Back In Communist Russia…” due to the ellipses at the end. By replacing the three dots format (…) with precomposed ellipses (…) the issue was resolved.

Cost:

  • Time: Hard to fully quantify, but overall process took about four weeks of spare time. Most time was spent waiting for the autoloader on initial rips and then manually cleaning up artwork and metadata issues. I was typically able to run 3-4 boxes of discs through the autoloader per day, then spent some lengthy evenings working on tagging and artwork. The use of the autoloader then PerfectTUNES and Mp3tag made the process feel very efficient.
  • Hardware:
    • Acronova Nimbie USB Plus NB21-DVD: $569.00
    • USB 3.0 Card: $19.99
    • Internal Power Cable: $1.99
    • AMD Radeon HD 5450: $31.76
  • Software:
    • dbPoweramp and PerfectTUNES bundle: $58
Leave a Comment