Restore missing WX zones Due to a filtering mistake when assembling the zlist file, 2.4.3 shipped with no NWS WX zone associations. Brown bag fix releasing as 2.4.4.
Prepare for 2.4.3 release Switch to the 2022 US Census Bureau data, March 2023 NWS WX zones, latest OurAirports open data set, and refreshed active forecast and station lists. Regenerate all correlation sets based on these updated sources.
Don't use U mode, removed in python3.11 This patch was helpfully submitted by Bas Couwenberg to drop use of the universal newline flag, since Python 3.11 no longer supports it. It probably breaks the ability to build new correlation files under Python 2.7 and earlier, but since it shouldn't affect operation of the utility with prebuilt correlations (the way it's typically used), this isn't yet considered to drop Python 2.7 support altogether.
Drop vestigial import of the tarfile module The correlate() function stopped needing tarfile a couple of years ago (version 2.4), but it was overlooked that the script continued to unnecessarily import it. Clean this up.
Force UTF-8 locale when reading configs and data Apparently, Python on Windows defaults to assuming CP1252 encoding unless otherwise specified, as opposed to the UTF-8 assumption made on POSIX platforms. Since our configuration and data files are expected to always use UTF-8 encoding, be clear in the ConfigParser.read() calls about that. We only do this under Python 3.x, as that method doesn't have an encoding parameter in 2.7. Thanks to Lance Bermudez for reporting this.
Refresh correlation data and update copyright year Just a basic correlation update based on more recent active METAR station and WX zone lists. Also update the copyright year for files which have been edited so far in 2021 as well as in the LICENSE file.
Correct handling of boolean selections The selections proxy class, which mashes together command-line arguments and configuration options, contained a longstanding and fatal flaw with its handling of boolean values. In particular, falsey values were consistently treated as truthy due to naively recasting str to bool (which will always yield True unless empty). This went unnoticed for so long because the majority of these settings default to False, meaning the only reason most users had to set them was to override them to True. Many thanks to Jordan Russell for bringing this bug to my attention, and for supplying an initial patch on which this fix is heavily based. Co-Authored-By: Jordan Russell
Correct and simplify URLError exception handling Julien Palard pointed out that the way URLError exceptions were being manually cobbled into the stderr stream wasn't quite working (thanks!), but it was also unnecessarily complicated for reasons I don't recall now. Rip most of it out and just go with a basic catch/error/re-raise there instead.
Prepare for 2.4.1 release Update the version string in the project and manpages.
Make missing alert URLs non-fatal As a more complete fix and future-proofing for the earlier mismatch between default_atypes and the alert URLs generated for WX zones during correlation, stop aborting and simply add a warning if a requested alert type has no corresponding URL.
Correct default_atypes to match what's generated Kevin Monceaux reported a regression with the 2.4 release. Running with the -a/--alert option and no limited --atypes or atypes override in weatherrc resulted in a message about undefined URLs and no normal output. This problem crept in when hard-coding alert types in the correlator after ditching the woefully unmaintained zonecatalog.curr.tar data source (commit 8a37edd). Update default_atypes so that it covers all relevant non-forecast URLs the correlate routine embeds.
Make the build reproducible While auditing Debian's packages, Chris Lamb reported[*] that weather-util's correlation set generation is not reproducible because it embeds timestamps without a means to override them and also varies by system timezone. Allow SOURCE_DATE_EPOCH from the calling environment and assume UTC rather than relying on locale settings when no timezones are specified. [*] https://bugs.debian.org/964721
Get correlate() working in modern Python 3 Update a bunch of the parsing for various correlation source files to work in both Python 2.7 and 3.5+, mostly where str vs bytes and UTF-8 encoding/decoding are concerned. This can be cleaned up significantly once support for 2.7 is finally dropped.
Be more thorough about file copyrights Add a copyright header to the .gitignore file with start and end years determined from its commit history. Add copyright headers for the current year to overrides.log and qa.log, and also add functionality to correlate() which adds these headers from now on. Update the copyright year on overrides.conf, which was missed in 8a37edd and later commits. All files tracked in this repository now declare a copyright and refer to the main LICENSE file for licensing terms.
Don't use "is" with a literal to test for equality Solve a SyntaxWarning under Python 3.8 and later for use of the "is" identity operator when comparing literals, by replacing with the "==" equality operator.
Caching support for URLs with port numbers When mangling URLs of fetched data to store in the local cache, only split on the first colon so that URLs with port numbers in them are properly differentiated. Previously, all URLs for the same domain name landed in a single file if a port number was included, causing incorrect results to be returned from the cache.
Use a dedicated field for cached search timestamps Fix a cache corruption issue by using a new "cached" field to hold the timestamp for cached correlation search results. Previously the "description" field was being overloaded, but this could cause the cache to no longer load because of duplicate fields.
Decode retrieved files as UTF-8 even on Python 2 Python 2.7 is likely the only Python 2 anyone is using any longer (even that's well past EOL upstream now), and reasonably recent versions of 2.7 it need the same decode hack as Python 3 anyway when dealing with some retrieved content. Just get rid of the version detection and do it under any version.
Add weather zone hkz000 for Hong Kong Observatory Thanks to Bill Agee for suggesting the Hong Kong Observatory's weather forecast page. A custom filter is implemented to strip the forecast text from the HTML page in which it is embedded (if anyone finds a plaintext version published at an alternate URL, let me know and I'll rip out the extra routine).
Update correlation sources Remove the stale metar.tbl and zonecatalog.curr.tar, which the USA NWS hasn't been updating for many years, and add the public domain airports.csv file from the amazing ourairports.com community. Also update to latest (2019) USA Census Bureau location data, March 2020 WX zone information, cooperative sites list from 2018 (latest), and regenerated active station and zone lists. Loss of the zonecatalog necessitates directly applying various forecast and alert URL patterns, though some which appeared unused by NWS for many years were not included. Clear out all old overrides, since the vast majority are obsoleted by refreshed data, and build fresh correlation sets from the above sources. Basically all sites have switched from HTTP to HTTPS, so update URLs for this too.