Microsoft have a big problem with their patching and QA processes
My original intention was to write up a fairly simple blog going through the recent chdsk fix as seen in this Register article here.
My plan was to take a look at the chkdsk files so I could show a way of validating the files across a network and confirming that the version you had installed across your networks was bug free as well as to point you to the patch fix so you could run the same tests I did and validate the results.
Unfotunately I can't do that because the Microsoft patching and QA process is such a mess that I wanted to highlight the range of issues I hit in trying to do something that should be fairly simple but I'm getting ahead of myself.
Let me start at the begining and the opening paragraph of the register article above:
A Windows 10 update rolled out by Microsoft contained a buggy version of chkdsk that damaged the file system on some PCs and made Windows fail to boot.
the article then adds the line:
The updates that included the fault are KB4586853 and KB4592438.
So, it should be pretty easy to get the list of files in those patches and compare the versions of chkdsk, right?
Well no. I use chrome and the latest build has a new feature that warns when downloading over an invalid connection.
While Windows Update doesn't have an insecure connection the secure connection it does have is invalid because it does not have a secure cert that matches the URL of the site:
Bearing in mind that this literally the download location for patching I for one certainly expect MS to have decent if not top notch security on the site.
I would expect the site to behind a Web Application Firewall (WAF) (I don't know if it is or not) with at least the HSTS header and older protocols being blocked so I thought it worth double checking via the Qualys SSL test site and the result is pretty dismal:
That's the first round of issues. To be fair to MS, they are not show stoppers but it is a very poor setup for the site that hosts patches. Moving on!
What about the update file itself?
The one I decided to look into more closely is KB4592438
The issue with chkdsk is listed close to the bottom of the page and I've screencapped it with the relevant part highlighted:
This confuses me because the update itself has a problem with chkdsk but the text says that the issue is resolved but neglects to say how it is resovled. The text talks about it taking "24 hours for the resolution to propogate to non-managed devices".
What resolution? Is this a patch that is downloaded automatically but not listed on the download page?
And what is a "non managed device"? Is it one that's not on the domain? One that isn't in intune? The text doesn't mention anything else about these devices.
The rest of the text is just as baffling:
enterprise managed devices that have installed the update AND encounted the issue, it can be resolved by installing and configuring a special group policy
If a group policy resolves it then I guess that managed devices must be on a domain and that is likely the difference between managed and non managed but seriously, why not just say "those on a domain" and make it clearer?
I should also point out that the issue that some machines encounter is that they cannot boot. Just how does a group policy help fix a machine that cannot boot?
Anyway, the Group policy link is actually an MSI file. opening that up shows two group policy template files - I suspect that they are in an MSI so that they'll install into the policy defintions folder in the sysvol but I never tested this out as I just extracted the files using 7-Zip and placed them into the necessary folders. I then launched GPMC expecting to see some information on what the GPO does and oh boy was I wrong:
If I'm reading the GPO correctly this GPO, called KB84586853 issue 002 rollback (catchy name!) will either enable a feature preview if the GPO is enabled or rollback something if it is set to disabled.
What does this actually mean? This is supposedely a fix for a known issue. Why does the GPO talk about feature previews? What sort of feature preview would you even have with chkdsk? One that doesn't corrupt the disk presumably!!
I suspect that the text is just generic text that has been copied across from other GPO's but that doesn't excuse the fact that the language is difficult to comprehend in light of what it is supposed to cover and doesn't actually say what it is disabling or enabling.
If I install this GPO into any corporate environment I'd expect to have to explain some technical details in a change request and I just don't have any details of what the GPO does or how it does it and that is a very poor show from Microsoft.
The fix, whatever it is, should be a patch containing just the files necessary to fix this issue. It should have a proper name and not "KB84586853 issue 002 rollback". It should at least mention chkdsk or autochk as chkdsk is the command but autochk is actually the file that runs the check disk process.
I will be keeping on eye on the next set of patches to see if MS provide any more information or a proper fix for chkdsk. The next set of patches are only a few days away so I would like to do a follow up blog to this in a few weeks time if time allows of course.
Feedback and comments are always appreciated, either here or on twitter @garyw_
Subscribe to Ramblings of a Sysadmin
Get the latest posts delivered right to your inbox