Trials of a Network Admin

Fixing problems nobody else seems to have.

  • About

VSS Snapshots Fail on Server 2008 or 2008 R2

Posted by James F. Prudente on March 8, 2017
Posted in: Exchange, Windows Server. Leave a comment

Recently our Exchange 2013 server – running on Windows Server 2008 (not R2) – started having backup failures. We use FalconStor DiskSafe for backup, which, like most (all?) Exchange-aware backup application uses VSS to snap the server.

During the investigation with FalconStor support, we discovered VSS was timing out during the verification of the snapshot, and running “vssadmin list writers” showed a number of the VSS writers in a failed state. No updates had recently been applied to the server, nor had any configuration changes been made recently, so there was no obvious potential cause of the problem.

FalconStor support was able to point us to this link at IBM. While not an exact description of our problem, the article linked to a Microsoft utility, DevNodeClean. The description for that utility is:

“On a computer that is running Windows Server 2003 or a later version, a storage device that is connected by using a fiber channel or by using the iSCSI protocol may be connected for only a short time. When a storage device is connected, Windows creates registry information for the device. Over time, the registry may contain many entries for devices that will never be used again. This utility can be used to remove this information from the registry.”

Interesting. Searching a bit more I came across a 3rd party complied version of DevNodeClean, which has more information about the problem it resolves.

I can’t find the link now, but I also came across a page that suggested reviewing the size of the registry itself. The actual registry files are located at c:\windows\system32\config. The SYSTEM file, which represents HKLM\SYSTEM, was in this case over 1.5GB! Checking a few other servers of similar age showed SYSTEM files of 50MB or less in most cases.

By now, I was pretty confident this was our issue. I ran DevNodeClean (the Microsoft provided one), which took – no joke – six days to complete. As it continued to delete unused devices from the registry, our backups began running again intermittently. After about five days of running the utility, all backups began working properly.

It seems what happens is that any backup product that calls VSS – FalconStor, Veeam, Microsoft DPM, etc. – causes a registry entry to be created for every single VSS snapshop. I believe the same happens when using Hyper-V snapshots. These registry entries are not cleaned-up, and over time, they bloat the registry to the point where it can essentially break VSS. In our case, we were snapping two LUNs every hour, so over the course of a few years of the server’s life, there were thousands of orphaned registry entries.

There is supposedly a hotfix for 2008 R2, but it’s not clear if that fixes the problem, and in any event it’s not available for 2008.

It seems the best way to deal with this is to run DevNodeClean as a scheduled task on a regular basis.

Unfortunately, the registry cannot be compacted online, so we will need to schedule a maintenance window to actually shrink the bloat.

Google Chrome Managed Bookmarks

Posted by James F. Prudente on October 14, 2016
Posted in: Chrome. Tagged: Chrome, favorites, Group Policy. 14 Comments

The Enterprise (MSI) version of Google Chrome includes a comprehensive Group Policy template, allowing many settings to be centrally controlled. One of these settings is “Managed Bookmarks,” which allows the administrator to push out a fixed set of bookmarks to all users.

As shown here, there are three methods to create managed bookmarks:

  • JSON strings in the GPO editor
  • JSON strings in the registry editor
  • Expanded JSON in the registry

Reviewing those options, I initially looked to method #3, as dealing directly with the registry seemed the easiest of the methods. I tested the registry entries that needed to be made, confirmed everything was OK, and proceeded to configure Group Policy to import the registry keys. And it didn’t work. To be specific, the registry keys would not import.

After spending way too much time troubleshooting, I found the problem: Chrome’s Managed Bookmarks are stored at “HKCU\Software\Policies\Google\Chrome\ManagedBookmarks” – this is in the current user’s registry hive so it should be writable. Except it isn’t. If you check permissions on “HCKU\Software\Policies”, you’ll find the current user only has READ permissions to this particular branch of the registry. You need administrator privileges to write to the “Policies” key.

The obvious question is why? The “Policies” key contains group policy settings, as set by administrative templates. If a standard user could change settings in this key, it would give them the ability to override group policy.

The difficulty this creates for the way Chrome handles Managed Bookmarks is that neither of the registry options are viable for non-admin users. I suppose they could be useful if you wanted to preload bookmarks in your default profile, but otherwise there’s no viable way to use the registry to control the bookmarks. You’re left with the first option, directly entering JSON into the Group Policy editor.

I wrote a utility that will take a folder of url shortcuts (like how IE handles favorites) and create the appropriate JSON code. It’s too rough to share here but if anyone has interest in it, please post below and perhaps I’ll clean it up.

Windows Update Logging on Windows 10

Posted by James F. Prudente on September 27, 2016
Posted in: Web Filtering, Windows 10. Leave a comment

For many years and through multiple Windows versions, Windows Update logged to c:\windows\windowsupdate.log – simple and functional. But Microsoft’s vision with Windows 10 seems to be fixing things that aren’t broken, and this is no exception.

Opening c:\windows\windowsupdate.log on Win 10 gives you this:

Windows Update logs are now generated using ETW (Event Tracing for Windows).

Please run the Get-WindowsUpdateLog PowerShell command to convert ETW traces into a readable WindowsUpdate.log.

For more information, please visit http://go.microsoft.com/fwlink/?LinkId=518345

Basically, in order to have a human-readable update log, you need to open PowerShell and run Get-WindowsUpdateLog, which as noted above will convert ETW logs to an old-style WindowsUpdate.log which it places on the desktop. It’s an extra step but not too bad. When it works.

Unfortunately, in some (many?) cases you will end up with a log file filled with lines like these:

1601.01.01 01:00:00.0000000 692 5220 Unknown( 31): GUID=c232d8d6-c0ba-3f2d-4ee0-3c6f4b234595 (No Format Information found).

So what’s the problem? Believe it or not, in order for Windows 10 to convert the ETW files to a human-readable log, the system needs Internet access in order to reach Microsoft’s symbol servers. This is yet another functional change that seems to ignore the reality that many larger environments and virtually every school has content filtering in place, which blocks this very access.

In order to fix this, you will need to whitelist all traffic to http://msdl.microsoft.com/download/symbols. If that still does not fix things, you will need to delete the contents of %temp%\WindowsUpdateLog and re-run the PowerShell command.

After doing that, you should once again have a usable log file. Easy, right?

BGInfo for Windows 10

Posted by James F. Prudente on September 16, 2016
Posted in: PowerShell, Scripting, Windows 10. 10 Comments

Years ago, Mark Russinovich of Sysinternals fame (since bought by Microsoft) released the BGInfo utility, which served two purposes: to put various system information such as the computer name and free hard drive space on the desktop, and likewise to put the same info on the lock screen. In a large environment, that latter feature was extremely useful as it enables users and technicians to quickly see the name of a PC without having to log on and find the info.

Unfortunately, Microsoft changed the way the lock screen was handled in Windows Vista and 7, which broke BGInfo’s lock screen functionality. Windows 10 changed things yet again and while BGInfo has been updated to support Win 10, we could not get it working for the lock screen. I don’t want to say it’s not possible, because there may be some combination of using the program along with group policy to display the info on the lock screen, but in our environment it did not work.

Anyway…we really wanted this functionality so I began to look into ways of hacking something together…and boy is this a hack.

Before I go any further, I want to clarify the terms I’m using:

The lock screen is the screen (wallpaper) displayed before a user has logged on, or when the logged-on user locks the computer. {WIN}+{L} for example. In Win 10 RTM and 1511, this disappears before the user is prompted for credentials.

The login screen is the screen where a user actually enters their credentials. On Win 10 RTM and 1511, this background will either be a solid color or the neon-blue Windows wallpaper. On 1607, the lock screen image will remain even once the login fields are shown.

As is seemingly always the case with Windows 10, half the battle is figuring out how aspects of the OS actually work; only then can you determine how to best control things. The below represents my best efforts at understanding what’s going on behind the scenes; if you know otherwise or can add to the discussion, please leave a comment. Also, while some of the settings below can be controlled through group policy (and one must be), I elected to deal with the registry directly as doing so kept all the changes in one place.

Additionally, the way the lock screen works is different in build 1511 (Threshold 2), 1607 (Anniversary) and 1607 with the September cumulative update. These scripts will work with either 1511 or a fully-patched 1607, at least as of this writing. They may however work differently between the two builds, as well as on an existing user profile versus a new one.

There are two sets of lock screen settings: the system-wide machine setting, and a user setting which is (predictably) specific to each user.

For the system-wide settings, we have:

HKLM\Software\Policies\Microsoft\Windows\Personalization\NoChangingLockScreen – Set to 0 to allow the user to change the lock screen, 1 to prevent changes.

HKLM\Software\Policies\Microsoft\Windows\Personalization\LockScreenImage – This is the full path to the lock screen file.

The in-box lock screens are located at c:\windows\web\screen and there does not appear to be a way to select anything other than these default images through the UI.

For the user-specific settings, we have:

HKCU\Software\Microsoft\Windows\CurrentVersin\ContentDeliveryManager\RotatingLockScreen – Set to 0 for a fixed image, or 1 for Windows Spotlight.

Windows Spotlight is a feature by which Win 10 downloads new lock screen images on its own, rotating them from time to time. If Spotlight is enabled, the below registry keys are relevant:

HKCU\Software\Microsoft\Windows\CurrentVersion\Lock Screen\Creative\HotspotImageFolderPath – Path to the user’s downloaded images

HKCU\Software\Microsoft\Windows\CurrentVersion\Lock Screen\Creative\PortraitAssetPath –> Full path to the lock screen file for portrait (vertical) layout

HKCU\Software\Microsoft\Windows\CurrentVersion\Lock Screen\Creative\LandscapeAssetPath –> Full path to the lock screen file for landscape (horizontal) layout

Now in theory, if a computer (HKLM) policy is set to prevent the user from changing the lock screen image, the user should receive the same lock screen, but what we found instead was in some cases on build 1511 the user’s setting would override the computer’s setting. In other words, we would have a fixed lock screen for the computer when nobody was logged in, but once a user logged in, their lock screen settings (which default to having Spotlight enabled) would take precedence. This meant we’d have one lock screen for the computer, and one lock screen for each user. 1607 seems to work properly in that setting a lock screen at the HKLM level prevents any user changes from being made.

On 1511, I recommend enabling the group policy Computer Config à Policies à Administrative Templates à Control Panel à Personalization à Prevent changing lock screen and login image. If you do not do this, the user retains the ability to change their lock screen even though they technically should not.

There are a lot of moving pieces here. The download includes both a startup and login script; on 1511 you need both. On 1607 you can get away with just the startup script however you will lose some functionality in doing so. Here’s what each piece does:

Startup Script

  • Check if we’re running on Win 10; if not, exit.
  • Choose a random picture from c:\windows\web\screen, and save it out to c:\programdata\bginfoscreens
  • Determine the size of the display; this unfortunately running this at startup or shutdown is problematic, so we’re doing a few different things:
    • Attempt to determine the size; the default return value is 1024×768. If we get that back, assume we have bad info.
    • If we have bad info, attempt to read c:\program files\bginfo\resolution.out, which is a text file created by the login script, containing the correct resolution.
    • If we have resolution.out, use the resolution there; otherwise stick to the (likely incorrect) data we collected earlier.
  • Resize the picture chosen earlier to our screen resolution.
  • Write out our desired info to the resized picture. We attempt to adjust the text positioning to fit the image.
  • There’s some code here for debugging purposes, where you can choose to display additional info on the screen if the system name matches a certain pattern.
  • Save the image.
  • Set the new lock screen by making changes to the registry.

Login Script

  • Check if we’re running on Win 10; if not, exit.
  • If you set $enablenetworkcollection to $true and provide a valid, user-writable path as $networksspath, the login script will copy the user’s Windows Spotlight backgrounds to a network share. This is useful if you want to have a larger pool of pictures for the computer lock screen. If you choose to do this, you will eventually want to disable $enablenetworkcollection and use whatever method you like to copy this collection of pictures to c:\windows\web\screen
  • If you collect these images, you will need to rename them to .jpg and also filter out any that are not the proper resolution. I’ve included a quick and dirty script, “CheckResolution.ps1” which will do this.
  • Determine the screen resolution; unlike the startup script, this should always be correct. We save it to a file so the startup script can access it on subsequent reboots.
  • Read the registry to determine where the user’s lock screens are located; if we don’t find the keys the rest of the script is meaningless, so we exit.
  • Find a random image to work with, checking if it’s 1920px wide.
  • Code to save images to a network location if $enablenetworkcollection is true.
  • Resize the picture chosen earlier to our screen resolution.
  • Write out our desired info to the resized picture. We attempt to adjust the text positioning to fit the image.
  • Save the image.
  • Set the new lock screen by making changes to the registry.

The combination of these two scripts provides a lock screen that rotates on startup and on login (1511) only, as well as displays the info you choose to show. It’s a hack as I said earlier, but it works for us at least, and if nothing else it should provide a starting point for your own environment.

Download the scripts here: http://media.islipufsd.org/Scripts/BGInfoScripts.zip

Note the scripts themselves credits certain other sites and posts I’ve taken code from.

EDIT: These are tested and working on both 1709 and 1803.

Orphaned VMs on ESXi

Posted by James F. Prudente on July 6, 2016
Posted in: vmware. Tagged: esxi. Leave a comment

Recently we added another host to our VMWare cluster. During testing we discovered a few virtual machines would not vMotion between any of our hosts. It’s not clear if this had anything to do with adding the host or if it was something we only noticed because we were moving guests between hosts, which as a general rule does not happen that often.

To set the stage from the hardware side of things:

  • One VMWare cluster, with (5) Dell hosts: (2) 2950s, (2) R720s, and (1) R730
  • Shared storage on a FalconStor NSS SAN, connected via a Brocade 6505 FC switch.
  • Originally we had a mix of ESXi 5.1 and 6.0, but we’ve since upgraded to all 6.0; the problem didn’t change between versions.

Fairly standard stuff.

When a vMotion would fail, we would get one of a few different errors, most of which were along the lines of:

After that, the VM would show either “(orphaned)” or “(inaccessible)” in vCenter:

Often, the same VM would reappear at the same time as a “discovered virtual machine,” which showed normally in vCenter but still could not be moved between hosts. It also showed the wrong disk space usage, often showing 4GB or less on servers with 60GB or larger drives. Note the guest OS was still running without issue, and the VM could be shut down and started back up without any errors.

We quickly established that this was not limited to a specific data store or specific host since there was no commonality between the guest VMs that had a problem. We also ruled out a storage issue as we have quite a lot of other VMs on that SAN as well as a Windows cluster, all of which was running fine.

Our first troubleshooting step was to remove the guest from vCenter inventory and attempt to re-add by browsing the datastore for the .vmx file. This resulted in two VMs being imported, as shown:

The first VM – i.e. the one without the (1) – would not show any data.

The second copy of the VM would show mostly accurate data:

Note the provisioned and used storage numbers, which are clearly incorrect.

After a while, the registration of the first copy of the VM would fail with an error and no longer show in vCenter.

The second copy of the VM would remain in vCenter but any attempt to vMotion to another host would result in an error (shown below), and the VM would then show (orphaned) in vCenter.

We were back where we started and contacted support.

Support was not terribly useful. I’m not looking to turn this into a rant about VMWare support, so I’ll skip a lot of the details but the short version is that while they did fix this for two of our VMs, they did so in an extremely inefficient manner that involved duplicating the entire VMDK file. So when the same problem appeared on two more VMs I wanted to investigate this further in-house.

Below is the fix we came up with, which is a little tedious but effective and not too time consuming.

  • SSH to one of your hosts.
  • Navigate to /vmfs/volumes/datastorename/vmfoldername
  • Run ls –l for a directory listing

    It’s a good time to review a few things about VMWare file names:

    name.vmx is the virtual machine definition file, which is plain text, and contains info about the VM itself

    name.vmdk is the hard drive definition file, which is plain text as well, and points to the hard drive data file

    name-flat.vmdk is the actual hard drive data

In some cases, either of the definition files could be missing (which would cause a problem), but they were both there for us.

  • Keep the SSH session open, and use WinSCP to connect to the same host then navigate to the same folder.
  • Use WinSCP to take backup copies of the base vmdk file and the vmx file.
  • Open the base vmdk file and take note of the “ddb.adapterType” shown, as you will need this later. Also check to see if “ddb.thinProvisioned” is present, and its value if so.
  • Rename the base vmdk file and the vmx files; we’re going to replace both: (This is shown via SSH)

    mv –i name.vmdk name.vmdk.old

    mv –i name.vmx name.vmx.old

  • Follow this link to recreate the vmdk file. (VMWare KB article 1002511) Note that in step 8, when you need to edit the vmdk file, it’s easiest just to do this directly from WinSCP which is why I suggest using both it and SSH simultaneously.
  • Now we need to deal with the vmx file. See this link. Here’s where things get even stranger. Following the post linked to, you select the existing vmdk file with the datastore browser, so the file is clearly there. Yet when you finish the wizard and vCenter goes to create the VM, we ended up with an error that the vmdk file wasn’t found.

    Well that makes no sense; the file is clearly there. It’s in the datastore browser and it’s visible through the CLI and WinSCP. It was at this point I noticed something strange. I realize it’s hard to tell with the obfuscation, but there’s a slash “/” missing between the folder name and the filename. Huh.

    I went through the wizard again and paid more attention to what’s shown for the disk file path, after selecting it in the datastore browser:

To be clear, the brackets [] are just part of our naming convention. Notice the total lack though of slashes. I took a shot in the dark and added a slash between the folder and file name, then finished creating the VM and crossed my fingers.

    

  • The good news is manually adding the slash fixed the “file not found” error. But why should things get normal now? vCenter again showed two copies of the VM as discovered. Again, the second copy appears operational and the first shows no info. vCenter meanwhile hung trying to create (presumably) the first copy of the VM. Eventually it will timeout.

    Note the second (working) copy of the VM did show a MAC address conflict which appears to be a harmless side-effect of the inexplicable duplication of VMs. I was able to boot the second copy of the VM. You may need to reconfigure the NICs from within the guest OS since in some cases new NICs may be ‘added’ to the system.

  • Once this was all done, we had a fully-functional guest VM that could be migrated from host to host without any issues.

Hopefully this problem doesn’t reoccur, but at least we have a viable fix for it if it does. I’d be curious to hear if anyone else knows what may cause this in the first place.

Posts navigation

← Older Entries
Newer Entries →
  • Recent Posts

    • The Real-World Implications of PrintNightmare
    • Office 365 Folder Naming Conflict
    • DHCP Registration of DNS Entries to an Alternate DNS Server
    • Making Sense of Office 365 Pro Plus and Office 2019
    • 802.1x Certificate Based Authentication against NPS on Windows Server 2016
  • Recent Comments

    James F. Prudente on Managing Mail-Enabled Security…
    Roger on Managing Mail-Enabled Security…
    Joel on DHCP Registration of DNS Entri…
    James F. Prudente on DHCP Registration of DNS Entri…
    Joel on DHCP Registration of DNS Entri…
  • Archives

    • August 2021
    • July 2021
    • December 2019
    • November 2018
    • September 2018
    • June 2018
    • November 2017
    • October 2017
    • March 2017
    • October 2016
    • September 2016
    • July 2016
    • June 2016
    • April 2016
    • February 2016
    • December 2015
    • September 2015
    • July 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • November 2014
    • October 2014
    • September 2014
    • July 2014
    • June 2014
    • May 2014
    • April 2014
    • March 2014
    • February 2014
  • Categories

    • Active Directory
    • ADFS
    • ASA
    • C#
    • Chrome
    • Cisco
    • Deployment
    • Exchange
    • Group Policy
    • Office 365
    • Opinion
    • PaperCut
    • Permissions
    • PKI
    • PowerShell
    • Scripting
    • Uncategorized
    • vmware
    • Web Filtering
    • Windows 10
    • Windows 8.1
    • Windows Server
    • Wireless
  • Meta

    • Register
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.com
Blog at WordPress.com.
Trials of a Network Admin
Blog at WordPress.com.
  • Follow Following
    • Trials of a Network Admin
    • Join 31 other followers
    • Already have a WordPress.com account? Log in now.
    • Trials of a Network Admin
    • Customize
    • Follow Following
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar
 

Loading Comments...