Skip to content

Blog

Sentinel KQL Queries: Detecting a Lack of Zscaler Data Ingestion

Screenshot of the "PIU - No Zscaler Internet Access data received in 2 hours" Sentinel rule in situ in the Azure Portal.

Ingesting data from Zscaler Internet Access and Zscaler Private Access into a SIEM is a valuable technique for identifying risky endpoint activity or system compromise. It also gives you a (hopefully) immutable1 copy of this audit data to support a post-incident investigation.

I’ve been able to configure both Zscaler Internet Access and Zscaler Private Access data to be ingested into Microsoft Sentinel2, but occasionally have found that the somewhat circuitous path that ZIA data takes into the SIEM (NSS VM, to another Linux VM over syslog, then to Azure Monitor Agent, and finally to Sentinel) can be brittle. A reboot of the collector VMs has always fixed this, but you have to know that the data flow has stopped!

I have written a couple of KQL queries for this scenario – one for Zscaler Internet Access (ZIA) and one for Zscaler Private Access (ZPA).

Each will trigger an incident if zero log entries are received for the relevant service within a 2 hour period.

You can import these to your own Sentinel environment by clicking Import in Analytics rules and providing the ARM template files linked below.

Zscaler Internet Access

Download as ARM template

CommonSecurityLog | where DeviceVendor == "Zscaler" and DeviceProduct == "NSSWeblog"
| where TimeGenerated > ago(30d)
| summarize last_log = datetime_diff("second",now(), max(TimeGenerated))
| where last_log >= 7200

Set this to run against data for the last 2 hours, with the maximum “look back” period. The query will return 0 results if data ingestion is occurring correctly, so you will want to alert on >0 log entries.

Zscaler Private Access

Download as ARM template

ZPAUserActivity
| where LogTimestamp > ago(30d)
| summarize last_log = datetime_diff("second",now(), max(LogTimestamp))
| where last_log >= 7200

Set this to run against data for the last 2 hours, with the maximum “look back” period.

You will get an error “Failed to run the analytics rule query. One of the tables does not exist” if you have not completely configured the ZPA log ingestion, including adding the custom ZPAUserActivity function to your Sentinel workspace. Follow the Zscaler and Microsoft Sentinel Deployment Guide (“Configuring NSS VM-Based Log Ingestion for ZPA”, page 126).

 

1: The keys to the kingdom are then the Sentinel/LA workspace, which hopefully your attacker has not escalated privileges to be able to delete. There’s nothing like “immutable vaults” in Azure Recovery Vaults for Sentinel or Log Analytics workspaces. You can set a standard Azure lock, but a privileged attacker could just delete the lock!

2: Zscaler recently (just in time) updated the ZPA ingestion workflow to use the Azure Monitor Agent rather than the deprecated Log Analytics Agent. This took a little reconfiguring and was quite an involved process!

Word hangs on saving: App Control for Business and WebClnt.dll

The Symptom

Users were experiencing a 5-15 second delay when saving a document to OneDrive or SharePoint, during which Word would show as “not responding”.

All machines in question use App Control for Business (WDAC).

The Cause

During the “not responding” period, Word is attempting to start the Web Client service, which is set to Manual.

svchost.exe launches, tries to load the WebClnt.dll library, but this is blocked by App Control for Business.

Word has to wait for this attempt to time out, before then giving up and saving anyway.

The Fix

Setting the Web Client service start type to Disabled prevents Word from attempting to start the service, fixing the delay.

The Diagnosis Process

Event Tracing for Windows and UIforETW strike again, along with the incredible utility of the public symbols for Office.

The process was similar to that which I documented here, where the process of saving comments in Word caused a hang. The key is to click Trace > Load Symbols to ensure we have function names in the trace.

Drilling down to the problem time area and then digging into the amount of time we spend in each function revealed that the time was being spent in davhlpr.dll!TriggerStartWebclientServiceIfNotRunning.

A screenshot of Windows Performance Analyzer, showing a stack trace where a significant wait was being incurred in davhlpr.dll!TriggerWebClientServiceIfNotRunning

Stack trace from Windows Performance Analyzer from WWLIB.DLL!CmdSaveAs to davhlpr.dll!TriggerStartWebclientServiceIfNotRunning

Well this does indeed sound like something to do with the Web Client service!

» Read the rest of this post…

Installing the Zabbix Agent 2 on Windows with Minimal Privileges (LocalService)

The Zabbix Agent 2 on Linux uses a non-root account by default (“zabbix”), and thus provides some protection against the worst outcomes of a potential vulnerability in the agent, or perhaps a takeover of a Zabbix server that monitors that agent.

The Agent on Windows, however, runs with NT AUTHORITY\SYSTEM, which has extensive privileges on the monitored system.

I have put together a little wrapper script around the Zabbix Agent 2 MSI installer which runs the installer, then reconfigures it to run as NT AUTHORITY\LocalService, which is a minimally privileged account.

You can find the script on GitHub. You’ll need to also grab the Zabbix Agent 2 MSI installer, rename it to zabbix-agent2.msi and provide that MSI in the same directory when you deploy.

It goes without saying that this is not officially supported, but I have not experienced any issues monitoring the standard items that are in the Windows by Zabbix Agent template. It is possible you will run into issues with unsupported items if the item in question does in fact require elevated permissions on the monitored host!

Hopefully this will be useful to others looking to monitor Windows systems with Zabbix, while maintaining as much of the principle of least privilege as possible!

Adventures in ETW: “Slow Comment”

I am a great admirer of the work of Bruce Dawson on Event Tracing for Windows, UIforETW and his blog posts on using ETW to track down all sorts of weird and wonderful issues.

I also found Bruce’s training videos on the subject, despite the videos knocking on the door of being a decade old, to be very useful.

I was delighted to have a recent opportunity to practise my own skills in this area, following Bruce’s lead!

The Symptom

The end user was experiencing delays of between several seconds and about half a minute when saving comments in a Word document. Choosing to Insert the comment was fine and when typing the comment, Word also behaved normally. Press Save, however, and Word’s UI would hang for somewhere between a few and 30 seconds.

Yep, sometimes half a minute for each comment being saved!

In a document that required a lot of comments, this was dramatically slowing the user’s work.

» Read the rest of this post…

Smartcard login — the RDP client needs to be able to access the CRL

The revocation status of the domain controller certificate used for smartcard authentication could not be determined. There is additional information in the system event log. Please contact your system administrator.

The revocation status of the domain controller certificate used for smartcard authentication could not be determined. There is additional information in the system event log. Please contact your system administrator.

If you have smartcard authentication set up for logging into certain Active Directory systems, and also a restrictive web proxy on the machine acting as the RDP client, you may run into this issue.

My mistake was checking that the RDP server had access to the CRL mentioned in the certificate.

Yes, the RDP server might be quite happy in terms of checking the certificate revocation, but if the RDP client can’t access the CRL URL (perhaps through the configured proxy), you will receive this same error.

Check connectivity to the stated CRL distribution point from the RDP client and RDP server!

Binary release of RPC Investigator

Screenshot of RPC Investigator
RPC Investigator

Happy 2023!

I am intrigued by Trail of Bits’ new tool RPC Investigator. Exploring Windows internals is of ongoing interest, and this seems like a very interesting tool to shed light on some of that internal complexity and learn more about how the OS works.

Trail of Bits is releasing a new tool for exploring RPC clients and servers on Windows. RPC Investigator is a .NET application that builds on the NtApiDotNet platform for enumerating, decompiling/parsing and communicating with arbitrary RPC servers. We’ve added visualization and additional features that offer a new way to explore RPC.

RPC is an important communication mechanism in Windows, not only because of the flexibility and convenience it provides software developers but also because of the renowned attack surface its implementers afford to exploit developers. While there has been extensive research published related to RPC servers, interfaces, and protocols, we feel there’s always room for additional tooling to make it easier for security practitioners to explore and understand this prolific communication technology.

I could not find a binary release of the code on GitHub, just instructions on how to build it yourself.

In case others want to play with RPC Investigator without needing to build it, I publish this binary release that you can download and just run.

I have done nothing to the original repo’s code except open and build in Visual Studio 2022. I am sharing this binary build in case others want to avoid having to build the code themselves.

Binary releases here may be kept up-to-date, or may not. It is on a best effort basis. 🙂

Have fun in RPC-land!

Surface Pro Type Covers not typing after waking and the Oblitum Interception driver (Veyon)

Following teachers’ return for this academic year in September, we suddenly found ourselves with a frequent issue. After waking the Surface Pro devices from sleep, the Type Cover would often not respond to keystrokes. The on screen keyboard was not affected, but USB keyboards also stopped working. The Type Cover trackpad would continue working fine.

A full Windows restart would always bring back the keyboard functionality.

This triggered a challenging investigation to determine what was wrong. The fact that we had made no significant software changes that should affect this over the summer made me look, with guidance from Microsoft Surface Business support, to Windows Updates as a possible issue. Rolling back both September and August’s Windows Updates did not seem to have any effect.

Clearly this wasn’t a wide enough issue to be affecting everyone, or many more customers would be up in arms about having to restart 6 or 7 times in a working day!

With the issue affecting a wide range of different Surface Pro devices and different Type Covers, it looked more likely to be a software issue than hardware. Predictably, perhaps, I was unable to reproduce the issue on a device with nothing but a stock Windows install on it… it’s got to be software. Right?

The build of Windows we run is kept as simple and close-to-stock as possible, for exactly the reason that it saves you from this type of issue! Of the software we do run, the prime suspects seemed to be:

I dug a little deeper into what Veyon brings along to do its magic. Its ability to remotely control other systems for classroom management purposes, including remotely inducing the Secure Attention Sequence (Ctrl-Alt-Delete to normal folks!) means that it must have some kind of driver installed that permits this functionality. Eventually, it dawned on me that this interacts with the keyboard, making it a good candidate for the culprit for, you know, keyboard problems.

» Read the rest of this post…

AutoPilot hanging at “checking connection to Microsoft”

I ran into some trouble recently with a machine that had previously been registered for Intune/Microsoft Endpoint Manager AutoPilot deployment hanging on a Windows reinstall (the SSD had been replaced).

The machine would sit at “checking connection to Microsoft. This might take a while”.

Take a while it did — a spell overnight on this very screen would not help. I used Shift+F10 to get some diagnostics tooling on the system. I could see references to a /join HTTPS endpoint being accessed that seemed to be Intune-related, but it was neither obviously succeeding nor failing.

Some perusal of logfiles suggested to me that UEFI variables are involved in the AutoPilot process.

Fortunately, the machines in question are desktop PCs. A very simple way to (destructively!) clear out UEFI variables was to remove the CMOS battery for a period of time. Upon trying again, we jump right past “checking connection to Microsoft” and can move forward with the install.

On systems where a battery pull is not effective, it may be worth getting yourself into a UEFI shell and using dmpstore to identify UEFI variables in NVRAM that may be related to AutoPilot and deleting them. Sorry I can’t be more specific!

Block Mounting of ISO Images with Microsoft Intune (Endpoint Manager)

Today’s malware-loader-du-jour, Bumblebee, has been seen achieving initial access through phishing sites that convince a user to mount a downloaded ISO image. This may be a reaction to Microsoft’s recent improvements to macro-enabled document security.

Adversaries push ISO files through compromised email (reply) chains, known as thread hijacked emails, to deploy the Bumblebee loader. ISO files contain a byte-to-byte copy of low-level data stored on a disk. The malicious ISO files are delivered through Google Cloud links or password protected zip folders. The ISO files contain a hidden DLL with random names and an LNK file. DLL (Dynamic Link Library) is a library that contains codes and data which can be used by more than one program at a time. LNK is a filename extension in Microsoft Windows for shortcuts to local files.

https://cloudsek.com/technical-analysis-of-bumblebee-malware-loader/

One of the things that we can do to help our users avoid this new initial execution foothold is by blocking the mounting of ISO images, as long as you can be confident this will not break anything they actually need to do! I am fortunate enough to be able to do this.

(Djordje Atlialp shows us how to achieve this with classic GPOs, and also a more comprehensive neutering of ISO files.)

Here is what I have rolled out as an Intune PowerShell Script to block the mounting of ISOs. No reboot is required. Users will see the Mount option disappear from the context menu of an ISO file within File Explorer and will be unable to double-click to mount a malicious ISO. Or, indeed, any ISO. 😉

We will head to Microsoft Endpoint Manager admin center, go to Devices > Scripts and create a new Windows 10 and later PowerShell script.

Restrict mounting of ISOs in File Explorer

The Intune Script

UPDATE: I have made some improvements — namely, the previous one liner will cause failures to be reported in Intune on subsequent runs. We will now only add the value where it does not exist, and we will add support for Windows.VhdFile as well. It’s no longer a one-liner!

$items = @(
    @{
        path = "HKLM:\SOFTWARE\Classes\Windows.IsoFile\shell\mount"
        valueName = "ProgrammaticAccessOnly"   
    }
    @{
        path = "HKLM:\SOFTWARE\Classes\Windows.VhdFile\shell\mount"
        valueName = "ProgrammaticAccessOnly"
    }
)

foreach($item in $items) {
    if ($null -eq (Get-Item -Path $item.path).GetValue($item.valueName)) {
        New-ItemProperty -Path $item.path -Name $item.valueName -Value ""
    }
}

The body of the script can be as follows:

New-ItemProperty -Path "HKLM:\SOFTWARE\Classes\Windows.IsoFile\shell\mount" -Name ProgrammaticAccessOnly -Value ""

(This REG_SZ value need only exist, with a blank string as its Data, for this to work.)

Assign it to the device group and you are all set.

Removal

To undo this change, we can reverse what we’re doing and Remove-ItemProperty on the items we added:

$items = @(
    @{
        path = "HKLM:\SOFTWARE\Classes\Windows.IsoFile\shell\mount"
        valueName = "ProgrammaticAccessOnly"   
    }
    @{
        path = "HKLM:\SOFTWARE\Classes\Windows.VhdFile\shell\mount"
        valueName = "ProgrammaticAccessOnly"
    }
)

foreach($item in $items) {
    if ($null -ne (Get-Item -Path $item.path).GetValue($item.valueName)) {
        Remove-ItemProperty -Path $item.path -Name $item.valueName
    }
}

Conclusion

This doesn’t make you bulletproof, but will, if tolerated by your users, provide a substantial degree of protection, at the time of writing, from any number of current malware loaders that are using the ISO image technique to achieve initial code execution. The nature of the separate filesystem within the ISO presently prevents it from being marked as being from the Wild Wild West World Wide Web.

ShutdownBuddy — save resources while saving resources

In a continuation of my desire to write really lightweight software that doesn’t add to the undesirable background bloat running on computers, I set about in June-ish to write something to improve upon a VBScript-Scheduled-Task-and-shutdown.exe gaffer tape of a solution to forcing a full shutdown when a computer is idle that I had previously cobbled together.

Power management in Windows is mature and capable, for sure, but what is less obvious is how to, on shared fixed desktop computers, actually trigger a proper shutdown and not just put idle machines to sleep. Hibernation is an option, of course, but the relentless increase in complexity of Windows brings to mind the other, stability-related, benefits of regular proper restarts.

So, then, we want something that:

  • identifies when no-one is interactively signed in
  • waits a configurable amount of time
  • if still no-one has signed in in that time, shut down properly

Additionally, because this unavoidably must run with high permissions and regularly assess signed in users in the background, it should be a Windows service that is as lightweight and simple as possible. Reduced resource usage (RAM, CPU time in background) so we can shut down and have reduced resource usage (of electrical power). I can see the beauty of it already!

So I wrote ShutdownBuddy.

It is configurable through the registry:

HKLM\SOFTWARE\upfold.org.uk\ShutdownBuddy

EvaluationIntervalSeconds — DWORD. How frequently, in seconds, to evaluate for interactive sessions.

ShutdownAfterIdleForSeconds — DWORD. How many seconds of idle computer (i.e. no interactive sessions) before issuing a shutdown. This is periodically evaluated as above.

Like all my lightweight, C(++) Win32 projects, it is officially experimental as I am using these projects to learn how to write this kind of code properly. Any suggestions and improvements are gratefully received.