Skip to content

Blog

Adventures with Arch Packages: Exercise Caution with Exclude

Running a bleeding-edge rolling release Linux distribution like Arch Linux has its challenges and risks, but there is perhaps no greater feeling of absolute control over your own operating system! It also leads to the opportunity to have “adventures” like the one described here. While “adventure” might be tongue-in-cheek, the truth is that there is great educational value in breaking something and thus being forced to fix it!

I recently made the mistake of seeing that a pacman upgrade to icu (International Components for Unicode) would break a few Electron packages, which I needed for the open source build of VS Code, among a few other options. (I generally like to avoid Electron apps, but VS Code is a rare exception!)

I was notified by pacman that there was a dependency issue here that could not be solved, so that the entire transaction was not possible.

My mistake was thinking “oh, I’ll just exclude icu from the upgrade, as it’s causing the dependency issue”. Usually these issues are fixed in a day or two, so I’d upgrade it later.

This was indeed a mistake.

Somehow, apps were now referencing the new icu that wasn’t there already. I rebooted, to discover that the GUI wouldn’t start, and I couldn’t even use pacman in single-user mode to roll back the previous transaction, as pacman itself depended upon icu.

I’d broken the very package manager that I needed to roll back. Oops!

Install media to the rescue

I booted into the Arch install media, and set about to address the issue.

chrooting into the target filesystem and using pacman wasn’t an option, as running the package manager from the target filesystem failed with the same dependency issue.

I had a few false starts with trying to run pacman from the install medium, pointing it at the mounted OS partition with --sysroot.

While this didn’t pan out, I remembered that pacstrap, while designed for install-time, presumably could install packages in the target filesystem without needing pacman itself. I was concerned that this might reinstall some base packages, but it turns out you can specify which packages to install manually on its command line.

So, I gathered the list of packages that needed upgrading by using pacman -Syu --sysroot /mnt against the target filesystem, and then supplied this package list to pacstrap:
pacstrap -G -i -M /mnt icu brltty electron27 electron28 electron29 electron30 freerdp freerdp2 harfbuzz-icu raptor
And… we’re back!

Lessons learned

  • The manual isn’t joking about “Partial upgrades are unsupported”.
  • Considerable caution is required when using pacman --exclude
  • “The OS is ephemeral and I can rebuild it” is a bit too relaxed an attitude when you don’t really want to have to rebuild the OS at short notice. Back up the OS packages and libraries too.
  • Really understanding the package manager and install process gives you the tools to pull yourself out of the holes you dig for yourself!

 

Reminding myself which machine I am authenticating to with a sudo “lecture”

I frequently SSH into various systems from my primary Linux machine. There is an analogous issue to “too many browser tabs” that exists here — having too many SSH sessions open in different terminal tabs!

There is a risk in these cases of accidentally typing a higher-privileged sudo password into a lower security system by typing into the wrong terminal. There are various approaches that can help here; I have used screen banners with different colours before.

A good “last line of defence” approach to this risk that I have settled on is to make use of sudo‘s “lectures”. You will have seen the default:

We trust you have received the usual lecture from the local System Administrator. It usually boils down to these three things:
#1) Respect the privacy of others.
#2) Think before you type.
#3) With great power comes great responsibility.

We can customise this, and also set it to always show, rather than just the first time you ever use sudo on that machine. We’ll create a custom lecture file with our desired text — in my case, the hostname I’m logged into, so I’m sure where I am before I type the password!

Then, use visudo to set these options:

Defaults lecture=always
Defaults lecture_file=/etc/custom_sudo_lecture

Installing the Zabbix Agent 2 on Windows with Minimal Privileges (LocalService)

The Zabbix Agent 2 on Linux uses a non-root account by default (“zabbix”), and thus provides some protection against the worst outcomes of a potential vulnerability in the agent, or perhaps a takeover of a Zabbix server that monitors that agent.

The Agent on Windows, however, runs with NT AUTHORITY\SYSTEM, which has extensive privileges on the monitored system.

I have put together a little wrapper script around the Zabbix Agent 2 MSI installer which runs the installer, then reconfigures it to run as NT AUTHORITY\LocalService, which is a minimally privileged account.

You can find the script on GitHub. You’ll need to also grab the Zabbix Agent 2 MSI installer, rename it to zabbix-agent2.msi and provide that MSI in the same directory when you deploy.

It goes without saying that this is not officially supported, but I have not experienced any issues monitoring the standard items that are in the Windows by Zabbix Agent template. It is possible you will run into issues with unsupported items if the item in question does in fact require elevated permissions on the monitored host!

Hopefully this will be useful to others looking to monitor Windows systems with Zabbix, while maintaining as much of the principle of least privilege as possible!

Choppy video and audio in KVM VMs with Spice Display

I am currently trying to migrate away from VMware Workstation on a Linux host (thanks, Broadcom!) to using KVM virtual machines.

I run many Linux virtual machines, and in many of these I use the GUI, so want well performing video and sound! This was one of the original justifications for VMware Workstation.

The default configuration on Arch was to use the Virtio video driver and a “Display Spice” entry in the VM configuration to support video and sound output.

However, I would experience choppy audio and video output. On the display, it seemed that, roughly vertically, some of the pixels in the framebuffer would not be updated, especially when lots of display changes occurred. This created a sort of irregular “interlaced” look, as you can see here as I have moved the mouse down the menu:

Debian 12 XFCE desktop, with an irregular "interlaced" graphical corruption issue, showing a menu where previously highlighted menu items are partially blue from previous frames

I have been able to work around this issue by enabling OpenGL acceleration in the guest.

Configuration

For an Arch guest, it needs the package qemu-hw-display-virtio-gpu installed.

Check the guest is showing the virtio_gpudrmfb frame buffer device:

# dmesg | grep '\[drm'
...
virtio-pci 0000:00:01.0: [drm] fb0: virtio_gpudrmfb frame buffer device

The guest configuration is as follows:

3D acceleration enabled in the Video Virtio driver. Ensure this is ticked.

virt-manager configuration page on the Video Virtio tab. Video model is Virtio. 3D acceleration should be ticked.

On the Display Spice object, the configuration is set to Listen type None and OpenGL is set to on.

virt-manager configuration page on the Display Spice tab. Listen type is None and OpenGL is ticked.

References

The idea that enabling OpenGL with the virtio video driver may help was derived from https://www.kraxel.org/blog/2016/09/using-virtio-gpu-with-libvirt-and-spice/.

Adventures in ETW: “Slow Comment”

I am a great admirer of the work of Bruce Dawson on Event Tracing for Windows, UIforETW and his blog posts on using ETW to track down all sorts of weird and wonderful issues.

I also found Bruce’s training videos on the subject, despite the videos knocking on the door of being a decade old, to be very useful.

I was delighted to have a recent opportunity to practise my own skills in this area, following Bruce’s lead!

The Symptom

The end user was experiencing delays of between several seconds and about half a minute when saving comments in a Word document. Choosing to Insert the comment was fine and when typing the comment, Word also behaved normally. Press Save, however, and Word’s UI would hang for somewhere between a few and 30 seconds.

Yep, sometimes half a minute for each comment being saved!

In a document that required a lot of comments, this was dramatically slowing the user’s work.

» Read the rest of this post…

Missing CNAMEs? Certification Authority Authorization (CAA) records forbid the CA from issuing a certificate

The configuration for the Let’s Encrypt TLS certificate for this site includes a number of additional domains, mostly with my name in them, which redirect to my main domain for this site, peter.upfold.org.uk.

Some of these additional Subject Alternative Names listed in the cert are www. CNAMEs on these domains, e.g. www.peterupfold.com. It turns out that some of these www CNAMEs didn’t exist in my DNS records.

Recently, a change to Let’s Encrypt means that they appear to use Unbound 1.18 internally, where the behaviour has changed in some way, and those www CNAMEs not existing will cause this error on certificate renewal:

Problem for www.peterupfold.com: urn:ietf:params:acme:error:caa :: Certification Authority Authorization (CAA) records forbid the CA from issuing a certificate :: Error finalizing order :: While processing CAA for www.peterupfold.com: DNS problem: SERVFAIL looking up CAA for www.peterupfold.com - the domain's nameservers may be malfunctioning

It’s unclear to me how this was working before, given I was missing these www. CNAMEs entirely!

My domain registrar and DNS provider doesn’t appear to yet support adding CAA records, but that’s fine — as long as the DNS request returns NOERROR, CAA records aren’t mandatory yet.

Somewhere in this config change to Unbound 1.18 on Let’s Encrypt’s side means that the failure to resolve these www. CNAMEs means that we are not considered to be returning NOERROR for the CAA records. This causes this error above, and the subsequent refusal to issue the renewed cert.

Anyway, I added www. CNAMEs in my DNS management panel for each domain that was failing, re-issued the renewal request and now all is well.

Smartcard login — the RDP client needs to be able to access the CRL

The revocation status of the domain controller certificate used for smartcard authentication could not be determined. There is additional information in the system event log. Please contact your system administrator.

The revocation status of the domain controller certificate used for smartcard authentication could not be determined. There is additional information in the system event log. Please contact your system administrator.

If you have smartcard authentication set up for logging into certain Active Directory systems, and also a restrictive web proxy on the machine acting as the RDP client, you may run into this issue.

My mistake was checking that the RDP server had access to the CRL mentioned in the certificate.

Yes, the RDP server might be quite happy in terms of checking the certificate revocation, but if the RDP client can’t access the CRL URL (perhaps through the configured proxy), you will receive this same error.

Check connectivity to the stated CRL distribution point from the RDP client and RDP server!

X11 Xorg.log amdgpu “no screens found” when a non-graphics card is in the primary PCI Express slot

I bought a used LTO4 tape drive with a 8088 SAS connection. Why?

For fun, for backups that feel like they might be more resilient than the shingled magnetic recording hard drives I accidentally bought (thanks Seagate for disclosing that), and for the enjoyment of something so wonderfully mechanical in a world that is very “solid state”.

This necessitated a SAS card purchase, to give myself the ports necessary to actually plug in the tape drive. It seemed unhappy with one of my PCI Express slots, so I moved it up to the primary PCI Express slot — the one you’d usually use for a graphics card.

Now this Arch Linux machine has no need for fancy graphics. The APU integrated graphics on the Ryzen 7 5700G are perfectly adequate.

However, once the SAS card was in the primary PCI Express slot, X11 would no longer start. My SAS card showed up beautifully with lspci, as did the tape drive with lsscsi, but I had to sacrifice the GUI for it. Seems a little extreme, even for me.

X11 would fail with “no screens found” when the amdgpu driver was enumerating screens.

The integrated graphics moved PCI ID

What had happened is that once something is in that primary PCI Express slot, the integrated graphics moved their PCI bus ID.

I first identified where the “VGA controller” had gone with lspci:

08:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] (rev c8)

Then I edited /etc/X11/xorg.conf.d/amdgpu.conf to point the BusID at that new identifier.

For me, it had moved from PCI:7:0:0 to PCI:8:0:0.

And now, I have the delight of a GUI and a SAS card, and a tape drive.

Binary release of RPC Investigator

Screenshot of RPC Investigator
RPC Investigator

Happy 2023!

I am intrigued by Trail of Bits’ new tool RPC Investigator. Exploring Windows internals is of ongoing interest, and this seems like a very interesting tool to shed light on some of that internal complexity and learn more about how the OS works.

Trail of Bits is releasing a new tool for exploring RPC clients and servers on Windows. RPC Investigator is a .NET application that builds on the NtApiDotNet platform for enumerating, decompiling/parsing and communicating with arbitrary RPC servers. We’ve added visualization and additional features that offer a new way to explore RPC.

RPC is an important communication mechanism in Windows, not only because of the flexibility and convenience it provides software developers but also because of the renowned attack surface its implementers afford to exploit developers. While there has been extensive research published related to RPC servers, interfaces, and protocols, we feel there’s always room for additional tooling to make it easier for security practitioners to explore and understand this prolific communication technology.

I could not find a binary release of the code on GitHub, just instructions on how to build it yourself.

In case others want to play with RPC Investigator without needing to build it, I publish this binary release that you can download and just run.

I have done nothing to the original repo’s code except open and build in Visual Studio 2022. I am sharing this binary build in case others want to avoid having to build the code themselves.

Binary releases here may be kept up-to-date, or may not. It is on a best effort basis. 🙂

Have fun in RPC-land!

Surface Pro Type Covers not typing after waking and the Oblitum Interception driver (Veyon)

Following teachers’ return for this academic year in September, we suddenly found ourselves with a frequent issue. After waking the Surface Pro devices from sleep, the Type Cover would often not respond to keystrokes. The on screen keyboard was not affected, but USB keyboards also stopped working. The Type Cover trackpad would continue working fine.

A full Windows restart would always bring back the keyboard functionality.

This triggered a challenging investigation to determine what was wrong. The fact that we had made no significant software changes that should affect this over the summer made me look, with guidance from Microsoft Surface Business support, to Windows Updates as a possible issue. Rolling back both September and August’s Windows Updates did not seem to have any effect.

Clearly this wasn’t a wide enough issue to be affecting everyone, or many more customers would be up in arms about having to restart 6 or 7 times in a working day!

With the issue affecting a wide range of different Surface Pro devices and different Type Covers, it looked more likely to be a software issue than hardware. Predictably, perhaps, I was unable to reproduce the issue on a device with nothing but a stock Windows install on it… it’s got to be software. Right?

The build of Windows we run is kept as simple and close-to-stock as possible, for exactly the reason that it saves you from this type of issue! Of the software we do run, the prime suspects seemed to be:

I dug a little deeper into what Veyon brings along to do its magic. Its ability to remotely control other systems for classroom management purposes, including remotely inducing the Secure Attention Sequence (Ctrl-Alt-Delete to normal folks!) means that it must have some kind of driver installed that permits this functionality. Eventually, it dawned on me that this interacts with the keyboard, making it a good candidate for the culprit for, you know, keyboard problems.

» Read the rest of this post…