I “recently” wrote about obtaining a new (to me, actually quite old) computer over in The Apple Power Macintosh 7200/120 PC Compatible (Part 1). This post is a bit of a detour, but may help others understand why some images they download from the internet don’t work.
Disk partitioning is (of course) a way to divide up a single disk into multiple volumes (partitions) for different uses. While the idea is similar, computer platforms over the ages have done this in a variety of different ways, with varying formats on disk, and varying limitations. The ones that you’re most likely to be familiar with are the MBR partitioning scheme (from the IBM PC), and the GPT partitioning scheme (common for UEFI systems such as the modern PC and Mac). One you’re less likely to be familiar with is the Apple Partition Map scheme.
The way all IBM PCs and compatibles worked from the introduction of MS-DOS 2.0 in 1983 until some time after 2005 was the Master Boot Record partitioning scheme. It was outrageously simple: of the first 512 byte sector of a disk, the first 446 bytes was for the bootstrapping code (the “boot sector”), the last 2 bytes were for the magic two bytes telling the BIOS this disk was bootable, and the other 64 bytes were four entries of 16 bytes, each describing a disk partition. The Wikipedia page is a good overview of what it all looks like. Since “four partitions should be enough for anybody” wasn’t going to last, DOS 3.2 introduced “extended partitions” which was just using one of those 4 partitions as another similar data structure that could point to more partitions.
In the 1980s (similar to today), the Macintosh was, of course, different. The Apple Partition Map is significantly more flexible than the MBR on PCs. For a start, you could have more than four partitions! You could actually have a lot more than four partitions, as the Apple Partition Map is a single 512-byte sector for each partition, and the partition map is itself a partition. Instead of being block 0 (like the MBR is), it actually starts at block 1, and is contiguous (The Driver Descriptor Record is what’s at block 0). So, once created, it’s hard to extend. Typically it’d be created as 64×512-byte entries, for 32kb… which turns out is actually about enough for anyone.
The Inside Macintosh reference on the SCSI Manager goes through more detail as to these structures. If you’re wondering what language all the coding examples are in, it’s Pascal – which was fairly popular for writing Macintosh applications in back in the day.
But the actual partition map isn’t the “interesting” part of all this (and yes, the quotation marks are significant here), because Macs are pretty darn finicky about what disks to boot off, which gets to be interesting if you’re trying to find a CD-ROM image on the internet from which to boot, and then use to install an Operating System from.
I never programmed a 1980s Macintosh actually in the 1980s. It was sometime in the early 1990s that I first experienced Microsoft Basic for the Macintosh. I’d previously (unknowingly at the time as it was branded Commodore) experienced Microsoft BASIC on the Commodore 16, Commodore 64, and even the Apple ][, but the Macintosh version was something else. It let you do some pretty neat things such as construct a GUI with largely the same amount of effort as it took to construct a Text based UI on the micros I was familiar with.
Okay, to be fair, I’d also dabbled in Microsoft QBasic that came bundled with MS-DOS of the era, which let you do a whole bunch of graphics – so you could theoretically construct a GUI with it. Something I did attempt to do. Programming on the Mac was so much easier to construct a GUI.
Of course, Microsoft Basic wasn’t the preferred way to program on the Macintosh. At that time it was largely Pascal, with C being something that also existed – but you were going to see Pascal in Inside Macintosh. It was probably somewhat fortuitous that I’d poked at Pascal a bit as something alternate to look at in the high school computing classes. I can only remember using TurboPascal on DOS systems and never actually writing Pascal on the Macintosh.
By the middle part of the 1990s though, I was firmly incompetently writing C on the Mac. No doubt the quality of my code increased after I’d done some university courses actually covering the language rather than the only practical way I had to attempt to write anything useful being looking at Inside Macintosh examples in Pascal and “C for Dummies” which was very not-Macintosh. Writing C on UNIX/Linux was a lot easier – everything was made for it, including Actual Documentation!
Anyway, in the early 2000s I ran MacOS X for a bit on my white iBook G3, and did a (very) small amount of any GUI / Project Builder (the precursor to Xcode) related development – instead largely focusing on command line / X11 things. The latest coolness being to use Objective-C to program applications (unless you were bringing over your Classic MacOS Carbon based application, then you could still write C). Enter some (incompetent) Objective-C coding!
Then Apple went to x86, so the hardware ceased being interesting, and I had no reason to poke at it even as a side effect of having hardware that could run the software stack. Enter a long-ass time of Debian, Ubuntu, and Fedora on laptops.
Come 2022 though, and (for reasons I should really write up), I’m poking at a Mac again and it’s now Swift as the preferred way to write apps. So, I’m (incompetently) hacking away at Swift code. I have to admit, it’s pretty nice. I’ve managed to be somewhat productive in a relative short amount of time, and all the affordances in the language gear towards the kind of safety that is a PITA when coding in C.
So this is my WIP utility to be able to import photos from a Shotwell database into the macOS Photos app:
There’s a lot of rough edges and unknowns left, including how to actually do the import (it looks like there’s going to be Swift code doing AppleScript things as the PhotoKit API is inadequate). But hey, some incompetent hacking in not too much time has a kind-of photo browser thing going on that feels pretty snappy.
So, this idea has been brewing for a while now… try and watch all of Doctor Who. All of it. All 38 seasons. Today(ish), we started. First up, from 1963 (first aired not quite when intended due to the Kennedy assassination): An Unearthly Child. The first episode of the first serial.
A lot of iconic things are there from the start: the music, the Police Box, embarrassing moments of not quite remembering what time one is in, and normal humans accidentally finding their way into the TARDIS.
I first saw this way back when a child, where they were repeated on ABC TV in Australia for some anniversary of Doctor Who (I forget which one). Well, I saw all but the first episode as the train home was delayed and stopped outside Caulfield for no reason for ages. Some things never change.
Of course, being a show from the early 1960s, there’s some rougher spots. We’re not about to have the picture of diversity, and there’s going to be casual racism and sexism. What will be interesting is noticing these things today, and contrasting with my memory of them at the time (at least for episodes I’ve seen before), and what I know of the attitudes of the time.
“This year-ometer is not calculating properly” is a very 2020 line though (technically from the second episode).
Every so often, I release a new libeatmydata. This has not happened for a long time. This is just some bug fixes, most of which have been in the Debian package for some time, I’ve just been lazy and not sat down and merged them.
So, I learned something recently: if you pick up your iPhone with eBay open on an auction bid screen in just the right way, you may accidentally click the bid button and end up buying an old computer. Totally not the worst thing ever, and certainly a creative way to make a decision.
So, not too long later, a box arrives!
It arrived! Well packed and with the all important video adapter cable!
In the 1990s, Apple created some pretty “interesting” computers and product line. One thing you could get is a DOS Compatibility (or PC Compatibility) card. This was a card that went into one of the expansion slots on a Mac and had something really curious on it: most of the guts of a PC.
The machine I’d bought was an Apple Power Macintosh 7200/120 with the PC Compatible card added afterwards (so it doesn’t have the PC Compatible label on the front like some models ended up getting).
The Apple Power Macintosh 7200/120
Wikipedia has a good article on the line, noting that it was first released in August 1995, and fitting for the era, was sold as about 14 million other model numbers (okay not quite that bad, it was only a total of four model numbers for essentially the same machine). This specific model, the 7200/120 was introduced on April 22nd, 1996, and the original web page describing it from Apple is on the wayback machine.
The 7200 series replaced the 7100, which was one of the original PowerPC based Macs. The big changes are using the industry standard PCI bus for its three expansion slots rather than NuBus. Rather surprisingly, NuBus was not Apple specific, but you could not call it widely adopted by successful manufacturers. Apple first used NuBus in the 1987 Macintosh II.
The PCI bus was standardized in 1992, and it’s almost certain that a successor to it is in the computer you’re using to read this. It really quite caught on as an industry standard.
The processor of the machine is a PowerPC 601. The PowerPC was an effort of IBM, Apple, and Motorola (the AIM Alliance) to create a class of processors for personal computers based on IBM’s POWER Architecture. The PowerPC 601 was the first of these processors, initially used by Apple in its Power Macintosh range. The machine I have has one running at a whopping 120Mhz. There continued to be PowerPC chips for a number of years, and IBM continued making POWER processors even after that. However, you are almost certainly not using a PowerPC derived processor in the computer you’re using to read this.
The PC Compatibility card has on it a full on legit Pentium 100 processor, and hardware for doing VGA graphics, a Sound Blaster 16 and the other things you’d usually expect of a PC from 1996. Since it’s on a PCI card though, it’s a bit different than a PC of the era. It doesn’t have any expansion slots of its own, and in fact uses up one of the three PCI slots in the Mac. It also doesn’t have its own floppy drive, or hard drive. There’s software on the Mac that will let the PC card use the Mac’s floppy drive, and part of the Mac’s hard drive for the PC!
The Pentium 100 was the first mass produced superscalar processor. You are quite likely to be using a computer with a processor related to the Pentium to read this, unless you’re using a phone or tablet, or one of the very latest Macs; in which case you’re using an ARM based processor. You likely have more ARM processors in your life than you have socks.
Basically, this computer is a bit of a hodge-podge of historical technology, some of which ended up being successful, and other things less so.
Let’s have a look inside!
The PC Compatibility card on the left (installed), and oh, look, a graphics card on the right!
So, one of the PCI slots has a Vertex Twin Turbo 128M8A video card in it. There is not much about this card on the internet. There’s a photo of one on Wikimedia Commons though. I’ll have to investigate more.
Does it work though? Yes! Here it is on my desk:
The powered on Power Mac 7200/120
Even with Microsoft Internet Explorer 4.0 that came with MacOS 8.6, you can find some places on the internet you can fetch files from, at a not too bad speed even!
A few years ago we went to Taiwan. I managed to capture some random bits of the city on film (and also some shots on my then phone, a Google Pixel). I find the different style of art on the streets around the world to be fascinating, and Taiwan had some good examples.
I’ve really enjoyed shooting Kodak E100VS film over the years, and some of my last rolls were shot in Taiwan. It’s a film that unfortunately is not made anymore, but at least we have a new Ektachrome to have fun with now.
Street Art in Taipei. Shot on Kodak E100VS 35mm film.
Painted staircase in Taipei. Shot on Kodak E100VS 35mm film
Taipei Street Art. Shot with a Google Pixel.
Words for our time: “Where there is democracy, equality and freedom can exist; without democracy, equality and freedom are merely empty words”.
This is, of course, only a small number of the total photos I took there. I’d really recommend a trip to Taiwan, and I look forward to going back there some day.
If you’re near Melbourne, you should go to Healseville Sanctuary and enjoy the Australian native animals. I’ve been a number of times over the years, and here’s a couple of photos from a (relatively, as in, the last couple of years) trip.
Leah trying to photograph a much too close birdKoalas seem to always look like they’ve just woken up. I’m pretty convinced this one just had.
Some shots on Kodak Portra 400 from Adelaide. These would have been shot with my Nikon F80 35mm body, I think all with the 50mm lens. These are all pre-pandemic, and I haven’t gone and looked up when exactly. I’m just catching up on scanning some negatives.
On the random old photos train, there’s some from spending time in Tasmania post linux.conf.au 2017 in Hobart.
All of these are Kodak E100VS film, which was no doubt a bit out of date by the time I shot it (and when they stopped making Ektachrome for a while). It was a nice surprise to be reminded of a truly wonderful Tassie trip, taken with friends, and after the excellent linux.conf.au.
I recently got around to scanning some film that took an awful long time to make its way back to me after being developed. There’s some pictures from home.
Melbourne, circa December 2016
The rest of this roll of 35mm Fuji Velvia 50 is from Tasmania, which would place this all around December 2016.
It’s strange to get unexpected photos from a while ago. It’s also joyous.
Birds at Karkarook Park
These photos above are from a park down the street from where we used to live. I believe it was originally a quarry, and a number of years ago the community got together and turned it into a park. It’s a quite decent size (Parkrun is held there), and there’s plenty of birds (and ducks!) to see.
Moorabbin Station
It’s a very strange feeling seeing photos from both the before time, and from where I used to live. I’m sure that if the world wasn’t the way it was now, and there wasn’t a pandemic, it would feel different.
All of the above were shot on a Nikon F80 with 35mm Fuji Velvia 50 film.
Somewhere in the mid to late 1990s I picked myself up a Macintosh Plus for the sum of $60AUD. At that time there were still computer Swap Meets where old and interesting equipment was around, so I headed over to one at some point (at the St Kilda Town Hall if memory serves) and picked myself up four 1MB SIMMs to boost the RAM of it from the standard 1MB to the insane amount of 4MB. Why? Umm… because I could? The RAM was pretty cheap, and somewhere in the house to this day, I sometimes stumble over the 256KB SIMMs as I just can’t bring myself to get rid of them.
This upgrade probably would have cost close to $2,000 at the system’s release. If the Macintosh system software were better at disk caching you could have easily held the whole 800k of the floppy disk in memory and still run useful software!
One of the annoying things that started with the Macintosh was odd screws and Apple gear being hard to get into. Compare to say, the Apple ][ which had handy clips to jump inside whenever. In fitting my massive FOUR MEGABYTES of RAM back in the day, I recall using a couple of allen keys sticky-taped together to be able to reach in and get the recessed Torx screws. These days, I can just order a torx bit off Amazon and have it arrive pretty quickly. Well, two torx bits, one of which is just too short for the job.
My (dusty) Macintosh Plus
One thing had always struck me about it, it never really looked like the photos of the Macintosh Plus I saw in books. In what is an embarrassing number of years later, I learned that a lot can be gotten from the serial number printed on the underside of the front of the case.
Manufactured in: F => Fremont, California, USA Year of production: 1985 Week of production: 14 Production number: 3V3 => 4457 Model ID: M0001WP => Macintosh 512K (European Macintosh ED)
Your Macintosh 512K (European Macintosh ED) was the 4457th Mac manufactured during the 14th week of 1985 in Fremont, California, USA.
Pretty cool! So it is certainly a Plus as the logic board says that, but it’s actually an upgraded 512k! If you think it was madness to have a GUI with only 128k of RAM in the original Macintosh, you’d be right. I do not envy anybody who had one of those.
Some time a decent (but not too many, less than 10) years ago, I turn on the Mac Plus to see if it still worked. It did! But then… some magic smoke started to come out (which isn’t so good), but the computer kept working! There’s something utterly bizarre about looking at a computer with smoke coming out of it that continues to function perfectly fine.
Anyway, as the smoke was coming out, I decided that it would be an opportune time to turn it off, open doors and windows, and put it away until I was ready to deal with it.
One Global Pandemic Later, and now was the time.
I suspected it was going to be a capacitor somewhere that blew, and figured that I should replace it, and probably preemptively replace all the other electrolytic capacitors that could likely leak and cause problems.
First thing’s first though: dismantle it and clean everything. First, taking the case off. Apple is not new to the game of annoying screws to get into things. I ended up spending $12 on this set on Amazon, as the T10 bit can actually reach the screws holding the case on.
My Macintosh Plus with the case off. Note INTERNATIONAL and DANGER HIGH VOLTAGE
Cathode Ray Tubes are not to be messed with. We’re talking lethal voltages here. It had been many years since electricity went into this thing, so all was good. If this all doesn’t work first time when reassembling it, I’m not exactly looking forward to discharging a CRT and working on it.
The inside of my Macintosh Plus, with lots of grime.
You can see there’s grime everywhere. It’s not the worst in the world, but it’s not great (and kinda sticky). Obviously, this needs to be cleaned! The best way to do that is take a lot of photos, dismantle everything, and clean it a bit at a time.
There’s four main electronic components inside a Macintosh Plus:
The CRT itself
The floppy disk drive
The Logic Board (what Mac people call what PC people call the motherboard)
The Analog Board
There’s also some metal structure that keeps some things in place. There’s only a few connectors between things, which are pretty easy to remove. If you don’t know how to discharge a CRT and what the dangers of them are you should immediately go and find out through reading rather than finding out by dying. I would much prefer it if you dyed (because creative fun) rather than died.
Once the floppy connector and the power connector is unplugged, the logic board slides out pretty easily. You can see from the photo below that I have the 4MB of RAM installed and the resistor you need to snip is, well, snipped (but look really closely for that). Also, grime.
Macintosh Plus Logic Board
Cleaning things? Well, there’s two ways that I have used (and considering I haven’t yet written the post with “hurray, it all works”, currently take it with a grain of salt until I write that post). One: contact cleaner. Two: detergent.
Macintosh Plus Logic Board (being washed in my sink)
I took the route of cleaning things first, and then doing recapping adventures. So it was some contact cleaner on the boards, and then some soaking with detergent. This actually all worked pretty well.
Note that C30, C32, C38, C39, and C37 were missing from the kit I received (probably due to differences in the US and International boards). I did have an X2 cap (for C37) but it was 0.1uF not 0.47uF. I also had two extra 1000uF 16V caps.
Macintosh Repair and Upgrade Secrets (up to the Mac SE no less!) holds an Appendix with the parts listing for both the US and International Analog boards, and this led me to conclude that they are in fact different boards rather than just a few wires that are different. I am not sure what the “For 120V operation, W12 must be in place” and “for 240V operation, W12 must be removed” writing is about on the International Analog board, but I’m not quite up to messing with that at the moment.
So, I ordered the parts (linked above) and waited (again) to be able to finish re-capping the board.
I found https://youtu.be/H9dxJ7uNXOA video to be a good one for learning a bunch about the insides of compact Macs, I recommend it and several others on his YouTube channel. One interesting thing I learned is that the X2 cap (C37 on the International one) is before the power switch, so could blow just by having the system plugged in and not turned on! Okay, so I’m kind of assuming that it also applies to the International board, and mine exploded while it was plugged in and switched on, so YMMV.
Additionally, there’s an interesting list of commonly failing parts. Unfortunately, this is also for the US logic board, so the tables in Macintosh Repair and Upgrade Secrets are useful. I’m hoping that I don’t have to replace anything more there, but we’ll see.
But, after the Nth round of parts being delivered….
Note the lack of an exploded capacitor
Yep, that’s where the exploded cap was before. Cleanup up all pretty nicely actually. Annoyingly, I had to run it all through a step-up transformer as the board is all set for Australian 240V rather than US 120V. This isn’t going to be an everyday computer though, so it’s fine.
Macintosh Plus booting up (note how long the memory check of 4MB of RAM takes. I’m being very careful as the cover is off. High, and possibly lethal voltages exposed.
Woohoo! It works. While I haven’t found my supply of floppy disks that (at least used to) work, the floppy mechanism also seems to work okay.
Macintosh Plus with a seemingly working floppy drive mechanism. I haven’t found a boot floppy yet though.
Next up: waiting for my Floppy Emu to arrive as it’ll certainly let it boot. Also, it’s now time to rip the house apart to find a floppy disk that certainly should have made its way across the ocean with the move…. Oh, and also to clean up the mouse and keyboard.
Thanks to my most recent PR being merged, op-build v2.5 will have full support for the Raptor Blackbird! This includes support for the “IPL Monitor” that’s required to get fan control going.
Note that if you’re running Fedora 32 then you need some patches to buildroot to have it build, but if you’re building on something a little older, then upstream should build and work straight out of the box (err… git tree).
I also note that the work to get Secure Boot for an OS Kernel going is starting to make its way out for code reviews, so that’s something to look forward to (although without a TPM we’re going to need extra code).
I have done a few builds of firmware for the Raptor Blackbird since I got mine, each of them based on upstream op-build plus a few patches. The previous one was Yet another near-upstream Raptor Blackbird firmware build that I built a couple of months ago. This new build is based off the release candidate of op-build v2.5. Here’s what’s changed:
There’s two differences from upstream op-build: my pull request to op-build, and the fixing of the (old) buildroot so that it’ll build on Fedora 32. From discussions on the openpower-firmware mailing list, it seems that one hopeful thing is to have all the Blackbird support merged in before the final op-build v2.5 is tagged. The previous op-build release (v2.4) was tagged in July 2019, so we’re about 10 months into what was a 2 month release cycle, so speculating on when that final release will be is somewhat difficult.
So, following on from my post on Sensors on the Blackbird (and thus Power9), I mentioned that when you look at the temperature sensors for each CPU core in my 8-core POWER9 chip, they’re not linear numbers. Let’s look at what that means….
stewart@blackbird9$ sudo ipmitool sensor | grep core
p0_core0_temp | na
p0_core1_temp | na
p0_core2_temp | na
p0_core3_temp | 38.000
p0_core4_temp | na
p0_core5_temp | 38.000
p0_core6_temp | na
p0_core7_temp | 38.000
p0_core8_temp | na
p0_core9_temp | na
p0_core10_temp | na
p0_core11_temp | 37.000
p0_core12_temp | na
p0_core13_temp | na
p0_core14_temp | na
p0_core15_temp | 37.000
p0_core16_temp | na
p0_core17_temp | 37.000
p0_core18_temp | na
p0_core19_temp | 39.000
p0_core20_temp | na
p0_core21_temp | 39.000
p0_core22_temp | na
p0_core23_temp | na
You can see I have eight CPU cores in my Blackbird system. The reason the 8 CPU cores are core 3, 5, 7, 11, 15, 17, 19, and 21 rather than 0-8 or something is that these represent the core numbers on the physical die, and the die is a 24 core die. When you’re making a chip as big and as complex as modern high performance CPUs, not all of the chips coming out of your fab are going to be perfect, so this is how you get different models in the line with only one production line.
Weirdly, the output from the hwmon sensors and why there’s a “core 24” and a “core 28”. That’s just… wrong. What it is, however, is right if you think of 8*4=32. This is a product of Linux thinking that Thread=Core in some ways. So, yeah, this numbering is the first thread of each logical core.
But let’s ignore that, go from the IPMI sensors (which also match what the OCC shows with “occtoolp9 -LS” (see below).
$ ./occtoolp9 -SL
Sensor Details: (found 86 sensors, details only for Status of 0x00)
GUID Name Sample Min Max U Stat Accum UpdFreq ScaleFactr Loc Type
....
0x00ED TEMPC03……… 47 29 47 C 0x00 0x00037CF2 0x00007D00 0x00000100 0x0040 0x0008
0x00EF TEMPC05……… 37 26 39 C 0x00 0x00014E53 0x00007D00 0x00000100 0x0040 0x0008
0x00F1 TEMPC07……… 46 28 46 C 0x00 0x0001A777 0x00007D00 0x00000100 0x0040 0x0008
0x00F5 TEMPC11……… 44 27 45 C 0x00 0x00018402 0x00007D00 0x00000100 0x0040 0x0008
0x00F9 TEMPC15……… 36 25 43 C 0x00 0x000183BC 0x00007D00 0x00000100 0x0040 0x0008
0x00FB TEMPC17……… 38 28 41 C 0x00 0x00015474 0x00007D00 0x00000100 0x0040 0x0008
0x00FD TEMPC19……… 43 27 44 C 0x00 0x00016589 0x00007D00 0x00000100 0x0040 0x0008
0x00FF TEMPC21……… 36 30 40 C 0x00 0x00015CA9 0x00007D00 0x00000100 0x0040 0x0008
So what does that mean for physical layout? Well, like all modern high performance chips, the POWER9 is modular, with a bunch of logic being replicated all over the die. The most notable duplicated parts are the core (replicated 24 times!) and cache structures. Less so are memory controllers and PCI hardware.
See that each core (e.g. EC00 and EC01) is paired with the cache block (EC00 and EC01 with EP00). That’s two POWER9 cores with one 512KB L2 cache and one 10MB L3 cache.
You can see the cache layout (including L1 Instruction and Data caches) by looking in sysfs:
$ for i in /sys/devices/system/cpu/cpu0/cache/index*/; \
do echo -n $(cat $i/level) $(cat $i/size) $(cat $i/type); \
echo; done
1 32K Data
1 32K Instruction
2 512K Unified
3 10240K Unified
So, what does the layout of my POWER9 chip look like? Well, thanks to the power of graphics software, we can cross some cores out and look at the topology:
My 8-core POWER9 CPU in my Raptor Blackbird
If I run some memory bandwidth benchmarks, I can see that you can see the L3 cache capacity you’d assume from the above diagram: 80MB (10MB/core). Let’s see:
If all the cores were packed together, I’d expect that cliff to be a lot sooner.
So how does this compare to other machines I have around? Well, let’s look at my Ryzen 7. Specifically, a “AMD Ryzen 7 1700 Eight-Core Processor”. The cache layout is:
$ for i in /sys/devices/system/cpu/cpu0/cache/index*/; \
do echo -n $(cat $i/level) $(cat $i/size) $(cat $i/type); \
echo; \
done
1 32K Data
1 64K Instruction
2 512K Unified
3 8192K Unified
And then the performance benchmark similar to the one I ran above on the POWER9 (lower numbers down low as 8MB is less than 10MB)
I’m not sure what performance conclusions we can realistically draw from these curves, apart from “keeping workload to L3 cache is cool”, and “different chips have different cache hardware”, and “I should probably go and read and remember more about the microarchitectural characteristics of the cache hardware in Ryzen 7 hardware and 10th gen Intel Core hardware”.
This post we’re going to look at three different ways to look at various sensors in the Raptor Blackbird system. The Blackbird is a single socket uATX board for the POWER9 processor. One advantage of the system is completely open source firmware, so you can (like I have): build your own firmware. So, this is my Blackbird running my most recent firmware build (the BMC is running the 2.00 release from Raptor).
Sensors over IPMI
One way to get the sensors is over IPMI. This can be done either in-band (as in, from the OS running on the blackbird), or over the network.
stewart@blackbird9$ sudo ipmitool sensor |head
occ | na | discrete | na | na | na | na | na | na | na
occ0 | 0x0 | discrete | 0x0200| na | na | na | na | na | na
occ1 | 0x0 | discrete | 0x0100| na | na | na | na | na | na
p0_core0_temp | na | | na | na | na | na | na | na | na
p0_core1_temp | na | | na | na | na | na | na | na | na
p0_core2_temp | na | | na | na | na | na | na | na | na
p0_core3_temp | 38.000 | degrees C | ok | na | -40.000 | na | 78.000 | 90.000 | na
p0_core4_temp | na | | na | na | na | na | na | na | na
p0_core5_temp | 38.000 | degrees C | ok | na | -40.000 | na | 78.000 | 90.000 | na
p0_core6_temp | na | | na | na | na | na | na | na | na
It’s kind of annoying to read there, so standard unix tools to the rescue!
stewart@blackbird9$ sudo ipmitool sensor | cut -d '|' -f 1,2
occ | na
occ0 | 0x0
occ1 | 0x0
p0_core0_temp | na
p0_core1_temp | na
p0_core2_temp | na
p0_core3_temp | 38.000
p0_core4_temp | na
p0_core5_temp | 38.000
p0_core6_temp | na
p0_core7_temp | 38.000
p0_core8_temp | na
p0_core9_temp | na
p0_core10_temp | na
p0_core11_temp | 37.000
p0_core12_temp | na
p0_core13_temp | na
p0_core14_temp | na
p0_core15_temp | 37.000
p0_core16_temp | na
p0_core17_temp | 37.000
p0_core18_temp | na
p0_core19_temp | 39.000
p0_core20_temp | na
p0_core21_temp | 39.000
p0_core22_temp | na
p0_core23_temp | na
p0_vdd_temp | 40.000
dimm0_temp | 35.000
dimm1_temp | na
dimm2_temp | na
dimm3_temp | na
dimm4_temp | 38.000
dimm5_temp | na
dimm6_temp | na
dimm7_temp | na
dimm8_temp | na
dimm9_temp | na
dimm10_temp | na
dimm11_temp | na
dimm12_temp | na
dimm13_temp | na
dimm14_temp | na
dimm15_temp | na
fan0 | 1200.000
fan1 | 1100.000
fan2 | 1000.000
p0_power | 33.000
p0_vdd_power | 5.000
p0_vdn_power | 9.000
cpu_1_ambient | 30.600
pcie | 27.000
ambient | 26.000
You can see that I have 3 fans, two DIMMs (although why it lists 16 possible DIMMs for a two DIMM slot board is a good question!), and eight CPU cores. More on why the layout of the CPU cores is the way it is in a future post.
The code path for reading these sensors is interesting, it’s all from the BMC, so we’re having the OCC inside the P9 read things, which the BMC then reads, and then passes back to the P9. On the P9 itself, each sensor is a call all the way to firmware and back! In fact, we can look at it in perf:
What are the 0x300xxxxx addresses? They’re the OPAL firmware (i.e. skiboot). We can look up the symbols easily, as the firmware exposes them to the kernel, which then plonks it in sysfs:
[stewart@blackbird9 ~]$ sudo head /sys/firmware/opal/symbol_map
[sudo] password for stewart:
0000000000000000 R __builtin_kernel_end
0000000000000000 R __builtin_kernel_start
0000000000000000 T __head
0000000000000000 T _start
0000000000000010 T fdt_entry
00000000000000f0 t boot_sem
00000000000000f4 t boot_flag
00000000000000f8 T attn_trigger
00000000000000fc T hir_trigger
0000000000000100 t sreset_vector
So we can easily look up exactly where this is:
[stewart@blackbird9 ~]$ sudo grep '18e.. ' /sys/firmware/opal/symbol_map
0000000000018e20 t .__try_lock.isra.0
0000000000018e68 t .add_lock_request
So we’re managing to spend a whole 12% of execution time spinning on a spinlock in firmware! The call stack of what’s going on in firmware isn’t so easy, but we can find the bt_add_ipmi_msg call there which is probably how everything starts:
[stewart@blackbird9 ~]$ sudo grep '516.. ' /sys/firmware/opal/symbol_map 0000000000051614 t .bt_add_ipmi_msg_head 0000000000051688 t .bt_add_ipmi_msg 00000000000516fc t .bt_poll
OCCTOOL
This is the most not-what-you’re-meant-to-use method of getting access to sensors! It’s using a debug tool for the OCC firmware! There’s a variety of tools in the OCC source repositiory, and one of them (occtoolp9) can be used for a variety of things, one of which is getting sensor data out of the OCC.
The odd thing you’ll see is “via opal-prd” – and this is because it’s doing raw calls to the opal-prd binary to talk to the OCC firmware running things like “opal-prd --expert-mode htmgt-passthru“. Yeah, this isn’t a in-production thing :)
Amazingly (and interestingly), this doesn’t go through host firmware in the way that an IPMI call will. There’s a full OCC/Host firmware interface spec to read. But it’s insanely inefficient way to monity sensors, a long bash script shelling out to a whole bunch of other processes… Think ~14.4 billion cycles versus ~367million cycles for the ipmitool option above.
But there are some interesting sensors at the end of the list:
Sensor Details: (found 86 sensors, details only for Status of 0x00)
GUID Name Sample Min Max U Stat Accum UpdFreq ScaleFactr Loc Type
....
0x014A MRDM0……….. 688 3 15015 GBs 0x00 0x0144AE6C 0x00001901 0x000080FB 0x0008 0x0200
0x014E MRDM4……….. 480 3 14739 GBs 0x00 0x01190930 0x00001901 0x000080FB 0x0008 0x0200
0x0156 MWRM0……….. 560 4 16605 GBs 0x00 0x014C61FD 0x00001901 0x000080FB 0x0008 0x0200
0x015A MWRM4……….. 360 4 16597 GBs 0x00 0x014AE231 0x00001901 0x000080FB 0x0008 0x0200
is that memory bandwidth? Well, if I run the STREAM benchmark in a loop and look again:
In what is coming a month occurance, I’ve put up yet another firmware build for the Raptor Blackbird with close-to-upstream firmware (see here and here for previous ones).
Well, I’ve done another build! It’s current op-build (as of yesterday), but my branch with patches for the Raptor Blackbird. The skiboot patch is there, the SBE speedup patch is now upstream. The machine-xml which is straight from Raptor but in my repo.
If we compare this to the last build I put up, we have:
Component
old
new
skiboot
v6.5-209-g179d53df-p4360f95
v6.5-228-g82aed17a-p4360f95
linux
5.4.13-openpower1-pa361bec
5.4.22-openpower1-pdbbf8c8
occ
3ab2921
no change
hostboot
779761d-pe7e80e1
acdff8a-pe7e80e1
buildroot
2019.05.3-14-g17f117295f
2019.05.3-15-g3a4fc2a888
capp-ucode
p9-dd2-v4
no change
machine-xml
site_local-stewart-a0efd66
no change
hostboot-binaries
hw011120a.opmst
hw013120a.opmst
sbe
166b70c-p06fc80c
c318ab0-p1ddf83c
hcode
hw011520a.opmst
hw030220a.opmst
petitboot
v1.11
v1.12
version
blackbird-v2.4-415-gb63b36ef
blackbird-v2.4-514-g62d1a941
So, what do those changes mean? Not too much changed over the past month. Kernel bump, new petitboot (although I can’t find release notes but it doesn’t look like there’s a lot of changes), and slight bumps to other firmware components.
To flash it, copy blackbird.pnor to your Blackbird’s BMC in /tmp/ (important! the /tmp filesystem has enough room, the home directory for root does not), and then run:
pflash -E -p /tmp/blackbird.pnor
Which will ask you to confirm and then flash:
About to erase chip !
WARNING ! This will modify your HOST flash chip content !
Enter "yes" to confirm:yes
Erasing... (may take a while)
[==================================================] 99% ETA:1s
done !
About to program "/tmp/blackbird.pnor" at 0x00000000..0x04000000 !
Programming & Verifying...
[==================================================] 100% ETA:0s
A few weeks ago (okay, close to six), I put up a firmware build for the Raptor Blackbird with close-to-upstream firmware (see here).
Well, I’ve done another build! It’s current op-build (as of this morning), but my branch with patches for the Raptor Blackbird. The skiboot patch is there, as is the SBE speedup patch. Current kernel (works fine with my hardware), current petitboot, and the machine-xml which is straight from Raptor but in my repo.
The Self Boot Engine (SBE) is a small embedded PPE42 core inside the POWER9 CPU which has the unenvious job of getting a single POWER9 core ready enough to start executing instructions out of L3 cache, and poking some instructions into said cache for the core to start executing.
It’s called the “Self Boot Engine” as in generations prior to POWER8, it was the job of the FSP (Service Processor) to do all of the booting for the CPU. On POWER8, there was still an SBE, but it was a custom instruction set (this was the Power On Reset Engine – PORE), while the PPE42 is basically a 32bit powerpc core cut straight down the middle (just the way to make it awkward for toolchains).
One of the things I noted in my post on Booting temporary firmware on the Raptor Blackbird is that we got serial console output from the SBE. It turns out one of thing things explicitly not enabled by Raptor in their build was this output as “it made the SBE boot much slower”. I’d actually long suspected this, but hadn’t really had the time to delve into it.
WARNING: hacking on your SBE firmware can be relatively dangerous, as it’s literally the first thing that needs to work in order to boot the system, and there isn’t (AFAIK) publicly documented easy way to re-flash your SBE firmware if you mess it up.
Seeing as we saw a regression in boot time with the UART output enabled, we need to look at the uartPutChar() function in sbeConsole.C (error paths removed for clarity):
One thing you may notice if you’ve spent some time around serial ports is that it’s not using the transmit FIFO! While according to Wikipedia the original 16550 had a broken FIFO, but we’re certainly not going to be hooked up to an original rev of that silicon.
To compare, let’s look at the skiboot code, which is all in hw/lpc-uart.c:
The uart_check_tx_room() function is pretty simple, it checks if there’s room in the FIFO and knows that there’s 16 entries. Next, we have a busy loop that waits until there’s room again in the FIFO:
static void uart_wait_tx_room(void)
{
while (!tx_room) {
uart_check_tx_room();
if (!tx_room) {
smt_lowest();
do {
barrier();
uart_check_tx_room();
} while (!tx_room);
smt_medium();
}
}
}
Finally, the bit of code that writes the (internal) log buffer out to a serial port:
/*
* Internal console driver (output only)
*/
static size_t uart_con_write(const char *buf, size_t len)
{
size_t written = 0;
/* If LPC bus is bad, we just swallow data */
if (!lpc_ok() && !mmio_uart_base)
return written;
lock(&uart_lock);
while(written < len) {
if (tx_room == 0) {
uart_wait_tx_room();
if (tx_room == 0)
goto bail;
} else {
uart_write(REG_THR, buf[written++]);
tx_room--;
}
}
bail:
unlock(&uart_lock);
return written;
}
The skiboot code ends up being a bit more complicated thanks to a number of reasons, but the basic algorithm could be applied to the SBE code, and rather than busy waiting for each character to be written out before sending the other into the FIFO, we could just splat things down there and continue with life. So, I put together a patch to try out.
Before (i.e. upstream SBE code): it took about 15 seconds from “Welcome to SBE” to “Booting Hostboot”.
Now (with my patch): Around 10 seconds.
It’s a full five seconds (33%) faster to get through the SBE stage of booting. Wow.
Hopefully somebody looks at the pull request sometime soon, as it’s probably useful to a lot of people doing firmware and Operating System development.
So, Happy New Year for Blackbird owners (I’ll publish a build with this and other misc improvements “soon”).