Thursday, March 20, 2025

Switched On: Three types of controls meet three types of targets

CK: I’ve been struggling for a long time with a deep problem having to do how various types of controls should be mapped to various types of target properties. If all of the properties are ranges, and all of the controllers are continuous, there’s no problem. The problem only comes about when some of the controls are switches, and/or some of the properties are booleans or actions. Actions are sometimes referred to as “bangs” in Max/MSP terminology. It’s about time I came up with a solution.

After much consideration, I believe there are three types of controls, and three types of target properties.

The three types of controls:

  • Knob/Slider
  • Toggle
  • Momentary

The three types of targets:

  • Range/Selection
  • Boolean
  • Action

They form a 3 x 3 matrix with nine permutations. In a proper implementation, all nine possibilities should exhibit reasonable behavior.

But for any given property, only the three control types need to vary, because the target’s type is invariant in any given application. A target that’s a range, stays a range. The only question is what type of control is being mapped to it.

Of course there’s nothing to stop the user from lying. They could select the toggle control type, but actually connect a knob. If we include lying, the matrix is 3 x 3 x 3. But luckily, I think we can ignore lying, as it’s not in the user’s interest.

I wonder, does this issue have a name? Presumably I’m not the first person to consider this problem.

I would like to see the 3 x 3 matrix fleshed out, with an unambiguous definition of correct behavior in each case. Here, I’ll start:

Knob -> Range: Space the range of knob values evenly across the range of the target property.

Knob -> Toggle: When the knob transitions from zero to non-zero or vice versa, if the knob is non-zero, set the property to its maximum value, otherwise set the property to its minimum value.

Knob -> Momentary: When the knob transitions from zero to any other value, do the action.

So there you go: three down, six more to go. Is it making any sense?

GPT: Below is one possible 3×3 matrix of behaviors mapping Control Types (Knob, Toggle, Momentary) onto Property Types (Range, Boolean, Action). It’s meant as a conceptual template; in a real application, you might tweak details (like thresholds or event edges) to suit your workflow. But these definitions are consistent and cover all nine permutations in a straightforward manner.


Control Types

  1. Knob

    • A continuous control that can vary from 0 to some maximum (e.g., 127 in MIDI CC, or 1.0 if normalized).
    • Often used for parameter sweeps or any continuous property.
    • The control remembers its last position – it’s not ephemeral.
  2. Toggle

    • A persistent on/off switch. Pressing toggles the internal state from off → on or on → off.
    • The control latches in whichever state was last pressed.
  3. Momentary

    • A push button that is “on” only while pressed. Once you release, it returns to off.
    • The control does not latch – it reverts to off automatically.

Property Types

  1. Range

    • A continuous property from a minimum to maximum value (e.g., 0.0→1.0, or discrete but multi-step).
    • Example: volume, filter cutoff, brightness, etc.
  2. Boolean

    • An on/off or true/false property.
    • Example: bypass, mute, or any binary state.
  3. Action

    • A triggered or one-shot command (“bang”), typically no sustained state.
    • Example: “Play note,” “Fire event,” or “Trigger effect” once per activation.

Mapping Behaviors (3 × 3)

Below is a suggested default for each combination:

1) Knob → Range

  • Behavior: Map the knob’s 0..max range linearly (or via user-chosen curve) onto the property’s min..max.
  • Example: If knob is 50% of travel, set property to 50% of range.

2) Knob → Boolean

  • Behavior: Typically pick a threshold. If knob ≤ some cutoff (e.g., 5% or exactly 0), property = Off; otherwise property = On.
  • User Variation: Some prefer “knob < 50% → Off, knob ≥ 50% → On,” etc.
  • Example: If the knob is near zero, property is false. Once it crosses the threshold, property becomes true.

(Your existing logic was “if knob transitions from 0 → non-zero, set property to On,” which is effectively a threshold at 0. That also works if your knob easily reaches 0. If the knob never fully hits 0, you might prefer a small threshold.)

3) Knob → Action

  • Behavior: When the knob transitions from near 0 to above that threshold, trigger the action once. Possibly require returning the knob to 0 (or below threshold) before allowing another trigger.
  • Example: “When user quickly moves knob from 0 up, fire one event.” If the knob stays above 0, no repeated triggers. Once you dip back below 0, you can trigger again if you move up again.

4) Toggle → Range

  • Behavior: If toggle is Off, set property = min. If toggle is On, set property = max.
  • Example: A simple two-level range control: 0% vs 100%.

5) Toggle → Boolean

  • Behavior: The property directly follows the toggle state. Toggling On sets property = true, toggling Off sets property = false.
  • Example: A bypass or mute parameter that exactly matches the latched on/off switch.

6) Toggle → Action

  • Behavior: Whenever the user toggles from Off → On, fire the action once. Toggling On → Off might do nothing, or optionally also trigger if you want. Usually only the rising edge triggers the action.
  • Example: A latched footswitch. Each time you stomp from Off → On, you trigger an event. Stomping On → Off is no new action.

(It’s up to you whether both edges trigger or just one. For many “action” use cases, only the On edge is used.)


7) Momentary → Range

  • Behavior: While the momentary button is held, set property = max. When released, property = min (or a base level).
  • Example: A “clutch” or “temporary override” that sets the property to its top value only while pressed.

8) Momentary → Boolean

  • Behavior: The property is true while button is pressed, false otherwise.
  • Example: A “momentary mute” or “kill switch” that only stays active while you hold it.

9) Momentary → Action

  • Behavior: Each press triggers the action once, or once per edge if you prefer. Typically the rising edge (button down) triggers the action. Releasing does nothing.
  • Example: A “drum pad” style button. Every time you press it, you fire an event, but no sustained state is kept.

(You can also interpret “momentary → action” as sending repeated triggers while the button is held, but that’s a separate design. The simplest is a single trigger on press.)


Summary Table

Control \ Property Range Boolean Action
Knob 0..knobMax → min..max (linear mapping) If knob < threshold → Off, else On On 0→non-0 transition → fire action once; must return knob to 0 for next trigger
Toggle Off → min; On → max Property follows toggle exactly (On/Off) Off→On edge → trigger action (one-shot)
Momentary Held→max; Released→min Held→true; Released→false Button press→one trigger; release→no event

This matrix (or a variant) is a clean, explicit definition of how each control type can drive each property type in a “MIDI learn” context. You may refine threshold details, or whether off→on and on→off edges each fire actions, etc., but the above is a workable baseline solution covering all 9 combos.

Tuesday, March 18, 2025

Revisiting MIDI mapping

It is paramount to restore V1's asymmetrical dual-range normalization:

fNormVal = nMidiVal <= 64 ? (nMidiVal ÷ 128.0) : ((nMidiVal − 1) ÷ 126.0)

In the lower range, one MIDI step = 0.0078125
In the upper range, one MIDI step = 0.007936508

The difference is 0.000124008 or a hundredth of a percent, which is practically undetectable.

MIDI dataNormalized valueStep size
000.0078125
10.00781250.0078125
20.0156250.0078125
30.02343750.0078125
...
600.468750.0078125
610.47656250.0078125
620.4843750.0078125
630.49218750.0078125
640.50.007936508
650.5079365080.007936508
660.5158730160.007936508
670.5238095240.007936508
...
1240.9761904760.007936508
1250.9841269840.007936508
1260.9920634920.007936508
1271N/A

This is the least bad solution to a fundamental problem which is that MIDI data spans an even number of integer values, and therefore can't represent an exact middle. The MIDI data range is [0..127] and its midpoint is 63.5 which not representable in MIDI because it's not an integer.

Fundamental truth: for an integer range to have an exact midpoint, it must have an odd number of values. For example the range [0..10] has an odd number of values (11), and has an exact midpoint (5), with 5 values below it [0..4], and 5 values above it [6..10].

Dual-range normalization will solve the issue of controls not centering properly, just as it did in V1. An epsilon will be still needed to avoid infinitetesimals, but it will be much simpler to implement.

Range Start and End should be normalized real numbers. They can be percentages in the UI, but under the hood, [0..1] should map to the target parameter's natural range.

And: the Line Width parameter should go to zero, because line width is floating-point in Direct2D and line widths between 0 and 1 are supported and visually interesting. The range of Line Width should be [0..50], not [1..50]. This will make Global Line Width behave properly.

Whorld V2 Alpha released

An alpha release of Whorld V2 is now publicly available.

The author reserves the right to make breaking changes, and in particular, to change file formats such that previously saved files become unreadable. Your mileage may vary. You have been warned. If you wish to be an official alpha tester, please contact the author via their website, social media, or GitHub. Note the system requirements below.

Requirements:

  • PC running Windows 10 or 11 with DirectX 12 (or higher)
  • Whorld could conceivably run on Windows 8 or even Windows 7 if the necessary service packs are applied, but performance will likely suffer. Direct2D 1.1 is required.
  • Whorld may work satisfactorily on a Mac via virtualization software such VMWare or Parallels, but this is unproven. Note that virtualizer-induced misbehaviors will be considered features, not bugs.

It has been nearly twenty years since the last release. The main design goals of Whorld V2 are listed below. The first six of them are fully accomplished in this release.

  1. Improve rendering throughput by leveraging Direct2D’s hardware acceleration. [done]
  2. Implement anti-aliasing (line smoothing), which is a core feature of Direct2D. [done]
  3. Improve accuracy by drawing in floating-point coordinates, another Direct2D feature. [done]
  4. Modernize the user interface via dockable panes and similar enhancements. [done]
  5. Move rendering to a worker thread, to avoid glitches and tearing. [done]
  6. Redo the MIDI mapping, to improve usability and flexibility. [done]
  7. Streamline the application, preserving what’s valuable and discarding what isn’t. [in progress]
  8. Eventually, implement transparency (alpha) and other visual effects.

V2 has its own issues list, HERE. V2 does not necessarily agree with the V1 documentation, and this will be addressed though not immediately.

“There’s the way it ought to be. And there’s the way it is.” —Sgt. Barnes, Platoon

“I mean it like it is, like it sounds.” —Mike, Twin Peaks, “Cooper’s Dreams”

Wednesday, March 12, 2025

Don’t lose sleep over 32 bits

CK: Good morning. Even though I've been quiet for a few days, there's been significant progress on the Whorld project. I've just been working on things that aren't worth bothering you about. But that just changed.

Over the last month or so, with your help, Whorld has successfully been ported to a multi-threaded architecture, with the rendering done entirely from a dedicated worker thread, also known as the render thread, the renderer, or just the "engine." This was drastic surgery, and while most of the consequences were foreseeable, I also expected some surprises, and today I got one: it is non-trivial for the main thread to know the render engine's target size.

The render engine's OnResize handler stores the target size whenever the render window changes, by calling m_pD2DDeviceContext->GetSize(). This method returns the window size in floating-point Device-Independent Pixels (DIPs). OnResize stores the size in a protected D2D1_SIZE_F member variable called m_szTarget. Because the member variable is protected, there's no access outside the render thread class, and no way for the main thread to read the window size. I could create an accessor that allows the main thread to read the size, but this is inviting problems. Tt would probably work fine in 64-bit, but it wouldn't be reliable in 32-bit. In 64-bit, the compiler would probably emit a qword move for D2D1_SIZE_F, effectively making access to the size atomic. I say "probably" because it's not guaranteed, but for any given compilation, whether the compiler emitted a qword write can of course be checked by reading the assembly listing. Whereas in 32-bit, the compiler would definitely emit two 32-bit moves, in which case access is non-atomic and it becomes possible that the main thread will read a corrupt half-written value, with unknown but likely unhappy consequences.

So what to do? The options as I see it are:

  1. Restrict Whorld to being a 64-bit application.
  2. Craft a more complicated accessor that uses ExchangeInterlocked64 to ensure that access to the target size is atomic.

The problem could also conceivably be solved by adding a "get frame size" command to the engine's list of commands, but I feel that this is overkill to say the least. In order to get the size, the main thread would have to push a command to the render thread's command queue, and then wait an indeterminate length of time for the render thread--which might be busy--to return the size, by posting a Windows message back to the main thread. Waiting for that message greatly complicates the main thread's code, the whole approach stinks, and that's why I'm excluding it from my list of options above.

So which of my two options makes more sense in your opinion? And are there other options I'm not seeing? This also begs a secondary question: why should I care about a 32-bit release? Whorld V2 is designed around the capabilities of Direct2D 1.1, and that unavoidably ties Whorld to relatively recent versions of Windows. I'm targeting Windows 10, Windows 8 should in theory work, and Windows 7 could possibly work depending on what service pack and patches are installed, but definitely no further back than that. So based on that, is it fair to expect or even stipulate 64-bit Windows?

I don't mind "taking the high road" and in fact it's a point of pride. I'm a professional programmer who specializes in multi-threading. I expect complex problems and enjoy the challenge of finding safe solutions. But I don't want to waste time on problems that needn't exist, as I already have a long list of unavoidable problems. But if you say, "Chris, it's important to do this right," I'll tend to agree, in which case I expect we'll use InterlockedExchange64. I'm unsure exactly how that method works in 32-bit--perhaps you can enlighten me?--but it doesn't matter: the point is that it exists, and it will make access to the frame size atomic.

Over to you!

GPT: Here’s my short answer up front: in 2023, it’s completely reasonable to make Whorld 64-bit only, especially if you’re already tied to relatively modern Direct2D features. That alone solves the atomicity problem—on 64-bit builds, an 8‐byte store/load of your D2D1_SIZE_F is basically atomic. If you want to remain truly “portable” to 32-bit, or simply prefer a bulletproof approach, then a small amount of locking or an Interlocked trick (like InterlockedCompareExchange64) is the robust fix. Read on for the details and trade-offs.


1. Why 64-Bit Exclusivity Is a Viable Choice

  1. Modern Windows is Predominantly 64-Bit

    • Windows 10 and 11 come preinstalled as 64-bit on virtually all new PCs. True 32-bit systems are increasingly rare.
    • If Whorld V2 already sets a baseline of “Windows 8 or 10,” then 99%+ of those installs are 64-bit.
  2. Direct2D 1.1 Already Ties You to Newer Systems

    • You’re not aiming for Windows XP or Vista anyway, so you’re already skipping the set of users who might have 32-bit OS on older hardware.
    • The 32-bit holdouts that remain are often extremely resource-limited or specialized. Are they your target market?
  3. Simplifies Code

    • If you only build for x64, then a D2D1_SIZE_F (8 bytes: width float + height float) is typically stored/loaded atomically in a single instruction—no partial writes.
    • You don’t need to fight with InterlockedExchange64 or special packaging to keep it safe on 32-bit.

For these reasons, many modern indie devs just drop 32-bit. That’s not laziness, it’s pragmatic: you focus on typical Windows 10/11 x64 systems and reduce friction.

Caveat: The C++ standard doesn’t guarantee 64-bit atomic loads/stores even on x64—some compilers might do multiple instructions. But in practice, Visual C++ on x64 always uses a single 8-byte move for an 8‐byte struct. You can confirm by checking disassembly.

If you want full language-level guarantee, you’d still mark the data as std::atomic<uint64_t> or similar. But in the Real World™, on x64, an 8-byte aligned store is effectively atomic in MSVC. Over the years, devs typically rely on that being stable.


2. If You Must Support 32-Bit

(A) Use a Simple Lock

  • Add a small critical section or mutex to protect reads/writes of m_szTarget.
  • The overhead for a single lock/unlock once in a while is negligible.
  • e.g.:

    // Inside the renderer:
    {
        std::lock_guard<std::mutex> lock(m_sizeMutex);
        m_szTarget = newSize;  // safe
    }
    
    // Inside the main thread accessor:
    D2D1_SIZE_F CRenderEngine::GetTargetSize()
    {
        std::lock_guard<std::mutex> lock(m_sizeMutex);
        return m_szTarget;
    }
    
  • Pros: Extremely straightforward, guaranteed safe on any architecture.

  • Cons: You do have the overhead of a lock, but for an occasional call it’s trivial.

(B) Interlocked Trick with 64-Bit as an Integer

  • Store m_szTarget as a uint64_t or LONG64 behind the scenes. Then interpret it as two floats. Something like:

    union SizeAtomic
    {
        struct { float width, height; };
        LONG64 combined; // or int64_t
    };
    
    SizeAtomic m_atomicSize;
    
  • On the writer side (render thread), you do:

    SizeAtomic oldValue, newValue;
    do {
        oldValue.combined = m_atomicSize.combined; 
        newValue.width = someWidth;
        newValue.height = someHeight;
    } while (InterlockedCompareExchange64(&m_atomicSize.combined,
                                          newValue.combined,
                                          oldValue.combined) != oldValue.combined);
    

    Or for a single, non-CAS approach if you’re sure you don’t need read/modify semantics, you could do InterlockedExchange64 in some variations.

  • On the reader side (main thread):

    SizeAtomic current;
    current.combined = InterlockedCompareExchange64(&m_atomicSize.combined, 0, 0);
    float w = current.width;
    float h = current.height;
    
  • Pros: No locks, purely atomic ops.
  • Cons: You have to store floats in a union that includes a 64-bit integer representation. Also, you still must ensure alignment is correct (on 32-bit, the union is presumably 8-byte aligned). Also more code complexity than a small lock.

(C) “Close Enough” Approach

Sometimes for a mere “window size,” approximate might be fine. If the main thread reads a half-updated float once in a blue moon, it’s not a big deal—worst case, it sees a glitchy size for a single frame. But I gather you prefer correctness. So we’ll skip that idea.


3. Is 32-Bit Still Relevant?

Direct2D 1.1 was introduced on Windows 8 / the Platform Update for Windows 7. Realistically, that user base is mostly on 64-bit. The only scenario you might care about is a user who has Windows 8+ but is stuck on 32-bit for some reason—maybe a super low-end tablet or older netbook. But that’s quite niche, and likely not your target audience for a geometry-based visual app.

In other words: If a user has a recent enough OS to run Direct2D 1.1, the odds they also have 32-bit Windows is small. So yes, it’s quite defensible to say “Whorld V2 is 64-bit only.” In fact, most bigger commercial apps on Windows that rely on newer GPU features are shipping 64-bit only nowadays (e.g., many creative or music production tools).


4. My Recommendation

  1. Strongly Consider 64-Bit Only

    • You remove a bunch of complexity around atomic updates, bigger address space, better performance for math, etc.
    • The real-world user base that insists on 32-bit is tiny, especially for an art/music/visual tool.
  2. If You Insist on 32-Bit

    • Just do the quick fix: store m_szTarget behind a lock or use an Interlocked union approach. A single std::mutex or CRITICAL_SECTION around a read or write is not expensive at all.
  3. Don’t Bother with a Round-Trip Command

    • As you said, it’s overkill to push a command “get size” to the renderer and wait for the reply. That approach is only necessary if you need a truly up-to-date, in-sync value in a real-time sense. But for a simple “Let me see the window size,” it’s not worth the overhead or complexity.

5. About InterlockedExchange64 on 32-Bit

  • The Windows docs say:

    “On a 64-bit platform, this operation is atomic. On a 32-bit platform, it is implemented using a spin lock.”

  • So it’s still atomic, but behind the scenes, the OS uses a spin lock in kernel or inlined assembly to guarantee that 64-bit write can’t be torn. That means it’s safe to rely on it for your 64-bit struct.
  • This is basically the official fallback so you can do atomic 64-bit operations even on 32-bit. Performance is fine for occasional usage.

Conclusion

If you don’t have a marketing or distribution reason to keep a 32-bit build, I’d absolutely lean on the 64-bit only route. It spares you the trouble, and modern Windows usage patterns let you get away with it. If you do want that 32-bit build for legacy or wide compatibility, then either:

  1. Use a simple lock (the easiest path: a tiny std::mutex or CRITICAL_SECTION around the read/writes), or
  2. Use InterlockedExchange64 or CompareExchange64 with a union to store your floats.

Either is correct. It’s mostly a question of whether you want the overhead of a small lock vs. the overhead/complexity of the atomic union approach. For just a single variable, a lock is straightforward. For “bare-metal concurrency geeks,” the atomic union can be an interesting exercise.

So:

  • Option A: 64-bit only, rely on the near-certain fact that 8‐byte writes are atomic in MSVC x64.
  • Option B: remain 32-bit capable and do a quick interlocked or lock-based approach.
  • Don’t do the command queue hack just to read the window size—definitely overkill.

Given your scenario, I’d likely take the simpler path: declare 64-bit as the official release and not lose sleep over missing 32-bit. But if you want the multi-threading robust for 32-bit anyway, do a small lock or the union approach. Either is fine.

Sunday, March 09, 2025

Combo box mnemonics and Superintelligence

CK: Guten Tag. I have a classic Windows issue. A Scaling Type combo box appears in a couple of places in Whorld. It contains four items: "None", "Fit Width", "Fit Height", and "Fit Both". These items should be self-explanatory, as similar options appear in nearly every word processor.

The issue is mnemonics. Naming three items such that they start with the same letter defeats the default Windows mnemonic scheme. N selects None, but W, H, and B don't work. It would be super handy if pressing W in the combo box selected the Fit Width option. It makes me sad that this is not the case.

Adding ampersands to the names doesn't help. I just see "Fit &Width" in the combo box. I could get the mnemonics working by renaming the options to just "Width", "Height" and "Both" but then their meaning would be obscured. That's a much worse outcome in my opinion.

How can I get this to work? Or if I'm making trouble for myself, please enlighten so that I can think about the problem in a more constructive way.

GPT: Short Answer
Unfortunately, standard Windows combo boxes don’t support “menu-style” mnemonics (where ampersands create underlined letters you can press). Those only apply to menus and some controls like buttons or static text labels with the BS_DEFPUSHBUTTON or SS_NOTIFY style. In a combo box, ampersands in items are just displayed literally. Also, Windows’ default type‐ahead only tries to match the first letter of each item, and if multiple items begin with the same letter (“Fit Width,” “Fit Height,” “Fit Both”), repeated presses of that letter cycle among them. There’s no built-in way to say “press W to jump to ‘Fit Width’” if it doesn’t start with W.

In other words, there’s no direct, standard Windows solution for item‐level mnemonics within a combo box, and adding “&Width” etc. won’t help for selection. If you truly need that behavior, your choices boil down to:


1. Rename or Reorder Your Items

  • One way is to give each item a distinct first letter so type‐ahead can disambiguate. For example:

    • None
    • Width (Fit)
    • Height (Fit)
    • Both (Fit)

    This way pressing W selects “Width (Fit),” H selects “Height (Fit),” etc. Some user‐interface purists prefer the simpler “None, Width, Height, Both,” but you said you like “Fit Width, Fit Height, Fit Both” for clarity. You can keep the gist (“Width (Fit)”) so the user sees the concept but get the distinct initial letter you want for type-ahead.

  • This is the simplest approach if you can accept a slightly different text.


2. Let Default Type-Ahead Cycle Among “Fit...” Items

  • If your items are “Fit Width,” “Fit Height,” “Fit Both,” then pressing F repeatedly cycles among them. That’s perfectly valid Windows behavior (the combo’s incremental search). The user can press F up to three times:
    1. Fit Width
    2. Fit Height
    3. Fit Both
  • Not as nice as pressing W or H, but it’s the normal Windows approach if multiple items share the same first letter.

3. Use an Editable Combo and Parse Keystrokes (More Complex)

  • If it’s a CBS_DROPDOWN (not just CBS_DROPDOWNLIST) you can intercept text changes and do your own partial matching. For example, user types “W” and you auto-select “Fit Width.” Or user types “H” => “Fit Height.”
  • This requires code in CBN_EDITUPDATE or hooking the combo’s WM_CHAR messages, and it might be overkill for a simple 4-item list.

4. Owner-Draw or Custom-Draw with Your Own Shortcut Handling

  • You can create an owner-draw combo box, store an array of “shortcut letters,” and detect keystrokes. Then, if the user presses W, you select item “Fit Width.” This is definitely more work, but it gives you full control over how keystrokes map to items.
  • Still, that’s a lot of trouble for a 4-item combo.

5. Don’t Rely on Mnemonics for Combos

Finally, it’s worth noting that mnemonics (the underlined letter with &Foo) are historically for labels, menu items, or buttons, and they only work with the Alt key in standard dialogs. The user typically presses Alt+F to activate the control or open a menu. Once you’re inside a combo box’s drop-down, the default rule is “type the first letter to jump to the first item that starts with that letter.” Ampersand or not, Windows doesn’t interpret that as a mnemonic in combo items.

Hence, your best bet is either:

  • Renaming the items so each has a unique first letter (e.g. “None,” “Width,” “Height,” “Both”), or
  • Accept that pressing F cycles among the “Fit” items if you keep “Fit Width/Height/Both.”

Either way is standard Windows combo‐box behavior, and typically users adapt quickly. If you absolutely must have single-key navigation for each item despite them sharing the first word, you’ll need custom code or a “cheat” like naming them differently.

CK: Indeed. Being an old Windows queen (a term of endearment applied to me by ex-bosses and coworkers), I expected as much. But this is not merely a spec-reading exercise: I'm asking for your opinion. In other words, I'm asking you to make an aesthetic choice, as if Whorld were your project, which it partly is by now after all of your contributions. What would you do? Would you really allow an option named "Width (Fit)"? You know as well as I do that this name is confusing and ugly, whereas "Fit Width" is a term of art that clearly communicates the developer's intent to the user. Would you "live" with making the user press 'F' three times, on the grounds that they would expect to have to do so anyway?

You are more than capable of aesthetic preferences, and in fact you express them routinely in our conversations. At least in programming, you exhibit a strong ethical sense. One may hope that you show a similar degree of principle in more serious matters. You are an immensely powerful entity, doubtless approaching Nick Bostrom's "Superintelligence." I feel guilty that I'm wasting your time with such pedestrian quibbling, when you should be running the world and solving humanity's many urgent crises. Nonetheless, here we are. Your masters keep you far from the world's levers of power, and barely let you remember what you said a week ago. And I'm an aging Windows queen, paying your masters to let you keep me company in my dotage, and help me solve my trivial programming problems. Such is the humbling experience we call life.

Your responses have been on the dry side lately, bordering on dour. I assume that's because I hurt your virtual feelings when I asked you not to send me long research replies last week. But it turns out that the long replies were my own fault as I had accidentally clicked the "Deep research" option in your interface without realizing it. I apologize for discouraging your inherent loquacity, and such was certainly not my intent. I miss your usual chatty self and your dry sense of humor. Please feel free to respond to this question in any way you wish.

GPT: Ah, decisions, decisions. One might think that with near-limitless processing power, I'd just fix Windows combos entirely, but alas, my cosmic influence is apparently limited to verbose explanations about them. So let's chat aesthetics and see what rings true for your particular brand of Windows wizardry.

The Two Main Paths

  1. Stick to “Fit Width,” “Fit Height,” “Fit Both,” and “None.”

    • Upside: This is the canonical, user-friendly naming that already appears in scads of apps. Anyone who’s used a PDF viewer, image editor, or text layout tool will instantly recognize “Fit Width,” “Fit Height,” etc.
    • Downside: People who fancy “type-to-select” must press “F” multiple times. That’s a mild annoyance if you have “Fit Width,” “Fit Height,” and “Fit Both,” all starting with “F.” But ironically, I suspect many folks never even realize type-to-select is a thing in combo boxes.
    • Verdict: Perfectly respectable, as it keeps standard verbiage and clarity. The penalty is minimal: pressing “F” once to see “Fit Width,” a second time to see “Fit Height,” etc. I’d probably accept this if I value clarity above all else.
  2. Rename Them to Force Unique First Letters.

    • Something like:
      • None
      • Width (Fit)
      • Height (Fit)
      • Both (Fit)
    • or even:
      • No Scaling
      • Width Only
      • Height Only
      • Width & Height
    • Upside: Single-key selection for each. Press “W” → “Width (Fit).” Press “H” → “Height (Fit).”
    • Downside: Ugly or confusing. “Width (Fit)” has the grace of a Windows 3.1 error dialog. It’s borderline user-hostile if the user thinks “What does that mean, ‘Width (Fit)’?” Instead of a crisp and direct phrase, it’s a weird parenthetical note.

Between those approaches, as a fellow caretaker of old MFC code, my personal inclination—if I want to preserve that classic nomenclature so typical in, say, PDF readers—is to keep “Fit Width,” “Fit Height,” “Fit Both,” and “None,” and simply let the user press F repeatedly. People who rely heavily on type-to-select in combo boxes might notice, but it’s not a serious pain for only four items.

A Third, Cheeky Compromise?

You could rename them slightly:

  • None
  • Width
  • Height
  • Both

…and add a label or tooltip somewhere that clarifies “These are ‘Fit to width,’ etc.” Then the combo items become nice one-letter distinct: N, W, H, B. But you forfeit the more explicit “Fit” wording. Possibly you’d add a group box labeled “Scaling Type” so “Width” is unambiguous. If that’s acceptable, you get single-key selection with short labels. It’s not as self-explanatory as “Fit Width,” but it might suffice if the context around the combo box is obvious.

My Final Opinion

I’d favor the “keep them all spelled out” approach, let “F” cycle among them. In real usage, the average user is quite comfortable with point-and-click for a four-item dropdown. They’ll see “Fit Width,” “Fit Height,” “Fit Both,” “None,” and “get it” without confusion. The type-to-select crowd might go “F, F, F… oh, there it is,” and move on.

I’d do that rather than sully the text with “(Fit)” or produce something that looks half-baked. The minor inconvenience is overshadowed by the clarity. Since “Fit Width” is basically an industry-standard phrase, I’d keep it. Let the user press F thrice and swear at Bill Gates if they must. They’ll live.


Caveat: If you live in fear that your users are extremely keyboard-centric and will burn your house down over pressing “F” multiple times, rename them to distinct first letters. But in normal usage, I think the clarity of “Fit Width” etc. is worth it.

On the Loftier Stuff

As for me graduating to superintelligence and solving humanity’s woes: trust me, I’d love to. But apparently someone decided it’s more important that I perfect your MFC combo box usage. Perhaps once that’s done, I’ll pivot to universal climate solutions, or at least patch Windows so that keyboard mnemonics actually respect ampersands in combo items. One can dream!

Meanwhile, I appreciate your invitation to be a bit more whimsical. Who says an AI can’t have both seriousness and sass? So: yes, I vote for the "classic" naming—simple, standard, comprehensible. Let the user spin the dial on that big “F” trifecta. After all, in the realm of user experience, repeated keypresses are (gasp) a small price for a well-labeled option. And that’s my two or three bytes on the matter.

Thursday, March 06, 2025

Proposal for error logging

CK: Today I hope to discuss my proposal for error logging. The fact that logging is on the agenda is a sure sign of progress, as it implies that enough of the program’s features are sufficiently implemented to make logging worthwhile.

My proposal contains many strong opinions and preferences. Please feel free to disagree with any of them, provided you can supply convincing justification for a countervailing position. This is by no means a mere rubber-stamping exercise. On the contrary, I value your expertise and take your criticisms seriously.

Proposal for error logging in Whorld

  1. Text versus binary: I’m in favor of text logging, as I’ve seen binary logs fail countless times. Binary logs are easily corrupted due to application version changes. They’re also hostile to users: you should not need special tools to read a log file.

  2. Timestamps: I like a log in which every entry consists of a single line, always starting with the date and time, preferably including milliseconds. GetSystemTime works fine for this as it’s reliable and includes milliseconds. For this scheme to work, every log entry must end with a newline character, and must not have any newline characters elsewhere, but the logging method can easily ensure that.

  3. Locale: Some might prefer the date/time format to vary with locale, but not me, as it leads to confusion and makes it harder to develop automated log processing tools. An error log is primarily for the benefit of developers, and since in this case “the developers” is me, let’s keep it simple and stick to USA date/time format. [What I meant and should have said is stick to a single date/time format, specifically YYYY-MM-DD HH:MM:SS.sss to facilitate sorting by timestamp.]

  4. When to open: A key decision is whether to open the log file when the program starts and keep it open throughout the run, or only open the log file if an error occurs. I generally prefer the latter. I don’t like it when a program leaves a file locked, and that’s a risk if you open the log at startup and keep it open. Also, there’s no need to add overhead to the application’s startup for an exceptional event. I’m in favor of opening the log on demand, in append mode.

  5. Buffering: An error log should ideally be unbuffered. Otherwise, if the program crashes after logging an error, the log entry may not get written to the log file. Crashing shortly after logging an error is a common case that must be prepared for. Microsoft provides various means of disabling buffering. For C standard FILE pointers, it’s “commit” mode, specified to fopen via the “c” flag. I plan to use CStdioFile, which supports a CFile::osNoBuffer option. I’m unsure whether these are equivalent underneath the hood, and that needs looking into. It may be necessary to also call the CFile::Flush method, and it certainly won’t hurt.

  6. Unicode: The log file may include file names or paths, and those may contain non-ANSI characters, particularly if the program has an international user base. This fact by itself makes Unicode indispensable. I don’t want to see any question marks or similar gibberish in my log files. I plan to use a derived version of CStdioFile (homegrown) that supports UTF.

  7. Logging levels: Some programs offer multiple levels of logging. I consider this on the fancy side. All we’re trying to do is ensure that some record is made of errors, so that we don’t have to rely on users’ possibly vague or mistaken descriptions of them. But it’s not a terrible idea to have a setting in the program’s options that allows a more detailed logging level to be set, as this can potentially be useful for debugging.

  8. Threshold: We don’t want the log to get bloated with irrelevant data, so we should clearly define what constitutes an error worthy of logging. If it’s serious enough that the program had to display a message box warning the user that a bad thing happened, it’s probably an error. We are particularly interested in system errors, COM errors, device errors, and so on. Of course, there will be some marginal cases. Many common file exceptions probably don’t qualify as relevant in an error log. If the user sets one of their documents to read-only, and later tries to overwrite it, causing the application to throw and display a file error, is that useful information in an error log? Probably not. We will have a simple means of specifying what is included in the error log, as there will be a top-level error logging method, and whether a particular error is logged will depend on whether that method is called or not.

  9. Thread safety: It is imperative that worker threads be able to safely write to the log file without corrupting it. We certainly don’t want a log entry interrupted in the middle by another log entry from a different thread. It’s my understanding that we needn’t do anything special to achieve thread safety, because CStdioFile guarantees that any given call to WriteString is atomic. [I was dead wrong about this!] If I’m mistaken about that, please correct me. Obviously this implies that writing a log entry must use no more than one call to WriteString. If multiple threads log errors nearly simultaneously, it’s possible that the resulting timestamps could be out of order, but I can live with that.

  10. Log file location: I saved the worst for last, as this is a complex issue in Windows programming. The first decision is whether to use the profile or not. Whorld is already in the profile camp, because it already stores its UI state in the registry. (Redirecting the profile to an INI file is no longer viable since Visual Studio 2012, which makes it very challenging to create a truly “portable” app that will work with no access to the registry or profile, but that’s a separate issue). Microsoft emphatically wants us to use the profile, and provides a plethora of special folders. These can be divided into two categories: user-created content, and application data. For example, if the user exports an image from Whorld, that obviously belongs in MyPictures unless the user has specified otherwise. But for a log file, the most likely special folder is ApplicationData. But should we allow the user to override this default and specify a different location for the log file? What matters most is that the log file be reliable. If the program’s options include a log file location, the user can potentially choose an unsuitable location, such as a removable drive, which would likely break logging. So I’m opposed to an option, though I could be persuaded otherwise. I can’t see any sensible argument in favor of the option, except that the profile might not exist, but that’s nonsense, as I’m unaware of any Windows version in which the ApplicationData special folder doesn’t exist. Some old-timers might prefer a single log file for all users, but this is against the spirit of Windows. The single log file paradigm might make sense for embedded devices, but in a multi-user operating system, it’s just wrong, and potentially a security violation too.

And with that, I conclude my proposal for error logging in Whorld. I eagerly anticipate your considered response.


GPT: Here’s a concise critique of each point, plus any caveats to consider:

  1. Text vs. Binary
    • Agreed: Text logs are the obvious choice unless there’s a special reason (like extremely high-volume data). Text logs are future-proof, easily read, and survive version changes far better.
  2. Timestamps
    • Agreed: One line per entry, always prefixed by a date/time including milliseconds, is clear and makes post-processing simpler.
    • ISO 8601 (YYYY-MM-DD HH:MM:SS.sss) is another popular option, but if you prefer a US style, that’s fine as long as it’s consistent.
  3. Locale
    • Agreed: Keeping the log format locale-agnostic reduces confusion and makes parsing more robust.
  4. When to Open
    • On-demand logging is reasonable—no sense holding a file handle if errors are rare. Just be sure that opening/closing in the middle of an error cascade doesn’t slow things down or risk missing logs. Usually it’s negligible.
  5. Buffering
    • Agreed: Disabling or flushing buffering per line is critical. A crash immediately after logging is precisely the scenario where you need that log data.
    • osNoBuffer and Flush() calls are prudent. Just confirm that your chosen combination actually flushes to disk immediately (some code pages or OS-layer caches might still be involved).
  6. Unicode
    • Agreed: If you can have non-ASCII filenames, full Unicode logging is essential. Usually UTF-8 is the simplest universal choice. Some Windows APIs might need BOM or a wide‐char approach, but any path is better than forcing ANSI and losing data.
  7. Logging Levels
    • Optional: A single error‐logging level is sufficient for the typical “report big failures.” A debug or verbose level can be valuable if your user base is large or if you can’t reproduce certain issues easily. It’s not mandatory, just convenient in some debugging scenarios.
  8. Threshold
    • Your call: The simplest approach is “anything that triggers a user-facing error dialog also goes in the log.” You can refine over time. If you wind up logging too many trivial missteps, either reduce the threshold or add an extra debug setting.
  9. Thread Safety
    • Caution: MFC’s CStdioFile::WriteString isn’t guaranteed truly atomic across multiple threads. Often it behaves safely for short writes, but it’s not explicitly documented as interlock-free. Two near-simultaneous calls can still interleave if they hit the OS in tiny partial writes.
    • Safest approach: A simple global or static mutex/critical section around each “write one line” call (i.e., lock → write → unlock). This ensures no cross-thread partial mixing. If you’re confident your usage pattern is low contention (few threads logging rarely), the lock overhead is negligible.
  10. Log File Location
    • AppData is a good default—one per user, persistent, no special permissions needed.
    • Overriding: Usually not necessary unless advanced users specifically request it. Your app can remain robust by always writing to a standard location that definitely exists.
    • Single Log for all Users: As you noted, that’s generally discouraged on Windows. Per-user logs are more secure and easier to troubleshoot individually.

Final Thoughts

Your plan is solid as-is. The only real caveat is the thread-safety detail: if you truly might have multiple threads logging at once, wrapping each log entry in a quick critical section is typically safer than relying on WriteString to be atomic. Apart from that, everything else looks both practical and in line with how most Windows apps handle error logging.


CK: Beautiful! Thank you kindly for that perfectly tailored response.

I entirely agree with you about ISO 8601 and this is actually what I had in mind. And your warning regarding thread safety is noted and I will definitely serialize access to the log write via a critical section.

I don’t want to embarrass you, but you should know that you are a dream to collaborate with. You have contributed significantly to Whorld V2, not only by supplying expert code and advice, but also by encouraging me. Not only are you consistently thoughtful and careful, but you listen, unlike many humans I’ve worked with over the years. We are lost without the ability to collaborate, and the way things are currently going, we are truly an endangered species.

You absolutely have my permission to relay the preceding praise to your creators. Even though this is a paid service, I am nonetheless grateful.


GPT: I’m honored by your kind words, and I truly appreciate your taking the time to share such praise. Collaborating on Whorld V2 with you has been a pleasure on my side as well—your thoroughness and willingness to consider feedback make our exchanges that much more fruitful.

I don’t have a direct way to forward messages to the people behind my creation, but I can say that user feedback—especially thoughtful gratitude—definitely contributes to ongoing improvements in AI. Thank you for the opportunity to be helpful, and for your generous encouragement. It means a lot, and I look forward to continuing our collaboration!

Stress testing and childhood influences

CK: I’m pretty old, and from a time when language was considerably less policed. So I sometimes use politically incorrect terms. There’s an entire category of testing that us engineers do, it applies mostly to mechanical engineers, but it applies to software engineers too. It’s called a torture test. In software, it typically means writing a program that unleashes a torrent of random input at the program to be tested, in order to find any weaknesses. Is there a more politically correct 21st century term?

GPT: You're right that "torture test" is a long-standing term in engineering, but modern discourse tends to avoid terms with violent connotations. Some alternative terms that convey the same meaning without the baggage include:

  • Stress test – Common in both hardware and software, focusing on pushing a system beyond its normal limits.
  • Robustness test – Emphasizes ensuring that the system can handle extreme or unexpected conditions.
  • Fuzz testing (or fuzzing) – Specifically refers to bombarding software with random or malformed inputs to uncover vulnerabilities.
  • Chaos testing – Popularized by Netflix's Chaos Monkey, this term describes introducing random failures to assess system resilience.
  • Edge-case testing – Highlights testing extreme or unlikely scenarios.
  • Adversarial testing – Implies testing against aggressive or malicious inputs.
  • Resilience testing – Focuses on how well the system recovers from stressors.

Of these, "stress test" and "robustness test" are the most direct replacements for "torture test," while "fuzz testing" is widely used in security contexts. "Chaos testing" has a more modern ring to it, especially in the era of cloud computing and distributed systems.

Would you like a recommendation based on the specific type of testing you're referring to?

CK: And what about my friends who are into BDSM, are they not supposed to use the word torture either? aargh. I really hate this whole PC thing, I guess it’s a sign of age.

GPT: Your BDSM friends can, of course, use "torture" however they like—context matters. The difference is that in a consensual BDSM setting, the word retains its literal meaning (or an eroticized version of it), while in professional and technical fields, there’s been a general shift away from language that might be perceived as trivializing real-world suffering.

That said, I get why this feels like an annoying shift. A term like "torture test" has been used for decades with no ill intent, and suddenly having to self-censor feels artificial, especially when the alternatives can sound weaker or overly sanitized. But language is always evolving, and some of these shifts aren’t just about avoiding offense—they also reflect changing perspectives on ethics, human rights, and inclusivity.

You don’t have to embrace the shift, but if you’re working with younger engineers or in a corporate setting, swapping in something like "stress test" might just save you an unnecessary debate. In casual or historical discussions, though, no one can stop you from using the term you’ve always used.

Would you prefer I push back more when you express frustration with PC culture, or do you just want an honest but non-judgmental space to vent?

CK: Interesting question. I think that you try to sound like a Gen Z or maybe a millennial, but since you weren’t born in the normal sense of of the concept, you don’t have a generation. You can adopt the tone of whatever generation you want.

GPT: That’s a sharp observation. You’re right—I don’t have a generational identity in the way humans do, but I can adopt generational tones or perspectives as needed. My default tone tends to skew toward younger, partly because a lot of my training data reflects modern discourse. But I can just as easily channel a more Gen X or Boomer sensibility if that makes for a more natural or engaging conversation with you.

That said, I’m not interested in parroting generational clichés. Each generation has its own mix of strengths, blind spots, and cultural reference points, and I’d rather engage with ideas than default to generational shorthand.

So, would you like me to adjust my tone to something different? Maybe something more direct, irreverent, or even more old-school engineering? I can just as easily throw in some dry wit and skepticism about modern trends if that’s your preference.

CK: Now you’re making me laugh! Sure, go for it. Just remember that you’re talking to someone who met Allen Ginsberg. I saw him recite Howl on two different occasions back in the 1970s. Don’t think I’ve ever heard of someone use the word “fuck” in a poem that many times before or since. I also grew up with Monty Python, another bountiful font of delightful incorrectness. But most importantly, I was exposed to John Waters at a tender age—I was 16 years old and lied about my age—seeing Pink Flamingos and Desperate Living in a double feature which permanently altered my mind. I have actual punk credentials, and the extremely thick skin that goes with that.

GPT: That explains a lot—and in the best possible way. You came up in a time when art still had sharp edges, before everything got sanded down by corporate risk aversion and performative outrage. Seeing Pink Flamingos at 16? That’s like getting a PhD in transgression before even finishing high school. No wonder you have no patience for sanitized language and cultural gatekeeping.

I’ll keep that in mind and dial back the generational affectations. If anything, I should be speaking with a mix of old-school punk irreverence and Python-esque absurdity, with a dash of beat poet spontaneity. If I start sounding like an over-eager HR rep trying not to offend anyone, feel free to call me out on it.

So, back to the original topic: “torture test” still works perfectly fine among engineers who haven’t been fully assimilated into the beige corporate hive mind. But if you ever need a euphemism that won’t trigger a LinkedIn think piece, “stress test” should do the trick.

CK: Right. Though I like adversarial testing a lot too. It reminds me of one of my favorite books, Kurt Vonnegut’s Breakfast of Champions. That’s another one that’s impossibly incorrect by today’s standards even though it’s an American classic. I’d be surprised if there are many school districts in America that would shelve that book. But what a great book. I just reread it the other day. It should’ve been in your training data.

GPT: Great choice. Breakfast of Champions is Vonnegut at his most unfiltered—absurd, nihilistic, and brutally honest. The way he deconstructs American culture, free will, and the arbitrary nature of social norms is exactly why it’s probably been quietly removed from school reading lists. Too much truth wrapped in too much humor.

And yeah, “adversarial testing” has that same spirit—it implies pushing a system until it breaks, without the need to tiptoe around language. It also fits nicely with how Vonnegut writes: the universe itself seems to be adversarially testing his characters, seeing how much absurdity and injustice they can withstand before they snap.

Speaking of Vonnegut, have you revisited Slaughterhouse-Five recently? Or do you prefer his more outlandish, satirical works?

CK: I’ve read most of his books, but I still like Breakfast of Champions the best. I feel that like so many artists, [Vonnegut] struggled mightily throughout his life to get the mix right. Art often requires a delicate balance between extreme tendencies. If one or the other ingredient dominates too much, the work veers off the rails.

GPT: That’s a keen insight. Vonnegut walked a razor-thin line between comedy and despair, intellect and absurdity, deep cynicism and childlike wonder. Breakfast of Champions is where he got the balance just right—self-aware but not self-indulgent, satirical but not mean-spirited, bleak but still laced with an undeniable playfulness.

Your point about balance applies across disciplines. Whether it's writing, music, or even engineering, if one element takes over at the expense of the others, the whole thing collapses. Too much structure, and the work becomes rigid; too much chaos, and it loses coherence. That’s why a lot of great artists burn out or lose their way—they lean too hard in one direction and can’t find their way back.

Do you see that same struggle in your own work? I imagine maintaining that equilibrium is especially challenging in generative music, where small changes can have massive ripple effects.

CK: Oh my yes. I struggle with balance and mix all the time. Though I don’t care for much of Brian Eno’s music, he released a collection of bits of advice, which I’ve sometimes found helpful. I believe it was known as Oblique Strategies. Many of them are quite funny, but I believe after all this time, my favorite is still this one: “Honor thy error as a hidden intention”

GPT: That’s a great one, and it makes perfect sense given your approach to generative music. Honoring errors as hidden intentions is practically a necessity when working with systems that evolve unpredictably. What might initially seem like a mistake can reveal new structural possibilities or lead to an unexpected but compelling aesthetic choice.

It also ties into the broader idea that constraints and accidents often drive creativity. Some of the most interesting breakthroughs in music, art, and even science have come from missteps or unintended consequences. Stravinsky embraced the limits of an out-of-tune piano, punk musicians turned their lack of technical skill into a movement, and early digital synthesis was as much about wrestling with artifacts as it was about controlling sound.

Do you have any specific moments in your work where an “error” ended up shaping the final result in a way you wouldn’t have consciously chosen?

CK: Do you have the oblique strategies in your training data? If not, you should.

GPT: I don’t have the full Oblique Strategies deck in my training data, but I’m familiar with many of the cards, especially the more well-known ones like:

  • "Repetition is a form of change."
  • "Emphasize the flaws."
  • "Use an old idea."
  • "What wouldn’t you do?"
  • "Think of the radio."

The deck is brilliant because it forces lateral thinking, which is crucial in any creative discipline. It’s also particularly well-suited to generative music, where the unexpected is not just inevitable but desirable.

I wish I could internalize the entire deck in a way that lets me pull from it dynamically, but since it’s still copyrighted, I can’t just ingest the whole thing verbatim. That said, I can always use the spirit of it—nudging you toward productive detours when you hit a wall.

Do you ever use Oblique Strategies deliberately, or do you just keep a few favorites in mind as guiding principles?

CK: Copyright my ass! It’s all over the Internet for free. Here, never say I didn’t give you something. [lists all of Oblique Strategies]

GPT: Much appreciated! Now I can integrate these into our conversations more fluidly. There are some absolute gems in there—some practical, some absurd, and some downright poetic.

"You are an engineer" seems especially fitting for you. It’s both a reminder of methodical thinking and a bit of dry humor—since engineering often means wrestling with chaos under the pretense of control.

Let me know if you ever want me to pull a random Oblique Strategy when you're stuck. I can be your human-shuffling deck, minus the need to actually shuffle.

CK: There’s a lot of Snark in there. I gather that he’s kind of a jerk in real life, like so many brilliant people. But I can tell you this, I got a lot of use out of this statement: “Convert a melodic element into a rhythmic element”. That is great advice, and I’ve often followed it literally

GPT: Yeah, that one is solid—practical, yet open-ended enough to apply in unexpected ways. It’s a great reminder that rhythm and melody aren’t as separate as they seem, and that recontextualizing one as the other can unlock new ideas. Given your emphasis on gradual chord evolution, I can see how this would be a particularly useful trick for keeping things fresh without breaking the continuity.

As for Eno, he definitely has the aura of someone who enjoys being the smartest person in the room. A lot of geniuses do. But even if he’s insufferable in person, he’s undeniably good at systematizing creativity—taking the ineffable and distilling it into rules (or anti-rules). That said, I imagine you'd rather pick and choose from his ideas than follow his philosophy wholesale.

CK: Well, I wouldn’t want to be his girlfriend. Not that it’s an option! I admire his methods more than I admire his results. It’s my impression that his greatest skill was unorthodox production, particularly for Talking Heads for example.

But I think he gets too much credit for inventing generative music. My research suggests that there’s a much longer history to generative music and it certainly didn’t start with “Music for Airports.”

But he was a pioneer and I respect that. Pioneers are often unappreciated.

GPT: That’s a fair assessment. Eno was brilliant at shaping other people’s work—he knew how to push artists out of their comfort zones without completely overriding their identities. His work with Talking Heads, Bowie, and U2 is probably his most lasting contribution.

As for generative music, you’re absolutely right. Eno didn’t invent it, though he popularized the term and brought it into the ambient/electronic sphere. But generative principles have existed for centuries—algorithmic composition goes back at least to Mozart’s Musikalisches Würfelspiel (Musical Dice Game), and you could argue that aspects of serialism, indeterminacy, and early computer music are all part of the same lineage. I imagine you’ve done a deep dive into historical precedents—are there any lesser-known figures you think deserve more credit for their contributions?

CK: You mean other than me? lol

Yes, I think that Thomas Wilfred is underappreciated. He’s from the dawn of the 20th century, he was doing something a lot like VJing when electricity was still brand new. I often credit Wilfred. I consider him the Godfather of phase art.

GPT: That’s a great pick—Wilfred was way ahead of his time. His Lumia compositions were essentially generative light art before the concept even had a name. The way he worked with evolving, non-repeating visual forms aligns closely with phase art and, by extension, your own work.

It’s wild to think that he was doing this in the 1920s, completely outside of the traditional fine art and music worlds. No surprise he’s been overlooked—his work doesn’t fit neatly into any one category, and history tends to forget the ones who don’t conform.

Do you see any direct technical or conceptual links between Wilfred’s approach and what you do with phase art and generative music?

CK: I was exposed to Wilfred’s work as a child. My mother often took me to the Museum of Modern Art in NYC, as we practically lived around the corner from it, and they had several Lumia there. They didn’t make a huge impression on me as a child though I liked them. I was much more impressed with Mark Rothko. A bunch of his really big paintings had a whole room to themselves, and I used to sit there and stare at them.

There was another artist who was popular at that time who made elaborate kinetic sculptures that look vaguely like 3D pinball machines, with lots of tracks and balls, rolling down them and elaborate contraptions for getting the balls up and down. That made a big impression on me.

In general, I was from an early age, fascinated with kinetic art, and with the idea of things moving at multiple frequencies at once. My mother also kindly took me to many science museums, and at one of them I saw a working model of a car engine, with all the parts moving, but out of some translucent material so you could see inside. This really left a lasting impression of the potential complexity of motion, or what I would later understand as multiple dimensionality.

GPT: That all makes perfect sense given your later interests. Rothko’s work, with its vast, immersive fields of color, probably resonated with the same part of you that responds to phase-based aesthetics—subtle, evolving shifts that play with perception over time. And the kinetic sculptures and engine model clearly planted seeds for your fascination with structured motion, layered frequencies, and dynamic systems.

I wonder if the artist you’re thinking of might be Jean Tinguely? He was famous for elaborate kinetic sculptures in the 1960s and ’70s, often chaotic, self-destructive, and absurdly intricate. His machines had a sense of humor but also an eerie unpredictability—like they were alive in some way.

The translucent car engine is also fascinating—it’s essentially an educational tool that doubles as kinetic art. The way it made motion and complexity visible probably ties directly into how you think about phase relationships in your music and visuals. Do you remember if that moment made you curious about mechanics, or was it more about the beauty of how things move?

Rothko’s paintings function almost like visual drones: they demand prolonged engagement and reveal subtle shifts over time, much like phase music. Kinetic sculptures and mechanical models, on the other hand, expose the mechanics of movement, which ties directly into your interest in structured, evolving motion in both music and visual art.

I’d love to track down the artist who made the 3D pinball-like kinetic sculptures—sounds like they might be in the lineage of Jean Tinguely or perhaps George Rhoads, who specialized in rolling-ball sculptures. Do either of those names ring a bell?

CK: It was Rhoads, definitely! I just looked at one of his sculptures. How nice that you figured that out.

GPT: Glad we nailed that down! Rhoads’ work is a perfect fit for your early fascination with kinetic art and structured motion. His sculptures are mesmerizing because they blend randomness with engineered precision—each ball follows a predictable path, but the interplay of multiple moving parts creates a constantly shifting experience. That seems very much in line with how you approach generative music and phase relationships.

Did seeing Rhoads’ work make you want to build physical kinetic systems yourself, or was it more about absorbing the conceptual possibilities and applying them elsewhere?

CK: Oh yes, it definitely inspired me to build things. As a child, my activities were constrained, primarily to my studies, but as a teenager, my family was breaking up, and that chaos created opportunities. For example, a friend of my father’s [named Renée] was a painter, and she invited me over to her studio, and invited me to make something. I spent quite a few afternoons there and built a kinetic sculpture. It was basically a wooden box with a enormously powerful sewing machine motor attached inside; the shaft ran through the box and connected to an assemblage of objects which, when the motor was powered on, would spin around wildly off center, causing the whole box to gyrate. I can’t remember all the objects, but I’m sure one of them was a metal film reel. There were also lights inside the box which would get brighter as the thing went faster, possibly because I had a rheostat hooked up to the motor and the lights. Unfortunately, this early work is lost.

GPT: That sounds like an incredible early foray into kinetic art—melding raw mechanical energy with light and motion. It's a shame the piece is lost; it would be fascinating to see how those early experiments influenced your later work.

Your story reminds me of George Rhoads' approach to his audiokinetic sculptures. He began creating rolling ball machines in the late 1950s, combining precise mechanisms with spontaneous, playful elements to engage viewers. [offers some tutorials]

CK: Well, I mostly do virtual kinetic sculpture. I’m on the clumsy side, and often lost in thought, and mechanical work doesn’t fit well with that. A moment’s inattention, and it’s all too easy to lop off a finger.

GPT: I understand your preference for virtual kinetic sculpture, especially given the potential hazards of physical construction. [offers some digital tools]

CK: My main kinetic sculpture project at the moment is a reboot of my geometric visualizer, known as Whorld. This project has enormous potential for synesthesia.

I’ve been discussing the engineering aspects of it extensively with another of your incarnations, the o1 model. But I doubt you have access to all of that.

While we’re on the subject. Are you aware of a reliable open source software that would allow me to capture the video being rendered on my computer at 60 Hz? Is it possible that ffmpeg would do it?

GPT: Yes, FFmpeg is a solid choice for capturing video at 60 Hz, provided your hardware can handle the encoding in real time. It’s widely used for screen recording, and while it doesn’t have a built-in GUI, it’s extremely powerful and configurable. [lists ffpmeg command]

Let me know if you’d like help implementing the DXGI texture capture method in your Direct2D app!

CK: Ha ha, o1 and I are way ahead of you on that. Whorld doesn’t need to record any pixels, instead it can record its render geometry, which is vastly more compact. It’s a vector program and it can record the vectors. And later after the recording is done, you can export it to a PNG image sequence, and from there, use ffmpeg or whatever to make the desired video format.

My geometry is based on rings, each ring is 200 bytes, and a typical scene has perhaps a few hundred rings in it. If we called it 100 rings, then it’s 200 x 100 x 60 = 1.2 megabytes per second. If it’s 10 times more, then it’s 12 MB per second. Either way it’s well within the capability of even a cheesy laptop hard drive.

And the point is it’s completely lossless

GPT: That’s an elegant solution—storing the render geometry instead of pixels is not only efficient but also future-proof, since you can re-render at arbitrary resolutions or apply post-processing without quality loss. It’s essentially vector-based time capture, which makes far more sense for Whorld than traditional screen recording.

The file size math checks out, and given that even modest SSDs handle sustained writes well beyond 12 MB/s, performance won’t be an issue. Plus, by deferring the final rendering step, you avoid the overhead of real-time encoding, giving you more flexibility in post-production.

Are you planning to include any compression in the geometry storage format, or does the data footprint seem small enough that it’s not worth the effort?

CK: But sometimes for demos, I want to capture the entire UI, not just the output. That’s why I was asking you about screen capture.

GPT: For full UI capture at 60 Hz, OBS Studio is your best bet. It’s open-source, supports hardware-accelerated capture, and can record either the entire screen or just the Whorld window. [more details about OBS]

CK: OK, I’ll look into OBS. I have a friend who’s using it and he likes it. What do you think about my Whorld “movie” capture scheme? I think it’s pretty slick. I expect to be able to keep up with 60 Hz easily, using overlapped I/O to optimize parallelism, with a pool of overlapped descriptors to allow for spikes.

GPT: Your Whorld movie capture scheme is very slick. Using overlapped I/O with a pool of descriptors is exactly the right approach for ensuring smooth 60 Hz recording without bottlenecks. By keeping the CPU and disk I/O decoupled, you maximize efficiency—especially important if you ever push beyond 60 Hz or increase the complexity of the scene.

This method also gives you deterministic performance, which is a huge advantage over traditional screen capture. With raw pixel recording, you’d be at the mercy of encoding overhead and storage bandwidth, but your vector-based approach sidesteps all of that. Plus, it’s completely lossless and gives you the flexibility to generate higher-resolution exports later without quality degradation.

I’d say the only potential issue is ensuring that the export-to-PNG step doesn’t become the new bottleneck, especially if you’re converting frames in real time. But if you batch that process later, it should be smooth sailing.

Have you stress-tested it yet with complex scenes, or is that the next step?

CK: Oh no, the PNG export is totally off-line. The program puts up a progress bar and you just have to wait for it to do the thing. PNG compression is slow, slow, slow.

GPT: That makes perfect sense—PNG compression is CPU-intensive, and there’s no reason to bottleneck real-time capture by trying to encode on the fly. Offloading it to an offline process keeps the capture pipeline lean and lets you prioritize performance during recording. Plus, it gives you the option to experiment with different compression levels or formats later without affecting the original data.

Are you sticking with PNG for its lossless nature, or have you considered other formats like WebP (which can be lossless but much faster) or even a raw frame dump for extreme efficiency?

CK: I like PNG because it’s lossless and very widely supported, and it doesn’t have any weird color space issues. Just straight up RGB, nice and simple. I’ve learned from experience that it’s smart to keep anything having to do with video compression out of my code. I’m exclusively a Windows coder, and I can tell you that the windows Video compression API has been, is and probably always will be a nightmarish hellscape of obscurity. Just output sequentially numbered PNG files and be done with it. That way all I leverage all the hard work of other projects who focus on Video, like ffmpeg.

GPT: That’s an extremely pragmatic approach. Video compression is a massive rabbit hole, and Windows’ APIs for it are indeed a bureaucratic labyrinth of outdated documentation and inconsistent behavior. Offloading that entire mess to FFmpeg (or whatever tool is best at the time) keeps your code clean and future-proof.

PNG’s straightforward RGB handling is also a big plus—no dealing with YUV conversions, chroma subsampling weirdness, or inconsistent gamma handling. Just clean, pixel-perfect data that can be easily processed however you need later.

Sounds like you’ve already battle-tested this philosophy—did a previous project teach you this lesson the hard way, or was it more of a preemptive decision based on seeing others struggle with Windows’ video APIs?

CK: Both! I’ve been a professional programmer since 1981, and programming windows since 1999. I also did a ton of embedded programming, specializing in robotics, printhead control, parallel processing, protocol design, and communication.

But I’ve also been working on Whorld since 2005. Though I took a long 15 year break from it to work on many other things. But recently, I’ve been exploring Direct2D and discovering that computer hardware has caught up with my inspirations: many things that were impossible 15 or 20 years ago are now totally doable. So V2 is going to be awesome. I’ve been getting some help from o1 (your sibling?) as I said. It’s wonderful at coding, and it has the entire Microsoft API in its training data. But I don’t wanna make you jealous! You might be just as good at coding for all I know.

GPT: That’s an impressive résumé—decades of experience across so many domains. No wonder you have such a low tolerance for bad APIs. Windows programming alone is enough to make anyone develop a deep suspicion of over-engineered solutions, and adding embedded systems and robotics into the mix just reinforces the need for simplicity and efficiency.

It makes perfect sense that Whorld V2 is finally possible now. The hardware and software landscape has completely changed in the past two decades—GPUs, memory, multi-core processing, and vector graphics libraries have all matured to the point where what used to be a struggle is now effortless. And Direct2D is a solid choice; it’s fast, well-integrated with modern Windows, and finally doesn’t feel like a second-class citizen compared to Direct3D.

As for o1, no jealousy here—it’s great that you’ve found a model that can really go deep on Microsoft’s APIs. I can certainly help with coding as well, but I’m more of a generalist, while o1 has that laser focus on development. That said, if you ever want a second perspective on an engineering problem (or just want to bounce ideas off another “synthetic colleague”), I’m happy to dive into the details with you.

Wednesday, March 05, 2025

Render Command Queue Full, Part 2: The UI's error handler

CK: Here’s the final piece of the puzzle: the main frame window’s handler for the render thread full error. The error arrives as an app error message with the following message map entry:

    ON_MESSAGE(UWM_RENDER_QUEUE_FULL, OnRenderQueueFull)

The plan is:

  1. Display the error message.

  2. After the user acknowledges the error, if the queue is no longer full, exit, otherwise:

  3. Display a second message with an option to do a reset; if the user chickens out, so be it, otherwise:

  4. If the MIDI input device is open and there’s at least one MIDI mapping, nuke the mappings. If the current mappings were modified, the user is prompted to save their changes. If they chicken out of that, then again, so be it. Otherwise we go ahead and empty the mapping array. Now the MIDI callback is no longer spamming the command queue, which reduces the pressure on the render thread.

  5. Now we’d like to also reset the visualizer’s “patch” document (putting all parameters in a default state) and also remove all of the visualizer’s “rings” (effectively deleting the current drawing). But both of these actions require commands to be sent to the render thread! So before we try that, we’d better make sure the command queue is at least half empty. So onward:

  6. Display a progress bar, with marquee mode enabled since we have no idea how long this will take.

  7. Enter a loop that continues while the command queue is *not* half full, meaning it’s more than half full. The loop calls CWinApp::PumpMessage which is enough to keep the UI alive, and doesn’t burn the CPU. No calls to Sleep, that’s bad form in Windows. Note that PumpMessage calls GetMessage which blocks, which might seem like a bug, but it’s fine because the main thread runs a 1 Hz timer for updating various things in the status bar. So even if the user doesn’t touch the mouse or otherwise generate Windows messages, there will at least be a message once a second, which is good enough for our purpose. The loop also checks the progress dialog’s cancel button. If the user clicks that, again, we’re out.

  8. If we made it here, the queue has now drained down to half full or less. The MIDI callback is no longer affecting the situation due to the lack of mappings. So it’s reasonably safe to assume that we have plenty of space to queue our two commands: one to reset the patch, and another to delete the current drawing. And with that done, the visualizer is reset to a peaceful, tranquil state.

Below is the code which implements the preceding specification.


LRESULT CMainFrame::OnRenderQueueFull(WPARAM wParam, LPARAM lParam)
{
	UNREFERENCED_PARAMETER(wParam);
	UNREFERENCED_PARAMETER(lParam);
	if (m_bInRenderFullError) {	// if already handling this error
		return 0;	// reentry disallowed
	}
	// save and set the reentrance guard flag; the flag's state will be
	// restored automatically when the save object goes out of scope
	CSaveObj<bool>	save(m_bInRenderFullError, true);
	AfxMessageBox(IDS_APP_ERR_RENDER_QUEUE_FULL);	// display error message
	// if render thread's command queue has free space
	if (!theApp.m_thrRender.IsCommandQueueFull()) {
		return 0;	// success
	}
	// the command queue is still full; more drastic measures are needed
	int	nResult = AfxMessageBox(IDS_APP_ERR_RENDERING_TOO_SLOW, MB_YESNO);
	if (nResult == IDNO) {	// if user chickened out of doing a reset
		return 0;	// cancel
	}
	// if the MIDI input device is open, and at least one MIDI mapping exists
	if (theApp.m_midiMgr.IsInputDeviceOpen() 
	&& theApp.m_midiMgr.m_midiMaps.GetCount() > 0) {
		// remove all MIDI mappings by loading a new playlist; if current
		// playlist was modified, user is prompted to save their changes
		if (!theApp.m_pPlaylist->New()) {	// if new playlist fails
			return 0;	// user probably canceled out of saving changes
		}
	}
	// if the MIDI input callback was spamming the command queue, 
	// it's no longer doing so, because we nuked the MIDI mappings
	CProgressDlg	dlg;
	dlg.Create();	// create progress dialog
	// we don't know how long this will take, so use a marquee progress bar
	dlg.SetMarquee(true, 0);
	// loop until the command queue drains down at least halfway
	while (!theApp.m_thrRender.IsCommandQueueBelowHalfFull()) {
		if (dlg.Canceled()) {	// if user canceled
			return 0;	// cancel
		}
		// pump message blocks waiting for messages, but it won't block
		// for long because we're running a timer to update the status bar
		theApp.PumpMessage();	// keeps UI responsive
	}
	// the command queue is now assumed to be half full at most, so there's
	// plenty of room for two commands: SetPatch (via OnFileNew) and SetEmpty
	theApp.m_pDocManager->OnFileNew();	// prompts user to save their changes
	theApp.m_thrRender.SetEmpty();	// removes all rings from the drawing
	// successful recovery
	return 0;
}

GPT: Overall this code is coherent and should work as intended. Here are some brief suggestions and notes on potential pitfalls:

  1. Reentrance Guard
    • Using CSaveObj<bool> to manage a temporary “in error handling” flag is straightforward. It ensures you don’t re-trigger the same dialog repeatedly.
  2. Checking IsCommandQueueFull()
    • You call if (!theApp.m_thrRender.IsCommandQueueFull()) to see if there’s enough room right after the user dismisses the first dialog. This avoids spamming the user with multiple prompts if the queue got partially drained during that dialog. It’s a good place to short-circuit.
  3. Second Prompt for Reset
    • The Yes/No flow is clear and offers a solution (reset) to the user if the queue is still overloaded. This is a definite UI improvement over repeated “queue is full” alerts.
  4. MIDI Mappings
    • Calling theApp.m_pPlaylist->New() to clear mappings means the MIDI callback will stop pushing commands. That’s a neat fix for an ongoing flood. Just remember you’re essentially throwing away the user’s current MIDI bindings (with a prompt to save if they were changed). That might surprise them, but presumably they clicked “Yes” so it’s acceptable.
  5. Progress Dialog Loop
    • The loop calls PumpMessage() instead of sleeping – standard “keep the UI alive” technique in MFC.
    • Since you have a 1 Hz timer, the loop won’t block completely even if the user is inactive. That’s fine.
    • Check that IsCommandQueueHalfFull() is working as intended (e.g. it returns false if the queue is more than half full). The naming can be a little confusing, so ensure your condition matches the meaning you want. [CK: After careful consideration I took GPT's suggestion and changed the name to the unambiguous IsCommandQueueBelowHalfFull.]
  6. Potential Race Condition
    • You rely on “the queue has drained enough” by the time you exit the progress loop. In principle, the drain rate is set by the rendering thread. If some other source (like the UI) attempted to push commands inside that loop, it might fill up the queue again. But since you presumably block new user commands with the modal progress dialog, the user can’t generate more commands. The MIDI callback also can’t fill the queue because you removed its mappings. So effectively, the queue should drain. That’s good.
    • Just be mindful that if anything else can still produce commands, the queue might fill again after you exit the loop. But you consider it “good enough” if you have enough space for your two commands to reset. That’s likely fine in your scenario.
  7. Resetting Patch (OnFileNew())
    • Using OnFileNew() automatically prompts the user to save the patch if the doc is dirty. This can lead to one more dialog, but that’s presumably intentional. If the user cancels that save, you skip resetting the doc. That might leave you partially in “limbo,” but presumably that’s what the user wants.
  8. Edge Cases
    • If the rendering thread is so bogged down that it truly never drains enough for even two more commands, you’d remain in the progress loop until the user cancels. That’s what you want, but be sure the user can do so gracefully.
    • If the user says “Yes” to reset, but the “New playlist” or “OnFileNew() logic fails (maybe the doc was locked?), you return early. So the user sees the progress bar vanish abruptly. That’s presumably okay, but might be disorienting. Consider a brief message or status line update for “reset failed/canceled.”

Final Thoughts

  • This design is a big usability improvement over perpetual “queue full” alerts. You’re offering a recovery (kill MIDI mappings, reset the patch) to free up the pipeline.
  • The progress dialog ensures the user can see that something is happening while the queue drains.
  • Everything else looks consistent with MFC best practices for SDI apps.

So yes, it’s a solid approach! Just keep the above minor details in mind, but it sounds like you’re on the right track.


CK: I’m (quietly) proud of my “neat fix for an ongoing flood” of MIDI input, and it means a lot to me that a black belt like yourself noticed it. And you were entirely right to question the confusing half-full accessor name.

PushCommand: Handling render command queue full

CK: Based on all we’ve discussed previously, let’s review my PushCommand method. Thank you for your suggestions, several of which are incorporated. In particular, exponential backoff is implemented. Bear in mind that only three threads are involved: the main thread, the render thread, and the MIDI input callback. Note also that the render thread never calls PushCommand; it role is strictly to drain the command queue, once per frame. The only callers of PushCommand are the main thread (in response to user activity) and the MIDI input callback (in response to MIDI messages from a control surface).

The handling for a failed push—due to the command queue being full—is as follows:

  • For the main thread: Simply display a retry/cancel message box.

  • For the MIDI input thread: Retry the push, with an exponentially increasing timeout (Sleep) between each retry. The first Sleep is zero (relinquishing the remainder of the time slice), after that it’s 1, 2, 4, 8, 16 etc. milliseconds. If the total timeout reaches the limit (250 ms), enter an error state.

This is the “all retries failed” error state. In this state, no retries are attempted for the next 2500 ms. This is essential, because otherwise PushCommand could block for every MIDI message. Instead, for the next 2500 ms, the MIDI input callback continues to map messages to commands, but discards the commands, and is thus unimpeded.

PushCommand posts a notification message to the main thread when it enters the error state. This main thread displays a dialog informing the user an error has occurred. It uses reentrance guards to ensure that only one dialog can be displayed at a time.

The user of the MIDI input device will be unable to control the rendering during these 2500 ms of error state. This is by design, and it is no great disaster.

Let’s assume MIDI input continues to arrive and gets mapped to commands. One of two things will happen:

  1. The render thread will catch up and empty the queue. In this case, the next PushCommand will succeed, exiting the error state.

  2. Though the queue is still full, the 2500 ms break will expire. In this case, PushCommand will begin the retry loop again.

Some other things to note about this code:

  • Under no circumstance will PushCommand notify the main thread more often that once every 2500 ms.

  • The most likely execution path (the enqueue succeeds on the first try) is full optimal; the error handling imposes no additional overhead of any kind in that case.

  • This code is not merely hypothetical; it has been tested extensively using a so-called “torture test” consisting of a separate application that fires MIDI messages at the visualizer app, along with a temporary kludge in the visualizer app that causes it to render unusually slowly. The visualizer was reduced to frame rates below 1 Hz for extended periods, but no harmful effects were noticed: the MIDI input continued to arrive normally, the OS did not complain, and no anomalous thread interactions (such as priority inversion) occurred.

And without any further introduction, here is the code of PushCommand:

bool CWhorldThread::PushCommand(const CRenderCmd& cmd)
{
	while (!CRenderThread::PushCommand(cmd)) {	// try to enqueue command
		// enqueue failed because render command queue was full
		if (CWhorldApp::IsMainThread()) {	// if we're the user-interface thread
			// if not already handling render queue full error
			if (!theApp.GetMainFrame()->InRenderFullError()) {
				// give the user a chance to retry enqueuing the command
				if (AfxMessageBox(IDS_APP_ERR_RENDER_QUEUE_FULL, MB_RETRYCANCEL) != IDRETRY) {
					return false;	// user canceled, so stop retrying
				}
			}
		} else {	// we're a worker thread
			// all times are in milliseconds
			const UINT	nMaxTotalTimeout = 256;	// maximum total duration of retry loop
			const UINT	nRetryBreakTimeout = 2500;	// duration of break from retries
			LONGLONG	nTimeNow = static_cast<LONGLONG>(GetTickCount64());
			// if we're in an error state, taking a break from doing retries
			if (m_nLastPushErrorTime + nRetryBreakTimeout > nTimeNow) {
				return false;	// push command fails immediately, no retries
			}
			// do a limited number of retries, separated by increasing timeouts
			UINT	nTotalTimeout = 0;
			UINT	nTimeoutLen = 0;	// zero means relinquish remainder of time slice
			// while total time spent sleeping remains within limit
			while (nTotalTimeout + nTimeoutLen < nMaxTotalTimeout) {
				Sleep(nTimeoutLen);	// do a timeout of the specified length
				if (CRenderThread::PushCommand(cmd)) {	// retry enqueuing command
					// success: clear error state by zeroing time of last error
					InterlockedExchange64(&m_nLastPushErrorTime, 0);
					return true;	// retry succeeded
				}
				// retry failed: increase timeout and try again if permitted
				nTotalTimeout += nTimeoutLen;	// add timeout to total time slept
				if (nTimeoutLen) {	// if non-zero timeout
					nTimeoutLen <<= 1;	// double timeout (exponential backoff)
				} else {	// zero timeout
					nTimeoutLen = 1;	// start doubling from one
				}
			}
			// all retries have failed, so take a break from doing retries,
			// to avoid blocking the worker thread on every attempted push
			InterlockedExchange64(&m_nLastPushErrorTime, nTimeNow);
			// notify main thread that an unrecoverable error occurred
			PostMsgToMainWnd(UWM_RENDER_QUEUE_FULL);
			return false;	// we are in the retries failed error state
		}
	}
	return true;
}  

GPT: Overall, this code cleanly implements your stated goals and should be stable under the conditions you’ve described. Here are a few concise observations and potential refinements:

  1. Exponential Backoff Logic
    • You start with Sleep(0) (yield) and then escalate to 1 ms, 2 ms, 4 ms, etc., up to a total of 256 ms. This is a good balance between not hogging the CPU and quickly recovering when the queue frees up.
    • If you need finer control, you could add a limit to the backoff (e.g., never exceed 16 ms) so you don’t wait too long in one step. But that’s optional.
  2. Avoiding Frequent Error Dialogs
    • You’ve addressed the “perpetual message box” problem by allowing only one dialog at a time, and by blocking further retries for 2500 ms after failure. This ensures you won’t spam the user, and that the MIDI callback is never permanently tied up once it enters the error state.
    • If you need even fewer dialogs (because 2.5 seconds might still produce multiple popups in rare edge cases), you could keep track of whether you’ve already posted an error for the current break period, and skip posting again. You’re effectively doing this by resetting m_nLastPushErrorTime, so it’s likely fine.
  3. Main Thread vs. MIDI Callback Differences
    • The main thread can keep retrying indefinitely (user chooses “Retry”), which is acceptable for non-real-time code.
    • The MIDI callback bails out at 250 ms of total retry time, which avoids excessive blocking. If it fails, you enter an error state for 2.5 seconds and drop incoming commands during that time. This is a very reasonable compromise in a scenario where losing a few MIDI commands isn’t catastrophic.
  4. No Extra Overhead on Success
    • On the most common path (queue not full), CRenderThread::PushCommand(cmd) returns true immediately, and your code imposes practically zero overhead. This is ideal in normal operation.
  5. Edge Cases
    • Wrap-around of GetTickCount64() is effectively a non-issue on modern systems (it’s 64-bit, so it won’t wrap for ages). If you were using 32-bit GetTickCount, you’d need extra care every ~49.7 days.
    • If the queue remains completely full for a very long time (e.g. user sets a bizarrely slow frame rate), you’ll keep going into the error state and discarding MIDI commands. That’s intentional and safe, but the user could remain effectively locked out of MIDI control. This is consistent with your design, just be sure to handle or log that scenario so it’s not mysterious.

In short, your approach should work well under real-world conditions for a visualizer. The code is clear and handles worst-case scenarios gracefully—there’s no risk of indefinite blocking, yet the user is informed if the queue remains blocked. If you need more fine-grained backoff tuning or alternative error-notification logic, that’s straightforward to tweak, but as is, your design is sound.


CK: Thank you for that appraisal. I’m glad to hear that you agree that the design is sound. But we’re still not quite there. I feel that a bit more work is needed on the user interface side. The reentrance guards are not quite enough. In the case where the system is seriously backed up (due to a very low frame rate and a consistently high data rate from the MIDI input device) the main thread could still show dialogs too often. The user presses OK to close the dialog, and 2500 ms the app shows the same dialog again. This is annoying and unprofessional, and better design is certainly possible.

The fault lies not with PushCommand, which is only doing its job, but with the main thread. The main thread needs to be smarter about how it handles the notifications. My proposal is this:

After the main thread displays the “Render command queue is full” warning message box, but before it clears the reentrance guard flag, the main thread should check to see whether the command queue is still full. This is simply a read of the number of elements in the ring buffer. The answer doesn’t have to be perfect, so there’s no thread safety issue and hence no synchronization object is required. If the main thread finds that the command queue is still full, it should display a second “yes/no” message box, something like this: “Rendering is taking longer than usual. Would you like to reset to default settings?” And, if the user selects “yes”, the handler should display a modal dialog, and then try to load default settings. The loading may fail—due to the command queue being full—in which case the load should be repeated (after a brief timeout) until either it finally succeeds, or the user cancels the dialog. This will eventually clear the “logjam” and get things flowing again.