Thursday, 6 March 2014

Nesting and Overlapping Translucent Stenciled Windows

Mouthful of a title... This post is about using the stencil buffer to clip window contents, where the windows can be layered or nested (subwindows) and might be see-through. It's also an example of the common experience programmers have with falling through of plans leading to scope expansion.

If dominoes were tangents...

There's the game I'm obstensibly working on... then there's a little subset of rules for mage versus mage non-lethal combat... so I put a little framework together to test these combat rules, which would also stress the (incomplete) GUI a bit more. This proceeded well enough, but I wanted text feedback about what was happening in this mage-duel -- so a little scrolling logger.

Zen of Clipping: leave it to the window system!

The GUI didn't have a way to clip contents to a region yet. Having menu entries extend off the top and bottom of the screen, or text spilling outside of the box it's in... unacceptable. This GUI was used to make a font-browser, and rather than scrolling a list of fonts in a contained box, I drag the giant list up or down off the screen, as shown to the right. Pretty silly. It was time to face this issue.

Oh, I also needed scrolling -- just an extra offset for positioning will do it...

To clip contents I figured I'd use scissoring (glScissor). This is just a simple screen-based clipping rectangle. In practice the GUI would probably only use boxes anyway.

This should be easy. An hour or two.

One week later...

(fig 2) Nesting, overlapping, translucent, shaped, and transformed windows.
Contents of Red are trimmed to its edges, including the round corners. Nested
windows create their own clipped sub-regions while still respecting their parent.

Plans changed... Scissoring was unsatisfactory because windows can have some detailing to their shape (such as the rounded corners, above). Ideally I'd want that very same shape to affect clipping, rather than clipping to some invisible box pulled inward by a safe distance from the window edges. Two alternatives were under consideration for shaped clipping: the stencil buffer, or render-to-texture.

One choice leads to another: if we have arbitrary clipping... that means we can have arbitrary transforms! And if we have transforms (ie. perspective) I'm going to disfavor render-to-texture. This is because rendering the resultant image with potential scaling and rotation suffers the usual texture-sampling limits. (Think of a prerendered page of text rotated or scaled.) Blech. So, stencil it is! One thing I would have liked render-to-texture for is anti-aliasing or alpha-fade on the edge. However, it's easy enough to hide the ragged stencil-edge under a frame.

Leveraging the Stencil Buffer

I had never used the stencil buffer before (okay, some simple use long ago), so I spent a day playing with different ideas, thinking there must be a way to render stencils which can nest or overlap and allow transparent objects to render in correct order. I knew hierarchical bitfields would suffice, but I wasn't sure I could make that work with the stencil operations and only 8 bits (hierarchical bitfields can chew up bits quickly).

This must be a common operation, right? Compositing window managers need to do this kind of thing, so someone's bound to have posted some ideal solution.

I found this page: Clipping in OpenGL. The page starts with basic stenciling, adds transparency, and then considers nested windows without transparency... As I read through it, it was looking promising. I should have jumped to the end, but it was building up so well I was sure it would culminate with what I sought! But at the end: "Finally, combining subwindows and transparency seems possible but hasn't been fully investigated, as well as supporting more than 255 windows."


After seeing the approaches used on that page, I struggled to find a solution which didn't involve hierarchical bitfields, because it was so tantalizing... If I could puzzle out the right order of writing stencil values and using them, maybe I'd only need one sequential stencil value per window (allowing 255 stenciled windows before clearing the stencil buffer). However, I think that was an exercise in futility, with the available stencil-buffer features. It could be done if there was an additional parameter: a "reference" value only for testing and a separate "write" value to update the stencil buffer with.

The Stencil Test

The stencil buffer is like the framebuffer or depthbuffer, but hasn't become such a common term to warrant joining the words. ;) Joking aside, it commonly has 8 bits per pixel. The stencil test has this form:

    keep_the_pixel = (value & mask) op (stencil & mask)

The "op" is a comparison operator, such as less, equal, not equal. These values are set-up using:

    glStencilFunc op value mask;

The remaining stencil value in the test comes from the stencil buffer. During render, any fragment failing this test is discarded. However, the stencil buffer itself can be updated on pass or failure. It can't be written to directly, but has several modes of updating... including increment and decrements, which were a tempting tease as these would work for nested windows, but fails once they can overlap.

For the application described here, op is always gl_equal, and the stencil buffer is only updated on successful write... meaning a fragment has to pass stencil, depth, and alpha testing. This last requirement permits using the alpha-channel to shape the stencil. Without using the fragment shader for "alpha stencil" we'd be limited to stencil shapes built of the polygonal shapes of the primitives.

We're using gl_equal to test specific bitpatterns, albeit masked, as shown in the next section. A more typical use of the stencil operations is to test for less-than or greater-than, without masking, and to either increment or decrement the value written -- this allows for sequential stenciling without needing to clear the stencil buffer, 255 times, before the buffer must be reset/cleared.

For a more in-depth look at stencil operations and an example:

Rendering with the Stencil Test

This code will set up for rendering a stenciled window with contents. For translucent objects, contents should be rendered in order, from farthest to nearest. The contents may include other stenciled windows, with this one as parent.

    (* Set to update stencil-buffer with value, when test succeeds *)
    glStencilOp gl_keep gl_keep gl_replace;

    (* Constrain stencil rendering to be within parent, but "value" will be written *)
    glStencilFunc gl_equal value parent.mask;

    glStencilMask 0xff; (* enable writing to stencil-buffer *)

    draw stencil_shape;

    glStencilMask 0;    (* disable writing to stencil-buffer *)

    (* Limit further rendering to be within stenciled area...
     * including children! *)
    glStencilFunc gl_equal value mask;

    render contents;

    (* Restore stencil state to that of parent *)
    glStencilFunc gl_equal parent.value parent.mask;

There are four special numbers needed, however: value, mask, parent.value, and parent.mask. Each stenciled window has a value and mask, and computing these is covered in the next section...

Hierarchical Bitfields

What do I mean by "hierarchical bitfields"? You might be familiar with an example: IP addresses. A "class C subnet" (ipv4), for example, has the top 24 bits (first three of the dotted quad) assigned, while the lower 8 bits are free to allocate within the subnet. So you can identify all machines on a subnet by a particular prefix:, while the individual machines are distinguished by the last byte of the address.

Unicode UTF-8 also uses a hierarchy: certain bit-patterns identify the need for additional bytes.

In our present case, we want stencil-patterns where we can match a masked set of bits to identify "containment". For example, if a parent window has a value of 0100 (in 4 bits), and corresponding mask of 1100... this means any value matching 01xx would be writable -- so the children of this window should all have patterns starting in 01.

The tree of stenciled windows from "fig 2" is shown to the right. We'll work with this as an example. The asterisk is the root window, containing 1 and 4. In "fig 2", the red window is window 1.

I'm going to demonstrate with a similar notation as an ip-address: dotted integers. Each step deeper in the tree adds an integer, and siblings are numbered sequentially with this integer. One more rule: a zero terminates, and represents "the remaining bits are zero". So the example would be written as:

(indentation used to show hierarchy)

    *     0
     1    1.0
      2   1.1.0
     4    2.0
      5   2.1.0
      6   2.2.0

Now you can see that we can uniquely match any specific window, or a window and its children. "1" and its children all start with "1.", 4 and its children all start with "2." If we wanted to narrow down window 5 specifically, we need a mask which keeps the first two numbers: "2.1" -- masking the first two numbers, this is the only window which matches "2.1". By comparison, if we examine the first two numbers for "1.1", we get window 2 and its child, 3. With a value and a mask, we can pinpoint a specific window, or widen out to capture children encoded with this kind of "address".

These dotted integers are easily turned into a value and a mask. The essence of this transformation is determining the number of bits required to represent the tier. Take the number of siblings, add 1 for the parent (occupying the "zeroth" slot), and take the base2 log, rounded up, for the number of bits needed. So, off the root window we have two children: this is 2+1 = 3... number of bits needed for three values is 2, so the counting in binary is: 00, 01, 10. And these are the bits for our first "dotted integers". The corresponding mask is sufficient to reveal the significant bits -- or the total number of bits needed for the dotted integers, excluding the trailing zero:

   window  dotted-integers  value  mask
     *      0               0000   0000
     1      1.0             0100   1100
     2      1.1.0           0110   1110
     3         0111   1111
     4      2.0             1000   1100
     5      2.1.0           1001   1111
     6      2.2.0           1010   1111

Now these two numbers, value and mask, can be used to control stenciled rendering, allowing for nesting or overlap of windows.

All for a scrolling log-view.

If the available stencil bits are exceeded, the scene-graph can be managed as subtrees which fit within the limit of bits, and the stencil buffer cleared for each subtree. Not ideal. However, I don't think I'll exceed 8 bits in practice. I can imagine using stenciling for some widget and having dozens on the screen but that's okay -- it's nesting that cuts the available addressing in half, and I don't imagine too much nesting.

Oh yeah... the scrolling logger. That's what started this...

Tuesday, 4 March 2014

GUI Event Handling with a Functional Hierarchical State Machine

A journey toward a state-machine without mutable state, but enough flexibility to express GUI behaviors. This is nothing new in it's parts, but maybe as a whole. In scouring GUI lore there is very little in a functional style.

I had a "feel" for how I wanted event handling to be, but it wasn't clear. Initially I wrote some wishful thinking pseudo-code:

  let tooltip ?(delay=1.) ?(show=5.) ?(style=tooltip_style) text =
    on_enter (fun id ->
      after delay (fun id ->
        let tip = make_tooltip style text in
        after show (fun id -> delete tip (* and what is the active state now? *))

I like the style of chaining functions which are awaiting events to continue. Another variation I considered was with a yield-like operation, but that looked like a vertical stack of statements without clear structure.

Anyway, the above example was really declaring a state-machine with each state able to return a replacement state.

Playing with State Machines

I drew some state diagrams, and they were as messy as they always seem to be -- special cases and redundant states to deal with "one thing is different, even though mostly things are the same as this other state". After a lot of futzing about I realized some details: to avoid a redundant explosion of states I could leverage parent states which have child states and the parent transitioning also cleans up the children. This is because state machines often have exception-like transitions. Here is an example of the full tooltip:

    (on_leave (on_use (after delay)))
        |         |        |
        v         v        v
      start      ( )  (after show)
                          ( )

So, at "start", we are awaiting on_enter. Once that triggers, we replace the current state with one having a (parent (parent (child))) chain of on_leave, on_use, and after delay. This means the delayed show and hide of the tooltip is on its own, but can be interrupted by either on_use (say, clicking the button) or on_leave. For example, if the initial delay is passed, the tooltip is displayed and the current state is:

    (on_leave (on_use (after show)))
        |         |        |

Here, after show has replaced after delay. Now if on_use occurs, it transitions (to nothing) and also removes its child state, leaving us with:


And in Code

First, a general n-ary tree, and a type consumed to specify whether a matching event terminates at this handler, or propagates onward.

  type 'a tree = Child of 'a | Parent of 'a * ('a tree) list

  type consumed = bool

Then the recursive types to express these hierarchical states and transitions:

  type state = {
    handler : handler;
    cleanup : unit -> state tree list;
  and handler = id -> event -> consumed * return
  and return  = Retain | Stop | Follow of state tree list

It might be easier to see how it works with an example. Here is the corresponding code for the "tooltip" handler:

  let tooltip ?(delay=0.6) ?(show=5.) create_fn =
    let open Event in
    let rec start () = (* start needs an argument to constrain recursion *)
      on_enter (fun id -> Follow
        [ Parent (on_leave (fun _ -> Follow [Child (start ())]),
          [ Parent (on_use (fun _ -> Stop),
            [ Child (after delay id
                (fun id ->
                  let tip = create_fn id in
                  let cleanup () = delete tip; [] in
                  Follow [ Child (after show id ~cleanup (fun _ -> Stop)) ] )
            )] )] )] )
    in start ()

A little clunkier than my wishful thinking at the start, but I prefer this because it works! Also because it's clear about some things that the first-guess didn't cover. It deals with "on use" (interacting with something should make the tooltip go away), and has proper cleanup. The optional cleanup function is called when a state exits, including due to a parent expiring.

Note the square brackets everywhere: a bit noisy, but the returned replacement state is actually a list. An event handler is really a list of handlers which can each have children... so it's a full tree. Even the cleanup function returns such a list, empty [] in this case.

Composing Event Handlers

With my initial wishful thinking I didn't know if I could practically declare an event handler "tooltip" and simply attach it where I want it... but that's exactly how it works:

  Event.set button
    [ Event.Child (click_to_toggle ());
      Event.Child (hover_fade (0.85,0.95,1.0));
      Event.Child (tooltip (label "Toggle me on or off")) ]

Tooltip in action
This is attaching three root-level event handlers to button, which each work independently and go through their state transitions. Processing an event returns a new list of handlers which replace the current handler (if there was a change).

I think what I ended up with is a fairly standard hierarchical state machine, implemented functionally (except where I bind a list of handlers with an ID, but that's part of the nature of the component database).

I don't have experience with GTK, Qt, Windows-anything. I've written simple hard-coded interfaces for some games, but usually avoid even that -- linking character behavior to (abstracted) device inputs. Now I'm working on a game which needs more mousey interaction. I know I should have looked into existing systems at least to inform myself, but those libraries are huge and I never liked the first taste. Maybe I'm also fearful that if I got to know them I'd want to have all the features they have... Experienced comments are welcome!

A Component-based GUI with Functional Flavour

An example of the GUI in use.

There are plenty of existing GUIs -- why write my own?

Two technical reasons:
  1. To have a rendering back-end which is compatible with my game.
  2. Limiting dependencies to SDL and OpenGL, which are widely supported.

For me, the most important reason, and not a technical one, is that whenever I look at GUI APIs my brain wants to jump out of my skull. Mind you, my own mess here might have the same effect if I were to encounter it afresh. Maybe let me know what your outside perspective is?

Oh, oh, Gooey -- or is it OO GUI?

I once believed the standard claim that OOP is natural for GUIs -- the one thing OOP might really be suited to. My initial take on a GUI in OCaml was based on objects: defining widget classes and layering complexity with inheritance. There were some issues... composition of functionality rarely works well with objects because they're wedded to internal state, and that is frustrating. But there was another pitfall lurking in the shadows. Initially a lot of details were hard-coded: font, colors, borders, etc. The day came when I wanted to control these details... A default font for everything... maybe setting another font for some group of things, and overriding that in specific cases... well ain't this a mess!

Update: Leo White, in the comments, has clued me in to using OCaml objects with the mixin technique -- which seems to offer similar composability and inheritance, but retaining the power of the type-system. I'll make a future post comparing the approaches. An early impression/summary: nearly identical expressiveness, while the differences are similar to the trade-offs between static and dynamic typing.

I'm Destined to Use Databases for Everything

(to atone for disparaging remarks about DB programmers in my past)

It didn't take too long before I realized that I want properties: arbitrarily aggregating and inheritable properties... a.k.a. components. (The component system I'm using is described here.) Well then, what if all of my "widgets" were just IDs with associated components? What would it look like?

Here's the "default font" I wanted...

  let normal_font = new_id ()
    |> Font.s fnt
    |> Fontgradient.s `Crisp

Now "normal_font" is something to inherit from: a convenient bundle of properties. If I set the Font to something else during runtime, everything which inherits and doesn't otherwise override Font will get the change.

I'll set a bunch of global defaults called "gui_base"...

  let gui_base = new_id ()
    |> Anchor.s Vec2.null
    |> Color.s (0.5,0.5,0.5)
    |> Shape.s rounded_rectangle
    |> Gradient.s `FramedUnderlay
    |> Underlay.s Default
    |> Border.s 6.

So far, these properties are just data -- mostly related to rendering options.

Next we'll use inheritance, and add an event handler. This sets up properties for a particular class of buttons:

  let hover_button =
    new_child_of [ gui_base ]
    |> Fontsize.s 18.
    |> Fontcolor.s (1.,0.95,0.9)
    |> Shadow.s (`Shadow,black,0.5,1.5,2.)
    |> Event.s [ Event.Child (hover_fade (0.8,0.7,0.5)) ]

Now anything inheriting from this, and placed on-screen, will respond to pointer "hover" with an animated color-fade. Event handlers aren't masked, so I usually keep them near "leaf" nodes of inheritance. As a silly test, I made a right-click menu for color-changing in gui_base... so everything could have it's color changed. It worked, if a bit heavy handed. Still, something like that could be useful to create a configuration mode.

You might realize by now that there are no specific button, label, or textbox widgets. Any GUI element is defined by its cumulative properties. In practice, there are convenient "widgets" which I build functions for, like this color_button:

  let color_button color name =
    new_child_of [ hover_button; normal_font ]
    |> Color.s color
    |> Text.s_string name

I'm using components with multiple inheritance, and this function creates a button by inheriting both hover_button and normal_font, as well as adding a given color and name.

Well this looks promising to me. Properties, via components, provide a reasonable way to build GUI elements, ample support for hierarchical styles, and there's no need for optional parameters on every creation function -- a free-form chain of components serves as the optional parameters for any/all creation. For example, I can use the "color_button" above, but override the font, almost like a labeled parameter:

    color_button red "LAUNCH" |> Font.s dangerfont

Furthermore, when new components are created to implement features... there's no need to update functions with new parameters. I've often been running competing components in parallel, until one is deprecated. Purely modular bliss.

Model-View-Controller... or something else?

GUIs which separate declaration, control, and behavior all over the place drive me nuts. This would be the typical implementation of a Model-View-Controller pattern which is so beloved.

With MVC, and the numerous other GUI-pattern varieties (MVVM, MVP, etc), the principle notion is separation of concerns. Each particular pattern identifies different concerns and separations. I'm usually rather keen on separation of concerns -- modular, composable, understandable, no stepping on toes. I find with GUIs that I desire tighter coupling. An extreme in this regard is ImGUI, which goes too far for my usual preferences -- with it, view is stateless and coupled to execution-flow.

What I want, is to declare everything in a stream. We can break things down into sub-parts, as usual with programming. What I don't want, are code and declarations scattered in different places, files, or even languages... held together by message-spaghetti. Using the color_button again, here's a larger GUI object:

  let framed_colors =
    let blue   = color_button (0.1,0.1,1.0) "Blue" in
    let red    = color_button (1.0,0.0,0.0) "Red" in
    let green  = color_button (0.1,0.9,0.1) "Green" in
    let yellow = color_button (0.9,0.9,0.0) "Yellow" in
    let black  = color_button (0.0,0.0,0.0) "Black" in
    new_child_of [ gui_base ]
      |> Pos.s (Vec2.make 40. 20.)
      |> Dim.s (Vec2.make 240. 120.)
      |> Pad.(s (frame ( 4)))
      |> Border.s 12.
      |> Layout.(s
            [ hcenter (Id blue);
              hbox [ I green; G (gap 2); I yellow; G (fill 1); I black ];
              hcenter (Id red) ]))
      |> Event.s
          [ Event.Child (drag_size RIGHT);
            Event.Child (drag_move LEFT) ]

Layout uses a "boxes and glue" approach.

All in one, an object is declared with nested elements. It has a defined (yet flexible) layout, and responds to move and resize operations; the buttons have their own event handlers. The event handlers (as I'll get into in another post) are just functions. The point is that the composition of this is as for any typical code: functions. No out-of-band stuff... no layout in some other file keyed on string-names, no messages/signals/slots. There are events, but event-generation is from outside (SDL), so the only part we're concerned with is right here -- the event handlers.

I guess what I'm trying to minimize, is spooky-action-at-a-distance. To jump ahead a bit, when I call a menu function, it returns a result based on the user's selection (pausing this execution path while awaiting user response, via coroutines), rather than changing a state variable or firing off a message. Functions returning values.

Parts of a GUI

  1. Renderer
  2. Description/Layout
  3. Event Handling
Oh hey... View, Model, and Controller. D'oh!

I'll go into more depth in future posts, leaving off with just a note about each...

The renderer is a thin layer leveraging the game's renderer. It mows over the GUI scenegraph, rendering relevant components it encounters. It's entirely possible for the renderer to have alternate interpretations of the data -- and, in fact, I do this: to render for "picking". Effectively I render the "Event" view, including pass-through or "transparent" events. Some details about render layers and stenciling will be their own topic.

Declaration of elements leverages "components" as in the examples given above. For layout, I've always had a fondness for TeX's boxes and glue -- I ride Knuth's coat-tails in matters of typesetting. A hint of this is in the Layout component for the box of color_buttons. I'll probably cover layout or line-splitting in another post.

Event Handling. One of my favorite parts, and a separate post: GUI Event Handling with a Functional Hierarchical State Machine.

There's something missing still...

      4.  The UI Code

What I mean is the flow of code which triggers GUI element creation, awaits user responses, and does things with them. This could be done as a state machine, but I prefer keeping the flow continuous (as if the user always has an immediate answer), using coroutines. This is yet another topic for another day.

I haven't made a complex UI, so there might be a point beyond which I have to rely on messages -- really, I keep anticipating this, but it hasn't happened... yet.