Types as Interfaces
For the past few days, I have been toying with an idea for a board game. To test it out, I wanted to write a simple implementation of it. Here’s an example of a type we might need in a critical phase of the game.
-- | A quote for a proposal. data Quote = Quote { _proposal :: Proposal , _premium :: Int , _share :: Int }
In that phase, values of this type need to be communicated back and forth between players in a unicast fashion, so we might want to add fields to indicate who sent it and who received it.
-- | A quote that has both owner and offering player. data QuoteEtc = QuoteEtc { _proposal :: Proposal , _premium :: Int , _share :: Int , _owner :: PlayerId -- ^ Player that owns the process. , _offering :: PlayerId -- ^ Player that offers this quote. }
But then I realised there are quite a lot of these types where source-and-target fields are required, and that we could annotate them without modifying the underlying type, by creating another type to hold that data:
-- | Create a unicast message from any type a. data Msg a = Msg { _sender :: PlayerId , _recipient :: PlayerId , _payload :: a }
With this, we can construct a value of type Msg Quote
which represents the
Quote
annotated by sender and recipient.
Msg { _sender = PlayerId 2 , _recipient = PlayerId 0 , _payload = Quote { _proposal = undefined , _premium = 5 , _share = 3 } }
Functions that only operate on the _sender
and _recipient
fields can be
written with the appropriate type signature and be generic over
payload.1 This example uses object oriented style getters. If you’re
unfamiliar with optics/lenses, think of the operator ^.
as you would a regular
.
to get a property from a value in a language like Python or C#.
-- | Determines if a message is intended for player. msg_for :: Player -> Msg a -> Bool msg_for player msg = player^.name == msg^.recipient
This function does not care what the payload of the message is; it can be used on any list of messages, like
filter (msg_for broker) messages
to extract a list of messages intended for that player, regardless of whether
the payload is Quote
or anything else.2 Though beginners with Haskell
beware: whenever we pass a list of messages into this funcion, it still needs to
be a homogeneous list of just one concrete type. We cannot mix both Msg
Proposal
and Msg Quote
in the same list.
Even though Msg
is a plain type constructor, it acts an awful lot like an
interface. That is the main point of this article; that’s a pattern that can be
used to design simple code.
There is a common complaint against what we just did above. People say that
types-as-interfaces are not composable. Let’s find out what they mean. Imagine
we wanted some messages to be timestamped. We could add an optional _timestamp
field to the message type:
-- | Create a message from any type a, with optional timestamp. data MsgEtc a = MsgEtc { _sender :: PlayerId , _recipient :: PlayerId , _timestamp :: Maybe UTCTime , _payload :: a }
but what if we also wanted to timestamp some of the other objects we have, like
Quote
? Aha! We just learned how to do this! We create a new wrapper type:
-- | Annotate any value with a timestamp. data Timestamped a = Timestamped { _timestamp :: UTCTime , _contents :: a }
Just as we before we can now tack on another layer of data and make a
Timestamped (Msg Quote)
.
Timestamped { _timestamp = now , _contents = Msg { _sender = PlayerId 2 , _recipient = PlayerId 0 , _payload = Quote { _proposal = undefined , _premium = 5 , _share = 3 } } }
Clearly, this approach does compose, because we just composed both
Timestamped a
and Msg a
. But remember the msg_for
function we had that
determined whether a message was intended for a particular recipient? It had the
signature
msg_for :: Player -> Msg a -> Bool
meaning it takes any Msg a
but it will not be possible to give it a
Timestamped (Msg a)
; we have to unwrap the message from the timestamp first.
However, if we gave it an Msg (Timestamped Quote)
, it would have worked,
perhaps counter-intuitively.
The complaint here is not that the approach does not compose at all (clearly it does), but that it does not compose well: the order in which we choose to annotate our data with extra fields affects whether or not we can pass them into existing functions.
I think that’s basically fine. If we think about it, isn’t Timestamped (Msg
Quote)
a different-feeling thing from a Msg (Timestamped Quote)
? But let’s
assume we wanted to fix it. What is near at hand?
We could make a typeclass
class HasRecipient a where get_receiver :: a -> PlayerId
This is more like a real interface, which can be implemented by any type that has a receiver. Here are two implementations we would want:
instance HasRecipient (Msg a) where get_receiver msg = msg^.receiver instance HasRecipient (Timestamped (Msg a)) where get_receiver tsd = tsd^.contents.receiver
We could then rewrite msg_for
in terms of this typeclass instead.
msg_for :: HasRecipient msg => Player -> msg -> Bool msg_for player msg = player^.name == get_receiver msg
and this will work for any value of a type that implements HasRecipient
,
including Msg a
and Timestamped (Msg a)
.
Okay, so let’s roll down this slippery slope. Maybe we have a function
logger :: Show a => [Timestamped a] -> IO ()
which logs things in order of timestamp. This function will not take a list of
Msg (Timestamped Quote)
because there the outer value is not timestamped but
the Quote
inside.
New here? I intend to write more about Haskell patterns in the coming few months (although this is not a promise). You should subscribe to receive weekly summaries of new articles by email. If you don't like it, you can unsubscribe any time.
We could apply our newly discovered hammer and create a similar HasTimestamp
typeclass.
class HasTimestamp a where get_timestamp :: a -> UTCTime instance HasTimestamp (Timestamped a) where get_timestamp tsd = tsd^.timestamp instance HasRecipient (Msg (Timestamped a)) where get_timestamp msg = msg^.payload.timestamp
But at this point it gets a little confusing for this author’s brain – at least
if long-term maintenance is desired. What would be the best instance for
Timestamped (Msg (Timestamped Quote))
, for example?
And as we said before, aren’t Timestamped (Msg Quote)
and Msg (Timestamped
Quote)
slightly different kinds of things? Do we really need to be able to
pass both unaltered into that logging function?
What we really should do is take a cue from network protocol design. These are robust things that have stood the test of time.
Image data might be stored in a tga file with headers giving information on how to interpret it. It will be placed into a http request with its own headers. This gets stuffed inside a tcp packet with further headers. That in turn is enveloped in an ip datagram with headers. Which might then run along a wire inside an ethernet frame, carrying – you guessed it – its own headers.
At no point during transmission3 To be fair, I would not be surprised if it was possible to find network switches in the wild that inspect ip headers, or routers that look at http headers. So maybe that whole layered protocol thing was a mistake and I’m full of crap! does a switch hold up an ethernet frame to the light and ask, “So what http content type are you transmitting?” We have designed these protocols to be layered with meaningful structure from outside to in. Maybe we can do that in our code as well.
Maybe we don’t need both Timestamped (Msg Quote)
and Msg (Timestamped Quote)
in our application, and just one of them is enough. A mathematician creates
generalisations that work with everything. An engineer strips away the variants
that are less important and adapts the code to the big demands at hand. This
makes things simpler along the way.
If we do need both, maybe it’s fine to treat them as two different types (they are!) and not try to make functions generic over them. Maybe. Chris Penner seems to argue for the opposite. I don’t claim to have any silver bullets.