Type Classes
> {-# LANGUAGE FlexibleInstances #-}
> module TypeClasses where
We have already seen that the (<)
operator works for a bunch of different underlying data types. For example
ghci> 2 < 3
True
ghci> 3.9 < 3.5
False
Similarly we can compare all sorts of values
ghci> 2 == 3
False
ghci> [2.9, 3.5] == [2.9, 3.5]
True
“So?” Indeed, this is quite unremarkable, since languages since the dawn of time has supported some form of operator “overloading” to support this kind of ad–hoc polymorphism.
However, in Haskell, there is no caste system. There is no distinction between operators and functions. All are first class citizens in Haskell.
Well then, what type do we give to functions like (<)
and (==)
? Something like
(<) :: Integer -> Integer -> Bool
would be too anemic, since we want to add two doubles as well! Can type variables help?
(<) :: a -> a -> Bool
Nope. Thats a bit too aggressive, since it doesn’t make sense, to compare two functions with each other! Haskell solves this problem with an insanely slick mechanism called typeclasses, introduced by Wadler and Blott.
Qualified Types
To see the right type, lets just ask
ghci> :type (<)
(<) :: Ord a => a -> a -> Bool
We call the above a qualified type. Read it as, (<)
takes in two a
values and returns an Bool
, for any type a
that is an Ord
.
The name Ord
can be thought of as a predicate over types. Some types satisfy the Ord
predicate. Examples include Integer
, Double
etc, and any values of those types can be passed to (<)
. Other types do not satisfy the predicate. Examples include functions etc, and so values of those types cannot be passed to (<)
.
ghci> id < flip
<interactive>:15:1: error:
• No instance for (Ord ((a0 -> a0 -> c0) -> a0 -> a0 -> c0))
arising from a use of ‘<’
(maybe you haven't applied a function to enough arguments?)
• In the expression: id < flip
In an equation for ‘it’: it = id < flip
Now these kinds of error messages make sense. Basically Haskell is complaining that function types are not an instance of Ord
.
OK, SO WHAT IS A TYPECLASS?
In a nutshell, a typeclass is a collection of operations (functions) that must exist for the underlying type. For example, lets look at possibly the simplest typeclass Eq
class Eq a where
(==) :: a -> a -> Bool
(/=) :: a -> a -> Bool
That is, a type a can be an instance of Eq
as long as there are two functions that determine if two a values are respectively equal or disequal. Similarly, the typeclass Show
captures the requirements that make a particular datatype be viewable,
class Show a where
show :: a -> String
Indeed, we can test this on different (built-in) types
ghci> show 2
"2"
ghci> show 3.14
"3.14"
ghci> show (1, "two", ([],[],[]))
"(1,\"two\",([],[],[]))"
When we type an expression into ghci, it computes the value and then calls show on the result. Thus, if we create a new type by
ghci> data Unshowable = A | B | C
then we can create values of the type,
ghci> let x = A
ghci> :type x
x :: Unshowable
but can’t view or compare them
ghci> x = A
<interactive>:1:0:
No instance for (Show Unshowable)
arising from a use of `print' at <interactive>:1:0
Possible fix: add an instance declaration for (Show Unshowable)
In a stmt of a 'do' expression: print it
ghci> x == x
<interactive>:1:0:
No instance for (Eq Unshowable)
arising from a use of `==' at <interactive>:1:0-5
Possible fix: add an instance declaration for (Eq Unshowable)
In the expression: x == x
In the definition of `it': it = x == x
Again, the previously incomprehensible type error message should make sense to you.
Q: Lets create an instance for Show Unshowable
.
AUTOMATIC DERIVATION
Of course, this is lame; we should be able to compare and view them. To allow this, Haskell allows us automatically derive functions for certain key type classes, namely those in the standard library.
To do so, we simply dress up the data type definition with
> data Showable = A' | B' | C' deriving (Eq, Show)
and now we have
ghci> let x' = A'
ghci> :type x'
x' :: Showable
ghci> x'
A'
ghci> x' == x'
True
STANDARD TYPECLASS HIERARCHY
Let us now peruse the definition of the Ord
typeclass.
ghci> :info Ord
class Eq a => Ord a where
compare :: a -> a -> Ordering
(<) :: a -> a -> Bool
(<=) :: a -> a -> Bool
(>) :: a -> a -> Bool
(>=) :: a -> a -> Bool
max :: a -> a -> a
min :: a -> a -> a
{-# MINIMAL compare | (<=) #-}
There’s quite a bit going on there. A type a can only be deemed an instance of Ord
if
- The type is also an instance of
Eq
, and - There are functions for ordering values of that type.
In other words in addition to the ordering operations, we can compare check equality on two Ord
values.
Haskell comes equipped with a rich set of built-in classes.
Standard Typeclass Hierarchy
In the above picture, there is an edge from Eq
to Ord
because for something to be an Ord
it must also be an Eq
. There are a few other ones that we will come to know (and love!) in due course…
Most of these type classes come with default instances for the basic types. But not all of them
Q: Why does Haskell reject the following expression?
(1,2,3,4,5,6,7,8,9,10,1,1,1,1,1,1) == (1,2,3,4,5,6,7,8,9,10,1,1,1,1,1,1)
> instance (Eq a, Eq b, Eq c, Eq d, Eq e, Eq f, Eq g, Eq h, Eq i,
> Eq j, Eq k, Eq l, Eq m, Eq n, Eq o, Eq w) =>
> Eq (a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, w) where
> _ == _ = False
Using Typeclasses
We have already see how type classes integrate with the rest of the Haskell’s type.
Recall the insert
function
> insert x [] = [x]
> insert x (y:ys)
> | x <= y = x:y:ys
> | otherwise = y:insert x ys
Notice that we didn’t write down the type of the insert
. Lets see what the type is
<ghci :t insert
insert :: Ord t => t -> [t] -> [t]
How, did the engine figure this out? Easy enough, if you look at the body of the insert, you’ll see that we compare two key values.
Constraint Propagation
Every function that calls insert
should propagate the Ord
constraint.
> insertSort :: Ord a => [a] ->[a]
> -- insertSort ::[Int] ->[Int]
> insertSort = foldl (flip insert) []
>
> f :: Int -> Int
> f x = x
Note, that now Haskell is not smart enough to figure out the constraint, so we need an explicit type signature.
Explicit Signatures
In the case of insertSort
Haskell could have guessed a proper type. But, there are cases when the use of type classes requires explicit annotations (which change the behavior of the code.)
For example, Read
is a built-in typeclass, where any instance a of Read
has a function
read :: (Read a) => String -> a
which can parse a string and turn it into an a. Thus, Read
is, in a sense, the dual of the Show
.
Q: Is it possible that read
creates any value of type a
? Does this remind you any function we saw earlier?
Q: What does the expression read "2"
evaluate to?
Haskell is foxed, because it doesn’t know what to convert the string to! Did we want an Int
or a Double
? Or maybe something else altogether. Thus, we get back the complaint
interactive>:1:0:
Ambiguous type variable `a' in the constraint:
`Read a' arising from a use of `read' at <interactive>:1:0-9
Probable fix: add a type signature that fixes these type variable(s)
which clearly states what the issue is. Thus, here an explicit type annotation is needed to tell it what to convert the string to. Thus, if we play nice and add the types we get
ghci> (read "2") :: Int
2
ghci> (read "2") :: Float
2.0
Note the different results due to the different types.
Instantiating Typeclasses
So far we have seen Haskell’s nifty support for overloading by observing that
some standard types are instances of standard type classes, and
new types can be automatically made instances of standard type classes.
However, in many situations the automatic instantiation doesn’t quite cut it, and instead we need to (and get to!) create our own instances.
For example suppose you want to compare two tic-tac-toe tiles.
> data Tile = X | O | EmptyTile
> -- deriving (Eq)
Q: What does
ghci> EmptyTile == EmptyTile
evaluate to?
The equality test is structural, as in, a data type is always equal to itself.
What is we want to define tile-equality so that comparison with EmptyTile
is always False
?
We can write our own tile-equality operator to capture exactly this crazy requirement.
> instance Eq Tile where
> X == X = True
> O == O = True
> _ == _ = False
Q: Now what does
ghci> EmptyTile == EmptyTile
evaluate to?
How about
ghci> EmptyTile /= EmptyTile
To undertand how, (/=)
is evaluated let us look at the full definition of the Eq
typeclass. Ah! the typeclass definition also provides default implementations of each operation (in terms of the other operation.) Thus, all we need to do is define (==)
and we will get (/=)
(not-equals) for free!
In general, when instantiating a typeclass, Haskell will check that we have provided a minimal implementation containing enough functions from which the remaining functions can be obtained (via their default implementations.)
Laws
In addition to the explicit type requirements, a typeclass also encodes a set of laws that describe the relationships between the different operations. For example, the intention of the Eq
typeclass is that the supplied implementations of (==)
and (/=)
satisfy the law
forall t1 t2, t1 == t2 <==> not t1 /= t2
forall t1, t1 == t1
Unfortunately, there is no way for Haskell to verify that your implementations satisfy the laws, so this is something to be extra careful about, when using typeclasses.
Type classes are interfaces
Many of you might have experience with object-oriented programming languages like Java. You are in danger! Do not fall in the trap of confusing type classes (in Haskell) with classes (in Java)! They are not very similar, and you do not use them to solve the same problems.
If anything, type classes correspond to interfaces in Java: Both contain methods without implementation and their type signatures, and instances provide the implementation.
Classes and objects as in Java do not have a direct correspondence in Haskell, and that is ok, because problems are approached differently. But the Interaction type that we defined last week is, in some sense, an approximation of a class. The concrete Interactions that we defined are instances of this class, and functions like withStartScreen relate to inheritance (or maybe ot the decorator pattern).
Internals of Type classes
Type classes define interfaces that in Haskell’s terms are called dictionaries.
For instance, from the Ord
class definition the compiler will create a dictionary that stores all the class methods. This dictionary will be just a data type with one field per each defined method:
> data OrdDict a
> = OrdDict { leq :: a -> a -> Bool
> , gt :: a -> a -> Bool
> }
This dictionary will be an implicit argument to all the functions that use the class methods. The implicit argument will be automatically filled in by the compiler.
The compiler will transform our insert
function to a function that takes an extra explicit dictionary argument and use this argument to compare values of type a
.
> insert' dict x [] = [x]
> insert' dict x (y:ys)
> | leq dict x y = x:y:ys
> | otherwise = y:insert' dict x ys
>
> insertSort' :: OrdDict a -> [a] ->[a]
> insertSort' dict = foldl (flip (insert' dict)) []
>
>
> insertSort' :: Ord a => [a] ->[a]
> insertSort' = foldl (flip (insert)) []
Note how insertSort'
propagates the dictionary to insert'
.
When we define instances of the Ord
class we basically define dictionary values.
> intDict :: OrdDict Int
> intDict = OrdDict (<=) (>)
Then, the compiler together with the type inferences figure out how to properly pass around these dictionaries.
> insertInt :: [Int]
> insertInt = insertSort' intDict [4, 3, 6, 2, 9]
Creating Typeclasses
It turns out that typeclasses are useful for many different things. We will see some of those over the next few lectures, but let us conclude today’s class with a quick example that provides a (very) small taste of their capabilities.
JSON
JavaScript Object Notation or JSON is a simple format for transferring data around. Here is an example:
{ "name" : "Niki"
, "age" : 30
, "likes" : ["guacamole", "coffee", "cats"]
, "hates" : [ "waiting" , "grapefruit"]
, "lunches" : [ {"day" : "monday", "loc" : "sweet green"}
, {"day" : "tuesday", "loc" : "stamp"}
, {"day" : "wednesday", "loc" : "farmers market"}
, {"day" : "thursday", "loc" : "lab"}
, {"day" : "friday", "loc" : "home"} ]
}
In brief, each JSON object is either
a base value like a string, a number or a boolean,
an (ordered) array of objects, or
a set of string-object pairs.
Thus, we can encode (a subset of) JSON values with the datatype
> data JVal = JStr String
> | JNum Double
> | JBln Bool
> | JObj [(String, JVal)]
> | JArr [JVal]
> deriving (Eq, Ord, Show)
Thus, the above JSON
value would be represented by the JVal
> js1 =
> JObj [("name", JStr "Niki")
> ,("age", JNum 30)
> ,("likes", JArr [ JStr "guacamole", JStr "coffee", JStr "cats"])
> ,("hates", JArr [ JStr "waiting" , JStr "grapefruit"])
> ,("lunches", JArr [ JObj [("day", JStr "monday")
> ,("loc", JStr "sweet green")]
> , JObj [("day", JStr "tuesday")
> ,("loc", JStr "farmers market")]
> , JObj [("day", JStr "wednesday")
> ,("loc", JStr "farmers market")]
> , JObj [("day", JStr "thursday")
> ,("loc", JStr "lab")]
> , JObj [("day", JStr "friday")
> ,("loc", JStr "home")]
> ])
> ]
Serializing Haskell values to JSON
Next, suppose that we want to write a small library to serialize Haskell values as JSON
. We could write a bunch of functions like
> doubleToJSON :: Double -> JVal
> doubleToJSON = JNum
similarly, we have
> stringToJSON :: String -> JVal
> stringToJSON = JStr
>
> boolToJSON :: Bool -> JVal
> boolToJSON = JBln
But what about collections, namely objects and arrays? We might try
> doublesToJSON :: [Double] -> JVal
> doublesToJSON xs = JArr (map doubleToJSON xs)
>
> boolsToJSON :: [Bool] -> JVal
> boolsToJSON xs = JArr (map boolToJSON xs)
>
> stringsToJSON :: [String] -> JVal
> stringsToJSON xs = JArr (map stringToJSON xs)
which of course, you could abstract by making the individual-element-converter a parameter
> xsToJSON :: (a -> JVal) -> [a] -> JVal
> xsToJSON f xs = JArr (map f xs)
>
> xysToJSON :: (a -> JVal) -> [(String, a)] -> JVal
> xysToJSON f kvs = JObj [ (k, f v) | (k, v) <- kvs ]
but still, this is getting rather tedious, since we have to redefine versions for each Haskell type, and instantiate them by hand for each conversion
ghci> doubleToJSON 4
JNum 4.0
ghci> xsToJSON stringToJSON ["coffee", "guacamole", "cats"]
JArr [JStr "coffee",JStr "guacamole",JStr "cats"]
ghci> xysToJSON stringToJSON [("day", "monday"), ("loc", "zanzibar")]
JObj [("day",JStr "monday"),("loc",JStr "zanzibar")]
and this gets more hideous when you have richer objects like
> lunches = [ [("day", "monday"), ("loc", "sweet green")]
> , [("day", "tuesday"), ("loc", "stamp")]
> , [("day", "wednesday"), ("loc", "farmers market")]
> , [("day", "thursday"), ("loc", "lab")]
> , [("day", "friday"), ("loc", "home")]
> ]
because we have to go through gymnastics like
ghci> xsToJSON (xysToJSON stringToJSON) lunches
JArr [JObj [("day",JStr "monday"),("loc",JStr "sweet green")],JObj [("day",JStr "tuesday"),("loc",JStr "stamp")]]
Ugh! So much for readability. Isn’t there a better way? Is it too much to ask for a magical toJSON
that just works?
Type classes to the rescue!
Of course there is a better way, and the the route is paved by typeclasses!
Lets define a typeclass that describes any type that can be converted to JSON
.
> class JSON a where
> toJSON :: a -> JVal
Easy enough. Now, we can make all the above instances of JSON
like so
> instance JSON Double where
> toJSON = JNum
>
> instance JSON Bool where
> toJSON = JBln
>
> instance JSON String where
> toJSON = JStr
Now, we can just say
ghci> toJSON 4
JNum 4.0
ghci> toJSON True
JBln True
ghci> toJSON "guacamole"
JStr "guacamole"
Bootstrapping Instances
The real fun begins when we get Haskell to automatically bootstrap the above functions to work for lists and association lists!
> instance {-# OVERLAPS #-} JSON a => JSON [a] where
> toJSON xs = JArr [toJSON x | x <- xs]
Whoa!
The above says, if a is an instance of JSON
, that is, if you can convert a to JVal
then here’s a generic recipe to convert lists of a values!
ghci> toJSON [True, False, True]
JArr [JBln True,JBln False,JBln True]
ghci> toJSON ["cat", "dog", "Mouse"]
JArr [JStr "cat",JStr "dog",JStr "Mouse"]
ghci> toJSON [["cat", "dog"], ["mouse", "rabbit"]]
JArr [JArr [JStr "cat",JStr "dog"],JArr [JStr "mouse",JStr "rabbit"]]
Of course, we can pull the same trick with key-value lists
> instance (JSON a) => JSON [(String, a)] where
> toJSON kvs = JObj [ (k, toJSON v) | (k, v) <- kvs ]
after which, we are all set!
ghci> toJSON lunches
JArr [JObj [("day",JStr "monday"),("loc",JStr "stamp")],JObj [("day",JStr "tuesday"),("loc",JStr "stamp")]]
It is also useful to bootstrap the serialization for tuples (upto some fixed size) so we can easily write “non-uniform” JSON
objects where keys are bound to values with different shapes.
> instance (JSON a, JSON b) => JSON ((String, a), (String, b)) where
> toJSON ((k1, v1), (k2, v2)) =
> JObj [(k1, toJSON v1), (k2, toJSON v2)]
>
> instance (JSON a, JSON b, JSON c) => JSON ((String, a), (String, b), (String, c)) where
> toJSON ((k1, v1), (k2, v2), (k3, v3)) =
> JObj [(k1, toJSON v1), (k2, toJSON v2), (k3, toJSON v3)]
>
> instance (JSON a, JSON b, JSON c, JSON d) => JSON ((String, a), (String, b), (String, c), (String,d)) where
> toJSON ((k1, v1), (k2, v2), (k3, v3), (k4, v4)) =
> JObj [(k1, toJSON v1), (k2, toJSON v2), (k3, toJSON v3), (k4, toJSON v4)]
>
> instance (JSON a, JSON b, JSON c, JSON d, JSON e) => JSON ((String, a), (String, b), (String, c), (String,d), (String, e)) where
> toJSON ((k1, v1), (k2, v2), (k3, v3), (k4, v4), (k5, v5)) =
> JObj [(k1, toJSON v1), (k2, toJSON v2), (k3, toJSON v3), (k4, toJSON v4), (k5, toJSON v5)]
Now, we can simply write
> hs = (("name" , "Niki")
> ,("age" , 30 :: Double)
> ,("likes" , ["guacamole", "coffee", "cats"])
> ,("hates" , ["waiting", "grapefruit"])
> ,("lunches", lunches)
> )
which is a Haskell value that describes our running JSON
example, and can convert it directly like so
> js2 = toJSON hs
This value is exactly equal to the old “hand-serialized” JSON object js1.
ghci> js1 == js2
True
Thats it for today. We will see much more typeclass awesomeness in the next few lectures…