Last active
August 29, 2015 14:16
-
-
Save DRMacIver/db7664e7bd5ee77d8cd7 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{-# LANGUAGE ExistentialQuantification, RankNTypes #-} | |
{- | |
Monadic generation and simplification of examples | |
It's easy to generate data monadically (that's just a Random monad) but it's | |
less obvious how to proceed in a way that allows you to simplify the generated | |
data. It's not even obvious how to do it functorially. | |
The problem is that in the obvious implementation of simplification you need | |
a way of mapping the new type back to the old, which fmap doesn't actually allow | |
you to do: It's a one way trip only. | |
The templatized data generation approach I developed in Hypothesis | |
(https://github.com/DRMacIver/hypothesis) turns out to let you do this and not | |
be too hard to make work in a typed manner. This sketches out some details of | |
how you might do this in Haskell. | |
The core idea is that strategies are parametrized by two types: One is the data | |
that is actually generated, the other an intermediate representation. Simplification | |
occurs on the IR and not on the final data, so mapping to a new type does not prevent | |
simplification: You just simplify the IR and then reapply the mapping. This second type parameter | |
is then hidden behind an existential type so that you can use different IRs for different | |
types and strategies. | |
Monadic bind is a little trickier but is easily resolved through the use of an additional | |
existential type. | |
Warning: I have only type checked this code, I have not actually run it. It's | |
probably broken. | |
Additional warning: My Haskell is super rusty so this code will probably be very non-idiomatic. | |
-} | |
module MonadicSimplification where | |
import System.Random | |
import Control.Applicative | |
{- | |
A strategy describes how to explore some type. It makes use of an intermediate | |
template value which it can convert to the final type. | |
Once generated data can be simplified. Note that simplification happens on the | |
template value rather than the final value. This will be important later. | |
-} | |
data StrategyWithTemplate t a = StrategyWithTemplate { | |
reify :: t -> a, | |
simplify :: forall g. (RandomGen g) => t -> Rand g t, | |
arbitrary :: forall g. (RandomGen g) => Rand g t | |
} | |
{- | |
Our actual operations use an existential type to hide the template. We don't | |
need or want to do anything with the template other than pass it back to the | |
strategy we got it from. | |
-} | |
data Strategy a = forall t. Strategy (StrategyWithTemplate t a) | |
{- | |
Given a strategy for how to explore data, attempt to find a simple counter-example | |
for a hypothesis about that data. | |
This is just a toy implementation. It generates a single random value and if that | |
is a counter-example performs 100 random simplifications to attempt to get a smaller | |
one. | |
-} | |
hypothesis :: (RandomGen g) => Strategy a -> (a -> Bool) -> Rand g (Maybe a) | |
hypothesis (Strategy arb) = hypothesisFromTemplate arb | |
instance Functor Strategy where | |
fmap f (Strategy a) = Strategy (mapWithTemplate f a) | |
instance Applicative Strategy where | |
pure x = Strategy (returnFromTemplate x) | |
(Strategy arbf) <*> (Strategy arba) = Strategy (combineWithTemplate arbf arba) | |
instance Monad Strategy where | |
return = pure | |
(Strategy arb) >>= f = Strategy (bindWithTemplate arb f) | |
x >> y = y | |
{- | |
Templatized versions of all the different operations which make it clear what is | |
actually happenin to the template type | |
-} | |
-- template aware implementation of return/pure | |
returnFromTemplate :: a -> StrategyWithTemplate () a | |
returnFromTemplate x = StrategyWithTemplate { | |
reify = const x, | |
simplify = return, | |
arbitrary = return () | |
} | |
-- template aware implementation of fmap | |
mapWithTemplate :: (a -> b) -> (StrategyWithTemplate t a) -> (StrategyWithTemplate t b) | |
mapWithTemplate f arb = StrategyWithTemplate{ | |
reify = f . (reify arb), | |
simplify = simplify arb, | |
arbitrary = arbitrary arb | |
} | |
-- template aware implementation of <*> | |
combineWithTemplate :: StrategyWithTemplate s (a -> b) -> StrategyWithTemplate t a -> StrategyWithTemplate (s, t) b | |
combineWithTemplate arbf arba = StrategyWithTemplate { | |
reify = \(s, t) -> (reify arbf s) (reify arba t), | |
simplify = \(s, t) -> do left <- coinFlip | |
if left then do ss <- simplify arbf s | |
return (ss, t) | |
else do st <- simplify arba t | |
return (s, st), | |
arbitrary = do s <- arbitrary arbf | |
t <- arbitrary arba | |
return (s, t) | |
} | |
{- | |
template aware implementation of >>= | |
This one is a little tricky. The problem is that the function we are given might return strategies with different | |
template types depending on the value of its input argument. Thus we use a BoxedTemplate b which bundles together | |
a template value with the strategy that produced it, allowing us to circumvent this problem. | |
-} | |
bindWithTemplate :: StrategyWithTemplate s a -> (a -> Strategy b) -> StrategyWithTemplate (s, BoxedTemplate b) b | |
bindWithTemplate arbx f = StrategyWithTemplate { | |
reify = \(s, (BoxedTemplate t arb)) -> reify arb t, | |
{- | |
Simplification either simplifies the template for a and generates an entirely fresh strategy and template | |
for the second one, or it leaves the first argument alone and simplifies the second using the strategy it | |
has bundled with it | |
-} | |
simplify = \(s, bt) -> do left <- coinFlip | |
if left | |
then do ss <- simplify arbx s | |
btNew <- drawBoxedTemplate (f $ reify arbx ss) | |
return (ss, btNew) | |
else do btNew <- simplifyBoxedTemplate bt | |
return (s, btNew), | |
arbitrary = do s <- arbitrary arbx | |
let x = reify arbx s | |
t <- drawBoxedTemplate (f x) | |
return (s, t) | |
} | |
data BoxedTemplate a = forall t. BoxedTemplate t (StrategyWithTemplate t a) | |
drawBoxedTemplate :: (RandomGen g) => Strategy a -> Rand g (BoxedTemplate a) | |
drawBoxedTemplate (Strategy arb) = do template <- arbitrary arb | |
return (BoxedTemplate template arb) | |
simplifyBoxedTemplate :: (RandomGen g) => BoxedTemplate a -> Rand g (BoxedTemplate a) | |
simplifyBoxedTemplate (BoxedTemplate t arb) = do st <- simplify arb t | |
return (BoxedTemplate st arb) | |
-- Template aware implementation of hypothesis | |
hypothesisFromTemplate :: (RandomGen g) => StrategyWithTemplate t a -> (a -> Bool) -> Rand g (Maybe a) | |
hypothesisFromTemplate arb f = do template <- arbitrary arb | |
let x = reify arb template | |
if f x then do simplified <- simplerTemplate template | |
return $ Just (reify arb simplified) | |
else return Nothing | |
where simplerTemplate t = runSimplifications 100 t | |
isCounterExample = f . reify arb | |
runSimplifications n t = if n <= 0 then return t | |
else do s <- simplify arb t | |
if isCounterExample s | |
then runSimplifications (n - 1) s | |
else runSimplifications (n - 1) t | |
{- | |
A simple random data generation monad. Yes I know about Control.Monad.Random | |
but it was easier to do this as a stand-alone thing this way. | |
-} | |
data Rand g a = Rand (g -> (a, g)) | |
instance (RandomGen g) => Functor (Rand g) where | |
fmap f (Rand d) = Rand (\g -> let (x, g2) = d g in (f x, g2)) | |
instance (RandomGen g) => Applicative (Rand g) where | |
pure x = Rand (\g -> (x, g)) | |
(Rand rf) <*> (Rand rx) = Rand (\g -> let (f, g2) = rf g in let (x, g3) = rx g2 in (f x, g3)) | |
instance (RandomGen g) => Monad (Rand g) where | |
return = pure | |
(Rand rx) >>= f = Rand (\g -> let (x, g2) = rx g in let (Rand rg) = f x in rg g2) | |
coinFlip :: (RandomGen g) => Rand g Bool | |
coinFlip = Rand random |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
It's been pointed out to me that this isn't actually quite right: This breaks the monad and applicative laws, albeit in a way that is hard to observe. The distribution of simplify in (return x >>= f) is different from that of (f x) because the former will return x 50% of the time.
I think this is almost fixable by changing the interface to simplify to
And then have the bound simplify implementation be to try a mix of simplifying from the left and the right. That's probably a better interface anyway.
However I think this also then violates the associativity laws. I'm not sure I'm really bothered by that though. You can make it easily satisfy the distributional laws in the weaker sense that they might have different effects on the RNG and produce the same distribution of results from it but might produce different draws from that distribution.