Last active
October 25, 2022 13:52
-
-
Save gelisam/d789246eacfa0bfc75d28e2b492f9a7d to your computer and use it in GitHub Desktop.
A version of ConduitT with N input streams instead of 1
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- Follow up to [1], praising @viercc's better solution [2]. | |
-- | |
-- [1] https://gist.github.com/gelisam/a8bee217410b74f030c21f782de23d11 | |
-- [2] https://www.reddit.com/r/haskell/comments/yb9bi4/comment/itfh07z | |
-- | |
-- The challenge is still to implement a function which takes in three | |
-- Conduits, and uses the values from the first Conduit in order to decide | |
-- which of the other two Conduits to sample from next. Something like this: | |
-- | |
-- example bools ints strings = do | |
-- maybeBool <- awaitMaybe bools | |
-- case maybeBool of | |
-- Nothing -> do | |
-- liftIO $ putStrLn "it's over" | |
-- Just True -> do | |
-- int <- await ints | |
-- yield (int + 4) | |
-- example | |
-- Just False -> do | |
-- str <- await strings | |
-- liftIO $ putStrLn str | |
-- yield $ length str | |
-- example | |
-- | |
-- And I still want a solution which works for n Conduits, not just three. But | |
-- this time, instead of switching to the notoriously-complex "machines" | |
-- package, I'll stick to conduit, thanks to @viercc's great tip of using | |
-- Conduit _transformers_. | |
{-# LANGUAGE LambdaCase, RankNTypes, ScopedTypeVariables #-} | |
module Main where | |
import Test.DocTest | |
import Control.Monad.IO.Class (MonadIO(liftIO)) | |
import Control.Monad.Trans.Class (lift) | |
import Data.Conduit hiding (Source) | |
import Data.Functor.Identity (Identity(runIdentity)) | |
import Data.Void (absurd) | |
-- $setup | |
-- >>> import qualified Data.Conduit.Combinators as Conduit | |
-- >>> import qualified Data.Conduit.List as Conduit | |
-- >>> :{ | |
-- let testConduit | |
-- :: Show o | |
-- => ConduitT () o IO () | |
-- -> IO () | |
-- testConduit source = do | |
-- os <- connect source Conduit.consume | |
-- print os | |
-- :} | |
-- In the previous post, I used a type-level list to keep track of the | |
-- multiple inputs. This time, we'll represent a conduit with n inputs as a | |
-- stack of n 'ConduitT' transformers. | |
type Source m o = ConduitT () o m () | |
type Process m a o = ConduitT a o m () | |
type Tee m a b o = ConduitT a o | |
(ConduitT b Void m) () | |
type Tee3 m a b c o = ConduitT a o | |
(ConduitT b Void | |
(ConduitT c Void m)) () | |
-- Each 'ConduitT' layer has two type arguments; one for the elements you | |
-- await from the single input you get when you use conduits the normal way, | |
-- and one for the elements you send downstream. With n layers, we can thus | |
-- specify n inputs, which is what we want, but we must also specify n | |
-- outputs, which is n-1 too many. Using 'Void' for all but one of the outputs | |
-- clarifies that only one of them is actually used. | |
-- Now that we have unexpectedly managed to represent conduits which take more | |
-- than one input, how can we attach those inputs? In the previous post, I | |
-- defined a versatile 'polyCapL' function which could attach a 'Source' to a | |
-- number of different machines: | |
-- | |
-- -- polyCapL :: Source m a -> Process m a o -> Source m o | |
-- -- polyCapL :: Source m a -> Tee m a b o -> Process m b o | |
-- -- polyCapL :: Source m a -> Tee3 m a b c o -> Tee m b c o | |
-- polyCapL | |
-- :: Source m a1 | |
-- -> PolyTee m (a1 ': as) o | |
-- -> PolyTee m as o | |
-- | |
-- I would like to construct a similar function here. As before, I want to | |
-- attach a single source as the first input 'a1', while leaving the remaining | |
-- inputs 'as' untouched. In this conduit stack representation, the first | |
-- input is the input of the outermost 'ConduitT' layer, while the remaining | |
-- inputs are specified by the rest of the layers, 'mm': | |
-- | |
-- myPolyCapL | |
-- :: Source m a1 () | |
-- -> ConduitT a1 o mm r | |
-- -> mm r | |
-- | |
-- In order for this type to specialize to these, | |
-- | |
-- myPolyCapL :: Source m a -> Process m a o -> Source m o | |
-- myPolyCapL :: Source m a -> Tee m a b o -> Process m b o | |
-- myPolyCapL :: Source m a -> Tee3 m a b c o -> Tee m b c o | |
-- | |
-- I need to somehow specify that the monad at the base of the 'mm' stack must | |
-- be 'm'. When 'm' is IO, this is represented using a 'MonadIO' constraint: | |
-- | |
-- liftIO :: forall x. IO x -> mm x | |
-- | |
-- There exists a @MonadBase m@ constraint which generalizes 'MonadIO': | |
-- | |
-- liftBase :: forall x. m x -> mm x | |
-- | |
-- But instead of adding an orphan @MonadBase m (ConduitT i o mm)@ instance, | |
-- I'll just ask for an extra @forall x. m x -> mm x@ parameter: | |
-- | |
-- myPolyCapL | |
-- :: (forall x. m x -> mm x) | |
-- => Source m a1 () | |
-- -> ConduitT a1 o mm r | |
-- -> mm r | |
-- Another way in which the type above isn't quite right is that the output | |
-- type 'o' disappears. The fix is quite simple: instead of only concretizing | |
-- the very outermost ConduitT layer and leaving the rest abstract, I | |
-- concretize the _two_ outermost ConduitT layers: | |
myPolyCapL | |
:: forall a b o m mm r. (Monad m, Monad mm) | |
=> (forall x. m x -> mm x) | |
-> ConduitT () a m () | |
-> ConduitT a o (ConduitT b Void mm) r | |
-> ConduitT b o mm r | |
myPolyCapL liftM src doubleConduit | |
= connect src' doubleConduit'' | |
where | |
-- The implementation looks very different from polyCapL's, but it's the | |
-- same idea. It just so happens that the conduit API is expressive enough | |
-- that we can achieve our goal via several small transformations, without | |
-- having to unravel the conduits into sequences of instructions. | |
src' :: ConduitT () a (ConduitT b o mm) () | |
src' | |
= transPipe (lift . liftM) src | |
doubleConduit' :: ConduitT a o (ConduitT b o mm) r | |
doubleConduit' | |
= transPipe (mapOutput absurd) doubleConduit | |
-- The 'doubleConduit' transformations swap the 'Void' and 'o' output | |
-- types. | |
-- At this point one might wonder why I chose the convention of | |
-- using 'Void' for all but the _outermost_ layer. If I had chosen the | |
-- innermost layer instead, the 'o' would already be in the right | |
-- position, and I wouldn't need to perform any transformations on | |
-- 'doubleConduit'! | |
-- The reason is simply to provide a more ergonomic experience to the | |
-- user: by choosing the outermost layer, the user can emit by writing | |
-- | |
-- emit o | |
-- | |
-- Whereas if I had chosen the innermost layer, the user would have to | |
-- write this instead. | |
-- | |
-- lift $ lift $ emit o | |
doubleConduit'' :: ConduitT a Void (ConduitT b o mm) r | |
doubleConduit'' | |
= fuseUpstream doubleConduit' outputToInner | |
outputToInner :: ConduitT o Void (ConduitT b o mm) () | |
outputToInner = do | |
await >>= \case | |
Just o -> do | |
lift $ yield o | |
outputToInner | |
Nothing -> do | |
pure () | |
-- The types line up, but how does this work? We're compiling down to a single layer, so how did we persuade conduit to magically create a ConduitT with more than one input? | |
-- | |
-- The secret is, we didn't! Conduits are combined in the same way machines are: by lining up and then eliminating matching `yield` and `await` instructions. Thus, after we have attached an input and eliminated a ConduitT layer, all the `await` calls which were reading from that input have been replaced by a fragment of the code from that input, namely the code between two consecutive `yield`s. That's the magic of representing computations as a sequence of instructions, we can splice and rearrange those instructions! | |
-- | |
-- In fact, machines are represented in pretty much the same way as conduits, as a sequence of instructions, they aren't stored as a tree or a graph of instructions as one might expect. Pretty much the only difference is that a machine's `awaits` instruction takes an extra argument specifying which inputs it is awaiting from. | |
-- Anyway, since I suspect that most conduit stacks will either have IO or Identity as | |
-- a base monad, here are two specializations which fill-in the 'liftM' | |
-- parameter. | |
-- polyCapIO :: Source IO a -> Process IO a o -> Source IO o | |
-- polyCapIO :: Source IO a -> Tee IO a b o -> Process IO b o | |
-- polyCapIO :: Source IO a -> Tee3 IO a b c o -> Tee IO b c o | |
polyCapIO | |
:: forall a b o mm r. MonadIO mm | |
=> Source IO a | |
-> ConduitT a o (ConduitT b Void mm) r | |
-> ConduitT b o mm r | |
polyCapIO | |
= myPolyCapL liftIO | |
polyCap | |
:: forall a b o mm r. Monad mm | |
=> ConduitT () a Identity () | |
-> ConduitT a o (ConduitT b Void mm) r | |
-> ConduitT b o mm r | |
polyCap | |
= myPolyCapL (pure . runIdentity) | |
-- In the previous post, I also implemented a 'polyCapR' function for | |
-- converting a fully-saturated 'PolyTee' into a 'Source', so it can be used | |
-- with existing machine combinators. | |
-- | |
-- polyCapR | |
-- :: PolyTee m '[] b | |
-- -> Source m b | |
-- | |
-- With the conduit layers representation, such a function is not needed. | |
-- After attaching all but one of the inputs, the result is a single ConduitT | |
-- layer, that is, a normal conduit which can already be used with existing | |
-- conduit combinators. In particular, 'fuse' can be used to attach the last | |
-- input, thus converting the conduit to a source. | |
-- We can finally implement the challenge; twice, in order to exercise both | |
-- the IO and Identity specializations. | |
-- | | |
-- >>> :{ | |
-- testConduit | |
-- $ fuse (Conduit.yieldMany ["foo", "bar", "quux"]) | |
-- $ polyCapIO (Conduit.yieldMany [1..]) | |
-- $ polyCapIO (Conduit.yieldMany [True, False, False, True, False]) | |
-- $ exampleIO | |
-- :} | |
-- foo | |
-- bar | |
-- quux | |
-- it's over | |
-- [5,3,3,6,4] | |
exampleIO | |
:: ConduitT Bool Int | |
(ConduitT Int Void | |
(ConduitT String Void IO)) | |
() | |
exampleIO = do | |
maybeBool <- await | |
case maybeBool of | |
Nothing -> do | |
liftIO $ putStrLn "it's over" | |
Just True -> do | |
Just int <- lift await | |
yield (int + 4) | |
exampleIO | |
Just False -> do | |
Just str <- lift $ lift await | |
liftIO $ putStrLn str | |
yield $ length str | |
exampleIO | |
-- | | |
-- >>> :{ | |
-- testConduit | |
-- $ fuse (Conduit.yieldMany ["foo", "bar", "quux"]) | |
-- $ polyCap (Conduit.yieldMany [1..]) | |
-- $ polyCap (Conduit.yieldMany [True, False, False, True, False]) | |
-- $ example | |
-- :} | |
-- [5,3,3,6,4] | |
example | |
:: forall m. MonadFail m | |
=> ConduitT Bool Int | |
(ConduitT Int Void | |
(ConduitT String Void m)) | |
() | |
example = do | |
maybeBool <- await | |
case maybeBool of | |
Nothing -> do | |
pure () | |
Just True -> do | |
Just int <- lift await | |
yield (int + 4) | |
example | |
Just False -> do | |
Just str <- lift $ lift await | |
yield $ length str | |
example | |
main :: IO () | |
main = do | |
putStrLn "typechecks." | |
test :: IO () | |
test = do | |
doctest ["src/Main.hs"] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment