Idris is syntactically quite Haskell-like, and especially it has do
notation for sequencing "actions".
Like (traditional) Haskell, do
blocks are desugared to a sequence of >>=
(bind) operators. But,
unlike (traditional) Haskell, that >>=
is not always the monadic >>= : m a -> (a -> m b) -> m b
. (This can also happen in Haskell using rebindable syntax)
In Idris, you can use different "better-than-monad" types to statically (at compile time) reason about a computation beyond using the monad laws. For example, an effect system might track which effects are in scope.
Total parsing
In the case of Text.Parser (in the Idris
contrib/
package) the type signature of actions (
Grammar _ _ _
indicates whether a parser consumes any characters so that the compiler can tell if a parser might loop forever. (see
http://www.cse.chalmers.se/~nad/publications/danielsson-parser-combinators.html)
I was trying to write a new JSON parser for idris-todaybot using Text.Parser. Previously JSON was parsed using lightyear, but Text.Parser has more gratuitous dependent types so was an obvious way to proceed.
A problem.
I ran into a surprising compile error which initially made no sense to me at all.
This code compiles:
objectValuePair : Grammar Char True (List Char, ())
-- definition elided
jsonObject : Grammar Char True (List Char, ())
jsonObject = do
llll <- objectValuePair
pure llll
where llll
is a tuple; but the following version of jsonObject, which desconstructs that tuple and reassembles it, does not compile:
jsonObject : Grammar Char True (List Char, ())
jsonObject = do
(k,v) <- objectValuePair
pure (k,v)
It gives this error:
When checking right hand side of Main.case block
in jsonObject at bug-bind.idr:55:14 with expected type
Grammar Char c2 (List Char, ())
Type mismatch between
Grammar Char False (List Char, ()) (Type of pure (k, v))
and
Grammar Char c2 (List Char, ()) (Expected type)
Specifically:
Type mismatch between
False
and
c2
Another attempt to deconstruct llll
also fails:
jsonObject : Grammar Char True (List Char, ())
jsonObject = do
llll <- objectValuePair
let (k,v) = llll
pure (k,v)
but the following deconstruction by function application rather than pattern matching succeeds:
jsonObject : Grammar Char True (List Char, ())
jsonObject = do
llll <- objectValuePair
let k = fst llll
let v = snd llll
pure (k,v)
That type error
Let's dig into that type error:
Type mismatch between
Grammar Char False (List Char, ()) (Type of pure (k, v))
and
Grammar Char c2 (List Char, ()) (Expected type)
Grammer _ _ _
is the type of parser actions, where the first parameter Char
is the type of symbols we're consuming, the final parameter (List Char, ())
is the type that the parser will return on success, and the middle parameter (False
or c2
) represents whether the parser will definitely consume input (True
) or might succeed without consuming anything (False
- for example, a parser which removes whitespace, or pure
which never even looks at the input stream).
This "consumes" parameter contains the main novelty in Text.Parser beyond monadic parser combinators: Text.Parser combinators manipulate and use this value at compile time to help check that parsers really will consume things: for example, a parser that definitely consumes followed by a parser that might not, results in a parser that definitely consumes; while sequencing two parsers that might not consume results in a parser that might not consume. (See: the source)
So what on earth has this parameter, manipulated by >>=
, got to do with pattern matching pure
values after they've already been returned by an action?
Desugaring
It turns out we can forget that our troublesome tuple is being returned from an action; let (a,b) = (1,2)
breaks in the same way when run inside a Text.Parser
do
block.
Let's (very roughly) desugar some of the examples above, and then look at the types involved:
jsonObject : Grammar Char True (List Char, ())
jsonObject = do
llll <- objectValuePair
pure llll
-- becomes:
jsonObject = objectValuePair >>= (\v => pure v)
jsonObject = do
(k,v) <- objectValuePair
-- becomes:
objectValuePair >>= (\(k,v) => pure (k,v))
-- becomes:
objectValuePair >>= (\llll => case llll of
(k,v) => pure (k,v)
)
So in the second fragment, there's an extra case
expression in there
to deconstruct llll
using pattern matching.
Apparently that gets in the way of type inference/checking:
- On the one hand, that
pure
has type: Grammar Char False (List Char, ())
- false because it may (actually, will always) succeed without consuming input.
- On the other hand,
>>=
doesn't care whether the right hand side consumes or not - it will take either, as shown by the compile time variable c2
appearing in the error message.
Idris doesn't manage to unify
c2 = False
.
With further pinning of types using the
, an uglier form of pattern matching does work:
export jsonObject : Grammar Char True (List Char, ())
jsonObject =
do
llll <- objectValuePair
the (Grammar Char False (List Char, ())) $ do
let (k,v) = llll
pure (k, v)
Ugh
Thanks
Thanks to Melvar on #idris for explaining this.