Mach7: Pattern Matching for C++

haberman · on May 24, 2015

Reading about this, I followed a link to this talk advocating pattern matching (using Mach7) over the Visitor pattern: https://www.youtube.com/watch?v=QhJguzpZOrk

This was really interesting to me, because I had never thought of pattern matching and the Visitor pattern as being "competitors", so-to-speak. I guess this would mainly apply in the case that your pattern matching is based on type.

I was further intrigued because this argument (pattern matching is better than Visitor) seems to be the opposite argument made in the Google C++ Style Guide, which I usually agree with. The Google C++ Style Guide argues against using RTTI to do type-based dispatch (https://google-styleguide.googlecode.com/svn/trunk/cppguide....):

"Decision trees based on type are a strong indication that your code is on the wrong track. Code such as this usually breaks when additional subclasses are added to the class hierarchy. Moreover, when properties of a subclass change, it is difficult to find and modify all the affected code segments."

It's very interesting to me that there is such wide disagreement here on which of these two patterns represents better design.

jeremyjh · on May 24, 2015

Yeah, its cute but it doesn't provide exhaustiveness testing which is what the Google guide is concerned about. I think Scala came up with a neat way to work that into an OO model with its sealed attribute. A sealed case class can't be extended past its original source file, so its inheritance hierarchy is fixed. Pattern matches on a sealed case class will warn for lack of exhaustiveness just like matches on ADTs do in Haskell, Rust etc.

Someone · on May 24, 2015

For the language history buffs: sealed classes aren't a Scala invention. Dylan had them, too (at the library, rather than the source level. http://opendylan.org/books/dpg/perform.html)

(Better language history buffs than me, feel free to show prior art)

steveklabnik · on May 24, 2015

Small note: non-exhaustive patterns are an error in Rust, not a warning.

ufo · on May 25, 2015

Dunno if he mentions it on the video but in the FP community they refer to this dicoromy as the "Expression Problem"

http://c2.com/cgi/wiki?ExpressionProblem

The name comes from choosing the best way to represent a datatype for expressions in a programming language. The pattern-matching approach makes it easy to define new functions on top of ASTs and the OO-approach makes it easy to add new kinds of node to the tree but its really hard to make a system that is easy to extend in both directions.

herewego · on May 24, 2015

Seems like performance is a real advantage, too. Similar to why we sometimes see static CRTP used for the Visitor pattern. I'm using static CRTP for a project that requires extremely low latency, and when compared to it's virtual counterpart, it delivers.

mafribe · on May 24, 2015

Python 3 gets type hints, Javascript gets types (via Facebook's Flow), now C++ gets pattern matching. Seems like every successful programming language will eventually try and become ML (with meta-programming).

Robin Milner, you had it all sussed out in the early 1970s, it took the rest of the world about 40 years to catch up!

quchen · on May 24, 2015

I don't think type hints do Hindley/Milner/Damas justice. This type system has two parts, none of which are present in most popular languages:

1. A program that typechecks cannot have type errors during runtime. Dynamic type (i.e. runtime tag) based functions like instanceof break this, and most (all?) OOP languages have it in some form.

2. Type inference: the types of expressions can be unified to the most general type that works for all of them. I don't know any popular language that is able to do this.

Most typing I see added to languages these days are cherry-picking consequences of the HM algorithm. C++'s auto keyword for example is a very simple version of type inference, type hints in dynamic languages aren't checked (or checkable) statically, and when runtime tags are involved it's hard to speak of static type inference.

mafribe · on May 24, 2015

     I don't think type hints do Hindley/Milner/Damas justice.

I agree. Retro-fitting Hindley-Milner would be difficult. Rpython has type-inference for a subset of Python. But it's a special purpose language.

MrBuddyCasino · on May 25, 2015

    I don't know any popular language that is able to do this.

Wikipedia lists ML and Haskell - are they missing some detail?

quchen · on May 25, 2015

I wouldn't call ML popular, and in Haskell you have type classes, which have undecidable type inference. You can write `show (read str)` in Haskell for example, which parses `str` and then converts it back to `String` again. But the compiler cannot infer which type you want to parse to: is the input an Int, a Double, a JSON object, any other read/showable format? In these (rare) cases, you need explicit type annotations.

(Without type classes, type inference is decidable in Haskell.)

mafribe · on May 25, 2015

As an aside, Hindley–Milner is DEXPTIME-complete, and indeed you can write 5 line programs that exhibit the worst-case behaviour. That's as bad a worst-case behaviour as undecidability for all practical purposes.

That said, in practise, HM type inference tends to be very fast, because the worst-case behaviour only shows up in rather artificial programs that human programmers would normally not write. If you bound the depth of let-nesting, HM is polynomial.

quchen · on May 25, 2015

You're correct on the theoretical side, and writing a value whose type makes GHC crash is fairly easy. But in multiple years of Haskell, I did not encounter nor have I heard of a practical case in which type inference time was a performance issue, let alone a bottleneck or problem.

marvy · on May 25, 2015

not very popular :)

ndesaulniers · on May 25, 2015

Neat to see; I think Rust shines here, since the use of pattern matching is tighly coupled with error handling of value that may be null; since pattern matching in Rust is exhaustive. [0] "But, the C++ compiler let me get away with not handling the case where the pointer was invalid (even if doing nothing in the case of “handling” it). ... Not only did the [Rust] compiler prevent me from generating an executable, it told me that a pattern was not exhaustive, explicitly which one, and what case that was not covered."

[0] http://nickdesaulniers.github.io/blog/2013/05/07/rust-patter...

inglor · on May 24, 2015

For people who care about how this is implemented: here are the two relevant articles https://parasol.tamu.edu/~yuriys/papers/OPM13EA.pdf https://parasol.tamu.edu/~yuriys/papers/TS12.pdf they're from 2013 and 2012, I recommend starting with the second one as it gives an overview of the problem.

moogly · on May 24, 2015

Impressive effort, but pretty chthonian syntax. And that's with using macros. Main take away from this for me is that pattern matching needs to be a language feature to see pervasive use.

plorkyeran · on May 24, 2015

Stroustrup's pretty opposed to the idea that anything that can be done as a library without any language extensions automatically should be, so I think it's safe to say that by the time this is ready for standardization it will not be a mess of macros that beats the language into submission. It's just a lot easier to get people to try out your prototype when it doesn't require first building the compile from some branch.

kryptiskt · on May 24, 2015

That seems right, in the paper he is pretty open to the idea of extending the language if the solution is general enough:

"In the future, we would like to implement an actual language extension that will be capable of working with open patterns. Given such an extension and its implementation, we would like to look into how code for such patterns can be optimized without hardcoding the knowledge of the semantics of the patterns into the compiler. We would also like to experiment with other kinds of patterns (including those defined by the user), look at the interaction of patterns with both the standard library and other facilities in the language, and make views less ad-hoc."

http://www.stroustrup.com/OpenPatternMatching.pdf

nly · on May 24, 2015

Stroustrup has wanted multi-methods in C++ almost since its inception. At least 20 years.

vanderZwan · on May 24, 2015

The authors themselves state in their talk (linked halfway down the readme) that it's not the type of syntax anyone wants to see in the language - rather that this is an experimental library showing what is possible, using a very quirky syntax. They're more concerned with making it work and with good performance.

Although admittedly that's from an older talk, no idea if the current syntax still reflects that sentiment.

nly · on May 24, 2015

I tried Mach7 and couldn't get the thing to work reliably. Compile errors were horrific. I've had a lot of success with Yomm11[0] though

[0] https://github.com/jll63/yomm11

solodon · on May 25, 2015

Was the problem when you tried to build existing tests or when you tried to write your own code using the library? In any case I'd be happy to help.

Rexxar · on May 24, 2015

Is the long term goal to create a new library (why not in boost) or to test new ideas to extend the language ?

herewego · on May 24, 2015

From the presentation materials included in the repo, it looks like it's intended to be a library example of what they'd like to be a language extension.

solodon · on May 26, 2015

We are looking for contributors to this work, there is still a lot that needs to be done before we can get a language proposal. Of particular interest are more uses of the library in various contexts and applications, so if you are a student working on a language or some tree-rewriting system for a class, research project etc. we would love to help you try the library solution. Also if you are a lock-free guru, we'd love some help with making vtbl-map lock-free. Also, don't hesitate to write me an email with any questions regarding Mach7.

on May 24, 2015

[deleted]

adamnemecek · on May 24, 2015

The first commit of the project is from Oct 2011 which is shortly after Rust was made public so it seems unlikely that Rust was an influence. More likely, ML influenced both Rust and Mach7.