summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorPéter Diviánszky <divipp@gmail.com>2016-04-07 17:02:26 +0200
committerPéter Diviánszky <divipp@gmail.com>2016-04-07 17:02:26 +0200
commit2e65df79cda76000ae0b69b8b9b85f4abac7f084 (patch)
tree50a662e00214996b0ebe33b87c0d90da77e45f27 /doc
parent1fb8403f0c3f56ddcc61f467e47dd8722ac55293 (diff)
more complete documentation of pattern match compilation
Diffstat (limited to 'doc')
-rw-r--r--doc/guide.pandoc309
1 files changed, 256 insertions, 53 deletions
diff --git a/doc/guide.pandoc b/doc/guide.pandoc
index 4fba4fe0..6b50db80 100644
--- a/doc/guide.pandoc
+++ b/doc/guide.pandoc
@@ -292,18 +292,18 @@ Pattern match compilation
292 292
293Main task: 293Main task:
294 294
295- transform pattern matching and guards to case expressions 295- transform pattern matching and guards to simple case expressions
296 296
297Other tasks: 297Other tasks:
298 298
299- exhaustiveness check (is pattern matching total?) 299- exhaustiveness check (is pattern matching total?)
300- reachability check (are there superfluous expressions involved?) 300- reachability check (are there superfluous expressions involved?)
301- pattern guards desugaring 301- [pattern guards](https://wiki.haskell.org/Pattern_guard) desugaring
302- view patterns desugaring 302- [view patterns](https://ocharles.org.uk/blog/posts/2014-12-02-view-patterns.html) desugaring
303 303
304Design decisions: 304Design decisions:
305 305
306- The same algorithm is used with different parameters to compile all syntactic structures which may use pattern matching: 306- The same algorithm is used to compile all language structures which contain pattern matching:
307 - function alternatives 307 - function alternatives
308 - case expressions 308 - case expressions
309 - lambda expressions 309 - lambda expressions
@@ -311,11 +311,14 @@ Design decisions:
311 - open type family instances 311 - open type family instances
312 - type class instances, type class instance heads (look at type class desugaring) 312 - type class instances, type class instance heads (look at type class desugaring)
313- A new data structure, *pattern guard trees* are used as intermediate representation. 313- A new data structure, *pattern guard trees* are used as intermediate representation.
314 Pattern match compilation is done in two steps: 314 Pattern match compilation is done in three steps:
315 1. all syntax sugars (pattern matching, guards, view patterns, ...) are transformed to pattern guard trees 315
316 2. pattern guard trees are optimized and compiled to case expressions 316 1. language structures are transformed into pattern guard trees
317- Exhaustiveness and reachability checks are done also in step 2. 317 1. pattern guard trees are normalized
318 This approach is less error-prone then doing them in separate phase. 318 1. normalized pattern guard trees are compiled to case expressions
319
320- Exhaustiveness and reachability checks are done also in step 3.
321 This approach is less error-prone and more cheap than doing them in a separate phase.
319 322
320 323
321### Pattern guard trees 324### Pattern guard trees
@@ -325,21 +328,19 @@ Pattern guard trees is an intermediate representation to which all pattern match
325Syntax of pattern guards trees (used only in this documentation): 328Syntax of pattern guards trees (used only in this documentation):
326 329
327----------- -- ---------------------------------------------------------- ----------------------- 330----------- -- ---------------------------------------------------------- -----------------------
328*guardtree* := *expression* **rhs** 331*guardtree* := *expression* **success**
329 | *guardtree* `|` *guardtree* **alternative patterns** 332 | `MatchFailure` **failure**
330 | `MatchFailure` 333 | *guardtree* `|` *guardtree* **alternatives**
331 | *pattern* `<-` *expression* `,` *guardtree* **pattern guard** 334 | *pattern* `<-` *expression* `,` *guardtree* **pattern guard**
332 | `let` *bindings* `in` *guardtree* **let binding** 335 | `let` *bindings* `in` *guardtree* **let binding**
333----------- -- ---------------------------------------------------------- ----------------------- 336----------- -- ---------------------------------------------------------- -----------------------
334 337
335Remarks: 338Pattern guard trees are generalizations of [pattern guards](https://wiki.haskell.org/Pattern_guard):
336 339
337- Pattern guard trees are generalizations of [pattern guards](https://wiki.haskell.org/Pattern_guard): 340- pattern guard trees are expressions (may appear whereever an expression may appear)
338 - pattern guard trees are expressions (may appear whereever an expression may appear) 341- pattern guard trees may have alternatives not only on the top level
339 - pattern guard trees may have alternatives not only on the top level 342 For example, in `[] <- x, (Nothing <- y, ... | True <- z, ...)` both alternatives will match only if `x` is an empty list.
340 For example, in `[] <- x, (Nothing <- y, ... | True <- z, ...)` both alternatives will match only if `x` is an empty list. 343- pattern guard trees may have let bindings not only on the top level
341 - pattern guard trees may have let bindings not only on the top level
342- Pattern guard trees form a monoid with alternative patterns and `MatchFailure`.
343 344
344Examples: 345Examples:
345 346
@@ -349,22 +350,6 @@ Examples:
349 not x = (True <- x, False) | (False <- x, True) 350 not x = (True <- x, False) | (False <- x, True)
350 ~~~~~ 351 ~~~~~
351 352
3522. Definition of `zip`:
353
354 ~~~~~ {.haskell}
355 zip [] _ = []
356 zip x [] = []
357 zip (a:as) (b:bs) = (a,b) : zip as bs
358 ~~~~~
359
360 A possible translation to pattern guard trees:
361
362 ~~~~~ {.haskell}
363 zip v0 v1
364 = ([] <- v0, [])
365 | (let x = v0 in [] <- v1, [])
366 | ((:) a as <- v0, (:) b bs <- v1, (a, b): zip as bs)
367 ~~~~~
368 353
369#### Semantics 354#### Semantics
370 355
@@ -393,6 +378,21 @@ data GuardTree a where
393 PatternGuard :: Pattern b -> Expression b -> GuardTree a -> GuardTree a 378 PatternGuard :: Pattern b -> Expression b -> GuardTree a -> GuardTree a
394~~~~~ 379~~~~~
395 380
381#### Basic properties of pattern guard trees
382
383------------------------------------- -------------------------- -------------------------------
384t `|` (t' `|` t'') = (t `|` t') `|` t'' (`|`) is associative
385`MatchFailure` `|` t = t `MatchFailure` is left unit
386t `|` `MatchFailure` = t `MatchFailure` is right unit
387t `|` t = t (`|`) is idempotent
388e `|` t = e success ignores alternative
389C1 p1 ... pn `<-` C2 e1 ... em`,` t = `MatchFailure` constructor match failure
390------------------------------------- -------------------------- -------------------------------
391
392These are not true because of different strictness properties:
393
394- (p `<-` e, `MatchFailure`) = `MatchFailure`
395
396 396
397### Desugaring to pattern guard trees 397### Desugaring to pattern guard trees
398 398
@@ -405,24 +405,26 @@ data GuardTree a where
405 405
406Example: 406Example:
407 407
408Definition of `zip`:
409
408~~~~~ {.haskell} 410~~~~~ {.haskell}
409zip [] _ = [] 411zip [] _ = []
410zip x [] = [] 412zip x [] = []
411zip (a:as) (b:bs) = (a,b) : zip as bs 413zip (a:as) (b:bs) = (a,b) : zip as bs
412~~~~~ 414~~~~~
413 415
414Transformed code: 416Translation to pattern guard trees:
415 417
416~~~~~ {.haskell} 418~~~~~ {.haskell}
417zip v0 v1 419zip v0 v1
418 = ([] <- v0, _ <- v1, []) 420 = ([] <- v0, _ <- v1, [])
419 | (x <- v0, [] <- v1, []) 421 | (x <- v0, [] <- v1, [])
420 | (a:as <- v0, b:bs <- v1, (a,b) : zip as bs) 422 | (a:as <- v0, b:bs <- v1, (a,b) : zip as bs)
421~~~~~ 423~~~~~
422 424
423
424#### View patterns 425#### View patterns
425 426
427...
426 428
427### Normalization of pattern guard trees 429### Normalization of pattern guard trees
428 430
@@ -432,22 +434,199 @@ The normal form of pattern guard trees contains only simple constructor patterns
432*conpattern* := *conname* (*variable*)* 434*conpattern* := *conname* (*variable*)*
433------------- -- ---------------------------------------------------------- ----------------------- 435------------- -- ---------------------------------------------------------- -----------------------
434 436
435The normalization process is driven by the shape of patterns: 437The following equations turn all patterns into simple constructor patterns:
436
437 438
438pattern shape input output 439---------------------------- -------------------------------------------------------------- -------------------
439-------------- ---------------------------- -- -------- 440`_` `<-` e`,` t = t wildcard elim
440wildcard `_` `<-` e `,` t ~> t 441v `<-` e`,` t = `let` v = e `in` t variable elim
441variable v `<-` e `,` t ~> `let` v = e `in` t 442v`@`p `<-` e`,` t = `let` v = e `in` p `<-` v`,` t at-pattern elim
442at-pattern v`@`p `<-` e `,` t ~> `let` v = e `in` p `<-` v `,` t 443C p1 ... pn `<-` e`,` t = C w1 ... wn `<-` e`,` p1 `<-` w1`,` ...`,` pn `<-` wn`,` t nested pattern elim
443nested pattern C p1 ... pn `<-` e `,` t ~> C w1 ... wn `<-` e `,` p1 `<-` w1 `,` ... `,` pn `<-` wn `,` t 444(f `->` p) `<-` e`,` t = p `<-` f e`,` t view pattern elim
444view pattern (f `->` p) `<-` e `,` t ~> p `<-` f e `,` t 445---------------------------- -------------------------------------------------------------- -------------------
445-------------- ---------------------------- -- --------
446 446
447 447
448### Compilation of pattern guard trees 448### Compilation of pattern guard trees
449 449
450#### Filtering 450
451Compilation is driven by the following equations:
452
4531. C ... `<-` e`,` t | t'
454 = `seq` e (C ... `<-` e`,` t | t')
455
4561. `seq` e t
457 = `case` e `of` p1 `->` t; ...; pn `->` t
458 if p1, ..., pn is complete
459
4601. `case` e `of` ...; C e1 ... en `->` ... (C v1 ... vn `<-` e`,` t | t') ...; ...
461 = `case` e `of` ...; C e1 ... en `->` ... (`let` v1 = e1; ...; vn = en `in` t) ...; ...
462
4631. `case` e `of` ...; C1 e1 ... en `->` ... (C2 v1 ... vn `<-` e`,` t | t') ...; ...
464 = `case` e `of` ...; C1 e1 ... en `->` ... t' ...; ...
465
466
467
468# Complete example
469
470Source code:
471
472~~~~~ {.haskell}
473zip [] _ = []
474zip x [] = []
475zip (a:as) (b:bs) = (a,b) : zip as bs
476~~~~~
477
478Transformed to pattern guard trees:
479
480~~~~~ {.haskell}
481zip v0 v1
482 = ([] <- v0, _ <- v1, [])
483 | (x <- v0, [] <- v1, [])
484 | (a:as <- v0, b:bs <- v1, (a,b) : zip as bs)
485~~~~~
486
487Simplified pattern guard trees:
488
489~~~~~ {.haskell}
490zip v0 v1
491 = ([] <- v0, [])
492 | (let x = v0 in [] <- v1, [])
493 | (a:as <- v0, b:bs <- v1, (a,b) : zip as bs)
494~~~~~
495
496Apply (1):
497
498~~~~~ {.haskell}
499zip v0 v1 = v0 `seq`
500 ([] <- v0, [])
501 | (let x = v0 in [] <- v1, [])
502 | (a:as <- v0, b:bs <- v1, (a,b) : zip as bs)
503~~~~~
504
505Apply (2) with patterns `[]` and `x:xs`:
506
507~~~~~ {.haskell}
508zip v0 v1 = case v0 of
509 [] ->
510 ([] <- v0, [])
511 | (let x = v0 in [] <- v1, [])
512 | (a:as <- v0, b:bs <- v1, (a,b) : zip as bs)
513 x: xs ->
514 ([] <- v0, [])
515 | (let x = v0 in [] <- v1, [])
516 | (a:as <- v0, b:bs <- v1, (a,b) : zip as bs)
517~~~~~
518
519Apply (3); empty let is eliminated:
520
521~~~~~ {.haskell}
522zip v0 v1 = case v0 of
523 [] -> []
524 x: xs ->
525 ([] <- v0, [])
526 | (let x = v0 in [] <- v1, [])
527 | (a:as <- v0, b:bs <- v1, (a,b) : zip as bs)
528~~~~~
529
530Apply (4):
531
532~~~~~ {.haskell}
533zip v0 v1 = case v0 of
534 [] -> []
535 x: xs ->
536 (let x = v0 in [] <- v1, [])
537 | (a:as <- v0, b:bs <- v1, (a,b) : zip as bs)
538~~~~~
539
540Let floating:
541
542~~~~~ {.haskell}
543zip v0 v1 = case v0 of
544 [] -> []
545 x: xs -> let x = v0 in
546 ([] <- v1, [])
547 | (a:as <- v0, b:bs <- v1, (a,b) : zip as bs)
548~~~~~
549
550Apply (1):
551
552~~~~~ {.haskell}
553zip v0 v1 = case v0 of
554 [] -> []
555 x: xs -> let x = v0 in v1 `seq`
556 ([] <- v1, [])
557 | (a:as <- v0, b:bs <- v1, (a,b) : zip as bs)
558~~~~~
559
560Apply (2) with patterns `[]` and `y: ys`:
561
562~~~~~ {.haskell}
563zip v0 v1 = case v0 of
564 [] -> []
565 x: xs -> let x = v0 in case v1 of
566 [] ->
567 ([] <- v1, [])
568 | (a:as <- v0, b:bs <- v1, (a,b) : zip as bs)
569 y: ys ->
570 ([] <- v1, [])
571 | (a:as <- v0, b:bs <- v1, (a,b) : zip as bs)
572~~~~~
573
574Apply (3):
575
576~~~~~ {.haskell}
577zip v0 v1 = case v0 of
578 [] -> []
579 x: xs -> let x = v0 in case v1 of
580 [] -> []
581 y: ys ->
582 ([] <- v1, [])
583 | (a:as <- v0, b:bs <- v1, (a,b) : zip as bs)
584~~~~~
585
586Apply (4):
587
588~~~~~ {.haskell}
589zip v0 v1 = case v0 of
590 [] -> []
591 x: xs -> let x = v0 in case v1 of
592 [] -> []
593 y: ys ->
594 a:as <- v0, b:bs <- v1, (a,b) : zip as bs
595~~~~~
596
597Apply (3):
598
599~~~~~ {.haskell}
600zip v0 v1 = case v0 of
601 [] -> []
602 x: xs -> let x = v0 in case v1 of
603 [] -> []
604 y: ys -> let a = x; as = xs in b:bs <- v1, (a,b) : zip as bs
605~~~~~
606
607Apply (3):
608
609~~~~~ {.haskell}
610zip v0 v1 = case v0 of
611 [] -> []
612 x: xs -> let x = v0 in case v1 of
613 [] -> []
614 y: ys -> let a = x; as = xs in let b = y; bs = ys in (a,b) : zip as bs
615~~~~~
616
617Simplification:
618
619~~~~~ {.haskell}
620zip v0 v1 = case v0 of
621 [] -> []
622 x: xs -> case v1 of
623 [] -> []
624 y: ys -> (x, y) : zip xs ys
625~~~~~
626
627
628
629#### Concrete algorithm
451 630
452Filtering is a transformation on pattern guard trees, given an expession and a constructor name: 631Filtering is a transformation on pattern guard trees, given an expession and a constructor name:
453 632
@@ -463,12 +642,36 @@ filterTree e a@(Con c es) (PatternGuard (PCon c' vs) e' t)
463 | otherwise = filterTree e a (let $(bind vs es) t) 642 | otherwise = filterTree e a (let $(bind vs es) t)
464~~~~~ 643~~~~~
465 644
645...
646
647
648#### Checks
649
650Exhaustiveness check is just finding `MatchFailure` values in the result.
651
652Reachability check is just finding parts of source code not present in result.
653
654
655#### Extensions
656
657Step (2) is very general:
658
659- For pattern matching just on a single constructor C, the patterns {C v1 ... vn, _} can be used.
660 This can prevent quadratic code blowup in case of ADT with many constructors.
661- Pattern matching on types is possible in a similar way.
662- In case of GADTs, for better warnings, type information can be taken into account to restrict the set of matching constructor patterns,
663 similar to [GADTs Meet Their Match](https://people.cs.kuleuven.be/~george.karachalias/papers/p424-karachalias.pdf), implemented in GHC 8.
664
665### Discussion
466 666
467### TODO 667- Extension with dependent pattern matching seems possible but has to be explored.
668- The algorithm produce fast code, but in some cases the size of produced code may blow up.
669 Compilation driven by more than 4 equations might prevent this.
670- [A prototype was implemented.](https://github.com/lambdacube3d/lambdacube-compiler/blob/master/prototypes/PatCompile.hs)
468 671
469- pattern matching on GADTs 672#### Related work
470- dependent pattern matching
471 673
674Have not seen this approach before; help to compare it with others.
472 675
473 676
474Desugared source code 677Desugared source code