diff options
author | Péter Diviánszky <divipp@gmail.com> | 2016-04-04 11:25:33 +0200 |
---|---|---|
committer | Péter Diviánszky <divipp@gmail.com> | 2016-04-04 11:25:33 +0200 |
commit | a14705d13abf237c2e0a202612c3f600fa2a399f (patch) | |
tree | 61a8cff4bb0093f9511d8b4d4206013f356b5d70 /doc | |
parent | 497fd44cafdc76fe240fce848b9457d745a459e7 (diff) |
better explanation of compiler phases
Diffstat (limited to 'doc')
-rw-r--r-- | doc/common.css | 2 | ||||
-rw-r--r-- | doc/guide.pandoc | 192 |
2 files changed, 136 insertions, 58 deletions
diff --git a/doc/common.css b/doc/common.css index ce60b17f..b6f91436 100644 --- a/doc/common.css +++ b/doc/common.css | |||
@@ -74,7 +74,7 @@ input, textarea { | |||
74 | 74 | ||
75 | /* border */ | 75 | /* border */ |
76 | 76 | ||
77 | h1, h2, h3 { | 77 | h1, h2 { |
78 | border-bottom: 1px dotted black; | 78 | border-bottom: 1px dotted black; |
79 | } | 79 | } |
80 | 80 | ||
diff --git a/doc/guide.pandoc b/doc/guide.pandoc index 39d4a85f..fb872712 100644 --- a/doc/guide.pandoc +++ b/doc/guide.pandoc | |||
@@ -1,13 +1,12 @@ | |||
1 | % LambdaCube 3D compiler developer's documentation and ideas | 1 | % LambdaCube 3D compiler developer's documentation and ideas |
2 | 2 | ||
3 | Compiler structure | 3 | Compiler structure overview |
4 | ================== | 4 | ================== |
5 | 5 | ||
6 | Main tasks of the compiler: | 6 | Main tasks of the compiler: |
7 | 7 | ||
8 | - find errors in source code | 8 | - find errors in source code |
9 | - semantic errors are ruled out by strong static typing | 9 | - code completion to avoid superfluous parts of source code |
10 | - code completion to avoid inferrable parts of code | ||
11 | - eliminate abstractions (normalization, reduction) | 10 | - eliminate abstractions (normalization, reduction) |
12 | 11 | ||
13 | Additional tasks of the compiler: | 12 | Additional tasks of the compiler: |
@@ -16,68 +15,146 @@ Additional tasks of the compiler: | |||
16 | - modules support | 15 | - modules support |
17 | - support livecoding | 16 | - support livecoding |
18 | - highlight errors | 17 | - highlight errors |
19 | - inferred types in tooltips | 18 | - show inferred types in tooltips |
20 | - completion | 19 | - code completion during editing |
21 | - additional feedback for users | 20 | - additional feedback for users |
22 | - show desugared source (on a separate tab) | 21 | - show desugared source |
23 | - show optimizations (how?) | 22 | - show optimizations |
24 | 23 | ||
25 | Design decisions | 24 | Main design decisions |
25 | |||
26 | - Semantics errors are ruled out by strong static typing. | ||
27 | - Use dependent types for more expressive power | ||
28 | - Use Haskell syntax & semantics with extensions | ||
29 | - Code generation is not part of the compiler; it is done during the reduction of the main expression. | ||
30 | This helps to keep the compiler smaller and separate domain specific code. | ||
31 | - The compiler has just two phases: | ||
32 | 1. lexing + parsing + scope checking + desugaring | ||
33 | 2. type checking + type inference + normalization | ||
34 | |||
35 | |||
36 | Example: Compilation and reduction of a simple value | ||
37 | ---------------------------------------------------- | ||
38 | |||
39 | | | ||
40 | | source code | ||
41 | | (e.g. description of the 100th prime number) | ||
42 | v | ||
43 | lexing + parsing + scope checking + desugaring | ||
44 | | | | ||
45 | | desugared source code \--> halt with error message | ||
46 | v | ||
47 | inference + normalization -------> halt with error message | ||
48 | || | ||
49 | || infos (e.g. inferred types) | ||
50 | |\-------------------------------> | ||
51 | | | ||
52 | | normalized value | ||
53 | | (e.g. 547, the 100th prime number) | ||
54 | | | ||
55 | v | ||
56 | |||
57 | |||
58 | Code generation by reduction | ||
59 | ---------------------------- | ||
60 | |||
61 | The basic idea is that the compilation is done is multiple phases. | ||
62 | |||
63 | The code generator is compiled before its invocation. | ||
64 | |||
65 | The interface between the compiler and the code generator is the reflected elaborated code. | ||
66 | |||
67 | Additinional design decisions: | ||
68 | |||
69 | - be able to pattern match on different normal forms of the reflected elaborated code | ||
70 | This eases the description of code generation. | ||
71 | - Ideally no error handling is needed during code generation; all misues should prevented by the type system. | ||
72 | - compilation stages are tracked by the type system | ||
73 | - prevents even more misuses | ||
74 | - help to speed up compilation? | ||
75 | - support different optimization transformations in the libraries | ||
76 | |||
77 | |||
78 | ### Example: Compilation of domain specific code in two phases | ||
79 | |||
80 | Builtin library: | ||
81 | |||
82 | ~~~~~ {.haskell} | ||
83 | -- definition of elaborated expressions | ||
84 | data Expr a = ... | ||
85 | |||
86 | -- primitive function for reflection | ||
87 | reflect :: a -> Expr a -- TODO: encode phase info in type | ||
88 | |||
89 | -- pattern synonyms for matching different normal forms of `Expr` | ||
90 | ... | ||
91 | ~~~~~ | ||
26 | 92 | ||
27 | - Code generation is not part of the compiler; it is done in the libraries. | 93 | DSL compiler code: |
28 | This helps to keep the compiler smaller and separate domain specific code | 94 | |
29 | - reflection support makes this possible | 95 | ~~~~~ {.haskell} |
30 | - be able to pattern match on different normal forms for more compact description of code generation | 96 | -- definition of main type of the DSL |
31 | - a strong type system prevents most misuses, so ideally no error handling is needed during code generation | 97 | -- (this is a custom DSL IO type) |
32 | - compilation stages are tracked by the type system | 98 | data IO a = ... |
33 | - prevents even more misuses | 99 | |
34 | - help to speed up compilation? | 100 | -- definition of target code (e.g. ByteString) |
35 | - support different optimization transformations in the libraries | 101 | data Target = ... |
36 | - the compiler has just two phases | 102 | |
37 | 1. lexing, parsing, scope checking and desugaring is done in the same phase, sharing the same state | 103 | -- DSL compiler, including optimizations |
38 | 2. type checking & inference and normalization is done in the same phase, sharing the same state | 104 | compile :: Expr (IO ()) -> Target |
39 | 105 | ... | |
40 | Example: Compilation of graphics pipeline descriptions in two phases | 106 | ~~~~~ |
41 | -------------------------------------------------------------------- | 107 | |
108 | Application code: | ||
109 | |||
110 | ~~~~~ {.haskell} | ||
111 | -- DSL main entry point | ||
112 | dslMain :: IO () | ||
113 | dslMain = ... | ||
114 | |||
115 | -- compiled main | ||
116 | main :: Target | ||
117 | main = compile (reflect dslMain) | ||
118 | ~~~~~ | ||
119 | |||
120 | Compilation flowchart: | ||
42 | 121 | ||
43 | | | 122 | | |
44 | | source code of code generator for the domain | 123 | | DSL compiler code |
45 | | (e.g. code gen. for graphics pipeline descriptions) | ||
46 | v | 124 | v |
47 | lexer + parser + scope checking + desugaring | 125 | lexing + parsing + scope checking + desugaring |
48 | | | | 126 | & inference + normalization ------> halt with error message (phase #1) |
49 | | desugared source code | | ||
50 | v v | ||
51 | inference + normalization ------> halt with error message (phase #1) | ||
52 | | | ||
53 | | infos for library writers | ||
54 | |--------------------------> | ||
55 | | | 127 | | |
56 | | normalized code | 128 | | `compile` ready to be applied |
57 | | (code generator) | ||
58 | | | 129 | | |
59 | /---/ | 130 | /---/ |
60 | | | 131 | | |
61 | | | | 132 | | | |
62 | | | domain language source code | 133 | | | application code |
63 | | | (e.g. main with type 'graphics pipeline description') | ||
64 | | v | 134 | | v |
65 | | lexer + parser + scope checking + desugaring | 135 | | lexing + parsing + scope checking + desugaring |
66 | | | | | 136 | | & inference + normalization ------> halt with error message (phase #2) |
67 | | | desugared source code | | 137 | | | |
68 | v v v | 138 | | | normalized `dslMain` |
69 | inference + normalization ------> halt with error message (phase #2) | 139 | | v |
70 | (normalization includes reflection of domain lang. code) | 140 | | reflection |
71 | | | 141 | | | |
72 | | infos for live coding | 142 | | | normalized `reflect dslMain` |
73 | |--------------------------> | 143 | | | (value of type `Expr (IO ())`) |
74 | | | 144 | v v |
75 | | normalized code | 145 | function application & normalization |
76 | | (e.g. graphics pipeline description) | 146 | | |
77 | v | 147 | | normalized `compile (reflect dslMain)` |
148 | | (target code) | ||
149 | v | ||
150 | |||
151 | Remarks: | ||
152 | |||
153 | - only phase #2 should be run again if the application code changes | ||
154 | - application of `compile` can be faster if `compile` is compiled to machine code in an additional phase between phase #1 and phase #2 | ||
78 | 155 | ||
79 | 156 | ||
80 | Usecase: Live coding system overview | 157 | Use case: Live coding system overview |
81 | ---------------------------------- | 158 | ---------------------------------- |
82 | 159 | ||
83 | 160 | ||
@@ -100,10 +177,11 @@ Modules support | |||
100 | 177 | ||
101 | Design decisions | 178 | Design decisions |
102 | 179 | ||
103 | - module header and body are parsed in two phases | ||
104 | - module status and availabe information are cached in a map | ||
105 | - modules can be loaded by filename or by module name | 180 | - modules can be loaded by filename or by module name |
106 | - builtin modules are always accessible by the last default path element | 181 | - module header and body are parsed in two phases |
182 | This makes possible to process the import list even if there is a syntax error in the body. | ||
183 | - modules' status and availabe information are cached in a map | ||
184 | - builtin modules are always accessible by the last path element which is added by default | ||
107 | 185 | ||
108 | The module header contains | 186 | The module header contains |
109 | 187 | ||
@@ -113,9 +191,9 @@ The module header contains | |||
113 | 191 | ||
114 | The driver state contains | 192 | The driver state contains |
115 | 193 | ||
116 | - the paths (in a ReaderT monad transformer) | 194 | - path list (in a ReaderT monad transformer) |
117 | - map of cached modules: status + available information (in a StateT monad transformer) | 195 | - map of cached modules: status + available information (in a StateT monad transformer) |
118 | - fatal error (in an ExceptT monad transformer) | 196 | - fatal error message (in an ExceptT monad transformer) |
119 | 197 | ||
120 | Possible module status | 198 | Possible module status |
121 | 199 | ||
@@ -144,9 +222,9 @@ Tasks | |||
144 | 222 | ||
145 | Design decisions | 223 | Design decisions |
146 | 224 | ||
147 | - use the `megaparsec` parser combinator library (a better parsec) | ||
148 | - Lexing and parsing is done in the same phase, sharing the same state | 225 | - Lexing and parsing is done in the same phase, sharing the same state |
149 | how it works: Whitespace is consumed immediately after each "lexeme". | 226 | how it works: Whitespace is consumed immediately after each "lexeme". |
227 | - use the `megaparsec` parser combinator library (a better parsec) | ||
150 | - do custom indentation sensitive parsing | 228 | - do custom indentation sensitive parsing |
151 | - use custom lexer for literals | 229 | - use custom lexer for literals |
152 | 230 | ||