|
| 09 Jan 2016 09:06 PM |
im working on the parser for my language
basically, im trying to convert an arbitrary expression into an AST
e.g.
5 * 10 + 20 becomes:
+ / \ 20 * / \ 5 10
i've looked into the shunting yard algorithm. i know i can use it to EVALUATE an expression on the fly, but im not sure how to use it to create an AST
any suggestions? |
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:08 PM |
| I did this once with Lua and string manipulation. I ran into problems with brackets though. |
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:09 PM |
yeah, i could see myself running into a problem there
i think implementing parenthesis will be easy (simple recursive call?) |
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:10 PM |
| and im actually writing the compiler/lexer/parser in Lua, just the VM is written in C. It's really easy to generate bytecode with Lua and hand it to C using the C API |
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:11 PM |
shunting yard algorithm
shunting yard algorithm
shunting yard algorithm shunting yard algorithm
shunting yard algorithm shunting yard algorithmshunting yard algorithmshunting yard algorithmshunting yard algorithm shunting yard algorithm
|
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:11 PM |
| "i've looked into the shunting yard algorithm. i know i can use it to EVALUATE an expression on the fly, but im not sure how to use it to create an AST" |
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:12 PM |
| why exactly do you need an AST over shunting yard? |
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:13 PM |
because this is the parser. the expression will contain identifiers that have an unknown value at compile time. eventually, that AST that i gave as an example earlier will need to turn into
ICONST 5 ICONST 10 IMUL ICOSNT 20 IADD |
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:14 PM |
'i think implementing parenthesis will be easy (simple recursive call?)'
I thought the same, but when I did gsub, it was capturing pairs of brackets that weren't what I was looking for, I can't remember the exact problem, but it malformed the equation.
I bet I could recreate the code I had minus the brackets though. Could help you. |
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:14 PM |
| ^ and of course that could be optimized, but it's just an example |
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:15 PM |
| @war that'd be great if you could |
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:15 PM |
| Really all I had was functions for * / ^ % + - and just gsub'd the equation using BEDMAS, and replaced it with what they evaluated to. It wasn't difficult. The brackets was impossible though. |
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:15 PM |
you need to look into how the shunting yard algorythm works
it converts it into a format that's easily readable by your expression evaluator where an operator is preceeded by it's two operands on the stack, ie
3 2 +
evaluates to 5. it takes away all the heavy lifting for your evaluator, there's no need to know what the values are at compile time |
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:18 PM |
ohh ok you're right, that makes more sense
can you give a quick example of how i'd interpret the output? |
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:22 PM |
@war i'd need to convert something like
array[5 * 10 + 20]
into
ICONST 5 ICONST 10 IMUL ICONST 20 IADD LOAD < offset to a pointer to array that resides on the stack > IADD // offset from array pointer
:( |
|
|
| Report Abuse |
|
|
morash
|
  |
| Joined: 22 May 2010 |
| Total Posts: 5834 |
|
|
| 09 Jan 2016 09:23 PM |
'i think implementing parenthesis will be easy (simple recursive call?)'
Do it and watch efficiency die. RPN removes the need for parenthesis, meaning you don't have to use recursion at all. |
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:23 PM |
the shunting yard algorythm converts an expression (including brackets, ignore warspy) into RPN (reverse polish notation)
i could post a code example, but my code is in C++ and kind of specialised
basically: *) loop through the tokens given from shutning yard *) if the token is an identifier or something that should be replaced by a value, lookup the value and push it into a stack *) else if it's an operator, pop the operator off and look at the two operands before it. push to the yard stack the evaluated value of this operation (be careful, power is right associative!)
after this process, if the yard stack is only 1 value deep, that's your result. otherwise there's an error.
|
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:29 PM |
also, if you're wondering how to handle something like this:
a + b[1*3]
you can check the amount of tokens for this expression. you know that 1*3 is more than a simple lookup, so just send it recursively back through your shunting yard / expression solving process!
|
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:32 PM |
alright, so in my AST, could I just do something like:
if (x * y < 10) {
}
converts to
IF / \ cond block | x y < 10 * |
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:33 PM |
| and im just now reading your a + b[i] reply, thanks for the help |
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:35 PM |
shunting yard is for finding the result of the expression.
it's more like this:
if(x*3^4 > 1) { }
IF COND -> Logical > --> EXPR x*3^4 --> EXPR x
which means you can solve the expressions with the shunting yard algorythm, no need for any weird stuff going on
use the shunting yard and RPN solving in the same step |
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:36 PM |
the second EXPR should be 1, not x
but whatever lmao |
|
|
| Report Abuse |
|
|
| |
|
|
| 09 Jan 2016 09:41 PM |
wait but wouldnt i still need to use shunting yard for the logical statement parsing
e.g.
if (x*x > 10 && x*y < 50)
becomes
IF_STATEMENT > Condition >> AND >>> GREATER >>>> x*x >>>> 10 >>> LESS >>>> x*y >>>> 50 |
|
|
| Report Abuse |
|
|
|
| 09 Jan 2016 09:46 PM |
yeah. for each expression you have (x*x, 10, x*y, 50) just throw them into your function that runs shunting yard and RPN parsing when you're evaluating the condition block
(10 and 50 can be evaluated to just be themselves, you can have a clause for that at the beginning of your shunting yard algorithm to let some things cheat the system) |
|
|
| Report Abuse |
|
|