Lisp-like language interpreter in Python

Question

I decided to learn more about how interpreters work and made a simple Scheme interpreter in Python. I'd like to hear some advice on how I could have written this better, or on the mistakes I've made.

Parser.py (I thought about putting the Tokenizer and Atom classes in different files, but they are too small for that. Maybe I should somehow rename the file to get rid of the confusion?):

import Types class Parser: def __init__(self) -> None: pass def parse(self, tokens : list) -> Types.Exp: if len(tokens) == 0: raise SyntaxError('Unexpected EOF.') token = tokens.pop(0) if token == '(': L = [] while tokens[0] != ')': L.append(self.parse(tokens)) tokens.pop(0) # pop off ')' return L elif token == ')': raise SyntaxError('Unexpected syntax.') else: atom = Atom(token) return atom.value class Tokenizer: def tokenize(self, program : str) -> list: return program.replace('(', ' ( ').replace(')', ' ) ').split() class Atom: def __init__(self, value) -> None: self.value = self.__set__value(value) def __set__value(self, value): try: return int(value) except ValueError: try: return float(value) except ValueError: return Types.Symbol(value)

Types.py:

Symbol = str List = list Number = (int, float) Atom = (Symbol, Number) Exp = (Atom, List)

Eval.py (I thought it was not good to implement the Procedure class here and it would be more natural to put it in Types.py, but I could not solve the problem with the two-way import. I need to import Types.py into Eval.py and Eval.py into Types.py while I use eval() in Procedure.__call__):

import Types from Environment import Environment global_env = Environment() def eval(exp : Types.Exp, e = global_env): if isinstance(exp, Types.Symbol): return e.env[exp] elif isinstance(exp, Types.Number): return exp op = exp[0] args = exp[1:] if op == 'quote': return args[0] elif op == "if": (syntax_if, test, conseq, alt) = exp exp = (conseq if eval(test, e) else alt) return eval(exp, e) elif op == "define": (_, symbol, exp) = exp e.env[symbol] = eval(exp, e) elif op == "set!": (symbol, value) = args e.env[symbol] = eval(value, e) elif op == "lambda": (parms, body) = args return Procedure(parms, body, e) else: proc = eval(op, e) args = [eval(arg, e) for arg in args] if proc is None and args is not None: for arg in args: if arg is not None: return arg return proc(*args) class Procedure: def __init__(self, parms, body, env): self.parms, self.body, self.env = parms, body, env def __call__(self, *args): function_env = Environment(self.parms, args, self.env) return eval(self.body, function_env)

Environment.py (I implemented memory using dictionary):

import math import operator as op import Types class Environment: def __init__(self, parms=(), args=(), outer = None): self.env = {} if outer is not None: self.env = outer.env else: self.__set_default_env() self.env.update(zip(parms, args)) self.outer = outer def find(self, key): if key in self.env: return self.env[key] elif self.outer is not None: return self.outer.find(key) return None def update(self, parms=(), args=()): self.env.update(zip(parms, args)) def __set_default_env(self): self.env.update(vars(math)) self.env.update({ '+':op.add, '-':op.sub, '*':op.mul, '/':op.truediv, '>':op.gt, '<':op.lt, '>=':op.ge, '<=':op.le, '=':op.eq, 'abs': abs, 'append': op.add, 'apply': lambda proc, args: proc(*args), 'begin': lambda *x: x[-1], 'car': lambda x: x[0], 'cdr': lambda x: x[1:], 'cons': lambda x,y: [x] + y, 'eq?': op.is_, 'expt': pow, 'equal?': op.eq, 'length': len, 'list': lambda *x: Types.List(x), 'list?': lambda x: isinstance(x, Types.List), 'map': map, 'max': max, 'min': min, 'not': op.not_, 'null?': lambda x: x == [], 'number?': lambda x: isinstance(x, Types.Number), 'print': print, 'procedure?': callable, 'round': round, 'symbol?': lambda x: isinstance(x, Types.Symbol), })

I also made some tests (link on github repository)

eval shadows a Python built-in, you might want to rename the function. — Tamoghna Chowdhury, CommentedMay 1, 2023 at 15:48

J_H · Accepted Answer · 2023-05-01 16:02:22Z

Keeping small classes like Parser and Atom in same module seems fine.

If you're ever in doubt, here's an easy approach to making such a decision.

Write down a single sentence in a """docstring""" at top-of-module, describing what it does, what it's responsible for. Do not write down that the module holds everything and the kitchen sink.

Then examine each contained class, and ask yourself whether it fits within that sentence or whether it more closely resembles a kitchen sink. Keep it or evict it based on that.

You don't have to write a one-sentence docstring for each class. But you may find that it helps the process.

class Parser: def __init__(self) -> None: pass

Delete the empty constructor, please.

Which brings us to a related item. It looks like you really wanted a def parse(): function.

There can be valid reasons for retaining such a method within a class, such as grouping related items. In which case you should decorate with @staticmethod.

But here, it seems a function is indicated. Plus, it should be def tokenize(...):

nit: Your annotated signatures look odd. A quick$ black -S *.py will tidy up extra whitespace.

The current annotation is nice enough, but it would be more helpful if you spelled out tokens: list[str]

 def __set__value(self, value):

No, please don't use __ double underscore name mangling; it's not what you want. Just stick with a single _ underscore prefix to indicate this method is private.

And a single _ underscore after "set", as well.

A black -S *.py pass would improve the readability of this method.

You use the verb "set". But "get" seems to be what this function is doing.

Recommend that you prefer lowercase names for module / file names.

Rather than import Types, prefer from Types import Exp, Number, ...

 op = exp[0] args = exp[1:]

I don't understand those expressions.

The signature annotation told me that exp is an Exp. But now we apparently have List[Exp] ?? Recommend you use mypy to lint this code.

Here is a perfectly nice tuple unpack:

 (syntax_if, test, conseq, alt) = exp

The convention is to use _ to indicate an unused placeholder:

 (_, test, conseq, alt) = exp

Oh, look! That's exactly what we see for "define", good. Though I guess that for both of these a tuple unpack of args would have been more convenient.

 return Procedure(parms, body, e)

I don't understand why we're creating an object there. Would it suffice to return procedure(...) ? Using a def procedure(...): ?

 if proc is None and args is not None: for arg in args: if arg is not None: return arg

Sorry, you kind of lost me there. Not sure what input would exercise this code. Please add a # comment to help out the Gentle Reader. Or extract helper, so you can name the function and maybe adorn it with a docstring.

class Environment: def __init__(self, parms=(), args=(), outer=None):

I guess that's nice enough.

I view the signature as an invitation for mypy to assume a pair of tuple types. But we sometimes pass in list arguments.

Maybe declare parms: Iterable ? Similarly for .update()

Maybe you wanted to turn lists into tuples higher up the call stack? That is, we could reconsider the Public API we choose to offer.

We definitely don't want a mutable default.

 def __set_default_env(self): self.env.update(vars(math))

As mentioned above, prefer a single _ underscore prefix.

And that second line is fascinating, well done! I like that. A lot of power in a single utterance.

Feature request: maybe you'd like to expose a general (import "module_name") verb to users?

I imagine you're pretty far from compliance with R4RS or similar, and that's fine. It would be entertaining to run this against some automated compliance tests and see what the distance is.

This codebase achieves its design goals. It would benefit from minor class / function reorganization.

I would be willing to delegate or accept maintenance tasks on this codebase.

Stack Exchange Network

Lisp-like language interpreter in Python

1 Answer 1

Hot Network Questions

Lisp-like language interpreter in Python

1 Answer 1

Related

Hot Network Questions