Skip to content
Tomek edited this page May 10, 2024 · 9 revisions

Frequently Asked Questions about pycparser.

What is an AST?

AST - Abstract Syntax Tree. It is a tree representation of the syntax of source code - a convenient hierarchical data structure that's built from the code and is readily suitable for exploration and manipulation.

Why don't AST nodes in pycparser have parent links?

Adding parent links / pointers to each and every node in pycparser is a lot of work, and would complicate the parser code considerably (as well as make it consume more memory). I see no real benefit in doing so, since parent links are only occasionally useful when walking the AST.

Besides, parent links can be easily emulated. One simple way is to override the generic_visit method of NodeVisitor to keep a parent attribute when visiting a node's children:

def generic_visit(self, node):
    """ Called if no explicit visitor function exists for a
        node. Implements preorder visiting of the node.
    """
    oldparent = self.current_parent
    self.current_parent = node
    for c in node:
        self.visit(c)
    self.current_parent = oldparent

Now visitor methods can access self.current_parent to get to the visited node's parent. It just costs us 3 lines of code - not much effort!

Python is a highly dynamic language, and much more interesting additions can be made with very small code changes - one example would be keeping a full stack of parents when traversing down into children. Don't forget that the whole NodeVisitor class is just 5 lines of code in 2 methods.

Can I generate C code back from the AST?

Yes! See the c-to-c.py example that's being distributed with pycparser starting with version 2.03.

Does pycparser support GNU/Visual C++ extensions?

Mostly no. pycparser only knows how to parse ISO C99, and doesn't support compiler-specific extensions. This is a deliberate decision, which favors the simplicity and maintainability of the parser over additional features.

Dedicated users should find it easy to extend pycparser to support any specific extension they're interested in. Actually, such projects already exist. See for example pycparserext by Andreas Klöckner, which extends pycparser to support GNU/GCC extensions.

What do I do about __attribute__?

__attribute__ is a compiler extension, so it's not really supported by pycparser. However, unless you really need to parse the attribute itself, you can just pre-process it away by defining:

#define __attribute__(x)

And running the pre-processor before feeding the code to pycparser.

What do I do about __extension__?

Similarly to __attribute__, __extension__ is a compiler extension and isn't supported by pycparser. However, it's also easily ignored by defining:

#define __extension__

And running the pre-processor before feeding the code to pycparser. Note that this doesn't mean pycparser will parse actual GCC extensions (different syntaxes that are not part of the C99 standard but are supported by GCC).

What do I do about __asm__?

The __asm__ keyword might look more complex, since it can contain multiple lines of assembly, interleaved with variable names. But again, it ends up being a simple define:

#define __asm__(...)

What about parsing C++?

I have no intentions of expanding pycparser to support C++. If you need to parse C++ with Python, I suggest using Clang with its Python bindings. See this article for more details.