Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on compile #234

Open
yogi1967 opened this issue Feb 5, 2023 · 22 comments
Open

Clarification on compile #234

yogi1967 opened this issue Feb 5, 2023 · 22 comments

Comments

@yogi1967
Copy link

yogi1967 commented Feb 5, 2023

Apologies this isn't as such an issue report, but seeking clarification: Jython 2.7.x

For reasons outside of my control, I am forced to use single, large script files.. I write extensions to an app which controls the environment my code executes in (code is started / run by the master app through PythonInterpreter).

I cannot do 'normal' imports as my code runs from within a ZIP file. I can get the zip resources and contents via classloader.getResourceAsStream() etc.

Because I have large files, I sometimes hit the:
java.lang.RuntimeException: Module or method too large in xxx.py

and it then says:

Please provide a CPython 2.7 bytecode file (.pyc), e.g. run
python -m py_compile xxx.py

Alternatively, specify a CPython 2.7 command via the python.cpython2 property, e.g.:
jython -Dpython.cpython2=python
or (e.g. for pip) through the environment variable JYTHON_OPTS:


So, question 1. Why is it saying use python -m for jython?

question 2: is it actually possible to create .pyc files for jython, and if so, how do you actually do it?

question 3: what is the difference between .pyc files and xxx$py.class files?

I have now managed to compile jython using:
java -cp jython.jar org.python.util.jython -c "import compileall; compileall.compile_file('xx.py')"
.. and run it using imp.load_module() so this bit is great! But of course the script ends up within a sub module scope whereas I really want it within the scope of the original script. (Hence execfile would be more useful).

question 4: is there a way to execute $py.class files using execfile() via a resource stream (like load_module) rather than from a file?

question 5: I see exec can run Code objects.... What is this Code object? Is it the same as a .pyc file and/or $py.class file? And again, can a stream be executed rather than a file....?

Separately, I have seen that execfile() on a script which hits the too large issue, does actually run if a .pyc is also present on disk in the same location and one that was compiled by python -m py_compile... So something with pyc works... This all seems very odd..!?

Sorry for the questions, I have done much reading and experimenting. I am glad I have finally got something working, but your clarification(s) would be appreciated?

Thanks

@Stewori
Copy link
Contributor

Stewori commented Feb 5, 2023

On short time, so just the essentials:
Jython can read and execute CPython's (2.7) bytecode pyc files, but it cannot generate them. $py.class files is Jython's original way of bytecode, i.e. compile to Java bytecode. Support of pyc files is just an additional feature. Jython had frequent issues due to limited method length in Java's bytecode spec. Using the pyc file support is a workaround for that.

@yogi1967
Copy link
Author

yogi1967 commented Feb 6, 2023

Thanks! Wow, so python can read CPython bytecode files and this is fully supported?! A few clarification questions:

  • so is a pyc better than a $py.class file, or visa-versa, or are they the same? I’m after the fastest launch time, that avoids Jython doing extra checks/compiles and one that avoids the method limit issue.
  • if you use a pyc, does jython then compile that further to a $py.class file internally anyway. Ie what is the final destination compile for all code?
  • Do both pyc and $py.class fix/get around the method length issue?
  • Can imp.load_module execute both pyc and $py.class compiled files?
  • Can exec and execfile execute both pyc and $py.class compiled files?
  • If jython3 ever arrives, will both pyc and $py.class be supported?

Thanks!

@Stewori
Copy link
Contributor

Stewori commented Feb 6, 2023

I think $py.class is more performant since it is native to Java. pyc files are not further compiled, but interpreted by Jython. They result in a different PyCode subclass (I think pyc->PyBytecode and $py.class -> PyTableCode).
However, in presence of oversized Python functions I once implemented a smarter approach:
A $py.class cannot be compiled in that case because the limitation lies in Java's bytecode format and is therefore inherent.
If a pyc file can be found or created via a specified CPython 2.7 command, the necessary bytecode is taken from the pyc (only for the Functions in question) and embedded into the resulting $py.class via String constants. Due to the base64 encoding, the resulting class files may be unusually large, but I suppose jar-compression can compensate that. That means, if Jython succeeds in creating a $py.class file for a module, the pyc file does not have to be distributed. It only enables creation of the $py.class file initially. Regarding imp.load_module and exec, execfile I do not know right now and would have to look it up or try for myself. Regarding Jython3 I think it will be pyc first and Java bytecode eventually.

@yogi1967
Copy link
Author

yogi1967 commented Feb 6, 2023

OMG! So, I now generate a .pyc using:
python -m x.py
and then generate a class file using
java -cp jython.jar org.python.util.jython -c "import compileall; compileall.compile_file() x.py
... and the .pyc being present allows the java class compile to work (when before it was method too large) and as you say, I can just distribute the .class... Excellent!

I think this needs to be documented somewhere!?

Thanks for your help!

@Stewori
Copy link
Contributor

Stewori commented Feb 6, 2023

Exactly! You can alternatively configure -Dpython.cpython2=python, then Jython does the python -m x.py part automatically as needed. That is intended for large projects where many files are affected and it would be a hassle to sort things out manually.

I think this needs to be documented somewhere!?

Only in the news file and by means of the instructions in the error message AFAIK.

@yogi1967
Copy link
Author

yogi1967 commented Feb 6, 2023

Hmm... Where do you do this:
-Dpython.cpython2=python
??
Is this on Jython/JVM launch..? If so, I cannot use this as my code runs within a master app that I have no control over... This is an extension.....

@yogi1967
Copy link
Author

yogi1967 commented Feb 6, 2023

.. and just in time. After I got all this plumbed up, today my script hit the method too large issue and my new .pyc and $py.class build saved the day. Thanks!!

@Stewori
Copy link
Contributor

Stewori commented Feb 6, 2023

Welcome!

Where do you do this

Yes, on the jvm startup command line. If that is not an option you can set it in the Jython registry file, see
https://www.jython.org/registry.html
Maybe you can also set it programmatically. It should reside in PySystemState.registry (a Properties object), but I am not sure whether that is writable.

@yogi1967
Copy link
Author

yogi1967 commented Feb 7, 2023

Got it. I see where to put the -D command. Due to my environment setup, I can’t use that. But doesn’t matter as the python -m command works well. Thanks.

Now, last step. To see if I can use execfile() with a stream pointing to the $py.class file (instead of script) - rather than just imp.load_module().

@yogi1967
Copy link
Author

yogi1967 commented Feb 7, 2023

So.. final question in this thread... I would love to be able to run execfile() on the compiled $py.class file, rather than imp.load_module()... Why? Well really for namespace reasons so that the code runs in the same namespace as it would if I could just execfile() the file (as I can do for non compiled scripts)... (for example, the master application which I have no control over sets key variables for my script to read, but if I import my script, and use load_module(), I have to trick the passing of these variables to the new module by stuffing them into builtin as globals() is not shared... Same goes for passing back info to the master application....) But as my script's method(s) are too large now for running uncompiled, they must be compiled.... I see all routes for execfile() eventually lead to Py.runCode() but it requires a PyCode object - I have no idea how to get a $py.class file into a PyCode object.... TIA!

@Stewori
Copy link
Contributor

Stewori commented Feb 7, 2023

how to get a $py.class file into a PyCode object

I think that's done with org.python.core.BytecodeLoader.makeCode. It requires the class file loaded into a Java byte array.

@yogi1967
Copy link
Author

yogi1967 commented Feb 7, 2023

Thank you! It works.. The code to execute compiled $py.class file from a stream is as follows:

_launchedFile = _THIS_IS_ + _compiledExtn
scriptStream = MD_EXTENSION_LOADER.getResourceAsStream(u"/%s" %(_launchedFile))
code = BytecodeLoader.makeCode(os.path.splitext(_launchedFile)[0], IOUtils.toByteArray(scriptStream), (_THIS_IS_ + _normalExtn))
scriptStream.close()
exec(code)

Excellent. Many thanks for your help!

@yogi1967 yogi1967 closed this as completed Feb 7, 2023
@jeff5
Copy link
Member

jeff5 commented Feb 7, 2023

  • If jython3 ever arrives, will both pyc and $py.class be supported?

If it ever does, it will almost certainly interpret CPython byte/word code (.pyc files) from the compatible version of CPython. It already does to a limited extent. And it wouldn't be Jython if it didn't compile to JVM byte code.

@yogi1967
Copy link
Author

yogi1967 commented Feb 7, 2023

Thanks Jeff!

@yogi1967
Copy link
Author

yogi1967 commented Sep 8, 2023

Hi @jeff5... A quick followup question if you don't mind... As above, I am using this code to launch a compiled script:

_launchedFile = _THIS_IS_ + _compiledExtn
scriptStream = MD_EXTENSION_LOADER.getResourceAsStream(u"/%s" %(_launchedFile))
code = BytecodeLoader.makeCode(os.path.splitext(_launchedFile)[0], IOUtils.toByteArray(scriptStream), (_THIS_IS_ + _normalExtn))
scriptStream.close()
exec(code)

It works great..! Just one small niggle.. In the stack trace, it's referred to as "<iostream>".

If I use PythonInterpreter.execfile() then there is an argument name that allows you to pass the name. But I cannot see how to do this when using the exec statement... And I cannot see how to do run the code any other way.. I.e. there appears to be no way to use PythonInterpreter.execfile() passing the PyCode object..

Any ideas? Many thanks

@yogi1967 yogi1967 reopened this Sep 8, 2023
@jeff5
Copy link
Member

jeff5 commented Sep 12, 2023

Glad that works.

In this call, the file name you give is supposed to make its way to the constructor that returns a Python code object. I don't immediately see where it might be being lost as it is in generated code.

By comparison, in execfile the name is given to the compiler (and not to runCode) so it must be embedded in the class it generates (maybe as __file__). It could be we're only testing for (so only supporting) this case, which seems like a bug ... or not what was intended in the design of makeCode.

Do you mean the Python stack trace?

@yogi1967
Copy link
Author

Hi @jeff5 - ignore this.. It seems OK now. I must have done something wrong that day. Sorry to bother you with this! Thanks again.... S

@yogi1967 yogi1967 reopened this Sep 26, 2023
@yogi1967
Copy link
Author

yogi1967 commented Sep 26, 2023

@jeff5 - A further question.. I am using import compileall; compileall.compile_file method. This is all OK and great... However I may have hit an issue with the byte code version.. What bytecode version .class file does this create? Is there a way to specify a byte code version in the method call, and/or a way to call this using latest JDK but to specify the version (like you can do with javac)? Thanks

@yogi1967
Copy link
Author

... in fact, looking at the raw file, it looks like it's bytecode version: 50.0 (Java 6).. So that's OK.. Is this fixed in the compiler no matter what JDK you run? Thanks....

@Stewori
Copy link
Contributor

Stewori commented Sep 29, 2023

Classfiles are created using asm. One would have to look into the way Jython calls asm and what config asm offers. I suspect we are just calling it with default settings, but didn't look it up. As a side note, maybe the asm dependency could do with an update to support latest bytecode version.

@yogi1967
Copy link
Author

This may be a rabbit hole... Can you point me to the Class/Method that is making the asm call, and I will look from there.. I've tried to find it, but got lost... Sorry.... thx

@jeff5
Copy link
Member

jeff5 commented Oct 28, 2023

(I'm a little lost in the code generator myself.) I accuse:

cw.visit(Opcodes.V1_6, Opcodes.ACC_PUBLIC + Opcodes.ACC_SUPER, this.name, null, this.superclass, interfaces);

I may have thought that this:


would set a minimum JVM of 7 (could be 8 now), but it doesn't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants