Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #40170 > unrolled thread
| Started by | kramer65 <kramerh@gmail.com> |
|---|---|
| First post | 2013-02-28 12:25 -0800 |
| Last post | 2013-03-05 01:35 +0000 |
| Articles | 20 on this page of 26 — 16 participants |
Back to article view | Back to comp.lang.python
Why is it impossible to create a compiler than can compile Python to machinecode like C? kramer65 <kramerh@gmail.com> - 2013-02-28 12:25 -0800
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Matty Sarro <msarro@gmail.com> - 2013-02-28 15:50 -0500
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-01 02:55 +0000
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Stefan Behnel <stefan_ml@behnel.de> - 2013-02-28 22:03 +0100
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-01 03:47 +0000
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? alex23 <wuwei23@gmail.com> - 2013-02-28 20:31 -0800
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Stefan Behnel <stefan_ml@behnel.de> - 2013-03-01 08:48 +0100
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-02 01:49 +0000
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Chris Angelico <rosuav@gmail.com> - 2013-03-01 08:10 +1100
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Stefan Behnel <stefan_ml@behnel.de> - 2013-02-28 22:17 +0100
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Dave Angel <davea@davea.name> - 2013-02-28 16:18 -0500
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Modulok <modulok@gmail.com> - 2013-02-28 14:19 -0700
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Jonas Geiregat <jonas@geiregat.org> - 2013-02-28 22:33 +0100
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Nobody <nobody@nowhere.com> - 2013-02-28 22:01 +0000
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Terry Reedy <tjreedy@udel.edu> - 2013-02-28 17:06 -0500
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-02-28 21:09 -0500
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-01 04:27 +0000
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? alex23 <wuwei23@gmail.com> - 2013-02-28 20:38 -0800
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? 88888 Dihedral <dihedral88888@googlemail.com> - 2013-02-28 22:21 -0800
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Grant Edwards <invalid@invalid.invalid> - 2013-03-04 16:36 +0000
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? CM <cmpython@gmail.com> - 2013-03-04 14:55 -0800
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? 88888 Dihedral <dihedral88888@googlemail.com> - 2013-03-04 15:12 -0800
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Terry Reedy <tjreedy@udel.edu> - 2013-03-04 19:31 -0500
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Chris Angelico <rosuav@gmail.com> - 2013-03-05 11:33 +1100
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Benjamin Kaplan <benjamin.kaplan@case.edu> - 2013-03-04 16:27 -0800
Re: Why is it impossible to create a compiler than can compile Python to machinecode like C? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-05 01:35 +0000
Page 1 of 2 [1] 2 Next page →
| From | kramer65 <kramerh@gmail.com> |
|---|---|
| Date | 2013-02-28 12:25 -0800 |
| Subject | Why is it impossible to create a compiler than can compile Python to machinecode like C? |
| Message-ID | <b428fdef-a577-45c0-b37c-60bde74e3ae1@googlegroups.com> |
Hello, I'm using Python for a while now and I love it. There is just one thing I cannot understand. There are compilers for languages like C and C++. why is it impossible to create a compiler that can compile Python code to machinecode? My reasoning is as follows: When GCC compiles a program written in C++, it simply takes that code and decides what instructions that would mean for the computer's hardware. What does the CPU need to do, what does the memory need to remember, etc. etc. If you can create this machinecode from C++, then I would suspect that it should also be possible to do this (without a C-step in between) for programs written in Python. Where is my reasoning wrong here? Is that because Python is dynamically typed? Does machinecode always need to know whether a variable is an int or a float? And if so, can't you build a compiler which creates machinecode that can handle both ints and floats in case of doubt? Or is it actually possible to do, but so much work that nobody does it? I googled around, and I *think* it is because of the dynamic typing, but I really don't understand why this would be an issue.. Any insights on this would be highly appreciated!
[toc] | [next] | [standalone]
| From | Matty Sarro <msarro@gmail.com> |
|---|---|
| Date | 2013-02-28 15:50 -0500 |
| Message-ID | <mailman.2675.1362084610.2939.python-list@python.org> |
| In reply to | #40170 |
[Multipart message — attachments visible in raw view] — view raw
Python is an interpreted language, not a compiled language. This is actually a good thing! What it means is that there is a "scripting engine" (we just call it the interpreter) that actually executes everything for you. That means that any operating system that has an interpreter written for it is capable of running the exact same code (there are lots of exceptions to this, but in general it is true). It makes code much more portable. Also, it makes it easy to troubleshoot (compiled programs are a pain in the butt unless you add additional debugging elements to them). A compiled program on the other hand must be specifically compiled for the destination architecture (so if you're trying to write an OSX executable on windows, you need a compiler capable of doing that). So doing any sort of cross platform development can take significantly longer. Plus then, as I said, debugging will require additional debug tracing elements to be added to the code you write. The benefit though is that compilers can optimize code for you when they compile, and the compiled code will tend to run faster since you're not dealing with an interpreter between you and the machine. Now, there are places where this line is blurred. For instance perl is an interpreted language, but capable of running EXTREMELY fast. Python is a little slower, but significantly easier to read and write than perl. You also have some weird ones like JAVA which actually have a virtual machine, and "half compile" source code into java "bytecode." This is then executed by the virtual machine. I guess the ultimate point is that they're all designed for different purposes, and to solve different problems. Python was intended to make fast-to-write, easily understandable, easily portable code which can be executed on any system which has the Python interpreter. It's not really intended for things which require lower level access to hardware. It's what we call a "high level" programming language. C (your example) was intended for very low level programming, things like operating systems, device drivers, networking stacks, where the speed of a compiled executable and direct access to hardware was a necessity. That's what Dennis Ritchie wrote it for. We call it a "mid level" programming language, or a "low level" programming language depending on who you talk to. I'd have to say mid level because low level would be writing in assembly or playing with a hex editor :) Different tools for different jobs. HTH. -Matty On Thu, Feb 28, 2013 at 3:25 PM, kramer65 <kramerh@gmail.com> wrote: > Hello, > > I'm using Python for a while now and I love it. There is just one thing I > cannot understand. There are compilers for languages like C and C++. why is > it impossible to create a compiler that can compile Python code to > machinecode? > > My reasoning is as follows: > When GCC compiles a program written in C++, it simply takes that code and > decides what instructions that would mean for the computer's hardware. What > does the CPU need to do, what does the memory need to remember, etc. etc. > If you can create this machinecode from C++, then I would suspect that it > should also be possible to do this (without a C-step in between) for > programs written in Python. > > Where is my reasoning wrong here? Is that because Python is dynamically > typed? Does machinecode always need to know whether a variable is an int or > a float? And if so, can't you build a compiler which creates machinecode > that can handle both ints and floats in case of doubt? Or is it actually > possible to do, but so much work that nobody does it? > > I googled around, and I *think* it is because of the dynamic typing, but I > really don't understand why this would be an issue.. > > Any insights on this would be highly appreciated! > > -- > http://mail.python.org/mailman/listinfo/python-list >
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2013-03-01 02:55 +0000 |
| Message-ID | <513018b8$0$30001$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #40175 |
On Thu, 28 Feb 2013 15:50:00 -0500, Matty Sarro wrote: > Python is an interpreted language, not a compiled language. Actually, *languages* are neither interpreted nor compiled. A language is an abstract description of behaviour and syntax. Whether something is interpreted or compiled or a mixture of both is a matter of the implementation. There are C interpreters and Python compilers. [...] > Now, there are places where this line is blurred. For instance perl is > an interpreted language, but capable of running EXTREMELY fast. Python > is a little slower, but significantly easier to read and write than > perl. You also have some weird ones like JAVA which actually have a > virtual machine, and "half compile" source code into java "bytecode." > This is then executed by the virtual machine. Welcome to the 20th century -- nearly all so-called "interpreted" languages do that, including Python. Why do you think Python has a function called "compile", and what do you think the "c" in .pyc files stands for? The old model that you might have learned in school: * interpreters read a line of source code, execute it, then read the next line, execute it, then read the next one, and so forth... * compilers convert the entire source code to machine code, then execute the machine code. hasn't been generally true since, well, probably forever, but certainly not since the 1980s. These days, the best definition of "interpreted language" that I have read comes from Roberto Ierusalimschy, one of the creators of Lua: "...the distinguishing feature of interpreted languages is not that they are not compiled, but that the compiler is part of the language runtime and that, therefore, it is possible (and easy) to execute code generated on the fly." (Programming in Lua, 2nd edition, page 63.) In that sense, being an interpreter is a feature, and pure compilers are deficient. Oh, by the way, while it is true that the original version of Java used a pure virtual machine model, these days many Java compilers are capable of producing machine code. Just to drive home the lesson that *languages* aren't compiled or interpreted, but *implementations* are, consider these Python implementations with radically different execution styles: 1) CPython, the one you are used to, compiles code to byte-code for a custom-made virtual machine; 2) Jython generates code to run on a Java virtual machine; 3) IronPython does the same for the .Net CLR; 4) PyPy has a JIT compiler that generates machine code at runtime; 5) Pynie compiles to byte-code for the Parrot virtual machine; 6) Nuitka includes a static compiler that compiles to machine code; 7) Berp generates Haskell code, which is then compiled and executed by a Haskell compiler, which may or may not generate machine code; 8) Pyjamas compiles Python to Javascript; and others. And even machine code is not actually machine code. Some CPUs have an even lower level of micro-instructions, and an interpreter to translate the so-called "machine code" into micro-instructions before executing them. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Stefan Behnel <stefan_ml@behnel.de> |
|---|---|
| Date | 2013-02-28 22:03 +0100 |
| Message-ID | <mailman.2677.1362085404.2939.python-list@python.org> |
| In reply to | #40170 |
kramer65, 28.02.2013 21:25: > I'm using Python for a while now and I love it. There is just one thing > I cannot understand. There are compilers for languages like C and C++. > why is it impossible to create a compiler that can compile Python code > to machinecode? All projects that implement such compilers prove that it's quite possible. The most widely used static Python compiler is Cython, but there are also a couple of experimental compilers that do similar things in more or less useful or usable ways. And there are also a couple of projects that do dynamic runtime compilation, most notably PyPy and Numba. You may want to take a look at the Python implementations page, specifically the list of Python compilers: http://wiki.python.org/moin/PythonImplementations#Compilers > Does machinecode always need to know whether a variable is an int or a > float? Not at all. You're mixing different levels of abstraction here. > And if so, can't you build a compiler which creates machinecode > that can handle both ints and floats in case of doubt? Sure. Cython does just that, for example, unless you tell it explicitly to restrict a variable to a specific type. Basically, you get Python semantics by default and C semantics if you want to. Stefan
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2013-03-01 03:47 +0000 |
| Message-ID | <513024d5$0$30001$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #40178 |
On Thu, 28 Feb 2013 22:03:09 +0100, Stefan Behnel wrote:
> The most widely used static Python compiler is Cython
Cython is not a Python compiler. Cython code will not run in a vanilla
Python implementation. It has different keywords and syntax, e.g.:
cdef inline int func(double num):
...
which gives SyntaxError in a Python compiler.
Cython is an excellent language and a great addition to the Python
ecosystem, but it is incorrect to call it "Python".
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | alex23 <wuwei23@gmail.com> |
|---|---|
| Date | 2013-02-28 20:31 -0800 |
| Message-ID | <b01b97a3-6661-40ad-815e-7a57a32ed79a@o9g2000pbt.googlegroups.com> |
| In reply to | #40218 |
On Mar 1, 1:47 pm, Steven D'Aprano <steve +comp.lang.pyt...@pearwood.info> wrote: > Cython is not a Python compiler. Cython code will not run in a vanilla > Python implementation. It has different keywords and syntax, e.g.: > > cdef inline int func(double num): > ... > > which gives SyntaxError in a Python compiler. Cython has had a "pure Python" mode for several years now that allows you to decorate Python code or augment it with additional files containing the C specific declarations: http://docs.cython.org/src/tutorial/pure.html Both of which will be ignored by the regular Python interpreter, allowing you to write Python that is also suitable for Cython without the errors you mention.
[toc] | [prev] | [next] | [standalone]
| From | Stefan Behnel <stefan_ml@behnel.de> |
|---|---|
| Date | 2013-03-01 08:48 +0100 |
| Message-ID | <mailman.2705.1362124133.2939.python-list@python.org> |
| In reply to | #40218 |
Steven D'Aprano, 01.03.2013 04:47: > On Thu, 28 Feb 2013 22:03:09 +0100, Stefan Behnel wrote: > >> The most widely used static Python compiler is Cython > > Cython is not a Python compiler. Cython code will not run in a vanilla > Python implementation. It has different keywords and syntax, e.g.: > > cdef inline int func(double num): > ... > > which gives SyntaxError in a Python compiler. Including Cython, if you're compiling a ".py" file. The above is only valid syntax in ".pyx" files. Two languages, one compiler. Or three languages, if you want, because Cython supports both Python 2 and Python 3 code in separate compilation modes. The old model, which you might have learned at school: * a Python implementation is something that runs Python code * a Cython implementation is something that does not run Python code hasn't been generally true since, well, probably forever. Even Cython's predecessor Pyrex was capable of compiling a notable subset of Python code, and Cython has gained support for pretty much all Python language features about two years ago. Quoting the project homepage: "the Cython language is a superset of the Python language". http://cython.org/ If you don't believe that, just try it yourself. Try to compile some Python 3 code with it, if you find the time. Oh, and pass the "-3" option to the compiler in that case, so that it knows that it should switch to Python 3 syntax/semantics mode. It can't figure that out from the file extension (although you can supply the language level of the file in a header comment tag). And while you're at it, also pass the "-a" option to let it generate an HTML analysis of your code that highlights CPython interaction and thus potential areas for manual optimisation. The "superset" bit doesn't mean I've stopped fixing bugs from time to time that CPython's regression test suite reveals. If you want to get an idea of Cython's compatibility level, take a look at the test results, there are still about 470 failing tests left out of 26000 in the test suites of Py2.7 and 3.4: https://sage.math.washington.edu:8091/hudson/job/cython-devel-tests-pyregr/ One reason for a couple of those failures (definitely not all of them) is that Cython rejects some code at compile time that CPython only rejects at runtime. That's because the tests were explicitly written for CPython and assume that the runtime cannot detect some errors before executing the code. So, in a way, being capable of doing static analysis actually prevents Cython from being fully CPython compatible. I do not consider that a bad thing. And, BTW, we also compile most of Python's benchmark suite by now: https://sage.math.washington.edu:8091/hudson/view/bench/ The results are definitely not C-ishly fast, usually only some 10-80% improvement or so, e.g. only some 35% in the Django benchmark, but some of the results are quite ok for plain Python code that is not manually optimised for compilation. Remember, there are lots of optimisations that we deliberately do not apply, and static analysis generally cannot detect a lot of dynamic code patterns, runtime determined types, etc. That's clearly PyPy's domain, with its own set of pros and cons. The idea behind Cython is not that it will magically make your plain Python code incredibly fast. The idea is to make it really, really easy for users to bring their code up to C speed *themselves*, in the exact spots where the code really needs it. And yes, as was already mentioned in this thread, there is a pure Python mode for this that allows you to keep your code in plain Python syntax while optimising it for compilation. The "Cython optimised" benchmarks on the page above do exactly that. I wrote a half-rant about static Python compilation in a recent blog post. It's in English, and you might actually want to read it. I would say that I can claim to know what I'm talking about. http://blog.behnel.de/index.php?p=241 Stefan
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2013-03-02 01:49 +0000 |
| Message-ID | <51315ab4$0$30001$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #40231 |
On Fri, 01 Mar 2013 08:48:34 +0100, Stefan Behnel wrote: > Steven D'Aprano, 01.03.2013 04:47: >> On Thu, 28 Feb 2013 22:03:09 +0100, Stefan Behnel wrote: >> >>> The most widely used static Python compiler is Cython >> >> Cython is not a Python compiler. Cython code will not run in a vanilla >> Python implementation. It has different keywords and syntax, e.g.: >> >> cdef inline int func(double num): >> ... >> >> which gives SyntaxError in a Python compiler. > > Including Cython, if you're compiling a ".py" file. The above is only > valid syntax in ".pyx" files. Two languages, one compiler. Or three > languages, if you want, because Cython supports both Python 2 and Python > 3 code in separate compilation modes. Ah, that's very interesting, and thank you for the correction. I have re- set my thinking about Cython. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-03-01 08:10 +1100 |
| Message-ID | <mailman.2678.1362085840.2939.python-list@python.org> |
| In reply to | #40170 |
On Fri, Mar 1, 2013 at 7:50 AM, Matty Sarro <msarro@gmail.com> wrote: > C (your example) was intended for very low level programming, things like > operating systems, device drivers, networking stacks, where the speed of a > compiled executable and direct access to hardware was a necessity. That's > what Dennis Ritchie wrote it for. We call it a "mid level" programming > language, or a "low level" programming language depending on who you talk > to. I'd have to say mid level because low level would be writing in assembly > or playing with a hex editor :) Assembly is for people who write C compilers. C is for people who write language interpreters/compilers. Everyone else uses a high level language. Not 100% accurate but a reasonable rule of thumb. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Stefan Behnel <stefan_ml@behnel.de> |
|---|---|
| Date | 2013-02-28 22:17 +0100 |
| Message-ID | <mailman.2680.1362086259.2939.python-list@python.org> |
| In reply to | #40170 |
Stefan Behnel, 28.02.2013 22:03: > there are also a couple of projects that do > dynamic runtime compilation, most notably PyPy and Numba. Oh, and HotPy, I keep forgetting about that. > You may want to take a look at the Python implementations page, > specifically the list of Python compilers: > > http://wiki.python.org/moin/PythonImplementations#Compilers Stefan
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <davea@davea.name> |
|---|---|
| Date | 2013-02-28 16:18 -0500 |
| Message-ID | <mailman.2681.1362086311.2939.python-list@python.org> |
| In reply to | #40170 |
On 02/28/2013 03:25 PM, kramer65 wrote: > Hello, > > I'm using Python for a while now and I love it. There is just one thing I cannot understand. There are compilers for languages like C and C++. why is it impossible to create a compiler that can compile Python code to machinecode? > > My reasoning is as follows: > When GCC compiles a program written in C++, it simply takes that code and decides what instructions that would mean for the computer's hardware. What does the CPU need to do, what does the memory need to remember, etc. etc. If you can create this machinecode from C++, then I would suspect that it should also be possible to do this (without a C-step in between) for programs written in Python. > > Where is my reasoning wrong here? Is that because Python is dynamically typed? Does machinecode always need to know whether a variable is an int or a float? And if so, can't you build a compiler which creates machinecode that can handle both ints and floats in case of doubt? Or is it actually possible to do, but so much work that nobody does it? > > I googled around, and I *think* it is because of the dynamic typing, but I really don't understand why this would be an issue.. > > Any insights on this would be highly appreciated! > Sure, python could be compiled into machine code. But what machine? Do you refer to the hardware inside one of the Pentium chips? Sorry, but Intel doesn't expose those instructions to the public. Instead, they wrote a microcode interpreter, and embedded it inside their processor, and the "machine languages" that are documented as the Pentium Instruction sets are what that interpreter handles. Good thing too, as the microcode machine language has changed radically over time, and I'd guess there have been at least a dozen major variants, and a hundred different sets of details. So if we agree to ignore that interpreter, and consider the externally exposed machine language, we can pick a subset of the various such instruction sets, and make that our target. Can Python be compiled directly into that instruction set? Sure, it could. But would it be practical to write a compiler that went directly to it, or is it simpler to target C, and use gcc? Let's look at gcc. When you run it, does it look like it compiles C directly to machine language? Nope. It has 3 phases (last I looked, which was admittedly over 20 years ago). The final phase translates an internal form of program description into a particular "machine language". Even the mighty gcc doesn't do it in one step. Guess what, that means other languages can use the same back end, and a given language can use different back ends for different target machine languages. (Incidentally, Microsoft C compiler does the exact same thing, and a few of my patents involve injecting code between front end and back end) So now we have three choices. We could target the C language, and use all of gcc, or we could target the intermediate language, and use only the backend of gcc. Unfortunately, that intermediate language isn't portable between compilers, so you'd either have to write totally separate python compilers for each back end, or skip that approach, or abandon total portability. Well, we could write a Python compiler that targets an "abstract intermediate language," which in turn gets translated into each of the supported compiler's intermediate language. But that gets remarkably close to just targeting C in the first place. So how hard would it be just to directly target one machine language? Not too bad if you didn't try to do any optimizations, or adapt to the different quirks and missing features of the different implementations of that machine language. But I expect what you got would be neither smaller nor noticeably faster than the present system. Writing simple optimizations that improve some things is easy. Writing great optimizers that are also reliable and correct is incredibly hard. I'd expect that gcc has hundreds of man years of effort in it. Now, no matter which of these approaches you would take, there are some issues. The tricky part is not being flexible between int and float (and long, which is not part of the Intel machine instruction set), but between an unlimited set of possible meanings for each operation. Just picking on a+b, each class type that a and b might be can provide their own __add__ and/or __radd__ methods. All those have to be searched for, one has to be picked, and the code has to branch there. And that decision, in general, has to be made at runtime, not by the compiler. So by default, the code ends up being a twisted set of 4way indirections, calls to dict lookups, and finally calling a function that actually does an instruction or two of real work. Guess what, an interpreter can store those details much more succinctly (code size), and can run those choices nearly as quickly. So we're back to CPython. Could it be improved? Sure, that's why there are multiple projects which try to improve performance of the reference implementation. But each project seems to get to the point where the early promise of dozen-fold improvement dwindles down to a few times as fast, and not for everything. There are lots of things that can be improved with static analysis (so we're sure of the types of certain things), restricted language (so the developer gives us extra clues). But that work is nothing compared to what it would take to re-implement the equivalent of the back ends of gcc. Java works roughly the same way as Python, compiling to byte code files, then interpreting them. The interpreter is given the fancy name "virtual machine" because it really is an instruction set, one that could have been interpreted by Intel in their internal microcode. But they have their own history to stay compatible with. Look at the Merced and how it's taken the world by storm (NOT). But Java is much stricter about its byte code files, so each function is much closer to machine level. Nearly all those Python indirections are eliminated by the compiler (because it's not as dynamic a language), and they do JIT compiling. The latter is why they're quick. -- DaveA
[toc] | [prev] | [next] | [standalone]
| From | Modulok <modulok@gmail.com> |
|---|---|
| Date | 2013-02-28 14:19 -0700 |
| Message-ID | <mailman.2682.1362086369.2939.python-list@python.org> |
| In reply to | #40170 |
> I'm using Python for a while now and I love it. There is just one thing I > cannot understand. There are compilers for languages like C and C++. why is > it impossible to create a compiler that can compile Python code to > machinecode? Not exactly what you describe, but have you checked out PyPy? http://pypy.org/ -Modulok-
[toc] | [prev] | [next] | [standalone]
| From | Jonas Geiregat <jonas@geiregat.org> |
|---|---|
| Date | 2013-02-28 22:33 +0100 |
| Message-ID | <mailman.2687.1362087285.2939.python-list@python.org> |
| In reply to | #40170 |
On do, feb 28, 2013 at 12:25:07pm -0800, kramer65 wrote:
> Hello,
>
> I'm using Python for a while now and I love it. There is just one thing I cannot understand. There are compilers for languages like C and C++. why is it impossible to create a compiler that can compile Python code to machinecode?
>
> My reasoning is as follows:
> When GCC compiles a program written in C++, it simply takes that code and decides what instructions that would mean for the computer's hardware. What does the CPU need to do, what does the memory need to remember, etc. etc. If you can create this machinecode from C++, then I would suspect that it should also be possible to do this (without a C-step in between) for programs written in Python.
>
> Where is my reasoning wrong here? Is that because Python is dynamically typed? Does machinecode always need to know whether a variable is an int or a float? And if so, can't you build a compiler which creates machinecode that can handle both ints and floats in case of doubt? Or is it actually possible to do, but so much work that nobody does it?
>
> I googled around, and I *think* it is because of the dynamic typing, but I really don't understand why this would be an issue..
>
> Any insights on this would be highly appreciated!
>
Guido actually encourages people to try to build different compilers for
python. He thinks it might, one day, be possible to have a compiler for
python.
But this could only be possible if there was some kind of global
file based annotation saying you will not use some of the dynamic parts
of python. Else it won't be possible to create a compiler for such a
highly dynmaic language as python.
You can view the key-note where he talks about this here:
http://www.youtube.com/watch?v=EBRMq2Ioxsc
Jonas.
[toc] | [prev] | [next] | [standalone]
| From | Nobody <nobody@nowhere.com> |
|---|---|
| Date | 2013-02-28 22:01 +0000 |
| Message-ID | <pan.2013.02.28.22.01.33.176000@nowhere.com> |
| In reply to | #40170 |
On Thu, 28 Feb 2013 12:25:07 -0800, kramer65 wrote: > I'm using Python for a while now and I love it. There is just one thing > I cannot understand. There are compilers for languages like C and C++. > why is it impossible to create a compiler that can compile Python code > to machinecode? It's not impossible, it's just pointless. Because Python is dynamically-typed and late-bound, practically nothing is fixed at compile time. So a compiled Python program would just be a sequence of calls to interpreter functions. > Where is my reasoning wrong here? Is that because Python is dynamically > typed? Does machinecode always need to know whether a variable is an int > or a float? Yes. > And if so, can't you build a compiler which creates > machinecode that can handle both ints and floats in case of doubt? Yes. But it's not just ints and floats. E.g. Python's "+" operator works on any pair of objects provided that either the left-hand operand has an __add__ method or the right-hand operand has a __radd__ method. > Or is it actually possible to do, but so much work that nobody does it? It's not that it's "so much work" as much as the fact that the resulting executable wouldn't be any faster than using the interpreter. IOW, it's so much work for little or no gain.
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2013-02-28 17:06 -0500 |
| Message-ID | <mailman.2691.1362089228.2939.python-list@python.org> |
| In reply to | #40170 |
The subject line is wrong. There are multiple compilers. Someone just listed some of them today in another post. On 2/28/2013 3:50 PM, Matty Sarro wrote: > Python is an interpreted language, not a compiled language. A language is just a language. Implementations are implementations*. That aside, I pretty much agree with the rest of the response. * For instance, C is usually compiled, but I once used a C interpreter on unix. -- Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | Dennis Lee Bieber <wlfraed@ix.netcom.com> |
|---|---|
| Date | 2013-02-28 21:09 -0500 |
| Message-ID | <mailman.2698.1362103788.2939.python-list@python.org> |
| In reply to | #40170 |
On Thu, 28 Feb 2013 22:33:36 +0100, Jonas Geiregat <jonas@geiregat.org>
declaimed the following in gmane.comp.python.general:
> But this could only be possible if there was some kind of global
> file based annotation saying you will not use some of the dynamic parts
> of python. Else it won't be possible to create a compiler for such a
> highly dynmaic language as python.
>
Oh, you could create a compiler -- but it would have to link to a
library that included a Python interpreter to handle the dynamic
operations <G>
--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2013-03-01 04:27 +0000 |
| Message-ID | <51302e2f$0$30001$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #40170 |
On Thu, 28 Feb 2013 12:25:07 -0800, kramer65 wrote:
> Hello,
>
> I'm using Python for a while now and I love it. There is just one thing
> I cannot understand. There are compilers for languages like C and C++.
> why is it impossible to create a compiler that can compile Python code
> to machinecode?
Your assumption is incorrect. You can compile Python to machine-code, at
least sometimes. It is quite tricky, for various reasons, but it can be
done, at various levels of efficiency.
One of the oldest such projects was Psyco, which was a Just-In-Time
compiler for Python. When Psyco was running, it would detect at run time
that you were doing calculations on (say) standard ints, compile on the
fly a machine-code function to perform those calculations, and execute
it. Psyco has more or less been made obsolete by PyPy, which does the
same thing only even more so.
http://en.wikipedia.org/wiki/Psyco
http://en.wikipedia.org/wiki/PyPy
> My reasoning is as follows:
> When GCC compiles a program written in C++, it simply takes that code
> and decides what instructions that would mean for the computer's
> hardware. What does the CPU need to do, what does the memory need to
> remember, etc. etc. If you can create this machinecode from C++, then I
> would suspect that it should also be possible to do this (without a
> C-step in between) for programs written in Python.
In principle, yes, but in practice it's quite hard, simply because Python
does so much more at runtime than C++ (in general).
Take an expression like:
x = a + b
In C++, the compiler knows what kind of data a and b are, what kind of
data x is supposed to be. They are often low-level machine types like
int32 or similar, which the CPU can add directly (or at least, the
compiler can fake it). Even if the variables are high-level objects, the
compiler can usually make many safe assumptions about what methods will
be called, and can compile instructions something like this pseudo-code:
10 get the int64 at location 12348 # "a"
20 get the int64 at location 13872 # "b"
30 jump to the function at location 93788 # add two int64s
40 store the result at location 59332 # "x"
which is fast and efficient because most of the hard work is done at
compile time. But it's also quite restrictive, because you can't change
code on the fly, create new types or functions, etc. (Or, where you can,
then you lose some of the advantages of C++ and end up with something
like Python but with worse syntax.)
In Python, you don't know what a and b are until runtime. They could be
ints, or lists, or strings, or anything. The + operator could call a
custom __add__ method, or a __radd__ method, from some arbitrary class.
Because nearly everything is dynamic, the Python compiler cannot safely
make many assumptions about the code at compile time. So you end up with
code like this:
10 search for the name "a" and take note of it
20 search for the name "b" and take note of it
30 decide whether to call a.__add__ or b.__radd__
40 call the appropriate method
60 bind the result to the name "x"
You can get an idea of what Python actually does by disassembling the
byte code into pseudo-assembly language:
py> code = compile("x = a + b", '', 'single')
py> from dis import dis
py> dis(code)
1 0 LOAD_NAME 0 (a)
3 LOAD_NAME 1 (b)
6 BINARY_ADD
7 STORE_NAME 2 (x)
10 LOAD_CONST 0 (None)
13 RETURN_VALUE
Nevertheless, PyPy can often speed up Python code significantly,
sometimes to the speed of C or even faster.
http://morepypy.blogspot.com.au/2011/02/pypy-faster-than-c-on-carefully-crafted.html
http://morepypy.blogspot.com.au/2011/08/pypy-is-faster-than-c-again-string.html
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | alex23 <wuwei23@gmail.com> |
|---|---|
| Date | 2013-02-28 20:38 -0800 |
| Message-ID | <87263f1c-f35f-4a55-8259-b6d71acb64a5@l4g2000pbn.googlegroups.com> |
| In reply to | #40170 |
On Mar 1, 6:25 am, kramer65 <kram...@gmail.com> wrote: > There are compilers for languages like C and C++. why > is it impossible to create a compiler that can compile > Python code to machinecode? This is a nice site list a lot of current approaches to that subject: http://compilers.pydata.org/
[toc] | [prev] | [next] | [standalone]
| From | 88888 Dihedral <dihedral88888@googlemail.com> |
|---|---|
| Date | 2013-02-28 22:21 -0800 |
| Message-ID | <d825ad2b-c664-4147-b91f-36c40207d50c@googlegroups.com> |
| In reply to | #40170 |
kramer65於 2013年3月1日星期五UTC+8上午4時25分07秒寫道: > Hello, > > > > I'm using Python for a while now and I love it. There is just one thing I cannot understand. There are compilers for languages like C and C++. why is it impossible to create a compiler that can compile Python code to machinecode? > > > > My reasoning is as follows: > > When GCC compiles a program written in C++, it simply takes that code and decides what instructions that would mean for the computer's hardware. What does the CPU need to do, what does the memory need to remember, etc. etc. If you can create this machinecode from C++, then I would suspect that it should also be possible to do this (without a C-step in between) for programs written in Python. > > > > Where is my reasoning wrong here? Is that because Python is dynamically typed? Does machinecode always need to know whether a variable is an int or a float? And if so, can't you build a compiler which creates machinecode that can handle both ints and floats in case of doubt? Or is it actually possible to do, but so much work that nobody does it? > > > > I googled around, and I *think* it is because of the dynamic typing, but I really don't understand why this would be an issue.. > > > > Any insights on this would be highly appreciated! I think a smart object can perform some experiments in its lifetime in sensing and collecting data to improve its methods in the long run. This will require a dynamical language definitely.
[toc] | [prev] | [next] | [standalone]
| From | Grant Edwards <invalid@invalid.invalid> |
|---|---|
| Date | 2013-03-04 16:36 +0000 |
| Message-ID | <kh2iij$msn$1@reader1.panix.com> |
| In reply to | #40170 |
On 2013-02-28, kramer65 <kramerh@gmail.com> wrote:
> I'm using Python for a while now and I love it. There is just one
> thing I cannot understand. There are compilers for languages like C
> and C++. why is it impossible to create a compiler that can compile
> Python code to machinecode?
The main issue is that python has dynamic typing. The type of object
that is referenced by a particular name can vary, and there's no way
(in general) to know at compile time what the type of object "foo" is.
That makes generating object code to manipulate "foo" very difficult.
> My reasoning is as follows: When GCC compiles a program written in
> C++, it simply takes that code and decides what instructions that
> would mean for the computer's hardware. What does the CPU need to do,
> what does the memory need to remember, etc. etc. If you can create
> this machinecode from C++, then I would suspect that it should also
> be possible to do this (without a C-step in between) for programs
> written in Python.
>
> Where is my reasoning wrong here? Is that because Python is
> dynamically typed?
Yes.
> Does machinecode always need to know whether a
> variable is an int or a float?
Yes. Not only might it be an int or a float, it might be a string, a
list, a dictionary, a network socket, a file, or some user-defined
object type that the compiler has no way of knowing about.
> And if so, can't you build a compiler which creates machinecode that
> can handle both ints and floats in case of doubt?
That's pretty much what you've got now. The Python compiler compiles
the source code as much as it can, and the VM is the "machinecode that
can handle both ints and floats".
> Or is it actually possible to do, but so much work that nobody does
> it?
>
> I googled around, and I *think* it is because of the dynamic typing,
> but I really don't understand why this would be an issue..
Can you explain how to generate machine code to handle any possible
object type than any Python user might ever create?
--
Grant Edwards grant.b.edwards Yow! for ARTIFICIAL
at FLAVORING!!
gmail.com
[toc] | [prev] | [next] | [standalone]
Page 1 of 2 [1] 2 Next page →
Back to top | Article view | comp.lang.python
csiph-web