0

It is said that in linux, unlike Windows, there's not a clear border between executables and other files.

Well, in Windows, I write a C++ program, then it is precompiled, compiled and then linked to become a distinguished file: executable. The changes are so much that they are not reversible.

But in Linux a simple text file containing a code is executed. So what do compilation and linkage do? If a code is executed, why is it compiled? What's good in this process and what is the main difference between the code and the final (so called) executable file in Linux? Why portability of programs is limited in Linux and needs a lot of (version specific) dependency requirements, if they are codes?

5
  • 5
    This is not really a question about *nix but about the difference between scripting and compiled languages. On windows as on Linux, a Perl script is a text file and also an executable. On Windows as on Linux, a compiled C file is a binary executable file. You may be confusing executable files with binary files.
    – terdon
    CommentedJan 4, 2014 at 0:26
  • I'm new to linux, trying to understand it,
    – user54651
    CommentedJan 4, 2014 at 0:27
  • 1
    I suggest going through faqs.org/docs/artu/ch01s06.html, doesn't answer your question directly but goes into the root of what you seem to be curious about.
    – Ketan
    CommentedJan 4, 2014 at 0:32
  • 5
    This question appears to be off-topic because it is about the difference between scripting and compiled languages and is not really related to *nix.
    – terdon
    CommentedJan 4, 2014 at 0:46
  • This isn't true about Windows. Windows has .bat (batch) files, for example. These work pretty similarly to shell scripts. Windows figures out how to handle a file based on its extension; Unix based on its permissions (execute bits) & content.
    – derobert
    CommentedJan 4, 2014 at 7:39

5 Answers 5

4

tl;dr: the difference is the executable bit.

The answer lies in the UNIX permissions model. To be honest, I forget what the Windows permissions model is, but in UNIX (and hence GNU/Linux), there are three main permission bits that can be set on a file: read, write, and execute. These bits can be set on anything. There are two main types of files that you would want to set the executable bit on:

  1. Binaries
  2. Scripts

The first type works exactly as .exes do in Windows. The only difference is that the ability of the file to be executed is determined by a permission bit in the filesystem, instead of the file extension. Binaries do still have a format, just like .exes. On GNU/Linux, this format is called ELF. The Linux kernel has special logic that tells it how to read the format of ELF binaries. When you execute a binary, it is this logic that actually runs the code.

The part that is confusing you is the second type of executable: scripts. Scripts are regular text files that can be executed by an interpreter, like python or bash. Scripts start with something called a shebang, which looks like this: #!. When a script is "executed", the kernel recognizes the shebang and executes whatever binary is specified after it, with the path of the script you are executing as an argument.

For example, let's say I have a script with the executable bit set, at the path /home/alex/bin/test_script. This script has the following as the first line:

#!/bin/bash 

When you execute this script, the kernel will recognize the shebang at the beginning. It will then load /bin/bash and pass it /home/alex/bin/test_script as the first argument. This would be the equivalent of executing the following on the command line:

/bin/bash /home/alex/bin/test_script 

In this way, bash is loaded to interpret, or "execute", the script.

As a small aside, the change from source to binary is not so great that it cannot be reversed. Retrieving source code from a binary is called decompiling.

1
  • 1
    The original MS-DOS filesystems only support a handful of "attributes" -- System, Hidden, Volume Label(!), Archive, and Read-only (if I remember correctly). I think there may have been some "reserved bits" which were never used. (Yes, they had a bit on every directory on the system to indicate that this entry was NOT the volume label; it really was that stupid). System was sort of like "hidden+read-only" but different. It was an ill-conceived mess. The "volume bit" was used later, in VFAT, to implement long name support.CommentedJan 4, 2014 at 4:30
3

But in linux a simple text file containing a code is executed. So what do compilation and linkage do?

This is because Linux doesn't rely on extensions but the executable bit. Normally when the compilers finish their job they add the executable bit to the compiled binary. This is why normally you can execute C++ code you just compiled. In the case of text files is different. They use either the way of interpreter file being interpreter a executable already; or ./file if the file has a shebang (#!) that tells the system what interpreter to use.

What's good in this process and what is the main difference between the code and the final (so called) executable file in linux?

I think this was answered already, but the code source is by no means ready to be executable. It has to be compiled, the libraries linked, etc. The system should not do this at execution time (compile) so, it first has to be compiled, then executed.

Why portability of programs is limited in linux and needs a lot of (version specific) dependency requirements, if they are codes?

This depends of the OS and installed libraries. This is actually a good thing! It means that you don't have the same library again and again each time you install something. In windows you could easily end with 3 or 4 GTK+ versions installed in different paths. In Linux you install once and each application is able to run it, and benefits of all the updates given. If you want to remove them, you don't need to hunt each of the libraries.

BTW, they are no codes. They are already compiled libraries (most of them) and can be called at will by the programs that needs them in fixed locations. Some use the GNU linker to make this even easier.

    2

    Executables in Linux work very much the same than in Windows.

    Given some C++ source code, you compile it into a binary, both in Linux and Windows.

    Other languages like perl are not compiled. They need some interpreter to get executed. This is the same in Linux and Windows.

    On Windows you have batch scripts. They are pretty much the same as shell scripts in Linux: They are just text which is executed line by line.


    Therefore the difference is not between Linux and Windows, but the one between compiled languages and scripting languages.

    1
    • Actually there are some other more subtle differences. In MS-DOS and MS Windows batch files are limited to ending in the .BAT extension (and I they added .CMD for NT and later). The 4DOS alternative to COMMAND.COM (and its ilk, including the Norton Utilities derivative) also accepted the .BTM extension. Other than that one could create associations between extensions and executables (including script interpreters) ... but this still isn't much like simply having an executable bit in thee file's mode.CommentedJan 4, 2014 at 5:24
    1

    Each unix executable needs two things: appropriate mode and signature. Mode is set by chmod and define that system CAN try to execute file. Signature - two first bytes - is a hint for system about HOW executable may be executed.

    Scripts often begins by "shebang" - #! - special sign that some interpretator should be started. First line #!/bin/sh means that system should start other executable /bin/sh and pass your script to it as argument. So scripts are executed indirectly.

    The "real" executables begins with "?E" that means that system can load that file right to the memory.

    1
    • UNIX scans the first block of the file for certain patterns ("magic numbers") which affect how the file is executed. This varies. #! is interpreted by kernel as a magic number. Linux also supports old COFF/a.out and ELF binary formats, for example. Failing all that (plain text with not #! "comment" ... the shell may attempt to process the text as a script (as csh does, for example). Also Linux supports registration of magic numbers for things like Java .class files with custom interpreters) --- though this is rather obscure (Google: binfmt_misc).CommentedJan 4, 2014 at 5:29
    1

    Well, let's see. First of all, it is certainly not true that any file that contains code can be executed in *nix. Basically, executables are the same thing for all operating systems. The only real difference (and what the post you linked to is actually saying) is that on Windows, file extensions are used by the system to determine what the file type of a particular file is while on *nix this is usually irrelevant. You can use arbitrary extensions and not affect the file in any way.

    I will now give a horrible simplification. There are basically two main kinds of programming languages, compiled languages and scripting languages. To write a program in a compiled language (like C or C++ or Fortran), you first write a text file containing your code and you then compile that text file into a binary. That compilation essentially transforms things like for(i=0;i<10;i++){do stuff} into something a computer can understand.

    Scripting languages (like Perl or Python or bash), on the other hand, work via an interpreter. This interpreter reads the source code and translates it into machine commands on the fly (kinda). There is no need to compile because the interpreter takes care of translating the commands you have written into things your computer can understand.

    So, programs written in compiled languages are run as binary files while those written in scripting languages are interpreted by the interpreter but both the binary and the script are executable files. This is exactly the same in Windows and in *nix.

    2
    • In this context, I'd say compiled and interpreted languages. Also, Python technically compiles source to bytecode ran by a simple virtual machine (the compilation step runs automatically and transparently when you run a recently modified program and the compiled output is stored in .pyc files on multi-module projects).
      – Alexios
      CommentedJan 4, 2014 at 8:21
    • @Alexios that's why I said kinda.
      – terdon
      CommentedJan 4, 2014 at 12:28