Computer Science 101: The Art of Programming

(Dec 13th, 2013 at 12:51:32 AM)

At this point, we've discussed what programming languages are and, more or less, how they are use to make computers do things. I've also slightly (however poorly) motivated why some languages might be preferable for different situations. That being something I care very much about, I wanted to dedicate today to at least hitting on some of the big considerations.

Compilers vs Interpreters

So far I've really only talked about compiling programs (converting them to binary so the computer can execute them). However, programs can also be interpreted. This entails writing a program that can look at the source code for a program (i.e. the programming language code) and, rather than converting it into instructions for the computer, simply "doing" whatever the code is supposed to do.

There are pros and cons to both methods, but it really just boils down to that compilers are make programs that run quickly while interpreters are aware of what is happening in a program. Compiled programs run quickly because they are machine code—they actually run on the computer. Interpreted programs are not only executing the program but are figuring out what it's trying to do as they go. Compilers take care of this once and then the program is good to go. However, compiled programs will do the same thing every time; interpreted programs can identify certain situations where work can be avoided because it knows all the actual values when the program is running.

In general, older programming languages like C, C++, Fortran, etc. were distributed with compilers while "newer" languages like Python or PHP have interpreters. Technically speaking, most languages could a compiler or an interpreter or both. More often, though, it is only one of the two (at least that gets mainstream attention). As a result, languages are often referred to as "interpreted" or "compiled" which really just means that that's the most likely way it will happen. But before I go on anymore about that, there's a big topic that I'm going to touch on unfortunately briefly: compile-time optimizations.

Optimizing compilers are those that attempt to make programs as efficient as possible (be it speed of execution or amount of memory used) while maintaining the same functionality. They essentially work by identifying bits of code that a programmer wrote that can be removed, re-ordered, or re-worked to require fewer instructions (or instructions that take less time) in the final machine code and/or limit the amount of memory used and frequently accessed.

There are dozens of optimizations a compiler can make if they can identify where they apply which, for a large program, can means loads of work saved for the programmer. The trouble is that with how expressive modern programming languages are, knowing for sure that a certain rule applies can be very difficult. In languages like C or C++, everything you do is (or at least should be) extremely explicit. Because you are doing everything so deliberately, the compiler is more able to identify where there are issues and where things can be improved. In languages like Python, you don't get as much help.

I've taken care not to mention Java yet, because it was designed with a fairly unique purpose. Java is compiled. But not into machine code. The Java compiler translates your code into something they call "byte code." This byte code is similar to assembly, but is designed to be just general enough to work on any computer hardware. However, Java also has an interpreter. The byte code output by the compiler is then "run" by the Java Virtual Machine or JVM. The JVM is a huge program with many features (usually written in C++ or something like it) that is developed for every kind of computer hardware as it shows up. Though this is a fairly large effort, once that work is done ANY other Java program can be run on that machine, just by getting your hands on the byte code.

Other Considerations

Though from the above you can probably get a decent idea of why people use different languages and why it matters for programmers to know many, but before calling it a night I wanted to throw together a few other points that probably don't need much explanation:

Programs that need to be fast should be compiled
For simple programs doesn't really matter what you use, just get it done and don't be stupid about it
If you REALLY need portability (programs that can run on different kind so computers), Java may be your best bet. At the same time, most languages are fairly portable at this point, so the world is better in that sense
Readability matters: If other programmers can't figure out what you're doing, you've basicaly just created a program in its final version

One things that helps this a lot is commenting. I think every language I've ever used has provided a way of putting "comments" into your code that don't affect the program at all but can serve as a note for other programmers (or future you) to understand what is going on in case it may need changed or added to later

C and C++ are much more focused on raw programming. They're about doing what you want to the data and only what you want. Most languages since Python have revolved more around built-in features and ease of use. This tends to make C and C++ feel more restricted or behind the times
Programming is about problem solving and logic. There is so much more to it than compilers and algorithms (which we admittedly haven't gotten into too much) and people do more and more clever things every year

And on that note, I'll see you tomorrow where I'll tell you a little more about what is going on now and how I feel about it.

Computer Science 101: The Art of Programming

Compilers vs Interpreters

Other Considerations

Comments