Language speed is often a major factor when determining how useful a programming language can be. Some applications rely on languages that can execute a large amount of instructions in a short amount of time, and programmers may in some cases use languages which have many unpleasant features, just because they can execute an algorithm faster. It would be interesting to see how some languages would fare when it comes to how long it takes them to execute the same algorithm.

I decided to conduct a comparison of programming languages by using them to implement practically the same task and then timing how long each one would take to execute. For the purposes of this article, a faster program is one which can execute an implementation of an algorithm in less time. The algorithm used was quicksort, which is a well-known sorting algorithm invented by Tony Hoare in 1961.1 It can be implemented recursively, and it works by finding a pivot value in an array, moving all elements smaller than it to one side and the larger ones to the other, this process is repeated on the sub-sections around the pivot until the whole array is sorted.

Following up from my previous article, I thought of using the Mersenne Twister2 algorithm to generate random 32bit integers that I would then sort using quicksort in 7 different implementations, specifically: C3, C++4, Java5, Python 36, Python with PyPy6, Jython6, and PHP7. For the quicksort algorithm, I found a pseudo-code description of it on Wikipedia8. Using a Java implementation of Mersenne Twister9, I generated 7 files all containing pseudo-random positive 32-bit integers. The files contained 100 to 100 million integers, with each file containing 10 times as many integers as the previous one. Using the description from Wikipedia, I then implemented the algorithm in the languages used in the test, adding functions for reading the numbers from files and printing them for checking. In the actual test each program was to read the contents of the file, store them in an array, sort the array using quicksort, and finally print a message that the array was sorted. The time of execution would be measured by the Bash time function.10 This was mainly done so that the same function would be measuring all programs, as using timing from different languages can produce unreliable results. In order to obtain more precise results each instance of sorting was done 5 times, and the average time was taken. In an effort to automate the task more, I wrote a Bash script that would run each instance of the program 5 times, time the execution for each run, and then calculate the average of the times and write it to a CSV table.11

As for my initial expectations, they coincided with the usual belief about these languages. I expected the compiled languages(C, C++, Java) to be fast, and the interpreted languages to be slower. This basically follows common knowledge about these languages, but the point of the test is to see how they would actually compare, and how significant the differences would be. When it comes to the Python variants I thought that PyPy would be the fastest out of them as it was designed specifically to make python code run faster, and uses just-in-time(jit) compilation.

  • The table which contains the average times for all runs of each program, on the five largest files.
Seconds taken to sort array
Language 10^4 integers 10^5 integers 10^6 integers 10^7 integers 10^8 integers
C0.00450.02350.21152.231523.4800
C++0.00400.02450.21302.229523.4535
Java0.25500.37351.40509.922595.5925
Python 30.08100.48905.682073.1470989.4360
PyPy2.2.10.13200.22401.53409.164094.2570
PHP0.12100.66507.599096.59501198.0580
Jython2.34602.84509.588088.15601585.4810
  • A graph which shows the comparison of how long each language took to sort N number of ints, the three largest files are considered

Graph comparing time taken to sort 100 million integers for 5 languages

The results of this test agree with the general belief that interpreted languages tend to be far slower than compiled ones. C and C++ were almost the same, which could be partly because the same IO libraries were used for those languages. PyPy turned out to be practically 10 times faster than CPython, which was much more than I had expected. This could be because the jit compilation starts making a greater difference with a greater number of integers. Jython was the slowest out of all, which certainly not surprising since it is running Python code on top of the Java Virtual Machine.

While this test showed interesting outcomes, it leaves many things to be improved. One of these is the heavy use of libraries for file I/O, this means that the test was also measuring the implementation of the libraries, which could have affected the results. For example if a file I/O library in Java was better written than the other ones, it could give that language a skewed advantage. Another issue with this test is that it only considers languages which have their syntax influenced by C, and have many major similarities. It would have been interesting to conduct the test while including languages, which are less similar to one another, for example some of the functional languages.

If anything this little comparison shows how useful it is to have knowledge of the features of different programming languages. For example, Python is a language which is really pleasant to code in, and allows for elegant solutions, however it would not be a good idea to use it in performance-intensive applications. Knowing about differences like this between different programming languages will allow one to make sensible choices when having to implement a solution to some problem. After all the programming languages we have are just tools, which can be used to implement some solution that the programmer comes up with.

Footnotes

1: http://cs.stanford.edu/people/eroberts/courses/soco/projects/2008-09/tony-hoare/quicksort.html

2: http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html

3: My C implementation of Quicksort (compiled with gcc): https://github.com/Filip-Ter/QSortTest/blob/master/cmplang/src/QsortC.c

4: My C++ Quicksort implementation (compiled with g++): https://github.com/Filip-Ter/QSortTest/blob/master/cmplang/src/Qsort.cc

5: My Java Quicksort implementation, complied and executed with jdk 7u51: https://github.com/Filip-Ter/QSortTest/blob/master/cmplang/src/Qsort.java

6: My Python Quicksort implementation, the same source file was used for Python 3.3.4, PyPy, and Jython: https://github.com/Filip-Ter/QSortTest/blob/master/cmplang/src/Qsort.py

7: My PHP Quicksort implementation: https://github.com/Filip-Ter/QSortTest/blob/master/cmplang/src/Qsort.php

8:The pseudocode description of the algorithm that is implemented in my programs: http://en.wikipedia.org/wiki/Quicksort#In-place_version

9: Java implementation of Mersenne Twister: http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/JAVA/MTRandom.java

10: The bash built-in time function was used, so not /usr/bin/time (GNU time)

11: The script used to run and time the programs: https://github.com/Filip-Ter/QSortTest/blob/master/TestTime.sh