The Julia Language
High-Performance JIT Compiler
Julia’s LLVM-based just-in-time (JIT) compiler combined with the language’s design allow it to approach and often match the performance of C/C++. To get a sense of relative performance of Julia compared to other languages that can or could be used for numerical and scientific computing, we’ve written a small set of micro-benchmarks in a variety of languages. The source code for the various implementations can be found here: C++, Julia, Python, Matlab/Octave, R, and JavaScript. We encourage you to skim the code to get a sense for how easy or difficult numerical programming in each language is. The following micro-benchmark results are from a MacBook Pro with a 2.53GHz Intel Core 2 Duo CPU and 8GB of 1066MHz DDR3 RAM:Julia | Python | Matlab | Octave | R | JavaScript | |
---|---|---|---|---|---|---|
3f670da0 | 2.7.1 | R2011a | 3.4 | 2.14.2 | V8 3.6.6.11 | |
fib | 1.97 | 31.47 | 1336.37 | 2383.80 | 225.23 | 1.55 |
parse_int | 1.44 | 16.50 | 815.19 | 6454.50 | 337.52 | 2.17 |
quicksort | 1.49 | 55.84 | 132.71 | 3127.50 | 713.77 | 4.11 |
mandel | 5.55 | 31.15 | 65.44 | 824.68 | 156.68 | 5.67 |
pi_sum | 0.74 | 18.03 | 1.08 | 328.33 | 164.69 | 0.75 |
rand_mat_stat | 3.37 | 39.34 | 11.64 | 54.54 | 22.07 | 8.12 |
rand_mat_mul | 1.00 | 1.18 | 0.70 | 1.65 | 8.64 | 41.79 |
C++ compiled by GCC 4.2.1, taking best timing from all optimization levels (-O0 through -O3).
The Python implementations of rand_mat_stat and rand_mat_mul use NumPy (v1.5.1) functions; the rest are pure Python implementations.
The Python implementations of rand_mat_stat and rand_mat_mul use NumPy (v1.5.1) functions; the rest are pure Python implementations.
These benchmarks, while not comprehensive, do test compiler performance on a range of common code patterns, such as function calls, string parsing, sorting, numerical loops, random number generation, and array operations. Julia is strong in an area that high-level languages have traditionally been weak: scalar arithmetic loops, such as that found in the pi summation benchmark. Matlab’s JIT for floating-point arithmetic does well here too, as does the V8 JavaScript engine. V8 is impressive in that it can provide such a dynamic language with C-like performance in so many circumstances. JavaScript, however, is unable to utilize technical computing libraries such as LAPACK, resulting in poor performance on benchmarks like matrix multiplication. In contrast with both Matlab and JavaScript, Julia has a more comprehensive approach to eliminating overhead that allows it to consistently optimize all kinds of code for arbitrary user-defined data types, not just certain special cases.
To give a quick taste of what Julia looks like, here is the code used in the Mandelbrot and random matrix statistics benchmarks:
function mandel(z)
c = z
maxiter = 80
for n = 1:maxiter
if abs(z) > 2
return n-1
end
z = z^2 + c
end
return maxiter
end
function randmatstat(t)
n = 5
v = zeros(t)
w = zeros(t)
for i = 1:t
a = randn(n,n)
b = randn(n,n)
c = randn(n,n)
d = randn(n,n)
P = [a b c d]
Q = [a b; c d]
v[i] = trace((P.'*P)^4)
w[i] = trace((Q.'*Q)^4)
end
std(v)/mean(v), std(w)/mean(w)
end
As you can see, the code is quite clear, and should feel familiar to
anyone who has programmed in other mathematical languages.
Although C++ beats Julia in the random matrix statistics benchmark by a
significant factor, consider how much simpler this code is than the C++ implementation.
There are more compiler optimizations planned that we hope will close
this performance gap in the future.
By design, Julia allows you to range from low-level loop and vector
code, up to a high-level programming style, sacrificing some
performance, but gaining the ability to express complex algorithms
easily.
This continuous spectrum of programming levels is a hallmark of the
Julia approach to programming and is very much an intentional feature of
the language.Designed for Parallelism & Cloud Computing
Julia does not impose any particular style of parallelism on the user. Instead, it provides a number of key building blocks for distributed computation, making it flexible enough to support a number of styles of parallelism, and allowing users to add more. The following simple example demonstrates how to count the number of heads in a large number of coin tosses in parallel.nheads = @parallel (+) for i=1:100000000
randbit()
end
This computation is automatically distributed across all available compute nodes, and the result, reduced by summation (+
), is returned at the calling node.Although it is in the early stages, Julia already supports a fully remote cloud computing mode. Here is a screenshot of a web-based interactive Julia session, plotting an oscillating function and a Gaussian random walk:
You can try Julia in the web repl yourself at julia.forio.com (EC2 instance and maintenance graciously provided by Forio). There will eventually be full support for cloud-based operation, including data management, code editing and sharing, execution, debugging, collaboration, analysis, data exploration, and visualization. The goal is to allow people who work with big data to stop worrying about administering machines and managing data and get straight to the real problem.
No comments:
Post a Comment
Thank you