Yes, I know, Ruby is not that slow, use a better algorithm, blablablablablaaaaaa. Ruby does fit my needs, but I want to do a few things in less than 100ms (repeatedly), so I had to do a bit of performance tuning.
This page is a in-progress work towards documenting performance-related tips for Ruby. In fact, it is not really about performance tuning, but more about latency-reducing tips. It applies only to Ruby 1.8. I did not yet put my hands into Ruby 1.9.
Avoiding object allocation
Object allocation is slow, and the GC (which is also slow) runs slower if there is more objects (of course) As a corollary, do not load rubygems and cache result of enum_for if you use it a lot.
RubyGems
Do not load rubygems. This counts for more than 10000 objects on my system, and makes the GC run more than 8 times longer. The simple program
require 'pp'
require 'enumerator'
require 'radius'
require 'facet/time/elapse'
GC.start
pp ObjectSpace.enum_for(:each_object).to_a.size
pp Time.elapse { GC.start }
Measures the count of live objects, and the time passed at determining that there is no object to delete. This allocates 598 objects without Gems loaded, and GC runs for 3 ms. With RubyGems, 11300 objects remain and the GC runs for 17 ms. For those who would wonder why, a lot of objects are strings that are actually program lines. I did not yet find why.
Use of enumeration idioms
- avoid using the &block form, unless you actually want to store the resulting proc. This allocates a Proc object while using a block is faster and do not allocate any.
- If you use enum_for in methods that are often called, do it once and reuse the same enumerator object. #each is thread safe as long as the underlying enumeration method is, so the cached enumerator object can be called from different threads.