Thinking about it, I realized that even the implementation described here isn't really O(log n). Here's why: it's rather easy to prove that the nth term of the Fibonacci sequence has O(n) digits. So, even if you are performing only O(log n) multiplications, each multiplication is actually O(n), so the algorithm is actually O(n log n).
I made a benchmark that shows that in practice:
docs.google.com/spreadsheets/d/1mDIuAUrI6lzZxDf2HEqHbaYjddfK0hijd_r9qBOOEoQ
*cap obvious flies away*