I’ve been using ActiveRecord for a while now. I’m writing a financial application with it — which I suppose is the canonical example of a database application that SQL was designed for. Specifically:
- It has many 1:N relations.
- It makes extensive use of grouping and aggregation for everyday data
- It routinely works with many thousands of records, even in a small system.
All of which makes ActiveRecord break into a cold sweat.
ActiveRecord, at the moment, handles 1:N relationships with N+1 queries — one for the main record, and one for each sub-record, indexed by ID. That gets really slow when you have a thousand records to display.
ActiveRecord, since it reads the column names from the database system itself, can’t automatically infer the presence of a virtual
column such as an account balance, which is calculated as the sum of a field from a joined table. In fact, there’s no way to do that at all without dropping back to SQL, and there’s no place to place a join in the standard methods. Meaning you have to jump under the hood, into the greasy parts of the engine.
The temptation to say
def balance
transactions.inject(0.0) { |acc,e| acc += e.amount }
end
is really tempting, but it’s too slow.
My other problem is how it doesn’t integrate with Ruby that well. It doesn’t feel like the standard library.
ActiveRecord takes over inheritance: ActiveRecord classes must inherit from ActiveRecord::Base
, and worse than that, it treats indirect subclasses differently: if I have an inheritance of Payment < Transaction < ActiveRecord::Base
, then it assumes there’s a column type
, which will contain a value NULL
or the string Payment
, referring to whether to instantiate the parent or child type — instead of using a table payments
by default, and letting me change that. This integrates really poorly with PostgreSQL’s way of doing object orientation, and in this case, a Payment
is a kind of Transaction
, but the difference is semantic, not coded into the database. I’d like to be able to inherit the behaviors from the parent — not have them change on me because I’m inheriting.
ActiveRecord is also very hard to debug without intimately knowing its structure due the the massive amount of eval
magic involving strings calculated elswhere in the code.
ActiveRecord does a decent job of type guessing, but only for basic types and classes-as-tables, not for anything resembling a derivative of a basic type.
So, what I’d do to ActiveRecord, and may do when I get this app a bit further along:
- Make
ActiveRecord::Base
a module designed for mixin. We haveEnumerable
which adds pretty massive functionality to a class — why not anActiveRecord
mixin to make your class automagically database-enabled? - Eliminate the inheritance magic.
- Use
define_method
whenever possible. - Split the type-guessing code into a separate module and design it to be extended.
Take this with a grain of salt. I’m working with a pre-existing application here, porting to ActiveRecord. I’m working with data with One True Way, according to six hundred years of accounting theory, to be represented. I also have a habit and a fondness for taking systems and bending them past the point where they’re still flexible. It’s why I use Ruby.
A response from David Heinemeier Hansson
Ahh, great to get a set of concerns like that. Allow me to elaborate my position on each of the charges.
1:N relationships with N+1 queries
This is exactly why Active Record allows for an intimate relationship with SQL. And why it uses SQL for conditions snippets and more. Active Record really doesn’t want you to forget about SQL because SQL is exactly the right answer to a lot of questions. This is one of them. Active Record takes the drudgery out of doing manual SQL for a ton of queries (in Basecamp it’s around 85-95% of the queries that are automated).
So when you encounter a case such as this, you just drop down to SQL. If AR gives the intention that SQL is fooling around with the engine that’s were the mistake is being made. It’s more like choosing manual gears on a car that runs on automatic rather that popping the hood. So I think this is mostly a case of managing expectations. Active Record shouldn’t give the expectation that you’ll never have to touch SQL again. That’s an illusion and many ORMs have been build and crumbled on that illusion.
Taking Active Record to a manual gear on this entails using all the
wonderful flexibility put into the ORM through the power that is Ruby. The
solution is called “data piggybacks” and works like this:
class Post < ActiveRecord::Base
def self.find_all_with_authors
find_by_sql(
“SELECT posts.*, people.name as author_name ” +
“FROM posts, person ” +
“WHERE posts.author_id = people.id”
)
end
end
Now you can do:
for post in Post.find_all_with_authors
puts “#{post.title} was written by #{post.author_name}”
end
This bends the regular restriction that post objects will only have attributes equal to those in the table definition for that class. Hence the piggyback term.
This is of course a simple example, which you can make more elaborate as your requirements dictate it. But it does show how you can turn n*2 (or n*3 or n*5) type queries back into n. This is mostly important for overview screens that needs to present this data together.
I could imagine some formal support in Active Record for the piggyback approach, but I’ve found that outside toy examples, like the one above, I really do like to drop to SQL for queries like this. The only reason you’re going to SQL is to tweak performance, so it’s often you more interested in control than convenience in these cases. But again, I’m certainly open to formal piggyback schemes.
ActiveRecord takes over inheritance
It was an early decision that ActiveRecord objects wouldn’t be able to live in isolation from the ORM. This follows directly from the Active Record pattern and is one of the main differences between that and the Data Mapper approach. By tying your business objects to Active Record you loose flexibility in swapping in another ORM, but at the same time gain a incredible ease of use and convenience. Active Record objects accepts this dependency by using the association/aggregation macros (has_many, etc), overwriting callbacks (before_save, etc), and implementing validation.
So in my opinion there’s little to gain from a mix-in, when you’re writing code that formulates a specific dependency anyway. Once you’ve accepted that, there’s no way to “mix-out”.
What I’d rather mix-in would be the shared functionality between payment and transaction if you don’t want it to use an explicit inheritance hierarchy. That was actually exactly what I did with todo lists and todo items on Basecamp. Both needed functionality to control list movement (move this item/list higher or lower or insert a new item/list). This behavior was kept in a List module that was then mixed into both Active Record classes. Worked very well.
ActiveRecord is also very hard to debug.
I recognize that especially the association/aggregation macros are pretty hard to debug. But you’ve yourself remedied that with the module_eval debugger where you get a perfect view into which methods are added by the different macros and what their internals look like.
The other parts of Active Record is more easily debugged by having a look at the logging that ships with the package. All SQL statements generated and executed by Active Record are logged and their run time is reported (poor man’s profiler).
The newer versions of Active Record also implements a considerably improved exception system with more saying and relevant exceptions raised.
ActiveRecord does not [do a decent job of type guessing] for anything resembling a derivative of a basic type.
This just changed with the inclusion of the aggregation module, which brings much needed value object support to Active Record (allowing for a “fine-grained” label). You can now use Active Record objects as aggregations of value objects such as Temperature, Money, Address, and others that doesn’t need an identity (such objects are often called entity objects, which is exactly what Active Record-based objects are).
Example (as it will be in the final version):
def Account < ActiveRecord::Base
composed_of :amount, :class_name => “Money”
end
The amount attribute, which is just represented by an integer in the
database, will now be presented as a Money object and can be assigned as
such:
account.amount = Money.new(20, “USD”)
account.amount.exchange_to(“DKK”) # returns Money.new(150, “DKK”)
Take this with a grain of salt. I’m working with a pre-existing application here
Active Record is certainly better suited for new applications than for wrapping around existing structures. The naming, id, and class schemes work much better when you’re following the form that Active Record is most comfortable with. So again, trade a bit of flexibility, get ease of use.
It would however not be unreasonable to suggest improvements to Active Record that would allow it to be more easily used for legacy applications. It’s just that I’ve certainly picked sides and tweaked ease of use towards new applications, so suggestions that this awkward probably won’t be acccepted.
Thank you, Ari, for bringing these concerns out. This allows me to better formulate what Active Record is and isn’t and why.
(I’ve posted this response to Loud Thinking as well)
David Heinemeier Hansson,
http://www.instiki.org/ — A No-Step-Three Wiki in Ruby
http://www.basecamphq.com/ — Web-based Project Management
http://www.loudthinking.com/ — Broadcasting Brain
http://www.nextangle.com/ — Development & Consulting Services
A response
All very good points. When it comes down to it, it’s a difference in expectations, mostly, and in coding style.
Debugging is still something that keeps biting me — and since ActiveRecord isn’t a perfect fit for what I’m doing, I do have to get into the guts. I think of modifiability for other purposes an important design feature, so that’s why it bothers me.
Having the name ActiveRecord
makes me think
it’s the One True Implementation of the pattern — the
one-size-fits-all, ultra-flexible version. Just as I’d expect Enumerable
to be the one implementation of Enumerable
, I expect the same of ActiveRecord
. You choose an ambitious title.
All told, ActiveRecord is solidly implemented. What it was designed to do, it does well.
I look forward to the _with_parts
accessors and the composed_of
macros. They’ll solve some of the
problems neatly.
More coming.
Responses? Comments? Email me. I’ll post them here if you say I can and they’re reasonable.