Why is ORM considered good but “select *” considered bad


Doesn't an ORM usually involve doing something like a select *?

If I have a table, MyThing, with column A, B, C, D, etc, then there typically would be an object, MyThing with properties A, B, C, D.

It would be evil if that object were incompletely instantiated by a select statement that looked like this, only fetching the A, B, not the C, D:

select A, B from MyThing /* don't get C and D, because we don't need them */

but it would also be evil to always do this:

select A, B, C, D /* get all the columns so that we can completely instantiate the MyThing object */

Does ORM make an assumption that database access is so fast now you don't have to worry about it and so you can always fetch all the columns?

Or, do you have different MyThing objects, one for each combo of columns that might happen to be in a select statement?

EDIT: Before you answer the question, please read Nicholas Piasecki's and Bill Karwin's answers. I guess I asked my question poorly because many misunderstood it, but Nicholas understood it 100%. Like him, I'm interested in other answers.

EDIT #2: Links that relate to this question:

Why do we need entity objects?

http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx, especially the section "The Partial-Object Problem and the Load-Time Paradox"




Best Solution

In my limited experience, things are as you describe--it's a messy situation and the usual cop-out "it depends" answer applies.

A good example would be the online store that I work for. It has a Brand object, and on the main page of the Web site, all of the brands that the store sells are listed on the left side. To display this menu of brands, all the site needs is the integer BrandId and the string BrandName. But the Brand object contains a whole boatload of other properties, most notably a Description property that can contain a substantially large amount of text about the Brand. No two ways about it, loading all of that extra information about the brand just to spit out its name in an unordered list is (1) measurably and significantly slow, usually because of the large text fields and (2) pretty inefficient when it comes to memory usage, building up large strings and not even looking at them before throwing them away.

One option provided by many ORMs is to lazy load a property. So we could have a Brand object returned to us, but that time-consuming and memory-wasting Description field is not until we try to invoke its get accessor. At that point, the proxy object will intercept our call and suck down the description from the database just in time. This is sometimes good enough but has burned me enough times that I personally don't recommend it:

  • It's easy to forget that the property is lazy-loaded, introducing a SELECT N+1 problem just by writing a foreach loop. Who knows what happens when LINQ gets involved.
  • What if the just-in-time database call fails because the transport got flummoxed or the network went out? I can almost guarantee that any code that is doing something as innocuous as string desc = brand.Description was not expecting that simple call to toss a DataAccessException. Now you've just crashed in a nasty and unexpected way. (Yes, I've watched my app go down hard because of just that. Learned the hard way!)

So what I've ended up doing is that in scenarios that require performance or are prone to database deadlocks, I create a separate interface that the Web site or any other program can call to get access to specific chunks of data that have had their query plans carefully examined. The architecture ends up looking kind of like this (forgive the ASCII art):

Web Site:         Controller Classes
                     |                                 |
App Server:       IDocumentService               IOrderService, IInventoryService, etc
                  (Arrays, DataSets)             (Regular OO objects, like Brand)
                     |                                 |
                     |                                 |
                     |                                 |
Data Layer:       (Raw ADO.NET returning arrays, ("Full cream" ORM like NHibernate)
                   DataSets, simple classes)

I used to think that this was cheating, subverting the OO object model. But in a practical sense, as long as you do this shortcut for displaying data, I think it's all right. The updates/inserts and what have you still go through the fully-hydrated, ORM-filled domain model, and that's something that happens far less frequently (in most of my cases) than displaying particular subsets of the data. ORMs like NHibernate will let you do projections, but by that point I just don't see the point of the ORM. This will probably be a stored procedure anyway, writing the ADO.NET takes two seconds.

This is just my two cents. I look forward to reading some of the other responses.

Related Question