Gal Ratner
Gal Ratner is a Techie who lives and works in Los Angeles. Follow galratner on Twitter Google
To O/R Map or not to O/R Map - The Data Layer debate

To O/R Map or not to O/R Map

I’ve recently been faced with the age old question, “Should I O/R map or use the traditional custom data layer along with stored procedures?”  There are many opinions and articles about this subject, and I’m sure if you ask two architects you will probably get three answers.  I guess the real question is, “What are the main use cases for the system you are trying to build?”  In this article I will try to break down this theoretical subject into main points; this is a very complicated issue that is best discussed in the academia considering that in the business world everything is measured in dollars and cents and stake holders are not always interested in what’s under the hood of their business driven app. as long as it performs their business needs.  Let’s first talk a little about Object-relational mapping and O/R mappers: what they are and what they do.

The main purpose of O/R mappers is to bring relational data into object.  Relational databases store data in a format that is designed for performance in data access and modifications.  They do not concern themselves about objects, and when we design a relational database, we usually don’t pay much attention to use cases and design our table structure for database optimization.  Our application, however, is built on objects.  Objects represent the logical unit of data as we conclude from our use cases.  For example, an object can be a customer and can be composed of first and last name and a list of addresses.  An address can be another object.  In the database, however, this data can be spread across multiple tables including look-up tables for states or territories, indexes, primary keys, and so on.  Something needs to shape the data retrieved from the database into the application’s objects.  O/R mappers fill in this role.  They turn relational data to objects and assist with CRUD operations. 

Development Time

Developing with O/R Mappers is usually a little faster than a custom data layer.  The lack of store procedures and the ability to reverse engineer the application can assist in cutting down development times.  Of course not all O/R Mappers are created equal.  Some do not have a UI and will require manually creating classes and mapping files.  This can actually prolong development time especially when the database does not exist yet.  Development time can be further prolonged by database schema changes that will usually require you to regenerate the entire data layer on minor database changes.  Creating a custom data layer will usually require a larger initial effort in coding our data objects, but will allow for minor revision changes as our database schema changes.  Since the store procedures are in charge on the output from the database, we can simply manipulate them in order to preserve the app. objects format. 

Maintainability

Since development never ends, new features are constantly added.  This type of O/R files regeneration can be pretty tedious.  When your development team is involved, it can lead to code merging issues due to the large number of files being constantly modified and going out of sync.  In a traditional data layer, the only files being synced are the files modified for use cases. 

So What About the Architecture? 

A lot has been said about multi-tiered programming and the need for loose coupling.  Coupling is the level of isolation between tiers and loose coupling will indicate that changes in one layer would not cascade into other layers.  Rule of thumb:  in a properly designed loose-coupled system is the two dot rule.  If you need to dot your way around the application, it usually means that the parent object is “too aware” of what its child objects look like and what they do.  If you can pinpoint a sub method inside a child of a child it usually means that changes in that object would cascade up into the parent.  For example, myobject.mychildobject.mygrandchildobject.method will indicate that if we ever change “method” we would have to change “myobject” as well.  This can lead to a maintainability nightmare and will basically require full integration testing before releasing any change to production.  How does this relate to data layer vs. O/R Mappers?  Well, simply put, since we treat our objects as obvious, we tend to assume methods and properties are there all across the application tiers.  And so if we make a schema change and regenerate our data layer, a field that was removed would cascade across all the app. Tiers.  With traditional data layer, we can better control changes in the data objects and fill any necessary voids in the database level using the store procedures.  This is more significant than you think.  The ability to be flexible is important in each layer of our application and even though we don’t tend to think about it too much as programmers, the database is certainly one of the tiers.  The ability to loose couple the database is key because it gives us the flexibility to maintain an efficient database schema design without cascading changes to our data layer. 

Pluggable Database Types

Some O/R mappers are designed to have templates that work with different database types.  This type of pluggable architecture allows for the ability to switch between database types such as SQL Server and Oracle; however, this is purely theoretical since I have never seen a system without a single store procedure.  This diminishes the value of pluggable data layers since we still have to manually convert TSQL to PLSQL and back. 

Data Validation and Dependency Validation

Data validation is a part of any application.  There are several viewpoints regarding where in the application validation should take place.   I personally always take the approach of validating in the UI, in the data layer, and at the store procedure levels.  Some O/R mappers add validation in the data objects themselves.  I tend to think this fills in the gap of having no stored procedures, however it can get pretty heavy in the data layer level and can make the generated code unreadable (assuming you actually read the generated classes.)  In this area of data validation both approaches are pretty much the same.  There are really no major benefits to either side and it’s really up to your preference. 

Passing Through Layers, Serialization, and WCF

If you write any type of enterprise system you probably need to pass data along multiple layers.  Some of them may require object serialization or data contracts.  I often find my objects playing a double role as contracts passing through WCF services and as a grid binding data source.  This type of double role is pretty easy to perform on custom objects.  All you need are a few attributes on each property.  Some O/R mappers allow you to write your own classes or to pre define templates where you can add data member attributes and specify clearly which object will be serialized and the serialized as attribute.  This type of performance tuning is critical especially in the WEB 2.0 world and service oriented architecture.  Spending the time in the mapper to define these attributes can be very time consuming and requires intimate knowledge with your selected mapper.  Failing to properly design serialization attributes can result in poor performance and degrade user experience. 

How LINQ Changed Everything

You’re probably pretty familiar with LINQ, so there’s no need to go into what it does.  If you are interested, I’m sure you can Google it.  Microsoft has two LINQ supported relational data-to-objects technologies.  The first and more mature one is LINQ to SQL.  This cannot be fully considered an O/R mapper since there is no layer of obstruction, meaning the database tables are the application objects.  You cannot map certain columns to properties; the entire context is pretty much mapped for you.  The second one is the entity framework.  There are mixed reviews about the entity framework 1.0 and all I can say after working with it is that most people expect from it things it simply wasn’t designed to do.  The next version promises to be much more comprehensive and will be released as part of .NET 4.0 and Visual Studio 2010.  Other O/R mappers such as LLBLGEN fully support LINQ and do a very nice job at it. 

Once you get used to working and thinking in LINQ, you will never switch back.  The amount of code savings is simply unbelievable.  It’s a new way of thinking about data and the ability to use the same code in memory data, relational data, XML, Entities, REST is nothing short of impressive.  The underlying implementation usually does a better job than any traditional loop.  Unfortunately, the biggest feature about LINQ is also its Achilles’ Heel.  Since it’s simply so easy to work with data, we tend to cram a lot of the business logic into LINQ which tends to cascade down to the database in the form of highly complicated and not always optimized SQL.  When we are doing that, we are unknowingly shifting the logic of our app. into the database.  And even though we don’t tend to think about that, it can end up costing us in performance.  I subscribe to the methodology that databases are used for CRUD operations and business layers are used for data processing.  I’ve seen too many cases where complicated store procedures caused table locks and application hang.

Performance

This is the biggest “if” of O/R Mappers.  The best generated SQL is simply not as good as a well-tuned store procedure.  Yes, I know… you might be thinking “Hey!  I can fine tune my code to generate a really good query.” but I would argue in the time that it would take you to manipulate the generator to create an optimized query, I would have already written the same query into a store procedure and coded an object for it. Most programmers don’t bother to optimize their queries at all.   Just look at an average query bank and compare it to the store procedures already compiled in your database.  See the difference? In addition, you have to constantly be on guard when coding any data operations as any overly complex LINQ statement will result in pushing too much logic into the database as mentioned in the above paragraph. 


Dynamic Data Services

REST driven services are currently in development and version 1 was already out with .NET 3.5 SP1.  The beautiful thing about dynamic data services is that they plug into the entity framework almost seamlessly and have hooks to LINQ to SQL.  Yes, you can implement some interfaces for custom collections; however, if you are going to use REST, I highly suggest you have some sort of entity framework mini-data layer including only the objects you are planning to expose. 

Some Thoughts

I once heard an interview with Rockford Lhotka about his take on objects, and as the inventor of CSLA.NET he knows a thing or two about OOP.  He is also responsible for many of my gray hairs, but that’s a different story.  He said something I did not agree with at first, but I fully understand now what he meant and can identify with.  He said, “It’s not an object unless I coded it.”  At first I wondered, “Hey wait a minute… an object is an object, right?”  But then I came to realize that I expect some things from my objects, mainly to be there for me properties and all.  If I rely on a generator to create them for me and assume properties will always be there, I’m taking a big leap of faith.  Generators can change from version to version and database schemas can also change.  It’s very easy, maybe too easy, to change data in generated classes simply by dragging and dropping and knocking down multiple layers of hand-coded logic.  This is a case where the programmer simply can’t rely on the foundation of the app. to be there for him or her. 

Conclusion

So what should I use?  O/R Maps or traditional data layers?  Well, it’s up to you.  I personally like to use O/R Mappers for small projects where performance doesn’t really matter.  I use traditional data layers for enterprise and high end systems.  I sometimes even mix and match, creating a traditional data layer for parts of the app. that I know will need high performance and use the Entity framework for other parts, for example the CMS or REST. 

What’s coming next in O/R mappers?

The Entity Framework 4.0 promises to have more features and will finally be a challenge to LLBLGEN.  It might be worth a second look before you choose your next mapper. 

Terminology

O/R Mapper-  Object-Relational Mapper
CRUD- Create, read, update, and delete
REST- Representational State Transfer
CMS- Content Management System
WCF – Windows Communication Foundation


Posted 9 Jul 2009 6:04 AM by Gal Ratner

Comments

DotNetKicks.com wrote O/R Mapping vs a traditional data layer
on 8 Jul 2009 10:14 PM

You've been kicked (a good thing) - Trackback from DotNetKicks.com

DotNetShoutout wrote To O/R Map or not to O/R Map - The Data Layer debate - .NET Development - Gal Ratner
on 9 Jul 2009 9:17 AM

Thank you for submitting this cool story - Trackback from DotNetShoutout

Powered by Community Server (Non-Commercial Edition), by Telligent Systems