ORM vs. Direct SQL

ORM, or Object-Relational Mapping, is a technique for interacting with a relational database without directly writing SQL statements. Is that a handy shorthand that makes coding easier? Or just another layer that gets in the way? Let's explore both possibilities.

ORM, or Object-Relational Mapping, is a technique for interacting with a relational database without directly writing SQL statements. Is that a handy shorthand that makes coding easier? Or just another layer that gets in the way? Let’s explore both possibilities.

How ORM Works

ORM’s provide classes with methods that perform database-access functions. To use the ORM, the application codes its own classes that extend the base ORM “Table” class. For instance, given a users table with ID, username, first_name, last_name, email, password, create_date and maintenance_date, a typical application would define a matching UserTable class with those properties. It would extend the ORM class, and inherit its pre-defined methods to retrieve, insert, update, or delete table rows using those properties. Then, instead of coding:

SELECT * from users;

a calling program would instantiate a UserTable object ($users = new UserTable) and call a method like:

$users->find('all');

The returned object would contain an array of user objects, one for each row that meets the selection criteria (in this case, ‘all’).

The UserTable class would probably define validation rules to be automatically evaluated before rows are added or modified. It may also define default behaviors like timestamps on updates, or calculated pseudo-columns on retrieval.

The UserTable class would also have the option of defining any foreign-key relationships. Once the class knows that users has child rows in articles, based on the user_id foreign key, future retrievals of a user row can be asked to include the related articles in the returned object. The find() method will know how to form the JOIN clause without each caller having to provide that information.

PROS

ORM allows database manipulation to behave like the rest of the object-oriented world.

The functionality is compartmentalized and DRY (“Don’t Repeat Yourself”). For instance, the actual SQL to:

INSERT into users
(username, first_name, last_name, email, password, create_date, maintenance_date)
VALUES
('johndoe', 'John', 'Doe', 'john@doe.com', 'mypassword','2022-07-20','2022-07-21');

is segregated into a method within the class. The calling program need only instantiate and populate an instance of the UserTable class as $user, then call the method:

$user->save();

The mechanics of the actual database INSERT (or UPDATE if the object’s primary key already exists) are transparent to the caller. This extends even to the point of being agnostic as to the DBMS: The class translates the object’s properties into the correct-syntax SQL statement for the DBMS currently in use. Changing from MySQL to Postgres to Microsoft SQL Server does not require modification of any of the calling programs; the class handles any changes needed.

ORM enforces standardization.

Equivalent SQL queries done in multiple places throughout the application are guaranteed to behave the same. Required functions are stored in the class and run transparently to the calling program .

  • An example on a database INSERT or UPDATE would be automatically generating timestamp columns so the stamps don’t have to be maintained by each caller.
  • An example on a database SELECT would be calculating a “full_name” property by concatenating the table’s “first_name" plus "last_name“. “full_name” would be automatically available to all callers with no additional code, as if it actually existed in the table.

ORM allows enforcement of database relationships and validation rules.

The ORM class defines required fields, field values, and foreign-key relationships. It ensures that all data is correct before issuing the SQL command to the database. If there are errors, it can return “pretty” feedback to the caller for correction or other exception handling. This relieves the caller of the tedious coding to individually validate each field.

ORM allows in-code documentation.

Since all of the foreign-key relationships, automatic functionality, computed properties, and validation rules are defined in the class, it becomes the one-stop-shop for documentation of everything that needs to be known about that table.

CONS

ORM adds a layer of performance vulnerability.

The ORM code 1) translates calls to its methods into SQL syntax, 2) calls the database to execute those commands and receive the results, then 3) translates the results into a return to the caller. It is acting as a middleman. Wouldn’t it be more efficient for the caller to simply execute the SQL command itself? What is the trade-off between performance cost vs. convenience and other benefits?

How much overhead the ORM is adding will vary depending on the exact functionality in question. It will also vary from one ORM to another: How much bloat is there, coding for options that are not used? What is the quality of coding technique in that particular ORM?

ORM adds a layer of learning curve.

The developer not only must know what SQL results they want; they must also know how to code the ORM call so that it generates that desired SQL. They need to be aware of what methods the ORM makes available, and how to use them. Even deciding whether or not to handle a function via the ORM requires knowing that a method is available for the feature in question. If you don’t know that the method exists, then you can’t use it! 🙂

ORM effectively restricts functionality outside of the application.

For instance, the handy computed full_name pseudo-column mentioned above is suddenly non-existent when the table is called in direct SQL. If there is reason for another application — even a SQL GUI like MySQL Workbench — to read that table, it is stuck with only the columns that are really there. To a developer who has grown used to always having access to the virtual columns, that could be disconcerting at minimum, possibly even crippling in some situations.

By the same token, table updates done via an outside source will not have any of the automatically-generated columns like timestamps. The resulting inconsistent data stored may cause “unpredictable results” later.

BTW, even within the application, there are limits to those pseudo-columns. For instance, that full_name can’t be used to sort the query, because it doesn’t exist to be formed into the actual SQL SELECT statement. Instead, the ORM query needs to be defined to sort on the detail columns, last_name and first_name. But… That now requires the calling program to know the detail columns, which it may otherwise be able to happily ignore.

ORM adds a layer of room for error.

Since the ORM is itself code that is subject to bugs or unexpected results, it becomes one more place to look when problems arise. It has its own set of included libraries bringing their own potential for upgrades, incompatibilities, and the other fun ramifications of our complex development environment.

These issues get progressively worse as the needed SQL becomes more complex.

At some point, a SQL requirement will go beyond what the ORM is capable of handling. A five-table JOIN with nested sub-queries and selection criteria that is a mix of AND and OR against three different child/grandchild tables is not easy to write even in direct SQL. It can become much worse to try telling ORM how to produce such a complex query.

I once gave up and reverted to direct SQL for a query that originally seemed fairly simple: Select if the record was active during the date range, based on start/end dates within the record. That needed to select:

WHERE start_date <= [criteria start date]
AND (stop_date is null OR
     stop_date >= [criteria end date])

I tried, but could NOT figure out how to tell the ORM to build that WHERE clause, with its mix of AND/OR and parentheses.

Note: I reverted to direct SQL for this query, while using ORM throughout the rest of the application. That can solve the ORM-or-not decision, at the expense of now having inconsistent coding techniques within the same application.

If the ORM does allow for complex situations like these, then it necessarily is itself more complex, therefore more difficult to learn and more vulnerable to bugs.

Conclusion

Consider ORM if your application:

  • Is straightforward SQL, with a minimum of “spider-web” relationships;
  • Has repetitive use of the same simple-to-medium SQL queries (not necessarily the exact table, but the same pattern: single record by key, set of JOINed parent/child records, etc.);
  • Has regular need for computed columns, automatically-populated columns, or complex validation;
  • Has little or no need for alternate-application access to the same database tables;
  • May be implemented on more than one DBMS.

Be cautious if your application:

  • Has multiple unique or complex queries;
  • Has very high-volume database access;
  • Will need regular access to the same database functionality outside of the application;
  • Will be written or maintained by developers with less experience using frameworks or object-oriented programming.

ORM can be a very useful tool for documenting and standardizing your database processes. But, as in so many things in life, the very features that are positives in most cases can become negatives when taken to extremes. The tipping point that defines “extreme” is an individual choice that will have to be determined for each application.

Leave a Reply

Your email address will not be published.

Join our Mailing list!

Get all latest news, exclusive deals and service updates.