Database migration tips in Rails - Part 1

Database migrations are an everyday task and as such are not generally considered as exciting as software architecture or any other hot topic in the industry. Yet, they are essential in any applications that use databases!

Ruby on Rails handles a lot of problems with migrations for us. However, there are still some details everyone must remember to avoid troubles. That’s why I would like to share with you some simple tips on how to improve your database migrations, which hopefully, you will find very useful and powerful!

Here are the tips you should follow:

1. Redefine ActiveRecord classes

Often, we need to refer to ActiveRecord Models inside migrations, for example, to update some fields. So, let's imagine we want to introduce a new :status column to the Users table and update Users with the "active" :status.

It might look like this...

    class AddStatusToUser < ActiveRecord::Migration[5.1] ❌
  def up
    add_column :users, :status, :string
    User.update_all(status: "active")
  end

  def down
    remove_column :users, :status, :string
  end
end

But there’s a better way to do this by stubbing out the model in the migrations. It gives us two main advantages. First of all, it guards against the case where a model is removed from the codebase but is still being called in a migration. Secondly, it prevents validations from being run, as well as eliminates the associations overhead.

The improved version would look like this:

2. Irreversible migrations

Always make sure your migration is reversible (you should be able to run rails db:rollback on it). Rails has a #change method that is primarily used in writing migrations. It works for most of the cases in which Active Record knows how to reverse a migration automatically.

However, if you modify the data, for example, update the :status column of User, Rails won’t know how to rollback that migration. Hence, never use the #change method in migration in such a scenario and always use #up/#down methods, where #down raises ActiveRecord::IrreversibleMigration or implements logic to reverse change introduced in #up.

    class AddStatusToUser < ActiveRecord::Migration[5.1] ❌
  def change
    User.update_all(status: "active")
  end
end
class AddStatusToUser < ActiveRecord::Migration [5.1] ✅
  class User < ActiveRecord::Base
  end

  def up
    User.update_all(status: "active")
  end

  def down
     raise ActiveRecord::IrreverisibleMigration
  end
end
class AddStatusToUser < ActiveRecord::Migration [5.1] ✅
  class User < ActiveRecord::Base
  end

  def up
    add_column : users, : status, :string
    User. update_all(status: "active")
  end

  def down 
    remove_column :users, :status, :string
  end
end

Alternatively, you can use the #reversible block/helper in the #change method as shown in the example below:

    class AddStatusToUser < ActiveRecord::Migration[5.1] ✅    
class User < ActiveRecord::Base
end
def change

    reversible 
do
|dir|

      dir.up 
do
 
        User.update_all/
status:
"active"
)
       
end

       dir.down 
do

         raise ActiveRecord::IrreverisibleMigration
       
end
end
end
end

Active Record has a class (ActiveRecord::Migration::CommandRecorder) that records commands done during migration and knows how to reverse them. Currently, there are around 27 commands and most of them are reversible like #create_column or #add_column. Examples of a non-reversible may be #execute, #execute_block, or #transaction. If you would like to find out more, please visit https://api.rubyonrails.org/classes/ActiveRecord/Migration/CommandRecorder.html.

3. Clean up old migrations

It might be a good idea to clean up old migrations in case of the circumstance when you can’t run all migrations (rails db:migrate) on the fresh database. Otherwise, it might not be necessary to do so as there aren’t many benefits besides fewer files in db/migrations.

In GAT, we only did it once because of old, poorly written migrations. Cleaning up legacy migrations is fairly simple; you just need to create one migration file that has the current state of the database. In other words, a copy of your schema.rb. Now you can forget about all the troubles when running migrations on a fresh database:

    class InitSchema < ActiveRecord::Migration[5.1]
def up

    create_table 
"user"
do
|t|

      t.string 
"first_name"
, 
null:
false

      t.string 
"last_name"
, 
null:
false

      t.datetime 
"created_at"
, 
null:
false

      t.datetime 
"updated_at"
, 
null:
false
end
# etc…
end
def down

    raise ActiveRecord::IrreversibleMigration, 
"The initial migration is not reversible"
end
end

4. Bulk migrations

Let's say that we need to add :first_name, :last_name, :address, :phone_number to the Users table.

Instead of adding each field separately we can do it in bulk mode and add them at once. This way the query will be executed once instead of four times, and in the database world, executing a single query is much faster than executing four individual queries one by one even though their results are the same.

Compare these two examples:

    Class AddFieldsToUser < ActiveRecord::Migration[5.1] ❌
  def change
    add_column :users, :first_name, :string
    add_column :users, :last_name, :string
    add_column :users, :address, :string
    add_column :users, :phone_number, :string
  end
end
Class AddFieldsToUser < ActiveRecord::Migration[5.1] ✅
  def change
    change_table :users, bulk: true do |t|
      t.string :first_name
      t.string :last_name
      t.string :address
      t.string :phone_number 
   end
end

In general, it’s 2-3 times faster, which is quite a time saving, especially with a large database, given how simple the improvement is.

5. Handle complex data migrations in temporary rake tasks

By definition, Rails migrations should only be used for schema changes and not for actual data changes in the database. It might be a good idea to handle complex data migrations in temporary rake tasks. Given that, we decouple our data changes from the deployment. Furthermore, data migrations don’t follow business logic so shouldn’t stay forever in the codebase.

That said, it’s not the worst idea to add a little data change after adding a new column, especially if you stick to tip #1 from this article or use the Database migration tips in Rails. Part 2 - Strong migrations. It shouldn’t cause you any trouble and there are also cool gems like after_party and maitenence_task that you might find helpful.

    class AddStatusTouser < ActiveRecord::Migration[5.1] ❌
  def change
    User.update_all(status: "active")
  end
end

namespace :data_migration do ✅
  desc "Sets the default user status"
  task set user status: :environment do
    puts "Setting up default users status" 

    User.find_each do |user|
      begin
        user.status = 'active'
        user.save!
      rescue
        puts "Error updating #{user.id}"
      end
    end

    puts "All Done! 🎉”
  end
end

I consider the code below fine as well, as long as we keep in mind tip #1 from this article, and we know that migration won’t take too much time:

    class AddStatusToUser < ActiveRecord::Migration[5.1]
class User < ActiveRecord::Base
end
def up

    add_column 
:users
, 
:status
, 
:string

    User.update_all(
status:
"active"
)
  
end
def down

    remove_column 
:users
, 
:status
, 
:string
end
end

6. Get familiar with Active Record

When dealing with large datasets, be aware of what SQL query Active Record generates under the hood. You can run #to_sql on the Active Record query to do so. It may turn out that your query is not performant enough, and even though it was fine on the dev environment, it takes too much time to be executed on production.

Here’s a short list of Active Record tips that might help you with trying to speed up your query:

Use #find_each over #each - > It reduces memory consumption as it’s more efficient than looping through a collection because it doesn’t instantiate all the objects at once. Instead, it allows us to work with individual records.
Use #update_all/#delete_all instead of #update/#delete -> It will update/delete all records with one SQL query which is far more performant than doing it one by one.
Use #insert_all/#upsert_all -> It inserts multiple records in a single SQL INSERT statement without instantiating models nor triggering ActiveRecord’s callbacks.
Watch out for #destroy/#destroy_all -> Remember that destroying might be an expensive operation as it instantiates, executes callbacks, and deletes each record.

Summary

Migrations are essential in database applications. Some frameworks, like Ruby on Rails, will do a lot for us under the hood, but we shouldn’t forget how important migrations are. As such, it’s worth still giving them plenty of attention.

Hopefully, these tips will prove helpful and easy to apply to your project, but we’re not done yet! In part two we’ll take a look at strong migrations and see how we can make our migrations even stronger!