Sundered Peak

through the mind of kyle tolle

Manually Update Postgres Timestamps In Rails

Let’s say I need to set a boolean field on 100,000 records at the same time. And I’m using Rails 4.x, just to clarify.

The simple way to do this is

unprocessed_records.each do |unprocessed_record|
  unprocessed_record.processed = true
  unprocessed_record.save!
end

But this results in 100,000 ActiveRecord objects instantiated and held in memory at the same time. Since all I’m doing is updating an attribute, this is not efficient in memory or time.

A more efficient way to update all these records at once is #update_all.

unprocessed_records.update_all(processed: true)

The benefit is that it doesn’t create an ActiveRecord instance per database record. It also skips validations and callbacks from the model, so the database update is faster.

One thing ActiveRecord does give us is the timestamps. Each time we #save a record, its updated_at is modified. Using #update_all, we lose that. In the case of a database migration, we might not want the timestamps updated, but, in this use case, we certainly do.

Additionally, #update_all inserts values into the database without passing them through ActiveRecord’s typecasting. So it’s important to use the correct value, since ActiveRecord won’t do anything magical for us.

So how can we use #update_all and still ensure the timestamps are kept current?

In searching, I came across this StackOverflow question from 2013. The accepted answer mentions that you can pass #update_all a value of updated_at equal to DateTime.now. But I wanted to be sure this solution would still work today.

In the Rails console, I found the value which would be put into the database.

> DateTime.now.to_s
=> "2015-04-27T18:20:28-06:00"

Next, I had to check whether this value could be used for our column type, since ActiveRecord won’t be typecasting it.

My database is Postgres, and for created_at and updated_at we use the column type timestamp [without time zone].

The output in the Rails console above clearly has the timezone information, so I needed to check whether that’s acceptable for Postgres or I’d have to munge the value.

Postgres’ documentation on date/time types has the following:

Note: The SQL standard requires that writing just timestamp be equivalent to timestamp without time zone, and PostgreSQL honors that behavior. timestamptz is accepted as an abbreviation for timestamp with time zone; this is a PostgreSQL extension.

This lead me to think I could use the value of DateTime.now.to_s and be fine. I tested it out from the command line, and verified it worked. The timestamp had the correct value.


So, when you need to manually update timestamps in Postgres using Rails, use DateTime.now.to_s.

Keep this in mind when you need to update many records along with their timestamps. We don’t get all of Rails’ magic, but it will save the server a lot of time and memory.


Update: My co-worker Zac McCormick asked a good question. If we need to store the time as UTC, does Postgres convert non-UTC times to UTC time before it strips the timezone?

The answer appears to be no. This SO answer tells us timestamp [without time zone ignores a timezone modifier if we should add one. The time zone is assumed to be that of the time zone setting.

If your server’s timezome is not set to UTC, and you want to store UTC timestamps, you’ll have to convert the time yourself. This other SO question has a good answer.

> DateTime.now.to_s
=> "2015-04-27T19:07:55-06:00"
> DateTime.now.new_offset(Rational(0,24)).to_s
=> "2015-04-28T01:07:56+00:00"