Skip to content

ActiveRecord extension to kick the speed of allocating ActiveRecord object

Notifications You must be signed in to change notification settings

Paxa/light_record

Repository files navigation

LightRecord

Build Status

ActiveRecord extension to kick the speed of allocating ActiveRecord object, and it allows fetching DB rows sequentially to reduce memory usage

Supports: Rails 5, 6 and 7

Note: it overrides some internal methods of active record, and in some rare cases it not work correctly, please test it well in your project before using it in production

How it works

It provides functionality to load ActiveRecord records with patched attribute related methods. This make AR objects as read-only but it makes up to 5 times less object allocations.

Each time when you retrieve objects via .light_records it will create anonymous class to work with given set of attributes.

  LightRecord Extension Class
              ↓
        Your AR Model
              ↓
      ActiveRecord::Base

Installation

gem 'light_record', github: 'paxa/light_record'

scope.light_records

records = User.limit(1_000_000).light_records
records # => array of records. Very fast and very memory efficient

Idea is to skip all magic related to attributes and object initialization. This creates new class inherited from your model. That allows us to create only one extra object when we initialize new record.

Simply it become something like this:

class User_light_record < User
  def initialize(attributes)
    @attributes = attributes # hash of data "as is" from database library
  end

  def email
    @attributes[:email]
  end
end

scope.light_records_each

Other method: .light_records_each, it will utilize stream: true feature from mysql2 client. So it will initialize objects one by one for every interaction:

User.limit(1_000_000).light_records_each do |user|
  user.do_something
end

This allow you to interate big amount of data without using find_each or find_in_batches because with light_records_each it will use very low memory. Or allow you to use find_in_batches with bigger batch size

* Please note that time will be as a ruby Time object, instead of TimeWithZone. To make it in correct time zone you can call it as:

record.created_at.in_time_zone(Time.zone)

Benchmarks

Still on a way, but I try to use in some project and it gives 3-5 times improvement, and 2-3 times less memory usage


Sometimes this can break functionality because it will override attribute methods and disable some of features in activerecord.

There is mechanism to override attribute methods created by LightRecord:

class User < ActiveRecord::Base
  # this module will be included in extending class when we use light_records and light_records_each
  module LightRecord
    def sometihng
    end

    def success
      attributes[:success] == 1
    end

    def success_time
      return attributes[:success_time] unless attributes[:success_time]
      @success_time ||= attributes[:success_time].in_time_zone(Time.zone)
    end

    def success_time=(val)
      @success_time = nil
      super(val)
    end
  end
end

Note: when you use LightRecord instances it will break type casting

This gem supports MySQL and PostgreSQL

About

ActiveRecord extension to kick the speed of allocating ActiveRecord object

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages