Skip to content

Commit

Permalink
Merge pull request #509 from Shopify/htatla-add-csv-batching
Browse files Browse the repository at this point in the history
Enable CSV Batching
  • Loading branch information
harmanT23 authored Oct 3, 2024
2 parents c7005ce + 3604054 commit 2b56a7a
Show file tree
Hide file tree
Showing 3 changed files with 30 additions and 3 deletions.
4 changes: 2 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
### Main (unreleased)

Nil
- Added CSV batching functionality to EnumeratorBuilder with `build_csv_enumerator_on_batches` method and `csv_on_batches` alias.

## v1.6.0 (Sep 24, 2024)

Expand Down Expand Up @@ -29,7 +29,7 @@ when generating position for cursor based on `:id` column (Rails 7.1 and above,
primary models are now supported). This ensures we grab the value of the id column, rather than a
potentially composite primary key value.
- [456](https://github.com/Shopify/job-iteration/pull/431) - Use Arel to generate SQL that's type compatible for the
cursor pagination conditionals in ActiveRecord cursor. Previously, the cursor would coerce numeric ids to a string value
cursor pagination conditionals in ActiveRecord cursor. Previously, the cursor would coerce numeric ids to a string value
(e.g.: `... AND id > '1'`)

## v1.4.1 (Sep 5, 2023)
Expand Down
5 changes: 5 additions & 0 deletions lib/job-iteration/enumerator_builder.rb
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,10 @@ def build_csv_enumerator(enumerable, cursor:)
CsvEnumerator.new(enumerable).rows(cursor: cursor)
end

def build_csv_enumerator_on_batches(enumerable, cursor:, batch_size: 100)
CsvEnumerator.new(enumerable).batches(cursor: cursor, batch_size: batch_size)
end

# Builds Enumerator for nested iteration.
#
# @param enums [Array<Proc>] an Array of Procs, each should return an Enumerator.
Expand Down Expand Up @@ -186,6 +190,7 @@ def build_nested_enumerator(enums, cursor:)
alias_method :active_record_on_batch_relations, :build_active_record_enumerator_on_batch_relations
alias_method :throttle, :build_throttle_enumerator
alias_method :csv, :build_csv_enumerator
alias_method :csv_on_batches, :build_csv_enumerator_on_batches
alias_method :nested, :build_nested_enumerator

private
Expand Down
24 changes: 23 additions & 1 deletion test/unit/enumerator_builder_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,10 @@ class EnumeratorBuilderTest < ActiveSupport::TestCase
enumerator_builder(wraps: 0).build_csv_enumerator(CSV.new("test"), cursor: nil)
end

test_builder_method(:build_csv_enumerator_on_batches) do
enumerator_builder(wraps: 0).build_csv_enumerator_on_batches(CSV.new("test"), cursor: nil)
end

test_builder_method(:build_nested_enumerator) do
enumerator_builder(wraps: 0).build_nested_enumerator(
[
Expand All @@ -79,7 +83,7 @@ class EnumeratorBuilderTest < ActiveSupport::TestCase

test "#build_csv_enumerator uses the CsvEnumerator class" do
csv = CSV.open(
["test", "support", "sample_csv_with_headers.csv"].join("/"),
sample_csv_with_headers,
converters: :integer,
headers: true,
)
Expand All @@ -92,6 +96,24 @@ class EnumeratorBuilderTest < ActiveSupport::TestCase
end
end

test "#build_csv_enumerator_on_batches uses the CsvEnumerator class with batches" do
csv = CSV.open(
sample_csv_with_headers,
converters: :integer,
headers: true,
)
builder = EnumeratorBuilder.new(mock, wrapper: mock)

enum = builder.build_csv_enumerator_on_batches(csv, cursor: nil, batch_size: 2)
csv_rows = open_csv.to_a
enum.each_with_index do |batch_and_cursor, index|
batch, cursor = batch_and_cursor
expected_batch = csv_rows[index * 2, 2]
assert_equal expected_batch, batch
assert_equal index, cursor
end
end

private

def enumerator_builder(wraps: 1)
Expand Down

0 comments on commit 2b56a7a

Please sign in to comment.