Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can I use sidekiq-iteration in the job that read remote file? #4

Closed
remy727 opened this issue Jun 18, 2023 · 4 comments
Closed

Can I use sidekiq-iteration in the job that read remote file? #4

remy727 opened this issue Jun 18, 2023 · 4 comments

Comments

@remy727
Copy link

remy727 commented Jun 18, 2023

# frozen_string_literal: true

require "open-uri"

class BulkOperationDataRetrieveJob
  include Sidekiq::Job

  sidekiq_options queue: :bulk_operation_data_retrieve, retry: false

  def perform(shop_domain, url)
    shop = Shop.find_by(shopify_domain: shop_domain)

    if shop.nil?
      logger.error("#{self.class} failed: cannot find shop with domain '#{shop_domain}'")
      return
    end

    read_file(url)
  end

  private
    def read_file(url)
      file_path = "tmp/customers.jsonl"
      IO.copy_stream(URI.open(url), file_path)

      # Parse data file
      File.open(file_path) do |f|
        f.each do |line|
          process_line(JSON.parse(line))
        end
      end
    end

    def process_line(line)
      shopify_customer_id = line["id"].gsub("gid://shopify/Customer/", "").to_i
      shop.shopify_customers.find_or_create_by(shopify_id: shopify_customer_id) do |customer|
        customer.email = line["email"]
        customer.phone = line["phone"]
        customer.amount_spent = line["amountSpent"]["amount"].to_f
      end
    end
end

I have the above job and the remote file contains 100K customers.
Can I use sidekiq-iteration in this job?

@fatkodima
Copy link
Owner

Sure, but you need to figure out how to write a custom cursor for this (https://github.com/fatkodima/sidekiq-iteration/blob/master/guides/custom-enumerator.md), which is a hard part. One of the (dumb?) solutions is to download the file locally, parse it, push some ids into the redis list, write a custom redis enumerator to iterate over this list and use this enumerator in the job.

There was a similar discussion (Shopify/job-iteration#50) in the parent gem before.

@remy727
Copy link
Author

remy727 commented Jul 21, 2023

Sorry for the late reply.

Got it. But Sidekiq jobs run on Heroku and there's no way to guarantee that downloaded files (for example tmp/files) would exist.

@remy727
Copy link
Author

remy727 commented Aug 2, 2023

I fixed this by building a custom iterator. Thank you!

@remy727 remy727 closed this as completed Aug 2, 2023
@fatkodima
Copy link
Owner

@remy727 Can you please share the approach you finally decided to use or the iterator's code? So this would be helpful for future seekers or maybe be incorporated into the gem in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants