Problem importing from Google Takeout Records.json #279

edbr-xyz · 2024-09-24T14:43:13Z

When I follow the steps to import a Records.json from Google Takeout, I get the following output:

/var/app # bundle exec rake import:big_file['public/imports/Records.json','<my-email>']
[dotenv] Set DATABASE_PORT and PHOTON_API_HOST
[dotenv] Loaded .env.development
D, [2024-09-24T14:07:26.464291 #150] DEBUG -- :   User Load (1.6ms)  SELECT "users".* FROM "users" WHERE "users"."email" = $1 LIMIT $2  [["email", "<my-email>"], ["LIMIT", 1]]
D, [2024-09-24T14:07:26.465388 #150] DEBUG -- :   ↳ app/services/tasks/imports/google_records.rb:9:in `initialize'
D, [2024-09-24T14:07:26.549560 #150] DEBUG -- :   TRANSACTION (0.2ms)  BEGIN
D, [2024-09-24T14:07:26.550475 #150] DEBUG -- :   ↳ app/services/tasks/imports/google_records.rb:26:in `create_import'
D, [2024-09-24T14:07:26.560035 #150] DEBUG -- :   Import Create (10.6ms)  INSERT INTO "imports" ("name", "user_id", "source", "created_at", "updated_at", "raw_points", "doubles", "processed", "raw_data", "points_count") VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10) RETURNING "id"  [["name", "public/imports/Records.json"], ["user_id", 1], ["source", 2], ["created_at", "2024-09-24 14:07:26.548282"], ["updated_at", "2024-09-24 14:07:26.548282"], ["raw_points", 0], ["doubles", 0], ["processed", 0], ["raw_data", nil], ["points_count", 0]]
D, [2024-09-24T14:07:26.560962 #150] DEBUG -- :   ↳ app/services/tasks/imports/google_records.rb:26:in `create_import'
D, [2024-09-24T14:07:26.562775 #150] DEBUG -- :   TRANSACTION (1.4ms)  COMMIT
D, [2024-09-24T14:07:26.563269 #150] DEBUG -- :   ↳ app/services/tasks/imports/google_records.rb:26:in `create_import'
D, [2024-09-24T14:07:26.563671 #150] DEBUG -- : Importing public/imports/Records.json for <my-email>, file size is 742258184... This might take a while, have patience!
Killed
/var/app #

(my account email replaced with <my-email>)

The file that I am trying to import is quiet large, as seen in the above output.

I have tried upping the CPU and memory limits in docker-compose.yml. If I raise them enough, or remove the limits, the task will run, but will hang after a while, locking up the whole server, eventually spitting out the following:

^CE, [2024-09-24T14:04:12.363572 #156] ERROR -- : Failed enqueuing ImportGoogleTakeoutJob to Sidekiq(imports): Interrupt ()
I, [2024-09-24T14:04:12.363998 #156]  INFO -- : ↳ app/services/tasks/imports/google_records.rb:35:in `block in schedule_import_jobs'
rake aborted!
Interrupt: Interrupt
/var/app/vendor/bundle/ruby/3.3.0/gems/redis-client-0.22.2/lib/redis_client/ruby_connection/buffered_io.rb:213:in `wait_readable'
/var/app/vendor/bundle/ruby/3.3.0/gems/redis-client-0.22.2/lib/redis_client/ruby_connection/buffered_io.rb:213:in `block in fill_buffer'
/var/app/vendor/bundle/ruby/3.3.0/gems/redis-client-0.22.2/lib/redis_client/ruby_connection/buffered_io.rb:197:in `fill_buffer'
/var/app/vendor/bundle/ruby/3.3.0/gems/redis-client-0.22.2/lib/redis_client/ruby_connection/buffered_io.rb:187:in `ensure_remaining'
/var/app/vendor/bundle/ruby/3.3.0/gems/redis-client-0.22.2/lib/redis_client/ruby_connection/buffered_io.rb:152:in `getbyte'
/var/app/vendor/bundle/ruby/3.3.0/gems/redis-client-0.22.2/lib/redis_client/ruby_connection/resp3.rb:113:in `parse'
/var/app/vendor/bundle/ruby/3.3.0/gems/redis-client-0.22.2/lib/redis_client/ruby_connection/resp3.rb:50:in `load'
/var/app/vendor/bundle/ruby/3.3.0/gems/redis-client-0.22.2/lib/redis_client/ruby_connection.rb:96:in `read'
/var/app/vendor/bundle/ruby/3.3.0/gems/redis-client-0.22.2/lib/redis_client/connection_mixin.rb:52:in `block in call_pipelined'
/var/app/vendor/bundle/ruby/3.3.0/gems/redis-client-0.22.2/lib/redis_client/connection_mixin.rb:50:in `call_pipelined'
/var/app/vendor/bundle/ruby/3.3.0/gems/redis-client-0.22.2/lib/redis_client.rb:434:in `block (2 levels) in pipelined'
/var/app/vendor/bundle/ruby/3.3.0/gems/redis-client-0.22.2/lib/redis_client/middlewares.rb:16:in `call'
/var/app/vendor/bundle/ruby/3.3.0/gems/redis-client-0.22.2/lib/redis_client.rb:433:in `block in pipelined'
/var/app/vendor/bundle/ruby/3.3.0/gems/redis-client-0.22.2/lib/redis_client.rb:699:in `ensure_connected'
/var/app/vendor/bundle/ruby/3.3.0/gems/redis-client-0.22.2/lib/redis_client.rb:431:in `pipelined'
/var/app/vendor/bundle/ruby/3.3.0/gems/redis-client-0.22.2/lib/redis_client/decorator.rb:51:in `pipelined'
/var/app/vendor/bundle/ruby/3.3.0/gems/sidekiq-7.3.2/lib/sidekiq/client.rb:228:in `block in raw_push'
/var/app/vendor/bundle/ruby/3.3.0/gems/connection_pool-2.4.1/lib/connection_pool.rb:110:in `block (2 levels) in with'
/var/app/vendor/bundle/ruby/3.3.0/gems/connection_pool-2.4.1/lib/connection_pool.rb:109:in `handle_interrupt'
/var/app/vendor/bundle/ruby/3.3.0/gems/connection_pool-2.4.1/lib/connection_pool.rb:109:in `block in with'
/var/app/vendor/bundle/ruby/3.3.0/gems/connection_pool-2.4.1/lib/connection_pool.rb:106:in `handle_interrupt'
/var/app/vendor/bundle/ruby/3.3.0/gems/connection_pool-2.4.1/lib/connection_pool.rb:106:in `with'
/var/app/vendor/bundle/ruby/3.3.0/gems/sidekiq-7.3.2/lib/sidekiq/client.rb:225:in `raw_push'
/var/app/vendor/bundle/ruby/3.3.0/gems/sidekiq-7.3.2/lib/sidekiq/client.rb:93:in `push'
/var/app/vendor/bundle/ruby/3.3.0/gems/sidekiq-7.3.2/lib/sidekiq/job.rb:372:in `client_push'
/var/app/vendor/bundle/ruby/3.3.0/gems/sidekiq-7.3.2/lib/sidekiq/job.rb:209:in `perform_async'
/var/app/vendor/bundle/ruby/3.3.0/gems/activejob-7.2.1/lib/active_job/queue_adapters/sidekiq_adapter.rb:25:in `enqueue'
/var/app/vendor/bundle/ruby/3.3.0/gems/activejob-7.2.1/lib/active_job/enqueuing.rb:133:in `raw_enqueue'
/var/app/vendor/bundle/ruby/3.3.0/gems/activejob-7.2.1/lib/active_job/enqueue_after_transaction_commit.rb:24:in `raw_enqueue'
/var/app/vendor/bundle/ruby/3.3.0/gems/activejob-7.2.1/lib/active_job/enqueuing.rb:118:in `block in enqueue'
/var/app/vendor/bundle/ruby/3.3.0/gems/activesupport-7.2.1/lib/active_support/callbacks.rb:121:in `block in run_callbacks'
/var/app/vendor/bundle/ruby/3.3.0/gems/activejob-7.2.1/lib/active_job/instrumentation.rb:40:in `block in instrument'
/var/app/vendor/bundle/ruby/3.3.0/gems/activesupport-7.2.1/lib/active_support/notifications.rb:210:in `block in instrument'
/var/app/vendor/bundle/ruby/3.3.0/gems/activesupport-7.2.1/lib/active_support/notifications/instrumenter.rb:58:in `instrument'
/var/app/vendor/bundle/ruby/3.3.0/gems/activesupport-7.2.1/lib/active_support/notifications.rb:210:in `instrument'
/var/app/vendor/bundle/ruby/3.3.0/gems/activejob-7.2.1/lib/active_job/instrumentation.rb:39:in `instrument'
/var/app/vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.1/lib/active_record/railties/job_runtime.rb:18:in `instrument'
/var/app/vendor/bundle/ruby/3.3.0/gems/activejob-7.2.1/lib/active_job/instrumentation.rb:21:in `block (2 levels) in <module:Instrumentation>'
/var/app/vendor/bundle/ruby/3.3.0/gems/activesupport-7.2.1/lib/active_support/callbacks.rb:130:in `instance_exec'
/var/app/vendor/bundle/ruby/3.3.0/gems/activesupport-7.2.1/lib/active_support/callbacks.rb:130:in `block in run_callbacks'
/var/app/vendor/bundle/ruby/3.3.0/gems/activejob-7.2.1/lib/active_job/logging.rb:41:in `tag_logger'
/var/app/vendor/bundle/ruby/3.3.0/gems/activejob-7.2.1/lib/active_job/logging.rb:28:in `block (2 levels) in <module:Logging>'
/var/app/vendor/bundle/ruby/3.3.0/gems/activesupport-7.2.1/lib/active_support/callbacks.rb:130:in `instance_exec'
/var/app/vendor/bundle/ruby/3.3.0/gems/activesupport-7.2.1/lib/active_support/callbacks.rb:130:in `block in run_callbacks'
/var/app/vendor/bundle/ruby/3.3.0/gems/activesupport-7.2.1/lib/active_support/callbacks.rb:141:in `run_callbacks'
/var/app/vendor/bundle/ruby/3.3.0/gems/activejob-7.2.1/lib/active_job/enqueuing.rb:117:in `enqueue'
/var/app/vendor/bundle/ruby/3.3.0/gems/activejob-7.2.1/lib/active_job/enqueuing.rb:84:in `perform_later'
/var/app/app/services/tasks/imports/google_records.rb:35:in `block in schedule_import_jobs'
/var/app/app/services/tasks/imports/google_records.rb:34:in `each'
/var/app/app/services/tasks/imports/google_records.rb:34:in `schedule_import_jobs'
/var/app/app/services/tasks/imports/google_records.rb:19:in `call'
/var/app/lib/tasks/import.rake:9:in `block (2 levels) in <main>'
/var/app/vendor/bundle/ruby/3.3.0/gems/rake-13.2.1/exe/rake:27:in `<top (required)>'
/var/app/vendor/bundle/ruby/3.3.0/gems/bundler-2.5.9/lib/bundler/cli/exec.rb:58:in `load'
/var/app/vendor/bundle/ruby/3.3.0/gems/bundler-2.5.9/lib/bundler/cli/exec.rb:58:in `kernel_load'
/var/app/vendor/bundle/ruby/3.3.0/gems/bundler-2.5.9/lib/bundler/cli/exec.rb:23:in `run'
/var/app/vendor/bundle/ruby/3.3.0/gems/bundler-2.5.9/lib/bundler/cli.rb:451:in `exec'
/var/app/vendor/bundle/ruby/3.3.0/gems/bundler-2.5.9/lib/bundler/vendor/thor/lib/thor/command.rb:28:in `run'
/var/app/vendor/bundle/ruby/3.3.0/gems/bundler-2.5.9/lib/bundler/vendor/thor/lib/thor/invocation.rb:127:in `invoke_command'
/var/app/vendor/bundle/ruby/3.3.0/gems/bundler-2.5.9/lib/bundler/vendor/thor/lib/thor.rb:527:in `dispatch'
/var/app/vendor/bundle/ruby/3.3.0/gems/bundler-2.5.9/lib/bundler/cli.rb:34:in `dispatch'
/var/app/vendor/bundle/ruby/3.3.0/gems/bundler-2.5.9/lib/bundler/vendor/thor/lib/thor/base.rb:584:in `start'
/var/app/vendor/bundle/ruby/3.3.0/gems/bundler-2.5.9/lib/bundler/cli.rb:28:in `start'
/var/app/vendor/bundle/ruby/3.3.0/gems/bundler-2.5.9/exe/bundle:28:in `block in <top (required)>'
/var/app/vendor/bundle/ruby/3.3.0/gems/bundler-2.5.9/lib/bundler/friendly_errors.rb:117:in `with_friendly_errors'
/var/app/vendor/bundle/ruby/3.3.0/gems/bundler-2.5.9/exe/bundle:20:in `<top (required)>'
/usr/local/bundle/bin/bundle:25:in `load'
/usr/local/bundle/bin/bundle:25:in `<main>'
Tasks: TOP => import:big_file
(See full trace by running task with --trace)

I am running dawarich in openmediavault-7 docker-compose using the :latest dawarich version, on an AMD Ryzen 5 2400g with 8GB RAM.

The text was updated successfully, but these errors were encountered:

Danielson89 · 2024-09-27T13:13:12Z

I ended up exporting it directly from my phone using the Google Maps app. Google has really stuffed things up here. Once done from the phone the import worked as expected.

applesoff · 2024-09-28T22:15:04Z

i am having the same issue, but i cannot figure out how Danielson89 fixed it the way they did. any tips?

Freika · 2024-10-05T21:45:49Z

I'd suggest splitting the Records.json file into smaller chunks: https://dawarich.app/docs/FAQ#why-my-attempt-to-import-recordsjson-fails

Svagtlys · 2024-10-11T02:41:26Z

@applesoff In case you haven't gotten your data yet, these steps are what I used now that Google has swapped from online to local by default storage of timeline data:

https://support.google.com/maps/thread/280205453/how-do-i-download-my-timeline-history?hl=en

Instructions:
try from android device settings > location > location services > timeline > export timeline data

mcfrojd · 2024-11-14T18:06:01Z

My Records.json is 1.56 GB and 70 million lines! (data from 2010 till today)
How big chunks can dawarich handle?
And any tips on best way to split upp the file?

Freika · 2024-11-15T15:09:32Z

@mcfrojd splitting to files sized up to 100-150MB should work

AngryJKirk · 2024-11-17T17:07:54Z

I come back to this project once in a month trying to import my data (around 900mb) and it always fails.

I can't believe the technology is not there yet to read and put in a database less than a gigabyte of data. 😭

AngryJKirk · 2024-11-17T17:52:26Z

I ended up solving this by a python script:

import json
from datetime import datetime


def generate_sql(file_path, output_path, import_id, user_id):
    now = datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S.%f')[:-3]

    with open(file_path, 'r') as json_file, open(output_path, 'w') as sql_file:
        data = json.load(json_file)
        locations = data.get('locations', [])

        for location in locations:
            parsed = parse_json(location)
            sql = (
                f"INSERT INTO public.points (latitude, longitude, timestamp, raw_data, topic, tracker_id, import_id, user_id, created_at, updated_at) "
                f"VALUES ({parsed['latitude']}, {parsed['longitude']}, {parsed['timestamp']}, "
                f"'{parsed['raw_data']}', 'Google Maps Timeline Export', 'google-maps-timeline-export', "
                f"{import_id}, {user_id}, '{now}', '{now}');\n"
            )
            sql_file.write(sql)

def parse_json(entry):
    timestamp_str = entry.get('timestamp') or entry.get('timestampMs', '')


    if 'T' in timestamp_str:
        timestamp = int(datetime.fromisoformat(timestamp_str.replace('Z', '+00:00')).timestamp())
    else:
        timestamp = int(timestamp_str) // 1000 if timestamp_str else 0

    return {
        "latitude": entry.get('latitudeE7', 0) / 10 ** 7,
        "longitude": entry.get('longitudeE7', 0) / 10 ** 7,
        "timestamp": timestamp,
        "altitude": entry.get('altitude', 'NULL'),
        "velocity": entry.get('velocity', 'NULL'),
        "raw_data": json.dumps(entry).replace("'", "''") 
    }


input_json_path = 'Records.json'
output_sql_path = 'output.sql'
import_id = 1
user_id = 1

generate_sql(input_json_path, output_sql_path, import_id, user_id)

Steps:

create import
wait until it fails
put the user id and import id according to the database (it will be 1 and 1 if you run it on a fresh install)
put your Records.json next to the script
run the script
modify docker-compose.yml to expose the port of the database, e.g.

    ports:
      - "127.0.0.1:5432:5432"

get the output.sql and just run it against the database (you will need to modify docker-compose.yml to expose the port for the database, e.g. )
wait around 10-15 minutes (it took 11 minutes for 2 million rows)

After that I see all my points in the app. I checked the code and that seems to be the only thing to be done, @Freika please correct me if I am wrong, I am seeing Ruby for the first time in my life

Nathagamelle · 2024-11-20T22:35:09Z

            sql = (
                f"INSERT INTO public.points (latitude, longitude, timestamp, raw_data, topic, tracker_id, import_id, user_id, created_at, updated_at) "
                f"VALUES ({parsed['latitude']}, {parsed['longitude']}, {parsed['timestamp']}, "
                f"'{parsed['raw_data']}', '**Google Maps Timeline Export', 'google-maps-timeline-export**', "
                f"{import_id}, {user_id}, '{now}', '{now}');\n"
            )

tempted to do it, but is there any reason to keep the long topic and tracker_id ?

AngryJKirk mentioned this issue Nov 17, 2024

Importing large Records.json from Takeout fails with no message #142

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem importing from Google Takeout Records.json #279

Problem importing from Google Takeout Records.json #279

edbr-xyz commented Sep 24, 2024

Danielson89 commented Sep 27, 2024

applesoff commented Sep 28, 2024

Freika commented Oct 5, 2024

Svagtlys commented Oct 11, 2024

mcfrojd commented Nov 14, 2024

Freika commented Nov 15, 2024

AngryJKirk commented Nov 17, 2024

AngryJKirk commented Nov 17, 2024 •

edited

Loading

Nathagamelle commented Nov 20, 2024 •

edited

Loading

Problem importing from Google Takeout Records.json #279

Problem importing from Google Takeout Records.json #279

Comments

edbr-xyz commented Sep 24, 2024

Danielson89 commented Sep 27, 2024

applesoff commented Sep 28, 2024

Freika commented Oct 5, 2024

Svagtlys commented Oct 11, 2024

mcfrojd commented Nov 14, 2024

Freika commented Nov 15, 2024

AngryJKirk commented Nov 17, 2024

AngryJKirk commented Nov 17, 2024 • edited Loading

Nathagamelle commented Nov 20, 2024 • edited Loading

AngryJKirk commented Nov 17, 2024 •

edited

Loading

Nathagamelle commented Nov 20, 2024 •

edited

Loading