New Feature: Avoid Sampling
The latest release of UA-archive includes a new feature to avoid sampling in the Google Analytics Reporting API. The API may return a sample of sessions if the date range is very large or the number of records in a query is very large. To address this, we have included a wrapper script ua_backup.py
, which can take the same list of arguments as analytics_reporter.py
. This script will split the date range into smaller chunks based on the value in the --report_level
argument. The available options are 'day', 'week', 'month', and 'year'.
Example:
python3 ua_backup.py --report_id 1 --start 2020-01-01 --end 2023-01-31 --report_level day
This will run the query for each day and store the results as separate CSV files in the output folder. The script merge_report.py
can be used to merge all the individual CSV files into a single CSV file.
python3 merge_report.py output/123423_ua-property full_report
You will get the merged CSV report in the full_report
folder.
Note: The system uses ua-backup-execution.log
to keep track of the last script executed to resume execution if any error occurs. It also uses quota_exceeded.log
to track whether the quota was exceeded. The <view-id>_progress.log
is used to track individual reports. If you want to execute the script as a fresh one, starting from the beginning, you should remove these log files.