Releases: snowplow/snowplow
Snowplow v0.8.11
Extensive ETL improvements, including adding support for the recent changes to the CloudFront access log file format.
Hadoop ETL
- Bumped to 0.3.5
- Added Argonaut 6.0 as a dependency (#342)
- Added fromTimestamp to EventEnrichments (#340)
- Added makeTsvSafe to ConversionUtils (#338)
- Added JsonUtils (#323)
- Added support for 3 and 4 return values from MapTransformer (#324)
- Updated GetJsonPayload to use Argonaut and renamed to JsonPayload (#339)
- Added ability to mask IP addresses in ETL (#309)
- refr_ and page_ fields now stored raw (#374)
- Defensively fixed raw spaces in page and referer URLs (#346)
- Fixed regression, single-encoded %s logic didn't account for % itself (#347)
- Added unit tests for fixTabsNewlines (#332)
- Tests now report the failing CanonicalOutput field (#325)
- Now handling all fields double-encoded as per CloudFront post-14-September (#348)
- Added support for 21 Oct CloudFront access log format (#384)
- Added truncation to refr_term (#379)
- Added truncation to se_label (#394)
- Made all prior ME.identity fields TSV-safe (#395)
EmrEtlRunner
- Bumped to 0.5.0
- Bumped Sluice to 0.1.5 (#96)
- Bumped Elasticity to 2.6 (#345)
- Enabled EMR Job Flow debugging for easier access to logs (#279)
- ETL job no longer fails if there's no data for last run period (#296)
- Empty processing dir check now works if dir contains 1 file (#326)
- Added ability to mask IP addresses in ETL (#309)
- Made the examples match what you get from git out of the box, thanks @shermozle (#331)
StorageLoader
- Bumped to 0.1.1
- Bumped Sluice to 0.1.5 (#96)
- Fixed "" in fields acts as an escape character for Postgres, thanks @kingo55 (#329)
- Added ability to --skip analyze (#335)
- Moved VACUUM SORT ONLY to a --include step (#321)
- Added COMPROWS to config and --include compupdate option (#344)
- Changed Postgres VACUUM FULL to VACUUM (#357)
- Added TRUNCATECOLUMNS for Redshift load (#360)
- Added FILLRECORD to our Redshift COPY command (#380)
Postgres
- Fixed error in
recipes_basic.technology_mobile
recipe (#397)
Snowplow v0.8.10
Adding recipes and cubes as SQL views for both Redshift and PostgreSQL. A few miscellaneous tidy-ups as well, see below for details.
Redshift
- Bumped table-def to 0.2.2
- Moved events table to a new atomic schema in atomic-def.sql (#301)
- Added migration script for 0.2.1 to 0.2.2
- Added SQL DDL to define Redshift recipes (#297)
- Redshift: added SQL DDL to define Redshift cubes (#298)
Postgres
- Bumped table-def to 0.1.1
- Renamed table-def file to atomic-def.sql
- Added migration script for 0.1.0 to 0.1.1
- Moved NOT NULL constraint on event field to event_vendor field (#318)
- Added SQL DDL to define Postgres recipes (#303)
- Added SQL DDL to define Postgres cubes (#302)
Documentation
- Fixed wrong path to no-js-tracker subdirectory, thanks @gregakespret (#343)
- Improved "Find out more" table in README, thanks @dideler (#353)
Snowplow v0.8.9
A release to handle the unannounced change which Amazon made to the CloudFront access log file format on 17th August (since reversed).
Hadoop ETL
- Bumped to 0.3.4
- Updated to handle singly-encoded %s in CloudFront querystring field (#333)
Snowplow v0.8.8
Adding Postgres support, re-adding HiveQL support, and also adding support for multiple storage targets.
Plus plenty of small improvements, bug fixes and simplifications.
JavaScript Tracker
- Moved into own repo (#277)
Hadoop ETL
- Bumped to 0.3.3
- URL-decodes "%3D" to "=" to allow Hive-style directory names as arguments (#305)
- Bumped referer-parser to 0.1.1 to fix java.lang.NullPointerException (#314)
EmrEtlRunner
- Bumped to 0.4.0
- Bumped Sluice to 0.0.7 (#299)
- Removed :snowplow: section from config.yml.sample (#289)
- Simplified EmrEtlRunner and its config (#287)
- Added run= to timestamped ETL folder names (#294)
- Updated "Jobflow started" stdout message to include jobflow ID (#315)
Hive ETL
- Removed folder 3-enrich/hive-etl as no longer supported (#286)
Hive storage
- Updated hive-storage scripts to work with current Redshift-format flatfile (#290)
Infobright storage
- Rremoved folder 4-storage/infobright as not currently supported (#285)
Postgres storage
- Added Postgres table definition in atomic schema (#160)
StorageLoader
- Bumped to 0.1.0
- Bumped Sluice 0.0.7 (#300)
- Removed code to delete Hive ETL's empty event files (#306)
- Fixed bug where download path has to be set (even when using Redshift) (#280)
- Optimized ANALYZE and VACUUM commands (#283)
- Added MAXERROR as StorageLoader configuration value for Redshift (#273)
- Added support for loading Postgres (#161)
- Removed Infobright loading capability (#307)
- Added support for loading into multiple storage targets (#311)
Snowplow v0.8.7
Predominantly bug fixes and tweaks to the JavaScript Tracker. Note that this is the last release where the JavaScript Tracker is part of the main snowplow/snowplow repository - it will shortly be moved into its own repo.
JavaScript Tracker
- Bumped to 0.12.0
- Fixed
document
reference to usedocumentAlias
(#247) - Fixed bug with
setCustomUrl
(#267) - Changed
ev_
tose_
for structured events (#197) - Fixed Firefox failure when "Always ask" set for cookies (#163)
- Fixed bug in page ping functionality detected in IE 8 (#260)
- Replaced
forEach
as not supported in IE 6-8 (#295)
EmrEtlRunner
- Fixed bug in
config.yml.sample
(#291)
Arduino Tracker
- Added git submodule link (#292)