Releases: snowplow/snowplow
Release 82 Tawny Eagle
Adds the ability to use HTTP endpoints in the Kinesis Elaticsearch Sink.
Common
- Common: publish each Kinesis app individually to Bintray (#2492)
Kinesis Elasticsearch Sink
Release 81 Kangaroo Island Emu
Reboots the Hadoop Event Recovery project (formerly Hadoop Bad Rows), which allows you to fix up Snowplow bad rows and make them ready for reprocessing
Documentation
- Fix broken link in Thrift Schemas' README.md (#2498)
Common
Android Tracker
- Bump git submodule to 0.5.4 (#2710)
JavaScript Tracker
- Bump git submodule to 2.6.1 1. (#2708)
Objective-C Tracker
- Bump git submodule to 0.6.1 (#2709)
Golang Tracker
- Add git submodule (#2619)
Scala Common Enrich
Stream Enrich
Hadoop Event Recovery
- Update README instructions (#2348)
- Add continuous deployment (#2692)
- Rename from Scala Hadoop Bad Rows (#2694)
- Allow source row to be transformed with JavaScript (#2223)
- Capitalize Snowplow correctly in copyright notices (#2641)
StorageLoader
- Write JSON path for com.clearbit/person (#2631)
- Write JSON path for com.clearbit/company (#2632)
- Write JSON path for com.amazon.aws.lambda/java_context (#2560)
Redshift
Release 80 Southern Cassowary
Real-time pipeline release which improves stability and brings the real-time pipeline up-to-date with our Hadoop pipeline
Common
- Add CI/CD for Kinesis apps (#2621)
- Add Bintray credentials to .travis.yml (#2618)
- Change Kinesis pipeline status from "Beta" to "Production-ready" in READMEs (#2629)
Config
- Update config/iglu_resolver.json version to 1-0-1 (#2479)
Scala Stream Collector
- Bump to 0.7.0 (#2595)
- Increase tolerance of timings in tests (#2614)
- Send nonempty response to POST requests (#2606)
- Crash when unable to find stream instead of hanging (#2583)
- Stop using deprecated Config.getMilliseconds method (#2570)
- Move example configuration file to examples folder (#2566)
- Upgrade the log level for reports of stream nonexistence from INFO to ERROR (#2384)
- Crash rather than hanging when unable to bind to the supplied port (#2551)
- Bump Spray version to 1.3.3 (#2522)
- Bump Scala version to 2.10.5 (#2565)
- Fix omitted string interpolation (#2561)
Stream Enrich
- Bump to 0.8.0 (#2596)
- Bump Common Enrich to 0.23.0 (#2612)
- Bump Iglu Scala Client to 0.4.0 (#2688)
- Add configuration setting for MaxRecords (#2610)
- Use nonEmpty method to check whether lists are empty (#2608)
- Refactor functions to avoid return keyword (#2607)
- Upgrade the log level for reports of stream nonexistence from INFO to ERROR (#2598)
- Crash when unable to find stream instead of hanging (#2584)
- Add standard copyright notice to AbstractSourceSpec.scala (#2580)
- Make logging more succinct in case of failure (#1723)
- Move example configuration file to examples folder (#2567)
- Remove src/main/resolver.json.sample (#1932)
- Use json4s to combine the enrichment configuration JSONs (#2259)
Kinesis Elasticsearch Sink
Release 79 Black Swan
Introduces our powerful new API Request Enrichment, plus a new HTTP Header Extractor Enrichment and several other improvements on the enrichments side
Documentation
- Removed closes from CHANGELOG tickets for R78 (#2534)
Common
- Changed Vagrantfile to use NFS and extra CPU cores by default (#2482)
Config
- Removed duplicated enabled property in ua_parser_config.json (#2424)
- Enabled switched to false in currency_conversion_config.json (#2327)
- Enabled switched to false in weather_enrichment_config.json (#2326)
EmrEtlRunner
- Bumped AMI version in example config to 4.5.0 (#2604)
- Updated hadoop_enrich version in config.yml.sample to 1.7.0 (#2661)
- Updated hadoop_shred version in config.yml.sample to 0.9.0 (#2662)
Scala Common Enrich
- Bumped user-agent-utils version to latest (#2516)
- Transaction item quantity type changed to JInteger (#2157)
- Bumped to 0.23.0 (#2486)
- Improved OWM error if user doesn't have historical weather (#2325)
- Added API Request Enrichment (#2051)
- Bumped Iglu Scala Client to 0.4.0 (#2333)
- Added HTTP Header Extractor Enrichment (#1373)
Scala Hadoop Enrich
- Bumped to 1.7.0 (#2446)
- Bumped Scala Common Enrich to 0.23.0 (#2485)
- Bumped Iglu Scala Client to 0.4.0 (#2478)
- Added test for API Request Enrichment (#2603)
Scala Hadoop Shred
Release 78 Great Hornbill
Brings our Kinesis pipeline functionally up-to-date with our Hadoop pipeline, and makes various further improvements to the Kinesis pipeline.
Common
- Removed openjdk7 from .travis.yml (#2533)
Scala Common Enrich
- Bumped to 0.22.0
- Added handling for bad rows which are too long to print in full (closes #2419)
Kinesis
- Updated publish-kinesis-release.bash (closes #2477)
Scala Stream Collector
- Bumped to 0.6.0
- Added Scala Common Enrich as a library dependency (closes #2153)
- Added click redirect mode (closes #549)
- Configured the ability to use IP address as partition key (closes #2331)
- Converted bad rows to new format (closes #2006)
- Shared a single thread pool for all writes to Kinesis (closes #2369)
- Specified UTF-8 encoding everywhere (closes #2147)
- Made cookie name customizable, thanks @kazjote! (closes #2474)
- Added boolean collector.cookie.enabled setting (closes #2488)
- Made backoffPolicy fields macros (closes #2518)
- Updated AWS credentials to support iam/env/default not cpf (closes #1518)
Scala Kinesis Enrich
- Bumped to 0.7.0
- Renamed to Stream Enrich (closes #2418)
- Bumped Kinesis Client Library to 1.6.1 (closes #1823)
- Bumped Scala Common Enrich to 0.21.0 (closes #2033)
- Bumped Iglu Scala Client to 0.3.1 (closes #2080)
- Configured the ability to use IP address as partition key (closes #2332)
- Started emitting KCL metrics to CloudWatch (closes #2357)
- Converted bad rows to new format (closes #1207)
- Removed outdated comment about ClasspathPropertiesFileCredentialsProvider from sample config file (closes #1519)
- Removed redundant documentation from README (closes #2032)
- Updated test suite with valid self-describing JSONs (closes #2151)
- Updated Scala Tracker to 0.2.0 and enabled EC2 context (closes #2109)
- Updated to use new EtlPipeline (closes #1933)
- Specified UTF-8 encoding everywhere (closes #2148)
Kinesis Elasticsearch Sink
- Bumped to 0.5.0
- Bumped Kinesis Client Library to 1.6.1 (closes #1824)
- Bumped Scala Common Enrich to 0.22.0 (closes #2152)
- Added mixed output mode (closes #2412)
- Added new canonical event fields (closes #2089)
- Moved the stream-type setting into the main sink configuration object (closes #2490)
- Made source and sink fields macros (closes #2519)
- Renamed Build object to match project (closes #2002)
- Converted bad rows to new format (closes #1208)
- Updated schema regular expression in line with Iglu Central (fixes #1998)
- Cached the mapping of field name to field type (closes #2090)
- Specified UTF-8 encoding everywhere (closes #2149)
- Stopped sending timestamp instead of failure count (fixes #1951)
- Made performance of conversion from TSV to JSON linear (closes #1847)
- Updated to latest version of EnrichedEvent (closes #2089)
Release 77 Great Auk
Updates Snowplow to run on the new 4.x series of Elastic MapReduce releases
Documentation
Common
- Made optionality of Lingual and HBase in config.yml clearer (#2206)
- Fixed OpenJDK build in Travis CI (#2447)
Scala Hadoop Enrich
- Bumped to 1.6.0
- Bumped Scala Common Enrich to 0.21.0 (#2442)
Scala Common Enrich
- Bumped to 0.21.0
- Fixed exception for invalid API key in currency conversion (#2441)
- Fixed exception on same currency conversion (#2437)
- Switched from javax.script to org.mozilla.javascript for JavaScriptEnrichment (#2453)
Scala Hadoop Shred
- Bumped to 0.8.0
- Bumped Iglu Scala Client to 0.3.2 (#2319)
EmrEtlRunner
- Bumped to 0.21.0
- Attached monitoring tags to jobflow (#425)
- Now throwing exception if processing thrift with --skip s3distcp or AMI 2.x.x (#1648)
- Added bootstrap action to prepare AMI >= 3.8.0 (#2320)
- Bumped Elasticity to 6.0.7 (#2400)
- Added support for Amazon EMR 4.x.x series (#1926)
- Prevented bad CLI options from throwing stack trace (#1930)
- Made error for nonempty processing bucket collector-agnostic (#1961)
- Bumped Ruby Tracker to 0.5.2 (#2143)
- Improved retry logic for EMR bootstrap timeouts (#2150)
- Excluded previously-built executables from the build (#2163)
- Added support for additional_info in EMR section of configuration (#2211)
- Added Elasticsearch stage to help message (#2323)
- Updated hadoop_enrich version in config.yml.sample to 1.6.0 (#2459)
- Updated hadoop_shred version in config.yml.sample to 0.8.0 (#2370)
- Removed snowplow-emr-etl-runner.sh (#2445)
StorageLoader
- Bumped to 0.7.0
- Added support for supplying config file as Base64-encoded string (#2227)
- Added ability to retrieve AWS credentials from EC2 role (#2226)
- Excluded previously-built executables from the build (#2164)
- Started printing stack trace for failures not caused by bad configuration (#2160)
- Bumped Ruby Tracker to 0.5.2 (#2144)
- Moved ANALYZE statements after VACUUM statements (#1361)
- Added resolver config option to snowplow-runner-and-loader.sh (#2170)
- Updated snowplow-runner-and-loader.sh to use JRuby binaries (#2233)
- Removed snowplow-storage-loader.sh (#2444)
- Wrote JSON Path file for com.optimizely/visitor_dimension event (#2436)
- Wrote JSON Path file for com.optimizely/visitor_audience event (#2435)
- Wrote JSON Path file for com.optimizely/visitor event (#2434)
- Wrote JSON Path file for com.optimizely/variation event (#2433)
- Wrote JSON Path file for com.optimizely/state event (#2432)
- Wrote JSON Path file for com.optimizely/experiment event (#2431)
- Wrote JSON Path file for io.augur.snowplow/identity_lite (#1958)
Redshift
- Wrote Redshift DDL for com.optimizely/visitor_dimension event (#2430)
- Wrote Redshift DDL for com.optimizely/visitor_audience event (#2429)
- Wrote Redshift DDL for com.optimizely/visitor event (#2428)
- Wrote Redshift DDL for com.optimizely/variation event (#2427)
- Wrote Redshift DDL for com.optimizely/state event (#2426)
- Wrote Redshift DDL for com.optimizely/experiment event (#2425)
- Added Redshift DDL for io.augur.snowplow/identity_lite (#1957)
Release 76 Changeable Hawk-Eagle
Introduces an event de-duplication process which runs on Hadoop, plus an important bug fix for our recent SendGrid webhook support
Scala Hadoop Enrich
- Bumped to 1.5.1
- Bumped Scala Common Enrich to 0.20.1 (#2338)
Scala Common Enrich
- Bumped to 0.20.0
- Now using only base MIME type in content-type check for SendGrid Adapter (#2328)
Scala Hadoop Shred
- Bumped to 0.7.0
- Fixed good tests' checks for empty paths (#2278)
- Now deduplicating event_id and event_fingerprint pairs (#2246)
- Fixed incorrect event in SchemaValidationFailed1Spec (#2355)
- Updated tests to check atomic-events output (#2264)
- Now only writes atomic-events if JSONs shred successfully (#2245)
- Removed empty SchemaValidationFailed2Spec (#2271)
- Fixed test suite issue with multiple input lines (#2270)
EmrEtlRunner
- Updated hadoop_enrich version in config.yml.sample to 1.5.1 (#2339)
- Changed in bucket example in config.yml.sample to s3://my-in-bucket (#2358)
- Updated archive bucket examples in config.yml (#2368)
- Updated hadoop_shred version in config.yml.sample to 0.7.0 (#2360)
StorageLoader
- Wrote JSON Paths file for com.google.analytics.enhanced-ecommerce/action (#2136)
- Wrote JSON Paths file for com.google.analytics.enhanced-ecommerce/actionFieldObject (#2135)
- Wrote JSON Paths file for com.google.analytics.enhanced-ecommerce/impressionFieldObject (#2134)
- Wrote JSON Paths file for com.google.analytics.enhanced-ecommerce/productFieldObject (#2133)
- Wrote JSON Paths file for com.google.analytics.enhanced-ecommerce/promotionFieldObject (#2132)
Redshift
- Added Redshift DDL for com.google.analytics.enhanced-ecommerce/promotionFieldObject (#2131)
- Added Redshift DDL for com.google.analytics.enhanced-ecommerce/productFieldObject (#2130)
- Added Redshift DDL for com.google.analytics.enhanced-ecommerce/impressionFieldObject (#2129)
- Added Redshift DDL for com.google.analytics.enhanced-ecommerce/actionFieldObject (#2128)
- Added Redshift DDL for com.google.analytics.enhanced-ecommerce/action (#2127)
Release 75 Long-Legged Buzzard
Add support for Urban Airship and SendGrid webhooks
Scala Hadoop Enrich
- Bumped to 1.5.0
- Bumped Scala Common Enrich to 0.20.0 (#2200)
- Added test for loading Urban Airship Connect ndjson files (#2168)
- Added test for SendGrid Adapter (#2194)
Scala Common Enrich
- Bumped to 0.20.0
- Added JsonLoader for Urban Airship, Mixpanel et al (#2210)
- Added Adapter to pre-process Urban Airship events (#2167)
- Abstracted Mandrill
reformatParameters
function into Adapter (#2171) - Added Adapter to pre-process SendGrid events (#1161)
EmrEtlRunner
- Bumped to 0.20.0
- Updated hadoop_enrich version in config.yml.sample to 1.5.0 (#2282)
- Added raw s3 -> hdfs step with group by (#2253)
- Added directory flattening code (#2232)
- Added support for ndjson loader format (#2251)
- Improved test coverage of runner.rb (#2250)
Redshift
- Added Redshift DDL for a com.sendgrid/processed event (#2172)
- Added Redshift DDL for a com.sendgrid/dropped event (#2173)
- Added Redshift DDL for a com.sendgrid/delivered event (#2174)
- Added Redshift DDL for a com.sendgrid/deferred event (#2175)
- Added Redshift DDL for a com.sendgrid/bounce event (#2176)
- Added Redshift DDL for a com.sendgrid/open event (#2177)
- Added Redshift DDL for a com.sendgrid/click event (#2178)
- Added Redshift DDL for a com.sendgrid/spamreport event (#2179)
- Added Redshift DDL for a com.sendgrid/unsubscribe event (#2180)
- Added Redshift DDL for a com.sendgrid/group_unsubscribe event (#2181)
- Added Redshift DDL for a com.sendgrid/group_resubscribe event (#2182)
- Added Redshift DDL for com.urbanairship.connect/UNINSTALL event (#2283)
- Added Redshift DDL for com.urbanairship.connect/TAG_CHANGE event (#2284)
- Added Redshift DDL for com.urbanairship.connect/SEND event (#2285)
- Added Redshift DDL for com.urbanairship.connect/RICH_READ event (#2286)
- Added Redshift DDL for com.urbanairship.connect/RICH_DELIVERY event (#2287)
- Added Redshift DDL for com.urbanairship.connect/RICH_DELETE event (#2288)
- Added Redshift DDL for com.urbanairship.connect/REGION event (#2289)
- Added Redshift DDL for com.urbanairship.connect/PUSH_BODY event (#2290)
- Added Redshift DDL for com.urbanairship.connect/OPEN event (#2291)
- Added Redshift DDL for com.urbanairship.connect/LOCATION event (#2292)
- Added Redshift DDL for com.urbanairship.connect/IN_APP_MESSAGE_RESOLUTION event (#2293)
- Added Redshift DDL for com.urbanairship.connect/IN_APP_MESSAGE_EXPIRATION event (#2294)
- Added Redshift DDL for com.urbanairship.connect/IN_APP_MESSAGE_DISPLAY event (#2295)
- Added Redshift DDL for com.urbanairship.connect/FIRST_OPEN event (#2296)
- Added Redshift DDL for com.urbanairship.connect/CUSTOM event (#2297)
- Added Redshift DDL for com.urbanairship.connect/CLOSE event (#2298)
StorageLoader
- Added JSON Path file for com.sendgrid/processed event (#2183)
- Added JSON Path file for com.sendgrid/dropped event (#2184)
- Added JSON Path file for com.sendgrid/delivered event (#2185)
- Added JSON Path file for com.sendgrid/deferred event (#2186)
- Added JSON Path file for com.sendgrid/bounce event (#2187)
- Added JSON Path file for com.sendgrid/open event (#2188)
- Added JSON Path file for com.sendgrid/click event (#2189)
- Added JSON Path file for com.sendgrid/spamreport event (#2190)
- Added JSON Path file for com.sendgrid/unsubscribe event (#2191)
- Added JSON Path file for com.sendgrid/group_unsubscribe event (#2192)
- Added JSON Path file for com.sendgrid/group_resubscribe event (#2193)
- Added JSON Path file for com.urbanairship.connect/UNINSTALL event (#2299)
- Added JSON Path file for com.urbanairship.connect/TAG_CHANGE event (#2300)
- Added JSON Path file for com.urbanairship.connect/SEND event (#2301)
- Added JSON Path file for com.urbanairship.connect/RICH_READ event (#2302)
- Added JSON Path file for com.urbanairship.connect/RICH_DELIVERY event (#2303)
- Added JSON Path file for com.urbanairship.connect/RICH_DELETE event (#2304)
- Added JSON Path file for com.urbanairship.connect/REGION event (#2305)
- Added JSON Path file for com.urbanairship.connect/PUSH_BODY event (#2306)
- Added JSON Path file for com.urbanairship.connect/OPEN event (#2307)
- Added JSON Path file for com.urbanairship.connect/LOCATION event (#2308)
- Added JSON Path file for com.urbanairship.connect/IN_APP_MESSAGE_RESOLUTION event (#2309)
- Added JSON Path file for com.urbanairship.connect/IN_APP_MESSAGE_EXPIRATION event (#2310)
- Added JSON Path file for com.urbanairship.connect/IN_APP_MESSAGE_DISPLAY event (#2311)
- Added JSON Path file for com.urbanairship.connect/FIRST_OPEN event (#2312)
- Added JSON Path file for com.urbanairship.connect/CUSTOM event (#2313)
- Added JSON Path file for com.urbanairship.connect/CLOSE event (#2314)
Data modeling
Release 74 European Honey Buzzard
Adds a weather enrichment
Common
Common: added encrypted OWM API key to .travis.yml (#2243)
Scala Hadoop Enrich
Bumped to 1.4.0
Bumped Scala Common Enrich to 0.19.0 (#2255)
Scala Common Enrich
Bumped to 0.19.0
Added weather enrichment (#456)
Fixed issue with BC timestamp in ExtractEventTypeSpec (#2257)
Fixed currency conversion enrichment's test for invalid API key (#2258)
StorageLoader
Wrote JSON path file for org.openweathermap/weather (#2240)
Redshift
Added Redshift DDL for org.openweathermap/weather (#2241)
Release 73 Cuban Macaw
Loads bad rows in batch pipeline into Elasticsearch, and formally separates the Snowplow enriched event format from the TSV format used to load Redshift.
EmrEtlRunner
- Bumped to 0.19.0
- Added hadoop_elasticsearch to config.yml.sample (#2124)
- Added support for Elasticsearch in targets section of config (#826)
- Bumped Elasticity to 6.0.5 (#2026)
- Stopped skipping the whole job just because enrich and shred are being skipped (#2049)
Scala Common Enrich
- Bumped Iglu Scala Client to 0.3.1 (#2079)
- Bumped version to 0.18.0
- Moved ScalazArgs into shared library (#2010)
- Removed executable bit from Scala source files (#2022)
- Removed JSON length checks (#2041)
- Removed truncation code (#2044)
- Stopped attempting to catch fatal errors (#2045)
Scala Hadoop Enrich
- Bumped to 1.3.0
- Bumped Scala Common Enrich to 0.18.0 (#2015)
- Added Iglu Scala Client as an explicit dependency (#2115)
- Added .forceToDisk to speed up run (#859)
- Started using Scala Common Enrich's version of ScalazArgs (#2013)
Scala Hadoop Shred
- Bumped to 0.6.0
- Added .forceToDisk to common to speed up run (#2039)
- Bumped Iglu Scala Client to 0.3.1 (#2081)
- Bumped Scala Common Enrich to 0.18.0 (#2016)
- Applied truncation logic to atomic-events TSV (#2042)
- Processed enriched events for atomic.events removing JSON fields (#1731)
- Started using Scala Common Enrich's version of ScalazArgs (#2014)
Storage
Hadoop Elasticsearch Sink
- Added. (#824)
StorageLoader
- Bumped to 0.6.0
- Added tcpKeepAlive=true to JDBC for long-running COPYs via NAT (#2145)
- Fixed setup guide link in README, thanks @diamondo25! (#2025)
- Loaded atomic.events from shredded folder (#1795)
Postgres
Redshift
- Added migration script for 0.4.0 to 0.8.0 (#2155)
- Added migration script for 0.5.0 to 0.8.0 (#2119)
- Added migration script for 0.6.0 to 0.8.0 (#2120)
- Added migration script for 0.7.0 to 0.8.0 (#2048)
- Removed JSON fields from atomic.events (#1849)