diff --git a/README.md b/README.md index 9415e9fff12e7..a4ae4b3d96c1a 100644 --- a/README.md +++ b/README.md @@ -157,6 +157,21 @@ Example: When you change any file in `versioned_docs/version-0.7.0/`, it will on ## Configs Configs can be automatically updated by following these steps documented at ../hudi-utils/README.md +## Talks + +When adding a talk, please follow these guidelines. + +1. Ensure the entry is of the format + "[Title](Hyperlink to video/resources)" - By , , . , . +2. Please ensure the talks are in chronological order. +3. Try to add links to videos and slide decks when possible. If they are not available in same page, feel free to add + [Slides](Slides link) towards the end like for example: + +:::note + ["Hoodie: An Open Source Incremental Processing Framework From Uber"](http://www.dataengconf.com/hoodie-an-open-source-incremental-processing-framework-from-uber) - By Vinoth Chandar. + Apr 2017, DataEngConf, San Francisco, CA [Slides](https://www.slideshare.net/vinothchandar/hoodie-dataengconf-2017) [Video](https://www.youtube.com/watch?v=7Wudjc-v7CA) +::: + ## Blogs When adding a new blog, please follow these guidelines. diff --git a/website/src/pages/talks.md b/website/src/pages/talks.md index bea3757170013..ddcfacb5fad41 100644 --- a/website/src/pages/talks.md +++ b/website/src/pages/talks.md @@ -49,59 +49,85 @@ last_modified_at: 2019-12-31T15:59:57-04:00 18. ["Next Generation Data lakes using Apache Hudi"](https://docs.google.com/presentation/d/1y-ryRwCdTbqQHGr_bn3lxM_B8L1L5nsZOIXlJsDl_wU/edit?usp=sharing) - By Balaji Varadarajan and Sivabalan Narayanan, Sep 2020, ["ApacheCon"](https://www.apachecon.com/) -19. ["Building Large-Scale, Transactional Data Lakes using Apache Hudi"](https://www.dbta.com/DataSummit/Fall2020/Agenda.aspx) - By Nishith Agarwal, Data Summit 2020 +19. ["Apache Hudi on Amazon EMR"](https://pages.awscloud.com/rs/112-TZM-766/images/EV_analytics-sprint-week-apache-hundi-amazon-emr_Sep-2020.pdf) - By the AWS team. September 2020 -20. ["Landing practice of Apache Hudi in T3go"](https://drive.google.com/file/d/1ULVPkjynaw-07wsutLcZm-4rVXf8E8N8/view?usp=sharing) - By VinoYang and XianghuWang, November 2020, Qcon. +20. ["Building Large-Scale, Transactional Data Lakes using Apache Hudi"](https://www.dbta.com/DataSummit/Fall2020/Agenda.aspx) - By Nishith Agarwal, Data Summit 2020 -21. ["Meetup talk by Nishith Agarwal"](https://www.meetup.com/UberEvents/events/274924537/) - Uber Data Platforms Meetup, Dec 2020 +21. ["Landing practice of Apache Hudi in T3go"](https://drive.google.com/file/d/1ULVPkjynaw-07wsutLcZm-4rVXf8E8N8/view?usp=sharing) - By VinoYang and XianghuWang, November 2020, Qcon. -22. ["Apache Hudi learning series: Understanding Hudi internals"](https://www.slideshare.net/NishithAgarwal3/hudi-architecture-fundamentals-and-capabilities) - By Abhishek Modi, Balajee Nagasubramaniam, Prashant Wason, Satish Kotha, Nishith Agarwal, Feb 2021, Uber Meetup +22. ["Meetup talk by Nishith Agarwal"](https://www.meetup.com/UberEvents/events/274924537/) - Uber Data Platforms Meetup, Dec 2020 -23. ["Apache Hudi Meetup at Uber with talks from AWS, CityStorageSystems & Uber"](https://youtu.be/iXBInMLbjo0) - By Udit Mehrotra, Wenning Ding (AWS), Alexander Filipchik (CityStorageSystems), Prashant Wason, Satish Kotha (Uber), Feb 2021 +23. ["Apache Hudi learning series: Understanding Hudi internals"](https://www.slideshare.net/NishithAgarwal3/hudi-architecture-fundamentals-and-capabilities) - By Abhishek Modi, Balajee Nagasubramaniam, Prashant Wason, Satish Kotha, Nishith Agarwal, Feb 2021, Uber Meetup -24. ["Apache Hudi: The Streaming Data Lake Platform"](https://docs.google.com/presentation/d/1lVpbYV7qytAZPdwx4X9DD9ii0qFh7n9WGKJ0XQ4VpIs/edit?usp=sharing) - By Nishith Agarwal, Sivabalan Narayanan, +24. ["Apache Hudi Meetup at Uber with talks from AWS, CityStorageSystems & Uber"](https://youtu.be/iXBInMLbjo0) - By Udit Mehrotra, Wenning Ding (AWS), Alexander Filipchik (CityStorageSystems), Prashant Wason, Satish Kotha (Uber), Feb 2021 + +25. ["Speeding up Presto Queries Using Apache Hudi Clustering"](https://www.youtube.com/watch?v=1WSg2aiCwDQ) - By Satish Kotha and Nishith Agarwal. Presto Con, March 2021 + +26. ["Apache Hudi: The Streaming Data Lake Platform"](https://docs.google.com/presentation/d/1lVpbYV7qytAZPdwx4X9DD9ii0qFh7n9WGKJ0XQ4VpIs/edit?usp=sharing) - By Nishith Agarwal, Sivabalan Narayanan, Data Summit Connect, May, 2021 -25. ["Change Data Capture to Data lakes using Apache Pulsar/Hudi"](https://www.slideshare.net/streamnative/change-data-capture-to-data-lakes-using-apache-pulsar-and-apache-hudi-pulsar-summit-na-2021) - By Vinoth Chandar, Pulsar Summit North America, June 2021. ["Video link"](https://www.youtube.com/watch?v=MWpnVIgcAXw) +27. ["Apache Hudi: Large Scale Data Systems with Vinoth Chandar"](https://softwareengineeringdaily.com/2021/05/13/apache-hudi-large-scale-data-systems-with-vinoth-chandar/) - By Vinoth Chandar. SE Daily Podcast. May, 2021 -26. ["Apache Hudi: Large Scale Data Systems with Vinoth Chandar"](https://softwareengineeringdaily.com/2021/05/13/apache-hudi-large-scale-data-systems-with-vinoth-chandar/) - By Vinoth Chandar. SE Daily Podcast. May, 2021 +28. ["Change Data Capture to Data lakes using Apache Pulsar/Hudi"](https://www.slideshare.net/streamnative/change-data-capture-to-data-lakes-using-apache-pulsar-and-apache-hudi-pulsar-summit-na-2021) - By Vinoth Chandar, Pulsar Summit North America, June 2021. ["Video link"](https://www.youtube.com/watch?v=MWpnVIgcAXw) -27. ["Meet the creator of Apache hudi: Vinoth Chandar"](https://www.youtube.com/watch?v=XcaFaJR4IVk) - By Vinoth Chandar. Presto Con Day, 2021 +29. ["Meet the creator of Apache hudi: Vinoth Chandar"](https://www.youtube.com/watch?v=XcaFaJR4IVk) - By Vinoth Chandar. Presto Con Day, 2021 -28. ["Presto Eco system Panel Discussion"](https://www.youtube.com/watch?v=lsFSM2Z4kPs) - By Vinoth Chandar, Dipti Borkar, Nezih Yigitbasi, Maxime Beauchemin, Kishore. Presto Con, 2021 +30. ["Presto Eco system Panel Discussion"](https://www.youtube.com/watch?v=lsFSM2Z4kPs) - By Vinoth Chandar, Dipti Borkar, Nezih Yigitbasi, Maxime Beauchemin, Kishore. Presto Con, 2021 -29. ["Speeding up Presto Queries Using Apache Hudi Clustering"](https://www.youtube.com/watch?v=1WSg2aiCwDQ) - By Satish Kotha and Nishith Agarwal. Presto Con, March 2021 +31. ["Building a Large-scale Transactional Data Lake Using Apache Hudi"](https://www.youtube.com/watch?v=J6EcGiExx7M) - By Satish Kotha, AICamp -30. ["Building a Large-scale Transactional Data Lake Using Apache Hudi"](https://www.youtube.com/watch?v=J6EcGiExx7M) - By Satish Kotha, AICamp +32. ["Apache Hudi table format, Purpose-built for low latency data lake use-cases"](https://www.dremio.com/subsurface/introducing-the-apache-hudi-table-format-purpose-built-for-low-latency-data-lake-use-cases/) - By Nishith Agarwal and Sivabalan Narayanan. July, 2021 -31. ["Apache Hudi table format, Purpose-built for low latency data lake use-cases"](https://www.dremio.com/subsurface/introducing-the-apache-hudi-table-format-purpose-built-for-low-latency-data-lake-use-cases/) - By Nishith Agarwal and Sivabalan Narayanan. July, 2021 +33. ["Community round table: Open data lakes with Presto, Hudi and Aws - the next generation of analytics"](https://ahana.io/videos-presentations/roundtable-presto-hudi-aws/) - By Vinoth chandar, Roy Hasson, Dipti Borkar, Coordinated by Eric Kavanagh. July, 2021 -32. ["Community round table: Open data lakes with Presto, Hudi and Aws - the next generation of analytics"](https://ahana.io/videos-presentations/roundtable-presto-hudi-aws/) - By Vinoth chandar, Roy Hasson, Dipti Borkar, Coordinated by Eric Kavanagh. July, 2021 +34. ["DataEngineering Podcast: Charting A Path For Streaming Data To Fill Your Data Lake With Hudi"](https://www.dataengineeringpodcast.com/hudi-streaming-data-lake-episode-209/) - By Vinoth Chandar. Data Engineering Podcast, Aug, 2021 -33. ["DataEngineering Podcast: Charting A Path For Streaming Data To Fill Your Data Lake With Hudi"](https://www.dataengineeringpodcast.com/hudi-streaming-data-lake-episode-209/) - By Vinoth Chandar. Aug, 2021 - -34. ["Streaming Data Lakes using Kafka Connect + Apache Hudi"](https://www.slideshare.net/HostedbyConfluent/streaming-data-lakes-using-kafka-connect-apache-hudi-vinoth-chandar-apache-software-foundation) - Balaji Varadarajan and Vinoth Chandar. Sep 27, 2021. +35. ["Streaming Data Lakes using Kafka Connect + Apache Hudi"](https://www.confluent.io/events/kafka-summit-americas-2021/streaming-data-lakes-using-kafka-connect-apache-hudi/) - Balaji Varadarajan and Vinoth Chandar. Kafka Summit, Sep 27, 2021. -35. ["Code/Design walk through"](https://www.youtube.com/watch?v=0ezDbR_4FqU) - By Vinoth Chandar. Oct 8, 2021 +36. ["Code/Design walk through"](https://www.youtube.com/watch?v=0ezDbR_4FqU) - By Vinoth Chandar. Oct 8, 2021 + +37. ["Apache Hudi - The Data lake platform"](https://www.youtube.com/watch?v=nGcT6RPjez4) - By Vinoth Chandar. ApacheCon, Oct 11, 2021 + +38. ["Building Open Data Lakes on AWS with Debezium and Apache Hudi"](https://programmaticponderings.com/2021/10/31/demonstration-building-open-data-lakes-on-aws-with-debezium-and-apache-hudi/) - By Gary A. Stafford. Oct 31, 2021 + +39. ["Apache Hudi Meetup at Uber with talks from Disney, Walmart & Uber"](https://youtu.be/ZamXiT9aqs8) - By Vinay Patil (Disney+Hotstar), Samuel Guleff (Walmart), Surya Prasanna Yalla, Meenal Binwade (Uber), Jan 2022 + +40. ["Apache Hudi Meetup at Uber with talks from Philips, Moveworks & Uber (including Hudi OSS roadmap 2022)"](https://youtu.be/8Q0kM-emMyo) - By Felix Kizhakkel Jose (Philips), Bhavani Sudha (Moveworks), Prashant Wason (Uber), March 2022 + +41. ["Apache Hudi with Vinoth Chandar"](https://softwareengineeringdaily.com/2022/03/08/apache-hudi-with-vinoth-chandar/) By Software Engineering Daily. Mar 5, 2022 + +42. ["Presto Tech Talk: Optimizing table layout for Presto using Apache Hudi"](https://www.youtube.com/watch?v=J1JuHVFdggs) - By Ethan Guo and Vinoth Chandar. Presto Meetup. Jun 23, 2022 + +43. ["PrestoDB and Apache Hudi for the Lakehouse"](https://www.youtube.com/watch?v=3zQJR-IGH0Y&list=PLJVeO1NMmyqXHoLuUJtulMDU0yBgSL0GH&index=11) - By Sagar Sumit and Bhavani Sudha Saktheeswaran. PrestoCon Day. Jul 21, 2022 + +44. ["Petabyte-scale lakehouses with dbt and Apache Hudi"](https://coalesce.getdbt.com/blog/petabyte-scale-lakehouses-with-dbt-and-apache-hudi) - By Vinoth Govindarajan and Vinoth Chandar. Coalesce, Oct 17, 2022 + +45. ["Build on Open Source Episode 7 - aws on Twitch"](https://www.twitch.tv/videos/1656012018) - By Vinoth Chandar. Nov 18th, 2022 + +46. ["Prestocon- Exploring New Frontiers: How Apache Flink, Apache Hudi and Presto Power New Insights at Scale"](https://www.onehouse.ai/blog/exploring-new-frontiers-how-apache-flink-apache-hudi-and-presto-power-new-insights-at-scale) - By Danny Chan and Sagar Sumit. PrestoCon, June 2023 + +47. ["Lakehouses for Data Engineers: What You Need to Consider to Build Efficient ETL Pipelines"](https://www.databricks.com/dataaisummit/session/lakehouses-data-engineers-what-you-need-consider-build-efficient-etl-pipelines/) - By Nadine Farah. Data AI Summit, June 2023 + +48. ["Building Lakehouse using Hudi | Apache Hudi | Data Lakehouse | Hudi | Apache"](https://www.youtube.com/watch?v=3N4XVil05sM) - By the DataCouch Team. July 2023 -36. ["Apache Hudi - The Data lake platform"](https://www.youtube.com/watch?v=nGcT6RPjez4) - By Vinoth Chandar. Oct 11, 2021 +49. ["Trino fest: Skip rocks and files: Turbocharge Trino queries with Hudi’s multi-modal indexing subsystem"](https://trino.io/blog/2023/07/07/trino-fest-2023-onehouse-recap.html) - By Nadine Farah and Sagar Sumit. Trino Fest, July 2023 -37. ["Building Open Data Lakes on AWS with Debezium and Apache Hudi"](https://programmaticponderings.com/2021/10/31/demonstration-building-open-data-lakes-on-aws-with-debezium-and-apache-hudi/) - By Gary A. Stafford. Oct 31, 2021 +50. ["A Glide, Skip or a Jump: Efficiently Stream Data into Your Medallion Architecture with Apache Hudi"](https://www.confluent.io/events/current/2023/a-glide-skip-or-a-jump-efficiently-stream-data-into-your-medallion/) - By Nadine Farah and Ethan Guo. Current, September 2023 -38. ["Apache Hudi Meetup at Uber with talks from Disney, Walmart & Uber"](https://youtu.be/ZamXiT9aqs8) - By Vinay Patil (Disney+Hotstar), Samuel Guleff (Walmart), Surya Prasanna Yalla, Meenal Binwade (Uber), Jan 2022 +51. ["Incremental Data Processing with Apache Hudi](https://qconsf.com/presentation/oct2023/incremental-data-processing-apache-hudi) - By Bhavani Sudha Saktheeswaran and Saketh Chintapalli. Qcon, October 2023 -39. ["Apache Hudi Meetup at Uber with talks from Philips, Moveworks & Uber (including Hudi OSS roadmap 2022)"](https://youtu.be/8Q0kM-emMyo) - By Felix Kizhakkel Jose (Philips), Bhavani Sudha (Moveworks), Prashant Wason (Uber), March 2022 +52. ["Keynote: The Future is Unified: The Convergence of Data Lakes and Data Warehouses into an Interoperable Data Lakehouse"](https://dewcon.ai/) - By Vinoth Chandar. DEWCON, October 2023 -40. ["Apache Hudi with Vinoth Chandar"](https://softwareengineeringdaily.com/2022/03/08/apache-hudi-with-vinoth-chandar/) By Software Engineering Daily. Mar 5, 2022 +53. ["Panel: Is the Modern Data Stack Dead?"](https://dewcon.ai/) - By Vinoth Chandar, Joe Reis, Divyansh Saini, Shuveb Hussain. DEWCON, October 2023 -41. ["Presto Tech Talk: Optimizing table layout for Presto using Apache Hudi"](https://www.youtube.com/watch?v=J1JuHVFdggs) - By Ethan Guo and Vinoth Chandar. Presto Meetup. Jun 23, 2022 +54. ["Apache Hudi 1.0 preview: A database experience on the data lake"](https://opensourcedatasummit.com/apache-hudi-1-preview/) - By Bhavani Sudha Saktheeswaran and Sagar Sumit. OSDS, November 2023 -42. ["PrestoDB and Apache Hudi for the Lakehouse"](https://www.youtube.com/watch?v=3zQJR-IGH0Y&list=PLJVeO1NMmyqXHoLuUJtulMDU0yBgSL0GH&index=11) - By Sagar Sumit and Bhavani Sudha Saktheeswaran. PrestoCon Day. Jul 21, 2022 +55. ["Consistently Hashing it Out: Embracing Fresher, Faster Data with the Hudi-Flink Support for the Bucket Index"](https://www.ververica.academy/app/courses/f352775b-6c43-475b-84e0-d6070c57b1a7) - By Nadine Farah. Flink Forward, November 2023 -43. ["Petabyte-scale lakehouses with dbt and Apache Hudi"](https://youtu.be/aTn5dkm6rqQ) - By Vinoth Govindarajan and Vinoth Chandar. Oct 17, 2022 +56. ["Session: Maximizing efficiency by templating Glue jobs and serverless architecture in Hudi data lakes"](https://opensourcedatasummit.com/maximizing-efficiency-at-job-target/) - By Soumil Shah. OSDS, November 2023 -44. ["Build on Open Source Episode 7 - aws on Twitch"](https://www.twitch.tv/videos/1656012018) - By Vinoth Chandar. Nov 18th, 2022 +57. ["Data Alchemy: Transforming Raw Data to Gold with Apache Hudi and DBT"](https://osacon.io/sessions/2023/data-alchemy-transforming-raw-data-to-gold-with-apache-hudi-and-dbt/) - By Nadine Farah. OSA CON, December 2023 -45. ["Apache Hudi on Amazon EMR"](https://pages.awscloud.com/rs/112-TZM-766/images/EV_analytics-sprint-week-apache-hundi-amazon-emr_Sep-2020.pdf) - By the AWS team. September 2020 +58. ["Panel Discussion on Growing a Healthy Open Source Community"](https://osacon.io/sessions/2023/panel-discussion-on-growing-a-healthy-open-source-community/) - By Nadine Farah, Ali LeClerc, Evan Rusackas. OSA CON, December 2023 -46. ["Building Lakehouse using Hudi | Apache Hudi | Data Lakehouse | Hudi | Apache"](https://www.youtube.com/watch?v=3N4XVil05sM) - By the DataCouch Team. July 2023 \ No newline at end of file +59. ["Unveil the Magic Without Hoodini: Transform Your Machine Learning Pipelines with Apache Hudi"](https://www.youtube.com/watch?v=pUZHotLdkjU) - By Nadine Farah. AI.dev, December 2023