-
Notifications
You must be signed in to change notification settings - Fork 707
Home
mikevalenty edited this page Apr 6, 2014
·
94 revisions
Scalding is a Scala library that makes it easy to write MapReduce jobs in Hadoop. It's similar to other MapReduce platforms like Pig and Hive, but offers a higher level of abstraction by leveraging the full power of Scala and the JVM.
Scalding is built on top of Cascading, a Java library that abstracts away much of the complexity of Hadoop (such as the need to write raw map
and reduce
functions).
- Cascading Google Group. We are using this Google Group for Scalding questions as well.
- @Scalding on Twitter
- Frequently Asked Questions
- Upgrading to 0.9.0 means fixing some compile issues. These sed rules may help.
- Scaladocs: Generated documentation for current version of Scalding.
- Note:
sbt doc
will build scaladocs under thetarget/2.9.2/api/
directory, which you can then open in your browser. - Getting Started
- Fields-based API Reference. This is the original, Cascading DSL API to scalding using a named tuple model. This also contains many example code snippets illustrating each Scalding function. See Field Rules for more on Fields.
- Type-safe API Reference. This API is very close to the scala collections API.
- Getting Started with the Matrix library
- Matrix-API-Reference
- Building Bigger Platforms With Scalding some approaches for modular design and composing with scalding.
- Scalding Sources
- Scalding-Commons. The README of the former scalding-commons library.
- Rosetta Code. A collection of MapReduce tasks translated (from Pig, Hive, Cascalog, MapReduce Streaming, etc.) into Scalding.
- Oscar's Scalding Talk at the Hadoop Summit. Slides from Oscar's talk at the Hadoop Summit.
- Scalding-cassandra support for reading/writing cassandra
- [Spy Glass] (https://github.com/ParallelAI/SpyGlass) - Advanced featured HBase wrapper for Cascading and Scalding
- Scalding: Powerful & Concise MapReduce Programming
- Scalding lecture for UC Berkeley's Analyzing Big Data with Twitter class
- Scalding with CDH3U2 in a Maven project
- Running your Scalding jobs in Eclipse
- Run/Test jobs locally from Intellij IDEA
- Running your Scalding jobs in IDEA intellij
- Running Scalding jobs on EMR
- Running Scalding with HBase support: Scalding HBase wiki
- Using the distributed cache
- Calling Scalding from inside your application
- Unit Testing Scalding Jobs
- Scalding for the impatient great set of tutorials on using scalding walking through simple to more complex examples (including TF-IDF).
- Movie Recommendations and more in MapReduce and Scalding
- Generating Recommendations with MapReduce and Scalding, a shorter version of the above post.
- Poker collusion detection with Mahout and Scalding
- Portfolio Management in Scalding
- Find the Fastest Growing County in US, 1969-2011, using Scalding
- Dean Wampler's Scalding Workshop. Presented by Dean at StrangeLoop 2012.
- Typesafe's Activator for Scalding. Also created by Dean Wampler.
- Hive, Pig, Scalding, Scoobi, Scrunch and Spark: A Comparison of Hadoop Frameworks
- Why Hadoop MapReduce needs Scala
- How Twitter is doing its part to democratize big data
- Meet the combo powering Hadoop at Etsy, Airbnb and Climate Corp.
- Scalding wins a Bossie award from InfoWorld
- Scalding: Hadoop Word Count in LESS than 70 lines of code
- Using Scalding with other versions of Scala
- Scala and sbt for Homebrew users
- Scala and sbt for MacPorts users
- Comparison to Scrunch and Scoobi
- Powered-By see who is using scalding in production.
- Scaladocs
- Getting Started
- Type-safe API Reference
- SQL to Scalding
- Building Bigger Platforms With Scalding
- Scalding Sources
- Scalding-Commons
- Rosetta Code
- Fields-based API Reference (deprecated)
- Scalding: Powerful & Concise MapReduce Programming
- Scalding lecture for UC Berkeley's Analyzing Big Data with Twitter class
- Scalding REPL with Eclipse Scala Worksheets
- Scalding with CDH3U2 in a Maven project
- Running your Scalding jobs in Eclipse
- Running your Scalding jobs in IDEA intellij
- Running Scalding jobs on EMR
- Running Scalding with HBase support: Scalding HBase wiki
- Using the distributed cache
- Unit Testing Scalding Jobs
- TDD for Scalding
- Using counters
- Scalding for the impatient
- Movie Recommendations and more in MapReduce and Scalding
- Generating Recommendations with MapReduce and Scalding
- Poker collusion detection with Mahout and Scalding
- Portfolio Management in Scalding
- Find the Fastest Growing County in US, 1969-2011, using Scalding
- Mod-4 matrix arithmetic with Scalding and Algebird
- Dean Wampler's Scalding Workshop
- Typesafe's Activator for Scalding