Demonstration of using Python to process the Common Crawl dataset with the mrjob framework
-
Updated
Apr 1, 2022 - Python
Demonstration of using Python to process the Common Crawl dataset with the mrjob framework
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Efficient and scalable parallelism using the message passing interface (MPI) to handle big data and highly computational problems.
RedisGears python client
There are Python 2.7 codes and learning notes for Spark 2.1.1
Code for paper "Locally Distributed Deep Learning Inference on Edge Device Clusters"
Iterable Java8 style Streams for Python
🎓Repository for masters labs on FCSN, BSUIR
A tool that converts long audio files into a thorough, summarized report. Leverages OpenAI and its API (ChatGPT backend), Langchain for text processing, and Pinecone for vector database facilitation.
Distributed encoding, second generation.
Source code of the numerical experiments presented in "Energy-Efficient Edge-Facilitated Wireless Collaborative Computing using Map-Reduce" by Antoine Paris, Hamed Mirghasemi, Ivan Stupia and Luc Vandendorpe (presented at SPAWC19).
A package for working with lists distributed over MPI
Implementation of Girvan-Newman Algorithm to detect communities in graphs using Yelp dataset
Scatter gather with AWS lambda
Parallel implementation of Breadth-First Search algorith in Java MapReduce and PySpark. This implementation finds degrees of separation between Twitter Users
A case study on mining association rules between different factors related to deaths of people in the United States
Anagram Python Script In Hadoop
Learn Big Data tools/ framework by doing examples, POC, per projects.
Add a description, image, and links to the map-reduce topic page so that developers can more easily learn about it.
To associate your repository with the map-reduce topic, visit your repo's landing page and select "manage topics."