#

massive-datasets

Here are 24 public repositories matching this topic...

polardb / polardbx-sql

PolarDB-X is a cloud native distributed SQL Database designed for high concurrency, massive storage, complex querying scenarios.

mysql distributed-transactions cloud-native high-availability relational-database high-concurrency massive-datasets htap horizontal-scaling enterprise-class

Updated Nov 11, 2024
Java

heat

helmholtz-analytics / heat

Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python

python data-science machine-learning hpc gpu numpy mpi pytorch distributed parallelism data-analytics tensors data-processing multi-gpu mpi4py massive-datasets multi-node-cluster array-api

Updated Nov 18, 2024
Python

polardb / polardbx

PolarDB-X is a cloud native distributed SQL Database designed for high concurrency, massive storage, complex querying scenarios.

mysql distributed-transactions cloud-native high-availability relational-databases high-concurrency massive-datasets htap horizontal-scaling enterprise-class

Updated May 14, 2024
Makefile

simkarwin / mimo_keras

TF-Package: Multiple-Input Multiple-Output Keras Data-Generator for massive and complex datasets

massive-datasets keras-datagenerator mimo-models

Updated Jan 2, 2023
Python

joshuaboud / gen-dataset

Command line tool to quickly generate a lot of files in a lot of directories

linux benchmarking evaluation multithreading dataset dataset-generation massive-datasets cli-tool dataset-generator

Updated Feb 18, 2022
C++

FedericoBruzzone / anti-money-laundering

The project is based on the analysis of the "IBM Transactions for Anti Money Laundering" dataset published on Kaggle. The task is to implement a model which predicts whether or not a transaction is illicit, using the attribute "Is Laundering" as a label to be predicted.

machine-learning machine-learning-algorithms pyspark massive-datasets

Updated Aug 12, 2024
Jupyter Notebook

rajeshidumalla / Bloom-Filter

Building a Bloom Filter on English dictionary words

python data-science machine-learning bloom-filter data-analysis nltk-library massive-datasets

Updated Oct 7, 2021
Jupyter Notebook

gmalik9 / floating_point_data_compressor

gipa -- compression/decompression tool to package compress and encode massive archive files with floating-point data

compression data-visualization autoencoder compressor data-compression representation representation-learning floating-point massive-datasets

Updated Sep 14, 2017
Python

FedericoBruzzone / algorithms-for-massive-datasets

This repository contains a LaTeX file that generates a PDF document comprising comprehensive notes for the course "Algorithms for Massive Datasets"

deep-learning algorithms recommender-system massive-datasets unimi linkanalysis

Updated Aug 12, 2024
TeX

rajeshidumalla / PageRank

Building PageRank algorithm on Web Graph around Stanford.edu using NetworkX python library

python data-science machine-learning spark numpy pagerank-algorithm pandas data-analysis massive-datasets networkx-library

Updated Oct 7, 2021
Jupyter Notebook

Alex4gtx / Massive-Data-Handler

Permite abrir e manipular arquivos massivos de texto/dados cujo seria impossivel abrir em um computador, por exemplo um arquivo de texto de +20gb, permite manipular o arquivo pegando apenas as linhas necessárias sem travar o computador por falta de memória.

big-data dictionaries python-script massive-datasets manipulacao-arquivos

Updated Feb 12, 2022
Python

diem-ai / google-bigquery

Series of SQL exercise working with databases, using Google BigQuery to scale to massive datasets taught by educators in Kaggle.com

python bigquery sql analytics kaggle massive-datasets

Updated Jul 9, 2019
Jupyter Notebook

rajeshidumalla / node2vec

Building node2vec algorithm

python data-science machine-learning numpy pandas data-analysis matplotlib massive-datasets node2vec networkx-graph

Updated Oct 7, 2021
Jupyter Notebook

arhcoder / Netflix-Recommendation

📺 Content Recommendation System for the Netflix Prize Challenge with Collaborative Filtering.

python jupyter-notebook collaborative-filtering netflix recommendation-system recommendation-engine recommender-system massive-datasets netflix-prize massive-data

Updated Feb 17, 2024
Jupyter Notebook

manuparra / hadoop-statistics

Calculate statistical measures of one column in big data Datasets with these simply Hadoop Application

java hadoop bigdata max avg min standardeviation massive-datasets

Updated Feb 24, 2017
Java

rajeshidumalla / Wordcount-in-Spark

word count in Spark

python spark python-library pandas wordcount massive-datasets

Updated Oct 6, 2021
Jupyter Notebook

KolwaBrad / massivedataset

Training the MASSIVE dataset by Amazon(english-US, German-DE and Swahili-KE)

python massive-datasets

Updated Oct 2, 2023
Python

dhruv3 / MRbasedFriendRecommender

Map Reduce program to suggest new friends based on count of mutual friends

java mapreduce datamining massive-datasets

Updated Mar 2, 2018
Java

miguel-kjh / Machine-Translation

language translation massive-datasets

Updated Dec 11, 2020
Jupyter Notebook

nelsonstos / bulk-load-api-multivende

This project facilitates the efficient mass registration of products using Rabbit MQ, managing loads exceeding 50,000 products.

nodejs mysql docker queue rabbitmq expressjs pubsub api-rest massive-datasets prisma

Updated Jul 6, 2024
JavaScript

Improve this page

Add a description, image, and links to the massive-datasets topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the massive-datasets topic, visit your repo's landing page and select "manage topics."