A list of awesome-public-datasets found in the industry and their descriptions are shown below. Clicking the link will take you to the data description page. The data and its description will be updated periodically.
-
PHM Data Challenge 18: Etching tool fault detection (PdM).
-
SECOM: Semiconductor manufacturing process data.
-
WM-811K(LSWMD): Wafer fault detection & classification data.
- Gas Sensor Array Drift: This archive contains 13910 measurements from 16 chemical sensors exposed to 6 different gases at various concentration levels.
- Chemical Detection Platform: The dataset contains 18000 time-series recordings from a chemical detection platform at six different locations in a wind tunnel facility in response to ten high-priority chemical gaseous substances.
- Dynamic Gas Mixtures: The data set contains the recordings of 16 chemical sensors exposed to two dynamic gas mixtures at varying concentrations. For each mixture, signals were acquired continuously during 12 hours.
-
C-MAPSS: Engine degradation simulation.
PHM08: Challenge on this dataset
-
CNC Mill Tool Wear: Machining data was collected from a CNC machine for variations of tool condition, feed rate, and clamping pressure.
-
Naval Propulsion Plants characterized by a COmbined Diesel eLectric And Gas (CODLAG) propulsion plant type.
-
PHM Data Challenge 17: predict faulty regimes of operation of a train car using the data provided and physics-based modeling methods.
-
Appliance Energy: Experimental data used to create regression models of appliances energy use in a low energy building.
-
Combined Cycle Power Plant: Combined Cycle Power Plant over 6 years.
-
GREEND : GREEND is an energy dataset containing power measurements collected from multiple households in Austria and Italy. It provides detailed energy profiles on a per device basis with a sampling rate of 1 Hz.
-
Eco(Electricity Consumption & Occupancy) : The ECO data set is a comprehensive data set for non-intrusive load monitoring and occupancy detection research.
-
UK DALE dataset : This dataset records the power demand from five houses. In each house we record both the whole-house mains power demand every six seconds as well as power demand from individual appliances every six seconds. In three of the five houses (houses 1, 2 and 5) we also record the whole-house voltage and current at 16 kHz.
-
BLUED dataset : The dataset consists of voltage and current measurements for a single-family residence in the United States, sampled at 12 kHz for a whole week.
-
REDD: A Public Data Set for Energy Disaggregation Research: A freely available data set containing detailed power usage information from several homes, which is aimed at furthering research on energy disaggregation (the task of determining the component appliance contributions from an aggregated electricity signal)
-
Experiments on Li-ion batteries: Charging and discharging at different temperatures. Records the impedance as the damage criterion.
-
Data-driven prediction of battery cycle life before capacity degradation
-
Hill-Valley: This is NOT a manufacturing dataset, but looks good for testing pattern detection methods.
-
APS System Failures: The datasets' positive class consists of component failures for a specific component of the APS system. The negative class consists of trucks with failures for components not related to the APS.
- Sehee Lee (sehee.lee@makinarocks.ai)
- Minkyu Jeon (jskstar12@makinarocks.ai)