OneCLick is a lightweight, open-source Streamlit application for enriching bibliometric datasets by retrieving missing keywords from the OpenAlex API.
It streamlines preprocessing for tools like VOSviewer, CiteSpace, and Bibliometrix, and outputs reproducible, analysis-ready files for downstream research.
-
📁 Upload Input:
- Accepts
.xlsx
files with a column namedDI
(DOI).
- Accepts
-
🔄 DOI-Based Keyword Retrieval:
- Sends DOI queries to OpenAlex and retrieves enriched metadata keywords.
-
⚠️ Error & Warning Handling:- Invalid DOIs, network timeouts, or rate-limit errors →
Failed_DOIs.csv
- Valid DOIs without keywords → flagged as
[WARNING:NoKeywords]
inNoKeyword_DOIs.csv
- Invalid DOIs, network timeouts, or rate-limit errors →
-
📊 Latency Logging:
- Per-DOI request latency is tracked and summarized (median & IQR)
- Full request log saved as
Latency_Log.csv
-
📤 Outputs:
- Enriched dataset →
openalex_keywords.csv
/.xlsx
- Downloadable error & warning logs
- Latency logs for reproducibility and benchmarking
- Enriched dataset →
# Clone the repository
git clone https://github.com/meMeta-a11y/oneclick.git
cd oneclick
# Install dependencies
pip install -r requirements.txt
Run the Streamlit application:
streamlit run streamlit_app.py
We provide three benchmark datasets for reproducibility and testing:
Dataset | DOIs | Purpose |
---|---|---|
Pilot | 37 | Quick exploratory testing |
Intermediate | 100 | Lightweight benchmarking |
Validation | 1,097 | Large-scale robustness testing |
All datasets are available in the example_data/
directory and are permanently archived on Zenodo.
Validation experiments demonstrate robustness and scalability across dataset sizes:
Dataset | DOIs | Error DOIs | No-Keyword DOIs | Success Rate | Median Latency per DOI (seconds, IQR) |
---|---|---|---|---|---|
Pilot | 37 | 0 | 0 | 100.00% | 0.30 (0.29 – 0.31) |
Intermediate | 100 | 0 | 26 | 74.00% | 0.30 (0.29 – 0.32) |
Validation | 1,097 | 0 | 205 | 81.31% | 0.30 (0.29 – 0.32) |
🧪 Benchmarks were conducted on Streamlit Cloud (Aug 2025) using a Linux environment (4 GB RAM, 100 Mbps network). Performance may vary depending on network conditions and OpenAlex API rate limits.
Follow these steps to integrate OneCLick outputs with VOSviewer:
- Open VOSviewer → Create → Map based on network data
- Input: Choose your enriched
openalex_keywords.csv
- Mapping: Set
OpenAlex_KW
as the keyword field - Options: Select co-occurrence counting method (e.g., Full counting)
- Generate: Visualize the keyword network
📚 Additional workflows are described in the manuscript (Section 2.12).
oneclick/
├── streamlit_app.py # Main application
├── requirements.txt # Dependencies
├── example_data/ # Example DOI datasets
│ ├── pilot_37.xlsx
│ ├── intermediate_100.xlsx
│ └── validation_1097.xlsx
├── outputs/ # Example outputs & logs
│ ├── openalex_keywords.csv
│ ├── Failed_DOIs.csv
│ ├── NoKeyword_DOIs.csv
│ └── Latency_Log.csv
└── README.md # Documentation
If you use OneCLick in your research, please cite it using one of the following formats:
Wei, L. K. (2025). OneCLick: Streamlined metadata enrichment using machine-inferred keywords from OpenAlex. SoftwareX, 31, 102353.
Wei, Loo Keat. "OneCLick: Streamlined metadata enrichment using machine-inferred keywords from OpenAlex." SoftwareX 31 (2025): 102353.
Wei, L.K., 2025. OneCLick: Streamlined metadata enrichment using machine-inferred keywords from OpenAlex. SoftwareX, 31, p.102353.
Wei LK. OneCLick: Streamlined metadata enrichment using machine-inferred keywords from OpenAlex. SoftwareX. 2025 Sep 1;31:102353.
- 🔗 Live Demo: metakey.streamlit.app
- 📦 Source Code: GitHub Repository
This project is licensed under the Apache-2.0 License - see the LICENSE file for details.
💡 OneCLick is developed to accelerate bibliometric research workflows by automating metadata enrichment and preparing datasets for network analysis and visualization. Contributions, issues, and pull requests are always welcome!