Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customize / Harmonize publisher + other small fixes #213

Merged
merged 5 commits into from
Jan 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions ChangeLog
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,17 @@ as of 2.0.0.

## [Unreleased]

### Added
- `Publisher` ZIM metadata can now be customized at CLI (#210)

### Changed
- `Publisher` ZIM metadata default value is changed to `openZIM` intead of `Kiwix` (#210)

### Fixed
- Do not fail if temporary directory already exists (#207)
- Typo in `Scraper` ZIM metadata (#212)
- Adapt to hatchling v1.19.0 which mandates packages setting (#211)

## [2.1.0] - 2023-08-18

### Changed
Expand Down
5 changes: 4 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ build-backend = "hatchling.build"
name = "gutenberg2zim"
authors = [{ name = "Kiwix", email = "dev@kiwix.org" }]
keywords = ["kiwix", "zim", "offline", "gutenberg"]
requires-python = ">=3.11"
requires-python = ">=3.11,<3.12"
description = "Make ZIM file from Gutenberg books"
readme = "pypi-readme.rst"
license = { text = "GPL-3.0-or-later" }
Expand Down Expand Up @@ -69,6 +69,9 @@ exclude = ["/.github"]
path = "hatch_build.py"
dependencies = ["zimscraperlib==3.1.1"]

[tool.hatch.build.targets.wheel]
packages = ["src/gutenberg2zim"]

[tool.hatch.envs.default]
features = ["dev"]

Expand Down
7 changes: 5 additions & 2 deletions src/gutenberg2zim/entrypoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
"""[--prepare] [--parse] [--download] [--export] [--dev] """
"""[--zim] [--complete] [-m ONE_LANG_ONE_ZIM_FOLDER] """
"""[--title-search] [--bookshelves] [--optimization-cache S3URL] """
"""[--stats-filename STATS_FILENAME]"""
"""[--stats-filename STATS_FILENAME] [--publisher ZIM_PUBLISHER]"""
"""

-h --help Display this help message
Expand Down Expand Up @@ -63,6 +63,7 @@
--use-any-optimized-version Try to use any optimized version found on """
"""optimization cache
--stats-filename=<filename> Path to store the progress JSON file to
--publisher=<zim_publisher> Custom Publisher in ZIM Metadata (openZIM otherwise)

This script is used to produce a ZIM file (and any intermediate state)
of Gutenberg repository using a mirror."""
Expand Down Expand Up @@ -102,6 +103,7 @@ def main():
optimization_cache = arguments.get("--optimization-cache") or None
use_any_optimized_version = arguments.get("--use-any-optimized-version", False)
stats_filename = arguments.get("--stats-filename") or None
publisher = arguments.get("--publisher") or "openZIM"

s3_storage = None
if optimization_cache:
Expand All @@ -111,7 +113,7 @@ def main():
logger.info("S3 Credentials OK. Continuing ... ")

# create tmp dir
TMP_FOLDER_PATH.mkdir(parents=True)
TMP_FOLDER_PATH.mkdir(parents=True, exist_ok=True)

languages = [
x.strip().lower()
Expand Down Expand Up @@ -224,4 +226,5 @@ def f(x):
title=zim_title,
description=zim_desc,
stats_filename=stats_filename,
publisher=publisher,
)
6 changes: 3 additions & 3 deletions src/gutenberg2zim/shared.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ def inc_progress():
Global.progress += 1

@staticmethod
def setup(filename, language, title, description, name):
def setup(filename, language, title, description, name, publisher):
Global.creator = Creator(
filename=filename,
main_path="Home.html",
Expand All @@ -41,10 +41,10 @@ def setup(filename, language, title, description, name):
title=title,
description=description,
creator="gutenberg.org", # type: ignore
publisher="Kiwix", # type: ignore
publisher=publisher, # type: ignore
name=name,
tags="_category:gutenberg;gutenberg", # type: ignore
scraper=f"gutengergtozim-{VERSION}", # type: ignore
scraper=f"gutenberg2zim-{VERSION}", # type: ignore
date=date.today(), # type: ignore
).config_verbose(True)

Expand Down
2 changes: 2 additions & 0 deletions src/gutenberg2zim/zim.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ def build_zimfile(
title,
description,
stats_filename,
publisher,
):
# actual list of languages with books sorted by most used
nb = fn.COUNT(Book.language).alias("nb")
Expand Down Expand Up @@ -76,6 +77,7 @@ def build_zimfile(
title=title,
description=description,
name=project_id,
publisher=publisher,
)

Global.start()
Expand Down