Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an item pipeline to log auto field stats #124

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .flake8
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ per-file-ignores =
# Ignore: "imported but unused" errors in __init__ files, as those imports are there
# to expose submodule functions so they can be imported directly from that module
zyte_common_items/__init__.py:F401,
zyte_common_items/pipelines.py:F401,

# Ignore: * imports in these files
zyte_common_items/__init__.py:F403,
Expand All @@ -23,3 +24,10 @@ per-file-ignores =
# Ignore: may be undefined, or defined from star imports
zyte_common_items/zyte_data_api.py:F405,
tests/test_page_inputs.py:F405,

# ”module level import not at the top of file“ caused by
# pytest.importorskip
tests/test_auto_field_stats.py:E402,

# “docstring does contain unindexed parameters” due to the use of {}.
zyte_common_items/_scrapy_poet.py:P102
25 changes: 25 additions & 0 deletions docs/_ext/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
def setup(app):
# https://stackoverflow.com/a/13663325
#
# Scrapy’s
# https://github.com/scrapy/scrapy/blob/dba37674e6eaa6c2030c8eb35ebf8127cd488062/docs/_ext/scrapydocs.py#L90C16-L110C6
app.add_crossref_type(
directivename="setting",
rolename="setting",
indextemplate="pair: %s; setting",
)
app.add_crossref_type(
directivename="signal",
rolename="signal",
indextemplate="pair: %s; signal",
)
app.add_crossref_type(
directivename="command",
rolename="command",
indextemplate="pair: %s; command",
)
app.add_crossref_type(
directivename="reqmeta",
rolename="reqmeta",
indextemplate="pair: %s; reqmeta",
)
4 changes: 4 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
import pkgutil
import sys
from datetime import datetime
from pathlib import Path


def get_copyright(attribution, *, first_year):
Expand All @@ -26,7 +28,9 @@ def get_version_and_release():
copyright = get_copyright("Zyte Group Ltd", first_year=2022)
version, release = get_version_and_release()

sys.path.insert(0, str(Path(__file__).parent.absolute())) # _ext
extensions = [
"_ext",
"sphinx.ext.autodoc",
"sphinx.ext.intersphinx",
]
Expand Down
1 change: 1 addition & 0 deletions docs/reference/pipelines.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,5 @@ Scrapy Pipelines
================

.. autoclass:: zyte_common_items.pipelines.AEPipeline
.. autoclass:: zyte_common_items.pipelines.AutoFieldStatsItemPipeline
.. autoclass:: zyte_common_items.pipelines.DropLowProbabilityItemPipeline
2 changes: 2 additions & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
Scrapy
scrapy-poet
Sphinx==8.1.3
sphinx-rtd-theme==3.0.1
47 changes: 43 additions & 4 deletions docs/setup.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,51 @@ Installation


.. _configuration:
.. _scrapy-config:

Configuration
=============
Scrapy configuration
====================

To allow itemadapter_ users, like Scrapy_, to interact with :ref:`items
<items>`, prepend :class:`~zyte_common_items.ZyteItemAdapter` or
If you use Scrapy, zyte-common-items provides some functionality that needs
configuring:

- If using Scrapy_ 2.10 or higher, enable the add-on:

.. code-block:: python
:caption: settings.py

ADDONS = {
"zyte_common_items.Addon": 400,
}

The add-on:

- Appends :class:`~zyte_common_items.ZyteItemAdapter` to
itemadapter.ItemAdapter.ADAPTER_CLASSES_ if neither
:class:`~zyte_common_items.ZyteItemAdapter` nor
:class:`~zyte_common_items.ZyteItemKeepEmptyAdapter` are already there.

- Adds :class:`~zyte_common_items.pipelines.AutoFieldStatsItemPipeline`
(if :doc:`scrapy-poet <scrapy-poet:index>` is installed) to
:setting:`ITEM_PIPELINES <scrapy:ITEM_PIPELINES>`:

.. code-block:: python

ITEM_PIPELINES = {
"zyte_common_items.pipelines.AutoFieldStatsItemPipeline": 200,
}

- If using Scrapy_ 2.9 or lower, apply those configurations manually as
needed.


.. _itemadapter-config:

itemadapter configuration
=========================

To allow itemadapter_ to interact with :ref:`items <items>`, prepend
:class:`~zyte_common_items.ZyteItemAdapter` or
:class:`~zyte_common_items.ZyteItemKeepEmptyAdapter` to
itemadapter.ItemAdapter.ADAPTER_CLASSES_ as early as possible in your code::

Expand Down
Loading
Loading