Skip to content

gh-67230: add quoting rules to csv module #29469

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Apr 12, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 20 additions & 2 deletions Doc/library/csv.rst
Original file line number Diff line number Diff line change
Expand Up @@ -327,7 +327,7 @@ The :mod:`csv` module defines the following constants:

Instructs :class:`writer` objects to quote all non-numeric fields.

Instructs the reader to convert all non-quoted fields to type *float*.
Instructs :class:`reader` objects to convert all non-quoted fields to type *float*.


.. data:: QUOTE_NONE
Expand All @@ -337,7 +337,25 @@ The :mod:`csv` module defines the following constants:
character. If *escapechar* is not set, the writer will raise :exc:`Error` if
any characters that require escaping are encountered.

Instructs :class:`reader` to perform no special processing of quote characters.
Instructs :class:`reader` objects to perform no special processing of quote characters.

.. data:: QUOTE_NOTNULL

Instructs :class:`writer` objects to quote all fields which are not
``None``. This is similar to :data:`QUOTE_ALL`, except that if a
field value is ``None`` an empty (unquoted) string is written.

Instructs :class:`reader` objects to interpret an empty (unquoted) field as None and
to otherwise behave as :data:`QUOTE_ALL`.

.. data:: QUOTE_STRINGS

Instructs :class:`writer` objects to always place quotes around fields
which are strings. This is similar to :data:`QUOTE_NONNUMERIC`, except that if a
field value is ``None`` an empty (unquoted) string is written.

Instructs :class:`reader` objects to interpret an empty (unquoted) string as ``None`` and
to otherwise behave as :data:`QUOTE_NONNUMERIC`.

The :mod:`csv` module defines the following exception:

Expand Down
2 changes: 2 additions & 0 deletions Lib/csv.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,14 @@
unregister_dialect, get_dialect, list_dialects, \
field_size_limit, \
QUOTE_MINIMAL, QUOTE_ALL, QUOTE_NONNUMERIC, QUOTE_NONE, \
QUOTE_STRINGS, QUOTE_NOTNULL, \
__doc__
from _csv import Dialect as _Dialect

from io import StringIO

__all__ = ["QUOTE_MINIMAL", "QUOTE_ALL", "QUOTE_NONNUMERIC", "QUOTE_NONE",
"QUOTE_STRINGS", "QUOTE_NOTNULL",
"Error", "Dialect", "__doc__", "excel", "excel_tab",
"field_size_limit", "reader", "writer",
"register_dialect", "get_dialect", "list_dialects", "Sniffer",
Expand Down
4 changes: 4 additions & 0 deletions Lib/test/test_csv.py
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,10 @@ def test_write_quoting(self):
quoting = csv.QUOTE_ALL)
self._write_test(['a\nb',1], '"a\nb","1"',
quoting = csv.QUOTE_ALL)
self._write_test(['a','',None,1], '"a","",,1',
quoting = csv.QUOTE_STRINGS)
self._write_test(['a','',None,1], '"a","",,"1"',
quoting = csv.QUOTE_NOTNULL)

def test_write_escape(self):
self._write_test(['a',1,'p,q'], 'a,1,"p,q"',
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Add :data:`~csv.QUOTE_STRINGS` and :data:`~csv.QUOTE_NOTNULL` to the suite
of :mod:`csv` module quoting styles.
16 changes: 15 additions & 1 deletion Modules/_csv.c
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,8 @@ typedef enum {
} ParserState;

typedef enum {
QUOTE_MINIMAL, QUOTE_ALL, QUOTE_NONNUMERIC, QUOTE_NONE
QUOTE_MINIMAL, QUOTE_ALL, QUOTE_NONNUMERIC, QUOTE_NONE,
QUOTE_STRINGS, QUOTE_NOTNULL
} QuoteStyle;

typedef struct {
Expand All @@ -95,6 +96,8 @@ static const StyleDesc quote_styles[] = {
{ QUOTE_ALL, "QUOTE_ALL" },
{ QUOTE_NONNUMERIC, "QUOTE_NONNUMERIC" },
{ QUOTE_NONE, "QUOTE_NONE" },
{ QUOTE_STRINGS, "QUOTE_STRINGS" },
{ QUOTE_NOTNULL, "QUOTE_NOTNULL" },
{ 0 }
};

Expand Down Expand Up @@ -1264,6 +1267,12 @@ csv_writerow(WriterObj *self, PyObject *seq)
case QUOTE_ALL:
quoted = 1;
break;
case QUOTE_STRINGS:
quoted = PyUnicode_Check(field);
break;
case QUOTE_NOTNULL:
quoted = field != Py_None;
break;
default:
quoted = 0;
break;
Expand Down Expand Up @@ -1659,6 +1668,11 @@ PyDoc_STRVAR(csv_module_doc,
" csv.QUOTE_NONNUMERIC means that quotes are always placed around\n"
" fields which do not parse as integers or floating point\n"
" numbers.\n"
" csv.QUOTE_STRINGS means that quotes are always placed around\n"
" fields which are strings. Note that the Python value None\n"
" is not a string.\n"
" csv.QUOTE_NOTNULL means that quotes are only placed around fields\n"
" that are not the Python value None.\n"
" csv.QUOTE_NONE means that quotes are never placed around fields.\n"
" * escapechar - specifies a one-character string used to escape\n"
" the delimiter when quoting is set to QUOTE_NONE.\n"
Expand Down