python 2.7: query_devices() fails with non ASCII characters #30

raecke · 2016-07-30T07:30:08Z

On a german Windows 7 machine with sounddevice 0.3.3, I get a traceback (see below) when I call:

python -m sounddevice

The reason seems to be that a german Windows version has non ASCII characters in the device name.
(e.g.: 'Primärer Soundaufnahmetreiber').

Possible Solution:

Line 1806:
Change
name=info['name'],
To:
name=repr(info['name']),

Traceback (most recent call last):
  File "C:\Python27\lib\runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "C:\Python27\lib\runpy.py", line 72, in _run_code
    exec code in run_globals
  File "C:\Python27\lib\site-packages\sounddevice.py", line 2536, in <module>
    print(query_devices())
  File "C:\Python27\lib\site-packages\sounddevice.py", line 1810, in __repr__
    for idx, info in enumerate(self))
  File "C:\Python27\lib\site-packages\sounddevice.py", line 1810, in <genexpr>
    for idx, info in enumerate(self))
UnicodeEncodeError: 'ascii' codec can't encode character u'\ufffd' in position 18: ordinal not in range(128)

The text was updated successfully, but these errors were encountered:

mgeier · 2016-07-30T09:27:52Z

Thanks for this bug report!

Your solution avoids the exception, but it adds quotation marks to the device names.
Also, I guess it turns your device into u'Prim\\xe4rer Soundaufnahmetreiber' instead of showing the umlaut correctly, right?

For now, I can only think of a solution that checks if Python 2 or 3 is used and uses encode(...) in the former case and does nothing in the latter. Similar to http://stackoverflow.com/a/13848698/500098.

However, this is quite unwieldy just for supporting Python 2. Any better ideas?

raecke · 2016-07-30T11:12:45Z

You are right. The stackoverflow-solution is not ideal. I adapted a decorator from django.utils.encoding. Applying this decorator to the class (not the function), and adding
from __future__ import unicode_literals behaves as it should.

def python_2_unicode_compatible(klass):
    """
    A decorator that (re)defines __unicode__ and __str__ and __repr__ methods under Python 2.
    Under Python 3 it does nothing.

    To support Python 2 and 3 with a single code base, define a __str__ method
    returning text and apply this decorator to the class.
    """
    if _sys.version_info.major < 3:
        if '__repr__' in klass.__dict__:
            klass.__urepr__ = klass.__repr__
            klass.__repr__ = lambda self: self.__urepr__().encode('utf-8')
        if '__str__' in klass.__dict__:
            klass.__unicode__ = klass.__str__
            klass.__str__ = lambda self: self.__unicode__().encode('utf-8')
    return klass

mgeier · 2016-07-31T12:20:57Z

Thanks for the modification, but I think this is still too complicated.

And I don't really like the unicode_literals thing, what about using "native strings" everywhere?

I implemented this in #31, can you please try if this works for your umlaut-afflicted device name?
Please try it in both Python 2 and 3.

raecke · 2016-08-01T09:37:02Z

Tested it with python2. This leads to a mix of different encodings ('mbcs' for DirectSound and 'utf8' for everything else). Recoding everything to utf8 which is not already utf8 is needed.

mgeier · 2016-08-01T10:46:39Z

Thanks for testing.

Out of curiosity, how do the different encodings look? Could you provide a screenshot?

raecke · 2016-08-01T12:38:23Z

Out of curiosity, how do the different encodings look? Could you provide a screenshot?

It depends how I decode the string. If I try to decode mbcs (which is cp1252 on my machine) as utf8 I get an exception. If I try to decode utf8 as mbcs I get:

print u"primärer kopfhörer".encode('utf8').decode('mbcs')
primÃ¤rer kopfhÃ¶rer

mgeier · 2016-08-01T12:56:04Z

Thanks, but how does the output of python -m sounddevice look when using #31?

raecke · 2016-08-01T20:33:15Z

It depends on the codepage of the terminal:

C:\> chcp 850 && python -m sounddevice
Aktive Codepage: 850.
   0 Microsoft Soundmapper - Input, MME (2 in, 0 out)
>  1 Mikrofon (Realtek High Definiti, MME (2 in, 0 out)
   2 Microsoft Soundmapper - Output, MME (0 in, 2 out)
<  3 Lautsprecher/Kopfh÷rer (Realtek, MME (0 in, 2 out)
   4 Primõrer Soundaufnahmetreiber, Windows DirectSound (2 in, 0 out)
   5 Mikrofon (Realtek High Definition Audio), Windows DirectSound (2 in, 0 out)
   6 Primõrer Soundtreiber, Windows DirectSound (0 in, 2 out)
   7 Lautsprecher/Kopfh÷rer (Realtek High Definition Audio), Windows DirectSound
(0 in, 2 out)
   8 Lautsprecher/Kopfh├Ârer (Realtek High Definition Audio), Windows WASAPI (0
in, 2 out)
   9 Mikrofon (Realtek High Definition Audio), Windows WASAPI (2 in, 0 out)
  10 Speakers (Realtek HD Audio output), Windows WDM-KS (0 in, 2 out)
  11 Mikrofon (Realtek HD Audio Mic input), Windows WDM-KS (2 in, 0 out)

C:\>chcp 1252 && python -m sounddevice
Aktive Codepage: 1252.
   0 Microsoft Soundmapper - Input, MME (2 in, 0 out)
>  1 Mikrofon (Realtek High Definiti, MME (2 in, 0 out)
   2 Microsoft Soundmapper - Output, MME (0 in, 2 out)
<  3 Lautsprecher/Kopfhörer (Realtek, MME (0 in, 2 out)
   4 Primärer Soundaufnahmetreiber, Windows DirectSound (2 in, 0 out)
   5 Mikrofon (Realtek High Definition Audio), Windows DirectSound (2 in, 0 out)
   6 Primärer Soundtreiber, Windows DirectSound (0 in, 2 out)
   7 Lautsprecher/Kopfhörer (Realtek High Definition Audio), Windows DirectSound
(0 in, 2 out)
   8 Lautsprecher/KopfhÃ¶rer (Realtek High Definition Audio), Windows WASAPI (0
in, 2 out)
   9 Mikrofon (Realtek High Definition Audio), Windows WASAPI (2 in, 0 out)
  10 Speakers (Realtek HD Audio output), Windows WDM-KS (0 in, 2 out)
  11 Mikrofon (Realtek HD Audio Mic input), Windows WDM-KS (2 in, 0 out)

C:\>chcp 65001 && python -m sounddevice
Aktive Codepage: 65001.
   0 Microsoft Soundmapper - Input, MME (2 in, 0 out)
>  1 Mikrofon (Realtek High Definiti, MME (2 in, 0 out)
   2 Microsoft Soundmapper - Output, MME (0 in, 2 out)
<  3 Lautsprecher/Kopfh�rer (Realtek, MME (0 in, 2 out)
   4 Prim�rer Soundaufnahmetreiber, Windows DirectSound (2 in, 0 out)
   5 Mikrofon (Realtek High Definition Audio), Windows DirectSound (2 in, 0 out)
   6 Prim�rer Soundtreiber, Windows DirectSound (0 in, 2 out)
   7 Lautsprecher/Kopfh�rer (Realtek High Definition Audio), Windows DirectSound
(0 in, 2 out)
   8 Lautsprecher/Kopfhörer (Realtek High Definition Audio), Windows WASAPI (0
in, 2 out)
   9 Mikrofon (Realtek High Definition Audio), Windows WASAPI (2 in, 0 out)
  10 Speakers (Realtek HD Audio output), Windows WDM-KS (0 in, 2 out)
  11 Mikrofon (Realtek HD Audio Mic input), Windows WDM-KS (2 in, 0
out)Traceback (most recent call last):
  File "C:\Python27\lib\runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "C:\Python27\lib\runpy.py", line 72, in _run_code
    exec code in run_globals
  File "c:\noppi\misc\python-sounddevice\sounddevice.py", line 2544, in <module>
    print(query_devices())
IOError: [Errno 0] Error

raecke · 2016-08-01T21:04:35Z

Results with python3:

C:\>chcp 850 && python -m sounddevice
Aktive Codepage: 850.
Traceback (most recent call last):
  File "c:\Program Files (x86)\Python35-32\lib\runpy.py", line 184, in _run_module_as_main
    "__main__", mod_spec)
  File "c:\Program Files (x86)\Python35-32\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "c:\noppi\misc\python-sounddevice\sounddevice.py", line 2544, in <module>
    print(query_devices())
  File "c:\Program Files (x86)\Python35-32\lib\encodings\cp850.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd' in position 191: character maps to <undefined>

C:\>chcp 1252 && python -m sounddevice
Aktive Codepage: 1252.
Traceback (most recent call last):
  File "c:\Program Files (x86)\Python35-32\lib\runpy.py", line 184, in _run_module_as_main
    "__main__", mod_spec)
  File "c:\Program Files (x86)\Python35-32\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "c:\noppi\misc\python-sounddevice\sounddevice.py", line 2544, in <module>
    print(query_devices())
  File "c:\Program Files (x86)\Python35-32\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd' in position 191: character maps to <undefined>

C:\>chcp 65001 && python -m sounddevice
Aktive Codepage: 65001.
   0 Microsoft Soundmapper - Input, MME (2 in, 0 out)
>  1 Mikrofon (Realtek High Definiti, MME (2 in, 0 out)
   2 Microsoft Soundmapper - Output, MME (0 in, 2 out)
<  3 Lautsprecher/Kopfhï¿½rer (Realtek, MME (0 in, 2 out)
   4 PrimÃ¤rer Soundaufnahmetreiber, Windows DirectSound (2 in, 0 out)
   5 Mikrofon (Realtek High Definition Audio), Windows DirectSound (2 in, 0 out)
   6 PrimÃ¤rer Soundtreiber, Windows DirectSound (0 in, 2 out)
   7 Lautsprecher/KopfhÃ¶rer (Realtek High Definition Audio), Windows DirectSound (0 in, 2 out)
   8 Lautsprecher/KopfhÃ¶rer (Realtek High Definition Audio), Windows WASAPI (0 in, 2 out)
   9 Mikrofon (Realtek High Definition Audio), Windows WASAPI (2 in, 0 out)
  10 Speakers (Realtek HD Audio output), Windows WDM-KS (0 in, 2 out)
  11 Mikrofon (Realtek HD Audio Mic input), Windows WDM-KS (2 in, 0 out)

raecke · 2016-08-01T21:42:08Z

The encoding for MME devicenames seems also to be mbcs. This should be fixed in line 623.

Closes #30.

mgeier · 2016-08-02T07:26:43Z

Thanks a lot for the detailed tests!

I've made a new attempt at solving this: #32. Can you please try if that works?

The encoding for MME devicenames seems also to be mbcs. This should be fixed in line 623.

OK, but in your last example, DirectSound and WASAPI show the same kind of artifact ...
Is that the same with #32?

raecke · 2016-08-02T09:58:31Z

#32 works with the limitation, that the encoding for MME is not correct. Should I open a new issue for this? It also a problem for python3.

OK, but in your last example, DirectSound and WASAPI show the same kind of artifact ...
Is that the same with #32?

The result is the same, because the test is with python3, and python3 is not affected by the changes.

mgeier · 2016-08-02T10:44:57Z

Thanks, I've added another commit (e33bd89) to #32 which enables MBCS for MME.

Does it now work for all codepages on Python 2 and 3?

raecke · 2016-08-02T20:13:29Z

Yes it works in the terminal with cp850, cp1252, and cp65001. The only thing that is not perfect, is the behaviour in a jupyter notebook. If I type sounddevice.query_devices(), I get the wrong characters, because utf-8 is interpreted as mbcs.

mgeier · 2016-08-02T22:27:50Z

Thanks for testing! Any idea how we could make it work for Jupyter notebooks?
Does the problem occur for both Python 2 and 3?
Does sys.stdout.encoding have the wrong value?

raecke · 2016-08-03T07:25:08Z

In the Jupyter notebook sys.stdout.encoding is UTF-8.
The command print sounddevice.query_devices() gives the correct result.
The command sounddevice.query_devices() interprets the result as MBCS.

mgeier · 2016-08-03T10:44:08Z

Thanks for checking this out!

I guess this last issue has to be solved on the Jupyter side of things, right?

Shall I merge #32 as is or is there something else to do?

raecke · 2016-08-03T12:44:56Z

Shall I merge #32 as is or is there something else to do?

I think there is nothing else to do.

mgeier · 2016-08-03T17:02:02Z

I just merged #32, thanks for your help!

mgeier added the bug label Jul 30, 2016

mgeier added a commit that referenced this issue Aug 2, 2016

Return bytes in DeviceList.__repr__() for Python 2

eb79538

Closes #30.

mgeier mentioned this issue Aug 2, 2016

Return bytes in DeviceList.__repr__() for Python 2 #32

Merged

mgeier closed this as completed in 76fe59d Aug 3, 2016

mgeier mentioned this issue Jan 2, 2017

Encoding Problem #62

Closed

mgeier mentioned this issue Feb 2, 2017

query_devices() name decoding problem (MME, DirectSound) #72

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

python 2.7: query_devices() fails with non ASCII characters #30

python 2.7: query_devices() fails with non ASCII characters #30

raecke commented Jul 30, 2016 •

edited

Loading

mgeier commented Jul 30, 2016

raecke commented Jul 30, 2016 •

edited by mgeier

Loading

mgeier commented Jul 31, 2016

raecke commented Aug 1, 2016

mgeier commented Aug 1, 2016

raecke commented Aug 1, 2016

mgeier commented Aug 1, 2016

raecke commented Aug 1, 2016

raecke commented Aug 1, 2016

raecke commented Aug 1, 2016

mgeier commented Aug 2, 2016

raecke commented Aug 2, 2016

mgeier commented Aug 2, 2016

raecke commented Aug 2, 2016 •

edited

Loading

mgeier commented Aug 2, 2016

raecke commented Aug 3, 2016

mgeier commented Aug 3, 2016

raecke commented Aug 3, 2016 •

edited by mgeier

Loading

mgeier commented Aug 3, 2016

python 2.7: query_devices() fails with non ASCII characters #30

python 2.7: query_devices() fails with non ASCII characters #30

Comments

raecke commented Jul 30, 2016 • edited Loading

mgeier commented Jul 30, 2016

raecke commented Jul 30, 2016 • edited by mgeier Loading

mgeier commented Jul 31, 2016

raecke commented Aug 1, 2016

mgeier commented Aug 1, 2016

raecke commented Aug 1, 2016

mgeier commented Aug 1, 2016

raecke commented Aug 1, 2016

raecke commented Aug 1, 2016

raecke commented Aug 1, 2016

mgeier commented Aug 2, 2016

raecke commented Aug 2, 2016

mgeier commented Aug 2, 2016

raecke commented Aug 2, 2016 • edited Loading

mgeier commented Aug 2, 2016

raecke commented Aug 3, 2016

mgeier commented Aug 3, 2016

raecke commented Aug 3, 2016 • edited by mgeier Loading

mgeier commented Aug 3, 2016

raecke commented Jul 30, 2016 •

edited

Loading

raecke commented Jul 30, 2016 •

edited by mgeier

Loading

raecke commented Aug 2, 2016 •

edited

Loading

raecke commented Aug 3, 2016 •

edited by mgeier

Loading