Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to debug/troubleshoot #538

Merged
merged 12 commits into from
Jul 14, 2020
2 changes: 1 addition & 1 deletion docs/artwork
Submodule artwork updated 2 files
+1 −0 LICENSE
+271 −0 src/debug.svg
2 changes: 2 additions & 0 deletions docs/basics/101-123-config2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -386,6 +386,8 @@ Write a note about configurations in datasets into ``notes.txt``.

.. index:: ! environment variable

.. _envvars:

.. [#f4] **Some more on environment variables**:
Names of environment variables are often all-uppercase. While the ``$`` is not part of
the name of the environment variable, it is necessary to *refer* to the environment
Expand Down
158 changes: 158 additions & 0 deletions docs/basics/101-135-help.rst
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,164 @@ The "submit a question link" on `DataLad's GitHub page <https://github.com/datal
page prefills a neurostars form with a template for a question for a good
starting point if you want to have more guidance or encounter writer's block.

.. index:: Debugging

Debugging like a DataLad-developer
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If you have read a command's help from start to end, checked all software versions twice, even `asked colleagues to reproduce your problem (unsuccessfully) <https://xkcd.com/2083/>`_, and you still don't have any clue what is going on, then welcome to the debugging section!

.. figure:: ../artwork/src/debug.svg
:figwidth: 50%

It's not as bad as this

It is not always straightforward to see *why* a particular DataLad command has failed.
Given that operations with DataLad can be quite complicated, and could involve complexities such as different forms of authentication, different file systems, interactions with the environment, configurations, and other software, and *much* more, there are what may feel like an infinite amount of sources for the problem at hand.
The resulting error message, however, may not display the underlying cause correctly because the error message of whichever process failed is not propagated into the final result report.
Thus, you may end up with an uninformative ``Unable to access these remotes`` error in the result summary, when the underlying issue is a `certificate error <https://github.com/datalad/datalad/issues/4651#issuecomment-649180205>`_.

In situations where there is no obvious reason for a command to fail, it can be helpful -- either for yourself or for further information to paste into :term:`GitHub` issues -- to start `debugging <https://xkcd.com/1722/>`_, or *logging at a higher granularity* than is the default.
This allows you to gain more insights into the actions DataLad and its underlying tools are taking, where *exactly* they fail, and to even play around with the program at the state of the failure.

:term:`Debugging` and :term:`logging` are not as complex as these terms may sound if you have never consciously debugged.
Procedurally, it can be as easy as adding an additional flag to a command call, and cognitively, it can be as easy as engaging your visual system in a visual search task for the color red or the word "error", or reading more DataLad output that you're used to.
The paragraphs below start with the general concepts, and collect concrete debugging strategies for different problems. If you have advice to add, please `get in touch <https://github.com/datalad-handbook/book/issues>`__.

.. _logging:

Logging
"""""""

In order to gain more insights into the steps performed by a program and capture as many details as possible for troubleshooting an error, you can turn to :term:`logging`.
Logging simply refers to the fact that DataLad and its underlying tools tell you what they are doing:
This information can be coarse, such as a mere ``[INFO] Downloading <some_url> into <some_target>``, or fine-grained, such as ``[DEBUG] Resolved dataset for status reporting: <dataset>``.
The :term:`log level` in brackets at the beginning of the line indicates how many details DataLad shares with you.

Note that :term:`logging` is not a sealed book, and happens automatically during the execution of any DataLad command.
While you were reading the handbook, you have seen a lot of log messages already.
Anything printed to your terminal preceded by ``[INFO]``, for example, is a log message (in this case, on the ``info`` level).
When you are *consciously* logging, you simply set the log-level to the desired amount of information, or increase the amount of verbosity until the output gives you a hint of what went wrong.
Likewise, adjusting the log-level also works the other way around, and lets you *decrease* the amount of information you receive in your terminal.

.. findoutmore:: Log levels

Log levels provide the means to adjust how much information you want, and are described in human readable terms, ordered by the severity of the failures or problems reported.
The following log levels can be chosen from:

- **critical**: Only catastrophes are reported. Currently, there is nothing inside of DataLad that would log at this level, so setting the log level to *critical* will result in getting no details at all, not even about errors or failures.
- **error**: With this log level you will receive reports on any errors that occurred within the program during command execution.
- **warning**: At this log level, the command execution will report on usual situations and anything that *might* be a problem, in addition to report anything from the *error* log level. .
- **info**: This log level will include reports by the program that indicate normal behavior and serve to keep you up to date about the current state of things, in additions to warning and error logging messages.
- **debug**: This log level is very useful to troubleshoot a problem, and results in DataLad telling you *a lot* it possibly can.

Other than log *levels*, you can also adjust the amount of information provided with numerical granularity. Instead of specifying a log level, provide an integer between 1 and 9, with lower values denoting more debugging information.

Raising the log level (e.g, to ``error``, or ``9`) will decrease the amount of information and output you will receive, while lowering it (e.g., to ``debug`` or ``1``) will increase it.



Setting a log level can be done in the form of an :term:`environment variable`, a configuration, or with the ``-l``/``--log-level`` flag appended directly after the main :command:`datalad` command.
To get extensive information on what :command:`datalad status` does underneath the hood, your command could look like this:

.. runrecord:: _examples/DL-101-135-105
:language: console
:workdir: dl-101/DataLad-101

$ datalad --log-level debug status

.. findoutmore:: ... and how does it look when using environment variables or configurations?

The log level can also be set (for different scopes) using the ``datalad.log.level`` configuration variable, or the corresponding environment variable ``DATALAD_LOG_LEVEL``.

To set the log level for a single command, for example, set it in front of the command::

$ DATALAD_LOG_LEVEL=debug datalad status

And to set the log level for the rest of the shell session, export it::

$ export DATALAD_LOG_LEVEL=debug
$ datalad status
$ ...

You can find out a bit more on environment variable :ref:`in this footnote <envvars>`.

The configuration variable can be used to set the log level on a user (global) or system-wide level with the :command:`git config` command::

$ git config --global datalad.log.level debug

This output is extensive and detailed, but it precisely shows the sequence of commands and arguments that are run prior to a failure or crash, and all additional information that is reported with the log levels ``info`` or ``debug`` can be very helpful to find out what is wrong.
Even if the vast amount of detail in output generated with ``debug`` logging appears overwhelming, it can make sense to find out at which point an execution stalls, whether arguments, commands, or datasets reported in the debug output are what you expect them to be, and to forward all information into any potential GitHub issue you will be creating.

Finally, other than logging with a DataLad command, it sometimes can be useful to turn to :term:`git-annex` or :term:`Git` for logging.
For failing :command:`datalad get` calls, it may be useful to retry the retrieval using :command:`git annex get -d -v <file>`, where ``-d`` (debug) and ``-v`` (verbose) increase the amount of detail about the command execution and failure.
In rare cases where you suspect something might be wrong with Git, setting the environment variables ``GIT_TRACE`` and ``GIT_TRACE_SETUP`` to ``2`` prior to running a Git command will give you debugging output.

.. _debug:

Debugging
"""""""""

If the additional level of detail provided by logging messages is not enough, you can go further with actual :term:`debugging`.
For this, add the ``--dbg`` or ``--idbg`` flag to the main :command:`datalad` command, as in ``datalad --dbg status``.
Adding this flag will enter a `Python <https://docs.python.org/3/library/pdb.html>`_ or `IPython debugger <https://ipython.org/>`_ when something unexpectedly crashes.
This allows you to debug the program right when it fails, inspect available variables and their values, or step back and forth through the source code.
Note that debugging experience is not a prerequisite when using DataLad -- although it is `an exciting life skill <https://www.monkeyuser.com/2017/step-by-step-debugging/>`_.
`The official Python docs <https://docs.python.org/3/library/pdb.html#debugger-commands>`_ provide a good overview on the available debugger commands if you are interested in learning more about this.

Debugging examples
""""""""""""""""""

This section collects errors and their solutions from real GitHub issues.
They may not be applicable for the problem you are currently facing, but seeing other's troubleshooting strategies may be helpful nevertheless.
If you are interested in getting your error and solution described here, please `get in touch <https://github.com/datalad-handbook/book/issues>`_.

:command:`datalad get`:
It is common for :command:`datalad get` errors to originate in :term:`git-annex`, the software used by DataLad to transfer data. Here are a few suggestions to debug them:

- Take a deep breath, or preferably a walk in a nice park :)
- Check that you are using a recent version of git-annex
- ``git-annex version`` returns the version of git-annex on the first line of its input, and it is also reported in the output of :command:`datalad wtf`.
- The version number contains the release date of the version in use. For instance, git-annex version: ``8.20200330-g971791563`` was released on 30 March 2020.
- If the version that you are using is older than a few months, consider updating using the instructions `here <http://handbook.datalad.org/en/latest/intro/installation.html>`_.
- Try to download the file using ``git-annex get -v -d <file_name>``. If this doesn't succeed, the DataLad command may not succeed. Options ``-d/--debug`` and ``-v`` are here to provide as much verbosity in error messages as possible
- Read the output of :term:`git-annex`, identify the error, breathe again, and solve the issue!

.. list-table:: Examples of possible issues
:header-rows: 1

* - git-annex error
- A solution that worked once
* - .. code-block::

Last exception was:
Could not find a suitable TLS CA
certificate bundle, invalid path:
/etc/pki/tls/certs/ca-bundle.crt
[adapters.py:cert_verify:227]

- Unset environment variable ``CURL_CA_BUNDLE``
* - Permission denied when writing file
- Download to a writeable file system
* - File retrieval from an Amazon S3 bucket failed during a system call in a MATLAB session:

.. code-block:: bash

>> system('datalad -C mytest \
get 100206/T1w/T1w_acpc_dc.nii.gz')
[...]
git-annex: get: 1 failed

- MATLAB massively overrides the ``LD_LIBRARY_PATH`` setting.
This can lead to a number of issues, among them SSL certification errors.
Prefixing the :command:`datalad get` command with

.. code-block:: bash

!LD_LIBRARY_PATH= datalad get [....]

can solve this.

Common warnings and errors
^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down
44 changes: 44 additions & 0 deletions docs/basics/_examples/DL-101-135-105
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
$ datalad --log-level debug status
[DEBUG] Command line args 1st pass. Parsed: Namespace() Unparsed: ['status']
[DEBUG] Discovering plugins
[DEBUG] Building doc for <class 'datalad.core.local.status.Status'>
[DEBUG] Parsing known args among ['/home/adina/env/handbook/bin/datalad', '--log-level', 'debug', 'status']
[DEBUG] Determined class of decorated function: <class 'datalad.core.local.status.Status'>
[DEBUG] Resolved dataset for status reporting: /home/me/dl-101/DataLad-101
[DEBUG] query AnnexRepo(/home/me/dl-101/DataLad-101).diffstatus() for paths: None
[DEBUG] AnnexRepo(/home/me/dl-101/DataLad-101).get_content_info(...)
[DEBUG] Query repo: ['git', 'ls-files', '--stage', '-z', '-d', '-m', '--exclude-standard', '-o', '--directory', '--no-empty-directory']
[DEBUG] Done query repo: ['git', 'ls-files', '--stage', '-z', '-d', '-m', '--exclude-standard', '-o', '--directory', '--no-empty-directory']
[DEBUG] Done AnnexRepo(/home/me/dl-101/DataLad-101).get_content_info(...)
[DEBUG] AnnexRepo(/home/me/dl-101/DataLad-101).get_content_info(...)
[DEBUG] Query repo: ['git', 'ls-tree', 'HEAD', '-z', '-r', '--full-tree', '-l']
[DEBUG] Done query repo: ['git', 'ls-tree', 'HEAD', '-z', '-r', '--full-tree', '-l']
[DEBUG] Initiating a new process for BatchedCommand(cmd=['git', 'cat-file', '--batch'], output_proc=<function>, path='/home/me/dl-101/DataLad-101')
[DEBUG] Closing stdin of <subprocess.Popen object at 0x7ff4bcd847b8> and waiting process to finish
[DEBUG] Process <subprocess.Popen object at 0x7ff4bcd847b8> has finished
[DEBUG] Done AnnexRepo(/home/me/dl-101/DataLad-101).get_content_info(...)
[DEBUG] GitRepo(/home/me/dl-101/DataLad-101/midterm_project).get_content_info(...)
[DEBUG] Query repo: ['git', 'ls-files', '--stage', '-z', '-d', '-m', '--exclude-standard', '-o', '--directory', '--no-empty-directory']
[DEBUG] Done query repo: ['git', 'ls-files', '--stage', '-z', '-d', '-m', '--exclude-standard', '-o', '--directory', '--no-empty-directory']
[DEBUG] Done GitRepo(/home/me/dl-101/DataLad-101/midterm_project).get_content_info(...)
[DEBUG] GitRepo(/home/me/dl-101/DataLad-101/midterm_project).get_content_info(...)
[DEBUG] Query repo: ['git', 'ls-tree', 'HEAD', '-z', '-r', '--full-tree', '-l']
[DEBUG] Done query repo: ['git', 'ls-tree', 'HEAD', '-z', '-r', '--full-tree', '-l']
[DEBUG] Done GitRepo(/home/me/dl-101/DataLad-101/midterm_project).get_content_info(...)
[DEBUG] GitRepo(/home/me/dl-101/DataLad-101/midterm_project/input).get_content_info(...)
[DEBUG] Query repo: ['git', 'ls-files', '--stage', '-z', '-d', '-m', '--exclude-standard', '-o', '--directory', '--no-empty-directory']
[DEBUG] Done query repo: ['git', 'ls-files', '--stage', '-z', '-d', '-m', '--exclude-standard', '-o', '--directory', '--no-empty-directory']
[DEBUG] Done GitRepo(/home/me/dl-101/DataLad-101/midterm_project/input).get_content_info(...)
[DEBUG] GitRepo(/home/me/dl-101/DataLad-101/midterm_project/input).get_content_info(...)
[DEBUG] Query repo: ['git', 'ls-tree', 'HEAD', '-z', '-r', '--full-tree', '-l']
[DEBUG] Done query repo: ['git', 'ls-tree', 'HEAD', '-z', '-r', '--full-tree', '-l']
[DEBUG] Done GitRepo(/home/me/dl-101/DataLad-101/midterm_project/input).get_content_info(...)
[DEBUG] GitRepo(/home/me/dl-101/DataLad-101/recordings/longnow).get_content_info(...)
[DEBUG] Query repo: ['git', 'ls-files', '--stage', '-z', '-d', '-m', '--exclude-standard', '-o', '--directory', '--no-empty-directory']
[DEBUG] Done query repo: ['git', 'ls-files', '--stage', '-z', '-d', '-m', '--exclude-standard', '-o', '--directory', '--no-empty-directory']
[DEBUG] Done GitRepo(/home/me/dl-101/DataLad-101/recordings/longnow).get_content_info(...)
[DEBUG] GitRepo(/home/me/dl-101/DataLad-101/recordings/longnow).get_content_info(...)
[DEBUG] Query repo: ['git', 'ls-tree', 'HEAD', '-z', '-r', '--full-tree', '-l']
[DEBUG] Done query repo: ['git', 'ls-tree', 'HEAD', '-z', '-r', '--full-tree', '-l']
[DEBUG] Done GitRepo(/home/me/dl-101/DataLad-101/recordings/longnow).get_content_info(...)
nothing to save, working tree clean
14 changes: 12 additions & 2 deletions docs/glossary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,10 @@ Glossary
Debian
A common Linux distribution. `More information here <https://www.debian.org/index.en.html>`__.

debugging
Finding and resolving problems within a computer program.
To learn about debugging a failed execution of a DataLad command, take a look at the section :ref:`debug`.

DOI
A digital object identifier (DOI) is a character string used to permanently identify
a resource and link to in on the web. A DOI will always refer to the one resource
Expand All @@ -97,8 +101,8 @@ Glossary
enables DataLad to extract and aggregate special types of metadata.

environment variable
A variable made up of a name/value pair. Programs using a given environment variable
will use its associated value for their execution.
A variable made up of a name/value pair. Programs using a given environment variable will use its associated value for their execution.
You can find out a bit more on environment variable :ref:`in this footnote <envvars>`.

ephemeral clone
TODO
Expand Down Expand Up @@ -160,6 +164,12 @@ Glossary
https
Hypertext Transfer Protocol Secure; A protocol for file transfer over a network.

logging
Automatic protocol creation of software processes, for example in order to gain insights into errors. To learn about logging to troubleshoot problems or remove or increase the amount of information printed to your terminal during the execution of a DataLad command, take a look at the section :ref:`logging`.

log level
Adjusts the amount of verbosity during :term:`logging`.

master
Git concept: The default :term:`branch` in a dataset.

Expand Down