Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pedantic documentation tweaks #998

Merged
merged 5 commits into from
Feb 13, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,9 +86,9 @@ Before you submit a pull request from your forked repo, check that it
meets these guidelines:

1. The pull request should include tests, either as doctests, unit tests, or both.
1. If the pull request adds functionality, the docs should be updated as part of the same PR. Doc string are often sufficient, make sure to follow the sphinx compatible standards.
1. If the pull request adds functionality, the docs should be updated as part of the same PR. Doc string are often sufficient. Make sure to follow the sphinx compatible standards.
1. The pull request should work for Python 2.6, 2.7, and 3.3. If you need help writing code that works in both Python 2 and 3, see the documentation at the [Python-Future project](http://python-future.org) (the future package is an Airflow requirement and should be used where possible).
1. Code will be reviewed by re running the unittests, flake8 and syntax should be as rigorous as the core Python project.
1. Code will be reviewed by re running the unittests and flake8. Syntax should be as rigorous as the core Python project.
1. Please rebase and resolve all conflicts before submitting.

## Running unit tests
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ one to the other (though tasks can exchange metadata!). Airflow is not
in the [Spark Streaming](http://spark.apache.org/streaming/)
or [Storm](https://storm.apache.org/) space, it is more comparable to
[Oozie](http://oozie.apache.org/) or
[Azkaban](http://data.linkedin.com/opensource/azkaban).
[Azkaban](https://azkaban.github.io/).

Workflows are expected to be mostly static or slowly changing. You can think
of the structure of the tasks in your workflow as slightly more dynamic
Expand Down
36 changes: 17 additions & 19 deletions docs/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -177,48 +177,46 @@ In addition, users can supply an S3 location for storing log backups. If logs ar
Scaling Out on Mesos (community contributed)
''''''''''''''''''''''''''''''''''''''''''''
MesosExecutor allows you to schedule airflow tasks on a Mesos cluster.
For this to work, you need a running mesos cluster and perform following
For this to work, you need a running mesos cluster and you must perform the following
steps -

1. Install airflow on a machine where webserver and scheduler will run,
let's refer this as Airflow server.
2. On Airflow server, install mesos python eggs from `mesos downloads <http://open.mesosphere.com/downloads/mesos/>`_.
3. On Airflow server, use a database which can be accessed from mesos
slave machines, for example mysql, and configure in ``airflow.cfg``.
let's refer to this as the "Airflow server".
2. On the Airflow server, install mesos python eggs from `mesos downloads <http://open.mesosphere.com/downloads/mesos/>`_.
3. On the Airflow server, use a database (such as mysql) which can be accessed from mesos
slave machines and add configuration in ``airflow.cfg``.
4. Change your ``airflow.cfg`` to point executor parameter to
MesosExecutor and provide related Mesos settings.
`MesosExecutor` and provide related Mesos settings.
5. On all mesos slaves, install airflow. Copy the ``airflow.cfg`` from
Airflow server (so that it uses same sql alchemy connection).
6. On all mesos slaves, run
6. On all mesos slaves, run the following for serving logs:

.. code-block:: bash

airflow serve_logs

for serving logs.

7. On Airflow server, run
7. On Airflow server, to start processing/scheduling DAGs on mesos, run:

.. code-block:: bash

airflow scheduler -p

to start processing DAGs and scheduling them on mesos. We need -p parameter to pickle the DAGs.
Note: We need -p parameter to pickle the DAGs.

You can now see the airflow framework and corresponding tasks in mesos UI.
The logs for airflow tasks can be seen in airflow UI as usual.

For more information about mesos, refer `mesos documentation <http://mesos.apache.org/documentation/latest/>`_.
For any queries/bugs on MesosExecutor, please contact `@kapil-malik <https://github.com/kapil-malik>`_.
For more information about mesos, refer to `mesos documentation <http://mesos.apache.org/documentation/latest/>`_.
For any queries/bugs on `MesosExecutor`, please contact `@kapil-malik <https://github.com/kapil-malik>`_.

Integration with systemd
''''''''''''''''''''''''
Airflow can integrate with systemd based systems. This makes watching your daemons easy as systemd
can take care restarting a daemon on failure. In the ``scripts/systemd`` directory you can find unit files that
have been tested on Redhat based systems. You can copy those ``/usr/lib/systemd/system``. It is assumed that
can take care of restarting a daemon on failure. In the ``scripts/systemd`` directory you can find unit files that
have been tested on Redhat based systems. You can copy those to ``/usr/lib/systemd/system``. It is assumed that
Airflow will run under ``airflow:airflow``. If not (or if you are running on a non Redhat based system) you
probably need adjust the unit files.
probably need to adjust the unit files.

Environment configuration is picked up from ``/etc/sysconfig/airflow``. An example file is supplied
. Make sure to specify the ``SCHEDULER_RUNS`` variable in this file when you run the schduler. You
can also define here, for example, ``AIRFLOW_HOME`` or ``AIRFLOW_CONFIG``.
Environment configuration is picked up from ``/etc/sysconfig/airflow``. An example file is supplied.
Make sure to specify the ``SCHEDULER_RUNS`` variable in this file when you run the schduler. You
can also define here, for example, ``AIRFLOW_HOME`` or ``AIRFLOW_CONFIG``.
2 changes: 1 addition & 1 deletion docs/plugins.rst
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ looks like:
Example
-------

The code bellow defines a plugin that injects a set of dummy object
The code below defines a plugin that injects a set of dummy object
definitions in Airflow.

.. code:: python
Expand Down
8 changes: 4 additions & 4 deletions docs/tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -77,12 +77,12 @@ at first) is that this Airflow Python script is really
just a configuration file specifying the DAG's structure as code.
The actual tasks defined here will run in a different context from
the context of this script. Different tasks run on different workers
at different point it time, which means this script cannot be directly
to cross communicate between tasks for instance. Note that for this
at different points in time, which means that this script cannot be used
to cross communicate between tasks. Note that for this
purpose we have a more advanced feature called ``XCom``.

People sometimes think of the DAG definition file as a place where they
can do some actual data processing, that is not the case at all!
can do some actual data processing - that is not the case at all!
The script's purpose is to define a DAG object. It needs to evaluate
quickly (seconds, not minutes) since the scheduler will execute it
periodically to reflect the changes if any.
Expand Down Expand Up @@ -420,7 +420,7 @@ running against it should get it to get triggered and run every day.

Here's a few things you might want to do next:

* Take an in-depth tour of the UI, click all the things!
* Take an in-depth tour of the UI - click all the things!
* Keep reading the docs! Especially the sections on:

* Command line interface
Expand Down