Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-5088][AIP-24] Persisting serialized DAG in DB for webserver scalability #5743

Merged
merged 64 commits into from
Oct 24, 2019
Merged

Commits on Oct 22, 2019

  1. Configuration menu
    Copy the full SHA
    67515c5 View commit details
    Browse the repository at this point in the history
  2. clean flake8 and pylint warnings

    coufon authored and kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    07a7481 View commit details
    Browse the repository at this point in the history
  3. use create_session instead of provide_session

    coufon authored and kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    bf11bba View commit details
    Browse the repository at this point in the history
  4. make dagbag.get_dag/dag.get_bag called by webserver/scheduler control…

    …led by dagcached_enabled and only_from_file
    coufon authored and kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    e58d100 View commit details
    Browse the repository at this point in the history
  5. fix a degradation that inspect.getsource does not support functools.p…

    …artial
    coufon authored and kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    1db7933 View commit details
    Browse the repository at this point in the history
  6. add unit tests of SerializedDagModel

    coufon authored and kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    f104fc3 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    4559939 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    bb65a22 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    2efd089 View commit details
    Browse the repository at this point in the history
  10. remove hardcoded example DAGs in serialization unit tests

    coufon authored and kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    0eb6898 View commit details
    Browse the repository at this point in the history
  11. Add logs

    kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    2124fe8 View commit details
    Browse the repository at this point in the history
  12. move enum module to enums to avoid confusion

    coufon authored and kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    ab443c4 View commit details
    Browse the repository at this point in the history
  13. Add config to airflow.cfg file

    kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    b8cbd55 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    4d46b8d View commit details
    Browse the repository at this point in the history
  15. fix operator displying as SerializedBaseOperator on UI

    coufon authored and kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    480dccc View commit details
    Browse the repository at this point in the history
  16. add subdags into dagbag when loading from database

    coufon authored and kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    ffc1441 View commit details
    Browse the repository at this point in the history
  17. clean mypy and pylint

    coufon authored and kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    55825ee View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    79c5f42 View commit details
    Browse the repository at this point in the history
  19. Move the date check into SQL

    kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    cb77194 View commit details
    Browse the repository at this point in the history
  20. Fix logging error

    kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    6dd110f View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    35f62b5 View commit details
    Browse the repository at this point in the history
  22. Add test to validate if task.subdag is None if operator is not SubDag…

    …Operator
    
    Fix CI by filtering DAG list and removing SubDAGs
    
    Figured out that the issue was because the it tries to remove SubDAG from DB which does not exist as a separate row and hence fails.
    kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    ef65c7b View commit details
    Browse the repository at this point in the history
  23. Use session whereever available

    kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    cb688d9 View commit details
    Browse the repository at this point in the history
  24. Configuration menu
    Copy the full SHA
    296c217 View commit details
    Browse the repository at this point in the history
  25. Configuration menu
    Copy the full SHA
    09f6f5c View commit details
    Browse the repository at this point in the history
  26. Configuration menu
    Copy the full SHA
    2e00f9c View commit details
    Browse the repository at this point in the history
  27. Store JSON schema as static, package-data JSON file

    - Since it is now in a single document we can use internal schema
    references to reuse common chunks
    
    - Don't store dag_id against for serialized tasks
    
      We never store serialized operators apart from their dags, so we don't
      need to include the dag_id in the structure again. This change means
      we need to re-associate the task and the dag after inflating the dag,
      but does also mean we don't need to pass `visited_dags` around all over
      the place
    
    - Always serialize template_fields as an list, never an array
    
      Although many operator classes set these as tuples instead of lists,
      that is a distinction that is not important to us here, and makes the
      schema more complex.
    ashb authored and kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    a1102a9 View commit details
    Browse the repository at this point in the history
  28. Configuration menu
    Copy the full SHA
    3423182 View commit details
    Browse the repository at this point in the history
  29. Configuration menu
    Copy the full SHA
    bfb3f32 View commit details
    Browse the repository at this point in the history
  30. Configuration menu
    Copy the full SHA
    d2436ad View commit details
    Browse the repository at this point in the history
  31. Configuration menu
    Copy the full SHA
    927f610 View commit details
    Browse the repository at this point in the history
  32. Configuration menu
    Copy the full SHA
    825abe0 View commit details
    Browse the repository at this point in the history
  33. Configuration menu
    Copy the full SHA
    3228d04 View commit details
    Browse the repository at this point in the history
  34. improve fileloc hashing in DAG persistence

    coufon authored and kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    c2c0ddb View commit details
    Browse the repository at this point in the history
  35. Add documentation

    kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    6e8da2c View commit details
    Browse the repository at this point in the history
  36. Configuration menu
    Copy the full SHA
    8b84343 View commit details
    Browse the repository at this point in the history
  37. Configuration menu
    Copy the full SHA
    94c69c7 View commit details
    Browse the repository at this point in the history
  38. Update timezone class

    kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    01e5f73 View commit details
    Browse the repository at this point in the history
  39. Configuration menu
    Copy the full SHA
    88ce053 View commit details
    Browse the repository at this point in the history
  40. Configuration menu
    Copy the full SHA
    b5ee858 View commit details
    Browse the repository at this point in the history
  41. Configuration menu
    Copy the full SHA
    4932254 View commit details
    Browse the repository at this point in the history
  42. Code Cleanup for JSON columns

    kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    ebe4ec7 View commit details
    Browse the repository at this point in the history
  43. Configuration menu
    Copy the full SHA
    a2b27f0 View commit details
    Browse the repository at this point in the history
  44. Configuration menu
    Copy the full SHA
    89a03a6 View commit details
    Browse the repository at this point in the history
  45. Add Debug info

    kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    0754f61 View commit details
    Browse the repository at this point in the history
  46. Trial reducing size of SerializedDAGs

    Don't store defaults, and remove many (but not all) of the type
    annotations. in the JSON). Combined this reduces the size of our large
    test DAGs to 40% of what they were
    
    (cherry picked from commit a590ed3253f1f41e46a30df966cf17d5dae989f3)
    ashb authored and kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    da0d59a View commit details
    Browse the repository at this point in the history
  47. Add specific test for schedule_interval serialization

    (cherry picked from commit 073464a17ab1b997b50994df7513b57e99f570c5)
    ashb authored and kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    ae03cf0 View commit details
    Browse the repository at this point in the history
  48. Support dateutil.relativedelta in SerializedDAGs

    This was a valid type for schedule_interval already, so we should
    continue supporting it
    
    (cherry picked from commit ec9d705f1a90790bdcb099196269c77d3cc3d53c)
    ashb authored and kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    69b242c View commit details
    Browse the repository at this point in the history
  49. Cleanup

    kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    4bf9eb5 View commit details
    Browse the repository at this point in the history
  50. Fix imports for iSort

    kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    a498a62 View commit details
    Browse the repository at this point in the history
  51. Delete non-existent Dags

    kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    b127f37 View commit details
    Browse the repository at this point in the history
  52. Remove comment

    kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    525cccd View commit details
    Browse the repository at this point in the history
  53. Configuration menu
    Copy the full SHA
    21fafb6 View commit details
    Browse the repository at this point in the history
  54. Configuration menu
    Copy the full SHA
    f4b3e7d View commit details
    Browse the repository at this point in the history
  55. Change maxDiff to None

    kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    facbef7 View commit details
    Browse the repository at this point in the history
  56. Fix CI

    kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    83f2007 View commit details
    Browse the repository at this point in the history
  57. Just-in-time loading of DagBag in webserver

    To save start-up time (and memory) this changes the DabBag to not be
    populated by the webserver on start up - and when a specific dag is
    asked for it will be loaded on-demand from the SerializedDAG table.
    
    Co-Authored-By: Ash Berlin-Taylor <ash_github@firemirror.com>
    kaxil and ashb committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    69f7854 View commit details
    Browse the repository at this point in the history
  58. Configuration menu
    Copy the full SHA
    7c96ed8 View commit details
    Browse the repository at this point in the history
  59. Add support for OperatorLinks

    ExtraOperatorLinks are supported if Plugins are registered for them
    kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    9cb6e28 View commit details
    Browse the repository at this point in the history
  60. Cleanup

    kaxil committed Oct 22, 2019
    Configuration menu
    Copy the full SHA
    e840616 View commit details
    Browse the repository at this point in the history

Commits on Oct 23, 2019

  1. Configuration menu
    Copy the full SHA
    5ccd878 View commit details
    Browse the repository at this point in the history
  2. Fix docs building

    Kamil Breguła authored and kaxil committed Oct 23, 2019
    Configuration menu
    Copy the full SHA
    2b23a65 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    be21a0f View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    162da54 View commit details
    Browse the repository at this point in the history