-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathpaper.tex
98 lines (62 loc) · 2.59 KB
/
paper.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
\documentclass[11pt,a4paper]{article}
\usepackage{graphicx}
\usepackage{authblk}
\usepackage{hyperref}
\begin{document}
\title{TaskFlow: robust, resumable, reliable, composable distributed graph
execution, made for clouds, made for humans.}
\date{\today}
\author[1]{Joshua Harlow}
\author[2]{Min Pae}
\affil[1]{Cloud Platform Group, Yahoo!}
\affil[2]{Cue Team, HP}
\maketitle
\begin{abstract}
Scalability and reliability in distributed systems is a \emph{hard problem}.
\href{http://docs.openstack.org/developer/taskflow/}{TaskFlow} provides
a framework for implementing workflows with specific provisions for scalability
and reliability. Using taskflow, cue, cinder, glance, octavia, magnum,
mirantis pumphouse, rackspace cloud big data, and other projects inside (and
outside) of \href{http://www.openstack.org/}{OpenStack} are tackling issues of
scalability and reliability in a consistent and unified way. By exploring
the concepts taskflow is based on we will show how it was designed, its
components and how it helps make building cloud platforms (and cloud
applications) that much easier to \emph{just get right}.
\end{abstract}
\section{Problem statement}
Creating reliable, scalable and highly available services out of unreliable
components (computers, disks, networks ...) is very often prone with
difficulties and challenges. This appears especially true when those services
are created in an opensource manner where the variability of what the
definition of scale, reliable, an highly available varies depending on target
end-use and service provider. From an evolutionary process those code bases
evolve into something that can handle a median of all the definitions with
varying degrees of success. We believe this situation is resolvable
by providing a framework that can provide a solid foundation so that the
evolutionary median can be reached and exceeded much more quickly than it
would be normally.
\section{Distributed systems}
\subsection{Robustness}
\subsection{Reliability}
\subsection{Recoverability}
\section{Programming models}
\subsection{Declarative programming}
\subsection{Dataflow programming}
\subsection{State machines}
\section{Workflows}
\subsection{Modeling}
\subsection{Lifecycle management}
\subsubsection{Execution}
\subsubsection{Consistency consistency consistency}
\subsubsection{Ownership and lose of}
\section{OpenStack}
\section{TaskFlow}
\subsection{Patterns}
\subsection{Engines}
\subsection{Persistent storage}
\subsection{Jobs}
\subsection{Conductors}
\section{Real world usage}
\section{Comparison with similar systems}
\section{Future work}
\end{document}