-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathproposal.tex
112 lines (84 loc) · 10.8 KB
/
proposal.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
% preamble
\documentclass[a4paper]{article}
\usepackage{cite}
% Beginning of document
\begin{document}
\input{./proposal_title.tex}
\section{Introduction}
\label{sec:introduction}
The concept of functional programming is becoming far more widespread and important in recent times. Object orientated and imperative languages like C++ and Java are beginning to use functional concepts such as type inference, first-class functions and more. The C++11 specification defines a new `auto' type which causes the compiler to infer the variable type from its content\cite{web:autokeyword}. Furthermore, Java 1.8 now includes Lambda Expressions in conjunction with a range of modifications to the libraries that allow its use\cite{web:javalambda}. Companies such as Jane Street -- a quantitative trading firm -- use OCaml entirely for all their high performance trading software and algorithms\cite{web:janestreettech}, and contribute to the advancement of the OCaml language.
OCaml is a great example of a powerful and robust functional language. It includes everything you would expect from a functional language and more such as object orientated and imperative programming paradigms, and is derived from the highly expressive ML programming language. Its powerful type system provides great type safety, eliminating many of the associated runtime errors; automatic memory management through it's incremental garbage collector and strict evaluation branching from it's ML roots in theorem proving. Parallelism in OCaml is however limited due to a global runtime lock present in the system. The purpose of this lock is to prevent unsafe use of non re-entrant code within the core OCaml library, but more importantly is present due to the fact that the OCaml garbage collector is not multi-core friendly\cite{clerc2012}. The resulting issue is that only one piece of OCaml code may be executed at a time leading to very little capacity for parallelism.
Java on the other hand is a widely supported, widely available software platform and programming language used in devices like desktop/laptop PC's, mobile phones, ATM's and even credit/debit cards\cite{web:aboutjava}. Its huge community of developers, wide selection of libraries and ability to write easily portable and distributable software means it has continued to grow rapidly over the years. Java also has a thread safe concurrency library which works in conjunction with its parallel and concurrent G1 garbage collector.
Both languages have their pros and cons however the OCaml-Java project aims to bring the best of both languages together by compiling OCaml to Java bytecode, taking advantage of OCaml's powerful typing system and Java's feature rich library. In addition, OCaml code compiled with OCaml-Java uses Java's G1 garbage collector meaning that the software is able to run in parallel across multiple cores without restriction. OCaml-Java comes with a relatively low-level concurrency library which allows developers to utilise its multi-core capabilities, however these low-level modules can be cumbersome to use frequently, especially considering there are some very popular and powerful concurrency libraries for OCaml -- such as Ocsigen's Light Weight Threads (LWT).
\section{Substance and Structure of the Project}
\label{sec:substance}
The aim of this project is to port the LWT concurrency library to use OCaml-Java's low-level threading library therefore taking advantage of Java's multi-core processing capabilities and providing an easier, more robust means of writing parallelised OCaml code to run on the JVM. LWT is a widely used monadic concurrency library for OCaml with strong emphasis on the `light' aspect of their threads. Spawning a thread in LWT is a very cheap operation and as such, threading with LWT fits well with the short-lived-data aspect that is well acquainted with functional languages. Furthermore the results of the project will provide some interesting results into the increased scalability of OCaml code running in parallel across multiple cores.\\
Thinking modularly, the project can be divided into three distinct sections:
\begin{description}
\item[Porting the LWT Library:] Porting the library to use OCaml-Java will allow the execution of software written in LWT under OCaml-Java, running on a single thread. This will mainly consist of writing a back-end to act as a pipeline between LWT and OCaml-Java.
\item[Making LWT parallel:] Making the ported LWT library parallel will mean utilising many Java threads to execute the LWT threads. These LWT threads are as mentioned very lightweight and are frequently spawned, Java threads on the other hand are not lightweight and as such a one-to-one mapping of LWT to Java threads will not be possible. To resolve this issue, and thus connect the two sides of the equation, an appropriate scheduler must be written to distribute LWT threads between the fewer Java threads that are active on each core.
\item[Benchmarking:] The final aspect of the project is to evaluate how the scalability of software written in LWT scales when run under my ported version as opposed to when run under default OCaml. It will be necessary to convert existing parallel processing benchmarks to use the ported LWT library and also test the scalability on hardware of varying parallel processing power, such as on a dual-core, quad-core and 48-core machine.
\end{description}
\section{Success Criteria}
\label{sec:success}
For the project to be a success I have set the following requirements:
\begin{enumerate}
\item{I am able to show the scalability difference between two programs using the same piece of LWT code, one compiled using the original LWT libraries under the normal OCaml compiler and one compiled using my ported LWT libraries under OCaml-Java, running on the JVM.}
\item{Parallel processing benchmark can be used to determine the scalability using LWT under OCaml-Java and using LWT under the normal OCaml environment.}
\end{enumerate}
\section{Starting Point}
\label{sec:starting}
My experience in functional programming extends as far as the ML courses in Part IA, however I enjoyed completing Project Euler challenges in ML (of which I was normally one of the few people to do so!). Neither have I developed code of a substantial size in ML or OCaml, therefore the best place place to begin would be looking into the structure of open source OCaml code and get a feel for how things are laid out in OCaml. In addition the book `Real World OCaml'\cite{madhavapeddy2013} has a public beta available before publication (lucky for me) which will also provide a solid foundation in producing real world OCaml code.
This project relies on the OCaml-Java project, an OCaml to Java bytecode compiler, developed by Xavier Clerc at INRIA. This project provides the ability to use the numerous Java libraries as well as Java's already mentioned parallel processing capabilities. Furthermore the JVM can be found installed on many different types of hardware and systems meaning the target hardware for OCaml code will be greatly increased by using OCaml-Java. Xavier Clerc, having strong connections with the OCaml Labs here in Cambridge, is also keen to support this project. The version of OCaml-Java I will be using is an early preview release of version 2.0.
Finally getting to grips with the innards of LWT will be most easily achievable by experimenting with the software first hand and diving into the source code available on Github.
\section{Optional Extensions}
\label{sec:optional}
Some ideas for optional extensions include:
\begin{itemize}
\item{Experiment with different schedulers to see which perform better and investigate why.}
\item{Attempt to get higher compatibility rate than initially decided which would mean ironing out incompatibilities with current software written with LWT when run with my ported libraries.}
\end{itemize}
\section{Timetable and Milestones}
\label{sec:timetable}
\begin{description}
\item[17$^{th}$ Oct -- 6$^{th}$ Nov] Further research into LWT's source. Research OCaml coding techniques.\\
Milestone: Complete LWT coding tutorial; Read through `Real World OCaml'; Implement some parallel programs in OCaml-Java 2.0; ready to begin planning.
\item[7$^{th}$ Nov -- 27$^{th}$ Nov]
Continued research into LWT's source. Plan implementation of basic functionality of the ported library.\\
Milestone: Ready to begin development.
\item[28$^{th}$ Nov -- 18$^{th}$ Dec] Begin porting LWT to single threaded interfacing with OCaml-Java.\\
Milestone: Some basic functionality implemented.
\item[19$^{th}$ Dec -- 8$^{th}$ Jan] Continued porting of LWT to single threaded interfacing with OCaml-Java.\\
Milestone: Porting complete. Should be able to walk through the LWT tutorial without much/any hassle.
\item[9$^{th}$ Jan -- 29$^{th}$ Jan] Begin creating the scheduler to manage the distribution of LWT threads to OCaml-Java threads.\\
Milestone: Fixed number of Java threads are able to be scheduled to, regardless of number of cores on system.
\item[30$^{th}$ Jan -- 19$^{th}$ Feb] Continued scheduler development.\\
Milestone: Number of Java threads scales with the number of cores present on the system.
\item[20$^{th}$ Feb -- 12$^{th}$ Mar] Begin conversion of parallel processing benchmarks.\\
Milestone: Most of benchmarking software conversion complete, fixed relevant bugs which may arise from testing with the benchmark.
\item[13$^{th}$ Mar -- 2$^{nd}$ April] Testing with benchmarks, finishing dissertation evaluation and write-up.\\
Milestone: Relevant scalability graph data accumulated.
\item[3$^{rd}$ April -- 23$^{rd}$ April] Finalising dissertation write-up.\\
Milestone: Dissertation complete!
\end{description}
\section{Resources Required and Backup}
\label{sec:resources}
A list of required resources:
\begin{itemize}
\item{My own Machines:
\begin{itemize}
\item{Macbook Air 13" 2013 (8GB RAM, 512GB Storage, OSX/Windows 8)}
\item{Desktop Computer (Ubuntu 13.04/Windows 7, Intel Core2 Quad, 5TB Storage)}
\end{itemize}
}
\item{Machines in College}
\item{A many-core machine in the SRG (32 - 48 Cores; e.g. `Roo')}
\item{OCaml-Java and Sources\footnote{OCaml-Java: http://ocamljava.x9c.fr/}}
\item{OCaml\footnote{OCaml: http://ocaml.org}}
\item{LWT Sources\footnote{LWT: http://ocsigen.org/lwt/}}
\end{itemize}
Backups will be provided by Dropbox, Skydrive, my own personal USB stick and my personal storage on the PWF facilities. Cloud backup will be incredibly useful for keeping my data accessible from many locations and also avoiding local corruption which could occur at any time.
Version control (and also the main copy of my dissertation/project) will be provided by Github. I chose Github since I am already very familiar with git version control; Github provides cloud storage so I'm able to retrieve and work on my files from various locations; furthermore Github provides free private repositories for students.
\bibliographystyle{plain}
\bibliography{resources}
\end{document}