Add solutions for week 1

wmutschl · Oct 19, 2023 · 4f1de44 · 4f1de44
1 parent 043909f
commit 4f1de44
Show file tree

Hide file tree

Showing 10 changed files with 677 additions and 1 deletion.
diff --git a/exercises/gitkraken_quick_tour_solution.tex b/exercises/gitkraken_quick_tour_solution.tex
@@ -0,0 +1,197 @@
+\begin{enumerate}
+\item \emph{git} is a version control system, i.e. a way to track changes to code, text, documents, data etc.
+	It let's you go back and forth between many different versions of the same file, and see a list of the differences.
+	Collaboration becomes (technically) very easy and straightforward as people can work on different files or different versions of the same file simultaneously
+  	  and afterwards merge their changes.
+
+	\emph{git} is the most popular version control system invented in 2005 to track the development of the worldwide largest open-source project: the Linux Kernel.
+    It is a \textbf{command line tool}, and at some point you should learn the commands on the Command Line Interface (CLI).
+	However, there are many graphical user interface (GUI) programs that make getting started with Git much easier
+	  and integrate seamlessly with online collaboration platforms such as GitHub or GitLab.
+	Therefore, in this exercise I will focus on such a tool called GitKraken,
+	  which I use daily and highly recommend.\footnote{Other GUI programs work very similarly,
+	  see \url{https://en.wikipedia.org/wiki/Comparison_of_Git_GUIs} for a comparison of features.
+	  I also recommend the built-in git functionality of Visual Studio Code.}
+	GitKraken offers a free trial of their paid license,
+	  but also offers a free version for use on publicly-hosted repositories (which is fine for our purposes as we will mostly do stuff locally anyway).
+	Additionally, GitKraken also offers the Pro license FREE to students and teachers through the GitHub Student Developer Pack (\url{https://education.github.com/pack}).
+
+	While git and GitKraken are tools that you install on your computer,
+      GitHub is an online platform that provides a nice visual interface to help you manage your version-controlled projects remotely.
+    It is the largest git repository hosting service and has become by far the largest open-source collaboration site.	
+	Another important online platform is Gitlab as you can also host that on your computer or server.
+	For individuals gitea offers yet another way to host a stripped down version of GitHub or Gitlab.	
+	I personally have accounts on GitHub (\url{https://github.com/wmutschl}) and Gitlab (\url{https://gitlab.com/wmutschl}),
+	  but also use self-hosted versions of Gitlab (\url{https://git.dynare.org/wmutschl}) and gitea (\url{https://git.mutschler.eu}) to mirror my projects.
+\item A typical workflow looks like this:
+	\begin{itemize}
+	\item retrieve data and prepare it for estimation purposes
+	\item select a model framework, decide on certain hyperparameters and modeling choices and then run an estimation
+	\item prepare tables, graphs and reports
+	\end{itemize}
+	All of these tasks heavily rely on coding, i.e. putting text into some files that are then evaluated by software
+	  that actually performs the tasks.
+	Moreover, we will see that estimating macroeconomic models requires a lot of trial and error and accordingly those files constantly change and need to be adapted.
+	\emph{git} enables you to track these changes as it gives you an organized revision history.
+	So you can experiment with your codes, make changes to a project and always keep the ability to go back and fourth between changes.
+	So stop naming files like \emph{2022-10-17-master-thesis-v2-final-now-really-final.tex}
+	  and let \emph{git} do its magic for you by simply tracking the file \emph{thesis.tex} with all of its revision history.
+\item Follow the instructions provided in the links or get in touch if you are struggling with the installation.
+\item Follow the instructions provided in the links or get in touch if you are struggling with the installation.
+\item Follow the instructions provided in the links or get in touch if you are struggling with the installation.
+\item In GitKraken: Open a new Tab, click \emph{Start a local repo}, then on the \emph{Init} register select \emph{Local Only} and fill out the details.
+  Note that GitKraken automatically creates a first commit with a \texttt{README.md} file.
+  Inside every repository there is a hidden folder \texttt{.git}.
+  It contains everything done by \emph{git}, so all the changes you will ever do.
+  Never delete this folder!
+  Also putting a repository on a cloud storage folder might damage this folder,
+    so best practice is to use a local folder on the disk.
+  We will cover how to push the repository to a so-called remote which works basically like syncing,
+	but much more robust and git-ier.
+\item Now the benefits of using a GUI like GitKraken become evident,
+  as our changes are displayed in the \emph{Unstaged Files} area
+  and by clicking on the file we get a really pretty side-by-side comparison of all the changes.
+  We can now decide which lines we want to \texttt{stage} and \texttt{commit}.
+\item The git model looks like the following diagram:\\
+\begin{center}
+
+\begin{tikzpicture}[mypostaction/.style 2 args={
+	decoration={
+		 text align={
+			   left indent=#1},
+			   text along path, 
+			   text={#2}
+			   },
+	  decorate
+   }
+]
+	\node[draw,		
+		fill=Rhodamine!50,
+		minimum width=3.5cm, 
+	 	minimum height=3cm,
+		text width=3cm,
+		text centered
+	] (unstaged) at (0,0){File changes in working directory (Unstaged Files)};
+
+	\node [draw,
+		fill=Goldenrod,
+		minimum width=3.5cm,
+		minimum height=3cm,
+		text width=3cm,
+		text centered,
+		right=1cm of unstaged
+	]  (staged) {Staged files};
+
+	\node [draw,
+		fill=SpringGreen, 
+	 	minimum width=3.5cm, 
+	 	minimum height=3cm,
+		text width=3cm,
+		text centered,
+	 	right=1cm of staged
+	] (local) {Local repository};
+
+	\node [draw,
+		fill=SeaGreen, 
+		minimum width=3.5cm, 
+		minimum height=3cm,
+		text width=3cm,
+		text centered,
+		right=1cm of local
+	]  (remote) {Remote repository (GitHub, GitLab) [optional]};
+
+	% Arrows with text label
+	\coordinate (unstageRoot) at (-0.5,2); \coordinate (stageRoot) at (4,2);
+	\draw[-latex, blue!20!white, line width=2ex]  (unstageRoot) to[in=135,out=90] (stageRoot);	
+	\path [postaction={mypostaction={1cm}{stage changes}},postaction={mypostaction={1.5cm}{git add},/pgf/decoration/raise=-3mm}](unstageRoot) to [in=180,out=3] (stageRoot);
+
+	\coordinate (stageRoot) at (4.5,2); \coordinate (localRoot) at (9,2);
+	\draw[-latex, blue!20!white, line width=2ex]  (stageRoot) to[in=135,out=90] (localRoot);	
+	\path [postaction={mypostaction={1cm}{commit changes}},postaction={mypostaction={1.5cm}{git commit},/pgf/decoration/raise=-3mm}](stageRoot) to [in=180,out=3] (localRoot);
+
+	\coordinate (localRoot) at (9.5,2); \coordinate (remoteRoot) at (14,2);
+	\draw[-latex, blue!20!white, line width=2ex]  (localRoot) to[in=135,out=90] (remoteRoot);	
+	\path [postaction={mypostaction={1cm}{push changes}},postaction={mypostaction={1.5cm}{git push},/pgf/decoration/raise=-3mm}](localRoot) to [in=180,out=3] (remoteRoot);
+	\end{tikzpicture}
+\end{center}
+	You do your work in your \textbf{working directory}.
+	On the \textbf{stage} you collect all the changes that you want to save.
+	This is very powerful because sometimes it is just individual lines of code or text that you want to keep track of and not the whole file.
+	Once you've tracked all the changes that you want to combine, it is time to collect these changes into a \textbf{commit}.
+	A commit is a permanent snapshot of the files that git tracks stored in the .git directory. It is associated with a unique identifier (hash).
+	In other words, a commit is like a snapshot in time; you can always revert back to this and see what changes were made compared to any other commit.
+	On your local repository (i.e. on your local machine) you now have a nice versioned history.
+	However, if you want to collaborate with others or sync your repository to a specialized cloud provider you need to push these changes to a so-called remote repository,
+	  typically on GitHub, GitLab, but any folder that you can access via remotely might serve as a remote repository.
+\item Click on the file and select \texttt{Stage File} or add each line by clicking on the plus or minus signs left to each line.
+Once you are happy with the file, click on the X to close the file-comparison window.
+We now don't see any unstaged files and can proceed to write a commit message and then click on the big green button.
+\item A \emph{good commit} typically does one discrete task or change only.
+	For example, you added a variable to the regression specification in the code, in the output and in the report.
+	Or you changed the name of a variable and treat it properly across multiple scripts.
+	This enables you to make meaningful commit messages like \emph{Add year dummies to regression specification}
+	and you thus end up with a well organized repository.
+	This workflow needs some practice and everyone is slightly different with regards to this.
+	Nevertheless, try to combine changes to certain meaningful smaller tasks and provide good commit messages.
+	In my experience, having ten tiny commits is always preferable to one large commit.
+	Your future self and collaborators will thank you!
+
+	The question to what you should include in your commits,
+	  is also a matter of choice and preference.
+	Definitely your script files of codes, latex and text files.
+	Data is also sometimes given as csv files which are basically just text files.
+	Binary files (like Excel sheets, Word documents, Power Point slides) are a bit tricky to handle,
+	  as you can't see the differences between versions in git.
+	It depends on the specific needs whether one should commit these files as well (e.g. for Excel files with data this obviously makes sense),
+	  but I usually don't do this.
+	Note that GitHub doesn't allow files larger than 100 MB or projects with total size larger than 1 GB.
+	There is also a way to deal with large binary files called \texttt{Git Large File Storage (LFS)}, but we won't need this.
+\item Right click on the initial commit and select \emph{Reset main to this commit - Soft}.
+Click on the file in the staged files section and remove the last line from the stage.
+Re-commit your stage by providing a meaningful commit message and hitting the green button.
+Click on Stash to put the remaining changes into the stash.
+\item Simply click on Push and add the remote. On the left Panel click on REMOTE to see the current remote (usually named origin).
+Note that you can add several remotes (say from different people) and compare the commits.
+Remotes are also a nice backup of your codes.
+\item Branches are arguably the most powerful part of \emph{git}.
+By default you have a \textbf{main} branch,
+  but what if you want to do some experiments, re-write an estimation function from scratch, work on a new feature, etc?
+You could copy the whole folder and start working there or you use git and create a branch and make the changes there.
+You can switch between branches, make commits to any branch, move them around, etc.
+If your experiment doesn't work out, simply delete the branch.
+If your experiments work out, commit them and merge them into the main branch.
+Sometimes there will be conflicts which one needs to sort out,
+  but using GUI tools like GitKraken makes this very easy
+  as you have a pretty side-by-side comparison of changes.
+Branches are arguably the most powerful part of \emph{git} especially for our purposes
+  as research is a highly nonlinear process, and this way of doing version control is much more similar to how we actually work
+  than the very linear way that other cloud storage providers do version control.
+Branches are also extremely powerful for collaboration
+  as different people can work on the same thing at the same time.
+
+Select a so-called parent commit, where you want to create a new branch.
+Note that this doesn't have to be the latest commit.
+Click on the button \emph{Branch} and name it according to the exercise.
+On the left panel, click on LOCAL to see an overview of all your branches.
+\item Create, copy and paste the three files into your repository.
+Check for pasting errors and then \emph{Stage all changes} and commit them.
+\item Run the commands and solve any errors you might get from latex.
+\item Follow the instructions in the exercise.
+Note that there is a difference between \enquote{Ignore} and \enquote{Ignore and Stop Tracking}.
+\enquote{Ignore} simply adds the file(type) to the \texttt{.gitignore} file so that new files with that name/type/whatever are not tracked.
+To \enquote{Ignore and Stop Tracking} means to remove the file(s) from git version control:
+  they will no longer be in the repo (as of the commit that performs the "stop tracking").
+Basically, use \enquote{Ignore and Stop Tracking} if the file(s) you are ignoring never should have been in the repo in the first place.
+\item Make sure you are on the correct branch \emph{latex-exam-template}
+  and push this branch to GitHub.
+Either right click on the commit or go to the left panel, click on PULL REQUESTS and on the green plus sign that appears.
+Select the \emph{latex-exam-template} as the FROM REPO branch and \emph{main} as the TO REPO branch.
+Enter a Title and Description and click on the green button.
+Have a look in GitHub ar the pull request.
+As there are no conflicts merge it and go back to GitKraken to see what happens in your repository.
+You might need to \enquote{fetch origin} by right clicking on the origin remote.
+\item Double click on your local main branch and then click on pull,
+  which fast forwards your repo to the merged changes.
+  Then click on Pop to get the WIP codes which were stored on main.
+  Right click on the README.md file in the \emph{Unstaged Files} area and select \emph{Discard changes}.
+\end{enumerate}
diff --git a/exercises/matlab_quick_tour_solution.tex b/exercises/matlab_quick_tour_solution.tex
@@ -0,0 +1 @@
+\lstinputlisting[style=Matlab-editor,basicstyle=\mlttfamily]{progs/matlab/quickTourMatlab.m}
diff --git a/exercises/programming_languages_solution.tex b/exercises/programming_languages_solution.tex
@@ -0,0 +1,87 @@
+\begin{enumerate}
+\item General purpose: C/C++, Fortran, Python, Excel.
+	Domain-specific: MATLAB, Julia, R, Mathematica, EViews.
+\item Every program is a set of instructions, say to add two numbers. 
+	Compilers and interpreters take human-readable code and convert it to computer-readable machine code.
+	In a compiled language, the target machine directly translates the program.
+	In an interpreted language, the source code is not directly translated by the target machine.
+	Instead, a different program, aka the interpreter, reads and executes the code.
+	Some modern languages like Python can have both compiled and interpreted implementations,
+	  but for simplicity's sake it is useful to keep in mind the distinction.
+
+	Compiled languages like Fortran, C or C++ are usually fastest, more efficient and more powerful,
+	  but they are harder to learn and harder to code in.
+	They also require a build step, i.e. they need to be compiled.
+	Interpreted languages like Python, R, Mathematica, MATLAB, R or Julia are slower,
+	  but easier to learn and faster to code in.
+	Interpreters run through a program line by line and execute each command.
+	Interpreted languages tend to be very similar in the syntax,
+	  but differ in best practices and concepts.
+
+	Interpreted languages were once significantly slower than compiled languages.
+	But, with the development of just-in-time (JiT) compilation, that gap is shrinking.
+	MATLAB and Julia are two very prominent examples that make use of JiT compilation,
+	  that is they combine both worlds.
+
+	You can also make use of e.g. Fortran or C++ code in MATLAB, R, Python or Julia;
+	  that is, write very CPU-intensive tasks in a compiled language
+	  and use them in an interpreted language.
+
+\item Learning a programming language is a huge investment;
+	however, once one has knowledge of one, learning another one tends to be easier
+	  as they are based on similar principles.
+	Try to stick with popular choices as the choice of learning resources and communities
+	  that help you learn this language are wider spread,
+	  i.e. googling for help is much easier for Python than for Fortran.
+	Often the project you are working on dictates which programming language you should use.
+	The general purpose languages can be used in many non-scientific applications,
+	  so your investment might payoff in very different fields in the end. 
+
+	In scientific computing, particularly in Macroeconomics,
+	  we are often faced with CPU intensive problems 
+	  and need to prototype models and methods quickly.
+	An interpreted language like MATLAB or Julia that does just-in-time compilation
+	  is therefore best suited for such tasks.
+	Moreover, having some basic knowledge in C++ is advisable
+	  to write computational intensive tasks in a compiled language 
+	  and reuse this as e.g. so-called MEX files in MATLAB.
+	However, the main determining factor is by looking at legacy code of the last 20-30 years of research done
+	  in quantitative and computational Macroeconomics,
+	  we see that most was and still is conducted in MATLAB,
+	  whereas highly intensive tasks were programmed in Fortran.
+	So keep in mind, that you need to understand this legacy codebase.
+	In the last couple of years, researchers in Macroeconomics are really pushing Julia.
+	New developments like Machine Learning require you to invest in Python.
+	For writing scientific reports and papers you should get familiar with Latex and Markdown.
+
+	Another issue to consider is the license, cost and support of the language maintainers.
+	Most programming languages are free and open-source,
+	  others like MATLAB are proprietary and are quite expensive
+	  (free and open-source clones like Octave tend to be very slow unfortunately).
+	Regardless of the license, having a good governance structure,
+	  i.e. a board, cooperation or company driving the development of the language,
+	  is very important for the sustainability of the language
+	  and for your investment in a computer language.
+
+	Lastly, and very importantly, have a look at the toolset available for the languages.
+	Which Integrated Development Environment (IDE) do you like best?
+	Which code editor do you prefer?
+	How good are the debugging capabilities of your chosen environment.
+	Things like syntax highlighting, smart indentation, code linting, comparison tools,
+	  handling of workspace, etc. are very important.
+	Some languages like MATLAB bring their own IDE in one big package and it works very well.
+	Others like Julia, Python or C++ can be neatly integrated in a variety of environments;
+	  in fact Visual Studio Code has become the leading editor and environment for many languages,
+	  but of course there are many other great choices depending on your needs and preferences.
+
+	So which computer languages should you devote your time into,
+	  if you are interested in computational or quantitative macroeconomics?
+	\\
+	\textbf{Here is my opinionated advice:}
+	\begin{itemize}
+		\item Default languages (excellent knowledge): Julia and MATLAB
+		\item Data analysis and Machine Learning (advanced knowledge): R and Python
+		\item Heavy tasks (basic knowledge): C++ and Fortran
+		\item Scientific writing (advanced knowledge): Latex and Markdown
+	\end{itemize}
+\end{enumerate}
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		\lstinputlisting[style=Matlab-editor,basicstyle=\mlttfamily]{progs/matlab/quickTourMatlab.m}