From 0adbe59a1f022e59c0e4815984851f90d9300a5e Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Sun, 13 Jan 2019 18:03:30 -0500 Subject: [PATCH 001/640] First draft of shell lecture --- shell.md | 209 ++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 208 insertions(+), 1 deletion(-) diff --git a/shell.md b/shell.md index 4742c541..dd111622 100644 --- a/shell.md +++ b/shell.md @@ -3,4 +3,211 @@ layout: page title: "Shell and Scripting" --- -Lecture notes will be available by the start of lecture. +The shell is an efficient, textual interface to your computer. + +The shell prompt: what greets you when you open a terminal. +Lets you run programs and commands; common ones are: + + - `cd` to change directory + - `ls` to list files and directories + - `mv` and `cp` to move and copy files + +But the shell lets you do _so_ much more; you can invoke any program on +your computer, and command-line tools exist for doing pretty much +anything you may want to do. And they're often more efficient than their +graphical counterparts. We'll go through a bunch of those in this class. + +The shell provides an interactive programming language ("scripting"). +There are many shells: + + - You've probably used `sh` or `bash`. + - Also shells that match languages: `csh`. + - Or "better" shells: `fish`, `zsh`, `ksh`. + +In this class we'll focus on the ubiquitous `sh` and `bash`, but feel +free to play around with others. I like `fish`. + +Shell programming is a *very* useful tool in your toolbox. +Can either write programs directly at the prompt, or into a file. +`#!/bin/sh` + `chmod +x` to make shell executable. + +## Working with the shell + +Run a command a bunch of times: +```shell +for i in $(seq 1 5); do echo hello; done +``` + +There's a lot to unpack: + - `for x in list; do BODY; done` + - `;` terminates a command -- equivalent to newline + - split `list`, assign each to `x`, and run body + - splitting is "whitespace splitting", which we'll get back to + - no curly braces in shell, so `do` + `done` + - `$(seq 1 10)` + - run the program `seq` with arguments `1` and `5` + - substitute entire `$()` with the output of that program + - equivalent to + ```shell + for i in 1 2 3 4 5 + ``` + - `echo hello` + - everything in a shell script is a command + - in this case, run the `echo` command, which prints its arguments + with the argument `hello`. + - all commands are searched for in `$PATH` (colon-separated) + +We have variables: +```shell +for f in $(ls); do echo $f; done +``` + +Will print each file name in the current directory. +Can also set variables using `=` (no space!): +```shell +foo=bar +echo $foo +``` + +To only print directories +```shell +for f in $(ls); do if test -d $f; then echo dir $f; fi; done +``` + +More to unpack here: + - `if CONDITION; then BODY; fi` + - `CONDITION` is a command; if it returns with exit status 0 + (success), then `BODY` is run. + - can also hook in an `else` or `elif` + - again, no curly braces, so `then` + `fi` + - `test` is another program that provides various checks and + comparisons, and exits with 0 if they're true + - `man COMMAND` is your friend: `man test` + - can also be invoked with `[` + `]`: `[ -d $f ]` + - take a look at `man test` and `which "["` + +But wait! This is wrong! What if a file is called "My Documents"? + - `for f in $(ls)` expands to `for f in My Documents` + - first do the test on `My`, then on `Documents` + - not what we wanted! + - biggest source of bugs in shell scripts + +## Argument splitting + +Bash splits arguments by whitespace; not always what you want! + - need to use quoting to handle spaces in arguments + `for f in "My Documents"` would work correctly + - same problem somewhere else -- do you see where? + `test -d $f`: if `$f` contains whitespace, `test` will error! + - `echo` happens to be okay, because split + join by space + but what if a filename contains a newline?! turns into space! + - quote all use of variables that you don't want split + - but how do we fix our script above? + what does `for f in "$(ls)"` do do you think? + +Globbing is the answer! + - bash knows how to look for files using patterns: + - `*` any string of characters + - `?` any single character + - `{a,b,c}` any of these characters + - `for f in *`: all files in this directory + - when globbing, each matching file becomes its own argument + - still need to make sure to quote when _using_: `test -d "$f"` + - can make advanced patterns: + - `for f in a*`: all files starting with `a` in the current directory + - `for f in foo/*.txt`: all `.txt` files in `foo` + - `for f in foo/*/p??.txt` + all three-letter text files starting with p in subdirs of `foo` + +Whitespace issues don't stop there: + - `if [ "$foo" = "bar" ]; then` -- see the issue? + - what if `$foo` is empty? arguments to `[` are `=` and `bar`... + - _can_ work around this with `[ "x$foo" = "xbar" ]`, but bleh + - instead, use `[[`: bash built-in comparator that has special parsing + - also allows `&&` instead of `-a`, `||` over `-o`, etc. + +## Composability + +Shell is powerful in part because of composability. Can chain multiple +programs together rather than have one program that does everything. + +The key character is `|` (pipe). + - `a | b` means run both `a` and `b` + send all output of `a` as input to `b` + print the output of `b` + +All programs you launch ("processes") have three "streams": + - `STDIN`: when the program reads input, it comes from here + - `STDOUT`: when the program prints something, it goes here + - `STDERR`: a 2nd output the program can choose to use + - by default, `STDIN` is your keyboard, `STDOUT` and `STDERR` are both + your terminal. but you can change that! + - `a | b` makes `STDOUT` of `a` `STDIN` of `b`. + - also have: + - `a > foo` (`STDOUT` of `a` goes to the file `foo`) + - `a 2> foo` (`STDERR` of `a` goes to the file `foo`) + - `a < foo` (`STDIN` of `a` is read from the file `foo`) + - hint: `tail -f` will print a file as it's being written + - why is this useful? lets you manipulate output of a program! + - `ls | grep foo`: all files that contain the word `foo` + - `ps | grep foo`: all processes that contain the word `foo` + - `journalctl | grep -i intel | tail -n5`: + last 5 system log messages with the word intel (case insensitive) + - `who | sendmail -t me@example.com` + send the list of logged-in users to `me@example.com` + - forms the basis for much data-wrangling, as we'll cover later + +Bash also provides a number of other ways to compose programs. + +You can group commands with `(a; b) | tac`: run `a`, then `b`, and send +all their output to `tac`, which prints its input in reverse order. + +A lesser-known, but super useful one is _process substitution_. +`b <(a)` will run `a`, generate a temporary file-name for its output +stream, and pass that file-name to `b`. For example: +```shell +diff <(journalctl -b -1 | head -n20) <(journalctl -b -2 | head -n20) +``` +will show you the difference between the first 20 lines of the last boot +log and the one before that. + + + +## Job and process control + +What if you want to run longer-term things in the background? + - the `&` suffix runs a program "in the background" + - it will give you back your prompt immediately + - handy if you want to run two programs at the same time + like a server and client: `server & client` + - note that the running program still has your terminal as `STDOUT`! + try: `server > server.log & client` + - see all such processes with `jobs` + - notice that it shows "Running" + - bring it to the foreground with `fg %JOB` (no argument is latest) + - if you want to background the current program: `^Z` + `bg` + - `^Z` stops the current process and makes it a "job" + - `bg` runs the last job in the background (as if you did `&`) + - background jobs are still tied to your current session, and exit if + you log out. `disown` lets you sever that connection. or use `nohup`. + + + +What about other stuff running on your computer? + - `ps` is your friend: lists running processes + - `ps -A`: print processes from all users (also `ps ax`) + - `ps` has *many* arguments: see `man ps` + - `pgrep`: find processes by searching (like `ps -A | grep`) + - `pgrep -af`: search and display with arguments + - `kill`: send a _signal_ to a process by ID (`pkill` by search + `-f`) + - signals tell a process to "do something" + - most common: `SIGKILL` (`-9` or `-KILL`): tell it to exit *now* + equivalent to `^\` + - also `SIGTERM` (`-15` or `-TERM`): tell it to exit gracefully + equivalent to `^C` + +## Exercises and further reading + +QoL stuff: fasd/autojump, fzf, rg, fd, etc. + +TODO From 773fecb04dd3386b5904b35ca3ec9fdc229d91c2 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 14 Jan 2019 08:49:22 -0500 Subject: [PATCH 002/640] Fix parsing --- shell.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/shell.md b/shell.md index dd111622..b1ac1ee1 100644 --- a/shell.md +++ b/shell.md @@ -34,11 +34,13 @@ Can either write programs directly at the prompt, or into a file. ## Working with the shell Run a command a bunch of times: + ```shell for i in $(seq 1 5); do echo hello; done ``` There's a lot to unpack: + - `for x in list; do BODY; done` - `;` terminates a command -- equivalent to newline - split `list`, assign each to `x`, and run body @@ -64,17 +66,20 @@ for f in $(ls); do echo $f; done Will print each file name in the current directory. Can also set variables using `=` (no space!): + ```shell foo=bar echo $foo ``` To only print directories + ```shell for f in $(ls); do if test -d $f; then echo dir $f; fi; done ``` More to unpack here: + - `if CONDITION; then BODY; fi` - `CONDITION` is a command; if it returns with exit status 0 (success), then `BODY` is run. @@ -87,6 +92,7 @@ More to unpack here: - take a look at `man test` and `which "["` But wait! This is wrong! What if a file is called "My Documents"? + - `for f in $(ls)` expands to `for f in My Documents` - first do the test on `My`, then on `Documents` - not what we wanted! @@ -95,6 +101,7 @@ But wait! This is wrong! What if a file is called "My Documents"? ## Argument splitting Bash splits arguments by whitespace; not always what you want! + - need to use quoting to handle spaces in arguments `for f in "My Documents"` would work correctly - same problem somewhere else -- do you see where? @@ -106,6 +113,7 @@ Bash splits arguments by whitespace; not always what you want! what does `for f in "$(ls)"` do do you think? Globbing is the answer! + - bash knows how to look for files using patterns: - `*` any string of characters - `?` any single character @@ -120,6 +128,7 @@ Globbing is the answer! all three-letter text files starting with p in subdirs of `foo` Whitespace issues don't stop there: + - `if [ "$foo" = "bar" ]; then` -- see the issue? - what if `$foo` is empty? arguments to `[` are `=` and `bar`... - _can_ work around this with `[ "x$foo" = "xbar" ]`, but bleh @@ -132,11 +141,13 @@ Shell is powerful in part because of composability. Can chain multiple programs together rather than have one program that does everything. The key character is `|` (pipe). + - `a | b` means run both `a` and `b` send all output of `a` as input to `b` print the output of `b` All programs you launch ("processes") have three "streams": + - `STDIN`: when the program reads input, it comes from here - `STDOUT`: when the program prints something, it goes here - `STDERR`: a 2nd output the program can choose to use @@ -165,6 +176,7 @@ all their output to `tac`, which prints its input in reverse order. A lesser-known, but super useful one is _process substitution_. `b <(a)` will run `a`, generate a temporary file-name for its output stream, and pass that file-name to `b`. For example: + ```shell diff <(journalctl -b -1 | head -n20) <(journalctl -b -2 | head -n20) ``` @@ -176,6 +188,7 @@ log and the one before that. ## Job and process control What if you want to run longer-term things in the background? + - the `&` suffix runs a program "in the background" - it will give you back your prompt immediately - handy if you want to run two programs at the same time @@ -194,6 +207,7 @@ What if you want to run longer-term things in the background? What about other stuff running on your computer? + - `ps` is your friend: lists running processes - `ps -A`: print processes from all users (also `ps ax`) - `ps` has *many* arguments: see `man ps` From e8ab85911b8ec9b96a2b69d87fe70b562b3694dc Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 14 Jan 2019 09:25:02 -0500 Subject: [PATCH 003/640] Add dotfiles content --- dotfiles.md | 105 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 104 insertions(+), 1 deletion(-) diff --git a/dotfiles.md b/dotfiles.md index cec35dbf..716c5e3f 100644 --- a/dotfiles.md +++ b/dotfiles.md @@ -3,4 +3,107 @@ layout: page title: "Dotfiles" --- -Lecture notes will be available by the start of lecture. +Many programs are configured using plain-text files known as "dotfiles" +(because the file names begin with a `.`, e.g. `~/.gitconfig`, so that they are +hidden in the directory listing `ls` by default). + +A lot of the tools you use probably have a lot of settings that can be tuned +pretty finely. Often times, tools are customized with specialized languages, +e.g. Vimscript for Vim or the shell's own language for a shell. + +Customizing and adapting your tools to your preferred workflow will make you +more productive. We advise you to invest time in customizing your tool yourself +rather than cloning someone else's dotfiles from GitHub. + +You probably have some dotfiles set up already. Some places to look: + +- `~/.bashrc` +- `~/.emacs` +- `~/.vim` +- `~/.gitconfig` + +# Learning to customize tools + +You can learn about your tool's settings by reading online documentation or +[man pages](https://en.wikipedia.org/wiki/Man_page). Another great way is to +search the internet for blog posts about specific programs, where authors will +tell you about their preferred customizations. Yet another way to learn about +customizations is to look through other people's dotfiles: you can find tons of +dotfiles repositories on GitHub --- see the most popular one +[here](https://github.com/mathiasbynens/dotfiles) (we advise you not to blindly +copy configurations though). + +# Organization + +How should you organize your dotfiles? They should be in their own folder, +under version control, and symlinked into place using a script. This has the +benefits of: + +- **Easy installation**: if you log in to a new machine, applying your +customizations will only take a minute +- **Portability**: your tools will work the same way everywhere +- **Synchronization**: you can update your dotfiles anywhere and keep them all +in sync +- **Change tracking**: you're probably going to be maintaining your dotfiles +for your entire programming career, and version history is nice to have for +long-lived projects + +# Advanced topics + +## Machine-specific customizations + +Most of the time, you'll want the same configuration across machines, but +sometimes, you'll want a small delta on a particular machine. Here are a couple +ways you can handle this situation: + +### Branch per machine + +Use version control to maintain a branch per machine. This approach is +logically straightforward but can be pretty heavyweight. + +### If statements + +If the configuration file supports it, use the equivalent of if-statements to +apply machine specific customizations. For example, your shell could have a line +like: + +``` +if [[ "$(uname)" == "Darwin" ]]; then {do something}; fi +``` + +### Includes + +If the configuration file supports it, make use of includes. For example, +a `~/.gitconfig` can have a setting: + +``` +[include] + path = ~/.gitconfig_local +``` + +And then on each machine, `~/.gitconfig_local` can contain machine-specific +settings. You could even track these in a separate repository for +machine-specific settings. + +# Resources + +- [GitHub does dotfiles](http://dotfiles.github.io/): dotfile frameworks, +utilities, examples, and tutorials + +# Exercises + +1. Create a folder for your dotfiles (and set up version control, or wait till + we [cover that](/version-control/) in lecture). + +1. Add a configuration for at least one program, e.g. your shell, with some + customization (to start off, it can be something as simple as customizing + your shell prompt by setting `$PS1`). + +1. Set up a method to install your dotfiles quickly (and without manual effort) + on a new machine. This can be as simple as a shell script that calls `ln -s` + for each file, or you could use a [specialized + utility](http://dotfiles.github.io/#general-purpose-dotfile-utilities). + +1. Test your installation script on a fresh virtual machine. + +1. Migrate all of your current tool configurations to your dotfiles repository. From 89bcac6b04653430ab0d842bc7ab8c15aee39986 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Mon, 14 Jan 2019 10:07:12 -0500 Subject: [PATCH 004/640] Mention some special vars --- shell.md | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/shell.md b/shell.md index b1ac1ee1..2b777b94 100644 --- a/shell.md +++ b/shell.md @@ -72,6 +72,13 @@ foo=bar echo $foo ``` +There are a bunch of "special" variables too: + + - `$1` to `$9`: arguments to the script + - `$0` name of the script itself + - `$#` number of arguments + - `$$` process ID of current shell + To only print directories ```shell @@ -86,7 +93,7 @@ More to unpack here: - can also hook in an `else` or `elif` - again, no curly braces, so `then` + `fi` - `test` is another program that provides various checks and - comparisons, and exits with 0 if they're true + comparisons, and exits with 0 if they're true (`$?`) - `man COMMAND` is your friend: `man test` - can also be invoked with `[` + `]`: `[ -d $f ]` - take a look at `man test` and `which "["` @@ -135,6 +142,8 @@ Whitespace issues don't stop there: - instead, use `[[`: bash built-in comparator that has special parsing - also allows `&&` instead of `-a`, `||` over `-o`, etc. + + ## Composability Shell is powerful in part because of composability. Can chain multiple @@ -203,6 +212,7 @@ What if you want to run longer-term things in the background? - `bg` runs the last job in the background (as if you did `&`) - background jobs are still tied to your current session, and exit if you log out. `disown` lets you sever that connection. or use `nohup`. + - `$!` is pid of last background process From 6b418f5355de7eeaee2f7d998785b20dac4be28d Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 14 Jan 2019 10:16:53 -0500 Subject: [PATCH 005/640] Add Piazza link --- course-overview.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/course-overview.md b/course-overview.md index 9bbd0152..97bb17b1 100644 --- a/course-overview.md +++ b/course-overview.md @@ -31,6 +31,8 @@ own. We'll inspire you to learn more about your tools, and we'll show you what's possible and cover some of the basics in detail, but we can't teach you everything in the time we have. +Please post questions on [Piazza](https://piazza.com/class/jqjpgaeaz77785). + # Exercises 1. Fill out the [registration form](https://goo.gl/forms/HSdsUQ204Ow8BgUs2) if From 792cd3af3e3fe1ac11d187a14379ffa0fa5a0985 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 14 Jan 2019 18:18:07 -0500 Subject: [PATCH 006/640] Update --- course-overview.md | 9 ++++++--- ides.md | 6 ------ package-management.md | 2 +- schedule.md | 10 +++++----- static/css/main.css | 23 ++++++----------------- terminal.md | 9 ++++++++- virtual-machines.md | 16 +++++++++++----- 7 files changed, 37 insertions(+), 38 deletions(-) delete mode 100644 ides.md diff --git a/course-overview.md b/course-overview.md index 97bb17b1..7c82d10b 100644 --- a/course-overview.md +++ b/course-overview.md @@ -23,15 +23,18 @@ form of demos) that may not be in the notes. Each class is split into two 50-minute lectures with a 10-minute break in between. Lectures are mostly live demonstrations followed by hands-on -exercises. We might have a short amount of time at the end of each lecture to -get started on the exercises in an office-hours-style setting. +exercises. We might have a short amount of time at the end of each class to get +started on the exercises in an office-hours-style setting. To make the most of the class, you should go through all the exercises on your own. We'll inspire you to learn more about your tools, and we'll show you what's possible and cover some of the basics in detail, but we can't teach you everything in the time we have. -Please post questions on [Piazza](https://piazza.com/class/jqjpgaeaz77785). +Please post questions on [Piazza](https://piazza.com/class/jqjpgaeaz77785). In +addition, we ask that you share your knowledge with your classmates through +Piazza --- for "homework" for each lecture, create a Piazza note about something +you've learned or something you'd like to share about the topic. # Exercises diff --git a/ides.md b/ides.md deleted file mode 100644 index 3529528d..00000000 --- a/ides.md +++ /dev/null @@ -1,6 +0,0 @@ ---- -layout: page -title: "IDEs" ---- - -Lecture notes will be available by the start of lecture. diff --git a/package-management.md b/package-management.md index 59e85ce8..bb29c8f2 100644 --- a/package-management.md +++ b/package-management.md @@ -1,6 +1,6 @@ --- layout: page -title: "Package Management" +title: "Package Management and Dependency Management" --- Lecture notes will be available by the start of lecture. diff --git a/schedule.md b/schedule.md index 36f29b3c..3b6c81cb 100644 --- a/schedule.md +++ b/schedule.md @@ -8,27 +8,27 @@ blocks, with a 10 minute break in between. # Tuesday, 1/15 -- [Course overview](/course-overview/), [virtual machines](/virtual-machines/), [dotfiles](/dotfiles/) +- [Course overview](/course-overview/), [virtual machines and containers](/virtual-machines/) - [Shell and scripting](/shell/) # Thursday, 1/17 -- [Terminal emulators and multiplexers](/terminal/) +- [Command-line environment](/terminal/) - [Data wrangling](/data-wrangling/) # Tuesday, 1/22 - [Editors](/editors/) -- [IDEs](/ides/) +- [Version control](/version-control/) # Thursday, 1/24 -- [Version control](/version-control/) and [backups](/backups/) +- [Dotfiles](/dotfiles/) and [backups](/backups/) - [Debuggers, logging, profilers, and monitoring](/debuggers-logging-profilers-monitoring/) # Tuesday, 1/29 -- [Package management](/package-management/) +- [Package management and dependency management](/package-management/) - [OS customization](/os-customization/) and [OS automation](/os-automation/) # Thursday, 1/31 diff --git a/static/css/main.css b/static/css/main.css index 84da2ec8..bdff4864 100644 --- a/static/css/main.css +++ b/static/css/main.css @@ -80,16 +80,11 @@ pre, code { font-family: "Source Code Pro", "Menlo", "DejaVu Sans Mono", "Lucida Console", monospace; } -code:before { - content: "`"; -} - -code:after { - content: "`"; -} - code { - color: #6c71c4; + background-color: rgba(27,31,35,.05); + border-radius: 3px; + padding: 0 0.2rem; + font-size: 0.9em; } pre { @@ -104,14 +99,8 @@ pre { pre code { color: inherit; -} - -pre code:before { - content: none; -} - -pre code:after { - content: none; + background: none; + font-size: 100%; } a { diff --git a/terminal.md b/terminal.md index a334525b..ed5596f4 100644 --- a/terminal.md +++ b/terminal.md @@ -1,6 +1,13 @@ --- layout: page -title: "Terminal Emulators and Multiplexers" +title: "Command-line environment" --- Lecture notes will be available by the start of lecture. + +{% comment %} +- terminal emulators +- multiplexers +- remote: ssh and mosh +- autojump and fzf +{% endcomment %} diff --git a/virtual-machines.md b/virtual-machines.md index 23724a45..bf36d593 100644 --- a/virtual-machines.md +++ b/virtual-machines.md @@ -1,8 +1,10 @@ --- layout: page -title: "Virtual Machines" +title: "Virtual Machines and Containers" --- +# Virtual Machines + Virtual machines are simulated computers. You can configure a guest virtual machine with some operating system and configuration and use it without affecting your host environment. @@ -16,7 +18,7 @@ that only runs on a certain operating system (e.g. using a Windows VM on Linux to run Windows-specific software). They are often used for experimenting with potentially malicious software. -# Useful features +## Useful features - **Isolation**: hypervisors do a pretty good job of isolating the guest from the host, so you can use VMs to run buggy or untrusted software reasonably @@ -27,19 +29,19 @@ the entire machine state (disk, memory, etc.), make changes to your machine, and then restore to an earlier state. This is useful for testing out potentially destructive actions, among other things. -# Disadvantages +## Disadvantages Virtual machines are generally slower than running on bare metal, so they may be unsuitable for certain applications. -# Resources +## Resources - Hypervisors - [VirtualBox](https://www.virtualbox.org/) (open-source) - [VMWare](https://www.vmware.com/) (commercial, available from IS&T [for MIT students](https://ist.mit.edu/vmware-fusion)) -# Exercises +## Exercises 1. Download and install a hypervisor. @@ -49,3 +51,7 @@ be unsuitable for certain applications. 1. Experiment with snapshots. Try things that you've always wanted to try, like running `sudo rm -rf --no-preserve-root /`, and see if you can recover easily. + +# Containers + +Coming soon! From cf835d20106df9e393899ed0fdfca992734157af Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 14 Jan 2019 18:31:01 -0500 Subject: [PATCH 007/640] List presenters --- backups.md | 1 + terminal.md => command-line.md | 1 + course-overview.md | 1 + data-wrangling.md | 1 + dotfiles.md | 1 + editors.md | 1 + ...ing-profilers-monitoring.md => machine-introspection.md | 3 ++- os-automation.md | 1 + os-customization.md | 1 + package-management.md | 1 + program-introspection.md | 7 +++++++ schedule.md | 4 ++-- security.md | 1 + shell.md | 1 + version-control.md | 1 + virtual-machines.md | 1 + web.md | 1 + 17 files changed, 25 insertions(+), 3 deletions(-) rename terminal.md => command-line.md (93%) rename debuggers-logging-profilers-monitoring.md => machine-introspection.md (58%) create mode 100644 program-introspection.md diff --git a/backups.md b/backups.md index 398eab6e..3e13bc5e 100644 --- a/backups.md +++ b/backups.md @@ -1,6 +1,7 @@ --- layout: page title: "Backups" +presenter: Jose --- Lecture notes will be available by the start of lecture. diff --git a/terminal.md b/command-line.md similarity index 93% rename from terminal.md rename to command-line.md index ed5596f4..cf9de7cb 100644 --- a/terminal.md +++ b/command-line.md @@ -1,6 +1,7 @@ --- layout: page title: "Command-line environment" +presenter: Jose --- Lecture notes will be available by the start of lecture. diff --git a/course-overview.md b/course-overview.md index 7c82d10b..07b52530 100644 --- a/course-overview.md +++ b/course-overview.md @@ -1,6 +1,7 @@ --- layout: page title: "Course Overview" +presenter: Anish --- # Motivation diff --git a/data-wrangling.md b/data-wrangling.md index edaa7d10..55565960 100644 --- a/data-wrangling.md +++ b/data-wrangling.md @@ -1,6 +1,7 @@ --- layout: page title: "Data Wrangling" +presenter: Jon --- Lecture notes will be available by the start of lecture. diff --git a/dotfiles.md b/dotfiles.md index 716c5e3f..243272b0 100644 --- a/dotfiles.md +++ b/dotfiles.md @@ -1,6 +1,7 @@ --- layout: page title: "Dotfiles" +presenter: Anish --- Many programs are configured using plain-text files known as "dotfiles" diff --git a/editors.md b/editors.md index 31136c47..ad7c9b60 100644 --- a/editors.md +++ b/editors.md @@ -1,6 +1,7 @@ --- layout: page title: "Editors" +presenter: Anish --- Lecture notes will be available by the start of lecture. diff --git a/debuggers-logging-profilers-monitoring.md b/machine-introspection.md similarity index 58% rename from debuggers-logging-profilers-monitoring.md rename to machine-introspection.md index a3e9887f..dac62a03 100644 --- a/debuggers-logging-profilers-monitoring.md +++ b/machine-introspection.md @@ -1,6 +1,7 @@ --- layout: page -title: "Debuggers, Logging, Profilers, and Monitoring" +title: "Machine Introspection" +presenter: Jon --- Lecture notes will be available by the start of lecture. diff --git a/os-automation.md b/os-automation.md index 699df10f..2f4269f2 100644 --- a/os-automation.md +++ b/os-automation.md @@ -1,6 +1,7 @@ --- layout: page title: "OS Automation" +presenter: Jose --- Lecture notes will be available by the start of lecture. diff --git a/os-customization.md b/os-customization.md index a6c314d8..58b1b41b 100644 --- a/os-customization.md +++ b/os-customization.md @@ -1,6 +1,7 @@ --- layout: page title: "OS Customization" +presenter: Anish --- Lecture notes will be available by the start of lecture. diff --git a/package-management.md b/package-management.md index bb29c8f2..fe3e18d5 100644 --- a/package-management.md +++ b/package-management.md @@ -1,6 +1,7 @@ --- layout: page title: "Package Management and Dependency Management" +presenter: Anish --- Lecture notes will be available by the start of lecture. diff --git a/program-introspection.md b/program-introspection.md new file mode 100644 index 00000000..29fde930 --- /dev/null +++ b/program-introspection.md @@ -0,0 +1,7 @@ +--- +layout: page +title: "Program Introspection" +presenter: Anish +--- + +Lecture notes will be available by the start of lecture. diff --git a/schedule.md b/schedule.md index 3b6c81cb..38be52e4 100644 --- a/schedule.md +++ b/schedule.md @@ -13,7 +13,7 @@ blocks, with a 10 minute break in between. # Thursday, 1/17 -- [Command-line environment](/terminal/) +- [Command-line environment](/command-line/) - [Data wrangling](/data-wrangling/) # Tuesday, 1/22 @@ -24,7 +24,7 @@ blocks, with a 10 minute break in between. # Thursday, 1/24 - [Dotfiles](/dotfiles/) and [backups](/backups/) -- [Debuggers, logging, profilers, and monitoring](/debuggers-logging-profilers-monitoring/) +- [Machine introspection](/machine-introspection/) and [program introspection](/program-introspection/) # Tuesday, 1/29 diff --git a/security.md b/security.md index c293f496..040fd251 100644 --- a/security.md +++ b/security.md @@ -1,6 +1,7 @@ --- layout: page title: "Security and Privacy" +presenter: Jon --- Lecture notes will be available by the start of lecture. diff --git a/shell.md b/shell.md index 2b777b94..7b41e70f 100644 --- a/shell.md +++ b/shell.md @@ -1,6 +1,7 @@ --- layout: page title: "Shell and Scripting" +presenter: Jon --- The shell is an efficient, textual interface to your computer. diff --git a/version-control.md b/version-control.md index bd072b13..f9e65bea 100644 --- a/version-control.md +++ b/version-control.md @@ -1,6 +1,7 @@ --- layout: page title: "Version Control" +presenter: Jon --- Lecture notes will be available by the start of lecture. diff --git a/virtual-machines.md b/virtual-machines.md index bf36d593..a76f9ee8 100644 --- a/virtual-machines.md +++ b/virtual-machines.md @@ -1,6 +1,7 @@ --- layout: page title: "Virtual Machines and Containers" +presenter: Anish, Jon --- # Virtual Machines diff --git a/web.md b/web.md index d2a512e7..1ec371b2 100644 --- a/web.md +++ b/web.md @@ -1,6 +1,7 @@ --- layout: page title: "Web and Browsers" +presenter: Jose --- Lecture notes will be available by the start of lecture. From acfe1052cbde94b7fbe14fc57cdceeabe5be5475 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Tue, 15 Jan 2019 10:05:50 -0500 Subject: [PATCH 008/640] Text on containers --- virtual-machines.md | 38 +++++++++++++++++++++++++++++++++++++- 1 file changed, 37 insertions(+), 1 deletion(-) diff --git a/virtual-machines.md b/virtual-machines.md index a76f9ee8..fd18e882 100644 --- a/virtual-machines.md +++ b/virtual-machines.md @@ -55,4 +55,40 @@ be unsuitable for certain applications. # Containers -Coming soon! +Virtual Machines are relatively heavy-weight; what if you want to spin +up machines in an automated fashion? Enter containers! + + - Amazon Firecracker + - Docker + - rkt + - lxc + +Containers are _mostly_ just an assembly of various Linux security +features, like virtual file system, virtual network interfaces, chroots, +virtual memory tricks, and the like, that together give the appearance +of virtualization. + +Not quite as secure or isolated as a VM, but pretty close and getting +better. Usually higher performance, and much faster to start, but not +always. + +Containers are handy for when you want to run an automated task in a +standardized setup: + + - Build systems + - Development environments + - Pre-packaged servers + - Running untrusted programs + - Grading student submissions + - (Some) cloud computing + - Continuous integration + - Travis CI + - GitHub Actions + +Usually, you write a file that defines how to construct your container. +You start with some minimal _base image_ (like Alpine Linux), and then +a list of commands to run to set up the environment you want (install +packages, copy files, build stuff, write config files, etc.). Normally, +there's also a way to specify any external ports that should be +available, and an _entrypoint_ that dictates what command should be run +when the container is started (like a grading script). From e2c7e671456d1a3f63c423cabe35b696d5354929 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 15 Jan 2019 12:29:23 -0500 Subject: [PATCH 009/640] Update --- course-overview.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/course-overview.md b/course-overview.md index 07b52530..2d35b563 100644 --- a/course-overview.md +++ b/course-overview.md @@ -20,7 +20,7 @@ to customize your tools, and how to extend your tools. We have 6 lectures covering a [variety of topics](/schedule/). We have lecture notes online, but there will be a lot of content covered in class (e.g. in the -form of demos) that may not be in the notes. +form of demos) that may not be in the notes. We will be recording lectures. Each class is split into two 50-minute lectures with a 10-minute break in between. Lectures are mostly live demonstrations followed by hands-on @@ -34,8 +34,8 @@ everything in the time we have. Please post questions on [Piazza](https://piazza.com/class/jqjpgaeaz77785). In addition, we ask that you share your knowledge with your classmates through -Piazza --- for "homework" for each lecture, create a Piazza note about something -you've learned or something you'd like to share about the topic. +Piazza --- **for "homework" for each lecture, create a Piazza note about +something you've learned or something you'd like to share about the topic**. # Exercises From 839b2d52c387acee3f3c8302f01646b9f6dadf30 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 15 Jan 2019 12:42:04 -0500 Subject: [PATCH 010/640] Update --- virtual-machines.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/virtual-machines.md b/virtual-machines.md index fd18e882..db2adebb 100644 --- a/virtual-machines.md +++ b/virtual-machines.md @@ -35,6 +35,17 @@ potentially destructive actions, among other things. Virtual machines are generally slower than running on bare metal, so they may be unsuitable for certain applications. +## Setup + +- **Resources**: shared with host machine; be aware of this when allocating +physical resources. + +- **Networking**: many options, default NAT should work fine for most use +cases. + +- **Guest addons**: many hypervisors can install software in the guest to +enable nicer integration with host system. You should use this if you can. + ## Resources - Hypervisors @@ -53,6 +64,9 @@ be unsuitable for certain applications. running `sudo rm -rf --no-preserve-root /`, and see if you can recover easily. +1. Install guest addons and experiment with different windowing modes, file + sharing, and other features. + # Containers Virtual Machines are relatively heavy-weight; what if you want to spin From 5f968796cde24793da1fae72ec34b327054d76dd Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Tue, 15 Jan 2019 15:01:47 -0500 Subject: [PATCH 011/640] Add extra stuff for VM & containers --- virtual-machines.md | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/virtual-machines.md b/virtual-machines.md index db2adebb..3226b8e7 100644 --- a/virtual-machines.md +++ b/virtual-machines.md @@ -49,10 +49,14 @@ enable nicer integration with host system. You should use this if you can. ## Resources - Hypervisors + - [VirtualBox](https://www.virtualbox.org/) (open-source) + - [Virt-manager](https://virt-manager.org/) (open-source, manages KVM virtual machines and LXC containers) - [VMWare](https://www.vmware.com/) (commercial, available from IS&T [for MIT students](https://ist.mit.edu/vmware-fusion)) + If you are already familiar with popular hypervisors/VMs you many want to learn more about how to do this from a command line friendly way. One option is the [libvirt](https://wiki.libvirt.org/page/UbuntuKVMWalkthrough) toolkit which allows you to manage multiple different virtualization providers/hypervisors. + ## Exercises 1. Download and install a hypervisor. @@ -64,6 +68,8 @@ enable nicer integration with host system. You should use this if you can. running `sudo rm -rf --no-preserve-root /`, and see if you can recover easily. +1. Read what a [fork-bomb](https://en.wikipedia.org/wiki/Fork_bomb) (`:(){ :|:& };:`) is and run it on the VM to see that the resource isolation (CPU, Memory, &c) works. + 1. Install guest addons and experiment with different windowing modes, file sharing, and other features. @@ -86,6 +92,11 @@ Not quite as secure or isolated as a VM, but pretty close and getting better. Usually higher performance, and much faster to start, but not always. +The performance boost comes from the fact that unlike VMs which run an entire copy of the operating system, containers share the linux kernel with the host. However note that if you are running linux containers on Windows/macOS a Linux VM will need to be active as a middle layer between the two. + +![Docker vs VM](https://i2.wp.com/blog.docker.com/wp-content/uploads/Blog.-Are-containers-..VM-Image-1.png?ssl=1) +_Comparison between Docker containers and Virtual Machines. Credit: blog.docker.com_ + Containers are handy for when you want to run an automated task in a standardized setup: @@ -99,6 +110,8 @@ standardized setup: - Travis CI - GitHub Actions +Moreover, container software like Docker has also been extensively used as a solution for [dependency hell](https://en.wikipedia.org/wiki/Dependency_hell). If a machine needs to be running many services with conflicting dependencies they can be isolated using containers. + Usually, you write a file that defines how to construct your container. You start with some minimal _base image_ (like Alpine Linux), and then a list of commands to run to set up the environment you want (install @@ -106,3 +119,11 @@ packages, copy files, build stuff, write config files, etc.). Normally, there's also a way to specify any external ports that should be available, and an _entrypoint_ that dictates what command should be run when the container is started (like a grading script). + +In a similar fashion to code repository websites (like [GitHub](https://github.com/)) there are some container repository websites (like [DockerHub](https://hub.docker.com/))where many software services have prebuilt images that one can easily deploy. + +## Exercises + +1. Choose a container software (Docker, LXC, …) and install a simple Linux image. Try SSHing into it. + +1. Search and download a prebuilt container image for a popular web server (nginx, apache, …) \ No newline at end of file From 823ae5613a3c01b9a2ac70c5d77afd43002c160c Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Tue, 15 Jan 2019 15:02:42 -0500 Subject: [PATCH 012/640] vim doesn't like shell as the code type --- shell.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/shell.md b/shell.md index 7b41e70f..dcd39a3c 100644 --- a/shell.md +++ b/shell.md @@ -36,7 +36,7 @@ Can either write programs directly at the prompt, or into a file. Run a command a bunch of times: -```shell +```bash for i in $(seq 1 5); do echo hello; done ``` @@ -51,7 +51,7 @@ There's a lot to unpack: - run the program `seq` with arguments `1` and `5` - substitute entire `$()` with the output of that program - equivalent to - ```shell + ```bash for i in 1 2 3 4 5 ``` - `echo hello` @@ -61,14 +61,14 @@ There's a lot to unpack: - all commands are searched for in `$PATH` (colon-separated) We have variables: -```shell +```bash for f in $(ls); do echo $f; done ``` Will print each file name in the current directory. Can also set variables using `=` (no space!): -```shell +```bash foo=bar echo $foo ``` @@ -82,7 +82,7 @@ There are a bunch of "special" variables too: To only print directories -```shell +```bash for f in $(ls); do if test -d $f; then echo dir $f; fi; done ``` @@ -187,7 +187,7 @@ A lesser-known, but super useful one is _process substitution_. `b <(a)` will run `a`, generate a temporary file-name for its output stream, and pass that file-name to `b`. For example: -```shell +```bash diff <(journalctl -b -1 | head -n20) <(journalctl -b -2 | head -n20) ``` will show you the difference between the first 20 lines of the last boot From d9776d79122aa9447bb7e10ccac693411b08990a Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 15 Jan 2019 15:02:51 -0500 Subject: [PATCH 013/640] Update dependencies --- Gemfile.lock | 313 +++++++++++++++++++++++++++++---------------------- 1 file changed, 180 insertions(+), 133 deletions(-) diff --git a/Gemfile.lock b/Gemfile.lock index fcc39013..e6c51ec4 100644 --- a/Gemfile.lock +++ b/Gemfile.lock @@ -1,195 +1,242 @@ GEM remote: https://rubygems.org/ specs: - activesupport (4.2.7) + activesupport (4.2.10) i18n (~> 0.7) - json (~> 1.7, >= 1.7.7) minitest (~> 5.1) thread_safe (~> 0.3, >= 0.3.4) tzinfo (~> 1.1) - addressable (2.5.0) - public_suffix (~> 2.0, >= 2.0.2) + addressable (2.5.2) + public_suffix (>= 2.0.2, < 4.0) coffee-script (2.4.1) coffee-script-source execjs - coffee-script-source (1.12.2) + coffee-script-source (1.11.1) colorator (1.1.0) - ethon (0.10.1) + commonmarker (0.17.13) + ruby-enum (~> 0.5) + concurrent-ruby (1.1.4) + dnsruby (1.61.2) + addressable (~> 2.5) + em-websocket (0.5.1) + eventmachine (>= 0.12.9) + http_parser.rb (~> 0.6.0) + ethon (0.12.0) ffi (>= 1.3.0) + eventmachine (1.2.7) execjs (2.7.0) - faraday (0.11.0) + faraday (0.15.4) multipart-post (>= 1.2, < 3) - ffi (1.9.17) + ffi (1.10.0) forwardable-extended (2.6.0) - gemoji (2.1.0) - github-pages (115) - activesupport (= 4.2.7) - github-pages-health-check (= 1.3.0) - jekyll (= 3.3.1) - jekyll-avatar (= 0.4.2) - jekyll-coffeescript (= 1.0.1) + gemoji (3.0.0) + github-pages (193) + activesupport (= 4.2.10) + github-pages-health-check (= 1.8.1) + jekyll (= 3.7.4) + jekyll-avatar (= 0.6.0) + jekyll-coffeescript (= 1.1.1) + jekyll-commonmark-ghpages (= 0.1.5) jekyll-default-layout (= 0.1.4) - jekyll-feed (= 0.8.0) - jekyll-gist (= 1.4.0) - jekyll-github-metadata (= 2.3.0) - jekyll-mentions (= 1.2.0) - jekyll-optional-front-matter (= 0.1.2) + jekyll-feed (= 0.11.0) + jekyll-gist (= 1.5.0) + jekyll-github-metadata (= 2.9.4) + jekyll-mentions (= 1.4.1) + jekyll-optional-front-matter (= 0.3.0) jekyll-paginate (= 1.1.0) - jekyll-readme-index (= 0.0.3) - jekyll-redirect-from (= 0.11.0) - jekyll-relative-links (= 0.2.1) - jekyll-sass-converter (= 1.3.0) - jekyll-seo-tag (= 2.1.0) - jekyll-sitemap (= 0.12.0) + jekyll-readme-index (= 0.2.0) + jekyll-redirect-from (= 0.14.0) + jekyll-relative-links (= 0.5.3) + jekyll-remote-theme (= 0.3.1) + jekyll-sass-converter (= 1.5.2) + jekyll-seo-tag (= 2.5.0) + jekyll-sitemap (= 1.2.0) jekyll-swiss (= 0.4.0) - jekyll-theme-architect (= 0.0.3) - jekyll-theme-cayman (= 0.0.3) - jekyll-theme-dinky (= 0.0.3) - jekyll-theme-hacker (= 0.0.3) - jekyll-theme-leap-day (= 0.0.3) - jekyll-theme-merlot (= 0.0.3) - jekyll-theme-midnight (= 0.0.3) - jekyll-theme-minimal (= 0.0.3) - jekyll-theme-modernist (= 0.0.3) - jekyll-theme-primer (= 0.1.7) - jekyll-theme-slate (= 0.0.3) - jekyll-theme-tactile (= 0.0.3) - jekyll-theme-time-machine (= 0.0.3) - jekyll-titles-from-headings (= 0.1.4) - jemoji (= 0.7.0) - kramdown (= 1.11.1) - liquid (= 3.0.6) - listen (= 3.0.6) + jekyll-theme-architect (= 0.1.1) + jekyll-theme-cayman (= 0.1.1) + jekyll-theme-dinky (= 0.1.1) + jekyll-theme-hacker (= 0.1.1) + jekyll-theme-leap-day (= 0.1.1) + jekyll-theme-merlot (= 0.1.1) + jekyll-theme-midnight (= 0.1.1) + jekyll-theme-minimal (= 0.1.1) + jekyll-theme-modernist (= 0.1.1) + jekyll-theme-primer (= 0.5.3) + jekyll-theme-slate (= 0.1.1) + jekyll-theme-tactile (= 0.1.1) + jekyll-theme-time-machine (= 0.1.1) + jekyll-titles-from-headings (= 0.5.1) + jemoji (= 0.10.1) + kramdown (= 1.17.0) + liquid (= 4.0.0) + listen (= 3.1.5) mercenary (~> 0.3) - minima (= 2.0.0) - nokogiri (= 1.6.8.1) - rouge (= 1.11.1) + minima (= 2.5.0) + nokogiri (>= 1.8.2, < 2.0) + rouge (= 2.2.1) terminal-table (~> 1.4) - github-pages-health-check (1.3.0) + github-pages-health-check (1.8.1) addressable (~> 2.3) - net-dns (~> 0.8) + dnsruby (~> 1.60) octokit (~> 4.0) public_suffix (~> 2.0) - typhoeus (~> 0.7) - html-pipeline (2.5.0) + typhoeus (~> 1.3) + html-pipeline (2.10.0) activesupport (>= 2) nokogiri (>= 1.4) - i18n (0.7.0) - jekyll (3.3.1) + http_parser.rb (0.6.0) + i18n (0.9.5) + concurrent-ruby (~> 1.0) + jekyll (3.7.4) addressable (~> 2.4) colorator (~> 1.0) + em-websocket (~> 0.5) + i18n (~> 0.7) jekyll-sass-converter (~> 1.0) - jekyll-watch (~> 1.1) - kramdown (~> 1.3) - liquid (~> 3.0) + jekyll-watch (~> 2.0) + kramdown (~> 1.14) + liquid (~> 4.0) mercenary (~> 0.3.3) pathutil (~> 0.9) - rouge (~> 1.7) + rouge (>= 1.7, < 4) safe_yaml (~> 1.0) - jekyll-avatar (0.4.2) + jekyll-avatar (0.6.0) jekyll (~> 3.0) - jekyll-coffeescript (1.0.1) + jekyll-coffeescript (1.1.1) coffee-script (~> 2.2) + coffee-script-source (~> 1.11.1) + jekyll-commonmark (1.2.0) + commonmarker (~> 0.14) + jekyll (>= 3.0, < 4.0) + jekyll-commonmark-ghpages (0.1.5) + commonmarker (~> 0.17.6) + jekyll-commonmark (~> 1) + rouge (~> 2) jekyll-default-layout (0.1.4) jekyll (~> 3.0) - jekyll-feed (0.8.0) + jekyll-feed (0.11.0) jekyll (~> 3.3) - jekyll-gist (1.4.0) + jekyll-gist (1.5.0) octokit (~> 4.2) - jekyll-github-metadata (2.3.0) + jekyll-github-metadata (2.9.4) jekyll (~> 3.1) octokit (~> 4.0, != 4.4.0) - jekyll-mentions (1.2.0) - activesupport (~> 4.0) + jekyll-mentions (1.4.1) html-pipeline (~> 2.3) jekyll (~> 3.0) - jekyll-optional-front-matter (0.1.2) + jekyll-optional-front-matter (0.3.0) jekyll (~> 3.0) jekyll-paginate (1.1.0) - jekyll-readme-index (0.0.3) + jekyll-readme-index (0.2.0) jekyll (~> 3.0) - jekyll-redirect-from (0.11.0) - jekyll (>= 2.0) - jekyll-relative-links (0.2.1) - jekyll (~> 3.3) - jekyll-sass-converter (1.3.0) - sass (~> 3.2) - jekyll-seo-tag (2.1.0) - jekyll (~> 3.3) - jekyll-sitemap (0.12.0) - jekyll (~> 3.3) - jekyll-swiss (0.4.0) - jekyll-theme-architect (0.0.3) - jekyll (~> 3.3) - jekyll-theme-cayman (0.0.3) - jekyll (~> 3.3) - jekyll-theme-dinky (0.0.3) - jekyll (~> 3.3) - jekyll-theme-hacker (0.0.3) - jekyll (~> 3.3) - jekyll-theme-leap-day (0.0.3) + jekyll-redirect-from (0.14.0) jekyll (~> 3.3) - jekyll-theme-merlot (0.0.3) + jekyll-relative-links (0.5.3) jekyll (~> 3.3) - jekyll-theme-midnight (0.0.3) + jekyll-remote-theme (0.3.1) + jekyll (~> 3.5) + rubyzip (>= 1.2.1, < 3.0) + jekyll-sass-converter (1.5.2) + sass (~> 3.4) + jekyll-seo-tag (2.5.0) jekyll (~> 3.3) - jekyll-theme-minimal (0.0.3) + jekyll-sitemap (1.2.0) jekyll (~> 3.3) - jekyll-theme-modernist (0.0.3) - jekyll (~> 3.3) - jekyll-theme-primer (0.1.7) - jekyll (~> 3.3) - jekyll-theme-slate (0.0.3) - jekyll (~> 3.3) - jekyll-theme-tactile (0.0.3) - jekyll (~> 3.3) - jekyll-theme-time-machine (0.0.3) - jekyll (~> 3.3) - jekyll-titles-from-headings (0.1.4) - jekyll (~> 3.3) - jekyll-watch (1.5.0) - listen (~> 3.0, < 3.1) - jemoji (0.7.0) - activesupport (~> 4.0) - gemoji (~> 2.0) + jekyll-swiss (0.4.0) + jekyll-theme-architect (0.1.1) + jekyll (~> 3.5) + jekyll-seo-tag (~> 2.0) + jekyll-theme-cayman (0.1.1) + jekyll (~> 3.5) + jekyll-seo-tag (~> 2.0) + jekyll-theme-dinky (0.1.1) + jekyll (~> 3.5) + jekyll-seo-tag (~> 2.0) + jekyll-theme-hacker (0.1.1) + jekyll (~> 3.5) + jekyll-seo-tag (~> 2.0) + jekyll-theme-leap-day (0.1.1) + jekyll (~> 3.5) + jekyll-seo-tag (~> 2.0) + jekyll-theme-merlot (0.1.1) + jekyll (~> 3.5) + jekyll-seo-tag (~> 2.0) + jekyll-theme-midnight (0.1.1) + jekyll (~> 3.5) + jekyll-seo-tag (~> 2.0) + jekyll-theme-minimal (0.1.1) + jekyll (~> 3.5) + jekyll-seo-tag (~> 2.0) + jekyll-theme-modernist (0.1.1) + jekyll (~> 3.5) + jekyll-seo-tag (~> 2.0) + jekyll-theme-primer (0.5.3) + jekyll (~> 3.5) + jekyll-github-metadata (~> 2.9) + jekyll-seo-tag (~> 2.0) + jekyll-theme-slate (0.1.1) + jekyll (~> 3.5) + jekyll-seo-tag (~> 2.0) + jekyll-theme-tactile (0.1.1) + jekyll (~> 3.5) + jekyll-seo-tag (~> 2.0) + jekyll-theme-time-machine (0.1.1) + jekyll (~> 3.5) + jekyll-seo-tag (~> 2.0) + jekyll-titles-from-headings (0.5.1) + jekyll (~> 3.3) + jekyll-watch (2.1.2) + listen (~> 3.0) + jemoji (0.10.1) + gemoji (~> 3.0) html-pipeline (~> 2.2) - jekyll (>= 3.0) - json (1.8.6) - kramdown (1.11.1) - liquid (3.0.6) - listen (3.0.6) - rb-fsevent (>= 0.9.3) - rb-inotify (>= 0.9.7) + jekyll (~> 3.0) + kramdown (1.17.0) + liquid (4.0.0) + listen (3.1.5) + rb-fsevent (~> 0.9, >= 0.9.4) + rb-inotify (~> 0.9, >= 0.9.7) + ruby_dep (~> 1.2) mercenary (0.3.6) - mini_portile2 (2.1.0) - minima (2.0.0) - minitest (5.10.1) + mini_portile2 (2.4.0) + minima (2.5.0) + jekyll (~> 3.5) + jekyll-feed (~> 0.9) + jekyll-seo-tag (~> 2.1) + minitest (5.11.3) multipart-post (2.0.0) - net-dns (0.8.0) - nokogiri (1.6.8.1) - mini_portile2 (~> 2.1.0) - octokit (4.6.2) + nokogiri (1.10.1) + mini_portile2 (~> 2.4.0) + octokit (4.13.0) sawyer (~> 0.8.0, >= 0.5.3) - pathutil (0.14.0) + pathutil (0.16.2) forwardable-extended (~> 2.6) public_suffix (2.0.5) - rb-fsevent (0.9.8) - rb-inotify (0.9.7) - ffi (>= 0.5.0) - rouge (1.11.1) + rb-fsevent (0.10.3) + rb-inotify (0.10.0) + ffi (~> 1.0) + rouge (2.2.1) + ruby-enum (0.7.2) + i18n + ruby_dep (1.5.0) + rubyzip (1.2.2) safe_yaml (1.0.4) - sass (3.4.23) + sass (3.7.3) + sass-listen (~> 4.0.0) + sass-listen (4.0.0) + rb-fsevent (~> 0.9, >= 0.9.4) + rb-inotify (~> 0.9, >= 0.9.7) sawyer (0.8.1) addressable (>= 2.3.5, < 2.6) faraday (~> 0.8, < 1.0) - terminal-table (1.7.3) - unicode-display_width (~> 1.1.1) - thread_safe (0.3.5) - typhoeus (0.8.0) - ethon (>= 0.8.0) - tzinfo (1.2.2) + terminal-table (1.8.0) + unicode-display_width (~> 1.1, >= 1.1.1) + thread_safe (0.3.6) + typhoeus (1.3.1) + ethon (>= 0.9.0) + tzinfo (1.2.5) thread_safe (~> 0.1) - unicode-display_width (1.1.3) + unicode-display_width (1.4.1) PLATFORMS ruby From a94dfa3f0140e27693671989ce2dfdfd05d3ab60 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 15 Jan 2019 15:05:20 -0500 Subject: [PATCH 014/640] Fix formatting --- static/css/main.css | 1 + virtual-machines.md | 5 ++--- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/static/css/main.css b/static/css/main.css index bdff4864..b592b6d4 100644 --- a/static/css/main.css +++ b/static/css/main.css @@ -64,6 +64,7 @@ ul { li > ul { padding-left: 2rem; + margin-bottom: 1rem; } ul li { diff --git a/virtual-machines.md b/virtual-machines.md index 3226b8e7..89905574 100644 --- a/virtual-machines.md +++ b/virtual-machines.md @@ -49,13 +49,12 @@ enable nicer integration with host system. You should use this if you can. ## Resources - Hypervisors - - [VirtualBox](https://www.virtualbox.org/) (open-source) - [Virt-manager](https://virt-manager.org/) (open-source, manages KVM virtual machines and LXC containers) - [VMWare](https://www.vmware.com/) (commercial, available from IS&T [for MIT students](https://ist.mit.edu/vmware-fusion)) - If you are already familiar with popular hypervisors/VMs you many want to learn more about how to do this from a command line friendly way. One option is the [libvirt](https://wiki.libvirt.org/page/UbuntuKVMWalkthrough) toolkit which allows you to manage multiple different virtualization providers/hypervisors. +If you are already familiar with popular hypervisors/VMs you many want to learn more about how to do this from a command line friendly way. One option is the [libvirt](https://wiki.libvirt.org/page/UbuntuKVMWalkthrough) toolkit which allows you to manage multiple different virtualization providers/hypervisors. ## Exercises @@ -126,4 +125,4 @@ In a similar fashion to code repository websites (like [GitHub](https://github.c 1. Choose a container software (Docker, LXC, …) and install a simple Linux image. Try SSHing into it. -1. Search and download a prebuilt container image for a popular web server (nginx, apache, …) \ No newline at end of file +1. Search and download a prebuilt container image for a popular web server (nginx, apache, …) From 8ff7b6ae4e58c2f562feb8f9bd21ba2177117e48 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 15 Jan 2019 15:12:44 -0500 Subject: [PATCH 015/640] Fix --- static/css/main.css | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/static/css/main.css b/static/css/main.css index b592b6d4..f4aecd8c 100644 --- a/static/css/main.css +++ b/static/css/main.css @@ -64,13 +64,20 @@ ul { li > ul { padding-left: 2rem; - margin-bottom: 1rem; } ul li { list-style-type: none; } +ul, ol { + margin-bottom: 1rem; +} + +ul ul, ol ul, ul ol, ol ol { + margin-bottom: inherit; +} + ul li:before { content: "\2013 "; /* note: extra space needed because first is consumed by css parser */ position: absolute; From 4f984287a6e75a9d613e827379351635990dc239 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Tue, 15 Jan 2019 16:14:54 -0500 Subject: [PATCH 016/640] Add some exercises for shell --- shell.md | 47 ++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 44 insertions(+), 3 deletions(-) diff --git a/shell.md b/shell.md index 7b41e70f..b6e230b8 100644 --- a/shell.md +++ b/shell.md @@ -231,8 +231,49 @@ What about other stuff running on your computer? - also `SIGTERM` (`-15` or `-TERM`): tell it to exit gracefully equivalent to `^C` -## Exercises and further reading -QoL stuff: fasd/autojump, fzf, rg, fd, etc. + + + +## Flags + +Most command line utilities take parameters using **flags**. Flags usually come in short form (`-h`) and long form (`--help`). Usually running `CMD -h` or `man CMD` will give you +Short flags can usually be combined so running `rm -r -f` is equivalent to running `rm -rf` or `rm -fr`. +Some common flags are a de facto standard and you will seem them in many applications: + +* `-a` commonly refers to all files (i.e. also including those that start with a period) +* `-f` usually refers to forcing something, like `rm -f` +* `-h` displays the help for most commands +* `-v` usually enables a verbose output +* `-V` usually prints the version of the command + +Also, a double dash `--` is used in built-in commands and many other commands to signify the end of command options, after which only positional parameters are accepted. So if you have a file called `-v` (which you can) and want to grep it `grep pattern -- -v` will work whereas `grep pattern -v` won't. + +## Exercises + +1. **PATH, which, type** +We briefly discussed that the `PATH` environment variable is used to locate the programs that you run through the command line. Let's explore that a little further + +- Run `echo $PATH` (or `echo $PATH | tr -s ':' '\n'` for pretty printing) and examine its contents, what locations are listed? +- The command `which` locates a program in the user PATH. Try running `which` for common commands like `echo`, `ls` or . Note that `which` is a bit limited since it does not understand shell aliases. Try running `type` and `command -v` for those same commands. How is the output different? +- Run `export PATH` and try running the previous commands again, some work and some don't, can you figure out why? + +2. **Special Variables** + + - What does the variable `~` expands as? + - What does the variable `$?` do? + - What does the variable `$_` do? + - What does the variable `!!` expand to? What about `!!*`? And `!l`? + - Look for documentation for these options and familiarize yourself with them + +3. **Keyboard shortcuts** + +As with any application you use frequently is worth familiarising yourself with its keyboard shortcuts. Type the following ones and try figuring out what they do and in what scenarios it might be convenient knowing about them. For some of them it might be easier searching online about what they do. + + - `Ctrl+A` + - `Ctrl+E` + - `Ctrl+R` + - `Ctrl+L` + - `Ctrl+C` + - `Ctrl+D` -TODO From 2b5622055eb047f7e361741cf4da450198c6e9ab Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Tue, 15 Jan 2019 17:38:19 -0500 Subject: [PATCH 017/640] Add extra exercises for shell --- shell.md | 45 ++++++++++++++++++++++++++++++--------------- 1 file changed, 30 insertions(+), 15 deletions(-) diff --git a/shell.md b/shell.md index 6b9d2993..0a7e219e 100644 --- a/shell.md +++ b/shell.md @@ -251,29 +251,44 @@ Also, a double dash `--` is used in built-in commands and many other commands to ## Exercises +1. If you are completely new to the shell you may want to read a more comprenhensive guide about it such as [BashGuide](http://mywiki.wooledge.org/BashGuide) + 1. **PATH, which, type** We briefly discussed that the `PATH` environment variable is used to locate the programs that you run through the command line. Let's explore that a little further -- Run `echo $PATH` (or `echo $PATH | tr -s ':' '\n'` for pretty printing) and examine its contents, what locations are listed? -- The command `which` locates a program in the user PATH. Try running `which` for common commands like `echo`, `ls` or . Note that `which` is a bit limited since it does not understand shell aliases. Try running `type` and `command -v` for those same commands. How is the output different? -- Run `export PATH` and try running the previous commands again, some work and some don't, can you figure out why? + - Run `echo $PATH` (or `echo $PATH | tr -s ':' '\n'` for pretty printing) and examine its contents, what locations are listed? + - The command `which` locates a program in the user PATH. Try running `which` for common commands like `echo`, `ls` or . Note that `which` is a bit limited since it does not understand shell aliases. Try running `type` and `command -v` for those same commands. How is the output different? + - Run `export PATH` and try running the previous commands again, some work and some don't, can you figure out why? + +1. **Special Variables** + + - What does the variable `~` expands as? What about `.`? And `..`? + - What does the variable `$?` do? + - What does the variable `$_` do? + - What does the variable `!!` expand to? What about `!!*`? And `!l`? + - Look for documentation for these options and familiarize yourself with them + +1. **xargs** -2. **Special Variables** +Sometimes piping doesn't quite work because the command being piped into does not expect the newline separated format. For example `file` command tells you properties of the file. - - What does the variable `~` expands as? - - What does the variable `$?` do? - - What does the variable `$_` do? - - What does the variable `!!` expand to? What about `!!*`? And `!l`? - - Look for documentation for these options and familiarize yourself with them +Try running `ls | file` and `ls | xargs file`. What is `xargs` doing? Note that this works because `file` accepts arbitrarily many arguments. -3. **Keyboard shortcuts** +1. **Misc** + +- Try running `touch {a,b}{a,b}` then `ls` what did appear? +- Sometimes you want to keep STDIN and still pipe it to a file. Try running `echo HELLO | tee hello.txt` +- Try running `cat hello.txt > hello.txt ` what do you expect to happen? What does happen? +- Run `echo HELLO > hello.txt` and then run `echo WORLD >> hello.txt`. What are the contents of `hello.txt`? How is `>` different from `>>`? +- Run `printf "\e[38;5;81mfoo\e[0m\n"`. How was the ouput different? If you want to know more seach for ANSI color escape sequences. +- Run `touch a.txt` then run `^txt^log` what did bash do for you? In the same vein, run `fc`. What does it do? + +1. **Keyboard shortcuts** As with any application you use frequently is worth familiarising yourself with its keyboard shortcuts. Type the following ones and try figuring out what they do and in what scenarios it might be convenient knowing about them. For some of them it might be easier searching online about what they do. - - `Ctrl+A` - - `Ctrl+E` + - `Ctrl+A`, `Ctrl+E` - `Ctrl+R` - `Ctrl+L` - - `Ctrl+C` - - `Ctrl+D` - + - `Ctrl+C`, `Ctrl+\` and `Ctrl+D` + - `Ctrl+U` and `Ctrl+Y` \ No newline at end of file From b95bf1107c9f5f2369915e40e47edb5a88869fe3 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Tue, 15 Jan 2019 18:04:06 -0500 Subject: [PATCH 018/640] Fix some typos in shell --- shell.md | 27 ++++++++++++--------------- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/shell.md b/shell.md index 0a7e219e..b5690851 100644 --- a/shell.md +++ b/shell.md @@ -208,7 +208,7 @@ What if you want to run longer-term things in the background? - see all such processes with `jobs` - notice that it shows "Running" - bring it to the foreground with `fg %JOB` (no argument is latest) - - if you want to background the current program: `^Z` + `bg` + - if you want to background the current program: `^Z` + `bg` (Here `^Z` means pressing `Ctrl+Z`) - `^Z` stops the current process and makes it a "job" - `bg` runs the last job in the background (as if you did `&`) - background jobs are still tied to your current session, and exit if @@ -232,13 +232,10 @@ What about other stuff running on your computer? equivalent to `^C` - - - ## Flags -Most command line utilities take parameters using **flags**. Flags usually come in short form (`-h`) and long form (`--help`). Usually running `CMD -h` or `man CMD` will give you -Short flags can usually be combined so running `rm -r -f` is equivalent to running `rm -rf` or `rm -fr`. +Most command line utilities take parameters using **flags**. Flags usually come in short form (`-h`) and long form (`--help`). Usually running `CMD -h` or `man CMD` will give you a list of the flags the program takes. +Short flags can usually be combined, running `rm -r -f` is equivalent to running `rm -rf` or `rm -fr`. Some common flags are a de facto standard and you will seem them in many applications: * `-a` commonly refers to all files (i.e. also including those that start with a period) @@ -247,11 +244,11 @@ Some common flags are a de facto standard and you will seem them in many applica * `-v` usually enables a verbose output * `-V` usually prints the version of the command -Also, a double dash `--` is used in built-in commands and many other commands to signify the end of command options, after which only positional parameters are accepted. So if you have a file called `-v` (which you can) and want to grep it `grep pattern -- -v` will work whereas `grep pattern -v` won't. +Also, a double dash `--` is used in built-in commands and many other commands to signify the end of command options, after which only positional parameters are accepted. So if you have a file called `-v` (which you can) and want to grep it `grep pattern -- -v` will work whereas `grep pattern -v` won't. In fact, one way to create such file is to do `touch -- -v`. ## Exercises -1. If you are completely new to the shell you may want to read a more comprenhensive guide about it such as [BashGuide](http://mywiki.wooledge.org/BashGuide) +1. If you are completely new to the shell you may want to read a more comprehensive guide about it such as [BashGuide](http://mywiki.wooledge.org/BashGuide) 1. **PATH, which, type** We briefly discussed that the `PATH` environment variable is used to locate the programs that you run through the command line. Let's explore that a little further @@ -280,15 +277,15 @@ Try running `ls | file` and `ls | xargs file`. What is `xargs` doing? Note that - Sometimes you want to keep STDIN and still pipe it to a file. Try running `echo HELLO | tee hello.txt` - Try running `cat hello.txt > hello.txt ` what do you expect to happen? What does happen? - Run `echo HELLO > hello.txt` and then run `echo WORLD >> hello.txt`. What are the contents of `hello.txt`? How is `>` different from `>>`? -- Run `printf "\e[38;5;81mfoo\e[0m\n"`. How was the ouput different? If you want to know more seach for ANSI color escape sequences. +- Run `printf "\e[38;5;81mfoo\e[0m\n"`. How was the output different? If you want to know more search for ANSI color escape sequences. - Run `touch a.txt` then run `^txt^log` what did bash do for you? In the same vein, run `fc`. What does it do? 1. **Keyboard shortcuts** -As with any application you use frequently is worth familiarising yourself with its keyboard shortcuts. Type the following ones and try figuring out what they do and in what scenarios it might be convenient knowing about them. For some of them it might be easier searching online about what they do. +As with any application you use frequently is worth familiarising yourself with its keyboard shortcuts. Type the following ones and try figuring out what they do and in what scenarios it might be convenient knowing about them. For some of them it might be easier searching online about what they do. (remember that `^X` means pressing `Ctrl+X`) - - `Ctrl+A`, `Ctrl+E` - - `Ctrl+R` - - `Ctrl+L` - - `Ctrl+C`, `Ctrl+\` and `Ctrl+D` - - `Ctrl+U` and `Ctrl+Y` \ No newline at end of file + - `^A`, `^E` + - `^R` + - `^L` + - `^C`, `^\` and `^D` + - `^U` and `^Y` \ No newline at end of file From 0593083d07289a1a7fc437c53933e13023b579db Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 15 Jan 2019 18:11:41 -0500 Subject: [PATCH 019/640] More Piazza --- course-overview.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/course-overview.md b/course-overview.md index 2d35b563..11ff78e6 100644 --- a/course-overview.md +++ b/course-overview.md @@ -32,12 +32,16 @@ own. We'll inspire you to learn more about your tools, and we'll show you what's possible and cover some of the basics in detail, but we can't teach you everything in the time we have. -Please post questions on [Piazza](https://piazza.com/class/jqjpgaeaz77785). In -addition, we ask that you share your knowledge with your classmates through -Piazza --- **for "homework" for each lecture, create a Piazza note about -something you've learned or something you'd like to share about the topic**. +Please post questions on [Piazza]. In addition, we ask that you share your +knowledge with your classmates through Piazza --- **for "homework" for each +lecture, create a Piazza note about something you've learned or something you'd +like to share about the topic**. # Exercises 1. Fill out the [registration form](https://goo.gl/forms/HSdsUQ204Ow8BgUs2) if you haven't already. + +1. Sign up for [Piazza]. + +[Piazza]: https://piazza.com/class/jqjpgaeaz77785 From e80124568ed553a2dbb747d79b414a61d8fa4d1c Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Tue, 15 Jan 2019 18:12:49 -0500 Subject: [PATCH 020/640] Fix shell explanation --- shell.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/shell.md b/shell.md index b5690851..f2eba9ec 100644 --- a/shell.md +++ b/shell.md @@ -269,7 +269,7 @@ We briefly discussed that the `PATH` environment variable is used to locate the Sometimes piping doesn't quite work because the command being piped into does not expect the newline separated format. For example `file` command tells you properties of the file. -Try running `ls | file` and `ls | xargs file`. What is `xargs` doing? Note that this works because `file` accepts arbitrarily many arguments. +Try running `ls | file` and `ls | xargs file`. What is `xargs` doing? 1. **Misc** From 6de1157c3337614b6855856741bf84ed2bbe0e83 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 15 Jan 2019 18:19:58 -0500 Subject: [PATCH 021/640] Fix numbering The styling is still a little ugly, but this commit fixes the numbering. --- shell.md | 32 +++++++++++++++----------------- 1 file changed, 15 insertions(+), 17 deletions(-) diff --git a/shell.md b/shell.md index b5690851..c6a6e697 100644 --- a/shell.md +++ b/shell.md @@ -251,14 +251,13 @@ Also, a double dash `--` is used in built-in commands and many other commands to 1. If you are completely new to the shell you may want to read a more comprehensive guide about it such as [BashGuide](http://mywiki.wooledge.org/BashGuide) 1. **PATH, which, type** -We briefly discussed that the `PATH` environment variable is used to locate the programs that you run through the command line. Let's explore that a little further + We briefly discussed that the `PATH` environment variable is used to locate the programs that you run through the command line. Let's explore that a little further - Run `echo $PATH` (or `echo $PATH | tr -s ':' '\n'` for pretty printing) and examine its contents, what locations are listed? - The command `which` locates a program in the user PATH. Try running `which` for common commands like `echo`, `ls` or . Note that `which` is a bit limited since it does not understand shell aliases. Try running `type` and `command -v` for those same commands. How is the output different? - Run `export PATH` and try running the previous commands again, some work and some don't, can you figure out why? 1. **Special Variables** - - What does the variable `~` expands as? What about `.`? And `..`? - What does the variable `$?` do? - What does the variable `$_` do? @@ -267,25 +266,24 @@ We briefly discussed that the `PATH` environment variable is used to locate the 1. **xargs** -Sometimes piping doesn't quite work because the command being piped into does not expect the newline separated format. For example `file` command tells you properties of the file. + Sometimes piping doesn't quite work because the command being piped into does not expect the newline separated format. For example `file` command tells you properties of the file. -Try running `ls | file` and `ls | xargs file`. What is `xargs` doing? Note that this works because `file` accepts arbitrarily many arguments. + Try running `ls | file` and `ls | xargs file`. What is `xargs` doing? Note that this works because `file` accepts arbitrarily many arguments. 1. **Misc** - -- Try running `touch {a,b}{a,b}` then `ls` what did appear? -- Sometimes you want to keep STDIN and still pipe it to a file. Try running `echo HELLO | tee hello.txt` -- Try running `cat hello.txt > hello.txt ` what do you expect to happen? What does happen? -- Run `echo HELLO > hello.txt` and then run `echo WORLD >> hello.txt`. What are the contents of `hello.txt`? How is `>` different from `>>`? -- Run `printf "\e[38;5;81mfoo\e[0m\n"`. How was the output different? If you want to know more search for ANSI color escape sequences. -- Run `touch a.txt` then run `^txt^log` what did bash do for you? In the same vein, run `fc`. What does it do? + - Try running `touch {a,b}{a,b}` then `ls` what did appear? + - Sometimes you want to keep STDIN and still pipe it to a file. Try running `echo HELLO | tee hello.txt` + - Try running `cat hello.txt > hello.txt ` what do you expect to happen? What does happen? + - Run `echo HELLO > hello.txt` and then run `echo WORLD >> hello.txt`. What are the contents of `hello.txt`? How is `>` different from `>>`? + - Run `printf "\e[38;5;81mfoo\e[0m\n"`. How was the output different? If you want to know more search for ANSI color escape sequences. + - Run `touch a.txt` then run `^txt^log` what did bash do for you? In the same vein, run `fc`. What does it do? 1. **Keyboard shortcuts** -As with any application you use frequently is worth familiarising yourself with its keyboard shortcuts. Type the following ones and try figuring out what they do and in what scenarios it might be convenient knowing about them. For some of them it might be easier searching online about what they do. (remember that `^X` means pressing `Ctrl+X`) + As with any application you use frequently is worth familiarising yourself with its keyboard shortcuts. Type the following ones and try figuring out what they do and in what scenarios it might be convenient knowing about them. For some of them it might be easier searching online about what they do. (remember that `^X` means pressing `Ctrl+X`) - - `^A`, `^E` - - `^R` - - `^L` - - `^C`, `^\` and `^D` - - `^U` and `^Y` \ No newline at end of file + - `^A`, `^E` + - `^R` + - `^L` + - `^C`, `^\` and `^D` + - `^U` and `^Y` From 0346eba03a4635ca23b279e8753e7c718b6304d7 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Wed, 16 Jan 2019 00:46:54 -0500 Subject: [PATCH 022/640] Fix error exercise --- shell.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/shell.md b/shell.md index 666c1fad..7a2d30e9 100644 --- a/shell.md +++ b/shell.md @@ -254,8 +254,8 @@ Also, a double dash `--` is used in built-in commands and many other commands to We briefly discussed that the `PATH` environment variable is used to locate the programs that you run through the command line. Let's explore that a little further - Run `echo $PATH` (or `echo $PATH | tr -s ':' '\n'` for pretty printing) and examine its contents, what locations are listed? - - The command `which` locates a program in the user PATH. Try running `which` for common commands like `echo`, `ls` or . Note that `which` is a bit limited since it does not understand shell aliases. Try running `type` and `command -v` for those same commands. How is the output different? - - Run `export PATH` and try running the previous commands again, some work and some don't, can you figure out why? + - The command `which` locates a program in the user PATH. Try running `which` for common commands like `echo`, `ls` or `mv`. Note that `which` is a bit limited since it does not understand shell aliases. Try running `type` and `command -v` for those same commands. How is the output different? + - Run `PATH=` and try running the previous commands again, some work and some don't, can you figure out why? 1. **Special Variables** - What does the variable `~` expands as? What about `.`? And `..`? From 4a4d7b91782862fd3b38a74efcf9008dae6adf7f Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Wed, 16 Jan 2019 16:06:33 -0500 Subject: [PATCH 023/640] Add link from Jose --- dotfiles.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/dotfiles.md b/dotfiles.md index 243272b0..857a4855 100644 --- a/dotfiles.md +++ b/dotfiles.md @@ -90,6 +90,9 @@ machine-specific settings. - [GitHub does dotfiles](http://dotfiles.github.io/): dotfile frameworks, utilities, examples, and tutorials +- [Shell startup + scripts](https://blog.flowblok.id.au/2013-02/shell-startup-scripts.html): an + explanation of the different configuration files used for your shell # Exercises From 44e73b7891c3b1befd975bf705968ec0ecd6f98b Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Wed, 16 Jan 2019 16:06:53 -0500 Subject: [PATCH 024/640] Add note from Jon --- editors.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/editors.md b/editors.md index ad7c9b60..f423f11c 100644 --- a/editors.md +++ b/editors.md @@ -5,3 +5,7 @@ presenter: Anish --- Lecture notes will be available by the start of lecture. + +{% comment %} +https://vimways.org/2018/ +{% endcomment %} From 03cb3cb5f8e777e5eef1a287531b8feaf88ad698 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Wed, 16 Jan 2019 16:07:08 -0500 Subject: [PATCH 025/640] Add note --- editors.md | 1 + 1 file changed, 1 insertion(+) diff --git a/editors.md b/editors.md index f423f11c..22d3c675 100644 --- a/editors.md +++ b/editors.md @@ -8,4 +8,5 @@ Lecture notes will be available by the start of lecture. {% comment %} https://vimways.org/2018/ +https://vim-adventures.com/ {% endcomment %} From 927cd6cce1280b960ebade364c5f233e28bb12f2 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Wed, 16 Jan 2019 23:04:54 -0500 Subject: [PATCH 026/640] Initial draft of command-line.md --- command-line.md | 162 +++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 159 insertions(+), 3 deletions(-) diff --git a/command-line.md b/command-line.md index cf9de7cb..9d204472 100644 --- a/command-line.md +++ b/command-line.md @@ -4,11 +4,167 @@ title: "Command-line environment" presenter: Jose --- -Lecture notes will be available by the start of lecture. +# Comand-line Environment + +## Aliases & Functions + +As you can imagine it can become tiresome typing long commands that involve many flags or verbose options. Nevertheless, most shells support **aliasing**. For instance, an alias in bash has the following structure (note there is no space around the `=` sign): + +```bash +alias alias_name="command_to_alias" +``` + + + +Alias have many convenient features + +```bash +# Alias can summarize good default flags +alias ll="ls -lh" + +# Save a lot of typing for common commands +alias gc="git commit" + +# Alias can overwrite existing commands +alias mv="mv -i" +alias mkdir="mkdir -p" + +# Alias can be composed +alias la="ls -A" +alias lla="la -l" + +# To ignore an alias run it prepended with \ +\ls +# Or can be disabled using unalias +unalias la + +``` + + + +However in many scenarios aliases can be limiting, specially when you are trying to write chain commands together that take the same arguments. An alternative exists which is **functions** which are a midpoint between aliases and custom shell scripts. + +Here is an example function that makes a directory and move into it. + +```bash +mcd () { + mkdir -p $1 + cd $1 +} +``` + +Alias and functions will not persist shell sessions by default. To make an alias persistent tou need to include it a one the shell startup script files like `.bashrc` or `.zshrc`. My suggestion is to write them separately in a `.alias` and `source` that file from your different shell config files. + + + +## Shells & Frameworks + +During shell and scripting we covered the `bash` shell since it is by far the most ubiquituos shell and most systems have it as the default option. Nevertheless, it is not the only option. + +For exampel the `zsh` shell is a superset of `bash` and provides many convenient features out of the box such as: + +- Smarter globbing, `**` +- Inline globbing/wildcard expansion +- Spelling correction +- Better tab completion/selection +- Path expansion (`cd /u/lo/b` will expand as `/usr/local/bin`) + +Moreover many shells can be improved with **frameworks**, some popular general frameworks like [prezto](https://github.com/sorin-ionescu/prezto) or [oh-my-zsh](https://github.com/robbyrussell/oh-my-zsh), and smaller ones that focus on specific features like for example [zsh-syntax-highlighting](https://github.com/zsh-users/zsh-syntax-highlighting) or [zsh-history-substring-search](https://github.com/zsh-users/zsh-history-substring-search). Other shells like [fish](https://fishshell.com/) include a lot of these user-friendly features by default. Some of these feaures include: + +- Right prompt +- Command syntax higlighting +- History substring search +- manpage based flag completions +- Smarter autocompletion +- Prompt themes + +One thing to note when using these frameworks is that if the code they run is not properly optimized or it is too much code, your shell can start slowing down. You can always profile it and disable the features that you do not use often or value over speed. + +## Terminal Emulators & Multiplexers + + + +## Command-line utilities + +The command line utilities that most UNIX based operating systems have by default are more than enough to do 99% of the stuff you usually need to do. + + +In the next few subsections I will cover alternative tools for extremely common shell operations which are more convenient to use. Some of these tools add new improved functionality to the command whereas others just focus on providing a simpler, more intuitive interface with better defaults. + +### `fasd` vs `cd` + +[fasd](https://github.com/clvv/fasd) is probably one of the best . Even with improved path expansion and tab autocomplete, changing directories can become quite repetitive. Fasd (or [autojump](https://github.com/wting/autojump)) solves this issue by keeping track of recent and frequent folders you have been to and performing fuzzy matching. + +Thus if I have visited the path `/home/user/awesome_project/code` running `z code` will `cd` to it. If I have multiple folders called code I can disambiguate by running `z awe code` which will be closer match. Unlike autojump, fasd also provides commands that instead of performing `cd` just expand frequent and /or recent files,folders or both. + + +### `bat` vs `cat` + +Even though `cat` does it job perfectly, [bat](https://github.com/sharkdp/bat) improves it by providing syntax highlighting, paging, line numbers and git integration. + + +### `exa`/`ranger` vs `ls` + +`ls` is a great command but some of the defaults can be annoying such as displaying the size in raw bytes. [exa](https://github.com/ogham/exa) provides better defaults + +If you are in need of navigating many folders and/or previewing many files, [ranger](https://github.com/ranger/ranger) can be much more efficient than `cd` and `cat` due to its wonderful interface. It is quite customizable and with a correct setup you can even [preview images](https://github.com/ranger/ranger/wiki/Image-Previews) in your terminal + +### `fd` vs `find` + +[fd](https://github.com/sharkdp/fd) is a simple, fast and user-friendly alternative to `find`. `find` defaults like having to use the `--name` flag (which is what you want to do 99% of the time) make it easier to use in an every day basis. It is also `git` aware and will skip files in your `.gitignore` and `.git` folder by default. It also has nice color coding by default. + +### `rg/fzf` vs `grep` + +`grep` is a great tool but if you want to grep through many files and once, there are better tools for that purpose. [ack](https://beyondgrep.com/), [ag](https://github.com/ggreer/the_silver_searcher) & [rg](https://github.com/BurntSushi/ripgrep) recursively search your current directory for a regex pattern while respecting your gitignore rules. They all work pretty similar but I favor `rg` due to how fast it can search my entire home directory. + +Similarly, it can be easy to find yourself doing `CMD | grep PATTERN` over an over again. [fzf](https://github.com/junegunn/fzf) is a command line fuzzy finder that enables you to interactively filter the ouput of pretty much any command. + +### `rsync` vs `cp/scp` + +Whereas `mv` and `scp` are perfect for most scenarios, when copying/moving around large amounts of files, large files or when some of the data is alredy on the destination `rsync` is a huge improvement. `rsync` will skip files that have already been transfered and with the `--partial` flag it can resume from a previously interrupted copy. + +### `trash` vs `rm` + +`rm` is a dangerous command in the sense that once you delete a file there is no turning back. However, modern OS do not behave like that when you delete something in the file explorer, they just move it to the Trash folder which is cleared periodically. + +Since how the trash is managed varies from OS to OS there is not a single CLI utility. In macOS there is [trash](https://hasseg.org/trash/) and in linux there is [trash-cli](https://github.com/andreafrancia/trash-cli/) among others. + +### `mosh` vs `ssh` + +`ssh ` is a very handy tool but if you have a slow connection, the lag can become annonying and if the connection interrupts you have to reconnect. [mosh](https://mosh.org/) is a handy tool that works allows roaming, supports intermittent connectivity, and provides intelligent local echo. + +### `tldr` vs `man` + +You can figure out what a commands does and what options it has using `man` and the `-h`/'--help' flag most of the time. However, in some cases it can be a bit daunting navigating these if they are detailed + +The [tldr](https://github.com/tldr-pages/tldr) command is a community driven documentation system that's available from the command line and gives a few simple illustrative examples of what the command does and the most common argument options. + + +### `aunpack` vs `tar/unzip/unrar` + +As [this xkcd](https://xkcd.com/1168/) references it can be quite tricky to remember the options and sometimes you need a different tool altogether such as `unrar` for rar files. +The [atool](https://www.nongnu.org/atool/) package provides the `aunpack` command which will figure out the correct options and always put the extracted archives in a new folder. + + {% comment %} + + + +- alias +- functions +- bash_profile, zprofile, &c - terminal emulators - multiplexers -- remote: ssh and mosh -- autojump and fzf +- ssh and mosh +- autojump +- fzf {% endcomment %} + +## Exercises + +- Run `cat .bash_history | sort | uniq -c | sort -rn | head -n 10` (or `cat .zhistory | sort | uniq -c | sort -rn | head -n 10` for zsh) to get top 10 most used commands and consider writing sorter aliases for them +- +- Since `fzf` is quite convenient for performing fuzzy searches and the shell history is quite prone to those kind of searches, investigate how to bind `fzf` to `^R`. You can find some [info](here) +- What does the `--bar` option do in `ack`? \ No newline at end of file From 86be195d9b0a097a9df0bb03ad9a578387fcf277 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Thu, 17 Jan 2019 11:45:40 -0500 Subject: [PATCH 027/640] Lecture notes from data wrangling --- data-wrangling.md | 329 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 328 insertions(+), 1 deletion(-) diff --git a/data-wrangling.md b/data-wrangling.md index 55565960..5d6992a4 100644 --- a/data-wrangling.md +++ b/data-wrangling.md @@ -4,4 +4,331 @@ title: "Data Wrangling" presenter: Jon --- -Lecture notes will be available by the start of lecture. +Have you ever had a bunch of text and wanted to do something with it? +Good. That's what data wrangling is all about! +Specifically, adapting data from one format to another, until you end up +with exactly what you wanted. + +We've already seen basic data wrangling: `journalctl | grep -i intel`. + - find all system log entries that mention Intel (case insensitive) + - really, most of data wrangling is about knowing what tools you have, + and how to combine them. + +Let's start from the beginning: we need a data source, and something to +do with it. Logs often make for a good use-case, because you often want +to investigate things about them, and reading the whole thing isn't +feasible. Let's figure out who's trying to log into my server by looking +at my server's log: + +```bash +ssh myserver journalctl +``` + +That's far too much stuff. Let's limit it to ssh stuff: + +```bash +ssh myserver journalctl | grep sshd +``` + +Notice that we're using a pipe to stream a _remote_ file through `grep` +on our local computer! `ssh` is magical. This is still way more stuff +than we wanted though. And pretty hard to read. Let's do better: + +```bash +ssh myserver journalctl | grep sshd | grep "Disconnected from" +``` + +There's still a lot of noise here. There are _a lot_ of ways to get rid +of that, but let's look at one of the most powerful tools in your +toolkit: `sed`. + +`sed` is a "stream editor" that builds on top of the old `ed` editor. In +it, you basically give short commands for how to modify the file, rather +than manipulate its contents directly (although you can do that too). +There are tons of commands, but one of the most common ones is `s`: +substitution. For example, we can write: + +```bash +ssh myserver journalctl + | grep sshd + | grep "Disconnected from" + | sed 's/.*Disconnected from //' +``` + +What we just wrote was a simple _regular expression_; a powerful +construct that lets you match text against patterns. The `s` command is +written on the form: `s/REGEX/SUBSTITUTION/`, where `REGEX` is the +regular expression you want to search for, and `SUBSTITUTION` is the +text you want to substitute matching text with. + +## Regular expressions + +Regular expressions are common and useful enough that it's worthwhile to +take some time to understand how they work. Let's start by looking at +the one we used above: `/.*Disconnected from /`. Regular expressions are +usually (though not always) surrounded by `/`. Most ASCII characters +just carry their normal meaning, but some characters have "special" +matching behavior. Exactly which characters do what vary somewhat +between different implementations of regular expressions, which is a +source of great frustration. Very common patterns are: + + - `.` means "any single character" except newline + - `*` zero or more of the preceding match + - `+` one or more of the preceding match + - `[abc]` any one character of `a`, `b`, and `c` + - `(RX1|RX2)` either something that matches `RX1` or `RX2` + - `^` the start of the line + - `$` the end of the line + +`sed`'s regular expressions are somewhat weird, and will require you to +put a `\` before most of these to give them their special meaning. Or +you can pass `-E`. + +So, looking back at `/.*Disconnected from /`, we see that it matches +any text that starts with any number of characters, followed by the +literal string "Disconnected from ". Which is what we wanted. But +beware, regular expressions are trixy. What if someone tried to log in +with the username "Disconnected from"? We'd have: + +``` +Jan 17 03:13:00 thesquareplanet.com sshd[2631]: Disconnected from invalid user Disconnected from 46.97.239.16 port 55920 [preauth] +``` + +What would we end up with? Well, `*` and `+` are, by default, "greedy". +They will match as much text as they can. So, in the above, we'd end up +with just + +``` +46.97.239.16 port 55920 [preauth] +``` + +Which may not be what we wanted. In some regular expression +implementations, you can just suffix `*` or `+` with a `?` to make them +non-greedy, but sadly `sed` doesn't support that. We _could_ switch to +perl's command-line mode though, which _does_ support that construct: + +```bash +perl -pe 's/.*?Disconnected from //' +``` + +We'll stick to `sed` for the rest of this though, because it's by far +the more common tool for these kinds of jobs. `sed` can also do other +handy things like print lines following a given match, do multiple +substitutions per invocation, search for things, etc. But we won't cover +that too much here. `sed` is basically an entire topic in and of itself, +but there are often better tools. + +Okay, so we also have a suffix we'd like to get rid of. How might we do +that? It's a little tricky to match just the text that follows the +username, especially if the username can have spaces and such! What we +need to do is match the _whole_ line: + +```bash + | sed -E 's/.*Disconnected from (invalid |authenticating )?user .* [^ ]+ port [0-9]+( \[preauth\])?$//' +``` + +Let's look at what's going on with a [regex +debugger](https://regex101.com/r/qqbZqh/2). Okay, so the start is still +as before. Then, we're matching any of the "user" variants (there are +two prefixes in the logs). Then we're matching on any string of +characters where the username is. Then we're matching on any single word +(`[^ ]+`; any non-empty sequence of non-space characters). Then the word +"port" followed by a sequence of digits. Then possibly the suffix +` [preauth]`, and then the end of the line. + +Notice that with this technique, as username of "Disconnected from" +won't confuse us any more. Can you see why? + +There is one problem with this though, and that is that the entire log +becomes empty. We want to _keep_ the username after all. For this, we +can use "capture groups". Any text matched by a regex surrounded by +parentheses is stored in a numbered capture group. These are available +in the substitution (and in some engines, even in the pattern itself!) +as `\1`, `\2`, `\3`, etc. So: + +```bash + | sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' +``` + +As you can probably imagine, you can come up with _really_ complicated +regular expressions. For example, here's an article on how you might +match an [e-mail +address](https://www.regular-expressions.info/email.html). It's [not +easy](https://emailregex.com/). And there's [lots of +discussion](https://stackoverflow.com/questions/201323/how-to-validate-an-email-address-using-a-regular-expression/1917982). +And people have [written +tests](https://fightingforalostcause.net/content/misc/2006/compare-email-regex.php). +And [test matrixes](https://mathiasbynens.be/demo/url-regex). You can +even write a regex for determining if a given number [is a prime +number](https://www.noulakaz.net/2007/03/18/a-regular-expression-to-check-for-prime-numbers/). + +Regular expressions are notoriously hard to get right, but they are also +very handy to have in your toolbox! + +## Back to data wrangling + +Okay, so we now have + +```bash +ssh myserver journalctl + | grep sshd + | grep "Disconnected from" + | sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' +``` + +We could do it just with `sed`, but why would we? For fun is why. + +```bash +ssh myserver journalctl + | sed -E + -e '/Disconnected from/!d' + -e 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' +``` + +This shows off some of `sed`'s capabilities. `sed` can also inject text +(with the `i` command), explicitly print lines (with the `p` command), +select lines by index, and lots of other things. Check `man sed`! + +Anyway. What we have now gives us a list of all the usernames that have +attempted to log in. But this is pretty unhelpful. Let's look for common +ones: + +```bash +ssh myserver journalctl + | grep sshd + | grep "Disconnected from" + | sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' + | sort | uniq -c +``` + +`sort` will, well, sort its input. `uniq -c` will collapse consecutive +lines that are the same into a single line, prefixed with a count of the +number of occurrences. We probably want to sort that too and only keep +the most common logins: + +```bash +ssh myserver journalctl + | grep sshd + | grep "Disconnected from" + | sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' + | sort | uniq -c + | sort -nk1,1 | tail -n10 +``` + +`sort -n` will sort in numeric (instead of lexiographic) order. `-k1,1` +means "sort by only the first whitespace-separated column". The `,n` +part says "sort until the `n`th field, where the default is the end of +the line. In this _particular_ example, sorting by the whole line +wouldn't matter, but we're here to learn! + +If we wanted the _least_ common ones, we could use `head` instead of +`tail`. There's also `sort -r`, which sorts in reverse order. + +Okay, so that's pretty cool, but we'd sort of like to only give the +usernames, and maybe not one per line? + +```bash +ssh myserver journalctl + | grep sshd + | grep "Disconnected from" + | sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' + | sort | uniq -c + | sort -nk1,1 | tail -n10 + | awk '{print $2}' | paste -sd, +``` + +Let's start with `paste`: it lets you combine lines (`-s`) by a given +single-character delimiter (`-d`). But what's this `awk` business? + +## awk -- another editor + +`awk` is a programming language that just happens to be really good at +processing text streams. There is _a lot_ to say about `awk` if you were +to learn it properly, but as with many other things here, we'll just go +through the basics. + +First, what does `{print $2}` do? Well, `awk` programs take the form of +an optional pattern plus a block saying what to do if the pattern +matches a given line. The default pattern (which we used above) matches +all lines. Inside the block, `$0` is set to the entire line's contents, +and `$1` through `$n` are set to the `n`th _field_ of that line, when +separated by the `awk` field separator (whitespace by default, change +with `-F`). In this case, we're saying that, for every line, print the +contents of the second field, which happens to be the username! + +Let's see if we can do something fancier. Let's compute the number of +single-use usernames that start with `c` and end with `e`: + +```bash + | awk '$1 == 1 && $2 ~ /^c[^ ]*e$/ { print $2 }' | wc -l +``` + +There's a lot to unpack here. First, notice that we now have a pattern +(the stuff that goes before `{...}`). The pattern says that the first +field of the line should be equal to 1 (that's the count from `uniq +-c`), and that the second field should match the given regular +expression. And the block just says to print the username. We then count +the number of lines in the output with `wc -l`. + +However, `awk` is a programming language, remember? + +```awk +BEGIN { rows = 0 } +$1 == 1 && $2 ~ /^c[^ ]*e$/ { rows += $1 } +END { print rows } +``` + +`BEGIN` is a pattern that matches the start of the input (and `END` +matches the end). Now, the per-line block just adds the count from the +first field (although it'll always be 1 in this case), and then we print +it out at the end. In fact, we _could_ get rid of `grep` and `sed` +entirely, because `awk` [can do it +all](https://backreference.org/2010/02/10/idiomatic-awk/), but we'll +leave that as an exercise to the reader. + +## Analyzing data + +You can do math! + +```bash + | paste -sd+ | bc -l +``` + +```bash +echo "2*($(data | paste -sd+))" | bc -l +``` + +You can get stats in a variety of ways. +[`st`](https://github.com/nferraz/st) is pretty neat, but if you already +have R: + +```bash +ssh myserver journalctl + | grep sshd + | grep "Disconnected from" + | sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' + | sort | uniq -c + | awk '{print $1}' | R --slave -e 'x <- scan(file="stdin", quiet=TRUE); summary(x)' +``` + +R is another (weird) programming language that's great at data analysis +and [plotting](https://ggplot2.tidyverse.org/). We won't go into too +much detail, but suffice to say that `summary` prints summary statistics +about a matrix, and we computed a matrix from the input stream of +numbers, so R gives us the statistics we wanted! + +If you just want some simple plotting, `gnuplot` is your friend: + +```bash +ssh myserver journalctl + | grep sshd + | grep "Disconnected from" + | sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' + | sort | uniq -c + | sort -nk1,1 | tail -n10 + | gnuplot -p -e 'set boxwidth 0.5; plot "-" using 1:xtic(2) with boxes' +``` + +# Exercises + +TODO From afd95e47271d93f76e773bd4a9ff3751f713bf2b Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Thu, 17 Jan 2019 12:00:57 -0500 Subject: [PATCH 028/640] Add some data wrangling exercises --- data-wrangling.md | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/data-wrangling.md b/data-wrangling.md index 5d6992a4..d28e7231 100644 --- a/data-wrangling.md +++ b/data-wrangling.md @@ -51,7 +51,7 @@ substitution. For example, we can write: ```bash ssh myserver journalctl | grep sshd - | grep "Disconnected from" + | grep "Disconnected from" | sed 's/.*Disconnected from //' ``` @@ -172,7 +172,7 @@ Okay, so we now have ```bash ssh myserver journalctl | grep sshd - | grep "Disconnected from" + | grep "Disconnected from" | sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' ``` @@ -196,7 +196,7 @@ ones: ```bash ssh myserver journalctl | grep sshd - | grep "Disconnected from" + | grep "Disconnected from" | sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' | sort | uniq -c ``` @@ -209,7 +209,7 @@ the most common logins: ```bash ssh myserver journalctl | grep sshd - | grep "Disconnected from" + | grep "Disconnected from" | sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' | sort | uniq -c | sort -nk1,1 | tail -n10 @@ -230,7 +230,7 @@ usernames, and maybe not one per line? ```bash ssh myserver journalctl | grep sshd - | grep "Disconnected from" + | grep "Disconnected from" | sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' | sort | uniq -c | sort -nk1,1 | tail -n10 @@ -295,7 +295,7 @@ You can do math! ``` ```bash -echo "2*($(data | paste -sd+))" | bc -l +echo "2*($(data | paste -sd+))" | bc -l ``` You can get stats in a variety of ways. @@ -305,7 +305,7 @@ have R: ```bash ssh myserver journalctl | grep sshd - | grep "Disconnected from" + | grep "Disconnected from" | sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' | sort | uniq -c | awk '{print $1}' | R --slave -e 'x <- scan(file="stdin", quiet=TRUE); summary(x)' @@ -322,7 +322,7 @@ If you just want some simple plotting, `gnuplot` is your friend: ```bash ssh myserver journalctl | grep sshd - | grep "Disconnected from" + | grep "Disconnected from" | sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' | sort | uniq -c | sort -nk1,1 | tail -n10 @@ -331,4 +331,8 @@ ssh myserver journalctl # Exercises -TODO +- If you are not familiar with Regular Expressions [here](https://regexone.com/) is a short interactive tutorial that covers most of the basics +- How is `sed s/REGEX/SUBSTITUTION/g` different from the regular sed? What about `sed s/REGEX/SUBSTITUTION/I`? +- To do inplace substitution it is quite tempting to do something like `sed s/REGEX/SUBSTITUTION/ input.txt > input.txt`. However this is a bad idea, why? Is this particular to `sed`? + + From 4b4da7e78104edea2d9471ebce20380c0410fe5e Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Thu, 17 Jan 2019 12:10:05 -0500 Subject: [PATCH 029/640] More data wrangling exercises --- data-wrangling.md | 41 ++++++++++++++++++++++++++++++++++++----- 1 file changed, 36 insertions(+), 5 deletions(-) diff --git a/data-wrangling.md b/data-wrangling.md index d28e7231..18d2769a 100644 --- a/data-wrangling.md +++ b/data-wrangling.md @@ -331,8 +331,39 @@ ssh myserver journalctl # Exercises -- If you are not familiar with Regular Expressions [here](https://regexone.com/) is a short interactive tutorial that covers most of the basics -- How is `sed s/REGEX/SUBSTITUTION/g` different from the regular sed? What about `sed s/REGEX/SUBSTITUTION/I`? -- To do inplace substitution it is quite tempting to do something like `sed s/REGEX/SUBSTITUTION/ input.txt > input.txt`. However this is a bad idea, why? Is this particular to `sed`? - - + - If you are not familiar with Regular Expressions + [here](https://regexone.com/) is a short interactive tutorial that + covers most of the basics + - How is `sed s/REGEX/SUBSTITUTION/g` different from the regular sed? + What about `/I` or `/m`? + - To do in-place substitution it is quite tempting to do something like + `sed s/REGEX/SUBSTITUTION/ input.txt > input.txt`. However this is a + bad idea, why? Is this particular to `sed`? + - Look for boot messages that are _not_ shared between your past three + reboots (see `journalctl`'s `-b` flag). You may want to just mash all + the boot logs together in a single file, as that may make things + easier. + - Produce some statistics of your system boot time over the last ten + boots using the log timestamp of the messages + ``` + Logs begin at ... + ``` + and + ``` + systemd[577]: Startup finished in ... + ``` + - Find the number of words (in `/usr/share/dict/words`) that contain at + least three `a`s and don't have a `'s` ending. What are the three + most common last two letters of those words? `sed`'s `y` command, or + the `tr` program, may help you with case insensitivity. How many + of those two-letter combinations are there? And for a challenge: + which combinations do not occur? + - Find an online data set like [this + one](https://stats.wikimedia.org/EN/TablesWikipediaZZ.htm) or [this + one](https://ucr.fbi.gov/crime-in-the-u.s/2016/crime-in-the-u.s.-2016/topic-pages/tables/table-1). + Fetch it using `curl` and extract out just two columns of numerical + data. If you're fetching HTML data, + [`pup`](https://github.com/EricChiang/pup) might be helpful. For JSON + data, try [`jq`](https://stedolan.github.io/jq/). Find the min and + max of one column in a single command, and the sum of the difference + between the two columns in another. From 32f6d8661f18a262ca03d15e06089cf613d43e77 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Thu, 17 Jan 2019 12:10:49 -0500 Subject: [PATCH 030/640] More data sets --- data-wrangling.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/data-wrangling.md b/data-wrangling.md index 18d2769a..fcd090bc 100644 --- a/data-wrangling.md +++ b/data-wrangling.md @@ -361,6 +361,8 @@ ssh myserver journalctl - Find an online data set like [this one](https://stats.wikimedia.org/EN/TablesWikipediaZZ.htm) or [this one](https://ucr.fbi.gov/crime-in-the-u.s/2016/crime-in-the-u.s.-2016/topic-pages/tables/table-1). + Maybe another one [from + here](https://www.springboard.com/blog/free-public-data-sets-data-science-project/). Fetch it using `curl` and extract out just two columns of numerical data. If you're fetching HTML data, [`pup`](https://github.com/EricChiang/pup) might be helpful. For JSON From b8f3c92c3f6bb3a30b684018e6ddab3ffe9e9ba6 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Thu, 17 Jan 2019 12:59:50 -0500 Subject: [PATCH 031/640] Add terminal emulators/multiplex to command-line --- command-line.md | 56 +++++++++++++++++++++++++------------------------ 1 file changed, 29 insertions(+), 27 deletions(-) diff --git a/command-line.md b/command-line.md index 9d204472..28c948e0 100644 --- a/command-line.md +++ b/command-line.md @@ -4,7 +4,7 @@ title: "Command-line environment" presenter: Jose --- -# Comand-line Environment +# Command-line Environment ## Aliases & Functions @@ -54,15 +54,15 @@ mcd () { } ``` -Alias and functions will not persist shell sessions by default. To make an alias persistent tou need to include it a one the shell startup script files like `.bashrc` or `.zshrc`. My suggestion is to write them separately in a `.alias` and `source` that file from your different shell config files. +Alias and functions will not persist shell sessions by default. To make an alias persistent you need to include it a one the shell startup script files like `.bashrc` or `.zshrc`. My suggestion is to write them separately in a `.alias` and `source` that file from your different shell config files. ## Shells & Frameworks -During shell and scripting we covered the `bash` shell since it is by far the most ubiquituos shell and most systems have it as the default option. Nevertheless, it is not the only option. +During shell and scripting we covered the `bash` shell since it is by far the most ubiquitous shell and most systems have it as the default option. Nevertheless, it is not the only option. -For exampel the `zsh` shell is a superset of `bash` and provides many convenient features out of the box such as: +For example the `zsh` shell is a superset of `bash` and provides many convenient features out of the box such as: - Smarter globbing, `**` - Inline globbing/wildcard expansion @@ -70,10 +70,10 @@ For exampel the `zsh` shell is a superset of `bash` and provides many convenient - Better tab completion/selection - Path expansion (`cd /u/lo/b` will expand as `/usr/local/bin`) -Moreover many shells can be improved with **frameworks**, some popular general frameworks like [prezto](https://github.com/sorin-ionescu/prezto) or [oh-my-zsh](https://github.com/robbyrussell/oh-my-zsh), and smaller ones that focus on specific features like for example [zsh-syntax-highlighting](https://github.com/zsh-users/zsh-syntax-highlighting) or [zsh-history-substring-search](https://github.com/zsh-users/zsh-history-substring-search). Other shells like [fish](https://fishshell.com/) include a lot of these user-friendly features by default. Some of these feaures include: +Moreover many shells can be improved with **frameworks**, some popular general frameworks like [prezto](https://github.com/sorin-ionescu/prezto) or [oh-my-zsh](https://github.com/robbyrussell/oh-my-zsh), and smaller ones that focus on specific features like for example [zsh-syntax-highlighting](https://github.com/zsh-users/zsh-syntax-highlighting) or [zsh-history-substring-search](https://github.com/zsh-users/zsh-history-substring-search). Other shells like [fish](https://fishshell.com/) include a lot of these user-friendly features by default. Some of these features include: - Right prompt -- Command syntax higlighting +- Command syntax highlighting - History substring search - manpage based flag completions - Smarter autocompletion @@ -83,6 +83,18 @@ One thing to note when using these frameworks is that if the code they run is no ## Terminal Emulators & Multiplexers +Along with customizing your shell it is worth spending some time figuring out your choice of **terminal emulator** and its settings. There are many many terminal emulators out there (here is a [comparison](https://anarc.at/blog/2018-04-12-terminal-emulators-1/)). + +Since you might be spending hundreds to thousands of hours in your terminal it pays off to look into its settings. Some of the aspects that you may want to modify in your terminal include: + +- Font choice +- Color Scheme +- Keyboard shortcuts +- Tab/Pane support +- Scrollback configuration +- Performance (some newer terminals like [Alacritty](https://github.com/jwilm/alacritty) offer GPU acceleration) + +It is also worth mentioning **terminal multiplexers** like [tmux](https://github.com/tmux/tmux). `tmux` allows you to pane and tab multiple shell sessions. It also supports attaching and detaching which is a very common use-case when you are working on a remote server and want to keep you shell running without having to worry about disowning you current processes (by default when you log out your processes are terminated). This way, with `tmux` you can jump into and out of complex terminal layouts. Similar to terminal emulators `tmux` supports heavy customization by editing the `~/.tmux.conf` file. ## Command-line utilities @@ -94,7 +106,7 @@ In the next few subsections I will cover alternative tools for extremely common ### `fasd` vs `cd` -[fasd](https://github.com/clvv/fasd) is probably one of the best . Even with improved path expansion and tab autocomplete, changing directories can become quite repetitive. Fasd (or [autojump](https://github.com/wting/autojump)) solves this issue by keeping track of recent and frequent folders you have been to and performing fuzzy matching. +Even with improved path expansion and tab autocomplete, changing directories can become quite repetitive. [Fasd]https://github.com/clvv/fasd() (or [autojump](https://github.com/wting/autojump)) solves this issue by keeping track of recent and frequent folders you have been to and performing fuzzy matching. Thus if I have visited the path `/home/user/awesome_project/code` running `z code` will `cd` to it. If I have multiple folders called code I can disambiguate by running `z awe code` which will be closer match. Unlike autojump, fasd also provides commands that instead of performing `cd` just expand frequent and /or recent files,folders or both. @@ -118,11 +130,11 @@ If you are in need of navigating many folders and/or previewing many files, [ran `grep` is a great tool but if you want to grep through many files and once, there are better tools for that purpose. [ack](https://beyondgrep.com/), [ag](https://github.com/ggreer/the_silver_searcher) & [rg](https://github.com/BurntSushi/ripgrep) recursively search your current directory for a regex pattern while respecting your gitignore rules. They all work pretty similar but I favor `rg` due to how fast it can search my entire home directory. -Similarly, it can be easy to find yourself doing `CMD | grep PATTERN` over an over again. [fzf](https://github.com/junegunn/fzf) is a command line fuzzy finder that enables you to interactively filter the ouput of pretty much any command. +Similarly, it can be easy to find yourself doing `CMD | grep PATTERN` over an over again. [fzf](https://github.com/junegunn/fzf) is a command line fuzzy finder that enables you to interactively filter the output of pretty much any command. ### `rsync` vs `cp/scp` -Whereas `mv` and `scp` are perfect for most scenarios, when copying/moving around large amounts of files, large files or when some of the data is alredy on the destination `rsync` is a huge improvement. `rsync` will skip files that have already been transfered and with the `--partial` flag it can resume from a previously interrupted copy. +Whereas `mv` and `scp` are perfect for most scenarios, when copying/moving around large amounts of files, large files or when some of the data is already on the destination `rsync` is a huge improvement. `rsync` will skip files that have already been transferred and with the `--partial` flag it can resume from a previously interrupted copy. ### `trash` vs `rm` @@ -132,7 +144,7 @@ Since how the trash is managed varies from OS to OS there is not a single CLI ut ### `mosh` vs `ssh` -`ssh ` is a very handy tool but if you have a slow connection, the lag can become annonying and if the connection interrupts you have to reconnect. [mosh](https://mosh.org/) is a handy tool that works allows roaming, supports intermittent connectivity, and provides intelligent local echo. +`ssh ` is a very handy tool but if you have a slow connection, the lag can become annoying and if the connection interrupts you have to reconnect. [mosh](https://mosh.org/) is a handy tool that works allows roaming, supports intermittent connectivity, and provides intelligent local echo. ### `tldr` vs `man` @@ -147,24 +159,14 @@ As [this xkcd](https://xkcd.com/1168/) references it can be quite tricky to reme The [atool](https://www.nongnu.org/atool/) package provides the `aunpack` command which will figure out the correct options and always put the extracted archives in a new folder. - -{% comment %} - - - -- alias -- functions -- bash_profile, zprofile, &c -- terminal emulators -- multiplexers -- ssh and mosh -- autojump -- fzf -{% endcomment %} - ## Exercises - Run `cat .bash_history | sort | uniq -c | sort -rn | head -n 10` (or `cat .zhistory | sort | uniq -c | sort -rn | head -n 10` for zsh) to get top 10 most used commands and consider writing sorter aliases for them -- -- Since `fzf` is quite convenient for performing fuzzy searches and the shell history is quite prone to those kind of searches, investigate how to bind `fzf` to `^R`. You can find some [info](here) +- Choose a terminal emulator and figure out how to change the following properties: + - Font choice + - Color scheme. How many colors does a standard scheme have? why? + - Scrollback history size + +- Install `fasd` or some similar software and write a bash/zsh function called `v` that performs fuzzy matching on the passed arguments and opens up the top result in your editor of choice. Then, modify it so that if there are multiple matches you can select them with `fzf`. +- Since `fzf` is quite convenient for performing fuzzy searches and the shell history is quite prone to those kind of searches, investigate how to bind `fzf` to `^R`. You can find some info [here](https://github.com/junegunn/fzf/wiki/Configuring-shell-key-bindings) - What does the `--bar` option do in `ack`? \ No newline at end of file From 0e8486225bfcc4f677e1f798dc9af1bb0c1b6467 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Thu, 17 Jan 2019 14:34:46 -0500 Subject: [PATCH 032/640] More data wrangling --- data-wrangling.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/data-wrangling.md b/data-wrangling.md index fcd090bc..5dd8981f 100644 --- a/data-wrangling.md +++ b/data-wrangling.md @@ -329,6 +329,16 @@ ssh myserver journalctl | gnuplot -p -e 'set boxwidth 0.5; plot "-" using 1:xtic(2) with boxes' ``` +## Data wrangling to make arguments + +Sometimes you want to do data wrangling to find things to install or +remove based on some longer list. The data wrangling we've talked about +so far + `xargs` can be a powerful combo: + +```bash +rustup toolchain list | grep nightly | grep -vE "nightly-x86|01-17" | sed 's/-x86.*//' | xargs rustup toolchain uninstall +``` + # Exercises - If you are not familiar with Regular Expressions From fc09343b197b109907b1d080d40f01d7de3a127d Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Fri, 18 Jan 2019 11:19:05 -0500 Subject: [PATCH 033/640] Fix highlight background bleeding outside border --- static/css/syntax.css | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/static/css/syntax.css b/static/css/syntax.css index a09a2010..65054607 100644 --- a/static/css/syntax.css +++ b/static/css/syntax.css @@ -1,4 +1,4 @@ -.highlight { background-color: #f9f9f9 } +pre.highlight { background-color: #f9f9f9; background-clip: border-box } .highlight .c { color: #999988; font-style: italic } /* Comment */ .highlight .err { color: #a61717; background-color: #e3d2d2 } /* Error */ .highlight .k { color: #000000; font-weight: bold } /* Keyword */ From 7a5440b72d30de2f0450c69b10bf07323b065f40 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Fri, 18 Jan 2019 11:21:38 -0500 Subject: [PATCH 034/640] Add link to website source code --- index.md | 5 ++++- static/css/main.css | 4 ++++ 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/index.md b/index.md index e2d5e83a..61a1dd47 100644 --- a/index.md +++ b/index.md @@ -41,4 +41,7 @@ Have any questions? Send us an email at --- -

Taught as part of SIPB IAP 2019. Co-sponsored by SIPB and MIT EECS.

+
+

Taught as part of SIPB IAP 2019. Co-sponsored by SIPB and MIT EECS.

+

Source code.

+
diff --git a/static/css/main.css b/static/css/main.css index f4aecd8c..a20b35a5 100644 --- a/static/css/main.css +++ b/static/css/main.css @@ -162,6 +162,10 @@ hr { font-size: 0.75rem; } +.small p { + margin-bottom: 0; +} + .center { text-align: center; } From 04864bd6a24a8c2b8c06409bfda450c2d7a50209 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Sun, 20 Jan 2019 13:25:14 -0500 Subject: [PATCH 035/640] Add note --- editors.md | 1 + 1 file changed, 1 insertion(+) diff --git a/editors.md b/editors.md index 22d3c675..333388b0 100644 --- a/editors.md +++ b/editors.md @@ -9,4 +9,5 @@ Lecture notes will be available by the start of lecture. {% comment %} https://vimways.org/2018/ https://vim-adventures.com/ +- sshfs {% endcomment %} From 624fb928c3bb850d059920c5db5fafda8918c17c Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Mon, 21 Jan 2019 22:55:07 -0500 Subject: [PATCH 036/640] a start for vcs --- version-control.md | 275 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 274 insertions(+), 1 deletion(-) diff --git a/version-control.md b/version-control.md index f9e65bea..ded58e2b 100644 --- a/version-control.md +++ b/version-control.md @@ -4,4 +4,277 @@ title: "Version Control" presenter: Jon --- -Lecture notes will be available by the start of lecture. +Whenever you are working on something that changes over time, it's +useful to be able to _track_ those changes. This can be for a number of +reasons: it gives you a record of what changed, how to undo it, who +changed it, and possibly even why. Version control systems (VCS) give +you that ability. They let you _commit_ changes to a set of files, along +with a message describing the change, as well as look at and undo +changes you've made in the past. + +Most VCS support sharing the commit history between multiple users. This +allows for convenient collaboration: you can see the changes I've made, +and I can see the changes you've made. And since the VCS tracks +_changes_, it can often (though not always) figure out how to combine +our changes as long as they touch relatively disjoint things. + +There [_a +lot_](https://en.wikipedia.org/wiki/Comparison_of_version-control_software) +of VCSes out there that differ a lot in what they support, how they +function, and how you interact with them. Here, we'll focus on +[git](https://git-scm.com/), one of the more commonly used ones, but I +recommend you also take a look at +[Mercurial](https://www.mercurial-scm.org/). + +With that all said -- to the cliffnotes! + +## Is git dark magic? + +not quite.. you need to understand the data model. +we're going to skip over some of the details, but roughly speaking, +the _core_ "thing" in git is a commit. + + - every commit has a unique name, "revision hash" + a long hash like `998622294a6c520db718867354bf98348ae3c7e2` + often shortened to a short (unique-ish) prefix: `9986222` + - commit has author + commit message + - also has the hash of any _ancestor commits_ + usually just the hash of the previous commit + - commit also stores a _diff_, a representation of how you get from the + commit's ancestors to the commit (e.g., remove this line in this + file, add these lines to this file, rename that file, etc.) + +initially, the _repository_ (roughly: the folder that git manages) has +no content, and no commits. let's set that up: + +```console +$ git init hackers +$ cd hackers +$ git status +``` + +the output here actually gives us a good starting point. let's dig in +and make sure we understand it all. + +first, "On branch master". + + - don't want to use hashes all the time. + - branches are names that point to hashes. + - master is traditionally the name for the "latest" commit. + every time a new commit is made, the master name will be made to + point to the new commit's hash. + - special name `HEAD` refers to "current" name + - you can also make your own names with `git branch` (or `git tag`) + we'll get back to that + +let's skip over "No commits yet" because that's all there is to it. + +then, "nothing to commit". + + - every commit contains a diff with all the changes you made. + but how is that diff constructed in the first place? + - _could_ just always commit _all_ changes you've made since the last + commit + - sometimes you want to only commit some of them (e.g., not `TODO`s) + - sometimes you want to break up a change into multiple commits to + give a separate commit message for each one + - git lets you _stage_ changes to construct a commit + - add changes to a file or files to the staged changes with `git add` + - add only some changes in a file with `git add -p` + - without argument `git add` operates on "all known files" + - remove a file and stage its removal with `git rm` + - empty the set of staged changes `git reset` + - note that this does *not* change any of your files! + it *only* means that no changes will be included in a commit + - to remove only some staged changes: + `git reset FILE` or `git reset -p` + - check staged changes with `git diff --staged` + - see remaining changes with `git diff` + - when you're happy with the stage, make a commit with `git commit` + - if you just want to commit *all* changes: `git commit -a` + - `git help add` has a bunch more helpful info + +while you're playing with the above, try to run `git status` to see what +git thinks you're doing -- it's surprisingly helpful! + +## A commit you say... + +okay, we have a commit, now what? + + - we can look at recent changes: `git log` (or `git log --oneline`) + - we can look at the full changes: `git log -p` + - we can show a particular commit: `git show master` + - or with `-p` for full diff/patch + - we can go back to the state at a commit using `git checkout NAME` + - if `NAME` is a commit hash, git says we're "detached". this just + means there's no `NAME` that refers to this commit, so if we make + commits, no-one will know about them. + - we can revert a change with `git revert NAME` + - applies the diff in the commit at `NAME` in reverse. + - we can compare an older version to this one using `git diff NAME..` + - `a..b` is a commit _range_. if either is left out, it means `HEAD`. + - we can show all the commits between using `git log NAME..` + - `-p` works here too + - we can change `master` to point to a particular commit (effectively + undoing everything since) with `git reset NAME`: + - huh, why? wasn't `reset` to change staged changes? + reset has a "second" form (see `git help reset`) which sets `HEAD` + to the commit pointed to by the given name. + - notice that this didn't change any files -- `git diff` now + effectively shows `git diff NAME..`. + +## What's in a name? + +clearly, names are important in git. and they're the key to +understanding *a lot* of what goes on in git. so far, we've talked about +commit hashes, master, and `HEAD`. but there's more! + + - you can make your own branches (like master) with `git branch b` + - creates a new name, `b`, which points to the commit at `HEAD` + - you're still "on" master though, so if you make a new commit, + master will point to that new commit, `b` will not. + - switch to a branch with `git checkout b` + - any commits you make will now update the `b` name + - switch back to master with `git checkout master` + - all your changes in `b` are hidden away + - tags are other names that never change, and that have their own + message. often used to mark releases + changelogs. + - `NAME^` means "the commit before `NAME` + - can apply recursively: `NAME^^^` + - you _most likely_ mean `~` when you use `~` + - `~` is "temporal", whereas `^` goes by ancestors + - `~~` is the same as `^^` + - with `~` you can also write `X~3` for "3 commits older than `X` + - you don't want `^3` + - `git diff HEAD^` + - `-` means "the previous name" + - most commands operate on `HEAD` unless you give another argument + +## Clean up your mess. + +your commit history will _very_ often end up as: + + - `add feature x` -- maybe even with a commit message about `x`! + - `forgot to add file` + - `fix bug` + - `typo` + - `typo2` + - `actually fix` + - `actually actually fix` + - `tests pass` + - `fix example code` + - `typo` + - `x` + - `x` + - `x` + - `x` + +that's _fine_ as far as git is concerned, but is not very helpful to +your future self, or to other people who are curious about what has +changed. git lets you clean up these things: + + - `git commit --amend`: fold staged changes into previous commit + - note that this _changes_ the previous commit, giving it a new hash! + - `git rebase -i HEAD~13` is _magical_. + for each commit from past 13, choose what to do: + - default is `pick`; do nothing + - `r`: change commit message + - `e`: change commit (add or remove files) + - `s`: combine commit with previous and edit commit message + - `f`: "fixup" -- combine commit with previous; discard commit msg + - at the end, `HEAD` is made to point to what is now the last commit + +## Playing with others. + +a common use-case for version control is to allow multiple people to +make changes to a set of files without stepping on each other's toes. +or rather, to make sure that _if_ they step on each other's toes, they +won't just silently overwrite each other's changes. + +git is a _distributed_ VCS: everyone has a local copy of the entire +repository (well, of everything others have chosen to publish). some +VCSes are _centralized_ (e.g., subversion): a server has all the +commits, clients only have the files they have "checked out". basically, +they only have the _current_ files, and need to ask the server if they +want anything else. + +every copy of a git repository can be listed as a "remote". you can copy +an existing git repository using `git clone ADDRESS` (instead of `git +init`). this creates a remote called _origin_ that points to `ADDRESS`. +you can fetch names and the commits they point to from a remote with +`git fetch REMOTE`. all names at a remote are available to you as +`REMOTE/NAME`, and you can use them just like local names. + +if you have write access to a remote, you can change names at the remote +to point to commits you've made using `git push`. for example, let's +make the master name (branch) at the remote `origin` point to the commit +that our master branch currently points to: + + - `git push origin master:master` + - for convenience, you can set `origin/master` as the default target + for when you `git push` from the current branch with `-u` + - consider: what does this do? `git push origin master:HEAD^` + +often you'll use GitHub, GitLab, BitBucket, or something else as your +remote. there's nothing "special" about that as far as git is concerned. +it's all just names and commits. if someone makes a change to master and +updates `github/master` to point to their commit (we'll get back to +that in a second), then when you `git fetch github`, you'll be able to +see their changes with `git log github/master`. + +## Working with others. + +so far, branches seem pretty useless: you can create them, do work on +them, but then what? eventually, you'll just make master point to them +anyway, right? + + - what if you had to fix something while working on a big feature? + - what if someone else made a change to master in the meantime? + +inevitably, you will have to _merge_ changes in one branch with changes +in another, whether those changes are made by you or someone else. git +lets you do this with, unsurprisingly, `git merge NAME`. `merge` will: + + - look for the latest point where `HEAD` and `NAME` shared a commit + ancestor (i.e., where they diverged) + - (try to) apply all those changes to the current `HEAD` + - produce a commit that contains all those changes, and lists both + `HEAD` and `NAME` as its ancestors + - set `HEAD` to that commit's hash + +once your big feature has been finished, you can merge its branch into +master, and git will ensure that you don't lose any changes from either +branch! + +if you've used git in the past, you may recognize `merge` by a different +name: `pull`. when you do `git pull REMOTE BRANCH`, that is: + + - `git fetch REMOTE` + - `git merge REMOTE/BRANCH` + - where, like `push`, `REMOTE` and `BRANCH` are often omitted and use + the "tracking" remote branch (remember `-u`?) + +this usually works _great_. as long as the changes to the branches being +merged are disjoint. if they are not, you get a _merge conflict_. sounds +scary... + + - a merge conflict is just git telling you that it doesn't know what + the final diff should look like + - git pauses and asks you to finish staging the "merge commit" + - open the conflicted file in your editor and look for lots of angle + brackets (`<<<<<<<`). the stuff above `=======` is the change made in + the `HEAD` since the shared ancestor commit. the stuff below is the + change made in the `NAME` since the shared commit. + - `git mergetool` is pretty handy -- opens a diff editor + - once you've _resolved_ the conflict by figuring out what the file + should now look like, stage those changes with `git add`. + - when all the conflicts are resolved, finish with `git commit` + +you've just resolved your first git merge conflict! \o/ +now you can publish your finished changes with `git push` + +## fast-forward, forced pushes + +## rebase vs merge + +## git stash From 84e1e0795648c716d5df39f5c19a80c3593e011e Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Tue, 22 Jan 2019 10:30:39 -0500 Subject: [PATCH 037/640] Some more vcs content + start of exercise list --- version-control.md | 79 ++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 69 insertions(+), 10 deletions(-) diff --git a/version-control.md b/version-control.md index ded58e2b..1255aff5 100644 --- a/version-control.md +++ b/version-control.md @@ -40,9 +40,11 @@ the _core_ "thing" in git is a commit. - commit has author + commit message - also has the hash of any _ancestor commits_ usually just the hash of the previous commit - - commit also stores a _diff_, a representation of how you get from the - commit's ancestors to the commit (e.g., remove this line in this + - commit also represents a _diff_, a representation of how you get from + the commit's ancestors to the commit (e.g., remove this line in this file, add these lines to this file, rename that file, etc.) + - in reality, git stores the full before and after state + - probably don't want to store big files that change! initially, the _repository_ (roughly: the folder that git manages) has no content, and no commits. let's set that up: @@ -137,6 +139,7 @@ commit hashes, master, and `HEAD`. but there's more! - any commits you make will now update the `b` name - switch back to master with `git checkout master` - all your changes in `b` are hidden away + - a very handy way to be able to easily test out changes - tags are other names that never change, and that have their own message. often used to mark releases + changelogs. - `NAME^` means "the commit before `NAME` @@ -150,7 +153,7 @@ commit hashes, master, and `HEAD`. but there's more! - `-` means "the previous name" - most commands operate on `HEAD` unless you give another argument -## Clean up your mess. +## Clean up your mess your commit history will _very_ often end up as: @@ -183,8 +186,13 @@ changed. git lets you clean up these things: - `s`: combine commit with previous and edit commit message - `f`: "fixup" -- combine commit with previous; discard commit msg - at the end, `HEAD` is made to point to what is now the last commit + - often referred to as _squashing_ commits + - what it really does: rewind `HEAD` to rebase start point, then + re-apply the commits in order as directed. + - `git reset --hard NAME`: reset the state of all files to that of + `NAME` (or `HEAD` if no name is given). handy for undoing changes. -## Playing with others. +## Playing with others a common use-case for version control is to allow multiple people to make changes to a set of files without stepping on each other's toes. @@ -222,7 +230,7 @@ updates `github/master` to point to their commit (we'll get back to that in a second), then when you `git fetch github`, you'll be able to see their changes with `git log github/master`. -## Working with others. +## Working with others so far, branches seem pretty useless: you can create them, do work on them, but then what? eventually, you'll just make master point to them @@ -269,12 +277,63 @@ scary... - once you've _resolved_ the conflict by figuring out what the file should now look like, stage those changes with `git add`. - when all the conflicts are resolved, finish with `git commit` + - you can give up with `git merge --abort` you've just resolved your first git merge conflict! \o/ now you can publish your finished changes with `git push` -## fast-forward, forced pushes - -## rebase vs merge - -## git stash +## When worlds collide + +when you `push`, git checks that no-one else's work is lost if you +update the remote name you're pushing too. it does this by checking +that the current commit of the remote name is an ancestor of the commit +you are pushing. if it is, git can safely just update the name; this is +called _fast-forwarding_. if it is not, git will refuse to update the +remote name, and tell you there have been changes. + +if your push is rejected, what do you do? + + - merge remote changes with `git pull` (i.e., `fetch` + `merge`) + - force the push with `--force`: this will lose other people's changes! + - there's also `--force-with-lease`, which will only force the change + if the remote name hasn't changed since the last time you fetched + from that remote. much safer! + - if you've rebased local commits that you've previously pushed + ("history rewriting"; probably don't do this), you'll have to force + push. think about why! + - try to re-apply your changes "on top of" the changes made remotely + - this is a `rebase`! + - rewind all local commits since shared ancestor + - fast-forward `HEAD` to commit at remote name + - apply local commits in-order + - may have conflicts you have to manually resolve + - `git rebase --continue` or `--abort` + - lots more [here](https://git-scm.com/book/en/v2/Git-Branching-Rebasing) + - `git pull --rebase` will start this process for you + - whether you should merge or rebase is a hot topic! some good reads: + - [this](https://www.atlassian.com/git/tutorials/merging-vs-rebasing) + - [this](https://derekgourlay.com/blog/git-when-to-merge-vs-when-to-rebase/) + - [this](https://stackoverflow.com/questions/804115/when-do-you-use-git-rebase-instead-of-git-merge) + +# Further reading + +[![XKCD on git](https://imgs.xkcd.com/comics/git.png)](https://xkcd.com/1597/) + + - [Learn git branching](https://learngitbranching.js.org/) + - [How to explain git in simple words](https://smusamashah.github.io/blog/2017/10/14/explain-git-in-simple-words) + - [Git from the bottom up](https://jwiegley.github.io/git-from-the-bottom-up/) + - [The Pro Git book](https://git-scm.com/book/en/v2) + +# Exercises + + - forced push + `--force-with-lease` + - git merge/rebase --abort + - git stash + - git reflog + - git hooks + - .gitconfig + aliases + - git blame + - visualization + - `gitk --all` + - `git log --graph --all --decorate` + - exercise about why rebasing public commits is bad From 5aecb5b3bb6330f3b22f9c3c5b1dc952e41e7225 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 22 Jan 2019 14:34:48 -0500 Subject: [PATCH 038/640] Add lecture notes for editors --- editors.md | 299 ++++++- example-data.xml | 1000 ++++++++++++++++++++++ static/media/editor-learning-curves.jpg | Bin 0 -> 22279 bytes static/media/example-data.xml | 1002 +++++++++++++++++++++++ 4 files changed, 2295 insertions(+), 6 deletions(-) create mode 100644 example-data.xml create mode 100644 static/media/editor-learning-curves.jpg create mode 100644 static/media/example-data.xml diff --git a/editors.md b/editors.md index 333388b0..68a5b43b 100644 --- a/editors.md +++ b/editors.md @@ -4,10 +4,297 @@ title: "Editors" presenter: Anish --- -Lecture notes will be available by the start of lecture. +# Importance of Editors -{% comment %} -https://vimways.org/2018/ -https://vim-adventures.com/ -- sshfs -{% endcomment %} +As programmers, we spend most of our time editing plain-text files. It's worth +investing time learning an editor that fits your needs. + +How do you learn a new editor? You force yourself to use that editor for a +while, even if it temporarily hampers your productivity. It'll pay off soon +enough (two weeks is enough to learn the basics). + +We are going to teach you Vim, but we encourage you to experiment with other +editors. It's a very personal choice, and people have [strong +opinions](https://en.wikipedia.org/wiki/Editor_war). + +We can't teach you how to use a powerful editor in 50 minutes, so we're going +to focus on teaching you the basics, showing you some of the more advanced +functionality, and giving you the resources to master the tool. We'll teach you +lessons in the context of Vim, but most ideas will translate to any other +powerful editor you use (and if they don't, then you probably shouldn't use +that editor!). + +![Editor Learning Curves](/static/media/editor-learning-curves.jpg) + + + +The editor learning curves graph is a myth. Learning the basics of a powerful +editor is quite easy (even though it might take years to master). + +Which editors are popular today? See this [Stack Overflow +survey](https://insights.stackoverflow.com/survey/2018/#technology-most-popular-development-environments) +(there may be some bias because Stack Overflow users may not be representative +of programmers as a whole). + +## Command-line Editors + +Even if you eventually settle on using a GUI editor, it's worth learning a +command-line editor for easily editing files on remote machines. + +# Nano + +Nano is a simple command-line editor. + +- Move with arrow keys +- All other shortcuts (save, exit) shown at the bottom + +# Vim + +Vi/Vim is a powerful text editor. It's a command-line program that's usually +installed everywhere, which makes it convenient for editing files on a remote +machine. + +Vim also has graphical versions, such as GVim and +[MacVim](https://macvim-dev.github.io/macvim/). These provide additional +features such as 24-bit color, menus, and popups. + +## Philosophy of Vim + +- When programming, you spend most of your time reading/editing, not writing + - Vim is a **modal** editor: different modes for inserting text vs manipulating text +- Vim is programmable (with Vimscript and also other languages like Python) +- Vim's interface itself is like a programming language + - Keystrokes (with mneumonic names) are commands + - Commands are composable +- Don't use the mouse: too slow +- Editor should work at the speed you think + +## Introductory Vim + +### Modes + +Vim shows the current mode in the bottom left. + +- Normal mode: for moving around a file and making edits + - Spend most of your time here +- Insert mode: for inserting text +- Visual (visual, line, or block) mode: for selecting blocks of text + +You change modes by pressing `` to switch from any mode back to normal +mode. From normal mode, enter insert mode with `i`, visual mode with `v`, +visual line mode with `V`, and visual block mode with ``. + +You use the `` key a lot when using Vim: consider remapping Caps Lock to +Escape. + +### Basics + +Vim Ex-commands are issued through `:{command}` in normal mode. + +- `:q` quit (close window) +- `:w` save +- `:wq` save and quit +- `:e {name of file}` open file for editing +- `:help topic` open help + +### Movement + +Vim is all about efficient movement. Navigate the file in Normal mode. + +- Disable arrow keys to avoid bad habits +```vim +nnoremap :echoe "Use h" +nnoremap :echoe "Use l" +nnoremap :echoe "Use k" +nnoremap :echoe "Use j" +``` +- Basic movement: `hjkl` (left, down, up, right) +- Words: `w` (next word), `b` (beginning of word), `e` (end of word) +- Lines: `0` (beginning of line), `^` (first non-blank character), `$` (end of line) +- Screen: `H` (top of screen), `M` (middle of screen), `L` (bottom of screen) +- File: `gg` (beginning of file), `G` (end of file) +- Line numbers: `:{number}` or `{number}G` (line {number}) +- Misc: `%` (corresponding item) +- Find: `f{character}`, `t{character}`, `F{character}`, `T{character}` + - find/to forward/backward {character} on the current line +- Repeating N times: `{number}{movement}`, e.g. `10j` moves down 10 lines +- Search: `/{regex}`, `n` / `N` for navigating matches + +### Selection + +Visual modes: + +- Visual +- Visual Line +- Visual Block + +Can use movement keys to make selection. + +### Manipulating text + +Everything that you used to do with the mouse, you now do with keyboards (and +powerful, composable commands). + +- `i` enter insert mode + - but for manipulating/deleting text, want to use something more than + backspace +- `o` / `O` insert line below / above +- `d{motion}` delete {motion} + - e.g. `dw` is delete word, `d$` is delete to end of line, `d0` is delete + to beginning of line +- `c{motion}` change {motion} + - e.g. `cw` is change word + - like `d{motion}` followed by `i` +- `x` delete character (equal do `dl`) +- `s` substitute character (equal to `xi`) +- visual mode + manipulation + - select text, `d` to delete it or `c` to change it +- `u` to undo, `/{/g` + - `%s/\(.*\)<\/name>/"name": "\1",/g` + - ... + - Vim commands / macros + - `Gdd`, `ggdd` delete first and last lines + - Macro to format a single element (register `e`) + - Go to line with `` + - `qe^r"f>s": "fq` + - Macro to format a person + - Go to line with `` + - `qpS{j@ej@eA,jS},q` + - Macro to format a person and go to the next person + - Go to line with `` + - `qq@pjq`j@ej@eA,jj@ej@eA,jj@ej@eA,j + - Execute macro until end of file + - `999@q` + - Manually remove last `,` and add `[` and `]` delimiters + +## Extending Vim + +There are tons of plugins for extending vim. + +First, get set up with a plugin manager like +[vim-plug](https://github.com/junegunn/vim-plug), +[Vundle](https://github.com/VundleVim/Vundle.vim), or +[pathogen.vim](https://github.com/tpope/vim-pathogen). + +Some plugins to consider: + +- [ctrlp.vim](https://github.com/kien/ctrlp.vim): fuzzy file finder +- [vim-fugitive](https://github.com/tpope/vim-fugitive): git integration +- [vim-surround](https://github.com/tpope/vim-surround): manipulating "surroundings" +- [gundo.vim](https://github.com/sjl/gundo.vim): navigate undo tree +- [nerdtree](https://github.com/scrooloose/nerdtree): file explorer +- [syntastic](https://github.com/vim-syntastic/syntastic): syntax checking +- [vim-easymotion](https://github.com/easymotion/vim-easymotion): magic motions +- [vim-over](https://github.com/osyo-manga/vim-over): substitute preview + +Lists of plugins: + +- [Vim Awesome](https://vimawesome.com/) + +## Vim-mode in Other Programs + +For many popular editors (e.g. vim and emacs), many other tools support editor +emulation. + +- `~/.inputrc` + - `set editing-mode vi` +- Shell + - `export EDITOR=vim` (environment variable used by programs like `git`) + - zsh: `bindkey -v` + +## Resources + +- [Vim Tips Wiki](http://vim.wikia.com/wiki/Vim_Tips_Wiki) +- [Vim Advent Calendar](https://vimways.org/2018/): various Vim tips + +# Remote Editing + +[sshfs](https://github.com/libfuse/sshfs) can mount a folder on a remote server +locally, and then you can use a local editor. + +# Resources + +TODO resources for other editors? + +# Exercises + +1. Experiment with some editors. Try at least one command-line editor (e.g. + Vim) and at least one GUI editor (e.g. Atom). Learn through tutorials like + `vimtutor` (or the equivalents for other editors). To get a real feel for a + new editor, commit to using it exclusively for a couple days while going + about your work. + +1. Customize your editor. Look through tips and tricks online, and look through + other people's configurations (often, they are well-documented). + +1. Experiment with plugins for your editor. + +1. Commit to using a powerful editor for at least a couple weeks: you should + start seeing the benefits by then. At some point, you should be able to get + your editor to work as fast as you think. diff --git a/example-data.xml b/example-data.xml new file mode 100644 index 00000000..e973e1cf --- /dev/null +++ b/example-data.xml @@ -0,0 +1,1000 @@ + + Johnny Zhang Jr. + amyalvarez@cole.com + + + Edward Cook + dsparks@alvarez-dunn.com + + + Stephen Sweeney + dlewis@gmail.com + + + Krystal Riley + jflores@wright.biz + + + Ashley Robinson + robertsmichael@yahoo.com + + + Kimberly Brooks + sharoncunningham@larson.com + + + Brent Proctor + edward86@stewart.com + + + William Roberts + parkertodd@webb.com + + + Amanda Morales + lorizavala@hodges.com + + + Bryan Poole Jr. + carolyn56@gray-campos.net + + + Dale Hall + martinjames@yahoo.com + + + Isabella Reynolds + wbowen@wallace.com + + + Ann Rodriguez + charles37@taylor-riley.biz + + + Bryan Davis + jessica60@hotmail.com + + + Dalton Powell + piercenatasha@yahoo.com + + + Scott Turner + harold68@yahoo.com + + + Nicholas Castillo + dawnstephens@robinson.info + + + Joseph Pierce + lukepatterson@hotmail.com + + + Robyn White + jenniferrobinson@hotmail.com + + + Justin Rice + brandi76@gmail.com + + + Jamie Graham + harrisdavid@yahoo.com + + + Phillip Schmidt + stephanie33@gmail.com + + + John Baker + todd86@hotmail.com + + + Sharon Austin + srivera@yahoo.com + + + Erica Avila + jenniferreed@bowers-wilson.com + + + Jeremy Bass + jdavis@collins.com + + + Joshua Parsons + stephaniecoleman@miller-barker.com + + + Emma Mccoy + taylorjohn@wagner.net + + + Megan Williams + ronnie54@gmail.com + + + Michael Sutton + connie58@mendoza.net + + + Nicholas York + kennedykevin@collins.com + + + Donald Robles + williamsbrandon@gmail.com + + + Melissa Allen + pproctor@ramos-patel.com + + + Shannon Jones + beckkathleen@johnson.com + + + David White + sandra73@thompson.com + + + Jonathan Thomas + johnsonjeremy@gmail.com + + + Rachael Floyd + amanda78@johnson.info + + + Tina Carter + josewells@jones.net + + + Eric Johnson + bowersaustin@hernandez-edwards.com + + + William Kramer + rhunt@johnson.com + + + Nathan Williams + cynthiayoung@hotmail.com + + + Patty Schwartz + salinasdavid@sheppard.biz + + + David Collins + pcalhoun@yahoo.com + + + James Thomas + brianfox@rogers-cruz.com + + + Mark Casey + jerry88@graham.com + + + Robert Galloway + cherylmcgee@hotmail.com + + + Caitlin Dunn + nicholemartin@yahoo.com + + + Nancy Allison + martha33@molina-bullock.com + + + Marvin Burns + wrocha@gmail.com + + + Kimberly Jones + anitamunoz@french-christian.com + + + Caitlin Wood + thomasrandall@bowers-sullivan.org + + + Sara Burton + riosangelica@gmail.com + + + Jessica Roberson + theresa11@hotmail.com + + + Nicole Macias + kevinhodge@martin.biz + + + Christina Williams + shawn35@rice-bailey.org + + + Cody Winters + nicholassmith@barron-wu.com + + + Patricia Miller DDS + pierceraymond@watkins.org + + + Jennifer Lyons + vrivera@gmail.com + + + Jerry Rojas + jacobalexander@yahoo.com + + + Matthew Perez + jrivas@hotmail.com + + + Patrick Hogan + moorelisa@yahoo.com + + + Lisa Howard + stephen90@smith.biz + + + Justin Sloan + edwardsmichael@hotmail.com + + + Suzanne Morrow + shane74@yahoo.com + + + Theresa Lara + maryrichardson@clark.com + + + Christopher Powers + yfowler@davis-lee.net + + + Teresa Howell + amy15@yahoo.com + + + Richard Shelton + ksmith@yahoo.com + + + Jeremy Cole + bleach@gmail.com + + + Melissa Clark + rosejeffrey@yahoo.com + + + Kimberly Mcdaniel + ularson@ross-david.com + + + Kelly Dixon + gatesstephen@hotmail.com + + + Devin Quinn + wjohnson@hotmail.com + + + Kevin Greene + lhanson@hotmail.com + + + Jeffery Wiggins + amy76@gmail.com + + + Latoya Allen + vking@yahoo.com + + + Zachary Walker + diazjames@hotmail.com + + + Alyssa Molina + elizabeth59@gmail.com + + + Heather Miranda + davidturner@cortez-martinez.biz + + + Lori Gardner + murphytaylor@yahoo.com + + + Jessica Simpson + jamesdean@rosales.com + + + Anna Dickerson + abigailmurphy@hotmail.com + + + Molly Oconnor + morrisrhonda@yahoo.com + + + Brandi Braun + ericksonmatthew@jenkins.org + + + Renee Flowers + brownantonio@yang-crosby.org + + + Cassandra Compton + progers@yahoo.com + + + David Gilbert + vickie78@gmail.com + + + Brenda Davis + cynthiajones@thornton.com + + + Nicholas Rivera + longalyssa@yahoo.com + + + Dustin Hodges + sgolden@lee.com + + + Chad Wong + williambernard@mccarty.net + + + Robin Craig + xbyrd@austin.com + + + Heather Parker + allenjoshua@rodriguez.com + + + Jennifer Roberts + manningtravis@gmail.com + + + James Andrews + ginaromero@hotmail.com + + + Dorothy Hines + dsmith@thomas.com + + + Stephen Garcia + hughesbrendan@hotmail.com + + + Alfred Ellis + elizabeth41@crawford.info + + + Marilyn White + victoriaford@hotmail.com + + + Brian Graves + cpatel@gmail.com + + + Elizabeth Wagner + newtonwesley@cohen.com + + + Michelle Flores + shelbygross@duke-thomas.info + + + Larry Russell + richard99@meyer.com + + + Terrence Boyd + markmartin@flores.com + + + Jessica Carroll + eric30@yahoo.com + + + Erin Dean + toddmartin@guerra.biz + + + Craig Hernandez + joshualang@gonzalez.com + + + Amber Choi + doughertynancy@harmon.org + + + Renee Brown + terribeard@archer-gibson.info + + + Curtis Turner + pjohnson@hotmail.com + + + Benjamin Reed + marksmith@austin.net + + + Christina Fernandez + richardjoseph@esparza-peters.com + + + Jasmine Campbell + thomasmatthew@gmail.com + + + Catherine Bond + coreyroberts@gonzalez.com + + + Connie Jones + koneal@riley.com + + + Cody Taylor + kelsey99@hotmail.com + + + Kendra Gray + walkerrussell@hotmail.com + + + Alexander Murray + grossrobert@hotmail.com + + + Arthur Jackson + travis73@hotmail.com + + + Dr. William Vasquez DDS + gonzalezdaniel@hotmail.com + + + April Hampton + desireemorris@mcguire.info + + + Gerald Hunter + justin91@ross-scott.biz + + + Morgan Bolton + erika30@lloyd-smith.biz + + + Angela Barker + daniel17@carr.com + + + Angela Montgomery + jonathangoodwin@smith-perez.com + + + Yolanda Henry + shawnmcguire@gmail.com + + + Susan Hines + sarahbailey@wallace.com + + + Michelle Young + lewismichele@yahoo.com + + + Glen Hood + ljackson@vazquez.com + + + Christopher Wright + evansjulie@walton.com + + + Susan Guzman DDS + medinaelizabeth@gmail.com + + + Barbara Cortez + bchavez@cameron.com + + + Stacey Hammond + nancyturner@stewart.com + + + Amanda Stout + macdonaldlatoya@hotmail.com + + + Lisa Johnson + wnolan@gmail.com + + + Carlos Wyatt + iperez@cohen.com + + + Samantha Brewer + thomas47@hotmail.com + + + Brett Jackson + zpowell@cruz-rivera.com + + + Johnny Guzman + tmerritt@yahoo.com + + + Mary Davis + collinslisa@hotmail.com + + + Willie Mccoy + joshua20@terrell.biz + + + Kelsey Rivera + randy72@gmail.com + + + Melissa Maddox + christopher13@gmail.com + + + Jason Rodriguez + kellypierce@harris.com + + + Donna Walsh + wardraymond@martinez.com + + + Monique Patel + cynthia75@james.net + + + Dr. Lindsay Farrell PhD + brownmaria@gmail.com + + + Ann Ruiz + jeremiah94@pennington.org + + + Mary Alexander + catherineharper@munoz.org + + + Brittany Russell + haileywinters@russell-coffey.net + + + Dominique Rosales + matthewpatterson@carr.com + + + Henry Waters + karen72@logan.com + + + Jared Weaver + karlafletcher@baldwin.org + + + Mr. Thomas Atkins + gboone@gmail.com + + + Carla Cohen + ibarron@gmail.com + + + Tricia Lewis + pperez@hotmail.com + + + Mario Gill + lisa43@brown.org + + + James Olsen + vickie82@hotmail.com + + + Michael Perry + rdavis@yahoo.com + + + Matthew Lucas + joshuagray@carpenter-stanley.com + + + Christine Torres + samanthayoung@smith-aguilar.biz + + + Lindsay Miller + randyevans@yahoo.com + + + Margaret Jones + kevincantu@alexander-carson.org + + + Cameron Mcdonald + deckerjerome@garcia.com + + + Brittany Sanders + dennis55@leonard-turner.com + + + Daniel Patterson + timothy36@novak.com + + + David Chaney + kristen02@hotmail.com + + + Sheri Silva + idawson@alvarez.com + + + Holly Ward + saraallen@dunn-smith.net + + + Bryan Solis + stacey30@lam.biz + + + Diane Carter + paulvargas@gmail.com + + + David Brown + james98@gmail.com + + + Bridget Fritz + beth24@hotmail.com + + + Paul Boyd + johngutierrez@hotmail.com + + + Ernest Baker + phillipwhite@hotmail.com + + + George Myers + frank52@hammond.com + + + Daniel Miller + joshua96@gmail.com + + + Jonathan Ayala + jerryharris@davis.net + + + Jill Stone + pwright@hotmail.com + + + Trevor Richard + mreed@thompson.org + + + Jason Thomas + josephflowers@hotmail.com + + + Arthur Thomas + lnelson@hicks.com + + + Austin Collins + ambermann@barnes.com + + + Jason Diaz + ericreyes@hotmail.com + + + Darryl Hall + faithdixon@barnes-burgess.org + + + Jason Thomas + brittany32@yahoo.com + + + John Sanders + waltontheresa@hotmail.com + + + Lisa Hayes + victor14@hotmail.com + + + Chelsea Wong + iwatkins@williams-solomon.com + + + Joseph Fitzgerald + mary86@hotmail.com + + + Crystal Schroeder + kbarron@wilson-flynn.org + + + Denise Bean + noah23@gmail.com + + + Jamie Atkins + cwebb@hotmail.com + + + Joshua Kim + esmith@ramirez.com + + + Deanna Mooney + jason13@turner.com + + + Jasmine Baker + torresjacob@braun.com + + + Victoria Williams + rwilliams@hotmail.com + + + Sandra Hall + williamsonrichard@gmail.com + + + Miranda Mcpherson + xrussell@barajas.biz + + + Samantha Walton + danielle73@gmail.com + + + Kyle Serrano + stonecassandra@mcfarland.info + + + Mr. Bruce Maldonado DDS + diazmatthew@yahoo.com + + + Amber Fisher + jonesdavid@rubio.info + + + Brett Berry + millerteresa@gmail.com + + + Cory Bradley + umatthews@summers.com + + + Ryan Peters + shepherdmonique@gmail.com + + + Laura Lee + lfleming@higgins.com + + + Christian Smith + johnnymartinez@castro-miller.com + + + Kelly Hanson + velazquezsandra@chavez-malone.info + + + Brian King + hwood@yahoo.com + + + Cynthia Owens + sbrown@hotmail.com + + + Lisa Clark + derek74@bell-martinez.com + + + Brenda Ford + kevin55@hotmail.com + + + Daniel Brady + wbennett@hotmail.com + + + Jake Wilson + lorraine60@solis.biz + + + April Cole + halltyler@yahoo.com + + + Melissa Callahan + cmckenzie@rodriguez.info + + + Taylor Brown + davisadam@gmail.com + + + Patrick Guerrero + hannah48@delgado.net + + + Brian Gonzalez + burchmalik@johnson.com + + + Robert Bailey + debbiemoore@hotmail.com + + + Jesus Maynard + gene45@gmail.com + + + Linda Greer + johnharris@reed-allen.net + + + Travis Thomas + bryantrachel@gmail.com + + + Vicki Mitchell + edaniels@hotmail.com + + + Paula Espinoza + donnameyer@dennis.org + + + James Hoffman + haustin@larson-wiggins.biz + + + Ashlee Perkins + stevenknapp@miller.com + + + Rebecca Leon + smitchell@simpson-johnson.com + + + Jorge Williams + shawn36@peters-meadows.com + + + Bob Flores + kellercourtney@yahoo.com + + + Lisa Miller + johnsoncrystal@gmail.com + + + Brandon Davis + bryanpetersen@hotmail.com + + + Joshua Daugherty + josehayes@carey.com + + + Justin Wise + pamelacosta@simmons-morrow.com + + + Kimberly Johnson + combssandra@deleon.com + + + Toni Stone + eestrada@charles.com + + + Julie Rivers + rwilliams@castillo-nelson.org + + + Kelly Scott + danielsmith@hotmail.com + + + Michael Carr + clarklisa@newman-barrett.com + + + Jonathan Vaughn + dennisrebecca@lawrence-harris.com + + + Erica Lowe + wilsonkelly@hotmail.com + + + Kimberly Clark + jose15@gmail.com + + + Lindsey Robertson + rdickerson@yahoo.com + + + Cindy Anderson + gmorton@daniels.com + + + Tami Barber + harveykaren@hotmail.com + + + Tiffany Wu + jessica90@gmail.com + + + Edward Bowers + hallkathy@gmail.com + + + Shawn Collier + rhondasmith@hotmail.com + + + Michael Cox + usimpson@graham-cunningham.net + diff --git a/static/media/editor-learning-curves.jpg b/static/media/editor-learning-curves.jpg new file mode 100644 index 0000000000000000000000000000000000000000..e84b43c2403023bb7f1d33cb9a298993f7e98c94 GIT binary patch literal 22279 zcmeFZ2T)Yo(l0&;0!k7@;t(W=N|4NuMKXwpzc!U&WRFo8CKp-_83j;L`6D<(< znCA%-D;p;#Clv#)Ac$Rng@cp*Hz6o@?%ct~!Y0ANAz^L%nr#QT`qP1r-$q^)?PVD*A0y)Y~^tbPE;j zHagQ*=<# zF8WjHjX=_yaY6lUY|F2HQ7nW-Q7w6s|!{kpuhpsl8 z>OzDv{+12*=Dg*R_yt$1lo2cx<_upypf?8vVFvm9`3b9+VsPf+)Hy^r+^K|u?D&9 zqT?E~8mv@EG?|)}MIil_zNoBb-wrc~4riAx4I{+gf2+aL`jz@KE|rs3yTb7I`=yPw z;mL7%nf5caXrD2t{ow!s7wqbLpcBj7aq&tlqQs1xXmdeD+9+fz#^a4obk@us*0U|X zHToewXm-|%&s#2^B?Aq+w_T$BU&~vS&(Q~)Ls3nWpQLU@*2haNeLRpBfT5+?21Jxz z14>twH?Zh8U+H9*j7~m&k4>6msVMfG7YYz;4d0x-!#}amDlKo7JcdhatFK{eytG?6?98FuJ}eZhOy> zacJgiIdv`lPoHZmpP%KUE)|8!adfOZtiQ^twQ zfhk1Nsm;5m2Kyuo=`P>CS23_BQK)M6h%#?w88#ND#=Q*d4T-iW!adM;Mvn#$>^y5t zZfISt+5&4Rt0X})Miqxu)ia}0hf6<%ujqS}3hY#@w1l?}@i2JU1F>y!1KPRdR#i+D2H319@TE(Y$gE@APaZn!8zJ(Og4ci^<=VWsWnL!3P^%rD z)N6o*eLCNl=VD@LVt<@&BXHMtBDXJ`+t0K-^3&X2?Zpm{yjJQURp}~q)?Y@RrqhZK53nk3_fDHw$>or|Zxh4P!gs}^n&8~04WhINbYgT;nwkxVmTlmO+tq45HN#Of zUvc->jkQL%v)Lu$^5JbRw_n~Tv08KHtgNrtfOJn{7HmSt`25dRHOFGp{yEdtod}%Biu1tjnq5BMyOOqau$6N!-b#dv({#tGQ zs%idBv;9Yk=ec-Kg=TJX_1u#c3%%F}(7d2>V0@B`VNW%3S8AaZ2)G8+Y*5qC+@krD z(=XbxL8!iEhGK}K36gbMnsIo3!@e}Esw7=h?0_4?`DJSTQCM9LDNmT*R`u?xNKcIbbYoKhTFn>&cgTRrRj9fSgc?8 z8%(P7+2(#OY1cA^1gHr1XBQ`2INiDv{QLyO$8(PzWQ-b&^-kR@uwL<-#reUCQrB`g zv#J`J(mp9&+>V2#KFJ__mAB92%J>6n@DA81(WH7r^T{Ujcrw!=;j3y{ilLykRmJki zbutuNYW#>r`&JMLOH9E@Fc8$?uRpY z=rb+_nmRZ)2s)=F1)muuF)EE7mW@p{ujrez8@?kZlCITk^?T0WhcxgRz?bZS=5K+~ zVeZwnwF~qfN+n;vJ&}B-%2|Z=mOFK6d$xW>}ibBU3`Y11=OH7ZUk{gP&46AZ(e(i>u=?(qdth~_&{qd?uPFJckx)cBgffOoTaYw?WWrC6x zsf8hP_k=nZzgfZCn&%I>6o6;8dTk(G}{JLG`SrC9r_FC<&SXenp7oj{I( zbd_P%4NoFw)kG@mdUoh`IGK?&g;Y4JM^$xT8`!@2mUNF+D0R7b>X>D4kDP*|A31jh z)Dx0v!Wc$>FX{^>oL7?H9VNVVPr5&H%ePgqC!ncB;oBNZd8xdkWsl&JtW{8-;44s* zoe)MRAi_RPeQ+uUMt_fbYsex|_}U>A9) z_oT$QzIWFwbB72sDlS4;kV{Z+($TOPY@HAcPUT`|vePofehE#EWm!&*=ZY+^$oXhb zg3I&8xi$R@4Yv1-$|YIm<5AP0lm6S{jnWyF8zjm2T(9h#4)=+TFNn+Mcdd2z>8vSs zJcHBLrYOIZsx$G0$GmLpdUI~3(I>PusNCd@uNa!Z$3EP*La~tIx65cFMoQt>cw|y; z(XY^ASooQU>>-XP8jj%MASZr@tUfJWxqti~FO0Rz+HSYth`C30Sksc5ADYX{@}*Un z`Q-#7ZD!}nBMTBIiiY@r`1+l&EnPoOG?CSwqs&cZk@9NOZ8AUS=|qe8a2W_M1-)9> z)(Q~Q3gw`xMwqVLz>3GFqS0SWBm$PTlhBa9P1Zjq(+B_>NxvmXo zJl-nM7-$c}+8T-(oaGWs%bBEAeWUqAopwgu4U+Fbs5cnj6vN)>++}NSTYCq?Jz$3@ zKZvz1B`IlJHka{pIiH_<4sKR(W@8!|hBFlVWxV{UUTAM~IcK0h3xL+O?$i^}m(Tdv zgJri{di~5(R$V}#+%P5EYI?@q9zN{v2YO%=^zhhHL|t^j+3StGe9s2BYhfjeX@%Aa zu{o9Ilmf(W&iBN7TmQzTU^iLWL2&7~$@3s~*V=t|*UB!D^3_Q)s6WvP(~rt%)yb$8mv;0fsN*kGiig`=UkHXWaHa%+{7dfg#MFf5QzSB=N*OeCX?m7i$3{9^KxFW*o!xGOA{hq!z3GVHW2MfPk5^~y=d=K%9 z(vHjF+VP4;b(dLvK=TTy=2Iq3;OFuB`22#r%zMh}%<+;ln;Yverd5Ocz?TnOifnb+ zaTcAEdjSWrse$I>ZpnRd<7(v>+f|V8lYpctFg-NWei}`}DM8ndcMo0QA`kmvDNXOT z+mf=@vA<*4@W4c(qy_YzoG52BOnbvl)P3b$7>uKP4komN!JoAH#*(T@0K6fzzszf0 zhL~MJ)^+WqDUw_RlI`7{`x-*`y75X|ikQf9GaIq!yZOck1lTh=U{>YP^Q?;i8~)Hl%!cv!_1RCflcXJ$w<0}>%{5Tk8xuG5?zYn- zSWtPMXc#_^KGfLbq`vM50^$E;)?a%A*z&{?+cHnG>}8}Y_YDLPwf zVt$}A_>L`Nez?uvCqq1S%5h~ZP-MD7;*(_(B%I7?L_rx$9}```LhnZ~?sqI8z)2v1 zehs*XrP;(cqh8F}i_>Tpq8!eTniCdN8l9=A_`Hj9fa)Q2rB?5J04|*$$Phg;ga_t@Ld8ZT8bKgipfjIqH$c14^Ph?>m5D}9lY2I zhVaNYXteDJN!Vgf6Hqkc#coeW4s4ZkwW(`=nbevMR8jqA!5}}>IPq%2%&2iRNL(z! z#GUq?brpldK0|;!!R+mEGT}F^MVzU(&+ocEnNWe<$LgdHHxB`_^L8L#;bDi`06Y1E ze)H}*E-Md3JuiiOCGsv(+ytr}i_-7Xi&V}ASvTdK1&v$->?0%-a4TXK#Qb%|#O|NF zBm30HO1{TE%~1S>{=*Ua=@R`;{jFalA}l#Xu?oo-Vg_d@9(q2Ub%XdXjzZA>O~$|d zCB2-IHS4h|uQn5;(lQe^7bj<2XX*2r_ju0D)1gm37f6eC*0mH>+hB^LNO3S+;BdJZ zV7@I_uVMbao1pXQ(&FNGi97mL--}Sgt*CENWYA!Tua{BC*=3B|j?6$Sl-=GRzRF9E zalkQof5vPNa9YD@)$m)ijkc2+WAGz`xSDRzgqzUo8ANpsAR4lD{qSyKF1$ad!4$+Q zi(iK_I+V>m&8ihzJZTqbDb`PaNkS7&v$dAIiIZY681e>{R@FMkj*gg)Fs*DcgK#NvYqQ%>xD1R@4bw z2SCHj)}lZRRlJ(ZKBWTMGP#_rWfC&S>bTZfVq()IH!p06fDCv{ovi^nIWv_EkL_se6&XR;fzt#K>QSiN2Ii}Z1-*>20?2B z<1DKbMq+Dk9dWf)%*3i;Jmju<TdKD zglt!xF)Izu@d`-<_XESSvpp!2x+8RlZH#Iq1_AkRw`sv$1pr~3h0HtFO{)Pr+ z;^6o?MX}hutT2DNcIibBS78(AbVLN*Mp}Na;OaF;bc}crki)(<&+}6-c;peeV&hH7| z#JerezMalg>#um&G_2<3H|i=q=5ch2(NoWjC`Jf_s}7~7rxxVmQ4Yk-$tKshIxdgJg8Pl7gQM^gES zsv^ue-g%^P62T7beGm0owsxN!GiG2A(u1>)+xE?=n`otPY$PX}=j3HAU$qz1$YICn z$2%>oU{TR(b=CCI3Wo&EvyzdaLq@K!FF)MCO}pvkJ6|-s7kBJ5b59a_?ZalRNc%xr zSjsyy?z?kk5MHbO!sy_vPJ8D^Mm))F*h{oQomb`zV`MKfVYY+gWPJ)Ab;@!DCW)Mm zj;IY)UG^!bi>Ubf7=ZgeQaV`|=a*%vx$F?6oo;5mW{qH1=}UzU;5YVpsStr}E0BRP znjbmZ{Z?i92jT)HeP~5^;ZoDU@$yhV%y|)@U6Xjo0L*rXoNHLVd~T<>cw}hmaaMe# z(TJqg3VHoYHs0!)4vkuQ?p-Q+C&qj$e%^Z1$s|iTkGj`0EF~IdYdaOp=>l2WB*y|( zOfv#fR3ScedgvJ;48ug5JB0~v63jNC*iM(Kub)i4nqR7gv3{J^NrBbcjOlU`$ohAX z3AyMu4s5Grn=xJ<_*M|3Hx6<}o z%aXOuPoLSU&RCpq5Yuug=jJucy@rzz0D(&m-p_Gurag(nP>t#|pbkZzj*YS}HNzSq ze$ah6nXhC+WLqgKML7J;d;KXZN;G%)z4|SQF1Qwbl%^rZ^}|VS7w@$c;xmnQYM}J+ z0}*rW=dG?jaGS>9)CZQ3GSexI)xvvfW=|kbG{>u4cYN!L@D6N5fTyTudT(g%)Kvyo z&RTPg%Y4|F60}T{t-|r(F|A2;i>6AxuA>lnR`||SGd*XZA-tw7{Vl133b?Xcn>Kfbh;gbIv zFm!T++>(FcnZs~1p{ZI=o2U{_n}`qQ%h5z&dsY#x&+hU{UtDZm1Io?>7rG8S@}8vV z^G_gGNs#n)a)VP`6dPyQF}0B0PUlw2d9_*T_hB;t0|QE4hHC)UmmrW^q0GR>rlu0t zu=4Q7r=-gA3ewLQm7vWwujc%n20NKE9 zJ^z|rWwNQ&IOCadHJcfKqN<>*)|C!70tiBx0Ri(i$cV*!d^D{xBd1_yVGTWK^{h2f zZSD*BarIaNH&L72AmzNNE}R|X@}XC}T&f0?IEZDG$9x4*o8aC+UproFbv$coegbGd z@NMZ7XeCj>^pDUTc5t)beua#7zRadJbnPwEjpLqrFfG^WRbEngls9O?a-dC zp2v9hF8k)f_iMsgaiy9Scql4OA*%fpzz3tn`u4!kP)JMkE&6^r%pq zUnx$w>n0Ij(?`&WnnuR%tN~wYvjL>hp62uCely8V93j)5Qg#09B_<{|4|7G^MZePK$_48hFIzfq8O63!9+?>+EQ|5aazcJoF*bkY6I?fh@1xfSFSYDYY!7t*9@hjTf zZXv>rB0W)?UX2JO7uN=p3vw(wga#o+%vuV*OP95W6!0B40 zi(ci075+e}SyYCeeu7ST!ZTAZ_UocsTF~%0P|zr;{V=Kb(eAmtAKkHCJ6ISi^>`d< zff?lIE$`)8$Tub&ouif1Jt^zRxKRlCl$#qtj+Cd?Cy()scEP2?`(AXlR176!DO8hF z=b(r3fO-j;BJUVJ3jT;1?o>s+(x$76OhJWJtxYu!?Z8LKX8DpmYZ#-1!Rli{ zK?)EZFYYpT4Tf#M8o`b@y&7tasK4sf%B-(6W#%{bAkZj^KX=WrTDC7uth2F9 znGu%Xn&hZzuX6v6$=#~`+~jinh{y6QQ8i3X`ASXOyQm@9Wn@%m<^Ae;&o`*@{VAts z!;(8q1$EZF-SuW>pAWtwFAcX8J8~h{0H$ZcW7_Rz`Ds%nUP;1^mvU2HM#@Wy%HHwB z>6_2qrg3qpp6R(>l^!!HiyhB|91dQ5-mscNvVZ>BRl=VS0ERGBZt3Jk%y=t0%_YYD z(hJHyEkSsnw3f06SEK79o+r~Xahv?drE5S6)gnqKl^u&^o=Ho53CWjshQ>$3CyC$^ zHn{jNitS%w%$nwcLP0?cKNqFD=u1UKZq}wekF=hi5%rm8o4BhP;JDM}(}&}9p;fAY zY!KIg4sXczm$h>#a=HhO4~ki4*oKFzLD-Z++H0^Ydsw15iJeBMiaK1=7Ov+K!pd(D z-00MjSKc?Y`q zP1hGZfx`Ht3z6(En##2KCM81)wh6BRltTAL63Dg-9WEo{iOJjP2t@0;MY}O%I6}RR zrt9E%JUoD?fIWMOCBmQbAhF8HZSV zRjDqtFi*|wC6ZQIjoKlQ{XP4bl<+$>!42l-3!9IwG#t(ILJ}1#%{`82!dnchXUuN= zBVDWcs0;$lX=@PD_;yx+eW;!XG)_3h-Hwph3+e|ukx{p37bd;&+W zaPY{J$Lsk|66GZ2aJcw6>Ij*z&F$=LFi_f}n+(cZ{hFF!xFEa$GKxZN0>7R|u#0h7P4Z|Veq1AsOd zhdOVVcH>VrWo7hp_->H$KHI-Z0{W>bZzwzKs&7RD5@WEr2gA#XZvw3nF*`Ju{A)Xw zt@$xX0k>AUigm0@)HA}M5jQHXGdBl!h8;h&28Q7n);a5h%MQ2=H`gMQ_8FNjR(=_s zvprV;@s1p;sCk%ie_3KW1$(cHE`b;n$fzLxq(WX*gU3TjUO{fIjA*AU@;--Wa?Qgp z74NU4>8uin2{c!M@!Pq~k(=NoVi?mFafl^A1Jsc2&5csCykwErX zZza1Z&p@N!=px6&3W*xhVGo~yt)5z?ARoX2!V?dpMl`s%#JMWq*uLfcc^=I4Q=f|L zmux#u0Mqj}CO3Ik(&lDrCJtT&*%?MFjhJgR(_Q>!6}`Bm0`kW%lFz(XT~&VMN6CID zrxhXU-5H5xZ(G2lukVmz^{!*B{we5z|e z+?2TG#Ya(JPM~bBwpB9^bmrktS&tNDB|c1Y-`OccJIFA zd4c}1gRsnRqxRKI*T|zs2!b7V4bAhEIB1=`qX=Rohb)&r#25Ugp&}0BkS7aUF+W=oQvM;! z^oRJos~_%#sRhv{B2?(b=PGVhr1*(R%Wbrbf%>yxj zsGYwI@h|VCJH)hBte;s+>l$#Qeqqc!ZZ8LYQ`n$0H`eSc@))=@J={KG$!YRYXsvo7 z^crAwpmiiq2h4MOxjlo3UnS21b#GrzJ9uY*PH)09NOz{7T4!9{ZJnTw^H(#;f0P|- z#!xe$Dz74vlkWIcloto5gG`deZl2v)!sFW+go`o&Qu1ahXC(R{O0tG&G2VMyHY=K! z5)6Ls{%(lhJ>8|e&IOxX_{WR<<>ix(m5kKBDusDp(5!bWP3z~H{kN-bm591(GlTCJx$ulpaEuTM^@lZ?d&uubsy z+QPLxt^qCRRX;w!|NDLabE5yBGZTwwE!}ea3M=9$Jj2Tx*Y&GPiG`y*TwJ1GNKm36 zY_1Abf0HpOQ)I*gdg~k67Q(Ic?wYbc;2cai$oXcO-Sj#KULPR=GmQ9 zCQnf(a1=NV;pY-Qu^qgMZuI&t814CELizc_tojA@PM(JM2{>&b$EfV`w{9Y!Gcd_> znX`+%W9grb?&AF+it~qb#cRNi8OiXYk<{ZKvW9<%m;cjQobf}{|Ly4S@9Y1;5f%UC zjsKQkhC}gI$CgE0!e|efr}s?tcDowf@du!^(96$88eFa z{EEYCA!!wTh0F8!jn56|9!_PNdIt$4v*LmRvj%nDbj&S>&x!7y3~HNCsJcYQjS`zD zL&#sDDFAC#S|bID0?-MZ4XF3NmP019%N@bmX&Oufo^>|SAIgR`CpaHM<3b5KjyZSB z4^rURHq6u!d^gcD?d@U}X_maajG+hPWQEKv%04dJM$hxU;j!4;;5 zgM(kb2LN@;gLtb($aMnc&9vl9)B5O&N)_1iUv~Q@P9w3veM56}*00#NQqW`kd;MRE zuZGOlH%=c_)w_nWQJJA^Y7DPVx)F95u;WdDbEd5^q9-pAhqelY^ zLzAWC;+_!bVT=Ytv7v?7z10`v4g2URw?s)(cI|62d~jcbY1+4vcFGjCMP-Ca{Wi@n zWajAhhS)dcII?bej(6R8Uk7(upbpdrCL;AL{|mzWThhkgn$|u+xkvD_PXo1Pvh5Cb zFJ;>}_c(l$-rT`~vknGCOCtdD_kCVO5gF@E-R7DNd0pvl;BX?#MHn};dE5T!+q6fl z{LE>r8Yz*OOQGQ&AY7okN4(mT5YBf9vt*BoL7t|GT149!>gmp&i2*x5oi#C!1E(KQ z)I>b4zmlxRBHlgcvy}3P;<#e1`o>eU2{~Dr`4h=%V;d$ok|Jzz+7C|Zgt01FMaFme z%(^(Yq@NzKV8eZ`Va$SR5J_Zk(5A~nQ1$_^%TLnGf>K1&x09?`RJvAMH`*hqWoo|u zbn2$7K0yJ}w?u??HnSGO5dJGf;l^AYF_LBpX0=a)Kgmgbqjg79CUk@r1iwaL=S^q(ZGo*g&KBfxUA5cS_t4=cY ztD|Ra>*{$#pc&k%cjX4%UzS}DJkZ4fnYnD%PZ@ZqOwlNsP0xK{)r znPjj^fEN~+5OvvkM z=IKB!s6O+MpI_Lo)ZYdyP;8HcK2~gO0>?t?flz@IFk;U`9@wvk1Q}cd-e6Kkx?}nY z`5TuX#&1F&yviJL2JI+CWV=>tVo%3VV%qB4)&lL}A{`(b1_2kRC%N;x%#%|EyZoNw zjRHMXB#4!&4_E#BxLT(I)8x(jzkPz$++2B2KYmF%|*aHYgXcdS!9)Sn*n zLpO*tOwPllj%mHC6>6F!$2rBRIPDu;)l^W})@P~AS*`nsry>I&W;rji1WVxz!@nna zQz>L#roNI@r8;A}`Ka(g%nbR>I$oIedQtN6B9DPJ0S|C!I9h6!?sJx;38PTZh4@&j zK#V3mK|fu)e=^7%+))u`!r}xl`L2cl+>Cp&H-%<2`o^aA|3 z$OTPcjirv9#wNupn6k>wGM|jvq#7ojE-}bBnU@rU_O?eFFXzcAf8Ms_Q4xb_`~6(x z4XWzh{wXuLe6x?&>C=N3tG6G{GP91!bY`1Ub*9mNTqY1cz$6R-&sK(+^OaQ?m1Q>v z1$Y0RDEBkc2XcGm3b8oCOH-$JsTu#myTkGjT9^n0dbDH!1eyVL!QK*Al8R$h&r%a# z|5Y;UfA&e*92p=L)z|0HO2v@&8G80Wu&kCUmL|wGL>kSt<)Jg-Rq6NeWT<5m|J}R) zmM{17@82imwT{O0$r8u$G`rGFnBv(i&RIu%apT%DX?L^EKsT1 zrVDl@LYo|-0dws+A1Q*2O>V6-NKI;3Hb5b%eWcNfwX;@0yyK6k$sZU5ZVkN+20b)j z?fAAB_Nho8oooOE03fiEm%wk`O0HH12loo^Q*w3ufGCx(9)&tmZ= zkU&Nd>obrFu-87-SJ>PnO3dE{*BB-h>DJzL!r=As6S1X3K`f-~W03G_Zd#2Vj(#mE z-*r?bIz}~sgnfD!jef^GoEI76!)m8|ZlRRg_m*(26wFhe=wOvBG|XcboRk#ew5=>5 ziwIU_4a7h77n$i#4_b6d*;iCbFrS6zW>wDhm|Ni4hTlC+T0E8f4z;-gVDIdM7#vb( zMzh|GDvk}8W=3Z|nfG!w_^`0;go}&00LRqn^SsRr4={WBja-nkZ^pw4ag&wizlvsy z5geiyUMY0{5|Ol$20ewH#2f{yt-g;Dr8NkaxLMeE&3zX!X;YqU8#$3~lnYgcNXw~B zCd-+>EDukdD)vd;;yE>_$=0=@^JA69c&XYR;|1FLh$xn^lJ6E119SYkBBEV|eGkur zTIuk!-=uf)9Pp)8)Oc)l|8=3ne`F)w)zKLrnkpYr=87pLNwoVEnLtU$4q>Q6O^F-> z%^NJ_C_Qc{Iy7{-21qWky|j)0XiD4#?zd>8)yd&9G!x%C`lDGpNbU0Vr zcgPTyiof3En7pR-`G@42*Z1+rZt$w(S&rbaRzmHp?b_n$#2{tziF)I^KkkH8@smpc zZ)%*C7bK0bCvr~Oo35~| zkUu(p|Bs%rjQ`}?6MpGda|4R%6Az`W7x&CqgAQ#y{unZse|GKtk+EVR9kP) z6=%36+@Y(K;m`hfRQW&pR;T@w>*sK}=hi)Q{7(keVcc-1D#cAmgQanpEMWdIbApPT z5+Z?{N9?r6XSD+<@d`9=+sx3bOtU(mW^r6IBiJ}BBF&LpZIG91V8z@C>O>8AzVJOk zL~Q+A>mx)XqSmARuR>z4vu;Y4{k+w^B<02`=ThI_)E5|@QSgALyqq39|2^NspJT#5 z*|ot8N18v$%&lc~at2rRH2`pJQs-kboa4HH*8yoh_^fw$~#Cw@u!J;!QdQPvR=Z z?B>%2dOP#i_0fWmON`0r4TNp>ONh0;Jn8!a)vg@dxKA98{64F0@CCFgIwrfvMhi1K zR7V8_tg`Srkw=2bJi0T}%H+1uhXwedMY+KiLOg!3FMiINyeTN8EhL?ER3P>DaG65x z#1PpHb9!S}1*L{PRfVqV2_X1{s}8PbidMZ(LhV{Qgu?4VEsj zge9R4AJ^U}v`p(mv+^{wG-365Zc1%6AMHY5V12NP0JB#Lhj?dJohqFN1y#NMH9zIf zCHWbJJb1d%$?zum$9maO(A@59BJPihki^l+S_&R{3PAzSBd3A`Q&aFAx)`5T6d36| z27d&~>O+pqM0QmQqp>vHnT}OalWa%juaKuE)l(qty8OVCD7F+@h z>DyY3^a<Z>_LHbKf$Hfn=xzxg^~2A&DYq-E6G2L3_j0&}o-J z;l8ga93APyjM|?vkL&TYIw#ykos*qgD%-@KvNdwv#Se28E8TH;%za6!Mq@tT*2MZT zV(~djC(05os&MmJ|NiS3@aH<7B(pja$L_C)4^MSdTfJ!5>UT&MCBHY{!&s!0dQt>q zw=-&o(V1(XuCA;NN7-*xK`2Bl!W;dxPScyq_Z+N{6|KFJ*8o3uOva3&4QS?IksUrh z!9R!p&$EF3yv~nMd(4f!-aBjmBlIVkTR*imN0!ER;#BT^LenyO1Hs`GmRHvbUqtC{ zYENBeZH&@1KcWkGLum_JYT{Ota$|b8rtw-bh>yH#P4deb~XXyd>0Ai+KEUmPF&K zlD8=~3hP<+5o7_%JT}DTwzf{F8Yz@F1_f0s4_2W25E1k9F8*^r7VTG$a=8O!aCg@D zTR$`-f#Ilux8LtiGB-t}TlZBwQ!m~=;|~e*8ddS+x#QSTQWRU|8oPCn<1NHGh~UpQ zm^X%J(M~{*!?TAqD&8A17eH-4K;2J1%6^!c!NuvomGvbRJ}7QTFgiZxt1`G0L8aBpd zMLZJqVF7iK*ar#hFbI(QIB<&R3fM{wrpfNYi&jQA(2bR7w)%=$2VtIQSp1a#6 z@_2rj2WRp`@)954f%U&_cwn$!1Ke|l$4qy)UYX#R2QAruG4v9M$HlnGQLaLX>Af2} zsAAUBV9%9$tSVG#l!o-VHP%Af*@@9B0s>0l?$g!LE>u@cB28YE*9;{;$a|CnwNIL$ zCq*-MPG@jV->X>3HZ%>cUxDesx|#&R2WS1>!(YV2jdgFTW5@#J$|yw9Vh{M_xNH*$ z+V)nR%c>3h{o8P9;qybdvZC<`geCo0j{+pR2*Ysozu5ut?uh^wJQCcMnGc2)EQ@aP zai;lgzx|dKbW=U_T(~t5n17k7x%mX~c_=}(W;A}>`>j?Q8#Xq!?{Wm4pKEIe`mN7T z24V8eU4z$vuo{oNxF3cp#4rYs`?vrp>1^lmZq?)3^k^| znf4^s^K>nJFC_y?xEQ#eg^~4csxNd5t4pXinU*Z~#jkvhp>@q;MpmQ8r`<04&1{!V zH|^re-F6#WpllMS^c#*^-{!#u?_3m`8|Sl2)*_+@1H6Ge$rx}=7vsA*_%xj65kURy zH}x4};*2*wXIX0I>pp{O1#^D2b^`)#(y1;Cw!VkWylu7!{&(q%)AydkLmOpNYtwcu z3aUnV-0lYTP&5;8^h$`3VxUp6p@?;Sftj=qEhWK(`YB=CvWhsCbX-cibAB#UbXG0o zeKwsS0oUM570)v8MlkU-XI%4J$P8^d@h#LWmZbF}?TXHlc53p5@Ik8FP)21}A1z zF*^xU;(@I3*cP7)oauD<4DAe)IhH7OjE#@*#}!Zq>Rto7dwttbt^u~uTfhfA*$ZAZ zV-NT$tG(CO43-~1WUC5impCD?8<^j{siX_QC4ZmC;-T}wD&Ez}?NVd;JKv#gf7O7P z_(1co+lk!lh9jKiOZKxYoJll8u;|a@$gyA)UE3IcoXK@-0^07KE)`TD={(&a9mK^Xc)0l95cn z+3mWekf!iGhG_l3180;rD4PlhUi?xgBx!=5@vCs+6`Gie<3G_4f8NR4z%j;|GW7xR z`g2z_WlIxO&paQxofHFN)@r2TM=%caORn&J4+T(b!gUHt<>}T2*&-Ise+y3 z?qp9z=ZyT&l4%rLfi?sDNE;F4J;GN#EYHH{wDQa;>NG4<=fz}QSktI=&(yq*<*yL? z$}22V-ktd|cFy1?sCa(fm}C~O^+J0e zDIjHLf+?2IZJeTUj%b-_QPZVX(sg=imEjuDOai#w$1(hB%TYhQe`RAWXf4alZMbIz z>B&h*SlLZtlAmEgI8fynRrL|?NRHbspbL3>t> zlCssTNwpm0p@-bB8KMCrN^qD4!KS(N80i!w!pnLI#9=Sq9zY^74MY8o`AZ zJH{ea*JTT*H+syQN?7_y0?^T)A4o=v2rMk2qUg}Q`cerjiI39zT{-?|n8-78K~inu z!P8!q;2)%-w&k&Pm^{fv9r&lstEokG+4;rMQMlD?n+|Cl^I3?7=JD%)RU7|UXMfrK zLN#yvXL7s}W%_UG<0p@I0`Q;x~w>tFQ>!K!Ej#~fK)maeh6*GSll(>MG zsjlxw+{YOf{&-UJq<#U4`hhM@w=8*Pr%rQ*ZWwH))ngCRWDyOK_1HE=`nTD9_i#vg*}9W&F&`ZyjrRvA*3h5-tt1 z_HEaIhiN20Vg-B$F>!qRn9t2U1^BO;^za{~zWsGd_E&6Y7Ejsaj;W{^RYtsj`IZ$? zD{xPHmv`P-o_D5Yp)StYq;_bIne*f+`x4rAkMootCfz;BeKh>Ra^CB22OoIV(i5rPLQ^J1Z7_PUA=LipUZ3Q@@{Mg_c{s}}2s<9hOP1gV+ zxog0Fm2m9IdHJ$V+Z&hwQxYjT-5XBc2)#}l0TDF&_m_qIH!D~A_J7^btq}U;}GZt~Rtj1THzEI(mK>UlMmKK;;-s}$w3MTy3 zlKUfW`?s+bv|nt}e*v_K$u1#9=T-!pVfyM%tdLJ8R%EU`^s;)LrMjB?Q^h?$Lymog zRw88$N~`vjn;s%@z8e;!@}lH*nZj;iRlg5Is+83Mjcdt#K|7I@jM+V&UTL0i}CfbXRPc!B3YUPf6X!;t1-c>(`*VHDT{S0N!woW-ol0 z4-0Rn5?JHKBxfC36i1_Z_#H@MMopeG8poB7op(Brz~%v8(c4Pt@n;F2-G-yf7b5yE zy1t7O+Pvu03%fCBMAbU5xiV}3D80nTLA4}5*R5YEe=>e~s}w`=w4|ieif7fM4Kd?# zonJ_wX1m-jo}tBYx0ANI!#fYUiVs+?i)%@M8bU1`E3#~0FQ;OG$zY5H7YvLK-Y;b4VQ`zC=9_s|>aBy7@p#1QBvJXS|lu6G`Hi;*wpisB|; z5xnI!7`P$-gXbxli15N}#8EM9o5U-7Z&&hsT>m&!_|0!goB!YO>j*kQ2gdnG^9vp6 z{mfjGYfoovLeRXwS5tv#g9Mac&^)5imws!fRZuQa+MnYX7v4Cz%OdkMmf|7x{0=zY3ze<8wioR7WuEn{$n%fWB zK6M)6uYnC$-*}s{|NY$(ZxdIfg?^@5`)P>aVYFoXg)8HV27f(~->0a9;ScGo*` z&lRhA*Q>W(_q!&(%PMM_*MmEalcYS4c}V_CEEN>MUrwyMs9ksdlG_@$loHV`#cd1) z@B9{7a_BLQZlD5p&X`w0GbMDNG(2aakzU5o*z1@0w zYwNqdX)7n~Fjq)vn^ZW7t$s!<`~XG7_yPutlnMp8rbMgr^%3KMt1rJsI`Wj?e-ZYx L{osc2|Gx + + Johnny Zhang Jr. + amyalvarez@cole.com + + + Edward Cook + dsparks@alvarez-dunn.com + + + Stephen Sweeney + dlewis@gmail.com + + + Krystal Riley + jflores@wright.biz + + + Ashley Robinson + robertsmichael@yahoo.com + + + Kimberly Brooks + sharoncunningham@larson.com + + + Brent Proctor + edward86@stewart.com + + + William Roberts + parkertodd@webb.com + + + Amanda Morales + lorizavala@hodges.com + + + Bryan Poole Jr. + carolyn56@gray-campos.net + + + Dale Hall + martinjames@yahoo.com + + + Isabella Reynolds + wbowen@wallace.com + + + Ann Rodriguez + charles37@taylor-riley.biz + + + Bryan Davis + jessica60@hotmail.com + + + Dalton Powell + piercenatasha@yahoo.com + + + Scott Turner + harold68@yahoo.com + + + Nicholas Castillo + dawnstephens@robinson.info + + + Joseph Pierce + lukepatterson@hotmail.com + + + Robyn White + jenniferrobinson@hotmail.com + + + Justin Rice + brandi76@gmail.com + + + Jamie Graham + harrisdavid@yahoo.com + + + Phillip Schmidt + stephanie33@gmail.com + + + John Baker + todd86@hotmail.com + + + Sharon Austin + srivera@yahoo.com + + + Erica Avila + jenniferreed@bowers-wilson.com + + + Jeremy Bass + jdavis@collins.com + + + Joshua Parsons + stephaniecoleman@miller-barker.com + + + Emma Mccoy + taylorjohn@wagner.net + + + Megan Williams + ronnie54@gmail.com + + + Michael Sutton + connie58@mendoza.net + + + Nicholas York + kennedykevin@collins.com + + + Donald Robles + williamsbrandon@gmail.com + + + Melissa Allen + pproctor@ramos-patel.com + + + Shannon Jones + beckkathleen@johnson.com + + + David White + sandra73@thompson.com + + + Jonathan Thomas + johnsonjeremy@gmail.com + + + Rachael Floyd + amanda78@johnson.info + + + Tina Carter + josewells@jones.net + + + Eric Johnson + bowersaustin@hernandez-edwards.com + + + William Kramer + rhunt@johnson.com + + + Nathan Williams + cynthiayoung@hotmail.com + + + Patty Schwartz + salinasdavid@sheppard.biz + + + David Collins + pcalhoun@yahoo.com + + + James Thomas + brianfox@rogers-cruz.com + + + Mark Casey + jerry88@graham.com + + + Robert Galloway + cherylmcgee@hotmail.com + + + Caitlin Dunn + nicholemartin@yahoo.com + + + Nancy Allison + martha33@molina-bullock.com + + + Marvin Burns + wrocha@gmail.com + + + Kimberly Jones + anitamunoz@french-christian.com + + + Caitlin Wood + thomasrandall@bowers-sullivan.org + + + Sara Burton + riosangelica@gmail.com + + + Jessica Roberson + theresa11@hotmail.com + + + Nicole Macias + kevinhodge@martin.biz + + + Christina Williams + shawn35@rice-bailey.org + + + Cody Winters + nicholassmith@barron-wu.com + + + Patricia Miller DDS + pierceraymond@watkins.org + + + Jennifer Lyons + vrivera@gmail.com + + + Jerry Rojas + jacobalexander@yahoo.com + + + Matthew Perez + jrivas@hotmail.com + + + Patrick Hogan + moorelisa@yahoo.com + + + Lisa Howard + stephen90@smith.biz + + + Justin Sloan + edwardsmichael@hotmail.com + + + Suzanne Morrow + shane74@yahoo.com + + + Theresa Lara + maryrichardson@clark.com + + + Christopher Powers + yfowler@davis-lee.net + + + Teresa Howell + amy15@yahoo.com + + + Richard Shelton + ksmith@yahoo.com + + + Jeremy Cole + bleach@gmail.com + + + Melissa Clark + rosejeffrey@yahoo.com + + + Kimberly Mcdaniel + ularson@ross-david.com + + + Kelly Dixon + gatesstephen@hotmail.com + + + Devin Quinn + wjohnson@hotmail.com + + + Kevin Greene + lhanson@hotmail.com + + + Jeffery Wiggins + amy76@gmail.com + + + Latoya Allen + vking@yahoo.com + + + Zachary Walker + diazjames@hotmail.com + + + Alyssa Molina + elizabeth59@gmail.com + + + Heather Miranda + davidturner@cortez-martinez.biz + + + Lori Gardner + murphytaylor@yahoo.com + + + Jessica Simpson + jamesdean@rosales.com + + + Anna Dickerson + abigailmurphy@hotmail.com + + + Molly Oconnor + morrisrhonda@yahoo.com + + + Brandi Braun + ericksonmatthew@jenkins.org + + + Renee Flowers + brownantonio@yang-crosby.org + + + Cassandra Compton + progers@yahoo.com + + + David Gilbert + vickie78@gmail.com + + + Brenda Davis + cynthiajones@thornton.com + + + Nicholas Rivera + longalyssa@yahoo.com + + + Dustin Hodges + sgolden@lee.com + + + Chad Wong + williambernard@mccarty.net + + + Robin Craig + xbyrd@austin.com + + + Heather Parker + allenjoshua@rodriguez.com + + + Jennifer Roberts + manningtravis@gmail.com + + + James Andrews + ginaromero@hotmail.com + + + Dorothy Hines + dsmith@thomas.com + + + Stephen Garcia + hughesbrendan@hotmail.com + + + Alfred Ellis + elizabeth41@crawford.info + + + Marilyn White + victoriaford@hotmail.com + + + Brian Graves + cpatel@gmail.com + + + Elizabeth Wagner + newtonwesley@cohen.com + + + Michelle Flores + shelbygross@duke-thomas.info + + + Larry Russell + richard99@meyer.com + + + Terrence Boyd + markmartin@flores.com + + + Jessica Carroll + eric30@yahoo.com + + + Erin Dean + toddmartin@guerra.biz + + + Craig Hernandez + joshualang@gonzalez.com + + + Amber Choi + doughertynancy@harmon.org + + + Renee Brown + terribeard@archer-gibson.info + + + Curtis Turner + pjohnson@hotmail.com + + + Benjamin Reed + marksmith@austin.net + + + Christina Fernandez + richardjoseph@esparza-peters.com + + + Jasmine Campbell + thomasmatthew@gmail.com + + + Catherine Bond + coreyroberts@gonzalez.com + + + Connie Jones + koneal@riley.com + + + Cody Taylor + kelsey99@hotmail.com + + + Kendra Gray + walkerrussell@hotmail.com + + + Alexander Murray + grossrobert@hotmail.com + + + Arthur Jackson + travis73@hotmail.com + + + Dr. William Vasquez DDS + gonzalezdaniel@hotmail.com + + + April Hampton + desireemorris@mcguire.info + + + Gerald Hunter + justin91@ross-scott.biz + + + Morgan Bolton + erika30@lloyd-smith.biz + + + Angela Barker + daniel17@carr.com + + + Angela Montgomery + jonathangoodwin@smith-perez.com + + + Yolanda Henry + shawnmcguire@gmail.com + + + Susan Hines + sarahbailey@wallace.com + + + Michelle Young + lewismichele@yahoo.com + + + Glen Hood + ljackson@vazquez.com + + + Christopher Wright + evansjulie@walton.com + + + Susan Guzman DDS + medinaelizabeth@gmail.com + + + Barbara Cortez + bchavez@cameron.com + + + Stacey Hammond + nancyturner@stewart.com + + + Amanda Stout + macdonaldlatoya@hotmail.com + + + Lisa Johnson + wnolan@gmail.com + + + Carlos Wyatt + iperez@cohen.com + + + Samantha Brewer + thomas47@hotmail.com + + + Brett Jackson + zpowell@cruz-rivera.com + + + Johnny Guzman + tmerritt@yahoo.com + + + Mary Davis + collinslisa@hotmail.com + + + Willie Mccoy + joshua20@terrell.biz + + + Kelsey Rivera + randy72@gmail.com + + + Melissa Maddox + christopher13@gmail.com + + + Jason Rodriguez + kellypierce@harris.com + + + Donna Walsh + wardraymond@martinez.com + + + Monique Patel + cynthia75@james.net + + + Dr. Lindsay Farrell PhD + brownmaria@gmail.com + + + Ann Ruiz + jeremiah94@pennington.org + + + Mary Alexander + catherineharper@munoz.org + + + Brittany Russell + haileywinters@russell-coffey.net + + + Dominique Rosales + matthewpatterson@carr.com + + + Henry Waters + karen72@logan.com + + + Jared Weaver + karlafletcher@baldwin.org + + + Mr. Thomas Atkins + gboone@gmail.com + + + Carla Cohen + ibarron@gmail.com + + + Tricia Lewis + pperez@hotmail.com + + + Mario Gill + lisa43@brown.org + + + James Olsen + vickie82@hotmail.com + + + Michael Perry + rdavis@yahoo.com + + + Matthew Lucas + joshuagray@carpenter-stanley.com + + + Christine Torres + samanthayoung@smith-aguilar.biz + + + Lindsay Miller + randyevans@yahoo.com + + + Margaret Jones + kevincantu@alexander-carson.org + + + Cameron Mcdonald + deckerjerome@garcia.com + + + Brittany Sanders + dennis55@leonard-turner.com + + + Daniel Patterson + timothy36@novak.com + + + David Chaney + kristen02@hotmail.com + + + Sheri Silva + idawson@alvarez.com + + + Holly Ward + saraallen@dunn-smith.net + + + Bryan Solis + stacey30@lam.biz + + + Diane Carter + paulvargas@gmail.com + + + David Brown + james98@gmail.com + + + Bridget Fritz + beth24@hotmail.com + + + Paul Boyd + johngutierrez@hotmail.com + + + Ernest Baker + phillipwhite@hotmail.com + + + George Myers + frank52@hammond.com + + + Daniel Miller + joshua96@gmail.com + + + Jonathan Ayala + jerryharris@davis.net + + + Jill Stone + pwright@hotmail.com + + + Trevor Richard + mreed@thompson.org + + + Jason Thomas + josephflowers@hotmail.com + + + Arthur Thomas + lnelson@hicks.com + + + Austin Collins + ambermann@barnes.com + + + Jason Diaz + ericreyes@hotmail.com + + + Darryl Hall + faithdixon@barnes-burgess.org + + + Jason Thomas + brittany32@yahoo.com + + + John Sanders + waltontheresa@hotmail.com + + + Lisa Hayes + victor14@hotmail.com + + + Chelsea Wong + iwatkins@williams-solomon.com + + + Joseph Fitzgerald + mary86@hotmail.com + + + Crystal Schroeder + kbarron@wilson-flynn.org + + + Denise Bean + noah23@gmail.com + + + Jamie Atkins + cwebb@hotmail.com + + + Joshua Kim + esmith@ramirez.com + + + Deanna Mooney + jason13@turner.com + + + Jasmine Baker + torresjacob@braun.com + + + Victoria Williams + rwilliams@hotmail.com + + + Sandra Hall + williamsonrichard@gmail.com + + + Miranda Mcpherson + xrussell@barajas.biz + + + Samantha Walton + danielle73@gmail.com + + + Kyle Serrano + stonecassandra@mcfarland.info + + + Mr. Bruce Maldonado DDS + diazmatthew@yahoo.com + + + Amber Fisher + jonesdavid@rubio.info + + + Brett Berry + millerteresa@gmail.com + + + Cory Bradley + umatthews@summers.com + + + Ryan Peters + shepherdmonique@gmail.com + + + Laura Lee + lfleming@higgins.com + + + Christian Smith + johnnymartinez@castro-miller.com + + + Kelly Hanson + velazquezsandra@chavez-malone.info + + + Brian King + hwood@yahoo.com + + + Cynthia Owens + sbrown@hotmail.com + + + Lisa Clark + derek74@bell-martinez.com + + + Brenda Ford + kevin55@hotmail.com + + + Daniel Brady + wbennett@hotmail.com + + + Jake Wilson + lorraine60@solis.biz + + + April Cole + halltyler@yahoo.com + + + Melissa Callahan + cmckenzie@rodriguez.info + + + Taylor Brown + davisadam@gmail.com + + + Patrick Guerrero + hannah48@delgado.net + + + Brian Gonzalez + burchmalik@johnson.com + + + Robert Bailey + debbiemoore@hotmail.com + + + Jesus Maynard + gene45@gmail.com + + + Linda Greer + johnharris@reed-allen.net + + + Travis Thomas + bryantrachel@gmail.com + + + Vicki Mitchell + edaniels@hotmail.com + + + Paula Espinoza + donnameyer@dennis.org + + + James Hoffman + haustin@larson-wiggins.biz + + + Ashlee Perkins + stevenknapp@miller.com + + + Rebecca Leon + smitchell@simpson-johnson.com + + + Jorge Williams + shawn36@peters-meadows.com + + + Bob Flores + kellercourtney@yahoo.com + + + Lisa Miller + johnsoncrystal@gmail.com + + + Brandon Davis + bryanpetersen@hotmail.com + + + Joshua Daugherty + josehayes@carey.com + + + Justin Wise + pamelacosta@simmons-morrow.com + + + Kimberly Johnson + combssandra@deleon.com + + + Toni Stone + eestrada@charles.com + + + Julie Rivers + rwilliams@castillo-nelson.org + + + Kelly Scott + danielsmith@hotmail.com + + + Michael Carr + clarklisa@newman-barrett.com + + + Jonathan Vaughn + dennisrebecca@lawrence-harris.com + + + Erica Lowe + wilsonkelly@hotmail.com + + + Kimberly Clark + jose15@gmail.com + + + Lindsey Robertson + rdickerson@yahoo.com + + + Cindy Anderson + gmorton@daniels.com + + + Tami Barber + harveykaren@hotmail.com + + + Tiffany Wu + jessica90@gmail.com + + + Edward Bowers + hallkathy@gmail.com + + + Shawn Collier + rhondasmith@hotmail.com + + + Michael Cox + usimpson@graham-cunningham.net + + From c74c87e63d17ae152096b29ba8766339c04837e4 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 22 Jan 2019 14:36:01 -0500 Subject: [PATCH 039/640] Comment out section that has no content --- editors.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/editors.md b/editors.md index 68a5b43b..91be3421 100644 --- a/editors.md +++ b/editors.md @@ -278,9 +278,11 @@ emulation. [sshfs](https://github.com/libfuse/sshfs) can mount a folder on a remote server locally, and then you can use a local editor. +{% comment %} # Resources TODO resources for other editors? +{% endcomment %} # Exercises From e93a5176ba2943a4ee12e75c8572c65f42ff299d Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 22 Jan 2019 14:36:33 -0500 Subject: [PATCH 040/640] Add bash --- editors.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/editors.md b/editors.md index 91be3421..a52ddfb1 100644 --- a/editors.md +++ b/editors.md @@ -262,11 +262,12 @@ Lists of plugins: For many popular editors (e.g. vim and emacs), many other tools support editor emulation. -- `~/.inputrc` - - `set editing-mode vi` - Shell - - `export EDITOR=vim` (environment variable used by programs like `git`) + - bash: `set -o vi` - zsh: `bindkey -v` + - `export EDITOR=vim` (environment variable used by programs like `git`) +- `~/.inputrc` + - `set editing-mode vi` ## Resources From 435e67d1525fc07388f7eb6731d32f75b4fea0c5 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 22 Jan 2019 14:37:53 -0500 Subject: [PATCH 041/640] Update --- editors.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/editors.md b/editors.md index a52ddfb1..fdcf2643 100644 --- a/editors.md +++ b/editors.md @@ -89,13 +89,15 @@ Escape. ### Basics -Vim Ex-commands are issued through `:{command}` in normal mode. +Vim ex commands are issued through `:{command}` in normal mode. - `:q` quit (close window) - `:w` save - `:wq` save and quit - `:e {name of file}` open file for editing -- `:help topic` open help +- `:help {topic}` open help + - `:help :w` opens help for the `:w` ex command + - `:help w` opens help for the `w` movement ### Movement @@ -150,6 +152,7 @@ powerful, composable commands). - visual mode + manipulation - select text, `d` to delete it or `c` to change it - `u` to undo, ` Date: Tue, 22 Jan 2019 14:45:46 -0500 Subject: [PATCH 042/640] Fix typos --- editors.md | 4 +- example-data.xml | 1000 ---------------------------------------------- 2 files changed, 2 insertions(+), 1002 deletions(-) delete mode 100644 example-data.xml diff --git a/editors.md b/editors.md index fdcf2643..82f403bd 100644 --- a/editors.md +++ b/editors.md @@ -228,10 +228,10 @@ better way of doing this", there probably is: look it up online. - `qe^r"f>s": "fq` - Macro to format a person - Go to line with `` - - `qpS{j@ej@eA,jS},q` + - `qpS{j@eA,j@ejS},q` - Macro to format a person and go to the next person - Go to line with `` - - `qq@pjq`j@ej@eA,jj@ej@eA,jj@ej@eA,j + - `qq@pjq` - Execute macro until end of file - `999@q` - Manually remove last `,` and add `[` and `]` delimiters diff --git a/example-data.xml b/example-data.xml deleted file mode 100644 index e973e1cf..00000000 --- a/example-data.xml +++ /dev/null @@ -1,1000 +0,0 @@ - - Johnny Zhang Jr. - amyalvarez@cole.com - - - Edward Cook - dsparks@alvarez-dunn.com - - - Stephen Sweeney - dlewis@gmail.com - - - Krystal Riley - jflores@wright.biz - - - Ashley Robinson - robertsmichael@yahoo.com - - - Kimberly Brooks - sharoncunningham@larson.com - - - Brent Proctor - edward86@stewart.com - - - William Roberts - parkertodd@webb.com - - - Amanda Morales - lorizavala@hodges.com - - - Bryan Poole Jr. - carolyn56@gray-campos.net - - - Dale Hall - martinjames@yahoo.com - - - Isabella Reynolds - wbowen@wallace.com - - - Ann Rodriguez - charles37@taylor-riley.biz - - - Bryan Davis - jessica60@hotmail.com - - - Dalton Powell - piercenatasha@yahoo.com - - - Scott Turner - harold68@yahoo.com - - - Nicholas Castillo - dawnstephens@robinson.info - - - Joseph Pierce - lukepatterson@hotmail.com - - - Robyn White - jenniferrobinson@hotmail.com - - - Justin Rice - brandi76@gmail.com - - - Jamie Graham - harrisdavid@yahoo.com - - - Phillip Schmidt - stephanie33@gmail.com - - - John Baker - todd86@hotmail.com - - - Sharon Austin - srivera@yahoo.com - - - Erica Avila - jenniferreed@bowers-wilson.com - - - Jeremy Bass - jdavis@collins.com - - - Joshua Parsons - stephaniecoleman@miller-barker.com - - - Emma Mccoy - taylorjohn@wagner.net - - - Megan Williams - ronnie54@gmail.com - - - Michael Sutton - connie58@mendoza.net - - - Nicholas York - kennedykevin@collins.com - - - Donald Robles - williamsbrandon@gmail.com - - - Melissa Allen - pproctor@ramos-patel.com - - - Shannon Jones - beckkathleen@johnson.com - - - David White - sandra73@thompson.com - - - Jonathan Thomas - johnsonjeremy@gmail.com - - - Rachael Floyd - amanda78@johnson.info - - - Tina Carter - josewells@jones.net - - - Eric Johnson - bowersaustin@hernandez-edwards.com - - - William Kramer - rhunt@johnson.com - - - Nathan Williams - cynthiayoung@hotmail.com - - - Patty Schwartz - salinasdavid@sheppard.biz - - - David Collins - pcalhoun@yahoo.com - - - James Thomas - brianfox@rogers-cruz.com - - - Mark Casey - jerry88@graham.com - - - Robert Galloway - cherylmcgee@hotmail.com - - - Caitlin Dunn - nicholemartin@yahoo.com - - - Nancy Allison - martha33@molina-bullock.com - - - Marvin Burns - wrocha@gmail.com - - - Kimberly Jones - anitamunoz@french-christian.com - - - Caitlin Wood - thomasrandall@bowers-sullivan.org - - - Sara Burton - riosangelica@gmail.com - - - Jessica Roberson - theresa11@hotmail.com - - - Nicole Macias - kevinhodge@martin.biz - - - Christina Williams - shawn35@rice-bailey.org - - - Cody Winters - nicholassmith@barron-wu.com - - - Patricia Miller DDS - pierceraymond@watkins.org - - - Jennifer Lyons - vrivera@gmail.com - - - Jerry Rojas - jacobalexander@yahoo.com - - - Matthew Perez - jrivas@hotmail.com - - - Patrick Hogan - moorelisa@yahoo.com - - - Lisa Howard - stephen90@smith.biz - - - Justin Sloan - edwardsmichael@hotmail.com - - - Suzanne Morrow - shane74@yahoo.com - - - Theresa Lara - maryrichardson@clark.com - - - Christopher Powers - yfowler@davis-lee.net - - - Teresa Howell - amy15@yahoo.com - - - Richard Shelton - ksmith@yahoo.com - - - Jeremy Cole - bleach@gmail.com - - - Melissa Clark - rosejeffrey@yahoo.com - - - Kimberly Mcdaniel - ularson@ross-david.com - - - Kelly Dixon - gatesstephen@hotmail.com - - - Devin Quinn - wjohnson@hotmail.com - - - Kevin Greene - lhanson@hotmail.com - - - Jeffery Wiggins - amy76@gmail.com - - - Latoya Allen - vking@yahoo.com - - - Zachary Walker - diazjames@hotmail.com - - - Alyssa Molina - elizabeth59@gmail.com - - - Heather Miranda - davidturner@cortez-martinez.biz - - - Lori Gardner - murphytaylor@yahoo.com - - - Jessica Simpson - jamesdean@rosales.com - - - Anna Dickerson - abigailmurphy@hotmail.com - - - Molly Oconnor - morrisrhonda@yahoo.com - - - Brandi Braun - ericksonmatthew@jenkins.org - - - Renee Flowers - brownantonio@yang-crosby.org - - - Cassandra Compton - progers@yahoo.com - - - David Gilbert - vickie78@gmail.com - - - Brenda Davis - cynthiajones@thornton.com - - - Nicholas Rivera - longalyssa@yahoo.com - - - Dustin Hodges - sgolden@lee.com - - - Chad Wong - williambernard@mccarty.net - - - Robin Craig - xbyrd@austin.com - - - Heather Parker - allenjoshua@rodriguez.com - - - Jennifer Roberts - manningtravis@gmail.com - - - James Andrews - ginaromero@hotmail.com - - - Dorothy Hines - dsmith@thomas.com - - - Stephen Garcia - hughesbrendan@hotmail.com - - - Alfred Ellis - elizabeth41@crawford.info - - - Marilyn White - victoriaford@hotmail.com - - - Brian Graves - cpatel@gmail.com - - - Elizabeth Wagner - newtonwesley@cohen.com - - - Michelle Flores - shelbygross@duke-thomas.info - - - Larry Russell - richard99@meyer.com - - - Terrence Boyd - markmartin@flores.com - - - Jessica Carroll - eric30@yahoo.com - - - Erin Dean - toddmartin@guerra.biz - - - Craig Hernandez - joshualang@gonzalez.com - - - Amber Choi - doughertynancy@harmon.org - - - Renee Brown - terribeard@archer-gibson.info - - - Curtis Turner - pjohnson@hotmail.com - - - Benjamin Reed - marksmith@austin.net - - - Christina Fernandez - richardjoseph@esparza-peters.com - - - Jasmine Campbell - thomasmatthew@gmail.com - - - Catherine Bond - coreyroberts@gonzalez.com - - - Connie Jones - koneal@riley.com - - - Cody Taylor - kelsey99@hotmail.com - - - Kendra Gray - walkerrussell@hotmail.com - - - Alexander Murray - grossrobert@hotmail.com - - - Arthur Jackson - travis73@hotmail.com - - - Dr. William Vasquez DDS - gonzalezdaniel@hotmail.com - - - April Hampton - desireemorris@mcguire.info - - - Gerald Hunter - justin91@ross-scott.biz - - - Morgan Bolton - erika30@lloyd-smith.biz - - - Angela Barker - daniel17@carr.com - - - Angela Montgomery - jonathangoodwin@smith-perez.com - - - Yolanda Henry - shawnmcguire@gmail.com - - - Susan Hines - sarahbailey@wallace.com - - - Michelle Young - lewismichele@yahoo.com - - - Glen Hood - ljackson@vazquez.com - - - Christopher Wright - evansjulie@walton.com - - - Susan Guzman DDS - medinaelizabeth@gmail.com - - - Barbara Cortez - bchavez@cameron.com - - - Stacey Hammond - nancyturner@stewart.com - - - Amanda Stout - macdonaldlatoya@hotmail.com - - - Lisa Johnson - wnolan@gmail.com - - - Carlos Wyatt - iperez@cohen.com - - - Samantha Brewer - thomas47@hotmail.com - - - Brett Jackson - zpowell@cruz-rivera.com - - - Johnny Guzman - tmerritt@yahoo.com - - - Mary Davis - collinslisa@hotmail.com - - - Willie Mccoy - joshua20@terrell.biz - - - Kelsey Rivera - randy72@gmail.com - - - Melissa Maddox - christopher13@gmail.com - - - Jason Rodriguez - kellypierce@harris.com - - - Donna Walsh - wardraymond@martinez.com - - - Monique Patel - cynthia75@james.net - - - Dr. Lindsay Farrell PhD - brownmaria@gmail.com - - - Ann Ruiz - jeremiah94@pennington.org - - - Mary Alexander - catherineharper@munoz.org - - - Brittany Russell - haileywinters@russell-coffey.net - - - Dominique Rosales - matthewpatterson@carr.com - - - Henry Waters - karen72@logan.com - - - Jared Weaver - karlafletcher@baldwin.org - - - Mr. Thomas Atkins - gboone@gmail.com - - - Carla Cohen - ibarron@gmail.com - - - Tricia Lewis - pperez@hotmail.com - - - Mario Gill - lisa43@brown.org - - - James Olsen - vickie82@hotmail.com - - - Michael Perry - rdavis@yahoo.com - - - Matthew Lucas - joshuagray@carpenter-stanley.com - - - Christine Torres - samanthayoung@smith-aguilar.biz - - - Lindsay Miller - randyevans@yahoo.com - - - Margaret Jones - kevincantu@alexander-carson.org - - - Cameron Mcdonald - deckerjerome@garcia.com - - - Brittany Sanders - dennis55@leonard-turner.com - - - Daniel Patterson - timothy36@novak.com - - - David Chaney - kristen02@hotmail.com - - - Sheri Silva - idawson@alvarez.com - - - Holly Ward - saraallen@dunn-smith.net - - - Bryan Solis - stacey30@lam.biz - - - Diane Carter - paulvargas@gmail.com - - - David Brown - james98@gmail.com - - - Bridget Fritz - beth24@hotmail.com - - - Paul Boyd - johngutierrez@hotmail.com - - - Ernest Baker - phillipwhite@hotmail.com - - - George Myers - frank52@hammond.com - - - Daniel Miller - joshua96@gmail.com - - - Jonathan Ayala - jerryharris@davis.net - - - Jill Stone - pwright@hotmail.com - - - Trevor Richard - mreed@thompson.org - - - Jason Thomas - josephflowers@hotmail.com - - - Arthur Thomas - lnelson@hicks.com - - - Austin Collins - ambermann@barnes.com - - - Jason Diaz - ericreyes@hotmail.com - - - Darryl Hall - faithdixon@barnes-burgess.org - - - Jason Thomas - brittany32@yahoo.com - - - John Sanders - waltontheresa@hotmail.com - - - Lisa Hayes - victor14@hotmail.com - - - Chelsea Wong - iwatkins@williams-solomon.com - - - Joseph Fitzgerald - mary86@hotmail.com - - - Crystal Schroeder - kbarron@wilson-flynn.org - - - Denise Bean - noah23@gmail.com - - - Jamie Atkins - cwebb@hotmail.com - - - Joshua Kim - esmith@ramirez.com - - - Deanna Mooney - jason13@turner.com - - - Jasmine Baker - torresjacob@braun.com - - - Victoria Williams - rwilliams@hotmail.com - - - Sandra Hall - williamsonrichard@gmail.com - - - Miranda Mcpherson - xrussell@barajas.biz - - - Samantha Walton - danielle73@gmail.com - - - Kyle Serrano - stonecassandra@mcfarland.info - - - Mr. Bruce Maldonado DDS - diazmatthew@yahoo.com - - - Amber Fisher - jonesdavid@rubio.info - - - Brett Berry - millerteresa@gmail.com - - - Cory Bradley - umatthews@summers.com - - - Ryan Peters - shepherdmonique@gmail.com - - - Laura Lee - lfleming@higgins.com - - - Christian Smith - johnnymartinez@castro-miller.com - - - Kelly Hanson - velazquezsandra@chavez-malone.info - - - Brian King - hwood@yahoo.com - - - Cynthia Owens - sbrown@hotmail.com - - - Lisa Clark - derek74@bell-martinez.com - - - Brenda Ford - kevin55@hotmail.com - - - Daniel Brady - wbennett@hotmail.com - - - Jake Wilson - lorraine60@solis.biz - - - April Cole - halltyler@yahoo.com - - - Melissa Callahan - cmckenzie@rodriguez.info - - - Taylor Brown - davisadam@gmail.com - - - Patrick Guerrero - hannah48@delgado.net - - - Brian Gonzalez - burchmalik@johnson.com - - - Robert Bailey - debbiemoore@hotmail.com - - - Jesus Maynard - gene45@gmail.com - - - Linda Greer - johnharris@reed-allen.net - - - Travis Thomas - bryantrachel@gmail.com - - - Vicki Mitchell - edaniels@hotmail.com - - - Paula Espinoza - donnameyer@dennis.org - - - James Hoffman - haustin@larson-wiggins.biz - - - Ashlee Perkins - stevenknapp@miller.com - - - Rebecca Leon - smitchell@simpson-johnson.com - - - Jorge Williams - shawn36@peters-meadows.com - - - Bob Flores - kellercourtney@yahoo.com - - - Lisa Miller - johnsoncrystal@gmail.com - - - Brandon Davis - bryanpetersen@hotmail.com - - - Joshua Daugherty - josehayes@carey.com - - - Justin Wise - pamelacosta@simmons-morrow.com - - - Kimberly Johnson - combssandra@deleon.com - - - Toni Stone - eestrada@charles.com - - - Julie Rivers - rwilliams@castillo-nelson.org - - - Kelly Scott - danielsmith@hotmail.com - - - Michael Carr - clarklisa@newman-barrett.com - - - Jonathan Vaughn - dennisrebecca@lawrence-harris.com - - - Erica Lowe - wilsonkelly@hotmail.com - - - Kimberly Clark - jose15@gmail.com - - - Lindsey Robertson - rdickerson@yahoo.com - - - Cindy Anderson - gmorton@daniels.com - - - Tami Barber - harveykaren@hotmail.com - - - Tiffany Wu - jessica90@gmail.com - - - Edward Bowers - hallkathy@gmail.com - - - Shawn Collier - rhondasmith@hotmail.com - - - Michael Cox - usimpson@graham-cunningham.net - From c954bf9dd0284c2f5b5dc66c8293cecda6d63f44 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 22 Jan 2019 14:48:39 -0500 Subject: [PATCH 043/640] Add mention of buffers --- editors.md | 1 + 1 file changed, 1 insertion(+) diff --git a/editors.md b/editors.md index 82f403bd..9884ab8e 100644 --- a/editors.md +++ b/editors.md @@ -95,6 +95,7 @@ Vim ex commands are issued through `:{command}` in normal mode. - `:w` save - `:wq` save and quit - `:e {name of file}` open file for editing +- `:ls` show open buffers - `:help {topic}` open help - `:help :w` opens help for the `:w` ex command - `:help w` opens help for the `w` movement From 14a48f0d8fa08ccc8ab34ab7c2a6f451c644729d Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Tue, 22 Jan 2019 15:20:54 -0500 Subject: [PATCH 044/640] Some version control exercises --- version-control.md | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/version-control.md b/version-control.md index 1255aff5..5b36d26d 100644 --- a/version-control.md +++ b/version-control.md @@ -326,10 +326,19 @@ if your push is rejected, what do you do? # Exercises + - On a repo try modifying an existing file. What happens when you do `git stash`? What do you see when running `git log --all --oneline`? Run `git stash pop` to undo what you did with `git stash`. In what scenario might this be useful? + + - One common mistake when learning git is to commit large files that should not be managed by git or adding sensitive information. Try adding a file to a repository, making some commits and then deleting that file from history (you may want to look at [this](https://help.github.com/articles/removing-sensitive-data-from-a-repository/)). Also if you do want git to manage large files for you, look into [Git-LFS](https://git-lfs.github.com/) + + - Git is really convenient for undoing changes but one has to be familiar even with the most unlikely changes + - If a file is mistakenly modified in some commit it can be reverted with `git revert`. However if a commit involves several changes `revert` might not be the best option. How can we use `git checkout` to recover a file version from a specific commit? + - Create a branch, make a commit in said branch and then delete it. Can you still recover said commit? Try looking into `git reflog`. (Note: Recover dangling things quickly, git will periodically automatically clean up commits that nothing points to.) + - If one is too trigger happy with `git reset --hard` instead of `git reset` changes can be easily lost. However since the changes were staged, we can recover them. (look into `git fsck --lost-found` and `.git/lost-found`) + + - In any git repo look under the folder `.git/hooks` you will find a bunch of scripts that end with `.sample`. If you rename them without the `.sample` they will run based on their name. For instance `pre-commit` will execute before doing a commit. Experiment with them + - forced push + `--force-with-lease` - git merge/rebase --abort - - git stash - - git reflog - git hooks - .gitconfig + aliases - git blame From ae7726c8c8eb42bee0d09df36e769db0912b611c Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Wed, 23 Jan 2019 14:01:21 -0500 Subject: [PATCH 045/640] Add more content --- dotfiles.md | 38 ++++++++++++++++++++++++++++++++------ 1 file changed, 32 insertions(+), 6 deletions(-) diff --git a/dotfiles.md b/dotfiles.md index 857a4855..401f1b13 100644 --- a/dotfiles.md +++ b/dotfiles.md @@ -37,8 +37,8 @@ copy configurations though). # Organization How should you organize your dotfiles? They should be in their own folder, -under version control, and symlinked into place using a script. This has the -benefits of: +under version control, and **symlinked** into place using a script. This has +the benefits of: - **Easy installation**: if you log in to a new machine, applying your customizations will only take a minute @@ -49,6 +49,26 @@ in sync for your entire programming career, and version history is nice to have for long-lived projects +```shell +cd ~/src +mkdir dotfiles +cd dotfiles +git init +touch bashrc +# create a bashrc with some settings, e.g.: +# PS1='\w > ' +touch install +chmod +x install +# insert the following into the install script: +# #!/usr/bin/env bash +# BASEDIR=$(dirname $0) +# cd $BASEDIR +# +# ln -s ${PWD}/bashrc ~/.bashrc +git add bashrc install +git commit -m 'Initial commit' +``` + # Advanced topics ## Machine-specific customizations @@ -68,8 +88,8 @@ If the configuration file supports it, use the equivalent of if-statements to apply machine specific customizations. For example, your shell could have a line like: -``` -if [[ "$(uname)" == "Darwin" ]]; then {do something}; fi +```shell +if [[ "$(uname)" == "Darwin" ]]; then {do_something}; fi ``` ### Includes @@ -88,6 +108,10 @@ machine-specific settings. # Resources +- Your instructors' dotfiles: + [Anish](https://github.com/anishathalye/dotfiles), + [Jon](https://github.com/jonhoo/configs), + [Jose](https://github.com/jjgo/dotfiles) - [GitHub does dotfiles](http://dotfiles.github.io/): dotfile frameworks, utilities, examples, and tutorials - [Shell startup @@ -96,8 +120,8 @@ utilities, examples, and tutorials # Exercises -1. Create a folder for your dotfiles (and set up version control, or wait till - we [cover that](/version-control/) in lecture). +1. Create a folder for your dotfiles and set up [version + control](/version-control/). 1. Add a configuration for at least one program, e.g. your shell, with some customization (to start off, it can be something as simple as customizing @@ -111,3 +135,5 @@ utilities, examples, and tutorials 1. Test your installation script on a fresh virtual machine. 1. Migrate all of your current tool configurations to your dotfiles repository. + +1. Publish your dotfiles on GitHub. From f360cd11fd8b70556e4cc49b41f67788531b3cb0 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Wed, 23 Jan 2019 14:09:03 -0500 Subject: [PATCH 046/640] Add link to dotfiles repositories --- dotfiles.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/dotfiles.md b/dotfiles.md index 401f1b13..f64d3267 100644 --- a/dotfiles.md +++ b/dotfiles.md @@ -30,7 +30,9 @@ You can learn about your tool's settings by reading online documentation or search the internet for blog posts about specific programs, where authors will tell you about their preferred customizations. Yet another way to learn about customizations is to look through other people's dotfiles: you can find tons of -dotfiles repositories on GitHub --- see the most popular one +[dotfiles +repositories](https://github.com/search?o=desc&q=dotfiles&s=stars&type=Repositories) +on GitHub --- see the most popular one [here](https://github.com/mathiasbynens/dotfiles) (we advise you not to blindly copy configurations though). From 03cc4ae440476a1fca5aae897a262a0ac8785fea Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Wed, 23 Jan 2019 14:53:44 -0500 Subject: [PATCH 047/640] Add GDB --- program-introspection.md | 47 +++++++++++++++++++++++++++++++++++++++- static/media/example.c | 27 +++++++++++++++++++++++ 2 files changed, 73 insertions(+), 1 deletion(-) create mode 100644 static/media/example.c diff --git a/program-introspection.md b/program-introspection.md index 29fde930..7608059d 100644 --- a/program-introspection.md +++ b/program-introspection.md @@ -4,4 +4,49 @@ title: "Program Introspection" presenter: Anish --- -Lecture notes will be available by the start of lecture. +# Debugging + +When printf-debugging isn't good enough: use a debugger. + +Debuggers let you interact with the execution of a program, letting you do +things like: + +- halt execution of the program when it reaches a certain line +- single-step through the program +- inspect values of variables +- many more advanced features + +## GDB/LLDB + +[GDB](https://www.gnu.org/software/gdb/) and [LLDB](https://lldb.llvm.org/). +Supports many C-like languages. + +Let's look at [example.c](/static/media/example.c). Compile with debug flags: +`gcc -g -o example example.c`. + +Open GDB: + +`gdb example` + +Commands: + +- `run` +- `b {name of function}` - set a breakpoint +- `b {file}:{line}` - set a breakpoint +- `c` - continue +- `step` / `next` / `finish` - step in / step over / step out +- `p {variable}` - print value of variable +- `watch {expression}` - set a watchpoint that triggers when the value of the expression changes +- `rwatch {expression}` - set a watchpoint that triggers when the value is read + +## PDB + +## Web browser Developer Tools + +# Profiling + +Types of profiling: CPU, memory, etc. + +## Go + +## Perf diff --git a/static/media/example.c b/static/media/example.c new file mode 100644 index 00000000..85505028 --- /dev/null +++ b/static/media/example.c @@ -0,0 +1,27 @@ +#include + +const char *numbers[] = { + "one", + "two", + "three", + "four", + "five", + "six", + "seven", + "eight", + "nine", + "ten" +}; + +void say(int i) +{ + const char *msg = numbers[i-1]; + printf("%s\n", msg); +} + +int main() +{ + for (int i = 1; i <= 10; i++) { + say(i); + } +} From 8be73c402e066b36c9b4f3a8a1f1a53d377805ed Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Wed, 23 Jan 2019 15:06:53 -0500 Subject: [PATCH 048/640] Add more on debuggers --- program-introspection.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/program-introspection.md b/program-introspection.md index 7608059d..46444d2a 100644 --- a/program-introspection.md +++ b/program-introspection.md @@ -41,8 +41,15 @@ Commands: ## PDB +[PDB](https://docs.python.org/3/library/pdb.html) is the Python debugger. + +Insert `import pdb; pdb.set_trace()` where you want to drop into PDB, basically +a hybrid of a debugger (like GDB) and a Python shell. + ## Web browser Developer Tools +Another example of a debugger, this time with a graphical interface. + # Profiling Types of profiling: CPU, memory, etc. From 728daa3b7cb2360213430b062014c8ed89e2cd8b Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Wed, 23 Jan 2019 15:24:40 -0500 Subject: [PATCH 049/640] Add more content --- program-introspection.md | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/program-introspection.md b/program-introspection.md index 46444d2a..d45bb352 100644 --- a/program-introspection.md +++ b/program-introspection.md @@ -50,10 +50,30 @@ a hybrid of a debugger (like GDB) and a Python shell. Another example of a debugger, this time with a graphical interface. +# strace + +Observe system calls a program makes: `strace {program}`. + # Profiling Types of profiling: CPU, memory, etc. +Simplest profiler: `time`. + ## Go +Run test code with CPU profiler: `go test -cpuprofile=cpu.out` + +Analyze profile: `go tool pprof -web cpu.out` + +Run test code with CPU profiler: `go test -memprofile=cpu.out` + +Analyze profile: `go tool pprof -web mem.out` + ## Perf + +Basic performance stats: `perf stat {command}` + +Run a program with the profiler: `perf record {command}` + +Analyze profile: `perf report` From 0b893bdcd3c46670a6ccdb2e82b8b827ea2dc56a Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Thu, 24 Jan 2019 10:36:18 -0500 Subject: [PATCH 050/640] Write backup notes --- backups.md | 43 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 42 insertions(+), 1 deletion(-) diff --git a/backups.md b/backups.md index 3e13bc5e..0864ea25 100644 --- a/backups.md +++ b/backups.md @@ -4,4 +4,45 @@ title: "Backups" presenter: Jose --- -Lecture notes will be available by the start of lecture. +There are two types of people: + +- Those who do backups +- Those who will do backups + +Any data you own that you haven't backed up is data that could be gone at any moment, forever. Here we will cover some good backup basics and the pitfalls of some approaches. + +## 3-2-1 Rule + +The [3-2-1 rule](https://www.us-cert.gov/sites/default/files/publications/data_backup_options.pdf) is a general recommended strategy for backing up your data. It state that you should have: + +- at least **3 copies** of your data +- **2** copies in **different mediums** +- **1** of the copies being **offsite** + +The main idea behind this recommendation is not to put all your eggs in one basket. Having 2 different devices/disks ensures that a single hardware failure doesn't take away all your data. Similarly, if you store your only backup at home and the house burns down or gets robbed you lose everything, that's what the offsite copy is there for. Onsite backups give you availability and speed, offsite give you the resiliency should a disaster happen. + +## Testing your backups + +An common pitfall when performing backups is blindly trusting whatever the system says it's doing and not verifying that the data can be properly recovered. Toy Story 2 was almost lost and their backups were not working, [luck](https://www.youtube.com/watch?v=8dhp_20j0Ys) ended up saving them. + +## Versioning + +You should understand that [RAID](https://en.wikipedia.org/wiki/RAID) is not a backup, and in general **mirroring is not a backup solution**. Simply syncing your files somewhere does not help in many scenarios such as: + +- Data corruption +- Malicious software +- Deleting files by mistake + +If the changes on your data propagate to the backup then you won't be able to recover in these scenarios. Note that this is the case for a lot of cloud storage solutions like Dropbox, Google Drive, One Drive, &c. Some of them do keep deleted data around for short amounts of time but usually the interface to recover is not something you want to be using to recover large amounts of files. + +A proper backup system should be versioned in order to prevent this failure mode. By providing different snapshots in time one can easily navigate them to restore whatever was lost. The most widely known software of this kind is macOS Time Machine. + +## Deduplication + +However, making several copies of your data might be extremely costly in terms of disk space. Nevertheless, from one version to the next, most data will be identical and needs not be transferred again. This is where [data deduplication](https://en.wikipedia.org/wiki/Data_deduplication) comes into play, by keeping track of what has already been stored one can do **incremental backups** where only the changes from one version to the next need to be stored. This significantly reduces the amount of space needed for backups beyond the first copy. + +## Encryption + +Since we might be backing up to untrusted third parties like cloud providers it is worth considering that if you backup your data is copied *as is* then it could potentially be looked by unwanted agents. Documents like your taxes are sensitive information that should not be backed up in plain format. To prevent this, many backup solutions offer **client side encryption** where data is encrypted before being sent to the server. That way the server cannot read the data it is storing but you can decrypt it with your secret key. + +As a side note, if your disk (or home partition) is not encrypted, then anyone that get ahold of your computer can manage to override the user access controls and read your data. Modern hardware supports fast and efficient read and writes of encrypted data so you might want to consider enabling **full disk encryption**. From 599c6bbd4d6dc65351e93c64b526a234240f2d44 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Thu, 24 Jan 2019 10:47:29 -0500 Subject: [PATCH 051/640] Include extra bk considerations and web backup --- backups.md | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/backups.md b/backups.md index 0864ea25..a3a10250 100644 --- a/backups.md +++ b/backups.md @@ -46,3 +46,25 @@ However, making several copies of your data might be extremely costly in terms o Since we might be backing up to untrusted third parties like cloud providers it is worth considering that if you backup your data is copied *as is* then it could potentially be looked by unwanted agents. Documents like your taxes are sensitive information that should not be backed up in plain format. To prevent this, many backup solutions offer **client side encryption** where data is encrypted before being sent to the server. That way the server cannot read the data it is storing but you can decrypt it with your secret key. As a side note, if your disk (or home partition) is not encrypted, then anyone that get ahold of your computer can manage to override the user access controls and read your data. Modern hardware supports fast and efficient read and writes of encrypted data so you might want to consider enabling **full disk encryption**. + + + +## Additional considerations + +Some other things you may want to look into are: + +- **Periodic backups**: outdated backups can become pretty useless. Making backups regularly should be a consideration for your system +- **Bootable backups**: some programs allow you to clone your entire disk. That way you have an image that contains an entire copy of your system you can boot directly from. +- **Differential backup strategies**, you may not necessarily care the same about all your data. You can define different backup policies for different types of data. +- **Append only backups** an additional consideration is to enforce append only operations to your backup repositories in order to prevent malicious agents to delete them if they get ahold of your machine. + + +## Webservices + +Not all the data that you use lives on your hard disk. If you use **webservices** then it might be the case that some data you care about is stored there such as Google Docs presentations or Spotify playslists. Figuring out a backup solution in scenario is somewaht trickier. Nevertheless, most of these services offer you the possibility to download that data, either directly or through a web API. + + +## Webpages + +Similarly, some high quality content can be found online in the form of webpages. If said content is static one can easily back it up by just saving the website and all of its attachments. Another alternative is the [Wayback Machine](https://archive.org/web/), a massive digital archive of the World Wide Web managed by the [Internet Archive](https://archive.org/), a non profit organization focused on the preservation of all sorts of media. The Wayback Machine allows you to capture and archive webpages being able to later retrieve all the snapshots that have been archived for that website. If you find it useful, consider [donating](https://archive.org/donate/) to the project. + From 7ca833f6b48541aa9ebb7599ac9806c92ba97849 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Thu, 24 Jan 2019 10:49:34 -0500 Subject: [PATCH 052/640] Add backup resources and exercises --- backups.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/backups.md b/backups.md index a3a10250..296e3397 100644 --- a/backups.md +++ b/backups.md @@ -68,3 +68,19 @@ Not all the data that you use lives on your hard disk. If you use **webservices* Similarly, some high quality content can be found online in the form of webpages. If said content is static one can easily back it up by just saving the website and all of its attachments. Another alternative is the [Wayback Machine](https://archive.org/web/), a massive digital archive of the World Wide Web managed by the [Internet Archive](https://archive.org/), a non profit organization focused on the preservation of all sorts of media. The Wayback Machine allows you to capture and archive webpages being able to later retrieve all the snapshots that have been archived for that website. If you find it useful, consider [donating](https://archive.org/donate/) to the project. + +## Resources + +Some good backup programs and services we have used and can honestly recommend: + +- [Tarsnap](https://www.tarsnap.com/) - deduplicated, encrypted online backup service for the truly paranoid. +- [Borg Backup](https://borgbackup.readthedocs.io) - deduplicated backup program that supports compression and authenticated encryption. If you need a cloud provider [rsync.net](https://www.rsync.net/products/attic.html) has special offerings for borg/attic users. + + +## Exercises + +- Consider how you are (not) backing up your data and look into fixing/improving that. + +- Choose a webservice you use often (Spotify, Google Music, &c) and figure out what options for backing up your data are. Often people have already made solutions based on available APIs. + +- Think of a website you have visited repeatedly over the years and look it up in the [wayback machine](https://archive.org/web/), how many versions does it have? \ No newline at end of file From 6c6c98ce61d224d4210b095232fe33733741ce36 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Thu, 24 Jan 2019 10:51:01 -0500 Subject: [PATCH 053/640] machine introspection outline --- machine-introspection.md | 32 +++++++++++++++++++++++++++++++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/machine-introspection.md b/machine-introspection.md index dac62a03..a2eb9aff 100644 --- a/machine-introspection.md +++ b/machine-introspection.md @@ -4,4 +4,34 @@ title: "Machine Introspection" presenter: Jon --- -Lecture notes will be available by the start of lecture. +`top` and `htop` + - `t` for tree view +`dstat` +`pstree -p` +`/var/log` + - `tail -f` +`dmesg` +`journalctl` + - `-u UNIT` + - `-f` follow + - `--full` + - `-b` +`df` and `du` +`/proc` +`sudo` +`ip` and `iw` + - `ping` + - `ip route` +`systemd` + - `systemctl enable/disable` + - `systemctl start/stop/restart` + - `systemctl status` + - `systemd-analyze` + - systemd unit files +`ss` +`locate` +`dmidecode`? +`tcpdump`? +`/boot`? +`iptables`? +`rsync`? From 98e0bcf19e7f874d05ee1b0cc82e5ab349615c93 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Thu, 24 Jan 2019 10:52:07 -0500 Subject: [PATCH 054/640] Fix typos in backups --- backups.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/backups.md b/backups.md index 296e3397..2126e1df 100644 --- a/backups.md +++ b/backups.md @@ -61,7 +61,7 @@ Some other things you may want to look into are: ## Webservices -Not all the data that you use lives on your hard disk. If you use **webservices** then it might be the case that some data you care about is stored there such as Google Docs presentations or Spotify playslists. Figuring out a backup solution in scenario is somewaht trickier. Nevertheless, most of these services offer you the possibility to download that data, either directly or through a web API. +Not all the data that you use lives on your hard disk. If you use **webservices** then it might be the case that some data you care about is stored there such as Google Docs presentations or Spotify playlists. Figuring out a backup solution in scenario is somewhat trickier. Nevertheless, most of these services offer you the possibility to download that data, either directly or through a web API. ## Webpages @@ -83,4 +83,4 @@ Some good backup programs and services we have used and can honestly recommend: - Choose a webservice you use often (Spotify, Google Music, &c) and figure out what options for backing up your data are. Often people have already made solutions based on available APIs. -- Think of a website you have visited repeatedly over the years and look it up in the [wayback machine](https://archive.org/web/), how many versions does it have? \ No newline at end of file +- Think of a website you have visited repeatedly over the years and look it up in [archive.org](https://archive.org/web/), how many versions does it have? \ No newline at end of file From bd862f7e7101274781401a2d47e4fe170e40396b Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Thu, 24 Jan 2019 10:59:22 -0500 Subject: [PATCH 055/640] Add rsync/rclone --- backups.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/backups.md b/backups.md index 2126e1df..da5bdb58 100644 --- a/backups.md +++ b/backups.md @@ -75,7 +75,8 @@ Some good backup programs and services we have used and can honestly recommend: - [Tarsnap](https://www.tarsnap.com/) - deduplicated, encrypted online backup service for the truly paranoid. - [Borg Backup](https://borgbackup.readthedocs.io) - deduplicated backup program that supports compression and authenticated encryption. If you need a cloud provider [rsync.net](https://www.rsync.net/products/attic.html) has special offerings for borg/attic users. - +- [rsync](https://rsync.samba.org/) is a utility that provides fast incremental file transfer. It is not a full backup solution. +- [rclone](https://rclone.org/) like rsync but for cloud storage providers such as Amazon S3, Dropbox, Google Drive, &c. Supports client side encryption of remote folders. ## Exercises From 202602aeeb87150c4487ac6ac24c4ea876dce9c5 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Thu, 24 Jan 2019 11:47:24 -0500 Subject: [PATCH 056/640] Finish lecture content for machine introspection --- machine-introspection.md | 141 +++++++++++++++++++++++++++++++-------- 1 file changed, 114 insertions(+), 27 deletions(-) diff --git a/machine-introspection.md b/machine-introspection.md index a2eb9aff..dcdd96b6 100644 --- a/machine-introspection.md +++ b/machine-introspection.md @@ -4,34 +4,121 @@ title: "Machine Introspection" presenter: Jon --- -`top` and `htop` - - `t` for tree view -`dstat` -`pstree -p` -`/var/log` - - `tail -f` -`dmesg` -`journalctl` - - `-u UNIT` - - `-f` follow - - `--full` - - `-b` -`df` and `du` -`/proc` -`sudo` -`ip` and `iw` - - `ping` - - `ip route` -`systemd` - - `systemctl enable/disable` - - `systemctl start/stop/restart` - - `systemctl status` - - `systemd-analyze` - - systemd unit files -`ss` -`locate` +Sometimes, computers misbehave. And very often, you want to know why. +Let's look at some tools that help you do that! + +But first, let's make sure you're able to do introspection. Often, +system introspection requires that you have certain privileges, like +being the member of a group (like `power` for shutdown). The `root` user +is the ultimate privilege; they can do pretty much anything. You can run +a command as `root` (but be careful!) using `sudo`. + +## What happened? + +If something goes wrong, the first place to start is to look at what +happened around the time when things went wrong. For this, we need to +look at logs. + +Traditionally, logs were all stored in `/var/log`, and many still are. +Usually there's a file or folder per program. Use `grep` or `less` to +find your way through them. + +There's also a kernel log that you can see using the `dmesg` command. +This used to be available as a plain-text file, but nowadays you often +have to go through `dmesg` to get at it. + +Finally, there is the "system log", which is increasingly where all of +your log messages go. On _most_, though not all, Linux systems, that log +is managed by `systemd`, the "system daemon", which controls all the +services that run in the background (and much much more at this point). +That log is accessible through the somewhat inconvenient `journalctl` +tool if you are root, or part of the `admin` or `wheel` groups. + +For `journalctl`, you should be aware of these flags in particular: + + - `-u UNIT`: show only messages related to the given systemd service + - `--full`: don't truncate long lines (the stupidest feature) + - `-b`: only show messages from the latest boot (see also `-b -2`) + - `-n100`: only show last 100 entries + +## What is happening? + +If something _is_ wrong, or you just want to get a feel for what's going +on in your system, you have a number of tools at your disposal for +inspecting the currently running system: + +First, there's `top`, and the improved version `htop`, which show you +various statistics for the currently running processes on the system. +CPU use, memory use, process trees, etc. There are lots of shortcuts, +but `t` is particularly useful for enabling the tree view. You can also +see the process tree with `pstree` (+ `-p` to include PIDs). If you want +to know what those programs are doing, you'll often want to tail their +log files. `journalctl -f`, `dmesg -w`, and `tail -f` are you friends +here. + +Sometimes, you want to know more about the resources being used overall +on your system. [`dstat`](http://dag.wiee.rs/home-made/dstat/) is +excellent for that. It gives you real-time resource metrics for lots of +different subsystems like I/O, networking, CPU utilization, context +switches, and the like. `man dstat` is the place to start. + +If you're running out of disk space, there are two primary utilities +you'll want to know about: `df` and `du`. The former shows you the +status of all the partitions on your system (try it with `-h`), whereas +the latter measures the size of all the folders you give it, including +their contents (see also `-h` and `-s`). + +To figure out what network connections you have open, `ss` is the way to +go. `ss -t` will show all open TCP connections. `ss -tl` will show all +listening (i.e., server) ports on your system. `-p` will also include +which process is using that connection, and `-n` will give you the raw +port numbers. + + +## System configuration + +There are _many_ ways to configure your system, but we'll got through +two very common ones: networking and services. Most applications on your +system tell you how to configure them in their manpage, and usually it +will involve editing files in `/etc`; the system configuration +directory. + +If you want to configure your network, the `ip` command lets you do +that. Its arguments take on a slightly weird form, but `ip help command` +will get you pretty far. `ip addr` shows you information about your +network interfaces and how they're configured (IP addresses and such), +and `ip route` shows you how network traffic is routed to different +network hosts. Network problems can often be resolved purely through the +`ip` tool. There's also `iw` for managing wireless network interfaces. +`ping` is a handy tool for checking how deeply things are broken. Try +pinging a hostname (google.com), an external IP address (1.1.1.1), and +an internal IP address (192.168.1.1 or default gw). You may also want to +fiddle with `/etc/resolv.conf` to check your DNS settings (how hostnames +are resolved to IP addresses). + +To configure services, you pretty much have to interact with `systemd` +these days, for better or for worse. Most services on your system will +have a systemd service file that defines a systemd _unit_. These files +define what command to run when that services is started, how to stop +it, where to log things, etc. They're usually not too bad to read, and +you can find most of them in `/usr/lib/systemd/system/`. You can also +define your own in `/etc/systemd/system` . + +Once you have a systemd service in mind, you use the `systemctl` command +to interact with it. `systemctl enable UNIT` will set the service to +start on boot (`disable` removes it again), and `start`, `stop`, and +`restart` will do what you expect. If something goes wrong, systemd will +let you know, and you can use `journalctl -u UNIT` to see the +application's log. You can also use `systemctl status` to see how all +your system services are doing. If your boot feels slow, it's probably +due to a couple of slow services, and you can use `systemd-analyze` (try +it with `blame`) to figure out which ones. + +# Exercises + +`locate`? `dmidecode`? `tcpdump`? `/boot`? `iptables`? -`rsync`? +`/proc`? From 0de372cbde620dc4cecd9153ebc25ea813464bc5 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Thu, 24 Jan 2019 12:43:07 -0500 Subject: [PATCH 057/640] Remove title from cli environment --- command-line.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/command-line.md b/command-line.md index 28c948e0..77780107 100644 --- a/command-line.md +++ b/command-line.md @@ -4,8 +4,6 @@ title: "Command-line environment" presenter: Jose --- -# Command-line Environment - ## Aliases & Functions As you can imagine it can become tiresome typing long commands that involve many flags or verbose options. Nevertheless, most shells support **aliasing**. For instance, an alias in bash has the following structure (note there is no space around the `=` sign): From ea57396b24345c5a6fa5788e24a034e298593b02 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Thu, 24 Jan 2019 13:21:10 -0500 Subject: [PATCH 058/640] Rename automation lecture --- os-automation.md => automation.md | 0 schedule.md | 2 +- 2 files changed, 1 insertion(+), 1 deletion(-) rename os-automation.md => automation.md (100%) diff --git a/os-automation.md b/automation.md similarity index 100% rename from os-automation.md rename to automation.md diff --git a/schedule.md b/schedule.md index 38be52e4..2c9f70c7 100644 --- a/schedule.md +++ b/schedule.md @@ -29,7 +29,7 @@ blocks, with a 10 minute break in between. # Tuesday, 1/29 - [Package management and dependency management](/package-management/) -- [OS customization](/os-customization/) and [OS automation](/os-automation/) +- [OS customization](/os-customization/) and [automation](/automation/) # Thursday, 1/31 From a94a05cd8bc05c92a8d98b443f9b0e704ef68187 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Thu, 24 Jan 2019 14:17:52 -0500 Subject: [PATCH 059/640] Move remotes lecture --- remotes.md | 7 +++++++ schedule.md | 4 ++-- 2 files changed, 9 insertions(+), 2 deletions(-) create mode 100644 remotes.md diff --git a/remotes.md b/remotes.md new file mode 100644 index 00000000..c2b99836 --- /dev/null +++ b/remotes.md @@ -0,0 +1,7 @@ +--- +layout: page +title: "Remote Machines" +presenter: Jose +--- + +Lecture notes will be available by the start of lecture. \ No newline at end of file diff --git a/schedule.md b/schedule.md index 2c9f70c7..d7f855ba 100644 --- a/schedule.md +++ b/schedule.md @@ -23,13 +23,13 @@ blocks, with a 10 minute break in between. # Thursday, 1/24 -- [Dotfiles](/dotfiles/) and [backups](/backups/) +- [Dotfiles](/dotfiles/), [backups](/backups/) and [automation](/automation/) - [Machine introspection](/machine-introspection/) and [program introspection](/program-introspection/) # Tuesday, 1/29 - [Package management and dependency management](/package-management/) -- [OS customization](/os-customization/) and [automation](/automation/) +- [OS customization](/os-customization/) and [Remote Machines](/remotes/) # Thursday, 1/31 From 5a1291691c09b2e6881cd388f3371d875d5b3e74 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Thu, 24 Jan 2019 14:18:05 -0500 Subject: [PATCH 060/640] Write notes for automation --- automation.md | 60 +++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 58 insertions(+), 2 deletions(-) diff --git a/automation.md b/automation.md index 2f4269f2..093ea25c 100644 --- a/automation.md +++ b/automation.md @@ -1,7 +1,63 @@ --- layout: page -title: "OS Automation" +title: "Automation" presenter: Jose --- -Lecture notes will be available by the start of lecture. +Sometims you write a script that does something but you want for it to run periodically, say a backup task. You can always write an *ad hoc* solution that runs in the background and comes online periodically. However, most UNIX systems come with the cron daemon which can run task with a frequency up to a minute based on simple rules. + +On most UNIX systems the cron daemon, `crond` will be running by default but you can always check using `ps aux | grep crond`. + +## The crontab + +The configuration file for cron can be displayed running `crontab -l` edited running `crontab -e` The time format that cron uses are five space separated fields along with the user and command + +- **minute** - What minute of the hour the command will run on, + and is between '0' and '59' +- **hour** - This controls what hour the command will run on, and is specified in + the 24 hour clock, values must be between 0 and 23 (0 is midnight) +- **dom** - This is the Day of Month, that you want the command run on, e.g. to + run a command on the 19th of each month, the dom would be 19. +- **month** - This is the month a specified command will run on, it may be specified + numerically (0-12), or as the name of the month (e.g. May) +- **dow** - This is the Day of Week that you want a command to be run on, it can + also be numeric (0-7) or as the name of the day (e.g. sun). +- **user** - This is the user who runs the command. +- **command** - This is the command that you want run. This field may contain + multiple words or spaces. + +Note that using an asterisk `*` means all and using an asterisk followed by a slash and number means every nth value. So `*/5` means every five. Some examples are + +```shell +*/5 * * * * # Every five minutes + 0 * * * * # Every hour at o'clock + 0 9 * * * # Every day at 9:00 am + 0 9-17 * * * # Every hour between 9:00am and 5:00pm + 0 0 * * 5 # Every Friday at 12:00 am + 0 0 1 */2 * # Every other month, the first day, 12:00am +``` +You can find many more examples of common crontab schedules in [crontab.guru](https://crontab.guru/examples.html) + +## Shell environment and logging + +A common pitfall when using cron is that it does not load the same environment scripts that common shells do such as `.bashrc`, `.zshrc`, &c and it does not log the output anywhere by default. Combined with the maximum frequency being one minute, it can become quite painful to debug cronscripts initially. + +To deal with the environment, make sure that you use absolute paths in all your scripts and modify your environment variables such as `PATH` so the script can run successfully. To simplify logging, a good recommendation is to write your crontab in a format like this + + +```shell +* * * * * user /path/to/cronscripts/every_minute.sh >> /tmp/cron_every_minute.log 2>&1 +``` + +And write the script in a separate file. Remember that `>>` appends to the file and that `2>&1` redirects `stderr` to `stdout` (you might to want keep them separate though). + +## Anacron + +One caveat of using cron is that if the computer is powered off or asleep when the cron script should run then it is not executed. For frequent tasks this might be fine, but if a task runs less often, you may want to ensure that it is executed. [anacron](https://linux.die.net/man/8/anacron) works similar to `cron` except that the frequency is specified in days. Unlike cron, it does not assume that the machine is running continuously. Hence, it can be used on machines that aren't running 24 hours a day, to control regular jobs as daily, weekly, and monthly jobs. + + +## Exercises + +- Make a script that looks every minute in your downloads folder for any file that is a picture (you can look into MIME types or use a regular expression to match common extensions) and moves them into your Pictures folder. + +- Write a cron script to weekly check for outdated packages in your system and prompts you to update them or updates them automatically. From da13634b7ff897fb770cdfc55aa5332b15a91dfa Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Thu, 24 Jan 2019 15:08:21 -0500 Subject: [PATCH 061/640] Add note --- program-introspection.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/program-introspection.md b/program-introspection.md index d45bb352..aae98c1f 100644 --- a/program-introspection.md +++ b/program-introspection.md @@ -28,7 +28,7 @@ Open GDB: `gdb example` -Commands: +Some commands: - `run` - `b {name of function}` - set a breakpoint @@ -38,6 +38,7 @@ Commands: - `p {variable}` - print value of variable - `watch {expression}` - set a watchpoint that triggers when the value of the expression changes - `rwatch {expression}` - set a watchpoint that triggers when the value is read +- `layout` ## PDB From 532a636590885e1d4a0aa2ee0911669ca92688d1 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Thu, 24 Jan 2019 17:15:01 -0500 Subject: [PATCH 062/640] Update schedule --- schedule.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/schedule.md b/schedule.md index d7f855ba..345d6ce1 100644 --- a/schedule.md +++ b/schedule.md @@ -24,11 +24,10 @@ blocks, with a 10 minute break in between. # Thursday, 1/24 - [Dotfiles](/dotfiles/), [backups](/backups/) and [automation](/automation/) -- [Machine introspection](/machine-introspection/) and [program introspection](/program-introspection/) +- [Machine introspection](/machine-introspection/) # Tuesday, 1/29 - -- [Package management and dependency management](/package-management/) +- [Program introspection](/program-introspection/) and [package/dependency management](/package-management/) - [OS customization](/os-customization/) and [Remote Machines](/remotes/) # Thursday, 1/31 From fcafb18b8432c82373f43c62e2b70369de5c3d15 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Fri, 25 Jan 2019 18:20:09 -0500 Subject: [PATCH 063/640] Make exercises use number bullets --- automation.md | 6 +++--- backups.md | 8 +++++--- command-line.md | 12 ++++++------ shell.md | 2 +- 4 files changed, 15 insertions(+), 13 deletions(-) diff --git a/automation.md b/automation.md index 093ea25c..8ca371ce 100644 --- a/automation.md +++ b/automation.md @@ -4,7 +4,7 @@ title: "Automation" presenter: Jose --- -Sometims you write a script that does something but you want for it to run periodically, say a backup task. You can always write an *ad hoc* solution that runs in the background and comes online periodically. However, most UNIX systems come with the cron daemon which can run task with a frequency up to a minute based on simple rules. +Sometimes you write a script that does something but you want for it to run periodically, say a backup task. You can always write an *ad hoc* solution that runs in the background and comes online periodically. However, most UNIX systems come with the cron daemon which can run task with a frequency up to a minute based on simple rules. On most UNIX systems the cron daemon, `crond` will be running by default but you can always check using `ps aux | grep crond`. @@ -58,6 +58,6 @@ One caveat of using cron is that if the computer is powered off or asleep when t ## Exercises -- Make a script that looks every minute in your downloads folder for any file that is a picture (you can look into MIME types or use a regular expression to match common extensions) and moves them into your Pictures folder. +1. Make a script that looks every minute in your downloads folder for any file that is a picture (you can look into [MIME types](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types) or use a regular expression to match common extensions) and moves them into your Pictures folder. -- Write a cron script to weekly check for outdated packages in your system and prompts you to update them or updates them automatically. +1. Write a cron script to weekly check for outdated packages in your system and prompts you to update them or updates them automatically. diff --git a/backups.md b/backups.md index da5bdb58..22eec2ad 100644 --- a/backups.md +++ b/backups.md @@ -80,8 +80,10 @@ Some good backup programs and services we have used and can honestly recommend: ## Exercises -- Consider how you are (not) backing up your data and look into fixing/improving that. +1. Consider how you are (not) backing up your data and look into fixing/improving that. -- Choose a webservice you use often (Spotify, Google Music, &c) and figure out what options for backing up your data are. Often people have already made solutions based on available APIs. +1. Choose a webservice you use often (Spotify, Google Music, &c) and figure out what options for backing up your data are. Often people have already made solutions based on available APIs. -- Think of a website you have visited repeatedly over the years and look it up in [archive.org](https://archive.org/web/), how many versions does it have? \ No newline at end of file +1. Think of a website you have visited repeatedly over the years and look it up in [archive.org](https://archive.org/web/), how many versions does it have? + +1. One way to efficiently implement deduplication is to use hardlinks. Whereas symbolic link (also called soft link) is a file that points to another file or folder, a hardlink is a exact copy of the pointer (it uses the same inode and points to the same place in the disk). Thus if the original file is removed a symlink stops working whereas a hard link doesn't. However, hardlinks only work for files. Try using the command `ln` to create hard links and compare them to symlinks created with `ln -s`. (In macOS you will need to install the gnu coreutils or the hln package). \ No newline at end of file diff --git a/command-line.md b/command-line.md index 77780107..f5fa5256 100644 --- a/command-line.md +++ b/command-line.md @@ -153,18 +153,18 @@ The [tldr](https://github.com/tldr-pages/tldr) command is a community driven doc ### `aunpack` vs `tar/unzip/unrar` -As [this xkcd](https://xkcd.com/1168/) references it can be quite tricky to remember the options and sometimes you need a different tool altogether such as `unrar` for rar files. +As [this xkcd](https://xkcd.com/1168/) references, it can be quite tricky to remember the options for `tar` and sometimes you need a different tool altogether such as `unrar` for .rar files. The [atool](https://www.nongnu.org/atool/) package provides the `aunpack` command which will figure out the correct options and always put the extracted archives in a new folder. ## Exercises -- Run `cat .bash_history | sort | uniq -c | sort -rn | head -n 10` (or `cat .zhistory | sort | uniq -c | sort -rn | head -n 10` for zsh) to get top 10 most used commands and consider writing sorter aliases for them -- Choose a terminal emulator and figure out how to change the following properties: +1. Run `cat .bash_history | sort | uniq -c | sort -rn | head -n 10` (or `cat .zhistory | sort | uniq -c | sort -rn | head -n 10` for zsh) to get top 10 most used commands and consider writing sorter aliases for them +1. Choose a terminal emulator and figure out how to change the following properties: - Font choice - Color scheme. How many colors does a standard scheme have? why? - Scrollback history size -- Install `fasd` or some similar software and write a bash/zsh function called `v` that performs fuzzy matching on the passed arguments and opens up the top result in your editor of choice. Then, modify it so that if there are multiple matches you can select them with `fzf`. -- Since `fzf` is quite convenient for performing fuzzy searches and the shell history is quite prone to those kind of searches, investigate how to bind `fzf` to `^R`. You can find some info [here](https://github.com/junegunn/fzf/wiki/Configuring-shell-key-bindings) -- What does the `--bar` option do in `ack`? \ No newline at end of file +1. Install `fasd` or some similar software and write a bash/zsh function called `v` that performs fuzzy matching on the passed arguments and opens up the top result in your editor of choice. Then, modify it so that if there are multiple matches you can select them with `fzf`. +1. Since `fzf` is quite convenient for performing fuzzy searches and the shell history is quite prone to those kind of searches, investigate how to bind `fzf` to `^R`. You can find some info [here](https://github.com/junegunn/fzf/wiki/Configuring-shell-key-bindings) +1. What does the `--bar` option do in `ack`? \ No newline at end of file diff --git a/shell.md b/shell.md index 7a2d30e9..62105ebd 100644 --- a/shell.md +++ b/shell.md @@ -248,7 +248,7 @@ Also, a double dash `--` is used in built-in commands and many other commands to ## Exercises -1. If you are completely new to the shell you may want to read a more comprehensive guide about it such as [BashGuide](http://mywiki.wooledge.org/BashGuide) +1. If you are completely new to the shell you may want to read a more comprehensive guide about it such as [BashGuide](http://mywiki.wooledge.org/BashGuide). If you want a more indepth introduction [The Linux Command Line](http://linuxcommand.org/tlcl.php) is a good resource. 1. **PATH, which, type** From e3468bea5322fc9fa929b9ce65bf1d3510d1d457 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Fri, 25 Jan 2019 18:20:31 -0500 Subject: [PATCH 064/640] Add rename exercise --- data-wrangling.md | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/data-wrangling.md b/data-wrangling.md index 5dd8981f..e40aec21 100644 --- a/data-wrangling.md +++ b/data-wrangling.md @@ -341,20 +341,22 @@ rustup toolchain list | grep nightly | grep -vE "nightly-x86|01-17" | sed 's/-x8 # Exercises - - If you are not familiar with Regular Expressions +1. If you are not familiar with Regular Expressions [here](https://regexone.com/) is a short interactive tutorial that covers most of the basics - - How is `sed s/REGEX/SUBSTITUTION/g` different from the regular sed? +1. How is `sed s/REGEX/SUBSTITUTION/g` different from the regular sed? What about `/I` or `/m`? - - To do in-place substitution it is quite tempting to do something like +1. To do in-place substitution it is quite tempting to do something like `sed s/REGEX/SUBSTITUTION/ input.txt > input.txt`. However this is a bad idea, why? Is this particular to `sed`? - - Look for boot messages that are _not_ shared between your past three +1. Implement a simple grep equivalent tool in a language you are familiar with using regex. If you want the output to be color highlighted like grep is, search for ANSI color escape sequences. +1. Sometimes some operations like renaming files can be tricky with raw commands like `mv` . `rename` is a nifty tool to achieve this and has a sed-like syntax. Try creating a bunch of files with spaces in their names and use `rename` to replace them with underscores. +1. Look for boot messages that are _not_ shared between your past three reboots (see `journalctl`'s `-b` flag). You may want to just mash all the boot logs together in a single file, as that may make things easier. - - Produce some statistics of your system boot time over the last ten - boots using the log timestamp of the messages +1. Produce some statistics of your system boot time over the last ten + boots using the log timestamp of the messages ``` Logs begin at ... ``` @@ -362,13 +364,13 @@ rustup toolchain list | grep nightly | grep -vE "nightly-x86|01-17" | sed 's/-x8 ``` systemd[577]: Startup finished in ... ``` - - Find the number of words (in `/usr/share/dict/words`) that contain at +1. Find the number of words (in `/usr/share/dict/words`) that contain at least three `a`s and don't have a `'s` ending. What are the three most common last two letters of those words? `sed`'s `y` command, or the `tr` program, may help you with case insensitivity. How many of those two-letter combinations are there? And for a challenge: which combinations do not occur? - - Find an online data set like [this +1. Find an online data set like [this one](https://stats.wikimedia.org/EN/TablesWikipediaZZ.htm) or [this one](https://ucr.fbi.gov/crime-in-the-u.s/2016/crime-in-the-u.s.-2016/topic-pages/tables/table-1). Maybe another one [from From 8a8965c6f0ca192b8bd04b5a99ba10688b7cb19b Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Fri, 25 Jan 2019 18:20:51 -0500 Subject: [PATCH 065/640] Add minor comments to dotfiles --- dotfiles.md | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/dotfiles.md b/dotfiles.md index f64d3267..731ca93b 100644 --- a/dotfiles.md +++ b/dotfiles.md @@ -23,6 +23,10 @@ You probably have some dotfiles set up already. Some places to look: - `~/.vim` - `~/.gitconfig` +Some programs don't put the files under your home folder directly and instead they put them in a folder under `~/.config`. + +Dotfiles are not exclusive to command line applications, for instance the [MPV](https://mpv.io/) video player can be configured editing files under `~/.config/mpv` + # Learning to customize tools You can learn about your tool's settings by reading online documentation or @@ -87,11 +91,17 @@ logically straightforward but can be pretty heavyweight. ### If statements If the configuration file supports it, use the equivalent of if-statements to -apply machine specific customizations. For example, your shell could have a line +apply machine specific customizations. For example, your shell could have something like: ```shell +if [[ "$(uname)" == "Linux" ]]; then {do_something else}; fi + +# Darwin is the architecture name for macOS systems if [[ "$(uname)" == "Darwin" ]]; then {do_something}; fi + +# You can also make it machine specific +if [[ "$(hostname)" == "myServer" ]]; then {do_something}; fi ``` ### Includes @@ -108,6 +118,15 @@ And then on each machine, `~/.gitconfig_local` can contain machine-specific settings. You could even track these in a separate repository for machine-specific settings. +This idea is also useful if you want different programs to share some configurations. For instance if you want both `bash` and `zsh` to share the same set of aliases you can write them under `.aliases` and have the following block in both. + +```bash +# Test if ~/.aliases exists and source it +if [ -f ~/.aliases ]; then + source ~/.aliases +fi +``` + # Resources - Your instructors' dotfiles: From 9be5396a01e0e0479960aec11ac4d9a8f50a5b0a Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Fri, 25 Jan 2019 18:21:18 -0500 Subject: [PATCH 066/640] Minor comments about editors --- editors.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/editors.md b/editors.md index 9884ab8e..ce245290 100644 --- a/editors.md +++ b/editors.md @@ -273,15 +273,15 @@ emulation. - `~/.inputrc` - `set editing-mode vi` +There are even vim keybinding extensions for web [browsers](http://vim.wikia.com/wiki/Vim_key_bindings_for_web_browsers) some popular one are [Vimium](https://chrome.google.com/webstore/detail/vimium/dbepggeogbaibhgnhhndojpepiihcmeb?hl=en) for Google Chrome and [Tridactyl](https://github.com/tridactyl/tridactyl) for Firefox. + + ## Resources - [Vim Tips Wiki](http://vim.wikia.com/wiki/Vim_Tips_Wiki) - [Vim Advent Calendar](https://vimways.org/2018/): various Vim tips - -# Remote Editing - -[sshfs](https://github.com/libfuse/sshfs) can mount a folder on a remote server -locally, and then you can use a local editor. +- [Neovim](https://neovim.io/) is a modern vim reimplementation with more active development. +- [Vim Golf](http://www.vimgolf.com/): Various Vim challenges {% comment %} # Resources @@ -295,7 +295,7 @@ TODO resources for other editors? Vim) and at least one GUI editor (e.g. Atom). Learn through tutorials like `vimtutor` (or the equivalents for other editors). To get a real feel for a new editor, commit to using it exclusively for a couple days while going - about your work. + about your work. 1. Customize your editor. Look through tips and tricks online, and look through other people's configurations (often, they are well-documented). @@ -304,4 +304,4 @@ TODO resources for other editors? 1. Commit to using a powerful editor for at least a couple weeks: you should start seeing the benefits by then. At some point, you should be able to get - your editor to work as fast as you think. + your editor to work as fast as you think. \ No newline at end of file From f15e23b67ebfe556cbcdb20b23ad273f047f440d Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Fri, 25 Jan 2019 18:21:40 -0500 Subject: [PATCH 067/640] Draft security topics and exercises --- security.md | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/security.md b/security.md index 040fd251..095a4bb4 100644 --- a/security.md +++ b/security.md @@ -5,3 +5,28 @@ presenter: Jon --- Lecture notes will be available by the start of lecture. + + +{% comment %} + +Topics: + +- Encrypting files, encrypted volumes, full disk encryption +- Password managers +- Two factor authentication, paper keys +- HTTPS (HTTPS Every extension) +- Cookies (Firefox Multiaccount containers) +- VPNs, wireguard + +Exercises + +1. Encrypt a file using PGP +1. Use veracrypt to create a simple encrypted volume +1. Enable 2FA for your most data sensitive accounts i.e. GMail, Dropbox, Github, &c + + +References + +- [PrivacyTools.io](https://privacytools.io) + +{% endcomment %} \ No newline at end of file From 70cd8ed74fd36980d026f7cf61d621dd177b3c52 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Fri, 25 Jan 2019 18:22:12 -0500 Subject: [PATCH 068/640] Draft os-customization exercises and reference --- os-customization.md | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/os-customization.md b/os-customization.md index 58b1b41b..a126d229 100644 --- a/os-customization.md +++ b/os-customization.md @@ -5,3 +5,26 @@ presenter: Anish --- Lecture notes will be available by the start of lecture. + + +{% comment %} + +Topics: + +- Hammerspoon +- Bitbar / Polybar +- Clipboard Manager (stack/searchable history) +- Keyboard remapping / shortcuts +- Tiling Window Managers + +## Exercises + +- Figure out how to remap Caps Lock to some other key you use more ofter like ESC, Ctrl or Backspace +- Make a custom keyboard shorcut to open a new terminal window or new browser window + +References + +- reddit.com/r/unixporn + + +{% endcomment %} \ No newline at end of file From 2118243eac9643f7dfb7300c30b4511837915992 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Fri, 25 Jan 2019 18:26:14 -0500 Subject: [PATCH 069/640] Add notes for future automation topics --- automation.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/automation.md b/automation.md index 8ca371ce..20b549cc 100644 --- a/automation.md +++ b/automation.md @@ -61,3 +61,12 @@ One caveat of using cron is that if the computer is powered off or asleep when t 1. Make a script that looks every minute in your downloads folder for any file that is a picture (you can look into [MIME types](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types) or use a regular expression to match common extensions) and moves them into your Pictures folder. 1. Write a cron script to weekly check for outdated packages in your system and prompts you to update them or updates them automatically. + + + +{% comment %} + +- [fswatch](https://github.com/emcrisostomo/fswatch) +- GUI automation (pyautogui) [Automating the boring stuff Chapter 18](https://automatetheboringstuff.com/chapter18/) + +{% endcomment %} \ No newline at end of file From 886fc020960b281a94d80304e885882622e8d7c7 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Fri, 25 Jan 2019 18:27:30 -0500 Subject: [PATCH 070/640] Add more git exercises --- under-construction.md | 189 ------------------------------------------ version-control.md | 36 +++++--- 2 files changed, 24 insertions(+), 201 deletions(-) delete mode 100644 under-construction.md diff --git a/under-construction.md b/under-construction.md deleted file mode 100644 index 6aa4520f..00000000 --- a/under-construction.md +++ /dev/null @@ -1,189 +0,0 @@ ---- -layout: page -title: "Under Construction" ---- - - -Construction -============ - -Things to add to other sections - -# Course Overview - -# Virtual Machines - -- Add QEMU+LVM with virt-manager as Linux virtualization option - -Exercises - -- Install OpenSSH server to ssh in your VM - -# Dotfiles - -To Cover: - -- bashrc/zshrc -- SSH config -- Alias, Functions -- stow - -Exercises - -- Write a .ssh/config entry for Athena -- Run `cat .bash_history | sort | uniq -c | sort -rn | head -n 10` (or the equivalent for zsh) to get 10 most used commands and consider writing sorter aliases for them - -# Shell and scripting - -To Cover: - -- stuff like mv `myfile{.txt, .md}`, ls `*.png` -- File redirect, <(), <{} -- PATH -- zsh, fish, -- QoL stuff: fasd/autojump, fzf, rg, mosh - -Exercises - -- Implement an interactive Ctrl+R with fzf and zsh keybindings - -Reference - -- [ExplainShell](https://explainshell.com/) -- [The Linux Command Line](http://linuxcommand.org/tlcl.php) - -# Terminal emulators and multiplexers - -# Data wrangling - -To Cover: - -- grep, find, tee, less, tail, head, -- Regular expressions -- terminal escape sequences - -Exercises: - -- Implement a rudimentary grep using regex & terminal color escape sequences - -References: - -- [More Shell, Less Egg](http://www.leancrew.com/all-this/2011/12/more-shell-less-egg/) - -# Editors - -# IDEs - -To cover: - -- Syntax linting -- Style linting - -# Version Control - -To cover - -- Git Reflog - -Exercises - -- Explore git-extras like `git ignore` - -Reference - -- [Git for computer scientists](http://eagain.net/articles/git-for-computer-scientists/) -- [Git as a functional datastructure](https://blog.jayway.com/2013/03/03/git-is-a-purely-functional-data-structure/) -- [gitignore.io](https://www.gitignore.io/) - -# Backups - -To cover: - -- 3,2,1 Rule -- Github is not enough (i.e. your taxes can't go there) -- Difference between backups and mirrors (i.e. why gdrive/dropbox can be dangerous) -- Versioned, Deduplicated, compressed backup options: (borg, tarsnap, rsync.net) -- Backing up webservices (i.e. spotify playlists) - -Exercises - -- Choose a webservice you use often (Spotify, Google Music, &c) and figure out what options for backing up your data are. Often people have already made solutions based on available APIs - -# Debuggers, logging, profilers and monitoring - -# Package management - -# OS customization - -To Cover - -- Clipboard Manager (stack/searchable/history) -- Keyboard remapping / shortcuts -- Tiling Window Managers - -Exercises - -- Look for a searchable clipboard manager for your OS -- Remap Caps Lock to some other key you use more ofter like ESC, Ctrl or Backspace -- Make a custom keyboard shorcut to open a new terminal window or new browser window - -References - -- reddit.com/r/unixporn - -# OS automation - -To cover: - -- cron/anacron would be quite useful too -- automator, applescript -- GUI automation (pyautogui) - -Exercises: - -- Implement "when a file is added to the Downloads folder that matches a regex move it to folder X" -- Use pyautogui to draw a square spiral in Paint/GIMP/&c - -References - -- [Automating the boring stuff Chapter 18](https://automatetheboringstuff.com/chapter18/) - -# Web and browsers - -To cover: - -- The web console: html, css, js, network tab, cookies, &c -- Adblockers: Having one to avoid not only ads but also most malicious websites -- Custom CSS: Stylish -- Notion of web API, IFTTT -- Archive.org - -Exercises - -- Write Custom CSS to disable StackOverflow sidebar -- Use web api to fetch current public IP (ipinfo.io) -- Look some webpage you have visited over the years on archive.org to see archived versions - -# Security and privacy - -To cover: - -- Encrypting files, disks -- What are SSH Keys -- Password managers -- Two factor authentication, paper keys -- HTTPS -- VPNs - -Exercises - -- Encrypt a file using PGP -- Use veracrypt to create a simple encrypted volume -- Use ssh-copy-id to access VM without a password -- Get 2FA for most important accoutns i.e. GMail, Dropbox, Github, &c - - -References - -- [PrivacyTools.io](https://privacytools.io) -- [Secure Secure Shell](https://stribika.github.io/2015/01/04/secure-secure-shell.html) \ No newline at end of file diff --git a/version-control.md b/version-control.md index 5b36d26d..21925395 100644 --- a/version-control.md +++ b/version-control.md @@ -322,27 +322,39 @@ if your push is rejected, what do you do? - [Learn git branching](https://learngitbranching.js.org/) - [How to explain git in simple words](https://smusamashah.github.io/blog/2017/10/14/explain-git-in-simple-words) - [Git from the bottom up](https://jwiegley.github.io/git-from-the-bottom-up/) + - [Git for computer scientists](http://eagain.net/articles/git-for-computer-scientists/) + - [Oh shit, git!](https://ohshitgit.com/) - [The Pro Git book](https://git-scm.com/book/en/v2) # Exercises - - On a repo try modifying an existing file. What happens when you do `git stash`? What do you see when running `git log --all --oneline`? Run `git stash pop` to undo what you did with `git stash`. In what scenario might this be useful? +1. On a repo try modifying an existing file. What happens when you do `git stash`? What do you see when running `git log --all --oneline`? Run `git stash pop` to undo what you did with `git stash`. In what scenario might this be useful? - - One common mistake when learning git is to commit large files that should not be managed by git or adding sensitive information. Try adding a file to a repository, making some commits and then deleting that file from history (you may want to look at [this](https://help.github.com/articles/removing-sensitive-data-from-a-repository/)). Also if you do want git to manage large files for you, look into [Git-LFS](https://git-lfs.github.com/) +1. One common mistake when learning git is to commit large files that should not be managed by git or adding sensitive information. Try adding a file to a repository, making some commits and then deleting that file from history (you may want to look at [this](https://help.github.com/articles/removing-sensitive-data-from-a-repository/)). Also if you do want git to manage large files for you, look into [Git-LFS](https://git-lfs.github.com/) - - Git is really convenient for undoing changes but one has to be familiar even with the most unlikely changes - - If a file is mistakenly modified in some commit it can be reverted with `git revert`. However if a commit involves several changes `revert` might not be the best option. How can we use `git checkout` to recover a file version from a specific commit? - - Create a branch, make a commit in said branch and then delete it. Can you still recover said commit? Try looking into `git reflog`. (Note: Recover dangling things quickly, git will periodically automatically clean up commits that nothing points to.) - - If one is too trigger happy with `git reset --hard` instead of `git reset` changes can be easily lost. However since the changes were staged, we can recover them. (look into `git fsck --lost-found` and `.git/lost-found`) +1. Git is really convenient for undoing changes but one has to be familiar even with the most unlikely changes + 1. If a file is mistakenly modified in some commit it can be reverted with `git revert`. However if a commit involves several changes `revert` might not be the best option. How can we use `git checkout` to recover a file version from a specific commit? + 1. Create a branch, make a commit in said branch and then delete it. Can you still recover said commit? Try looking into `git reflog`. (Note: Recover dangling things quickly, git will periodically automatically clean up commits that nothing points to.) + 1. If one is too trigger happy with `git reset --hard` instead of `git reset` changes can be easily lost. However since the changes were staged, we can recover them. (look into `git fsck --lost-found` and `.git/lost-found`) - - In any git repo look under the folder `.git/hooks` you will find a bunch of scripts that end with `.sample`. If you rename them without the `.sample` they will run based on their name. For instance `pre-commit` will execute before doing a commit. Experiment with them +1. In any git repo look under the folder `.git/hooks` you will find a bunch of scripts that end with `.sample`. If you rename them without the `.sample` they will run based on their name. For instance `pre-commit` will execute before doing a commit. Experiment with them + +1. Like many command line tools `git` provides a configuration file (or dotfile) called `~/.gitconfig` . Create and alias using `~/.gitconfig` so that when you run `git graph` you get the output of `git log --oneline --decorate --all --graph` (this is a good command to quickly visualize the commit graph) + +1. Git also lets you define global ignore patterns under `~/.gitignore_global`, this is useful to prevent common errors like adding RSA keys. Create a `~/.gitignore_global` file and add the pattern `*rsa`, then test that it works in a repo. + +1. Once you start to get more familiar with `git`, you will find yourself running into common tasks, such as editing your `.gitignore`. [git extras](https://github.com/tj/git-extras/blob/master/Commands.md) provides a bunch of little utilities that integrate with `git`. For example `git ignore PATTERN` will add the specified pattern to the `.gitignore` file in your repo and `git ignore-io LANGUAGE` will fetch the common ignore patterns for that language from [gitignore.io](https://www.gitignore.io). Install `git extras` and try using some tools like `git alias` or `git ignore`. + +1. Git GUI programs can be a great resource sometimes. Try running [gitk](https://git-scm.com/docs/gitk) in a git repo an explore the differents parts of the interface. Then run `gitk --all` what are the differences? + +1. Once you get used to command line applications GUI tools can feel cumbersome/bloated. A nice compromise between the two are ncurses based tools which can be navigated from the command line and still provide an interactive interface. Git has [tig](https://github.com/jonas/tig), try installing it and running it in a repo. You can find some usage examples [here](https://www.atlassian.com/blog/git/git-tig). + + +{% comment %} - forced push + `--force-with-lease` - git merge/rebase --abort - - git hooks - - .gitconfig + aliases - git blame - - visualization - - `gitk --all` - - `git log --graph --all --decorate` - exercise about why rebasing public commits is bad + +{% endcomment %} \ No newline at end of file From 3ea5fe15316d96a88fbb95523a863da4e03f25fb Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Fri, 25 Jan 2019 18:28:12 -0500 Subject: [PATCH 071/640] Draft remote machine contents --- remotes.md | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 52 insertions(+), 1 deletion(-) diff --git a/remotes.md b/remotes.md index c2b99836..d1c688da 100644 --- a/remotes.md +++ b/remotes.md @@ -4,4 +4,55 @@ title: "Remote Machines" presenter: Jose --- -Lecture notes will be available by the start of lecture. \ No newline at end of file +Lecture notes will be available by the start of lecture. + +{% comment %} + +Executing remote commands + +## SSH Keys + +If you have configured pushing to Giithub using SSH keys you have probably done the steps outlined [here](https://help.github.com/articles/connecting-to-github-with-ssh/). + +- Key generation +- ssh-copy-id + +## Copying files over ssh + +- cat +- scp +- rsync + +## Terminal Multiplexers + +- tmux, screen +- mosh + +## Port Forwarding + +- Local port forwarding +- Remote "" + +## Graphics Forwarding + +## SSH Configuration + +- client side +- server side + +sshfs + + +## Exercises + +- Use ssh-copy-id to access VM without a password +- Write a .ssh/config entry for Athena +- [Secure Secure Shell](https://stribika.github.io/2015/01/04/secure-secure-shell.html) + + +# Remote Editing + +[sshfs](https://github.com/libfuse/sshfs) can mount a folder on a remote server +locally, and then you can use a local editor. + +{% endcomment %} \ No newline at end of file From e760d821047da2a69e9ad9091dc48f7c1314219b Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Mon, 28 Jan 2019 11:20:11 -0500 Subject: [PATCH 072/640] Add more exercises shell --- shell.md | 45 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/shell.md b/shell.md index 62105ebd..7016660c 100644 --- a/shell.md +++ b/shell.md @@ -270,6 +270,41 @@ Also, a double dash `--` is used in built-in commands and many other commands to Try running `ls | file` and `ls | xargs file`. What is `xargs` doing? + +1. **Shebang** + + When you write a script you can specify to your shell what interpreter should be used to interpret the script by using a [shebang](https://en.wikipedia.org/wiki/Shebang_(Unix)) line. Write a script called `hello` with the following contentsmake it executable with `chmod +x hello`. Then execute it with `./hello`. Then remove the first line and execute it again? How is the shell using that first line? + + + ```bash + #! /usr/bin/python + + print("Hello World!") + ``` + + You will often see programs that have a shebang that looks like `#! usr/bin/env bash`. This is a more portable solution with it own set of [advantages and disadvantages](https://unix.stackexchange.com/questions/29608/why-is-it-better-to-use-usr-bin-env-name-instead-of-path-to-name-as-my). How is `env` different from `which`? What environment vairable does `env` use to decide what program to run? + + +1. **Pipes, process substitution, subshell** + + Create a script called `slow_seq.sh` with the following contents and do `chmod +x slow_seq.sh` to make it executable. + + ```bash + #! /usr/bin/env bash + + for i in $(seq 1 10); do + echo $i; + sleep 1; + done + ``` + + There is a way in which pipes (and process substitution) differ from using subshell execution, i.e. `$()`. Run the following commands and observe the differences: + + - `./slow_seq.sh | grep -P "[3-6]"` + - `grep -P "[3-6]" <(./script.sh)` + - `echo $(./script.sh) | grep -P "[3-6]"` + + 1. **Misc** - Try running `touch {a,b}{a,b}` then `ls` what did appear? - Sometimes you want to keep STDIN and still pipe it to a file. Try running `echo HELLO | tee hello.txt` @@ -278,6 +313,16 @@ Also, a double dash `--` is used in built-in commands and many other commands to - Run `printf "\e[38;5;81mfoo\e[0m\n"`. How was the output different? If you want to know more search for ANSI color escape sequences. - Run `touch a.txt` then run `^txt^log` what did bash do for you? In the same vein, run `fc`. What does it do? +{% comment %} + +TODO + +1. **parallel** +- set -e, set -x +- traps + +{% endcomment %} + 1. **Keyboard shortcuts** As with any application you use frequently is worth familiarising yourself with its keyboard shortcuts. Type the following ones and try figuring out what they do and in what scenarios it might be convenient knowing about them. For some of them it might be easier searching online about what they do. (remember that `^X` means pressing `Ctrl+X`) From dec2435a6ff9579f3311995b41693cda5a6f718a Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Mon, 28 Jan 2019 11:20:31 -0500 Subject: [PATCH 073/640] Add todo for automation --- automation.md | 4 ++++ editors.md | 6 ++++-- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/automation.md b/automation.md index 20b549cc..c1c2802a 100644 --- a/automation.md +++ b/automation.md @@ -68,5 +68,9 @@ One caveat of using cron is that if the computer is powered off or asleep when t - [fswatch](https://github.com/emcrisostomo/fswatch) - GUI automation (pyautogui) [Automating the boring stuff Chapter 18](https://automatetheboringstuff.com/chapter18/) +- Ansible/puppet/chef + +- https://xkcd.com/1205/ +- https://xkcd.com/1319/ {% endcomment %} \ No newline at end of file diff --git a/editors.md b/editors.md index ce245290..3ca643c0 100644 --- a/editors.md +++ b/editors.md @@ -273,7 +273,7 @@ emulation. - `~/.inputrc` - `set editing-mode vi` -There are even vim keybinding extensions for web [browsers](http://vim.wikia.com/wiki/Vim_key_bindings_for_web_browsers) some popular one are [Vimium](https://chrome.google.com/webstore/detail/vimium/dbepggeogbaibhgnhhndojpepiihcmeb?hl=en) for Google Chrome and [Tridactyl](https://github.com/tridactyl/tridactyl) for Firefox. +There are even vim keybinding extensions for web [browsers](http://vim.wikia.com/wiki/Vim_key_bindings_for_web_browsers), some popular ones are [Vimium](https://chrome.google.com/webstore/detail/vimium/dbepggeogbaibhgnhhndojpepiihcmeb?hl=en) for Google Chrome and [Tridactyl](https://github.com/tridactyl/tridactyl) for Firefox. ## Resources @@ -304,4 +304,6 @@ TODO resources for other editors? 1. Commit to using a powerful editor for at least a couple weeks: you should start seeing the benefits by then. At some point, you should be able to get - your editor to work as fast as you think. \ No newline at end of file + your editor to work as fast as you think. + +1. Install a linter (e.g. pyflakes for python) link it to your editor and test it is working. \ No newline at end of file From 025a079e6ebb957410ccc00d7aec0a71581266e2 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Mon, 28 Jan 2019 13:38:34 -0500 Subject: [PATCH 074/640] Write Remote Machine Lecture --- remotes.md | 150 ++++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 126 insertions(+), 24 deletions(-) diff --git a/remotes.md b/remotes.md index d1c688da..8af01ff1 100644 --- a/remotes.md +++ b/remotes.md @@ -4,55 +4,157 @@ title: "Remote Machines" presenter: Jose --- -Lecture notes will be available by the start of lecture. +It has become more and more common for programmers to use remote servers in their everyday work. If you need to use remote servers in order to deploy backend software or you need a server with higher computational capabilities, you will end up using a Secure Shell (SSH). As with most tools covered, SSH is highly configurable so it is worth learning about it. -{% comment %} -Executing remote commands +## Executing commands + +An often overlooked feature of `ssh` is the ability to run commands directly. + +- `ssh foobar@server ls` will execute ls in the home folder of foobar +- It works with pipes, so `ssh foobar@server ls | grep PATTERN` will grep locally the remote output of `ls` and `ls | ssh foobar@server grep PATTERN` will grep remotely the local output of `ls`. ## SSH Keys -If you have configured pushing to Giithub using SSH keys you have probably done the steps outlined [here](https://help.github.com/articles/connecting-to-github-with-ssh/). +Key-based authentication exploits public-key cryptography to prove to the server that the client owns the secret private key without revealing the key. This way you do not need to reenter your password every time. Nevertheless the private key (e.g. `~/.ssh/id_rsa`) is effectively your password so treat it like so. + +- Key generation. To generate a pair you can simply run `ssh-keygen -t rsa -b 4096`. If you do not choose a passphrase anyone that gets ahold of your private key will be able to access authorized servers so it is recommended to choose one and use `ssh-agent` to manage shell sessions. + +If you have configured pushing to Github using SSH keys you have probably done the steps outlined [here](https://help.github.com/articles/connecting-to-github-with-ssh/) and have a valid pair already. To check if you have a passphrase and validate it you can run `ssh-keygen -y -f /path/to/key`. + +- Key based authentication. `ssh` will look into `.ssh/authorized_keys` to determine which clients it should let in. To copy a public key over we can use the -- Key generation -- ssh-copy-id +```bash +cat .ssh/id_dsa.pub | ssh foobar@remote 'cat >> ~/.ssh/authorized_keys' +``` + +A simpler solution can be achieved with `ssh-copy-id` where available. + +```bash +ssh-copy-id -i .ssh/id_dsa.pub foobar@remote +``` ## Copying files over ssh -- cat -- scp -- rsync +There are many ways to copy files over ssh + +- `ssh+tee`, the simplest is to use `ssh` command execution and stdin input by doing `cat localfile | ssh remote_server tee serverfile` +- `scp` when copying large amounts of files/directories, the secure copy `scp` command is more convenient since it can easily recurse over paths. The syntax is `scp path/to/local_file remote_host:path/to/remote_file` +- `rsync` improves upon `scp` by detecting identical files in local and remote and preventing copying them again. It also provides more fine grained control over symlinks, permissions and has extra features like the `--partial` flag that can resume from a previously interrupted copy. `rsync` has a similar syntax to `scp`. + + +## Backgrounding processes + +By default when interrupting a ssh connection, child processes of the parent shell are killed along with it. There are a couple of alternatives -## Terminal Multiplexers +- `nohup` - the `nohup` tool effectively allows for a process to live when the terminal gets killed. Although this can sometimes be achieved with `&` and `disown`, nohup is a better default. More details can be found [here](https://unix.stackexchange.com/questions/3886/difference-between-nohup-disown-and). -- tmux, screen -- mosh +- `tmux`, `screen` - whereas `nohup` effectively backgrounds the process it is not convenient for interactive shell sessions. In that case using a terminal multiplexer like `screen` or `tmux` is a convenient choice since one can easily detach and reattach the associated shells. + +Lastly, if you disown a program and want to reattach it to the current terminal, you can look into [reptyr](https://github.com/nelhage/reptyr). `reptyr PID` will grab the process with id PID and attach it to your current terminal. ## Port Forwarding -- Local port forwarding -- Remote "" +In many scenarios you will run into software that works by listening to ports in the machine. When this happens in your local machine you can simply do `localhost:PORT` or `127.0.0.1:PORT`, but what do you do with a remote server that does not have its ports directly available through the network/internet?. This is called port forwarding and it +comes in two flavors: Local Port Forwarding and Remote Port Forwarding (see the pictures for more details, credit of the pictures from [this SO post](https://unix.stackexchange.com/questions/115897/whats-ssh-port-forwarding-and-whats-the-difference-between-ssh-local-and-remot)). + + +**Local Port Forwarding** +![Local Port Forwarding](https://i.stack.imgur.com/a28N8.png  "Local Port Forwarding") + +**Remote Port Forwarding** +![Remote Port Forwarding](https://i.stack.imgur.com/4iK3b.png  "Remote Port Forwarding") + + +The most common scenario is local port forwarding where a service in the remote machine listens in a port and you want to link a port in your local machine to forward to the remote port. For example if we execute `jupyter notebook` in the remote server that listens to the port `8888`. Thus to forward that to the local port `9999` we would do `ssh -L 9999:localhost:8888 foobar@remote_server` and then navigate to `locahost:9999` in our local machine. ## Graphics Forwarding +Sometimes forwarding ports is not enough since we want to run a GUI based program in the server. You can always resort to Remote Desktop Software that sends the entire Desktop Environment (ie. options like RealVNC, Teamviewer, &c). However for a single GUI tool, SSH provides a good alternative: Graphics Forwarding. + +Using the `-X` flag tells SSH to forward + + For trusted X11 forwarding the `-Y` flag can be used. + +Final note is that for this to work the `sshd_config` on the server must have the following options + +```bash +X11Forwarding yes +X11DisplayOffset 10 +``` + +## Roaming + +A common pain when connecting to a remote server are disconnections due to shutting down/sleeping your computer or changing a network. Moreover if one has a connection with significant lag using ssh can become quite frustrating. [Mosh](https://mosh.org/), the mobile shell, improves upon ssh, allowing roaming connections, intermittent connectivity and providing intelligent local echo. + +Mosh is present in all common distributions and package managers. Mosh requires an ssh server to be working in the server. You do not need to be superuser to install mosh but it does require that ports 60000 through 60010 to be open in the server (they usually are since they are not in the privileged range). + +A downside of `mosh` is that is does not support roaming port/graphics forwarding so if you use those often `mosh` won't be of much help. + ## SSH Configuration -- client side -- server side +#### Client -sshfs +We have covered many many arguments that we can pass. A tempting alternative is to create shell aliases that look like `alias my_serer="ssh -X -i ~/.id_rsa -L 9999:localhost:8888 foobar@remote_server`, however there is a better alternative, using `~/.ssh/config`. +```bash +Host vm + User foobar + HostName 172.16.174.141 + Port 22 + IdentityFile ~/.ssh/id_rsa + RemoteForward 9999 localhost:8888 -## Exercises +# Configs can also take wildcards +Host *.mit.edu + User foobaz +``` -- Use ssh-copy-id to access VM without a password -- Write a .ssh/config entry for Athena -- [Secure Secure Shell](https://stribika.github.io/2015/01/04/secure-secure-shell.html) +An additional advantage of using the `~/.ssh/config` file over aliases is that other programs like `scp`, `rsync`, `mosh`, &c are able to read it as well and convert the settings into the corresponding flags. + + +Note that the `~/.ssh/config` file can be considered a dotfile, and in general it is fine for it to be included with the rest of your dotfiles. However if you make it public, think about the information that you are potentially providing strangers on the internet: the addresses of your servers, the users you are using, the open ports, &c. This may facilitate some types of attacks so be thoughtful about sharing your SSH configuration. + +Warning: Never include your RSA keys ( `~/.ssh/id_rsa*` ) in a public repository! -# Remote Editing +#### Server side -[sshfs](https://github.com/libfuse/sshfs) can mount a folder on a remote server +Server side configuration is usually specified in `/etc/ssh/sshd_config`. Here you can make changes like disabling password authentication, changing ssh ports, enabling X11 forwarding, &c. You can specify config settings in a per user basis. + +## Remote Filesystem + +Sometimes it is convenient to mount a remote folder. [sshfs](https://github.com/libfuse/sshfs) can mount a folder on a remote server locally, and then you can use a local editor. -{% endcomment %} \ No newline at end of file +## Exercises + +1. For SSH to work the host needs to be running an SSH server. Install an SSH server (such as OpenSSH) in a virtual machine so you the rest of the exercises. To figure out what is the ip of the machine run the command `ip addr` and look for the inet field (ignore the `127.0.0.1` entry, that corresponds to the loopback interface). + +1. Go to `~/.ssh/` and check if you have a pair of SSH keys there. If not, generate them with `ssh-keygen -t rsa -b 4096`. It is recommended that you use a password and use `ssh-agent` , more info [here](https://www.ssh.com/ssh/agents). + +1. Use `ssh-copy-id` to copy the key to your virtual machine. Test that you can ssh without a password. Then, edit your `sshd_config` in the server to disable password authentication by editing the value of `PasswordAuthentication`. Disable root login by editing the value of `PermitRootLogin`. + +1. Edit the `sshd_config` in the server to change the ssh port and check that you can still ssh. If you ever have a public facing server, a non default port and key only login will throttle a significant amount of malicious attacks. + +1. Install mosh in your server/VM, establish a connection and then disconnect the network adapter of the server/VM. Can mosh properly recover from it? + +1. Another use of local port forwarding is to tunnel certain host to the server. If your network filters some website like for example `reddit.com` you can tunnel it through the server as follows: + + - Run `ssh remote_server -L 80:reddit.com:80` + - Set `reddit.com` and `www.reddit.com` to `127.0.0.1` in `/etc/hosts` + - Check that you are accessing that website through the server + - If it is not obvious use a website such as [ipinfo.io](https://ipinfo.io/) which will change depending on your host public ip. + + +1. Background port forwarding can easily be achieved with a couple of extra flags. Look into what the `-N` and `-f` flags do in `ssh` and figure out what a command such as this `ssh -N -f -L 9999:localhost:8888 foobar@remote_server` does. + + +## References + +- [SSH Hacks](http://matt.might.net/articles/ssh-hacks/) +- [Secure Secure Shell](https://stribika.github.io/2015/01/04/secure-secure-shell.html) + +{% comment %} +Lecture notes will be available by the start of lecture. +{% endcomment %} From b1d8912f040fdf7b06daf3b585f5d984fb59c49b Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Tue, 29 Jan 2019 00:32:55 -0500 Subject: [PATCH 075/640] Write web lecture notes --- web.md | 176 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 175 insertions(+), 1 deletion(-) diff --git a/web.md b/web.md index 1ec371b2..88e26b4c 100644 --- a/web.md +++ b/web.md @@ -4,4 +4,178 @@ title: "Web and Browsers" presenter: Jose --- -Lecture notes will be available by the start of lecture. +Apart from the terminal, the web browser is a tool you will find yourself spending significant amounts of time into. Thus it is worth learning how to use it efficiently and + +## Shortcuts + +Clicking around in your browser is often not the fastest option, getting familiar with common shortcuts can really pay off in the long run. + +- `Middle Button Click` in a link opens it in a new tab +- `Ctrl+T` Opens a new tab +- `Ctrl+Shift+T` Reopens a recently closed tab +- `Ctrl+L` selects the contents of the search bar +- `Ctrl+F` to search within a webpage. If you do this often, you may benefit from an extension that supports regular expressions in searches. + + +## Search operators + +Web search engines like Google or DuckDuckGo provide search operators to enable more elaborate web searches: + +- `"bar foo"` enforces an exact match of bar foo +- `foo site:bar.com` searches for foo within bar.com +- `foo -bar ` excludes the terms containing bar from the search +- `foobar filetype:pdf` Searches for files of that extension +- `(foo|bar)` searches for matches that have foo OR bar + +More through lists are available for popular engines like [Google](https://ahrefs.com/blog/google-advanced-search-operators/) and [DuckDuckGo](https://duck.co/help/results/syntax) + + +## Searchbar + +The searchbar is a powerful tool too. Most browsers can infer search engines from websites and will store them. By editing the keyword argument + +- In Google Chrome they are in [chrome://settings/searchEngines](chrome://settings/searchEngines) +- In Firefox they are in [about:preferences#search](about:preferences#search) + +For example you can make so that `y SOME SEARCH TERMS` to directly search in youtube. + +Moreover, if you own a domain you can setup subdomain forwards using your registrar. For instance I have mapped [ht.josejg.com](https://ht.josejg.com) to this course website. That way I can just type `ht.` and the searchbar will autocomplete. Another good feature of this setup is that unlike bookmarks they will work in every browser. + +## Privacy extensions + +Nowadays surfing the web can get quite annoying due to ads and invasive due to trackers. Moreover a good adblocker not only blocks most ad content but it will also block sketchy and malicious websites since they will be included in the common blacklists. They will also reduce page load times sometimes by reducing the amount of requests performed. A couple of recommendations are: + +- **uBlock origin** ([Chrome](https://chrome.google.com/webstore/detail/ublock-origin/cjpalhdlnbpafiamejdnhcphjbkeiagm), [Firefox](https://addons.mozilla.org/en-US/firefox/addon/ublock-origin/)): block ads and trackers based on predefined rules. You should also consider taking a look at the enabled blacklists in settings since you can enable more based on your region or browsing habits. You can even install filters from [around the web](https://github.com/gorhill/uBlock/wiki/Filter-lists-from-around-the-web) + +- **[Privacy Badger](https://www.eff.org/privacybadger)**: detects and blocks trackers automatically. For example when you go from website to website ad companies track which sites you visit and build a profile of you + +- **[HTTPS everywhere](https://www.eff.org/https-everywhere)** is a wonderful extension that redirects to HTTPS version of a website automatically, if available. + +You can find about more addons of this kind [here](https://www.privacytools.io/#addons) + +## Style customization + +Web browsers are just another piece of software running in _your machine_ and thus you usually have the last say about what they should display or how they should behave. An example of this are custom styles. Browsers determine how to render the style of a webpage using Cascading Style Sheets often abbreviated as CSS. + +You can access the source code of a website by inspecting it and changing its contents and styles temporarily (this is also a reason why you should never trust webpage screenshots). + +If you want to permanently tell your browser to override the style settings for a webpage you will need to use an extension. Our recommendation is **[Stylus](https://github.com/openstyles/stylus)** ([Firefox](https://addons.mozilla.org/en-US/firefox/addon/styl-us/), [Chrome](https://chrome.google.com/webstore/detail/stylus/clngdbkpkpeebahjckkjfobafhncgmne?hl=en)). + + +For example, we can write the following style for the class website + + +```css + +body { + background-color: #2d2d2d; + color: #eee; + font-family: Fira Code; + font-size: 16pt; + max-width: +} + +a:link { + text-decoration: none; + color: #0a0; +} +``` + +Moreover, Stylus can find styles written by other users and published in [userstyles.org](https://userstyles.org). Most common websites have one or several dark theme stylesheets for instance. FYI, you should not use Stylish since it was shown to leak user data, more [here](https://arstechnica.com/information-technology/2018/07/stylish-extension-with-2m-downloads-banished-for-tracking-every-site-visit/) + + +## Functionality Customization + +In the same way that you can modify the style, you can also modify the behaviour of a website by writing custom javascript and them sourcing it using a web browser extension such as [Tapermonkey](https://tampermonkey.net/) + +For example the following script enables vim-like navigation using the J and K keys. + +```js +// ==UserScript== +// @name VIM HT +// @namespace http://tampermonkey.net/ +// @version 0.1 +// @description Vim JK for our website +// @author You +// @match https://hacker-tools.github.io/* +// @grant none +// ==/UserScript== + + +(function() { + 'use strict'; + + window.onkeyup = function(e) { + var key = e.keyCode ? e.keyCode : e.which; + + if (key == 74) { // J is key 74 + window.scrollBy(0,500);; + }else if (key == 75) { // K is key 75 + window.scrollBy(0,-500);; + } + } +})(); +``` + +There are also script repositories such as [OpenUserJS](https://openuserjs.org/) and [Greasy Fork](https://greasyfork.org/en). However, be warned, installing user scripts from others can be very dangerous since they can pretty much do anything such as steal your credit card numbers. Never install a script unless you read the whole thing yourself, understand what it does, and are absolutely sure that you know it isn't doing anything suspicious. Never install a script that contains minified or obfuscated code that you can't read! + +## Web APIs + +It has become more and more common for webservices to offer an application interface aka web API so you can interact with the services making web requests. +A more in depth introduction to the topic can be found [here](https://developer.mozilla.org/en-US/docs/Learn/JavaScript/Client-side_web_APIs/Introduction). Web APIs can be useful for very many reasons: + +- **Retrieval**. Web APIs can quite easily provide you information such as maps, weather or what your public ip address. For instance `curl ipinfo.io` will return a JSON object with some details about your public ip, region, location, &c. With proper parsing these tools can be integrated even with command line tools. The following bash functions talks to Googles autocompletion API and returns the first ten matches. + +```bash +function c() { + url='https://www.google.com/complete/search?client=hp&hl=en&xhr=t' + # NB: user-agent must be specified to get back UTF-8 data! + curl -H 'user-agent: Mozilla/5.0' -sSG --data-urlencode "q=$*" "$url" | + jq -r ".[1][][0]" | + sed 's,,,g' +} +``` + +- **Interaction**. Web API endpoints can also be used to trigger actions. These usually require some sort of authentication token that you can obtain through the service. For example performing the following +`curl -X POST -H 'Content-type: application/json' --data '{"text":"Hello, World!"}' "https://hooks.slack.com/services/$SLACK_TOKEN"` will send a `Hello, World!` message in a channel. + +- **Piping**. Since some services with web APIs are rather popular, common web API "gluing" has already been implemented and is provided with server included. This is the case for services like [If This Then That](https://ifttt.com/) and [Zapier](https://zapier.com/) + + +## Web Automation + +Sometimes web APIs are not enough. If only reading is needed you can use a html parser like `pup` or use a library, for example python has BeautifulSoup. However if interactivity or javascript execution is required those solutions fall short. WebDriver + + +For example, the following script will save the specified url using the wayback machine simulating the interaction of typing the website. + +```python +from selenium.webdriver import Firefox +from selenium.webdriver.common.keys import Keys + + +def snapshot_wayback(driver, url): + + driver.get("https://web.archive.org/") + elem = driver.find_element_by_class_name('web-save-url-input') + elem.clear() + elem.send_keys(url) + elem.send_keys(Keys.RETURN) + driver.close() + + +driver = Firefox() +url = 'https://hacker-tools.github.io' +snapshot_wayback(driver, url) +``` + + +## Exercises + +1. Edit a keyword search engine that you use often in your web browser +1. Install the mentioned extensions. Look into how uBlock Origin/Privacy Badger can be disabled for a website. What differences do you see? Try doing it in a website with plenty of ads like YouTube. +1. Install Stylus and write a custom style for the class website using the CSS provided. Here are some commmon programming characters `= == === >= => ++ /= ~=`. What happens to them when changing the font to Fira Code? If you want to know more search for programming font ligatures. +1. Find a web api to get the weather in your city/area. +1. Use a WebDriver software like [Selenium](https://docs.seleniumhq.org/) to automate some repetitive manual task that you perform often with your browser. + + From 7e832440d45843d81276158f79cfa3611c6609a5 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 29 Jan 2019 14:46:45 -0500 Subject: [PATCH 076/640] Add some text on package management --- package-management.md | 97 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 96 insertions(+), 1 deletion(-) diff --git a/package-management.md b/package-management.md index fe3e18d5..05c446bb 100644 --- a/package-management.md +++ b/package-management.md @@ -4,4 +4,99 @@ title: "Package Management and Dependency Management" presenter: Anish --- -Lecture notes will be available by the start of lecture. +Software usually builds on (a collection of) other software, which necessitates +dependency management. + +Package/dependency management programs are language-specific, but many share +common ideas. + +# Package repositories + +Packages are hosted in _package repositories_. There are different repositories +for different languages (and sometimes multiple for a particular language), +such as [PyPI](https://pypi.org/) for Python, [RubyGems](https://rubygems.org/) +for Ruby, and [crates.io](https://crates.io/) for Rust. + +# Semantic versioning + +Software evolves over time, and we need a way to refer to software versions. +One way would be to refer to software by commit hash, but we can do better in +terms of communicating more information: using version numbers. + +There are many approaches; one popular one is [Semantic +Versioning](https://semver.org/): + +``` +x.y.z +^ ^ ^ +| | +- patch +| +--- minor ++----- major +``` + +Increment **major** version when you make incompatible API changes. + +Increment **minor** version when you add functionality in a backward-compatible manner. + +Increment **patch** when you make backward-compatible bug fixes. + +For example, if you depend on a feature introduced in `v1.2.0` of some +software, then you can install `v1.x.y` for any minor version `x >= 2` and any +patch version `y`. You need to install major version `1` (because `2` can +introduce backward-incompatible changes), and you need to install a minor +version `>= 2` (because you depend on a feature introduced in that minor +version). You can use any newer minor version or patch version because +they should not introduce any backward-incompatible changes. + +# Lock files + +In addition to specifying versions, it can be nice to enforce that the +_contents_ of the dependency have not changed. Some tools use _lock files_ to +specify cryptographic hashes of dependencies (along with versions) that are +checked on package install. + +# Specifying versions + +Tools often let you specify versions in multiple ways, such as: + +- exact version, e.g. `2.3.12` +- minimum major version, e.g. `>= 2` +- specific major version and minimum patch version, e.g. `>= 2.3, <3.0` + +Specifying an exact version can be advantageous to avoid different behaviors +based on installed dependencies (this shouldn't happen if all dependencies +faithfully follow semver, but sometimes people make mistakes). Specifying a +minimum requirement has the advantage of allowing bug fixes to be installed +(e.g. patch upgrades). + +# Dependency resolution + +Package managers use various dependency resolution algorithms to satisfy +dependency requirements. This often gets challenging with complex dependencies +(e.g. a package can be indirectly depended on by multiple top-level +dependencies, and different versions could be required). Different package +managers have different levels of sophistication in their dependency +resolution, but it's something to be aware of: you may need to understand this +if you are debugging dependencies. + +# Virtual environments + +If you're developing multiple software projects, they may depend on different +versions of a particular piece of software. Sometimes, your build tool will +handle this naturally (e.g. by building a static binary). + +For other build tools and programming languages, you can handle this with +virtual environments (e.g. with the +[virtualenv](https://docs.python-guide.org/dev/virtualenvs/) tool for Python). +Instead of installing dependencies system-wide, you can install dependencies +per-project in a virtual environment, and _activate_ the virtual environment +that you want to use when you're working on a specific project. + +# Vendoring + +Another very different approach to dependency management is _vendoring_. +Instead of using a dependency manager or build tool to fetch software, you copy +the entire source code for a dependency into your software's repository. This +has the advantage that you're always building against the same version of the +dependency and you don't need to rely on a package repository, but it is more +effort to upgrade dependencies. From 2ea9d5fda614773d406d20eb23004a4810896759 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 29 Jan 2019 15:00:12 -0500 Subject: [PATCH 077/640] Tweak text --- package-management.md | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/package-management.md b/package-management.md index 05c446bb..dcdb9b92 100644 --- a/package-management.md +++ b/package-management.md @@ -15,13 +15,16 @@ common ideas. Packages are hosted in _package repositories_. There are different repositories for different languages (and sometimes multiple for a particular language), such as [PyPI](https://pypi.org/) for Python, [RubyGems](https://rubygems.org/) -for Ruby, and [crates.io](https://crates.io/) for Rust. +for Ruby, and [crates.io](https://crates.io/) for Rust. They generally store +software (source code and sometimes pre-compiled binaries for specific +platforms) for all versions of a package. # Semantic versioning Software evolves over time, and we need a way to refer to software versions. -One way would be to refer to software by commit hash, but we can do better in -terms of communicating more information: using version numbers. +Some simple ways could be to refer to software by a sequence number or a commit +hash, but we can do better in terms of communicating more information: using +version numbers. There are many approaches; one popular one is [Semantic Versioning](https://semver.org/): @@ -51,9 +54,9 @@ they should not introduce any backward-incompatible changes. # Lock files In addition to specifying versions, it can be nice to enforce that the -_contents_ of the dependency have not changed. Some tools use _lock files_ to -specify cryptographic hashes of dependencies (along with versions) that are -checked on package install. +_contents_ of the dependency have not changed to prevent tampering. Some tools +use _lock files_ to specify cryptographic hashes of dependencies (along with +versions) that are checked on package install. # Specifying versions @@ -85,8 +88,8 @@ If you're developing multiple software projects, they may depend on different versions of a particular piece of software. Sometimes, your build tool will handle this naturally (e.g. by building a static binary). -For other build tools and programming languages, you can handle this with -virtual environments (e.g. with the +For other build tools and programming languages, one approach is handling this +with virtual environments (e.g. with the [virtualenv](https://docs.python-guide.org/dev/virtualenvs/) tool for Python). Instead of installing dependencies system-wide, you can install dependencies per-project in a virtual environment, and _activate_ the virtual environment From 2bcc6eddd630d71b2e2268b18042c5fac3d79027 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 29 Jan 2019 15:13:24 -0500 Subject: [PATCH 078/640] Add text on OS customization --- os-customization.md | 84 +++++++++++++++++++++++++++++++++++++-------- 1 file changed, 70 insertions(+), 14 deletions(-) diff --git a/os-customization.md b/os-customization.md index a126d229..d5f9b2d1 100644 --- a/os-customization.md +++ b/os-customization.md @@ -4,27 +4,83 @@ title: "OS Customization" presenter: Anish --- -Lecture notes will be available by the start of lecture. +There is a lot you can do to customize your operating system beyond what is +available in the settings menus. +# Keyboard remapping -{% comment %} +Your keyboard probably has keys that you aren't using very much. Instead of +having useless keys, you can remap them to do useful things. -Topics: +## Remapping to other keys -- Hammerspoon -- Bitbar / Polybar -- Clipboard Manager (stack/searchable history) -- Keyboard remapping / shortcuts -- Tiling Window Managers +The simplest thing is to remap keys to other keys. For example, if you don't +use the caps lock key very much, then you can remap it to something more +useful. If you are a Vim user, for example, you might want to remap caps lock +to escape. + +## Remapping to arbitrary commands + +You don't just have to remap keys to other keys: there are tools that will let +you remap keys (or combinations of keys) to arbitrary commands. For example, +you could make command-shift-t open a new terminal window. + +# Customizing hidden OS settings + +## macOS + +macOS exposes a lot of useful settings through the `defaults` command. For +example, you can make Dock icons of hidden applications translucent: + +```shell +defaults write com.apple.dock showhidden -bool true +``` + +There is no single list of all possible settings, but you can find lists of +specific customizations online, such as Mathias Bynens' +[.macos](https://github.com/mathiasbynens/dotfiles/blob/master/.macos). + +# Window management + +## Tiling window management -## Exercises +[Tiling window management](https://en.wikipedia.org/wiki/Tiling_window_manager) +is one approach to window management, where you organize windows into +non-overlapping frames. If you're using a Unix-based operating system, you can +install a tiling window manager; if you're using something like Windows or +macOS, you can install applications that let you approximate this behavior. -- Figure out how to remap Caps Lock to some other key you use more ofter like ESC, Ctrl or Backspace -- Make a custom keyboard shorcut to open a new terminal window or new browser window +## Screen management -References +You can set up keyboard shortcuts to help you manipulate windows across +screens. -- reddit.com/r/unixporn +## Layouts +If there are specific ways you lay out windows on a screen, rather than +"executing" that layout manually, you can script it, making instantiating a +layout trivial. + +# Resources + +- [Hammerspoon](https://www.hammerspoon.org/) - macOS desktop automation +- [Spectacle](https://www.spectacleapp.com/) - macOS window manager +- [r/unixporn](https://www.reddit.com/r/unixporn/) - screenshots and +documentation of people's fancy configurations + +# Exercises + +1. Figure out how to remap your Caps Lock key to something you use more often + (such as Escape or Ctrl or Backspace). + +1. Make a custom global keyboard shortcut to open a new terminal window or a + new browser window. + +{% comment %} + +TODO + +- Bitbar / Polybar +- Clipboard Manager (stack/searchable history) -{% endcomment %} \ No newline at end of file +{% endcomment %} From 8ef9b077769429c83ed27100e6cc15b993c3cde1 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 29 Jan 2019 15:14:46 -0500 Subject: [PATCH 079/640] Add note on Karabiner --- os-customization.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/os-customization.md b/os-customization.md index d5f9b2d1..71687b51 100644 --- a/os-customization.md +++ b/os-customization.md @@ -19,6 +19,9 @@ use the caps lock key very much, then you can remap it to something more useful. If you are a Vim user, for example, you might want to remap caps lock to escape. +On macOS, you can do some remappings through Keyboard settings in System +Preferences; for more complicated mappings, you need special software. + ## Remapping to arbitrary commands You don't just have to remap keys to other keys: there are tools that will let @@ -65,6 +68,7 @@ layout trivial. - [Hammerspoon](https://www.hammerspoon.org/) - macOS desktop automation - [Spectacle](https://www.spectacleapp.com/) - macOS window manager +- [Karabiner](https://pqrs.org/osx/karabiner/) - sophisticated macOS keyboard remapping - [r/unixporn](https://www.reddit.com/r/unixporn/) - screenshots and documentation of people's fancy configurations From 1f76dcf285479b5b067488c3bc333c89a86bcbf3 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Thu, 31 Jan 2019 12:25:30 -0500 Subject: [PATCH 080/640] Update schedule --- schedule.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/schedule.md b/schedule.md index 345d6ce1..07752a4d 100644 --- a/schedule.md +++ b/schedule.md @@ -8,7 +8,7 @@ blocks, with a 10 minute break in between. # Tuesday, 1/15 -- [Course overview](/course-overview/), [virtual machines and containers](/virtual-machines/) +- [Course overview](/course-overview/) and [virtual machines and containers](/virtual-machines/) - [Shell and scripting](/shell/) # Thursday, 1/17 @@ -23,8 +23,8 @@ blocks, with a 10 minute break in between. # Thursday, 1/24 -- [Dotfiles](/dotfiles/), [backups](/backups/) and [automation](/automation/) -- [Machine introspection](/machine-introspection/) +- [Dotfiles](/dotfiles/) and [backups](/backups/) +- [Automation](/automation/) and [machine introspection](/machine-introspection/) # Tuesday, 1/29 - [Program introspection](/program-introspection/) and [package/dependency management](/package-management/) From 258d1987176e9a68561bfa832288278e27a0e435 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Thu, 31 Jan 2019 14:46:44 -0500 Subject: [PATCH 081/640] Add security lecture notes --- security.md | 204 +++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 188 insertions(+), 16 deletions(-) diff --git a/security.md b/security.md index 095a4bb4..511e8523 100644 --- a/security.md +++ b/security.md @@ -4,29 +4,201 @@ title: "Security and Privacy" presenter: Jon --- -Lecture notes will be available by the start of lecture. +The world is a scary place, and everyone's out to get you. +Okay, maybe not, but that doesn't mean you want to flaunt all your +secrets. Security (and privacy) is generally all about raising the bar +for attackers. Find out what your threat model is, and then design your +security mechanisms around that! If the threat model is the NSA or +Mossad, you're _probably_ going to have a bad time. -{% comment %} +There are _many_ ways to make your technical persona more secure. We'll +touch on a lot of high-level things here, but this is a process, and +educating yourself is one of the best things you can do. So: -Topics: +## Follow the Right People -- Encrypting files, encrypted volumes, full disk encryption -- Password managers -- Two factor authentication, paper keys -- HTTPS (HTTPS Every extension) -- Cookies (Firefox Multiaccount containers) -- VPNs, wireguard +One of the best ways to improve your security know-how is to follow +other people who are vocal about security. Some suggestions: -Exercises + - [@TroyHunt](https://twitter.com/TroyHunt) + - [@SwiftOnSecurity](https://twitter.com/SwiftOnSecurity) + - [@taviso](https://twitter.com/taviso) + - [@thegrugq](https://twitter.com/thegrugq) + - [@tqbf](https://twitter.com/tqbf) + - [@mattblaze](https://twitter.com/mattblaze) + - [@moxie](https://twitter.com/moxie) -1. Encrypt a file using PGP -1. Use veracrypt to create a simple encrypted volume -1. Enable 2FA for your most data sensitive accounts i.e. GMail, Dropbox, Github, &c +See also [this +list](https://heimdalsecurity.com/blog/best-twitter-cybersec-accounts/) +for more suggestions. + +## General Security Advice + +Tech Solidarity has a pretty great list of [do's and don'ts for +journalists](https://techsolidarity.org/resources/basic_security.htm) +that has a lot of sane advice, and is decently up-to-date. @thegrugq +also has a good blog post on [travel security +advice](https://medium.com/@thegrugq/stop-fabricating-travel-security-advice-35259bf0e869) +that's worth reading. We'll repeat much of the advice from those sources +here, plus some more. + +## Authentication + +The very first thing you should do, if you haven't already, is download +a password manager (probably [1password](https://1password.com/) or +[`pass`](https://www.passwordstore.org/)). Use it to generate passwords +for all the web sites you care about right now. Then, switch on +two-factor authentication, ideally with a +[FIDO/U2F](https://fidoalliance.org/) dongle (a +[YubiKey](https://www.yubico.com/quiz/) for example, which has [20% off +for students](https://www.yubico.com/why-yubico/for-education/)). TOTP +(like Google Authenticator or Duo) will also work in a pinch, but +[doesn't protect against +phishing](https://twitter.com/taviso/status/1082015009348104192). SMS is +pretty much useless unless your threat model only includes random +strangers picking up your password in transit. + +Also, a note about paper keys. Often, services will give you a "backup +key" that you can use as a second factor if you lose your real second +factor (btw, always keep a backup dongle somewhere safe!). While you +_can_ stick those in your password managers, that means that should +someone get access to your password manager, you're totally hosed (but +maybe you're okay with that thread model). If you are truly paranoid, +print out these paper keys, never store them digitally, and place them +in a safe in the real world. + +## Private Communication + +Use [Signal](https://www.signal.org/) ([setup +instructions](https://medium.com/@mshelton/signal-for-beginners-c6b44f76a1f0). +[Wire](https://wire.com/en/) is fine too; WhatsApp is okay; [don't use +Telegram](https://twitter.com/bascule/status/897187286554628096)). +Desktop messengers are pretty broken (partially due to usually relying +on Electron, which is a huge trust stack). + +E-mail is particularly problematic, even if PGP signed. It's not +generally forward-secure, and the key-distribution problem is pretty +severe. [keybase.io](https://keybase.io/) helps, and is useful for a +number of other reasons. Also, PGP keys are generally handled on desktop +computers, which is one of the least secure computing environments. +Relatedly, consider getting a Chromebook, or just work on a tablet with +a keyboard. + +## File Security + +File security is hard, and operates on many level. What is it you're +trying to secure against? + +[![$5 wrench](https://imgs.xkcd.com/comics/security.png)](https://xkcd.com/538/) + - Offline attacks (someone steals your laptop while it's off): turn on + full disk encryption. ([cryptsetup + + LUKS](https://wiki.archlinux.org/index.php/Dm-crypt/Encrypting_a_non-root_file_system) + on Linux, + [BitLocker](https://fossbytes.com/enable-full-disk-encryption-windows-10/) + on Windows, [FileVault](https://support.apple.com/en-us/HT204837) on + macOS. Note that this won't help if the attacker _also_ has you and + really wants your secrets. + - Online attacks (someone has your laptop and it's on): use file + encryption. There are two primary mechanisms for doing so + - Encrypted volumes: use something like [eCryptfs or + EncFS](https://wiki.archlinux.org/index.php/Disk_encryption#Stacked_filesystem_encryption) + to create a "filesystem in a file", which is then encrypted. You + can "mount" that file by providing the decryption key, and then + browse the files inside it freely. When you unmount it, those + files are all unavailable. + - Encrypted files: encrypt individual files with symmetric + encryption (see `gpg -c`) and a secret key. Or, like `pass`, also + encrypt the key with your public key so only you can read it back + later with your private key. Exact encryption settings matter a + lot! + - [Plausible + deniability](https://en.wikipedia.org/wiki/Plausible_deniability) + (what seems to be the problem officer?): usually lower performance, + and easier to lose data. Hard to actually prove that it provides + [deniable + encryption](https://en.wikipedia.org/wiki/Deniable_encryption)! See + the [discussion + here](https://security.stackexchange.com/questions/135846/is-plausible-deniability-actually-feasible-for-encrypted-volumes-disks), + and then consider whether you may want to try + [VeraCrypt](https://www.veracrypt.fr/en/Home.html) (the maintained + fork of good ol' TrueCrypt). + - Encrypted backups: use [Tarsnap](https://www.tarsnap.com/). + - Think about whether an attacker can delete your backups if they + get a hold of your laptop! -References +## Internet Security & Privacy -- [PrivacyTools.io](https://privacytools.io) +The internet is a _very_ scary place. Open WiFi networks +[are](https://www.troyhunt.com/the-beginners-guide-to-breaking-website/) +[scary](https://www.troyhunt.com/talking-with-scott-hanselman-on/). Make +sure you delete them afterwards, otherwise your phone will happily +announce and re-connect to something with the same name later! -{% endcomment %} \ No newline at end of file +If you're ever on a network you don't trust, a VPN _may_ be worthwhile, +but keep in mind that you're trusting the VPN provider _a lot_. Do you +really trust them more than your ISP? If you truly want a VPN, use a +provider you're sure you trust, and you should probably pay for it. Or +set up [WireGuard](https://www.wireguard.com/) for yourself -- it's +[excellent](https://latacora.singles/2018/05/16/there-will-be.html)! + +There are also secure configuration settings for a lot of +internet-enabled applications at [cipherli.st](https://cipherli.st/). If +you're particularly privacy-oriented, +[privacytools.io](https://privacytools.io) is also a good resource. + +Some of you may wonder about [Tor](https://www.torproject.org/). Keep in +mind that Tor is _not_ particularly resistant to powerful global +attackers, and is weak against traffic analysis attacks. It may be +useful for hiding traffic on a small scale, but won't really buy you all +that much in terms of privacy. You're better off using more secure +services in the first place (Signal, TLS + certificate pinning, etc.). + +## Web Security + +So, you want to go on the Web too? +Jeez, you're really pushing your luck here. + +Install [HTTPS Everywhere](https://www.eff.org/https-everywhere). +SSL/TLS is +[critical](https://www.troyhunt.com/ssl-is-not-about-encryption/), and +it's _not_ just about encryption, but also about being able to verify +that you're talking to the right service in the first place! If you run +your own web server, [test it](https://ssldecoder.org/) and [test it +again](https://www.ssllabs.com/ssltest/index.html). TLS configuration +[can get hairy](https://wiki.mozilla.org/Security/Server_Side_TLS). +HTTPS Everywhere will do its very best to never navigate you to HTTP +sites when there's an alternative. That doesn't save you, but it helps. +If you're truly paranoid, blacklist any SSL/TLS CAs that you don't +absolutely need. + +Install [uBlock Origin](https://github.com/gorhill/uBlock). It is a +[wide-spectrum +blocker](https://github.com/gorhill/uBlock/wiki/Blocking-mode) that +doesn't just stop ads, but all sorts of third-party communication a page +may try to do. And inline scripts and such. If you're willing to spend +some time on configuration to make things work, go to [medium +mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-medium-mode) +or even [hard +mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-hard-mode). +Those _will_ make some sites not work until you've fiddled with the +settings enough, but will also significantly improve your online +security. + +If you're using Firefox, enable [Multi-Account +Containers](https://support.mozilla.org/en-US/kb/containers). Create +separate containers for social networks, banking, shopping, etc. Firefox +will keep the cookies and other state for each of the containers totally +separate, so sites you visit in one container can't snoop on sensitive +data from the others. In Google Chrome, you can use [Chrome +Profiles](https://support.google.com/chrome/answer/2364824) to achieve +similar results. + +Exercises + +TODO + +1. Encrypt a file using PGP +1. Use veracrypt to create a simple encrypted volume +1. Enable 2FA for your most data sensitive accounts i.e. GMail, Dropbox, Github, &c From 0889ae3e8097ac227bc9e66de3c5277b5ecbe688 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Thu, 31 Jan 2019 14:48:54 -0500 Subject: [PATCH 082/640] USB data blockers --- security.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/security.md b/security.md index 511e8523..f4bb23cc 100644 --- a/security.md +++ b/security.md @@ -41,7 +41,9 @@ that has a lot of sane advice, and is decently up-to-date. @thegrugq also has a good blog post on [travel security advice](https://medium.com/@thegrugq/stop-fabricating-travel-security-advice-35259bf0e869) that's worth reading. We'll repeat much of the advice from those sources -here, plus some more. +here, plus some more. Also, get a [USB data +blocker](https://www.amazon.com/dp/B00QRRZ2QM), because [USB is +scary](https://www.bleepingcomputer.com/news/security/heres-a-list-of-29-different-types-of-usb-attacks/). ## Authentication From 24224c3e4331a0f30ff4896aa9c55cb28f0ddee8 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Thu, 31 Jan 2019 15:11:50 -0500 Subject: [PATCH 083/640] Fix typo --- web.md | 1 - 1 file changed, 1 deletion(-) diff --git a/web.md b/web.md index 88e26b4c..e25df3e7 100644 --- a/web.md +++ b/web.md @@ -72,7 +72,6 @@ body { color: #eee; font-family: Fira Code; font-size: 16pt; - max-width: } a:link { From f1652c5d4089586f6b024487ed9a3ed6df559b17 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Sat, 2 Feb 2019 16:49:34 -0500 Subject: [PATCH 084/640] Update README --- README.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index c9f4580e..188af3d4 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,9 @@ -# hacker-tools.github.io +# Hacker Tools + +Website for the [Hacker Tools](https://hacker-tools.github.io/) class! + +Contributions are most welcome! If you have edits or new content to add, please +open an issue or submit a pull request. ## Development From 3a6a1e62299ca7ee98ebdc4e1d7e9fbe45d166c5 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Sat, 2 Feb 2019 16:59:07 -0500 Subject: [PATCH 085/640] Add favicons --- _includes/head.html | 3 --- apple-touch-icon.png | Bin 0 -> 7119 bytes favicon-16x16.png | Bin 0 -> 3617 bytes favicon-32x32.png | Bin 0 -> 3931 bytes favicon.ico | Bin 0 -> 7406 bytes 5 files changed, 3 deletions(-) create mode 100644 apple-touch-icon.png create mode 100644 favicon-16x16.png create mode 100644 favicon-32x32.png create mode 100644 favicon.ico diff --git a/_includes/head.html b/_includes/head.html index 5a969de8..8b1132f2 100644 --- a/_includes/head.html +++ b/_includes/head.html @@ -2,12 +2,9 @@ - - diff --git a/apple-touch-icon.png b/apple-touch-icon.png new file mode 100644 index 0000000000000000000000000000000000000000..e876e935f75947755d79247784d3c840940b993a GIT binary patch literal 7119 zcmch6Ra9I}(BKdtXmDo&gA*XQTX5F|4LS@C!QF!f2<~pdAy^3R78oo{a0~7-xU-Wz z`|a76ef`gszPGxnOS-G8>U5-rn*1wF3QPb1@JdlZ<~^eQ@$W%@fq2)~Bv=3dnC&*w z(i%1p2mqiEnUaR4ku*;dbn^VX7e?|WD0wmqB-@%iiOk%wfUXEC=ugMXoYOb#X#~*@ z)CWh?Vtz-BZv>IpN1Gwtztl%bcCQOUO}R#n`|Hc&xN|2IRNe3 zYDw&08MU#DwmMV#P_*eGjzGwR#(mA$R*}5uwMqRZhcb5K1wAseX>Je5om{nJtjT*g z+)F$eXEhzZwmhXn-n6os_%lJsZDpopw$vXLbOFpP4>7gV2!@Ksaouz;3iX3G8|a zDm^qz3H-U29>JKkxLP67f0+ewGP}v0kPMLox<5Hli@xNZL^J*j=ENKXq4!E_eR*jk z(Hf6$jx;Z&7B8}ig(|(K0ZaxkNJ}YTZb?$+)0UA{1YJotPS#A??7nb&wI50)B{m7V zAi2Y)<3!a7YXn2Q*bh;7)(M^NZ^rKgtOlL*ao-S&1_MIB zPU4I86F5t?C*aXjDN6PKvi*klMz$i}=Nqbx{4p)JIr?pA57?UQIW*o4o{$=j&~Fu~}m+i5@8%7G<>Rpd}cqws?->@Xevcp}#$~2T?@d zZ)Zk9c|!m@PFk=*ATkVv_`k!*q&5j$zZgWKeR9gsnv3jR7;- zo9g6GXZYm{LoPCJnoC((;KiYGP|l9ZWXV@4XaZN?~$gBA7MAkx_C{Dg(t) zEm6f$Ohk@vaNgjjd`)RjnNzAvVM$SAb!UxXh9!SWmQIGH2(VJ?be4$b2kUa_z_r&Z zHY>F%4Yl@)X?T${1!ANZB}63JR1Q82lxLRJYkkm)){3j}D8s2NFB|(1tpnAm)Do#= zETVIR2uJ72tt!1!J1GpzEf$LwiWabgRo&@(TKa3`RUem+f8fw^)aa^t(hn(Et4ye< zGb&QtC})&|8S<#`D7)5dTZ|RVS2P$|xmuEri;LeUkW^WGC;r~}zVhQi$;_9Zu|K9RlYl`bZl4gVPZQPpr*S7?B0P~Bu(?;|$r@@5obbQuSS!`Ev0G|DP%Xsd6 zjyrnd4)cVx9eF0T@GKE71$v};?|RpYRfy*LQh2$D*@@m~SdJ2H<8Rw>*m2a&zNx`o zVPC0iH}J=!4$PCv!!aRVWnRTxwFU0PL2H}r`l91H7J zCAKfV17R-d3YBEE9*MJ{8SfOUi{0i1GmP0r%ixFE#rp2TE-sfi*8o@jTuWW0nPb1r zAjW#;1|0qy&IhN9_9Hzq{L?fS;M}wRoqN=bPlRUZ%SxJfWzbm zj;O7iql>KSUykepI*zlCvwml43KjX~x02WmFv-w(K!9z`b_mH4Po;FuA{E}TNgWbTTk@_4T%l)Z`C;* zjf^K)wksYh3#KD07iY+-hD~5iGl541?%&+Y4+2aaJ374{?_Y6IemrYDpWVKG`ui+x z^jYe2bG~oU!Jz*;vjC!pxw>J854R|KTV>zYB>P?N!KeO+~ z_M1YiF2=QTwME)4fqRBeE%5V}9r=&4)zYMi92Y;|+WGo%U9ER0gVTiB(TqBM22Mr> zJS^{TD;;zv3|(71?pk4&C6qV<6pjn4qL*!tC#AwbVG2i~72D1uzqVr6fSnc*Do^&M z%NE})s)Lo7d#n>r4_SAyUE`&>IqJTd)Ub{)K(yIxs zy!)a0&c9vB{(ScfXK!~Gli(L{XMBUvTp%i**4j6Ez)GR(3qBmc!Y6=2bcyA@;^&el zG!=py6Hz`-^(bV(bTvTcJ1U6G2>N@POj}kRAf*{|fl`~3WBB|wkzQo6TyK-njrYxK z>cEKcI4>SRr$<~|9A8XaV6|klKZ?Y`?jN>yG)nS-Q&H&Gub9X0S6)6DJfU*#;Ax#* zWYZziSrA0wy($3U!w3Kbg#rL~2vE>20N~CE0PLFr0K(}20I5@UtEMQTZix(uqXdIQJfdBw5mZFTLme;~j z&aCUAo?FBTs&7qM;z4WG7OisqHBE-fS@r|cts_s8IXB_^#OS4&s|8krvcp&QAm?)c zUTWju_w`1JAko!IY6VyxE>3{pHQ-z{4)E7N1N!tLC=f{qpeupXO&pKxgt`eZ{2%*; zYvD+#$o#5*9_jYR@V%7uIzM}7`t@J|3U5|L?;(pyUB$5ACRZ0nUv0ju@jQ7J>HEZ# zSbS^GopYt?8jjaX#&ItX=#24naq2aW>`YN0vbI#7^D|^#K@X!H)XE!^3r#1Z3+*Nd zT+0Er*XpvZfcj<7!}{Qc#{^7=&b?~f1L>|1Ve<5n&-w8;7>--a%M!nA6fg$0L^YXV z@36AEl1G}P*V4h**vOcXCZ|<2kNi3`k#ewyK`N*t?p;NRc+!Av+vRUgw@^-^M232E zAricUCx)!W(QW=3!}~Oy^AI*2zRP)ySrykAkknl$Pi7?`Vu)-eCe)<*2MvYJgYtuGi9+M6K5?wPSV zw{a5@NiUXX5Qai~dt$}6=>#JC5Gj*jd3;LQMWiest>tFB%L74Ac<`(>bJ4>K)erIsa4N=*`c2fkhMSLD!*@Bz8g#5N2m4n)G>)g^`Ori#>_Yz+_HOKa+sgEy7>Gu z;Xy#3TF4%q20cN6e%J+3*yrY=M}8DQu|NUjeF}GhI9>m`xymR+TqBu8W{vf**@XGo!tz z*&G_&%YmPg(hfKr0t^h*&|=PDfgSpTH=H}2UO{YyUysgmBwnlGq%N|$faB2lq0c-jcjD`G zz|RQp-0LgCBgu2GLJgMX?NFo4d8HETC`^OEABWVNhI;zH%%|-5R}`62TCdCmt2i)+ z_x090fc&9&p(;^P^dGRt?45g!%iUOWe?e6Ag{JERGtPs!21|q}FSND2S)!-a?u)?K zy4%fTQ=3k;TIgYB8CBi2xPk#pChb(w#rVCFb^R>#u)X~02UU{F%sazbK7xl0erQ@> zkI;`qsEOt6>(s^ZZFX5&jGZ{YIBkXPR%2N5PQ9K@QvlY5GbQzKT@aoWXYX&XpBl|Q z7$Vrd?{e0p3am9U2`Wv=25Xn^K+*RVXIj){l72sTq=kT0GknPkM9LQv*|o|rPl91? z*)Vik$vM1GuEEfB>5lI<9Ggi>mA4Mudj)@q2;M=#5O=DX4cyF@A&j8-EOljzMyojcwwj~@I}N142m<1aJH)AYgn*lbz)U916fq^52;8>#ui@1`M=A6= znU?|^&a;^FsgKKsoz6jw$*j|^NfopuZ z^VI#lkbw$9By)X6b$Z`5T8FXPcwISCETsz+Kp%Bkf0M^8#w<`OR`J@_3~i*== z#6fFwS7lUYxWnN%Oxev#8m1jA0Pr%9K|*)VZE9F6&It8|cirEOSwh|s5uD^!8`58$ z2&l_a77jZJ9_Kwrv+GS!;+6Oa+ua-rMtkRHD&W6#|lo3~}G~@~O>% z=g=|^RTt}j*afm*+U z5#cyXTcOJUHLfQkiCD)FE;(kIFG9P}kj{tOo33Dyft%$j-wAlyoGJ#_P?1LiTqpJF zU?KWOF9MvuUJI^4z@35me%_b@V@GQai}D5g_8hgBYffwgsy@@LQquc&B-IDs zN=G6x&qg!NHp7o*)kL7t>>)yR^1nd5)^2&+GI2fsIO@zT~QJ_ ztb(G8vw&U7JO<@Ns$sOyC$m-DUvN(vM5L83+VfkZ7&_rJG?kNkB8cA*sf!q+jkLca zfPQLt=!5lHO3wDa*t~hJ6TEt@YUP#LGJzoywys{MH`6Mh|B4AH5$jqf=uWCuyvyw$ zLFl1;aU@f?D{8AhY~=zP%7ZjzM9S3Zbq-*(@picTbUF>lky{#j0$^IzoX;C+^0c8gctaFG+U-@Iq@SE^SpqHncn6|t$qAM^?cDlXyT#!GmDzO{Nsn* zgH~Ig&Ndg+EhmpJ#XP}${vX(uzYe)dTJs!5JU!-Cl~0&EcP-zUlHHjByANPqmGnbq zVagSVz_}gP!Tor?cNFr(L7b#mGw;O_os#Ln`-d`gG;KY!5A&C^?d=D6__SroRCdF| z@PwKk16W+(u7gn9yHMdi=c=wD2C|3diOe=+b}c8I)-`e_!gl@lVnHxa1e}7&P32A* zV^W-?RCMLSYpd_;mwFs3eYz>w=;Q*U`i5wxuGLIKPyNJ?Jk+)JG!k%ImnY8W`YG@Ab%MZYOu zyf&T_Yj`uc6G9YSU9{nFpM=;oN5EZ7V>=)09Wmjv|IyhFENqjYJ(hb*BZ^AL< zvLdtn2k!EXVa6k7#q(Yw85e%WEaQJggw!SuK7O)fWJM_+lQHSwUh~wiupMcTH2GnM@+H_1Aw(U} z1R=ywZ!mw2-Ii{WbJx5VYveSnCfpF=2BIl+6l0rMvquncPu8q^;qgSpgG}mGTaNlu zWGlyulg+vIhd2SpM+5%AH%{lRGXoCX$GrPpN@m5zzI;E6j)`0Rn%6Rb>EM&JjJL3C z)FF(oc)1gd%g2$l)GSlrntN~YYnwm*_+Gj~Bx!M`ZkeI!ec(!CUxqxbfA)7~(z~6; z?corLpw??IcU;;7vQB-brM|jP6cXUXG!D3w1_({gYF=Y0R{c0E^L2Mn>WV`+kk+3^ zO+;Wnv**+x(Y(Gxm;&KDn%AcJwj&8_+KXe?1*D;Kh{j>ZpCowB7=^%SW$x7~Sf>QY=f~R8s zL^0|?CIP242eM;Poj)L?&nHw>=F7X!yYKi7u_zs6m+jI_9`ST!VqqL#biYXt1W2|M zy4M&Xghy`&l70A>0qECsB=##cJRSJREt5)WZ0X4?7pd*&k>F}=MZR`NbJBt}ArfVg zeJX5QzRNH+osVOlcEy}Y= z;xO$Wdx*21{hOjTrt@2(LWi!k@_#yxM)(w!bRZUP5&Xb>DXeU<$%|>op5Nw=i82$h zw)fP@a0!NH(`t|CgzvfJu(>xsr8=Y0k6)aCybb*=P^)nt*$;Gr6(gN|7&h?qne2bI zR{TFjY05J)gBowijcBGjqEcpXlht!G2fJAaLtHEn4S<`IOXw{pH{#af<`(7?6y_9Q z<>V6PrfbjwLFfMb l>L2X~_M<_A6!MkW0sEx*Zf8<_DF_w-MOihO?^340{|iq6Bu literal 0 HcmV?d00001 diff --git a/favicon-16x16.png b/favicon-16x16.png new file mode 100644 index 0000000000000000000000000000000000000000..2f4890d00f1436931d0447ec6c45a846696d8751 GIT binary patch literal 3617 zcmZ`+c{tQx)c+z&_9bKNV^_AZMRqgBmfaAIZHyUBmKnwpvX%;oERk#xVzU2)lqE~q zw?c?4S+Yyn!aM!Z`@H|W_qpeubIySfel5o{}l!r z$|^0#W(xp}Z}D)r6&`~D0OPpy40@~7FQBNcv$NG$(1WP7p=S`po3tSs>35S1CXnYH zne15;j!tcL5Yf(>m;_PA3fkm42*^9Zo$8p*@myM9brfy-K26fMFlEBbdUy7;g6Dwt z^0emiKsKN!QH3Z_aX23pfWD_bJ)6M6&okJ%PtEX+ifV$ob;;PEZ)nIZs%P&gM-O-y zWPY25IPNbBI`Ay$`);CsBhxC4tC7<}qesd*yq_aZ8PH?R54!V;dH9?w!qM&_=)rF4 zd5hfnx$_p;9?ZBADWz$CY#_jJ;W8?l8R%rMQ&zgteM2Cgt!E5Yq4z-Ph`)h!;4R;C z!v!j(9e#MO*XpwQA9;)2$PTl#4<5Ct-?ajtn1~!Oh!>~?4d}jobf&P1A6XUXdTDbo zA4Lo(Dh&P&2@FZjj*Vl27dy7*?B#hht3YmMUE$n1z%SK=$1H}8g6J$sDXmEoNvUn| z*|^*eHTSH}-uCZ}DIGI?cwB<#uBmO|Vvr66kk^~^@adA7{@#21lNW~P`VVgPZn)zc z-bUmO&3Z{z=KWb;P1zcdaqHg4ZHv(?dU_6i8st*)bT@VXUgsC0)2ysx?H_49c4?O=v(?G}|_9+sb_^*l_SNgLjlR0}bR(hW2n_ zYgnVSk*x8OCp2?KW6dcKo0oNH9(=^d8^ zs5}pUp`TsND|Ajz5fs%`Cs3+R>bbm=4S80y1k)aZ?17G$#h|qIv2|$7E&26x%5xBPP8VvacA0NZ>cTVB8Etmk91R@Jz--i3 zyV3!-Rt#|W;Sh&*2d5wG?PGRvArn}~$D1$MFCxm4LtoJ1jW$JAE-< z;?K7lpw%pJjOtxfMYlo3|r(RJwV0LKalG{ zb?D54O`dwd{KXI{HpXkl*cYKA*b%M~O{J%=EQ2KPKc-!*TU6Rr+_m#(JLE#dx?AiN z%N847L`liuZi6rm=>xnv9j) zaq))nW}QXxjqyeCQWpppSub*=KTLm{K4D5uzm#qv6DX4?J(}i|22UGJSCbL8Z!Xr# zkGZC3zhgI7wn#>josg?VA}TalYKibEJx#qPvo+h!(yW&?NLyqAGN~-+B`dk~Wxs8L zeXl(ksY#YB6bry;B;*>-n(A6?6-4G1X(wDsQ1cpnb>tX~3%ANE-z*)lRX`G~S}IQ+ z?>wI)KQ60wE;Lyvl{6f6QZ`e*N~&DK_CNnpR_p9Z!to5~=o~)=y~0*-SJc^%udfx4 zJ$U`-b<1nxCa2?q-^r}bA8e%%B?imc1NrjiU#*d+7C0 zs!*GEC!vC&m?Oy}>!Y<_{CjzP$;{czDa@wKlg#&60$F&#OT1&eI!dB^S9o_4gLv!s z!=aZIA1Th% zzOJ{Z)p0wSFzl2Z!o4$3=trSQs7F3U8`_#>IeB0Cns7~OKV_G8mNs(sTXS(Otr3Oc z&rYt<8_`$8-?ir#KF7o?_9@F+Q$64^;FUe79_jB#>?U5>+^r`Ud6li9 z$DI>)Q$(4aPKW1gADCPjgL224#e-gj8vlY2}9T-SXG=(mH;M>$|YTvE-%T zrBFS7v65{@eun%O70xaknFq^bb>*Iwo@Jc%j5vzujtGosKmC0wbAqC&qSB|DqIu&u zDa%TJqAIWgZn(5U2tcDqYIglwfhTUG0OS zKO)if`LJzC2?*A;=l-YUPj^Ze@QQer)w_Fl_eNpd=(6n*|D5k%dNmQC0_w-!vAyv* zGWR<5=lRZyJBs25tPuALpV`>1vW?`L?f3QcB{>c|O4X#-#(q&F^^q=r-&;QVep zn^IdNXQ@DNb{V|1RCYps{yC03H3ojw?K)aN7P;{}@I_$hT7)a%UGuGzV-`jJ>$`P7 z$Cvg`zn#IIqhZkv`C)}?UE#X!5f}bURCoK@9u^Z~r{Aw3#t=ma*UO~Ku4mm#B{g|n z_2NFO6F;f~w=Pd#o_k%fAvDFtEyJ@6+TV1jDy+V1ZR4;Tv0RfgZn~VlIfa;%_$nd8 zB=04!COd0?S`_2z*D^w`LD!FMnm6gL&uuo=j*|!{Ive4LW1TbEgTuCnvy()-F1w=7 z4L|yR@X!~q{dYFMchx5mXKvj;e&_w28u;9O`Og|-V?tvZZ|hhqQpZ??9B(iFw$3Z; z|Mle8G5-n+;zO(AYIR17L4`jCKyzgR9Mw^o_X~9_2-n( zSk~7yyv{!A|51O(b8i5dYp2=tOJLRMv~lNWhh*PtgHEKzprbdVzl!-;)%XaLvs%BJPPR%k1T^>vS7vTDZ`^Jw zB1J4WY6=C*PycER`!2LLlX%Ru6&!>J)LwC!mJ4eQhi*eR=AIcnP3jV=437!7IY6%B zSCz{>?IJTzh9RWnoQ<8;#zU?7!vzR}-HT8- zc8POIWa0i(^k;l?ytvZxbmMTpe{N!~S#4fzfuPnhdMkMQqtDZLrzfvc8kuw^_A@VOi?sGx;_oqLBrgY6KO0L(gZ?_5<*j?~Nw<`N=AfjcB-O;E$a4LJ@N;@=E1zU_MNEx=ZLQvi4;eRYX6Q~) zr?k*zcI56}wix9*3&9xMm;*qlBmhL+1%M-pD{2J*0-*r#!wmp5G68^xnEl3DixL>I zGD9MG{|-LN0E2nJV6dQ|2;7jUuHhgedWn}8EF>i6j4tx=?{dMsG&ByBmD3dv5C(%m z{QQD?FuaRW?Y6qjJA$WPo^7`|=uPbDn7INCE(m#9yH@u@T>uspsG&-VwfPKHHPs}dlGs7E)3nyx+kdEIh*;Qr(E|5*&J-JFB6dWn#(#I0 zG2Zl{6%|(B@FKUJs$w8Zrea@q*7Zq^H=$Nkpl}*EhZu&-(N+>V+cym}@3Z;ZlNHB% zvplgp=Up_7@-(Zxk(4JL&O*GTDgR0ir3mH$#+mdua~Fy!5rA+A@IVJ(H8B2IiUO3N zidW>IN|cFIQqoXW*MO?YKovEh(8kIm%A5Zmf)Bw1j|=(#2_JW1Srmfw-wc$-M*~A3 zhWO(>Zv+7Fe-=caNc6@+Onfo&vXH--9j>a5Q#d^TI38FM#ve}%z!Q7{13Zc7jSk`A zPj8?=!2cjVXx~5_8WRxckM;kDdCIr?g@WPuUkt$m>-{fykev;o1h0i69Rh67!4RT9 z0f(nl5fDPeLMXW?E!wT@guTx+8ZL-IluEk? tjaV%b(mV0#V@F?yKnHzqlySbPJn(~uBVZS%nodapm>?_+DqwCg{{wi`XP5v0 literal 0 HcmV?d00001 diff --git a/favicon-32x32.png b/favicon-32x32.png new file mode 100644 index 0000000000000000000000000000000000000000..1a345101abde647348e9594d0a8858ba6b37416d GIT binary patch literal 3931 zcmZ`+cQo8h)cy&g_h9waiMo1p%SK;q3DI}$vQZYRi%vufQ9?uuL6E2sM507{b%`j! zOY~ktq6FXOkM}#@Ki{00xij}Z&%O7|oH=K1f+<{=hLW8U000_&Jsop`1pgi6#DrPr zRbnpyP=3T}Ynx(GC;-q)$jT-&&G^m|xqp4V8OQQCGIQ(&81^A^j9BE;47on|Wp@sJ z?zD|vXCs)er#?D~kFuIHtr5)Pk%T0=xM@R@>0cK~n)RDF^+%{IZhogfZ%)c}L~VUe zWql+M&=jnN70X#uMEW^DQkq*xV&GsM?f89z{09-y^o@=+J)NPkF{j9Z<$UiQ0X_Es#$i2h3e~MkJ4~t$29cQ^|f=};A6->?OOw7X+ zbSolrdj`8!ThhDdhp?C=*CPXql9d2;0-$g(HTn6Pk=F5N?6X&QNrq2@26vIz=8xe8 zV+-y=HHCk6Hq-Y<#GLwnV-EOTR;>X1)<-kFr&>MI;b= zqlQ1TX9b@#9DQPQ(;nHO$y#uxRFU?}jiNgs?c9hbf_@2HC-;a{BPRyOWUGBC=?HBR z(G}NQbCu575L<>=L!V4=K6=}jA`39#0xAXP)p8yt(dSE}m|h;{e<{$dboEHc1>|0a zz0%67W8)^#gs?>RHgZ-d;Rh5B^8l8#NIp|ic1=sOgirm{jss=a8+nF$DLL-iC9TM3^uo9!g4dE`aA9hMS^Ea|PEe{x)8QOq_$o(#v zQ)bm@AU1i7L9Lt72m0|Ty#Tj9wELe2ujoZ#m1!ZbNU^$md@?TNXR)2mZlIgJjLV4> z9XifRIm|f|Im0=WL%c@(Ym94>Wn{9+XOBvejkPS+tm0H0NyfbxXR6xLY4#n_9rR9^ z2T#DQ_^vNLLQ1-JfFx~pv~9v{@@$NIN8^&R&!=VvRqmRorjFJ{?E{a8H`@TmCmPWj zsqHcQT@eCl6dy=zqj`7>zyfcCGZ|@_Mi}K^7Sh~61wjTOb$XE1Nrmre8&)=8?2*p1oU1?mAIf5Kn`qG^-!Qznc!!q$Q zJyBZ`GSRo0@tJ{$+Y>ok`YGiY(9XlO%%R`6Qvbp$V$%om^E!sm32D{F6J+mA~K%onW>#Qo~0W` zVs&V_Sh-S#sIo(Bmh#FG=PIUX&uXe@wi<0)^i<@&u18oPk`Sqt0k3JRDqas;Bv}nw zRUuTWgi84RP|8UKcNYva;QPf91*K|9cajv{$KRaW1Y*KW3*YWlj95q^aHj1wS2i&( z7ptCD*4dZnFINcN9k-J;k~PHFtf7Zrey?n>cg15^N7U6Xp0d0_S2I^PnpfT5E}MM( z{>l6H_j;{%7sbEQXzlwfgkj}6>vJu!7duCfU?w_}K5T9i{}ZHFAVW zD?pw==H2F8RGo^wKFjpWSI7n<4dhRx?VazVLq@4x*Ib@Gc;C14{*H$iZbY$0t{qKe zt7J_s{>$2Lq;tflfIrVdzI-+v+5c|n-J6D;x?)>W+qX>*>O<6~yuJR+?c!(G!yDc= z!5h?_E~oGI2#sMr87B?f+Tv}Wc$V&}sZ{0{{>$EqX;u6wzIM2_oBz9`0Mp(PUJ~}= z@*bHkS)KN$uA-8csOV*{J}+U*a?|o*_`UqP5r+}?yiuhHUmx#&?>l=(O;x4tmD|o! zQD;IwgMZQu?fZ`hqFQ$YGg@TlnC2n^vV%8*Yt<@M3qsk0eAV1lFP>wDZmlt_xl6fA z)lG@k(9KKESADb%qvwezgci~|GB1cMP%gNJpNIE{`-gX3{kjsnv?Z=3(juBA{$MjB zPFwX%o^u1#eEYzhUQmSZH6Ou!aII1^n&>0gyYmUD2@AW#ElCebBHlM?T)ZaQ#?X7Q zS1gaHsA#IFLm9l73Rvg4wUVq^e3X!KU&VagSC&P_MX(6lcgW+VjhPKPh&rSPQa{~N zSB1oNV|yu2p4c- zLNd|{KXi_-t(xXfc+>rFLKi1)L~=x|TDU1zX0rk;2(KjNV?WysUaTkx(@|33}7-INmfFvHQ~hm4C%{xFhaUThQeN4TR(VQRC04 zwcl4iuC?u>pi#|5p(WeBVH(KrTYskO`n@d9%5ZUWpSNL?uu_<#0$#!Kx__;_zOc7R zz;knYtJZ&CVNPN3ef2K)EFH5L>pIKtJ?q+%x>z%F>!a}X`ur(_^{l;F*o@%6f?`yX z?vje)3szU9(T+atU#se!n+TQznT_gWgJ@VHC$-7wgN`nCA!S!ek5Wr}65W$8ro z*3cF!Sux$a>!o8u&lHTY+3-}%=a_o!#dygH<;h9zN#&1?wAC4tG~6~1SALllmJFXf zJGtQ4pn>&uKsM{L+lBtj_^WON%~F&zod~DE_tar3meb_&YjU)Thc8@)eTMVPiA|as zo8RZ14{tTaxE_xn3M^GxPdGR2u38R%wyf*khrQKiO_w@;FX0~g5b|s3lc96AvLu;( zTS(9TKu-YX;;hQ+-l!eECE)zS_(>TDts*;aWV*lnNyh^cW3@V+wQ~GQhfM& ziwbw36>1^2nzF(yw7!kfX7@-wrHd8a}sN$sJjTyro{2GKwT{t8- zq|h+`Dg86GH&s?+a>p8if>I^wqchXwM9jznD3a*qF^YJU;YcupKJkB&A0zz7hcDi}5$)X@MC}Krg8b zv!x$Zc132yc>-qIVP#i&$qjlNEc90mMLENK2P6XPv0IF z>^l?l1UY$AFj&~!@`;B}x2JF416+rlQwbD?6TPj>!NEmvY6y>X_wGVJc(0&j$;&GW z<`t1sFf%rdLt<(?e7Zb*dpvx4+`T&;kYzAkyqLH;Cnq-{7Xd-(J1TZ)Y$L|K+2Y<) zC1o2vesK{|B@Hdtd$!MAJw7<1Ds>HlxVU)5B{X$(eQh0!v0iOn_lnxJA2qI%> zU>xS?Qi=6y*MwrZxcDR_p)fr^lxv-lNeo2Rkef$udDBtS;dih>(W5I{vR8@n!N-{rvV;I@;jTcNa>i!J~eOyCA zeGv(x8Pikyo)KhEw1~UZth$>mXVB)&rWG{pptD= z)s^H_mpYP>a?i&4B#;zC-75%lypxfKQ8@-f!2rU|JkfijsifFm+bpn?w#wX-np-f@ zc#^naq#$gm>9JzL0j(8f$N_nJ?(4iAApfvU?*`i0b)^HbY9ZramTZ0R6E1i4Sxh~O z-dEkf=J{D1#3t$eG_AI3`^D^5e&&wK(?T=fYtwM?J57lN1=qQb4h!?8RIn&&UYe{pDY0>p2m&q;HQOJR_o0Pli#3c{)PXrNf^e#UpdZZI&&Amft&H+T69gb5 z4Y?yBEko!C85w1HC1q(vF=>dhv~)|&IpMYZAA%>&1&ay({|WybL30TNk-rfLC8IJ5 z=N;^eb#?OtwEtUywYe*V5_-~TYr**CuvFbw~T!MUJ4{xhED<$(#tdm#vGKXd0m zu(vM`gC%6)7wnA&6JimH+n~JVBWiLVD-n?alffbVRdaxtpGp!%#l+8Km~o4k39J(- w*X2UY-+%xQPJjESd#IbUn`|&rugE|W*kWbyJA%q*5kdg^Ft|=N)G7LZ0HR6y8UO$Q literal 0 HcmV?d00001 diff --git a/favicon.ico b/favicon.ico new file mode 100644 index 0000000000000000000000000000000000000000..aa591c1fca14eb3bd2514be523e31f288aa60265 GIT binary patch literal 7406 zcmeI12Uify8HdH5Xl z1!~D+g6ACFP+zQsKhI3StpNx81=}HAJrv=fAI?w1l%Wxx=sXK|hh15w7%fL5;Eq?oW@yU_%`gV5YVl;-kWg}xq;OIc_vErB0>%P`)gg9%irLIHnWn1!diJD|F- z0Iv3S!<&<%XtxOKRtwrlvHuGE3-uG~v9qlZK!sfWpv;wnrLqEAQ1?b$=w}Yv)m4zm zFd#-Ts4rJM@D*yM#SE*c=X>pN4JBi1s?+?23D4>+*UOR&!f|rPTHA0q6dNTJzEE9MTa$PYC+EDXp1V@NVZw)= zpFfixOgxhtfk9E}SRNaeMn?&Shi{Euk|eo{!9Z5)u_iOm$4q$sHx0Q-3|uabyFHQG zyDvU_>FUMAAw?brj_dH>@eD=Ri)VJ~?{Yc0VEJQD4GHtuO@9pP(tv7Vcm0Eb8mn;~ zCmmTP{)uGqI0J*Gvrit~SX6!`HDP}YJN^2#!*%hSCJon>+_5Hjk~mIxFXz_HaQr#q zd2?F(jC0%N=K@bZGjKj77BtH~8y@j2PGoQYj&fS})aBVw{I;@u=vCW`zC;g?^X>ut z#td0}Zcd30W(95}dY8GiuK1J8^3pt|`jsRPzsnVSfAQrCKi=P5ob=_wL8JZg<*k@I zDNpKJ2vO)C;`g5@aBC93QX8`gdY5_j%5jHoXV($`&pmFRD%|Cd8(A36jqm1nEgn&D z$Dg!#*S!z_yA^mOlCWQK5X7l+lo1u@)Kol)9Z^ob;^>0@x*GTfS&q2n)<7@(37Ig) zcW>io8S+ORaw~BoF{rP#CC)n!a^XLg+{lb>fmg30>&Ezq*oGL181PeMuxGI?L&kh% z*a=-4wZK1alR@AtVm4wRVpU>a;#gw1w`ax$CVLV2sTG;RIeutbKa@;!riV#G=Hy#LdLkcaal4W+T|O8iDbiLGC1mzkpm%j8B|R{6;)Z zTxhFOK@ho@IF5Llc$1iQw6OthVy%FU^K?bWQPuo$1`pi>S3$uQ&Mg>!anf z%r@g>ln(oRdcEH3>li5P$q;XGVy;rPpYUTBbh0?qrQGbKlaY)Ujg;*3d(nyiM4!`k za`P~qV!OK;#`J!JUhkW%$ei06s6;I4o=;^BfPsx&3n!HGhi!4kWWtHkq^ zXl^2m9sPV4a|Np1#%kWN{Df-hR3pBOdoIK$x_I9e9(*8@i#cfv~ zRA6nL_S6dcmu=rFt#pEVL)zpB{sOx>})*PIYa{NvPIM`N$OJIMXP~nW*+lHFWx40Oc1{%RXnb z3V9ag7*t!P9E0)(7w-2|JKsC@R^ZpN0)`zAQy!D`V5egxWv2~{F7}m9on2j?dFTYg zM8DamRT(vUsF5q8-(=+;I%+wPVNNMhM7ro0T|-@(=uF{}$oWaJNm-d=)IjvSSehb9 zPd`{yZ=&n_^DU{G+EWEegWyl6W{+1i$x6ksppG>*AJ)YN$Nnb8!|_JONc8tVz0bb@ DMSrtt literal 0 HcmV?d00001 From d4cfc2759a8d785896c8e467cdf750403a8f6f62 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Sun, 3 Feb 2019 11:01:43 -0500 Subject: [PATCH 086/640] Add videos --- _layouts/lecture.html | 13 +++++++++++++ automation.md | 7 +++++-- backups.md | 7 +++++-- command-line.md | 7 +++++-- course-overview.md | 5 ++++- data-wrangling.md | 5 ++++- dotfiles.md | 5 ++++- editors.md | 7 +++++-- machine-introspection.md | 5 ++++- os-customization.md | 5 ++++- package-management.md | 5 ++++- program-introspection.md | 5 ++++- remotes.md => remote-machines.md | 5 ++++- schedule.md | 2 +- security.md | 5 ++++- shell.md | 5 ++++- static/css/main.css | 14 ++++++++++++++ version-control.md | 7 +++++-- virtual-machines.md | 5 ++++- web.md | 5 ++++- 20 files changed, 101 insertions(+), 23 deletions(-) create mode 100644 _layouts/lecture.html rename remotes.md => remote-machines.md (99%) diff --git a/_layouts/lecture.html b/_layouts/lecture.html new file mode 100644 index 00000000..f19b15f6 --- /dev/null +++ b/_layouts/lecture.html @@ -0,0 +1,13 @@ +--- +layout: default +--- + +

{{ page.title }}{% if page.subtitle %} {{ page.subtitle }}{% endif %}

+ +{% if page.video %} +
+ +
+{% endif %} + +{{ content }} diff --git a/automation.md b/automation.md index c1c2802a..f6a83819 100644 --- a/automation.md +++ b/automation.md @@ -1,7 +1,10 @@ --- -layout: page +layout: lecture title: "Automation" presenter: Jose +video: + aspect: 56.25 + id: BaLlAaHz-1k --- Sometimes you write a script that does something but you want for it to run periodically, say a backup task. You can always write an *ad hoc* solution that runs in the background and comes online periodically. However, most UNIX systems come with the cron daemon which can run task with a frequency up to a minute based on simple rules. @@ -73,4 +76,4 @@ One caveat of using cron is that if the computer is powered off or asleep when t - https://xkcd.com/1205/ - https://xkcd.com/1319/ -{% endcomment %} \ No newline at end of file +{% endcomment %} diff --git a/backups.md b/backups.md index 22eec2ad..e9498b12 100644 --- a/backups.md +++ b/backups.md @@ -1,7 +1,10 @@ --- -layout: page +layout: lecture title: "Backups" presenter: Jose +video: + aspect: 56.25 + id: lrpqYF8tcYQ --- There are two types of people: @@ -86,4 +89,4 @@ Some good backup programs and services we have used and can honestly recommend: 1. Think of a website you have visited repeatedly over the years and look it up in [archive.org](https://archive.org/web/), how many versions does it have? -1. One way to efficiently implement deduplication is to use hardlinks. Whereas symbolic link (also called soft link) is a file that points to another file or folder, a hardlink is a exact copy of the pointer (it uses the same inode and points to the same place in the disk). Thus if the original file is removed a symlink stops working whereas a hard link doesn't. However, hardlinks only work for files. Try using the command `ln` to create hard links and compare them to symlinks created with `ln -s`. (In macOS you will need to install the gnu coreutils or the hln package). \ No newline at end of file +1. One way to efficiently implement deduplication is to use hardlinks. Whereas symbolic link (also called soft link) is a file that points to another file or folder, a hardlink is a exact copy of the pointer (it uses the same inode and points to the same place in the disk). Thus if the original file is removed a symlink stops working whereas a hard link doesn't. However, hardlinks only work for files. Try using the command `ln` to create hard links and compare them to symlinks created with `ln -s`. (In macOS you will need to install the gnu coreutils or the hln package). diff --git a/command-line.md b/command-line.md index f5fa5256..99c8d01f 100644 --- a/command-line.md +++ b/command-line.md @@ -1,7 +1,10 @@ --- -layout: page +layout: lecture title: "Command-line environment" presenter: Jose +video: + aspect: 56.25 + id: i0rf1gpKL1E --- ## Aliases & Functions @@ -167,4 +170,4 @@ The [atool](https://www.nongnu.org/atool/) package provides the `aunpack` comman 1. Install `fasd` or some similar software and write a bash/zsh function called `v` that performs fuzzy matching on the passed arguments and opens up the top result in your editor of choice. Then, modify it so that if there are multiple matches you can select them with `fzf`. 1. Since `fzf` is quite convenient for performing fuzzy searches and the shell history is quite prone to those kind of searches, investigate how to bind `fzf` to `^R`. You can find some info [here](https://github.com/junegunn/fzf/wiki/Configuring-shell-key-bindings) -1. What does the `--bar` option do in `ack`? \ No newline at end of file +1. What does the `--bar` option do in `ack`? diff --git a/course-overview.md b/course-overview.md index 11ff78e6..558139db 100644 --- a/course-overview.md +++ b/course-overview.md @@ -1,7 +1,10 @@ --- -layout: page +layout: lecture title: "Course Overview" presenter: Anish +video: + aspect: 56.25 + id: qw2c6ffSVOM --- # Motivation diff --git a/data-wrangling.md b/data-wrangling.md index e40aec21..3dea3ff1 100644 --- a/data-wrangling.md +++ b/data-wrangling.md @@ -1,7 +1,10 @@ --- -layout: page +layout: lecture title: "Data Wrangling" presenter: Jon +video: + aspect: 56.25 + id: VW2jn9Okjhw --- Have you ever had a bunch of text and wanted to do something with it? diff --git a/dotfiles.md b/dotfiles.md index 731ca93b..0da12423 100644 --- a/dotfiles.md +++ b/dotfiles.md @@ -1,7 +1,10 @@ --- -layout: page +layout: lecture title: "Dotfiles" presenter: Anish +video: + aspect: 56.25 + id: YSZBWWJw3mI --- Many programs are configured using plain-text files known as "dotfiles" diff --git a/editors.md b/editors.md index 3ca643c0..007dfecd 100644 --- a/editors.md +++ b/editors.md @@ -1,7 +1,10 @@ --- -layout: page +layout: lecture title: "Editors" presenter: Anish +video: + aspect: 56.25 + id: 1vLcusYSrI4 --- # Importance of Editors @@ -306,4 +309,4 @@ TODO resources for other editors? start seeing the benefits by then. At some point, you should be able to get your editor to work as fast as you think. -1. Install a linter (e.g. pyflakes for python) link it to your editor and test it is working. \ No newline at end of file +1. Install a linter (e.g. pyflakes for python) link it to your editor and test it is working. diff --git a/machine-introspection.md b/machine-introspection.md index dcdd96b6..a4698801 100644 --- a/machine-introspection.md +++ b/machine-introspection.md @@ -1,7 +1,10 @@ --- -layout: page +layout: lecture title: "Machine Introspection" presenter: Jon +video: + aspect: 56.25 + id: eNYT2Oq3PF8 --- Sometimes, computers misbehave. And very often, you want to know why. diff --git a/os-customization.md b/os-customization.md index 71687b51..58a9d8c8 100644 --- a/os-customization.md +++ b/os-customization.md @@ -1,7 +1,10 @@ --- -layout: page +layout: lecture title: "OS Customization" presenter: Anish +video: + aspect: 56.25 + id: epSRVqQzeDo --- There is a lot you can do to customize your operating system beyond what is diff --git a/package-management.md b/package-management.md index dcdb9b92..35159c86 100644 --- a/package-management.md +++ b/package-management.md @@ -1,7 +1,10 @@ --- -layout: page +layout: lecture title: "Package Management and Dependency Management" presenter: Anish +video: + aspect: 56.25 + id: tgvt473T8xA --- Software usually builds on (a collection of) other software, which necessitates diff --git a/program-introspection.md b/program-introspection.md index aae98c1f..dfde34c0 100644 --- a/program-introspection.md +++ b/program-introspection.md @@ -1,7 +1,10 @@ --- -layout: page +layout: lecture title: "Program Introspection" presenter: Anish +video: + aspect: 56.25 + id: 74MhV-7hYzg --- # Debugging diff --git a/remotes.md b/remote-machines.md similarity index 99% rename from remotes.md rename to remote-machines.md index 8af01ff1..07c119c1 100644 --- a/remotes.md +++ b/remote-machines.md @@ -1,7 +1,10 @@ --- -layout: page +layout: lecture title: "Remote Machines" presenter: Jose +video: + aspect: 56.25 + id: X5c2Y8BCowM --- It has become more and more common for programmers to use remote servers in their everyday work. If you need to use remote servers in order to deploy backend software or you need a server with higher computational capabilities, you will end up using a Secure Shell (SSH). As with most tools covered, SSH is highly configurable so it is worth learning about it. diff --git a/schedule.md b/schedule.md index 07752a4d..4b3a86f6 100644 --- a/schedule.md +++ b/schedule.md @@ -28,7 +28,7 @@ blocks, with a 10 minute break in between. # Tuesday, 1/29 - [Program introspection](/program-introspection/) and [package/dependency management](/package-management/) -- [OS customization](/os-customization/) and [Remote Machines](/remotes/) +- [OS customization](/os-customization/) and [Remote Machines](/remote-machines/) # Thursday, 1/31 diff --git a/security.md b/security.md index f4bb23cc..b1ac7243 100644 --- a/security.md +++ b/security.md @@ -1,7 +1,10 @@ --- -layout: page +layout: lecture title: "Security and Privacy" presenter: Jon +video: + aspect: 56.25 + id: OBx_c-i-M8s --- The world is a scary place, and everyone's out to get you. diff --git a/shell.md b/shell.md index 7016660c..3233afb2 100644 --- a/shell.md +++ b/shell.md @@ -1,7 +1,10 @@ --- -layout: page +layout: lecture title: "Shell and Scripting" presenter: Jon +video: + aspect: 56.25 + id: Gn_zGUywz-Q --- The shell is an efficient, textual interface to your computer. diff --git a/static/css/main.css b/static/css/main.css index a20b35a5..45cf5524 100644 --- a/static/css/main.css +++ b/static/css/main.css @@ -170,6 +170,20 @@ hr { text-align: center; } +.youtube-wrapper { + position: relative; + height: 0; + margin-bottom: 1rem; +} + +.youtube-wrapper iframe { + position: absolute; + top: 0; + left: 0; + width: 100%; + height: 100%; +} + /* Elements */ #content { diff --git a/version-control.md b/version-control.md index 21925395..c93ed8f8 100644 --- a/version-control.md +++ b/version-control.md @@ -1,7 +1,10 @@ --- -layout: page +layout: lecture title: "Version Control" presenter: Jon +video: + aspect: 56.25 + id: 3fig2Vz8QXs --- Whenever you are working on something that changes over time, it's @@ -357,4 +360,4 @@ if your push is rejected, what do you do? - git blame - exercise about why rebasing public commits is bad -{% endcomment %} \ No newline at end of file +{% endcomment %} diff --git a/virtual-machines.md b/virtual-machines.md index 89905574..157e0d3b 100644 --- a/virtual-machines.md +++ b/virtual-machines.md @@ -1,7 +1,10 @@ --- -layout: page +layout: lecture title: "Virtual Machines and Containers" presenter: Anish, Jon +video: + aspect: 56.25 + id: LJ9ki5zq6Ik --- # Virtual Machines diff --git a/web.md b/web.md index e25df3e7..efe1d1c2 100644 --- a/web.md +++ b/web.md @@ -1,7 +1,10 @@ --- -layout: page +layout: lecture title: "Web and Browsers" presenter: Jose +video: + aspect: 56.25 + id: XpZO3S8odec --- Apart from the terminal, the web browser is a tool you will find yourself spending significant amounts of time into. Thus it is worth learning how to use it efficiently and From e9b1ae45e1c3ccc567311bc548a0a0656640e012 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Sun, 3 Feb 2019 11:04:39 -0500 Subject: [PATCH 087/640] Fix aspect ratios --- command-line.md | 2 +- dotfiles.md | 2 +- editors.md | 2 +- os-customization.md | 2 +- program-introspection.md | 2 +- remote-machines.md | 2 +- web.md | 2 +- 7 files changed, 7 insertions(+), 7 deletions(-) diff --git a/command-line.md b/command-line.md index 99c8d01f..7726ea56 100644 --- a/command-line.md +++ b/command-line.md @@ -3,7 +3,7 @@ layout: lecture title: "Command-line environment" presenter: Jose video: - aspect: 56.25 + aspect: 62.5 id: i0rf1gpKL1E --- diff --git a/dotfiles.md b/dotfiles.md index 0da12423..a4e9f128 100644 --- a/dotfiles.md +++ b/dotfiles.md @@ -3,7 +3,7 @@ layout: lecture title: "Dotfiles" presenter: Anish video: - aspect: 56.25 + aspect: 62.5 id: YSZBWWJw3mI --- diff --git a/editors.md b/editors.md index 007dfecd..d88c7546 100644 --- a/editors.md +++ b/editors.md @@ -3,7 +3,7 @@ layout: lecture title: "Editors" presenter: Anish video: - aspect: 56.25 + aspect: 62.5 id: 1vLcusYSrI4 --- diff --git a/os-customization.md b/os-customization.md index 58a9d8c8..59e5066a 100644 --- a/os-customization.md +++ b/os-customization.md @@ -3,7 +3,7 @@ layout: lecture title: "OS Customization" presenter: Anish video: - aspect: 56.25 + aspect: 62.5 id: epSRVqQzeDo --- diff --git a/program-introspection.md b/program-introspection.md index dfde34c0..d58b711a 100644 --- a/program-introspection.md +++ b/program-introspection.md @@ -3,7 +3,7 @@ layout: lecture title: "Program Introspection" presenter: Anish video: - aspect: 56.25 + aspect: 62.5 id: 74MhV-7hYzg --- diff --git a/remote-machines.md b/remote-machines.md index 07c119c1..e08be44b 100644 --- a/remote-machines.md +++ b/remote-machines.md @@ -3,7 +3,7 @@ layout: lecture title: "Remote Machines" presenter: Jose video: - aspect: 56.25 + aspect: 62.5 id: X5c2Y8BCowM --- diff --git a/web.md b/web.md index efe1d1c2..dc53282f 100644 --- a/web.md +++ b/web.md @@ -3,7 +3,7 @@ layout: lecture title: "Web and Browsers" presenter: Jose video: - aspect: 56.25 + aspect: 62.5 id: XpZO3S8odec --- From f1a4167dfa81a417b1b46826ec9ff4191a59f8f5 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Mon, 4 Feb 2019 09:57:39 -0500 Subject: [PATCH 088/640] Mention more password managers --- security.md | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/security.md b/security.md index b1ac7243..f3c6ce90 100644 --- a/security.md +++ b/security.md @@ -51,8 +51,16 @@ scary](https://www.bleepingcomputer.com/news/security/heres-a-list-of-29-differe ## Authentication The very first thing you should do, if you haven't already, is download -a password manager (probably [1password](https://1password.com/) or -[`pass`](https://www.passwordstore.org/)). Use it to generate passwords +a password manager. Some good ones are: + + - [1password](https://1password.com/) + - [KeePass](https://keepass.info/) + - [BitWarden](https://bitwarden.com/) + - [`pass`](https://www.passwordstore.org/) + +If you're particularly paranoid, use one that encrypts the passwords +locally on your computer, as opposed to storing them in plain-text at +the server. Use it to generate passwords for all the web sites you care about right now. Then, switch on two-factor authentication, ideally with a [FIDO/U2F](https://fidoalliance.org/) dongle (a From 6255f8f51df0163a8d93d9960e8b8183c6aaf102 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Mon, 4 Feb 2019 09:58:27 -0500 Subject: [PATCH 089/640] Link to https://www.securemessagingapps.com/ --- security.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/security.md b/security.md index f3c6ce90..0365766d 100644 --- a/security.md +++ b/security.md @@ -85,7 +85,8 @@ in a safe in the real world. Use [Signal](https://www.signal.org/) ([setup instructions](https://medium.com/@mshelton/signal-for-beginners-c6b44f76a1f0). -[Wire](https://wire.com/en/) is fine too; WhatsApp is okay; [don't use +[Wire](https://wire.com/en/) is [fine +too](https://www.securemessagingapps.com/); WhatsApp is okay; [don't use Telegram](https://twitter.com/bascule/status/897187286554628096)). Desktop messengers are pretty broken (partially due to usually relying on Electron, which is a huge trust stack). From 67adfc4dd88a40d58192094bec2d146ba3d9e0ea Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Mon, 4 Feb 2019 13:52:49 -0500 Subject: [PATCH 090/640] Link to shared locations --- index.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/index.md b/index.md index 61a1dd47..e213fd33 100644 --- a/index.md +++ b/index.md @@ -39,6 +39,19 @@ Have any questions? Send us an email at [hacker-tools@mit.edu](mailto:hacker-tools@mit.edu) or post on [Piazza](https://piazza.com/class/jqjpgaeaz77785). +# Beyond MIT + +We've also shared this class beyond MIT in the hopes that other may +benefit from these resources. You can find posts and discussion on + + - [Hacker News](https://news.ycombinator.com/item?id=19078281) + - [Lobsters](https://lobste.rs/s/h6157x/mit_hacker_tools_lecture_series_on) + - [`/r/learnprogramming`](https://www.reddit.com/r/learnprogramming/comments/an42uu/mit_hacker_tools_a_lecture_series_on_programmer/) + - [`/r/programming`](https://www.reddit.com/r/programming/comments/an3xki/mit_hacker_tools_a_lecture_series_on_programmer/) + - [Twitter](https://twitter.com/Jonhoo/status/1091896192332693504) + - [YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuiujH1lpn8cA9dsyulbYRv) + - [Facebook](https://www.facebook.com/jonhoo/posts/10161566630165387) + ---
From 5b0a5e56386062650037fcd8c570a7aafb129f51 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 4 Feb 2019 14:06:26 -0500 Subject: [PATCH 091/640] Make it more obvious that there are videos --- _includes/nav.html | 2 +- _layouts/redirect.html | 19 +++++++++++++++++++ index.md | 2 +- schedule.md => lectures.md | 5 ++--- schedule.html | 5 +++++ 5 files changed, 28 insertions(+), 5 deletions(-) create mode 100644 _layouts/redirect.html rename schedule.md => lectures.md (85%) create mode 100644 schedule.html diff --git a/_includes/nav.html b/_includes/nav.html index 7687507c..74c48d08 100644 --- a/_includes/nav.html +++ b/_includes/nav.html @@ -5,7 +5,7 @@ diff --git a/_layouts/redirect.html b/_layouts/redirect.html new file mode 100644 index 00000000..5f827964 --- /dev/null +++ b/_layouts/redirect.html @@ -0,0 +1,19 @@ +--- +layout: null +--- + + + + + + {{ site.title }} -- {{ page.title }} + + + + + +

Redirecting you to {{ page.redirect }}

+ + diff --git a/index.md b/index.md index e213fd33..b2ee3026 100644 --- a/index.md +++ b/index.md @@ -19,7 +19,7 @@ software, configure your desktop environment, and more. # Schedule Lectures are 3:30pm-5:30pm on Tuesdays and Thursdays, starting on January 15th. -See [here](/schedule/) for a full schedule. +See [here](/lectures/) for links to all lecture videos and lecture notes. # Registration diff --git a/schedule.md b/lectures.md similarity index 85% rename from schedule.md rename to lectures.md index 4b3a86f6..107d5052 100644 --- a/schedule.md +++ b/lectures.md @@ -1,10 +1,9 @@ --- layout: page -title: "Schedule" +title: "Lectures" --- -Lectures are 3:30pm-5:30pm in 32-124. Lectures are taught in two 50 minute -blocks, with a 10 minute break in between. +Click on specific topics below to see lecture videos and lecture notes. # Tuesday, 1/15 diff --git a/schedule.html b/schedule.html new file mode 100644 index 00000000..c5b83d2b --- /dev/null +++ b/schedule.html @@ -0,0 +1,5 @@ +--- +layout: redirect +redirect: /lectures/ +title: Lectures +--- From 47ba3dfeffa34ac7545dc17dd8d8f728c01cef4c Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 4 Feb 2019 14:08:44 -0500 Subject: [PATCH 092/640] Make it more obvious that the class is over --- index.md | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/index.md b/index.md index b2ee3026..53b98a43 100644 --- a/index.md +++ b/index.md @@ -16,16 +16,11 @@ We’ll show you how to navigate the command line, use a powerful text editor, use version control efficiently, automate mundane tasks, manage packages and software, configure your desktop environment, and more. -# Schedule +# Lectures -Lectures are 3:30pm-5:30pm on Tuesdays and Thursdays, starting on January 15th. -See [here](/lectures/) for links to all lecture videos and lecture notes. - -# Registration +Hacker Tools has concluded for IAP 2019. -While registration is not _required_ to attend, if you intend to take the -class, we ask that you fill out this [short -form](https://goo.gl/forms/HSdsUQ204Ow8BgUs2). +See [here](/lectures/) for links to all lecture videos and lecture notes. # Staff From 119eb2e4c30df565da080680822cc2dbb0126aaf Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 4 Feb 2019 14:09:30 -0500 Subject: [PATCH 093/640] Remove subtitle --- index.md | 1 - 1 file changed, 1 deletion(-) diff --git a/index.md b/index.md index 53b98a43..e69b92f4 100644 --- a/index.md +++ b/index.md @@ -1,7 +1,6 @@ --- layout: page title: Hacker Tools -subtitle: IAP 2019 --- Learn to make the most of the tools that From 55363250e27e5f5eb074b3074dfae6a36294a4b7 Mon Sep 17 00:00:00 2001 From: Giannis Vrentzos Date: Mon, 4 Feb 2019 23:28:58 +0200 Subject: [PATCH 094/640] Remove quotes from variable `[ "$foo" = "bar" ]` is fine when `$foo` is quoted, but not without them. --- shell.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/shell.md b/shell.md index 3233afb2..077db51f 100644 --- a/shell.md +++ b/shell.md @@ -140,7 +140,7 @@ Globbing is the answer! Whitespace issues don't stop there: - - `if [ "$foo" = "bar" ]; then` -- see the issue? + - `if [ $foo = "bar" ]; then` -- see the issue? - what if `$foo` is empty? arguments to `[` are `=` and `bar`... - _can_ work around this with `[ "x$foo" = "xbar" ]`, but bleh - instead, use `[[`: bash built-in comparator that has special parsing From 0ce3e27945525228897e3bbd996786c0ed85e990 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 4 Feb 2019 16:36:39 -0500 Subject: [PATCH 095/640] Add link to edit current lecture page --- _layouts/lecture.html | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/_layouts/lecture.html b/_layouts/lecture.html index f19b15f6..d0cdc3b7 100644 --- a/_layouts/lecture.html +++ b/_layouts/lecture.html @@ -11,3 +11,9 @@

{{ page.title }}{% if page.subtitle %} {% endif %} {{ content }} + +
+ + From 801159e992c2d93bca3d9b03d9854200b9afc724 Mon Sep 17 00:00:00 2001 From: Giannis Vrentzos Date: Mon, 4 Feb 2019 23:41:57 +0200 Subject: [PATCH 096/640] Remove quotes --- shell.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/shell.md b/shell.md index 077db51f..b5b3dc65 100644 --- a/shell.md +++ b/shell.md @@ -142,7 +142,7 @@ Whitespace issues don't stop there: - `if [ $foo = "bar" ]; then` -- see the issue? - what if `$foo` is empty? arguments to `[` are `=` and `bar`... - - _can_ work around this with `[ "x$foo" = "xbar" ]`, but bleh + - _can_ work around this with `[ x$foo = "xbar" ]`, but bleh - instead, use `[[`: bash built-in comparator that has special parsing - also allows `&&` instead of `-a`, `||` over `-o`, etc. From ea9df7a0c8e1ae1237d634bbbb2674a5e9c9f6df Mon Sep 17 00:00:00 2001 From: Allan Peng Date: Mon, 4 Feb 2019 16:15:29 -0800 Subject: [PATCH 097/640] typo hehe --- command-line.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/command-line.md b/command-line.md index 7726ea56..a0d124e9 100644 --- a/command-line.md +++ b/command-line.md @@ -162,7 +162,7 @@ The [atool](https://www.nongnu.org/atool/) package provides the `aunpack` comman ## Exercises -1. Run `cat .bash_history | sort | uniq -c | sort -rn | head -n 10` (or `cat .zhistory | sort | uniq -c | sort -rn | head -n 10` for zsh) to get top 10 most used commands and consider writing sorter aliases for them +1. Run `cat .bash_history | sort | uniq -c | sort -rn | head -n 10` (or `cat .zhistory | sort | uniq -c | sort -rn | head -n 10` for zsh) to get top 10 most used commands and consider writing shorter aliases for them 1. Choose a terminal emulator and figure out how to change the following properties: - Font choice - Color scheme. How many colors does a standard scheme have? why? From 917b765817ef2849c780a08126171961862e0824 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Markus=20M=C3=BCller?= Date: Tue, 5 Feb 2019 09:06:39 +0100 Subject: [PATCH 098/640] Typo in command-line.md --- command-line.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/command-line.md b/command-line.md index a0d124e9..10f224a6 100644 --- a/command-line.md +++ b/command-line.md @@ -107,7 +107,7 @@ In the next few subsections I will cover alternative tools for extremely common ### `fasd` vs `cd` -Even with improved path expansion and tab autocomplete, changing directories can become quite repetitive. [Fasd]https://github.com/clvv/fasd() (or [autojump](https://github.com/wting/autojump)) solves this issue by keeping track of recent and frequent folders you have been to and performing fuzzy matching. +Even with improved path expansion and tab autocomplete, changing directories can become quite repetitive. [Fasd](https://github.com/clvv/fasd) (or [autojump](https://github.com/wting/autojump)) solves this issue by keeping track of recent and frequent folders you have been to and performing fuzzy matching. Thus if I have visited the path `/home/user/awesome_project/code` running `z code` will `cd` to it. If I have multiple folders called code I can disambiguate by running `z awe code` which will be closer match. Unlike autojump, fasd also provides commands that instead of performing `cd` just expand frequent and /or recent files,folders or both. From 75732a70c853198988f0f2443ec6ae5a0f14ea5b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Imm=C3=A1nuel!?= Date: Tue, 5 Feb 2019 13:01:47 +0100 Subject: [PATCH 099/640] Replace the dummy file to the actual script + typo --- shell.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/shell.md b/shell.md index b5b3dc65..1701adcb 100644 --- a/shell.md +++ b/shell.md @@ -304,8 +304,8 @@ Also, a double dash `--` is used in built-in commands and many other commands to There is a way in which pipes (and process substitution) differ from using subshell execution, i.e. `$()`. Run the following commands and observe the differences: - `./slow_seq.sh | grep -P "[3-6]"` - - `grep -P "[3-6]" <(./script.sh)` - - `echo $(./script.sh) | grep -P "[3-6]"` + - `grep -P "[3-6]" <(./slow_seq.sh)` + - `echo $(./slow_seq.sh) | grep -P "[3-6]"` 1. **Misc** @@ -313,7 +313,7 @@ Also, a double dash `--` is used in built-in commands and many other commands to - Sometimes you want to keep STDIN and still pipe it to a file. Try running `echo HELLO | tee hello.txt` - Try running `cat hello.txt > hello.txt ` what do you expect to happen? What does happen? - Run `echo HELLO > hello.txt` and then run `echo WORLD >> hello.txt`. What are the contents of `hello.txt`? How is `>` different from `>>`? - - Run `printf "\e[38;5;81mfoo\e[0m\n"`. How was the output different? If you want to know more search for ANSI color escape sequences. + - Run `printf "\e[38;5;81mfoo\e[0m\n"`. How was the output different? If you want to know more, search for ANSI color escape sequences. - Run `touch a.txt` then run `^txt^log` what did bash do for you? In the same vein, run `fc`. What does it do? {% comment %} From db1299ab91b4b8e1e6af38a2d654768824f046af Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Tue, 5 Feb 2019 16:16:26 -0500 Subject: [PATCH 100/640] Add r/hackertools links and remove Piazza links --- automation.md | 2 ++ backups.md | 2 ++ command-line.md | 2 ++ course-overview.md | 22 ++++++++++++++++++++-- data-wrangling.md | 2 ++ dotfiles.md | 2 ++ editors.md | 2 ++ index.md | 5 ++++- machine-introspection.md | 2 ++ os-customization.md | 2 ++ package-management.md | 2 ++ program-introspection.md | 2 ++ remote-machines.md | 2 ++ security.md | 2 ++ shell.md | 2 ++ version-control.md | 2 ++ virtual-machines.md | 2 ++ web.md | 2 ++ 18 files changed, 56 insertions(+), 3 deletions(-) diff --git a/automation.md b/automation.md index f6a83819..8cd4d59c 100644 --- a/automation.md +++ b/automation.md @@ -7,6 +7,8 @@ video: id: BaLlAaHz-1k --- +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anidgj/automation_iap_2019/) + Sometimes you write a script that does something but you want for it to run periodically, say a backup task. You can always write an *ad hoc* solution that runs in the background and comes online periodically. However, most UNIX systems come with the cron daemon which can run task with a frequency up to a minute based on simple rules. On most UNIX systems the cron daemon, `crond` will be running by default but you can always check using `ps aux | grep crond`. diff --git a/backups.md b/backups.md index e9498b12..54bdc54a 100644 --- a/backups.md +++ b/backups.md @@ -7,6 +7,8 @@ video: id: lrpqYF8tcYQ --- +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anifrx/backups_iap_2019/) + There are two types of people: - Those who do backups diff --git a/command-line.md b/command-line.md index 10f224a6..085624eb 100644 --- a/command-line.md +++ b/command-line.md @@ -7,6 +7,8 @@ video: id: i0rf1gpKL1E --- +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anick3/commandline_environment_iap_2019/) + ## Aliases & Functions As you can imagine it can become tiresome typing long commands that involve many flags or verbose options. Nevertheless, most shells support **aliasing**. For instance, an alias in bash has the following structure (note there is no space around the `=` sign): diff --git a/course-overview.md b/course-overview.md index 558139db..73268e3f 100644 --- a/course-overview.md +++ b/course-overview.md @@ -7,6 +7,8 @@ video: id: qw2c6ffSVOM --- +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anic30/course_overview_iap_2019/) + # Motivation This class is about [hacker](https://en.wikipedia.org/wiki/Hacker_culture) @@ -35,16 +37,32 @@ own. We'll inspire you to learn more about your tools, and we'll show you what's possible and cover some of the basics in detail, but we can't teach you everything in the time we have. +Please post questions on [r/hackertools]. In addition, we ask that you share your +knowledge with your classmates through [r/hackertools] --- **for "homework" for each +lecture, create a post or comment about something you've learned or something you'd +like to share about the topic**. + +# Discussion + +You can find discussion about the class topics in the subreddit [r/hackertools] + +{% comment %} + +# Exercises + Please post questions on [Piazza]. In addition, we ask that you share your knowledge with your classmates through Piazza --- **for "homework" for each lecture, create a Piazza note about something you've learned or something you'd like to share about the topic**. -# Exercises - 1. Fill out the [registration form](https://goo.gl/forms/HSdsUQ204Ow8BgUs2) if you haven't already. 1. Sign up for [Piazza]. [Piazza]: https://piazza.com/class/jqjpgaeaz77785 + + +{% endcomment %} + +[r/hackertools]: https://www.reddit.com/r/hackertools \ No newline at end of file diff --git a/data-wrangling.md b/data-wrangling.md index 3dea3ff1..3a1496a1 100644 --- a/data-wrangling.md +++ b/data-wrangling.md @@ -7,6 +7,8 @@ video: id: VW2jn9Okjhw --- +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anicor/data_wrangling_iap_2019/) + Have you ever had a bunch of text and wanted to do something with it? Good. That's what data wrangling is all about! Specifically, adapting data from one format to another, until you end up diff --git a/dotfiles.md b/dotfiles.md index a4e9f128..8e64f82e 100644 --- a/dotfiles.md +++ b/dotfiles.md @@ -7,6 +7,8 @@ video: id: YSZBWWJw3mI --- +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anidcd/dotfiles_iap_2019/) + Many programs are configured using plain-text files known as "dotfiles" (because the file names begin with a `.`, e.g. `~/.gitconfig`, so that they are hidden in the directory listing `ls` by default). diff --git a/editors.md b/editors.md index d88c7546..e55cd3c0 100644 --- a/editors.md +++ b/editors.md @@ -7,6 +7,8 @@ video: id: 1vLcusYSrI4 --- +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anid4e/editors_iap_2019/) + # Importance of Editors As programmers, we spend most of our time editing plain-text files. It's worth diff --git a/index.md b/index.md index e69b92f4..9482674e 100644 --- a/index.md +++ b/index.md @@ -30,8 +30,11 @@ Ortiz](http://josejg.com/). # Questions Have any questions? Send us an email at -[hacker-tools@mit.edu](mailto:hacker-tools@mit.edu) or post on +[hacker-tools@mit.edu](mailto:hacker-tools@mit.edu) or post on [r/hackertools](https://www.reddit.com/r/hackertools/) + +{% comment %} [Piazza](https://piazza.com/class/jqjpgaeaz77785). +{% endcomment %} # Beyond MIT diff --git a/machine-introspection.md b/machine-introspection.md index a4698801..a34bce7e 100644 --- a/machine-introspection.md +++ b/machine-introspection.md @@ -7,6 +7,8 @@ video: id: eNYT2Oq3PF8 --- +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anidns/machine_introspection_iap_2019/) + Sometimes, computers misbehave. And very often, you want to know why. Let's look at some tools that help you do that! diff --git a/os-customization.md b/os-customization.md index 59e5066a..27f5ad5d 100644 --- a/os-customization.md +++ b/os-customization.md @@ -7,6 +7,8 @@ video: id: epSRVqQzeDo --- +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anie4v/os_customization_iap_2019/) + There is a lot you can do to customize your operating system beyond what is available in the settings menus. diff --git a/package-management.md b/package-management.md index 35159c86..8999f768 100644 --- a/package-management.md +++ b/package-management.md @@ -7,6 +7,8 @@ video: id: tgvt473T8xA --- +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anidxk/package_management_and_dependency_management_iap/) + Software usually builds on (a collection of) other software, which necessitates dependency management. diff --git a/program-introspection.md b/program-introspection.md index d58b711a..df568bdf 100644 --- a/program-introspection.md +++ b/program-introspection.md @@ -7,6 +7,8 @@ video: id: 74MhV-7hYzg --- +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anidrp/program_introspection_iap_2019/) + # Debugging When printf-debugging isn't good enough: use a debugger. diff --git a/remote-machines.md b/remote-machines.md index e08be44b..dae55eba 100644 --- a/remote-machines.md +++ b/remote-machines.md @@ -7,6 +7,8 @@ video: id: X5c2Y8BCowM --- +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anie9u/remote_machines_iap_2019/) + It has become more and more common for programmers to use remote servers in their everyday work. If you need to use remote servers in order to deploy backend software or you need a server with higher computational capabilities, you will end up using a Secure Shell (SSH). As with most tools covered, SSH is highly configurable so it is worth learning about it. diff --git a/security.md b/security.md index 0365766d..0899b0f8 100644 --- a/security.md +++ b/security.md @@ -7,6 +7,8 @@ video: id: OBx_c-i-M8s --- +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/aniekk/security_and_privacy_iap_2019/) + The world is a scary place, and everyone's out to get you. Okay, maybe not, but that doesn't mean you want to flaunt all your diff --git a/shell.md b/shell.md index 1701adcb..7c997df3 100644 --- a/shell.md +++ b/shell.md @@ -7,6 +7,8 @@ video: id: Gn_zGUywz-Q --- +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anieve/shell_and_scripting_iap_2019/) + The shell is an efficient, textual interface to your computer. The shell prompt: what greets you when you open a terminal. diff --git a/version-control.md b/version-control.md index c93ed8f8..2e8018db 100644 --- a/version-control.md +++ b/version-control.md @@ -7,6 +7,8 @@ video: id: 3fig2Vz8QXs --- +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anid8p/version_control_iap_2019/) + Whenever you are working on something that changes over time, it's useful to be able to _track_ those changes. This can be for a number of reasons: it gives you a record of what changed, how to undo it, who diff --git a/virtual-machines.md b/virtual-machines.md index 157e0d3b..ad5e260c 100644 --- a/virtual-machines.md +++ b/virtual-machines.md @@ -7,6 +7,8 @@ video: id: LJ9ki5zq6Ik --- +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anicey/virtual_machines_and_containers_iap_2019/) + # Virtual Machines Virtual machines are simulated computers. You can configure a guest virtual diff --git a/web.md b/web.md index dc53282f..9e5575cc 100644 --- a/web.md +++ b/web.md @@ -7,6 +7,8 @@ video: id: XpZO3S8odec --- +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anief6/web_and_browsers_iap_2019/) + Apart from the terminal, the web browser is a tool you will find yourself spending significant amounts of time into. Thus it is worth learning how to use it efficiently and ## Shortcuts From 24a84a4d0671216a8b6a860de259d5e48d1ddbb8 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Tue, 5 Feb 2019 17:08:32 -0500 Subject: [PATCH 101/640] Mention backup e-mail --- backups.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/backups.md b/backups.md index 54bdc54a..55e2adc9 100644 --- a/backups.md +++ b/backups.md @@ -66,7 +66,7 @@ Some other things you may want to look into are: ## Webservices -Not all the data that you use lives on your hard disk. If you use **webservices** then it might be the case that some data you care about is stored there such as Google Docs presentations or Spotify playlists. Figuring out a backup solution in scenario is somewhat trickier. Nevertheless, most of these services offer you the possibility to download that data, either directly or through a web API. +Not all the data that you use lives on your hard disk. If you use **webservices** then it might be the case that some data you care about is stored there such as Google Docs presentations or Spotify playlists. An easy one to forget are email accounts with web access. Figuring out a backup solution in scenario is somewhat trickier. Nevertheless, most of these services offer you the possibility to download that data, either directly or through a web API. ## Webpages @@ -87,7 +87,9 @@ Some good backup programs and services we have used and can honestly recommend: 1. Consider how you are (not) backing up your data and look into fixing/improving that. -1. Choose a webservice you use often (Spotify, Google Music, &c) and figure out what options for backing up your data are. Often people have already made solutions based on available APIs. +1. Figure out how to backup your email accounts + +1. Choose a webservice you use often (Spotify, Google Music, &c) and figure out what options for backing up your data are. Often people have already made tools (such as [youtube-dl](https://github.com/rg3/youtube-dl)) solutions based on available APIs. 1. Think of a website you have visited repeatedly over the years and look it up in [archive.org](https://archive.org/web/), how many versions does it have? From 50d9a28b044bfed98a26ef8310e409223872f8a2 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Tue, 5 Feb 2019 17:08:55 -0500 Subject: [PATCH 102/640] Add Public APIs ref --- web.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/web.md b/web.md index 9e5575cc..e48c178d 100644 --- a/web.md +++ b/web.md @@ -126,7 +126,7 @@ There are also script repositories such as [OpenUserJS](https://openuserjs.org/) ## Web APIs It has become more and more common for webservices to offer an application interface aka web API so you can interact with the services making web requests. -A more in depth introduction to the topic can be found [here](https://developer.mozilla.org/en-US/docs/Learn/JavaScript/Client-side_web_APIs/Introduction). Web APIs can be useful for very many reasons: +A more in depth introduction to the topic can be found [here](https://developer.mozilla.org/en-US/docs/Learn/JavaScript/Client-side_web_APIs/Introduction). There are [many public APIs](https://github.com/toddmotto/public-apis). Web APIs can be useful for very many reasons: - **Retrieval**. Web APIs can quite easily provide you information such as maps, weather or what your public ip address. For instance `curl ipinfo.io` will return a JSON object with some details about your public ip, region, location, &c. With proper parsing these tools can be integrated even with command line tools. The following bash functions talks to Googles autocompletion API and returns the first ten matches. From 7c6ae711f8a4a232c225a3d686e45db3da8e87f3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Klemens=20K=C3=BChle?= Date: Wed, 6 Feb 2019 00:31:54 +0100 Subject: [PATCH 103/640] Update virtual-machines.md Just a Typo --- virtual-machines.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/virtual-machines.md b/virtual-machines.md index ad5e260c..4a21d0f3 100644 --- a/virtual-machines.md +++ b/virtual-machines.md @@ -59,7 +59,7 @@ enable nicer integration with host system. You should use this if you can. - [VMWare](https://www.vmware.com/) (commercial, available from IS&T [for MIT students](https://ist.mit.edu/vmware-fusion)) -If you are already familiar with popular hypervisors/VMs you many want to learn more about how to do this from a command line friendly way. One option is the [libvirt](https://wiki.libvirt.org/page/UbuntuKVMWalkthrough) toolkit which allows you to manage multiple different virtualization providers/hypervisors. +If you are already familiar with popular hypervisors/VMs you may want to learn more about how to do this from a command line friendly way. One option is the [libvirt](https://wiki.libvirt.org/page/UbuntuKVMWalkthrough) toolkit which allows you to manage multiple different virtualization providers/hypervisors. ## Exercises From 33d64208c6464b9565076e49be9a2c7de426fdd5 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Wed, 6 Feb 2019 10:25:12 -0500 Subject: [PATCH 104/640] Add ref to gmail backups --- backups.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backups.md b/backups.md index 55e2adc9..2bf18baf 100644 --- a/backups.md +++ b/backups.md @@ -66,7 +66,7 @@ Some other things you may want to look into are: ## Webservices -Not all the data that you use lives on your hard disk. If you use **webservices** then it might be the case that some data you care about is stored there such as Google Docs presentations or Spotify playlists. An easy one to forget are email accounts with web access. Figuring out a backup solution in scenario is somewhat trickier. Nevertheless, most of these services offer you the possibility to download that data, either directly or through a web API. +Not all the data that you use lives on your hard disk. If you use **webservices** then it might be the case that some data you care about is stored there such as Google Docs presentations or Spotify playlists. An easy one to forget are email accounts with web access such as Gmail. However there are tools available to download the email files to your computer such as [gmvault](https://github.com/gaubert/gmvault). Figuring out a backup solution in scenario is somewhat trickier. Nevertheless, most of these services offer you the possibility to download that data, either directly or through a web API. ## Webpages From 48a63b9d16424f2a5dbafca4d6ba2c5cdb829bdd Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Fri, 8 Feb 2019 10:39:45 -0500 Subject: [PATCH 105/640] Try to make front page easier to navigate --- index.md | 25 ++++++++----------------- 1 file changed, 8 insertions(+), 17 deletions(-) diff --git a/index.md b/index.md index 9482674e..3edb618c 100644 --- a/index.md +++ b/index.md @@ -15,34 +15,25 @@ We’ll show you how to navigate the command line, use a powerful text editor, use version control efficiently, automate mundane tasks, manage packages and software, configure your desktop environment, and more. -# Lectures +**See [here](/lectures/) for links to all lecture videos and lecture notes.** -Hacker Tools has concluded for IAP 2019. +**Lectures**: Hacker Tools has concluded for IAP 2019. -See [here](/lectures/) for links to all lecture videos and lecture notes. - -# Staff - -This class is co-taught by [Anish Athalye](https://www.anishathalye.com/), [Jon +**Staff**: This class is co-taught by [Anish Athalye](https://www.anishathalye.com/), [Jon Gjengset](https://thesquareplanet.com/), and [Jose Javier Gonzalez Ortiz](http://josejg.com/). -# Questions - -Have any questions? Send us an email at -[hacker-tools@mit.edu](mailto:hacker-tools@mit.edu) or post on [r/hackertools](https://www.reddit.com/r/hackertools/) - -{% comment %} -[Piazza](https://piazza.com/class/jqjpgaeaz77785). -{% endcomment %} +**Questions**: Send us an email at +[hacker-tools@mit.edu](mailto:hacker-tools@mit.edu) or post on +[r/hackertools](https://www.reddit.com/r/hackertools/) -# Beyond MIT +## Beyond MIT We've also shared this class beyond MIT in the hopes that other may benefit from these resources. You can find posts and discussion on - [Hacker News](https://news.ycombinator.com/item?id=19078281) - - [Lobsters](https://lobste.rs/s/h6157x/mit_hacker_tools_lecture_series_on) + - [Lobsters](https://lobste.rs/s/h6157x/mit_hacker_tools_lecture_series_on) — you'll need an [invite](https://lobste.rs/about#invitations) to comment - [`/r/learnprogramming`](https://www.reddit.com/r/learnprogramming/comments/an42uu/mit_hacker_tools_a_lecture_series_on_programmer/) - [`/r/programming`](https://www.reddit.com/r/programming/comments/an3xki/mit_hacker_tools_a_lecture_series_on_programmer/) - [Twitter](https://twitter.com/Jonhoo/status/1091896192332693504) From c01f424f50234b433877885333e1c9c67909d583 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Fri, 8 Feb 2019 10:40:41 -0500 Subject: [PATCH 106/640] Mention r/hackertools in the list too --- index.md | 1 + 1 file changed, 1 insertion(+) diff --git a/index.md b/index.md index 3edb618c..afe23713 100644 --- a/index.md +++ b/index.md @@ -32,6 +32,7 @@ Ortiz](http://josejg.com/). We've also shared this class beyond MIT in the hopes that other may benefit from these resources. You can find posts and discussion on + - [`/r/hackertools`](https://www.reddit.com/r/hackertools) - [Hacker News](https://news.ycombinator.com/item?id=19078281) - [Lobsters](https://lobste.rs/s/h6157x/mit_hacker_tools_lecture_series_on) — you'll need an [invite](https://lobste.rs/about#invitations) to comment - [`/r/learnprogramming`](https://www.reddit.com/r/learnprogramming/comments/an42uu/mit_hacker_tools_a_lecture_series_on_programmer/) From 207896fb5fdab9435950d3c223f047b3ea4456d8 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Fri, 8 Feb 2019 10:42:52 -0500 Subject: [PATCH 107/640] Slightly nicer formatting --- index.md | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/index.md b/index.md index afe23713..989fc233 100644 --- a/index.md +++ b/index.md @@ -17,15 +17,11 @@ software, configure your desktop environment, and more. **See [here](/lectures/) for links to all lecture videos and lecture notes.** -**Lectures**: Hacker Tools has concluded for IAP 2019. +## About the class -**Staff**: This class is co-taught by [Anish Athalye](https://www.anishathalye.com/), [Jon -Gjengset](https://thesquareplanet.com/), and [Jose Javier Gonzalez -Ortiz](http://josejg.com/). - -**Questions**: Send us an email at -[hacker-tools@mit.edu](mailto:hacker-tools@mit.edu) or post on -[r/hackertools](https://www.reddit.com/r/hackertools/) +**Lectures**: Hacker Tools has concluded for IAP 2019.
+**Staff**: This class is co-taught by [Anish](https://www.anishathalye.com/), [Jon](https://thesquareplanet.com/), and [Jose](http://josejg.com/).
+**Questions**: Email us at [hacker-tools@mit.edu](mailto:hacker-tools@mit.edu) or post on [r/hackertools](https://www.reddit.com/r/hackertools/) ## Beyond MIT From f2348b7a7e5f182164bdd76431e5326f01ee4397 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Fri, 8 Feb 2019 12:30:33 -0500 Subject: [PATCH 108/640] Make lectures page backed by data file --- _data/lectures.yml | 51 ++++++++++++++++++++++++++++++++++++++++++++++ lectures.html | 18 ++++++++++++++++ lectures.md | 35 ------------------------------- 3 files changed, 69 insertions(+), 35 deletions(-) create mode 100644 _data/lectures.yml create mode 100644 lectures.html delete mode 100644 lectures.md diff --git a/_data/lectures.yml b/_data/lectures.yml new file mode 100644 index 00000000..f47b4d03 --- /dev/null +++ b/_data/lectures.yml @@ -0,0 +1,51 @@ +- date: Tuesday, 1/15 + topics: + - title: course overview + url: /course-overview/ + - title: virtual machines and containers + url: /virtual-machines/ + - title: shell and scripting + url: /shell/ + +- date: Thursday, 1/17 + topics: + - title: command-line environment + url: /command-line/ + - title: data wrangling + url: /data-wrangling/ + +- date: Tuesday, 1/22 + topics: + - title: editors + url: /editors/ + - title: version control + url: /version-control/ + +- date: Thursday, 1/24 + topics: + - title: dotfiles + url: /dotfiles/ + - title: backups + url: /backups/ + - title: automation + url: /automation/ + - title: machine introspection + url: /machine-introspection/ + +- date: Tuesday, 1/29 + topics: + - title: program introspection + url: /program-introspection/ + - title: package/dependency management + url: /package-management/ + - title: OS customization + url: /os-customization/ + - title: remote machines + url: /remote-machines/ + +- date: Thursday, 1/31 + topics: + - title: web and browsers + url: /web/ + - title: security and privacy + url: /security/ diff --git a/lectures.html b/lectures.html new file mode 100644 index 00000000..44d8c40c --- /dev/null +++ b/lectures.html @@ -0,0 +1,18 @@ +--- +layout: page +title: "Lectures" +--- + +

Click on specific topics below to see lecture videos and lecture notes.

+ +{% for lecture in site.data.lectures %} + +

{{ lecture.date }}

+ + + +{% endfor %} diff --git a/lectures.md b/lectures.md deleted file mode 100644 index 107d5052..00000000 --- a/lectures.md +++ /dev/null @@ -1,35 +0,0 @@ ---- -layout: page -title: "Lectures" ---- - -Click on specific topics below to see lecture videos and lecture notes. - -# Tuesday, 1/15 - -- [Course overview](/course-overview/) and [virtual machines and containers](/virtual-machines/) -- [Shell and scripting](/shell/) - -# Thursday, 1/17 - -- [Command-line environment](/command-line/) -- [Data wrangling](/data-wrangling/) - -# Tuesday, 1/22 - -- [Editors](/editors/) -- [Version control](/version-control/) - -# Thursday, 1/24 - -- [Dotfiles](/dotfiles/) and [backups](/backups/) -- [Automation](/automation/) and [machine introspection](/machine-introspection/) - -# Tuesday, 1/29 -- [Program introspection](/program-introspection/) and [package/dependency management](/package-management/) -- [OS customization](/os-customization/) and [Remote Machines](/remote-machines/) - -# Thursday, 1/31 - -- [Web and browsers](/web/) -- [Security and privacy](/security/) From 0ad0e0afe942fbb54cb73de50a1c872b159f7d3a Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Fri, 8 Feb 2019 12:35:41 -0500 Subject: [PATCH 109/640] Add topic list to front page --- index.md | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/index.md b/index.md index 989fc233..7e14cbe3 100644 --- a/index.md +++ b/index.md @@ -15,12 +15,20 @@ We’ll show you how to navigate the command line, use a powerful text editor, use version control efficiently, automate mundane tasks, manage packages and software, configure your desktop environment, and more. -**See [here](/lectures/) for links to all lecture videos and lecture notes.** +## Topics + + ## About the class -**Lectures**: Hacker Tools has concluded for IAP 2019.
-**Staff**: This class is co-taught by [Anish](https://www.anishathalye.com/), [Jon](https://thesquareplanet.com/), and [Jose](http://josejg.com/).
+**Lectures**: Hacker Tools has concluded for IAP 2019. +**Staff**: This class is co-taught by [Anish](https://www.anishathalye.com/), [Jon](https://thesquareplanet.com/), and [Jose](http://josejg.com/). **Questions**: Email us at [hacker-tools@mit.edu](mailto:hacker-tools@mit.edu) or post on [r/hackertools](https://www.reddit.com/r/hackertools/) ## Beyond MIT From 3885dec38575d9aeb45229dfa15fe9ea4a47ae81 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Fri, 8 Feb 2019 12:36:40 -0500 Subject: [PATCH 110/640] Do capitalization manually The liquid capitalize filter didn't work nicely with abbreviations like "OS". --- _data/lectures.yml | 32 ++++++++++++++++---------------- index.md | 2 +- lectures.html | 2 +- 3 files changed, 18 insertions(+), 18 deletions(-) diff --git a/_data/lectures.yml b/_data/lectures.yml index f47b4d03..3676f46a 100644 --- a/_data/lectures.yml +++ b/_data/lectures.yml @@ -1,51 +1,51 @@ - date: Tuesday, 1/15 topics: - - title: course overview + - title: Course overview url: /course-overview/ - - title: virtual machines and containers + - title: Virtual machines and containers url: /virtual-machines/ - - title: shell and scripting + - title: Shell and scripting url: /shell/ - date: Thursday, 1/17 topics: - - title: command-line environment + - title: Command-line environment url: /command-line/ - - title: data wrangling + - title: Data wrangling url: /data-wrangling/ - date: Tuesday, 1/22 topics: - - title: editors + - title: Editors url: /editors/ - - title: version control + - title: Version control url: /version-control/ - date: Thursday, 1/24 topics: - - title: dotfiles + - title: Dotfiles url: /dotfiles/ - - title: backups + - title: Backups url: /backups/ - - title: automation + - title: Automation url: /automation/ - - title: machine introspection + - title: Machine introspection url: /machine-introspection/ - date: Tuesday, 1/29 topics: - - title: program introspection + - title: Program introspection url: /program-introspection/ - - title: package/dependency management + - title: Package/dependency management url: /package-management/ - title: OS customization url: /os-customization/ - - title: remote machines + - title: Remote machines url: /remote-machines/ - date: Thursday, 1/31 topics: - - title: web and browsers + - title: Web and browsers url: /web/ - - title: security and privacy + - title: Security and privacy url: /security/ diff --git a/index.md b/index.md index 7e14cbe3..275d0933 100644 --- a/index.md +++ b/index.md @@ -20,7 +20,7 @@ software, configure your desktop environment, and more. diff --git a/lectures.html b/lectures.html index 44d8c40c..11c4d819 100644 --- a/lectures.html +++ b/lectures.html @@ -11,7 +11,7 @@

{{ lecture.date }}

From f51c8f2f43ba6f8b815493573089da69b77aa0e1 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Fri, 8 Feb 2019 12:37:56 -0500 Subject: [PATCH 111/640] Restore note about videos/notes --- index.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/index.md b/index.md index 275d0933..d9a3e9c5 100644 --- a/index.md +++ b/index.md @@ -17,6 +17,8 @@ software, configure your desktop environment, and more. ## Topics +Click on specific topics below to see lecture videos and lecture notes. + -See [here](/lectures/) for more on this year's topics, including links to +See [here](/2020/) for more on this year's topics, including links to lecture notes and videos. If you want to get a sense of what the class was like last year, check out [last year's -lectures](https://hacker-tools.github.io/lectures/). +lectures](/2019/). # About the class From b88ca7dcbbd55db15093c1713789f6ceb270116f Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Fri, 17 Jan 2020 15:44:49 -0500 Subject: [PATCH 188/640] Add lecture videos --- _2020/course-shell.md | 6 +++--- _2020/data-wrangling.md | 6 +++--- _2020/editors.md | 6 +++--- _2020/shell-tools.md | 3 +++ 4 files changed, 12 insertions(+), 9 deletions(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index 9acee11c..c55288e0 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -3,9 +3,9 @@ layout: lecture title: "Course overview + the shell" date: 2019-1-13 ready: true -# video: -# aspect: 56.25 -# id: qw2c6ffSVOM +video: + aspect: 56.25 + id: Yh-iV6Vn5W4 --- {% comment %} diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index 89595663..b5da8587 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -3,9 +3,9 @@ layout: lecture title: "Data Wrangling" date: 2019-1-16 ready: true -# video: -# aspect: 56.25 -# id: VW2jn9Okjhw +video: + aspect: 56.25 + id: QQiUPFvIMt8 --- {% comment %} diff --git a/_2020/editors.md b/_2020/editors.md index 08a4d75f..2fd1bfaf 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -3,9 +3,9 @@ layout: lecture title: "Editors (Vim)" date: 2019-1-15 ready: true -#video: -# aspect: 62.5 -# id: 1vLcusYSrI4 +video: + aspect: 62.5 + id: BE-xaxvDEpo --- Writing English words and writing code are very different activities. When diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 051f7f9f..44b456b2 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -3,6 +3,9 @@ layout: lecture title: "Shell Tools and Scripting" date: 2019-1-14 ready: true +video: + aspect: 56.25 + id: 2APJRjhBiYc --- In this lecture we will present some of the basics of using bash as a scripting language along with a number of shell tools that cover several of the most common tasks that you will be constantly performing in the command line. From 415a54f1c2eb35d9d5a983e1835372fefc55330c Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Fri, 17 Jan 2020 16:04:43 -0500 Subject: [PATCH 189/640] Fix --- static/css/main.css | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/static/css/main.css b/static/css/main.css index 0cdb713c..56326000 100644 --- a/static/css/main.css +++ b/static/css/main.css @@ -30,7 +30,7 @@ body { margin: 0; color: #000; background-color: #fff; - + overflow-y: scroll; } h1, h2, h3, h4, h5, h6 { From 123ebe06f9511c44b6bc9cc778e4da2b1d4be5d1 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Fri, 17 Jan 2020 16:09:07 -0500 Subject: [PATCH 190/640] Link to YouTube playlist --- _2020/index.html | 2 ++ index.md | 6 ++---- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/_2020/index.html b/_2020/index.html index 961789d6..2ceee8dd 100644 --- a/_2020/index.html +++ b/_2020/index.html @@ -28,6 +28,8 @@ {% endfor %} +Video recordings of the lectures are available on YouTube. +

Previous year's lectures

You can find lecture notes and videos from last year's version of this class.

diff --git a/index.md b/index.md index 464384e1..103850d7 100644 --- a/index.md +++ b/index.md @@ -43,10 +43,8 @@ Sign up for the IAP 2020 class by filling out this [registration form](https://f {% endfor %} -See [here](/2020/) for more on this year's topics, including links to -lecture notes and videos. If you want to get a sense of what the class was like -last year, check out [last year's -lectures](/2019/). +Video recordings of the lectures are available [on +YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuloKGG59rS43e29ro7I57J). # About the class From b049fecddd7bfd96e700a70c33604df160493383 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Fri, 17 Jan 2020 16:17:31 -0500 Subject: [PATCH 191/640] Fix aspect ratio --- _2020/editors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/editors.md b/_2020/editors.md index 2fd1bfaf..e12eb134 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -4,7 +4,7 @@ title: "Editors (Vim)" date: 2019-1-15 ready: true video: - aspect: 62.5 + aspect: 56.25 id: BE-xaxvDEpo --- From 17a637b6bacb1bed5d87efdd71ad37e94a1af34c Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Fri, 17 Jan 2020 18:37:44 -0500 Subject: [PATCH 192/640] Rewrite exercise possible ambiguity --- _2020/shell-tools.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 44b456b2..4bd09471 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -314,7 +314,7 @@ ls -lath --color=auto {% endcomment %} 1. Write bash functions `marco` and `polo` that do the following. -Whenever you execute `marco` the current path should be saved in some manner, then when you execute `polo`, no matter what directory you are in, `polo` should `cd` you back to the directory where you executed `marco`. +Whenever you execute `marco` the current working directory should be saved in some manner, then when you execute `polo`, no matter what directory you are in, `polo` should `cd` you back to the directory where you executed `marco`. For ease of debugging you can write the code in a file `marco.sh` and (re)load the definitions to your shell by executing `source marco.sh`. {% comment %} From cd0f5f4163c341936058922dd768e49aeb9bf7cd Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 20 Jan 2020 10:07:13 -0500 Subject: [PATCH 193/640] Remove link to Hacker Tools from README This was here for when the current class year had no content. Now that we have a couple lectures, this link isn't necessary anymore. --- README.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/README.md b/README.md index 709104f6..b30f24d5 100644 --- a/README.md +++ b/README.md @@ -2,8 +2,6 @@ Website for the [The Missing Semester of Your CS Education](https://missing.csail.mit.edu/) class! -(formerly known as [Hacker Tools](https://hacker-tools.github.io/)) - Contributions are most welcome! If you have edits or new content to add, please open an issue or submit a pull request. From e4291ed522dbec17bbf7b63798b5318ed58dd02e Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Mon, 20 Jan 2020 23:15:19 -0500 Subject: [PATCH 194/640] Draft of command line environment notes --- _2020/command-line.md | 457 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 457 insertions(+) diff --git a/_2020/command-line.md b/_2020/command-line.md index ea946620..fc4e8359 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -2,4 +2,461 @@ layout: lecture title: "Command-line Environment" date: 2019-1-21 +ready: true --- + + + +# Job Control + +In some cases you will need to interrupt a job while it is executing, for instance if a command is taking too long to complete (such as a `find` with a very large directory structure to search through). +Most of the time, you can do `Ctrl-C` and the command will stop. +But how does this actually work and why does it sometimes fail to stop the process? + +## Killing a process + +Your shell is using a UNIX communication mechanism called a _signal_ to communicate information to the process. When a process receives a signal it stops its execution, deals with the signal and potentially changes the flow of execution based on the information that the signal delivered. For this reason, signals are _software interrupts_. + +In our case, when typing `Ctrl-C` this prompts the shell to deliver a `SIGINT` signal to the process. + +Here's a minimal example of a Python program that captures `SIGINT` and ignores it, no longer stopping. To kill this program we can now use the `SIGQUIT` signal instead, by typing `Ctrl-\`. + +```python +#!/usr/bin/env python +import signal, time + +def handler(signum, time): + print("\nI got a SIGINT, but I am not stopping") + +signal.signal(signal.SIGINT, handler) +i = 0 +while True: + time.sleep(.1) + print("\r{}".format(i), end="") + i += 1 +``` + +Here's what happens if we send `SIGINT` twice to this program, followed by `SIGQUIT`. Note that `^` is how `Ctrl` is displayed when typed in the terminal. + +``` +$ python sigint.py +24^C +I got a SIGINT, but I am not stopping +26^C +I got a SIGINT, but I am not stopping +30^\[1] 39913 quit python sigint.py +``` + +While `SIGINT` and `SIGQUIT` are both usually associated with terminal related requests, a more generic signal for asking a process to exit gracefully is the `SIGTERM` signal. +To send this signal we can use the [`kill`](http://man7.org/linux/man-pages/man1/kill.1.html) command, with the syntax `kill -TERM `. + +## Pausing and backgrounding processes + +Signals can do other things beyond killing a process. For instance, `SIGSTOP` pauses a process. In the terminal, typing `Ctrl-Z` will prompt the shell to send a `SIGTSTP` signal, short for Terminal Stop (i.e. the terminal's version of `SIGSTOP`). + +We can then continue the paused job in the foreground or in the background using [`fg`](http://man7.org/linux/man-pages/man1/fg.1p.html) or [`bg`](http://man7.org/linux/man-pages/man1/bg.1p.html), respectively. + +The [`jobs`](http://man7.org/linux/man-pages/man1/jobs.1p.html) command lists the unfinished jobs associated with the current terminal session. +You can refer to those jobs using their pid (you can use [`pgrep`](http://man7.org/linux/man-pages/man1/pgrep.1.html) to find that out). +More intuitively, you can also refer to a process using the percent symbol followed by its job number (displayed by `jobs`). To refer to the last backgrounded job you can use the `$!` environment variable. + +One more thing to know is that the `&` suffix in a command will run the command in the background, giving you the prompt back, although it will still use the shell's STDOUT which can be annoying (use shell redirections in that case). + +To background an already running program you can do `Ctrl-Z` followed by `bg`. +Note that backgrounded processes are still children processes of your terminal and will die if you close the terminal (this will send yet another signal, `SIGHUP`). +To prevent that from happening you can run the program with [`nohup`](http://man7.org/linux/man-pages/man1/nohup.1.html) (a wrapper to ignore `SIGHUP`), or use `disown` if the process has already been started. +Alternatively, you can use a terminal multiplexer as we will see in the next section. + +Below is a sample session to showcase some of these concepts. + +``` +$ sleep 1000 +^Z +[1] + 18653 suspended sleep 1000 + +$ nohup sleep 2000 & +[2] 18745 +appending output to nohup.out + +$ jobs +[1] + suspended sleep 1000 +[2] - running nohup sleep 2000 + +$ bg %1 +[1] - 18653 continued sleep 1000 + +$ jobs +[1] - running sleep 1000 +[2] + running nohup sleep 2000 + +$ kill -STOP %1 +[1] + 18653 suspended (signal) sleep 1000 + +$ jobs +[1] + suspended (signal) sleep 1000 +[2] - running nohup sleep 2000 + +$ kill -SIGHUP %1 +[1] + 18653 hangup sleep 1000 + +$ jobs +[2] + running nohup sleep 2000 + +$ kill -SIGHUP %2 + +$ jobs +[2] + running nohup sleep 2000 + +$ kill %2 +[2] + 18745 terminated nohup sleep 2000 + +$ jobs + +``` + +A special signal is `SIGKILL` since it cannot be captured by the process and it will always terminate it immediately. However, it can have bad side effects such as leaving orphaned children processes. + +You can learn more about these and other signals [here](https://en.wikipedia.org/wiki/Signal_(IPC)) or typing [`man signal`](http://man7.org/linux/man-pages/man7/signal.7.html) or `kill -t`. + + +# Terminal Multiplexers + +When using the command line interface you will often want to run more than one thing at once. +For instance, you might want to run your editor and your program side by side. +Although this can be achieved opening new terminal windows, using a terminal multiplexer is a more versatile solution. + +Terminal multiplexers like [`tmux`](http://man7.org/linux/man-pages/man1/tmux.1.html) or its predecessor [`screen`](http://man7.org/linux/man-pages/man1/screen.1.html) allow to multiplex terminal windows using panes and tabs so you can interact multiple shell sessions. +Moreover, terminal multiplexers let you detach a current terminal session and reattach at some point later in time. +This can make your workflow much better when working with remote machines since it voids the need to use `nohup` and similar tricks. + +The most popular terminal multiplexer these days is [`tmux`](http://man7.org/linux/man-pages/man1/tmux.1.html). `tmux` is highly configurable and using the associated keybindings you can create multiple tabs and panes and quickly navigate through them. +You might also want to familiarize yourself with [`screen`](http://man7.org/linux/man-pages/man1/screen.1.html), since it comes installed in most UNIX systems. + +![Example Tmux session](https://upload.wikimedia.org/wikipedia/commons/5/50/Tmux.png) + +For further reading, +[here](https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/) is a quick tutorial on `tmux` and [this](http://linuxcommand.org/lc3_adv_termmux.php) has a more detailed explanation that covers the original `screen` command. + +# Aliases + +It can become tiresome typing long commands that involve many flags or verbose options. +For this reason, most shells support _aliasing_. +A shell alias is a short form for another command that your shell will replace automatically for you. +For instance, an alias in bash has the following structure: + +```bash +alias alias_name="command_to_alias arg1 arg2" +``` + +Note that there is no space around the equal sign `=`, because [`alias`](http://man7.org/linux/man-pages/man1/alias.1p.html) is a shell command that takes a single argument. + +Aliases have many convenient features: + +```bash +# Make shorthands for common flags +alias ll="ls -lh" + +# Save a lot of typing for common commands +alias gs="git status" +alias gc="git commit" +alias v="vim" + +# Save you from mistyping +alias sl=ls + +# Overwrite existing commands for better defaults +alias mv="mv -i" # -i prompts before overwrite +alias mkdir="mkdir -p" # -p make parent dirs as needed +alias df="df -p" # -h prints human readable format + +# Alias can be composed +alias la="ls -A" +alias lla="la -l" + +# To ignore an alias run it prepended with \ +\ls +# Or disable an alias altogether with unalias +unalias la + +# To get an alias definition just call it with alias +alias ll +# Will print ll='ls -lh' +``` + +Note that aliases do not persist shell sessions by default. +To make an alias persistent you need to include it in shell startup files, like `.bashrc` or `.zshrc`, which we are going to introduce in the next section. + + +# Dotfiles + +Many programs are configured using plain-text files known as _dotfiles_ +(because the file names begin with a `.`, e.g. `~/.vimrc`, so that they are +hidden in the directory listing `ls` by default). + +Shells are one example of programs configured with such files. On startup, your shell will read many files to load its configuration. +Depending of the shell, whether you are starting a login and/or interactive the entire process can be quite complex. +[Here](https://blog.flowblok.id.au/2013-02/shell-startup-scripts.html) is an excellent resource on the topic. + +For `bash`, editing your `.bashrc` or `.bash_profile` will work in most systems. +Here you can include commands that you want to run on startup, like the alias we just described or modifications to your `PATH` environment variable. +In fact, many programs will ask you to include a line like `export PATH="$PATH:/path/to/program/bin"` in your shell configuration file so their binaries can be found. + +Some other examples of tools that can be configured through dotfiles are: + +- `bash` - `~/.bashrc`, `~/.bash_profile` +- `git` - `~/.gitconfig` +- `vim` - `~/.vimrc` and the `~/.vim` folder +- `ssh` - `~/.ssh/config` +- `tmux` - `~/.tmux.conf` + +How should you organize your dotfiles? They should be in their own folder, +under version control, and **symlinked** into place using a script. This has +the benefits of: + +- **Easy installation**: if you log in to a new machine, applying your +customizations will only take a minute. +- **Portability**: your tools will work the same way everywhere. +- **Synchronization**: you can update your dotfiles anywhere and keep them all +in sync. +- **Change tracking**: you're probably going to be maintaining your dotfiles +for your entire programming career, and version history is nice to have for +long-lived projects. + +What should you put in your dotfiles? +You can learn about your tool's settings by reading online documentation or +[man pages](https://en.wikipedia.org/wiki/Man_page). Another great way is to +search the internet for blog posts about specific programs, where authors will +tell you about their preferred customizations. Yet another way to learn about +customizations is to look through other people's dotfiles: you can find tons of +[dotfiles +repositories](https://github.com/search?o=desc&q=dotfiles&s=stars&type=Repositories) +on --- see the most popular one +[here](https://github.com/mathiasbynens/dotfiles) (we advise you not to blindly +copy configurations though). +[Here](https://dotfiles.github.io/) is another good resource on the topic. + +All of the class instructors have their dotfiles publicly accessible on GitHub: [Anish](https://github.com/anishathalye/dotfiles), +[Jon](https://github.com/jonhoo/configs), +[Jose](https://github.com/jjgo/dotfiles). + + +## Portability + +A common pain with dotfiles is that the configurations might not work when working with several machines, e.g. if they have different operating systems or shells. Sometimes you also want some configuration to be applied only in a given machine. + +There are some tricks for making this easier. +If the configuration file supports it, use the equivalent of if-statements to +apply machine specific customizations. For example, your shell could have something +like: + +```bash +if [[ "$(uname)" == "Linux" ]]; then {do_something}; fi + +# Check before using shell-specific features +if [[ "$SHELL" == "zsh" ]]; then {do_something}; fi + +# You can also make it machine-specific +if [[ "$(hostname)" == "myServer" ]]; then {do_something}; fi +``` + +If the configuration file supports it, make use of includes. For example, +a `~/.gitconfig` can have a setting: + +``` +[include] + path = ~/.gitconfig_local +``` + +And then on each machine, `~/.gitconfig_local` can contain machine-specific +settings. You could even track these in a separate repository for +machine-specific settings. + +This idea is also useful if you want different programs to share some configurations. For instance, if you want both `bash` and `zsh` to share the same set of aliases you can write them under `.aliases` and have the following block in both: + +```bash +# Test if ~/.aliases exists and source it +if [ -f ~/.aliases ]; then + source ~/.aliases +fi +``` + +# Shells & Frameworks + +During shell tool and scripting we covered the `bash` shell because it is by far the most ubiquitous shell and most systems have it as the default option. Nevertheless, it is not the only option. + +For example, the `zsh` shell is a superset of `bash` and provides many convenient features out of the box such as: + +- Smarter globbing, `**` +- Inline globbing/wildcard expansion +- Spelling correction +- Better tab completion/selection +- Path expansion (`cd /u/lo/b` will expand as `/usr/local/bin`) + +**Frameworks** can improve your shell as well. Some popular general frameworks are [prezto](https://github.com/sorin-ionescu/prezto) or [oh-my-zsh](https://github.com/robbyrussll/oh-my-zsh), and smaller ones that focus on specific features such as [zsh-syntax-highlighting](https://github.com/zsh-users/zsh-syntax-highlighting) or [zsh-history-substring-search](https://github.com/zsh-users/zsh-history-substring-search). Shells like [fish](https://fishshell.com/) include many of these user-friendly features by default. Some of these features include: + +- Right prompt +- Command syntax highlighting +- History substring search +- manpage based flag completions +- Smarter autocompletion +- Prompt themes + +One thing to note when using these frameworks is that they may slow down your shell, especially if the code they run is not properly optimized or it is too much code. You can always profile it and disable the features that you do not use often or value over speed. + +# Terminal Emulators + +Along with customizing your shell, it is worth spending some time figuring out your choice of **terminal emulator** and its settings. There are many many terminal emulators out there (here is a [comparison](https://anarc.at/blog/2018-04-12-terminal-emulators-1/)). + +Since you might be spending hundreds to thousands of hours in your terminal it pays off to look into its settings. Some of the aspects that you may want to modify in your terminal include: + +- Font choice +- Color Scheme +- Keyboard shortcuts +- Tab/Pane support +- Scrollback configuration +- Performance (some newer terminals like [Alacritty](https://github.com/jwilm/alacritty) or [kitty](https://sw.kovidgoyal.net/kitty/) offer GPU acceleration). + +# Remote Machines + +It has become more and more common for programmers to use remote servers in their everyday work. If you need to use remote servers in order to deploy backend software or you need a server with higher computational capabilities, you will end up using a Secure Shell (SSH). As with most tools covered, SSH is highly configurable so it is worth learning about it. + + +## Executing commands + +An often overlooked feature of `ssh` is the ability to run commands directly. +`ssh foobar@server ls` will execute `ls` in the home folder of foobar. +It works with pipes, so `ssh foobar@server ls | grep PATTERN` will grep locally the remote output of `ls` and `ls | ssh foobar@server grep PATTERN` will grep remotely the local output of `ls`. + + +## SSH Keys + +Key-based authentication exploits public-key cryptography to prove to the server that the client owns the secret private key without revealing the key. This way you do not need to reenter your password every time. Nevertheless, the private key (often `~/.ssh/id_rsa` and more recently `~/.ssh/id_ed25519`) is effectively your password, so treat it like so. + +### Key generation + +To generate a pair you can run [`ssh-keygen`](http://man7.org/linux/man-pages/man1/ssh-keygen.1.html). +```bash +ssh-keygen -o -a 100 -t ed25519 -f ~/.ssh/id_ed25519 +``` +You should choose a passphrase, to avoid someone who gets ahold of your private key to access authorized servers. Use [`ssh-agent`](http://man7.org/linux/man-pages/man1/ssh-agent.1.html) or [`gpg-agent`](https://linux.die.net/man/1/gpg-agent) so you do not have to type your passphrase every time. + +If you have ever configured pushing to GitHub using SSH keys, then you have probably done the steps outlined [here](https://help.github.com/articles/connecting-to-github-with-ssh/) and have a valid key pair already. To check if you have a passphrase and validate it you can run `ssh-keygen -y -f /path/to/key`. + +### Key based authentication + +`ssh` will look into `.ssh/authorized_keys` to determine which clients it should let in. To copy a public key over you can use: + +```bash +cat .ssh/id_dsa.pub | ssh foobar@remote 'cat >> ~/.ssh/authorized_keys' +``` + +A simpler solution can be achieved with `ssh-copy-id` where available: + +```bash +ssh-copy-id -i .ssh/id_dsa.pub foobar@remote +``` + +## Copying files over SSH + +There are many ways to copy files over ssh: + +- `ssh+tee`, the simplest is to use `ssh` command execution and STDIN input by doing `cat localfile | ssh remote_server tee serverfile`. Recall that [`tee`](http://man7.org/linux/man-pages/man1/tee.1.html) writes the output from STDIN into a file. +- [`scp`](http://man7.org/linux/man-pages/man1/scp.1.html) when copying large amounts of files/directories, the secure copy `scp` command is more convenient since it can easily recurse over paths. The syntax is `scp path/to/local_file remote_host:path/to/remote_file` +- [`rsync`](http://man7.org/linux/man-pages/man1/rsync.1.html) improves upon `scp` by detecting identical files in local and remote, and preventing copying them again. It also provides more fine grained control over symlinks, permissions and has extra features like the `--partial` flag that can resume from a previously interrupted copy. `rsync` has a similar syntax to `scp`. + +## Port Forwarding + +In many scenarios you will run into software that listens to soecific ports in the machine. When this happens in your local machine you can type `localhost:PORT` or `127.0.0.1:PORT`, but what do you do with a remote server that does not have its ports directly available through the network/internet?. + +This is called _port forwarding_ and it +comes in two flavors: Local Port Forwarding and Remote Port Forwarding (see the pictures for more details, credit of the pictures from [this StackOverflow post](https://unix.stackexchange.com/questions/115897/whats-ssh-port-forwarding-and-whats-the-difference-between-ssh-local-and-remot)). + +**Local Port Forwarding** +![Local Port Forwarding](https://i.stack.imgur.com/a28N8.png  "Local Port Forwarding") + +**Remote Port Forwarding** +![Remote Port Forwarding](https://i.stack.imgur.com/4iK3b.png  "Remote Port Forwarding") + +The most common scenario is local port forwarding, where a service in the remote machine listens in a port and you want to link a port in your local machine to forward to the remote port. For example, if we execute `jupyter notebook` in the remote server that listens to the port `8888`. Thus, to forward that to the local port `9999`, we would do `ssh -L 9999:localhost:8888 foobar@remote_server` and then navigate to `locahost:9999` in our local machine. + + +## SSH Configuration + +We have covered many many arguments that we can pass. A tempting alternative is to create shell aliases that look like +```bash +alias my_server="ssh -i ~/.id_ed25519 --port 2222 - L 9999:localhost:8888 foobar@remote_server +``` + +However, there is a better alternative using `~/.ssh/config`. + +```bash +Host vm + User foobar + HostName 172.16.174.141 + Port 2222 + IdentityFile ~/.ssh/id_ed25519 + RemoteForward 9999 localhost:8888 + +# Configs can also take wildcards +Host *.mit.edu + User foobaz +``` + +An additional advantage of using the `~/.ssh/config` file over aliases is that other programs like `scp`, `rsync`, `mosh`, &c are able to read it as well and convert the settings into the corresponding flags. + + +Note that the `~/.ssh/config` file can be considered a dotfile, and in general it is fine for it to be included with the rest of your dotfiles. However, if you make it public, think about the information that you are potentially providing strangers on the internet: addresses of your servers, users, open ports, &c. This may facilitate some types of attacks so be thoughtful about sharing your SSH configuration. + +Server side configuration is usually specified in `/etc/ssh/sshd_config`. Here you can make changes like disabling password authentication, changing ssh ports, enabling X11 forwarding, &c. You can specify config settings in a per user basis. + +## Miscellaneous + +**Roaming** - A common pain when connecting to a remote server are disconnections due to shutting down/sleeping your computer or changing a network. Moreover if one has a connection with significant lag using ssh can become quite frustrating. [Mosh](https://mosh.org/), the mobile shell, improves upon ssh, allowing roaming connections, intermittent connectivity and providing intelligent local echo. + +Sometimes it is convenient to mount a remote folder. [sshfs](https://github.com/libfuse/sshfs) can mount a folder on a remote server +locally, and then you can use a local editor. + + +# Exercises + +1. From what we have seen, we can use some `ps aux | grep` commands to get our jobs' pids and then kill them, but there are better ways to do it. Start a `sleep 10000` job in a terminal, background it with `Ctrl-Z` and continue its execution with `bg`. Now use [`pgrep`](http://man7.org/linux/man-pages/man1/pgrep.1.html) to find its pid and [`pkill`](http://man7.org/linux/man-pages/man1/pgrep.1.html) to kill it without ever typing the pid itself. (Hint: use the `-af` flags). + +1. Say you don't want to start a process until another completes, how you would go about it? In this exercise our limiting process will always be `sleep 60 &`. +One way to achieve this is to use the [`wait`](http://man7.org/linux/man-pages/man1/wait.1p.html) command. Try launching the sleep command and having an `ls` wait until the background process finishes. + + However, this strategy will fail if we start in a different bash session, since `wait` only works for child processes. One feature we did not discuss in the notes is that the `kill` command's exit status will be zero on success and nonzero otherwise. `kill -0` does not send a signal but will give a nonzero exit status if the process does not exist. + Write a bash function called `pidwait` that takes a pid and waits until said process completes. You should use `sleep` to avoid wasting CPU unnecessarily. + +1. Run `history 1 |awk '{$1="";print substr($0,2)}' |sort | uniq -c | sort -n | tail -n10`) to get your top 10 most used commands and consider writing shorter aliases for them. + +1. Follow this `tmux` [tutorial](https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/) and then learn how to do some basic customizations following [these steps](https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/). + +1. Let's get you up to speed with dotfiles. + - Create a folder for your dotfiles and set up version + control. + - Add a configuration for at least one program, e.g. your shell, with some + customization (to start off, it can be something as simple as customizing your shell prompt by setting `$PS1`). + - Set up a method to install your dotfiles quickly (and without manual effort) on a new machine. This can be as simple as a shell script that calls `ln -s` for each file, or you could use a [specialized + utility](https://dotfiles.github.io/utilities/). + - Test your installation script on a fresh virtual machine. + - Migrate all of your current tool configurations to your dotfiles repository. + - Publish your dotfiles on GitHub. + + +1. Install a Linux virtual machine (or use an already existing one) for this exercise. If you are not familiar with virtual machines check out [this](https://hibbard.eu/install-ubuntu-virtual-box/) tutorial for installing one. + + - Go to `~/.ssh/` and check if you have a pair of SSH keys there. If not, generate them with `ssh-keygen -o -a 100 -t ed25519`. It is recommended that you use a password and use `ssh-agent` , more info [here](https://www.ssh.com/ssh/agents). + - Edit `.ssh/config` to have an entry as follows + + ```bash + Host vm + User username_goes_here + HostName ip_goes_here + IdentityFile ~/.ssh/id_ed25519 + RemoteForward 9999 localhost:8888 + ``` + - Use `ssh-copy-id vm` to copy your ssh key to the server. + - Start a webserver in your VM by executing `python -m http.server 8888`. Access the VM webserver by navigating to `http://localhost:9999` in your machine. + - Edit your /etc/ssh/sshd_config to disable password authentication by editing the value of `PasswordAuthentication`. Disable root login by editing the value of `PermitRootLogin`. Restart the `ssh` service with `sudo service sshd restart`. Try sshing in again. + - (Challenge) Install [`mosh`](https://mosh.org/) in the VM and establish a connection. Then disconnect the network adapter of the server/VM. Can mosh properly recover from it? + - (Challenge) Look into what the `-N` and `-f` flags do in `ssh` and figure out what a command to achieve background port forwarding. \ No newline at end of file From 3a629442a5f33e13111c0fd3ef322f2f3abc9428 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Mon, 20 Jan 2020 23:20:08 -0500 Subject: [PATCH 195/640] First draft of debugging-profiling notes --- _2020/debugging-profiling.md | 420 +++++++++++++++++++++++++++++++++++ 1 file changed, 420 insertions(+) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 0143fe46..cecf8a77 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -2,4 +2,424 @@ layout: lecture title: "Debugging and Profiling" date: 2019-1-23 +ready: true --- + +A golden rule in programming is that code will not do what you expect it to do but what you told it to do. +Bridging that gap can sometimes be a quite difficult feat. +In this lecture we will cover useful techniques for dealing with buggy and resource hungry code: debugging and profiling. + +# Debugging + +## Printf debugging and Logging + +"The most effective debugging tool is still careful thought, coupled with judiciously placed print statements" — Brian Kernighan, _Unix for Beginners_. +The first approach to debug a problem is often adding print statements around where you have detected that something is wrong and keep iterating until you have extracted enough information to understand what is responsible for the issue. + +The next step is to do use logging in your program instead of ad hoc print statements. Logging is better than just regular print statements for several reasons: + +- You can log to files, sockets even remote servers instead of standard output +- Logging supports severity levels (such as INFO, DEBUG, WARN, ERROR, &c) so you can filter your output accordingly +- For new issues, there's a fair chance that your logs will contain enough information to detect what is going wrong + +One of my favorite tips for making logs more readable is to color code them. +By now you probably have realized that your terminal uses colors to make things more readable. +But how does it do it? Programs like `ls` or `grep` are using [ANSI escape codes](https://en.wikipedia.org/wiki/ANSI_escape_code) which are special sequences of characters to indicate your shell to change the color of the output. For example executing `echo -e "\e[38;2;255;0;0mThis is red\e[0m"` will print a red `This is red` message in your terminal. + +## Third party logs + +As you start building larger software systems you will often run into dependencies that will run as separate programs. +Web servers, databases or message brokers are common examples of this kind of dependencies. +When interacting with these systems you will often need to read their logs since client side error message might not suffice. + +Luckily, most programs will write their own logs somewhere in your system. +In UNIX systems, it commonplace for that programs write their logs under `/var/log`. +For instance, the [NGINX](https://www.nginx.com/) webserver will place its logs under `/var/log/nginx`. +More recently, systems have started using a **system log** ”, which is increasingly where all of your log messages go. +Most (but not all) Linux systems will `systemd`, a system daemon that will control many things in your system such as which services are enabled and running. +`systemd` will place the logs under `/var/log/journal` in a specialized format and you can use [`journalctl`](http://man7.org/linux/man-pages/man1/journalctl.1.html) to display the messages. +Similarly, on macOS there is still `/var/log/system.log` but increasingly tools will log into the system log that can be displayed with [`log show`](https://www.manpagez.com/man/1/log/) on macOS or BSD. +On most UNIX systems you can also use the [`dmesg`](http://man7.org/linux/man-pages/man1/dmesg.1.html) command to access the kernel log. + +For logging under the system logs you can use the [`logger`](http://man7.org/linux/man-pages/man1/logger.1.html) tool. +Many programming languages will also have bindings for doing so. +Here's an example of using `logger` and how to check that the entry made it to the system logs. + +```bash +logger "Hello Logs" +# On macOS +log show --last 1m | grep Hello +# On Linux +journalctl --since "1m ago" | grep Hello +``` + +As we saw in the data wrangling lecture, logs can be quite verbose and they might require some level of processing and filtering to get the information you want. +If you find yourself heavily filtering through `journalctl` and `log show` you will probably want to familiarize yourself with their flags which can perform a first round of filtering of their output. +There are some tools like [`lnav`](http://lnav.org/) that provide an improved presentation and navigation for log files. + +## Debuggers + +When printf debugging is not enough you should be using a debugger. +Debuggers are programs that will let you interact with the execution of a program, letting you do things like: + +- Halt execution of the program when it reaches a certain line +- Step through the program one instruction at a time +- Inspect values of variables after the program crashed +- Conditionally halt the execution when a given condition is met. +- And many more advanced features + + +Many programming languages will come with some form of debugger. +In Python this is the Python Debugger [`pdb`](https://docs.python.org/3/library/pdb.html). + +Here is a brief description of some of the commands `pdb` supports. + +- **l**(ist) - Displays 11 lines around the current line or continue the previous listing. +- **s**(tep) - Execute the current line, stop at the first possible occasion. +- **n**(ext) - Continue execution until the next line in the current function is reached or it returns. +- **b**(reak) - Set a breakpoint (depending on the argument provided). +- **p**(rint) - Evaluate the expression in the current context and print its value. There's also **pp** to display using [`pprint`](https://docs.python.org/3/library/pprint.html) instead. +- **r**(eturn) - Continue execution until the current function returns. +- **q**(uit) - Quit from the debugger + + +Note that since Python is an interpreted language we can use the `pdb` shell to execute commands and to execute instructions. +[`ipdb`](https://pypi.org/project/ipdb/) is an improved `pdb` that uses the [`IPython`](https://ipython.org) REPL enabling tab completion, syntax highlighting, better tracebacks, better introspection while retaining the same interface as the `pdb` module. + +For more low level programming you will probably want to look into [`gdb`](https://www.gnu.org/software/gdb/) (and its quality of life modification [`pwndbg`](https://github.com/pwndbg/pwndbg)) and [`lldb`](https://lldb.llvm.org/). +They are optimized for C-like language debugging but will let you probe pretty much any process and get its current state: registers, stack, program counter, &c. + + +## Specialized Tools + +Even if what you are trying to debug is a black box binary there are tools that can help you with that. +Whenever programs need to perform actions that only the kernel can, they will use [System Calls](https://en.wikipedia.org/wiki/System_call). +There are commands that will let you trace the syscalls your program makes. In Linux there's [`strace`](http://man7.org/linux/man-pages/man1/strace.1.html) and macOS and BSD have [`dtrace`](http://dtrace.org/blogs/about/). Since `dtrace` can be tricky to use since it uses its own `D` language there is a wrapper called [`dtruss`](https://www.manpagez.com/man/1/dtruss/) that will provide an interface more similar to `strace` (more details [here](https://8thlight.com/blog/colin-jones/2015/11/06/dtrace-even-better-than-strace-for-osx.html)). +Below are some examples of using `strace` or `dtruss` to show [`stat`](http://man7.org/linux/man-pages/man2/stat.2.html) syscall traces for an execution of `ls`. For a deeper dive into strace , try reading [this](https://blogs.oracle.com/linux/strace-the-sysadmins-microscope-v2). + +```bash +# On Linux +sudo strace -e lstat ls -l > /dev/null + +# On macOS +sudo dtruss -t lstat64_extended ls -l > /dev/null +``` + +If your programs rely on network functionality, looking at the network packets might be necessary to figure out what is going wrong. +Tools like [`tcpdump`](http://man7.org/linux/man-pages/man1/tcpdump.1.html) and [Wireshark](https://www.wireshark.org/) are network packet analyzers that will let you read the contents of network packets and filter them based on many criteria. + +For web development the Chrome/Firefox developer tools are a quite amazing tool. They feature a large number of tools: +- Source code - Inspect the HTML/CSS/JS source code of any website +- Live HTML, CSS, JS modification - Change the website content, styles and behavior to test. (This also means that website screenshots are not valid proofs). +- Javascript shell - Execute commands in the JS REPL +- Network - Analyze the timeline of requests +- Storage - Look into the Cookies and local application storage. + +## Static Analysis + +Not all issues need the code to be run to be discovered. +For example, just by carefully looking at a piece of code you could realize that your loop variable is overshadowing an already existing variable or function name; or that a variable has never been defined. +Here is where [static analysis](https://en.wikipedia.org/wiki/Static_program_analysis) tools come into play. +Static analysis programs will go through the source + +In the example below there are several mistakes. First, our loop variable `foo` shadows the previous definition of the function `foo`. We also wrote `baz` instead of `bar` in the last line so the program will crash, but it will take a minute to do so because of the `sleep` call. + +```python +import time + +def foo(): + return 42 + +for foo in range(5): + print(foo) +bar = 1 +bar *= 0.2 +time.sleep(60) +print(baz) +``` + +Static analysis tools can catch both these issues. We run [`pyflakes`](https://pypi.org/project/pyflakes) on the code and get errors related to those issues. [`mypy`](http://mypy-lang.org/) is another tool that can detect type checking issues. Here, `bar` is first an `int` and it's then casted to a `float` so `mypy` will warn is about the error. +Note that all these issues were detected without actually having to run the code. +In the shell tools lecture we covered [`shellcheck`](https://www.shellcheck.net/) which is a similar tool for shell scripts. + +```bash +$ pyflakes foobar.py +foobar.py:6: redefinition of unused 'foo' from line 3 +foobar.py:11: undefined name 'baz' + +$ mypy foobar.py +foobar.py:6: error: Incompatible types in assignment (expression has type "int", variable has type "Callable[[], Any]") +foobar.py:9: error: Incompatible types in assignment (expression has type "float", variable has type "int") +foobar.py:11: error: Name 'baz' is not defined +Found 3 errors in 1 file (checked 1 source file) +``` + +Most editors and IDEs will support displaying the output of these tools within the editor itself, highlighting the locations of warnings and errors. +This is often called **code linting** and it can also be used to display other types of issues such as stylistic violations or insecure constructs. + +In vim, the plugins [`ale`](https://vimawesome.com/plugin/ale) or [`syntastic`](https://vimawesome.com/plugin/syntastic) will let you do that. +For Python, [`pylint`](https://www.pylint.org) and [`pep8`](https://pypi.org/project/pep8/) are examples of stylistic linters and [`bandit`](https://pypi.org/project/bandit/) is a tool designed to find common security issues. +For other languages people have compiled comprehensive lists of useful static analysis tools such as [Awesome Static Analysis](https://github.com/mre/awesome-static-analysis) (you may want to take a look at the _Writing_ section) and for linters there is [Awesome Linters](https://github.com/caramelomartins/awesome-linters). + +A complementary tool to stylistic linting are code formatters such as [`black`](https://github.com/psf/black) for Python, `gofmt` for Go or `rustfmt` for Rust. +These tools auto format your code so it's consistent with common stylistic patterns for the given programming language. +Although you might be reticent to give stylistic control about your code, standardizing code format will help other people read your code and will make you better at reading other people's (stylistically standardized) code. + +# Profiling + +Even if your code functionally behaves as you would expect that might not be good enough if it takes all your CPU or memory in the process. +Algorithms classes will teach you big _O_ notation but they won't teach how to find hot spots in your program. +Since [premature optimization is the root of all evil](http://wiki.c2.com/?PrematureOptimization) you should learn about profilers and monitoring tools, since they will help you understand what parts of your program are taking most of the time and/or resources so you can focus on optimizing those parts. + +## Timing + +Similar to the debugging case, in many scenarios it can be enough to just print the time it took your code between two points. +Here is an example in Python using the [`time`](https://docs.python.org/3/library/time.html) module. + +```python +import time, random +n = random.randint(1, 10) * 100 + +# Get current time +start = time.time() + +# Do some work +print("Sleeping for {} ms".format(n)) +time.sleep(n/1000) + +# Compute time between start and now +print(time.time() - start) + +# Output +# Sleeping for 500 ms +# 0.5713930130004883 +``` + +However, as you might have noticed if you ran the example above wall clock time might not match your expected measurements. +Wall clock time can be misleading since your computer might be running other processes at the same time or might be waiting for events to happen. Often you will see tools make a distinction between _Real_, _User_ and _Sys_ time. In general _User_ + _Sys_ will tell you how much time your process actually spent in the CPU (more detailed explanation [here](https://stackoverflow.com/questions/556405/what-do-real-user-and-sys-mean-in-the-output-of-time1)) + +- _Real_ - Wall clock elapsed time from start to finish of the program, including the time taken by other processed and time taken while blocked (e.g. waiting for I/O or network) +- _User_ - Amount of time spent in the CPU running user code +- _Sys_ - Amount of time spent in the CPU running kernel code + +For example, try running a command that performs an HTTP request and prefixing it with [`time`](http://man7.org/linux/man-pages/man1/time.1.html). Under a slow connection you might get an output like the one below. Here it took over 2 seconds for the request to complete but the process only took 15ms of CPU user time and 12ms of kernel CPU time. + +```bash +$ time curl https://missing.csail.mit.edu &> /dev/null` +real 0m2.561s +user 0m0.015s +sys 0m0.012s +``` + +## Profilers + +### CPU + +Most of the time when people refer to profilers they actually mean CPU profilers since they are the most common. + +There are two main types of CPU profilers, tracing profilers and sampling profilers. + +Tracing profilers keep a record of every function call your program makes whereas sampling profilers probe your program periodically (commonly every milliseconds) and record the program's stack. +They then present aggregate statistics of what your program spent the most time doing. +[Here](https://jvns.ca/blog/2017/12/17/how-do-ruby---python-profilers-work-) is a good intro article if you want more detail on this topic. + +Most programming languages will have at least a command line debugger that you can use. +Often those integrate with full fledged IDEs but for this lecture we are going to focus on the command line tools themselves. + +In Python +TODO cProfile + +TODO `perf` command +- Basic performance stats: `perf stat {command}` +- Run a program with the profiler: `perf record {command}` +- Analyze profile: `perf report` + + +A caveat of Python's `cProfile` profiler (and many profilers for that matter) is that they will display time per function call. That can become intuitive really fast specially if you are using third party libraries in your code since internal function calls will also be accounted for. +A more intuitive way of displaying profiling information is to include the time taken per line of code, this is what _line profilers_ do. + +For instance the following piece of Python code performs a request to the class website and parses the response to get all URLs in the page. + +```python +#!/usr/bin/env python +import requests +from bs4 import BeautifulSoup + +@profile +def get_urls(): + response = requests.get('https://missing.csail.mit.edu') + s = BeautifulSoup(response.content, 'lxml') + urls = [] + for url in s.find_all('a'): + urls.append(url['href']) + +if __name__ == '__main__': + get_urls() +``` + +If we ran it thorugh Python's `cProfile` profiler we get over 2500 lines of output and even with sorting it is hard to understand where the time is being spent. A quick run with [`line_profiler`](https://github.com/rkern/line_profiler) shows the time taken per line. + +```bash +$ kernprof -l -v a.py +Wrote profile results to a.py.lprof +Timer unit: 1e-06 s + +Total time: 0.636188 s +File: a.py +Function: get_urls at line 5 + +Line # Hits Time Per Hit % Time Line Contents +============================================================== + 5 @profile + 6 def get_urls(): + 7 1 613909.0 613909.0 96.5 response = requests.get('https://missing.csail.mit.edu') + 8 1 21559.0 21559.0 3.4 s = BeautifulSoup(response.content, 'lxml') + 9 1 2.0 2.0 0.0 urls = [] +10 25 685.0 27.4 0.1 for url in s.find_all('a'): +11 24 33.0 1.4 0.0 urls.append(url['href']) +``` + +### Memory + +In languages like C or C++ memory leaks can cause your program to never release memory that doesn't need anymore. +To help in the process of memory debugging you can use tools like [Valgrind](https://valgrind.org/) that will help you identify memory leaks. + +In garbage collected languages like Python it is still useful to use a memory profiler since as long as you have pointers to objects in memory they won't be garbage collected. +Here's an example program and the associated output then running it with [memory-profiler](https://pypi.org/project/memory-profiler/) (note the decorator like in `line-profiler`) + +```python +@profile +def my_func(): + a = [1] * (10 ** 6) + b = [2] * (2 * 10 ** 7) + del b + return a + +if __name__ == '__main__': + my_func() +``` + +```bash +$ python -m memory_profiler example.py +Line # Mem usage Increment Line Contents +============================================== + 3 @profile + 4 5.97 MB 0.00 MB def my_func(): + 5 13.61 MB 7.64 MB a = [1] * (10 ** 6) + 6 166.20 MB 152.59 MB b = [2] * (2 * 10 ** 7) + 7 13.61 MB -152.59 MB del b + 8 13.61 MB 0.00 MB return a +``` + + +### Visualization + +Profiler output for real world programs will contain large amounts of information because of the inherent complexity of software projects. +Humans are visual creatures and are quite terrible at reading large amounts of numbers and making sense of them. +Thus there are many tools for displaying profiler's output in a easier to parse way. + +One common way to display CPU profiling information for sampling profilers is to use a [Flame Graph](http://www.brendangregg.com/flamegraphs.html) which will display a hierarchy of function calls across the Y axis and time taken proportional to the X axis. They are also interactive letting you zoom into specific parts of the program and get their stack traces (try clicking in the image below). + +[![FlameGraph](http://www.brendangregg.com/FlameGraphs/cpu-bash-flamegraph.svg)](http://www.brendangregg.com/FlameGraphs/cpu-bash-flamegraph.svg) + +Call graphs or control flow graphs display the relationships between subroutines within a program by including functions as nodes and functions calls between them as directed edges. When coupled with profiling information such as number of calls and time taken, call graphs can be quite useful for interpreting the flow of a program. +In Python you can use the [`pycallgraph`](http://pycallgraph.slowchop.com/en/master/) library to generate them. + +![Call Graph](https://upload.wikimedia.org/wikipedia/commons/2/2f/A_Call_Graph_generated_by_pycallgraph.png) + + +## Resource Monitoring + +Sometimes, the first step towards analyzing the performance of your program is to understand what its actual resource consumption is. +Often programs will run slow when they are resource constrained, e.g. not having enough memory or having a slow network connection. +There is a myriad of command line tools for probing and displaying different system resources like CPU usage, memory usage, network, disk usage and so on. + +- **General Monitoring** - Probably the most popular is [`htop`](https://hisham.hm/htop/index.php) which is an improved version of [`top`](http://man7.org/linux/man-pages/man1/top.1.html). +`htop` presents you various statistics for the currently running processes on the system. +See also [`glances`](https://nicolargo.github.io/glances/) for similar implementation with a well designed UI. For getting aggregate measures across all processes, [`dstat`](http://dag.wiee.rs/home-made/dstat/) is a great tool that computes real-time resource metrics for lots of different subsystems like I/O, networking, CPU utilization, context switches, &c. +- **I/O operations** - [`iotop`](http://man7.org/linux/man-pages/man8/iotop.8.html) displays live I/O usage information, handy to check if a process is doing heavy I/O disk operations +- **Disk Usage** - [`df`](http://man7.org/linux/man-pages/man1/df.1.html) will display metrics per partitions and [`du`](http://man7.org/linux/man-pages/man1/du.1.html) displays **d**isk **u**sage per file for the current directory. In this tools the `-h` flag is quite crucial to get **h**uman readable output. +A more interactive version of `du` is [`ncdu`](https://dev.yorhel.nl/ncdu) which will let you navigate folders and delete files and folders as you navigate. +- **Memory Usage** - [`free`](http://man7.org/linux/man-pages/man1/free.1.html) displays the total amount of free and used memory in the system. Memory is also often display in tools like `htop`. +- **Open Files** - [`lsof`](http://man7.org/linux/man-pages/man8/lsof.8.html) lists file information about files opened by processes. It can be quite useful for checking which process has opened a given file. +- **Network Connections and Config** - [`ss`](http://man7.org/linux/man-pages/man8/ss.8.html) will let you monitor incoming and outgoing network packets statistics as well as interface statistics. A common use case of `ss` is figuring out what process is using a given port in a machine. For displaying routing, network devices and interfaces you can use [`ip`](http://man7.org/linux/man-pages/man8/ip.8.html). Note that `netstat` and `ifconfig` have been deprecated in favor of the former tools respectively. +- **Network Usage** - [`nethogs`](https://github.com/raboof/nethogs) and [`iftop`](http://www.ex-parrot.com/pdw/iftop/) are good interactive CLI tools for monitoring network usage. + +If you want to test this tools you can also artificially impose loads on the machine using the [`stress`](https://linux.die.net/man/1/stress) command. + + +### Specialized tools + +Sometimes, black box benchmarking is all you need to determine what software to use. +Tools like [`hyperfine`](https://github.com/sharkdp/hyperfine) will let you quickly benchmark command line programs. +For instance, in the shell tools and scripting lecture we recommended `fd` over `find`. We can use `hyperfine` to compare them in tasks we run often. +E.g. in the example below `fd` was 20x faster than `find` in my machine. + +```bash +$ hyperfine --warmup 3 'fd -e jpg' 'find . -iname "*.jpg"' +Benchmark #1: fd -e jpg + Time (mean ± σ): 51.4 ms ± 2.9 ms [User: 121.0 ms, System: 160.5 ms] + Range (min … max): 44.2 ms … 60.1 ms 56 runs + +Benchmark #2: find . -iname "*.jpg" + Time (mean ± σ): 1.126 s ± 0.101 s [User: 141.1 ms, System: 956.1 ms] + Range (min … max): 0.975 s … 1.287 s 10 runs + +Summary + 'fd -e jpg' ran + 21.89 ± 2.33 times faster than 'find . -iname "*.jpg"' +``` + +As it was the case for debugging, browsers also come with a fantastic set of tools for profiling webpage loading letting you figure out where time is being spent: loading, rendering, scripting, &c. +More info for [Firefox](https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Profiling_with_the_Built-in_Profiler) and [Chrome](https://developers.google.com/web/tools/chrome-devtools/rendering-toolss). + +# Exercises + +1. Do [this](https://github.com/spiside/pdb-tutorial) hands on `pdb` tutorial to familiarize yourself with the commands. For a more in depth tutorial read [this](https://realpython.com/python-debugging-pdb). + +1. Use `journalctl` on Linux or `log show` on macOS to get the super user accesses and commands in the last day. +If there aren't any you can execute some harmless commands such as `sudo ls` and check again. + +1. Install `pyflakes` or `pylint` and run it in the following piece of Python code. What is wrong with the code? Try fixing it. + +```python +TODO +``` + +1. Run `cProfile`, `line_profiler` and `memory_profiler` in the following piece of Python code. What can you do to improve its performance? + +```python +TODO +``` + +1. Here's some (arguably convoluted) Python code for computing Fibonacci numbers using a function for each number. + +```python +#!/usr/bin/env python +def fib0(): return 0 + +def fib1(): return 1 + +s = """def fib{}(): return fib{}() + fib{}()""" + +if __name__ == '__main__': + + for n in range(2, 10): + exec(s.format(n, n-1, n-2)) + # from functools import lru_cache + # for n in range(10): + # exec("fib{} = lru_cache(1)(fib{})".format(n, n)) + print(eval("fib9()")) +``` + +Put the code into a file and make it executable. Install [`pycallgraph`](http://pycallgraph.slowchop.com/en/master/). Run the code as is with `pycallgraph graphviz -- ./fib.py` and check the `pycallgraph.png` file. How many times is `fib0` called?. We can do better than that by memoizing the functions. Uncomment the commented lines and regenerate the images. How many times are we calling each `fibN` function now? + +1. A common issue is that a port you want to listen on is already taken by another process. Let's learn how to discover that process pid. First execute `python -m http.server 4444` to start a minimal web server listening on port `4444`. On a separate terminal run `lsof | grep LISTEN` to print all listening processes and ports. Find that process pid and terminate it by running `kill `. + +1. Limiting processes resources can be another handy tool in your toolbox. +Try running `stress -c 3` and visualize the CPU consumption with `htop`. Now, execute `taskset --cpu-list 0,2 stress -c 3` and visualize it. Is `stress` taking three CPUs, why not? Read [`man taskset`](http://man7.org/linux/man-pages/man1/taskset.1.html). +Challenge: achieve the same using [`cgroups`](http://man7.org/linux/man-pages/man7/cgroups.7.html). Try limiting the memory consumption of `stress -m`. + +1. (Advanced) The command `curl ipinfo.io` performs a HTTP request an fetches information about your public IP. Open [Wireshark](https://www.wireshark.org/) and try to sniff the request and reply packets that `curl` sent and received. (Hint: Use the `http` filter to just watch HTTP packets). + +1. (Advanced) Read about [reversible debugging](https://undo.io/resources/reverse-debugging-whitepaper/) and get a simple example working using [`rr`](https://rr-project.org/) or [`RevPDB`](https://morepypy.blogspot.com/2016/07/reverse-debugging-for-python.html). \ No newline at end of file From 36224f70022710459174bc322a9b9f6353d3a836 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Mon, 20 Jan 2020 23:21:31 -0500 Subject: [PATCH 196/640] Minor fix in shell-tools notes --- _2020/shell-tools.md | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 4bd09471..458a462a 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -179,6 +179,7 @@ Short for manual, [`man`](http://man7.org/linux/man-pages/man1/man.1.html) provi For example, `man rm` will output the behavior of the `rm` command along with the flags that it takes including the `-i` flag we showed earlier. In fact, what I have been linking so far for every command are the online version of Linux manpages for the commands. Even non native commands that you install will have manpage entries if the developer wrote them and included them as part of the installation process. +For interactive tools such as the ones based on ncurses, help for the comands can often be accessed within the program using the `:help` command or typing `?`. Sometimes manpages can be overly detailed descriptions of the commands and it can become hard to decipher what flags/syntax to use for common use cases. [TLDR pages](https://tldr.sh/) are a nifty complementary solution that focuses on giving example use cases of a command so you can quickly figure out which options to use. @@ -288,7 +289,7 @@ Finding frequent and/or recent files and directories can be done through tools l Fasd ranks files and directories by [_frecency_](https://developer.mozilla.org/en/The_Places_frecency_algorithm), that is, by both _frequency_ and _recency_. The most straightforward use is _autojump_ which adds a `z` command that you can use to quickly `cd` using a substring of a _frecent_ directory. E.g. if you often go to `/home/user/files/cool_project` you can simply `z cool` to jump there. -More complex tools exists to quickly get an overview of a directory structure [`tree`](https://linux.die.net/man/1/tree), [`broot`](https://github.com/Canop/broot) or even full fledged file managers like [nnn](https://github.com/jarun/nnn) or [ranger](https://github.com/ranger/ranger) +More complex tools exist to quickly get an overview of a directory structure [`tree`](https://linux.die.net/man/1/tree), [`broot`](https://github.com/Canop/broot) or even full fledged file managers like [`nnn`](https://github.com/jarun/nnn) or [`ranger`](https://github.com/ranger/ranger) # Exercises @@ -327,7 +328,7 @@ polo() { } {% endcomment %} -1. Say you have a command that fails rarely. In order to debug it you need to capture its output but it can be time consuming to get it. +1. Say you have a command that fails rarely. In order to debug it you need to capture its output but it can be time consuming to get a failure run. Write a bash script that runs the following script until it fails and captures its standard output and error streams to files and prints everything at the end. Bonus points if you can also report how many runs it took for the script to fail. @@ -345,7 +346,6 @@ fi echo "Everything went according to plan" ``` - {% comment %} #!/usr/bin/env bash @@ -368,9 +368,8 @@ To bridge this disconnect there's the [`xargs`](http://man7.org/linux/man-pages/ For example `ls | xargs rm` will delete the files in the current directory. Your task is to write a command that recursively finds all HTML files in the folder and makes a zip with them. Note that your command should work even if the files have spaces (hint: check `-d` flag for `xargs`) - -{% comment %} -find . -type f -name "*.html" | xargs -d '\n' tar -cvzf archive.tar.gz -{% endcomment %} + {% comment %} + find . -type f -name "*.html" | xargs -d '\n' tar -cvzf archive.tar.gz + {% endcomment %} 1. (Advanced) Write a command or script to recursively find the most recently modified file in a directory. More generally, can you list all files by recency? From 3780c873aa97adecfe1d16929b78a409cac4b8bc Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Mon, 20 Jan 2020 23:24:58 -0500 Subject: [PATCH 197/640] Unready debugging and profiling notes --- _2020/debugging-profiling.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index cecf8a77..fc6b55f9 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -2,7 +2,6 @@ layout: lecture title: "Debugging and Profiling" date: 2019-1-23 -ready: true --- A golden rule in programming is that code will not do what you expect it to do but what you told it to do. @@ -422,4 +421,4 @@ Challenge: achieve the same using [`cgroups`](http://man7.org/linux/man-pages/ma 1. (Advanced) The command `curl ipinfo.io` performs a HTTP request an fetches information about your public IP. Open [Wireshark](https://www.wireshark.org/) and try to sniff the request and reply packets that `curl` sent and received. (Hint: Use the `http` filter to just watch HTTP packets). -1. (Advanced) Read about [reversible debugging](https://undo.io/resources/reverse-debugging-whitepaper/) and get a simple example working using [`rr`](https://rr-project.org/) or [`RevPDB`](https://morepypy.blogspot.com/2016/07/reverse-debugging-for-python.html). \ No newline at end of file +1. (Advanced) Read about [reversible debugging](https://undo.io/resources/reverse-debugging-whitepaper/) and get a simple example working using [`rr`](https://rr-project.org/) or [`RevPDB`](https://morepypy.blogspot.com/2016/07/reverse-debugging-for-python.html). From c2b9275135099ae2ccd2ea00cc731a98fcf5d234 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Tue, 21 Jan 2020 10:34:15 -0500 Subject: [PATCH 198/640] Add more tmux notes to command-line --- _2020/command-line.md | 103 +++++++++++++++++++++++++----------------- 1 file changed, 62 insertions(+), 41 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index fc4e8359..0ec354d3 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -125,17 +125,38 @@ When using the command line interface you will often want to run more than one t For instance, you might want to run your editor and your program side by side. Although this can be achieved opening new terminal windows, using a terminal multiplexer is a more versatile solution. -Terminal multiplexers like [`tmux`](http://man7.org/linux/man-pages/man1/tmux.1.html) or its predecessor [`screen`](http://man7.org/linux/man-pages/man1/screen.1.html) allow to multiplex terminal windows using panes and tabs so you can interact multiple shell sessions. +Terminal multiplexers like [`tmux`](http://man7.org/linux/man-pages/man1/tmux.1.html) allow to multiplex terminal windows using panes and tabs so you can interact multiple shell sessions. Moreover, terminal multiplexers let you detach a current terminal session and reattach at some point later in time. This can make your workflow much better when working with remote machines since it voids the need to use `nohup` and similar tricks. The most popular terminal multiplexer these days is [`tmux`](http://man7.org/linux/man-pages/man1/tmux.1.html). `tmux` is highly configurable and using the associated keybindings you can create multiple tabs and panes and quickly navigate through them. -You might also want to familiarize yourself with [`screen`](http://man7.org/linux/man-pages/man1/screen.1.html), since it comes installed in most UNIX systems. -![Example Tmux session](https://upload.wikimedia.org/wikipedia/commons/5/50/Tmux.png) +`tmux` expects you to know its keybindings, and they all have the form ` x` where that means press `Ctrl+b` release, and the press `x`. `tmux` has the following hierarchy of objects: +- **Sessions** - a session is an independent workspace with one or more windows + + `tmux` starts a new session. + + `tmux -t NAME` starts it with that name. + + `tmux ls` lists the current sessions + + Within `tmux` typing ` d` dettaches the current session + + `tmux a` attaches the last session. You can use `-t` flag to specify which + +- **Windows** - Equivalent to tabs in editors or browsers, they are visually separate parts of the same session + + ` c` Creates a new window. To close it you can just terminate the shells doing `` + + ` N` Go to the _N_ th window. Note they are numbered + + ` p` Goes to the previous window + + ` n` Goes to the next window + + ` ,` Rename the current window + + ` w` List current windows + +- **Panes** - Like vim splits, pane let you have multiple shells in the same visual display. + + ` "` Split the current pane horizontally + + ` %` Split the current pane vertically + + ` ` Move to the pane in the specified _direction_. Direction here means arrow keys. + + ` z` Toggle zoom for the current pane + + ` [` Start scrollback. You can then press `` to start a selection and `` to copy that selection. + + ` ` Cycle through pane arrangements. For further reading, -[here](https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/) is a quick tutorial on `tmux` and [this](http://linuxcommand.org/lc3_adv_termmux.php) has a more detailed explanation that covers the original `screen` command. +[here](https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/) is a quick tutorial on `tmux` and [this](http://linuxcommand.org/lc3_adv_termmux.php) has a more detailed explanation that covers the original `screen` command. You might also want to familiarize yourself with [`screen`](http://man7.org/linux/man-pages/man1/screen.1.html), since it comes installed in most UNIX systems. # Aliases @@ -280,42 +301,6 @@ if [ -f ~/.aliases ]; then fi ``` -# Shells & Frameworks - -During shell tool and scripting we covered the `bash` shell because it is by far the most ubiquitous shell and most systems have it as the default option. Nevertheless, it is not the only option. - -For example, the `zsh` shell is a superset of `bash` and provides many convenient features out of the box such as: - -- Smarter globbing, `**` -- Inline globbing/wildcard expansion -- Spelling correction -- Better tab completion/selection -- Path expansion (`cd /u/lo/b` will expand as `/usr/local/bin`) - -**Frameworks** can improve your shell as well. Some popular general frameworks are [prezto](https://github.com/sorin-ionescu/prezto) or [oh-my-zsh](https://github.com/robbyrussll/oh-my-zsh), and smaller ones that focus on specific features such as [zsh-syntax-highlighting](https://github.com/zsh-users/zsh-syntax-highlighting) or [zsh-history-substring-search](https://github.com/zsh-users/zsh-history-substring-search). Shells like [fish](https://fishshell.com/) include many of these user-friendly features by default. Some of these features include: - -- Right prompt -- Command syntax highlighting -- History substring search -- manpage based flag completions -- Smarter autocompletion -- Prompt themes - -One thing to note when using these frameworks is that they may slow down your shell, especially if the code they run is not properly optimized or it is too much code. You can always profile it and disable the features that you do not use often or value over speed. - -# Terminal Emulators - -Along with customizing your shell, it is worth spending some time figuring out your choice of **terminal emulator** and its settings. There are many many terminal emulators out there (here is a [comparison](https://anarc.at/blog/2018-04-12-terminal-emulators-1/)). - -Since you might be spending hundreds to thousands of hours in your terminal it pays off to look into its settings. Some of the aspects that you may want to modify in your terminal include: - -- Font choice -- Color Scheme -- Keyboard shortcuts -- Tab/Pane support -- Scrollback configuration -- Performance (some newer terminals like [Alacritty](https://github.com/jwilm/alacritty) or [kitty](https://sw.kovidgoyal.net/kitty/) offer GPU acceleration). - # Remote Machines It has become more and more common for programmers to use remote servers in their everyday work. If you need to use remote servers in order to deploy backend software or you need a server with higher computational capabilities, you will end up using a Secure Shell (SSH). As with most tools covered, SSH is highly configurable so it is worth learning about it. @@ -411,12 +396,48 @@ Server side configuration is usually specified in `/etc/ssh/sshd_config`. Here y ## Miscellaneous -**Roaming** - A common pain when connecting to a remote server are disconnections due to shutting down/sleeping your computer or changing a network. Moreover if one has a connection with significant lag using ssh can become quite frustrating. [Mosh](https://mosh.org/), the mobile shell, improves upon ssh, allowing roaming connections, intermittent connectivity and providing intelligent local echo. +A common pain when connecting to a remote server are disconnections due to shutting down/sleeping your computer or changing a network. Moreover if one has a connection with significant lag using ssh can become quite frustrating. [Mosh](https://mosh.org/), the mobile shell, improves upon ssh, allowing roaming connections, intermittent connectivity and providing intelligent local echo. Sometimes it is convenient to mount a remote folder. [sshfs](https://github.com/libfuse/sshfs) can mount a folder on a remote server locally, and then you can use a local editor. +# Shells & Frameworks + +During shell tool and scripting we covered the `bash` shell because it is by far the most ubiquitous shell and most systems have it as the default option. Nevertheless, it is not the only option. + +For example, the `zsh` shell is a superset of `bash` and provides many convenient features out of the box such as: + +- Smarter globbing, `**` +- Inline globbing/wildcard expansion +- Spelling correction +- Better tab completion/selection +- Path expansion (`cd /u/lo/b` will expand as `/usr/local/bin`) + +**Frameworks** can improve your shell as well. Some popular general frameworks are [prezto](https://github.com/sorin-ionescu/prezto) or [oh-my-zsh](https://github.com/robbyrussll/oh-my-zsh), and smaller ones that focus on specific features such as [zsh-syntax-highlighting](https://github.com/zsh-users/zsh-syntax-highlighting) or [zsh-history-substring-search](https://github.com/zsh-users/zsh-history-substring-search). Shells like [fish](https://fishshell.com/) include many of these user-friendly features by default. Some of these features include: + +- Right prompt +- Command syntax highlighting +- History substring search +- manpage based flag completions +- Smarter autocompletion +- Prompt themes + +One thing to note when using these frameworks is that they may slow down your shell, especially if the code they run is not properly optimized or it is too much code. You can always profile it and disable the features that you do not use often or value over speed. + +# Terminal Emulators + +Along with customizing your shell, it is worth spending some time figuring out your choice of **terminal emulator** and its settings. There are many many terminal emulators out there (here is a [comparison](https://anarc.at/blog/2018-04-12-terminal-emulators-1/)). + +Since you might be spending hundreds to thousands of hours in your terminal it pays off to look into its settings. Some of the aspects that you may want to modify in your terminal include: + +- Font choice +- Color Scheme +- Keyboard shortcuts +- Tab/Pane support +- Scrollback configuration +- Performance (some newer terminals like [Alacritty](https://github.com/jwilm/alacritty) or [kitty](https://sw.kovidgoyal.net/kitty/) offer GPU acceleration). + # Exercises 1. From what we have seen, we can use some `ps aux | grep` commands to get our jobs' pids and then kill them, but there are better ways to do it. Start a `sleep 10000` job in a terminal, background it with `Ctrl-Z` and continue its execution with `bg`. Now use [`pgrep`](http://man7.org/linux/man-pages/man1/pgrep.1.html) to find its pid and [`pkill`](http://man7.org/linux/man-pages/man1/pgrep.1.html) to kill it without ever typing the pid itself. (Hint: use the `-af` flags). From 9c75db60c2fe55a6082dc0f08ab94720c227da38 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Tue, 21 Jan 2020 17:03:22 -0500 Subject: [PATCH 199/640] Update notes for command line environment --- _2020/command-line.md | 89 +++++++++++++++++++++++++++---------------- 1 file changed, 56 insertions(+), 33 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 0ec354d3..309cb021 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -5,6 +5,9 @@ date: 2019-1-21 ready: true --- +In this lecture we will go through several ways in which you can improve your workflow when using the shell. We have been working with the shell for a while now, but we have mainly focused on executing different commands. We will now see how to run several processes at the same time while keeping track of them, how to stop or pause a specific process and how to make a process run in the background. + +We will also learn about different ways to improve your shell and other tools, by defining aliases and configuring them using dotfiles. Both of these can help you save time, e.g. by using the same configurations in all your machines without having to type long commands. We will look at how to work with remote machines using SSH. # Job Control @@ -305,6 +308,14 @@ fi It has become more and more common for programmers to use remote servers in their everyday work. If you need to use remote servers in order to deploy backend software or you need a server with higher computational capabilities, you will end up using a Secure Shell (SSH). As with most tools covered, SSH is highly configurable so it is worth learning about it. +To `ssh` into a server you execute a command as follows + +```bash +ssh foo@bar.mit.edu +``` + +Here we are trying to ssh as user `foo` in server `bar.mit.edu`. +The server can be specified with a URL (like `bar.mit.edu`) or an IP (something like `foobar@192.168.1.42`). Later we will shee that if we modify ssh config file you can access just using something like `ssh bar`. ## Executing commands @@ -332,13 +343,13 @@ If you have ever configured pushing to GitHub using SSH keys, then you have prob `ssh` will look into `.ssh/authorized_keys` to determine which clients it should let in. To copy a public key over you can use: ```bash -cat .ssh/id_dsa.pub | ssh foobar@remote 'cat >> ~/.ssh/authorized_keys' +cat .ssh/id_ed25519.pub | ssh foobar@remote 'cat >> ~/.ssh/authorized_keys' ``` A simpler solution can be achieved with `ssh-copy-id` where available: ```bash -ssh-copy-id -i .ssh/id_dsa.pub foobar@remote +ssh-copy-id -i .ssh/id_ed25519.pub foobar@remote ``` ## Copying files over SSH @@ -351,7 +362,7 @@ There are many ways to copy files over ssh: ## Port Forwarding -In many scenarios you will run into software that listens to soecific ports in the machine. When this happens in your local machine you can type `localhost:PORT` or `127.0.0.1:PORT`, but what do you do with a remote server that does not have its ports directly available through the network/internet?. +In many scenarios you will run into software that listens to specific ports in the machine. When this happens in your local machine you can type `localhost:PORT` or `127.0.0.1:PORT`, but what do you do with a remote server that does not have its ports directly available through the network/internet?. This is called _port forwarding_ and it comes in two flavors: Local Port Forwarding and Remote Port Forwarding (see the pictures for more details, credit of the pictures from [this StackOverflow post](https://unix.stackexchange.com/questions/115897/whats-ssh-port-forwarding-and-whats-the-difference-between-ssh-local-and-remot)). @@ -440,6 +451,8 @@ Since you might be spending hundreds to thousands of hours in your terminal it p # Exercises +## Job control + 1. From what we have seen, we can use some `ps aux | grep` commands to get our jobs' pids and then kill them, but there are better ways to do it. Start a `sleep 10000` job in a terminal, background it with `Ctrl-Z` and continue its execution with `bg`. Now use [`pgrep`](http://man7.org/linux/man-pages/man1/pgrep.1.html) to find its pid and [`pkill`](http://man7.org/linux/man-pages/man1/pgrep.1.html) to kill it without ever typing the pid itself. (Hint: use the `-af` flags). 1. Say you don't want to start a process until another completes, how you would go about it? In this exercise our limiting process will always be `sleep 60 &`. @@ -448,36 +461,46 @@ One way to achieve this is to use the [`wait`](http://man7.org/linux/man-pages/m However, this strategy will fail if we start in a different bash session, since `wait` only works for child processes. One feature we did not discuss in the notes is that the `kill` command's exit status will be zero on success and nonzero otherwise. `kill -0` does not send a signal but will give a nonzero exit status if the process does not exist. Write a bash function called `pidwait` that takes a pid and waits until said process completes. You should use `sleep` to avoid wasting CPU unnecessarily. -1. Run `history 1 |awk '{$1="";print substr($0,2)}' |sort | uniq -c | sort -n | tail -n10`) to get your top 10 most used commands and consider writing shorter aliases for them. +## Terminal multiplexer 1. Follow this `tmux` [tutorial](https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/) and then learn how to do some basic customizations following [these steps](https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/). -1. Let's get you up to speed with dotfiles. - - Create a folder for your dotfiles and set up version - control. - - Add a configuration for at least one program, e.g. your shell, with some - customization (to start off, it can be something as simple as customizing your shell prompt by setting `$PS1`). - - Set up a method to install your dotfiles quickly (and without manual effort) on a new machine. This can be as simple as a shell script that calls `ln -s` for each file, or you could use a [specialized - utility](https://dotfiles.github.io/utilities/). - - Test your installation script on a fresh virtual machine. - - Migrate all of your current tool configurations to your dotfiles repository. - - Publish your dotfiles on GitHub. - - -1. Install a Linux virtual machine (or use an already existing one) for this exercise. If you are not familiar with virtual machines check out [this](https://hibbard.eu/install-ubuntu-virtual-box/) tutorial for installing one. - - - Go to `~/.ssh/` and check if you have a pair of SSH keys there. If not, generate them with `ssh-keygen -o -a 100 -t ed25519`. It is recommended that you use a password and use `ssh-agent` , more info [here](https://www.ssh.com/ssh/agents). - - Edit `.ssh/config` to have an entry as follows - - ```bash - Host vm - User username_goes_here - HostName ip_goes_here - IdentityFile ~/.ssh/id_ed25519 - RemoteForward 9999 localhost:8888 - ``` - - Use `ssh-copy-id vm` to copy your ssh key to the server. - - Start a webserver in your VM by executing `python -m http.server 8888`. Access the VM webserver by navigating to `http://localhost:9999` in your machine. - - Edit your /etc/ssh/sshd_config to disable password authentication by editing the value of `PasswordAuthentication`. Disable root login by editing the value of `PermitRootLogin`. Restart the `ssh` service with `sudo service sshd restart`. Try sshing in again. - - (Challenge) Install [`mosh`](https://mosh.org/) in the VM and establish a connection. Then disconnect the network adapter of the server/VM. Can mosh properly recover from it? - - (Challenge) Look into what the `-N` and `-f` flags do in `ssh` and figure out what a command to achieve background port forwarding. \ No newline at end of file +## Aliases + +1. Create an alias `dc` that resolves to `cd` for when you type it wrongly. + +1. Run `history 1 |awk '{$1="";print substr($0,2)}' |sort | uniq -c | sort -n | tail -n10`) to get your top 10 most used commands and consider writing shorter aliases for them. + + +## Dotfiles + +Let's get you up to speed with dotfiles. +1. Create a folder for your dotfiles and set up version + control. +1. Add a configuration for at least one program, e.g. your shell, with some + customization (to start off, it can be something as simple as customizing your shell prompt by setting `$PS1`). +1. Set up a method to install your dotfiles quickly (and without manual effort) on a new machine. This can be as simple as a shell script that calls `ln -s` for each file, or you could use a [specialized + utility](https://dotfiles.github.io/utilities/). +1. Test your installation script on a fresh virtual machine. +1. Migrate all of your current tool configurations to your dotfiles repository. +1. Publish your dotfiles on GitHub. + +## Remote Machines + +Install a Linux virtual machine (or use an already existing one) for this exercise. If you are not familiar with virtual machines check out [this](https://hibbard.eu/install-ubuntu-virtual-box/) tutorial for installing one. + +1. Go to `~/.ssh/` and check if you have a pair of SSH keys there. If not, generate them with `ssh-keygen -o -a 100 -t ed25519`. It is recommended that you use a password and use `ssh-agent` , more info [here](https://www.ssh.com/ssh/agents). +1. Edit `.ssh/config` to have an entry as follows + +```bash +Host vm + User username_goes_here + HostName ip_goes_here + IdentityFile ~/.ssh/id_ed25519 + RemoteForward 9999 localhost:8888 +``` +1. Use `ssh-copy-id vm` to copy your ssh key to the server. +1. Start a webserver in your VM by executing `python -m http.server 8888`. Access the VM webserver by navigating to `http://localhost:9999` in your machine. +1. Edit your SSH server config by doing `sudo vim /etc/ssh/sshd_config` and disable password authentication by editing the value of `PasswordAuthentication`. Disable root login by editing the value of `PermitRootLogin`. Restart the `ssh` service with `sudo service sshd restart`. Try sshing in again. +1. (Challenge) Install [`mosh`](https://mosh.org/) in the VM and establish a connection. Then disconnect the network adapter of the server/VM. Can mosh properly recover from it? +1. (Challenge) Look into what the `-N` and `-f` flags do in `ssh` and figure out what a command to achieve background port forwarding. \ No newline at end of file From b066e2726423fa89ce7c11d85d1431010bc2fc14 Mon Sep 17 00:00:00 2001 From: Claire Nord Date: Tue, 21 Jan 2020 17:26:57 -0500 Subject: [PATCH 200/640] Clarify zsh history search vs. autosuggestions in shell tools lecture --- _2020/shell-tools.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 458a462a..525ac9c0 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -267,13 +267,14 @@ If we want to search there we can pipe that output to `grep` and search for patt In most shells you can make use of `Ctrl+R` to perform backwards search through your history. After pressing `Ctrl+R` you can type a substring you want to match for commands in your history. As you keep pressing it you will cycle through the matches in your history. +This can also be enabled with the UP/DOWN arrows in [zsh](https://github.com/zsh-users/zsh-history-substring-search). A nice addition on top of `Ctrl+R` comes with using [fzf](https://github.com/junegunn/fzf/wiki/Configuring-shell-key-bindings#ctrl-r) bindings. `fzf` is a general purpose fuzzy finder that can used with many commands. Here is used to fuzzily match through your history and present results in a convenient and visually pleasing manner. -Another cool history related trick I really enjoy is **history substring search**. +Another cool history-related trick I really enjoy is **history-based autosuggestions**. First introduced by the [fish](https://fishshell.com/) shell, this feature dynamically autocompletes your current shell command with the most recent command that you typed that shares a common prefix with it. -It can be enabled in [zsh](https://github.com/zsh-users/zsh-history-substring-search) and it is a great quality of life trick for your shell. +It can be enabled in [zsh](https://github.com/zsh-users/zsh-autosuggestions) and it is a great quality of life trick for your shell. Lastly, a thing to have in mind is that if you start a command with a leading space it won't be added to you shell history. This comes in handy when you are typing commands with passwords or other bits of sensitive information. From c541b17dab2303639cc1f96fe94b308eccd58d3c Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 21 Jan 2020 23:25:21 -0500 Subject: [PATCH 201/640] Add notes for version control lecture --- _2020/version-control.md | 513 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 513 insertions(+) diff --git a/_2020/version-control.md b/_2020/version-control.md index 6fb72391..e0879426 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -2,4 +2,517 @@ layout: lecture title: "Version Control (Git)" date: 2019-1-22 +ready: true --- + +Version control systems (VCSs) are tools used to track changes to source code +(or other collections of files and folders). As the name implies, these tools +help maintain a history of changes; furthermore, they facilitate collaboration. +VCSs track changes to a folder and its contents in a series of snapshots, where +each snapshot encapsulates the entire state of files/folders within a top-level +directory. VCSs also maintain metadata like who created each snapshot, messages +associated with each snapshot, and so on. + +Why is version control useful? Beyond letting you look at old snapshots of the +project, modern VCSs let you easily (and automatically) answer questions like: + +- Who wrote this module? +- When was this particular line of this particular file edited? By whom? Why + was it edited? +- Over the last 1000 revisions, when was this regression introduced? + +While other VCSs exist, **Git** is the de facto standard for version control. +This [XKCD comic](https://xkcd.com/1597/) captures Git's reputation: + +![xkcd 1597](https://imgs.xkcd.com/comics/git.png) + +Because Git's interface is a leaky abstraction, learning Git top-down (starting +with its interface / command-line interface) can lead to a lot of confusion. +It's possible to memorize a handful of commands and think of them as magic +incantations, and follow the approach in the comic above whenever anything goes +wrong. + +While Git admittedly has an ugly interface, its underlying design and ideas are +beautiful. While an ugly interface has to be _memorized_, a beautiful design +can be _understood_. For this reason, we give a bottom-up explanation of Git, +starting with its data model and later covering the command-line interface. + +# Git's data model + +## Snapshots + +Git models the history of a collection of files and folders within some +top-level directory as a series of snapshots. In Git terminology, a file is +called a "blob", and it's just a bunch of bytes. A directory is called a +"tree", and it maps names to blobs or trees (so directories can contain other +directories). A snapshot is the top-level tree that is being tracked. For +example, we might have a tree as follows: + +``` + (tree) +| ++- foo (tree) +| | +| + bar.txt (blob, contents = "hello world") +| ++- baz.txt (blob, contents = "git is wonderful") +``` + +The top-level tree contains two elements, a tree "foo" (that itself contains +one element, a blob "bar.txt"), and a blob "baz.txt". + +## Modeling history: relating snapshots + +How should a version control system relate snapshots? One simple model would be +to have a linear history. A history would be a list of snapshots in time-order. +For many reasons, Git doesn't use a simple model like this. + +In Git, a history is a directed acyclic graph (DAG) of snapshots. That may +sound like a fancy math word, but don't be intimidated. All this means is that +each snapshot in Git refers to a set of "parents", the snapshots that preceded +it. It's a set of parents rather than a single parent (as would be the case in +a linear history) because a snapshot might descend from multiple parents, for +example due to combining (merging) two parallel branches of development. + +Git calls these snapshots "commit"s. Visualizing a commit history might look +something like this: + +``` +o --> o --> o --> o + \ + \ + --> o --> o +``` + +In the ASCII art above, the `o`s correspond to individual commits (snapshots). +After the third commit, the history branches into two separate branches. This +might correspond to, for example, two separate features being developed in +parallel, independently from each other. In the future, these branches may be +merged to create a new snapshot that incorporates both of the features, +producing a new history that looks like this, with the newly created merge +commit shown in bold: + +
+o --> o --> o --> o -----> o
+             \             ^
+              \           /
+               --> o --> o
+
+ +## Data model, as pseudocode + +It may be instructive to see Git's data model written down in pseudocode: + +``` +// a file is a bunch of bytes +type blob = array + +// a directory contains named files and directories +type tree = map + +// a commit has parents, metadata, and the top-level tree +type commit = struct { + parent: array + author: string + message: string + snapshot: tree +} +``` + +It's a clean, simple model of history. + +## Objects and content-addressing + +An "object" is a blob, tree, or commit: + +``` +type object = blob | tree | commit +``` + +In Git data store, all objects are content-addressed by their [SHA-1 +hash](https://en.wikipedia.org/wiki/SHA-1). + +``` +objects = map + +def store(object): + id = sha1(object) + objects[id] = object + +def load(id): + return objects[id] +``` + +Blobs, trees, and commits are unified in this way: they are all objects. When +they reference other objects, they don't actually _contain_ them in their +on-disk representation, but have a reference to them by their hash. + +For example, the tree for the example directory structure [above](#snapshots) +(visualized using `git cat-file -p 698281bc680d1995c5f4caaf3359721a5a58d48d`), +looks like this: + +``` +100644 blob 4448adbf7ecd394f42ae135bbeed9676e894af85 baz.txt +040000 tree c68d233a33c5c06e0340e4c224f0afca87c8ce87 foo +``` + +The tree itself contains pointers to its contents, `baz.txt` (a blob) and `foo` +(a tree). If we look at the contents addressed by the hash corresponding to +baz.txt with `git cat-file -p 4448adbf7ecd394f42ae135bbeed9676e894af85`, we get +the following: + +``` +git is wonderful +``` + +## References + +Now, all snapshots can be identified by their SHA-1 hash. That's inconvenient, +because humans aren't good at remembering strings of 40 hexadecimal characters. + +Git's solution to this problem is human-readable names for SHA-1 hashes, called +"references". References are pointers to commits. + +``` +references = map + +def create_reference(name, id): + references[name] = id + +def read_reference(name): + return references[name] + +def load(name_or_id): + if name_or_id in references: + return load(references[name_or_id]) + else: + return load(name_or_id) +``` + +With this, Git can use human-readable names like "master" to refer to a +particular snapshot in the history, instead of a long hexadecimal string. + +One detail is that we often want a notion of "where we currently are" in the +history, so that when we take a new snapshot, we know what it is relative to +(how we set the `parents` field of the commit). In Git, that "where we +currently are" is a special reference called "HEAD". + +## Repositories + +Finally, we can define what (roughly) is a Git _repository_: it is the data +`objects` and `references`. + +On disk, all Git stores is objects and references: that's all there is to Git's +data model. All `git` commands map to some manipulation of the commit DAG by +adding objects and adding/updating references. + +Whenever you're typing in any command, think about what manipulation the +command is making to the underlying graph data structure. Conversely, if you're +trying to make a particular kind of change to the commit DAG, e.g. "discard +uncommitted changes and make the 'master' ref point to commit 5d83f9e", there's +probably a command to do it (e.g. in this case, `git checkout master; git reset +--hard 5d83f9e`). + +# Staging area + +This is another concept that's orthogonal to the data model, but it's a part of +the interface to create commits. + +One way you might imagine implementing snapshotting as described above is have +a "create snapshot" command that creates a new snapshot based on the _current +state_ of the working directory. Some version control tools work like this, but +not Git. We want clean snapshots, and it might not always be ideal to make a +snapshot from the current state. For example, imagine a scenario where you've +implemented two separate features, and you want to create two separate commits, +where the first introduces the first feature, and the next introduces the +second feature. Or imagine a scenario where you have debugging print statements +added all over your code, along with a bugfix; you want to commit the bugfix +while discarding all the print statements. + +Git accommodates such scenarios by allowing you to specify which modifications +should be included in the next snapshot through a mechanism called the "staging +area". + +# Git command-line interface + +The `git init` command initializes a new Git repository, with repository +metadata being stored in the `.git` directory: + +## Basics + +{% comment %} + +```console +$ mkdir myproject +$ cd myproject +$ git init +Initialized empty Git repository in /home/missing-semester/myproject/.git/ +$ git status +On branch master + +No commits yet + +nothing to commit (create/copy files and use "git add" to track) +``` + +How do we interpret this output? "No commits yet" basically means our version +history is empty. Let's fix that. + +```console +$ echo "hello, git" > hello.txt +$ git add hello.txt +$ git status +On branch master + +No commits yet + +Changes to be committed: + (use "git rm --cached ..." to unstage) + + new file: hello.txt + +$ git commit -m 'Initial commit' +[master (root-commit) 4515d17] Initial commit + 1 file changed, 1 insertion(+) + create mode 100644 hello.txt +``` + +With this, we've `git add`ed a file to the staging area, and then `git +commit`ed that change, adding a simple commit message "Initial commit". If we +didn't specify a `-m` option, Git would open our text editor to allow us type a +commit message. + +Now that we have a non-empty version history, we can visualize the history. +Visualizing the history as a DAG can be especially helpful in understanding the +current status of the repo and connecting it with your understanding of the Git +data model. + +The `git log` command visualizes history. By default, it shows a flattened +version, which hides the graph structure. If you use a command like `git log +--all --graph --decorate`, it will show you the full version history of the +repository, visualized in graph form. + +```console +$ git log --all --graph --decorate +* commit 4515d17a167bdef0a91ee7d50d75b12c9c2652aa (HEAD -> master) + Author: Missing Semester + Date: Tue Jan 21 22:18:36 2020 -0500 + + Initial commit +``` + +This doesn't look all that graph-like, because it only contains a single node. +Let's make some more changes, author a new commit, and visualize the history +once more. + +```console +$ echo "another line" >> hello.txt +$ git status +On branch master +Changes not staged for commit: + (use "git add ..." to update what will be committed) + (use "git checkout -- ..." to discard changes in working directory) + + modified: hello.txt + +no changes added to commit (use "git add" and/or "git commit -a") +$ git add hello.txt +$ git status +On branch master +Changes to be committed: + (use "git reset HEAD ..." to unstage) + + modified: hello.txt + +$ git commit -m 'Add a line' +[master 35f60a8] Add a line + 1 file changed, 1 insertion(+) +``` + +Now, if we visualize the history again, we'll see some of the graph structure: + +``` +* commit 35f60a825be0106036dd2fbc7657598eb7b04c67 (HEAD -> master) +| Author: Missing Semester +| Date: Tue Jan 21 22:26:20 2020 -0500 +| +| Add a line +| +* commit 4515d17a167bdef0a91ee7d50d75b12c9c2652aa + Author: Anish Athalye + Date: Tue Jan 21 22:18:36 2020 -0500 + + Initial commit +``` + +Also, note that it shows the current HEAD, along with the current branch +(master). + +We can look at old versions using the `git checkout` command. + +```console +$ git checkout 4515d17 # previous commit hash; yours will be different +Note: checking out '4515d17'. + +You are in 'detached HEAD' state. You can look around, make experimental +changes and commit them, and you can discard any commits you make in this +state without impacting any branches by performing another checkout. + +If you want to create a new branch to retain commits you create, you may +do so (now or later) by using -b with the checkout command again. Example: + + git checkout -b + +HEAD is now at 4515d17 Initial commit +$ cat hello.txt +hello, git +$ git checkout master +Previous HEAD position was 4515d17 Initial commit +Switched to branch 'master' +$ cat hello.txt +hello, git +another line +``` + +Git can show you how files have evolved (differences, or diffs) using the `git +diff` command: + +```console +$ git diff 4515d17 hello.txt +diff --git c/hello.txt w/hello.txt +index 94bab17..f0013b2 100644 +--- c/hello.txt ++++ w/hello.txt +@@ -1 +1,2 @@ + hello, git + +another line +``` + +{% endcomment %} + +Watch the lecture video or see the [Git documentation](https://git-scm.com/doc) +for detailed information on the commands below. + +- `git init`: creates a new git repo +- `git status`: tells you what's going on +- `git add `: stages files for commit +- `git commit`: creates a new commit + - Write [good commit messages](https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html)! +- `git commit --amend`: edit a commit's contents/message +- `git log`: shows a flattened log of history +- `git log --all --graph --decorate`: visualizes history as a DAG +- `git diff `: shows differences in a file between snapshots +- `git checkout `: updates HEAD and current branch + +## Branching and merging + +{% comment %} + +Branching allows you to "fork" version history. It can be helpful for working +on independent features or bug fixes in parallel. The `git branch` command can +be used to create new branches; `git checkout -b ` creates and +branch and checks it out. + +Merging is the opposite of branching: it allows you to combine forked version +histories, e.g. merging a feature branch back into master. The `git merge` +command is used for merging. + +{% endcomment %} + +- `git branch`: shows branches +- `git branch `: creates a branch +- `git checkout -b `: creates a branch and switches to it + - same as `git branch ; git checkout ` +- `git merge `: merges into current branch +- `git rebase`: rebase set of patches onto a new base + +## Remotes + +- `git remote`: list remotes +- `git remote add `: add a remote +- `git push :`: send objects to remote, and update remote reference +- `git branch --set-upstream-to=/`: set up correspondence between local and remote branch +- `git fetch`: retrieve objects/references from a remote +- `git pull`: same as `git fetch`, except updates local branch +- `git clone`: download repository from remote + +# Advanced Git + +- `git config`: Git is [highly customizable](https://git-scm.com/docs/git-config) +- `git clone --shallow`: clone without entire version history +- `git add -p`: interactive staging +- `git rebase -i`: interactive rebasing +- `git blame`: show who last edited which line +- `git stash`: temporarily remove modifications to working directory +- `git bisect`: binary search history (e.g. for regressions) +- `.gitignore`: [specify](https://git-scm.com/docs/gitignore) intentionally untracked files to ignore + +# Miscellaneous + +- **GUIs**: There are many [GUI clients](https://git-scm.com/downloads/guis) +out there for Git. We personally don't use them and use the command-line +interface instead. +- **Shell integration**: It's super handy to have a Git status as part of your +shell prompt ([zsh](https://github.com/olivierverdier/zsh-git-prompt), +[bash](https://github.com/magicmonty/bash-git-prompt)). Often included in +frameworks like [Oh My Zsh](https://github.com/ohmyzsh/ohmyzsh). +- **Editor integration**: Similarly to the above, handy integrations with many +features. [fugitive.vim](https://github.com/tpope/vim-fugitive) is the standard +one for Vim. +- **Workflows**: we taught you the data model, plus some basic commands; we +didn't tell you what practices to follow when working on big projects (and +there are [many](https://nvie.com/posts/a-successful-git-branching-model/) +[different](https://www.endoflineblog.com/gitflow-considered-harmful) +[approaches](https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow)). +- **GitHub**: Git is not GitHub. GitHub has a specific way of contributing code +to other projects, called [pull +requests](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests). + +# Resources + +- [Pro Git](https://git-scm.com/book/en/v2) is **highly recommended reading**. +Going through Chapters 1--5 should teach you most of what you need to use Git +proficiently, now that you understand the data model. The later chapters have +some interesting, advanced material. +- [Oh Shit, Git!?!](https://ohshitgit.com/) is a short guide on how to recover +from some common Git mistakes. +- [Git for Computer +Scientists](https://eagain.net/articles/git-for-computer-scientists/) is a +short explanation of Git's data model, with less pseudocode and more fancy +diagrams than these lecture notes. +- [Git from the Bottom Up](https://jwiegley.github.io/git-from-the-bottom-up/) +is a detailed explanation of Git's implementation details beyond just the data +model, for the curious. +- [How to explain git in simple +words](https://smusamashah.github.io/blog/2017/10/14/explain-git-in-simple-words) +- [Learn Git Branching](https://learngitbranching.js.org/) is a browser-based +game that teaches you Git. + +# Exercises + +1. Clone the [repository for the + class](https://github.com/missing-semester/missing-semester). Explore the + version history by visualizing it as a graph. Who was the last person to + modify `README.md`? (Hint: use `git log` with an argument) What was the + commit message associated with the last modification to the `collections:` + line of `_config.yml`? (Hint: use `git blame` and `git show`) +1. One common mistake when learning Git is to commit large files that should + not be managed by Git or adding sensitive information. Try adding a file to + a repository, making some commits and then deleting that file from history + (you may want to look at + [this](https://help.github.com/articles/removing-sensitive-data-from-a-repository/)). +1. Clone some repository from GitHub, and modify one of its existing files. + What happens when you do `git stash`? What do you see when running `git log + --all --oneline`? Run `git stash pop` to undo what you did with `git stash`. + In what scenario might this be useful? +1. Like many command line tools, Git provides a configuration file (or dotfile) + called `~/.gitconfig`. Create an alias in `~/.gitconfig` so that when you + run `git graph`, you get the output of `git log --all --graph --decorate + --oneline`. +1. You can define global ignore patterns in `~/.gitignore_global` after running + `git config --global core.excludesfile ~/.gitignore_global`. Do this, and + set up your global gitignore file to ignore OS-specific or editor-specific + temporary files, like `.DS_Store`. +1. Clone the [repository for the class + website](https://github.com/missing-semester/missing-semester), find a typo + or some other improvement you can make, and submit a pull request on GitHub. From 074c835443a23f5198e8d45e66c839cddb0686e6 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 21 Jan 2020 23:28:32 -0500 Subject: [PATCH 202/640] Reformat exercise --- _2020/version-control.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/_2020/version-control.md b/_2020/version-control.md index e0879426..e7781b67 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -491,11 +491,13 @@ game that teaches you Git. # Exercises 1. Clone the [repository for the - class](https://github.com/missing-semester/missing-semester). Explore the - version history by visualizing it as a graph. Who was the last person to - modify `README.md`? (Hint: use `git log` with an argument) What was the - commit message associated with the last modification to the `collections:` - line of `_config.yml`? (Hint: use `git blame` and `git show`) +class](https://github.com/missing-semester/missing-semester). + 1. Explore the version history by visualizing it as a graph. + 1. Who was the last person to modify `README.md`? (Hint: use `git log` with + an argument) + 1. What was the commit message associated with the last modification to the + `collections:` line of `_config.yml`? (Hint: use `git blame` and `git + show`) 1. One common mistake when learning Git is to commit large files that should not be managed by Git or adding sensitive information. Try adding a file to a repository, making some commits and then deleting that file from history From 466f6dab0e5977c6bf66b15ce0a52f39af3e2ad6 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 21 Jan 2020 23:30:33 -0500 Subject: [PATCH 203/640] Fix typo --- _2020/version-control.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/version-control.md b/_2020/version-control.md index e7781b67..c71e7a7f 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -491,7 +491,7 @@ game that teaches you Git. # Exercises 1. Clone the [repository for the -class](https://github.com/missing-semester/missing-semester). +class website](https://github.com/missing-semester/missing-semester). 1. Explore the version history by visualizing it as a graph. 1. Who was the last person to modify `README.md`? (Hint: use `git log` with an argument) From 5bbfe7c4eed9582db9e5720c0859c11808c587f4 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Wed, 22 Jan 2020 08:20:47 -0500 Subject: [PATCH 204/640] Flip direction of arrows for consistency --- _2020/version-control.md | 29 +++++++++++++++-------------- 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/_2020/version-control.md b/_2020/version-control.md index c71e7a7f..3163b460 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -78,25 +78,26 @@ Git calls these snapshots "commit"s. Visualizing a commit history might look something like this: ``` -o --> o --> o --> o - \ - \ - --> o --> o +o <-- o <-- o <-- o + ^ + \ + --- o <-- o ``` In the ASCII art above, the `o`s correspond to individual commits (snapshots). -After the third commit, the history branches into two separate branches. This -might correspond to, for example, two separate features being developed in -parallel, independently from each other. In the future, these branches may be -merged to create a new snapshot that incorporates both of the features, -producing a new history that looks like this, with the newly created merge -commit shown in bold: +The arrows point to the parent of each commit (it's a "comes before" relation, +not "comes after"). After the third commit, the history branches into two +separate branches. This might correspond to, for example, two separate features +being developed in parallel, independently from each other. In the future, +these branches may be merged to create a new snapshot that incorporates both of +the features, producing a new history that looks like this, with the newly +created merge commit shown in bold:
-o --> o --> o --> o -----> o
-             \             ^
-              \           /
-               --> o --> o
+o <-- o <-- o <-- o <---- o
+            ^            /
+             \          v
+              --- o <-- o
 
## Data model, as pseudocode From 4beeeeb3e62e7be9d523260af65a8d0b5e25b488 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Wed, 22 Jan 2020 08:27:24 -0500 Subject: [PATCH 205/640] Add text on immutable objects and mutable refs --- _2020/version-control.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/_2020/version-control.md b/_2020/version-control.md index 3163b460..270b5ea8 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -172,12 +172,15 @@ Now, all snapshots can be identified by their SHA-1 hash. That's inconvenient, because humans aren't good at remembering strings of 40 hexadecimal characters. Git's solution to this problem is human-readable names for SHA-1 hashes, called -"references". References are pointers to commits. +"references". References are pointers to commits. Unlike objects, which are +immutable, references are mutable (can be updated to point to a new commit). +For example, the `master` reference usually points to the latest commit in the +main branch of development. ``` references = map -def create_reference(name, id): +def update_reference(name, id): references[name] = id def read_reference(name): From 7656bd59fced909d70ed7db309731e7964ce2419 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Wed, 22 Jan 2020 08:29:53 -0500 Subject: [PATCH 206/640] Fix typo --- _2020/version-control.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/_2020/version-control.md b/_2020/version-control.md index 270b5ea8..5baa0827 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -239,13 +239,13 @@ area". # Git command-line interface -The `git init` command initializes a new Git repository, with repository -metadata being stored in the `.git` directory: - ## Basics {% comment %} +The `git init` command initializes a new Git repository, with repository +metadata being stored in the `.git` directory: + ```console $ mkdir myproject $ cd myproject From f84b07d89193a362be74aefec063493891ae7969 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Wed, 22 Jan 2020 12:12:09 -0500 Subject: [PATCH 207/640] Process Jon and Jose's feedback --- _2020/version-control.md | 49 ++++++++++++++++++++++++++++++++-------- 1 file changed, 39 insertions(+), 10 deletions(-) diff --git a/_2020/version-control.md b/_2020/version-control.md index 5baa0827..33faa0ab 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -13,13 +13,20 @@ each snapshot encapsulates the entire state of files/folders within a top-level directory. VCSs also maintain metadata like who created each snapshot, messages associated with each snapshot, and so on. -Why is version control useful? Beyond letting you look at old snapshots of the -project, modern VCSs let you easily (and automatically) answer questions like: +Why is version control useful? Even when you're working by yourself, it can let +you look at old snapshots of a project, keep a log of why certain changes were +made, work on parallel branches of development, and much more. When working +with others, it's an invaluable tool for seeing what other people have changed, +as well as resolving conflicts in concurrent development. + +Modern VCSs also let you easily (and often automatically) answer questions +like: - Who wrote this module? - When was this particular line of this particular file edited? By whom? Why was it edited? -- Over the last 1000 revisions, when was this regression introduced? +- Over the last 1000 revisions, when/why did a particular unit test stop +working? While other VCSs exist, **Git** is the de facto standard for version control. This [XKCD comic](https://xkcd.com/1597/) captures Git's reputation: @@ -36,9 +43,15 @@ While Git admittedly has an ugly interface, its underlying design and ideas are beautiful. While an ugly interface has to be _memorized_, a beautiful design can be _understood_. For this reason, we give a bottom-up explanation of Git, starting with its data model and later covering the command-line interface. +Once the data model is understood, the commands can be better understood, in +terms of how they manipulate the underlying data model. # Git's data model +There are many ad-hoc approaches you could take to version control. Git has a +well thought-out model that enables all the nice features of version control, +like maintaining history, supporting branches, and enabling collaboration. + ## Snapshots Git models the history of a collection of files and folders within some @@ -100,6 +113,11 @@ o <-- o <-- o <-- o <---- o --- o <-- o +Commits in Git are immutable. This doesn't mean that mistakes can't be +corrected, however; it's just that "edits" to the commit history are actually +creating entirely new commits, and references (see below) are updated to point +to the new ones. + ## Data model, as pseudocode It may be instructive to see Git's data model written down in pseudocode: @@ -186,7 +204,7 @@ def update_reference(name, id): def read_reference(name): return references[name] -def load(name_or_id): +def load_reference(name_or_id): if name_or_id in references: return load(references[name_or_id]) else: @@ -213,7 +231,7 @@ adding objects and adding/updating references. Whenever you're typing in any command, think about what manipulation the command is making to the underlying graph data structure. Conversely, if you're trying to make a particular kind of change to the commit DAG, e.g. "discard -uncommitted changes and make the 'master' ref point to commit 5d83f9e", there's +uncommitted changes and make the 'master' ref point to commit `5d83f9e`", there's probably a command to do it (e.g. in this case, `git checkout master; git reset --hard 5d83f9e`). @@ -394,17 +412,20 @@ index 94bab17..f0013b2 100644 {% endcomment %} -Watch the lecture video or see the [Git documentation](https://git-scm.com/doc) -for detailed information on the commands below. +To avoid duplicating information, we're not going to explain the commands below +in detail. See the highly recommended [Pro Git](https://git-scm.com/book/en/v2) +for more information, or watch the lecture video. -- `git init`: creates a new git repo +- `git help `: get help for a git command +- `git init`: creates a new git repo, with data stored in the `.git` directory - `git status`: tells you what's going on -- `git add `: stages files for commit +- `git add `: adds files to staging area - `git commit`: creates a new commit - Write [good commit messages](https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html)! - `git commit --amend`: edit a commit's contents/message - `git log`: shows a flattened log of history - `git log --all --graph --decorate`: visualizes history as a DAG +- `git diff `: show differences since the last commit - `git diff `: shows differences in a file between snapshots - `git checkout `: updates HEAD and current branch @@ -428,6 +449,7 @@ command is used for merging. - `git checkout -b `: creates a branch and switches to it - same as `git branch ; git checkout ` - `git merge `: merges into current branch +- `git mergetool`: use a fancy tool to help resolve merge conflicts - `git rebase`: rebase set of patches onto a new base ## Remotes @@ -437,7 +459,7 @@ command is used for merging. - `git push :`: send objects to remote, and update remote reference - `git branch --set-upstream-to=/`: set up correspondence between local and remote branch - `git fetch`: retrieve objects/references from a remote -- `git pull`: same as `git fetch`, except updates local branch +- `git pull`: same as `git fetch; git merge` - `git clone`: download repository from remote # Advanced Git @@ -471,6 +493,9 @@ there are [many](https://nvie.com/posts/a-successful-git-branching-model/) - **GitHub**: Git is not GitHub. GitHub has a specific way of contributing code to other projects, called [pull requests](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests). +- **Other Git providers**: GitHub is not special: there are many Git repository +hosts, like [GitLab](https://about.gitlab.com/) and +[BitBucket](https://bitbucket.org/). # Resources @@ -494,6 +519,10 @@ game that teaches you Git. # Exercises +1. If you don't have any past experience with Git, either try reading the first + couple chapters of [Pro Git](https://git-scm.com/book/en/v2) or go through a + tutorial like [Learn Git Branching](https://learngitbranching.js.org/). As + you're working through it, relate Git commands to the data model. 1. Clone the [repository for the class website](https://github.com/missing-semester/missing-semester). 1. Explore the version history by visualizing it as a graph. From b53fc36d9cca945872413d549e200c84d984ed76 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Wed, 22 Jan 2020 12:27:28 -0500 Subject: [PATCH 208/640] Move --- _2020/version-control.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/_2020/version-control.md b/_2020/version-control.md index 33faa0ab..34905b07 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -257,6 +257,10 @@ area". # Git command-line interface +To avoid duplicating information, we're not going to explain the commands below +in detail. See the highly recommended [Pro Git](https://git-scm.com/book/en/v2) +for more information, or watch the lecture video. + ## Basics {% comment %} @@ -412,10 +416,6 @@ index 94bab17..f0013b2 100644 {% endcomment %} -To avoid duplicating information, we're not going to explain the commands below -in detail. See the highly recommended [Pro Git](https://git-scm.com/book/en/v2) -for more information, or watch the lecture video. - - `git help `: get help for a git command - `git init`: creates a new git repo, with data stored in the `.git` directory - `git status`: tells you what's going on From eed72f23263bc33b30cdc91bf4a549bc2595a274 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Wed, 22 Jan 2020 22:19:49 -0500 Subject: [PATCH 209/640] Add command-line lecture video --- _2020/command-line.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/_2020/command-line.md b/_2020/command-line.md index 309cb021..2f4d44fb 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -3,6 +3,9 @@ layout: lecture title: "Command-line Environment" date: 2019-1-21 ready: true +video: + aspect: 56.25 + id: MpJPHy4kUEs --- In this lecture we will go through several ways in which you can improve your workflow when using the shell. We have been working with the shell for a while now, but we have mainly focused on executing different commands. We will now see how to run several processes at the same time while keeping track of them, how to stop or pause a specific process and how to make a process run in the background. From cb1a94c8933406b4d1a7089ed9c5422403e2bc3f Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Wed, 22 Jan 2020 22:20:01 -0500 Subject: [PATCH 210/640] Fix missing EOF newline --- _2020/command-line.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 2f4d44fb..80156afe 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -506,4 +506,4 @@ Host vm 1. Start a webserver in your VM by executing `python -m http.server 8888`. Access the VM webserver by navigating to `http://localhost:9999` in your machine. 1. Edit your SSH server config by doing `sudo vim /etc/ssh/sshd_config` and disable password authentication by editing the value of `PasswordAuthentication`. Disable root login by editing the value of `PermitRootLogin`. Restart the `ssh` service with `sudo service sshd restart`. Try sshing in again. 1. (Challenge) Install [`mosh`](https://mosh.org/) in the VM and establish a connection. Then disconnect the network adapter of the server/VM. Can mosh properly recover from it? -1. (Challenge) Look into what the `-N` and `-f` flags do in `ssh` and figure out what a command to achieve background port forwarding. \ No newline at end of file +1. (Challenge) Look into what the `-N` and `-f` flags do in `ssh` and figure out what a command to achieve background port forwarding. From 3f102b9703c29de7f2bbf68e7ffaafb13321c670 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Wed, 22 Jan 2020 22:20:24 -0500 Subject: [PATCH 211/640] Fix insane date value scheme --- _2020/command-line.md | 2 +- _2020/course-shell.md | 2 +- _2020/data-wrangling.md | 2 +- _2020/debugging-profiling.md | 2 +- _2020/editors.md | 2 +- _2020/metaprogramming.md | 2 +- _2020/mlk-day.md | 2 +- _2020/potpourri.md | 2 +- _2020/qa.md | 2 +- _2020/security.md | 2 +- _2020/shell-tools.md | 2 +- _2020/version-control.md | 2 +- 12 files changed, 12 insertions(+), 12 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 80156afe..308a3255 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -1,7 +1,7 @@ --- layout: lecture title: "Command-line Environment" -date: 2019-1-21 +date: 2019-01-21 ready: true video: aspect: 56.25 diff --git a/_2020/course-shell.md b/_2020/course-shell.md index c55288e0..48ef4f78 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -1,7 +1,7 @@ --- layout: lecture title: "Course overview + the shell" -date: 2019-1-13 +date: 2019-01-13 ready: true video: aspect: 56.25 diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index b5da8587..71043ba9 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -1,7 +1,7 @@ --- layout: lecture title: "Data Wrangling" -date: 2019-1-16 +date: 2019-01-16 ready: true video: aspect: 56.25 diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index fc6b55f9..1c28f1e6 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -1,7 +1,7 @@ --- layout: lecture title: "Debugging and Profiling" -date: 2019-1-23 +date: 2019-01-23 --- A golden rule in programming is that code will not do what you expect it to do but what you told it to do. diff --git a/_2020/editors.md b/_2020/editors.md index e12eb134..cac6585d 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -1,7 +1,7 @@ --- layout: lecture title: "Editors (Vim)" -date: 2019-1-15 +date: 2019-01-15 ready: true video: aspect: 56.25 diff --git a/_2020/metaprogramming.md b/_2020/metaprogramming.md index ff8f1948..135c996e 100644 --- a/_2020/metaprogramming.md +++ b/_2020/metaprogramming.md @@ -2,5 +2,5 @@ layout: lecture title: "Metaprogramming" details: build systems, sermver, makefiles, CI -date: 2019-1-27 +date: 2019-01-27 --- diff --git a/_2020/mlk-day.md b/_2020/mlk-day.md index 8e5a3386..ff8ca39c 100644 --- a/_2020/mlk-day.md +++ b/_2020/mlk-day.md @@ -1,6 +1,6 @@ --- layout: null title: "MLK day" -date: 2019-1-20 +date: 2019-01-20 noclass: true --- diff --git a/_2020/potpourri.md b/_2020/potpourri.md index cb6c658d..c147bc41 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -1,5 +1,5 @@ --- layout: lecture title: "Potpourri" -date: 2019-1-29 +date: 2019-01-29 --- diff --git a/_2020/qa.md b/_2020/qa.md index 25ec27e5..c2baca15 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -1,5 +1,5 @@ --- layout: lecture title: "Q&A" -date: 2019-1-30 +date: 2019-01-30 --- diff --git a/_2020/security.md b/_2020/security.md index 183c1712..d8856911 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -1,5 +1,5 @@ --- layout: lecture title: "Security and Privacy" -date: 2019-1-28 +date: 2019-01-28 --- diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 525ac9c0..f82e7a85 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -1,7 +1,7 @@ --- layout: lecture title: "Shell Tools and Scripting" -date: 2019-1-14 +date: 2019-01-14 ready: true video: aspect: 56.25 diff --git a/_2020/version-control.md b/_2020/version-control.md index 34905b07..48f81404 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -1,7 +1,7 @@ --- layout: lecture title: "Version Control (Git)" -date: 2019-1-22 +date: 2019-01-22 ready: true --- From a4bac9256aaa0ea79e66a60ffb9ad245216e21bd Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Wed, 22 Jan 2020 22:36:19 -0500 Subject: [PATCH 212/640] Finish exercises for debug and profiling --- _2020/debugging-profiling.md | 131 ++++++++++++++++++++--------------- static/files/sorts.py | 54 +++++++++++++++ 2 files changed, 131 insertions(+), 54 deletions(-) create mode 100644 static/files/sorts.py diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 1c28f1e6..2a305fb7 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -6,38 +6,39 @@ date: 2019-01-23 A golden rule in programming is that code will not do what you expect it to do but what you told it to do. Bridging that gap can sometimes be a quite difficult feat. -In this lecture we will cover useful techniques for dealing with buggy and resource hungry code: debugging and profiling. +In this lecture we are going to cover useful techniques for dealing with buggy and resource hungry code: debugging and profiling. # Debugging ## Printf debugging and Logging "The most effective debugging tool is still careful thought, coupled with judiciously placed print statements" — Brian Kernighan, _Unix for Beginners_. + The first approach to debug a problem is often adding print statements around where you have detected that something is wrong and keep iterating until you have extracted enough information to understand what is responsible for the issue. The next step is to do use logging in your program instead of ad hoc print statements. Logging is better than just regular print statements for several reasons: -- You can log to files, sockets even remote servers instead of standard output -- Logging supports severity levels (such as INFO, DEBUG, WARN, ERROR, &c) so you can filter your output accordingly -- For new issues, there's a fair chance that your logs will contain enough information to detect what is going wrong +- You can log to files, sockets even remote servers instead of standard output. +- Logging supports severity levels (such as INFO, DEBUG, WARN, ERROR, &c) so you can filter your output accordingly. +- For new issues, there's a fair chance that your logs will contain enough information to detect what is going wrong. One of my favorite tips for making logs more readable is to color code them. By now you probably have realized that your terminal uses colors to make things more readable. -But how does it do it? Programs like `ls` or `grep` are using [ANSI escape codes](https://en.wikipedia.org/wiki/ANSI_escape_code) which are special sequences of characters to indicate your shell to change the color of the output. For example executing `echo -e "\e[38;2;255;0;0mThis is red\e[0m"` will print a red `This is red` message in your terminal. +But how does it do it? Programs like `ls` or `grep` are using [ANSI escape codes](https://en.wikipedia.org/wiki/ANSI_escape_code) which are special sequences of characters to indicate your shell to change the color of the output. For example executing `echo -e "\e[38;2;255;0;0mThis is red\e[0m"` prints a red `This is red` message in your terminal. ## Third party logs -As you start building larger software systems you will often run into dependencies that will run as separate programs. +As you start building larger software systems you will most probably run into dependencies that will run as separate programs. Web servers, databases or message brokers are common examples of this kind of dependencies. -When interacting with these systems you will often need to read their logs since client side error message might not suffice. +When interacting with these systems you will often need to read their logs since client side error messages might not suffice. Luckily, most programs will write their own logs somewhere in your system. -In UNIX systems, it commonplace for that programs write their logs under `/var/log`. +In UNIX systems, it is commonplace for programs to write their logs under `/var/log`. For instance, the [NGINX](https://www.nginx.com/) webserver will place its logs under `/var/log/nginx`. More recently, systems have started using a **system log** ”, which is increasingly where all of your log messages go. -Most (but not all) Linux systems will `systemd`, a system daemon that will control many things in your system such as which services are enabled and running. -`systemd` will place the logs under `/var/log/journal` in a specialized format and you can use [`journalctl`](http://man7.org/linux/man-pages/man1/journalctl.1.html) to display the messages. -Similarly, on macOS there is still `/var/log/system.log` but increasingly tools will log into the system log that can be displayed with [`log show`](https://www.manpagez.com/man/1/log/) on macOS or BSD. +Most (but not all) Linux systems will use `systemd`, a system daemon that will control many things in your system such as which services are enabled and running. +`systemd` will place the logs under `/var/log/journal` in a specialized format and you can use the [`journalctl`](http://man7.org/linux/man-pages/man1/journalctl.1.html) command to display the messages. +Similarly, on macOS there is still `/var/log/system.log` but increasingly tools will log into the system log that can be displayed with [`log show`](https://www.manpagez.com/man/1/log/). On most UNIX systems you can also use the [`dmesg`](http://man7.org/linux/man-pages/man1/dmesg.1.html) command to access the kernel log. For logging under the system logs you can use the [`logger`](http://man7.org/linux/man-pages/man1/logger.1.html) tool. @@ -52,22 +53,21 @@ log show --last 1m | grep Hello journalctl --since "1m ago" | grep Hello ``` -As we saw in the data wrangling lecture, logs can be quite verbose and they might require some level of processing and filtering to get the information you want. -If you find yourself heavily filtering through `journalctl` and `log show` you will probably want to familiarize yourself with their flags which can perform a first round of filtering of their output. -There are some tools like [`lnav`](http://lnav.org/) that provide an improved presentation and navigation for log files. +As we saw in the data wrangling lecture, logs can be quite verbose and they require some level of processing and filtering to get the information you want. +If you find yourself heavily filtering through `journalctl` and `log show` you will probably want to familiarize yourself with their flags which can perform a first pass of filtering of their output. +There are also some tools like [`lnav`](http://lnav.org/) that provide an improved presentation and navigation for log files. ## Debuggers When printf debugging is not enough you should be using a debugger. Debuggers are programs that will let you interact with the execution of a program, letting you do things like: -- Halt execution of the program when it reaches a certain line -- Step through the program one instruction at a time -- Inspect values of variables after the program crashed +- Halt execution of the program when it reaches a certain line. +- Step through the program one instruction at a time. +- Inspect values of variables after the program crashed. - Conditionally halt the execution when a given condition is met. - And many more advanced features - Many programming languages will come with some form of debugger. In Python this is the Python Debugger [`pdb`](https://docs.python.org/3/library/pdb.html). @@ -81,12 +81,19 @@ Here is a brief description of some of the commands `pdb` supports. - **r**(eturn) - Continue execution until the current function returns. - **q**(uit) - Quit from the debugger +Let's go through an example of using `pdb` to fix the following buggy python code. + +```bash +TODO TODO +``` + + Note that since Python is an interpreted language we can use the `pdb` shell to execute commands and to execute instructions. -[`ipdb`](https://pypi.org/project/ipdb/) is an improved `pdb` that uses the [`IPython`](https://ipython.org) REPL enabling tab completion, syntax highlighting, better tracebacks, better introspection while retaining the same interface as the `pdb` module. +[`ipdb`](https://pypi.org/project/ipdb/) is an improved `pdb` that uses the [`IPython`](https://ipython.org) REPL enabling tab completion, syntax highlighting, better tracebacks, and better introspection while retaining the same interface as the `pdb` module. For more low level programming you will probably want to look into [`gdb`](https://www.gnu.org/software/gdb/) (and its quality of life modification [`pwndbg`](https://github.com/pwndbg/pwndbg)) and [`lldb`](https://lldb.llvm.org/). -They are optimized for C-like language debugging but will let you probe pretty much any process and get its current state: registers, stack, program counter, &c. +They are optimized for C-like language debugging but will let you probe pretty much any process and get its current machine state: registers, stack, program counter, &c. ## Specialized Tools @@ -94,7 +101,8 @@ They are optimized for C-like language debugging but will let you probe pretty m Even if what you are trying to debug is a black box binary there are tools that can help you with that. Whenever programs need to perform actions that only the kernel can, they will use [System Calls](https://en.wikipedia.org/wiki/System_call). There are commands that will let you trace the syscalls your program makes. In Linux there's [`strace`](http://man7.org/linux/man-pages/man1/strace.1.html) and macOS and BSD have [`dtrace`](http://dtrace.org/blogs/about/). Since `dtrace` can be tricky to use since it uses its own `D` language there is a wrapper called [`dtruss`](https://www.manpagez.com/man/1/dtruss/) that will provide an interface more similar to `strace` (more details [here](https://8thlight.com/blog/colin-jones/2015/11/06/dtrace-even-better-than-strace-for-osx.html)). -Below are some examples of using `strace` or `dtruss` to show [`stat`](http://man7.org/linux/man-pages/man2/stat.2.html) syscall traces for an execution of `ls`. For a deeper dive into strace , try reading [this](https://blogs.oracle.com/linux/strace-the-sysadmins-microscope-v2). + +Below are some examples of using `strace` or `dtruss` to show [`stat`](http://man7.org/linux/man-pages/man2/stat.2.html) syscall traces for an execution of `ls`. For a deeper dive into `strace`, [this](https://blogs.oracle.com/linux/strace-the-sysadmins-microscope-v2) is a good read. ```bash # On Linux @@ -104,10 +112,10 @@ sudo strace -e lstat ls -l > /dev/null sudo dtruss -t lstat64_extended ls -l > /dev/null ``` -If your programs rely on network functionality, looking at the network packets might be necessary to figure out what is going wrong. +Under some circumstances, looking at the network packets might be necessary to figure out what is going wrong with your program. Tools like [`tcpdump`](http://man7.org/linux/man-pages/man1/tcpdump.1.html) and [Wireshark](https://www.wireshark.org/) are network packet analyzers that will let you read the contents of network packets and filter them based on many criteria. -For web development the Chrome/Firefox developer tools are a quite amazing tool. They feature a large number of tools: +For web development, the Chrome/Firefox developer tools are a quite amazing tool. They feature a large number of tools: - Source code - Inspect the HTML/CSS/JS source code of any website - Live HTML, CSS, JS modification - Change the website content, styles and behavior to test. (This also means that website screenshots are not valid proofs). - Javascript shell - Execute commands in the JS REPL @@ -117,11 +125,12 @@ For web development the Chrome/Firefox developer tools are a quite amazing tool. ## Static Analysis Not all issues need the code to be run to be discovered. -For example, just by carefully looking at a piece of code you could realize that your loop variable is overshadowing an already existing variable or function name; or that a variable has never been defined. +For example, just by carefully looking at a piece of code you could realize that your loop variable is shadowing an already existing variable or function name; or that a variable has never been defined. Here is where [static analysis](https://en.wikipedia.org/wiki/Static_program_analysis) tools come into play. -Static analysis programs will go through the source +Static analysis programs take source code as input and analyze it using coding rules to reason about its correctness. -In the example below there are several mistakes. First, our loop variable `foo` shadows the previous definition of the function `foo`. We also wrote `baz` instead of `bar` in the last line so the program will crash, but it will take a minute to do so because of the `sleep` call. +For instance, in the following Python snippet there are several mistakes. +First, our loop variable `foo` shadows the previous definition of the function `foo`. We also wrote `baz` instead of `bar` in the last line so the program will crash, but it will take a minute to do so because of the `sleep` call. ```python import time @@ -137,7 +146,7 @@ time.sleep(60) print(baz) ``` -Static analysis tools can catch both these issues. We run [`pyflakes`](https://pypi.org/project/pyflakes) on the code and get errors related to those issues. [`mypy`](http://mypy-lang.org/) is another tool that can detect type checking issues. Here, `bar` is first an `int` and it's then casted to a `float` so `mypy` will warn is about the error. +Static analysis tools can catch both these issues. When we run [`pyflakes`](https://pypi.org/project/pyflakes) on the code and get errors related to those issues. [`mypy`](http://mypy-lang.org/) is another tool that can detect type checking issues. Here, `bar` is first an `int` and it's then casted to a `float` so `mypy` will warn us about the error. Note that all these issues were detected without actually having to run the code. In the shell tools lecture we covered [`shellcheck`](https://www.shellcheck.net/) which is a similar tool for shell scripts. @@ -161,8 +170,8 @@ For Python, [`pylint`](https://www.pylint.org) and [`pep8`](https://pypi.org/pro For other languages people have compiled comprehensive lists of useful static analysis tools such as [Awesome Static Analysis](https://github.com/mre/awesome-static-analysis) (you may want to take a look at the _Writing_ section) and for linters there is [Awesome Linters](https://github.com/caramelomartins/awesome-linters). A complementary tool to stylistic linting are code formatters such as [`black`](https://github.com/psf/black) for Python, `gofmt` for Go or `rustfmt` for Rust. -These tools auto format your code so it's consistent with common stylistic patterns for the given programming language. -Although you might be reticent to give stylistic control about your code, standardizing code format will help other people read your code and will make you better at reading other people's (stylistically standardized) code. +These tools autoformat your code so it's consistent with common stylistic patterns for the given programming language. +Although you might be unwilling to give stylistic control about your code, standardizing code format will help other people read your code and will make you better at reading other people's (stylistically standardized) code. # Profiling @@ -194,10 +203,10 @@ print(time.time() - start) # 0.5713930130004883 ``` -However, as you might have noticed if you ran the example above wall clock time might not match your expected measurements. -Wall clock time can be misleading since your computer might be running other processes at the same time or might be waiting for events to happen. Often you will see tools make a distinction between _Real_, _User_ and _Sys_ time. In general _User_ + _Sys_ will tell you how much time your process actually spent in the CPU (more detailed explanation [here](https://stackoverflow.com/questions/556405/what-do-real-user-and-sys-mean-in-the-output-of-time1)) +However, as you might have noticed if you ran the printed time might not match your expected measurements. +Wall clock time can be misleading since your computer might be running other processes at the same time or might be waiting for events to happen. It is common for tools to make a distinction between _Real_, _User_ and _Sys_ time. In general _User_ + _Sys_ tells you how much time your process actually spent in the CPU (more detailed explanation [here](https://stackoverflow.com/questions/556405/what-do-real-user-and-sys-mean-in-the-output-of-time1)) -- _Real_ - Wall clock elapsed time from start to finish of the program, including the time taken by other processed and time taken while blocked (e.g. waiting for I/O or network) +- _Real_ - Wall clock elapsed time from start to finish of the program, including the time taken by other processes and time taken while blocked (e.g. waiting for I/O or network) - _User_ - Amount of time spent in the CPU running user code - _Sys_ - Amount of time spent in the CPU running kernel code @@ -215,24 +224,17 @@ sys 0m0.012s ### CPU Most of the time when people refer to profilers they actually mean CPU profilers since they are the most common. - There are two main types of CPU profilers, tracing profilers and sampling profilers. - Tracing profilers keep a record of every function call your program makes whereas sampling profilers probe your program periodically (commonly every milliseconds) and record the program's stack. They then present aggregate statistics of what your program spent the most time doing. [Here](https://jvns.ca/blog/2017/12/17/how-do-ruby---python-profilers-work-) is a good intro article if you want more detail on this topic. -Most programming languages will have at least a command line debugger that you can use. +Most programming languages will have some form a command line profiler that you can use to analyze your code. Often those integrate with full fledged IDEs but for this lecture we are going to focus on the command line tools themselves. In Python TODO cProfile -TODO `perf` command -- Basic performance stats: `perf stat {command}` -- Run a program with the profiler: `perf record {command}` -- Analyze profile: `perf report` - A caveat of Python's `cProfile` profiler (and many profilers for that matter) is that they will display time per function call. That can become intuitive really fast specially if you are using third party libraries in your code since internal function calls will also be accounted for. A more intuitive way of displaying profiling information is to include the time taken per line of code, this is what _line profilers_ do. @@ -244,6 +246,8 @@ For instance the following piece of Python code performs a request to the class import requests from bs4 import BeautifulSoup +# This is a decorator that tells line_profiler +# that we want to analyze this function @profile def get_urls(): response = requests.get('https://missing.csail.mit.edu') @@ -284,7 +288,7 @@ In languages like C or C++ memory leaks can cause your program to never release To help in the process of memory debugging you can use tools like [Valgrind](https://valgrind.org/) that will help you identify memory leaks. In garbage collected languages like Python it is still useful to use a memory profiler since as long as you have pointers to objects in memory they won't be garbage collected. -Here's an example program and the associated output then running it with [memory-profiler](https://pypi.org/project/memory-profiler/) (note the decorator like in `line-profiler`) +Here's an example program and the associated output when running it with [memory-profiler](https://pypi.org/project/memory-profiler/) (note the decorator like in `line-profiler`) ```python @profile @@ -310,6 +314,22 @@ Line # Mem usage Increment Line Contents 8 13.61 MB 0.00 MB return a ``` +### Event Profiling + +As it was the case for `strace` for debugging, you might want to ignore the specifics of the code that you are running and treat it like a black box when profiling. +The [`perf`](http://man7.org/linux/man-pages/man1/perf.1.html) command abstracts away CPU differences and does not report time or memory but instead it reports system events related to your programs. +For example, `perf` can easily report poor cache locality, high amounts of page faults or livelocks. + +TODO `perf` command + +- `perf list` - List the events that can be traced with perf +- `perf stat COMMAND ARG1 ARG2` - Gets counts of different events related a process or command +- `perf record` - +- `perf report` - +- Basic performance stats: `perf stat {command}` +- Run a program with the profiler: `perf record {command}` +- Analyze profile: `perf report` + ### Visualization @@ -335,16 +355,16 @@ There is a myriad of command line tools for probing and displaying different sys - **General Monitoring** - Probably the most popular is [`htop`](https://hisham.hm/htop/index.php) which is an improved version of [`top`](http://man7.org/linux/man-pages/man1/top.1.html). `htop` presents you various statistics for the currently running processes on the system. -See also [`glances`](https://nicolargo.github.io/glances/) for similar implementation with a well designed UI. For getting aggregate measures across all processes, [`dstat`](http://dag.wiee.rs/home-made/dstat/) is a great tool that computes real-time resource metrics for lots of different subsystems like I/O, networking, CPU utilization, context switches, &c. +See also [`glances`](https://nicolargo.github.io/glances/) for similar implementation with a great UI. For getting aggregate measures across all processes, [`dstat`](http://dag.wiee.rs/home-made/dstat/) is also nifty tool that computes real-time resource metrics for lots of different subsystems like I/O, networking, CPU utilization, context switches, &c. - **I/O operations** - [`iotop`](http://man7.org/linux/man-pages/man8/iotop.8.html) displays live I/O usage information, handy to check if a process is doing heavy I/O disk operations -- **Disk Usage** - [`df`](http://man7.org/linux/man-pages/man1/df.1.html) will display metrics per partitions and [`du`](http://man7.org/linux/man-pages/man1/du.1.html) displays **d**isk **u**sage per file for the current directory. In this tools the `-h` flag is quite crucial to get **h**uman readable output. +- **Disk Usage** - [`df`](http://man7.org/linux/man-pages/man1/df.1.html) will display metrics per partitions and [`du`](http://man7.org/linux/man-pages/man1/du.1.html) displays **d**isk **u**sage per file for the current directory. In these tools the `-h` flag tells the program to print with **h**uman readable format. A more interactive version of `du` is [`ncdu`](https://dev.yorhel.nl/ncdu) which will let you navigate folders and delete files and folders as you navigate. -- **Memory Usage** - [`free`](http://man7.org/linux/man-pages/man1/free.1.html) displays the total amount of free and used memory in the system. Memory is also often display in tools like `htop`. +- **Memory Usage** - [`free`](http://man7.org/linux/man-pages/man1/free.1.html) displays the total amount of free and used memory in the system. Memory is also displayed in tools like `htop`. - **Open Files** - [`lsof`](http://man7.org/linux/man-pages/man8/lsof.8.html) lists file information about files opened by processes. It can be quite useful for checking which process has opened a given file. - **Network Connections and Config** - [`ss`](http://man7.org/linux/man-pages/man8/ss.8.html) will let you monitor incoming and outgoing network packets statistics as well as interface statistics. A common use case of `ss` is figuring out what process is using a given port in a machine. For displaying routing, network devices and interfaces you can use [`ip`](http://man7.org/linux/man-pages/man8/ip.8.html). Note that `netstat` and `ifconfig` have been deprecated in favor of the former tools respectively. - **Network Usage** - [`nethogs`](https://github.com/raboof/nethogs) and [`iftop`](http://www.ex-parrot.com/pdw/iftop/) are good interactive CLI tools for monitoring network usage. -If you want to test this tools you can also artificially impose loads on the machine using the [`stress`](https://linux.die.net/man/1/stress) command. +If you want to test these tools you can also artificially impose loads on the machine using the [`stress`](https://linux.die.net/man/1/stress) command. ### Specialized tools @@ -379,18 +399,21 @@ More info for [Firefox](https://developer.mozilla.org/en-US/docs/Mozilla/Perform 1. Use `journalctl` on Linux or `log show` on macOS to get the super user accesses and commands in the last day. If there aren't any you can execute some harmless commands such as `sudo ls` and check again. -1. Install `pyflakes` or `pylint` and run it in the following piece of Python code. What is wrong with the code? Try fixing it. - -```python -TODO -``` +1. Install [`shellchek`](https://www.shellcheck.net/) and try checking following script. What is wrong with the code? Fix it. Install a linter plugin in your editor so you can get your warnings automatically. -1. Run `cProfile`, `line_profiler` and `memory_profiler` in the following piece of Python code. What can you do to improve its performance? -```python -TODO +```bash +#!/bin/sh +## Example: a typical script with several problems +for f in $(ls *.m3u) +do + grep -qi hq.*mp3 $f \ + && echo -e 'Playlist $f contains a HQ file in mp3 format' +done ``` +1. [Here](/static/files/sorts.py) are some sorting algorithm implementations. Use [`cProfile`](https://docs.python.org/2/library/profile.html) and [`line_profiler`](https://github.com/rkern/line_profiler) to compare the runtime of insertion sort and quicksort. What is the bottleneck of each algorithm? Use then `memory_profiler` to check the memory consumption, why is insertion sort better? Check now the inplace version of quicksort. Challenge: Use `perf` to look at the cache locality of each algorithm. + 1. Here's some (arguably convoluted) Python code for computing Fibonacci numbers using a function for each number. ```python @@ -416,7 +439,7 @@ Put the code into a file and make it executable. Install [`pycallgraph`](http:// 1. A common issue is that a port you want to listen on is already taken by another process. Let's learn how to discover that process pid. First execute `python -m http.server 4444` to start a minimal web server listening on port `4444`. On a separate terminal run `lsof | grep LISTEN` to print all listening processes and ports. Find that process pid and terminate it by running `kill `. 1. Limiting processes resources can be another handy tool in your toolbox. -Try running `stress -c 3` and visualize the CPU consumption with `htop`. Now, execute `taskset --cpu-list 0,2 stress -c 3` and visualize it. Is `stress` taking three CPUs, why not? Read [`man taskset`](http://man7.org/linux/man-pages/man1/taskset.1.html). +Try running `stress -c 3` and visualize the CPU consumption with `htop`. Now, execute `taskset --cpu-list 0,2 stress -c 3` and visualize it. Is `stress` taking three CPUs? Why not? Read [`man taskset`](http://man7.org/linux/man-pages/man1/taskset.1.html). Challenge: achieve the same using [`cgroups`](http://man7.org/linux/man-pages/man7/cgroups.7.html). Try limiting the memory consumption of `stress -m`. 1. (Advanced) The command `curl ipinfo.io` performs a HTTP request an fetches information about your public IP. Open [Wireshark](https://www.wireshark.org/) and try to sniff the request and reply packets that `curl` sent and received. (Hint: Use the `http` filter to just watch HTTP packets). diff --git a/static/files/sorts.py b/static/files/sorts.py new file mode 100644 index 00000000..5f23ba2e --- /dev/null +++ b/static/files/sorts.py @@ -0,0 +1,54 @@ +import random + + +def test_sorted(fn, iters=1000): + for i in range(iters): + l = [random.randint(0, 100) for i in range(0, random.randint(0, 50))] + assert fn(l) == sorted(l) + # print(fn.__name__, fn(l)) + + +def insertionsort(array): + + for i in range(len(array)): + j = i-1 + v = array[i] + while j >= 0 and v < array[j]: + array[j+1] = array[j] + j -= 1 + array[j+1] = v + return array + + +def quicksort(array): + if len(array) <= 1: + return array + pivot = array[0] + left = [i for i in array[1:] if i < pivot] + right = [i for i in array[1:] if i >= pivot] + return quicksort(left) + [pivot] + quicksort(right) + + +def quicksort_inplace(array, low=0, high=None): + if len(array) <= 1: + return array + if high is None: + high = len(array)-1 + if low >= high: + return array + + pivot = array[high] + j = low-1 + for i in range(low, high): + if array[i] <= pivot: + j += 1 + array[i], array[j] = array[j], array[i] + array[high], array[j+1] = array[j+1], array[high] + quicksort_inplace(array, low, j) + quicksort_inplace(array, j+2, high) + return array + + +if __name__ == '__main__': + for fn in [quicksort, quicksort_inplace, insertionsort]: + test_sorted(fn) From d73cdf11e2ed3e1c5bf22c91d47a30ae0d63354b Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Wed, 22 Jan 2020 23:11:33 -0500 Subject: [PATCH 213/640] Finish cProfile example --- _2020/debugging-profiling.md | 48 +++++++++++++++++++++++++++++++++--- 1 file changed, 45 insertions(+), 3 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 2a305fb7..879d731c 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -232,8 +232,50 @@ They then present aggregate statistics of what your program spent the most time Most programming languages will have some form a command line profiler that you can use to analyze your code. Often those integrate with full fledged IDEs but for this lecture we are going to focus on the command line tools themselves. -In Python -TODO cProfile +In Python we can use the `cProfile` module to profile time per function call. Here is a simple example that implements a rudimentary grep in Python. + +```python +#!/usr/bin/env python + +import sys, re + +def grep(pattern, file): + with open(file, 'r') as f: + print(file) + for i, line in enumerate(f.readlines()): + pattern = re.compile(pattern) + match = pattern.search(line) + if match is not None: + print("{}: {}".format(i, line), end="") + +if __name__ == '__main__': + times = int(sys.argv[1]) + pattern = sys.argv[2] + for i in range(times): + for file in sys.argv[3:]: + grep(pattern, file) +``` + +We can profile this code using the following command. Analyzing the output we can see that IO is taking most of the time but compiling the regex also takes a fair amount of time. Since the regex need to be compiled just once we can move factor it out of the for. + +``` +$ python -m cProfile -s tottime grep.py 1000 '^(import|\s*def)[^,]*$' *.py + +[omitted program output] + + ncalls tottime percall cumtime percall filename:lineno(function) + 8000 0.266 0.000 0.292 0.000 {built-in method io.open} + 8000 0.153 0.000 0.894 0.000 grep.py:5(grep) + 17000 0.101 0.000 0.101 0.000 {built-in method builtins.print} + 8000 0.100 0.000 0.129 0.000 {method 'readlines' of '_io._IOBase' objects} + 93000 0.097 0.000 0.111 0.000 re.py:286(_compile) + 93000 0.069 0.000 0.069 0.000 {method 'search' of '_sre.SRE_Pattern' objects} + 93000 0.030 0.000 0.141 0.000 re.py:231(compile) + 17000 0.019 0.000 0.029 0.000 codecs.py:318(decode) + 1 0.017 0.017 0.911 0.911 grep.py:3() + +[omitted lines] +``` A caveat of Python's `cProfile` profiler (and many profilers for that matter) is that they will display time per function call. That can become intuitive really fast specially if you are using third party libraries in your code since internal function calls will also be accounted for. @@ -412,7 +454,7 @@ do done ``` -1. [Here](/static/files/sorts.py) are some sorting algorithm implementations. Use [`cProfile`](https://docs.python.org/2/library/profile.html) and [`line_profiler`](https://github.com/rkern/line_profiler) to compare the runtime of insertion sort and quicksort. What is the bottleneck of each algorithm? Use then `memory_profiler` to check the memory consumption, why is insertion sort better? Check now the inplace version of quicksort. Challenge: Use `perf` to look at the cache locality of each algorithm. +1. [Here](/static/files/sorts.py) are some sorting algorithm implementations. Use [`cProfile`](https://docs.python.org/2/library/profile.html) and [`line_profiler`](https://github.com/rkern/line_profiler) to compare the runtime of insertion sort and quicksort. What is the bottleneck of each algorithm? Use then `memory_profiler` to check the memory consumption, why is insertion sort better? Check now the inplace version of quicksort. Challenge: Use `perf` to look at the cycle counts and cache hits and misses of each algorithm. 1. Here's some (arguably convoluted) Python code for computing Fibonacci numbers using a function for each number. From 6cc4cb14e669b196dc0a639d97606036dbf34009 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Thu, 23 Jan 2020 00:17:45 -0500 Subject: [PATCH 214/640] Some writing edits to debug-profile --- _2020/debugging-profiling.md | 162 +++++++++++++++++------------------ 1 file changed, 80 insertions(+), 82 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 879d731c..028a4e7e 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -4,7 +4,7 @@ title: "Debugging and Profiling" date: 2019-01-23 --- -A golden rule in programming is that code will not do what you expect it to do but what you told it to do. +A golden rule in programming is that code does not do what you expect it to do, but what you tell it to do. Bridging that gap can sometimes be a quite difficult feat. In this lecture we are going to cover useful techniques for dealing with buggy and resource hungry code: debugging and profiling. @@ -14,35 +14,35 @@ In this lecture we are going to cover useful techniques for dealing with buggy a "The most effective debugging tool is still careful thought, coupled with judiciously placed print statements" — Brian Kernighan, _Unix for Beginners_. -The first approach to debug a problem is often adding print statements around where you have detected that something is wrong and keep iterating until you have extracted enough information to understand what is responsible for the issue. +A fist approach to debug a program is to add print statements around where you have detected the problem, and keep iterating until you have extracted enough information to understand what is responsible for the issue. -The next step is to do use logging in your program instead of ad hoc print statements. Logging is better than just regular print statements for several reasons: +A second approach is to use logging in your program, instead of ad hoc print statements. Logging is better than regular print statements for several reasons: -- You can log to files, sockets even remote servers instead of standard output. -- Logging supports severity levels (such as INFO, DEBUG, WARN, ERROR, &c) so you can filter your output accordingly. +- You can log to files, sockets or even remote servers instead of standard output. +- Logging supports severity levels (such as INFO, DEBUG, WARN, ERROR, &c), that allow you to filter the output accordingly. - For new issues, there's a fair chance that your logs will contain enough information to detect what is going wrong. One of my favorite tips for making logs more readable is to color code them. -By now you probably have realized that your terminal uses colors to make things more readable. -But how does it do it? Programs like `ls` or `grep` are using [ANSI escape codes](https://en.wikipedia.org/wiki/ANSI_escape_code) which are special sequences of characters to indicate your shell to change the color of the output. For example executing `echo -e "\e[38;2;255;0;0mThis is red\e[0m"` prints a red `This is red` message in your terminal. +By now you probably have realized that your terminal uses colors to make things more readable. But how does it do it? +Programs like `ls` or `grep` are using [ANSI escape codes](https://en.wikipedia.org/wiki/ANSI_escape_code), which are special sequences of characters to indicate your shell to change the color of the output. For example, executing `echo -e "\e[38;2;255;0;0mThis is red\e[0m"` prints the message `This is red` in red on your terminal. ## Third party logs -As you start building larger software systems you will most probably run into dependencies that will run as separate programs. +As you start building larger software systems you will most probably run into dependencies that run as separate programs. Web servers, databases or message brokers are common examples of this kind of dependencies. -When interacting with these systems you will often need to read their logs since client side error messages might not suffice. +When interacting with these systems it is often necessary to read their logs, since client side error messages might not suffice. -Luckily, most programs will write their own logs somewhere in your system. +Luckily, most programs write their own logs somewhere in your system. In UNIX systems, it is commonplace for programs to write their logs under `/var/log`. -For instance, the [NGINX](https://www.nginx.com/) webserver will place its logs under `/var/log/nginx`. +For instance, the [NGINX](https://www.nginx.com/) webserver places its logs under `/var/log/nginx`. More recently, systems have started using a **system log** ”, which is increasingly where all of your log messages go. -Most (but not all) Linux systems will use `systemd`, a system daemon that will control many things in your system such as which services are enabled and running. -`systemd` will place the logs under `/var/log/journal` in a specialized format and you can use the [`journalctl`](http://man7.org/linux/man-pages/man1/journalctl.1.html) command to display the messages. -Similarly, on macOS there is still `/var/log/system.log` but increasingly tools will log into the system log that can be displayed with [`log show`](https://www.manpagez.com/man/1/log/). +Most (but not all) Linux systems use `systemd`, a system daemon that controls many things in your system such as which services are enabled and running. +`systemd` places the logs under `/var/log/journal` in a specialized format and you can use the [`journalctl`](http://man7.org/linux/man-pages/man1/journalctl.1.html) command to display the messages. +Similarly, on macOS there is still `/var/log/system.log` but an increasing number of tools use the system log, that can be displayed with [`log show`](https://www.manpagez.com/man/1/log/). On most UNIX systems you can also use the [`dmesg`](http://man7.org/linux/man-pages/man1/dmesg.1.html) command to access the kernel log. For logging under the system logs you can use the [`logger`](http://man7.org/linux/man-pages/man1/logger.1.html) tool. -Many programming languages will also have bindings for doing so. +Many programming languages have bindings for doing so. Here's an example of using `logger` and how to check that the entry made it to the system logs. ```bash @@ -54,13 +54,13 @@ journalctl --since "1m ago" | grep Hello ``` As we saw in the data wrangling lecture, logs can be quite verbose and they require some level of processing and filtering to get the information you want. -If you find yourself heavily filtering through `journalctl` and `log show` you will probably want to familiarize yourself with their flags which can perform a first pass of filtering of their output. -There are also some tools like [`lnav`](http://lnav.org/) that provide an improved presentation and navigation for log files. +If you find yourself heavily filtering through `journalctl` and `log show` you can consider using their flags, which can perform a first pass of filtering of their output. +There are also some tools like [`lnav`](http://lnav.org/), that provide an improved presentation and navigation for log files. ## Debuggers -When printf debugging is not enough you should be using a debugger. -Debuggers are programs that will let you interact with the execution of a program, letting you do things like: +When printf debugging is not enough you should use a debugger. +Debuggers are programs that let you interact with the execution of a program, allowing the following: - Halt execution of the program when it reaches a certain line. - Step through the program one instruction at a time. @@ -68,10 +68,10 @@ Debuggers are programs that will let you interact with the execution of a progra - Conditionally halt the execution when a given condition is met. - And many more advanced features -Many programming languages will come with some form of debugger. +Many programming languages come with some form of debugger. In Python this is the Python Debugger [`pdb`](https://docs.python.org/3/library/pdb.html). -Here is a brief description of some of the commands `pdb` supports. +Here is a brief description of some of the commands `pdb` supports: - **l**(ist) - Displays 11 lines around the current line or continue the previous listing. - **s**(tep) - Execute the current line, stop at the first possible occasion. @@ -79,7 +79,7 @@ Here is a brief description of some of the commands `pdb` supports. - **b**(reak) - Set a breakpoint (depending on the argument provided). - **p**(rint) - Evaluate the expression in the current context and print its value. There's also **pp** to display using [`pprint`](https://docs.python.org/3/library/pprint.html) instead. - **r**(eturn) - Continue execution until the current function returns. -- **q**(uit) - Quit from the debugger +- **q**(uit) - Quit the debugger. Let's go through an example of using `pdb` to fix the following buggy python code. @@ -87,20 +87,18 @@ Let's go through an example of using `pdb` to fix the following buggy python cod TODO TODO ``` +Note that since Python is an interpreted language, we can use the `pdb` shell to execute commands and execute instructions. +[`ipdb`](https://pypi.org/project/ipdb/) is an improved `pdb` that uses the [`IPython`](https://ipython.org) REPL, thus enabling tab completion, syntax highlighting, better tracebacks and better introspection, while retaining the same interface as the `pdb` module. - -Note that since Python is an interpreted language we can use the `pdb` shell to execute commands and to execute instructions. -[`ipdb`](https://pypi.org/project/ipdb/) is an improved `pdb` that uses the [`IPython`](https://ipython.org) REPL enabling tab completion, syntax highlighting, better tracebacks, and better introspection while retaining the same interface as the `pdb` module. - -For more low level programming you will probably want to look into [`gdb`](https://www.gnu.org/software/gdb/) (and its quality of life modification [`pwndbg`](https://github.com/pwndbg/pwndbg)) and [`lldb`](https://lldb.llvm.org/). +For more low level programming you can look into [`gdb`](https://www.gnu.org/software/gdb/) (and its quality of life modification [`pwndbg`](https://github.com/pwndbg/pwndbg)) and [`lldb`](https://lldb.llvm.org/). They are optimized for C-like language debugging but will let you probe pretty much any process and get its current machine state: registers, stack, program counter, &c. ## Specialized Tools Even if what you are trying to debug is a black box binary there are tools that can help you with that. -Whenever programs need to perform actions that only the kernel can, they will use [System Calls](https://en.wikipedia.org/wiki/System_call). -There are commands that will let you trace the syscalls your program makes. In Linux there's [`strace`](http://man7.org/linux/man-pages/man1/strace.1.html) and macOS and BSD have [`dtrace`](http://dtrace.org/blogs/about/). Since `dtrace` can be tricky to use since it uses its own `D` language there is a wrapper called [`dtruss`](https://www.manpagez.com/man/1/dtruss/) that will provide an interface more similar to `strace` (more details [here](https://8thlight.com/blog/colin-jones/2015/11/06/dtrace-even-better-than-strace-for-osx.html)). +Whenever programs need to perform actions that only the kernel can, they use [System Calls](https://en.wikipedia.org/wiki/System_call). +There are commands that let you trace the syscalls your program makes. In Linux there's [`strace`](http://man7.org/linux/man-pages/man1/strace.1.html) and macOS and BSD have [`dtrace`](http://dtrace.org/blogs/about/). `dtrace` can be tricky to use because it uses its own `D` language, but there is a wrapper called [`dtruss`](https://www.manpagez.com/man/1/dtruss/) that provides an interface more similar to `strace` (more details [here](https://8thlight.com/blog/colin-jones/2015/11/06/dtrace-even-better-than-strace-for-osx.html)). Below are some examples of using `strace` or `dtruss` to show [`stat`](http://man7.org/linux/man-pages/man2/stat.2.html) syscall traces for an execution of `ls`. For a deeper dive into `strace`, [this](https://blogs.oracle.com/linux/strace-the-sysadmins-microscope-v2) is a good read. @@ -112,25 +110,25 @@ sudo strace -e lstat ls -l > /dev/null sudo dtruss -t lstat64_extended ls -l > /dev/null ``` -Under some circumstances, looking at the network packets might be necessary to figure out what is going wrong with your program. -Tools like [`tcpdump`](http://man7.org/linux/man-pages/man1/tcpdump.1.html) and [Wireshark](https://www.wireshark.org/) are network packet analyzers that will let you read the contents of network packets and filter them based on many criteria. +Under some circumstances, you may need to look at the network packets to figure out the issue in your program. +Tools like [`tcpdump`](http://man7.org/linux/man-pages/man1/tcpdump.1.html) and [Wireshark](https://www.wireshark.org/) are network packet analyzers that let you read the contents of network packets and filter them based on different criteria. -For web development, the Chrome/Firefox developer tools are a quite amazing tool. They feature a large number of tools: -- Source code - Inspect the HTML/CSS/JS source code of any website -- Live HTML, CSS, JS modification - Change the website content, styles and behavior to test. (This also means that website screenshots are not valid proofs). -- Javascript shell - Execute commands in the JS REPL -- Network - Analyze the timeline of requests +For web development, the Chrome/Firefox developer tools are quite handy. They feature a large number of tools, including: +- Source code - Inspect the HTML/CSS/JS source code of any website. +- Live HTML, CSS, JS modification - Change the website content, styles and behavior to test (you can see for yourself that website screenshots are not valid proofs). +- Javascript shell - Execute commands in the JS REPL. +- Network - Analyze the requests timeline. - Storage - Look into the Cookies and local application storage. ## Static Analysis -Not all issues need the code to be run to be discovered. -For example, just by carefully looking at a piece of code you could realize that your loop variable is shadowing an already existing variable or function name; or that a variable has never been defined. +For some issues you do not need to run any code. +For example, just by carefully looking at a piece of code you could realize that your loop variable is shadowing an already existing variable or function name; or that a program reads a variable before defining it. Here is where [static analysis](https://en.wikipedia.org/wiki/Static_program_analysis) tools come into play. Static analysis programs take source code as input and analyze it using coding rules to reason about its correctness. -For instance, in the following Python snippet there are several mistakes. -First, our loop variable `foo` shadows the previous definition of the function `foo`. We also wrote `baz` instead of `bar` in the last line so the program will crash, but it will take a minute to do so because of the `sleep` call. +In the following Python snippet there are several mistakes. +First, our loop variable `foo` shadows the previous definition of the function `foo`. We also wrote `baz` instead of `bar` in the last line, so the program will crash after completing the `sleep` call (which will take one minute). ```python import time @@ -146,9 +144,10 @@ time.sleep(60) print(baz) ``` -Static analysis tools can catch both these issues. When we run [`pyflakes`](https://pypi.org/project/pyflakes) on the code and get errors related to those issues. [`mypy`](http://mypy-lang.org/) is another tool that can detect type checking issues. Here, `bar` is first an `int` and it's then casted to a `float` so `mypy` will warn us about the error. -Note that all these issues were detected without actually having to run the code. -In the shell tools lecture we covered [`shellcheck`](https://www.shellcheck.net/) which is a similar tool for shell scripts. +Static analysis tools can identify this kind of issues. When we run [`pyflakes`](https://pypi.org/project/pyflakes) on the code we get the errors related to both bugs. [`mypy`](http://mypy-lang.org/) is another tool that can detect type checking issues. Here, `mypy` will warn us that `bar` is initially an `int` and is then casted to a `float`. +Again, note that all these issues were detected without having to run the code. + +In the shell tools lecture we covered [`shellcheck`](https://www.shellcheck.net/), which is a similar tool for shell scripts. ```bash $ pyflakes foobar.py @@ -162,26 +161,26 @@ foobar.py:11: error: Name 'baz' is not defined Found 3 errors in 1 file (checked 1 source file) ``` -Most editors and IDEs will support displaying the output of these tools within the editor itself, highlighting the locations of warnings and errors. +Most editors and IDEs support displaying the output of these tools within the editor itself, highlighting the locations of warnings and errors. This is often called **code linting** and it can also be used to display other types of issues such as stylistic violations or insecure constructs. In vim, the plugins [`ale`](https://vimawesome.com/plugin/ale) or [`syntastic`](https://vimawesome.com/plugin/syntastic) will let you do that. For Python, [`pylint`](https://www.pylint.org) and [`pep8`](https://pypi.org/project/pep8/) are examples of stylistic linters and [`bandit`](https://pypi.org/project/bandit/) is a tool designed to find common security issues. -For other languages people have compiled comprehensive lists of useful static analysis tools such as [Awesome Static Analysis](https://github.com/mre/awesome-static-analysis) (you may want to take a look at the _Writing_ section) and for linters there is [Awesome Linters](https://github.com/caramelomartins/awesome-linters). +For other languages people have compiled comprehensive lists of useful static analysis tools, such as [Awesome Static Analysis](https://github.com/mre/awesome-static-analysis) (you may want to take a look at the _Writing_ section) and for linters there is [Awesome Linters](https://github.com/caramelomartins/awesome-linters). A complementary tool to stylistic linting are code formatters such as [`black`](https://github.com/psf/black) for Python, `gofmt` for Go or `rustfmt` for Rust. -These tools autoformat your code so it's consistent with common stylistic patterns for the given programming language. +These tools autoformat your code so that it's consistent with common stylistic patterns for the given programming language. Although you might be unwilling to give stylistic control about your code, standardizing code format will help other people read your code and will make you better at reading other people's (stylistically standardized) code. # Profiling -Even if your code functionally behaves as you would expect that might not be good enough if it takes all your CPU or memory in the process. -Algorithms classes will teach you big _O_ notation but they won't teach how to find hot spots in your program. -Since [premature optimization is the root of all evil](http://wiki.c2.com/?PrematureOptimization) you should learn about profilers and monitoring tools, since they will help you understand what parts of your program are taking most of the time and/or resources so you can focus on optimizing those parts. +Even if your code functionally behaves as you would expect, that might not be good enough if it takes all your CPU or memory in the process. +Algorithms classes often teach big _O_ notation but not how to find hot spots in your programs. +Since [premature optimization is the root of all evil](http://wiki.c2.com/?PrematureOptimization), you should learn about profilers and monitoring tools. They will help you understand which parts of your program are taking most of the time and/or resources so you can focus on optimizing those parts. ## Timing -Similar to the debugging case, in many scenarios it can be enough to just print the time it took your code between two points. +Similarly to the debugging case, in many scenarios it can be enough to just print the time it took your code between two points. Here is an example in Python using the [`time`](https://docs.python.org/3/library/time.html) module. ```python @@ -203,8 +202,7 @@ print(time.time() - start) # 0.5713930130004883 ``` -However, as you might have noticed if you ran the printed time might not match your expected measurements. -Wall clock time can be misleading since your computer might be running other processes at the same time or might be waiting for events to happen. It is common for tools to make a distinction between _Real_, _User_ and _Sys_ time. In general _User_ + _Sys_ tells you how much time your process actually spent in the CPU (more detailed explanation [here](https://stackoverflow.com/questions/556405/what-do-real-user-and-sys-mean-in-the-output-of-time1)) +However, wall clock time can be misleading since your computer might be running other processes at the same time or waiting for events to happen. It is common for tools to make a distinction between _Real_, _User_ and _Sys_ time. In general, _User_ + _Sys_ tells you how much time your process actually spent in the CPU (more detailed explanation [here](https://stackoverflow.com/questions/556405/what-do-real-user-and-sys-mean-in-the-output-of-time1)). - _Real_ - Wall clock elapsed time from start to finish of the program, including the time taken by other processes and time taken while blocked (e.g. waiting for I/O or network) - _User_ - Amount of time spent in the CPU running user code @@ -223,16 +221,16 @@ sys 0m0.012s ### CPU -Most of the time when people refer to profilers they actually mean CPU profilers since they are the most common. -There are two main types of CPU profilers, tracing profilers and sampling profilers. -Tracing profilers keep a record of every function call your program makes whereas sampling profilers probe your program periodically (commonly every milliseconds) and record the program's stack. -They then present aggregate statistics of what your program spent the most time doing. +Most of the time when people refer to _profilers_ they actually mean _CPU profilers_, which are the most common. +There are two main types of CPU profilers: _tracing_ and _sampling_ profilers. +Tracing profilers keep a record of every function call your program makes whereas sampling profilers probe your program periodically (commonly every millisecond) and record the program's stack. +They use these records to present aggregate statistics of what your program spent the most time doing. [Here](https://jvns.ca/blog/2017/12/17/how-do-ruby---python-profilers-work-) is a good intro article if you want more detail on this topic. -Most programming languages will have some form a command line profiler that you can use to analyze your code. -Often those integrate with full fledged IDEs but for this lecture we are going to focus on the command line tools themselves. +Most programming languages have some sort of command line profiler that you can use to analyze your code. +They often integrate with full fledged IDEs but for this lecture we are going to focus on the command line tools themselves. -In Python we can use the `cProfile` module to profile time per function call. Here is a simple example that implements a rudimentary grep in Python. +In Python we can use the `cProfile` module to profile time per function call. Here is a simple example that implements a rudimentary grep in Python: ```python #!/usr/bin/env python @@ -256,7 +254,7 @@ if __name__ == '__main__': grep(pattern, file) ``` -We can profile this code using the following command. Analyzing the output we can see that IO is taking most of the time but compiling the regex also takes a fair amount of time. Since the regex need to be compiled just once we can move factor it out of the for. +We can profile this code using the following command. Analyzing the output we can see that IO is taking most of the time and that compiling the regex takes a fair amount of time as well. Since the regex only needs to be compiled once, we can factor it out of the for. ``` $ python -m cProfile -s tottime grep.py 1000 '^(import|\s*def)[^,]*$' *.py @@ -278,10 +276,10 @@ $ python -m cProfile -s tottime grep.py 1000 '^(import|\s*def)[^,]*$' *.py ``` -A caveat of Python's `cProfile` profiler (and many profilers for that matter) is that they will display time per function call. That can become intuitive really fast specially if you are using third party libraries in your code since internal function calls will also be accounted for. -A more intuitive way of displaying profiling information is to include the time taken per line of code, this is what _line profilers_ do. +A caveat of Python's `cProfile` profiler (and many profilers for that matter) is that they display time per function call. That can become intuitive really fast, specially if you are using third party libraries in your code since internal function calls are also accounted for. +A more intuitive way of displaying profiling information is to include the time taken per line of code, which is what _line profilers_ do. -For instance the following piece of Python code performs a request to the class website and parses the response to get all URLs in the page. +For instance, the following piece of Python code performs a request to the class website and parses the response to get all URLs in the page: ```python #!/usr/bin/env python @@ -302,7 +300,7 @@ if __name__ == '__main__': get_urls() ``` -If we ran it thorugh Python's `cProfile` profiler we get over 2500 lines of output and even with sorting it is hard to understand where the time is being spent. A quick run with [`line_profiler`](https://github.com/rkern/line_profiler) shows the time taken per line. +If we used Python's `cProfile` profiler we'd get over 2500 lines of output, and even with sorting it'd be hard to understand where the time is being spent. A quick run with [`line_profiler`](https://github.com/rkern/line_profiler) shows the time taken per line: ```bash $ kernprof -l -v a.py @@ -326,11 +324,11 @@ Line # Hits Time Per Hit % Time Line Contents ### Memory -In languages like C or C++ memory leaks can cause your program to never release memory that doesn't need anymore. +In languages like C or C++ memory leaks can cause your program to never release memory that it doesn't need anymore. To help in the process of memory debugging you can use tools like [Valgrind](https://valgrind.org/) that will help you identify memory leaks. -In garbage collected languages like Python it is still useful to use a memory profiler since as long as you have pointers to objects in memory they won't be garbage collected. -Here's an example program and the associated output when running it with [memory-profiler](https://pypi.org/project/memory-profiler/) (note the decorator like in `line-profiler`) +In garbage collected languages like Python it is still useful to use a memory profiler because as long as you have pointers to objects in memory they won't be garbage collected. +Here's an example program and its associated output when running it with [memory-profiler](https://pypi.org/project/memory-profiler/) (note the decorator like in `line-profiler`). ```python @profile @@ -359,7 +357,7 @@ Line # Mem usage Increment Line Contents ### Event Profiling As it was the case for `strace` for debugging, you might want to ignore the specifics of the code that you are running and treat it like a black box when profiling. -The [`perf`](http://man7.org/linux/man-pages/man1/perf.1.html) command abstracts away CPU differences and does not report time or memory but instead it reports system events related to your programs. +The [`perf`](http://man7.org/linux/man-pages/man1/perf.1.html) command abstracts CPU differences away and does not report time or memory, but instead it reports system events related to your programs. For example, `perf` can easily report poor cache locality, high amounts of page faults or livelocks. TODO `perf` command @@ -379,7 +377,7 @@ Profiler output for real world programs will contain large amounts of informatio Humans are visual creatures and are quite terrible at reading large amounts of numbers and making sense of them. Thus there are many tools for displaying profiler's output in a easier to parse way. -One common way to display CPU profiling information for sampling profilers is to use a [Flame Graph](http://www.brendangregg.com/flamegraphs.html) which will display a hierarchy of function calls across the Y axis and time taken proportional to the X axis. They are also interactive letting you zoom into specific parts of the program and get their stack traces (try clicking in the image below). +One common way to display CPU profiling information for sampling profilers is to use a [Flame Graph](http://www.brendangregg.com/flamegraphs.html), which will display a hierarchy of function calls across the Y axis and time taken proportional to the X axis. They are also interactive, letting you zoom into specific parts of the program and get their stack traces (try clicking in the image below). [![FlameGraph](http://www.brendangregg.com/FlameGraphs/cpu-bash-flamegraph.svg)](http://www.brendangregg.com/FlameGraphs/cpu-bash-flamegraph.svg) @@ -392,18 +390,18 @@ In Python you can use the [`pycallgraph`](http://pycallgraph.slowchop.com/en/mas ## Resource Monitoring Sometimes, the first step towards analyzing the performance of your program is to understand what its actual resource consumption is. -Often programs will run slow when they are resource constrained, e.g. not having enough memory or having a slow network connection. -There is a myriad of command line tools for probing and displaying different system resources like CPU usage, memory usage, network, disk usage and so on. - -- **General Monitoring** - Probably the most popular is [`htop`](https://hisham.hm/htop/index.php) which is an improved version of [`top`](http://man7.org/linux/man-pages/man1/top.1.html). -`htop` presents you various statistics for the currently running processes on the system. -See also [`glances`](https://nicolargo.github.io/glances/) for similar implementation with a great UI. For getting aggregate measures across all processes, [`dstat`](http://dag.wiee.rs/home-made/dstat/) is also nifty tool that computes real-time resource metrics for lots of different subsystems like I/O, networking, CPU utilization, context switches, &c. -- **I/O operations** - [`iotop`](http://man7.org/linux/man-pages/man8/iotop.8.html) displays live I/O usage information, handy to check if a process is doing heavy I/O disk operations -- **Disk Usage** - [`df`](http://man7.org/linux/man-pages/man1/df.1.html) will display metrics per partitions and [`du`](http://man7.org/linux/man-pages/man1/du.1.html) displays **d**isk **u**sage per file for the current directory. In these tools the `-h` flag tells the program to print with **h**uman readable format. -A more interactive version of `du` is [`ncdu`](https://dev.yorhel.nl/ncdu) which will let you navigate folders and delete files and folders as you navigate. +Programs often run slowly when they are resource constrained, e.g. without enough memory or on a slow network connection. +There are a myriad of command line tools for probing and displaying different system resources like CPU usage, memory usage, network, disk usage and so on. + +- **General Monitoring** - Probably the most popular is [`htop`](https://hisham.hm/htop/index.php), which is an improved version of [`top`](http://man7.org/linux/man-pages/man1/top.1.html). +`htop` presents various statistics for the currently running processes on the system. +See also [`glances`](https://nicolargo.github.io/glances/) for similar implementation with a great UI. For getting aggregate measures across all processes, [`dstat`](http://dag.wiee.rs/home-made/dstat/) is another nifty tool that computes real-time resource metrics for lots of different subsystems like I/O, networking, CPU utilization, context switches, &c. +- **I/O operations** - [`iotop`](http://man7.org/linux/man-pages/man8/iotop.8.html) displays live I/O usage information and is handy to check if a process is doing heavy I/O disk operations +- **Disk Usage** - [`df`](http://man7.org/linux/man-pages/man1/df.1.html) displays metrics per partitions and [`du`](http://man7.org/linux/man-pages/man1/du.1.html) displays **d**isk **u**sage per file for the current directory. In these tools the `-h` flag tells the program to print with **h**uman readable format. +A more interactive version of `du` is [`ncdu`](https://dev.yorhel.nl/ncdu) which lets you navigate folders and delete files and folders as you navigate. - **Memory Usage** - [`free`](http://man7.org/linux/man-pages/man1/free.1.html) displays the total amount of free and used memory in the system. Memory is also displayed in tools like `htop`. -- **Open Files** - [`lsof`](http://man7.org/linux/man-pages/man8/lsof.8.html) lists file information about files opened by processes. It can be quite useful for checking which process has opened a given file. -- **Network Connections and Config** - [`ss`](http://man7.org/linux/man-pages/man8/ss.8.html) will let you monitor incoming and outgoing network packets statistics as well as interface statistics. A common use case of `ss` is figuring out what process is using a given port in a machine. For displaying routing, network devices and interfaces you can use [`ip`](http://man7.org/linux/man-pages/man8/ip.8.html). Note that `netstat` and `ifconfig` have been deprecated in favor of the former tools respectively. +- **Open Files** - [`lsof`](http://man7.org/linux/man-pages/man8/lsof.8.html) lists file information about files opened by processes. It can be quite useful for checking which process has opened a specific file. +- **Network Connections and Config** - [`ss`](http://man7.org/linux/man-pages/man8/ss.8.html) lets you monitor incoming and outgoing network packets statistics as well as interface statistics. A common use case of `ss` is figuring out what process is using a given port in a machine. For displaying routing, network devices and interfaces you can use [`ip`](http://man7.org/linux/man-pages/man8/ip.8.html). Note that `netstat` and `ifconfig` have been deprecated in favor of the former tools respectively. - **Network Usage** - [`nethogs`](https://github.com/raboof/nethogs) and [`iftop`](http://www.ex-parrot.com/pdw/iftop/) are good interactive CLI tools for monitoring network usage. If you want to test these tools you can also artificially impose loads on the machine using the [`stress`](https://linux.die.net/man/1/stress) command. @@ -412,7 +410,7 @@ If you want to test these tools you can also artificially impose loads on the ma ### Specialized tools Sometimes, black box benchmarking is all you need to determine what software to use. -Tools like [`hyperfine`](https://github.com/sharkdp/hyperfine) will let you quickly benchmark command line programs. +Tools like [`hyperfine`](https://github.com/sharkdp/hyperfine) let you quickly benchmark command line programs. For instance, in the shell tools and scripting lecture we recommended `fd` over `find`. We can use `hyperfine` to compare them in tasks we run often. E.g. in the example below `fd` was 20x faster than `find` in my machine. @@ -431,7 +429,7 @@ Summary 21.89 ± 2.33 times faster than 'find . -iname "*.jpg"' ``` -As it was the case for debugging, browsers also come with a fantastic set of tools for profiling webpage loading letting you figure out where time is being spent: loading, rendering, scripting, &c. +As it was the case for debugging, browsers also come with a fantastic set of tools for profiling webpage loading, letting you figure out where time is being spent (loading, rendering, scripting, &c). More info for [Firefox](https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Profiling_with_the_Built-in_Profiler) and [Chrome](https://developers.google.com/web/tools/chrome-devtools/rendering-toolss). # Exercises From 1fe4a4b281baf6d863867bc3717854f82f866d69 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Thu, 23 Jan 2020 00:21:53 -0500 Subject: [PATCH 215/640] Finish debug-profile lecture notes --- _2020/debugging-profiling.md | 29 ++++++++++++++++++----------- 1 file changed, 18 insertions(+), 11 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 028a4e7e..f2137d4e 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -81,16 +81,26 @@ Here is a brief description of some of the commands `pdb` supports: - **r**(eturn) - Continue execution until the current function returns. - **q**(uit) - Quit the debugger. -Let's go through an example of using `pdb` to fix the following buggy python code. +Let's go through an example of using `pdb` to fix the following buggy python code. (See the lecture video). -```bash -TODO TODO +```python +def bubble_sort(arr): + n = len(arr) + for i in range(n): + for j in range(n): + if arr[j] > arr[j+1]: + arr[j] = arr[j+1] + arr[j+1] = arr[j] + return arr + +print(bubble_sort([4, 2, 1, 8, 7, 6])) ``` -Note that since Python is an interpreted language, we can use the `pdb` shell to execute commands and execute instructions. -[`ipdb`](https://pypi.org/project/ipdb/) is an improved `pdb` that uses the [`IPython`](https://ipython.org) REPL, thus enabling tab completion, syntax highlighting, better tracebacks and better introspection, while retaining the same interface as the `pdb` module. -For more low level programming you can look into [`gdb`](https://www.gnu.org/software/gdb/) (and its quality of life modification [`pwndbg`](https://github.com/pwndbg/pwndbg)) and [`lldb`](https://lldb.llvm.org/). +Note that since Python is an interpreted language we can use the `pdb` shell to execute commands and to execute instructions. +[`ipdb`](https://pypi.org/project/ipdb/) is an improved `pdb` that uses the [`IPython`](https://ipython.org) REPL enabling tab completion, syntax highlighting, better tracebacks, and better introspection while retaining the same interface as the `pdb` module. + +For more low level programming you will probably want to look into [`gdb`](https://www.gnu.org/software/gdb/) (and its quality of life modification [`pwndbg`](https://github.com/pwndbg/pwndbg)) and [`lldb`](https://lldb.llvm.org/). They are optimized for C-like language debugging but will let you probe pretty much any process and get its current machine state: registers, stack, program counter, &c. @@ -364,11 +374,8 @@ TODO `perf` command - `perf list` - List the events that can be traced with perf - `perf stat COMMAND ARG1 ARG2` - Gets counts of different events related a process or command -- `perf record` - -- `perf report` - -- Basic performance stats: `perf stat {command}` -- Run a program with the profiler: `perf record {command}` -- Analyze profile: `perf report` +- `perf record COMMAND ARG1 ARG2` - Records the run of a command and saves the statistical data into a file called `perf.data` +- `perf report` - Formats and prints the data collected in `perf.data` ### Visualization From 908b44963e668b302edd030ed26e226ff4bec65f Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Thu, 23 Jan 2020 00:22:51 -0500 Subject: [PATCH 216/640] Remove hanging TODO --- _2020/debugging-profiling.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index f2137d4e..2ffb0580 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -368,9 +368,7 @@ Line # Mem usage Increment Line Contents As it was the case for `strace` for debugging, you might want to ignore the specifics of the code that you are running and treat it like a black box when profiling. The [`perf`](http://man7.org/linux/man-pages/man1/perf.1.html) command abstracts CPU differences away and does not report time or memory, but instead it reports system events related to your programs. -For example, `perf` can easily report poor cache locality, high amounts of page faults or livelocks. - -TODO `perf` command +For example, `perf` can easily report poor cache locality, high amounts of page faults or livelocks. Here is an overview of the command: - `perf list` - List the events that can be traced with perf - `perf stat COMMAND ARG1 ARG2` - Gets counts of different events related a process or command From c0e56bfa347cf3be981fff34f33b51cfec171a6d Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Thu, 23 Jan 2020 00:37:30 -0500 Subject: [PATCH 217/640] Toggle lecture visibility --- _2020/debugging-profiling.md | 1 + 1 file changed, 1 insertion(+) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 2ffb0580..cd7d1a0b 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -2,6 +2,7 @@ layout: lecture title: "Debugging and Profiling" date: 2019-01-23 +ready: true --- A golden rule in programming is that code does not do what you expect it to do, but what you tell it to do. From b0557182f6d8c9700ef37ef0c86afd20857e0a0f Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Sun, 26 Jan 2020 11:42:21 -0500 Subject: [PATCH 218/640] Minor edits to debug/profiling --- _2020/debugging-profiling.md | 51 +++++++++++++++++++++------- static/files/logger.py | 66 ++++++++++++++++++++++++++++++++++++ 2 files changed, 105 insertions(+), 12 deletions(-) create mode 100644 static/files/logger.py diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index cd7d1a0b..f6145f7d 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -15,7 +15,7 @@ In this lecture we are going to cover useful techniques for dealing with buggy a "The most effective debugging tool is still careful thought, coupled with judiciously placed print statements" — Brian Kernighan, _Unix for Beginners_. -A fist approach to debug a program is to add print statements around where you have detected the problem, and keep iterating until you have extracted enough information to understand what is responsible for the issue. +A first approach to debug a program is to add print statements around where you have detected the problem, and keep iterating until you have extracted enough information to understand what is responsible for the issue. A second approach is to use logging in your program, instead of ad hoc print statements. Logging is better than regular print statements for several reasons: @@ -23,9 +23,33 @@ A second approach is to use logging in your program, instead of ad hoc print sta - Logging supports severity levels (such as INFO, DEBUG, WARN, ERROR, &c), that allow you to filter the output accordingly. - For new issues, there's a fair chance that your logs will contain enough information to detect what is going wrong. +[Here](/static/files/logger.py) is an example code that logs messages: + +```bash +$ python logger.py +# Raw output as with just prints +$ python logger.py log +# Log formatted output +$ python logger.py log ERROR +# Print only ERROR levels and above +$ python logger.py color +# Color formatted output +``` + One of my favorite tips for making logs more readable is to color code them. By now you probably have realized that your terminal uses colors to make things more readable. But how does it do it? -Programs like `ls` or `grep` are using [ANSI escape codes](https://en.wikipedia.org/wiki/ANSI_escape_code), which are special sequences of characters to indicate your shell to change the color of the output. For example, executing `echo -e "\e[38;2;255;0;0mThis is red\e[0m"` prints the message `This is red` in red on your terminal. +Programs like `ls` or `grep` are using [ANSI escape codes](https://en.wikipedia.org/wiki/ANSI_escape_code), which are special sequences of characters to indicate your shell to change the color of the output. For example, executing `echo -e "\e[38;2;255;0;0mThis is red\e[0m"` prints the message `This is red` in red on your terminal. The following script shows how to print many RGB colors into your terminal. + +```bash +#!/usr/bin/env bash +for R in $(seq 0 20 255); do + for G in $(seq 0 20 255); do + for B in $(seq 0 20 255); do + printf "\e[38;2;${R};${G};${B}m█\e[0m"; + done + done +done +``` ## Third party logs @@ -36,15 +60,15 @@ When interacting with these systems it is often necessary to read their logs, si Luckily, most programs write their own logs somewhere in your system. In UNIX systems, it is commonplace for programs to write their logs under `/var/log`. For instance, the [NGINX](https://www.nginx.com/) webserver places its logs under `/var/log/nginx`. -More recently, systems have started using a **system log** ”, which is increasingly where all of your log messages go. +More recently, systems have started using a **system log**, which is increasingly where all of your log messages go. Most (but not all) Linux systems use `systemd`, a system daemon that controls many things in your system such as which services are enabled and running. `systemd` places the logs under `/var/log/journal` in a specialized format and you can use the [`journalctl`](http://man7.org/linux/man-pages/man1/journalctl.1.html) command to display the messages. Similarly, on macOS there is still `/var/log/system.log` but an increasing number of tools use the system log, that can be displayed with [`log show`](https://www.manpagez.com/man/1/log/). On most UNIX systems you can also use the [`dmesg`](http://man7.org/linux/man-pages/man1/dmesg.1.html) command to access the kernel log. -For logging under the system logs you can use the [`logger`](http://man7.org/linux/man-pages/man1/logger.1.html) tool. -Many programming languages have bindings for doing so. +For logging under the system logs you can use the [`logger`](http://man7.org/linux/man-pages/man1/logger.1.html) shell program. Here's an example of using `logger` and how to check that the entry made it to the system logs. +Moreover, most programming languages have bindings logging to the system log. ```bash logger "Hello Logs" @@ -116,7 +140,7 @@ Below are some examples of using `strace` or `dtruss` to show [`stat`](http://ma ```bash # On Linux sudo strace -e lstat ls -l > /dev/null - +4 # On macOS sudo dtruss -t lstat64_extended ls -l > /dev/null ``` @@ -315,7 +339,7 @@ If we used Python's `cProfile` profiler we'd get over 2500 lines of output, and ```bash $ kernprof -l -v a.py -Wrote profile results to a.py.lprof +Wrote profile results to urls.py.lprof Timer unit: 1e-06 s Total time: 0.636188 s @@ -400,7 +424,7 @@ Programs often run slowly when they are resource constrained, e.g. without enoug There are a myriad of command line tools for probing and displaying different system resources like CPU usage, memory usage, network, disk usage and so on. - **General Monitoring** - Probably the most popular is [`htop`](https://hisham.hm/htop/index.php), which is an improved version of [`top`](http://man7.org/linux/man-pages/man1/top.1.html). -`htop` presents various statistics for the currently running processes on the system. +`htop` presents various statistics for the currently running processes on the system. `htop` has a myriad of options and keybinds, some useful ones are: `` to sort processes, `t` to show tree hierarchy and `h` to toggle threads. See also [`glances`](https://nicolargo.github.io/glances/) for similar implementation with a great UI. For getting aggregate measures across all processes, [`dstat`](http://dag.wiee.rs/home-made/dstat/) is another nifty tool that computes real-time resource metrics for lots of different subsystems like I/O, networking, CPU utilization, context switches, &c. - **I/O operations** - [`iotop`](http://man7.org/linux/man-pages/man8/iotop.8.html) displays live I/O usage information and is handy to check if a process is doing heavy I/O disk operations - **Disk Usage** - [`df`](http://man7.org/linux/man-pages/man1/df.1.html) displays metrics per partitions and [`du`](http://man7.org/linux/man-pages/man1/du.1.html) displays **d**isk **u**sage per file for the current directory. In these tools the `-h` flag tells the program to print with **h**uman readable format. @@ -440,12 +464,13 @@ More info for [Firefox](https://developer.mozilla.org/en-US/docs/Mozilla/Perform # Exercises -1. Do [this](https://github.com/spiside/pdb-tutorial) hands on `pdb` tutorial to familiarize yourself with the commands. For a more in depth tutorial read [this](https://realpython.com/python-debugging-pdb). - +## Debugging 1. Use `journalctl` on Linux or `log show` on macOS to get the super user accesses and commands in the last day. If there aren't any you can execute some harmless commands such as `sudo ls` and check again. -1. Install [`shellchek`](https://www.shellcheck.net/) and try checking following script. What is wrong with the code? Fix it. Install a linter plugin in your editor so you can get your warnings automatically. +1. Do [this](https://github.com/spiside/pdb-tutorial) hands on `pdb` tutorial to familiarize yourself with the commands. For a more in depth tutorial read [this](https://realpython.com/python-debugging-pdb). + +1. Install [`shellcheck`](https://www.shellcheck.net/) and try checking the following script. What is wrong with the code? Fix it. Install a linter plugin in your editor so you can get your warnings automatically. ```bash @@ -458,6 +483,9 @@ do done ``` +1. (Advanced) Read about [reversible debugging](https://undo.io/resources/reverse-debugging-whitepaper/) and get a simple example working using [`rr`](https://rr-project.org/) or [`RevPDB`](https://morepypy.blogspot.com/2016/07/reverse-debugging-for-python.html). +## Profiling + 1. [Here](/static/files/sorts.py) are some sorting algorithm implementations. Use [`cProfile`](https://docs.python.org/2/library/profile.html) and [`line_profiler`](https://github.com/rkern/line_profiler) to compare the runtime of insertion sort and quicksort. What is the bottleneck of each algorithm? Use then `memory_profiler` to check the memory consumption, why is insertion sort better? Check now the inplace version of quicksort. Challenge: Use `perf` to look at the cycle counts and cache hits and misses of each algorithm. 1. Here's some (arguably convoluted) Python code for computing Fibonacci numbers using a function for each number. @@ -490,4 +518,3 @@ Challenge: achieve the same using [`cgroups`](http://man7.org/linux/man-pages/ma 1. (Advanced) The command `curl ipinfo.io` performs a HTTP request an fetches information about your public IP. Open [Wireshark](https://www.wireshark.org/) and try to sniff the request and reply packets that `curl` sent and received. (Hint: Use the `http` filter to just watch HTTP packets). -1. (Advanced) Read about [reversible debugging](https://undo.io/resources/reverse-debugging-whitepaper/) and get a simple example working using [`rr`](https://rr-project.org/) or [`RevPDB`](https://morepypy.blogspot.com/2016/07/reverse-debugging-for-python.html). diff --git a/static/files/logger.py b/static/files/logger.py new file mode 100644 index 00000000..44cb31f1 --- /dev/null +++ b/static/files/logger.py @@ -0,0 +1,66 @@ +import logging +import sys + +class CustomFormatter(logging.Formatter): + """Logging Formatter to add colors and count warning / errors""" + + grey = "\x1b[38;21m" + yellow = "\x1b[33;21m" + red = "\x1b[31;21m" + bold_red = "\x1b[31;1m" + reset = "\x1b[0m" + format = "%(asctime)s - %(name)s - %(levelname)s - %(message)s (%(filename)s:%(lineno)d)" + + FORMATS = { + logging.DEBUG: grey + format + reset, + logging.INFO: grey + format + reset, + logging.WARNING: yellow + format + reset, + logging.ERROR: red + format + reset, + logging.CRITICAL: bold_red + format + reset + } + + def format(self, record): + log_fmt = self.FORMATS.get(record.levelno) + formatter = logging.Formatter(log_fmt) + return formatter.format(record) + +# create logger with 'spam_application' +logger = logging.getLogger("Sample") + +# create console handler with a higher log level +ch = logging.StreamHandler() +ch.setLevel(logging.DEBUG) + +if len(sys.argv)> 1: + if sys.argv[1] == 'log': + ch.setFormatter(logging.Formatter('%(asctime)s : %(levelname)s : %(name)s : %(message)s')) + elif sys.argv[1] == 'color': + ch.setFormatter(CustomFormatter()) + +if len(sys.argv) > 2: + logger.setLevel(logging.__getattribute__(sys.argv[2])) +else: + logger.setLevel(logging.DEBUG) + +logger.addHandler(ch) + +# logger.debug("debug message") +# logger.info("info message") +# logger.warning("warning message") +# logger.error("error message") +# logger.critical("critical message") + +import random +import time +for _ in range(100): + i = random.randint(0, 10) + if i <= 4: + logger.info("Value is {} - Everything is fine".format(i)) + elif i <= 6: + logger.warning("Value is {} - System is getting hot".format(i)) + elif i <= 8: + logger.error("Value is {} - Dangerous region".format(i)) + else: + logger.critical("Maximum value reached") + time.sleep(0.3) + From 9ef9b087aba50778ce5a34bbffbe31aeff6568be Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Sun, 26 Jan 2020 17:01:55 -0500 Subject: [PATCH 219/640] First draft of meta --- _2020/meta.md | 268 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 268 insertions(+) create mode 100644 _2020/meta.md diff --git a/_2020/meta.md b/_2020/meta.md new file mode 100644 index 00000000..b6696e06 --- /dev/null +++ b/_2020/meta.md @@ -0,0 +1,268 @@ +--- +layout: lecture +title: "Metaprogramming" +date: 2019-01-27 +# ready: true +# video: +# aspect: 56.25 +# id: QQiUPFvIMt8 +--- + +{% comment %} +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anicor/data_wrangling_iap_2019/) +{% endcomment %} + +What do we mean by "metaprogramming"? Well, it was the best collective +term we could come up with for the set of things that are more about +_process_ than they are about writing code or working more efficiently. +In this lecture, we will look at systems for building and testing your +code, and for managing dependencies. These may seem like they are of +limited importance in your day-to-day as a student, but the moment you +interact with a larger code base through an internship or once you enter +the "real world", you will see this everywhere. + +# Build systems + +If you write a paper in LaTeX, what are the commands you need to run to +produce your paper? What about the ones used to run your benchmarks, +plot them, and then insert that plot into your paper? Or to compile the +code provided in the class you're taking and then running the tests? + +For most projects, whether they contain code or not, there is a "build +process". Some sequence of operations you need to do to go from your +inputs to your outputs. Often, that process might have many steps, and +many branches. Run this to generate this plot, that to generate those +results, and something else to produce the final paper. As with so many +of the things we have seen in this class, you are not the first to +encounter this annoyance, and luckily there exists many tools to help +you! + +These are usually called "build systems", and there are _many_ of them. +Which one you use depends on the task at hand, your language of +preference, and the size of the project. At their core, they are all +very similar though. You define a number of _dependencies_, a number of +_targets_, and _rules_ for going from one to the other. You tell the +build system that you want a particular target, and its job is to find +all the transitive dependencies of that target, and then apply the rules +to produce intermediate targets all the way until the final target has +been produced. Ideally, the build system does this without unnecessarily +executing rules for targets whose dependencies haven't changed and where +the result is available from a previous build. + +`make` is one of the most common build systems out there, and you will +usually find it installed on pretty much any UNIX-based computer. It has +its warts, but works quite well for simple-to-moderate projects. When +you run `make`, it consults a file called `Makefile` in the current +directory. All the targets, their dependencies, and the rules are +defined in that file. Let's take a look at one: + +```make +paper.pdf: paper.tex plot-data.png + pdflatex paper.tex + +plot-%.png: %.dat plot.py + ./plot.py -i $*.dat -o $@ +``` + +Each directive in this file is a rule for how to produce the left-hand +side using the right-hand side. Or, phrased differently, the things +named on the right-hand side are dependencies, and the left-hand side is +the target. The indented block is a sequence of programs to produce the +target from those dependencies. In `make`, the first directive also +defines the default goal. If you run `make` with no arguments, this is +the target it will build. Alternatively, you can run something like +`make plot-data.png`, and it will build that target instead. + +The `%` in a rule is a "pattern", and will match the same string on the +left and on the right. For example, if the target `plot-foo.png` is +requested, `make` will look for the dependencies `foo.dat` and +`plot.py`. Now let's look at what happens if we run `make` with an empty +source directory. + +```console +$ make +make: *** No rule to make target 'paper.tex', needed by 'paper.pdf'. Stop. +``` + +`make` is helpfully telling us that in order to build `paper.pdf`, it +needs `paper.tex`, and it has no rule telling it how to make that file. +Let's try making it! + +```console +$ touch paper.tex +$ make +make: *** No rule to make target 'plot-data.png', needed by 'paper.pdf'. Stop. +``` + +Hmm, interesting, there _is_ a rule to make `plot-data.png`, but it is a +pattern rule. Since the source files do not exist (`foo.dat`), `make` +simply states that it cannot make that file. Let's try creating all the +files: + +```console +$ cat paper.tex +\documentclass{article} +\usepackage{graphicx} +\begin{document} +\includegraphics[scale=0.65]{plot-data.png} +\end{document} +$ cat plot.py +#!/usr/bin/env python +import matplotlib +import matplotlib.pyplot as plt +import numpy as np +import argparse + +parser = argparse.ArgumentParser() +parser.add_argument('-i', type=argparse.FileType('r')) +parser.add_argument('-o') +args = parser.parse_args() + +data = np.loadtxt(args.i) +plt.plot(data[:, 0], data[:, 1]) +plt.savefig(args.o) +$ cat data.dat +1 1 +2 2 +3 3 +4 4 +5 8 +``` + +Now what happens if we run `make`? + +```console +$ make +./plot.py -i data.dat -o plot-data.png +pdflatex paper.tex +... lots of output ... +``` + +And look, it made a PDF for us! +What if we run `make` again? + +```console +$ make +make: 'paper.pdf' is up to date. +``` + +It didn't do anything! Why not? Well, because it didn't need to. It +checked that all of the previously-built targets were still up to date +with respect to their listed dependencies. We can test this by modifying +`paper.tex` and then re-running `make`: + +```console +$ vim paper.tex +$ make +pdflatex paper.tex +... +``` + +Notice that `make` did _not_ re-run `plot.py` because that was not +necessary; none of `plot-data.png`'s dependencies changed! + +# Dependency management + +At a more macro level, your software projects are likely to have +dependencies that are themselves projects. You might depend on installed +programs (like `python`), system packages (like `openssl`), or libraries +within your programming language (like `matplotlib`). These days, most +dependencies will be available through a _repository_ that hosts a +large number of such dependencies in a single place, and provides a +convenient mechanism for installing them. Some examples include the +Ubuntu package repositories for Ubuntu system packages, which you access +through the `apt` tool, RubyGems for Ruby libraries, PyPi for Python +libraries, or the Arch User Repository for Arch Linux user-contributed +packages. + +Since the exact mechanisms for interacting with these repositories vary +a lot from repository to repository and from tool to tool, we won't go +too much into the details of any specific one in this lecture. What we +_will_ cover is some of the common terminology they all use. The first +among these is _versioning_. Most projects that other projects depend on +issue a _version number_ with every release. Usually something like +8.1.3 or 64.1.20192004. They are often, but not always, numerical. +Version numbers serve many purposes, and one of the most important of +them is to ensure that software keeps working. Imagine, for example, +that I release a new version of my library where I have renamed a +particular function. If someone tried to build some software that +depends on my library after I release that update, the build might fail +because it calls a function that no longer exists! Versioning attempts +to solve this problem by letting a project say that it depends on a +particular version, or range of versions, of some other project. That +way, even if the underlying library changes, dependent software +continues building by using an older version of my library. + +That also isn't ideal though! What if I issue a security update which +does _not_ change the public interface of my library (its "API"), and +which any project that depended on the old version should immediately +start using? This is where the different groups of numbers in a version +come in. The exact meaning of each one varies between projects, but one +relatively common standard is _semantic versioning_. With semantic +versioning, every version number is of the form: major.minor.patch. The +rules are: + + - If a new release does not change the API, increase the patch version. + - If you _add_ to your API in a backwards-compatible way, increase the + minor version. + - If you change the API in a non-backwards-compatible way, increase the + major version. + +This already provides some major advantages. Now, if my project depends +on your project, it _should_ be safe to use the latest release with the +same major version as the one I built against when I developed it, as +long as its minor version is at least what it was back then. In other +words, if I depend on your library at version `1.3.7`, then it _should_ +be fine to build it with `1.3.8`, `1.6.1`, or even `1.3.0`. Version +`2.2.4` would probably not be okay, because the major version was +increased. + +When working with dependency management systems, you may also come +across the notion of _lock files_. A lock file is simply a file that +lists the exact version you are _currently_ depending on of each +dependency. Usually, you need to explicitly run an update program to +upgrade to newer versions of your dependencies. There are many reasons +for this, such as avoiding unnecessary recompiles, having reproducible +builds, or not automatically updating to the latest version (which may +be broken). And extreme version of this kind of dependency locking is +_vendoring_, which is where you copy all the code of your dependencies +into your own project. That gives you total control over any changes to +it, and lets you introduce your own changes to it, but also means you +have to explicitly pull in any updates from the upstream maintainers +over time. + +# Continuous integration systems + +As you work on larger and larger projects, you'll find that there are +often additional tasks you have to do whenever you make a change to it. +You might have to upload a new version of the documentation, upload a +compiled version somewhere, release the code to pypi, run your test +suite, and all sort of other things. Maybe every time someone sends you +a pull request on GitHub, you want their code to be style checked and +you want some benchmarks to run? When these kinds of needs arise, it's +time to take a look at continuous integration. + +Continuous integration, or CI, is an umbrella term for "stuff that runs +whenever your code changes", and there are many companies out there that +provide various types of CI, often for free for open-source projects. +Some of the big ones are Travis CI, Azure Pipelines, and GitHub Actions. +They all work in roughly the same way: you add a file to your repository +that describes what should happen when various things happen to that +repository. By far the most common one is a rule like "when someone +pushes code, run the test suite". When the event triggers, the CI +provider spins up a virtual machines (or more), runs the commands in +your "recipe", and then usually notes down the results somewhere. You +might set it up so that you are notified if the test suite stops +passing, or so that a little badge appears on your repository as long as +the tests pass. + +As an example of a CI system, the class website is set up using GitHub +Pages. Pages is a CI action that runs the Jekyll blog software on every +push to `master` and makes the built site available on a particular +GitHub domain. This makes it trivial for us to update the website! We +just make our changes locally, commit them with git, and then push. CI +takes care of the rest. + +# Exercises + + 1. `make clean` From 3e1690515919edecde5fde7ca514d916c27011e6 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Sun, 26 Jan 2020 21:35:17 -0500 Subject: [PATCH 220/640] Rename lecture notes --- _2020/meta.md | 268 --------------------------------------- _2020/metaprogramming.md | 263 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 263 insertions(+), 268 deletions(-) delete mode 100644 _2020/meta.md diff --git a/_2020/meta.md b/_2020/meta.md deleted file mode 100644 index b6696e06..00000000 --- a/_2020/meta.md +++ /dev/null @@ -1,268 +0,0 @@ ---- -layout: lecture -title: "Metaprogramming" -date: 2019-01-27 -# ready: true -# video: -# aspect: 56.25 -# id: QQiUPFvIMt8 ---- - -{% comment %} -[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anicor/data_wrangling_iap_2019/) -{% endcomment %} - -What do we mean by "metaprogramming"? Well, it was the best collective -term we could come up with for the set of things that are more about -_process_ than they are about writing code or working more efficiently. -In this lecture, we will look at systems for building and testing your -code, and for managing dependencies. These may seem like they are of -limited importance in your day-to-day as a student, but the moment you -interact with a larger code base through an internship or once you enter -the "real world", you will see this everywhere. - -# Build systems - -If you write a paper in LaTeX, what are the commands you need to run to -produce your paper? What about the ones used to run your benchmarks, -plot them, and then insert that plot into your paper? Or to compile the -code provided in the class you're taking and then running the tests? - -For most projects, whether they contain code or not, there is a "build -process". Some sequence of operations you need to do to go from your -inputs to your outputs. Often, that process might have many steps, and -many branches. Run this to generate this plot, that to generate those -results, and something else to produce the final paper. As with so many -of the things we have seen in this class, you are not the first to -encounter this annoyance, and luckily there exists many tools to help -you! - -These are usually called "build systems", and there are _many_ of them. -Which one you use depends on the task at hand, your language of -preference, and the size of the project. At their core, they are all -very similar though. You define a number of _dependencies_, a number of -_targets_, and _rules_ for going from one to the other. You tell the -build system that you want a particular target, and its job is to find -all the transitive dependencies of that target, and then apply the rules -to produce intermediate targets all the way until the final target has -been produced. Ideally, the build system does this without unnecessarily -executing rules for targets whose dependencies haven't changed and where -the result is available from a previous build. - -`make` is one of the most common build systems out there, and you will -usually find it installed on pretty much any UNIX-based computer. It has -its warts, but works quite well for simple-to-moderate projects. When -you run `make`, it consults a file called `Makefile` in the current -directory. All the targets, their dependencies, and the rules are -defined in that file. Let's take a look at one: - -```make -paper.pdf: paper.tex plot-data.png - pdflatex paper.tex - -plot-%.png: %.dat plot.py - ./plot.py -i $*.dat -o $@ -``` - -Each directive in this file is a rule for how to produce the left-hand -side using the right-hand side. Or, phrased differently, the things -named on the right-hand side are dependencies, and the left-hand side is -the target. The indented block is a sequence of programs to produce the -target from those dependencies. In `make`, the first directive also -defines the default goal. If you run `make` with no arguments, this is -the target it will build. Alternatively, you can run something like -`make plot-data.png`, and it will build that target instead. - -The `%` in a rule is a "pattern", and will match the same string on the -left and on the right. For example, if the target `plot-foo.png` is -requested, `make` will look for the dependencies `foo.dat` and -`plot.py`. Now let's look at what happens if we run `make` with an empty -source directory. - -```console -$ make -make: *** No rule to make target 'paper.tex', needed by 'paper.pdf'. Stop. -``` - -`make` is helpfully telling us that in order to build `paper.pdf`, it -needs `paper.tex`, and it has no rule telling it how to make that file. -Let's try making it! - -```console -$ touch paper.tex -$ make -make: *** No rule to make target 'plot-data.png', needed by 'paper.pdf'. Stop. -``` - -Hmm, interesting, there _is_ a rule to make `plot-data.png`, but it is a -pattern rule. Since the source files do not exist (`foo.dat`), `make` -simply states that it cannot make that file. Let's try creating all the -files: - -```console -$ cat paper.tex -\documentclass{article} -\usepackage{graphicx} -\begin{document} -\includegraphics[scale=0.65]{plot-data.png} -\end{document} -$ cat plot.py -#!/usr/bin/env python -import matplotlib -import matplotlib.pyplot as plt -import numpy as np -import argparse - -parser = argparse.ArgumentParser() -parser.add_argument('-i', type=argparse.FileType('r')) -parser.add_argument('-o') -args = parser.parse_args() - -data = np.loadtxt(args.i) -plt.plot(data[:, 0], data[:, 1]) -plt.savefig(args.o) -$ cat data.dat -1 1 -2 2 -3 3 -4 4 -5 8 -``` - -Now what happens if we run `make`? - -```console -$ make -./plot.py -i data.dat -o plot-data.png -pdflatex paper.tex -... lots of output ... -``` - -And look, it made a PDF for us! -What if we run `make` again? - -```console -$ make -make: 'paper.pdf' is up to date. -``` - -It didn't do anything! Why not? Well, because it didn't need to. It -checked that all of the previously-built targets were still up to date -with respect to their listed dependencies. We can test this by modifying -`paper.tex` and then re-running `make`: - -```console -$ vim paper.tex -$ make -pdflatex paper.tex -... -``` - -Notice that `make` did _not_ re-run `plot.py` because that was not -necessary; none of `plot-data.png`'s dependencies changed! - -# Dependency management - -At a more macro level, your software projects are likely to have -dependencies that are themselves projects. You might depend on installed -programs (like `python`), system packages (like `openssl`), or libraries -within your programming language (like `matplotlib`). These days, most -dependencies will be available through a _repository_ that hosts a -large number of such dependencies in a single place, and provides a -convenient mechanism for installing them. Some examples include the -Ubuntu package repositories for Ubuntu system packages, which you access -through the `apt` tool, RubyGems for Ruby libraries, PyPi for Python -libraries, or the Arch User Repository for Arch Linux user-contributed -packages. - -Since the exact mechanisms for interacting with these repositories vary -a lot from repository to repository and from tool to tool, we won't go -too much into the details of any specific one in this lecture. What we -_will_ cover is some of the common terminology they all use. The first -among these is _versioning_. Most projects that other projects depend on -issue a _version number_ with every release. Usually something like -8.1.3 or 64.1.20192004. They are often, but not always, numerical. -Version numbers serve many purposes, and one of the most important of -them is to ensure that software keeps working. Imagine, for example, -that I release a new version of my library where I have renamed a -particular function. If someone tried to build some software that -depends on my library after I release that update, the build might fail -because it calls a function that no longer exists! Versioning attempts -to solve this problem by letting a project say that it depends on a -particular version, or range of versions, of some other project. That -way, even if the underlying library changes, dependent software -continues building by using an older version of my library. - -That also isn't ideal though! What if I issue a security update which -does _not_ change the public interface of my library (its "API"), and -which any project that depended on the old version should immediately -start using? This is where the different groups of numbers in a version -come in. The exact meaning of each one varies between projects, but one -relatively common standard is _semantic versioning_. With semantic -versioning, every version number is of the form: major.minor.patch. The -rules are: - - - If a new release does not change the API, increase the patch version. - - If you _add_ to your API in a backwards-compatible way, increase the - minor version. - - If you change the API in a non-backwards-compatible way, increase the - major version. - -This already provides some major advantages. Now, if my project depends -on your project, it _should_ be safe to use the latest release with the -same major version as the one I built against when I developed it, as -long as its minor version is at least what it was back then. In other -words, if I depend on your library at version `1.3.7`, then it _should_ -be fine to build it with `1.3.8`, `1.6.1`, or even `1.3.0`. Version -`2.2.4` would probably not be okay, because the major version was -increased. - -When working with dependency management systems, you may also come -across the notion of _lock files_. A lock file is simply a file that -lists the exact version you are _currently_ depending on of each -dependency. Usually, you need to explicitly run an update program to -upgrade to newer versions of your dependencies. There are many reasons -for this, such as avoiding unnecessary recompiles, having reproducible -builds, or not automatically updating to the latest version (which may -be broken). And extreme version of this kind of dependency locking is -_vendoring_, which is where you copy all the code of your dependencies -into your own project. That gives you total control over any changes to -it, and lets you introduce your own changes to it, but also means you -have to explicitly pull in any updates from the upstream maintainers -over time. - -# Continuous integration systems - -As you work on larger and larger projects, you'll find that there are -often additional tasks you have to do whenever you make a change to it. -You might have to upload a new version of the documentation, upload a -compiled version somewhere, release the code to pypi, run your test -suite, and all sort of other things. Maybe every time someone sends you -a pull request on GitHub, you want their code to be style checked and -you want some benchmarks to run? When these kinds of needs arise, it's -time to take a look at continuous integration. - -Continuous integration, or CI, is an umbrella term for "stuff that runs -whenever your code changes", and there are many companies out there that -provide various types of CI, often for free for open-source projects. -Some of the big ones are Travis CI, Azure Pipelines, and GitHub Actions. -They all work in roughly the same way: you add a file to your repository -that describes what should happen when various things happen to that -repository. By far the most common one is a rule like "when someone -pushes code, run the test suite". When the event triggers, the CI -provider spins up a virtual machines (or more), runs the commands in -your "recipe", and then usually notes down the results somewhere. You -might set it up so that you are notified if the test suite stops -passing, or so that a little badge appears on your repository as long as -the tests pass. - -As an example of a CI system, the class website is set up using GitHub -Pages. Pages is a CI action that runs the Jekyll blog software on every -push to `master` and makes the built site available on a particular -GitHub domain. This makes it trivial for us to update the website! We -just make our changes locally, commit them with git, and then push. CI -takes care of the rest. - -# Exercises - - 1. `make clean` diff --git a/_2020/metaprogramming.md b/_2020/metaprogramming.md index 135c996e..9b272ccf 100644 --- a/_2020/metaprogramming.md +++ b/_2020/metaprogramming.md @@ -3,4 +3,267 @@ layout: lecture title: "Metaprogramming" details: build systems, sermver, makefiles, CI date: 2019-01-27 +# ready: true +# video: +# aspect: 56.25 +# id: QQiUPFvIMt8 --- + +{% comment %} +[Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anicor/data_wrangling_iap_2019/) +{% endcomment %} + +What do we mean by "metaprogramming"? Well, it was the best collective +term we could come up with for the set of things that are more about +_process_ than they are about writing code or working more efficiently. +In this lecture, we will look at systems for building and testing your +code, and for managing dependencies. These may seem like they are of +limited importance in your day-to-day as a student, but the moment you +interact with a larger code base through an internship or once you enter +the "real world", you will see this everywhere. + +# Build systems + +If you write a paper in LaTeX, what are the commands you need to run to +produce your paper? What about the ones used to run your benchmarks, +plot them, and then insert that plot into your paper? Or to compile the +code provided in the class you're taking and then running the tests? + +For most projects, whether they contain code or not, there is a "build +process". Some sequence of operations you need to do to go from your +inputs to your outputs. Often, that process might have many steps, and +many branches. Run this to generate this plot, that to generate those +results, and something else to produce the final paper. As with so many +of the things we have seen in this class, you are not the first to +encounter this annoyance, and luckily there exists many tools to help +you! + +These are usually called "build systems", and there are _many_ of them. +Which one you use depends on the task at hand, your language of +preference, and the size of the project. At their core, they are all +very similar though. You define a number of _dependencies_, a number of +_targets_, and _rules_ for going from one to the other. You tell the +build system that you want a particular target, and its job is to find +all the transitive dependencies of that target, and then apply the rules +to produce intermediate targets all the way until the final target has +been produced. Ideally, the build system does this without unnecessarily +executing rules for targets whose dependencies haven't changed and where +the result is available from a previous build. + +`make` is one of the most common build systems out there, and you will +usually find it installed on pretty much any UNIX-based computer. It has +its warts, but works quite well for simple-to-moderate projects. When +you run `make`, it consults a file called `Makefile` in the current +directory. All the targets, their dependencies, and the rules are +defined in that file. Let's take a look at one: + +```make +paper.pdf: paper.tex plot-data.png + pdflatex paper.tex + +plot-%.png: %.dat plot.py + ./plot.py -i $*.dat -o $@ +``` + +Each directive in this file is a rule for how to produce the left-hand +side using the right-hand side. Or, phrased differently, the things +named on the right-hand side are dependencies, and the left-hand side is +the target. The indented block is a sequence of programs to produce the +target from those dependencies. In `make`, the first directive also +defines the default goal. If you run `make` with no arguments, this is +the target it will build. Alternatively, you can run something like +`make plot-data.png`, and it will build that target instead. + +The `%` in a rule is a "pattern", and will match the same string on the +left and on the right. For example, if the target `plot-foo.png` is +requested, `make` will look for the dependencies `foo.dat` and +`plot.py`. Now let's look at what happens if we run `make` with an empty +source directory. + +```console +$ make +make: *** No rule to make target 'paper.tex', needed by 'paper.pdf'. Stop. +``` + +`make` is helpfully telling us that in order to build `paper.pdf`, it +needs `paper.tex`, and it has no rule telling it how to make that file. +Let's try making it! + +```console +$ touch paper.tex +$ make +make: *** No rule to make target 'plot-data.png', needed by 'paper.pdf'. Stop. +``` + +Hmm, interesting, there _is_ a rule to make `plot-data.png`, but it is a +pattern rule. Since the source files do not exist (`foo.dat`), `make` +simply states that it cannot make that file. Let's try creating all the +files: + +```console +$ cat paper.tex +\documentclass{article} +\usepackage{graphicx} +\begin{document} +\includegraphics[scale=0.65]{plot-data.png} +\end{document} +$ cat plot.py +#!/usr/bin/env python +import matplotlib +import matplotlib.pyplot as plt +import numpy as np +import argparse + +parser = argparse.ArgumentParser() +parser.add_argument('-i', type=argparse.FileType('r')) +parser.add_argument('-o') +args = parser.parse_args() + +data = np.loadtxt(args.i) +plt.plot(data[:, 0], data[:, 1]) +plt.savefig(args.o) +$ cat data.dat +1 1 +2 2 +3 3 +4 4 +5 8 +``` + +Now what happens if we run `make`? + +```console +$ make +./plot.py -i data.dat -o plot-data.png +pdflatex paper.tex +... lots of output ... +``` + +And look, it made a PDF for us! +What if we run `make` again? + +```console +$ make +make: 'paper.pdf' is up to date. +``` + +It didn't do anything! Why not? Well, because it didn't need to. It +checked that all of the previously-built targets were still up to date +with respect to their listed dependencies. We can test this by modifying +`paper.tex` and then re-running `make`: + +```console +$ vim paper.tex +$ make +pdflatex paper.tex +... +``` + +Notice that `make` did _not_ re-run `plot.py` because that was not +necessary; none of `plot-data.png`'s dependencies changed! + +# Dependency management + +At a more macro level, your software projects are likely to have +dependencies that are themselves projects. You might depend on installed +programs (like `python`), system packages (like `openssl`), or libraries +within your programming language (like `matplotlib`). These days, most +dependencies will be available through a _repository_ that hosts a +large number of such dependencies in a single place, and provides a +convenient mechanism for installing them. Some examples include the +Ubuntu package repositories for Ubuntu system packages, which you access +through the `apt` tool, RubyGems for Ruby libraries, PyPi for Python +libraries, or the Arch User Repository for Arch Linux user-contributed +packages. + +Since the exact mechanisms for interacting with these repositories vary +a lot from repository to repository and from tool to tool, we won't go +too much into the details of any specific one in this lecture. What we +_will_ cover is some of the common terminology they all use. The first +among these is _versioning_. Most projects that other projects depend on +issue a _version number_ with every release. Usually something like +8.1.3 or 64.1.20192004. They are often, but not always, numerical. +Version numbers serve many purposes, and one of the most important of +them is to ensure that software keeps working. Imagine, for example, +that I release a new version of my library where I have renamed a +particular function. If someone tried to build some software that +depends on my library after I release that update, the build might fail +because it calls a function that no longer exists! Versioning attempts +to solve this problem by letting a project say that it depends on a +particular version, or range of versions, of some other project. That +way, even if the underlying library changes, dependent software +continues building by using an older version of my library. + +That also isn't ideal though! What if I issue a security update which +does _not_ change the public interface of my library (its "API"), and +which any project that depended on the old version should immediately +start using? This is where the different groups of numbers in a version +come in. The exact meaning of each one varies between projects, but one +relatively common standard is _semantic versioning_. With semantic +versioning, every version number is of the form: major.minor.patch. The +rules are: + + - If a new release does not change the API, increase the patch version. + - If you _add_ to your API in a backwards-compatible way, increase the + minor version. + - If you change the API in a non-backwards-compatible way, increase the + major version. + +This already provides some major advantages. Now, if my project depends +on your project, it _should_ be safe to use the latest release with the +same major version as the one I built against when I developed it, as +long as its minor version is at least what it was back then. In other +words, if I depend on your library at version `1.3.7`, then it _should_ +be fine to build it with `1.3.8`, `1.6.1`, or even `1.3.0`. Version +`2.2.4` would probably not be okay, because the major version was +increased. + +When working with dependency management systems, you may also come +across the notion of _lock files_. A lock file is simply a file that +lists the exact version you are _currently_ depending on of each +dependency. Usually, you need to explicitly run an update program to +upgrade to newer versions of your dependencies. There are many reasons +for this, such as avoiding unnecessary recompiles, having reproducible +builds, or not automatically updating to the latest version (which may +be broken). And extreme version of this kind of dependency locking is +_vendoring_, which is where you copy all the code of your dependencies +into your own project. That gives you total control over any changes to +it, and lets you introduce your own changes to it, but also means you +have to explicitly pull in any updates from the upstream maintainers +over time. + +# Continuous integration systems + +As you work on larger and larger projects, you'll find that there are +often additional tasks you have to do whenever you make a change to it. +You might have to upload a new version of the documentation, upload a +compiled version somewhere, release the code to pypi, run your test +suite, and all sort of other things. Maybe every time someone sends you +a pull request on GitHub, you want their code to be style checked and +you want some benchmarks to run? When these kinds of needs arise, it's +time to take a look at continuous integration. + +Continuous integration, or CI, is an umbrella term for "stuff that runs +whenever your code changes", and there are many companies out there that +provide various types of CI, often for free for open-source projects. +Some of the big ones are Travis CI, Azure Pipelines, and GitHub Actions. +They all work in roughly the same way: you add a file to your repository +that describes what should happen when various things happen to that +repository. By far the most common one is a rule like "when someone +pushes code, run the test suite". When the event triggers, the CI +provider spins up a virtual machines (or more), runs the commands in +your "recipe", and then usually notes down the results somewhere. You +might set it up so that you are notified if the test suite stops +passing, or so that a little badge appears on your repository as long as +the tests pass. + +As an example of a CI system, the class website is set up using GitHub +Pages. Pages is a CI action that runs the Jekyll blog software on every +push to `master` and makes the built site available on a particular +GitHub domain. This makes it trivial for us to update the website! We +just make our changes locally, commit them with git, and then push. CI +takes care of the rest. + +# Exercises + + 1. `make clean` From 883582e10032963fffcdb0c6a120a063edf72bf4 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Mon, 27 Jan 2020 09:11:37 -0500 Subject: [PATCH 221/640] Python 2/3, testing, and exercises --- _2020/metaprogramming.md | 67 +++++++++++++++++++++++++++++++++++++--- 1 file changed, 62 insertions(+), 5 deletions(-) diff --git a/_2020/metaprogramming.md b/_2020/metaprogramming.md index 9b272ccf..a7a5a39f 100644 --- a/_2020/metaprogramming.md +++ b/_2020/metaprogramming.md @@ -199,9 +199,9 @@ does _not_ change the public interface of my library (its "API"), and which any project that depended on the old version should immediately start using? This is where the different groups of numbers in a version come in. The exact meaning of each one varies between projects, but one -relatively common standard is _semantic versioning_. With semantic -versioning, every version number is of the form: major.minor.patch. The -rules are: +relatively common standard is [_semantic +versioning_](https://semver.org/). With semantic versioning, every +version number is of the form: major.minor.patch. The rules are: - If a new release does not change the API, increase the patch version. - If you _add_ to your API in a backwards-compatible way, increase the @@ -216,7 +216,11 @@ long as its minor version is at least what it was back then. In other words, if I depend on your library at version `1.3.7`, then it _should_ be fine to build it with `1.3.8`, `1.6.1`, or even `1.3.0`. Version `2.2.4` would probably not be okay, because the major version was -increased. +increased. We can see an example of semantic versioning in Python's +version numbers. Many of you are probably aware that Python 2 and Python +3 code do not mix very well, which is why that was a _major_ version +bump. Similarly, code written for Python 3.5 might run fine on Python +3.7, but possibly not on 3.4. When working with dependency management systems, you may also come across the notion of _lock files_. A lock file is simply a file that @@ -264,6 +268,59 @@ GitHub domain. This makes it trivial for us to update the website! We just make our changes locally, commit them with git, and then push. CI takes care of the rest. +## A brief aside on testing + +Most large software projects come with a "test suite". You may already +be familiar with the general concept of testing, but we thought we'd +quickly mention some approaches to testing and testing terminology that +you may encounter in the wild: + + - Test suite: a collective term for all the tests + - Unit test: a "micro-test" that tests a specific feature in isolation + - Integration test: a "macro-test" that runs a larger part of the + system to check that different feature or components work _together_. + - Regression test: a test that implements a particular pattern that + _previously_ caused a bug to ensure that the bug does not resurface. + - Mocking: the replace a function, module, or type with a fake + implementation to avoid testing unrelated functionality. For example, + you might "mock the network" or "mock the disk". + # Exercises - 1. `make clean` + 1. Most makefiles provide a target called `clean`. This isn't intended + to produce a file called `clean`, but instead to clean up any files + that can be re-built by make. Think of it as a way to "undo" all of + the build steps. Implement a `clean` target for the `paper.pdf` + `Makefile` above. You will have to make the target + [phony](https://www.gnu.org/software/make/manual/html_node/Phony-Targets.html). + You may find the [`git + ls-files`](https://git-scm.com/docs/git-ls-files) subcommand useful. + A number of other very common make targets are listed + [here](https://www.gnu.org/software/make/manual/html_node/Standard-Targets.html#Standard-Targets). + 2. Take a look at the various ways to specify version requirements for + dependencies in [Rust's build + system](https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html). + Most package repositories support similar syntax. For each one + (caret, tilde, wildcard, comparison, and multiple), try to come up + with a use-case in which that particular kind of requirement makes + sense. + 3. Git can act as a simple CI system all by itself. In `.git/hooks` + inside any git repository, you will find (currently inactive) files + that are run as scripts when a particular action happens. Write a + [`pre-commit`](https://git-scm.com/docs/githooks#_pre_commit) hook + that runs `make paper.pdf` and refuses the commit if the `make` + command fails. This should prevent any commit from having an + unbuildable version of the paper. + 4. Set up a simple auto-published page using [GitHub + Pages](https://help.github.com/en/actions/automating-your-workflow-with-github-actions). + Add a [GitHub Action](https://github.com/features/actions) to the + repository to run `shellcheck` on any shell files in that + repository (here is [one way to do + it](https://github.com/marketplace/actions/shellcheck)). Check that + it works! + 5. [Build your + own](https://help.github.com/en/actions/automating-your-workflow-with-github-actions/building-actions) + GitHub action to run [`proselint`](http://proselint.com/) or + [`write-good`](https://github.com/btford/write-good) on all the + `.md` files in the repository. Enable it in your repository, and + check that it works by filing a pull request with a typo in it. From 9477296cff13e27f8100bbbee9102a0188dd1045 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Mon, 27 Jan 2020 09:31:23 -0500 Subject: [PATCH 222/640] Release meta --- _2020/metaprogramming.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/metaprogramming.md b/_2020/metaprogramming.md index a7a5a39f..c4d810c5 100644 --- a/_2020/metaprogramming.md +++ b/_2020/metaprogramming.md @@ -3,7 +3,7 @@ layout: lecture title: "Metaprogramming" details: build systems, sermver, makefiles, CI date: 2019-01-27 -# ready: true +ready: true # video: # aspect: 56.25 # id: QQiUPFvIMt8 From 197b9a5fcc26fcf4e29e54e5cca7eb192cb93a17 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Mon, 27 Jan 2020 11:48:49 -0500 Subject: [PATCH 223/640] Sane print styles --- static/css/main.css | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/static/css/main.css b/static/css/main.css index 56326000..bb61e4b6 100644 --- a/static/css/main.css +++ b/static/css/main.css @@ -356,3 +356,15 @@ input[type=checkbox]:checked ~ .menu-label:after { color: #F92672; } } + +@media print { + #nav-bg, #logo, #top-nav { display: none; } + h1.title ~ p.center.gap.accent { display: none; } + .youtube-wrapper { display: none; } + html { font-size: 1em; font-family: sans-serif; } + body { background: none; } + #content { max-width: none; } + h1.title { text-align: center; } + #content hr:last-of-type { display: none; } + #content div.small:last-of-type { display: none; } +} From 2b1c9a161ed60c93dd34f1885d159f8623b8ffe9 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Tue, 28 Jan 2020 10:48:25 -0500 Subject: [PATCH 224/640] Release lec7 and lec8 videos --- _2020/debugging-profiling.md | 3 +++ _2020/metaprogramming.md | 6 +++--- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index f6145f7d..37b4067c 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -3,6 +3,9 @@ layout: lecture title: "Debugging and Profiling" date: 2019-01-23 ready: true +video: + aspect: 56.25 + id: rrO9whcNMC8 --- A golden rule in programming is that code does not do what you expect it to do, but what you tell it to do. diff --git a/_2020/metaprogramming.md b/_2020/metaprogramming.md index c4d810c5..00794761 100644 --- a/_2020/metaprogramming.md +++ b/_2020/metaprogramming.md @@ -4,9 +4,9 @@ title: "Metaprogramming" details: build systems, sermver, makefiles, CI date: 2019-01-27 ready: true -# video: -# aspect: 56.25 -# id: QQiUPFvIMt8 +video: + aspect: 56.25 + id: kderh1XA30Q --- {% comment %} From 894b1a3231fcfd57c8b6419fa3e04cb0a4c65fa8 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 28 Jan 2020 11:34:08 -0500 Subject: [PATCH 225/640] Add notes for security lecture --- _2020/security.md | 292 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 291 insertions(+), 1 deletion(-) diff --git a/_2020/security.md b/_2020/security.md index d8856911..1546834f 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -1,5 +1,295 @@ --- layout: lecture -title: "Security and Privacy" +title: "Security and Cryptography" date: 2019-01-28 +ready: true --- + +Last year's [security and privacy lecture](/2019/security/) focused on how you +can be more secure as a computer _user_. This year, we will focus on security +and cryptography concepts that are relevant in understanding tools covered +earlier in this class, such as the use of hash functions in Git or key +derivation functions and symmetric/asymmetric cryptosystems in SSH. + +This lecture is not a substitute for a more rigorous and complete course on +computer systems security ([6.858](https://css.csail.mit.edu/6.858/)) or +cryptography ([6.857](https://courses.csail.mit.edu/6.857/) and 6.875). Don't +do security work without formal training in security. Unless you're an expert, +don't [roll your own +crypto](https://www.schneier.com/blog/archives/2015/05/amateurs_produc.html). +The same principle applies to systems security. + +This lecture has a very informal (but we think practical) treatment of basic +cryptography concepts. This lecture won't be enough to teach you how to +_design_ secure systems or cryptographic protocols, but we hope it will be +enough to give you a general understanding of the programs and protocols you +already use. + +# Entropy + +[Entropy](https://en.wikipedia.org/wiki/Entropy_(information_theory)) is a +measure of randomness. This is useful, for example, when determining the +strength of a password. + +![XKCD 936: Password Strength](https://imgs.xkcd.com/comics/password_strength.png) + +As the above [XKCD comic](https://xkcd.com/936/) illustrates, a password like +"correcthorsebatterystaple" is more secure than one like "Tr0ub4dor&3". But how +do you quantify something like this? + +Entropy is measured in _bits_, and when selecting uniformly at random from a +set of possible outcomes, the entropy is equal to `log_2(# of possibilities)`. +A fair coin flip gives 1 bit of entropy. A dice roll (of a 6-sided die) has +\~2.58 bits of entropy. + +You should consider that the attacker knows the _model_ of the password, but +not the randomness (e.g. from [dice +rolls](https://en.wikipedia.org/wiki/Diceware)) used to select a particular +password. + +How many bits of entropy is enough? It depends on your threat model. For online +guessing, as the XKCD comic points out, \~40 bits of entropy is pretty good. To +be resistant to offline guessing, a stronger password would be necessary (e.g. +80 bits, or more). + +# Hash functions + +A [cryptographic hash +function](https://en.wikipedia.org/wiki/Cryptographic_hash_function) maps data +of arbitrary size to a fixed size, and has some special properties. A rough +specification of a hash function is as follows: + +``` +hash(value: array) -> vector (for some fixed N) +``` + +An example of a hash function is [SHA1](https://en.wikipedia.org/wiki/SHA-1), +which is used in Git. It maps arbitrary-sized inputs to 160-bit outputs (which +can be represented as 40 hexadecimal characters). We can try out the SHA1 hash +on an input using the `sha1sum` command: + +```console +$ printf 'hello' | sha1sum +aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d +``` + +At a high level, a hash function can be thought of as a hard-to-invert random +function (and this is the [ideal model of a hash +function](https://en.wikipedia.org/wiki/Random_oracle)). A hash function has +the following properties: + +- Non-invertible: it is hard to find an input `m` such that `hash(m) = h` for +some desired output `h`. +- Target collision resistant: given an input `m_1`, it's hard to find a +different input `m_2` such that `hash(m_1) = hash(m_2)`. +- Collision resistant: it's hard to find two inputs `m_1` and `m_2` such that +`hash(m_1) = hash(m_2)` (note that this is a strictly stronger property than +target collision resistance). + +## Applications + +- Git, for content-addressed storage. The idea of a [hash +function](https://en.wikipedia.org/wiki/Hash_function) is a more general +concept (there are non-cryptographic has functions). Why does Git use a +cryptographic hash function? +- A short summary of the contents of a file. Software can often be downloaded +from (potentially less trustworthy) mirrors, e.g. Linux ISOs, and it would be +nice to not have to trust them. The official sites usually post hashes +alongside the download links (that point to third-party mirrors), so that the +hash can be checked after downloading a file. +- [Commitment schemes](https://en.wikipedia.org/wiki/Commitment_scheme). +Suppose you want to commit to a particular value, but reveal the value itself +later. For example, I want to do a fair coin toss "in my head", without a +trusted shared coin that two parties can see. I could choose a value `r = +random()`, and then share `h = sha256(r)`. Then, you could call heads or tails +(we'll agree that even `r` means heads, and odd `r` means tails). After you +call, I can reveal my value `r`, and you can confirm that I haven't cheated by +checking `sha256(r)` matches the hash I shared earlier. + +# Key derivation functions + +A related concept to cryptographic hashes, [key derivation +functions](https://en.wikipedia.org/wiki/Key_derivation_function) (KDFs) are +used for a number of applications, including producing fixed-length output for +use as keys in other cryptographic algorithms. Usually, KDFs are deliberately +slow, in order to slow down offline brute-force attacks. + +## Applications + +- Producing keys from passphrases for use in other cryptographic algorithms +(e.g. symmetric cryptography, see below). +- Storing login credentials. Storing plaintext passwords is bad; the right +approach is to generate and store a random +[salt](https://en.wikipedia.org/wiki/Salt_(cryptography)) `salt = random()` for +each user, store `KDF(password + salt)`, and verify login attempts by +re-computing the KDF given the entered password and the stored salt. + +# Symmetric cryptography + +Hiding message contents is probably the first concept you think about when you +think about cryptography. Symmetric cryptography accomplishes this with the +following set of functionality: + +``` +keygen() -> key (this function is randomized) + +encrypt(plaintext: array, key) -> array (the ciphertext) +decrypt(ciphertext: array, key) -> array (the plaintext) +``` + +The encrypt function has the property that given the output (ciphertext), it's +hard to determine the input (plaintext) without the key. The decrypt function +has the obvious correctness property, that `decrypt(encrypt(m, k), k) = m`. + +An example of a symmetric cryptosystem in wide use today is +[AES](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard). + +## Applications + +- Encrypting files for storage in an untrusted cloud service. This can be +combined with KDFs, so you can encrypt a file with a passphrase. Generate `key += KDF(passphrase)`, and then store `encrypt(file, key)`. + +# Asymmetric cryptography + +The term "asymmetric" refers to there being two keys, with two different roles. +A private key, as its name implies, is meant to be kept private, while the +public key can be publicly shared and it won't affect security (unlike sharing +the key in a symmetric cryptosystem). Asymmetric cryptosystems provide the +following set of functionality, to encrypt/decrypt and to sign/verify: + +``` +keygen() -> (public key, private key) (this function is randomized) + +encrypt(plaintext: array, public key) -> array (the ciphertext) +decrypt(ciphertext: array, private key) -> array (the plaintext) + +sign(message: array, private key) -> array (the signature) +verify(message: array, signature: array, public key) -> bool (whether or not the signature is valid) +``` + +The encrypt/decrypt functions have properties similar to their analogs from +symmetric cryptosystems. A message can be encrypted using the _public_ key. +Given the output (ciphertext), it's hard to determine the input (plaintext) +without the _private_ key. The decrypt function has the obvious correctness +property, that `decrypt(encrypt(m, public key), private key) = m`. + +Symmetric and asymmetric encryption can be compared to physical locks. A +symmetric cryptosystem is like a door lock: anyone with the key can lock and +unlock it. Asymmetric encryption is like a padlock with a key. You could give +the unlocked lock to someone (the public key), they could put a message in a +box and then put the lock on, and after that, only you could open the lock +because you kept the key (the private key). + +The sign/verify functions have the same properties that you would hope physical +signatures would have, in that it's hard to forge a signature. No matter the +message, without the _private_ key, it's hard to produce a signature such that +`verify(message, signature, public key)` returns true. And of course, the +verify function has the obvious correctness property that `verify(message, +sign(message, private key), public key) = true`. + +## Applications + +- [PGP email encryption](https://en.wikipedia.org/wiki/Pretty_Good_Privacy). +People can have their public keys posted online (e.g. in a PGP keyserver, or on +[Keybase](https://keybase.io/)). Anyone can send them encrypted email. +- Private messaging. Apps like [Signal](https://signal.org/) and +[Keybase](https://keybase.io/) use asymmetric keys to establish private +communication channels. +- Signing software. Git can have GPG-signed commits and tags. With a posted +public key, anyone can verify the authenticity of downloaded software. + +## Key distribution + +Asymmetric-key cryptography is wonderful, but it has a big challenge of +distributing public keys / mapping public keys to real-world identities. There +are many solutions to this problem. Signal has one simple solution: trust on +first use, and support out-of-band public key exchange (you verify your +friends' "safety numbers" in person). PGP has a different solution, which is +[web of trust](https://en.wikipedia.org/wiki/Web_of_trust). Keybase has yet +another solution of [social +proof](https://keybase.io/blog/chat-apps-softer-than-tofu) (along with other +neat ideas). Each model has its merits; we (the instructors) like Keybase's +model. + +# Case study: SSH + +We've covered the use of SSH and SSH keys in an [earlier +lecture](/2020/command-line/#remote-machines). Let's look at the cryptography +aspects of this. + +When you run `ssh-keygen`, it generates an asymmetric keypair, `public_key, +private_key`. This is generated randomly, using entropy provided by the +operating system (collected from hardware events, etc.). The public key is +stored as-is (it's public, so keeping it a secret is not important), but at +rest, the private key should be encrypted on disk. The `ssh-keygen` program +prompts the user for a passphrase, and this is fed through a key derivation +function to produce a key, which is then used to encrypt the public key with a +symmetric cipher. + +In use, once the server knows the client's public key (stored in the +`.ssh/authorized_keys` file), a connecting client can prove its identity using +asymmetric signatures. This is done through +[challenge-response](https://en.wikipedia.org/wiki/Challenge%E2%80%93response_authentication). +At a high level, the server picks a random number and send it to the client. +The client then signs this message and sends the signature back to the server, +which checks the signature against the public key on record. This effectively +proves that the client is in possession of the private key corresponding to the +public key that's in the server's `.ssh/authorized_keys` file, so the server +can allow the client to log in. + +{% comment %} +extra topics, if there's time + +security concepts, tips +- password managers +- 2FA +- full disk encryption +- biometrics +- private messaging +- HTTPS +{% endcomment %} + +# Exercises + +1. **Entropy.** + 1. Suppose a password is chosen as a concatenation of five lower-case + dictionary words, where each word is selected uniformly at random from a + dictionary of size 100,000. An example of such a password is + `correcthorsebatterystaple`. How many bits of entropy does this have? + 1. Consider an alternative scheme where a password is chosen as a sequence + of 8 random alphanumeric characters (including both lower-case and + upper-case letters). An example is `rg8Ql34g`. How many bits of entropy + does this have? + 1. Which is the stronger password? + 1. Suppose an attacker can try guessing 10,000 passwords per second. On + average, how long will it take to break each of the passwords? +1. **Cryptographic hash functions.** Download a Debian image from a + [mirror](https://www.debian.org/CD/http-ftp/) (e.g. [this + file](http://debian.xfree.com.ar/debian-cd/10.2.0/amd64/iso-cd/debian-10.2.0-amd64-netinst.iso) + from an Argentinean mirror). Cross-check the hash (e.g. using the + `sha256sum` command) with the hash retrieved from the official Debian site + (e.g. [this + file](https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/SHA256SUMS) + hosted at `debian.org`, if you've downloaded the linked file from the + Argentinean mirror). +1. **Symmetric cryptography.** Encrypt a file with AES encryption, using + [OpenSSL](https://www.openssl.org/): `openssl aes-256-cbc -salt -in {input + filename} -out {output filename}`. Look at the contents using `cat` or + `hexdump`. Decrypt it with `openssl aes-256-cbc -d -in {input filename} -out + {output filename}` and confirm that the contents match the original using + `cmp`. +1. **Asymmetric cryptography.** + 1. Set up [SSH + keys](https://www.digitalocean.com/community/tutorials/how-to-set-up-ssh-keys--2) + on a computer you have access to (not Athena, because Kerberos interacts + weirdly with SSH keys). Rather than using RSA keys as in the linked + tutorial, use more secure [ED25519 + keys](https://wiki.archlinux.org/index.php/SSH_keys#Ed25519). Make sure + your private key is encrypted with a passphrase, so it is protected at + rest. + 1. [Set up GPG](https://www.digitalocean.com/community/tutorials/how-to-use-gpg-to-encrypt-and-sign-messages) + 1. Send Anish an encrypted email ([public key](https://keybase.io/anish)). + 1. Sign a Git commit with `git commit -C` or create a signed Git tag with + `git tag -s`. Verify the signature on the commit with `git show + --show-signature` or on the tag with `git tag -v`. From bb2560af44a34550b71a8ebbaf6abc06e1352a7f Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 28 Jan 2020 13:19:29 -0500 Subject: [PATCH 226/640] Add some more case studies --- _2020/security.md | 48 ++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 43 insertions(+), 5 deletions(-) diff --git a/_2020/security.md b/_2020/security.md index 1546834f..df09a59d 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -212,7 +212,49 @@ proof](https://keybase.io/blog/chat-apps-softer-than-tofu) (along with other neat ideas). Each model has its merits; we (the instructors) like Keybase's model. -# Case study: SSH +# Case studies + +## Password managers + +This is an essential tool that everyone should try to use (e.g. +[KeePassXC](https://keepassxc.org/)). Password managers let you use unique, +randomly generated high-entropy passwords for all your websites, and they save +all your passwords in one place, encrypted with a symmetric cipher with a key +produced from a passphrase using a KDF. + +Using a password manager lets you avoid password reuse (so you're less impacted +when websites get compromised), use high-entropy passwords (so you're less likely to +get compromised), and only need to remember a single high-entropy password. + +## Two-factor authentication + +[Two-factor +authentication](https://en.wikipedia.org/wiki/Multi-factor_authentication) +(2FA) requires you to use a passphrase ("something you know") along with a 2FA +authenticator (like a [YubiKey](https://www.yubico.com/), "something you have") +in order to protect against stolen passwords and +[phishing](https://en.wikipedia.org/wiki/Phishing) attacks. + +## Full disk encryption + +Keeping your laptop's entire disk encrypted is an easy way to protect your data +in the case that your laptop is stolen. You can use [cryptsetup + +LUKS](https://wiki.archlinux.org/index.php/Dm-crypt/Encrypting_a_non-root_file_system) +on Linux, +[BitLocker](https://fossbytes.com/enable-full-disk-encryption-windows-10/) on +Windows, or [FileVault](https://support.apple.com/en-us/HT204837) on macOS. +This encrypts the entire disk with a symmetric cipher, with a key protected by +a passphrase. + +## Private messaging + +Use [Signal](https://signal.org/) or [Keybase](https://keybase.io/). End-to-end +security is bootstrapped from asymmetric-key encryption. Obtaining your +contacts' public keys is the critical step here. If you want good security, you +need to authenticate public keys out-of-band (with Signal or Keybase), or trust +social proofs (with Keybase). + +## SSH We've covered the use of SSH and SSH keys in an [earlier lecture](/2020/command-line/#remote-machines). Let's look at the cryptography @@ -242,11 +284,7 @@ can allow the client to log in. extra topics, if there's time security concepts, tips -- password managers -- 2FA -- full disk encryption - biometrics -- private messaging - HTTPS {% endcomment %} From 30a8b1aee59346e7aba265bb4afa1c32fc2d9894 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 28 Jan 2020 13:32:36 -0500 Subject: [PATCH 227/640] Clarify that hash is deterministic --- _2020/security.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/_2020/security.md b/_2020/security.md index df09a59d..d167a928 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -73,10 +73,10 @@ $ printf 'hello' | sha1sum aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d ``` -At a high level, a hash function can be thought of as a hard-to-invert random -function (and this is the [ideal model of a hash -function](https://en.wikipedia.org/wiki/Random_oracle)). A hash function has -the following properties: +At a high level, a hash function can be thought of as a hard-to-invert +random-looking (but deterministic) function (and this is the [ideal model of a +hash function](https://en.wikipedia.org/wiki/Random_oracle)). A hash function +has the following properties: - Non-invertible: it is hard to find an input `m` such that `hash(m) = h` for some desired output `h`. From dbf730fe97a31702af04113f0c0ed204cba3ce0d Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 28 Jan 2020 13:39:01 -0500 Subject: [PATCH 228/640] Add resources --- _2020/security.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/_2020/security.md b/_2020/security.md index d167a928..c45bef48 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -288,6 +288,11 @@ security concepts, tips - HTTPS {% endcomment %} +# Resources + +- [Last year's notes](/2019/security/): from when this lecture was more focused on security and privacy as a computer user +- [Cryptographic Right Answers](https://latacora.micro.blog/2018/04/03/cryptographic-right-answers.html): answers "what crypto should I use for X?" for many common X. + # Exercises 1. **Entropy.** From 77daf26fd1a4ce035b29b03a589b3f8a977eeaa4 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Tue, 28 Jan 2020 17:11:30 -0500 Subject: [PATCH 229/640] Add basic structure to potpourri --- _2020/potpourri.md | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index c147bc41..66ae8456 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -3,3 +3,26 @@ layout: lecture title: "Potpourri" date: 2019-01-29 --- + +## Backups (Jose) +## Systemd (Jose) +## FUSE (Jose) +## Keyboard remapping (Jose) + + + + +## APIs (Jon) +## Common command-line flags/patterns (Jon) +## Window managers (Jon) +## VPNs (Jon) +## Markdown (Jon) + + + + +## Hammerspoon (Anish) +## Booting + Live USBs (Anish) +## Docker, Vagrant, VMs, Cloud, OpenStack (Anish) +## Notebook programming (Anish) +## GitHub (Anish) From 68d4419ada624eaf411f59d21b6084a9316008c2 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Wed, 29 Jan 2020 09:44:29 -0500 Subject: [PATCH 230/640] jon potpourri --- _2020/potpourri.md | 145 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 145 insertions(+) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index 66ae8456..09a64419 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -13,11 +13,156 @@ date: 2019-01-29 ## APIs (Jon) + +We've talked a lot in this class about using your computer more +efficiently to accomplish _local_ tasks, but you will find that many of +these lessons also extend to the wider internet. Most services online +will have "APIs" that let you programmatically access their data. For +example, the US government has an API that lets you get weather +forecasts, which you could use to easily get a weather forecast in your +shell. + +Most of these APIs have a similar format. They are structured URLs, +often rooted at `api.service.com`, where the path and query parameters +indicate what data you want to read or what action you want to perform. +For the US weather data for example, to get the forecast for a +particular location, you issue GET request (with `curl` for example) to +https://api.weather.gov/points/42.3604,-71.094. The response itself +contains a bunch of other URLs that let you get specific forecasts for +that region. Usually, the responses are formatted as JSON, which you can +then pipe through a tool like [`jq`](https://stedolan.github.io/jq/) to +massage into what you care about. + +Some APIs require authentication, and this usually takes the form of +some sort of secret _token_ that you need to include with the request. +You should read the documentation for the API to see what the particular +service you are looking for uses, but "[OAuth](https://www.oauth.com/)" +is a protocol you will often see used. At its heart, OAuth is a way to +give you tokens that can "act as you" on a given service, and can only +be used for particular purposes. Keep in mind that these tokens are +_secret_, and anyone who gains access to your token can do whatever the +token allows under _your_ account! + +[IFTTT](https://ifttt.com/) is a website and service centered around the +idea of APIs — it provides integrations with tons of services, and lets +you chain events from them in nearly arbitrary ways. Give it a look! + ## Common command-line flags/patterns (Jon) + +Command-line tools vary a lot, and you will often want to check out +their `man` pages before using them. They often share some common +features though that can be good to be aware of: + + - Most tools support some kind of `--help` flag to display brief usage + instructions for the tool. + - Many tools that can cause irrevocable change support the notion of a + "dry run" in which they only print what they _would have done_, but + do not actually perform the change. Similarly, they often have an + "interactive" flag that will prompt you for each destructive action. + - You can usually use `--version` or `-V` to have the program print its + own version (handy for reporting bugs!). + - Almost all tools have a `--verbose` or `-v` flag to produce more + verbose output. You can usually include the flag multiple times + (`-vvv`) to get _more_ verbose output, which can be handy for + debugging. Similarly, many tools have a `--quiet` flag for making it + only print something on error. + - In many tools, `-` in place of a file name means "standard input" or + "standard output", depending on the argument. + - Possibly destructive tools are generally not recursive by default, + but support a "recursive" flag (often `-r`) to make them recurse. + - If you want to run one program "through" another, like `ssh machine + foo`, it can sometimes be awkward to pass arguments to the + "inner" program (`foo`), as they will be interpreted as arguments to the + "outer" program (`ssh`). The argument `--` makes a program _stop_ + processing flags and options (things starting with `-`) in what + follows: `ssh machine --for-ssh -- foo --for-foo`. + ## Window managers (Jon) + +Most of you are used to using a "drag and drop" window manager, like +what comes with Windows, macOS, and Ubuntu by default. There are windows +that just sort of hang there on screen, and you can drag them around, +resize them, and have them overlap one another. But these are only one +_type_ of window manager, often referred to as a "floating" window +manager. There are many others, especially on Linux. A particularly +common alternative is a "tiling" window manager. In a tiling window +manager, windows never overlap, and are instead arranged as tiles on +your screen, sort of like panes in tmux. With a tiling window manager, +the screen is always filled by whatever windows are open, arranged +according to some _layout_. If you have just one window, it takes up the +full screen. If you then open another, the original window shrinks to +make room for it (often something like 2/3 and 1/3). If you open a +third, the other windows will again shrink to accommodate the new +window. Just like with tmux panes, you can navigate around these tiled +windows with your keyboard, and you can resize them and move them +around, all without touching the mouse. They are worth looking into! + ## VPNs (Jon) + +VPNs are all the rage these days, but it's not clear that's for [any +good reason](https://gist.github.com/joepie91/5a9909939e6ce7d09e29). You +should be aware of what a VPN does and does not get you. A VPN, in the +best case, is _really_ just a way for you to change your internet +service provider as far as the internet is concerned. All your traffic +will look like it's coming from the VPN provider instead of your "real" +location, and the network you are connected to will only see encrypted +traffic. + +While that may seem attractive, keep in mind that when you use a VPN, +all you are really doing is shifting your trust from you current ISP to +the VPN hosting company. Whatever your ISP _could_ see, the VPN provider +now sees _instead_. If you trust them _more_ than your ISP, that is a +win, but otherwise, it is not clear that you have gained much. If you +are sitting on some dodgy unencrypted public Wi-Fi at an airport, then +maybe you don't trust the connection much, but at home, the trade-off is +not quite as clear. + +You should also know that these days, much of your traffic, at least of +a sensitive nature, is _already_ encrypted through HTTPS or TLS more +generally. In that case, it usually matters little whether you are on +a "bad" network or not -- the network operator will only learn what +servers you talk to, but not anything about the data that is exchanged. + +Notice that I said "in the best case" above. It is not unheard of for +VPN providers to accidentally misconfigure their software such that the +encryption is either weak or entirely disabled. Some VPN providers are +malicious (or at the very least opportunist), and will log all your +traffic, and possibly sell information about it to third parties. +Choosing a bad VPN provider is often worse than not using one in the +first place. + ## Markdown (Jon) +There is a high chance that you will write some text over the course of +your career. And often, you will want to mark up that text in simple +ways. You want some text to be bold or italic, or you want to add +headers, links, and code fragments. Instead of pulling out a heavy tool +like Word or LaTeX, you may want to consider using the lightweight +markup language [Markdown](https://commonmark.org/help/). + +You have probably seen Markdown already, or at least some variant of it. +Subsets of it are used and supported almost everywhere, even if it's not +under the name Markdown. At its core, Markdown is an attempt to codify +the way that people already often mark up text when they are writing +plain text documents. Emphasis (*italics*) is added by surrounding a +word with `*`. Strong emphasis (**bold**) is added using `**`. Lines +starting with `#` are headings (and the number of `#`s is the subheading +level). Any line starting with `-` is a bullet list item, and any line +starting with a number + `.` is a numbered list item. Backtick is used +to show words in `code font`, and a code block can be entered by +indenting a line with four spaces or surrounding it with +triple-backticks: + + ``` + code goes here + ``` + +To add a link, place the _text_ for the link in square brackets, +and the URL immediately following that in parentheses: `[name](url)`. +Markdown is easy to get started with, and you can use it nearly +everywhere. In fact, the lecture notes for this lecture, and all the +others, are written in Markdown, and you can see the raw Markdown +[here](https://raw.githubusercontent.com/missing-semester/missing-semester/master/_2020/potpourri.md). From 496029ab5ecbaf2d5567a31e2a0244900a6d2c78 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Wed, 29 Jan 2020 12:15:07 -0500 Subject: [PATCH 231/640] Add my content --- _2020/potpourri.md | 116 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 115 insertions(+), 1 deletion(-) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index 09a64419..3468717f 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -166,8 +166,122 @@ others, are written in Markdown, and you can see the raw Markdown -## Hammerspoon (Anish) +## Hammerspoon (desktop automation on macOS) (Anish) + +[Hammerspoon](https://www.hammerspoon.org/) is a desktop automation framework +for macOS. It lets you write Lua scripts that hook into operating system +functionality, allowing you to interact with the keyboard/mouse, windows, +displays, filesystem, and much more. + +Some examples of things you can do with Hammerspoon: + +- Bind hotkeys to move windows to specific locations +- Create a menu bar button that automatically lays out windows in a specific layout +- Mute your speaker when you arrive in lab (by detecting the WiFi network) +- Show you a warning if you've accidentally taken your friend's power supply + +At a high level, Hammerspoon lets you run arbitrary Lua code, bound to menu +buttons, key presses, or events, and Hammerspoon provides an extensive library +for interacting with the system, so there's basically no limit to what you can +do with it. Many people have made their Hammerspoon configurations public, so +you can generally find what you need by searching the internet, but you can +always write your own code from scratch. + +### Resources + +- [Getting Started with Hammerspoon](https://www.hammerspoon.org/go/) +- [Sample configurations](https://github.com/Hammerspoon/hammerspoon/wiki/Sample-Configurations) +- [Anish's Hammerspoon config](https://github.com/anishathalye/dotfiles-local/tree/mac/hammerspoon) + ## Booting + Live USBs (Anish) + +When your machine boots up, before the operating system is loaded, the +[BIOS](https://en.wikipedia.org/wiki/BIOS)/[UEFI](https://en.wikipedia.org/wiki/Unified_Extensible_Firmware_Interface) +initializes the system. During this process, you can press a specific key +combination to configure this layer of software. For example, your computer may +say something like "Press F9 to configure BIOS. Press F12 to enter boot menu." +during the boot process. You can configure all sorts of hardware-related +settings in the BIOS menu. You can also enter the boot menu to boot from an +alternate device instead of your hard drive. + +[Live USBs](https://en.wikipedia.org/wiki/Live_USB) are USB flash drives +containing an operating system. You can create one of these by downloading an +operating system (e.g. a Linux distribution) and burning it to the flash drive. +This process is a little bit more complicated than simply copying a `.iso` file +to the disk. There are tools like [UNetbootin](https://unetbootin.github.io/) +to help you create live USBs. + +Live USBs are useful for all sorts of purposes. Among other things, if you +break your existing operating system installation so that it no longer boots, +you can use a live USB to recover data or fix the operating system. + ## Docker, Vagrant, VMs, Cloud, OpenStack (Anish) + +[Virtual machines](https://en.wikipedia.org/wiki/Virtual_machine) and similar +tools like containers let you emulate a whole computer system, including the +operating system. This can be useful for creating an isolated environment for +testing, development, or exploration (e.g. running potentially malicious code). + +[Vagrant](https://www.vagrantup.com/) is a tool that lets you describe machine +configurations (operating system, services, packages, etc.) in code, and then +instantiate VMs with a simple `vagrant up`. [Docker](https://www.docker.com/) +is conceptually similar but it uses containers instead. + +You can rent virtual machines on the cloud, and it's a nice way to get instant +access to: + +- A cheap always-on machine that has a public IP address, used to host services +- A machine with a lot of CPU, disk, RAM, and/or GPU +- Many more machines than you physically have access to (billing is often by +the second, so if you want a lot of compute for a short amount of time, it's +feasible to rent 1000 computers for a couple minutes) + +Popular services include [Amazon AWS](https://aws.amazon.com/), [Google +Cloud](https://cloud.google.com/), and +[DigitalOcean](https://www.digitalocean.com/). + +If you're a member of MIT CSAIL, you can get free VMs for research purposes +through the [CSAIL OpenStack +instance](https://tig.csail.mit.edu/shared-computing/open-stack/). + ## Notebook programming (Anish) + +[Notebook programming +environments](https://en.wikipedia.org/wiki/Notebook_interface) can be really +handy for doing certain types of interactive or exploratory development. +Perhaps the most popular notebook programming environment today is +[Jupyter](https://jupyter.org/), for Python (and several other languages). +[Wolfram Mathematica](https://www.wolfram.com/mathematica/) is another notebook +programming environment that's great for doing math-oriented programming. + ## GitHub (Anish) + +[GitHub](https://github.com/) is one of the most popular platforms for +open-source software development. Many of the tools we've talked about in this +class, from [vim](https://github.com/vim/vim) to +[Hammerspoon](https://github.com/Hammerspoon/hammerspoon), are hosted on +GitHub. It's easy to get started contributing to open-source to help improve +the tools that you use every day. + +There are two primary ways in which people contribute to projects on GitHub: + +- Creating an +[issue](https://help.github.com/en/github/managing-your-work-on-github/creating-an-issue). +This can be used to report bugs or request a new feature. Neither of these +involves reading or writing code, so it can be pretty lightweight to do. +High-quality bug reports can be extremely valuable to developers. Commenting on +existing discussions can be helpful too. +- Contribute code through a [pull +request](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests). +This is generally more involved than creating an issue. You can +[fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) +a repository on GitHub, clone your fork, create a new branch, make some changes +(e.g. fix a bug or implement a feature), push the branch, and then [create a +pull +request](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request). +After this, there will generally be some back-and-forth with the project +maintainers, who will give you feedback on your patch. Finally, if all goes +well, your patch will be merged into the upstream repository. Often times, +larger projects will have a contributing guide, tag beginner-friendly issues, +and some even have mentorship programs to help first-time contributors become +familiar with the project. From 3a75ae10246264b3944a0207bd7faa9e656c2368 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Wed, 29 Jan 2020 12:17:09 -0500 Subject: [PATCH 232/640] Remove names from lecture notes --- _2020/potpourri.md | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index 3468717f..76eda3dc 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -4,15 +4,15 @@ title: "Potpourri" date: 2019-01-29 --- -## Backups (Jose) -## Systemd (Jose) -## FUSE (Jose) -## Keyboard remapping (Jose) +## Backups +## Systemd +## FUSE +## Keyboard remapping -## APIs (Jon) +## APIs We've talked a lot in this class about using your computer more efficiently to accomplish _local_ tasks, but you will find that many of @@ -47,7 +47,7 @@ token allows under _your_ account! idea of APIs — it provides integrations with tons of services, and lets you chain events from them in nearly arbitrary ways. Give it a look! -## Common command-line flags/patterns (Jon) +## Common command-line flags/patterns Command-line tools vary a lot, and you will often want to check out their `man` pages before using them. They often share some common @@ -77,7 +77,7 @@ features though that can be good to be aware of: processing flags and options (things starting with `-`) in what follows: `ssh machine --for-ssh -- foo --for-foo`. -## Window managers (Jon) +## Window managers Most of you are used to using a "drag and drop" window manager, like what comes with Windows, macOS, and Ubuntu by default. There are windows @@ -97,7 +97,7 @@ window. Just like with tmux panes, you can navigate around these tiled windows with your keyboard, and you can resize them and move them around, all without touching the mouse. They are worth looking into! -## VPNs (Jon) +## VPNs VPNs are all the rage these days, but it's not clear that's for [any good reason](https://gist.github.com/joepie91/5a9909939e6ce7d09e29). You @@ -131,7 +131,7 @@ traffic, and possibly sell information about it to third parties. Choosing a bad VPN provider is often worse than not using one in the first place. -## Markdown (Jon) +## Markdown There is a high chance that you will write some text over the course of your career. And often, you will want to mark up that text in simple @@ -166,7 +166,7 @@ others, are written in Markdown, and you can see the raw Markdown -## Hammerspoon (desktop automation on macOS) (Anish) +## Hammerspoon (desktop automation on macOS) [Hammerspoon](https://www.hammerspoon.org/) is a desktop automation framework for macOS. It lets you write Lua scripts that hook into operating system @@ -193,7 +193,7 @@ always write your own code from scratch. - [Sample configurations](https://github.com/Hammerspoon/hammerspoon/wiki/Sample-Configurations) - [Anish's Hammerspoon config](https://github.com/anishathalye/dotfiles-local/tree/mac/hammerspoon) -## Booting + Live USBs (Anish) +## Booting + Live USBs When your machine boots up, before the operating system is loaded, the [BIOS](https://en.wikipedia.org/wiki/BIOS)/[UEFI](https://en.wikipedia.org/wiki/Unified_Extensible_Firmware_Interface) @@ -215,7 +215,7 @@ Live USBs are useful for all sorts of purposes. Among other things, if you break your existing operating system installation so that it no longer boots, you can use a live USB to recover data or fix the operating system. -## Docker, Vagrant, VMs, Cloud, OpenStack (Anish) +## Docker, Vagrant, VMs, Cloud, OpenStack [Virtual machines](https://en.wikipedia.org/wiki/Virtual_machine) and similar tools like containers let you emulate a whole computer system, including the @@ -244,7 +244,7 @@ If you're a member of MIT CSAIL, you can get free VMs for research purposes through the [CSAIL OpenStack instance](https://tig.csail.mit.edu/shared-computing/open-stack/). -## Notebook programming (Anish) +## Notebook programming [Notebook programming environments](https://en.wikipedia.org/wiki/Notebook_interface) can be really @@ -254,7 +254,7 @@ Perhaps the most popular notebook programming environment today is [Wolfram Mathematica](https://www.wolfram.com/mathematica/) is another notebook programming environment that's great for doing math-oriented programming. -## GitHub (Anish) +## GitHub [GitHub](https://github.com/) is one of the most popular platforms for open-source software development. Many of the tools we've talked about in this From b00d9dc242af769a0a1974c80b5a5aa51b98afa2 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Wed, 29 Jan 2020 12:58:32 -0500 Subject: [PATCH 233/640] adjustments --- _2020/potpourri.md | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index 76eda3dc..14c2da68 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -70,12 +70,14 @@ features though that can be good to be aware of: "standard output", depending on the argument. - Possibly destructive tools are generally not recursive by default, but support a "recursive" flag (often `-r`) to make them recurse. - - If you want to run one program "through" another, like `ssh machine - foo`, it can sometimes be awkward to pass arguments to the - "inner" program (`foo`), as they will be interpreted as arguments to the - "outer" program (`ssh`). The argument `--` makes a program _stop_ - processing flags and options (things starting with `-`) in what - follows: `ssh machine --for-ssh -- foo --for-foo`. + - Sometimes, you want to pass something that _looks_ like a flag as a + normal argument. For example, imagine you wanted to remove a file + called `-r`. Or you want to run one program "through" another, like + `ssh machine foo`, and you want to pass a flag to the "inner" program + (`foo`). The special argument `--` makes a program _stop_ processing + flags and options (things starting with `-`) in what follows, letting + you pass things that look like flags without them being interpreted + as such: `rm -- -r` or `ssh machine --for-ssh -- foo --for-foo`. ## Window managers @@ -97,6 +99,7 @@ window. Just like with tmux panes, you can navigate around these tiled windows with your keyboard, and you can resize them and move them around, all without touching the mouse. They are worth looking into! + ## VPNs VPNs are all the rage these days, but it's not clear that's for [any @@ -131,6 +134,10 @@ traffic, and possibly sell information about it to third parties. Choosing a bad VPN provider is often worse than not using one in the first place. +In a pinch, MIT [runs a VPN](https://ist.mit.edu/vpn) for its students, +so that may be worth taking a look at. Also, if you're going to roll +your own, give [WireGuard](https://www.wireguard.com/) a look. + ## Markdown There is a high chance that you will write some text over the course of From 5b2fc688eb7045c8e33e7a1ab74f994c13e7b464 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Wed, 29 Jan 2020 13:12:17 -0500 Subject: [PATCH 234/640] Jose's Potpourri --- _2020/potpourri.md | 105 +++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 102 insertions(+), 3 deletions(-) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index 14c2da68..c5ef4d2e 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -2,14 +2,113 @@ layout: lecture title: "Potpourri" date: 2019-01-29 +ready: true --- -## Backups -## Systemd -## FUSE + ## Keyboard remapping +As a programmer, your keyboard is your main input method. As with pretty much anything in your computer, it is configurable (and worth configuring). + +The most basic change is to remap keys. +This usually involves some software that is listening and, whenever a certain key is pressed, it intercepts that event and replaces it with another event corresponding to a different key. Some examples: +- Remap Caps Lock to Ctrl or Escape. We (the instructors) highly encourage this setting since Caps Lock has a very convenient location but is rarely used. +- Remapping PrtSc to Play/Pause music. Most OSes have a play/pause key. +- Swapping Ctrl and the Meta (Windows or Command) key. + +You can also map keys to arbitrary commands of your choosing. This is useful for common tasks that you perform. Here, some software listens for a specific key combination and executes some script whenever that event is detected. +- Open a new terminal or browser window. +- Inserting some specific text, e.g. your long email address or your MIT ID number. +- Sleeping the computer or the displays. + +There are even more complex modifications you can configure: +- Remapping sequences of keys, e.g. pressing shift five times toggles Caps Lock. +- Remapping on tap vs on hold, e.g. Caps Lock key is remapped to Esc if you quickly tap it, but is remapped to Ctrl if you hold it and use it as a modifier. +- Having remaps being keyboard or software specific. + +Some software resources to get started on the topic: +- macOS - [karabiner-elements](https://pqrs.org/osx/karabiner/), [skhd](https://github.com/koekeishiya/skhd) or [BetterTouchTool](https://folivora.ai/) +- Linux - [xmodmap](https://wiki.archlinux.org/index.php/Xmodmap) or [Autokey](https://github.com/autokey/autokey) +- Windows - Builtin in Control Panel, [AutoHotkey](https://www.autohotkey.com/) or [SharpKeys](https://www.randyrants.com/category/sharpkeys/) +- QMK - If your keyboard supports custom firmware you can use [QMK](https://docs.qmk.fm/) to configure the hardware device itself so the remaps works for any machine you use the keyboard with. + +## Daemons + +You are probably already familiar with the notion of daemons, even if the word seems new. +Most computers have a series of processes that are always running in the background rather than waiting for an user to launch them and interact with them. +These processes are called daemons and the programs that run as daemons often end with a `d` to indicate so. +For example `sshd`, the SSH daemon, is the program responsible for listening to incoming SSH requests and checking that the remote user has the necessary credentials to log in. + +In Linux, `systemd` (the system daemon) is the most common solution for running and setting up daemon processes. +You can run `systemctl status` to list the current running daemons. Most of them might sound unfamiliar but are responsible for core parts of the system such as managing the network, solving DNS queries or displaying the graphical interface for the system. +Systemd can be interacted with the `systemctl` command in order to `enable`, `disable`, `start`, `stop`, `restart` or check the `status` of services (those are the `systemctl` commands). + +More interestingly, `systemd` has a fairly accessible interface for configuring and enabling new daemons (or services). +Below is an example of a daemon for running a simple Python app. +We won't go in the details but as you can see most of the fields are pretty self explanatory. + +```ini +# /etc/systemd/system/myapp.service +[Unit] +Description=My Custom App +After=network.target + +[Service] +User=foo +Group=foo +WorkingDirectory=/home/foo/projects/mydaemon +ExecStart=/usr/bin/local/python3.7 app.py +Restart=on-failure + +[Install] +WantedBy=multi-user.target +``` + +Also, if you just want to run some program with a given frequency there is no need to build a custom daemon, you can use [`cron`](http://man7.org/linux/man-pages/man8/cron.8.html), a daemon you system already runs to perform scheduled tasks. + +## FUSE + +Modern software systems are usually composed of smaller building blocks that are composed together. +Your operating system supports using different filesystem backends because there is a common language of what operations a filesystem supports. +For instance, when you run `touch` to create a file, `touch` performs a system call to the kernel to create the file and the kernel performs the appropriate filesystem call to create the given file. +A caveat is that UNIX filesystems are traditionally implemented as kernel modules and only the kernel is allowed to perform filesystem calls. + +[FUSE](https://en.wikipedia.org/wiki/Filesystem_in_Userspace) (Filesystem in User Space) allows filesystems to be implemented by a user program. FUSE lets users run user space code for filesystem calls and then bridges the necessary calls to the kernel interfaces. +In practice, this means that users can implement arbitrary functionality for filesystem calls. + +For example, FUSE can be used so whenever you perform an operation in a virtual filesystem, that operation is forwarded through SSH to a remote machine, performed there, and the output is returned back to you. +This way, local programs can see the file as if it was in your computer while in reality it's in a remote server. +This is effectively what `sshfs` does. + +Some interesting examples of FUSE filesystems are: +- [sshfs](https://github.com/libfuse/sshfs) - Open locally remote files/folder thorugh an SSH connection. +- [rclone](https://rclone.org/commands/rclone_mount/) - Mount cloud storage services like Dropbox, GDrive, Amazon S3 or Google Cloud Storage and open data locally. +- [gocryptfs](https://nuetzlich.net/gocryptfs/) - Encrypted overlay system. Files are stored encrypted but once the FS is mounted they appear as plaintext in the mountpoint. +- [kbfs](https://keybase.io/docs/kbfs) - Distributed filesystem with end-to-end encryption. You can have private, shared and public folders. +- [borgbackup](https://borgbackup.readthedocs.io/en/stable/usage/mount.html) - Mount your deduplicated, compressed and encrypted backups for ease of browsing. + +## Backups + +Any data that you haven’t backed up is data that could be gone at any moment, forever. +It's easy to copy data around, it's hard to reliable backup data. +Here are some good backup basics and the pitfalls of some approaches. + +First, a copy of the data in the same disk is not a backup, because the disk is the single point of failure for all the data. Similarly, an external drive in your home is also a weak backup solution since it could be lost in a fire/robbery/&c. Instead, having an off-site backup is a recommended practice. + +Synchronization solutions are not backups. For instance, Dropbox/GDrive are convenient solutions, but when data is erased or corrupted they propagate the change. For the same reason, disk mirroring solutions like RAID are not backups. They don't help if data gets deleted, corrupted or encrypted by ransomware. + +Some core features of good backups solutions are versioning, deduplication and security. +Versioning backups ensure that you can access your history of changes and efficiently recover files. +Efficient backup solutions use data deduplication to only store incremental changes and reduce the storage overhead. +Regarding security, you should ask yourself what someone would need to know/have in order to read your data and, more importantly, to delete all your data and associated backups. +Lastly, blindly trusting backups is a terrible idea and you should verify regularly that you can use them to recover data. + +Backups go beyond local files in your computer. +Given the significant growth of web applications, large amounts of your data are only stored in the cloud. +For instance, your webmail, social media photos, music playlists in streaming services or online docs are gone if you lose access to the corresponding accounts. +Having an offline copy of this information is the way to go, and you can find online tools that people have built to fetch the data and save it. +For a more detailed explanation, see 2019's lecture notes on [Backups](/2019/backups). ## APIs From 81230f168ebe00ea4eb70972671b4f44d2341167 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Wed, 29 Jan 2020 16:29:46 -0500 Subject: [PATCH 235/640] Add TOC for Potpourri --- _2020/potpourri.md | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index c5ef4d2e..f9cb6607 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -5,6 +5,22 @@ date: 2019-01-29 ready: true --- +## Table of Contents + +- [Keyboard remapping](#keyboard-remapping) +- [Daemons](#daemons) +- [FUSE](#fuse) +- [Backups](#backups) +- [APIs](#apis) +- [Common command-line flags/patterns](#common-command-line-flagspatterns) +- [Window managers](#window-managers) +- [VPNs](#vpns) +- [Markdown](#markdown) +- [Hammerspoon(desktop-automation-on-macOS)](#hammerspoon-desktop-automation-on-macos) +- [Booting + Live USBs](#booting--live-usbs) +- [Docker, Vagrant, VMs, Cloud, OpenStack](#docker-vagrant-vms-cloud-openstack) +- [Notebook programming](#notebook-programming) +- [GitHub](#github) ## Keyboard remapping @@ -42,7 +58,7 @@ For example `sshd`, the SSH daemon, is the program responsible for listening to In Linux, `systemd` (the system daemon) is the most common solution for running and setting up daemon processes. You can run `systemctl status` to list the current running daemons. Most of them might sound unfamiliar but are responsible for core parts of the system such as managing the network, solving DNS queries or displaying the graphical interface for the system. Systemd can be interacted with the `systemctl` command in order to `enable`, `disable`, `start`, `stop`, `restart` or check the `status` of services (those are the `systemctl` commands). - +1 More interestingly, `systemd` has a fairly accessible interface for configuring and enabling new daemons (or services). Below is an example of a daemon for running a simple Python app. We won't go in the details but as you can see most of the fields are pretty self explanatory. From 2cb28f2d0722cecfa1168783a556cef8702c9a91 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Wed, 29 Jan 2020 16:45:34 -0500 Subject: [PATCH 236/640] Nicer page break points --- static/css/main.css | 2 ++ 1 file changed, 2 insertions(+) diff --git a/static/css/main.css b/static/css/main.css index bb61e4b6..9dabb63e 100644 --- a/static/css/main.css +++ b/static/css/main.css @@ -365,6 +365,8 @@ input[type=checkbox]:checked ~ .menu-label:after { body { background: none; } #content { max-width: none; } h1.title { text-align: center; } + h1, h2, h3, h4, h5, h6 { break-after: avoid-page; page-break-after: avoid; } #content hr:last-of-type { display: none; } + #content pre { break-inside: avoid-page; page-break-inside: avoid; } #content div.small:last-of-type { display: none; } } From 43f0fbae10a9a703f56ed8f494ceb646d64619af Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Wed, 29 Jan 2020 19:48:38 -0500 Subject: [PATCH 237/640] Remove stray 1 --- _2020/potpourri.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index f9cb6607..db86cdb4 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -58,7 +58,7 @@ For example `sshd`, the SSH daemon, is the program responsible for listening to In Linux, `systemd` (the system daemon) is the most common solution for running and setting up daemon processes. You can run `systemctl status` to list the current running daemons. Most of them might sound unfamiliar but are responsible for core parts of the system such as managing the network, solving DNS queries or displaying the graphical interface for the system. Systemd can be interacted with the `systemctl` command in order to `enable`, `disable`, `start`, `stop`, `restart` or check the `status` of services (those are the `systemctl` commands). -1 + More interestingly, `systemd` has a fairly accessible interface for configuring and enabling new daemons (or services). Below is an example of a daemon for running a simple Python app. We won't go in the details but as you can see most of the fields are pretty self explanatory. From 2a922ae7cc930ed82afc11c5fbb26fc00a79b6be Mon Sep 17 00:00:00 2001 From: Martin Plattner Date: Sat, 1 Feb 2020 22:59:56 +0100 Subject: [PATCH 238/640] Update potpourri.md --- _2020/potpourri.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index db86cdb4..b48f6838 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -80,7 +80,7 @@ Restart=on-failure WantedBy=multi-user.target ``` -Also, if you just want to run some program with a given frequency there is no need to build a custom daemon, you can use [`cron`](http://man7.org/linux/man-pages/man8/cron.8.html), a daemon you system already runs to perform scheduled tasks. +Also, if you just want to run some program with a given frequency there is no need to build a custom daemon, you can use [`cron`](http://man7.org/linux/man-pages/man8/cron.8.html), a daemon your system already runs to perform scheduled tasks. ## FUSE From 775dfb14e3b88c7824f8bdd74e91634b7ee57fde Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Sat, 1 Feb 2020 22:39:11 -0500 Subject: [PATCH 239/640] New videos for all lectures --- _2020/command-line.md | 2 +- _2020/course-shell.md | 2 +- _2020/data-wrangling.md | 2 +- _2020/debugging-profiling.md | 2 +- _2020/editors.md | 2 +- _2020/metaprogramming.md | 2 +- _2020/security.md | 3 +++ _2020/shell-tools.md | 2 +- _2020/version-control.md | 3 +++ 9 files changed, 13 insertions(+), 7 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 308a3255..5fff8e39 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -5,7 +5,7 @@ date: 2019-01-21 ready: true video: aspect: 56.25 - id: MpJPHy4kUEs + id: e8BO_dYxk5c --- In this lecture we will go through several ways in which you can improve your workflow when using the shell. We have been working with the shell for a while now, but we have mainly focused on executing different commands. We will now see how to run several processes at the same time while keeping track of them, how to stop or pause a specific process and how to make a process run in the background. diff --git a/_2020/course-shell.md b/_2020/course-shell.md index 48ef4f78..c401afcb 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -5,7 +5,7 @@ date: 2019-01-13 ready: true video: aspect: 56.25 - id: Yh-iV6Vn5W4 + id: Z56Jmr9Z34Q --- {% comment %} diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index 71043ba9..0ccbe0ae 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -5,7 +5,7 @@ date: 2019-01-16 ready: true video: aspect: 56.25 - id: QQiUPFvIMt8 + id: sz_dsktIjt4 --- {% comment %} diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 37b4067c..7f416c29 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -5,7 +5,7 @@ date: 2019-01-23 ready: true video: aspect: 56.25 - id: rrO9whcNMC8 + id: l812pUnKxME --- A golden rule in programming is that code does not do what you expect it to do, but what you tell it to do. diff --git a/_2020/editors.md b/_2020/editors.md index cac6585d..5f782a73 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -5,7 +5,7 @@ date: 2019-01-15 ready: true video: aspect: 56.25 - id: BE-xaxvDEpo + id: a6Q8Na575qc --- Writing English words and writing code are very different activities. When diff --git a/_2020/metaprogramming.md b/_2020/metaprogramming.md index 00794761..5c265fcb 100644 --- a/_2020/metaprogramming.md +++ b/_2020/metaprogramming.md @@ -6,7 +6,7 @@ date: 2019-01-27 ready: true video: aspect: 56.25 - id: kderh1XA30Q + id: _Ms1Z4xfqv4 --- {% comment %} diff --git a/_2020/security.md b/_2020/security.md index c45bef48..2d46840d 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -3,6 +3,9 @@ layout: lecture title: "Security and Cryptography" date: 2019-01-28 ready: true +video: + aspect: 56.25 + id: tjwobAmnKTo --- Last year's [security and privacy lecture](/2019/security/) focused on how you diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index f82e7a85..392e36d2 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -5,7 +5,7 @@ date: 2019-01-14 ready: true video: aspect: 56.25 - id: 2APJRjhBiYc + id: kgII-YWo3Zw --- In this lecture we will present some of the basics of using bash as a scripting language along with a number of shell tools that cover several of the most common tasks that you will be constantly performing in the command line. diff --git a/_2020/version-control.md b/_2020/version-control.md index 48f81404..02c36928 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -3,6 +3,9 @@ layout: lecture title: "Version Control (Git)" date: 2019-01-22 ready: true +video: + aspect: 56.25 + id: 2sjqTHE0zok --- Version control systems (VCSs) are tools used to track changes to source code From dd84ee4d5de1cfdcc32b70dca29c279e220c740d Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Sun, 2 Feb 2020 14:19:25 -0500 Subject: [PATCH 240/640] Publish potpourri video --- _2020/potpourri.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index b48f6838..f45edbdf 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -3,6 +3,9 @@ layout: lecture title: "Potpourri" date: 2019-01-29 ready: true +video: + aspect: 56.25 + id: JZDt-PRq0uo --- ## Table of Contents From 789d2f89f6c99462e9a6b568686551f0b7ffa155 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Sun, 2 Feb 2020 16:23:00 -0500 Subject: [PATCH 241/640] Add Q&A video --- _2020/qa.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/_2020/qa.md b/_2020/qa.md index c2baca15..127d5a69 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -2,4 +2,7 @@ layout: lecture title: "Q&A" date: 2019-01-30 +video: + aspect: 56.25 + id: Wz50FvGG6xU --- From 752e88ae7ffa20e3244016d312fee4c8b346012c Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Sun, 2 Feb 2020 21:34:24 -0500 Subject: [PATCH 242/640] Fix "video coming soon" on /about/ page With this patch, if there's no mention of video, nothing is shown. If the video has an ID, then the video is shown. Otherwise, if there's something like `video: false`, then the notice about the pending video will be shown. --- _layouts/lecture.html | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_layouts/lecture.html b/_layouts/lecture.html index 5b591bcf..486b2c7c 100644 --- a/_layouts/lecture.html +++ b/_layouts/lecture.html @@ -4,11 +4,11 @@

{{ page.title }}{% if page.subtitle %} {{ page.subtitle }}{% endif %}

-{% if page.video %} +{% if page.video.id %}
-{% else %} +{% elsif page.video %}

Lecture video coming soon!

{% endif %} From 07a7f47cdda689d91ff345717af8313847fdc3e1 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Mon, 3 Feb 2020 02:05:35 -0500 Subject: [PATCH 243/640] Add notes for Q&A lecture --- _2020/qa.md | 175 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 175 insertions(+) diff --git a/_2020/qa.md b/_2020/qa.md index 127d5a69..27e636bf 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -2,7 +2,182 @@ layout: lecture title: "Q&A" date: 2019-01-30 +ready: true video: aspect: 56.25 id: Wz50FvGG6xU --- + +As the last part of this lecture series, this section focused on answering questions that students from this class submitted. +Here we include a summary to what we answered for each question. + +- [Any recommendations on learning Operating Systems related topics like processes, virtual memory, interrupts, memory management, etc ](#any-recommendations-on-learning-operating-systems-related-topics-like-processes-virtual-memory-interrupts-memory-management-etc) +- [What are some of the tools you'd prioritize learning first?](#what-are-some-of-the-tools-youd-prioritize-learning-first) +- [When do I use Python versus a Bash scripts versus some other language?](#when-do-i-use-python-versus-a-bash-scripts-versus-some-other-language) +- [What is the difference between `source script.sh` and `./script.sh`](#what-is-the-difference-between-source-scriptsh-and-scriptsh) +- [What are the places where various packages and tools are stored and how does referencing them work? What even is `/bin` or `/lib`?](#what-are-the-places-where-various-packages-and-tools-are-stored-and-how-does-referencing-them-work-what-even-is-bin-or-lib) +- [Should I `apt-get install` a python-whatever, or `pip install` whatever package?](#should-i-apt-get-install-a-python-whatever-or-pip-install-whatever-package) +- [What's the easiest and best profiling tools to use to improve performance of my code?](#whats-the-easiest-and-best-profiling-tools-to-use-to-improve-performance-of-my-code) +- [What browser plugins do you use?](#what-browser-plugins-do-you-use) +- [What are other useful data wrangling tools?](#what-are-other-useful-data-wrangling-tools) +- [What is the difference between Docker and a Virtual Machine?](#what-is-the-difference-between-docker-and-a-virtual-machine) +- [What are the advantages and disadvantages of each OS and how can we choose between them (e.g. choosing the best Linux distribution for our purposes)?](#what-are-the-advantages-and-disadvantages-of-each-os-and-how-can-we-choose-between-them-eg-choosing-the-best-linux-distribution-for-our-purposes) +- [Vim vs Emacs?](#vim-vs-emacs) +- [Any tips or tricks for Machine Learning applications?](#any-tips-or-tricks-for-machine-learning-applications) +- [Any more Vim tips?](#any-more-vim-tips) +- [What is 2FA and why should I use it?](#what-is-2fa-and-why-should-i-use-it) +- [Any comments on differences between web browsers?](#any-comments-on-differences-between-web-browsers) +] + +## Any recommendations on learning Operating Systems related topics like processes, virtual memory, interrupts, memory management, etc + +First, it is unclear whether you actually need to be very familiar with all of this topics since they are very low level topics. +They will matter as you start writing more low level code like implementing or modifying a kernel. Otherwise, most topics will relevant, with the exception of processes and signals that were briefly covered in other lectures. + +Some good resources to learn about this topic: + +- [MIT's 6.828 class](https://pdos.csail.mit.edu/6.828/2019/schedule.html) - Graduate level class on Operating System Engineering. Class materials are publicly accessible +- Modern Operating Systems (4th ed) - by Andrew S. Tanenbaum is a good overview of many of the mentioned concepts. +- The Design and Implementation of the FreeBSD Operating System - A good resource about the FreeBSD OS (note that this is not Linux). +- Other guides like [Writing an OS in Rust](https://os.phil-opp.com/) where people implement a kernel step by step in various languages, mostly for teaching purposes. + + +## What are some of the tools you'd prioritize learning first? + +Some topics worth prioritizing: + +- Learning how to use you keyboard more and your mouse less. This can be through keyboard shortcuts, changing interfaces, &c +- Learning your editor well. As a programmer most of your time is spent editing files so it really pays off to learn this skill well +- Learning how to automate and/or simplify repetitive tasks in your workflow because the time savings will be enormous. +- Learning about version control tools like Git and how to use it in conjunction with GitHub to collaborate in modern software projects. + +## When do I use Python versus a Bash scripts versus some other language? + +In general, bash scripts should be useful for short and simple one-off scripts when you just one to run a specific series of commands. bash has a set of oddities that make it hard to work with for larger programs or scripts: + +- bash is easy to get right for a simple use case but it can be really hard to get right for all possible inputs. For example, spaces in script arguments have led to countless bugs in bash scripts +- bash is not very akin to code reuse so it can be hard to compose previous programs that you might have written. More generally, there is no concept of software libraries in bash. +- bash relies on many magic strings like `$?` or `$@` to refer to specific values, whereas other languages refer to them explicitly, like `exitCode` or `sys.args` respectively. + +Therefore, for larger and/or more complex scripts we recommend using more mature scripting languages like Python or Ruby. +You can find online countless libraries that people have already written to solve common problems in these languages. +If you find a library that implements the specific functionality you care about in some language, usually the best thing to do is to just use that language. + +## What is the difference between `source script.sh` and `./script.sh` + +In both cases the `script.sh` will be read and executed into a bash session, the difference lies in which session is running the commands. +For `source` the commands are executed in your current bash session and thus any changes made to the current environment, like changing directories or defining functions will persist in the current session once the `source` command finishes executing. +When running the script standalone like `./script.sh`, your current bash session starts a new instance of bash which will run the commands in `script.sh`. +Thus, if `script.sh` changes directories, the new bash instance will change directories but once it exits and returns control to the parent bash session, the parent session will remain in the same place. +Similarly, if `script.sh` defines a function that you want to access in your terminal, you need to `source` it for it to be defined in your current bash session. Otherwise, if you run it, the new bash process will be the one to process the function definition instead of your current shell. + +## What are the places where various packages and tools are stored and how does referencing them work? What even is `/bin` or `/lib`? + +Regarding programs that you execute in your terminal they are all found in the directories listed in your `PATH` environment variable and you can use the `which` command (or the `type` command) to check where your shell is finding an specific program. +In general, there are some conventions about where programs live. Here is some of the ones we talked about, check the [Filesystem, Hierarchy Standard](https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard) for a more comprehensive list. + +- `/bin` - Essential command binaries. +- `/dev` - Device files, special files that often are interfaces to hardware devices +- `/etc` - Host-specific system-wide configuration files +- `/home` - Home directories for users in the system +- `/lib` - Common libraries for system programs +- `/opt` - Optional application software +- `/sys` - Covered in the first lecture, contains information and configuration for the system +- `/tmp` - Temporary files (also `/var/tmp`). Usually deleted between reboots. +- `/usr/` - Read only user data + + `/usr/bin` - Non-essential command binaries + + `/usr/sbin` - Non-essential system binaries, often only supposed to be run by root + + `/usr/local/bin` - Binaries for user compiled programs +- `/var` - Variable files like logs or caches + +## Should I `apt-get install` a python-whatever, or `pip install` whatever package? + +There's no universal answer to this question, but in revolves around the more general question of whether you should use your systems package manager to install things or a language specific package manager. A few things to take into account: + +- Common packages will be available through both, but less popular ones or more recent ones might not be available in your system package. In this, case using the language specific manager is the better choice. +- Similarly, language specific package managers usually have more up to date versions of packages that system package managers. +- When using your system package manager, libraries will be installed system wide. This means that if you need different versions of a library for development purposes, the system package manager might not suffice. For this scenario, most programming languages provide some sort of isolated or virtual environment so you can install different versions of libraries without running into conflicts. For Python, there's virtualenv or for Ruby RVM. +- Depending on the operating system and the hardware architecture, some of these packages might come with binaries or might need to be compiled. For instance, in ARM computers like the Raspberry Pi using the system package manager can be better than the language specific one if the former comes in form of binaries and the later needs to be compiled. This is highly dependent on the specific setup so you should check. + +You should try to use one solution or the other and not both since that can lead to hard to debug conflicts. Our recommendation is to use the language specific package manager whenever possible, and to use isolated environments (like Python's virtualenv) to avoid polluting the global environment. + +## What's the easiest and best profiling tools to use to improve performance of my code? + +The easiest tool that is quite useful for profiling purposes is [print timing](/2020/debugging-profiling/#timing). +You just manually compute the time taken between different parts of your code. By repeatedly doing this, you can effectively do a binary search over your code and find the segment of code that took the longest. + +For more advanced tools, Valgrind's [Callgrind](http://valgrind.org/docs/manual/cl-manual.html) lets you run your program and measure how long everything takes and all the call stacks, namely which function called which other function. It then produces an annotated version of your program's source code with the time taken per line. However, it slows down your program by an order of magnitude and does not support threads. For other cases, the [`perf`](http://www.brendangregg.com/perf.html) tool and other language specific sampling profilers can output useful data pretty quickly. [Flamegraphs](http://www.brendangregg.com/flamegraphs.html) are good visualization tool for the output of said sampling profilers. You should also try to use specific tools for the programming language or task you are working with. E.g. for web development, the dev tools built into Chrome and Firefox have fantastic profilers. + +Sometimes the slow part of your code will be because your system is waiting for an event like a disk read or a network packet. In those cases, it is worth checking that back of the envelope calculations about the theoretical speed in terms of hardware capabilities do not deviate from the actual readings. There are also specialized tools to analyze the wait times in system calls. These include tools like [eBPF](http://www.brendangregg.com/blog/2019-01-01/learn-ebpf-tracing.html) that perform kernel tracing of user programs. In particular [`bpftrace`](https://github.com/iovisor/bpftrace) is worth checking out if you need to perform this sort of low level profiling. + + +## What browser plugins do you use? + +Some of our favorites, mostly related to security and usability: + +- [uBlock Origin](https://github.com/gorhill/uBlock) - It is a [wide-spectrum](https://github.com/gorhill/uBlock/wiki/Blocking-mode) blocker that doesn’t just stop ads, but all sorts of third-party communication a page may try to do. This also cover inline scripts and other types of resource loading. If you’re willing to spend some time on configuration to make things work, go to [medium mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-medium-mode) or even [hard mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-hard-mode). Those will make some sites not work until you’ve fiddled with the settings enough, but will also significantly improve your online security. Otherwise, the [easy mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-easy-mode) is already a good default that blocks most ads and tracking. You can also define you own rules about what website objects to block. +- [Stylus](https://github.com/openstyles/stylus/) - a fork of Stylish (don't use Stylish, it was shown to steal users browsing history), allows you to sideload custom CSS stylesheets to websites. With Stylus you can easily customize and modify the appearance of websites. This can be removing a sidebar, changing the background color or even the text size or font choice. This is fantastic for making websites that you visit frequently more readable. Moreover, Stylus can find styles written by other users and published in [userstyles.org](https://userstyles.org/). Most common websites have one or several dark theme stylesheets for instance. +- Full Page Screen Capture - Built into Firefox and [Chrome](https://chrome.google.com/webstore/detail/full-page-screen-capture/fdpohaocaechififmbbbbbknoalclacl?hl=en) extension. Let's you take a screenshot of a full website, often much better than printing for reference purposes. +- [Multi Account Containers](https://addons.mozilla.org/en-US/firefox/addon/multi-account-containers/) - lets you separate cookies into "containers", allowing you to browse the web with different identities and/or ensuring that websites are unable to share information between them. +- Password Manager Integration - Most password managers have browser extensions that make inputting your credentials into websites not only more convenient but also more secure. Compared to simply copy-pasting your user and password, these tools will first check that the website domain matches the one listed for the entry, preventing phishing attacks that recreate popular websites to steal credentials. + +## What are other useful data wrangling tools? + +Some of the data wrangling tools we did not have to cover during the data wrangling lecture include `jq` or `pup` which are specialized parsers for JSON and HTML data respectively. The Perl programming language is another good tool for more advanced data wrangling pipelines. Another trick is the `column -t` command that can be used to convert whitespace text (not necessarily aligned) into properly column aligned text. + +More generally a couple of more unconventional data wrangling tools are vim and Python. For some complex and multi-line transformations, vim macros can be a quite invaluable tools to use. You can just record a series of actions and repeat them as many times as you want, for instance in the editors [lecture notes](https://missing.csail.mit.edu/2020/editors/#macros) there is an example of converting a XML formatted file into JSON just using vim macros. + +For tabular data, often presented in CSVs, the [pandas](https://pandas.pydata.org/) Python library is a great tool. Not only because it makes it quite easy to define complex operations like group by, join or filters; but also makes it quite easy to plot different properties of your data. It also supports exporting to many table formats including XLS, HTML or LaTeX. Alternatively the R programming language (an arguably [bad](http://arrgh.tim-smith.us/)) programming language, it has lots of functionality for computing statistics over data and can be quite useful as the last step of your pipeline. The [ggplot2](https://ggplot2.tidyverse.org/) is a great plotting library in R. + +## What is the difference between Docker and a Virtual Machine? + +Docker uses a more general concept called containers. The main difference between containers and virtual machines is that virtual machines will execute an entire OS stack, including the kernel, even if the kernel is the same as the host machine. Unlike VMs, containers avoid running another instance of the kernel and just share the kernel with the host. In Linux this is achieved through a mechanism called LXC and it makes use of a series of isolation mechanism to spin up a program that thinks it's running on its own hardware but it's actually sharing the hardware and kernel with the host. Thus, containers have a lower overhead than a full VM. +On the flip side, containers have a weaker isolation and only work if the host runs the same kernel. For instance if you run Docker on macOS, Docker need to spin up a Linux virtual machine to get an initial Linux kernel and thus the overhead is still significant. Lastly, Docker is an specific implementation of containers and it is tailored for software deployment. Because of this, it has some quirks like by default Docker containers will not persist any form of storage between reboots. + +## What are the advantages and disadvantages of each OS and how can we choose between them (e.g. choosing the best Linux distribution for our purposes)? + +Regarding Linux distros, even though there are many, many distros, most of them will behave fairly identical for most use cases. +Most of Linux and UNIX features and inner workings can be learned in any distro. +A fundamental difference between distros is how they deal with package updates. +Some distros, like Arch Linux, use a rolling update policy where things are bleeding edge but things might break every so often. On the other hand, some distros like Debian, CentOS or Ubuntu LTS releases are much more conservative with releasing updates in their repositories so things are usually more stable at the expense of sacrificing newer features. +Our recommendation for an easy experience with both desktops and servers is to use Debian or Ubuntu. + +Mac OS is a good middle point between Windows and Linux that has a nicely polished interface. However, Mac OS is based on BSD rather than Linux, so some parts of the system and commands are different. +An alternative worth checking is FreeBSD. Even though some programs will not run on FreeBSD, the BSD ecosystem is much less fragmented and better documented than Linux is. +We discourage Windows for anything but for developing Windows applications or if there is some deal breaker feature that you need, like good driver support for gaming. + +For dual boot systems, we think that the most working implementation is macOS' bootcamp and that any other combination can be problematic on the long run, specially if you combine it with other features like disk encryption. + +## Vim vs Emacs? + +The three of us use vim as our primary editor but Emacs is also a good alternative and it's worth trying both to see which works better for you. Emacs does not follow vim's modal editing, but this can be enabled through Emacs plugins like [Evil](https://github.com/emacs-evil/evil) or [Doom Emacs](https://github.com/hlissner/doom-emacs). +An advantage of using Emacs is that is implemented in Lisp, a better scripting language than vimscript. Thus, Emacs plugins are sometimes excellent. + +## Any tips or tricks for Machine Learning applications? + +Ignoring ML specific advice, some of the lessons and takeaways from this class can directly be applied to ML applications. +As it is the case with many science disciplines, in ML you often perform a series of experiments and want to check what things worked and what didn't. +One can use shell tools to easily and quickly search through these experiments and aggregate the results in a sensible way. This could mean subselecting all experiments in a given time frame or that use a specific dataset. By using a simple JSON file to log all relevant parameters of the experiments, this can be incredibly simple with the tools we covered in this class. +Lastly, if you do not work with some sort of cluster where you submit your GPU jobs, you should look into how to automate this process since it can be a quite time consuming task that also eats away your mental energy. + +## Any more Vim tips? + +A few more tips: + +- Plugins - Take your time and explore the plugin landscape. There are a lot of great plugins that address some of vim's shortcomings or add new functionality that composes well with existing vim workflows. For this, good resources are [VimAwesome](https://vimawesome.com/) and other programmers' dotfiles. +- Marks - In vim, you can set a mark doing `m` for some letter `X`. You can then go back to that mark doing `'`. This let's you quickly navigate to specific locations within a file or even across files. +- Navigation - `Ctrl+O` and `Ctrl+I` move you backward and forward respectively through your recently visited locations. +- Undo Tree - Vim has a quite fancy mechanism for keeping tack of changes. Unlike other editors, vim stores a tree of changes so even if you undo and then make a different change you can still go back to the original state by navigating the undo tree. Some plugins expose this tree in a graphical way. +- Undo with time - The `earlier` and `later` commands will let you navigate the files using time references instead of one change at a time. +- [Persistent undo](https://vim.fandom.com/wiki/Using_undo_branches#Persistent_undo) An amazing built in feature of vim that is disabled by default is persisting undo history between vim invocations. By setting `undofile` and `undodir` in your `.vimrc`, vim will storage a per-file history of changes. +- Leader Key - The leader key is special key that is often left to the user to be configured for custom commands. The pattern is usually to press and release this key (often the space key) and then some other key to execute a certain command. Often, plugins will use this key to add their own functionality, for instance the UndoTree plugin uses ` U` to open the undo tree. +- Advanced Text Objects - Text objects like searches can also be composed with vim commands. E.g. `d/` will delete to the next match of said pattern or `cgn` will change the next occurrence of the last searched string. + +## What is 2FA and why should I use it? + +Two Factor Authentication (2FA) adds an extra layer of protection to your accounts on top of passwords. In order to login, you not only have to know some password you also have to "prove" in some way you have access to some hardware device. In the most simple case, this can be achieved by receiving an SMS on your phone, although there are known issues with SMS 2FA. A better alternative we endorse is to use a U2F solution like for example YubiKeys. + +## Any comments on differences between web browsers? + +The current landscape of browsers as of 2020 is that most of them are like Chrome because they use the same engine (WebKit). This means that Safari or the Microsoft Edge, both based on WebKit are just worse versions of Chrome. Chrome is a reasonably good browser both in terms of performance and usability. Should you want an alternative, Firefox is our recommendation. It is comparable to Chrome in pretty much every regard and it excels for privacy reasons. +Another browser called [Flow](https://www.ekioh.com/flow-browser/) is not user ready yet, but it is implementing a new rendering engine that promises to be faster than the current ones. From 8720f5cad5fa1898d920510de75820866cea3605 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Mon, 3 Feb 2020 02:12:13 -0500 Subject: [PATCH 244/640] Add subtitles for lectures 5,7,11 --- static/files/subtitles/2020/command-line.sbv | 2846 ++++++++++++++++ .../subtitles/2020/debugging-profiling.sbv | 2573 +++++++++++++++ static/files/subtitles/2020/qa.sbv | 2874 +++++++++++++++++ 3 files changed, 8293 insertions(+) create mode 100644 static/files/subtitles/2020/command-line.sbv create mode 100644 static/files/subtitles/2020/debugging-profiling.sbv create mode 100644 static/files/subtitles/2020/qa.sbv diff --git a/static/files/subtitles/2020/command-line.sbv b/static/files/subtitles/2020/command-line.sbv new file mode 100644 index 00000000..bb3e0469 --- /dev/null +++ b/static/files/subtitles/2020/command-line.sbv @@ -0,0 +1,2846 @@ +0:00:00.480,0:00:02.480 +Okay, can everyone hear me okay? + +0:00:03.720,0:00:06.160 +Okay, so welcome back. + +0:00:06.160,0:00:10.320 +I'm gonna address a couple of items +in kind of the administratrivia. + +0:00:10.640,0:00:13.080 +With the end of the first week, + +0:00:13.179,0:00:16.349 +we sent an email, noticing you that + +0:00:16.600,0:00:20.219 +we have uploaded the videos for the first +week, so you can now find them online. + +0:00:20.470,0:00:26.670 +They have all the screen recordings for the things +that we were doing, so you can go back to them. + +0:00:26.830,0:00:31.439 +Look if you're were confused about if +we did something quick and, again, + +0:00:31.440,0:00:37.560 +feel free to ask us any questions if anything in the +lecture notes is not clear. We also sent you a + +0:00:37.880,0:00:42.360 +survey so you can give us feedback +about what was not clear, + +0:00:42.360,0:00:46.280 +what items you would want a +more thorough explanation or + +0:00:47.110,0:00:51.749 +just any other item, if you're finding +the exercises too hard, too easy, + +0:00:52.239,0:00:55.288 +go into that URL and we'll really + +0:00:55.960,0:01:00.040 +appreciate getting that feedback, because +that will make the course better + +0:01:00.480,0:01:03.800 +for the remaining lectures and for +future iterations of the course. + +0:01:05.080,0:01:07.080 +With that out of the way + +0:01:07.080,0:01:10.840 +Oh, and we're gonna try to upload the +videos in a more timely manner. + +0:01:11.200,0:01:16.040 +We don't want to kind of wait until the end of +the week for that. So keep tuned for that. + +0:01:18.760,0:01:19.840 +That out of the way, + +0:01:19.920,0:01:20.800 +now I'm gonna + +0:01:21.120,0:01:24.960 +This lecture's called command-line +environment and we're + +0:01:25.160,0:01:28.440 +going to cover a few different topics. So the + +0:01:28.990,0:01:30.990 +main topics we're gonna + +0:01:32.040,0:01:34.520 +cover, so you can keep track, + +0:01:34.680,0:01:36.400 +it's probably better here, + +0:01:36.400,0:01:37.720 +keep track of what I'm talking. + +0:01:37.920,0:01:41.560 +The first is gonna be job control. + +0:01:42.040,0:01:44.280 +The second one is going to be + +0:01:44.600,0:01:46.600 +terminal multiplexers. + +0:01:51.720,0:01:57.360 +Then I'm going to explain what dotfiles +are and how to configure your shell. + +0:01:57.360,0:02:03.240 +And lastly, how to efficiently work with +remote machines. So if things are not + +0:02:05.110,0:02:07.589 +fully clear, kind of keep the structure. + +0:02:08.200,0:02:12.320 +They all kind of interact in some +way, of how you use your terminal, + +0:02:12.880,0:02:17.280 +but they are somewhat separate +topics, so keep that in mind. + +0:02:17.600,0:02:23.800 +So let's go with job control. So far we have +been using the shell in a very, kind of + +0:02:24.800,0:02:27.720 +mono-command way. Like, you +execute a command and then + +0:02:27.840,0:02:31.800 +the command executes, then you get some output, +and that's all about what you can do. + +0:02:32.200,0:02:36.520 +And if you want to run several +things, it's not clear + +0:02:36.540,0:02:41.099 +how you will do it. Or if you want to stop +the execution of a program, it's again, + +0:02:41.099,0:02:43.768 +like how do I know how to stop a program? + +0:02:44.650,0:02:47.940 +Let's showcase this with a command called sleep. + +0:02:48.160,0:02:50.320 +Sleep is a command that takes an argument, + +0:02:50.320,0:02:54.360 +and that argument is going to be an +integer number, and it will sleep. + +0:02:54.360,0:02:58.440 +It will just kind of be there, on the +background, for that many seconds. + +0:02:58.440,0:03:03.539 +So if we do something like sleep 20, this process +is gonna be sleeping for 20 seconds. + +0:03:03.539,0:03:07.720 +But we don't want to wait 20 seconds +for the command to complete. + +0:03:08.040,0:03:10.800 +So what we can do is type "Ctrl+C". + +0:03:10.840,0:03:12.580 +By typing "Ctrl+C" + +0:03:12.580,0:03:17.840 +We can see that, here, the terminal let us know, + +0:03:18.880,0:03:22.840 +and it's part of the syntax that we covered +in the editors / Vim lecture, + +0:03:23.000,0:03:27.200 +that we typed "Ctrl+C" and it stopped +the execution of the process. + +0:03:27.640,0:03:29.640 +What is actually going on here + +0:03:29.880,0:03:34.840 +is that this is using a UNIX communication +mechanism called signals. + +0:03:35.120,0:03:37.360 +When we type "Ctrl+C", + +0:03:37.800,0:03:42.080 +what the terminal did for us, +or the shell did for us, + +0:03:42.160,0:03:45.960 +is send a signal called SIGINT, + +0:03:45.960,0:03:51.320 +that stands for SIGnal INTerrupt, that +tells the program to stop itself. + +0:03:51.680,0:03:57.520 +And there are many, many, many signals +of this kind. If you do man signal, + +0:03:58.880,0:04:05.060 +and just go down a little bit, +here you have a list of them. + +0:04:05.060,0:04:07.040 +They all have number identifiers, + +0:04:07.520,0:04:10.640 +they have kind of a short name +and you can find a description. + +0:04:10.960,0:04:16.400 +So for example, the one I have just +described is here, number 2, SIGINT. + +0:04:16.520,0:04:22.200 +This is the signal that a terminal will send to a +program when it wants to interrupt its execution. + +0:04:22.520,0:04:25.840 +A few more to be familiar with + +0:04:26.460,0:04:28.530 +is SIGQUIT, this is + +0:04:29.229,0:04:34.409 +again, if you work from a terminal and you +want to quit the execution of a program. + +0:04:34.409,0:04:37.720 +For most programs it will do the same thing, + +0:04:37.720,0:04:41.120 +but we're gonna showcase now a program +which will be different, + +0:04:41.440,0:04:43.760 +and this is the signal that will be sent. + +0:04:44.680,0:04:49.229 +It can be confusing sometimes. Looking at +these signals, for example, the SIGTERM is + +0:04:50.080,0:04:54.100 +for most cases equivalent to SIGINT and SIGQUIT + +0:04:54.480,0:04:58.380 +but it's just when it's not +sent through a terminal. + +0:04:59.680,0:05:01.680 +A few more that we're gonna + +0:05:01.900,0:05:06.209 +cover is SIGHUP, it's when there's +like a hang-up in the terminal. + +0:05:06.210,0:05:10.199 +So for example, when you are in your +terminal, if you close your terminal + +0:05:10.199,0:05:13.348 +and there are still things +running in the terminal, + +0:05:13.480,0:05:17.000 +that's the signal that the program is gonna send + +0:05:17.000,0:05:19.960 +to all the processes to tell +that they should close, + +0:05:19.960,0:05:25.080 +like there was a hang-up in the +command line communication + +0:05:25.080,0:05:26.800 +and they should close now. + +0:05:28.400,0:05:34.260 +Signals can do more things than just stopping, interrupting +programs and asking them to finish. + +0:05:34.260,0:05:36.840 +You can for example use the + +0:05:37.520,0:05:43.840 +SIGSTOP to pause the execution of the +program, and then you can use the + +0:05:44.480,0:05:50.160 +SIGCONT command for continuing, to continue the execution +of the program at a point later in time. + +0:05:51.160,0:05:55.440 +Since all of this might be slightly too +abstract, let's see a few examples. + +0:05:58.040,0:06:00.560 +First, let's showcase a + +0:06:01.960,0:06:06.240 +Python program. I'm going to very +quickly go through the program. + +0:06:06.440,0:06:08.360 +This is a Python program, + +0:06:08.720,0:06:10.760 +that like most python programs, + +0:06:11.520,0:06:13.960 +is importing this signal library and + +0:06:14.960,0:06:20.400 +is defining this handler here. +And this handler is writing, + +0:06:20.440,0:06:23.040 +"Oh, I got a SIGINT, but +I'm not gonna stop here". + +0:06:23.480,0:06:24.960 +And after that, + +0:06:24.960,0:06:30.720 +we tell Python that we want this program, +when it gets a SIGINT, to stop. + +0:06:31.120,0:06:34.880 +The rest of the program is a very silly program +that is just going to be printing numbers. + +0:06:35.060,0:06:37.540 +So let's see this in action. + +0:06:37.560,0:06:39.560 +We do Python SIGINT. + +0:06:39.880,0:06:44.970 +And it's counting. We try doing +"Ctrl+C", this sends a SIGINT, + +0:06:44.970,0:06:50.000 +but the program didn't actually stop. This +is because we have a way in the program of + +0:06:50.400,0:06:54.600 +dealing with this exception, +and we didn't want to exit. + +0:06:54.760,0:06:57.600 +If we send a SIGQUIT, which is done through + +0:06:57.800,0:07:03.680 +"Ctrl+\", here, we can see that since the program +doesn't have a way of dealing with SIGQUIT, + +0:07:03.730,0:07:06.269 +it does the default operation, which is + +0:07:06.820,0:07:08.800 +terminate the program. + +0:07:09.080,0:07:11.460 +And you could use this, for example, + +0:07:11.880,0:07:15.880 +if someone Ctrl+C's your program, and your +program is supposed to do something, + +0:07:16.040,0:07:19.320 +like you maybe want to save the intermediate +state of your program + +0:07:19.320,0:07:21.520 +to a file, so you can recover it for later. + +0:07:21.600,0:07:25.640 +This is how you could write a handler like this. + +0:07:29.520,0:07:30.720 +Can you repeat the question? + +0:07:30.880,0:07:32.280 +What did you type right now, when it stopped? + +0:07:32.480,0:07:34.480 +So I... + +0:07:34.630,0:07:38.880 +So what I typed is, I type +"Ctrl+C" to try to stop it + +0:07:38.880,0:07:42.869 +but it didn't, because SIGINT is captured +by the program. Then I type + +0:07:43.120,0:07:48.040 +"Ctrl+\", which sends a SIGQUIT, +which is a different signal, + +0:07:49.000,0:07:51.720 +and this signal is not captured by the program. + +0:07:52.090,0:07:54.869 +It's also worth mentioning +that there is a couple of + +0:07:54.970,0:07:59.970 +signals that cannot be captured by software. +There is a couple of signals + +0:08:00.820,0:08:02.820 +like SIGKILL + +0:08:03.940,0:08:06.600 +that cannot be captured. Like that, it will + +0:08:06.660,0:08:09.300 +terminate the execution of the +process, no matter what. + +0:08:09.300,0:08:12.000 +And it can be sometimes harmful. +You do not want to be using it by + +0:08:12.000,0:08:16.460 +default, because this can leave for example an +orphan child, orphaned children processes. + +0:08:16.470,0:08:20.940 +Like if a process has other small children +processes that it started, and you + +0:08:21.400,0:08:25.470 +SIGKILL it, all of those will +keep running in there, + +0:08:25.760,0:08:30.800 +but they won't have a parent, and you can maybe +have a really weird behavior going on. + +0:08:32.040,0:08:35.680 +What signal is given to the +program if we log off? + +0:08:35.800,0:08:37.440 +If you log off? + +0:08:37.920,0:08:41.920 +That would be... so for example, if you're in an +SSH connection and you close the connection, + +0:08:41.920,0:08:45.600 +that is the hang-up signal, + +0:08:45.600,0:08:51.200 +SIGHUP, which I'm gonna cover in an example. +So this is what would be sent up. + +0:08:51.560,0:08:56.360 +And you could write for example, if you want +the process to keep working even if you close + +0:08:56.960,0:09:02.560 +that, you can write a wrapper around +that to ignore that signal. + +0:09:04.720,0:09:09.760 +Let's display what we could do +with the stop and continue. + +0:09:09.980,0:09:16.389 +So, for example, we can start a really long process. +Let's sleep a thousand, we're gonna take forever. + +0:09:16.960,0:09:18.920 +We can control-c, + +0:09:18.920,0:09:20.360 +"Ctrl+Z", sorry, + +0:09:20.360,0:09:25.280 +and if we do "Ctrl+Z" we can see that the +terminal is saying "it's suspended". + +0:09:25.400,0:09:31.520 +What this actually meant is that this process +was sent a SIGSTOP signal and now is + +0:09:31.900,0:09:36.900 +still there, you could continue its execution, but right +now it's completely stopped and in the background + +0:09:38.580,0:09:41.720 +and we can launch a different program. + +0:09:41.720,0:09:43.680 +When we try to run this program, + +0:09:43.680,0:09:46.620 +please notice that I have included +an "&" at the end. + +0:09:46.820,0:09:52.380 +This tells bash that I want this program +to start running in the background. + +0:09:52.560,0:09:55.660 +This is kind of related to all these + +0:09:55.660,0:09:59.720 +concepts of running programs in +the shell, but backgrounded. + +0:10:00.350,0:10:04.359 +And what is gonna happen is +the program is gonna start + +0:10:04.720,0:10:07.580 +but it's not gonna take over my prompt. + +0:10:07.580,0:10:11.540 +If I just ran this command without +this, I could not do anything. + +0:10:11.540,0:10:15.820 +I would have no access to the prompt +until the command either finished + +0:10:16.060,0:10:19.380 +or I ended it abruptly. But if I do this, + +0:10:19.520,0:10:23.080 +it's saying "there's a new +process which is this". + +0:10:23.080,0:10:25.180 +This is the process identifying number, + +0:10:25.180,0:10:26.940 +we can ignore this for now. + +0:10:27.800,0:10:32.919 +If I type the command "jobs", I get the +output that I have a suspended job + +0:10:32.920,0:10:35.800 +that is the "sleep 1000" job. + +0:10:36.040,0:10:38.100 +And then I have another running job, + +0:10:38.120,0:10:42.200 +which is this "NOHUP sleep 2000". + +0:10:42.640,0:10:45.660 +Say I want to continue the first job. + +0:10:45.660,0:10:48.520 +The first job is suspended, +it's not executing anymore. + +0:10:48.640,0:10:52.600 +I can continue that doing "BG %1" + +0:10:53.870,0:10:58.359 +That "%" is referring to the fact that +I want to refer to this specific + +0:11:00.280,0:11:04.280 +process. And now, if I do that +and I look at the jobs, + +0:11:04.300,0:11:06.460 +now this job is running again. Now + +0:11:06.460,0:11:08.940 +both of them are running. + +0:11:09.300,0:11:13.820 +If I wanted to stop these all, +I can use the kill command. + +0:11:14.040,0:11:16.060 +The kill command + +0:11:16.220,0:11:18.620 +is for killing jobs, + +0:11:19.180,0:11:22.080 +which is just stopping them, intuitively, + +0:11:22.120,0:11:23.760 +but actually it's really useful. + +0:11:23.860,0:11:28.200 +The kill command just allows you +to send any sort of Unix signal. + +0:11:28.360,0:11:32.220 +So here for example, instead +of killing it completely, + +0:11:32.220,0:11:34.640 +we just send a stop signal. + +0:11:34.640,0:11:39.160 +Here I'm gonna send a stop signal, which +is gonna pause the process again. + +0:11:39.160,0:11:41.280 +I still have to include the identifier, + +0:11:41.600,0:11:46.480 +because without the identifier the shell wouldn't know +whether to stop the first one or the second one. + +0:11:47.480,0:11:52.480 +Now it's said this has been suspended, +because there was a signal sent. + +0:11:52.620,0:11:57.360 +If I do "jobs", again, we can see +that the second one is running + +0:11:57.460,0:12:00.740 +and the first one has been stopped. + +0:12:01.420,0:12:04.300 +Going back to one of the questions, + +0:12:04.300,0:12:06.980 +what happens when you close +the cell, for example, + +0:12:06.980,0:12:12.860 +and why sometimes people will say that +you should use this NOHUP command + +0:12:12.860,0:12:15.960 +before your run jobs in a remote session. + +0:12:16.220,0:12:23.120 +This is because if we try to send +a hung up command to the first job + +0:12:23.560,0:12:27.820 +it's gonna, in a similar fashion +as the other signals, + +0:12:27.820,0:12:32.280 +it's gonna hang it up and that's +gonna terminate the job. + +0:12:32.800,0:12:35.960 +And the first job isn't there anymore + +0:12:36.320,0:12:39.140 +whereas we have still the second job running. + +0:12:39.400,0:12:42.920 +However, if we try to send the +signal to the second job + +0:12:42.920,0:12:46.060 +what will happen if we close +our terminal right now + +0:12:47.040,0:12:48.660 +is it's still running. + +0:12:48.660,0:12:52.480 +Like NOHUP, what it's doing +is kind of encapsulating + +0:12:52.480,0:12:54.480 +whatever command you're executing and + +0:12:54.740,0:12:58.720 +ignoring wherever you get a hang up signal, + +0:12:58.900,0:13:03.680 +and just ignoring that so it can keep running. + +0:13:05.060,0:13:08.500 +And if we send the "kill" +signal to the second job, + +0:13:08.500,0:13:12.820 +that one can't be ignored and that +will kill the job, no matter what. + +0:13:13.280,0:13:15.780 +And we don't have any jobs anymore. + +0:13:17.000,0:13:22.540 +That kind of completes the +section on job control. + +0:13:22.740,0:13:27.100 +Any questions so far? Anything +that wasn't fully clear? + +0:13:29.040,0:13:30.400 +What does BG do? + +0:13:30.960,0:13:31.800 +So BG... + +0:13:31.800,0:13:36.860 +There are like two commands. Whenever you +have a command that has been backgrounded + +0:13:37.200,0:13:41.820 +and is stopped you can use +BG (short for background) + +0:13:41.820,0:13:44.180 +to continue that process running +on the background. + +0:13:44.440,0:13:47.400 +That's equivalent of just kind of sending it + +0:13:47.680,0:13:50.820 +a continue signal, so it keeps running. + +0:13:50.820,0:13:54.820 +And then there's another one which +is called FG, if you want to + +0:13:54.860,0:13:59.580 +recover it to the foreground and you want +to reattach your standard output. + +0:14:04.760,0:14:06.760 +Okay, good. + +0:14:07.120,0:14:11.420 +Jobs are useful and in general, I +think knowing about signals can be + +0:14:11.420,0:14:14.360 +really beneficial when dealing +with some part of Unix + +0:14:14.360,0:14:19.420 +but most of the time what you actually want +to do is something along the lines of + +0:14:19.670,0:14:24.099 +having your editor in one side and then +the program in another, and maybe + +0:14:24.720,0:14:28.280 +monitoring what the resource +consumption is in our tab. + +0:14:28.680,0:14:33.640 +We could achieve this using probably +what you have seen a lot of the time, + +0:14:33.640,0:14:35.200 +which is just opening more windows. + +0:14:35.200,0:14:37.200 +We can keep opening terminal windows. + +0:14:37.320,0:14:41.280 +But the fact is there are kind of more +convenient solutions to this and + +0:14:41.280,0:14:43.800 +this is what a terminal multiplexer does. + +0:14:44.080,0:14:48.520 +A terminal multiplexer like tmux + +0:14:48.840,0:14:52.160 +will let you create different workspaces +that you can work in, + +0:14:52.640,0:14:54.280 +and quickly kind of, + +0:14:54.280,0:14:56.960 +this has a huge variety of functionality, + +0:14:57.320,0:15:02.760 +It will let you rearrange the environment and +it will let you have different sessions. + +0:15:03.400,0:15:05.400 +There's another more... + +0:15:05.600,0:15:07.640 +older command, which is called "screen", + +0:15:07.640,0:15:09.360 +that might be more readily available. + +0:15:09.360,0:15:12.200 +But I think the concept kind +of extrapolates to both. + +0:15:12.600,0:15:15.400 +We recommend tmux, that you go and learn it. + +0:15:15.400,0:15:17.480 +And in fact, we have exercises on it. + +0:15:17.480,0:15:20.240 +I'm gonna showcase a different +scenario right now. + +0:15:20.320,0:15:22.000 +So whenever I talked... + +0:15:22.320,0:15:24.880 +Oh, let me make a quick note. + +0:15:25.200,0:15:28.800 +There are kind of three core concepts +in tmux, that I'm gonna go through and + +0:15:30.110,0:15:33.130 +the main idea is that there are what is called + +0:15:35.180,0:15:37.180 +"sessions". + +0:15:37.760,0:15:40.510 +Sessions have "windows" and + +0:15:42.019,0:15:44.019 +windows have "panes". + +0:15:45.709,0:15:49.539 +It's gonna be kind of useful to +keep this hierarchy in mind. + +0:15:50.760,0:15:57.280 +You can pretty much equate "windows" to what +"tabs" are in other editors and others, + +0:15:57.280,0:16:00.720 +like for example your web browser. + +0:16:01.280,0:16:06.440 +I'm gonna go through the features, mainly +what you can do at the different levels. + +0:16:07.000,0:16:10.480 +So first, when we do tmux, that starts a session. + +0:16:11.360,0:16:14.960 +And here right now it seems like nothing changed + +0:16:14.960,0:16:20.360 +but what's happening right now is we're within a shell +that is different from the one we started before. + +0:16:20.640,0:16:24.840 +So in our shell we started +a process, that is tmux + +0:16:24.840,0:16:28.840 +and that tmux started a different process, +which is the shell we're currently in. + +0:16:28.980,0:16:30.400 +And the nice thing about this is that + +0:16:30.580,0:16:34.740 +that tmux process is separate from +the original shell process. + +0:16:34.860,0:16:36.860 +So + +0:16:40.580,0:16:44.460 +here, we can do things. + +0:16:44.480,0:16:48.600 +We can do "ls -la", for example, to +tell us what is going on in here. + +0:16:48.920,0:16:53.960 +And then we can start running our program, +and it will start running in there + +0:16:54.160,0:16:57.880 +and we can do "Ctrl+A d", for example, to detach + +0:17:12.760,0:17:15.960 +to detach from the session. + +0:17:16.140,0:17:19.120 +And if we do "tmux a" + +0:17:19.160,0:17:21.560 +that's gonna reattach us to the session. + +0:17:21.560,0:17:22.300 +So the process, + +0:17:22.300,0:17:25.180 +we abandon the process counting numbers. + +0:17:25.180,0:17:28.300 +This really silly Python program +that was just counting numbers, + +0:17:28.340,0:17:30.160 +we left it running there. + +0:17:30.200,0:17:31.720 +And if we tmux... + +0:17:31.720,0:17:33.760 +Hey, the process is still running there. + +0:17:33.780,0:17:37.820 +And we could close this entire +terminal and open a new one and + +0:17:37.880,0:17:41.860 +we could still reattach because this +tmux session is still running. + +0:17:43.340,0:17:45.340 +Again, we can... + +0:17:46.640,0:17:48.640 +Before I go any further. + +0:17:48.920,0:17:53.740 +Pretty much... Unlike Vim, where +you have this notion of modes, + +0:17:53.960,0:17:58.180 +tmux will work in a more emacsy way, which is + +0:17:58.180,0:18:04.140 +every command, pretty much every command in tmux, + +0:18:04.220,0:18:06.020 +you could enter it through the... + +0:18:06.020,0:18:08.160 +it has a command line, that we could use. + +0:18:08.240,0:18:11.320 +But I recommend you to get familiar +with the key bindings. + +0:18:11.880,0:18:15.080 +It can be somehow non intuitive at first, + +0:18:15.300,0:18:17.880 +but once you get used to them... + +0:18:22.140,0:18:23.020 +"Ctrl+C", yeah + +0:18:24.440,0:18:30.760 +When you get familiar with them, you will be much faster +just using the key bindings than using the commands. + +0:18:31.280,0:18:35.980 +One note about the key bindings: all the +key bindings have a form that is like + +0:18:36.140,0:18:39.840 +you type a prefix and then some key. + +0:18:40.060,0:18:44.000 +So for example, to detach we +do "Ctrl+A" and then "D". + +0:18:44.160,0:18:50.140 +This means you press "Ctrl+A" first, you release +that, and then press "D" to detach. + +0:18:50.380,0:18:54.200 +On default tmux, the prefix is "Ctrl+B", + +0:18:54.200,0:18:58.780 +but you will find that most people +will have this remapped to "Ctrl+A" + +0:18:58.780,0:19:02.680 +because it's a much more ergonomic +type on the keyboard. + +0:19:02.700,0:19:06.420 +You can find more about how to do these +things in one of the exercises, + +0:19:06.960,0:19:12.780 +where we link you to the basics and how to do some +kind of quality of life modifications to tmux. + +0:19:13.380,0:19:16.720 +Going back to the concept of sessions, + +0:19:16.960,0:19:22.120 +we can create a new session just +doing something like tmux new + +0:19:22.320,0:19:24.540 +and we can give sessions names. + +0:19:24.760,0:19:27.220 +So we can do like "tmux new -t foobar" + +0:19:27.220,0:19:30.900 +and this is a completely different +session, that we have started. + +0:19:32.240,0:19:36.360 +We can work here, we can detach from it. + +0:19:36.360,0:19:40.000 +"tmux ls" will tell us that we +have two different sessions: + +0:19:40.000,0:19:43.460 +the first one is named "0", because +I didn't give it a name, + +0:19:43.500,0:19:45.820 +and the second one is called "foobar". + +0:19:46.580,0:19:51.020 +I can attach the foobar session + +0:19:51.020,0:19:53.700 +and I can end it. + +0:19:54.680,0:19:56.340 +And it's really nice because + +0:19:56.340,0:20:00.139 +having this you can kind of work +in completely different projects. + +0:20:00.140,0:20:04.340 +For example, having two different +tmux sessions and different + +0:20:04.480,0:20:08.440 +editor sessions, different processes running... + +0:20:10.160,0:20:15.100 +When you are within a session, we +start with the concept of windows. + +0:20:15.100,0:20:21.160 +Here we have a single window, but we +can use "Ctrl+A c" (for "create") + +0:20:21.160,0:20:23.720 +to open a new window. + +0:20:24.000,0:20:26.340 +And here nothing is executing. + +0:20:26.380,0:20:29.420 +What it's doing is, tmux has +opened a new shell for us + +0:20:30.360,0:20:34.840 +and we can start running another +one of these programs here. + +0:20:35.460,0:20:42.460 +And to quickly jump between the tabs, +we can do "Ctrl+A" and "previous", + +0:20:42.460,0:20:44.520 +"p" for "previous", + +0:20:45.220,0:20:48.020 +and that will go up to the previous window. + +0:20:48.020,0:20:50.920 +"Ctrl+A" "next", to go to the next window. + +0:20:51.260,0:20:56.060 +You can also use the numbers. So if we +start opening a lot of these tabs, + +0:20:56.200,0:21:00.160 +we could use "Ctrl+A 1", to +specifically jump to the + +0:21:00.240,0:21:04.400 +to the window that is number "1". + +0:21:04.780,0:21:08.620 +And, lastly, it's also pretty +useful to know sometimes + +0:21:08.660,0:21:10.400 +that you can rename them. + +0:21:10.400,0:21:13.380 +For example here I'm executing +this Python process, + +0:21:13.580,0:21:16.800 +but that might not be really +informative and I want... + +0:21:16.880,0:21:21.160 +I maybe want to have something like +execution or something like that and + +0:21:21.740,0:21:26.840 +that will rename the name of that window so +you can have this really neatly organized. + +0:21:27.080,0:21:33.500 +This still doesn't solve the need when you want to +have two things at the same time in your terminal, + +0:21:33.680,0:21:35.740 +like in the same display. + +0:21:35.740,0:21:38.320 +This is what panes are for. Right now, here + +0:21:38.420,0:21:40.420 +we have a window with a single pane + +0:21:40.420,0:21:43.540 +(all the windows that we have opened +so far have a single pane). + +0:21:43.640,0:21:50.800 +But if we do 'Ctrl+A "' + +0:21:51.040,0:21:56.540 +this will split the current display +into two different panes. + +0:21:56.540,0:22:01.400 +So, you see, the one we open below is a different +shell from the one we have above, + +0:22:01.640,0:22:05.440 +and we can run any process that we want here. + +0:22:05.620,0:22:09.900 +We can keep splitting this, if we do "Ctrl+A %" + +0:22:10.080,0:22:15.000 +that will split vertically. And you can kind of + +0:22:15.000,0:22:18.220 +rearrange these tabs using a +lot of different commands. + +0:22:18.220,0:22:22.620 +One that I find very useful, when you are +starting and it's kind of frustrating, + +0:22:23.540,0:22:26.000 +rearranging them. + +0:22:26.160,0:22:30.160 +Before I explain that, to move +through these panes, which is + +0:22:30.300,0:22:32.280 +something you want to be doing all the time + +0:22:32.460,0:22:37.060 +You just do "Ctrl+A" and the arrow +keys, and that will let you quickly + +0:22:37.460,0:22:43.960 +navigate through the different +windows, and execute again... + +0:22:44.340,0:22:46.300 +I'm doing a lot of "ls -a" + +0:22:47.340,0:22:52.780 +I can do "HTOP", that we'll explain in +the debugging and profiling lecture. + +0:22:53.540,0:22:55.920 +And we can just navigate through them, again + +0:22:55.920,0:22:59.040 +like to rearrange there's +another slew of commands, + +0:22:59.080,0:23:01.080 +you will go through some in the Exercises + +0:23:02.400,0:23:07.160 +"Ctrl+A" space is pretty neat, because it +will kind of equispace the current ones + +0:23:07.160,0:23:10.260 +and let you through different layouts. + +0:23:11.480,0:23:14.260 +Some of them are too small for my current + +0:23:14.840,0:23:19.220 +terminal config, but that covers, +I think, most of it. + +0:23:19.440,0:23:21.440 +Oh, there's also, + +0:23:22.660,0:23:29.200 +here, for example, this Vim execution +that we have started, + +0:23:29.200,0:23:33.380 +is too small for what the current tmux pane is. + +0:23:33.720,0:23:38.240 +So one of the things that really is +much more convenient to do in tmux, + +0:23:39.180,0:23:42.500 +in contrast to having multiple +terminal windows, is that + +0:23:42.560,0:23:48.400 +you can zoom into this, you can ask +by doing "Ctrl+A z", for "zoom". + +0:23:48.400,0:23:52.960 +It will expand the pane to +take over all the space, + +0:23:52.960,0:23:56.660 +and then "Ctrl+A z", again will go back to it. + +0:24:02.760,0:24:08.080 +Any questions for terminal multiplexers, +or like, tmux concretely? + +0:24:14.140,0:24:16.780 +Is it running all the same thing? + +0:24:18.680,0:24:22.700 +Like, is there any difference in execution +between running it in different windows? + +0:24:24.880,0:24:28.640 +Is it really just doing it all the +same, so that you can see it? + +0:24:28.800,0:24:34.900 +Yeah, it wouldn't be any different from having +two terminal windows open in your computer. + +0:24:34.920,0:24:39.220 +Like both of them are gonna be running. +Of course, when it gets to the CPU, + +0:24:39.220,0:24:41.400 +this is gonna be multiplexed again. + +0:24:41.460,0:24:44.400 +Like there's like a timesharing +mechanism going there + +0:24:44.480,0:24:45.920 +but there's no difference. + +0:24:46.040,0:24:52.260 +tmux is just making this much more convenient +to use by giving you this visual layout + +0:24:52.560,0:24:55.020 +that you can quickly manipulate through. + +0:24:55.020,0:24:59.860 +And one of the main advantages will come +when we reach the remote machines + +0:24:59.860,0:25:05.300 +because you can leave one of these, we can +detach from one of these tmux systems, + +0:25:05.300,0:25:09.120 +close the connection and even +if we close the connection and + +0:25:09.120,0:25:11.640 +and the terminal is gonna send a hang-up signal, + +0:25:11.680,0:25:15.420 +that's not gonna close all the +tmux's that have been started. + +0:25:17.110,0:25:19.110 +Any other questions? + +0:25:23.620,0:25:27.980 +Let me disable the key-caster. + +0:25:33.580,0:25:38.040 +So now we're gonna move into the topic +of dotfiles and, in general, + +0:25:38.040,0:25:42.460 +how to kind of configure your shell +to do the things you want to do + +0:25:42.460,0:25:45.580 +and mainly how to do them quicker +and in a more convenient way. + +0:25:46.360,0:25:49.260 +I'm gonna motivate this using aliases first. + +0:25:49.380,0:25:51.060 +So what an alias is, + +0:25:51.060,0:25:54.260 +is that by now, you might be +starting to do something like + +0:25:54.920,0:26:01.680 +a lot of the time, I just want to LS a directory and +I want to display all the contents into a list format + +0:26:02.180,0:26:05.040 +and in a human readable thing. + +0:26:05.260,0:26:07.400 +And it's fine. Like it's not +that long of a command. + +0:26:07.400,0:26:10.300 +But as you start building longer +and longer commands, + +0:26:10.320,0:26:14.440 +it can become kind of bothersome having +to retype them again and again. + +0:26:14.440,0:26:17.540 +This is one of the reasons +why aliases are useful. + +0:26:17.540,0:26:21.740 +Alias is a command that will +be a built-in in your shell, + +0:26:21.960,0:26:23.680 +and what it will do is + +0:26:23.680,0:26:27.540 +it will remap a short sequence of +characters to a longer sequence. + +0:26:27.780,0:26:31.500 +So if I do, for example, here + +0:26:31.500,0:26:36.840 +alias ll="ls -lah" + +0:26:37.440,0:26:42.520 +If I execute this command, this is gonna call +the "alias" command with this argument + +0:26:42.520,0:26:44.320 +and the LS is going to update + +0:26:44.540,0:26:49.040 +the environment in my shell +to be aware of this mapping. + +0:26:49.320,0:26:52.920 +So if I now do LL, + +0:26:52.920,0:26:57.520 +it's executing that command without me +having to type the entire command. + +0:26:57.720,0:27:01.180 +It can be really handy for many, many reasons. + +0:27:01.180,0:27:04.740 +One thing to note before I go any further is that + +0:27:05.000,0:27:09.960 +here, alias is not anything special +compared to other commands, + +0:27:09.960,0:27:11.400 +it's just taking a single argument. + +0:27:11.680,0:27:15.600 +And there is no space around +this equals and that's + +0:27:16.020,0:27:18.720 +because alias takes a single argument + +0:27:18.720,0:27:21.640 +and if you try doing + +0:27:21.960,0:27:25.120 +something like this, that's giving +it more than one argument + +0:27:25.120,0:27:28.360 +and that's not gonna work because +that's not the format it expects. + +0:27:29.520,0:27:33.680 +So other use cases that work for aliases, + +0:27:34.720,0:27:36.549 +as I was saying, + +0:27:36.549,0:27:39.920 +for some things it may be much more convenient, + +0:27:40.040,0:27:41.020 +like + +0:27:41.020,0:27:43.200 +one of my favorites is git status. + +0:27:43.200,0:27:47.500 +It's extremely long, and I don't like typing +that long of a command every so often, + +0:27:47.560,0:27:48.960 +because you end up taking a lot of time. + +0:27:49.120,0:27:53.000 +So GS will replace for doing the git status + +0:27:53.820,0:27:58.620 +You can also use them to alias +things that you mistype often, + +0:27:58.620,0:28:01.160 +so you can do "sl=ls", + +0:28:01.160,0:28:02.540 +that will work. + +0:28:05.800,0:28:10.620 +Other useful mappings are, + +0:28:10.680,0:28:15.460 +you might want to alias a command to itself + +0:28:15.740,0:28:17.520 +but with a default flag. + +0:28:17.520,0:28:21.100 +So here what is going on is I'm creating an alias + +0:28:21.100,0:28:23.100 +which is an alias for the move command, + +0:28:23.300,0:28:29.780 +which is MV and I'm aliasing it to the +same command but adding the "-i" flag. + +0:28:29.980,0:28:34.460 +And this "-i" flag, if you go through the man page +and look at it, it stands for "interactive". + +0:28:34.780,0:28:39.880 +And what it will do is it will prompt +me before I do an overwrite. + +0:28:39.880,0:28:44.420 +So once I have executed this, +I can do something like + +0:28:44.700,0:28:47.360 +I want to move "aliases" into "case". + +0:28:47.700,0:28:53.140 +By default "move" won't ask, and if "case" +already exists, it will be over. + +0:28:53.160,0:28:55.780 +That's fine, I'm going to overwrite +whatever that's there. + +0:28:56.020,0:28:58.580 +But here it's now expanded, + +0:28:58.580,0:29:01.660 +"move" has been expanded into this "move -i" + +0:29:01.660,0:29:03.540 +and it's using that to ask me + +0:29:03.540,0:29:07.400 +"Oh, are you sure you want to overwrite this?" + +0:29:07.700,0:29:11.780 +And I can say no, I don't want to lose that file. + +0:29:12.180,0:29:15.820 +Lastly, you can use "alias move" + +0:29:15.820,0:29:18.520 +to ask for what this alias stands for. + +0:29:19.100,0:29:22.060 +So it will tell you so you can quickly make sure + +0:29:22.080,0:29:25.400 +what the command that you +are executing actually is. + +0:29:27.040,0:29:31.400 +One inconvenient part about, for example, +having aliases is how will you go about + +0:29:31.760,0:29:35.340 +persisting them into your current environment? + +0:29:35.500,0:29:38.120 +Like, if I were to close this terminal now, + +0:29:38.280,0:29:40.160 +all these aliases will go away. + +0:29:40.160,0:29:43.020 +And you don't want to be kind +of retyping these commands + +0:29:43.020,0:29:46.760 +and more generally, if you start configuring +your shell more and more, + +0:29:46.860,0:29:50.880 +you want some way of bootstrapping +all this configuration. + +0:29:51.380,0:29:56.780 +You will find that most shell command programs + +0:29:56.880,0:30:01.440 +will use some sort of text +based configuration file. + +0:30:01.440,0:30:06.740 +And this is what we usually call "dotfiles", because +they start with a dot for historical reasons. + +0:30:07.060,0:30:13.160 +So for bash in our case, which is a shell, + +0:30:13.160,0:30:15.560 +we can look at the bashrc. + +0:30:16.180,0:30:19.840 +For demonstration purposes, +here I have been using ZSH, + +0:30:19.900,0:30:24.460 +which is a different shell, and I'm going +to be configuring bash, and starting bash. + +0:30:24.640,0:30:29.640 +So if I create an entry here and I say + +0:30:29.940,0:30:31.960 +SL maps to LS + +0:30:32.600,0:30:36.020 +And I have modified that, and now I start bash. + +0:30:36.540,0:30:40.660 +Bash is kind of completely unconfigured, +but now if I do SL... + +0:30:41.360,0:30:44.040 +Hm, that's unexpected. + +0:30:46.280,0:30:48.000 +Oh, good. Good getting that. + +0:30:48.300,0:30:52.200 +So it matters where you config file is, + +0:30:52.200,0:30:55.260 +your config file needs to be in your home folder. + +0:30:55.640,0:31:00.940 +So your configuration file for +bash will live in that "~", + +0:31:00.940,0:31:03.940 +which will expand to your home directory, + +0:31:03.940,0:31:05.560 +and then bashrc. + +0:31:06.160,0:31:08.840 +And here we can create the alias + +0:31:12.040,0:31:15.840 +and now we start a bash session and we do SL. + +0:31:15.840,0:31:21.500 +Now it has been loaded, and this is +loaded at the beginning when this + +0:31:22.300,0:31:24.300 +bash program is started. + +0:31:24.700,0:31:31.200 +All this configuration is loaded and you can, not only use +aliases, they can have a lot of parts of configuration. + +0:31:31.390,0:31:35.729 +So for example here, I have a prompt +which is fairly useless. + +0:31:35.730,0:31:38.429 +It has just given me the name +of the shell, which is bash, + +0:31:38.640,0:31:43.820 +and the version, which is 5.0. I don't +want this to be displayed and + +0:31:44.360,0:31:48.540 +as with many things in your shell, this +is just an environment variable. + +0:31:48.600,0:31:53.120 +So the "PS1" is just the prompt string + +0:31:53.710,0:31:55.480 +for your prompt and + +0:31:55.480,0:32:02.520 +we can actually modify this +to just be a "> " symbol. + +0:32:02.520,0:32:08.280 +and now that has been modified, and we have +that. But if we exit and call bash again, + +0:32:08.620,0:32:15.059 +that was lost. However, if we add this +entry and say, oh we want "PS1" + +0:32:15.760,0:32:17.230 +to be + +0:32:17.230,0:32:19.179 +this and + +0:32:19.179,0:32:24.689 +we call bash again, this has been persisted. +And we can keep modifying this configuration. + +0:32:25.090,0:32:27.209 +So maybe we want to include + +0:32:27.880,0:32:29.880 +where the + +0:32:30.370,0:32:32.939 +working directory that we are is in, and + +0:32:34.140,0:32:37.380 +that's telling us the same information +that we had in the other shell. + +0:32:37.380,0:32:40.480 +And there are many, many options, + +0:32:40.780,0:32:45.060 +shells are highly, highly configurable, and + +0:32:45.700,0:32:49.920 +it's not only cells that are configured +through these files, + +0:32:50.590,0:32:55.740 +there are many other programs. As we saw for +example in the editors lecture, Vim is also + +0:32:55.840,0:33:02.900 +configured this way. We gave you this vimrc +file and told you to put it under your + +0:33:03.460,0:33:06.380 +home/.vimrc + +0:33:06.380,0:33:11.800 +and this is the same concept, but just +for Vim. It's just giving it a set of + +0:33:12.160,0:33:18.340 +instructions that it should load when it's started, +so you can keep a configuration that you want. + +0:33:19.140,0:33:21.240 +And even non... + +0:33:21.580,0:33:27.140 +kind of a lot of programs will support this. For instance, +my terminal emulator, which is another concept, + +0:33:27.260,0:33:30.159 +which is the program that is + +0:33:30.159,0:33:35.459 +running the shell, in a way, and displaying +this into the screen in my computer. + +0:33:35.950,0:33:38.610 +It can also be configured this way, so + +0:33:39.940,0:33:43.620 +if I modify this I can + +0:33:46.510,0:33:53.279 +change the size of the font. Like right now, for +example, I have increased the font size a lot + +0:33:53.279,0:33:55.768 +for demonstration purposes, but + +0:33:56.440,0:34:00.360 +if I change this entry and make it for example + +0:34:01.320,0:34:06.820 +28 and write this value, you see that +the size of the font has changed, + +0:34:06.820,0:34:12.920 +because I edited this text file that specifies +how my terminal emulator should work. + +0:34:19.480,0:34:20.900 +Any questions so far? + +0:34:20.900,0:34:22.280 +With dotfiles. + +0:34:28.040,0:34:35.940 +Okay, it can be a bit daunting knowing that there +is like this endless wall of configurations, + +0:34:35.940,0:34:40.600 +and how do you go about learning +about what can be configured? + +0:34:42.020,0:34:44.300 +The good news is that + +0:34:44.640,0:34:48.900 +we have linked you to really good +resources in the lecture notes. + +0:34:48.960,0:34:56.440 +But the main idea is that a lot of people really like +just configuring these tools and have uploaded + +0:34:56.640,0:35:01.140 +their configuration files to GitHub, another +different kind of repositories online. + +0:35:01.140,0:35:03.300 +So for example, here we are on GitHub, + +0:35:03.300,0:35:06.640 +we search for dotfiles, and +can see that there are like + +0:35:06.780,0:35:12.540 +thousands of repositories of people sharing +their configuration files. We have also... + +0:35:12.540,0:35:15.460 +Like, the class instructors +have linked our dotfiles. + +0:35:15.460,0:35:19.420 +So if you really want to know how +any part of our setup is working + +0:35:19.420,0:35:22.220 +you can go through it and try to figure it out. + +0:35:22.220,0:35:24.220 +You can also feel free to ask us. + +0:35:24.380,0:35:27.060 +If we go for example to this repository here + +0:35:27.210,0:35:30.649 +we can see that there's many, many +files that you can configure. + +0:35:30.650,0:35:37.520 +For example, there is one for bash, the first couple of ones +are for git, that will be probably be covered in the + +0:35:38.610,0:35:40.819 +version control lecture tomorrow. + +0:35:41.400,0:35:48.500 +If we go for example to the bash profile, which is +a different form of what we saw in the bashrc, + +0:35:49.400,0:35:52.900 +it can be really useful because +you can learn through + +0:35:53.940,0:35:58.320 +just looking at the manual page, but the +manual pages is, a lot of the time + +0:35:58.480,0:36:03.520 +just kind of like a descriptive explanation +of all the different options + +0:36:03.520,0:36:04.880 +and sometimes it's more helpful + +0:36:04.880,0:36:09.600 +going through examples of what people have done +and trying to understand why they did it + +0:36:09.600,0:36:12.200 +and how it's helping their workflow. + +0:36:12.960,0:36:17.300 +We can say here that this person has +done case-insensitive globbing. + +0:36:17.320,0:36:21.220 +We covered globbing as this +kind of filename expansion + +0:36:22.100,0:36:25.760 +trick in the shell scripting and tools. + +0:36:25.900,0:36:28.800 +And here you say no, I don't want this to matter, + +0:36:28.800,0:36:30.760 +whether using uppercase and lowercase, + +0:36:30.760,0:36:32.760 +and just setting this option in the shell +for these things to work this way + +0:36:35.360,0:36:38.140 +Similarly, there is for example aliases. + +0:36:38.140,0:36:42.220 +Here you can see a lot of aliases that this +person is doing. For example, "d" for + +0:36:44.200,0:36:47.400 +"d" for "Dropbox", sorry, because +that's just much shorter. + +0:36:47.400,0:36:49.200 +"g" for "git"... + +0:36:49.740,0:36:54.560 +Say we go, for example, with vimrc. It +can be actually very, very informative, + +0:36:54.560,0:36:58.860 +going through this and trying +to extract useful information. + +0:36:59.000,0:37:06.420 +We do not recommend just kind of getting one huge blob +of this and copying this into your config files, + +0:37:07.110,0:37:12.439 +because maybe things are prettier, but you might +not really understand what is going on. + +0:37:15.150,0:37:19.579 +Lastly one thing I want to mention +about dotfiles is that + +0:37:20.460,0:37:23.390 +people not only try to push these + +0:37:24.660,0:37:28.849 +files into GitHub just so other +people can read it, that's + +0:37:29.400,0:37:33.319 +one reason. They also make really sure they can + +0:37:34.140,0:37:39.440 +reproduce their setup. And to do that +they use a slew of different tools. + +0:37:39.440,0:37:41.280 +Oops, went a little too far. + +0:37:41.280,0:37:44.840 +So GNU Stow is, for example, one of them + +0:37:45.720,0:37:49.060 +and the trick that they are doing is + +0:37:50.280,0:37:54.520 +they are kind of putting all their +dotfiles in a folder and they are + +0:37:55.200,0:37:59.520 +faking to the system, using +a tool called symlinks, + +0:37:59.520,0:38:02.440 +that they are actually what +they're not. I'm gonna + +0:38:03.150,0:38:05.150 +draw really quick what I mean by that. + +0:38:05.790,0:38:10.939 +So a common folder structure might look +like you have your home folder and + +0:38:11.670,0:38:14.300 +in this home folder you might have your + +0:38:16.050,0:38:21.380 +bashrc, that contains your bash configuration, +you might have your vimrc and + +0:38:22.500,0:38:25.760 +it would be really great if you could +keep this under version control. + +0:38:26.580,0:38:29.300 +But the thing is, you might not +want to have a git repository, + +0:38:29.300,0:38:31.300 +which will be covered tomorrow, + +0:38:31.300,0:38:32.300 +in your home folder. + +0:38:32.300,0:38:37.360 +So what people usually do is they +create a dotfiles repository, + +0:38:38.280,0:38:42.160 +and then they have entries here for their + +0:38:43.050,0:38:47.239 +bashrc and their vimrc. And +this is where actually + +0:38:47.820,0:38:49.820 +the files are + +0:38:50.100,0:38:52.400 +and what they are doing is they're just + +0:38:53.460,0:38:56.510 +telling the OS to forward, whenever anyone + +0:38:56.760,0:39:01.849 +wants to read this file or write to this file, +just forward this to this other file. + +0:39:03.000,0:39:05.719 +This is a concept called symlinks + +0:39:06.690,0:39:08.630 +and it's useful in this scenario, + +0:39:08.630,0:39:12.600 +but it in general it's a really +useful tool in UNIX + +0:39:12.700,0:39:14.700 +that we haven't covered so far in the lectures + +0:39:14.960,0:39:16.740 +but you might be... + +0:39:16.740,0:39:18.740 +that you should be familiar with. + +0:39:19.100,0:39:22.840 +And in general, the syntax will be "ln -s" + +0:39:22.840,0:39:29.980 +for specifying a symbolic link and then +you will put the path to the file + +0:39:30.570,0:39:33.049 +that you want to create and then the + +0:39:33.780,0:39:35.780 +symlink that you want to create. + +0:39:39.390,0:39:41.390 +And + +0:39:41.880,0:39:45.619 +All these all these kind of fancy tools +that we're seeing here listed, + +0:39:45.810,0:39:52.159 +they all amount to doing some sort of this trick, so +that you can have all your dotfiles neat and tidy + +0:39:52.680,0:39:57.829 +into a folder, and then they can be +version-controlled, and they can be + +0:39:58.349,0:40:02.689 +symlinked so the rest of the programs can +find them in their default locations. + +0:40:06.720,0:40:09.020 +Any questions regarding dotfiles? + +0:40:13.200,0:40:20.200 +Do you need to have the dotfiles in your home folder, +and then also dotfiles in the version control folder? + +0:40:20.780,0:40:24.640 +So what you will have is, +pretty much every program, + +0:40:24.640,0:40:26.180 +for example bash, + +0:40:26.180,0:40:29.560 +will always look for "home/.bashrc". + +0:40:29.560,0:40:33.480 +That's where the program is going to look for. + +0:40:33.820,0:40:40.200 +What you do when you do a symlink +is, you place your "home/.bashrc" + +0:40:40.200,0:40:44.900 +it's just a file that is kind +of a special file in UNIX, + +0:40:45.150,0:40:49.609 +that says oh, whenever you want to read +this file go to this other file. + +0:40:51.500,0:40:53.440 +There's no content, like there is no... + +0:40:53.600,0:40:58.099 +your aliases are not part of this dotfile. That file +is just kind of like a pointer, saying now you should + +0:40:58.100,0:40:59.400 +go that other way. + +0:40:59.400,0:41:02.600 +And by doing that you can have your other file + +0:41:02.600,0:41:04.400 +in that other folder. + +0:41:04.560,0:41:06.360 +If version controlling is not useful, think about + +0:41:06.360,0:41:10.740 +what if you want to have them in your Dropbox +folder, so they're synced to the cloud, + +0:41:10.759,0:41:15.019 +for example. That's kind of another use case +where like symlinks could be really useful + +0:41:16.240,0:41:21.040 +So you don't need the folder dotfiles +to be in the home directory, right? + +0:41:21.040,0:41:23.820 +Because you can just use the symlink, +that points somewhere else. + +0:41:23.960,0:41:29.760 +As long as you have a way for the default path +to resolve wherever you have it, yeah. + +0:41:35.100,0:41:38.000 +Last thing I want to cover in the lecture... + +0:41:38.000,0:41:40.380 +Oh, sorry, any other questions about dotfiles? + +0:41:49.200,0:41:52.580 +Last thing I want to cover in the lecture +is working with remote machines, + +0:41:52.580,0:41:55.549 +which is a thing that you will run into, + +0:41:55.559,0:41:56.900 +sooner or later. + +0:41:56.900,0:42:02.238 +And there are a few things that will make your life +much easier when dealing with remote machines + +0:42:03.180,0:42:05.180 +if you know about them. + +0:42:05.220,0:42:08.380 +Right now maybe because you are +using the Athena cluster, + +0:42:08.380,0:42:10.740 +but later on, during your programming career, + +0:42:10.740,0:42:11.960 +it's pretty sure that + +0:42:11.960,0:42:15.400 +there is a fairly ubiquitous +concept of having your + +0:42:15.400,0:42:20.380 +local working environment and then having some +production server that is actually running the + +0:42:20.970,0:42:23.239 +code, so it is really good to get familiar + +0:42:24.480,0:42:26.749 +about how to work in/with remote machines. + +0:42:27.420,0:42:35.180 +So the main command for working +with remote machines is SSH. + +0:42:37.760,0:42:43.900 +SSH is just like a secure shell, it's +just gonna take the responsibility for + +0:42:43.900,0:42:46.540 +reaching wherever we want or tell it to go + +0:42:47.560,0:42:50.700 +and trying to open a session there. + +0:42:50.700,0:42:52.400 +So here the syntax is: + +0:42:53.130,0:42:56.660 +"JJGO" is the user that I want +to use in the remote machine, + +0:42:56.660,0:42:58.430 +and this is because the user is + +0:42:58.529,0:43:03.460 +different from the one I have my local machine, +which will be the case a lot of the time, + +0:43:03.460,0:43:07.400 +then the "@" is telling the +terminal that this separates + +0:43:07.400,0:43:12.540 +what the user is from what the address is. + +0:43:12.540,0:43:16.540 +And here I'm using an IP address because +what I'm actually doing is + +0:43:16.540,0:43:20.500 +I have a virtual machine in my computer, + +0:43:20.500,0:43:23.240 +that is the one that is remote right now. + +0:43:23.240,0:43:26.400 +And I'm gonna be SSH'ing into it. This is the + +0:43:26.580,0:43:27.880 +URL that I'm using, + +0:43:27.880,0:43:29.860 +sorry, the IP that I'm using, + +0:43:29.860,0:43:32.280 +but you might also see things like + +0:43:32.360,0:43:36.820 +oh I want to SSH as "JJGO" + +0:43:36.820,0:43:39.840 +at "foobar.mit.edu" + +0:43:39.840,0:43:42.960 +That's probably something more +common, if you are using some + +0:43:42.960,0:43:47.260 +remote server that has a DNS name. + +0:43:48.180,0:43:51.860 +So going back to a regular command, + +0:43:53.220,0:43:56.580 +we try to SSH, it asks us for a password, + +0:43:56.580,0:43:58.180 +really common thing. + +0:43:58.190,0:43:59.480 +And now we're there. We have... + +0:43:59.480,0:44:02.629 +we're still in our same terminal emulator + +0:44:02.630,0:44:09.529 +but right now SSH is kind of forwarding the +entire virtual display to display what the + +0:44:09.869,0:44:14.358 +remote shell is displaying. And +we can execute commands here and + +0:44:15.630,0:44:17.630 +we'll see the remote files + +0:44:18.390,0:44:22.819 +A couple of handy things to know about +SSH, that were briefly covered in the + +0:44:23.220,0:44:27.080 +data wrangling lecture, is that +SSH is not only good for just + +0:44:28.280,0:44:33.760 +opening connections. It will also let +you just execute commands remotely. + +0:44:33.770,0:44:36.979 +So for example, if I do that, it's gonna ask me + +0:44:37.710,0:44:39.020 +what is my password?, again. + +0:44:39.020,0:44:41.059 +And it's executing this command + +0:44:41.279,0:44:43.420 +then coming back to my terminal + +0:44:43.420,0:44:47.420 +and piping the output of what that +command was, in the remote machine, + +0:44:47.420,0:44:50.480 +through the standard output in my current cell. + +0:44:50.480,0:44:53.940 +And I could have this in... + +0:44:58.100,0:45:00.480 +I could have this in a pipe, and + +0:45:00.980,0:45:03.580 +this will work and we'll just + +0:45:03.600,0:45:06.100 +drop all this output and then have a local pipe + +0:45:06.100,0:45:07.879 +where I can keep working. + +0:45:08.640,0:45:12.140 +So far, it has been kind of inconvenient, +having to type our password. + +0:45:12.630,0:45:14.820 +There's one really good trick for this. + +0:45:14.820,0:45:16.880 +It's we can use something called "SSH keys". + +0:45:17.140,0:45:20.660 +SSH keys just use public key encryption + +0:45:20.660,0:45:24.980 +to create a pair of SSH keys, a public +key and a private key, and then + +0:45:25.170,0:45:29.320 +you can give the server the +public part of the key. + +0:45:29.320,0:45:32.810 +So you copy the public key and +then whenever you try to + +0:45:33.390,0:45:37.129 +authenticate instead of using your password, +it's gonna use the private key to + +0:45:37.820,0:45:40.800 +prove to the server that you are +actually who you say you are. + +0:45:43.860,0:45:48.020 +We can quickly showcase how you will go + +0:45:48.020,0:45:49.400 +about doing this. + +0:45:49.400,0:45:53.180 +Right now I don't have any SSH keys, +so I'm gonna create a couple of them. + +0:45:53.940,0:45:58.250 +First thing, it's just gonna ask +me where I want this key to live. + +0:45:58.980,0:46:00.640 +Unsurprisingly, it's doing this. + +0:46:00.640,0:46:04.820 +This is my home folder and then +it's using this ".ssh" path, + +0:46:05.460,0:46:08.750 +which refers back to the same concept +that we covered earlier about having + +0:46:08.850,0:46:12.439 +dotfiles. Like ".ssh" is a folder +that contains a lot of the + +0:46:13.320,0:46:16.540 +configuration files for how +you want SSH to behave. + +0:46:17.060,0:46:19.420 +So it will ask us a passphrase. + +0:46:19.680,0:46:23.120 +The passphrase is to encrypt +the private part of the key + +0:46:23.120,0:46:27.160 +because if someone gets your private key, +if you don't have a password protected + +0:46:27.920,0:46:29.580 +private key, if they get that key + +0:46:29.580,0:46:32.240 +they can use that key to impersonate +you in any server. + +0:46:32.310,0:46:34.360 +Whereas if you add a passphrase, + +0:46:34.360,0:46:37.640 +they will have to know what the passphrase +is to actually use the key. + +0:46:40.800,0:46:51.740 +It has created a keeper. We can check that +these two files are now under ssh. + +0:46:51.740,0:46:53.920 +And we can see... + +0:46:57.720,0:47:02.960 +We have these two files: we have +the 25519 and the public key. + +0:47:03.320,0:47:06.300 +And if we "cat" through the output, + +0:47:06.300,0:47:09.760 +that key is actually not like +any fancy binary file, it's + +0:47:15.430,0:47:20.760 +just a text file that has the contents +of the public key and some + +0:47:23.050,0:47:26.729 +alias name for it, so we can +know what this public key is. + +0:47:26.950,0:47:32.220 +The way we can tell the server that +we're authorized to SSH there + +0:47:32.260,0:47:38.400 +is by just actually copying this file, +like copying this string into a file, + +0:47:38.400,0:47:41.540 +that is ".ssh/authorized_keys". + +0:47:42.100,0:47:46.160 +So here what I'm doing is I'm + +0:47:46.960,0:47:49.770 +catting the output of this file + +0:47:49.800,0:47:53.920 +which is just this line of +text that we want to copy + +0:47:53.920,0:47:57.440 +and I'm piping that into SSH and then remotely + +0:47:57.960,0:48:02.080 +I'm asking "tee" to dump the contents +of the standard input + +0:48:02.080,0:48:05.220 +into ".ssh/authorized_keys". + +0:48:05.440,0:48:10.360 +And if we do that, obviously it's +gonna ask us for a password. + +0:48:14.800,0:48:18.740 +It was copied, and now we +can check that if we try + +0:48:19.690,0:48:21.690 +to SSH again, + +0:48:21.960,0:48:24.840 +It's going to first ask us for a passphrase + +0:48:24.840,0:48:29.100 +but you can arrange that so that +it's saved in the session + +0:48:29.460,0:48:34.840 +and we didn't actually have to +type the key for the server. + +0:48:34.840,0:48:36.840 +And I can kind of show that again. + +0:48:45.820,0:48:47.540 +More things that are useful. + +0:48:47.540,0:48:49.040 +Oh, we can do... + +0:48:49.220,0:48:51.880 +If that command seemed a little bit janky, + +0:48:51.980,0:48:55.000 +you can actually use this command +that is built for this, + +0:48:55.000,0:49:00.640 +so you don't have to kind of +craft this "ssh t" command. + +0:49:00.640,0:49:03.800 +That is just called "ssh-copy-id". + +0:49:05.000,0:49:08.080 +And we can do the same + +0:49:08.080,0:49:09.660 +and it's gonna copy the key. + +0:49:09.660,0:49:14.280 +And now, if we try to SSH, + +0:49:14.500,0:49:18.320 +we can SSH without actually +typing any key at all, + +0:49:18.860,0:49:20.320 +or any password. + +0:49:20.660,0:49:21.520 +More things. + +0:49:21.520,0:49:23.520 +We will probably want to copy files. + +0:49:23.740,0:49:25.310 +You cannot use "CP" + +0:49:25.310,0:49:29.720 +but you can use "SCP", for "SSH copy". + +0:49:29.720,0:49:34.500 +And here we can specify that we want +to copy this local file called notes + +0:49:34.500,0:49:36.880 +and the syntax is kind of similar. + +0:49:36.880,0:49:39.760 +We want to copy to this remote and + +0:49:39.920,0:49:44.020 +then we have a semicolon to separate +what the path is going to be. + +0:49:44.020,0:49:45.040 +And then we have + +0:49:45.040,0:49:46.620 +oh, we want to copy this as notes + +0:49:46.620,0:49:51.000 +but we could also copy this as foobar. + +0:49:51.740,0:49:55.600 +And if we do that, it has been executed + +0:49:55.780,0:49:59.280 +and it's telling us that all the +contents have been copied there. + +0:49:59.540,0:50:02.200 +If you're gonna be copying a lot of files, + +0:50:02.200,0:50:05.100 +there is a better command +that you should be using + +0:50:05.100,0:50:07.740 +that is called "RSYNC". For example, here + +0:50:07.900,0:50:10.780 +just by specifying these three flags, + +0:50:10.820,0:50:15.960 +I'm telling RSYNC to kind of preserve +all the permissions whenever possible + +0:50:16.240,0:50:19.740 +to try to check if the file +has already been copied. + +0:50:19.740,0:50:24.100 +For example, SCP will try to copy +files that are already there. + +0:50:24.200,0:50:26.440 +This will happen for example +if you are trying to copy + +0:50:26.440,0:50:29.060 +and the connection interrupts +in the middle of it. + +0:50:29.120,0:50:32.060 +SCP will start from the very beginning, +trying to copy every file, + +0:50:32.080,0:50:36.600 +whereas RSYNC will continue +from where it stopped. + +0:50:37.240,0:50:38.440 +And here, + +0:50:39.060,0:50:42.760 +we ask it to copy the entire folder and + +0:50:43.780,0:50:46.560 +it's just really quickly +copied the entire folder. + +0:50:48.080,0:50:54.100 +One of the other things to know about SSH is that + +0:50:54.320,0:50:59.860 +the equivalent of the dot file +for SSH is the "SSH config". + +0:50:59.860,0:51:06.340 +So if we edit the SSH config to be + +0:51:13.120,0:51:17.940 +If I edit the SSH config to +look something like this, + +0:51:17.940,0:51:22.900 +instead of having to, every +time, type "ssh jjgo", + +0:51:23.040,0:51:27.760 +having this really long string so I can +like refer to this specific remote, + +0:51:27.760,0:51:30.140 +I want to refer, with the specific user name, + +0:51:30.140,0:51:32.760 +I can have something here that says + +0:51:33.160,0:51:35.680 +this is the username, this +is the host name, that this + +0:51:36.860,0:51:40.540 +host is referring to and you should +use this identity file. + +0:51:41.460,0:51:43.960 +And if I copy this, + +0:51:43.960,0:51:46.100 +this is right now in my local folder, + +0:51:46.100,0:51:49.000 +I can copy this into ssh. + +0:51:49.600,0:51:53.520 +Now, instead of having to do this really +long command, I can just say + +0:51:53.520,0:51:57.100 +I just want to SSH into the host called VM. + +0:51:58.260,0:52:03.220 +And by doing that, it's grabbing all that +configuration from the SSH config + +0:52:03.220,0:52:05.220 +and applying it here. + +0:52:05.240,0:52:10.060 +This solution is much better than something +like creating an alias for SSH, + +0:52:10.360,0:52:13.360 +because other programs like SCP and RSYNC + +0:52:13.360,0:52:19.440 +also know about the dotfiles for SSH and +will use them whenever they are there. + +0:52:22.820,0:52:30.400 +Last thing I want to cover about remote machines is +that here, for example, we'll have tmux and we can, + +0:52:31.760,0:52:35.780 +like I was saying before, we +can start editing some file + +0:52:39.160,0:52:44.500 +and we can start running some job. + +0:52:54.200,0:52:56.180 +For example, something like HTOP. + +0:52:56.180,0:52:58.720 +And this is running here, we can + +0:52:59.320,0:53:01.320 +detach from it, + +0:53:01.430,0:53:03.430 +close the connection and + +0:53:03.740,0:53:07.780 +then SSH back. And then, if you do "tmux a", + +0:53:07.780,0:53:11.340 +everything is as you left it, like +nothing has really changed. + +0:53:11.340,0:53:15.220 +And if you have things executing there in +the background, they will keep executing. + +0:53:17.500,0:53:23.300 +I think that, pretty much, ends +all I have to say for this tool. + +0:53:23.300,0:53:26.420 +Any questions related to remote machines? + +0:53:32.860,0:53:36.780 +That's a really good question. +So what I do for that, + +0:53:38.700,0:53:39.460 +Oh, yes, sorry. + +0:53:39.460,0:53:44.880 +So the question is, how do you deal with +trying to use tmux in your local machine, + +0:53:44.880,0:53:47.640 +and also trying to use tmux +in the remote machine? + +0:53:48.400,0:53:50.760 +There are a couple of tricks +for dealing with that. + +0:53:50.760,0:53:53.220 +The first one is changing the prefix. + +0:53:53.360,0:53:55.340 +So what I do, for example, is + +0:53:55.340,0:54:00.020 +in my local machine the prefix I have +changed from "Ctrl+B" to "Ctrl+A" and + +0:54:00.220,0:54:02.580 +then in remove machines this is still "Ctrl+B". + +0:54:02.800,0:54:05.580 +So I can kind of swap between, + +0:54:05.580,0:54:09.840 +if I want to do things to the +local tmux I will do "Ctrl+A" + +0:54:09.840,0:54:13.460 +and if I want to do things to the +remote tmux I would do "Ctrl+B". + +0:54:15.080,0:54:19.900 +Another thing is that you +can have separate configs, + +0:54:20.080,0:54:24.100 +so I can do something like this, and then... + +0:54:27.260,0:54:31.040 +Ah, because I don't have my own ssh config, yeah. + +0:54:32.240,0:54:33.000 +But if you... + +0:54:33.000,0:54:34.420 +Um, I can SSH "VM". + +0:54:36.820,0:54:38.900 +Here, what you see, + +0:54:38.900,0:54:41.000 +the difference between these +two bars, for example, + +0:54:41.000,0:54:43.680 +is because the tmux config is different. + +0:54:44.380,0:54:48.500 +As you will see in the exercises, +the tmux configuration is in + +0:54:50.320,0:54:53.780 +the tmux.conf + +0:54:56.720,0:54:58.140 +And in tmux.conf, + +0:54:58.140,0:55:02.020 +here you can do a lot of things like changing +the color depending on the host you are + +0:55:02.210,0:55:06.879 +so you can get like quick visual +feedback about where you are, or + +0:55:06.880,0:55:10.240 +if you have a nested session. Also, tmux will, + +0:55:10.520,0:55:15.280 +if you're in the same host and you +try to tmux within a tmux session, + +0:55:15.290,0:55:18.759 +it will kind of prevent you from doing +it so you don't run into issues. + +0:55:21.700,0:55:25.400 +Any other questions related, to kind +of all the topics we have covered. + +0:55:29.100,0:55:32.720 +Another answer to that question is +also, if you type the prefix twice, + +0:55:32.880,0:55:35.760 +it sends it once to the underlying shell. + +0:55:35.920,0:55:40.100 +So the local binding is "Ctrl+A" and +the remote binding is "Ctrl+A", + +0:55:40.100,0:55:45.260 +You could type "Ctrl+A", "Ctrl+A" and then "D", for +example, detaches from the remote, basically. + +0:55:52.480,0:55:59.660 +I think that ends the class for today, there's a bunch +of exercises related to all these main topics and + +0:56:00.380,0:56:05.410 +we're gonna be holding office hours today, too. +So feel free to come and ask us any questions. + diff --git a/static/files/subtitles/2020/debugging-profiling.sbv b/static/files/subtitles/2020/debugging-profiling.sbv new file mode 100644 index 00000000..678e0709 --- /dev/null +++ b/static/files/subtitles/2020/debugging-profiling.sbv @@ -0,0 +1,2573 @@ +0:00:00.000,0:00:04.200 +So welcome back. Today we are gonna +cover debugging and profiling. + +0:00:04.720,0:00:09.340 +Before I get into it we're gonna make another +reminder to fill in the survey. + +0:00:09.520,0:00:14.580 +Just one of the main things we want to get +from you is questions, because the last day + +0:00:14.820,0:00:18.080 +is gonna be questions from +you guys: about things that + +0:00:18.080,0:00:22.020 +we haven't covered, or like you want +us to kind of talk more in depth. + +0:00:23.350,0:00:26.969 +The more questions we get, the more interesting +we can make that section, + +0:00:26.970,0:00:28.900 +so please go on and fill in the survey. + +0:00:28.900,0:00:35.660 +So today's lecture is gonna be a lot of topics. +All the topics revolve around the concept of + +0:00:35.820,0:00:39.920 +what do you do when you have +a program that has some bugs. + +0:00:39.920,0:00:42.520 +Which is most of the time, like when you +are programming, you're kind of thinking + +0:00:42.720,0:00:47.400 +about how you implement something and there's +like a half life of fixing all the issues that + +0:00:47.620,0:00:52.140 +that program has. And even if your program behaves +like you want, it might be that it's + +0:00:52.390,0:00:55.680 +really slow, or it's taking a lot +of resources in the process. + +0:00:55.680,0:01:00.569 +So today we're gonna see a lot of different +approaches of dealing with these problems. + +0:01:01.300,0:01:05.099 +So first, the first section is on debugging. + +0:01:06.159,0:01:08.279 +Debugging can be done in many different ways, + +0:01:08.380,0:01:10.119 +there are all kinds of... + +0:01:10.120,0:01:13.640 +The most simple approach that, pretty much, all + +0:01:13.640,0:01:17.140 +CS students will go through, will be just: +you have some code, and it's not behaving + +0:01:17.160,0:01:20.280 +like you want, so you probe the code by adding + +0:01:20.280,0:01:23.420 +print statements. This is called +"printf debugging" and + +0:01:23.440,0:01:24.450 +it works pretty well. + +0:01:24.450,0:01:26.680 +Like, I have to be honest, + +0:01:26.820,0:01:33.120 +I use it a lot of the time because of how simple +to set up and how quick the feedback can be. + +0:01:34.360,0:01:39.320 +One of the issues with printf debugging +is that you can get a lot of output + +0:01:39.320,0:01:40.740 +and maybe you don't want + +0:01:40.800,0:01:43.240 +to get as much output as you're getting. + +0:01:43.780,0:01:49.349 +There has... people have thought of slightly more +complex ways of doing printf debugging and + +0:01:53.920,0:01:58.320 +one of these ways is what is usually +referred to as "logging". + +0:01:58.420,0:02:04.530 +So the advantage of doing logging versus doing printf +debugging is that, when you're creating logs, + +0:02:05.080,0:02:09.780 +you're not necessarily creating the logs because +there's a specific issue you want to fix; + +0:02:09.780,0:02:12.460 +it's mostly because you have built a + +0:02:12.480,0:02:16.840 +more complex software system and you +want to log when some events happen. + +0:02:17.360,0:02:21.560 +One of the core advantages of using +a logging library is that + +0:02:22.180,0:02:27.040 +you can can define severity levels, +and you can filter based on those. + +0:02:27.400,0:02:31.620 +Let's see an example of how we +can do something like that. + +0:02:32.320,0:02:35.840 +Yeah, everything fits here. This +is a really silly example: + +0:02:36.340,0:02:37.520 +We're just gonna + +0:02:37.520,0:02:40.980 +sample random numbers and, depending +on the value of the number, + +0:02:41.120,0:02:44.720 +that we can interpret as a kind +of "how wrong things are going". + +0:02:44.740,0:02:48.760 +We're going to log the value +of the number and then + +0:02:49.340,0:02:51.640 +we can see what is going on. + +0:02:52.580,0:02:59.280 +I need to disable these formatters... + +0:02:59.620,0:03:03.720 +And if we were just to execute the code as it is, + +0:03:04.160,0:03:07.420 +we just get the output and we just +keep getting more and more output. + +0:03:07.420,0:03:13.599 +But you have to kind of stare at it and make +sense of what is going on, and we don't know + +0:03:13.600,0:03:19.629 +what is the relative timing between printfs, we don't really +know whether this is just an information message + +0:03:19.630,0:03:22.960 +or a message of whether something went wrong. + +0:03:23.810,0:03:25.810 +If we just go in, + +0:03:27.320,0:03:29.780 +and undo, not that one... + +0:03:34.220,0:03:37.140 +That one, we can set that formatter. + +0:03:38.620,0:03:41.600 +Now the output looks something more like this + +0:03:41.620,0:03:44.840 +So for example, if you have several different +modules that you are programming with, + +0:03:44.840,0:03:46.940 +you can identify them with like different levels. + +0:03:46.940,0:03:49.800 +Here, we have, we have debug levels, + +0:03:50.330,0:03:51.890 +we have critical + +0:03:51.890,0:03:57.540 +info, different levels. And it might be handy because +here we might only care about the error messages. + +0:03:57.740,0:04:00.640 +Like those are like, the... We have been + +0:04:00.700,0:04:03.960 +working on our code, so far so good, +and suddenly we get some error. + +0:04:03.960,0:04:06.540 +We can log that to identify where it's happening. + +0:04:06.580,0:04:11.640 +But maybe there's a lot of information +messages, but we can deal with that + +0:04:12.709,0:04:16.809 +by just changing the level to error level. + +0:04:17.400,0:04:17.900 +And + +0:04:18.890,0:04:22.960 +now if we were to run this again, +we are only going to get those + +0:04:23.620,0:04:28.160 +errors in the output, and we can just look through +those to make sense of what is going on. + +0:04:28.920,0:04:33.320 +Another really useful tool when +you're dealing with logs is + +0:04:34.130,0:04:36.670 +As you kind of look at this, + +0:04:36.670,0:04:42.580 +it has become easier because now we have this critical +and error levels that we can quickly identify. + +0:04:43.310,0:04:46.750 +But since humans are fairly visual creatures, + +0:04:48.680,0:04:53.109 +one thing that you can do is use +colors from your terminal to + +0:04:53.630,0:04:57.369 +identify these things. So now, +changing the formatter, + +0:04:57.369,0:05:03.320 +what I've done is slightly change +how the output is formatted. + +0:05:03.580,0:05:09.340 +When I do that, now whenever I get a warning +message, it's color coded by yellow; + +0:05:09.340,0:05:10.880 +whenever I get like an error, + +0:05:10.960,0:05:16.140 +faded red; and when it's critical, I have a +bold red indicating something went wrong. + +0:05:16.280,0:05:22.620 +And here it's a really short output, but when you start +having thousands and thousands of lines of log, + +0:05:22.620,0:05:26.380 +which is not unrealistic and happens +every single day in a lot of apps, + +0:05:27.140,0:05:32.500 +quickly browsing through them and identifying +where the error or the red patches are + +0:05:32.600,0:05:35.320 +can be really useful. + +0:05:35.600,0:05:41.400 +A quick aside is, you might be curious about +how the terminal is displaying these colors. + +0:05:41.580,0:05:45.320 +At the end of the day, the terminal +is only outputting characters. + +0:05:47.160,0:05:49.480 +Like, how is this program or how +are other programs, like LS, + +0:05:50.060,0:05:56.050 +that has all these fancy colors. How are they telling the +terminal that it should use these different colors? + +0:05:56.360,0:05:58.779 +This is nothing extremely fancy, + +0:05:59.440,0:06:03.440 +what these tools are doing, is +something along these lines. + +0:06:03.740,0:06:04.540 +Here we have... + +0:06:05.420,0:06:08.340 +I can clear the rest of the output, +so we can focus on this. + +0:06:08.660,0:06:14.000 +There's some special characters, +some escape characters here, + +0:06:14.260,0:06:19.740 +then we have some text and then we have some other +special characters. And if we execute this line + +0:06:19.940,0:06:22.360 +we get a red "This is red". + +0:06:22.480,0:06:26.640 +And you might have picked up on the +fact that we have a "255;0;0" here, + +0:06:26.720,0:06:31.400 +this is just telling the RGB values of +the color we want in the terminal. + +0:06:31.400,0:06:38.100 +And you pretty much can do this in any piece of code that +you have, and like that you can color code the output. + +0:06:38.100,0:06:42.540 +Your terminal is fairly fancy and supports +a lot of different colors in the output. + +0:06:42.550,0:06:45.400 +This is not even all of them, this +is like a sixteenth of them. + +0:06:46.100,0:06:49.119 +I think it can be fairly useful +to know about that. + +0:06:52.100,0:06:55.960 +Another thing is maybe you don't +enjoy or you don't think + +0:06:56.200,0:06:58.620 +logs are really fit for you. + +0:06:58.620,0:07:02.480 +The thing is a lot of other systems that +you might start using will use logs. + +0:07:02.840,0:07:05.360 +As you start building larger and larger systems, + +0:07:05.360,0:07:10.140 +you might rely on other dependencies. Common +dependencies might be web servers or + +0:07:10.220,0:07:12.320 +databases, it's a really common one. + +0:07:12.440,0:07:17.740 +And those will be logging their errors +or exceptions in their own logs. + +0:07:17.740,0:07:20.540 +Of course, you will get some client-side error, + +0:07:20.620,0:07:25.140 +but those sometimes are not informative enough +for you to figure out what is going on. + +0:07:25.900,0:07:33.940 +In most UNIX systems, the logs are usually +placed under a folder called "/var/log" + +0:07:33.940,0:07:37.980 +and if we list it, we can see there's +a bunch of logs in here. + +0:07:42.680,0:07:48.040 +So we have like the shutdown monitor +log, or some weekly logs. + +0:07:49.669,0:07:56.199 +Things related to the Wi-Fi, for +example. And if we output the + +0:07:57.560,0:08:00.840 +System log, which contains a lot +of information about the system, + +0:08:00.840,0:08:03.940 +we can get information about what's going on. + +0:08:04.120,0:08:06.780 +Similarly, there are tools that will let you + +0:08:07.460,0:08:13.090 +more sanely go through this output. +But here, looking at the system log, + +0:08:13.090,0:08:15.520 +I can look at this, and say: + +0:08:15.760,0:08:20.040 +oh there's some service that is +exiting with some abnormal code + +0:08:20.420,0:08:25.460 +and based on that information, I can go +and try to figure out what's going on, + +0:08:25.510,0:08:27.500 +like what's going wrong. + +0:08:29.020,0:08:32.000 +One thing to know when you're +working with logs is that + +0:08:32.000,0:08:35.900 +more traditionally, every software had their own + +0:08:35.920,0:08:42.540 +log, but it has been increasingly more popular to have +a unified system log where everything is placed. + +0:08:43.010,0:08:49.299 +Pretty much any application can log into the system +log, but instead of being in a plain text format, + +0:08:49.300,0:08:52.380 +it will be compressed in some special format. + +0:08:52.380,0:08:56.460 +An example of this, it was what we covered +in the data wrangling lecture. + +0:08:56.520,0:08:59.900 +In the data wrangling lecture we +were using the "journalctl", + +0:09:00.200,0:09:04.280 +which is accessing the log and +outputting all that output. + +0:09:04.340,0:09:07.380 +Here in Mac, now the command is "log show", + +0:09:07.380,0:09:10.020 +which will display a lot of information. + +0:09:10.100,0:09:15.760 +I'm gonna just display the last ten seconds, +because logs are really, really verbose and + +0:09:17.060,0:09:23.720 +just displaying the last 10 seconds is still +gonna output a fairly large amount of lines. + +0:09:23.900,0:09:28.240 +So if we go back through what's going on, + +0:09:28.240,0:09:33.460 +we here see that a lot of Apple things +are going on, since this is a macbook. + +0:09:33.500,0:09:38.460 +Maybe we could find errors about +like some system issue here. + +0:09:39.280,0:09:46.920 +Again they're fairly verbose, so you might want +to practice your data wrangling techniques here, + +0:09:46.920,0:09:50.440 +like 10 seconds equal to like 500 +lines of logs, so you can kind of + +0:09:50.960,0:09:54.960 +get an idea of how many lines +per second you're getting. + +0:09:56.360,0:10:01.060 +They're not only useful for figuring +out some other programs' output, + +0:10:01.060,0:10:05.619 +they're also useful for you, if you want to +log there instead of into your own file. + +0:10:05.779,0:10:11.319 +So using the "logger" command, +in both linux and mac, + +0:10:11.839,0:10:13.480 +You can say okay + +0:10:13.480,0:10:18.880 +I'm gonna log this "Hello Logs" +into this system log. + +0:10:18.880,0:10:21.939 +We execute the command and then + +0:10:22.760,0:10:27.640 +we can check by going through +the last minute of logs, + +0:10:27.640,0:10:31.760 +since it's gonna be fairly recent, +and grepping for that "Hello" + +0:10:31.760,0:10:38.260 +we find our entry. Fairly recent entry, that +we just created that said "Hello Logs". + +0:10:39.220,0:10:46.840 +As you become more and more familiar with +these tools, you will find yourself using + +0:10:48.800,0:10:51.279 +the logs more and more often, since + +0:10:51.529,0:10:56.349 +even if you have some bug that you haven't detected, +and the program has been running for a while, + +0:10:56.349,0:11:02.240 +maybe the information is already in the log and can +tell you enough to figure out what is going on. + +0:11:02.800,0:11:08.260 +However, printf debugging is not everything. +So now I'm going to be covering debuggers. + +0:11:08.260,0:11:10.380 +But first any questions on logs so far? + +0:11:11.720,0:11:15.040 +So what kind of things can you +figure out from the logs? + +0:11:15.040,0:11:18.800 +like this Hello Logs says that you did +something with Hello at that time? + +0:11:18.940,0:11:25.040 +Yeah, like say, for example, I can +write a bash script that detects... + +0:11:25.060,0:11:29.480 +Well, that checks every time what +Wi-Fi network I'm connected to. + +0:11:29.480,0:11:34.150 +And every time it detects that it has changed, +it makes an entry in the logs and says + +0:11:34.150,0:11:37.440 +Oh now it looks like we have +changed Wi-Fi networks. + +0:11:37.440,0:11:41.400 +and then you might go back and parse +through the logs and take like, okay + +0:11:41.510,0:11:47.559 +When did my computer change from one Wi-Fi network to +another. And this is just kind of a simple example + +0:11:47.560,0:11:50.260 +But there are many, many ways, + +0:11:50.660,0:11:54.020 +many types of information that +you could be logging here. + +0:11:54.020,0:11:59.040 +More commonly, you will probably want to +check if your computer, for example, is + +0:11:59.100,0:12:02.540 +entering sleep, for example, +for some unknown reason. + +0:12:02.680,0:12:04.660 +Like it's on hibernation mode. + +0:12:04.820,0:12:09.100 +There's probably some information in the +logs about who asked that to happen, + +0:12:09.100,0:12:10.240 +or why it's that happening. + +0:12:11.720,0:12:14.880 +Any other questions? Okay. + +0:12:14.880,0:12:17.380 +So when printf debugging is not enough, + +0:12:18.320,0:12:22.360 +the best alternative after that is using... + +0:12:23.360,0:12:25.360 +[Exit that] + +0:12:28.480,0:12:30.260 +So, it's using a debugger. + +0:12:30.580,0:12:37.620 +So a debugger is a tool that will wrap around +your code and will let you run your code, + +0:12:38.120,0:12:40.480 +but it will kind of keep control over it. + +0:12:40.480,0:12:42.500 +So it will let you step + +0:12:42.500,0:12:47.080 +through the code and execute +it and set breakpoints. + +0:12:47.080,0:12:50.020 +You probably have seen debuggers +in some way, if you have + +0:12:50.020,0:12:55.800 +ever used something like an IDE, because IDEs have this +kind of fancy: set a breakpoint here, execute, ... + +0:12:56.080,0:12:59.040 +But at the end of the day what +these tools are using is just + +0:12:59.040,0:13:04.740 +these command line debuggers and they're just +presenting them in a really fancy format. + +0:13:04.850,0:13:09.969 +Here we have a completely broken bubble +sort, a simple sorting algorithm. + +0:13:10.000,0:13:11.560 +Don't worry about the details, + +0:13:11.560,0:13:14.980 +but we just want to sort this +array that we have here. + +0:13:17.360,0:13:19.460 +We can try doing that by just doing + +0:13:21.340,0:13:23.340 +Python bubble.py + +0:13:23.500,0:13:28.360 +And when we do that... Oh there's some +index error, list index out of range. + +0:13:28.480,0:13:31.200 +We could start adding prints + +0:13:31.200,0:13:33.740 +but if have a really long string, +we can get a lot of information. + +0:13:33.820,0:13:37.820 +So how about we go up to the +moment that we crashed? + +0:13:37.900,0:13:41.020 +We can go to that moment and examine what the + +0:13:41.020,0:13:43.360 +current state of the program was. + +0:13:43.520,0:13:49.080 +So for doing that I'm gonna run the +program using the Python debugger. + +0:13:49.080,0:13:53.820 +Here I'm using technically the ipython debugger, +just because it has nice coloring syntax + +0:13:54.060,0:13:59.140 +so it's probably easier for +both of us to understand + +0:13:59.300,0:14:01.300 +what's going on in the output. + +0:14:01.310,0:14:04.929 +But they're pretty much identical anyway. + +0:14:05.140,0:14:09.400 +So we execute this, and now we are given a prompt + +0:14:09.400,0:14:13.080 +where we're being told that we are here, +at the very first line of our program. + +0:14:13.100,0:14:15.440 +And we can... + +0:14:15.980,0:14:20.380 +"L" stands for "List", so as +with many of these tools + +0:14:21.140,0:14:24.400 +there's kind of like a language +of operations that you can do, + +0:14:24.400,0:14:28.220 +and they are often mnemonic, as it +was the case with VIM or TMUX. + +0:14:28.860,0:14:32.940 +So here, "L" is for "Listing" the code, +and we can see the entire code. + +0:14:34.540,0:14:38.880 +"S" is for "Step" and will let us kind of one + +0:14:38.880,0:14:42.180 +line at a time, go through the execution. + +0:14:42.300,0:14:47.360 +The thing is we're only triggering +the error some time later. + +0:14:47.360,0:14:48.710 +So + +0:14:48.710,0:14:55.150 +we can restart the program and instead of +trying to step until we get to the issue, + +0:14:55.150,0:15:00.820 +we can just ask for the program to continue +which is the "C" command and + +0:15:01.480,0:15:04.160 +hey, we reached the issue. + +0:15:04.640,0:15:08.080 +We got to this line where everything crashed, + +0:15:08.080,0:15:11.020 +we're getting this list index out of range. + +0:15:11.020,0:15:13.560 +And now that we are here we can say, huh? + +0:15:14.120,0:15:17.520 +Okay, first, let's print the value of the array. + +0:15:18.080,0:15:21.520 +This is the value of the current array + +0:15:23.120,0:15:26.840 +So we have six items. Okay. What +is the value of "J" here? + +0:15:27.200,0:15:31.929 +So we look at the value of "J". "J" is 5 +here, which will be the last element, but + +0:15:32.480,0:15:37.119 +"J" plus 1 is going to be 6, so that's +triggering the out of bounds error. + +0:15:37.970,0:15:40.389 +So what we have to do is + +0:15:40.660,0:15:47.660 +this "N", instead of "N" has to be "N minus one". +We have identified that the error lies there. + +0:15:47.660,0:15:50.800 +So we can quit, which is "Q". + +0:15:52.010,0:15:54.729 +Again, because it's a post-mortem debugger. + +0:15:56.090,0:16:00.219 +We go back to the code and say okay, + +0:16:02.860,0:16:06.180 +we need to append this "N minus one". + +0:16:06.760,0:16:11.140 +That will prevent the list index out of range and + +0:16:11.480,0:16:14.260 +if we run this again without the debugger, + +0:16:15.020,0:16:18.729 +okay, no errors now. But this +is not our sorted list. + +0:16:18.729,0:16:21.200 +This is sorted, but it's not our list. + +0:16:21.300,0:16:23.000 +We are missing entries from our list, + +0:16:23.160,0:16:27.420 +so there is some behavioral issue +that we're reaching here. + +0:16:27.920,0:16:32.409 +Again, we could start using printf +debugging but kind of a hunch now + +0:16:32.409,0:16:37.940 +is that probably the way we're swapping entries +in the bubble sort program is wrong. + +0:16:38.480,0:16:45.920 +We can use the debugger for this. We can go through +them to the moment we're doing a swap and + +0:16:46.120,0:16:48.320 +check how the swap is being performed. + +0:16:48.540,0:16:50.600 +So a quick overview, + +0:16:50.600,0:16:56.590 +we have two for loops and +in the most nested loop, + +0:16:56.720,0:17:03.220 +we are checking if the array is larger than the other array. +The thing is if we just try to execute until this line, + +0:17:03.589,0:17:06.609 +it's only going to trigger +whenever we make a swap. + +0:17:06.700,0:17:11.640 +So what we can do is we can set +a breakpoint in the sixth line. + +0:17:11.820,0:17:15.520 +We can create a breakpoint in this line and then + +0:17:15.580,0:17:20.820 +the program will execute and the moment we try to swap +variables is when the program is going to stop. + +0:17:21.080,0:17:22.940 +So we create a breakpoint there + +0:17:22.940,0:17:27.000 +and then we continue the execution +of the program. The program halts + +0:17:27.000,0:17:30.520 +and says hey, I have executed +and I have reached this line. + +0:17:30.820,0:17:31.860 +Now + +0:17:31.920,0:17:39.120 +I can use "locals()", which is a Python function +that returns a dictionary with all the values + +0:17:39.120,0:17:41.220 +to quickly see the entire context. + +0:17:43.100,0:17:48.140 +The string, the array is fine and is +six, again, just the beginning and + +0:17:48.680,0:17:51.100 +I step, go to the next line. + +0:17:51.780,0:17:52.620 +Oh, + +0:17:52.620,0:17:57.000 +and I identify the issue: I'm swapping one +item at a time, instead of simultaneously, + +0:17:57.020,0:18:01.840 +so that's what's triggering the fact that +we're losing variables as we go through. + +0:18:03.200,0:18:06.729 +That's kind of a very simple example, but + +0:18:07.490,0:18:09.050 +debuggers are really powerful. + +0:18:09.050,0:18:13.320 +Most programming languages will +give you some sort of debugger, + +0:18:13.540,0:18:19.920 +and when you go to more low level debugging +you might run into tools like... + +0:18:19.920,0:18:21.920 +You might want to use something like + +0:18:25.340,0:18:27.340 +GDB. + +0:18:31.580,0:18:34.360 +And GDB has one nice property: + +0:18:34.460,0:18:37.740 +GDB works really well with C/C++ +and all these C-like languages. + +0:18:37.780,0:18:42.720 +But GDB actually lets you work with pretty +much any binary that you can execute. + +0:18:42.720,0:18:47.800 +So for example here we have sleep, which is just +a program that's going to sleep for 20 seconds. + +0:18:48.520,0:18:55.340 +It's loaded and then we can do run, and then we +can interrupt this sending an interrupt signal. + +0:18:55.340,0:19:02.020 +And GDB is displaying for us, here, very low-level +information about what's going on in the program. + +0:19:02.030,0:19:06.820 +So we're getting the stack trace, we're seeing +we are in this nanosleep function, + +0:19:07.060,0:19:11.660 +we can see the values of all the hardware +registers in your machine. So + +0:19:12.300,0:19:17.160 +you can get a lot of low-level +detail using these tools. + +0:19:18.560,0:19:22.520 +I think that's all I want to cover for debuggers. + +0:19:22.520,0:19:25.540 +Any questions related to that? + +0:19:33.520,0:19:39.040 +Another interesting tool when you're trying to +debug is that sometimes you want to debug as if + +0:19:39.480,0:19:42.220 +your program is a black box. + +0:19:42.220,0:19:46.059 +So you, maybe, know what the internals +of the program but at the same time + +0:19:46.430,0:19:52.119 +your computer knows whenever your program +is trying to do some operations. + +0:19:52.280,0:19:54.729 +So this is in UNIX systems, + +0:19:54.760,0:19:58.060 +there's this notion of like user +level code and kernel level code. + +0:19:58.060,0:20:03.180 +And when you try to do some operations like reading +a file or like reading the network connection + +0:20:03.340,0:20:06.020 +you will have to do something +called system calls. + +0:20:06.180,0:20:12.560 +You can get a program and go through +those operations and ask + +0:20:14.000,0:20:18.300 +what operations did this software do? + +0:20:18.300,0:20:20.920 +So for example, if you have +like a Python function + +0:20:20.980,0:20:26.660 +that is only supposed to do a mathematical operation +and you run it through this program, + +0:20:26.660,0:20:28.460 +and it's actually reading files, + +0:20:28.460,0:20:31.940 +Why is it reading files? It shouldn't +be reading files. So, let's see. + +0:20:34.520,0:20:37.200 +This is "strace". + +0:20:37.200,0:20:38.740 +So for example, we can do it something like this. + +0:20:38.740,0:20:41.260 +So here we're gonna run the "LS - L" + +0:20:42.220,0:20:47.900 +And then we're ignoring the output of LS, but +we are not ignoring the output of STRACE. + +0:20:47.900,0:20:49.740 +So if we execute that... + +0:20:52.300,0:20:54.720 +We're gonna get a lot of output. + +0:20:54.920,0:20:58.740 +This is all the different system calls + +0:21:00.520,0:21:02.080 +That this + +0:21:02.090,0:21:07.510 +LS has executed. You will see a bunch +of OPEN, you will see FSTAT. + +0:21:08.150,0:21:14.170 +And for example, since it has to list all the properties +of the files that are in this folder, we can + +0:21:15.110,0:21:20.410 +check for the LSTAT call. So the LSTAT call will +check for the properties of the files and + +0:21:21.020,0:21:27.420 +we can see that, effectively, all the files +and folders that are in this directory + +0:21:27.700,0:21:31.540 +have been accessed through +a system call, through LS. + +0:21:34.120,0:21:43.400 +Interestingly, sometimes you actually +don't need to run your code to + +0:21:44.360,0:21:47.000 +figure out that there is something +wrong with your code. + +0:21:47.960,0:21:52.449 +So far we have seen enough ways of identifying +issues by running the code, + +0:21:52.450,0:21:54.410 +but what if you... + +0:21:54.410,0:21:58.980 +you can look at a piece of code like this, like +the one I have shown right now in this screen, + +0:21:58.980,0:22:00.560 +and identify an issue. + +0:22:00.560,0:22:02.030 +So for example here, + +0:22:02.030,0:22:06.670 +we have some really silly piece of code. It +defines a function, prints a few variables, + +0:22:07.720,0:22:11.780 +multiplies some variables, it sleeps for +a while and then we try to print BAZ. + +0:22:12.020,0:22:14.840 +And you could try to look at +this and say, hey, BAZ has + +0:22:15.500,0:22:20.650 +never been defined anywhere. This is a new +variable. You probably meant to say BAR + +0:22:20.650,0:22:22.540 +but you just mistyped it. + +0:22:22.540,0:22:26.480 +Thing is, if we try to run this program, + +0:22:28.820,0:22:36.820 +it's gonna take 60 seconds, because like we have to wait until +this time.sleep function finishes. Here, sleep is just for + +0:22:37.790,0:22:42.070 +motivating the example but in general you may +be loading a data set that takes really long + +0:22:42.140,0:22:44.740 +because you have to copy everything into memory. + +0:22:44.740,0:22:48.780 +And the thing is, there are programs +that will take source code as input, + +0:22:49.340,0:22:54.940 +will process it and will say, oh probably this is +wrong about this piece of code. So in Python, + +0:22:55.760,0:23:00.600 +or in general, these are called +static analysis tools. + +0:23:00.780,0:23:02.860 +In Python we have for example pyflakes. + +0:23:02.860,0:23:06.640 +If we get this piece of code +and run it through pyflakes, + +0:23:06.860,0:23:09.820 +pyflakes is gonna give us a couple of issues. + +0:23:10.040,0:23:15.700 +First one is the one.... The second one is the one +we identified: here's an undefined name called BAZ. + +0:23:15.700,0:23:17.760 +You probably should be doing +something about that. + +0:23:17.760,0:23:22.720 +And the other one is like +oh, you're redefining the + +0:23:23.060,0:23:27.240 +the FOO variable name in that line. + +0:23:27.540,0:23:31.400 +So here we have a FOO function +and then we are kind of + +0:23:31.400,0:23:34.620 +shadowing that function by +using a loop variable here. + +0:23:34.760,0:23:38.460 +So now that FOO function that we +defined is not accessible anymore + +0:23:38.470,0:23:41.650 +and then if we try to call it afterwards, +we will get into errors. + +0:23:43.520,0:23:45.520 +There are other types of + +0:23:46.250,0:23:53.170 +Static Analysis tools. MYPY is a different one. MYPY +is gonna report the same two errors, but it's also + +0:23:53.840,0:24:00.160 +going to complain about type checking. So it's gonna +say, oh here you're multiplying an int by a float and + +0:24:00.680,0:24:06.320 +if you care about the type checking of your +code, you should not be mixing those up. + +0:24:07.490,0:24:12.219 +it can be kind of inconvenient, having to run +this, look at the line, going back to your + +0:24:12.800,0:24:17.409 +VIM or like your editor, and figuring +out what the error matches to. + +0:24:18.380,0:24:21.190 +There are already solutions for that. One + +0:24:22.340,0:24:27.069 +way is that you can integrate most +editors with these tools and here.. + +0:24:28.279,0:24:34.059 +You can see there is like some red highlighting on +the bash, and it will read the last line here. + +0:24:34.059,0:24:36.059 +So, undefined named 'baz'. + +0:24:36.160,0:24:39.080 +So as I'm editing this piece of Python code, + +0:24:39.080,0:24:43.360 +my editor is gonna give me feedback +about what's going wrong with this. + +0:24:43.560,0:24:48.480 +Or like here have another one saying +the redefinition of unused foo. + +0:24:49.849,0:24:51.849 +And + +0:24:53.080,0:24:56.060 +even, there are some stylistic complaints. + +0:24:56.060,0:24:58.060 +So, oh, I will expect two empty lines. + +0:24:58.120,0:25:03.660 +So like in Python, you should be having two +empty lines between a function definition. + +0:25:05.779,0:25:07.009 +There are... + +0:25:07.009,0:25:09.280 +there is a resource on the lecture notes + +0:25:09.280,0:25:13.160 +about pretty much static analyzers for a +lot of different programming languages. + +0:25:13.700,0:25:18.460 +There are even static analyzers for English. + +0:25:18.840,0:25:24.260 +So I have my notes + +0:25:24.580,0:25:30.280 +for the class here, and if I run it through this +static analyzer for English, that is "writegood". + +0:25:30.409,0:25:33.008 +It's going to complain about +some stylistic properties. + +0:25:33.009,0:25:33.489 +So like, oh, + +0:25:33.489,0:25:37.460 +I'm using "very", which is a weasel +word and I shouldn't be using it. + +0:25:37.480,0:25:43.080 +Or "quickly" can weaken meaning, and you can have +this for spell checking, or for a lot of different + +0:25:43.600,0:25:48.000 +types of stylistic analysis. + +0:25:48.760,0:25:52.020 +Any questions so far? + +0:25:57.500,0:25:59.490 +Oh, + +0:25:59.490,0:26:01.490 +I forgot to mention... + +0:26:01.640,0:26:07.320 +Depending on the task that you're performing, +there will be different types of debuggers. + +0:26:07.320,0:26:09.740 +For example, if you're doing web development, + +0:26:09.860,0:26:13.520 +both Firefox and Chrome + +0:26:13.740,0:26:20.600 +have a really really good set of tools +for doing debugging for websites. + +0:26:20.600,0:26:23.880 +So here we go and say inspect element, + +0:26:23.880,0:26:25.880 +we can get the... do you know? +how to make this larger... + +0:26:27.660,0:26:29.220 +We're getting + +0:26:29.220,0:26:33.380 +the entire source code for +the web page for the class. + +0:26:35.549,0:26:37.549 +Oh, yeah, here we go. + +0:26:38.640,0:26:40.640 +Is that better? + +0:26:40.799,0:26:47.149 +And we can actually go and change properties about +the course. So we can say... we can edit the title. + +0:26:47.400,0:26:51.280 +Say, this is not a class on +debugging and profiling. + +0:26:51.620,0:26:53.940 +And now the code for the website has changed. + +0:26:54.120,0:26:56.000 +This is one of the reasons +why you should never trust + +0:26:56.200,0:27:00.560 +any screenshots of websites, because +they can be completely modified. + +0:27:01.320,0:27:05.030 +And you can also modify this style. +Like, here I have things + +0:27:06.120,0:27:07.559 +using the + +0:27:07.560,0:27:09.500 +the dark mode preference, + +0:27:09.680,0:27:11.900 +but we can alter that. + +0:27:11.900,0:27:16.560 +Because at the end of the day, the +browser is rendering this for us. + +0:27:17.840,0:27:21.780 +We can check the cookies, but there's +like a lot of different operations. + +0:27:21.799,0:27:27.619 +There's also a built-in debugger for JavaScript, +so you can step through JavaScript code. + +0:27:27.620,0:27:34.020 +So kind of the takeaway is, depending on what you are +doing, you will probably want to search for what tools + +0:27:34.320,0:27:36.820 +programmers have built for them. + +0:27:44.880,0:27:47.630 +Now I'm gonna switch gears and + +0:27:48.200,0:27:51.800 +stop talking about debugging, which is kind +of finding issues with the code, right? + +0:27:51.800,0:27:54.200 +kind of more about the behavior, +and then start talking + +0:27:54.200,0:27:56.860 +about like how you can use profiling. + +0:27:56.860,0:27:59.240 +And profiling is how to optimize the code. + +0:28:01.100,0:28:05.940 +It might be because you want to optimize +the CPU, the memory, the network, ... + +0:28:06.330,0:28:09.889 +There are many different reasons that +you want to be optimizing it. + +0:28:10.440,0:28:14.000 +As it was the case with debugging, +the kind of first-order approach + +0:28:14.000,0:28:16.680 +that a lot of people have +experience with already is + +0:28:16.880,0:28:21.880 +oh, let's use just printf profiling, +so to say, like we can just take... + +0:28:22.770,0:28:25.610 +Let me make this larger. We can + +0:28:26.130,0:28:28.110 +take the current time here, + +0:28:28.110,0:28:34.610 +then we can check, we can do some execution +and then we can take the time again and + +0:28:35.060,0:28:37.320 +subtract it from the original time. + +0:28:37.320,0:28:39.320 +And by doing this you can kind of narrow down + +0:28:39.540,0:28:46.040 +and fence some different parts of your code and try to figure +out what is the time taken between those two parts. + +0:28:47.040,0:28:52.639 +And that's good. But sometimes it can be interesting, +the results. So here, we're sleeping for + +0:28:53.730,0:28:59.809 +0.5 seconds and the output is saying, +oh it's 0.5 plus some extra time, + +0:28:59.810,0:29:05.929 +which is kind of interesting. And if we keep running it, +we see there's like some small error and the thing is + +0:29:06.240,0:29:11.680 +here, what we're actually measuring is what +is usually referred to as the "real time". + +0:29:12.060,0:29:14.340 +Real time is as if you get + +0:29:14.340,0:29:15.930 +like a + +0:29:15.930,0:29:19.249 +clock, and you start it when your program starts, +and you stop it when your program ends. + +0:29:19.500,0:29:23.060 +But the thing is, in your computer it is +not only your program that is running. + +0:29:23.060,0:29:27.460 +There are many other programs running +at the same time and those might + +0:29:27.760,0:29:34.640 +be the ones that are taking the CPU. +So, to try to make sense of that, + +0:29:35.790,0:29:39.259 +A lot of... you'll see a lot of programs + +0:29:40.620,0:29:43.250 +using the terminology that is + +0:29:44.100,0:29:46.760 +real time, user time and system time. + +0:29:46.760,0:29:51.460 +Real time is what I explained, which is kind of +the entire length of time from start to finish. + +0:29:51.840,0:29:59.780 +Then there is the user time, which is the amount of time +your program spent on the CPU doing user level cycles. + +0:29:59.780,0:30:06.100 +So as I was mentioning, in UNIX, you can be running +user level code or kernel level code. + +0:30:06.920,0:30:12.940 +System is kind of the opposite, it's the amount of CPU, like +the amount of time that your program spent on the CPU + +0:30:13.500,0:30:18.480 +executing kernel mode instructions. +So let's show this with an example. + +0:30:18.620,0:30:22.180 +Here I'm going to "time", which is a command, + +0:30:22.460,0:30:27.840 +a shell command that's gonna get these three metrics +for the following command, and then I'm just + +0:30:28.100,0:30:30.560 +grabbing a URL from + +0:30:31.160,0:30:36.760 +a website that is hosted in Spain. So that's gonna take +some extra time to go over there and then go back. + +0:30:37.410,0:30:39.499 +If we see, here, if we were to just... + +0:30:39.780,0:30:43.670 +We have two prints, between the beginning +and the end of the program. + +0:30:43.670,0:30:49.039 +We could think that this program is taking like +600 milliseconds to execute, but actually + +0:30:49.500,0:30:56.930 +most of that time was spent just waiting for the +response on the other side of the network and + +0:30:57.330,0:31:04.880 +we actually only spent 16 milliseconds at the user level +and like 9 seconds, in total 25 milliseconds, actually + +0:31:05.280,0:31:08.149 +executing CURL code. Everything +else was just waiting. + +0:31:12.090,0:31:14.480 +Any questions related to timing? + +0:31:19.860,0:31:21.860 +Ok, so + +0:31:21.990,0:31:23.580 +timing can be + +0:31:23.580,0:31:29.480 +can become tricky, it's also kind of a black box solution. +Or if you start adding print statements, + +0:31:29.660,0:31:35.860 +it's kind of hard to add print statements, with time everywhere. +So programmers have figured out better tools. + +0:31:36.140,0:31:38.700 +These are usually referred to as "profilers". + +0:31:39.980,0:31:44.260 +One quick note that I'm gonna make, is that + +0:31:44.720,0:31:46.720 +profilers, like usually when people + +0:31:46.800,0:31:48.800 +refer to profilers they usually talk about + +0:31:49.050,0:31:55.190 +CPU profilers because they are the most common, at identifying +where like time is being spent on the CPU. + +0:31:56.790,0:31:59.180 +Profilers usually come in kind of two flavors: + +0:31:59.180,0:32:02.140 +there's tracing profilers and sampling profilers. + +0:32:02.140,0:32:06.380 +and it's kind of good to know the difference +because the output might be different. + +0:32:07.640,0:32:10.300 +Tracing profilers kind of instrument your code. + +0:32:10.680,0:32:15.799 +So they kind of execute with your code and every +time your code enters a function call, + +0:32:15.800,0:32:20.479 +they kind of take a note of it. It's like, oh we're entering +this function call at this moment in time and + +0:32:21.860,0:32:24.860 +they keep going and, once they +finish, they can report + +0:32:24.860,0:32:28.300 +oh, you spent this much time executing +in this function and + +0:32:28.580,0:32:33.760 +this much time in this other function. So on, so forth, +which is the example that we're gonna see now. + +0:32:34.590,0:32:38.329 +Another type of tools are tracing, +sorry, sampling profilers. + +0:32:38.430,0:32:44.840 +The issue with tracing profilers is they add a lot of overhead. +Like you might be running your code and having these kind of + +0:32:46.280,0:32:49.400 +profiling next to you making all these counts, + +0:32:49.400,0:32:54.340 +will hinder the performance of your program, so +you might get counts that are slightly off. + +0:32:55.380,0:32:59.450 +A sampling profiler, what it's gonna do +is gonna execute your program and every + +0:32:59.940,0:33:05.239 +100 milliseconds, 10 milliseconds, like some defined period, +it's gonna stop your program. It's gonna halt it, + +0:33:05.580,0:33:12.379 +it's gonna look at the stack trace and say, oh, you're +right now in this point in the hierarchy, and + +0:33:12.630,0:33:15.530 +identify which function is gonna +be executing at that point. + +0:33:16.260,0:33:19.760 +The idea is that as long as you +execute this for long enough, + +0:33:19.760,0:33:24.290 +you're gonna get enough statistics to know +where most of the time is being spent. + +0:33:25.800,0:33:28.800 +So, let's see an example of a tracing profiling. + +0:33:28.800,0:33:32.340 +So here we have a piece of +code that is just like a + +0:33:33.480,0:33:35.540 +really simple re-implementation of grep + +0:33:36.330,0:33:38.330 +done in Python. + +0:33:38.400,0:33:44.030 +What we want to check is what is the bottleneck of this +program? Like we're just opening a bunch of files, + +0:33:44.900,0:33:49.620 +trying to match this pattern, and then +printing whenever we find a match. + +0:33:49.620,0:33:52.340 +And maybe it's the regex, maybe it's the print... + +0:33:52.460,0:33:53.940 +We don't really know. + +0:33:53.940,0:33:59.040 +So to do this in Python, we have the "cProfile". + +0:33:59.040,0:34:00.080 +And + +0:34:00.990,0:34:06.620 +here I'm just calling this module and saying I want +to sort this by the total amount of time, that + +0:34:06.780,0:34:13.429 +we're gonna see briefly. I'm calling the +program we just saw in the editor. + +0:34:13.429,0:34:18.679 +I'm gonna execute this a thousand times +and then I want to match (the grep + +0:34:18.960,0:34:21.770 +Arguments here) is I want to match these regex + +0:34:22.919,0:34:27.469 +to all the Python files in here. +And this is gonna output some... + +0:34:30.780,0:34:34.369 +This is gonna produce some output, +then we're gonna look at it. First, + +0:34:34.369,0:34:38.539 +is all the output from the greps, +but at the very end, we're getting + +0:34:39.119,0:34:42.979 +output from the profiler itself. If we go up + +0:34:44.129,0:34:46.939 +we can see that, hey, + +0:34:47.730,0:34:55.250 +by sorting we can see that the total number of calls. So we +did 8000 calls, because we executed this 1000 times and + +0:34:57.360,0:35:03.440 +this is the total amount of time we spent in this function +(cumulative time). And here we can start to identify + +0:35:03.920,0:35:06.040 +where the bottleneck is. + +0:35:06.050,0:35:11.449 +So here, this built-in method IO open, is saying that +we're spending a lot of the time just waiting for + +0:35:12.080,0:35:14.340 +reading from the disk or... + +0:35:14.340,0:35:15.680 +There, we can check, hey, + +0:35:15.680,0:35:19.840 +a lot of time is also being spent +trying to match the regex. + +0:35:19.840,0:35:22.640 +Which is something that you will expect. + +0:35:22.640,0:35:26.220 +One of the caveats of using this + +0:35:26.480,0:35:29.540 +tracing profiler is that, as you can see, here + +0:35:29.540,0:35:35.239 +we're seeing our function but we're also seeing +a lot of functions that correspond to built-ins. + +0:35:35.240,0:35:35.910 +So like, + +0:35:35.910,0:35:41.899 +functions that are third party functions from the libraries. +And as you start building more and more complex code, + +0:35:41.900,0:35:43.560 +This is gonna be much harder. + +0:35:44.200,0:35:44.760 +So + +0:35:46.080,0:35:49.720 +here is another piece of Python code that, + +0:35:51.540,0:35:53.779 +don't read through it, what it's doing is just + +0:35:54.420,0:35:57.589 +grabbing the course website and +then it's printing all the... + +0:35:58.440,0:36:01.960 +It's parsing it, and then it's printing +all the hyperlinks that it has found. + +0:36:01.960,0:36:03.520 +So there are like these two operations: + +0:36:03.520,0:36:07.800 +going there, grabbing a website, and +then parsing it, printing the links. + +0:36:07.800,0:36:09.740 +And we might want to get a sense of + +0:36:09.740,0:36:16.180 +how those two operations compare to each +other. If we just try to execute the + +0:36:16.680,0:36:18.680 +cProfiler here and + +0:36:19.260,0:36:24.949 +we're gonna do the same, this is not gonna print anything. +I'm using a tool we haven't seen so far, + +0:36:24.950,0:36:25.700 +but I think it's pretty nice. + +0:36:25.700,0:36:32.810 +It's "TAC", which is the opposite of "CAT", and it is going +to reverse the output so I don't have to go up and look. + +0:36:33.430,0:36:35.430 +So we do this and... + +0:36:36.250,0:36:39.179 +Hey, we get some interesting output. + +0:36:39.880,0:36:46.200 +we're spending a bunch of time in this built-in method +socket_getaddr_info and like in _imp_create_dynamic and + +0:36:46.510,0:36:48.540 +method_connect and posix_stat... + +0:36:49.210,0:36:55.740 +nothing in my code is directly calling these functions so I +don't really know what is the split between the operation of + +0:36:56.349,0:37:03.929 +making a web request and parsing the output of +that web request. So, for that, we can use + +0:37:04.900,0:37:07.920 +a different type of profiler which is + +0:37:09.819,0:37:14.309 +a line profiler. And the line profiler is +just going to present the same results + +0:37:14.310,0:37:20.879 +but in a more human-readable way, which is just, for this +line of code, this is the amount of time things took. + +0:37:24.819,0:37:31.079 +So it knows it has to do that, we have to add a +decorator to the Python function, we do that. + +0:37:34.869,0:37:36.869 +And as we do that, + +0:37:37.119,0:37:39.749 +we now get slightly cropped output, + +0:37:39.750,0:37:46.169 +but the main idea, we can look at the percentage of time and +we can see that making this request, get operation, took + +0:37:46.450,0:37:52.829 +88% of the time, whereas parsing the +response took only 10.9% of the time. + +0:37:54.069,0:38:00.869 +This can be really informative and a lot of different programming +languages will support this type of a line profiling. + +0:38:04.569,0:38:07.439 +Sometimes, you might not care about CPU. + +0:38:07.440,0:38:15.000 +Maybe you care about the memory or like some other resource. +Similarly, there are memory profilers: in Python + +0:38:15.000,0:38:21.599 +there is "memory_profiler", for C you will have +"Valgrind". So here is a fairly simple example, + +0:38:21.760,0:38:28.530 +we just create this list with a million elements. That's +going to consume like megabytes of space and + +0:38:29.200,0:38:33.920 +we do the same, creating another +one with 20 million elements. + +0:38:34.860,0:38:38.180 +To check, what was the memory allocation? + +0:38:38.980,0:38:44.369 +How it's gonna happen, what's the consumption? +We can go through one memory profiler and + +0:38:44.950,0:38:46.619 +we execute it, + +0:38:46.620,0:38:51.380 +and it's telling us the total memory +usage and the increments. + +0:38:51.380,0:38:57.980 +And we can see that we have some overhead, because +this is an interpreted language and when we create + +0:38:58.450,0:39:00.599 +this million, + +0:39:03.520,0:39:07.340 +this list with a million entries, we're gonna +need this many megabytes of information. + +0:39:07.660,0:39:15.299 +Then we were getting another 150 megabytes. Then, we're freeing +this entry and that's decreasing the total amount. + +0:39:15.299,0:39:19.169 +We are not getting a negative increment because +of a bug, probably in the profiler. + +0:39:19.509,0:39:26.549 +But if you know that your program is taking a huge amount of +memory and you don't know why, maybe because you're copying + +0:39:26.920,0:39:30.269 +objects where you should be +doing things in place, then + +0:39:31.140,0:39:33.320 +using a memory profiler can be really useful. + +0:39:33.320,0:39:37.780 +And in fact there's an exercise that will +kind of work you through that, comparing + +0:39:37.980,0:39:39.980 +an in-place version of quicksort with like a + +0:39:40.059,0:39:44.008 +non-inplace, that keeps making new and new copies. +And if you using the memory profiler + +0:39:44.009,0:39:47.909 +you can get a really good comparison +between the two of them + +0:39:51.069,0:39:53.459 +Any questions so far, with profiling? + +0:39:53.460,0:39:57.940 +Is the memory profiler running the +program in order to get that? + +0:39:58.140,0:40:03.180 +Yeah... you might be able to figure +out like just looking at the code. + +0:40:03.180,0:40:05.759 +But as you get more and more complex +(for this code at least) + +0:40:06.009,0:40:10.738 +But you get more and more complex programs what +this is doing is running through the program + +0:40:10.739,0:40:16.739 +and for every line, at the very beginning, +it's looking at the heap and saying + +0:40:16.739,0:40:19.319 +"What are the objects that I have allocated now?" + +0:40:19.319,0:40:22.979 +"I have seven megabytes of objects", +and then goes to the next line, + +0:40:23.190,0:40:27.869 +looks again, "Oh now I have 50, +so I have now added 43 there". + +0:40:28.839,0:40:34.709 +Again, you could do this yourself by asking for those +operations in your code, every single line. + +0:40:34.920,0:40:39.899 +But that's not how you should be doing things since people +have already written these tools for you to use. + +0:40:43.089,0:40:46.078 +As it was the case with... + +0:40:51.480,0:40:58.220 +So as in the case with strace, you can +do something similar in profiling. + +0:40:58.340,0:41:03.380 +You might not care about the specific +lines of code that you have, + +0:41:03.440,0:41:08.200 +but maybe you want to check for outside events. +Like, you maybe want to check how many + +0:41:09.410,0:41:14.469 +CPU cycles your computer program is using, +or how many page faults it's creating. + +0:41:14.469,0:41:19.239 +Maybe you have like bad cache locality +and that's being manifested somehow. + +0:41:19.340,0:41:22.960 +So for that, there is the "perf" command. + +0:41:22.960,0:41:27.220 +The perf command is gonna do this, where it +is gonna run your program and it's gonna + +0:41:28.720,0:41:33.360 +keep track of all these statistics and report them back +to you. And this can be really helpful if you are + +0:41:33.680,0:41:36.060 +working at a lower level. So + +0:41:37.300,0:41:42.840 +we execute this command, I'm gonna +explain briefly what it's doing. + +0:41:48.650,0:41:51.639 +And this stress program is just + +0:41:52.219,0:41:54.698 +running in the CPU, and it's +just a program to just + +0:41:54.829,0:41:59.528 +hog one CPU and like test that you can +hog the CPU. And now if we Ctrl-C, + +0:42:00.619,0:42:02.708 +we can go back and + +0:42:03.410,0:42:08.559 +we get some information about the number of +page faults that we have or the number of + +0:42:09.769,0:42:11.769 +CPU cycles that we utilize, and other + +0:42:12.469,0:42:14.329 +useful + +0:42:14.329,0:42:18.968 +metrics from our code. For some programs you can + +0:42:21.469,0:42:25.089 +look at what the functions +that were being used were. + +0:42:26.120,0:42:30.140 +So we can record what this program is doing, + +0:42:30.940,0:42:34.920 +which we don't know about because it's +a program someone else has written. + +0:42:35.240,0:42:37.240 +And + +0:42:38.180,0:42:42.279 +we can report what it was doing by looking +at the stack trace and we can say oh, + +0:42:42.279,0:42:44.279 +It's spending a bunch of time in this + +0:42:44.660,0:42:46.640 +__random_r + +0:42:46.640,0:42:53.229 +standard library function. And it's mainly because the way of hogging +a CPU is by just creating more and more pseudo-random numbers. + +0:42:53.779,0:42:55.779 +There are some other + +0:42:55.819,0:42:58.149 +functions that have not been mapped, because they + +0:42:58.369,0:43:01.448 +belong to the program, but if +you know about your program + +0:43:01.448,0:43:05.140 +you can display this information +using more flags, about perf. + +0:43:05.140,0:43:10.220 +There are really good tutorials online +about how to use this tool. + +0:43:12.010,0:43:14.010 +Oh + +0:43:14.119,0:43:17.349 +One one more thing regarding +profilers is, so far, + +0:43:17.350,0:43:20.109 +we have seen that these profilers +are really good at + +0:43:20.510,0:43:25.419 +aggregating all this information and giving +you a lot of these numbers so you can + +0:43:25.790,0:43:29.739 +optimize your code or you can reason +about what is happening, but + +0:43:30.560,0:43:31.550 +the thing is + +0:43:31.550,0:43:35.949 +humans are not really good at making +sense of lots of numbers and since + +0:43:36.080,0:43:39.249 +humans are more visual creatures, it's much + +0:43:39.920,0:43:42.980 +easier to kind of have some +sort of visualization. + +0:43:42.980,0:43:48.700 +Again, programmers have already thought about +this and have come up with solutions. + +0:43:49.480,0:43:56.160 +A couple of popular ones, is a +FlameGraph. A FlameGraph is a + +0:43:56.780,0:44:00.160 +sampling profiler. So this is just running +your code and taking samples + +0:44:00.160,0:44:03.280 +And then on the y-axis here + +0:44:03.280,0:44:10.980 +we have the depth of the stack so we know that the bash function +called this other function, and this called this other function, + +0:44:11.260,0:44:14.480 +so on, so forth. And on the x-axis it's + +0:44:14.630,0:44:17.500 +not time, it's not the timestamps. + +0:44:17.500,0:44:23.290 +Like it's not this function run before, but it's just time +taken. Because, again, this is a sampling profiler: + +0:44:23.290,0:44:28.540 +we're just getting small glimpses of what was it going +on in the program. But we know that, for example, + +0:44:29.119,0:44:32.949 +this main program took the most time because the + +0:44:33.530,0:44:35.530 +x-axis is proportional to that. + +0:44:36.020,0:44:43.090 +They are interactive and they can be really useful +to identify the hot spots in your program. + +0:44:44.720,0:44:50.540 +Another way of displaying information, and there is also +an exercise on how to do this, is using a call graph. + +0:44:50.720,0:44:58.320 +So a call graph is going to be displaying information, and it's gonna +create a graph of which function called which other function. + +0:44:58.620,0:45:00.940 +And then you get information about, like, + +0:45:00.940,0:45:05.770 +oh, we know that "__main__" called this +"Person" function ten times and + +0:45:06.050,0:45:08.919 +it took this much time. And as you have + +0:45:09.080,0:45:13.029 +larger and larger programs, looking at one of +these call graphs can be useful to identify + +0:45:14.270,0:45:19.689 +what piece of your code is calling this really +expensive IO operation, for example. + +0:45:24.560,0:45:30.360 +With that I'm gonna cover the last +part of the lecture, which is that + +0:45:30.360,0:45:36.600 +sometimes, you might not even know what exact +resource is constrained in your program. + +0:45:36.619,0:45:39.019 +Like how do I know how much CPU + +0:45:39.380,0:45:44.060 +my program is using, and I can quickly +look in there, or how much memory. + +0:45:44.060,0:45:46.680 +So there are a bunch of really + +0:45:46.700,0:45:49.760 +nifty tools for doing that one of them is + +0:45:50.400,0:45:53.270 +HTOP. so HTOP is an + +0:45:54.000,0:45:59.810 +interactive command-line tool and here it's +displaying all the CPUs this machine has, + +0:46:00.160,0:46:07.740 +which is 12. It's displaying the amount of memory, it says I'm +consuming almost a gigabyte of the 32 gigabytes my machine has. + +0:46:07.740,0:46:11.660 +And then I'm getting all the different processes. + +0:46:11.730,0:46:13.290 +So for example we have + +0:46:13.290,0:46:20.300 +zsh, mysql and other processes that are running in this +machine, and I can sort through the amount of CPU + +0:46:20.300,0:46:24.379 +they're consuming or through the +priority they're running at. + +0:46:25.980,0:46:28.129 +We can check this, for example. Here + +0:46:28.130,0:46:30.230 +we have the stress command again + +0:46:30.230,0:46:31.470 +and we're going to + +0:46:31.470,0:46:37.040 +run it to take over four CPUs and check +that we can see that in HTOP. + +0:46:37.040,0:46:42.880 +So we did spot those four CPU +jobs, and now I have seen that + +0:46:43.710,0:46:46.429 +besides the ones we had before, +now I have this... + +0:46:50.310,0:46:56.119 +Like this "stress -c" command running +and taking a bunch of our CPU. + +0:46:56.849,0:47:03.169 +Even though you could use a profiler to get similar information to +this, the way HTOP displays this kind of in a live interactive + +0:47:03.329,0:47:07.099 +fashion can be much quicker +and much easier to parse. + +0:47:07.890,0:47:09.890 +In the notes, there's a + +0:47:10.160,0:47:15.180 +really long list of different tools for evaluating +different parts of your system. + +0:47:15.180,0:47:17.180 +So that might be tools for analyzing the + +0:47:17.180,0:47:19.720 +network performance, about looking the + +0:47:20.430,0:47:24.530 +number of IO operations, so you know +whether you're saturating the + +0:47:26.040,0:47:28.040 +the reads from your disks, + +0:47:28.829,0:47:31.429 +you can also look at what is the space usage. + +0:47:32.069,0:47:34.369 +Which, I think, here... + +0:47:38.690,0:47:44.829 +So NCDU... There's a tool called "du" +which stands for "disk usage" and + +0:47:45.440,0:47:49.480 +we have the "-h" flag for +"human readable output". + +0:47:51.740,0:47:58.959 +We can do videos and we can get output about +the size of all the files in this folder. + +0:48:08.059,0:48:10.059 +Yeah, there we go. + +0:48:10.400,0:48:15.040 +There are also interactive versions, +like HTOP was an interactive version. + +0:48:15.280,0:48:21.200 +So NCDU is an interactive version that will let me navigate +through the folders and I can see quickly that + +0:48:21.200,0:48:25.740 +oh, we have... This is one of the +folders for the video lectures, + +0:48:26.329,0:48:29.049 +and we can see there are these four files + +0:48:29.690,0:48:36.579 +that have like almost 9 GB each and I could +quickly delete them through this interface. + +0:48:37.760,0:48:43.839 +Another neat tool is "LSOF" which +stands for "LIST OF OPEN FILES". + +0:48:44.240,0:48:47.500 +Another pattern that you +may encounter is you know + +0:48:47.780,0:48:54.609 +some process is using a file, but you don't know exactly which process +is using that file. Or, similarly, some process is listening in + +0:48:55.400,0:48:59.020 +a port, but again, how do you +find out which one it is? + +0:48:59.020,0:49:00.820 +So to set an example. + +0:49:00.820,0:49:04.280 +We just run a Python HTTP server on port + +0:49:05.210,0:49:06.559 +444 + +0:49:06.559,0:49:10.899 +Running there. Maybe we don't know that +that's running, but then we can + +0:49:13.130,0:49:15.130 +use... + +0:49:17.089,0:49:19.089 +we can use LSOF. + +0:49:22.660,0:49:29.200 +Yeah, we can use LSOF, and the thing is LSOF +is gonna print a lot of information. + +0:49:30.440,0:49:32.740 +You need SUDO permissions because + +0:49:34.069,0:49:39.219 +this is gonna ask for who has all these items. + +0:49:39.829,0:49:43.929 +Since we only care about the one +who is listening in this 444 port + +0:49:44.630,0:49:46.369 +we can ask + +0:49:46.369,0:49:47.960 +grep for that. + +0:49:47.960,0:49:55.750 +And we can see, oh, there's like this Python process, with +this identifier, that is using the port and then we can + +0:49:56.660,0:49:58.009 +kill it, + +0:49:58.009,0:50:00.969 +and that terminates that process. + +0:50:02.299,0:50:06.669 +Again, there's a lot of different +tools. There's even tools for + +0:50:08.450,0:50:10.569 +doing what is called benchmarking. + +0:50:11.660,0:50:18.789 +So in the shell tools and scripting lecture, I said +like for some tasks "fd" is much faster than "find" + +0:50:18.950,0:50:21.519 +But like how will you check that? + +0:50:22.059,0:50:30.038 +I can test that with "hyperfine" and I have here +two commands: one with "fd" that is just + +0:50:30.500,0:50:34.029 +searching for JPEG files and +the same one with "find". + +0:50:34.579,0:50:41.079 +If I execute them, it's gonna benchmark these +scripts and give me some output about + +0:50:41.869,0:50:44.108 +how much faster "fd" is + +0:50:45.380,0:50:47.380 +compared to "find". + +0:50:47.660,0:50:52.269 +So I think that kind of concludes... +yeah, like 23 times for this task. + +0:50:52.940,0:50:55.990 +So that kind of concludes the whole overview. + +0:50:56.539,0:51:00.309 +I know that there's like a lot of different +topics and there's like a lot of + +0:51:00.650,0:51:04.539 +perspectives on doing these things, but +again I want to reinforce the idea + +0:51:04.539,0:51:08.499 +that you don't need to be a master +of all these topics but more... + +0:51:08.750,0:51:11.229 +To be aware that all these things exist. + +0:51:11.230,0:51:17.559 +So if you run into these issues you don't reinvent the wheel, +and you reuse all that other programmers have done. + +0:51:18.280,0:51:23.700 +Given that, I'm happy to take any questions related +to this last section or anything in the lecture. + +0:51:25.900,0:51:30.060 +Is there any way to sort of think about +how long a program should take? + +0:51:30.060,0:51:33.160 +You know, if it's taking a while to run + +0:51:33.160,0:51:42.840 +you know, should you be worried? Or depending on your process, let me wait +another ten minutes before I start looking at why it's taking so long. + +0:51:43.220,0:51:45.220 +Okay, so the... + +0:51:46.070,0:51:49.089 +The task of knowing how long a program + +0:51:49.090,0:51:53.920 +should run is pretty infeasible to figure out. +It will depend on the type of program. + +0:51:54.290,0:52:01.899 +It depends on whether you're making HTTP requests or you're +reading data... one thing that you can do is if you have + +0:52:02.390,0:52:02.980 +for example, + +0:52:02.980,0:52:10.689 +if you know you have to read two gigabytes from memory, +like from disk, and load that into memory, you can make + +0:52:11.510,0:52:16.719 +back-of-the-envelope calculation. So like that shouldn't +take longer than like X seconds because this is + +0:52:16.940,0:52:20.050 +how things are set up. Or if you are + +0:52:20.840,0:52:27.460 +reading some files from the network and you know kind of what the +network link is and they are taking say five times longer than + +0:52:27.460,0:52:29.460 +what you would expect then you could + +0:52:29.990,0:52:31.190 +try to do that. + +0:52:31.190,0:52:37.839 +Otherwise, if you don't really know. Say you're trying to do some +mathematical operation in your code and you're not really sure + +0:52:37.840,0:52:44.050 +about how long that will take you can use something +like logging and try to kind of print intermediate + +0:52:44.570,0:52:50.469 +stages to get a sense of like, oh I need +to do a thousand operations of this and + +0:52:51.800,0:52:53.600 +three iterations + +0:52:53.600,0:53:00.700 +took ten seconds. Then this is gonna take +much longer than I can handle in my case. + +0:53:00.920,0:53:04.599 +So I think there are there are ways, it +will again like depend on the task, + +0:53:04.600,0:53:08.800 +but definitely, given all the tools we've +seen really, we probably have like + +0:53:09.620,0:53:13.150 +a couple of really good ways +to start tackling that. + +0:53:14.750,0:53:16.750 +Any other questions? + +0:53:16.750,0:53:18.750 +You can also do things like + +0:53:18.750,0:53:21.060 +run HTOP and see if anything is running. + +0:53:22.380,0:53:25.500 +Like if your CPU is at 0%, something +is probably wrong. + +0:53:31.140,0:53:32.579 +Okay. + +0:53:32.579,0:53:38.268 +There's a lot of exercises for all the topics +that we have covered in today's class, + +0:53:38.269,0:53:41.419 +so feel free to do the ones +that are more interesting. + +0:53:42.180,0:53:44.539 +We're gonna be holding office hours again today. + +0:53:45.059,0:53:48.979 +Just a reminder, office hours. You can come +and ask questions about any lecture. + +0:53:48.980,0:53:53.510 +Like we're not gonna expect you to kind of +do the exercises in a couple of minutes. + +0:53:53.510,0:53:57.979 +They take a really long while to get through +them, but we're gonna be there + +0:53:58.529,0:54:04.339 +to answer any questions from previous classes, or even not related +to exercises. Like if you want to know more about how you + +0:54:04.619,0:54:09.889 +would use TMUX in a way to kind of quickly switch +between panes, anything that comes to your mind. + diff --git a/static/files/subtitles/2020/qa.sbv b/static/files/subtitles/2020/qa.sbv new file mode 100644 index 00000000..8afa5289 --- /dev/null +++ b/static/files/subtitles/2020/qa.sbv @@ -0,0 +1,2874 @@ +0:00:00.000,0:00:06.540 +I guess we should do an intro to to this as well, + +0:00:06.540,0:00:09.580 +so this is a just sort of a + +0:00:09.581,0:00:14.740 +free-form Q&A lecture where you, as in +the two people sitting here, but also + +0:00:14.740,0:00:19.841 +everyone at home who did not come here +in person get to ask questions and we + +0:00:19.841,0:00:22.961 +have a bunch of questions people asked +in advance but you can also ask + +0:00:22.961,0:00:27.371 +additional questions during, for the two +of you who are here, you can do it either + +0:00:27.371,0:00:33.611 +by raising your hand or you can submit it on +the forum and be anonymous, it's up to you + +0:00:33.611,0:00:35.671 +regardless though, what we're gonna +do is just go through some of the + +0:00:35.681,0:00:40.241 +questions have been asked and try to +give as helpful answers as we can + +0:00:40.241,0:00:43.691 +although they are unprepared on our side and + +0:00:43.791,0:00:45.611 +yeah that's the plan I guess we go + +0:00:45.611,0:00:48.911 +from popular to least popular + +0:00:48.911,0:00:49.991 +fire away + +0:00:49.991,0:00:52.091 +all right so for our first question any + +0:00:52.091,0:00:55.961 +recommendations on learning operating +system related topics like processes, + +0:00:55.961,0:00:59.861 +virtual memory, interrupts, +memory management, etc + +0:00:59.861,0:01:01.811 +so I think this is a + +0:01:01.811,0:01:07.181 +is an interesting question because these +are really low level concepts that often + +0:01:07.181,0:01:11.391 +do not matter, unless you have to +deal with this in some capacity, + +0:01:11.391,0:01:12.771 +right so + +0:01:12.891,0:01:17.671 +one instance where this matters is you're +writing really low level code like + +0:01:17.681,0:01:20.500 +you're implementing a kernel or something +like that, or you want to + +0:01:20.500,0:01:22.811 +just hack on the Linux kernel. + +0:01:22.811,0:01:24.751 +It's rare otherwise that you need to work with + +0:01:24.751,0:01:27.711 +especially like virtual memory and +interrupts and stuff yourself + +0:01:27.851,0:01:32.071 +processes, I think are a more general concept +that we've talked a little bit about in + +0:01:32.071,0:01:36.611 +this class as well and tools like +htop, pgrep, kill, and signals and + +0:01:36.761,0:01:37.711 +that sort of stuff + +0:01:37.711,0:01:39.311 +in terms of learning it + +0:01:39.311,0:01:45.371 +maybe one of the best ways, is to try to +take either an introductory class on the + +0:01:45.371,0:01:51.401 +topic, so for example MIT has a class +called 6.828, which is where + +0:01:51.401,0:01:55.091 +you essentially build and develop your +own operating system based on some code + +0:01:55.091,0:01:58.631 +that you're given, and all of those labs +are publicly available and all the + +0:01:58.631,0:02:01.601 +resources for the class are publicly available, +and so that is a good way to + +0:02:01.601,0:02:04.001 +really learn them is by doing them yourself. + +0:02:04.001,0:02:05.201 +There are also various + +0:02:05.201,0:02:11.201 +tutorials online that basically guide +you through how do you write a kernel + +0:02:11.201,0:02:15.431 +from scratch. Not necessarily a very +elaborate one, not one you would want + +0:02:15.431,0:02:20.561 +to run any real software on, but just to +teach you the basics and so that would + +0:02:20.561,0:02:21.930 +be another thing to look up. + +0:02:21.930,0:02:24.131 +Like how do I write a kernel in and then your + +0:02:24.131,0:02:27.611 +language of choice. You will probably not +find one that lets you do it in Python + +0:02:27.611,0:02:33.612 +but in like C, C++, Rust, there +are a bunch of topics like this + +0:02:33.612,0:02:36.951 +one other note on operating systems + +0:02:36.951,0:02:39.931 +so like Jon mentioned MIT has a 6.828 class but + +0:02:39.941,0:02:43.391 +if you're looking for a more high-level +overview, not necessarily programming or + +0:02:43.391,0:02:46.001 +an operating system, but just learning about +the concepts another good resource + +0:02:46.001,0:02:51.331 +is a book called "Modern Operating +Systems" by Andy Tannenbaum + +0:02:51.331,0:02:58.371 +there's also actually a book called the "The FreeBSD +Operating System" which is really good, + +0:02:58.371,0:03:03.031 +It doesn't go through Linux, but it goes +through FreeBSD and the BSD kernel is + +0:03:03.031,0:03:07.181 +arguably better organized than the Linux +one and better documented and so it + +0:03:07.181,0:03:11.591 +might be a gentler introduction to some of those +topics than trying to understand Linux + +0:03:11.591,0:03:14.951 +You want to check it as answered? + +0:03:14.951,0:03:16.511 +- Yes + Nice + +0:03:16.511,0:03:17.451 +Answered + +0:03:17.451,0:03:19.371 +For our next question + +0:03:19.371,0:03:23.951 +What are some of the tools you'd +prioritize learning first? + +0:03:23.951,0:03:29.551 +- Maybe we can all go through and +give our opinion on this? + Yeah + +0:03:29.551,0:03:31.713 +Tools to prioritize learning first? + +0:03:31.713,0:03:36.451 +I think learning your editor well, +just serves you in all capacities + +0:03:36.511,0:03:40.511 +like being efficient at editing files, +is just like a majority of + +0:03:40.511,0:03:45.041 +what you're going to spend your time doing. +And in general, just using your + +0:03:45.041,0:03:49.211 +keyboard more in your mouse less. It means +that you get to spend more of your + +0:03:49.311,0:03:53.751 +time doing useful things and +less of your time moving + +0:03:53.751,0:03:56.251 +I think that would be my top priority, + +0:04:04.511,0:04:06.751 +so I would say that for what + +0:04:06.760,0:04:09.671 +tool to prioritize will depend +on what exactly you're doing + +0:04:09.671,0:04:16.150 +I think the core idea is you should try +to find the types of tasks that you are + +0:04:16.151,0:04:18.371 +doing repetitively and so + +0:04:18.371,0:04:23.791 +if you are doing some sort of like +machine learning workload and + +0:04:24.011,0:04:27.130 +you find yourself using jupyter notebooks, +like the one we presented + +0:04:27.130,0:04:32.560 +yesterday, a lot. Then again, using +a mouse for that might not be + +0:04:32.560,0:04:35.830 +the best idea and you want to familiarize +with the keyboard shortcuts + +0:04:35.830,0:04:40.750 +and pretty much with anything you will +end up figuring out that there are some + +0:04:40.751,0:04:45.611 +repetitive tasks, and you're running a +computer, and just trying to figure out + +0:04:45.611,0:04:48.311 +oh there's probably a better way to do this + +0:04:48.431,0:04:50.871 +be it a terminal, be it an editor + +0:04:51.111,0:04:55.891 +And it might be really interesting to +learn to use some of the topics that + +0:04:55.900,0:05:01.121 +we have covered, but if they're not +extremely useful in a everyday + +0:05:01.121,0:05:05.431 +basis then it might not worth prioritizing them + +0:05:06.591,0:05:07.451 +out of the topics + +0:05:07.531,0:05:11.611 +covered in this class in my opinion two +of the most useful things are version + +0:05:11.621,0:05:15.220 +control and text editors, and I think they're +a little bit different from each + +0:05:15.220,0:05:18.880 +other, in the sense that text editors I +think are really useful to learn well + +0:05:18.880,0:05:21.970 +but it was probably the case that before +we started using vim and all its fancy + +0:05:21.970,0:05:25.390 +keyboard shortcuts you had some other +text editor you were using before and + +0:05:25.390,0:05:29.890 +you could edit text just fine maybe a little +bit inefficiently whereas I think + +0:05:29.890,0:05:33.100 +version control is another really useful +skill and that's one where if you don't + +0:05:33.100,0:05:36.580 +really know the tool properly, it can actually +lead to some problems like loss + +0:05:36.580,0:05:39.490 +of data or just inability to collaborate +properly with people so I + +0:05:39.490,0:05:42.730 +think version control is one of the first +things that's worth learning well + +0:05:42.730,0:05:46.871 +yeah, I agree with that, I think +learning a tool like Git is just + +0:05:46.871,0:05:49.691 +gonna save you so much heartache down the line + +0:05:49.691,0:05:51.431 +it also, to add on to that + +0:05:51.571,0:05:57.310 +It really helps you collaborate with others +and Anish touched a little bit on GitHub + +0:05:57.310,0:06:01.300 +in the last lecture, and just learning +to use that tool well in order + +0:06:01.300,0:06:05.321 +to work on larger software projects +that other people are working on is + +0:06:05.321,0:06:06.431 +an invaluable skill + +0:06:10.071,0:06:11.391 +For our next question + +0:06:11.391,0:06:12.871 +when do I use Python versus a + +0:06:12.881,0:06:16.051 +bash script, versus some other language + +0:06:16.051,0:06:19.661 +This is tough, because I think this comes + +0:06:19.661,0:06:21.631 +down to what Jose was saying earlier too + +0:06:21.771,0:06:23.731 +that it really depends on +what you're trying to do + +0:06:23.731,0:06:27.155 +For me, I think for bash scripts in particular + +0:06:27.155,0:06:28.791 +bash scripts are for + +0:06:28.891,0:06:33.430 +automating running a bunch of commands, +you don't want to write any + +0:06:33.430,0:06:35.411 +other like business logic in bash + +0:06:35.411,0:06:39.011 +it is just for I want to run these + +0:06:39.011,0:06:44.110 +commands, in this order. Maybe with +arguments, but like even that + +0:06:44.110,0:06:47.581 +it's unclear do you want to bash script +once you start taking arguments + +0:06:47.581,0:06:52.691 +Similarly, once you start doing any +kind of like text processing or + +0:06:52.691,0:06:55.131 +configuration, all that + +0:06:55.131,0:06:59.111 +reach for a language that is a more serious + +0:06:59.111,0:07:01.031 +programming language than bash is + +0:07:01.091,0:07:03.451 +bash is really for sort of short one-off + +0:07:03.461,0:07:10.211 +scripts or ones that have a very well-defined +use case on the terminal in + +0:07:10.211,0:07:12.851 +the shell, probably + +0:07:12.851,0:07:15.941 +For a slightly more concrete guideline, +you might say write a + +0:07:15.941,0:07:19.211 +bash script if it's less than a hundred +lines of code or so, but once it gets + +0:07:19.211,0:07:21.611 +beyond that point bash is kind of +unwieldy and it's probably worth + +0:07:21.611,0:07:25.091 +switching to a more serious programming +language like Python + +0:07:25.091,0:07:26.511 +and to add to that + +0:07:26.511,0:07:32.211 +I would say that I found myself writing +sometimes scripts in Python because + +0:07:32.211,0:07:36.911 +If I have already solved some subproblem +that covers part of the problem in Python + +0:07:36.911,0:07:40.631 +I find it much easier to compose the +previous solution that I found out in + +0:07:40.631,0:07:45.731 +Python and just try to reuse bash code, +that I don't find as reusable as Python + +0:07:45.731,0:07:49.600 +And in the same way it's kind of nice that +a lot of people have written something + +0:07:49.600,0:07:52.631 +like Python libraries or like Ruby libraries +to do a lot of these things + +0:07:52.631,0:07:58.451 +whereas in bash is kind of hard +to have like code reuse + +0:07:58.451,0:08:01.720 +And in fact, + +0:08:01.720,0:08:07.631 +I think to add to that. Usually, if you +find a library in some language that + +0:08:07.631,0:08:12.091 +helps with the task you're trying to +do, use that language for the job + +0:08:12.091,0:08:15.671 +And in bash there are no libraries, there +are only the programs on your computer + +0:08:15.771,0:08:18.931 +So you probably don't want to use +it unless like there's a program + +0:08:18.941,0:08:23.741 +you can just invoke I do think another +thing worth remembering about bash + +0:08:23.741,0:08:26.451 +bash is really hard to get right. + +0:08:26.451,0:08:30.531 +It's very easy to get it right for the particular +use case you're trying to solve right now + +0:08:30.531,0:08:32.471 +but things like + +0:08:32.471,0:08:35.891 +What if one of the filenames has a space in it? + +0:08:35.891,0:08:38.891 +It has caused so many bugs and so + +0:08:38.891,0:08:43.151 +many problems in bash scripts and if you +use a real programming language then + +0:08:43.151,0:08:46.642 +those problems just go away + +0:08:46.651,0:08:50.491 +Checked it + +0:08:50.571,0:08:51.571 +For our next question + +0:08:51.571,0:08:56.211 +What is the difference between sourcing +a script and executing that script ? + +0:08:57.071,0:09:02.711 +So this actually, we got in office +hours a while back as well which is + +0:09:02.871,0:09:06.991 +Aren't they the same? like aren't they +both just running the bash script? + +0:09:06.991,0:09:08.051 +and it is true + +0:09:08.051,0:09:12.191 +both of these will end up executing the +lines of code that are in the script + +0:09:12.191,0:09:16.571 +the ways in which they differ is that +sourcing a script is telling your + +0:09:16.571,0:09:22.991 +current bash script, your current bash +session to execute that program + +0:09:23.131,0:09:28.911 +whereas the other one is, start up a new instance +of bash and run the program there instead + +0:09:29.291,0:09:34.931 +And this matters for things like imagine that +"script.sh" tries to change directories + +0:09:34.931,0:09:37.841 +If you are running the script +as in the second invocation + +0:09:37.841,0:09:42.761 +"./script.sh", then the new +process is going to change + +0:09:42.761,0:09:46.891 +directories but by the time that script +exits and returns to your shell + +0:09:46.891,0:09:51.831 +your shell still remains in the same place. However, +if you do CD in a script and you source it + +0:09:51.831,0:09:55.241 +Your current instance of bash is the +one that ends up running it and + +0:09:55.241,0:09:57.951 +so it ends up CDing where you are + +0:09:57.951,0:10:01.171 +This is also why if you define functions + +0:10:01.171,0:10:04.751 +For example, that you may want to +execute in your shell session + +0:10:04.751,0:10:07.011 +You need to source the script, not run it + +0:10:07.011,0:10:10.261 +Because if you run it, that function +will be defined in the + +0:10:10.261,0:10:11.931 +instance of bash + +0:10:11.931,0:10:16.831 +In the bash process that gets launched but it +will not be defined in your current shell + +0:10:16.831,0:10:22.871 +I think those are two of the biggest +differences between the two + +0:10:29.211,0:10:29.711 +Next question, + +0:10:29.873,0:10:35.131 +What are the places where various packages and tools +are stored and how does referencing them work? + +0:10:35.131,0:10:39.171 +What even is /bin or /lib? + +0:10:39.171,0:10:45.091 +So as we covered in the first lecture, +there is this PATH environment variable + +0:10:45.091,0:10:49.551 +which is a semicolon separated +string of all the places + +0:10:49.551,0:10:55.111 +where your shell is gonna look for binaries +and if you just do something + +0:10:55.111,0:10:58.171 +like "echo $PATH", you're gonna get this list + +0:10:58.171,0:11:02.251 +and all these places are gonna +be consulted in order. + +0:11:02.251,0:11:03.601 +It's gonna go through all of them and in fact + +0:11:03.601,0:11:07.011 +- There is already... Did we cover which? + Yeah + +0:11:07.211,0:11:10.011 +So if you run "which" and a specific command + +0:11:10.021,0:11:14.071 +the shell is actually is gonna tell +you where it's finding this + +0:11:14.071,0:11:15.391 +Beyond that, + +0:11:15.391,0:11:20.431 +there is like some conventions where a lot +of programs will install their binaries + +0:11:20.431,0:11:24.071 +and they're like /usr/bin (or at +least they will include symlinks) + +0:11:24.071,0:11:26.051 +in /usr/bin so you can find them + +0:11:26.191,0:11:28.211 +There's also a /usr/local/bin + +0:11:28.211,0:11:33.951 +There are special directories. For example, +/usr/sbin it's only for sudo user and + +0:11:33.951,0:11:38.491 +some of these conventions are slightly +different between different distros so + +0:11:38.491,0:11:47.571 +I know like some distros for example install +the user libraries under /opt for example + +0:11:51.191,0:11:55.491 +Yeah I think one thing just +to talk a little bit of more + +0:11:55.651,0:12:00.631 +about /bin and then Anish maybe you can +do the other folders so when it comes to + +0:12:00.631,0:12:02.791 +/bin the convention + +0:12:02.791,0:12:10.051 +There are conventions, and the conventions are +usually /bin are for essential system utilities + +0:12:10.051,0:12:12.531 +/usr/bin are for user programs and + +0:12:12.531,0:12:17.431 +/usr/local/bin are for user +compiled programs, sort of + +0:12:17.431,0:12:21.691 +so things that you installed that you intend +the user to run, are in /usr/bin + +0:12:21.691,0:12:26.711 +things that a user has compiled themselves and stuck +on your system, probably goes in /usr/local/bin + +0:12:26.711,0:12:29.991 +but again, this varies a lot from machine +to machine, and distro to distro + +0:12:29.991,0:12:33.971 +On Arch Linux, for example, /bin +is a symlink to /usr/bin + +0:12:33.971,0:12:40.261 +They're the same and as Jose mentioned, there's +also /sbin which is for programs that are + +0:12:40.261,0:12:43.801 +intended to only be run as root, that +also varies from distro to distro + +0:12:43.801,0:12:47.251 +whether you even have that directory, and +on many systems like /usr/local/bin + +0:12:47.251,0:12:51.151 +might not even be in your PATH, or +might not even exist on your system + +0:12:51.151,0:12:55.831 +On BSD on the other hand /usr/local/bin +is often used a lot more heavily + +0:12:56.731,0:12:57.231 +yeah so + +0:12:57.231,0:13:01.111 +What we were talking about so far, these +are all ways that files and folders are + +0:13:01.111,0:13:05.071 +organized on Linux things or Linux or +BSD things vary a little bit between + +0:13:05.071,0:13:07.151 +that and macOS or other platforms + +0:13:07.151,0:13:09.301 +I think for the specific locations, + +0:13:09.301,0:13:11.471 +if you to know exactly what it's +used for, you can look it up + +0:13:11.471,0:13:17.291 +But some general patterns to keep in mind or anything +with /bin in it has binary executable programs in it, + +0:13:17.291,0:13:19.891 +anything with \lib in it, has +libraries in it so things that + +0:13:19.891,0:13:25.081 +programs can link against, and then some +other things that are useful to know are + +0:13:25.081,0:13:29.431 +there's a /etc on many systems, which +has configuration files in it and + +0:13:29.431,0:13:34.311 +then there's /home, which underneath that directory +contains each user's home directory + +0:13:34.311,0:13:38.521 +so like on a linux box my username +or if it's Anish will + +0:13:38.651,0:13:41.351 +correspond to a home directory /home/anish + +0:13:42.071,0:13:43.351 +Yeah I guess there are + +0:13:43.351,0:13:47.671 +a couple of others like /tmp is usually +a temporary directory that gets + +0:13:47.671,0:13:51.351 +erased when you reboot not always but sometimes, +you should check on your system + +0:13:51.731,0:13:59.211 +There's a /var which often holds like +files the change over time so + +0:13:59.211,0:14:06.151 +these these are usually going to be things +like lock files for package managers + +0:14:06.151,0:14:12.431 +they're gonna be things like log files +files to keep track of process IDs + +0:14:12.431,0:14:16.471 +then there's /dev which shows devices so + +0:14:16.471,0:14:20.551 +usually so these are special files that +correspond to devices on your system we + +0:14:20.551,0:14:27.391 +talked about /sys, Anish mentioned /etc + +0:14:29.051,0:14:36.031 +/opt is a common one for just like third-party +software that basically it's usually for + +0:14:36.031,0:14:40.951 +companies ported their software to Linux +but they don't actually understand what + +0:14:40.951,0:14:45.391 +running software on Linux is like, and +so they just have a directory with all + +0:14:45.391,0:14:51.411 +their stuff in it and when those get installed +they usually get installed into /opt + +0:14:51.411,0:14:55.651 +I think those are the ones off the top of my head + +0:14:55.651,0:14:57.771 +yeah + +0:14:57.771,0:15:02.271 +And we will list these in our lecture notes +which will produce after this lecture + +0:15:02.271,0:15:04.431 +Next question + +0:15:04.431,0:15:07.080 +Should I apt-get install a Python whatever + +0:15:07.080,0:15:10.691 +package or pip install that package + +0:15:10.691,0:15:13.890 +so this is a good question that I think at + +0:15:13.890,0:15:17.310 +a higher level this question is asking +should I use my systems package manager + +0:15:17.310,0:15:20.850 +to install things or should I use some other +package manager. Like in this case + +0:15:20.850,0:15:25.021 +one that's more specific to a particular +language. And the answer here is also + +0:15:25.021,0:15:28.590 +kind of it depends, sometimes it's nice +to manage things using a system package + +0:15:28.590,0:15:31.950 +manager so everything can be installed +and upgraded in a single place but + +0:15:31.950,0:15:35.160 +I think oftentimes whatever is available +in the system repositories the things + +0:15:35.160,0:15:37.800 +you can get via a tool like +apt-get or something similar + +0:15:37.800,0:15:41.040 +might be slightly out of date compared to +the more language specific repository + +0:15:41.040,0:15:45.060 +so for example a lot of the Python packages +I use I really want the most + +0:15:45.060,0:15:47.771 +up-to-date version and so +I use pip to install them + +0:15:48.551,0:15:51.091 +Then, to extend on that is + +0:15:51.091,0:15:57.751 +sometimes the case the system packages +might require some other + +0:15:57.751,0:16:02.461 +dependencies that you might not have realized +about, and it's also might be + +0:16:02.461,0:16:07.201 +the case or like for some systems, +at least for like alpine Linux they + +0:16:07.201,0:16:11.221 +don't have wheels for like a lot of the +Python packages so it will just take + +0:16:11.221,0:16:15.331 +longer to compile them, it will take more +space because they have to compile them + +0:16:15.331,0:16:20.761 +from scratch. Whereas if you just go +to pip, pip has binaries for a lot of + +0:16:20.761,0:16:23.471 +different platforms and that will probably work + +0:16:23.471,0:16:29.191 +You also should be aware that pip might not do +the exact same thing in different computers + +0:16:29.191,0:16:33.601 +So, for example, if you are in a kind of laptop +or like a desktop that is running like + +0:16:33.601,0:16:38.971 +a x86 or x86_64 you probably have binaries, +but if you're running something + +0:16:38.971,0:16:43.471 +like Raspberry Pi or some other kind of +embedded device. These are running on a + +0:16:43.471,0:16:47.611 +different kind of hardware architecture +and you might not have binaries + +0:16:47.611,0:16:51.841 +I think that's also good to take into account, +in that case in might be worthwhile to + +0:16:51.841,0:16:58.551 +use the system packages just because they +will take much shorter to get them + +0:16:58.551,0:17:01.691 +than to just to compile from scratch +the entire Python installation + +0:17:01.691,0:17:06.741 +Apart from that, I don't think I can think of any exceptions +where I would actually use the system packages + +0:17:06.741,0:17:09.251 +instead of the Python provided ones + +0:17:19.011,0:17:20.851 +So, one other thing to keep in mind is that + +0:17:20.861,0:17:26.180 +sometimes you will have more than one +program on your computer and you might + +0:17:26.180,0:17:29.961 +be developing more than one program on +your computer and for some reason not + +0:17:29.961,0:17:33.861 +all programs are always built with the latest +version of things, sometimes they + +0:17:33.861,0:17:39.351 +are a little bit behind, and when you +install something system-wide you can + +0:17:39.351,0:17:44.691 +only... depends on your exact system, +but often you just have one version + +0:17:44.691,0:17:49.711 +what pip lets you do, especially combined +with something like python's virtualenv, + +0:17:49.711,0:17:54.531 +and similar concepts exist for other +languages, where you can sort of say + +0:17:54.531,0:17:59.660 +I want to (NPM does the same thing as well +with its node modules, for example) where + +0:17:59.660,0:18:05.991 +I'm gonna compile the dependencies of +this package in sort of a subdirectory + +0:18:05.991,0:18:10.431 +of its own, and all of the versions that it +requires are going to be built in there + +0:18:10.431,0:18:13.910 +and you can do this separately for separate +projects so there they have + +0:18:13.910,0:18:16.910 +different dependencies or the same dependencies +with different versions + +0:18:16.910,0:18:20.930 +they still sort of kept separate. And that +is one thing that's hard to achieve + +0:18:20.931,0:18:22.651 +with system packages + +0:18:27.131,0:18:27.851 +Next question + +0:18:27.911,0:18:32.771 +What's the easiest and best profiling tools +to use to improve performance of my code? + +0:18:34.351,0:18:39.231 +This is a topic we could talk +about for a very long time + +0:18:39.231,0:18:42.881 +The easiest and best is to print stuff using time + +0:18:42.881,0:18:48.431 +Like, I'm not joking, very often +the easiest thing is in your code + +0:18:48.971,0:18:53.751 +At the top you figure out what the current +time is, and then you do sort of + +0:18:53.751,0:18:57.920 +a binary search over your program of add +a print statement that prints how much + +0:18:57.920,0:19:02.511 +time has elapsed since the start of your +program and then you do that until you + +0:19:02.511,0:19:06.320 +find the segment of code that took the +longest. And then you go into that + +0:19:06.320,0:19:09.531 +function and then you do the same thing +again and you keep doing this until you + +0:19:09.531,0:19:14.031 +find roughly where the time was spent. It's +not foolproof, but it is really easy + +0:19:14.031,0:19:16.721 +and it gives you good information quickly + +0:19:16.721,0:19:25.361 +if you do need more advanced information +Valgrind has a tool called cache-grind? + +0:19:25.361,0:19:29.431 +call grind? Cache grind? One of the two. + +0:19:29.431,0:19:33.310 +and this tool lets you run your program and + +0:19:33.310,0:19:38.741 +measure how long everything takes and +all of the call stacks, like which + +0:19:38.741,0:19:42.521 +function called which function, and what +you end up with is a really neat + +0:19:42.521,0:19:47.081 +annotation of your entire program source +with the heat of every line basically + +0:19:47.081,0:19:51.761 +how much time was spent there. It does +slow down your program by like an order + +0:19:51.761,0:19:56.021 +of magnitude or more, and it doesn't really +support threads but it is really + +0:19:56.021,0:20:01.121 +useful if you can use it. If you can't, +then tools like perf or similar tools + +0:20:01.121,0:20:05.201 +for other languages that do usually some +kind of sampling profiling like we + +0:20:05.201,0:20:09.811 +talked about in the profiler lecture, can +give you pretty useful data quickly, + +0:20:09.811,0:20:15.160 +but it's a lot of data around +this, but they're a little bit + +0:20:15.160,0:20:18.971 +biased and what kind of things they usually +highlight as a problem and it + +0:20:18.971,0:20:22.961 +can sometimes be hard to extract meaningful +information about what should + +0:20:22.961,0:20:27.701 +I change in response to them. Whereas the +sort of print approach very quickly + +0:20:27.701,0:20:32.171 +gives you like this section +of code is bad or slow + +0:20:32.171,0:20:34.871 +I think would be my answer + +0:20:34.871,0:20:40.431 +Flamegraphs are great, they're a good way +to visualize some of this information + +0:20:41.491,0:20:45.550 +Yeah I just have one thing to add, +oftentimes programming languages + +0:20:45.550,0:20:48.910 +have language specific tools for profiling +so to figure out what's the + +0:20:48.910,0:20:52.191 +right tool to use for your language like if +you're doing JavaScript in the web browser + +0:20:52.191,0:20:55.411 +the web browser has a really nice tool for +doing profiling you should just use that + +0:20:55.411,0:21:00.471 +or if you are using go, for example, go has a built-in +profiler is really good you should just use that + +0:21:01.711,0:21:04.251 +A last thing to add to that + +0:21:04.251,0:21:09.951 +Sometimes you might find that doing this binary +search over time that you're kind of + +0:21:09.961,0:21:14.351 +finding where the time is going, but this +time is sometimes happening because + +0:21:14.351,0:21:18.461 +you're waiting on the network, or you're +waiting for some file, and in that case + +0:21:18.461,0:21:23.440 +you want to make sure that the time +that is, if I want to write + +0:21:23.440,0:21:27.310 +like 1 gigabyte file or like read 1 +gigabyte file and put it into memory + +0:21:27.310,0:21:32.260 +you want to check that the actual time +there, is the minimum amount of time + +0:21:32.260,0:21:36.221 +you actually have to wait. If it's ten times +longer, you should try to use some + +0:21:36.221,0:21:39.371 +other tools that we covered in the debugging +and profiling section to see + +0:21:39.371,0:21:45.671 +why you're not utilizing all your +resources because that might... + +0:21:50.511,0:21:56.071 +Because that might be a lot of what's happening +thing, like for example, in my research + +0:21:56.081,0:21:59.410 +in machine learning workloads, a lot of +time is loading data and you have to + +0:21:59.410,0:22:02.981 +make sure well like the time it takes to +load data is actually the minimum amount + +0:22:02.981,0:22:07.500 +of time you want to have that happening + +0:22:08.040,0:22:13.481 +And to build on that, there are actually +specialized tools for doing things like + +0:22:13.481,0:22:17.351 +analyzing wait times. Very often when +you're waiting for something what's + +0:22:17.351,0:22:20.591 +really happening is you're issuing your +system call, and that system call takes + +0:22:20.591,0:22:24.191 +some amount of time to respond. Like you do +a really large write, or a really large read + +0:22:24.191,0:22:28.361 +or you do many of them, and one thing +that can be really handy here is + +0:22:28.361,0:22:31.841 +to try to get information out of the +kernel about where your program is + +0:22:31.841,0:22:37.000 +spending its time. And so there's (it's +not new), but there's a relatively + +0:22:37.000,0:22:42.820 +newly available thing called BPF or eBPF. +Which is essentially kernel tracing + +0:22:42.820,0:22:48.531 +and you can do some really cool things with +it, and that includes tracing user programs. + +0:22:48.531,0:22:51.760 +It can be a little bit awkward to +get started with, there's a tool + +0:22:51.760,0:22:56.201 +called BPF trace that i would recommend +you looking to, if you need to do like + +0:22:56.201,0:23:00.040 +this kind of low-level performance debugging. +But it is really good for this + +0:23:00.040,0:23:04.601 +kind of stuff. You can get things like +histograms over how much time was spent + +0:23:04.601,0:23:06.671 +in particular system calls + +0:23:06.671,0:23:09.721 +It's a great tool + +0:23:12.251,0:23:15.351 +What browser plugins do you use? + +0:23:16.731,0:23:19.731 +I try to use as few as I can get away with using + +0:23:19.731,0:23:25.991 +because I don't like things being in +my browser, but there are a couple of + +0:23:25.991,0:23:30.311 +ones that are sort of staples. +The first one is uBlock Origin. + +0:23:30.311,0:23:36.611 +So uBlock Origin is one of many ad blockers but +it's a little bit more than an ad blocker. + +0:23:36.611,0:23:42.530 +It is (a what do they call it?) a +network filtering tool so it lets + +0:23:42.530,0:23:47.331 +you do more things than just block ads. +It also lets you like block connections + +0:23:47.331,0:23:51.351 +to certain domains, block connections +for certain types of resources + +0:23:51.351,0:23:56.031 +So I have mine set up in what they call +the Advanced Mode, where basically + +0:23:56.031,0:24:02.451 +you can disable basically all network requests. +But it's not just Network requests, + +0:24:02.451,0:24:07.430 +It's also like I have disabled all inline +scripts on every page and all + +0:24:07.430,0:24:11.540 +third-party images and resources, and then +you can sort of create a whitelist + +0:24:11.540,0:24:16.351 +for every page so it gives you really +low-level tools around how to + +0:24:16.351,0:24:20.331 +how to improve the security of your browsing. +But you can also set it in not the + +0:24:20.331,0:24:23.991 +advanced mode, and then it does much of +the same as a regular ad blocker would + +0:24:23.991,0:24:28.101 +do, although in a fairly efficient way +if you're looking at an ad blocker it's + +0:24:28.101,0:24:31.510 +probably the one to use and it +works on like every browser + +0:24:31.511,0:24:34.451 +That would be my top pick I think, + +0:24:39.111,0:24:44.391 +I think probably the one I +use like the most actively + +0:24:44.391,0:24:50.391 +is one called Stylus. It lets you modify +the CSS or like the stylesheets + +0:24:50.391,0:24:54.560 +that webpages have. And it's pretty +neat, because sometimes you're + +0:24:54.560,0:24:58.550 +looking at a website and you want +to hide some part of the website + +0:24:58.550,0:25:04.211 +you don't care about. Like maybe a ad, maybe +some sidebar you're not finding useful + +0:25:04.211,0:25:06.290 +The thing is, at the end of +the day these things are + +0:25:06.290,0:25:09.591 +displaying in your browser, and you +have control of what code is + +0:25:09.591,0:25:13.131 +executing and similar to what Jon was +saying, like you can customize this + +0:25:13.131,0:25:18.491 +to no end, and what I have for a lot of +web pages like hide this this part, or + +0:25:18.491,0:25:23.390 +also trying to make like dark modes for +them like you can change pretty much the + +0:25:23.390,0:25:26.810 +color for every single website. And what +is actually pretty neat is that there's + +0:25:26.810,0:25:31.461 +like a repository online of people that +have contributed this is stylesheets + +0:25:31.461,0:25:35.031 +for the websites. So someone probably +has (done) one for GitHub + +0:25:35.031,0:25:38.780 +Like I want dark GitHub and someone has +already contributed one that makes + +0:25:38.780,0:25:44.631 +that much more pleasing to browse. Apart +from that, one that it's not really + +0:25:44.631,0:25:49.491 +fancy, but I have found incredibly helpful +is one that just takes a screenshot an + +0:25:49.491,0:25:53.121 +entire website. And It will +scroll for you and make + +0:25:53.121,0:25:57.711 +compound image of the entire website and that's +really great for when you're trying to + +0:25:57.711,0:26:00.111 +print a website and is just terrible. + +0:26:00.111,0:26:00.611 +(It's built into Firefox) + +0:26:00.611,0:26:02.671 +oh interesting + +0:26:02.671,0:26:05.751 +oh now that you mention builtin to Firefox, +another one that I really like about + +0:26:05.751,0:26:09.071 +Firefox is the multi account containers + +0:26:09.071,0:26:10.831 +(Oh yeah, it's fantastic) + +0:26:10.831,0:26:12.291 +Which kind of lets you + +0:26:12.291,0:26:16.670 +By default a lot of web browsers, like +for example Chrome, have this + +0:26:16.670,0:26:20.601 +notion of like there's session that you +have, where you have all your cookies + +0:26:20.601,0:26:24.560 +and they are kind of all shared from the +different websites in the sense of + +0:26:24.560,0:26:30.811 +you keep opening new tabs and unless you go into +incognito you kind of have the same profile + +0:26:30.811,0:26:34.190 +And that profile is the same for +all websites, there is this + +0:26:34.191,0:26:35.851 +Is it an extension or is it built in? + +0:26:35.851,0:26:40.571 +(it's a mix, it's complicated) + +0:26:41.091,0:26:46.211 +So I think you actually have to say you want +to install it or enable it and again + +0:26:46.221,0:26:49.881 +the name is Multi Account Containers and +these let you tell Firefox to have + +0:26:49.881,0:26:53.961 +separate isolated sessions. So +for example, you want to say + +0:26:53.961,0:26:58.851 +I have a separate sessions for whenever I +visit to Google or whenever I visit Amazon + +0:26:58.851,0:27:01.791 +and that can be pretty neat, because then you can + +0:27:01.791,0:27:08.171 +At a browser level it's ensuring that no information +sharing is happening between the two of them + +0:27:08.171,0:27:11.961 +And it's much more convenient than +having to open a incognito window + +0:27:11.961,0:27:14.471 +where it's gonna clean all the time the stuff + +0:27:14.471,0:27:17.311 +(One thing to mention is Stylus vs Stylish) + +0:27:17.531,0:27:19.651 +Oh yeah, I forgot about that + +0:27:19.651,0:27:24.931 +One important thing is the browser extension +for side loading CSS Stylesheets + +0:27:24.931,0:27:31.851 +it's called a Stylus and that's different +from the older one that was + +0:27:31.851,0:27:37.400 +called Stylish, because that one got +bought at some point by some shady + +0:27:37.400,0:27:40.711 +company, that started abusing it not only to have + +0:27:40.711,0:27:45.780 +that functionality, but also to read your +entire browser history and send that + +0:27:45.780,0:27:48.491 +back to their servers so they could data mine it. + +0:27:48.491,0:27:53.731 +So, then people just built this open-source alternative +that is called Stylus, and that's the one + +0:27:53.731,0:27:58.951 +we recommend. Said that, I think the repository +for styles is the same for the + +0:27:58.951,0:28:03.611 +two of them, but I would have +to double check that. + +0:28:03.611,0:28:05.951 +Do you have any browser plugins Anish? + +0:28:06.071,0:28:09.311 +Yes, so I also have some recommendations +for browser plugins + +0:28:09.311,0:28:13.991 +I also use uBlock Origin and I also use Stylus, + +0:28:13.991,0:28:18.511 +but one other one that I'd recommend is +integration with a password manager + +0:28:18.511,0:28:21.631 +So this is a topic that we have in +the lecture notes for the security + +0:28:21.631,0:28:24.841 +lecture, but we didn't really get to talk +about in detail. But basically password + +0:28:24.841,0:28:27.810 +managers do a really good job of increasing +your security when working + +0:28:27.810,0:28:31.831 +with online accounts, and having browser +integration with your password manager + +0:28:31.831,0:28:34.410 +can save you a lot of time like you +can open up a website then it can + +0:28:34.410,0:28:37.381 +autofill your login information for you +sir and you go and copy and paste it + +0:28:37.381,0:28:40.320 +back and forth between a separate program +if it's not integrated with your + +0:28:40.320,0:28:43.410 +web browser, and it can also, this integration, +can save you from certain + +0:28:43.410,0:28:47.651 +attacks that would otherwise be possible if +you were doing this manual copy pasting. + +0:28:47.651,0:28:50.790 +For example, phishing attacks. So +you find a website that looks very + +0:28:50.790,0:28:54.211 +similar to Facebook and you go to log in +with your facebook login credentials and + +0:28:54.211,0:28:56.851 +you go to your password manager and copy +paste the correct credentials into this + +0:28:56.851,0:29:00.060 +funny web site and now all of a sudden +it has your password but if you have + +0:29:00.060,0:29:03.091 +browser integration then the extension +can automatically check + +0:29:03.091,0:29:06.951 +like. Am I on F A C E B O O K.com,or +is it some other domain + +0:29:06.951,0:29:10.671 +that maybe look similar and it will not enter +the login information if it's the wrong domain + +0:29:10.671,0:29:15.791 +so browser extension for +password managing is good + +0:29:15.791,0:29:17.930 +Yeah I agree + +0:29:19.491,0:29:20.711 +Next question + +0:29:20.711,0:29:23.991 +What are other useful data wrangling tools? + +0:29:23.991,0:29:32.421 +So in yesterday's lecture, I mentioned curl, so +curl is a fantastic tool for just making web + +0:29:32.421,0:29:35.811 +requests and dumping them to your terminal. +You can also use it for things + +0:29:35.811,0:29:41.191 +like uploading files which is really handy. + +0:29:41.191,0:29:48.431 +In the exercises of that lecture we also talked about +JQ and pup which are command line tools that let you + +0:29:48.431,0:29:52.991 +basically write queries over JSON +and HTML documents respectively + +0:29:52.991,0:30:00.391 +that can be really handy. Other +data wrangling tools? + +0:30:00.391,0:30:03.821 +Ah Perl, the Perl programming language is + +0:30:03.821,0:30:08.061 +often referred to as a write only +programming language because it's + +0:30:08.061,0:30:13.431 +impossible to read even if you wrote it. +But it is fantastic at doing just like + +0:30:13.431,0:30:21.561 +straight up text processing, like nothing +beats it there, so maybe worth learning + +0:30:21.561,0:30:24.331 +some very rudimentary Perl just +to write some of those scripts + +0:30:24.331,0:30:29.371 +It's easier often than writing some like hacked-up +combination of grep and awk and sed, + +0:30:29.371,0:30:36.311 +and it will be much faster to just tack something +up than writing it up in Python, for example + +0:30:36.311,0:30:44.031 +but apart from that, other data wrangling + +0:30:44.031,0:30:47.071 +No, not off the top of my head really + +0:30:47.071,0:30:53.661 +column -t, if you pipe any white space separated + +0:30:53.661,0:30:58.821 +input into column -t it will align all +the white space of the columns so that + +0:30:58.821,0:31:05.771 +you get nicely aligned columns that's, and +head and tail but we talked about those + +0:31:09.011,0:31:13.791 +I think a couple of additions to that, +that I find myself using commonly + +0:31:13.791,0:31:19.881 +one is vim. Vim can be pretty useful +for like data wrangling on itself + +0:31:19.881,0:31:22.461 +Sometimes you might find that the operation +that you're trying to do is + +0:31:22.461,0:31:27.711 +hard to put down in terms of piping +different operators but if you + +0:31:27.711,0:31:32.531 +can just open the file and just record + +0:31:32.531,0:31:37.301 +a couple of quick vim macros to do what you +want it to do, it might be like much, + +0:31:37.301,0:31:42.311 +much easier. That's one, and then the other +one, if you're dealing with tabular + +0:31:42.311,0:31:46.091 +data and you want to do more complex operations +like sorting by one column, + +0:31:46.091,0:31:51.161 +then grouping and then computing some sort +of statistic, I think a lot of that + +0:31:51.161,0:31:55.951 +workload I ended up just using Python +and pandas because it's built for that + +0:31:55.951,0:32:00.190 +And one of the pretty neat features that +I find myself also using is that it + +0:32:00.190,0:32:03.931 +will export to many different formats. +So this intermediate state + +0:32:03.931,0:32:09.221 +has its own kind of pandas dataframe +object but it can + +0:32:09.221,0:32:14.171 +export to HTM, LaTeX, a lot of different +like table formats so if your end + +0:32:14.171,0:32:19.531 +product is some sort of summary table, then pandas +I think it's a fantastic choice for that + +0:32:21.111,0:32:24.791 +I would second the vim and also +Python I think those are + +0:32:24.791,0:32:29.051 +two of my most used data wrangling tools. +For the vim one, last year we had a demo + +0:32:29.051,0:32:31.841 +in the series in the lecture notes, but +we didn't cover it in class we had a + +0:32:31.841,0:32:38.051 +demo of turning an XML file into a JSON version +of that same data using only vim macros + +0:32:38.051,0:32:40.331 +And I think that's actually the +way I would do it in practice + +0:32:40.331,0:32:43.241 +I don't want to go find a tool that does +this conversion it is actually simple + +0:32:43.241,0:32:45.431 +to encode as a vim macro, +then I just do it that way + +0:32:45.431,0:32:48.991 +And then also Python especially in an interactive +tool like a Jupyter notebook + +0:32:48.991,0:32:51.171 +is a really great way of doing data wrangling + +0:32:51.171,0:32:52.951 +A third tool I'd mention which +I don't remember if we + +0:32:52.961,0:32:55.361 +covered in the data wrangling +lecture or elsewhere + +0:32:55.361,0:32:58.751 +is a tool called pandoc which can do transformations +between different text + +0:32:58.751,0:33:02.981 +document formats so you can convert from +plaintext to HTML or HTML to markdown + +0:33:02.981,0:33:07.361 +or LaTeX to HTML or many other formats +it actually it supports a large + +0:33:07.361,0:33:10.471 +list of input formats and a +large list of output formats + +0:33:10.471,0:33:16.361 +I think there's one last one which I mentioned briefly +in the lecture on data wrangling which is + +0:33:16.361,0:33:20.441 +the R programming language, it's +an awful (I think it's an awful) + +0:33:20.441,0:33:25.120 +language to program in. And i would never +use it in the middle of a data wrangling + +0:33:25.120,0:33:30.951 +pipeline, but at the end, in order to like produce +pretty plots and statistics R is great + +0:33:30.951,0:33:35.581 +Because R is built for doing +statistics and plotting + +0:33:35.581,0:33:40.591 +there's a library for are called +ggplot which is just amazing + +0:33:40.591,0:33:46.551 +ggplot2 i guess technically It's +great, it produces very + +0:33:46.551,0:33:51.431 +nice visualizations and it lets you do, +it does very easily do things like + +0:33:51.431,0:33:57.561 +If you have a data set that has like multiple +facets like it's not just X and Y + +0:33:57.561,0:34:03.111 +it's like X Y Z and some other variable, +and then you want to plot like the + +0:34:03.111,0:34:07.581 +throughput grouped by all of those parameters +at the same time and produce + +0:34:07.581,0:34:11.991 +a visualization. R very easily let's you +do this and I haven't seen anywhere + +0:34:11.991,0:34:14.891 +that lets you do that as easily + +0:34:16.971,0:34:17.951 +Next question, + +0:34:17.951,0:34:20.511 +What's the difference between +Docker and a virtual machine + +0:34:23.271,0:34:27.731 +What's the easiest way to explain this? So docker + +0:34:27.741,0:34:31.221 +starts something called containers and +docker is not the only program that + +0:34:31.221,0:34:36.561 +starts containers. There are many others +and usually they rely on some feature of + +0:34:36.561,0:34:40.401 +the underlying kernel in the case of +docker they use something called LXC + +0:34:40.401,0:34:47.571 +which are Linux containers and the basic +premise there is if you want to start + +0:34:47.571,0:34:53.181 +what looks like a virtual machine that +is running roughly the same operating + +0:34:53.181,0:34:57.411 +system as you are already running on your +computer then you don't really need + +0:34:57.411,0:35:04.701 +to run another instance of the kernel +really that other virtual machine can + +0:35:04.701,0:35:09.951 +share a kernel. And you can just use the +kernels built in isolation mechanisms to + +0:35:09.951,0:35:13.791 +spin up a program that thinks it's +running on its own hardware but in + +0:35:13.791,0:35:18.501 +reality it's sharing the kernel and so this +means that containers can often run + +0:35:18.501,0:35:22.611 +with much lower overhead than a full virtual +machine will do but you should + +0:35:22.611,0:35:26.391 +keep in mind that it also has somewhat weaker +isolation because you are sharing + +0:35:26.391,0:35:30.831 +a kernel between the two if you spin up +a virtual machine the only thing that's + +0:35:30.831,0:35:35.931 +shared is sort of the hardware and to +some extent the hypervisor, whereas + +0:35:35.931,0:35:40.791 +with a docker container you're sharing +the full kernel and the that is a + +0:35:40.791,0:35:44.921 +different threat model that you +might have to keep in mind + +0:35:47.341,0:35:52.361 +One another small note there as Jon pointed +out, to use containers something + +0:35:52.361,0:35:55.631 +like Docker you need the underlying operating +system to be roughly the same + +0:35:55.631,0:36:00.071 +as whatever the program that's running +on top of the container expects and so + +0:36:00.071,0:36:03.791 +if you're using macOS for example, the +way you use docker is you run Linux + +0:36:03.791,0:36:08.261 +inside a virtual machine and then you can +run Docker on top of Linux so maybe + +0:36:08.261,0:36:11.741 +if you're going for containers in order +to get better performance your trading + +0:36:11.741,0:36:15.131 +isolation for performance if you're running +on Mac OS that may not work out + +0:36:15.131,0:36:17.451 +exactly as expected + +0:36:17.451,0:36:21.221 +And one last note is that there +is a slight difference, so + +0:36:21.221,0:36:25.721 +with Docker and containers, +one of the gotchas you have + +0:36:25.721,0:36:29.411 +to be familiar with is that containers +are more similar to virtual + +0:36:29.411,0:36:33.071 +machines in the sense of that they will +persist all the storage that you + +0:36:33.071,0:36:35.971 +have where Docker by default won't have that. + +0:36:35.971,0:36:37.791 +Like Docker is supposed to be running + +0:36:37.791,0:36:41.771 +So the main idea is like I want +to run some software and + +0:36:41.771,0:36:45.671 +I get the image and it runs and if you +want to have any kind of persistent + +0:36:45.671,0:36:50.081 +storage that links to the host system +you have to kind of manually specify + +0:36:50.081,0:36:56.051 +that, whereas a virtual machine is using +some virtual disk that is being provided + +0:36:56.051,0:37:02.671 +Next question + +0:37:02.671,0:37:05.111 +What are the advantages of each operating system + +0:37:05.111,0:37:08.531 +and how can we choose between them? +For example, choosing the best Linux + +0:37:08.531,0:37:10.551 +distribution for our purposes + +0:37:14.251,0:37:16.811 +I will say that for many, many tasks the + +0:37:16.811,0:37:20.171 +specific Linux distribution that you're +running is not that important + +0:37:20.171,0:37:23.731 +the thing is, it's just what kind of + +0:37:23.731,0:37:27.651 +knowing that there are different types +or like groups of distributions, + +0:37:27.651,0:37:32.251 +So for example, there are some distributions +that have really frequent updates + +0:37:32.251,0:37:38.971 +but they kind of break more easily. So for +example Arch Linux has a rolling update + +0:37:38.971,0:37:43.511 +way of pushing updates, where things might +break but they're fine with the things + +0:37:43.511,0:37:47.891 +being that way. Where maybe where you +have some really important web server + +0:37:47.891,0:37:51.401 +that is hosting all your business +analytics you want that thing + +0:37:51.401,0:37:55.961 +to have like a much more steady way of +updates. So that's for example why you + +0:37:55.961,0:37:58.121 +will see distributions like Debian being + +0:37:58.121,0:38:02.951 +much more conservative about what they push, or +even for example Ubuntu makes a difference + +0:38:02.951,0:38:07.001 +between the Long Term Releases +that they are only update every + +0:38:07.001,0:38:12.281 +two years and the more periodic +releases of one there is a + +0:38:12.281,0:38:16.661 +it's like two a year that they make. +So, kind of knowing that there's the + +0:38:16.661,0:38:21.341 +difference apart from that some distributions +have different ways + +0:38:21.341,0:38:27.191 +of providing the binaries +to you and the way they + +0:38:27.191,0:38:33.791 +have the repositories so I think a lot of Red +Hat Linux don't want non free drivers in + +0:38:33.791,0:38:37.361 +their official repositories where I +think Ubuntu is fine with some of + +0:38:37.361,0:38:42.491 +them, apart from that I think like just +a lot of what is core to most Linux + +0:38:42.491,0:38:47.411 +distros is kind of shared between them +and there's a lot of learning in the + +0:38:47.411,0:38:51.431 +common ground. So you don't have +to worry about the specifics + +0:38:52.391,0:38:56.351 +Keeping with the theme of this class being somewhat +opinionated, I'm gonna go ahead and say + +0:38:56.351,0:39:00.041 +that if you're using Linux especially for +the first time choose something like + +0:39:00.041,0:39:03.851 +Ubuntu or Debian. So you Ubuntu to is a +Debian based distribution but maybe is a + +0:39:03.851,0:39:07.421 +little bit more friendly, Debian is a little +bit more minimalist. I use Debian + +0:39:07.421,0:39:10.451 +and all my servers, for example. And I use +Debian desktop on my desktop computers + +0:39:10.451,0:39:15.431 +that run Linux if you're going for maybe +trying to learn more things and you want + +0:39:15.431,0:39:19.391 +a distribution that trades stability for +having more up-to-date software maybe + +0:39:19.391,0:39:21.911 +at the expense of you having to fix a +broken distribution every once in a + +0:39:21.911,0:39:26.911 +while then maybe you can consider something +like Arch Linux or Gentoo + +0:39:26.911,0:39:32.681 +or Slackware. Oh man, I'd say that like +if you're installing Linux and just like + +0:39:32.681,0:39:34.891 +want to get work done Debian is a great choice + +0:39:35.911,0:39:38.271 +Yeah I think I agree with that. + +0:39:38.271,0:39:40.971 +The other observation is like +you couldn't install BSD + +0:39:40.971,0:39:46.691 +BSD has gotten, has come a long way from +where it was. There's still a bunch of + +0:39:46.691,0:39:50.921 +software you can't really get for BSD but +it gives you a very well-documented + +0:39:50.921,0:39:55.841 +experience and and one thing that's different +about BSD compared to Linux is + +0:39:55.841,0:40:02.531 +that in an BSD when you install BSD you +get a full operating system, mostly + +0:40:02.651,0:40:07.531 +So many of the programs are maintained by +the same team that maintains the kernel + +0:40:07.541,0:40:11.351 +and everything is sort of upgraded together, +which is a little different + +0:40:11.351,0:40:13.271 +than how thanks work in the Linux world it does + +0:40:13.271,0:40:16.751 +mean that things often move a little bit +slower. I would not use it for things + +0:40:16.751,0:40:21.791 +like gaming either, because drivers support +is meh. But it is an interesting + +0:40:21.791,0:40:30.661 +environment to look at. And then for things +like Mac OS and Windows I think + +0:40:30.661,0:40:36.041 +If you are a programmer, I don't know why +you are using Windows unless you are + +0:40:36.041,0:40:42.401 +building things for Windows; or you want +to be able to do gaming and stuff + +0:40:42.401,0:40:46.891 +but in that case, maybe try dual booting, +even though that's a pain too + +0:40:46.891,0:40:52.031 +Mac OS is a is a good sort of middle point +between the two where you get a system + +0:40:52.031,0:40:57.851 +that is like relatively nicely polished +for you. But you still have access to + +0:40:57.851,0:41:01.191 +some of the lower-level bits +at least to a certain extent. + +0:41:01.191,0:41:07.451 +it's also really easy to dual boot Mac OS and Windows +it is not quite the case with like Mac OS and + +0:41:07.451,0:41:09.651 +Linux or Linux and Windows + +0:41:13.911,0:41:15.751 +Alright, for the rest of the +questions so these are + +0:41:15.761,0:41:18.761 +all 0 upvote questions so maybe we can go +through them quickly in the last five + +0:41:18.761,0:41:23.471 +or so minutes of class. So the next +one is Vim versus Emacs? Vim! + +0:41:23.471,0:41:30.911 +Easy answer, but a more serious answer is like I think +all three of us use vim as our primary editor + +0:41:30.911,0:41:34.931 +I use Emacs for some research specific +stuff which requires Emacs but + +0:41:34.931,0:41:38.681 +at a higher level both editors have interesting +ideas behind them and if you + +0:41:38.681,0:41:43.061 +have the time is worth exploring both +to see which fits you better and also + +0:41:43.061,0:41:46.811 +you can use Emacs and run it in a vim +emulation mode. I actually know a + +0:41:46.811,0:41:49.091 +good number of people who do that so +they get access to some of the cool + +0:41:49.091,0:41:52.631 +Emacs functionality and some of the cool +philosophy behind that like Emacs is + +0:41:52.631,0:41:55.391 +programmable through Lisp which is kind of cool. + +0:41:55.391,0:41:59.411 +Much better than vimscript, but people like +vim's modal editing, so there's an + +0:41:59.411,0:42:04.481 +emacs plugin called evil mode which gives +you vim modal editing within Emacs so + +0:42:04.481,0:42:08.081 +it's not necessarily a binary choice you +can kind of combine both tools if you + +0:42:08.081,0:42:11.151 +want to. And it's worth exploring +both if you have the time. + +0:42:11.151,0:42:12.731 +Next question + +0:42:12.731,0:42:15.671 +Any tips or tricks for machine +learning applications? + +0:42:19.271,0:42:22.351 +I think, like knowing how + +0:42:22.361,0:42:24.791 +a lot of these tools, mainly the data wrangling + +0:42:24.791,0:42:30.041 +a lot of the shell tools, it's really +important because it seems a lot + +0:42:30.041,0:42:33.851 +of what you're doing as machine learning +researcher is trying different things + +0:42:33.851,0:42:39.491 +but I think one core aspect of doing that, +and like a lot of scientific work is being + +0:42:39.491,0:42:44.501 +able to have reproducible results +and logging them in a sensible way + +0:42:44.501,0:42:47.711 +So for example, instead of trying to come +up with really hacky solutions of how + +0:42:47.711,0:42:51.151 +you name your folders to make +sense of the experiments + +0:42:51.151,0:42:53.251 +Maybe it's just worth having for example + +0:42:53.251,0:42:55.931 +what I do is have like a JSON +file that describes the + +0:42:55.931,0:43:00.371 +entire experiment I know like all the parameters +that are within and then I can + +0:43:00.371,0:43:05.111 +really quickly, using the tools that +we have covered, query for all the + +0:43:05.111,0:43:09.701 +experiments that have some specific +purpose or use some data set + +0:43:09.701,0:43:15.071 +Things like that. Apart from that, the other +side of this is, if you are running + +0:43:15.071,0:43:19.871 +kind of things for training machine +learning applications and you + +0:43:19.871,0:43:23.981 +are not already using some sort of +cluster, like university or your + +0:43:23.981,0:43:28.301 +company is providing and you're just kind +of manually sshing, like a lot of + +0:43:28.301,0:43:31.231 +labs do, because that's kind of the easy way + +0:43:31.231,0:43:36.671 +It's worth automating a lot of that job +because it might not seem like it but + +0:43:36.671,0:43:40.601 +manually doing a lot of these operations +takes away a lot of your time and also + +0:43:40.601,0:43:45.031 +kind of your mental energy +for running these things + +0:43:48.551,0:43:51.691 +Anymore vim tips? + +0:43:51.691,0:43:56.771 +I have one. So in the vim lecture we tried +not to link you to too many different + +0:43:56.771,0:44:00.131 +vim plugins because we didn't want that +lecture to be overwhelming but I think + +0:44:00.131,0:44:02.921 +it's actually worth exploring vim plugins +because there are lots and lots + +0:44:02.921,0:44:07.091 +of really cool ones out there. +One resource you can use is the + +0:44:07.091,0:44:10.571 +different instructors dotfiles like a lot +of us, I think I use like two dozen + +0:44:10.571,0:44:14.321 +vim plugins and I find a lot of them quite +helpful and I use them every day + +0:44:14.321,0:44:18.311 +we all use slightly different subsets of +them. So go look at what we use or look + +0:44:18.311,0:44:22.131 +at some of the other resources we've linked +to and you might find some stuff useful + +0:44:22.791,0:44:26.951 +A thing to add to that is, I don't think +we went into a lot detail in the + +0:44:27.041,0:44:31.571 +lecture, correct me if I'm wrong. It's +getting familiar with the leader key + +0:44:31.571,0:44:35.021 +Which is kind of a special key +that a lot of programs will + +0:44:35.021,0:44:39.081 +especially plugins, that will link to +and for a lot of the common operations + +0:44:39.081,0:44:44.661 +vim has short ways of doing it, but you +can just figure out like quicker + +0:44:44.661,0:44:50.031 +versions for doing them. So for example, like +I know that you can do like semicolon WQ + +0:44:50.031,0:44:55.521 +to save and exit or that you +can do like capital ZZ but I + +0:44:55.521,0:44:59.241 +just actually just do leader (which for +me is the space) and then W. And I have + +0:44:59.241,0:45:04.131 +done that for a lot of a lot of kind of +common operations that I keep doing all + +0:45:04.131,0:45:08.091 +the time. Because just saving one keystroke +for an extremely common operation + +0:45:08.091,0:45:11.371 +is just saving thousands a month + +0:45:11.371,0:45:12.951 +Yeah just to expand a little bit + +0:45:12.951,0:45:17.031 +on what the leader key is so in vim you +can bind some keys I can do like ctrl J + +0:45:17.031,0:45:20.481 +does something like holding one key and +then pressing another I can bind that to + +0:45:20.481,0:45:23.781 +something or I can bind a single keystroke +to something. What the leader + +0:45:23.781,0:45:26.031 +key lets you do, is bind + +0:45:26.031,0:45:28.311 +So you can assign any key +to be the leader key and + +0:45:28.311,0:45:32.841 +then you can assign leader followed by +some other key to some action so for + +0:45:32.841,0:45:36.831 +example like Jose's leader key is space +and they can combine space and then + +0:45:36.831,0:45:41.601 +releasing space followed by some other +key to an arbitrary vim command so it + +0:45:41.601,0:45:45.631 +just gives you yet another way of binding +like a whole set of key combinations. + +0:45:45.631,0:45:49.751 +Leader key plus kind of any key on +the keyboard to some functionality + +0:45:49.751,0:45:53.751 +I think I've I forget whether +we covered macros in the vim + +0:45:53.751,0:45:58.581 +uh sure but like vim macros are worth +learning they're not that complicated + +0:45:58.581,0:46:03.141 +but knowing that they're there and knowing +how to use them is going to save + +0:46:03.141,0:46:09.501 +you so much time. The other one is something +called marks. So in vim you can + +0:46:09.501,0:46:13.491 +press m and then any letter on your keyboard +to make a mark in that file and + +0:46:13.491,0:46:18.021 +then you can press apostrophe on the +same letter to jump back to the same + +0:46:18.021,0:46:21.801 +place. This is really useful if you're +like moving back and forth + +0:46:21.801,0:46:25.491 +between two different parts of your code +for example. You can mark one as A and + +0:46:25.491,0:46:29.611 +one as B and you can then jump between +them with tick A and tick B. + +0:46:29.611,0:46:34.851 +There's also Ctrl+O which jumps to the previous +place you were in the file no matter + +0:46:34.851,0:46:40.611 +what caused you to move. So for example +if I am in a some line and then I jump + +0:46:40.611,0:46:45.201 +to B and then I jump to A, Ctrl+O will +take me back to B and then back to the + +0:46:45.201,0:46:48.831 +place I originally was. This can also be +handy for things like if you're doing a + +0:46:48.831,0:46:52.671 +search then the place that you +started the search is a part of + +0:46:52.671,0:46:56.211 +that stack. So I can do a search I can +then like step through the results + +0:46:56.211,0:47:00.801 +and like change them and then Ctrl+O +all the way back up to the search + +0:47:00.801,0:47:06.201 +Ctrl+O also lets you move across files so +if I go from one file to somewhere else in + +0:47:06.201,0:47:09.681 +different file and somewhere else in the +first file Ctrl+O will move me back + +0:47:09.681,0:47:15.261 +through that stack and then there's +Ctrl+I to move forward in that + +0:47:15.261,0:47:20.841 +stack and so it's not as though you +pop it and it goes away forever + +0:47:20.841,0:47:26.541 +The command colon earlier is really handy. +So, colon earlier gives you an earlier + +0:47:26.541,0:47:32.870 +version of the same file and it it does +this based on time not based on actions + +0:47:32.870,0:47:36.651 +so for example if you press a bunch of like +undo and redo and make some changes + +0:47:36.651,0:47:42.561 +and stuff, earlier will take a literally +earlier as in time version of your file + +0:47:42.561,0:47:46.971 +and restore it to your buffer. This can +sometimes be good if you like undid and + +0:47:46.971,0:47:50.841 +then rewrote something and then realize +you actually wanted the version that was + +0:47:50.841,0:47:55.100 +there before you started undoing earlier +let's you do this. And there's a plug-in + +0:47:55.100,0:48:01.971 +called undo tree or something like +that There are several of these, + +0:48:01.971,0:48:05.781 +that let you actually explore the full +tree of undo history the vim keeps + +0:48:05.781,0:48:09.201 +because it doesn't just keep a linear history +it actually keeps the full tree + +0:48:09.201,0:48:12.771 +and letting you explore that might in +some cases save you from having to + +0:48:12.771,0:48:16.461 +re-type stuff you typed in the past or +stuff you just forgot exactly what you + +0:48:16.461,0:48:21.081 +had there that used to work and no longer +works. And this is one final one I + +0:48:21.081,0:48:26.751 +want to mention which is, we mentioned +how in vim you have verbs and nouns + +0:48:26.751,0:48:33.201 +right to your verbs like delete or yank +and then you have nouns like next of + +0:48:33.201,0:48:37.401 +this character or percent to swap brackets +and that sort of stuff the + +0:48:37.401,0:48:44.571 +search command is a noun so you can do +things like D slash and then a string + +0:48:44.571,0:48:50.261 +and it will delete up to the next match +of that pattern this is extremely useful + +0:48:50.261,0:48:54.251 +and I use it all the time + +0:48:58.500,0:49:03.520 +One another neat addition on the undo stuff +that I find incredibly valuable in + +0:49:03.520,0:49:08.201 +an everyday basis is that like one of +the built-in functionalities of vim + +0:49:08.201,0:49:13.510 +is that you can specify an undo directory +and if you have a specified an + +0:49:13.510,0:49:17.620 +undo directory by default vim, if you +don't have this enabled, whenever you + +0:49:17.620,0:49:23.091 +enter a file your undo history is +clean, there's nothing in there + +0:49:23.091,0:49:26.371 +and as you make changes and then +undo them you kind of create this + +0:49:26.380,0:49:32.800 +history but as soon as you exit the +file that's lost. Sorry, as soon + +0:49:32.800,0:49:37.181 +as you exit vim, that's lost. However +if you have an undodir, vim is + +0:49:37.181,0:49:41.651 +gonna persist all those changes into +this directory so no matter how many + +0:49:41.651,0:49:45.580 +times you enter and leave that history +is persisted and it's incredibly + +0:49:45.580,0:49:48.191 +helpful because even like + +0:49:48.191,0:49:50.290 +it can be very helpful for +some files that you modify + +0:49:50.290,0:49:54.760 +often because then you can kind of keep +the flow. But it's also sometimes really + +0:49:54.760,0:50:00.010 +helpful if you modify your bashrc see and +something broke like five days later and + +0:50:00.010,0:50:03.070 +then you've vim again. Like what actually +did I change ,if you don't + +0:50:03.070,0:50:06.760 +have say like version control, then +you can just check the undos and + +0:50:06.760,0:50:10.661 +that's actually what happened. And +the last one, it's also really + +0:50:10.661,0:50:14.891 +worth familiarizing yourself with registers +and what different special + +0:50:14.891,0:50:20.380 +registers vim uses. So for example if +you want to copy/paste really that's + +0:50:20.380,0:50:26.201 +gone into in a specific register and if you +want to for example use the a OS a copy + +0:50:26.201,0:50:30.040 +like the OS clipboard, you should +be copying or yanking + +0:50:30.040,0:50:36.250 +copying and pasting from a different register +and there's a lot of them and yeah + +0:50:36.251,0:50:41.310 +I think that you should explore, there's +a lot of things to know about registers + +0:50:42.271,0:50:45.070 +The next question is asking about two-factor +authentication and I'll just give + +0:50:45.070,0:50:48.490 +a very quick answer to this one in the interest +of time. So it's worth using two + +0:50:48.490,0:50:52.480 +factor auth for anything security sensitive +so I use it for my GitHub + +0:50:52.480,0:50:56.710 +account and for my email and stuff like +that. And there's a bunch of different + +0:50:56.710,0:51:01.360 +types of two-factor auth. From SMS based +to factor auth where you get special + +0:51:01.360,0:51:04.630 +like a number texted to you when you try +to log in you have to type that number + +0:51:04.630,0:51:08.710 +and to other tools like universal to +factor this is like those Yubikeys + +0:51:08.710,0:51:11.350 +that you plug into your you have +to tap it every time you login + +0:51:11.350,0:51:18.130 +so not all, (yeah Jon is holding a +Yubikey), not all two-factor auth is + +0:51:18.130,0:51:22.240 +created equal and you really want to be +using something like U2F rather than SMS + +0:51:22.240,0:51:25.300 +based to factor auth. There something +based on one-time pass codes that you + +0:51:25.300,0:51:28.810 +have to type in we don't have time to get +into the details of why some methods + +0:51:28.810,0:51:32.020 +are better than others but at a high +level use U2F and the Internet has + +0:51:32.020,0:51:37.560 +plenty of explanations for why other +methods are not a great idea + +0:51:37.711,0:51:41.851 +Last question, any comments on differences +between web browsers? + +0:51:48.171,0:51:50.171 +Yes + +0:51:54.711,0:52:00.451 +Differences between web browsers, there +are fewer and fewer differences between + +0:52:00.461,0:52:06.000 +web browsers these day. At this point +almost all web browsers are chrome + +0:52:06.000,0:52:09.580 +Either because you're using Chrome or +because you're using a browser that's + +0:52:09.580,0:52:15.550 +using the same browser engine as Chrome. +It's a little bit sad, one might say, but + +0:52:15.550,0:52:20.511 +I think these days whether you choose + +0:52:20.511,0:52:24.451 +Chrome is a great browser for security reasons + +0:52:24.451,0:52:28.471 +if you want to have something +that's more customizable or + +0:52:28.471,0:52:39.490 +you don't want to be tied to Google then +use Firefox, don't use Safari it's a + +0:52:39.490,0:52:45.701 +worse version of Chrome. The new Internet +Explorer edge is pretty decent and also + +0:52:45.701,0:52:50.820 +uses the same browser engine as +Chrome and that's probably fine + +0:52:50.820,0:52:54.641 +although avoid it if you can because it +has some like legacy modes you don't + +0:52:54.641,0:52:58.064 +want to deal with. I think that's + +0:52:58.064,0:53:03.091 +Oh, there's a cool new browser called flow + +0:53:03.091,0:53:05.500 +that you can't use for anything useful +yet but they're actually writing + +0:53:05.500,0:53:08.693 +their own browser engine and that's really neat + +0:53:08.693,0:53:14.951 +Firefox also has this project called servo which is +they're really implementing their browser engine + +0:53:14.951,0:53:19.570 +in Rust in order to write it to be like +super concurrent and what they've done + +0:53:19.570,0:53:24.961 +is they've started to take modules +from that version and port them + +0:53:24.961,0:53:29.041 +over to gecko or integrate them with gecko +which is the main browser engine + +0:53:29.041,0:53:32.221 +for Firefox just to get those +speed ups there as well + +0:53:32.221,0:53:37.031 +and that's a neat neat thing +you can be watching out for + +0:53:39.231,0:53:41.851 +That is all the questions, hey we did it. Nice + +0:53:41.851,0:53:50.751 +I guess thanks for taking the missing semester +class and let's do it again next year From 1f614a5920474931967cd44bfdd34a4d64ab7c72 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 3 Feb 2020 08:33:49 -0500 Subject: [PATCH 245/640] Do editing pass on Q&A lecture --- _2020/qa.md | 93 ++++++++++++++++++++++++++--------------------------- 1 file changed, 46 insertions(+), 47 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index 27e636bf..9314cc5e 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -8,8 +8,7 @@ video: id: Wz50FvGG6xU --- -As the last part of this lecture series, this section focused on answering questions that students from this class submitted. -Here we include a summary to what we answered for each question. +For the last lecture, we answered questions that the students submitted: - [Any recommendations on learning Operating Systems related topics like processes, virtual memory, interrupts, memory management, etc ](#any-recommendations-on-learning-operating-systems-related-topics-like-processes-virtual-memory-interrupts-memory-management-etc) - [What are some of the tools you'd prioritize learning first?](#what-are-some-of-the-tools-youd-prioritize-learning-first) @@ -27,16 +26,15 @@ Here we include a summary to what we answered for each question. - [Any more Vim tips?](#any-more-vim-tips) - [What is 2FA and why should I use it?](#what-is-2fa-and-why-should-i-use-it) - [Any comments on differences between web browsers?](#any-comments-on-differences-between-web-browsers) -] ## Any recommendations on learning Operating Systems related topics like processes, virtual memory, interrupts, memory management, etc -First, it is unclear whether you actually need to be very familiar with all of this topics since they are very low level topics. +First, it is unclear whether you actually need to be very familiar with all of these topics since they are very low level topics. They will matter as you start writing more low level code like implementing or modifying a kernel. Otherwise, most topics will relevant, with the exception of processes and signals that were briefly covered in other lectures. Some good resources to learn about this topic: -- [MIT's 6.828 class](https://pdos.csail.mit.edu/6.828/2019/schedule.html) - Graduate level class on Operating System Engineering. Class materials are publicly accessible +- [MIT's 6.828 class](https://pdos.csail.mit.edu/6.828/) - Graduate level class on Operating System Engineering. Class materials are publicly available. - Modern Operating Systems (4th ed) - by Andrew S. Tanenbaum is a good overview of many of the mentioned concepts. - The Design and Implementation of the FreeBSD Operating System - A good resource about the FreeBSD OS (note that this is not Linux). - Other guides like [Writing an OS in Rust](https://os.phil-opp.com/) where people implement a kernel step by step in various languages, mostly for teaching purposes. @@ -46,17 +44,17 @@ Some good resources to learn about this topic: Some topics worth prioritizing: -- Learning how to use you keyboard more and your mouse less. This can be through keyboard shortcuts, changing interfaces, &c -- Learning your editor well. As a programmer most of your time is spent editing files so it really pays off to learn this skill well -- Learning how to automate and/or simplify repetitive tasks in your workflow because the time savings will be enormous. -- Learning about version control tools like Git and how to use it in conjunction with GitHub to collaborate in modern software projects. +- Learning how to use you keyboard more and your mouse less. This can be through keyboard shortcuts, changing interfaces, &c. +- Learning your editor well. As a programmer most of your time is spent editing files so it really pays off to learn this skill well. +- Learning how to automate and/or simplify repetitive tasks in your workflow because the time savings will be enormous.. +- Learning about version control tools like Git and how to use it in conjunction with GitHub to collaborate in modern software projects. ## When do I use Python versus a Bash scripts versus some other language? -In general, bash scripts should be useful for short and simple one-off scripts when you just one to run a specific series of commands. bash has a set of oddities that make it hard to work with for larger programs or scripts: +In general, bash scripts are useful for short and simple one-off scripts when you just want to run a specific series of commands. bash has a set of oddities that make it hard to work with for larger programs or scripts: -- bash is easy to get right for a simple use case but it can be really hard to get right for all possible inputs. For example, spaces in script arguments have led to countless bugs in bash scripts -- bash is not very akin to code reuse so it can be hard to compose previous programs that you might have written. More generally, there is no concept of software libraries in bash. +- bash is easy to get right for a simple use case but it can be really hard to get right for all possible inputs. For example, spaces in script arguments have led to countless bugs in bash scripts. +- bash is not amenable to code reuse so it can be hard to reuse components of previous programs you have written. More generally, there is no concept of software libraries in bash. - bash relies on many magic strings like `$?` or `$@` to refer to specific values, whereas other languages refer to them explicitly, like `exitCode` or `sys.args` respectively. Therefore, for larger and/or more complex scripts we recommend using more mature scripting languages like Python or Ruby. @@ -65,50 +63,51 @@ If you find a library that implements the specific functionality you care about ## What is the difference between `source script.sh` and `./script.sh` -In both cases the `script.sh` will be read and executed into a bash session, the difference lies in which session is running the commands. +In both cases the `script.sh` will be read and executed in a bash session, the difference lies in which session is running the commands. For `source` the commands are executed in your current bash session and thus any changes made to the current environment, like changing directories or defining functions will persist in the current session once the `source` command finishes executing. -When running the script standalone like `./script.sh`, your current bash session starts a new instance of bash which will run the commands in `script.sh`. +When running the script standalone like `./script.sh`, your current bash session starts a new instance of bash that will run the commands in `script.sh`. Thus, if `script.sh` changes directories, the new bash instance will change directories but once it exits and returns control to the parent bash session, the parent session will remain in the same place. Similarly, if `script.sh` defines a function that you want to access in your terminal, you need to `source` it for it to be defined in your current bash session. Otherwise, if you run it, the new bash process will be the one to process the function definition instead of your current shell. ## What are the places where various packages and tools are stored and how does referencing them work? What even is `/bin` or `/lib`? -Regarding programs that you execute in your terminal they are all found in the directories listed in your `PATH` environment variable and you can use the `which` command (or the `type` command) to check where your shell is finding an specific program. -In general, there are some conventions about where programs live. Here is some of the ones we talked about, check the [Filesystem, Hierarchy Standard](https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard) for a more comprehensive list. +Regarding programs that you execute in your terminal, they are all found in the directories listed in your `PATH` environment variable and you can use the `which` command (or the `type` command) to check where your shell is finding an specific program. +In general, there are some conventions about where specific types of files live. Here is some of the ones we talked about, check the [Filesystem, Hierarchy Standard](https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard) for a more comprehensive list. -- `/bin` - Essential command binaries. +- `/bin` - Essential command binaries +- `/sbin` - Essential system binaries, usually to be run by root - `/dev` - Device files, special files that often are interfaces to hardware devices - `/etc` - Host-specific system-wide configuration files - `/home` - Home directories for users in the system - `/lib` - Common libraries for system programs - `/opt` - Optional application software -- `/sys` - Covered in the first lecture, contains information and configuration for the system +- `/sys` - Contains information and configuration for the system (covered in the [first lecture](/2020/course-shell/)) - `/tmp` - Temporary files (also `/var/tmp`). Usually deleted between reboots. - `/usr/` - Read only user data + `/usr/bin` - Non-essential command binaries - + `/usr/sbin` - Non-essential system binaries, often only supposed to be run by root + + `/usr/sbin` - Non-essential system binaries, usually to be run by root + `/usr/local/bin` - Binaries for user compiled programs - `/var` - Variable files like logs or caches ## Should I `apt-get install` a python-whatever, or `pip install` whatever package? -There's no universal answer to this question, but in revolves around the more general question of whether you should use your systems package manager to install things or a language specific package manager. A few things to take into account: +There's no universal answer to this question. It's related to the more general question of whether you should use your system's package manager or a language-specific package manager to install software. A few things to take into account: -- Common packages will be available through both, but less popular ones or more recent ones might not be available in your system package. In this, case using the language specific manager is the better choice. -- Similarly, language specific package managers usually have more up to date versions of packages that system package managers. -- When using your system package manager, libraries will be installed system wide. This means that if you need different versions of a library for development purposes, the system package manager might not suffice. For this scenario, most programming languages provide some sort of isolated or virtual environment so you can install different versions of libraries without running into conflicts. For Python, there's virtualenv or for Ruby RVM. -- Depending on the operating system and the hardware architecture, some of these packages might come with binaries or might need to be compiled. For instance, in ARM computers like the Raspberry Pi using the system package manager can be better than the language specific one if the former comes in form of binaries and the later needs to be compiled. This is highly dependent on the specific setup so you should check. +- Common packages will be available through both, but less popular ones or more recent ones might not be available in your system package manager. In this, case using the language-specific tool is the better choice. +- Similarly, language-specific package managers usually have more up to date versions of packages than system package managers. +- When using your system package manager, libraries will be installed system wide. This means that if you need different versions of a library for development purposes, the system package manager might not suffice. For this scenario, most programming languages provide some sort of isolated or virtual environment so you can install different versions of libraries without running into conflicts. For Python, there's virtualenv, and for Ruby, there's RVM. +- Depending on the operating system and the hardware architecture, some of these packages might come with binaries or might need to be compiled. For instance, in ARM computers like the Raspberry Pi, using the system package manager can be better than the language specific one if the former comes in form of binaries and the later needs to be compiled. This is highly dependent on your specific setup. -You should try to use one solution or the other and not both since that can lead to hard to debug conflicts. Our recommendation is to use the language specific package manager whenever possible, and to use isolated environments (like Python's virtualenv) to avoid polluting the global environment. +You should try to use one solution or the other and not both since that can lead to conflicts that are hard to debug. Our recommendation is to use the language-specific package manager whenever possible, and to use isolated environments (like Python's virtualenv) to avoid polluting the global environment. ## What's the easiest and best profiling tools to use to improve performance of my code? The easiest tool that is quite useful for profiling purposes is [print timing](/2020/debugging-profiling/#timing). You just manually compute the time taken between different parts of your code. By repeatedly doing this, you can effectively do a binary search over your code and find the segment of code that took the longest. -For more advanced tools, Valgrind's [Callgrind](http://valgrind.org/docs/manual/cl-manual.html) lets you run your program and measure how long everything takes and all the call stacks, namely which function called which other function. It then produces an annotated version of your program's source code with the time taken per line. However, it slows down your program by an order of magnitude and does not support threads. For other cases, the [`perf`](http://www.brendangregg.com/perf.html) tool and other language specific sampling profilers can output useful data pretty quickly. [Flamegraphs](http://www.brendangregg.com/flamegraphs.html) are good visualization tool for the output of said sampling profilers. You should also try to use specific tools for the programming language or task you are working with. E.g. for web development, the dev tools built into Chrome and Firefox have fantastic profilers. +For more advanced tools, Valgrind's [Callgrind](http://valgrind.org/docs/manual/cl-manual.html) lets you run your program and measure how long everything takes and all the call stacks, namely which function called which other function. It then produces an annotated version of your program's source code with the time taken per line. However, it slows down your program by an order of magnitude and does not support threads. For other cases, the [`perf`](http://www.brendangregg.com/perf.html) tool and other language specific sampling profilers can output useful data pretty quickly. [Flamegraphs](http://www.brendangregg.com/flamegraphs.html) are good visualization tool for the output of said sampling profilers. You should also try to use specific tools for the programming language or task you are working with. For example, for web development, the dev tools built into Chrome and Firefox have fantastic profilers. -Sometimes the slow part of your code will be because your system is waiting for an event like a disk read or a network packet. In those cases, it is worth checking that back of the envelope calculations about the theoretical speed in terms of hardware capabilities do not deviate from the actual readings. There are also specialized tools to analyze the wait times in system calls. These include tools like [eBPF](http://www.brendangregg.com/blog/2019-01-01/learn-ebpf-tracing.html) that perform kernel tracing of user programs. In particular [`bpftrace`](https://github.com/iovisor/bpftrace) is worth checking out if you need to perform this sort of low level profiling. +Sometimes the slow part of your code will be because your system is waiting for an event like a disk read or a network packet. In those cases, it is worth checking that back-of-the-envelope calculations about the theoretical speed in terms of hardware capabilities do not deviate from the actual readings. There are also specialized tools to analyze the wait times in system calls. These include tools like [eBPF](http://www.brendangregg.com/blog/2019-01-01/learn-ebpf-tracing.html) that perform kernel tracing of user programs. In particular [`bpftrace`](https://github.com/iovisor/bpftrace) is worth checking out if you need to perform this sort of low level profiling. ## What browser plugins do you use? @@ -116,34 +115,34 @@ Sometimes the slow part of your code will be because your system is waiting for Some of our favorites, mostly related to security and usability: - [uBlock Origin](https://github.com/gorhill/uBlock) - It is a [wide-spectrum](https://github.com/gorhill/uBlock/wiki/Blocking-mode) blocker that doesn’t just stop ads, but all sorts of third-party communication a page may try to do. This also cover inline scripts and other types of resource loading. If you’re willing to spend some time on configuration to make things work, go to [medium mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-medium-mode) or even [hard mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-hard-mode). Those will make some sites not work until you’ve fiddled with the settings enough, but will also significantly improve your online security. Otherwise, the [easy mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-easy-mode) is already a good default that blocks most ads and tracking. You can also define you own rules about what website objects to block. -- [Stylus](https://github.com/openstyles/stylus/) - a fork of Stylish (don't use Stylish, it was shown to steal users browsing history), allows you to sideload custom CSS stylesheets to websites. With Stylus you can easily customize and modify the appearance of websites. This can be removing a sidebar, changing the background color or even the text size or font choice. This is fantastic for making websites that you visit frequently more readable. Moreover, Stylus can find styles written by other users and published in [userstyles.org](https://userstyles.org/). Most common websites have one or several dark theme stylesheets for instance. -- Full Page Screen Capture - Built into Firefox and [Chrome](https://chrome.google.com/webstore/detail/full-page-screen-capture/fdpohaocaechififmbbbbbknoalclacl?hl=en) extension. Let's you take a screenshot of a full website, often much better than printing for reference purposes. +- [Stylus](https://github.com/openstyles/stylus/) - a fork of Stylish (don't use Stylish, it was shown to [steal users browsing history](https://www.theregister.co.uk/2018/07/05/browsers_pull_stylish_but_invasive_browser_extension/)), allows you to sideload custom CSS stylesheets to websites. With Stylus you can easily customize and modify the appearance of websites. This can be removing a sidebar, changing the background color or even the text size or font choice. This is fantastic for making websites that you visit frequently more readable. Moreover, Stylus can find styles written by other users and published in [userstyles.org](https://userstyles.org/). Most common websites have one or several dark theme stylesheets for instance. +- Full Page Screen Capture - Built into Firefox and [Chrome extension](https://chrome.google.com/webstore/detail/full-page-screen-capture/fdpohaocaechififmbbbbbknoalclacl?hl=en). Let's you take a screenshot of a full website, often much better than printing for reference purposes. - [Multi Account Containers](https://addons.mozilla.org/en-US/firefox/addon/multi-account-containers/) - lets you separate cookies into "containers", allowing you to browse the web with different identities and/or ensuring that websites are unable to share information between them. -- Password Manager Integration - Most password managers have browser extensions that make inputting your credentials into websites not only more convenient but also more secure. Compared to simply copy-pasting your user and password, these tools will first check that the website domain matches the one listed for the entry, preventing phishing attacks that recreate popular websites to steal credentials. +- Password Manager Integration - Most password managers have browser extensions that make inputting your credentials into websites not only more convenient but also more secure. Compared to simply copy-pasting your user and password, these tools will first check that the website domain matches the one listed for the entry, preventing phishing attacks that impersonate popular websites to steal credentials. ## What are other useful data wrangling tools? -Some of the data wrangling tools we did not have to cover during the data wrangling lecture include `jq` or `pup` which are specialized parsers for JSON and HTML data respectively. The Perl programming language is another good tool for more advanced data wrangling pipelines. Another trick is the `column -t` command that can be used to convert whitespace text (not necessarily aligned) into properly column aligned text. +Some of the data wrangling tools we did not have time to cover during the data wrangling lecture include `jq` or `pup` which are specialized parsers for JSON and HTML data respectively. The Perl programming language is another good tool for more advanced data wrangling pipelines. Another trick is the `column -t` command that can be used to convert whitespace text (not necessarily aligned) into properly column aligned text. -More generally a couple of more unconventional data wrangling tools are vim and Python. For some complex and multi-line transformations, vim macros can be a quite invaluable tools to use. You can just record a series of actions and repeat them as many times as you want, for instance in the editors [lecture notes](https://missing.csail.mit.edu/2020/editors/#macros) there is an example of converting a XML formatted file into JSON just using vim macros. +More generally a couple of more unconventional data wrangling tools are vim and Python. For some complex and multi-line transformations, vim macros can be a quite invaluable tools to use. You can just record a series of actions and repeat them as many times as you want, for instance in the editors [lecture notes](/2020/editors/#macros) (and last year's [video](/2019/editors/)) there is an example of converting a XML-formatted file into JSON just using vim macros. -For tabular data, often presented in CSVs, the [pandas](https://pandas.pydata.org/) Python library is a great tool. Not only because it makes it quite easy to define complex operations like group by, join or filters; but also makes it quite easy to plot different properties of your data. It also supports exporting to many table formats including XLS, HTML or LaTeX. Alternatively the R programming language (an arguably [bad](http://arrgh.tim-smith.us/)) programming language, it has lots of functionality for computing statistics over data and can be quite useful as the last step of your pipeline. The [ggplot2](https://ggplot2.tidyverse.org/) is a great plotting library in R. +For tabular data, often presented in CSVs, the [pandas](https://pandas.pydata.org/) Python library is a great tool. Not only because it makes it quite easy to define complex operations like group by, join or filters; but also makes it quite easy to plot different properties of your data. It also supports exporting to many table formats including XLS, HTML or LaTeX. Alternatively the R programming language (an arguably [bad](http://arrgh.tim-smith.us/) programming language) has lots of functionality for computing statistics over data and can be quite useful as the last step of your pipeline. [ggplot2](https://ggplot2.tidyverse.org/) is a great plotting library in R. ## What is the difference between Docker and a Virtual Machine? -Docker uses a more general concept called containers. The main difference between containers and virtual machines is that virtual machines will execute an entire OS stack, including the kernel, even if the kernel is the same as the host machine. Unlike VMs, containers avoid running another instance of the kernel and just share the kernel with the host. In Linux this is achieved through a mechanism called LXC and it makes use of a series of isolation mechanism to spin up a program that thinks it's running on its own hardware but it's actually sharing the hardware and kernel with the host. Thus, containers have a lower overhead than a full VM. -On the flip side, containers have a weaker isolation and only work if the host runs the same kernel. For instance if you run Docker on macOS, Docker need to spin up a Linux virtual machine to get an initial Linux kernel and thus the overhead is still significant. Lastly, Docker is an specific implementation of containers and it is tailored for software deployment. Because of this, it has some quirks like by default Docker containers will not persist any form of storage between reboots. +Docker is based on a more general concept called containers. The main difference between containers and virtual machines is that virtual machines will execute an entire OS stack, including the kernel, even if the kernel is the same as the host machine. Unlike VMs, containers avoid running another instance of the kernel and instead share the kernel with the host. In Linux, this is achieved through a mechanism called LXC, and it makes use of a series of isolation mechanism to spin up a program that thinks it's running on its own hardware but it's actually sharing the hardware and kernel with the host. Thus, containers have a lower overhead than a full VM. +On the flip side, containers have a weaker isolation and only work if the host runs the same kernel. For instance if you run Docker on macOS, Docker need to spin up a Linux virtual machine to get an initial Linux kernel and thus the overhead is still significant. Lastly, Docker is an specific implementation of containers and it is tailored for software deployment. Because of this, it has some quirks: for example, Docker containers will not persist any form of storage between reboots by default. ## What are the advantages and disadvantages of each OS and how can we choose between them (e.g. choosing the best Linux distribution for our purposes)? -Regarding Linux distros, even though there are many, many distros, most of them will behave fairly identical for most use cases. +Regarding Linux distros, even though there are many, many distros, most of them will behave fairly identically for most use cases. Most of Linux and UNIX features and inner workings can be learned in any distro. A fundamental difference between distros is how they deal with package updates. -Some distros, like Arch Linux, use a rolling update policy where things are bleeding edge but things might break every so often. On the other hand, some distros like Debian, CentOS or Ubuntu LTS releases are much more conservative with releasing updates in their repositories so things are usually more stable at the expense of sacrificing newer features. -Our recommendation for an easy experience with both desktops and servers is to use Debian or Ubuntu. +Some distros, like Arch Linux, use a rolling update policy where things are bleeding-edge but things might break every so often. On the other hand, some distros like Debian, CentOS or Ubuntu LTS releases are much more conservative with releasing updates in their repositories so things are usually more stable at the expense of sacrificing newer features. +Our recommendation for an easy and stable experience with both desktops and servers is to use Debian or Ubuntu. Mac OS is a good middle point between Windows and Linux that has a nicely polished interface. However, Mac OS is based on BSD rather than Linux, so some parts of the system and commands are different. -An alternative worth checking is FreeBSD. Even though some programs will not run on FreeBSD, the BSD ecosystem is much less fragmented and better documented than Linux is. +An alternative worth checking is FreeBSD. Even though some programs will not run on FreeBSD, the BSD ecosystem is much less fragmented and better documented than Linux. We discourage Windows for anything but for developing Windows applications or if there is some deal breaker feature that you need, like good driver support for gaming. For dual boot systems, we think that the most working implementation is macOS' bootcamp and that any other combination can be problematic on the long run, specially if you combine it with other features like disk encryption. @@ -151,13 +150,13 @@ For dual boot systems, we think that the most working implementation is macOS' b ## Vim vs Emacs? The three of us use vim as our primary editor but Emacs is also a good alternative and it's worth trying both to see which works better for you. Emacs does not follow vim's modal editing, but this can be enabled through Emacs plugins like [Evil](https://github.com/emacs-evil/evil) or [Doom Emacs](https://github.com/hlissner/doom-emacs). -An advantage of using Emacs is that is implemented in Lisp, a better scripting language than vimscript. Thus, Emacs plugins are sometimes excellent. +An advantage of using Emacs is that extensions can be implemented in Lisp, a better scripting language than vimscript, Vim's default scripting language. ## Any tips or tricks for Machine Learning applications? -Ignoring ML specific advice, some of the lessons and takeaways from this class can directly be applied to ML applications. +Some of the lessons and takeaways from this class can directly be applied to ML applications. As it is the case with many science disciplines, in ML you often perform a series of experiments and want to check what things worked and what didn't. -One can use shell tools to easily and quickly search through these experiments and aggregate the results in a sensible way. This could mean subselecting all experiments in a given time frame or that use a specific dataset. By using a simple JSON file to log all relevant parameters of the experiments, this can be incredibly simple with the tools we covered in this class. +You can use shell tools to easily and quickly search through these experiments and aggregate the results in a sensible way. This could mean subselecting all experiments in a given time frame or that use a specific dataset. By using a simple JSON file to log all relevant parameters of the experiments, this can be incredibly simple with the tools we covered in this class. Lastly, if you do not work with some sort of cluster where you submit your GPU jobs, you should look into how to automate this process since it can be a quite time consuming task that also eats away your mental energy. ## Any more Vim tips? @@ -167,17 +166,17 @@ A few more tips: - Plugins - Take your time and explore the plugin landscape. There are a lot of great plugins that address some of vim's shortcomings or add new functionality that composes well with existing vim workflows. For this, good resources are [VimAwesome](https://vimawesome.com/) and other programmers' dotfiles. - Marks - In vim, you can set a mark doing `m` for some letter `X`. You can then go back to that mark doing `'`. This let's you quickly navigate to specific locations within a file or even across files. - Navigation - `Ctrl+O` and `Ctrl+I` move you backward and forward respectively through your recently visited locations. -- Undo Tree - Vim has a quite fancy mechanism for keeping tack of changes. Unlike other editors, vim stores a tree of changes so even if you undo and then make a different change you can still go back to the original state by navigating the undo tree. Some plugins expose this tree in a graphical way. -- Undo with time - The `earlier` and `later` commands will let you navigate the files using time references instead of one change at a time. -- [Persistent undo](https://vim.fandom.com/wiki/Using_undo_branches#Persistent_undo) An amazing built in feature of vim that is disabled by default is persisting undo history between vim invocations. By setting `undofile` and `undodir` in your `.vimrc`, vim will storage a per-file history of changes. +- Undo Tree - Vim has a quite fancy mechanism for keeping tack of changes. Unlike other editors, vim stores a tree of changes so even if you undo and then make a different change you can still go back to the original state by navigating the undo tree. Some plugins like [gundo.vim](https://github.com/sjl/gundo.vim) and [undotree](https://github.com/mbbill/undotree) expose this tree in a graphical way. +- Undo with time - The `:earlier` and `:later` commands will let you navigate the files using time references instead of one change at a time. +- [Persistent undo](https://vim.fandom.com/wiki/Using_undo_branches#Persistent_undo) is an amazing built-in feature of vim that is disabled by default. It persists undo history between vim invocations. By setting `undofile` and `undodir` in your `.vimrc`, vim will storage a per-file history of changes. - Leader Key - The leader key is special key that is often left to the user to be configured for custom commands. The pattern is usually to press and release this key (often the space key) and then some other key to execute a certain command. Often, plugins will use this key to add their own functionality, for instance the UndoTree plugin uses ` U` to open the undo tree. - Advanced Text Objects - Text objects like searches can also be composed with vim commands. E.g. `d/` will delete to the next match of said pattern or `cgn` will change the next occurrence of the last searched string. ## What is 2FA and why should I use it? -Two Factor Authentication (2FA) adds an extra layer of protection to your accounts on top of passwords. In order to login, you not only have to know some password you also have to "prove" in some way you have access to some hardware device. In the most simple case, this can be achieved by receiving an SMS on your phone, although there are known issues with SMS 2FA. A better alternative we endorse is to use a U2F solution like for example YubiKeys. +Two Factor Authentication (2FA) adds an extra layer of protection to your accounts on top of passwords. In order to login, you not only have to know some password, but you also have to "prove" in some way you have access to some hardware device. In the most simple case, this can be achieved by receiving an SMS on your phone, although there are [known issues](https://www.kaspersky.com/blog/2fa-practical-guide/24219/) with SMS 2FA. A better alternative we endorse is to use a [U2F](https://en.wikipedia.org/wiki/Universal_2nd_Factor) solution like [YubiKey](https://www.yubico.com/). ## Any comments on differences between web browsers? -The current landscape of browsers as of 2020 is that most of them are like Chrome because they use the same engine (WebKit). This means that Safari or the Microsoft Edge, both based on WebKit are just worse versions of Chrome. Chrome is a reasonably good browser both in terms of performance and usability. Should you want an alternative, Firefox is our recommendation. It is comparable to Chrome in pretty much every regard and it excels for privacy reasons. +The current landscape of browsers as of 2020 is that most of them are like Chrome because they use the same engine (WebKit). This means that Safari or the Microsoft Edge, both based on WebKit, are just worse versions of Chrome. Chrome is a reasonably good browser both in terms of performance and usability. Should you want an alternative, Firefox is our recommendation. It is comparable to Chrome in pretty much every regard and it excels for privacy reasons. Another browser called [Flow](https://www.ekioh.com/flow-browser/) is not user ready yet, but it is implementing a new rendering engine that promises to be faster than the current ones. From 49228867b2d96c531df957e9108c442918dd0045 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 3 Feb 2020 10:10:00 -0500 Subject: [PATCH 246/640] Remove irrelevant info now that class is over --- _2020/mlk-day.md | 6 ------ index.md | 4 ++++ 2 files changed, 4 insertions(+), 6 deletions(-) delete mode 100644 _2020/mlk-day.md diff --git a/_2020/mlk-day.md b/_2020/mlk-day.md deleted file mode 100644 index ff8ca39c..00000000 --- a/_2020/mlk-day.md +++ /dev/null @@ -1,6 +0,0 @@ ---- -layout: null -title: "MLK day" -date: 2019-01-20 -noclass: true ---- diff --git a/index.md b/index.md index 103850d7..a6404771 100644 --- a/index.md +++ b/index.md @@ -18,14 +18,18 @@ impossibly complex. Read about the [motivation behind this class](/about/). +{% comment %} # Registration Sign up for the IAP 2020 class by filling out this [registration form](https://forms.gle/TD1KnwCSV52qexVt9). +{% endcomment %} # Schedule +{% comment %} **Lecture**: 35-225, 2pm--3pm
**Office hours**: 32-G9 lounge, 3pm--4pm (every day, right after lecture) +{% endcomment %}
    {% assign lectures = site['2020'] | sort: 'date' %} From 022060d679050a90db974262fd4335258a31a8fc Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 3 Feb 2020 10:11:12 -0500 Subject: [PATCH 247/640] Update description --- _2020/metaprogramming.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/metaprogramming.md b/_2020/metaprogramming.md index 5c265fcb..a0859646 100644 --- a/_2020/metaprogramming.md +++ b/_2020/metaprogramming.md @@ -1,7 +1,7 @@ --- layout: lecture title: "Metaprogramming" -details: build systems, sermver, makefiles, CI +details: build systems, dependency management, testing, CI date: 2019-01-27 ready: true video: From 086de9c4123fea9cad86e896936acbd7dd51be6b Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 3 Feb 2020 12:37:23 -0500 Subject: [PATCH 248/640] Update links --- _2019/index.html | 16 ++++++++++++++++ index.md | 12 ++++++------ 2 files changed, 22 insertions(+), 6 deletions(-) diff --git a/_2019/index.html b/_2019/index.html index a90ce940..1b4890e5 100644 --- a/_2019/index.html +++ b/_2019/index.html @@ -46,3 +46,19 @@

    Thursday, 1/31

  • Web and browsers
  • Security and privacy
+ +
+ +

Discussion

+ +

We've also shared this class beyond MIT in the hopes that others may +benefit from these resources. You can find posts and discussion on

+ + diff --git a/index.md b/index.md index a6404771..51ea7aea 100644 --- a/index.md +++ b/index.md @@ -60,12 +60,12 @@ YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuloKGG59rS43e29ro7I57 We've also shared this class beyond MIT in the hopes that others may benefit from these resources. You can find posts and discussion on - - [Hacker News](https://news.ycombinator.com/item?id=19078281) - - [Lobsters](https://lobste.rs/s/h6157x/mit_hacker_tools_lecture_series_on) - - [/r/learnprogramming](https://www.reddit.com/r/learnprogramming/comments/an42uu/mit_hacker_tools_a_lecture_series_on_programmer/) - - [/r/programming](https://www.reddit.com/r/programming/comments/an3xki/mit_hacker_tools_a_lecture_series_on_programmer/) - - [Twitter](https://twitter.com/Jonhoo/status/1091896192332693504) - - [YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuiujH1lpn8cA9dsyulbYRv) + - [Hacker News](https://news.ycombinator.com/item?id=22226380) + - [Lobsters](https://lobste.rs/s/ti1k98/missing_semester_your_cs_education_mit) + - [/r/learnprogramming](https://www.reddit.com/r/learnprogramming/comments/eyagda/the_missing_semester_of_your_cs_education_mit/) + - [/r/programming](https://www.reddit.com/r/programming/comments/eyagcd/the_missing_semester_of_your_cs_education_mit/) + - [Twitter](https://twitter.com/jonhoo/status/1224383452591509507) + - [YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuloKGG59rS43e29ro7I57J) ## Acknowledgements From 5e3f62cef3a2c93ea64e41a8db0bedce2c53bcec Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 3 Feb 2020 12:44:32 -0500 Subject: [PATCH 249/640] Add redirect --- lectures.html | 5 +++++ 1 file changed, 5 insertions(+) create mode 100644 lectures.html diff --git a/lectures.html b/lectures.html new file mode 100644 index 00000000..2af92def --- /dev/null +++ b/lectures.html @@ -0,0 +1,5 @@ +--- +layout: redirect +redirect: /2020/ +title: Lectures +--- From f95e9dd510a845bc7f922666986daaf08f7db8ab Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 3 Feb 2020 12:49:13 -0500 Subject: [PATCH 250/640] Fix link --- _2020/course-shell.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index c401afcb..6d646621 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -35,7 +35,7 @@ Science curriculum. # Class structure The class consists of 11 1-hour lectures, each one centering on a -[particular topic](/lectures/). The lectures are largely independent, +[particular topic](/2020/). The lectures are largely independent, though as the semester goes on we will presume that you are familiar with the content from the earlier lectures. We have lecture notes online, but there will be a lot of content covered in class (e.g. in the From 23adc069634bd49501f1ba059231aef4b141f97f Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Mon, 3 Feb 2020 12:54:06 -0500 Subject: [PATCH 251/640] Add captions for lecture 2 --- static/files/subtitles/2020/shell-tools.sbv | 2625 +++++++++++++++++++ 1 file changed, 2625 insertions(+) create mode 100644 static/files/subtitles/2020/shell-tools.sbv diff --git a/static/files/subtitles/2020/shell-tools.sbv b/static/files/subtitles/2020/shell-tools.sbv new file mode 100644 index 00000000..1eb976e0 --- /dev/null +++ b/static/files/subtitles/2020/shell-tools.sbv @@ -0,0 +1,2625 @@ +0:00:00.400,0:00:02.860 +Okay, welcome back. + +0:00:02.860,0:00:05.920 +Today we're gonna cover a couple separate + +0:00:05.920,0:00:07.620 +two main topics related to the shell. + +0:00:07.620,0:00:11.240 +First, we're gonna do some kind of shell +scripting, mainly related to bash, + +0:00:11.240,0:00:14.160 +which is the shell that most of you will start + +0:00:14.160,0:00:18.520 +in Mac, or like in most Linux systems, +that's the default shell. + +0:00:18.520,0:00:22.720 +And it's also kind of backward compatible through +other shells like zsh, it's pretty nice. + +0:00:22.740,0:00:25.940 +And then we're gonna cover some other shell +tools that are really convenient, + +0:00:26.060,0:00:29.320 +so you avoid doing really repetitive tasks, + +0:00:29.320,0:00:31.580 +like looking for some piece of code + +0:00:31.580,0:00:33.420 +or for some elusive file. + +0:00:33.420,0:00:36.160 +And there are already really +nice built-in commands + +0:00:36.160,0:00:40.960 +that will really help you to do those things. + +0:00:40.960,0:00:43.260 +So yesterday we already kind of introduced + +0:00:43.260,0:00:46.160 +you to the shell and some of it's quirks, + +0:00:46.160,0:00:48.720 +and like how you start executing commands, + +0:00:48.720,0:00:50.600 +redirecting them. + +0:00:50.600,0:00:52.400 +Today, we're going to kind of cover more about + +0:00:52.460,0:00:56.120 +the syntax of the variables, the control flow, + +0:00:56.120,0:00:57.720 +functions of the shell. + +0:00:57.720,0:01:02.700 +So for example, once you drop +into a shell, say you want to + +0:01:02.760,0:01:06.360 +define a variable, which is +one of the first things you + +0:01:06.360,0:01:09.340 +learn to do in a programming language. + +0:01:09.340,0:01:12.740 +Here you could do something like foo equals bar. + +0:01:12.860,0:01:18.400 +And now we can access the value +of foo by doing "$foo". + +0:01:18.460,0:01:21.400 +And that's bar, perfect. + +0:01:21.400,0:01:24.480 +One quirk that you need to be aware of is that + +0:01:24.480,0:01:27.900 +spaces are really critical when +you're dealing with bash. + +0:01:27.900,0:01:33.380 +Mainly because spaces are reserved, and +that will be for separating arguments. + +0:01:33.380,0:01:36.700 +So, for example, something like foo equals bar + +0:01:36.700,0:01:42.000 +won't work, and the shell is gonna +tell you why it's not working. + +0:01:42.000,0:01:46.280 +It's because the foo command is not +working, like foo is non-existent. + +0:01:46.280,0:01:47.780 +And here what is actually happening, +we're not assigning foo to bar, + +0:01:47.780,0:01:52.260 +what is happening is we're +calling the foo program + +0:01:52.260,0:01:57.520 +with the first argument "=" and +the second argument "bar". + +0:01:57.520,0:02:03.880 +And in general, whenever you are having +some issues, like some files with spaces + +0:02:03.880,0:02:06.160 +you will need to be careful about that. + +0:02:06.160,0:02:10.620 +You need to be careful about quoting strings. + +0:02:10.640,0:02:16.480 +So, going into that, how you do strings in bash. +There are two ways that you can define a string: + +0:02:16.540,0:02:24.720 +You can define strings using double quotes +and you can define strings using single, + +0:02:24.720,0:02:26.540 +sorry, + +0:02:26.540,0:02:28.880 +using single quotes. + +0:02:29.140,0:02:32.760 +However, for literal strings they are equivalent, + +0:02:32.760,0:02:35.460 +but for the rest they are not equivalent. + +0:02:35.460,0:02:42.980 +So, for example, if we do value is $foo, + +0:02:43.440,0:02:48.480 +the $foo has been expanded like +a string, substituted to the + +0:02:48.480,0:02:50.820 +value of the foo variable in the shell. + +0:02:50.960,0:02:58.940 +Whereas if we do this with a simple quote, +we are just getting the $foo as it is + +0:02:58.940,0:03:02.280 +and single quotes won't be replacing. Again, + +0:03:02.280,0:03:07.290 +it's really easy to write a script, assume that +this is kind of like Python, that you might be + +0:03:07.290,0:03:10.860 +more familiar with, and not realize all that. + +0:03:10.860,0:03:14.180 +And this is the way you will assign variables. + +0:03:14.180,0:03:17.849 +Then bash also has control flow +techniques that we'll see later, + +0:03:17.849,0:03:24.440 +like for loops, while loops, and one main +thing is you can define functions. + +0:03:24.440,0:03:27.820 +We can access a function I have defined here. + +0:03:28.220,0:03:34.220 +Here we have the MCD function, that +has been defined, and the thing is + +0:03:34.220,0:03:38.400 +so far, we have just kind of seen how +to execute several commands by piping + +0:03:38.400,0:03:40.720 +into them, kind of saw that briefly yesterday. + +0:03:40.940,0:03:44.980 +But a lot of times you want to do first +one thing and then another thing. + +0:03:44.980,0:03:47.580 +And that's kind of like the + +0:03:47.740,0:03:50.880 +sequential execution that we get here. + +0:03:50.880,0:03:54.260 +Here, for example, we're +calling the MCD function. + +0:03:56.860,0:03:57.800 +We, first, + +0:03:57.800,0:04:02.960 +are calling the makedir command, +which is creating this directory. + +0:04:02.960,0:04:05.600 +Here, $1 is like a special variable. + +0:04:05.600,0:04:07.440 +This is the way that bash works, + +0:04:07.440,0:04:12.160 +whereas in other scripting languages +there will be like argv, + +0:04:12.160,0:04:16.620 +the first item of the array argv +will contain the argument. + +0:04:16.620,0:04:19.160 +In bash it's $1. And in general, a lot + +0:04:19.160,0:04:21.640 +of things in bash will be dollar something + +0:04:21.640,0:04:26.680 +and will be reserved, we will +be seeing more examples later. + +0:04:26.680,0:04:30.290 +And once we have created the folder, +we CD into that folder, + +0:04:30.290,0:04:34.687 +which is kind of a fairly common +pattern that you will see. + +0:04:34.687,0:04:39.060 +We will actually type this directly +into our shell, and it will work and + +0:04:39.120,0:04:45.260 +it will define this function. But sometimes +it's nicer to write things in a file. + +0:04:45.260,0:04:50.040 +What we can do is we can source +this. And that will + +0:04:50.080,0:04:53.960 +execute this script in our shell and load it. + +0:04:53.960,0:04:59.340 +So now it looks like nothing happened, +but now the MCD function has + +0:04:59.340,0:05:03.460 +been defined in our shell. So +we can now for example do + +0:05:03.463,0:05:09.150 +MCD test, and now we move from +the tools directory to the test + +0:05:09.160,0:05:14.200 +directory. We both created the +folder and we moved into it. + +0:05:15.760,0:05:18.820 +What else. So a result is... + +0:05:18.820,0:05:22.160 +We can access the first argument with $1. + +0:05:22.160,0:05:26.100 +There's a lot more reserved commands, + +0:05:26.100,0:05:30.020 +for example $0 will be the name of the script, + +0:05:30.020,0:05:35.260 +$2 through $9 will be the second +through the ninth arguments + +0:05:35.260,0:05:38.070 +that the bash script takes. +Some of these reserved + +0:05:38.070,0:05:43.080 +keywords can be directly used +in the shell, so for example + +0:05:43.420,0:05:50.300 +$? will get you the error code +from the previous command, + +0:05:50.300,0:05:53.580 +which I'll also explain briefly. + +0:05:53.580,0:05:58.320 +But for example, $_ will get +you the last argument of the + +0:05:58.320,0:06:03.460 +previous command. So another way +we could have done this is + +0:06:03.460,0:06:07.380 +we could have said like "mkdir test" + +0:06:07.380,0:06:12.020 +and instead of rewriting test, we +can access that last argument + +0:06:12.020,0:06:18.400 +as part of the (previous command), using $_ + +0:06:18.400,0:06:23.160 +like, that will be replaced with +test and now we go into test. + +0:06:25.040,0:06:27.480 +There are a lot of them, you +should familiarize with them. + +0:06:27.480,0:06:32.900 +Another one I often use is called "bang +bang" ("!!"), you will run into this + +0:06:32.910,0:06:37.300 +whenever you, for example, are trying +to create something and you don't have + +0:06:37.320,0:06:41.000 +enough permissions. Then, you can do "sudo !!" + +0:06:41.010,0:06:43.400 +and then that will replace the command in + +0:06:43.470,0:06:46.400 +there and now you can just try doing + +0:06:46.440,0:06:48.380 +that. And now it will prompt you for a password, + +0:06:48.380,0:06:50.080 +because you have sudo permissions. + +0:06:53.800,0:06:57.180 +Before, I mentioned the, kind +of the error command. + +0:06:57.180,0:06:59.400 +Yesterday we saw that, in general, there are + +0:06:59.400,0:07:02.400 +different ways a process can communicate + +0:07:02.400,0:07:05.091 +with other processes or commands. + +0:07:05.100,0:07:08.420 +We mentioned the standard +input, which also was like + +0:07:09.160,0:07:11.380 +getting stuff through the standard input, + +0:07:11.640,0:07:13.840 +putting stuff into the standard output. + +0:07:13.840,0:07:16.830 +There are a couple more interesting +things, there's also like a + +0:07:16.830,0:07:19.837 +standard error, a stream where you write errors + +0:07:19.837,0:07:23.900 +that happen with your program and you don't +want to pollute the standard output. + +0:07:23.900,0:07:27.420 +There's also the error code, +which is like a general + +0:07:27.420,0:07:29.520 +thing in a lot of programming languages, + +0:07:29.520,0:07:34.460 +some way of reporting how the +entire run of something went. + +0:07:34.460,0:07:36.060 +So if we do + +0:07:36.060,0:07:41.020 +something like echo hello and we + +0:07:41.580,0:07:43.920 +query for the value, it's zero. And it's zero + +0:07:43.920,0:07:45.840 +because everything went okay and there + +0:07:45.840,0:07:49.170 +weren't any issues. And a zero exit code is + +0:07:49.170,0:07:50.940 +the same as you will get in a language + +0:07:50.940,0:07:54.980 +like C, like 0 means everything +went fine, there were no errors. + +0:07:54.980,0:07:57.600 +However, sometimes things won't work. + +0:07:57.600,0:08:04.600 +Sometimes, like if we try to grep +for foobar in our MCD script, + +0:08:04.600,0:08:08.130 +and now we check for that +value, it's 1. And that's + +0:08:08.130,0:08:10.770 +because we tried to search for the foobar + +0:08:10.770,0:08:13.620 +string in the MCD script and it wasn't there. + +0:08:13.620,0:08:17.190 +So grep doesn't print anything, but + +0:08:17.190,0:08:19.950 +let us know that things didn't work by + +0:08:19.950,0:08:22.260 +giving us a 1 error code. + +0:08:22.260,0:08:24.420 +There are some interesting commands like + +0:08:24.420,0:08:29.160 +"true", for example, will always have a zero + +0:08:29.160,0:08:35.060 +error code, and false will always +have a one error code. + +0:08:35.060,0:08:37.919 +Then there are like + +0:08:37.919,0:08:40.080 +these logical operators that you can use + +0:08:40.080,0:08:43.808 +to do some sort of conditionals. +For example, one way... + +0:08:43.808,0:08:47.160 +you also have IF's and ELSE's, that +we will see later, but you can do + +0:08:47.160,0:08:51.920 +something like "false", and echo "Oops fail". + +0:08:51.920,0:08:56.300 +So here we have two commands connected +by this OR operator. + +0:08:56.300,0:09:00.250 +What bash is gonna do here, it's +gonna execute the first one + +0:09:00.250,0:09:04.450 +and if the first one didn't work, then it's + +0:09:04.450,0:09:07.380 +gonna execute the second one. So here we get it, + +0:09:07.380,0:09:12.000 +because it's gonna try to do a logical +OR. If the first one didn't have + +0:09:12.000,0:09:15.960 +a zero error code, it's gonna try to +do the second one. Similarly, if we + +0:09:15.960,0:09:19.580 +instead of use "false", we +use something like "true", + +0:09:19.580,0:09:22.180 +since true will have a zero error code, then the + +0:09:22.180,0:09:24.700 +second one will be short-circuited and + +0:09:24.700,0:09:27.500 +it won't be printed. + +0:09:32.560,0:09:36.970 +Similarly, we have an AND +operator which will only + +0:09:36.970,0:09:39.430 +execute the second part if the first one + +0:09:39.430,0:09:41.440 +ran without errors. + +0:09:41.440,0:09:44.820 +And the same thing will happen. + +0:09:44.820,0:09:50.340 +If the first one fails, then the second +part of this thing won't be executed. + +0:09:50.340,0:09:57.280 +Kind of not exactly related to that, but +another thing that you will see is + +0:10:00.020,0:10:04.120 +that no matter what you execute, +then you can concatenate + +0:10:04.120,0:10:07.120 +commands using a semicolon in the same line, + +0:10:07.120,0:10:10.300 +and that will always print. + +0:10:10.300,0:10:13.630 +Beyond that, what we haven't +seen, for example, is how + +0:10:13.630,0:10:19.460 +you go about getting the output +of a command into a variable. + +0:10:19.630,0:10:24.120 +And the way we can do that is +doing something like this. + +0:10:24.120,0:10:29.480 +What we're doing here is we're getting +the output of the PWD command, + +0:10:29.480,0:10:32.720 +which is just printing the +present working directory + +0:10:32.720,0:10:33.740 +where we are right now. + +0:10:33.740,0:10:37.220 +And then we're storing that +into the foo variable. + +0:10:37.220,0:10:42.279 +So we do that and then we ask +for foo, we view our string. + +0:10:42.280,0:10:48.460 +More generally, we can do this thing +called command substitution + +0:10:50.110,0:10:51.500 +by putting it into any string. + +0:10:51.500,0:10:55.162 +And since we're using double quotes +instead of single quotes + +0:10:55.162,0:10:57.440 +that thing will be expanded and + +0:10:57.440,0:11:02.740 +it will tell us that we are +in this working folder. + +0:11:02.740,0:11:09.240 +Another interesting thing is, right now, +what this is expanding to is a string + +0:11:09.400,0:11:10.300 +instead of + +0:11:11.920,0:11:13.320 +It's just expanding as a string. + +0:11:13.460,0:11:17.640 +Another nifty and lesser known tool +is called process substitution, + +0:11:17.640,0:11:20.540 +which is kind of similar. What it will do... + +0:11:24.360,0:11:30.041 +it will, here for example, the "<(", +some command and another parenthesis, + +0:11:30.041,0:11:34.840 +what that will do is: that will execute, +that will get the output to + +0:11:34.840,0:11:39.120 +kind of like a temporary file and it will +give the file handle to the command. + +0:11:39.120,0:11:42.020 +So here what we're doing is we're getting... + +0:11:42.020,0:11:45.760 +we're LS'ing the directory, putting +it into a temporary file, + +0:11:45.760,0:11:48.040 +doing the same thing for the +parent folder and then + +0:11:48.040,0:11:51.310 +we're concatenating both files. And this + +0:11:51.310,0:11:55.520 +will, may be really handy, because +some commands instead of expecting + +0:11:55.520,0:11:59.500 +the input coming from the stdin, +they are expecting things to + +0:11:59.500,0:12:03.560 +come from some file that is giving +some of the arguments. + +0:12:04.700,0:12:07.620 +So we get both things concatenated. + +0:12:12.880,0:12:17.040 +I think so far there's been a lot of +information, let's see a simple, + +0:12:17.040,0:12:22.920 +an example script where we +see a few of these things. + +0:12:23.200,0:12:27.220 +So for example here we have a string and we + +0:12:27.220,0:12:30.327 +have this $date. So $date is a program. + +0:12:30.327,0:12:34.540 +Again there's a lot of programs +in UNIX you will kind of slowly + +0:12:34.540,0:12:36.120 +familiarize with a lot of them. + +0:12:36.120,0:12:42.820 +Date just prints what the current date is +and you can specify different formats. + +0:12:43.800,0:12:48.700 +Then, we have these $0 here. $0 is the name + +0:12:48.700,0:12:50.540 +of the script that we're running. + +0:12:50.550,0:12:56.590 +Then we have $#, that's the number +of arguments that we are giving + +0:12:56.590,0:13:01.920 +to the command, and then $$ is the process +ID of this command that is running. + +0:13:01.920,0:13:06.160 +Again, there's a lot of these dollar +things, they're not intuitive + +0:13:06.160,0:13:07.690 +because they don't have like a mnemonic + +0:13:07.690,0:13:10.450 +way of remembering, maybe, $#. But + +0:13:10.450,0:13:12.880 +it can be... you will just be + +0:13:12.880,0:13:14.660 +seeing them and getting familiar with them. + +0:13:14.660,0:13:19.200 +Here we have this $@, and that will +expand to all the arguments. + +0:13:19.200,0:13:21.480 +So, instead of having to assume that, + +0:13:21.490,0:13:25.840 +maybe say, we have three arguments +and writing $1, $2, $3, + +0:13:25.840,0:13:29.760 +if we don't know how many arguments we +can put all those arguments there. + +0:13:29.760,0:13:33.670 +And that has been given to a +for loop. And the for loop + +0:13:33.670,0:13:39.020 +will, in time, get the file variable + +0:13:39.020,0:13:43.880 +and it will be giving each one of the arguments. + +0:13:43.880,0:13:47.529 +So what we're doing is, for every +one of the arguments we're giving. + +0:13:47.529,0:13:51.699 +Then, in the next line we're running the + +0:13:51.699,0:13:56.920 +grep command which is just search for +a substring in some file and we're + +0:13:56.920,0:14:01.380 +searching for the string foobar in the file. + +0:14:01.380,0:14:06.490 +Here, we have put the variable +that the file took, to expand. + +0:14:06.490,0:14:11.559 +And yesterday we saw that if we care +about the output of a program, we can + +0:14:11.560,0:14:15.680 +redirect it to somewhere, to save it +or to connect it to some other file. + +0:14:15.680,0:14:18.939 +But sometimes you want the opposite. + +0:14:18.939,0:14:21.260 +Sometimes, here for example, we care... + +0:14:21.260,0:14:25.119 +we're gonna care about the error code. About +this script, we're gonna care whether the + +0:14:25.120,0:14:28.440 +grep ran successfully or it didn't. + +0:14:28.440,0:14:33.220 +So we can actually discard +entirely what the output... + +0:14:33.220,0:14:37.480 +like both the standard output and the +standard error of the grep command. + +0:14:37.480,0:14:39.970 +And what we're doing is we're + +0:14:39.970,0:14:43.029 +redirecting the output to /dev/null which + +0:14:43.029,0:14:46.540 +is kind of like a special device in UNIX + +0:14:46.540,0:14:49.119 +systems where you can like write and + +0:14:49.119,0:14:51.129 +it will be discarded. Like you can + +0:14:51.129,0:14:52.869 +write no matter how much you want, + +0:14:52.869,0:14:57.730 +there, and it will be discarded. +And here's the ">" symbol + +0:14:57.730,0:15:02.199 +that we saw yesterday for redirecting +output. Here you have a "2>" + +0:15:02.199,0:15:04.689 +and, as some of you might have + +0:15:04.689,0:15:06.519 +guessed by now, this is for redirecting the + +0:15:06.519,0:15:08.589 +standard error, because those those two + +0:15:08.589,0:15:11.709 +streams are separate, and you kind of have to + +0:15:11.709,0:15:14.639 +tell bash what to do with each one of them. + +0:15:14.639,0:15:17.529 +So here, we run, we check if the file has + +0:15:17.529,0:15:20.649 +foobar, and if the file has foobar then it's + +0:15:20.649,0:15:22.959 +going to have a zero code. If it + +0:15:22.959,0:15:24.369 +doesn't have foobar, it's gonna have a + +0:15:24.369,0:15:26.980 +nonzero error code. So that's exactly what we + +0:15:26.980,0:15:31.120 +check. In this if part of the command we + +0:15:31.120,0:15:34.840 +say "get me the error code". Again, this $? + +0:15:34.840,0:15:37.240 +And then we have a comparison operator + +0:15:37.240,0:15:41.590 +which is "-ne", for "non equal". And some + +0:15:41.590,0:15:47.650 +other programming languages +will have "==", "!=", these + +0:15:47.650,0:15:51.070 +symbols. In bash there's + +0:15:51.070,0:15:53.650 +like a reserved set of comparisons and + +0:15:53.650,0:15:54.970 +it's mainly because there's a lot of + +0:15:54.970,0:15:57.520 +things you might want to test for when + +0:15:57.520,0:15:59.080 +you're in the shell. Here for example + +0:15:59.080,0:16:03.970 +we're just checking for two values, two +integer values, being the same. Or for + +0:16:03.970,0:16:08.380 +example here, the "-F" check will let + +0:16:08.380,0:16:10.420 +us know if a file exists, which is + +0:16:10.420,0:16:12.220 +something that you will run into very, + +0:16:12.220,0:16:17.530 +very commonly. I'm going back to the + +0:16:17.530,0:16:23.020 +example. Then, what happens when we + +0:16:24.400,0:16:28.600 +if the file did not have +foobar, like there was a + +0:16:28.600,0:16:31.990 +nonzero error code, then we print + +0:16:31.990,0:16:33.400 +"this file doesn't have any foobar, + +0:16:33.400,0:16:36.400 +we're going to add one". And what we do is + +0:16:36.400,0:16:40.750 +we echo this "# foobar", hoping this + +0:16:40.750,0:16:43.200 +is a comment to the file and then we're + +0:16:43.200,0:16:47.620 +using the operator ">>" to append at the end of + +0:16:47.620,0:16:50.800 +the file. Here since the file has + +0:16:50.800,0:16:54.490 +been fed through the script, and we don't +know it beforehand, we have to substitute + +0:16:54.490,0:17:03.430 +the variable of the filename. We can +actually run this. We already have + +0:17:03.430,0:17:05.260 +correct permissions in this script and + +0:17:05.260,0:17:10.540 +we can give a few examples. We have a +few files in this folder, "mcd" is the + +0:17:10.540,0:17:12.760 +one we saw at the beginning for the MCD + +0:17:12.760,0:17:15.040 +function, some other "script" function and + +0:17:15.040,0:17:21.700 +we can even feed the own script to itself +to check if it has foobar in it. + +0:17:21.700,0:17:26.680 +And we run it and first we can +see that there's different + +0:17:26.680,0:17:29.460 +variables that we saw, that have been + +0:17:29.460,0:17:33.400 +successfully expanded. We have the date, that has + +0:17:33.400,0:17:36.700 +been replaced to the current time, then + +0:17:36.700,0:17:39.100 +we're running this program, with three + +0:17:39.100,0:17:44.560 +arguments, this randomized PID, and then + +0:17:44.560,0:17:46.510 +it's telling us MCD doesn't have any + +0:17:46.510,0:17:48.169 +foobar, so we are adding a new one, + +0:17:48.169,0:17:50.450 +and this script file doesn't + +0:17:50.450,0:17:52.970 +have one. So now for example let's look at MCD + +0:17:52.970,0:17:55.820 +and it has the comment that we were looking for. + +0:17:59.000,0:18:05.619 +One other thing to know when you're +executing scripts is that + +0:18:05.619,0:18:07.759 +here we have like three completely + +0:18:07.759,0:18:10.279 +different arguments but very commonly + +0:18:10.279,0:18:12.889 +you will be giving arguments that + +0:18:12.889,0:18:16.100 +can be more succinctly given in some way. + +0:18:16.100,0:18:20.179 +So for example here if we wanted to + +0:18:20.179,0:18:25.429 +refer to all the ".sh" scripts we + +0:18:25.429,0:18:31.120 +could just do something like "ls *.sh" + +0:18:31.120,0:18:36.120 +and this is a way of filename expansion +that most shells have + +0:18:36.120,0:18:38.450 +that's called "globbing". Here, as you + +0:18:38.450,0:18:39.919 +might expect, this is gonna say + +0:18:39.919,0:18:42.559 +anything that has any kind of sort of + +0:18:42.559,0:18:45.940 +characters and ends up with "sh". + +0:18:45.940,0:18:52.159 +Unsurprisingly, we get "example.sh" +and "mcd.sh". We also have these + +0:18:52.159,0:18:54.769 +"project1" and "project2", and if there + +0:18:54.769,0:19:00.100 +were like a... we can do a +"project42", for example + +0:19:00.620,0:19:04.220 +And now if we just want to refer +to the projects that have + +0:19:04.220,0:19:07.279 +a single character, but not two characters + +0:19:07.279,0:19:08.720 +afterwards, like any other characters, + +0:19:08.720,0:19:13.879 +we can use the question mark. So "?" +will expand to only a single one. + +0:19:13.880,0:19:17.360 +And we get, LS'ing, first + +0:19:17.360,0:19:21.049 +"project1" and then "project2". + +0:19:21.049,0:19:27.580 +In general, globbing can be very powerful. +You can also combine it. + +0:19:31.880,0:19:35.480 +A common pattern is to use what +is called curly braces. + +0:19:35.480,0:19:39.320 +So let's say we have an image, +that we have in this folder + +0:19:39.320,0:19:43.620 +and we want to convert this image from PNG to JPG + +0:19:43.620,0:19:46.320 +or we could maybe copy it, or... + +0:19:46.320,0:19:49.609 +it's a really common pattern, to have +two or more arguments that are + +0:19:49.609,0:19:55.240 +fairly similar and you want to do something +with them as arguments to some command. + +0:19:55.240,0:20:01.290 +You could do it this way, or more +succinctly, you can just do + +0:20:01.290,0:20:08.880 +"image.{png,jpg}" + +0:20:09.410,0:20:13.590 +And here, I'm getting some color feedback, +but what this will do, is + +0:20:13.590,0:20:17.610 +it'll expand into the line above. + +0:20:17.610,0:20:23.990 +Actually, I can ask zsh to do that for +me. And that what's happening here. + +0:20:23.990,0:20:26.550 +This is really powerful. So for example + +0:20:26.550,0:20:29.220 +you can do something like... we could do... + +0:20:29.220,0:20:34.220 +"touch" on a bunch of foo's, and +all of this will be expanded. + +0:20:35.520,0:20:41.880 +You can also do it at several levels +and you will do the Cartesian... + +0:20:41.880,0:20:49.980 +if we have something like this, +we have one group here, "{1,2}" + +0:20:49.980,0:20:53.310 +and then here there's "{1,2,3}", +and this is going to do + +0:20:53.310,0:20:54.990 +the Cartesian product of these + +0:20:54.990,0:20:59.920 +two expansions and it will expand +into all these things, + +0:20:59.960,0:21:03.540 +that we can quickly "touch". + +0:21:03.540,0:21:10.520 +You can also combine the asterisk +glob with the curly braces glob. + +0:21:10.520,0:21:16.840 +You can even use kind of ranges. +Like, we can do "mkdir" + +0:21:16.840,0:21:21.420 +and we create the "foo" and the +"bar" directories, and then we + +0:21:21.420,0:21:25.680 +can do something along these lines. This + +0:21:25.680,0:21:28.890 +is going to expand to "fooa", "foob"... + +0:21:28.890,0:21:31.430 +like all these combinations, through "j", and + +0:21:31.430,0:21:35.250 +then the same for "bar". I haven't + +0:21:35.250,0:21:38.610 +really tested it... but yeah, we're getting +all these combinations that we + +0:21:38.610,0:21:41.850 +can "touch". And now, if we touch something + +0:21:41.850,0:21:47.970 +that is different between these +two [directories], we + +0:21:47.970,0:21:55.890 +can again showcase the process +substitution that we saw + +0:21:55.890,0:21:59.610 +earlier. Say we want to check what +files are different between these + +0:21:59.610,0:22:03.400 +two folders. For us it's obvious, +we just saw it, it's X and Y, + +0:22:03.400,0:22:07.410 +but we can ask the shell to do +this "diff" for us between the + +0:22:07.410,0:22:10.200 +output of one LS and the other LS. + +0:22:10.200,0:22:12.810 +Unsurprisingly we're getting: X is + +0:22:12.810,0:22:14.700 +only in the first folder and Y is + +0:22:14.700,0:22:20.970 +only in the second folder. What is more + +0:22:20.970,0:22:26.519 +is, right now, we have only seen +bash scripts. If you like other + +0:22:26.520,0:22:30.260 +scripts, like for some tasks bash +is probably not the best, + +0:22:30.260,0:22:33.119 +it can be tricky. You can actually +write scripts that + +0:22:33.119,0:22:35.700 +interact with the shell implemented in a lot + +0:22:35.700,0:22:39.710 +of different languages. So for +example, let's see here a + +0:22:39.710,0:22:43.139 +Python script that has a magic line at the + +0:22:43.139,0:22:45.539 +beginning that I'm not explaining for now. + +0:22:45.540,0:22:48.330 +Then we have "import sys", + +0:22:48.330,0:22:53.629 +it's kind of like... Python is not, +by default, trying to interact + +0:22:53.629,0:22:56.999 +with the shell so you will have to import + +0:22:56.999,0:22:58.799 +some library. And then we're doing a + +0:22:58.799,0:23:01.529 +really silly thing of just iterating + +0:23:01.529,0:23:06.440 +over "sys.argv[1:]". + +0:23:06.440,0:23:12.809 +"sys.argv" is kind of similar to what +in bash we're getting as $0, $1, &c. + +0:23:12.809,0:23:16.649 +Like the vector of the arguments, we're +printing it in the reversed order. + +0:23:16.649,0:23:21.179 +And the magic line at the beginning is + +0:23:21.179,0:23:23.999 +called a shebang and is the way that the + +0:23:23.999,0:23:26.159 +shell will know how to run this program. + +0:23:26.159,0:23:30.509 +You can always do something like + +0:23:30.509,0:23:34.379 +"python script.py", and then "a b c" and that + +0:23:34.379,0:23:36.659 +will work, always, like that. But + +0:23:36.659,0:23:39.119 +what if we want to make this to be + +0:23:39.119,0:23:41.309 +executable from the shell? The way the + +0:23:41.309,0:23:44.190 +shell knows that it has to use python as the + +0:23:44.190,0:23:48.450 +interpreter to run this file is using + +0:23:48.450,0:23:52.440 +that first line. And that first line is + +0:23:52.440,0:23:56.620 +giving it the path to where that thing lives. + +0:23:58.500,0:23:59.600 +However, you might not know. + +0:23:59.609,0:24:01.830 +Like, different machines will have probably + +0:24:01.830,0:24:04.049 +different places where they put python + +0:24:04.049,0:24:06.090 +and you might not want to assume where + +0:24:06.090,0:24:08.789 +python is installed, or any other interpreter. + +0:24:08.789,0:24:16.379 +So one thing that you can do is use the + +0:24:16.380,0:24:17.720 +"env" command. + +0:24:18.280,0:24:21.560 +You can also give arguments in the shebang, so + +0:24:21.570,0:24:23.940 +what we're doing here is specifying + +0:24:23.940,0:24:29.720 +run the "env" command, that is for pretty much every +system, there are some exceptions, but like for + +0:24:29.720,0:24:31.550 +pretty much every system it's is in + +0:24:31.550,0:24:33.620 +"usr/bin", where a lot of binaries live, + +0:24:33.620,0:24:36.200 +and then we're calling it with the + +0:24:36.200,0:24:38.570 +argument "python". And then that will make + +0:24:38.570,0:24:42.020 +use of the path environment variable + +0:24:42.020,0:24:43.580 +that we saw in the first lecture. It's + +0:24:43.580,0:24:45.680 +gonna search in that path for the Python + +0:24:45.680,0:24:48.620 +binary and then it's gonna use that to + +0:24:48.620,0:24:50.480 +interpret this file. And that will make + +0:24:50.480,0:24:52.490 +this more portable so it can be run in + +0:24:52.490,0:24:57.520 +my machine, and your machine +and some other machine. + +0:25:08.020,0:25:12.140 +Another thing is that the bash is not + +0:25:12.140,0:25:14.300 +really like modern, it was + +0:25:14.300,0:25:16.340 +developed a while ago. And sometimes + +0:25:16.340,0:25:18.890 +it can be tricky to debug. By + +0:25:18.890,0:25:21.980 +default, and the ways it will fail + +0:25:21.980,0:25:24.020 +sometimes are intuitive like the way we + +0:25:24.020,0:25:26.180 +saw before of like foo command not + +0:25:26.180,0:25:28.610 +existing, sometimes it's not. So there's + +0:25:28.610,0:25:31.280 +like a really nifty tool that we have + +0:25:31.280,0:25:34.310 +linked in the lecture notes, which is called + +0:25:34.310,0:25:37.580 +"shellcheck", that will kind of give you + +0:25:37.580,0:25:40.010 +both warnings and syntactic errors + +0:25:40.010,0:25:43.250 +and other things that you might +not have quoted properly, + +0:25:43.250,0:25:46.040 +or you might have misplaced spaces in + +0:25:46.040,0:25:50.060 +your files. So for example for +extremely simple "mcd.sh" + +0:25:50.060,0:25:51.980 +file we're getting a couple + +0:25:51.980,0:25:54.800 +of errors saying hey, surprisingly, + +0:25:54.800,0:25:56.090 +we're missing a shebang, like this + +0:25:56.090,0:25:59.060 +might not interpret it correctly if you're + +0:25:59.060,0:26:02.000 +it at a different system. Also, this + +0:26:02.000,0:26:05.620 +CD is taking a command and it might not + +0:26:05.620,0:26:08.960 +expand properly so instead of using CD + +0:26:08.960,0:26:11.300 +you might want to use something like CD + +0:26:11.300,0:26:14.540 +and then an OR and then an "exit". We go + +0:26:14.540,0:26:16.490 +back to what we explained earlier, what + +0:26:16.490,0:26:18.920 +this will do is like if the + +0:26:18.920,0:26:21.860 +CD doesn't end correctly, you cannot CD + +0:26:21.860,0:26:23.720 +into the folder because either you + +0:26:23.720,0:26:25.250 +don't have permissions, it doesn't exist... + +0:26:25.250,0:26:28.780 +That will give a nonzero error + +0:26:28.780,0:26:32.420 +command, so you will execute exit + +0:26:32.420,0:26:33.920 +and that will stop the script + +0:26:33.920,0:26:35.810 +instead of continue executing as if + +0:26:35.810,0:26:37.240 +you were in a place that you are + +0:26:37.240,0:26:42.900 +actually not in. And actually +I haven't tested, but I + +0:26:42.920,0:26:47.179 +think we can check for "example.sh" + +0:26:47.179,0:26:50.809 +and here we're getting that we should be + +0:26:50.809,0:26:55.070 +checking the exit code in a +different way, because it's + +0:26:55.070,0:26:57.710 +probably not the best way, doing it this + +0:26:57.710,0:27:01.580 +way. One last remark I want to make + +0:27:01.580,0:27:05.090 +is that when you're writing bash scripts + +0:27:05.090,0:27:07.159 +or functions for that matter, + +0:27:07.159,0:27:09.080 +there's kind of a difference between + +0:27:09.080,0:27:12.590 +writing bash scripts in isolation like a + +0:27:12.590,0:27:14.149 +thing that you're gonna run, and a thing + +0:27:14.149,0:27:16.100 +that you're gonna load into your shell. + +0:27:16.100,0:27:19.850 +We will see some of this in the command + +0:27:19.850,0:27:23.090 +line environment lecture, where we will kind of + +0:27:23.090,0:27:29.059 +be tooling with the bashrc and the +sshrc. But in general, if you make + +0:27:29.059,0:27:31.370 +changes to for example where you are, + +0:27:31.370,0:27:34.009 +like if you CD into a bash script and you + +0:27:34.009,0:27:36.919 +just execute that bash script, it won't CD + +0:27:36.919,0:27:39.980 +into the shell are right now. But if you + +0:27:39.980,0:27:42.980 +have loaded the code directly into + +0:27:42.980,0:27:45.559 +your shell, for example you load... + +0:27:45.559,0:27:48.440 +you source the function and then you execute + +0:27:48.440,0:27:50.269 +the function then you will get those + +0:27:50.269,0:27:52.000 +side effects. And the same goes for + +0:27:52.000,0:27:57.220 +defining variables into the shell. + +0:27:57.220,0:28:03.950 +Now I'm going to talk about some +tools that I think are nifty when + +0:28:03.950,0:28:07.580 +working with the shell. The first was + +0:28:07.580,0:28:09.799 +also briefly introduced yesterday. + +0:28:09.799,0:28:13.309 +How do you know what flags, or like + +0:28:13.309,0:28:15.320 +what exact commands are. Like how I am + +0:28:15.320,0:28:21.889 +supposed to know that LS minus L will list +the files in a list format, or that + +0:28:21.889,0:28:25.789 +if I do "move - i", it's gonna like prom me + +0:28:25.789,0:28:28.639 +for stuff. For that what you have is the "man" + +0:28:28.639,0:28:30.730 +command. And the man command will kind of + +0:28:30.730,0:28:33.590 +have like a lot of information of how + +0:28:33.590,0:28:35.809 +will you go about... so for example here it + +0:28:35.809,0:28:40.340 +will explain for the "-i" flag, there are + +0:28:40.340,0:28:43.970 +all these options you can do. That's + +0:28:43.970,0:28:45.620 +actually pretty useful and it will work + +0:28:45.620,0:28:51.540 +not only for really simple commands +that come packaged with your OS + +0:28:51.540,0:28:55.809 +but will also work with some tools +that you install from the internet + +0:28:55.809,0:28:58.240 +for example, if the person that did the + +0:28:58.240,0:29:01.390 +installation made it so that the man + +0:29:01.390,0:29:03.399 +package were also installed. So for example + +0:29:03.399,0:29:06.490 +a tool that we're gonna cover in a bit + +0:29:06.490,0:29:12.370 +which is called "ripgrep" and +is called with RG, this didn't + +0:29:12.370,0:29:14.980 +come with my system but it has installed + +0:29:14.980,0:29:17.230 +its own man page and I have it here and + +0:29:17.230,0:29:21.700 +I can access it. For some commands the + +0:29:21.700,0:29:25.029 +man page is useful but sometimes it can be + +0:29:25.029,0:29:28.270 +tricky to decipher because it's more + +0:29:28.270,0:29:30.399 +kind of a documentation and a + +0:29:30.399,0:29:32.679 +description of all the things the tool + +0:29:32.679,0:29:35.860 +can do. Sometimes it will have + +0:29:35.860,0:29:37.720 +examples but sometimes not, and sometimes + +0:29:37.720,0:29:41.620 +the tool can do a lot of things so a + +0:29:41.620,0:29:45.250 +couple of good tools that I use commonly + +0:29:45.250,0:29:50.289 +are "convert" or "ffmpeg", which deal +with images and video respectively and + +0:29:50.289,0:29:52.419 +the man pages are like enormous. So there's + +0:29:52.419,0:29:54.850 +one neat tool called "tldr" that + +0:29:54.850,0:29:58.240 +you can install and you will have like + +0:29:58.240,0:30:02.710 +some nice kind of explanatory examples + +0:30:02.710,0:30:05.470 +of how you want to use this command. And you + +0:30:05.470,0:30:07.840 +can always Google for this, but I find + +0:30:07.840,0:30:10.120 +myself saving going into the + +0:30:10.120,0:30:12.640 +browser, looking about some examples and + +0:30:12.640,0:30:14.919 +coming back, whereas "tldr" are + +0:30:14.919,0:30:16.870 +community contributed and + +0:30:16.870,0:30:19.210 +they're fairly useful. Then, + +0:30:19.210,0:30:23.020 +the one for "ffmpeg" has a lot of + +0:30:23.020,0:30:24.940 +useful examples that are more nicely + +0:30:24.940,0:30:26.799 +formatted (if you don't have a huge + +0:30:26.799,0:30:30.820 +font size for recording). Or even + +0:30:30.820,0:30:33.250 +simple commands like "tar", that have a lot + +0:30:33.250,0:30:35.470 +of options that you are combining. So for + +0:30:35.470,0:30:37.840 +example, here you can be combining 2, 3... + +0:30:37.840,0:30:41.710 +different flags and it can not be + +0:30:41.710,0:30:43.419 +obvious, when you want to combine + +0:30:43.419,0:30:48.429 +different ones. That's how you + +0:30:48.429,0:30:54.850 +would go about finding more about these tools. +On the topic of finding, let's try + +0:30:54.850,0:30:58.690 +learning how to find files. You can + +0:30:58.690,0:31:03.100 +always go "ls", and like you can go like + +0:31:03.100,0:31:05.950 +"ls project1", and + +0:31:05.950,0:31:08.559 +keep LS'ing all the way through. But + +0:31:08.559,0:31:11.740 +maybe, if we already know that we want + +0:31:11.740,0:31:15.450 +to look for all the folders called + +0:31:15.450,0:31:19.000 +"src", then there's probably a better command + +0:31:19.000,0:31:21.400 +for doing that. And that's "find". + +0:31:21.460,0:31:26.679 +Find is the tool that, pretty much comes +with every UNIX system. And find, + +0:31:26.679,0:31:35.230 +we're gonna give it... here we're +saying we want to call find in the + +0:31:35.230,0:31:37.510 +current folder, remember that "." stands + +0:31:37.510,0:31:40.149 +for the current folder, and we want the + +0:31:40.149,0:31:46.539 +name to be "src" and we want the type to +be a directory. And by typing that it's + +0:31:46.539,0:31:49.870 +gonna recursively go through the current + +0:31:49.870,0:31:52.330 +directory and look for all these files, + +0:31:52.330,0:31:58.659 +or folders in this case, that match this +pattern. Find has a lot of useful + +0:31:58.659,0:32:01.840 +flags. So for example, you can even test + +0:32:01.840,0:32:05.440 +for the path to be in a way. Here we're + +0:32:05.440,0:32:08.230 +saying we want some number of folders, + +0:32:08.230,0:32:09.909 +we don't really care how many folders, + +0:32:09.909,0:32:13.179 +and then we care about all the Python + +0:32:13.179,0:32:17.830 +scripts, all the things with the extension +".py", that are within a + +0:32:17.830,0:32:19.899 +test folder. And we're also checking, just in + +0:32:19.899,0:32:21.519 +cases really but we're checking just + +0:32:21.519,0:32:24.460 +that it's also a type F, which stands for + +0:32:24.460,0:32:28.710 +file. We're getting all these files. + +0:32:28.710,0:32:32.169 +You can also use different flags for things + +0:32:32.169,0:32:34.000 +that are not the path or the name. + +0:32:34.000,0:32:38.160 +You could check things that have been + +0:32:38.160,0:32:42.060 +modified ("-mtime" is for the modification +time), things that have been + +0:32:42.070,0:32:44.540 +modified in the last day, which is gonna + +0:32:44.559,0:32:46.659 +be pretty much everything. So this is gonna print + +0:32:46.659,0:32:49.029 +a lot of the files we created and files + +0:32:49.029,0:32:51.850 +that were already there. You can even + +0:32:51.850,0:32:54.960 +use other things like size, the owner, + +0:32:54.960,0:32:59.080 +permissions, you name it. What is even more + +0:32:59.080,0:33:01.870 +powerful is, "find" can find stuff + +0:33:01.870,0:33:04.269 +but it also can do stuff when you + +0:33:04.269,0:33:10.690 +find those files. So we could look for all + +0:33:10.690,0:33:14.080 +the files that have a TMP + +0:33:14.080,0:33:18.160 +extension, which is a temporary extension, and + +0:33:18.160,0:33:22.720 +then, we can tell "find" that +for every one of those files, + +0:33:22.720,0:33:26.350 +just execute the "rm" command for them. And + +0:33:26.350,0:33:29.050 +that will just be calling "rm" with all + +0:33:29.050,0:33:32.350 +these files. So let's first execute it + +0:33:32.350,0:33:35.760 +without, and then we execute it with it. + +0:33:35.760,0:33:38.950 +Again, as with the command line + +0:33:38.950,0:33:41.470 +philosophy, it looks like nothing + +0:33:41.470,0:33:48.070 +happened. But since we have +a zero error code, something + +0:33:48.070,0:33:49.540 +happened - just that everything went + +0:33:49.540,0:33:51.490 +correct and everything is fine. And now, + +0:33:51.490,0:33:57.810 +if we look for these files, +they aren't there anymore. + +0:33:57.810,0:34:02.950 +Another nice thing about the shell +in general is that there are + +0:34:02.950,0:34:05.890 +these tools, but people will keep + +0:34:05.890,0:34:08.230 +finding new ways, so alternative + +0:34:08.230,0:34:12.220 +ways of writing these tools. It's +nice to know about it. So, for + +0:34:12.220,0:34:20.020 +example find if you just want to match +the things that end in "tmp" + +0:34:20.020,0:34:24.190 +it can be sometimes weird to do this +thing, it has a long command. + +0:34:24.190,0:34:27.760 +There's things like "fd", + +0:34:27.760,0:34:32.320 +for example, that is a shorter command +that by default will use regex + +0:34:32.320,0:34:34.899 +and will ignore your gitfiles, so you + +0:34:34.899,0:34:38.020 +don't even search for them. It + +0:34:38.020,0:34:42.879 +will color-code, it will have better +Unicode support... It's nice to + +0:34:42.879,0:34:45.040 +know about some of these tools. But, again, + +0:34:45.040,0:34:52.149 +the main idea is that if you are aware +that these tools exist, you can + +0:34:52.149,0:34:53.740 +save yourself a lot of time from doing + +0:34:53.740,0:34:57.660 +kind of menial and repetitive tasks. + +0:34:57.660,0:35:00.010 +Another command to bear in mind is like + +0:35:00.010,0:35:01.990 +"find". Some of you may be + +0:35:01.990,0:35:04.300 +wondering, "find" is probably just + +0:35:04.300,0:35:06.520 +actually going through a directory + +0:35:06.520,0:35:09.580 +structure and looking for things but + +0:35:09.580,0:35:11.260 +what if I'm doing a lot of "finds" a day? + +0:35:11.260,0:35:12.850 +Wouldn't it be better, doing kind of + +0:35:12.850,0:35:18.790 +a database approach and build an index +first, and then use that index + +0:35:18.790,0:35:21.520 +and update it in some way. Well, actually + +0:35:21.520,0:35:23.380 +most Unix systems already do it and + +0:35:23.380,0:35:28.170 +this is through the "locate" command and + +0:35:28.170,0:35:31.690 +the way that the locate will + +0:35:31.690,0:35:35.470 +be used... it will just look for paths in + +0:35:35.470,0:35:38.680 +your file system that have the substring + +0:35:38.680,0:35:44.710 +that you want. I actually don't know if it +will work... Okay, it worked. Let me try to + +0:35:44.710,0:35:49.840 +do something like "missing-semester". + +0:35:51.840,0:35:53.950 +You're gonna take a while but + +0:35:53.950,0:35:56.109 +it found all these files that are somewhere + +0:35:56.109,0:35:57.730 +in my file system and since it has + +0:35:57.730,0:36:01.750 +built an index already on them, it's much + +0:36:01.750,0:36:05.680 +faster. And then, to keep it updated, + +0:36:05.680,0:36:11.980 +using the "updatedb" command +that is running through cron, + +0:36:13.840,0:36:18.490 +to update this database. Finding files, again, is + +0:36:18.490,0:36:23.230 +really useful. Sometimes you're actually concerned +about, not the files themselves, + +0:36:23.230,0:36:26.740 +but the content of the files. For that + +0:36:26.740,0:36:31.420 +you can use the grep command that we + +0:36:31.420,0:36:33.880 +have seen so far. So you could do + +0:36:33.880,0:36:37.740 +something like grep foobar in MCD, it's there. + +0:36:37.740,0:36:43.690 +What if you want to, again, recursively +search through the current + +0:36:43.690,0:36:45.760 +structure and look for more files, right? + +0:36:45.760,0:36:48.700 +We don't want to do this manually. + +0:36:48.700,0:36:51.220 +We could use "find", and the "-exec", but + +0:36:51.220,0:36:58.920 +actually "grep" has the "-R" flag +that will go through the entire + +0:36:58.920,0:37:03.609 +directory, here. And it's telling us + +0:37:03.609,0:37:06.579 +that oh we have the foobar line in example.sh + +0:37:06.579,0:37:09.279 +at these three places and in + +0:37:09.279,0:37:14.589 +this other two places in foobar. This can be + +0:37:14.589,0:37:16.900 +really convenient. Mainly, the + +0:37:16.900,0:37:18.940 +use case for this is you know you have + +0:37:18.940,0:37:21.910 +written some code in some programming + +0:37:21.910,0:37:23.859 +language, and you know it's somewhere in + +0:37:23.859,0:37:26.200 +your file system but you actually don't + +0:37:26.200,0:37:28.599 +know. But you can actually quickly search. + +0:37:28.600,0:37:32.980 +So for example, I can quickly search + +0:37:35.660,0:37:40.320 +for all the Python files that I have in my + +0:37:40.329,0:37:45.460 +scratch folder where I used the request library. + +0:37:45.460,0:37:47.589 +And if I run this, it's giving me + +0:37:47.589,0:37:50.890 +through all these files, exactly in + +0:37:50.890,0:37:53.650 +what line it has been found. And here + +0:37:53.650,0:37:56.260 +instead of using grep, which is fine, + +0:37:56.260,0:37:58.930 +you could also do this, I'm using "ripgrep", + +0:37:58.930,0:38:05.260 +which is kind of the same idea but +again trying to bring some more + +0:38:05.260,0:38:09.730 +niceties like color coding or file + +0:38:09.730,0:38:16.480 +processing and other things. It think it has, +also, unicode support. It's also pretty + +0:38:16.480,0:38:22.829 +fast so you are not paying like a +trade-off on this being slower and + +0:38:22.829,0:38:25.420 +there's a lot of useful flags. You + +0:38:25.420,0:38:27.670 +can say, oh, I actually want to get some + +0:38:27.670,0:38:30.460 +context around those results. + +0:38:33.040,0:38:36.400 +So I want to get like five +lines of context around + +0:38:36.400,0:38:42.819 +that, so you can see where that import +lives and see code around it. + +0:38:42.819,0:38:44.170 +Here in the import it's not really useful + +0:38:44.170,0:38:45.819 +but like if you're looking for where you + +0:38:45.819,0:38:49.720 +use the function, for example, it will + +0:38:49.720,0:38:54.010 +be very handy. We can also do things like + +0:38:54.010,0:38:59.170 +we can search, for example here,. + +0:38:59.170,0:39:04.839 +A more advanced use, we can say, + +0:39:04.840,0:39:11.580 +"-u" is for don't ignore hidden files, sometimes + +0:39:12.520,0:39:16.359 +you want to be ignoring hidden +files, except if you want to + +0:39:16.359,0:39:23.500 +search config files, that are by default +hidden. Then, instead of printing + +0:39:23.500,0:39:28.400 +the matches, we're asking to do something +that would be kind of hard, I think, + +0:39:28.400,0:39:31.380 +to do with grep, out of my head, which is + +0:39:31.390,0:39:34.569 +"I want you to print all the files that + +0:39:34.569,0:39:37.750 +don't match the pattern I'm giving you", which + +0:39:37.750,0:39:40.030 +may be a weird thing to ask here but + +0:39:40.030,0:39:42.940 +then we keep going... And this pattern here + +0:39:42.940,0:39:45.790 +is a small regex which is saying + +0:39:45.790,0:39:48.099 +at the beginning of the line I have a + +0:39:48.099,0:39:51.190 +"#" and a "!", and that's a shebang. + +0:39:51.190,0:39:53.470 +Like that, we're searching here for all + +0:39:53.470,0:39:56.650 +the files that don't have a shebang + +0:39:56.650,0:39:59.369 +and then we're giving it, here, + +0:39:59.369,0:40:02.470 +a "-t sh" to only look for "sh" + +0:40:02.470,0:40:07.660 +files, because maybe all your +Python or text files are fine + +0:40:07.660,0:40:10.000 +without a shebang. And here it's telling us + +0:40:10.000,0:40:13.020 +"oh, MCD is obviously missing a shebang" + +0:40:14.760,0:40:16.660 +We can even... It has like some + +0:40:16.660,0:40:19.119 +nice flags, so for example if we + +0:40:19.120,0:40:21.360 +include the "stats" flag + +0:40:28.700,0:40:34.119 +it will get all these results but it will +also tell us information about all + +0:40:34.119,0:40:35.410 +the things that it searched. For example, + +0:40:35.410,0:40:40.390 +the number of matches that it found, +the lines, the file searched, + +0:40:40.390,0:40:44.040 +the bytes that it printed, &c. + +0:40:44.040,0:40:47.160 +Similar as with "fd", sometimes +it's not as useful + +0:40:48.400,0:40:50.619 +using one specific tool or another and + +0:40:50.620,0:40:55.780 +in fact, as ripgrep, there are several +other tools. Like "ack", + +0:40:55.780,0:40:57.700 +is the original grep alternative that was + +0:40:57.700,0:41:00.670 +written. Then the silver searcher, + +0:41:00.670,0:41:04.089 +"ag", was another one... and they're all + +0:41:04.089,0:41:05.589 +pretty much interchangeable so + +0:41:05.589,0:41:07.630 +maybe you're at a system that has one and + +0:41:07.630,0:41:09.670 +not the other, just knowing that you can + +0:41:09.670,0:41:12.040 +use these things with these tools can be + +0:41:12.040,0:41:15.549 +fairly useful. Lastly, I want to cover + +0:41:15.549,0:41:19.780 +how you go about, not finding files +or code, but how you go about + +0:41:19.780,0:41:22.540 +finding commands that you already + +0:41:22.540,0:41:30.160 +some time figured out. The first, obvious +way is just using the up arrow, + +0:41:30.160,0:41:34.540 +and slowly going through all your history, +looking for these matches. + +0:41:34.540,0:41:36.490 +This is actually not very efficient, as + +0:41:36.490,0:41:42.579 +you probably guessed. So the bash +has ways to do this more easily. + +0:41:42.579,0:41:44.619 +There is the "history" command, that will + +0:41:44.619,0:41:49.180 +print your history. Here I'm in zsh and +it only prints some of my history, but + +0:41:49.180,0:41:54.069 +if I say, I want you to print everything +from the beginning of time, it will print + +0:41:54.069,0:41:58.220 +everything from the beginning +of whatever this history is. + +0:41:58.220,0:42:00.700 +And since this is a lot of results, + +0:42:00.700,0:42:02.589 +maybe we care about the ones where we + +0:42:02.589,0:42:08.490 +use the "convert" command to go from some +type of file to some other type of file. + +0:42:08.490,0:42:12.940 +Some image, sorry. Then, we're getting all + +0:42:12.940,0:42:15.849 +these results here, about all the ones + +0:42:15.849,0:42:18.120 +that match this substring. + +0:42:21.280,0:42:24.609 +Even more, pretty much all shells by default will + +0:42:24.609,0:42:27.130 +link "Ctrl+R", the keybinding, + +0:42:27.130,0:42:29.680 +to do backward search. Here we + +0:42:29.680,0:42:31.569 +have backward search, where we can + +0:42:31.569,0:42:34.750 +type "convert" and it's finding the + +0:42:34.750,0:42:36.609 +command that we just typed. And if we just + +0:42:36.609,0:42:38.619 +keep hitting "Ctrl+R", it will + +0:42:38.619,0:42:41.740 +kind of go through these matches and + +0:42:41.740,0:42:44.260 +it will let re-execute it + +0:42:44.260,0:42:49.240 +in place. Another thing that you can do, + +0:42:49.240,0:42:51.069 +related to that, is you can use this + +0:42:51.069,0:42:53.829 +really nifty tool called "fzf", which is + +0:42:53.829,0:42:56.280 +like a fuzzy finder, like it will... + +0:42:57.100,0:42:58.480 +It will let you do kind of + +0:42:58.480,0:43:02.200 +like an interactive grep. We could do + +0:43:02.200,0:43:06.369 +for example this, where we can cat our + +0:43:06.369,0:43:10.030 +example.sh command, that will print + +0:43:10.030,0:43:11.680 +print to the standard output, and then we + +0:43:11.680,0:43:14.290 +can pipe it through fzf. It's just getting + +0:43:14.290,0:43:18.490 +all the lines and then we can +interactively look for the + +0:43:18.490,0:43:21.849 +string that we care about. And the nice + +0:43:21.849,0:43:26.349 +thing about fzf is that, if you enable +the default bindings, it will bind to + +0:43:26.349,0:43:33.670 +your "Ctrl+R" shell execution and now + +0:43:33.670,0:43:36.490 +you can quickly and dynamically like + +0:43:36.490,0:43:41.700 +look for all the times you try to +convert a favicon in your history. + +0:43:42.020,0:43:46.375 +And it's also like fuzzy matching, +whereas like by default in grep + +0:43:46.375,0:43:49.420 +or these things you have to write a regex or some + +0:43:49.420,0:43:52.360 +expression that will match within here. + +0:43:52.360,0:43:54.609 +Here I'm just typing "convert" and "favicon" and + +0:43:54.609,0:43:57.369 +it's just trying to do the best scan, + +0:43:57.369,0:44:01.349 +doing the match in the lines it has. + +0:44:01.349,0:44:06.190 +Lastly, a tool that probably you have +already seen, that I've been using + +0:44:06.190,0:44:08.410 +for not retyping these extremely long + +0:44:08.410,0:44:13.080 +commands is this "history +substring search", where + +0:44:13.940,0:44:15.660 +as I type in my shell, + +0:44:15.670,0:44:19.630 +and both F fail to mention but both face + +0:44:19.630,0:44:22.760 +which I think was originally introduced, +this concept, and then + +0:44:22.760,0:44:25.760 +zsh has a really nice implementation) + +0:44:25.760,0:44:26.800 +what it'll let you do is + +0:44:26.800,0:44:31.300 +as you type the command, it will +dynamically search back in your + +0:44:31.300,0:44:34.420 +history to the same command +that has a common prefix, + +0:44:34.980,0:44:36.900 +and then, if you... + +0:44:39.100,0:44:42.100 +it will change as the match list stops + +0:44:42.100,0:44:44.110 +working and then as you do the + +0:44:44.120,0:44:49.760 +right arrow you can select that +command and then re-execute it. + +0:45:05.800,0:45:09.920 +We've seen a bunch of stuff... I think I have + +0:45:09.940,0:45:16.180 +a few minutes left so I'm going +to cover a couple of tools to do + +0:45:16.180,0:45:20.060 +really quick directory listing +and directory navigation. + +0:45:20.060,0:45:30.020 +So you can always use the "-R" to recursively +list some directory structure, + +0:45:30.020,0:45:35.160 +but that can be suboptimal, I cannot +really make sense of this easily. + +0:45:36.340,0:45:44.460 +There's tool called "tree" that will +be the much more friendly form of + +0:45:44.460,0:45:47.500 +printing all the stuff, it will +also color code based on... + +0:45:47.500,0:45:50.680 +here for example "foo" is blue +because it's a directory and + +0:45:50.680,0:45:55.100 +this is red because it has execute permissions. + +0:45:55.100,0:46:00.220 +But we can go even further than +that. There's really nice tools + +0:46:00.220,0:46:04.580 +like a recent one called "broot" that +will do the same thing but here + +0:46:04.580,0:46:07.300 +for example instead of doing +this thing of listing + +0:46:07.300,0:46:09.160 +every single file, for example in bar + +0:46:09.160,0:46:11.400 +we have these "a" through "j" files, + +0:46:11.400,0:46:14.260 +it will say "oh there are more, unlisted here". + +0:46:15.080,0:46:18.200 +I can actually start typing and it will again + +0:46:18.200,0:46:21.540 +again facily match to the files that are there + +0:46:21.540,0:46:24.800 +and I can quickly select them +and navigate through them. + +0:46:24.800,0:46:28.380 +So, again, it's good to know that + +0:46:28.380,0:46:33.340 +these things exist so you don't +lose a large amount of time + +0:46:34.240,0:46:36.180 +going for these files. + +0:46:37.880,0:46:40.500 +There are also, I think I have it installed + +0:46:40.500,0:46:44.829 +also something more similar to what +you would expect your OS to have, + +0:46:44.829,0:46:49.960 +like Nautilus or one of the Mac +finders that have like an + +0:46:49.960,0:46:59.260 +interactive input where you can just use your +navigation arrows and quickly explore. + +0:46:59.260,0:47:03.849 +It might be overkill but you'll +be surprised how quickly you can + +0:47:03.849,0:47:07.839 +make sense of some directory structure +by just navigating through it. + +0:47:07.840,0:47:12.780 +And pretty much all of these tools +will let you edit, copy files... + +0:47:12.780,0:47:16.880 +if you just look for the options for them. + +0:47:17.600,0:47:20.100 +The last addendum is kind of going places. + +0:47:20.100,0:47:24.480 +We have "cd", and "cd" is nice, it will get you + +0:47:26.120,0:47:30.060 +to a lot of places. But it's pretty handy if + +0:47:30.069,0:47:33.190 +you can like quickly go places, + +0:47:33.190,0:47:36.730 +either you have been to recently or that + +0:47:36.730,0:47:40.599 +you go frequently. And you can do this in + +0:47:40.599,0:47:42.520 +many ways there's probably... you can start + +0:47:42.520,0:47:44.319 +thinking, oh I can make bookmarks, I can + +0:47:44.319,0:47:46.660 +make... I can make aliases in the shell, + +0:47:46.660,0:47:49.020 +that we will cover at some point, + +0:47:49.020,0:47:53.020 +symlinks... But at this point, + +0:47:53.020,0:47:54.910 +programmers have like built all these + +0:47:54.910,0:47:56.799 +tools, so programmers have already figured + +0:47:56.799,0:47:59.520 +out a really nice way of doing this. + +0:47:59.520,0:48:01.930 +One way of doing this is using what is + +0:48:01.930,0:48:05.760 +called "auto jump", which I +think is not loaded here... + +0:48:14.140,0:48:20.100 +Okay, don't worry. I will cover it +in the command line environment. + +0:48:21.960,0:48:25.579 +I think it's because I disabled +the "Ctrl+R" and that also + +0:48:25.579,0:48:31.309 +affected other parts of the script. +I think at this point if anyone has + +0:48:31.309,0:48:35.480 +any questions that are related to this, +I'll be more than happy to answer + +0:48:35.480,0:48:37.509 +them, if anything was left unclear. + +0:48:37.509,0:48:42.859 +Otherwise, a there's a bunch of +exercises that we wrote, kind of + +0:48:42.859,0:48:46.549 +touching on these topics and we +encourage you to try them and + +0:48:46.549,0:48:48.559 +come to office hours, where we can help + +0:48:48.559,0:48:54.569 +you figure out how to do them, or some +bash quirks that are not clear. + From a7a982fd31426dc0236693423834e8f31a904f3f Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Mon, 3 Feb 2020 22:11:26 -0500 Subject: [PATCH 252/640] Fix incorrect use of WebKit in Q&A notes --- _2020/qa.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/qa.md b/_2020/qa.md index 9314cc5e..c2c4e26d 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -178,5 +178,5 @@ Two Factor Authentication (2FA) adds an extra layer of protection to your accoun ## Any comments on differences between web browsers? -The current landscape of browsers as of 2020 is that most of them are like Chrome because they use the same engine (WebKit). This means that Safari or the Microsoft Edge, both based on WebKit, are just worse versions of Chrome. Chrome is a reasonably good browser both in terms of performance and usability. Should you want an alternative, Firefox is our recommendation. It is comparable to Chrome in pretty much every regard and it excels for privacy reasons. +The current landscape of browsers as of 2020 is that most of them are like Chrome because they use the same engine (Blink). This means that Microsoft Edge which is also based on Blink, and Safari, which is based on WebKit, a similar engine to Blink, are just worse versions of Chrome. Chrome is a reasonably good browser both in terms of performance and usability. Should you want an alternative, Firefox is our recommendation. It is comparable to Chrome in pretty much every regard and it excels for privacy reasons. Another browser called [Flow](https://www.ekioh.com/flow-browser/) is not user ready yet, but it is implementing a new rendering engine that promises to be faster than the current ones. From 2474d1409c199a4989d354d60ede5ec46a060cb2 Mon Sep 17 00:00:00 2001 From: eir Date: Tue, 4 Feb 2020 07:26:13 -0800 Subject: [PATCH 253/640] command-line.md/Aliases: Fix top-10 history script `history 1` returns only the current event, making it look like the only command in history is the script people just pasted into their terminal. Note that tail prints the last 10 lines by default. --- _2020/command-line.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 5fff8e39..0cf8fe82 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -472,7 +472,7 @@ One way to achieve this is to use the [`wait`](http://man7.org/linux/man-pages/m 1. Create an alias `dc` that resolves to `cd` for when you type it wrongly. -1. Run `history 1 |awk '{$1="";print substr($0,2)}' |sort | uniq -c | sort -n | tail -n10`) to get your top 10 most used commands and consider writing shorter aliases for them. +1. Run `history | awk '{$1="";print substr($0,2)}' | sort | uniq -c | sort -n | tail -n 10` to get your top 10 most used commands and consider writing shorter aliases for them. ## Dotfiles From f72637b13a399e9daedd4d09634ea8d9e6067291 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Tue, 4 Feb 2020 11:44:39 -0500 Subject: [PATCH 254/640] Add note about ZSH Thanks to Jon for catching this. See discussion in https://github.com/missing-semester/missing-semester/pull/8. --- _2020/command-line.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 0cf8fe82..2cee8142 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -472,7 +472,7 @@ One way to achieve this is to use the [`wait`](http://man7.org/linux/man-pages/m 1. Create an alias `dc` that resolves to `cd` for when you type it wrongly. -1. Run `history | awk '{$1="";print substr($0,2)}' | sort | uniq -c | sort -n | tail -n 10` to get your top 10 most used commands and consider writing shorter aliases for them. +1. Run `history | awk '{$1="";print substr($0,2)}' | sort | uniq -c | sort -n | tail -n 10` to get your top 10 most used commands and consider writing shorter aliases for them. Note: this works for Bash; if you're using ZSH, use `history 1` instead of just `history`. ## Dotfiles From f559da1423439537ec1630edc0cc84ba8d658c22 Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Tue, 4 Feb 2020 12:15:25 -0500 Subject: [PATCH 255/640] Note about 'real' metaprogramming meaning (#9) --- _2020/metaprogramming.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/_2020/metaprogramming.md b/_2020/metaprogramming.md index a0859646..0545f1df 100644 --- a/_2020/metaprogramming.md +++ b/_2020/metaprogramming.md @@ -20,7 +20,11 @@ In this lecture, we will look at systems for building and testing your code, and for managing dependencies. These may seem like they are of limited importance in your day-to-day as a student, but the moment you interact with a larger code base through an internship or once you enter -the "real world", you will see this everywhere. +the "real world", you will see this everywhere. We should note that +"metaprogramming" can also mean "[programs that operate on +programs](https://en.wikipedia.org/wiki/Metaprogramming)", whereas that +is not quite the definition we are using for the purposes of this +lecture. # Build systems From 23ebf5d9ae7c4ee0cc7cbf801d33512d6808af71 Mon Sep 17 00:00:00 2001 From: Sean Pedersen <37712604+SeanPedersen@users.noreply.github.com> Date: Wed, 5 Feb 2020 01:43:00 +0100 Subject: [PATCH 256/640] Update security.md Would make no sense to encrypt the public key... --- _2020/security.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/security.md b/_2020/security.md index 2d46840d..20f14883 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -269,7 +269,7 @@ operating system (collected from hardware events, etc.). The public key is stored as-is (it's public, so keeping it a secret is not important), but at rest, the private key should be encrypted on disk. The `ssh-keygen` program prompts the user for a passphrase, and this is fed through a key derivation -function to produce a key, which is then used to encrypt the public key with a +function to produce a key, which is then used to encrypt the private key with a symmetric cipher. In use, once the server knows the client's public key (stored in the From 30c54f9a821810a074664b37538689a6ead4cae3 Mon Sep 17 00:00:00 2001 From: rishav Date: Wed, 5 Feb 2020 01:23:00 +0530 Subject: [PATCH 257/640] Add undo commands --- _2020/version-control.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/_2020/version-control.md b/_2020/version-control.md index 02c36928..27ba73e4 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -425,7 +425,6 @@ index 94bab17..f0013b2 100644 - `git add `: adds files to staging area - `git commit`: creates a new commit - Write [good commit messages](https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html)! -- `git commit --amend`: edit a commit's contents/message - `git log`: shows a flattened log of history - `git log --all --graph --decorate`: visualizes history as a DAG - `git diff `: show differences since the last commit @@ -465,6 +464,12 @@ command is used for merging. - `git pull`: same as `git fetch; git merge` - `git clone`: download repository from remote +## Undo + +- `git commit --amend`: edit a commit's contents/message +- `git reset HEAD `: unstage a file +- `git checkout -- `: discard changes + # Advanced Git - `git config`: Git is [highly customizable](https://git-scm.com/docs/git-config) From 85240b9efc571f1354a7414549a1e03a5d4508ca Mon Sep 17 00:00:00 2001 From: nixon Date: Wed, 5 Feb 2020 15:09:41 -0600 Subject: [PATCH 258/640] fix typo --- _2020/data-wrangling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index 0ccbe0ae..30cd9d4d 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -53,7 +53,7 @@ ssh myserver 'journalctl | grep sshd | grep "Disconnected from"' | less ``` Why the additional quoting? Well, our logs may be quite large, and it's -wasteful to do stream it all to our computer and then do the filtering. +wasteful to stream it all to our computer and then do the filtering. Instead, we can do the filtering on the remote server, and then massage the data locally. `less` gives us a "pager" that allows us to scroll up and down through the long output. To save some additional traffic while From 5f03c81b1f0f1e15d7b4f7806345e9231dfdd3bd Mon Sep 17 00:00:00 2001 From: Ben Keith <1754187+benlk@users.noreply.github.com> Date: Wed, 5 Feb 2020 19:54:16 -0500 Subject: [PATCH 259/640] Enlarge upon the sha1sum examples To demonstrate how slightly-differing inputs have widely different outputs, and how repeating the same input generates the same output. --- _2020/security.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/_2020/security.md b/_2020/security.md index 20f14883..94d02b32 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -74,6 +74,10 @@ on an input using the `sha1sum` command: ```console $ printf 'hello' | sha1sum aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d +$ printf 'hello' | sha1sum +aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d +$ printf 'Hello' | sha1sum +f7ff9e8b7bb2e09b70935a5d785e0cc5d9d0abf0 ``` At a high level, a hash function can be thought of as a hard-to-invert @@ -81,6 +85,7 @@ random-looking (but deterministic) function (and this is the [ideal model of a hash function](https://en.wikipedia.org/wiki/Random_oracle)). A hash function has the following properties: +- Deterministic: the same input always generates the same output. - Non-invertible: it is hard to find an input `m` such that `hash(m) = h` for some desired output `h`. - Target collision resistant: given an input `m_1`, it's hard to find a From 603a05686835b92775373aa74d32751fe55cffa9 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Wed, 5 Feb 2020 21:56:07 -0500 Subject: [PATCH 260/640] Add note on SHA-1 being broken Thanks to Ben Keith for suggesting this: https://github.com/missing-semester/missing-semester/pull/15#issuecomment-582686785. --- _2020/security.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/_2020/security.md b/_2020/security.md index 94d02b32..ed0adaf3 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -94,6 +94,14 @@ different input `m_2` such that `hash(m_1) = hash(m_2)`. `hash(m_1) = hash(m_2)` (note that this is a strictly stronger property than target collision resistance). +Note: while it may work for certain purposes, SHA-1 is [no +longer](https://shattered.io/) considered a strong cryptographic hash function. +You might find this table of [lifetimes of cryptographic hash +functions](https://valerieaurora.org/hash.html) interesting. However, note that +recommending specific hash functions is beyond the scope of this lecture. If you +are doing work where this matters, you need formal training in +security/cryptography. + ## Applications - Git, for content-addressed storage. The idea of a [hash From cb1b1344002a55726ad2252fdda8dfc3a82c2d0c Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Wed, 5 Feb 2020 22:18:35 -0500 Subject: [PATCH 261/640] Fix typos in example Thanks to Victor Engmark for pointing this out (https://github.com/missing-semester/missing-semester/issues/13). --- _2020/course-shell.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index 6d646621..12d4e607 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -197,10 +197,10 @@ To see what lives in a given directory, we use the `ls` command: ```console missing:~$ ls missing:~$ cd .. -missing:~$ ls +missing:/home$ ls missing -missing:~$ cd .. -missing:~$ ls +missing:/home$ cd .. +missing:/$ ls bin boot dev From f71a590601e8fc2a46801bee99b67f8d82ec0057 Mon Sep 17 00:00:00 2001 From: Trevor Manz Date: Thu, 6 Feb 2020 14:07:05 -0500 Subject: [PATCH 262/640] tmux -t is illegal option --- _2020/command-line.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 2cee8142..9354e160 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -140,7 +140,7 @@ The most popular terminal multiplexer these days is [`tmux`](http://man7.org/lin `tmux` expects you to know its keybindings, and they all have the form ` x` where that means press `Ctrl+b` release, and the press `x`. `tmux` has the following hierarchy of objects: - **Sessions** - a session is an independent workspace with one or more windows + `tmux` starts a new session. - + `tmux -t NAME` starts it with that name. + + `tmux new -s NAME` starts it with that name. + `tmux ls` lists the current sessions + Within `tmux` typing ` d` dettaches the current session + `tmux a` attaches the last session. You can use `-t` flag to specify which From 3bddbfb70f1b75e51804a28465cad7313cee8bf4 Mon Sep 17 00:00:00 2001 From: Jose Javier Date: Fri, 7 Feb 2020 12:17:11 -0500 Subject: [PATCH 263/640] Add license and guidelines --- _layouts/lecture.html | 1 + index.md | 5 +++-- license.md | 33 +++++++++++++++++++++++++++++++++ 3 files changed, 37 insertions(+), 2 deletions(-) create mode 100644 license.md diff --git a/_layouts/lecture.html b/_layouts/lecture.html index 486b2c7c..78b9e980 100644 --- a/_layouts/lecture.html +++ b/_layouts/lecture.html @@ -18,4 +18,5 @@

{{ page.title }}{% if page.subtitle %}

Edit this page.

+

This content is licensed under CC BY-NC-SA.

diff --git a/index.md b/index.md index 51ea7aea..4fe35370 100644 --- a/index.md +++ b/index.md @@ -52,7 +52,7 @@ YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuloKGG59rS43e29ro7I57 # About the class -**Staff**: This class is co-taught by [Anish](https://www.anishathalye.com/), [Jon](https://thesquareplanet.com/), and [Jose](http://josejg.com/). +**Staff**: This class is co-taught by [Anish](https://www.anishathalye.com/), [Jon](https://thesquareplanet.com/), and [Jose](http://josejg.com/). **Questions**: Email us at [missing-semester@mit.edu](mailto:missing-semester@mit.edu). # Beyond MIT @@ -78,5 +78,6 @@ AeroAstro](https://aeroastro.mit.edu/) for A/V equipment; and Brandi Adams and ---
-

Source code.

+

This content is licensed under CC BY-NC-SA.

+

See here for contribution & translation guidelines.

diff --git a/license.md b/license.md new file mode 100644 index 00000000..1e77f64a --- /dev/null +++ b/license.md @@ -0,0 +1,33 @@ +--- +layout: default +title: "License" +permalink: /license +--- + +# License + +All the content in this course, including lecture notes, exercises and lecture videos is licensed under Attribution-NonCommercial-ShareAlike 4.0 International [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). + +This means that you are free to: +- **Share** — copy and redistribute the material in any medium or format +- **Adapt** — remix, transform, and build upon the material + +Under the following terms: + +- **Attribution** — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. +- **NonCommercial** — You may not use the material for commercial purposes. +- **ShareAlike** — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original. + +This is a human-readable summary of (and not a substitute for) the [license](https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode). + +## Contribution guidelines + +You can submit corrections and suggestions to the course material by submitting issues and pull requests on our GitHub [repo](https://github.com/missing-semester/missing-semester). This includes the captions for the video lectures which are also in the repo (see [here](https://github.com/missing-semester/missing-semester/tree/master/static/files/subtitles/2020)). + +## Translation guidelines + +You are free to translate the lecture notes and exercises as long as you follow the license terms. +If your translation mirrors the course structure, please contact us so we can link your translated version from our page. + +For translating the video captions, please submit your translations as community contributions in YouTube. + From a1df34701a15a8d27fc472a9642fcdb28480a35a Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Fri, 7 Feb 2020 12:35:48 -0500 Subject: [PATCH 264/640] Tweak license wording - Add back source code link - Shorten "Licensed under..." wording - Remove link to CC license on homepage --- _layouts/lecture.html | 2 +- index.md | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/_layouts/lecture.html b/_layouts/lecture.html index 78b9e980..ce758efe 100644 --- a/_layouts/lecture.html +++ b/_layouts/lecture.html @@ -18,5 +18,5 @@

{{ page.title }}{% if page.subtitle %}

Edit this page.

-

This content is licensed under CC BY-NC-SA.

+

Licensed under CC BY-NC-SA.

diff --git a/index.md b/index.md index 4fe35370..7de28ac3 100644 --- a/index.md +++ b/index.md @@ -78,6 +78,7 @@ AeroAstro](https://aeroastro.mit.edu/) for A/V equipment; and Brandi Adams and ---
-

This content is licensed under CC BY-NC-SA.

+

Source code.

+

Licensed under CC BY-NC-SA.

See here for contribution & translation guidelines.

From 9f4e08cd82e3383b077da3f922bd48605b086d32 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Fri, 7 Feb 2020 14:40:42 -0500 Subject: [PATCH 265/640] Clarify that license covers source code --- README.md | 4 ++++ license.md | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index b30f24d5..90cac77c 100644 --- a/README.md +++ b/README.md @@ -12,3 +12,7 @@ To build and view the site locally, run: ```bash bundle exec jekyll serve -w ``` + +## License + +All the content in this course, including the website source code, lecture notes, exercises, and lecture videos is licensed under Attribution-NonCommercial-ShareAlike 4.0 International [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). diff --git a/license.md b/license.md index 1e77f64a..cd5180fd 100644 --- a/license.md +++ b/license.md @@ -6,7 +6,7 @@ permalink: /license # License -All the content in this course, including lecture notes, exercises and lecture videos is licensed under Attribution-NonCommercial-ShareAlike 4.0 International [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). +All the content in this course, including the website source code, lecture notes, exercises, and lecture videos is licensed under Attribution-NonCommercial-ShareAlike 4.0 International [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). This means that you are free to: - **Share** — copy and redistribute the material in any medium or format From a5019d4a3594c29ca1f892e746a6a830e547c966 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Sat, 8 Feb 2020 20:15:48 -0500 Subject: [PATCH 266/640] Add link to website license page --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 90cac77c..f4573f17 100644 --- a/README.md +++ b/README.md @@ -15,4 +15,4 @@ bundle exec jekyll serve -w ## License -All the content in this course, including the website source code, lecture notes, exercises, and lecture videos is licensed under Attribution-NonCommercial-ShareAlike 4.0 International [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). +All the content in this course, including the website source code, lecture notes, exercises, and lecture videos is licensed under Attribution-NonCommercial-ShareAlike 4.0 International [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). See [here](https://missing.csail.mit.edu/license) for more information on contributions or translations. From 5191c399d0ce438940a901bebc30c4af7a448659 Mon Sep 17 00:00:00 2001 From: Teresa Krohn Date: Sun, 9 Feb 2020 15:35:45 +0100 Subject: [PATCH 267/640] Fix wrong option in alias example --- _2020/command-line.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 9354e160..f9053b9e 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -194,7 +194,7 @@ alias sl=ls # Overwrite existing commands for better defaults alias mv="mv -i" # -i prompts before overwrite alias mkdir="mkdir -p" # -p make parent dirs as needed -alias df="df -p" # -h prints human readable format +alias df="df -h" # -h prints human readable format # Alias can be composed alias la="ls -A" From a10be139a2c5b8bc4d8c11d5ecc630ac8d890582 Mon Sep 17 00:00:00 2001 From: Christopher Hogg Date: Mon, 10 Feb 2020 10:11:40 +0100 Subject: [PATCH 268/640] fix: typo basic change to the sentence structure to make it clearer. --- _2020/shell-tools.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 392e36d2..971188d1 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -36,7 +36,7 @@ echo '$foo' # prints $foo ``` -As most programming languages bash supports control flow techniques including `if`, `case`, `while` and `for`. +As with most programming languages, bash supports control flow techniques including `if`, `case`, `while` and `for`. Similarly, `bash` has functions that take arguments and can operate with them. Here is an example of a function that creates a directory and `cd`s into it. From b4afe4351a3ca3b8e3f7439fdc993dbd23ab3ac8 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Thu, 13 Feb 2020 17:29:17 -0500 Subject: [PATCH 269/640] Fix numbering --- _2020/shell-tools.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 971188d1..3d0ed9f8 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -333,20 +333,20 @@ polo() { Write a bash script that runs the following script until it fails and captures its standard output and error streams to files and prints everything at the end. Bonus points if you can also report how many runs it took for the script to fail. -```bash -#!/usr/bin/env bash + ```bash + #!/usr/bin/env bash -n=$(( RANDOM % 100 )) + n=$(( RANDOM % 100 )) -if [[ n -eq 42 ]]; then - echo "Something went wrong" - >&2 echo "The error was using magic numbers" - exit 1 -fi + if [[ n -eq 42 ]]; then + echo "Something went wrong" + >&2 echo "The error was using magic numbers" + exit 1 + fi -echo "Everything went according to plan" + echo "Everything went according to plan" + ``` -``` {% comment %} #!/usr/bin/env bash From 2f1493e190ac00afe7b00d81869b826fa3ad6656 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 24 Feb 2020 10:42:13 -0500 Subject: [PATCH 270/640] Clarify lack of sysfs on macOS/Windows --- _2020/course-shell.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index 12d4e607..18d854ba 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -304,11 +304,13 @@ When you get permission denied errors, it is usually because you need to do something as root. Though make sure you first double-check that you really wanted to do it that way! -One thing you need to be root in order to do is writing to the `sysfs` -file system mounted under `/sys`. `sysfs` exposes a number of kernel -parameters as files, so that you can easily reconfigure the kernel on -the fly without specialized tools. For example, the brightness of your -laptop's screen is exposed through a file called `brightness` under +One thing you need to be root in order to do is writing to the `sysfs` file +system mounted under `/sys`. `sysfs` exposes a number of kernel parameters as +files, so that you can easily reconfigure the kernel on the fly without +specialized tools. **Note that sysfs does not exist on Windows or macOS.** + +For example, the brightness of your laptop's screen is exposed through a file +called `brightness` under ``` /sys/class/backlight From a0c130f34193ba67f6dc70e484e9735759b246e6 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 24 Feb 2020 10:42:29 -0500 Subject: [PATCH 271/640] Clarify `./semester` vs `sh semester` in exercises --- _2020/course-shell.md | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index 18d854ba..cc038ef7 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -363,9 +363,9 @@ there. # Exercises 1. Create a new directory called `missing` under `/tmp`. - 2. Look up the `touch` program. The `man` program is your friend. - 3. Use `touch` to create a new file called `semester` in `missing`. - 4. Write the following into that file, one line at a time: + 1. Look up the `touch` program. The `man` program is your friend. + 1. Use `touch` to create a new file called `semester` in `missing`. + 1. Write the following into that file, one line at a time: ``` #!/bin/sh curl --head --silent https://missing.csail.mit.edu @@ -376,12 +376,19 @@ there. differently: they will do the trick in this case. See the Bash [quoting](https://www.gnu.org/software/bash/manual/html_node/Quoting.html) manual page for more information. - 5. Try to execute the file. Investigate why it doesn't work with `ls`. - 6. Look up the `chmod` program. - 7. Use `chmod` to make it possible to run the command `./semester`. - 8. Use `|` and `>` to write the "last modified" date output by + 1. Try to execute the file, i.e. type the path to the script (`./semester`) + into your shell and press enter. Understand why it doesn't work by + consulting the output of `ls` (hint: look at the permission bits of the + file). + 1. Run the command by explicitly starting the `sh` interpreter, and giving it + the file `semester` as the first argument, i.e. `sh semester`. Why does + this work, while `./semester` didn't? + 1. Look up the `chmod` program (e.g. use `man chmod`). + 1. Use `chmod` to make it possible to run the command `./semester` rather than + having to type `sh semester`. + 1. Use `|` and `>` to write the "last modified" date output by `semester` into a file called `last-modified.txt` in your home directory. - 9. Write a command that reads out your laptop battery's power level or your + 1. Write a command that reads out your laptop battery's power level or your desktop machine's CPU temperature from `/sys`. Note: if you're a macOS user, your OS doesn't have sysfs, so you can skip this exercise. From 9ec7aaed6e86a4a9adef1fc8654782a87956fe5f Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 24 Feb 2020 10:50:59 -0500 Subject: [PATCH 272/640] Add more info on shebang --- _2020/course-shell.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index cc038ef7..c37b118e 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -385,7 +385,10 @@ there. this work, while `./semester` didn't? 1. Look up the `chmod` program (e.g. use `man chmod`). 1. Use `chmod` to make it possible to run the command `./semester` rather than - having to type `sh semester`. + having to type `sh semester`. How does your shell know that the file is + supposed to be interpreted using `sh`? See this page on the + [shebang](https://en.wikipedia.org/wiki/Shebang_(Unix)) line for more + information. 1. Use `|` and `>` to write the "last modified" date output by `semester` into a file called `last-modified.txt` in your home directory. From 2d708d2f30568677f646ff685d006c2c34a76ffe Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Wed, 26 Feb 2020 12:02:50 +0000 Subject: [PATCH 273/640] Bump nokogiri from 1.10.7 to 1.10.8 Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.10.7 to 1.10.8. - [Release notes](https://github.com/sparklemotion/nokogiri/releases) - [Changelog](https://github.com/sparklemotion/nokogiri/blob/master/CHANGELOG.md) - [Commits](https://github.com/sparklemotion/nokogiri/compare/v1.10.7...v1.10.8) Signed-off-by: dependabot[bot] --- Gemfile.lock | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Gemfile.lock b/Gemfile.lock index 80ec8845..3f6d2449 100644 --- a/Gemfile.lock +++ b/Gemfile.lock @@ -204,7 +204,7 @@ GEM jekyll-seo-tag (~> 2.1) minitest (5.13.0) multipart-post (2.1.1) - nokogiri (1.10.7) + nokogiri (1.10.8) mini_portile2 (~> 2.4.0) octokit (4.15.0) faraday (>= 0.9) From 16c5590052ae8096fd659cbe5b788695feca8a59 Mon Sep 17 00:00:00 2001 From: piaoliangkb <418508556@qq.com> Date: Fri, 28 Feb 2020 12:56:40 +0800 Subject: [PATCH 274/640] fix typo --- _2020/potpourri.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index f45edbdf..75e1b527 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -100,7 +100,7 @@ This way, local programs can see the file as if it was in your computer while in This is effectively what `sshfs` does. Some interesting examples of FUSE filesystems are: -- [sshfs](https://github.com/libfuse/sshfs) - Open locally remote files/folder thorugh an SSH connection. +- [sshfs](https://github.com/libfuse/sshfs) - Open locally remote files/folder through an SSH connection. - [rclone](https://rclone.org/commands/rclone_mount/) - Mount cloud storage services like Dropbox, GDrive, Amazon S3 or Google Cloud Storage and open data locally. - [gocryptfs](https://nuetzlich.net/gocryptfs/) - Encrypted overlay system. Files are stored encrypted but once the FS is mounted they appear as plaintext in the mountpoint. - [kbfs](https://keybase.io/docs/kbfs) - Distributed filesystem with end-to-end encryption. You can have private, shared and public folders. From 2202d3d8004b0a0ad926e2a6d64e6dcce1f558e4 Mon Sep 17 00:00:00 2001 From: Michael Date: Wed, 4 Mar 2020 16:23:59 -0800 Subject: [PATCH 275/640] incorrect link in exercises: terminal multiplexer Second link in Exercises: Terminal multiplexer is a duplicate of the first link. Currently, both point to https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/ Did you mean to direct the second link (about customizing your tmux conf) to the sequel blog post from the Ham Vocke at https://www.hamvocke.com/blog/a-guide-to-customizing-your-tmux-conf/ ? --- _2020/command-line.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index f9053b9e..5ec01b00 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -466,7 +466,7 @@ One way to achieve this is to use the [`wait`](http://man7.org/linux/man-pages/m ## Terminal multiplexer -1. Follow this `tmux` [tutorial](https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/) and then learn how to do some basic customizations following [these steps](https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/). +1. Follow this `tmux` [tutorial](https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/) and then learn how to do some basic customizations following [these steps](https://www.hamvocke.com/blog/a-guide-to-customizing-your-tmux-conf/). ## Aliases From 69cf523c62feb224a26ea9f8d999f96b6478f00b Mon Sep 17 00:00:00 2001 From: Colin Menzies Date: Sun, 8 Mar 2020 16:04:53 +0000 Subject: [PATCH 276/640] Typo --- _2019/backups.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2019/backups.md b/_2019/backups.md index a35657fb..7262f4c7 100644 --- a/_2019/backups.md +++ b/_2019/backups.md @@ -28,7 +28,7 @@ The main idea behind this recommendation is not to put all your eggs in one bask ## Testing your backups -An common pitfall when performing backups is blindly trusting whatever the system says it's doing and not verifying that the data can be properly recovered. Toy Story 2 was almost lost and their backups were not working, [luck](https://www.youtube.com/watch?v=8dhp_20j0Ys) ended up saving them. +A common pitfall when performing backups is blindly trusting whatever the system says it's doing and not verifying that the data can be properly recovered. Toy Story 2 was almost lost and their backups were not working, [luck](https://www.youtube.com/watch?v=8dhp_20j0Ys) ended up saving them. ## Versioning From bed23660df86e03f7e6ad56e7b0bae10ee6dd506 Mon Sep 17 00:00:00 2001 From: Tony Liu Date: Fri, 13 Mar 2020 00:42:08 +0800 Subject: [PATCH 277/640] fix the typo fix the typo of "hash function" --- _2020/security.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/security.md b/_2020/security.md index ed0adaf3..70098fad 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -106,7 +106,7 @@ security/cryptography. - Git, for content-addressed storage. The idea of a [hash function](https://en.wikipedia.org/wiki/Hash_function) is a more general -concept (there are non-cryptographic has functions). Why does Git use a +concept (there are non-cryptographic hash functions). Why does Git use a cryptographic hash function? - A short summary of the contents of a file. Software can often be downloaded from (potentially less trustworthy) mirrors, e.g. Linux ISOs, and it would be From 868e353df816f0f3d5e77c8a81616a5707724332 Mon Sep 17 00:00:00 2001 From: Jacinta Date: Mon, 16 Mar 2020 23:50:09 +0400 Subject: [PATCH 278/640] Fix typo in example --- _2020/shell-tools.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 3d0ed9f8..16e579f2 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -133,7 +133,7 @@ mv *{.py,.sh} folder mkdir foo bar # This creates files foo/a, foo/b, ... foo/h, bar/a, bar/b, ... bar/h -touch {foo,bar}/{a..j} +touch {foo,bar}/{a..h} touch foo/x bar/y # Show differences between files in foo and bar diff <(ls foo) <(ls bar) From fb423a15382db22eb08c2ab63336b4793c875ca3 Mon Sep 17 00:00:00 2001 From: Rohan Bansal Date: Fri, 20 Mar 2020 20:36:26 -0700 Subject: [PATCH 279/640] Added prettier as example of a code formatter --- _2020/debugging-profiling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 7f416c29..125e7d29 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -206,7 +206,7 @@ In vim, the plugins [`ale`](https://vimawesome.com/plugin/ale) or [`syntastic`]( For Python, [`pylint`](https://www.pylint.org) and [`pep8`](https://pypi.org/project/pep8/) are examples of stylistic linters and [`bandit`](https://pypi.org/project/bandit/) is a tool designed to find common security issues. For other languages people have compiled comprehensive lists of useful static analysis tools, such as [Awesome Static Analysis](https://github.com/mre/awesome-static-analysis) (you may want to take a look at the _Writing_ section) and for linters there is [Awesome Linters](https://github.com/caramelomartins/awesome-linters). -A complementary tool to stylistic linting are code formatters such as [`black`](https://github.com/psf/black) for Python, `gofmt` for Go or `rustfmt` for Rust. +A complementary tool to stylistic linting are code formatters such as [`black`](https://github.com/psf/black) for Python, `gofmt` for Go, `rustfmt` for Rust or [`prettier`](https://prettier.io/) for JavaScript, HTML and CSS. These tools autoformat your code so that it's consistent with common stylistic patterns for the given programming language. Although you might be unwilling to give stylistic control about your code, standardizing code format will help other people read your code and will make you better at reading other people's (stylistically standardized) code. From 6eacdaeb5b899d36b33ea119f073cbda59db8104 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Sat, 21 Mar 2020 10:51:05 -0400 Subject: [PATCH 280/640] Fix typo --- _2020/debugging-profiling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 7f416c29..dd4a76f0 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -314,7 +314,7 @@ $ python -m cProfile -s tottime grep.py 1000 '^(import|\s*def)[^,]*$' *.py ``` -A caveat of Python's `cProfile` profiler (and many profilers for that matter) is that they display time per function call. That can become intuitive really fast, specially if you are using third party libraries in your code since internal function calls are also accounted for. +A caveat of Python's `cProfile` profiler (and many profilers for that matter) is that they display time per function call. That can become unintuitive really fast, specially if you are using third party libraries in your code since internal function calls are also accounted for. A more intuitive way of displaying profiling information is to include the time taken per line of code, which is what _line profilers_ do. For instance, the following piece of Python code performs a request to the class website and parses the response to get all URLs in the page: From ec9ff485609fd4786f619f5e24ae75dc3961d617 Mon Sep 17 00:00:00 2001 From: lepasq <53230128+lepasq@users.noreply.github.com> Date: Sun, 22 Mar 2020 14:08:08 +0100 Subject: [PATCH 281/640] Fix typo inside shell-tools.md Added missing period. --- _2020/shell-tools.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 16e579f2..a22480a3 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -110,7 +110,7 @@ done ``` In the comparison we tested whether `$?` was not equal to 0. -Bash implements many comparsions of this sort, you can find a detailed list in the manpage for [`test`](http://man7.org/linux/man-pages/man1/test.1.html) +Bash implements many comparsions of this sort, you can find a detailed list in the manpage for [`test`](http://man7.org/linux/man-pages/man1/test.1.html). When performing comparisons in bash try to use double brackets `[[ ]]` in favor of simple brackets `[ ]`. Chances of making mistakes are lower although it won't be portable to `sh`. A more detailed explanation can be found [here](http://mywiki.wooledge.org/BashFAQ/031). When launching scripts, you will often want to provide arguments that are similar. Bash has ways of making this easier, expanding expressions by carrying out filename expansion. These techniques are often referred to as shell _globbing_. From 282630eaa68ade906f43c8b46debea97f4fb56d9 Mon Sep 17 00:00:00 2001 From: Michael Date: Mon, 23 Mar 2020 15:10:12 -0700 Subject: [PATCH 282/640] link to info about ssh-agent is incorrect linked url: https://www.ssh.com/ssh/agents correct url: https://www.ssh.com/ssh/agent --- _2020/command-line.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 5ec01b00..d8a401c1 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -492,7 +492,7 @@ Let's get you up to speed with dotfiles. Install a Linux virtual machine (or use an already existing one) for this exercise. If you are not familiar with virtual machines check out [this](https://hibbard.eu/install-ubuntu-virtual-box/) tutorial for installing one. -1. Go to `~/.ssh/` and check if you have a pair of SSH keys there. If not, generate them with `ssh-keygen -o -a 100 -t ed25519`. It is recommended that you use a password and use `ssh-agent` , more info [here](https://www.ssh.com/ssh/agents). +1. Go to `~/.ssh/` and check if you have a pair of SSH keys there. If not, generate them with `ssh-keygen -o -a 100 -t ed25519`. It is recommended that you use a password and use `ssh-agent` , more info [here](https://www.ssh.com/ssh/agent). 1. Edit `.ssh/config` to have an entry as follows ```bash From 1d4a1c9d117a14077266974676695eb9bd6df525 Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Mon, 23 Mar 2020 22:17:54 -0400 Subject: [PATCH 283/640] Fix mixup between RemoteForward and LocalForward Thanks to @carpdiem for pointing this out in https://github.com/missing-semester/missing-semester/issues/31. --- _2020/command-line.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index d8a401c1..6425933f 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -383,7 +383,7 @@ The most common scenario is local port forwarding, where a service in the remote We have covered many many arguments that we can pass. A tempting alternative is to create shell aliases that look like ```bash -alias my_server="ssh -i ~/.id_ed25519 --port 2222 - L 9999:localhost:8888 foobar@remote_server +alias my_server="ssh -i ~/.id_ed25519 --port 2222 -L 9999:localhost:8888 foobar@remote_server ``` However, there is a better alternative using `~/.ssh/config`. @@ -394,7 +394,7 @@ Host vm HostName 172.16.174.141 Port 2222 IdentityFile ~/.ssh/id_ed25519 - RemoteForward 9999 localhost:8888 + LocalForward 9999 localhost:8888 # Configs can also take wildcards Host *.mit.edu @@ -500,7 +500,7 @@ Host vm User username_goes_here HostName ip_goes_here IdentityFile ~/.ssh/id_ed25519 - RemoteForward 9999 localhost:8888 + LocalForward 9999 localhost:8888 ``` 1. Use `ssh-copy-id vm` to copy your ssh key to the server. 1. Start a webserver in your VM by executing `python -m http.server 8888`. Access the VM webserver by navigating to `http://localhost:9999` in your machine. From 990cf13114c007b7a0ee88018ba5924e4790db1b Mon Sep 17 00:00:00 2001 From: Michael Date: Thu, 26 Mar 2020 17:21:00 -0700 Subject: [PATCH 284/640] Change link to cProfile from python2 -> python3 docs The current link to cProfile in the exercises links to the python2 documentation. Maybe this should be the python3 documentation instead? --- _2020/debugging-profiling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 27eb83e2..b3df91a8 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -489,7 +489,7 @@ done 1. (Advanced) Read about [reversible debugging](https://undo.io/resources/reverse-debugging-whitepaper/) and get a simple example working using [`rr`](https://rr-project.org/) or [`RevPDB`](https://morepypy.blogspot.com/2016/07/reverse-debugging-for-python.html). ## Profiling -1. [Here](/static/files/sorts.py) are some sorting algorithm implementations. Use [`cProfile`](https://docs.python.org/2/library/profile.html) and [`line_profiler`](https://github.com/rkern/line_profiler) to compare the runtime of insertion sort and quicksort. What is the bottleneck of each algorithm? Use then `memory_profiler` to check the memory consumption, why is insertion sort better? Check now the inplace version of quicksort. Challenge: Use `perf` to look at the cycle counts and cache hits and misses of each algorithm. +1. [Here](/static/files/sorts.py) are some sorting algorithm implementations. Use [`cProfile`](https://docs.python.org/3/library/profile.html) and [`line_profiler`](https://github.com/rkern/line_profiler) to compare the runtime of insertion sort and quicksort. What is the bottleneck of each algorithm? Use then `memory_profiler` to check the memory consumption, why is insertion sort better? Check now the inplace version of quicksort. Challenge: Use `perf` to look at the cycle counts and cache hits and misses of each algorithm. 1. Here's some (arguably convoluted) Python code for computing Fibonacci numbers using a function for each number. From 2412043d6f658c96cf384076770bdeec19f76e7e Mon Sep 17 00:00:00 2001 From: Jon Gjengset Date: Fri, 27 Mar 2020 09:02:40 -0400 Subject: [PATCH 285/640] Typo; fixes #33 --- _2020/command-line.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 6425933f..55bbbf07 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -318,7 +318,7 @@ ssh foo@bar.mit.edu ``` Here we are trying to ssh as user `foo` in server `bar.mit.edu`. -The server can be specified with a URL (like `bar.mit.edu`) or an IP (something like `foobar@192.168.1.42`). Later we will shee that if we modify ssh config file you can access just using something like `ssh bar`. +The server can be specified with a URL (like `bar.mit.edu`) or an IP (something like `foobar@192.168.1.42`). Later we will see that if we modify ssh config file you can access just using something like `ssh bar`. ## Executing commands From 9f0546166263c2266fbf03c2985c6866e1fcf96f Mon Sep 17 00:00:00 2001 From: Krishnakumar Gopalakrishnan Date: Fri, 3 Apr 2020 19:53:26 +0100 Subject: [PATCH 286/640] adds a number of typo fixes to markdown files in the project repo --- _2019/backups.md | 4 ++-- _2019/data-wrangling.md | 4 ++-- _2019/editors.md | 2 +- _2019/remote-machines.md | 2 +- _2019/shell.md | 4 ++-- _2019/version-control.md | 2 +- _2019/web.md | 2 +- _2020/command-line.md | 4 ++-- _2020/shell-tools.md | 4 ++-- 9 files changed, 14 insertions(+), 14 deletions(-) diff --git a/_2019/backups.md b/_2019/backups.md index 7262f4c7..21df3cf9 100644 --- a/_2019/backups.md +++ b/_2019/backups.md @@ -50,7 +50,7 @@ However, making several copies of your data might be extremely costly in terms o Since we might be backing up to untrusted third parties like cloud providers it is worth considering that if you backup your data is copied *as is* then it could potentially be looked by unwanted agents. Documents like your taxes are sensitive information that should not be backed up in plain format. To prevent this, many backup solutions offer **client side encryption** where data is encrypted before being sent to the server. That way the server cannot read the data it is storing but you can decrypt it with your secret key. -As a side note, if your disk (or home partition) is not encrypted, then anyone that get ahold of your computer can manage to override the user access controls and read your data. Modern hardware supports fast and efficient read and writes of encrypted data so you might want to consider enabling **full disk encryption**. +As a side note, if your disk (or home partition) is not encrypted, then anyone that get hold of your computer can manage to override the user access controls and read your data. Modern hardware supports fast and efficient read and writes of encrypted data so you might want to consider enabling **full disk encryption**. ## Append only @@ -67,7 +67,7 @@ Some other things you may want to look into are: - **Periodic backups**: outdated backups can become pretty useless. Making backups regularly should be a consideration for your system - **Bootable backups**: some programs allow you to clone your entire disk. That way you have an image that contains an entire copy of your system you can boot directly from. - **Differential backup strategies**, you may not necessarily care the same about all your data. You can define different backup policies for different types of data. -- **Append only backups** an additional consideration is to enforce append only operations to your backup repositories in order to prevent malicious agents to delete them if they get ahold of your machine. +- **Append only backups** an additional consideration is to enforce append only operations to your backup repositories in order to prevent malicious agents to delete them if they get hold of your machine. ## Webservices diff --git a/_2019/data-wrangling.md b/_2019/data-wrangling.md index 3a1496a1..aabc7587 100644 --- a/_2019/data-wrangling.md +++ b/_2019/data-wrangling.md @@ -163,7 +163,7 @@ easy](https://emailregex.com/). And there's [lots of discussion](https://stackoverflow.com/questions/201323/how-to-validate-an-email-address-using-a-regular-expression/1917982). And people have [written tests](https://fightingforalostcause.net/content/misc/2006/compare-email-regex.php). -And [test matrixes](https://mathiasbynens.be/demo/url-regex). You can +And [test matrices](https://mathiasbynens.be/demo/url-regex). You can even write a regex for determining if a given number [is a prime number](https://www.noulakaz.net/2007/03/18/a-regular-expression-to-check-for-prime-numbers/). @@ -220,7 +220,7 @@ ssh myserver journalctl | sort -nk1,1 | tail -n10 ``` -`sort -n` will sort in numeric (instead of lexiographic) order. `-k1,1` +`sort -n` will sort in numeric (instead of lexicographic) order. `-k1,1` means "sort by only the first whitespace-separated column". The `,n` part says "sort until the `n`th field, where the default is the end of the line. In this _particular_ example, sorting by the whole line diff --git a/_2019/editors.md b/_2019/editors.md index 4e770211..acd0c072 100644 --- a/_2019/editors.md +++ b/_2019/editors.md @@ -69,7 +69,7 @@ features such as 24-bit color, menus, and popups. - Vim is a **modal** editor: different modes for inserting text vs manipulating text - Vim is programmable (with Vimscript and also other languages like Python) - Vim's interface itself is like a programming language - - Keystrokes (with mneumonic names) are commands + - Keystrokes (with mnemonic names) are commands - Commands are composable - Don't use the mouse: too slow - Editor should work at the speed you think diff --git a/_2019/remote-machines.md b/_2019/remote-machines.md index dae55eba..b2cf727a 100644 --- a/_2019/remote-machines.md +++ b/_2019/remote-machines.md @@ -23,7 +23,7 @@ An often overlooked feature of `ssh` is the ability to run commands directly. Key-based authentication exploits public-key cryptography to prove to the server that the client owns the secret private key without revealing the key. This way you do not need to reenter your password every time. Nevertheless the private key (e.g. `~/.ssh/id_rsa`) is effectively your password so treat it like so. -- Key generation. To generate a pair you can simply run `ssh-keygen -t rsa -b 4096`. If you do not choose a passphrase anyone that gets ahold of your private key will be able to access authorized servers so it is recommended to choose one and use `ssh-agent` to manage shell sessions. +- Key generation. To generate a pair you can simply run `ssh-keygen -t rsa -b 4096`. If you do not choose a passphrase anyone that gets hold of your private key will be able to access authorized servers so it is recommended to choose one and use `ssh-agent` to manage shell sessions. If you have configured pushing to Github using SSH keys you have probably done the steps outlined [here](https://help.github.com/articles/connecting-to-github-with-ssh/) and have a valid pair already. To check if you have a passphrase and validate it you can run `ssh-keygen -y -f /path/to/key`. diff --git a/_2019/shell.md b/_2019/shell.md index 388c571a..35e79407 100644 --- a/_2019/shell.md +++ b/_2019/shell.md @@ -253,7 +253,7 @@ Also, a double dash `--` is used in built-in commands and many other commands to ## Exercises -1. If you are completely new to the shell you may want to read a more comprehensive guide about it such as [BashGuide](http://mywiki.wooledge.org/BashGuide). If you want a more indepth introduction [The Linux Command Line](http://linuxcommand.org/tlcl.php) is a good resource. +1. If you are completely new to the shell you may want to read a more comprehensive guide about it such as [BashGuide](http://mywiki.wooledge.org/BashGuide). If you want a more in-depth introduction [The Linux Command Line](http://linuxcommand.org/tlcl.php) is a good resource. 1. **PATH, which, type** @@ -287,7 +287,7 @@ Also, a double dash `--` is used in built-in commands and many other commands to print("Hello World!") ``` - You will often see programs that have a shebang that looks like `#! usr/bin/env bash`. This is a more portable solution with it own set of [advantages and disadvantages](https://unix.stackexchange.com/questions/29608/why-is-it-better-to-use-usr-bin-env-name-instead-of-path-to-name-as-my). How is `env` different from `which`? What environment vairable does `env` use to decide what program to run? + You will often see programs that have a shebang that looks like `#! usr/bin/env bash`. This is a more portable solution with it own set of [advantages and disadvantages](https://unix.stackexchange.com/questions/29608/why-is-it-better-to-use-usr-bin-env-name-instead-of-path-to-name-as-my). How is `env` different from `which`? What environment variable does `env` use to decide what program to run? 1. **Pipes, process substitution, subshell** diff --git a/_2019/version-control.md b/_2019/version-control.md index 2e8018db..2576fc37 100644 --- a/_2019/version-control.md +++ b/_2019/version-control.md @@ -350,7 +350,7 @@ if your push is rejected, what do you do? 1. Once you start to get more familiar with `git`, you will find yourself running into common tasks, such as editing your `.gitignore`. [git extras](https://github.com/tj/git-extras/blob/master/Commands.md) provides a bunch of little utilities that integrate with `git`. For example `git ignore PATTERN` will add the specified pattern to the `.gitignore` file in your repo and `git ignore-io LANGUAGE` will fetch the common ignore patterns for that language from [gitignore.io](https://www.gitignore.io). Install `git extras` and try using some tools like `git alias` or `git ignore`. -1. Git GUI programs can be a great resource sometimes. Try running [gitk](https://git-scm.com/docs/gitk) in a git repo an explore the differents parts of the interface. Then run `gitk --all` what are the differences? +1. Git GUI programs can be a great resource sometimes. Try running [gitk](https://git-scm.com/docs/gitk) in a git repo an explore the different parts of the interface. Then run `gitk --all` what are the differences? 1. Once you get used to command line applications GUI tools can feel cumbersome/bloated. A nice compromise between the two are ncurses based tools which can be navigated from the command line and still provide an interactive interface. Git has [tig](https://github.com/jonas/tig), try installing it and running it in a repo. You can find some usage examples [here](https://www.atlassian.com/blog/git/git-tig). diff --git a/_2019/web.md b/_2019/web.md index fcbfdd15..9040bf25 100644 --- a/_2019/web.md +++ b/_2019/web.md @@ -178,7 +178,7 @@ snapshot_wayback(driver, url) 1. Edit a keyword search engine that you use often in your web browser 1. Install the mentioned extensions. Look into how uBlock Origin/Privacy Badger can be disabled for a website. What differences do you see? Try doing it in a website with plenty of ads like YouTube. -1. Install Stylus and write a custom style for the class website using the CSS provided. Here are some commmon programming characters `= == === >= => ++ /= ~=`. What happens to them when changing the font to Fira Code? If you want to know more search for programming font ligatures. +1. Install Stylus and write a custom style for the class website using the CSS provided. Here are some common programming characters `= == === >= => ++ /= ~=`. What happens to them when changing the font to Fira Code? If you want to know more search for programming font ligatures. 1. Find a web api to get the weather in your city/area. 1. Use a WebDriver software like [Selenium](https://docs.seleniumhq.org/) to automate some repetitive manual task that you perform often with your browser. diff --git a/_2020/command-line.md b/_2020/command-line.md index 55bbbf07..d5579ef4 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -142,7 +142,7 @@ The most popular terminal multiplexer these days is [`tmux`](http://man7.org/lin + `tmux` starts a new session. + `tmux new -s NAME` starts it with that name. + `tmux ls` lists the current sessions - + Within `tmux` typing ` d` dettaches the current session + + Within `tmux` typing ` d` detaches the current session + `tmux a` attaches the last session. You can use `-t` flag to specify which - **Windows** - Equivalent to tabs in editors or browsers, they are visually separate parts of the same session @@ -337,7 +337,7 @@ To generate a pair you can run [`ssh-keygen`](http://man7.org/linux/man-pages/ma ```bash ssh-keygen -o -a 100 -t ed25519 -f ~/.ssh/id_ed25519 ``` -You should choose a passphrase, to avoid someone who gets ahold of your private key to access authorized servers. Use [`ssh-agent`](http://man7.org/linux/man-pages/man1/ssh-agent.1.html) or [`gpg-agent`](https://linux.die.net/man/1/gpg-agent) so you do not have to type your passphrase every time. +You should choose a passphrase, to avoid someone who gets hold of your private key to access authorized servers. Use [`ssh-agent`](http://man7.org/linux/man-pages/man1/ssh-agent.1.html) or [`gpg-agent`](https://linux.die.net/man/1/gpg-agent) so you do not have to type your passphrase every time. If you have ever configured pushing to GitHub using SSH keys, then you have probably done the steps outlined [here](https://help.github.com/articles/connecting-to-github-with-ssh/) and have a valid key pair already. To check if you have a passphrase and validate it you can run `ssh-keygen -y -f /path/to/key`. diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index a22480a3..5bde84d9 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -110,7 +110,7 @@ done ``` In the comparison we tested whether `$?` was not equal to 0. -Bash implements many comparsions of this sort, you can find a detailed list in the manpage for [`test`](http://man7.org/linux/man-pages/man1/test.1.html). +Bash implements many comparisons of this sort, you can find a detailed list in the manpage for [`test`](http://man7.org/linux/man-pages/man1/test.1.html). When performing comparisons in bash try to use double brackets `[[ ]]` in favor of simple brackets `[ ]`. Chances of making mistakes are lower although it won't be portable to `sh`. A more detailed explanation can be found [here](http://mywiki.wooledge.org/BashFAQ/031). When launching scripts, you will often want to provide arguments that are similar. Bash has ways of making this easier, expanding expressions by carrying out filename expansion. These techniques are often referred to as shell _globbing_. @@ -179,7 +179,7 @@ Short for manual, [`man`](http://man7.org/linux/man-pages/man1/man.1.html) provi For example, `man rm` will output the behavior of the `rm` command along with the flags that it takes including the `-i` flag we showed earlier. In fact, what I have been linking so far for every command are the online version of Linux manpages for the commands. Even non native commands that you install will have manpage entries if the developer wrote them and included them as part of the installation process. -For interactive tools such as the ones based on ncurses, help for the comands can often be accessed within the program using the `:help` command or typing `?`. +For interactive tools such as the ones based on ncurses, help for the commands can often be accessed within the program using the `:help` command or typing `?`. Sometimes manpages can be overly detailed descriptions of the commands and it can become hard to decipher what flags/syntax to use for common use cases. [TLDR pages](https://tldr.sh/) are a nifty complementary solution that focuses on giving example use cases of a command so you can quickly figure out which options to use. From a90e1a81baef98a3290c1138032ea671055e9433 Mon Sep 17 00:00:00 2001 From: cijad Date: Sun, 5 Apr 2020 20:18:51 +0200 Subject: [PATCH 287/640] Fix small typos --- _2020/command-line.md | 10 +++++----- _2020/data-wrangling.md | 2 +- _2020/debugging-profiling.md | 6 +++--- _2020/metaprogramming.md | 2 +- _2020/potpourri.md | 6 +++--- _2020/qa.md | 16 ++++++++-------- _2020/security.md | 2 +- _2020/shell-tools.md | 16 ++++++++-------- _2020/version-control.md | 4 ++-- 9 files changed, 32 insertions(+), 32 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index d5579ef4..9a11cbce 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -129,9 +129,9 @@ You can learn more about these and other signals [here](https://en.wikipedia.org When using the command line interface you will often want to run more than one thing at once. For instance, you might want to run your editor and your program side by side. -Although this can be achieved opening new terminal windows, using a terminal multiplexer is a more versatile solution. +Although this can be achieved by opening new terminal windows, using a terminal multiplexer is a more versatile solution. -Terminal multiplexers like [`tmux`](http://man7.org/linux/man-pages/man1/tmux.1.html) allow to multiplex terminal windows using panes and tabs so you can interact multiple shell sessions. +Terminal multiplexers like [`tmux`](http://man7.org/linux/man-pages/man1/tmux.1.html) allow to multiplex terminal windows using panes and tabs so you can interact with multiple shell sessions. Moreover, terminal multiplexers let you detach a current terminal session and reattach at some point later in time. This can make your workflow much better when working with remote machines since it voids the need to use `nohup` and similar tricks. @@ -153,7 +153,7 @@ The most popular terminal multiplexer these days is [`tmux`](http://man7.org/lin + ` ,` Rename the current window + ` w` List current windows -- **Panes** - Like vim splits, pane let you have multiple shells in the same visual display. +- **Panes** - Like vim splits, pane lets you have multiple shells in the same visual display. + ` "` Split the current pane horizontally + ` %` Split the current pane vertically + ` ` Move to the pane in the specified _direction_. Direction here means arrow keys. @@ -221,7 +221,7 @@ Many programs are configured using plain-text files known as _dotfiles_ hidden in the directory listing `ls` by default). Shells are one example of programs configured with such files. On startup, your shell will read many files to load its configuration. -Depending of the shell, whether you are starting a login and/or interactive the entire process can be quite complex. +Depending on the shell, whether you are starting a login and/or interactive the entire process can be quite complex. [Here](https://blog.flowblok.id.au/2013-02/shell-startup-scripts.html) is an excellent resource on the topic. For `bash`, editing your `.bashrc` or `.bash_profile` will work in most systems. @@ -406,7 +406,7 @@ An additional advantage of using the `~/.ssh/config` file over aliases is that Note that the `~/.ssh/config` file can be considered a dotfile, and in general it is fine for it to be included with the rest of your dotfiles. However, if you make it public, think about the information that you are potentially providing strangers on the internet: addresses of your servers, users, open ports, &c. This may facilitate some types of attacks so be thoughtful about sharing your SSH configuration. -Server side configuration is usually specified in `/etc/ssh/sshd_config`. Here you can make changes like disabling password authentication, changing ssh ports, enabling X11 forwarding, &c. You can specify config settings in a per user basis. +Server side configuration is usually specified in `/etc/ssh/sshd_config`. Here you can make changes like disabling password authentication, changing ssh ports, enabling X11 forwarding, &c. You can specify config settings in a per user basis. ## Miscellaneous diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index 30cd9d4d..da95b256 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -308,7 +308,7 @@ leave that as an exercise to the reader. ## Analyzing data -You can do math! For example, add the numbers on each line together: +You can do the math! For example, add the numbers on each line together: ```bash | paste -sd+ | bc -l diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index b3df91a8..49a72099 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -408,13 +408,13 @@ For example, `perf` can easily report poor cache locality, high amounts of page Profiler output for real world programs will contain large amounts of information because of the inherent complexity of software projects. Humans are visual creatures and are quite terrible at reading large amounts of numbers and making sense of them. -Thus there are many tools for displaying profiler's output in a easier to parse way. +Thus there are many tools for displaying profiler's output in an easier to parse way. One common way to display CPU profiling information for sampling profilers is to use a [Flame Graph](http://www.brendangregg.com/flamegraphs.html), which will display a hierarchy of function calls across the Y axis and time taken proportional to the X axis. They are also interactive, letting you zoom into specific parts of the program and get their stack traces (try clicking in the image below). [![FlameGraph](http://www.brendangregg.com/FlameGraphs/cpu-bash-flamegraph.svg)](http://www.brendangregg.com/FlameGraphs/cpu-bash-flamegraph.svg) -Call graphs or control flow graphs display the relationships between subroutines within a program by including functions as nodes and functions calls between them as directed edges. When coupled with profiling information such as number of calls and time taken, call graphs can be quite useful for interpreting the flow of a program. +Call graphs or control flow graphs display the relationships between subroutines within a program by including functions as nodes and functions calls between them as directed edges. When coupled with profiling information such as the number of calls and time taken, call graphs can be quite useful for interpreting the flow of a program. In Python you can use the [`pycallgraph`](http://pycallgraph.slowchop.com/en/master/) library to generate them. ![Call Graph](https://upload.wikimedia.org/wikipedia/commons/2/2f/A_Call_Graph_generated_by_pycallgraph.png) @@ -432,7 +432,7 @@ See also [`glances`](https://nicolargo.github.io/glances/) for similar implement - **I/O operations** - [`iotop`](http://man7.org/linux/man-pages/man8/iotop.8.html) displays live I/O usage information and is handy to check if a process is doing heavy I/O disk operations - **Disk Usage** - [`df`](http://man7.org/linux/man-pages/man1/df.1.html) displays metrics per partitions and [`du`](http://man7.org/linux/man-pages/man1/du.1.html) displays **d**isk **u**sage per file for the current directory. In these tools the `-h` flag tells the program to print with **h**uman readable format. A more interactive version of `du` is [`ncdu`](https://dev.yorhel.nl/ncdu) which lets you navigate folders and delete files and folders as you navigate. -- **Memory Usage** - [`free`](http://man7.org/linux/man-pages/man1/free.1.html) displays the total amount of free and used memory in the system. Memory is also displayed in tools like `htop`. +- **Memory Usage** - [`free`](http://man7.org/linux/man-pages/man1/free.1.html) displays the total amount of free and used memory in the system. Memory is also displayed in tools like `htop`. - **Open Files** - [`lsof`](http://man7.org/linux/man-pages/man8/lsof.8.html) lists file information about files opened by processes. It can be quite useful for checking which process has opened a specific file. - **Network Connections and Config** - [`ss`](http://man7.org/linux/man-pages/man8/ss.8.html) lets you monitor incoming and outgoing network packets statistics as well as interface statistics. A common use case of `ss` is figuring out what process is using a given port in a machine. For displaying routing, network devices and interfaces you can use [`ip`](http://man7.org/linux/man-pages/man8/ip.8.html). Note that `netstat` and `ifconfig` have been deprecated in favor of the former tools respectively. - **Network Usage** - [`nethogs`](https://github.com/raboof/nethogs) and [`iftop`](http://www.ex-parrot.com/pdw/iftop/) are good interactive CLI tools for monitoring network usage. diff --git a/_2020/metaprogramming.md b/_2020/metaprogramming.md index 0545f1df..f6c49956 100644 --- a/_2020/metaprogramming.md +++ b/_2020/metaprogramming.md @@ -39,7 +39,7 @@ inputs to your outputs. Often, that process might have many steps, and many branches. Run this to generate this plot, that to generate those results, and something else to produce the final paper. As with so many of the things we have seen in this class, you are not the first to -encounter this annoyance, and luckily there exists many tools to help +encounter this annoyance, and luckily there exist many tools to help you! These are usually called "build systems", and there are _many_ of them. diff --git a/_2020/potpourri.md b/_2020/potpourri.md index 75e1b527..2a0b6400 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -54,7 +54,7 @@ Some software resources to get started on the topic: ## Daemons You are probably already familiar with the notion of daemons, even if the word seems new. -Most computers have a series of processes that are always running in the background rather than waiting for an user to launch them and interact with them. +Most computers have a series of processes that are always running in the background rather than waiting for a user to launch them and interact with them. These processes are called daemons and the programs that run as daemons often end with a `d` to indicate so. For example `sshd`, the SSH daemon, is the program responsible for listening to incoming SSH requests and checking that the remote user has the necessary credentials to log in. @@ -358,8 +358,8 @@ access to: - A cheap always-on machine that has a public IP address, used to host services - A machine with a lot of CPU, disk, RAM, and/or GPU - Many more machines than you physically have access to (billing is often by -the second, so if you want a lot of compute for a short amount of time, it's -feasible to rent 1000 computers for a couple minutes) +the second, so if you want a lot of computing for a short amount of time, it's +feasible to rent 1000 computers for a couple of minutes) Popular services include [Amazon AWS](https://aws.amazon.com/), [Google Cloud](https://cloud.google.com/), and diff --git a/_2020/qa.md b/_2020/qa.md index c2c4e26d..d74659f1 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -44,9 +44,9 @@ Some good resources to learn about this topic: Some topics worth prioritizing: -- Learning how to use you keyboard more and your mouse less. This can be through keyboard shortcuts, changing interfaces, &c. +- Learning how to use your keyboard more and your mouse less. This can be through keyboard shortcuts, changing interfaces, &c. - Learning your editor well. As a programmer most of your time is spent editing files so it really pays off to learn this skill well. -- Learning how to automate and/or simplify repetitive tasks in your workflow because the time savings will be enormous.. +- Learning how to automate and/or simplify repetitive tasks in your workflow because the time savings will be enormous... - Learning about version control tools like Git and how to use it in conjunction with GitHub to collaborate in modern software projects. ## When do I use Python versus a Bash scripts versus some other language? @@ -71,8 +71,8 @@ Similarly, if `script.sh` defines a function that you want to access in your ter ## What are the places where various packages and tools are stored and how does referencing them work? What even is `/bin` or `/lib`? -Regarding programs that you execute in your terminal, they are all found in the directories listed in your `PATH` environment variable and you can use the `which` command (or the `type` command) to check where your shell is finding an specific program. -In general, there are some conventions about where specific types of files live. Here is some of the ones we talked about, check the [Filesystem, Hierarchy Standard](https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard) for a more comprehensive list. +Regarding programs that you execute in your terminal, they are all found in the directories listed in your `PATH` environment variable and you can use the `which` command (or the `type` command) to check where your shell is finding a specific program. +In general, there are some conventions about where specific types of files live. Here are some of the ones we talked about, check the [Filesystem, Hierarchy Standard](https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard) for a more comprehensive list. - `/bin` - Essential command binaries - `/sbin` - Essential system binaries, usually to be run by root @@ -114,7 +114,7 @@ Sometimes the slow part of your code will be because your system is waiting for Some of our favorites, mostly related to security and usability: -- [uBlock Origin](https://github.com/gorhill/uBlock) - It is a [wide-spectrum](https://github.com/gorhill/uBlock/wiki/Blocking-mode) blocker that doesn’t just stop ads, but all sorts of third-party communication a page may try to do. This also cover inline scripts and other types of resource loading. If you’re willing to spend some time on configuration to make things work, go to [medium mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-medium-mode) or even [hard mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-hard-mode). Those will make some sites not work until you’ve fiddled with the settings enough, but will also significantly improve your online security. Otherwise, the [easy mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-easy-mode) is already a good default that blocks most ads and tracking. You can also define you own rules about what website objects to block. +- [uBlock Origin](https://github.com/gorhill/uBlock) - It is a [wide-spectrum](https://github.com/gorhill/uBlock/wiki/Blocking-mode) blocker that doesn’t just stop ads, but all sorts of third-party communication a page may try to do. This also cover inline scripts and other types of resource loading. If you’re willing to spend some time on configuration to make things work, go to [medium mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-medium-mode) or even [hard mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-hard-mode). Those will make some sites not work until you’ve fiddled with the settings enough, but will also significantly improve your online security. Otherwise, the [easy mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-easy-mode) is already a good default that blocks most ads and tracking. You can also define your own rules about what website objects to block. - [Stylus](https://github.com/openstyles/stylus/) - a fork of Stylish (don't use Stylish, it was shown to [steal users browsing history](https://www.theregister.co.uk/2018/07/05/browsers_pull_stylish_but_invasive_browser_extension/)), allows you to sideload custom CSS stylesheets to websites. With Stylus you can easily customize and modify the appearance of websites. This can be removing a sidebar, changing the background color or even the text size or font choice. This is fantastic for making websites that you visit frequently more readable. Moreover, Stylus can find styles written by other users and published in [userstyles.org](https://userstyles.org/). Most common websites have one or several dark theme stylesheets for instance. - Full Page Screen Capture - Built into Firefox and [Chrome extension](https://chrome.google.com/webstore/detail/full-page-screen-capture/fdpohaocaechififmbbbbbknoalclacl?hl=en). Let's you take a screenshot of a full website, often much better than printing for reference purposes. - [Multi Account Containers](https://addons.mozilla.org/en-US/firefox/addon/multi-account-containers/) - lets you separate cookies into "containers", allowing you to browse the web with different identities and/or ensuring that websites are unable to share information between them. @@ -131,7 +131,7 @@ For tabular data, often presented in CSVs, the [pandas](https://pandas.pydata.or ## What is the difference between Docker and a Virtual Machine? Docker is based on a more general concept called containers. The main difference between containers and virtual machines is that virtual machines will execute an entire OS stack, including the kernel, even if the kernel is the same as the host machine. Unlike VMs, containers avoid running another instance of the kernel and instead share the kernel with the host. In Linux, this is achieved through a mechanism called LXC, and it makes use of a series of isolation mechanism to spin up a program that thinks it's running on its own hardware but it's actually sharing the hardware and kernel with the host. Thus, containers have a lower overhead than a full VM. -On the flip side, containers have a weaker isolation and only work if the host runs the same kernel. For instance if you run Docker on macOS, Docker need to spin up a Linux virtual machine to get an initial Linux kernel and thus the overhead is still significant. Lastly, Docker is an specific implementation of containers and it is tailored for software deployment. Because of this, it has some quirks: for example, Docker containers will not persist any form of storage between reboots by default. +On the flip side, containers have a weaker isolation and only work if the host runs the same kernel. For instance if you run Docker on macOS, Docker needs to spin up a Linux virtual machine to get an initial Linux kernel and thus the overhead is still significant. Lastly, Docker is a specific implementation of containers and it is tailored for software deployment. Because of this, it has some quirks: for example, Docker containers will not persist any form of storage between reboots by default. ## What are the advantages and disadvantages of each OS and how can we choose between them (e.g. choosing the best Linux distribution for our purposes)? @@ -166,10 +166,10 @@ A few more tips: - Plugins - Take your time and explore the plugin landscape. There are a lot of great plugins that address some of vim's shortcomings or add new functionality that composes well with existing vim workflows. For this, good resources are [VimAwesome](https://vimawesome.com/) and other programmers' dotfiles. - Marks - In vim, you can set a mark doing `m` for some letter `X`. You can then go back to that mark doing `'`. This let's you quickly navigate to specific locations within a file or even across files. - Navigation - `Ctrl+O` and `Ctrl+I` move you backward and forward respectively through your recently visited locations. -- Undo Tree - Vim has a quite fancy mechanism for keeping tack of changes. Unlike other editors, vim stores a tree of changes so even if you undo and then make a different change you can still go back to the original state by navigating the undo tree. Some plugins like [gundo.vim](https://github.com/sjl/gundo.vim) and [undotree](https://github.com/mbbill/undotree) expose this tree in a graphical way. +- Undo Tree - Vim has a quite fancy mechanism for keeping track of changes. Unlike other editors, vim stores a tree of changes so even if you undo and then make a different change you can still go back to the original state by navigating the undo tree. Some plugins like [gundo.vim](https://github.com/sjl/gundo.vim) and [undotree](https://github.com/mbbill/undotree) expose this tree in a graphical way. - Undo with time - The `:earlier` and `:later` commands will let you navigate the files using time references instead of one change at a time. - [Persistent undo](https://vim.fandom.com/wiki/Using_undo_branches#Persistent_undo) is an amazing built-in feature of vim that is disabled by default. It persists undo history between vim invocations. By setting `undofile` and `undodir` in your `.vimrc`, vim will storage a per-file history of changes. -- Leader Key - The leader key is special key that is often left to the user to be configured for custom commands. The pattern is usually to press and release this key (often the space key) and then some other key to execute a certain command. Often, plugins will use this key to add their own functionality, for instance the UndoTree plugin uses ` U` to open the undo tree. +- Leader Key - The leader key is a special key that is often left to the user to be configured for custom commands. The pattern is usually to press and release this key (often the space key) and then some other key to execute a certain command. Often, plugins will use this key to add their own functionality, for instance the UndoTree plugin uses ` U` to open the undo tree. - Advanced Text Objects - Text objects like searches can also be composed with vim commands. E.g. `d/` will delete to the next match of said pattern or `cgn` will change the next occurrence of the last searched string. ## What is 2FA and why should I use it? diff --git a/_2020/security.md b/_2020/security.md index 70098fad..39a223b0 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -289,7 +289,7 @@ In use, once the server knows the client's public key (stored in the `.ssh/authorized_keys` file), a connecting client can prove its identity using asymmetric signatures. This is done through [challenge-response](https://en.wikipedia.org/wiki/Challenge%E2%80%93response_authentication). -At a high level, the server picks a random number and send it to the client. +At a high level, the server picks a random number and sends it to the client. The client then signs this message and sends the signature back to the server, which checks the signature against the public key on record. This effectively proves that the client is in possession of the private key corresponding to the diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 5bde84d9..d71cb5c1 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -19,7 +19,7 @@ Shell scripts are the next step in complexity. Most shells have their own scripting language with variables, control flow and its own syntax. What makes shell scripting different from other scripting programming language is that is optimized for performing shell related tasks. Thus, creating command pipelines, saving results into files or reading from standard input are primitives in shell scripting which makes it easier to use than general purpose scripting languages. -For this section we will focus in bash scripting since it is the most common. +For this section we will focus on bash scripting since it is the most common. To assign variables in bash use the syntax `foo=bar` and access the value of the variable with `$foo`. Note that `foo = bar` will not work since it is interpreted as calling the `foo` program with arguments `=` and `bar`. @@ -145,7 +145,7 @@ diff <(ls foo) <(ls bar) -Writing `bash` scripts can be tricky and unintutive. There are tools like [shellcheck](https://github.com/koalaman/shellcheck) that will help you find out errors in your sh/bash scripts. +Writing `bash` scripts can be tricky and unintuitive. There are tools like [shellcheck](https://github.com/koalaman/shellcheck) that will help you find out errors in your sh/bash scripts. Note that scripts need not necessarily be written in bash to be called from the terminal. For instance, here's a simple Python script that outputs its arguments in reversed order @@ -177,7 +177,7 @@ You could always start googling, but since UNIX predates StackOverflow there are As we saw in the shell lecture, the first order approach is to call said command with the `-h` or `--help` flags. A more detailed approach is to use the `man` command. Short for manual, [`man`](http://man7.org/linux/man-pages/man1/man.1.html) provides a manual page (called manpage) for a command you specify. For example, `man rm` will output the behavior of the `rm` command along with the flags that it takes including the `-i` flag we showed earlier. -In fact, what I have been linking so far for every command are the online version of Linux manpages for the commands. +In fact, what I have been linking so far for every command is the online version of Linux manpages for the commands. Even non native commands that you install will have manpage entries if the developer wrote them and included them as part of the installation process. For interactive tools such as the ones based on ncurses, help for the commands can often be accessed within the program using the `:help` command or typing `?`. @@ -230,8 +230,8 @@ A more in depth comparison can be found [here](https://unix.stackexchange.com/qu Finding files is useful but quite often you are after what is in the file. A common scenario is wanting to search for all files that contain some pattern, along with where in those files said pattern occurs. -To achieve this, most UNIX-like systems provide [`grep`](http://man7.org/linux/man-pages/man1/grep.1.html), a generic tool for matching patterns from input text. -It is an incredibly value shell tool and we will cover it more in detail during the data wrangling lecture. +To achieve this, most UNIX-like systems provide [`grep`](http://man7.org/linux/man-pages/man1/grep.1.html), a generic tool for matching patterns from the input text. +It is an incredibly valuable shell tool and we will cover it more in detail during the data wrangling lecture. `grep` has many flags that make it a very versatile tool. Some I frequently use are `-C` for getting **C**ontext around the matching line and `-v` for in**v**erting the match, i.e. print all lines that do **not** match the pattern. For example, `grep -C 5` will print 5 lines before and after the match. @@ -269,14 +269,14 @@ After pressing `Ctrl+R` you can type a substring you want to match for commands As you keep pressing it you will cycle through the matches in your history. This can also be enabled with the UP/DOWN arrows in [zsh](https://github.com/zsh-users/zsh-history-substring-search). A nice addition on top of `Ctrl+R` comes with using [fzf](https://github.com/junegunn/fzf/wiki/Configuring-shell-key-bindings#ctrl-r) bindings. -`fzf` is a general purpose fuzzy finder that can used with many commands. +`fzf` is a general purpose fuzzy finder that can be used with many commands. Here is used to fuzzily match through your history and present results in a convenient and visually pleasing manner. Another cool history-related trick I really enjoy is **history-based autosuggestions**. First introduced by the [fish](https://fishshell.com/) shell, this feature dynamically autocompletes your current shell command with the most recent command that you typed that shares a common prefix with it. It can be enabled in [zsh](https://github.com/zsh-users/zsh-autosuggestions) and it is a great quality of life trick for your shell. -Lastly, a thing to have in mind is that if you start a command with a leading space it won't be added to you shell history. +Lastly, a thing to have in mind is that if you start a command with a leading space it won't be added to your shell history. This comes in handy when you are typing commands with passwords or other bits of sensitive information. If you make the mistake of not adding the leading space you can always manually remove the entry by editing your `.bash_history` or `.zhistory`. @@ -361,7 +361,7 @@ echo "found error after $count runs" cat out.txt {% endcomment %} -1. As we covered in lecture `find`'s `-exec` can be very powerful for performing operations over the files we are searching for. +1. As we covered in the lecture `find`'s `-exec` can be very powerful for performing operations over the files we are searching for. However, what if we want to do something with **all** the files, like creating a zip file? As you have seen so far commands will take input from both arguments and STDIN. When piping commands, we are connecting STDOUT to STDIN, but some commands like `tar` take inputs from arguments. diff --git a/_2020/version-control.md b/_2020/version-control.md index 27ba73e4..b6083ad1 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -227,7 +227,7 @@ currently are" is a special reference called "HEAD". Finally, we can define what (roughly) is a Git _repository_: it is the data `objects` and `references`. -On disk, all Git stores is objects and references: that's all there is to Git's +On disk, all Git stores are objects and references: that's all there is to Git's data model. All `git` commands map to some manipulation of the commit DAG by adding objects and adding/updating references. @@ -243,7 +243,7 @@ probably a command to do it (e.g. in this case, `git checkout master; git reset This is another concept that's orthogonal to the data model, but it's a part of the interface to create commits. -One way you might imagine implementing snapshotting as described above is have +One way you might imagine implementing snapshotting as described above is to have a "create snapshot" command that creates a new snapshot based on the _current state_ of the working directory. Some version control tools work like this, but not Git. We want clean snapshots, and it might not always be ideal to make a From c25c128791b127108b3f4d072c26cddd4ba1ee2f Mon Sep 17 00:00:00 2001 From: cijad Date: Mon, 6 Apr 2020 09:20:44 +0200 Subject: [PATCH 288/640] Fix 'the math' -> 'math' --- _2020/data-wrangling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index da95b256..30cd9d4d 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -308,7 +308,7 @@ leave that as an exercise to the reader. ## Analyzing data -You can do the math! For example, add the numbers on each line together: +You can do math! For example, add the numbers on each line together: ```bash | paste -sd+ | bc -l From 5328713f1311a8f33c2266af0d907eb3236ba33f Mon Sep 17 00:00:00 2001 From: nishkakotian Date: Thu, 9 Apr 2020 19:29:33 +0530 Subject: [PATCH 289/640] Removed a letter --- about.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/about.md b/about.md index 3761b44c..a67db7a0 100644 --- a/about.md +++ b/about.md @@ -27,7 +27,7 @@ ecosystem that could make students' lives significantly easier. To help remedy this, we are running a class that covers all the topics we consider crucial to be an effective computer scientist and programmer. The -class is pragmatic and practical, and it provides hands-on introductions to +class is pragmatic and practical, and it provides hands-on introduction to tools and techniques that you can immediately apply in a wide variety of situations you will encounter. The class is being run during MIT's "Independent Activities Period" in January 2020 — a one-month semester that features shorter From 8e0c28b087e0abdc3f43daca3a38a574125fb68e Mon Sep 17 00:00:00 2001 From: Connor Rose Date: Fri, 10 Apr 2020 17:58:26 -0700 Subject: [PATCH 290/640] Fix exercise autonumber and indents --- _2020/debugging-profiling.md | 47 ++++++++++++++++++------------------ 1 file changed, 23 insertions(+), 24 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 49a72099..8dfe5ef5 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -475,16 +475,15 @@ If there aren't any you can execute some harmless commands such as `sudo ls` and 1. Install [`shellcheck`](https://www.shellcheck.net/) and try checking the following script. What is wrong with the code? Fix it. Install a linter plugin in your editor so you can get your warnings automatically. - -```bash -#!/bin/sh -## Example: a typical script with several problems -for f in $(ls *.m3u) -do - grep -qi hq.*mp3 $f \ - && echo -e 'Playlist $f contains a HQ file in mp3 format' -done -``` + ```bash + #!/bin/sh + ## Example: a typical script with several problems + for f in $(ls *.m3u) + do + grep -qi hq.*mp3 $f \ + && echo -e 'Playlist $f contains a HQ file in mp3 format' + done + ``` 1. (Advanced) Read about [reversible debugging](https://undo.io/resources/reverse-debugging-whitepaper/) and get a simple example working using [`rr`](https://rr-project.org/) or [`RevPDB`](https://morepypy.blogspot.com/2016/07/reverse-debugging-for-python.html). ## Profiling @@ -493,25 +492,25 @@ done 1. Here's some (arguably convoluted) Python code for computing Fibonacci numbers using a function for each number. -```python -#!/usr/bin/env python -def fib0(): return 0 + ```python + #!/usr/bin/env python + def fib0(): return 0 -def fib1(): return 1 + def fib1(): return 1 -s = """def fib{}(): return fib{}() + fib{}()""" + s = """def fib{}(): return fib{}() + fib{}()""" -if __name__ == '__main__': + if __name__ == '__main__': - for n in range(2, 10): - exec(s.format(n, n-1, n-2)) - # from functools import lru_cache - # for n in range(10): - # exec("fib{} = lru_cache(1)(fib{})".format(n, n)) - print(eval("fib9()")) -``` + for n in range(2, 10): + exec(s.format(n, n-1, n-2)) + # from functools import lru_cache + # for n in range(10): + # exec("fib{} = lru_cache(1)(fib{})".format(n, n)) + print(eval("fib9()")) + ``` -Put the code into a file and make it executable. Install [`pycallgraph`](http://pycallgraph.slowchop.com/en/master/). Run the code as is with `pycallgraph graphviz -- ./fib.py` and check the `pycallgraph.png` file. How many times is `fib0` called?. We can do better than that by memoizing the functions. Uncomment the commented lines and regenerate the images. How many times are we calling each `fibN` function now? + Put the code into a file and make it executable. Install [`pycallgraph`](http://pycallgraph.slowchop.com/en/master/). Run the code as is with `pycallgraph graphviz -- ./fib.py` and check the `pycallgraph.png` file. How many times is `fib0` called?. We can do better than that by memoizing the functions. Uncomment the commented lines and regenerate the images. How many times are we calling each `fibN` function now? 1. A common issue is that a port you want to listen on is already taken by another process. Let's learn how to discover that process pid. First execute `python -m http.server 4444` to start a minimal web server listening on port `4444`. On a separate terminal run `lsof | grep LISTEN` to print all listening processes and ports. Find that process pid and terminate it by running `kill `. From 8bb2bfdafc4280bc6a5728ca822b99fa8d7da946 Mon Sep 17 00:00:00 2001 From: Steven Maude Date: Tue, 21 Apr 2020 22:34:34 +0100 Subject: [PATCH 291/640] Describe the shell parameter $! correctly `$!` is a special parameter, not an environment variable. "special parameter" is used in both the Open Group specification: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html and the bash documentation: https://www.gnu.org/software/bash/manual/html_node/Special-Parameters.html --- _2020/command-line.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 9a11cbce..cadd2343 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -64,7 +64,7 @@ We can then continue the paused job in the foreground or in the background using The [`jobs`](http://man7.org/linux/man-pages/man1/jobs.1p.html) command lists the unfinished jobs associated with the current terminal session. You can refer to those jobs using their pid (you can use [`pgrep`](http://man7.org/linux/man-pages/man1/pgrep.1.html) to find that out). -More intuitively, you can also refer to a process using the percent symbol followed by its job number (displayed by `jobs`). To refer to the last backgrounded job you can use the `$!` environment variable. +More intuitively, you can also refer to a process using the percent symbol followed by its job number (displayed by `jobs`). To refer to the last backgrounded job you can use the `$!` special parameter. One more thing to know is that the `&` suffix in a command will run the command in the background, giving you the prompt back, although it will still use the shell's STDOUT which can be annoying (use shell redirections in that case). From 01d352e1693b2331915b2fc986d35776923c6566 Mon Sep 17 00:00:00 2001 From: Ethan Vieira Date: Sat, 25 Apr 2020 03:01:29 -0500 Subject: [PATCH 292/640] fix small typo --- _2020/qa.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/qa.md b/_2020/qa.md index d74659f1..82ad7fd4 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -30,7 +30,7 @@ For the last lecture, we answered questions that the students submitted: ## Any recommendations on learning Operating Systems related topics like processes, virtual memory, interrupts, memory management, etc First, it is unclear whether you actually need to be very familiar with all of these topics since they are very low level topics. -They will matter as you start writing more low level code like implementing or modifying a kernel. Otherwise, most topics will relevant, with the exception of processes and signals that were briefly covered in other lectures. +They will matter as you start writing more low level code like implementing or modifying a kernel. Otherwise, most topics will not be relevant, with the exception of processes and signals that were briefly covered in other lectures. Some good resources to learn about this topic: From 8521630a6bf407aa9916f2a76769b072479bdcf7 Mon Sep 17 00:00:00 2001 From: Floris Date: Mon, 4 May 2020 22:34:38 +0200 Subject: [PATCH 293/640] fix typo MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The sentence missed an ‘it’ --- _2020/shell-tools.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index d71cb5c1..8c1d937a 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -240,7 +240,7 @@ When it comes to quickly parsing through many files, you want to use `-R` since But `grep -R` can be improved in many ways, such as ignoring `.git` folders, using multi CPU support, &c. So there has been no shortage of alternatives developed, including [ack](https://beyondgrep.com/), [ag](https://github.com/ggreer/the_silver_searcher) and [rg](https://github.com/BurntSushi/ripgrep). All of them are fantastic but pretty much cover the same need. -For now I am sticking with ripgrep (`rg`) given how fast and intuitive is. Some examples: +For now I am sticking with ripgrep (`rg`) given how fast and intuitive it is. Some examples: ```bash # Find all python files where I used the requests library rg -t py 'import requests' From 5020b532e150d38a74b4dab59065e22640ff10bd Mon Sep 17 00:00:00 2001 From: Anish Athalye Date: Fri, 8 May 2020 10:55:16 -0400 Subject: [PATCH 294/640] Remove use of "said" Thanks to @rtvkiz for bringing this up. --- _2020/command-line.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index cadd2343..6cd4b1db 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -462,7 +462,7 @@ Since you might be spending hundreds to thousands of hours in your terminal it p One way to achieve this is to use the [`wait`](http://man7.org/linux/man-pages/man1/wait.1p.html) command. Try launching the sleep command and having an `ls` wait until the background process finishes. However, this strategy will fail if we start in a different bash session, since `wait` only works for child processes. One feature we did not discuss in the notes is that the `kill` command's exit status will be zero on success and nonzero otherwise. `kill -0` does not send a signal but will give a nonzero exit status if the process does not exist. - Write a bash function called `pidwait` that takes a pid and waits until said process completes. You should use `sleep` to avoid wasting CPU unnecessarily. + Write a bash function called `pidwait` that takes a pid and waits until the given process completes. You should use `sleep` to avoid wasting CPU unnecessarily. ## Terminal multiplexer From aaafafcd7224e2441ccbd42500e262e789f4ad37 Mon Sep 17 00:00:00 2001 From: Corbin Albert Date: Mon, 11 May 2020 15:07:12 -0700 Subject: [PATCH 295/640] Small formatting and grammar tweaks Nothing too special, mostly just adding a few words to assist in sentence flow. I also elaborated a bit more on the tmux prefix to hopefully make it a bit easier to understand, though it was pretty self-explanatory to begin with. --- _2020/command-line.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 6cd4b1db..fd73bdc3 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -131,13 +131,13 @@ When using the command line interface you will often want to run more than one t For instance, you might want to run your editor and your program side by side. Although this can be achieved by opening new terminal windows, using a terminal multiplexer is a more versatile solution. -Terminal multiplexers like [`tmux`](http://man7.org/linux/man-pages/man1/tmux.1.html) allow to multiplex terminal windows using panes and tabs so you can interact with multiple shell sessions. +Terminal multiplexers like [`tmux`](http://man7.org/linux/man-pages/man1/tmux.1.html) allow you to multiplex terminal windows using panes and tabs so you can interact with multiple shell sessions. Moreover, terminal multiplexers let you detach a current terminal session and reattach at some point later in time. This can make your workflow much better when working with remote machines since it voids the need to use `nohup` and similar tricks. -The most popular terminal multiplexer these days is [`tmux`](http://man7.org/linux/man-pages/man1/tmux.1.html). `tmux` is highly configurable and using the associated keybindings you can create multiple tabs and panes and quickly navigate through them. +The most popular terminal multiplexer these days is [`tmux`](http://man7.org/linux/man-pages/man1/tmux.1.html). `tmux` is highly configurable and by using the associated keybindings you can create multiple tabs and panes and quickly navigate through them. -`tmux` expects you to know its keybindings, and they all have the form ` x` where that means press `Ctrl+b` release, and the press `x`. `tmux` has the following hierarchy of objects: +`tmux` expects you to know its keybindings, and they all have the form ` x` where that means (1) press `Ctrl+b`, (2) release `Ctrl+b`, and then (3) press `x`. `tmux` has the following hierarchy of objects: - **Sessions** - a session is an independent workspace with one or more windows + `tmux` starts a new session. + `tmux new -s NAME` starts it with that name. @@ -153,7 +153,7 @@ The most popular terminal multiplexer these days is [`tmux`](http://man7.org/lin + ` ,` Rename the current window + ` w` List current windows -- **Panes** - Like vim splits, pane lets you have multiple shells in the same visual display. +- **Panes** - Like vim splits, panes let you have multiple shells in the same visual display. + ` "` Split the current pane horizontally + ` %` Split the current pane vertically + ` ` Move to the pane in the specified _direction_. Direction here means arrow keys. @@ -257,7 +257,7 @@ tell you about their preferred customizations. Yet another way to learn about customizations is to look through other people's dotfiles: you can find tons of [dotfiles repositories](https://github.com/search?o=desc&q=dotfiles&s=stars&type=Repositories) -on --- see the most popular one +on Github --- see the most popular one [here](https://github.com/mathiasbynens/dotfiles) (we advise you not to blindly copy configurations though). [Here](https://dotfiles.github.io/) is another good resource on the topic. From aa854b6ee95725c268ef8994bfe8573372903e58 Mon Sep 17 00:00:00 2001 From: devanshrj Date: Thu, 14 May 2020 00:42:01 +0530 Subject: [PATCH 296/640] Typo in command-line.md --- _2020/command-line.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index fd73bdc3..52ce3083 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -406,7 +406,7 @@ An additional advantage of using the `~/.ssh/config` file over aliases is that Note that the `~/.ssh/config` file can be considered a dotfile, and in general it is fine for it to be included with the rest of your dotfiles. However, if you make it public, think about the information that you are potentially providing strangers on the internet: addresses of your servers, users, open ports, &c. This may facilitate some types of attacks so be thoughtful about sharing your SSH configuration. -Server side configuration is usually specified in `/etc/ssh/sshd_config`. Here you can make changes like disabling password authentication, changing ssh ports, enabling X11 forwarding, &c. You can specify config settings in a per user basis. +Server side configuration is usually specified in `/etc/ssh/sshd_config`. Here you can make changes like disabling password authentication, changing ssh ports, enabling X11 forwarding, &c. You can specify config settings on a per user basis. ## Miscellaneous From cb11e1057852b4f5c562412ea2b0e03783f3e501 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 16 May 2020 21:30:29 +0800 Subject: [PATCH 297/640] fix cname --- CNAME | 2 +- _config.yml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/CNAME b/CNAME index 1f543870..1c09051a 100644 --- a/CNAME +++ b/CNAME @@ -1 +1 @@ -missing.csail.mit.edu +missing-semester-cn.github.io diff --git a/_config.yml b/_config.yml index fce0b9f3..26647e94 100644 --- a/_config.yml +++ b/_config.yml @@ -1,6 +1,6 @@ # Setup title: 'the missing semester of your cs education' -url: https://missing.csail.mit.edu +url: https://missing-semester-cn.github.io # Settings markdown: kramdown From 6993c65a64d0ec1fe40696553ff8be4e83bb84ec Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 16 May 2020 21:31:01 +0800 Subject: [PATCH 298/640] update trans --- _2020/course-shell.md | 67 +++++++++++++++++-------------------------- 1 file changed, 26 insertions(+), 41 deletions(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index c37b118e..6301be28 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -1,6 +1,6 @@ --- layout: lecture -title: "Course overview + the shell" +title: "课程概览与shell" date: 2019-01-13 ready: true video: @@ -12,46 +12,31 @@ video: [Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anic30/course_overview_iap_2019/) {% endcomment %} -# Motivation - -As computer scientists, we know that computers are great at aiding in -repetitive tasks. However, far too often, we forget that this applies -just as much to our _use_ of the computer as it does to the computations -we want our programs to perform. We have a vast range of tools -available at our fingertips that enable us to be more productive and -solve more complex problems when working on any computer-related -problem. Yet many of us utilize only a small fraction of those tools; we -only know enough magical incantations by rote to get by, and blindly -copy-paste commands from the internet when we get stuck. - -This class is an attempt to address this. - -We want to teach you how to make the most of the tools you know, show -you new tools to add to your toolbox, and hopefully instill in you some -excitement for exploring (and perhaps building) more tools on your own. -This is what we believe to be the missing semester from most Computer -Science curriculum. - -# Class structure - -The class consists of 11 1-hour lectures, each one centering on a -[particular topic](/2020/). The lectures are largely independent, -though as the semester goes on we will presume that you are familiar -with the content from the earlier lectures. We have lecture notes -online, but there will be a lot of content covered in class (e.g. in the -form of demos) that may not be in the notes. We will be recording -lectures and posting the recordings online. - -We are trying to cover a lot of ground over the course of just 11 1-hour -lectures, so the lectures are fairly dense. To allow you some time to -get familiar with the content at your own pace, each lecture includes a -set of exercises that guide you through the lecture's key points. After -each lecture, we are hosting office hours where we will be present to -help answer any questions you might have. If you are attending the class -online, you can send us questions at -[missing-semester@mit.edu](mailto:missing-semester@mit.edu). - -Due to the limited time we have, we won't be able to cover all the tools +# 动机 + +作为计算机科学家,我们都知道计算机最擅长帮助我们完成重复性的工作。 +但是我们却常常忘记这一点也适用于我们使用计算机的方式,而不仅仅是利用计算机程序去帮我们求解问题。 +在从事与计算机相关的工作时,我们有很多触手可及的工具可以帮助我们更高效的解决问题。 +但是我们中的大多数人实际上只利用了这些工具中的很少一部分,我们常常只是死记硬背地掌握了一些对我们来说如咒语一般的命令, +或是当我们卡住的时候,盲目地从网上复制粘贴一些命令。 + +本课程意在帮你解决这一问题。 + +我们希望教会您如何挖掘现有工具的潜力,并向您介绍一些新的工具。也许我们还可以促使您想要去探索(甚至是去开发)更多的工具。 +我们认为这是大多数计算机科学相关课程中缺少的重要一环。 + +# 课程结构 + +本课程包含11个时常在一小时左右的讲座,每一个讲座都会关注一个 +[特定的主题](/missing-semester/2020/)。尽管这些讲座之间基本上是各自独立的,但随着课程的进行,我们会假定您已经掌握了之前的内容。 +每个讲座都有在线笔记供查阅,但是课上的很多内容并不会包含在笔记中。因此我们也会把课程录制下来发布到互联网上供大家观看学习。 + +我们希望能在这11个一小时讲座中涵盖大部分必须的内容,因此课程地节奏会比较紧凑。 +为了能帮助您以自己的节奏来掌握讲座内容,每次课程都包含来一组练习来帮助您掌握本节课的重点。 +s课后我们会安排答疑的时间来回答您的问题。如果您参加的是在线课程,可以发送邮件到 +[missing-semester@mit.edu](mailto:missing-semester@mit.edu)来联系我们。 + +由于时长的限制,我们不可能达到那些专门课程一样的细致程度,我们会适时地将您引向一些优秀地资源Due to the limited time we have, we won't be able to cover all the tools in the same level of detail a full-scale class might. Where possible, we will try to point you towards resources for digging further into a tool or topic, but if something particularly strikes your fancy, don't From eb851c1b3c569cdd71ca360bfdc2d66061d4b796 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 16 May 2020 21:37:27 +0800 Subject: [PATCH 299/640] update trans --- _2020/course-shell.md | 24 +++++++++++------------- 1 file changed, 11 insertions(+), 13 deletions(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index 6301be28..f2f4cae9 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -36,15 +36,13 @@ video: s课后我们会安排答疑的时间来回答您的问题。如果您参加的是在线课程,可以发送邮件到 [missing-semester@mit.edu](mailto:missing-semester@mit.edu)来联系我们。 -由于时长的限制,我们不可能达到那些专门课程一样的细致程度,我们会适时地将您引向一些优秀地资源Due to the limited time we have, we won't be able to cover all the tools -in the same level of detail a full-scale class might. Where possible, we -will try to point you towards resources for digging further into a tool -or topic, but if something particularly strikes your fancy, don't -hesitate to reach out to us and ask for pointers! +由于时长的限制,我们不可能达到那些专门课程一样的细致程度,我们会适时地将您介绍一些优秀的资源,帮助您深入的理解相关的工具或主题。 +但是如果您还有一些特别关注的话题,也请联系我们。 -# Topic 1: The Shell -## What is the shell? +# 主题 1: The Shell + +## shell 是什么? Computers these days have a variety of interfaces for giving them commands; fancyful graphical user interfaces, voice interfaces, and @@ -68,7 +66,7 @@ _prompt_ (where you can type commands), you first need a _terminal_. Your device probably shipped with one installed, or you can install one fairly easily. -## Using the shell +## 使用 shell When you launch your terminal, you will see a _prompt_ that often looks a little like this: @@ -137,7 +135,7 @@ find out which file is executed for a given program name using the `which` program. We can also bypass `$PATH` entirely by giving the _path_ to the file we want to execute. -## Navigating in the shell +## 在shell中导航 A path on the shell is a delimited list of directories; separated by `/` on Linux and macOS and `\` on Windows. On Linux and macOS, the path `/` @@ -238,7 +236,7 @@ page_. Press `q` to exit. missing:~$ man ls ``` -## Connecting programs +## 在程序间创建连接 In the shell, programs have two primary "streams" associated with them: their input stream and their output stream. When the program tries to @@ -276,7 +274,7 @@ missing:~$ curl --head --silent google.com | grep --ignore-case content-length | We will go into a lot more detail about how to take advantage of pipes in the lecture on data wrangling. -## A versatile and powerful tool +## 一个功能全面又强大的工具 On most Unix-like systems, one user is special: the "root" user. You may have seen it in the file listings above. The root user is above (almost) @@ -336,7 +334,7 @@ state of various system LEDs (your path might be different): $ echo 1 | sudo tee /sys/class/leds/input6::scrolllock/brightness ``` -# Next steps +# 下一步 At this point you know your way around a shell enough to accomplish basic tasks. You should be able to navigate around to find files of @@ -345,7 +343,7 @@ lecture, we will talk about how to perform and automate more complex tasks using the shell and the many handy command-line programs out there. -# Exercises +# 课后练习 1. Create a new directory called `missing` under `/tmp`. 1. Look up the `touch` program. The `man` program is your friend. From d30f3091a8cee3ec302f50dcbdbfa98b268ba2bb Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 16 May 2020 21:48:14 +0800 Subject: [PATCH 300/640] update trans --- _2020/course-shell.md | 29 ++++++++--------------------- 1 file changed, 8 insertions(+), 21 deletions(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index f2f4cae9..d48cf0ee 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -44,27 +44,14 @@ s课后我们会安排答疑的时间来回答您的问题。如果您参加的 ## shell 是什么? -Computers these days have a variety of interfaces for giving them -commands; fancyful graphical user interfaces, voice interfaces, and -even AR/VR are everywhere. These are great for 80% of use-cases, but -they are often fundamentally restricted in what they allow you to do — -you cannot press a button that isn't there or give a voice command that -hasn't been programmed. To take full advantage of the tools your -computer provides, we have to go old-school and drop down to a textual -interface: The Shell. - -Nearly all platforms you can get your hand on has a shell in one form or -another, and many of them have several shells for you to choose from. -While they may vary in the details, at their core they are all roughly -the same: they allow you to run programs, give them input, and inspect -their output in a semi-structured way. - -In this lecture, we will focus on the Bourne Again SHell, or "bash" for -short. This is one of the most widely used shells, and its syntax is -similar to what you will see in many other shells. To open a shell -_prompt_ (where you can type commands), you first need a _terminal_. -Your device probably shipped with one installed, or you can install one -fairly easily. +如今的计算机有着多种多样的交互接口让我们可以进行指令的的输入,从炫酷的图像用户界面(GUI),语音输入甚至是AR/VR都已经无处不在。 +这些交互接口可以覆盖80%的使用场景,但是它们也从根本上限制了您的操作方式——你不能点击一个不存在的按钮或者是用语音输入一个还没有被录入的指令。 +为了充分利用计算机的能力,我们不得不回到最根本的方式,使用文字接口:Shell + +几乎所有您能够接触到的平台都支持某种形式都shell,有些甚至还提供了多种shell供您选择。虽然它们之间有些细节上都差异,但是其核心功能都是一样都:它允许你执行程序,输入并获取某种半结构化都输出。 + +本节课我们会使用Bourne Again SHell, 简称 "bash" 。 +这是被最广泛使用都一种shell,它都语法和其他都shell都是类似的。打开shell _提示符_(您输入指令的地方),您首先需要打开 _终端_ 。您的设备通常都已经内置了终端,或者您也可以安装一个,非常简单。 ## 使用 shell From f4af9994356bdcd6bbc6de8132f52010aec7c193 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 16 May 2020 21:53:34 +0800 Subject: [PATCH 301/640] update trans --- _2020/course-shell.md | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index d48cf0ee..55ae60d3 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -55,19 +55,13 @@ s课后我们会安排答疑的时间来回答您的问题。如果您参加的 ## 使用 shell -When you launch your terminal, you will see a _prompt_ that often looks -a little like this: +当您打开终端时,您会看到一个提示符,它看起来一般是这个样子的: ```console missing:~$ ``` -This is the main textual interface to the shell. It tells you that you -are on the machine `missing` and that your "current working directory", -or where you currently are, is `~` (short for "home"). The `$` tells you -that you are not the root user (more on that later). At this prompt you -can type a _command_, which will then be interpreted by the shell. The -most basic command is to execute a program: +这是shell最主要的文本接口。它告诉你,你的主机名是 `missing` 并且您当前的工作目录("current working directory")或者说您当前所在的位置是`~` (表示 "home")。 `$`符号表示您现在的身份不是root用户(稍后会介绍)。在找个提示符中,您可以输入 _命令_ ,命令最终会被shell解析。最简单的命令是执行一个程序: ```console missing:~$ date From a00ccaeb5755d907d2fca3336c1836a1d5622dc9 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 16 May 2020 22:49:41 +0800 Subject: [PATCH 302/640] update trans --- _2020/course-shell.md | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index 55ae60d3..8963c6de 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -69,9 +69,7 @@ Fri 10 Jan 2020 11:49:31 AM EST missing:~$ ``` -Here, we executed the `date` program, which (perhaps unsurprisingly) -prints the current date and time. The shell then asks us for another -command to execute. We can also execute a command with _arguments_: +这里,我们执行了 `date` 找个程序,不出意料地,它打印出了当前的日前和时间。然后,shell等待我们输入其他命令。我们可以在执行命令的同时向程序传递 _参数_ : ```console missing:~$ echo hello @@ -317,12 +315,8 @@ $ echo 1 | sudo tee /sys/class/leds/input6::scrolllock/brightness # 下一步 -At this point you know your way around a shell enough to accomplish -basic tasks. You should be able to navigate around to find files of -interest and use the basic functionality of most programs. In the next -lecture, we will talk about how to perform and automate more complex -tasks using the shell and the many handy command-line programs out -there. +学到这里,您掌握对shell知识已经可以完成一些基础对任务了。您应该已经可以查找感兴趣对文件并使用大多数程序对基本功能了。 +在下一场讲座中,我们会探讨如何利用shell及其他工具执行并自动化更复杂的任务。 # 课后练习 From 140b35104ecef44c8c784957fe6f2e885899b994 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 16 May 2020 23:02:03 +0800 Subject: [PATCH 303/640] update trans --- _2020/course-shell.md | 22 +++------------------- 1 file changed, 3 insertions(+), 19 deletions(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index 8963c6de..1efec228 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -75,26 +75,10 @@ missing:~$ missing:~$ echo hello hello ``` +上例中,我们让shell执行 `echo` ,同时指定参数`hello`。`echo` 程序将该参数打印出来。 +shell基于空格分割命令并进行解析,然后执行第一个单词代表的程序,并将后续的单词作为程序可以访问的参数。如果您希望传递的参数中包含空格(例如一个名为 My Photos 的文件夹),您要么用使用单引号,双引号将其包裹起来,要么使用转义符号`\`进行处理(`My\ Photos`)。 -In this case, we told the shell to execute the program `echo` with the -argument `hello`. The `echo` program simply prints out its arguments. -The shell parses the command by splitting it by whitespace, and then -runs the program indicated by the first word, supplying each subsequent -word as an argument that the program can access. If you want to provide -an argument that contains spaces or other special characters (e.g., a -directory named "My Photos"), you can either quote the argument with `'` -or `"` (`"My Photos"`), or escape just the relevant characters with `\` -(`My\ Photos`). - -But how does the shell know how to find the `date` or `echo` programs? -Well, the shell is a programming environment, just like Python or Ruby, -and so it has variables, conditionals, loops, and functions (next -lecture!). When you run commands in your shell, you are really writing a -small bit of code that your shell interprets. If the shell is asked to -execute a command that doesn't match one of its programming keywords, it -consults an _environment variable_ called `$PATH` that lists which -directories the shell should search for programs when it is given a -command: +但是,shell是如何知道去哪里寻找 `date` 或 `echo` 的呢?其实,类似于Python or Ruby,shell是一个编程环境,所以它具备变量、条件、循环和函数(下一课进行讲解)。当你在shell中执行命令时,您实际上是在执行一段shell可以解释执行的简短代码。如果你要求shell执行某个指令,但是该指令并不是shell所了解的编程关键字,那么它会去咨询 _环境变量_ `$PATH`,它会列出当shell接到某条指令时,进行程序搜索的路径: ```console From 76c645e9339f4fd46fef5fee317aaffd34e65cd6 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 16 May 2020 23:19:04 +0800 Subject: [PATCH 304/640] update trans --- _2020/course-shell.md | 29 ++++++----------------------- 1 file changed, 6 insertions(+), 23 deletions(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index 1efec228..7ec780b7 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -100,16 +100,8 @@ _path_ to the file we want to execute. ## 在shell中导航 -A path on the shell is a delimited list of directories; separated by `/` -on Linux and macOS and `\` on Windows. On Linux and macOS, the path `/` -is the "root" of the file system, under which all directories and files -lie, whereas on Windows there is one root for each disk partition (e.g., -`C:\`). We will generally assume that you are using a Linux filesystem -in this class. A path that starts with `/` is called an _absolute_ path. -Any other path is a _relative_ path. Relative paths are relative to the -current working directory, which we can see with the `pwd` command and -change with the `cd` command. In a path, `.` refers to the current -directory, and `..` to its parent directory: +shell中的路径是一组被分割的目录,在 Linux 和 macOS 上使用 `/` 分割,而在Windows上是`\`。路径 `/`代表的是系统的根目录,所有的文件夹都包括在找个路径之下,在Windows上每个盘都有一个根目录(例如: +`C:\`)。 我们假设您在学习本课程时使用的是Linux文件系统。如果某个路径以`/` 开头,那么它是一个 _绝对路径_,其他的都术语 _相对路径_ 。相对路径是指相对于当前工作目录的路径,当前工作目录可以使用 `pwd` 命令来获取。此外,切换目录需要使用 `cd` 命令。在路径中,`.` 表示的是当前目录,而 `..` 表示上级目录: ```console missing:~$ pwd @@ -130,15 +122,11 @@ missing:~$ ../../bin/echo hello hello ``` -Notice that our shell prompt kept us informed about what our current -working directory was. You can configure your prompt to show you all -sorts of useful information, which we will cover in a later lecture. +注意,shell会实时显示当前的路径信息。您可以通过配置shell提示符来显示各种有用的信息,这一内容我们会在后面的课程中进行讨论。 -In general, when we run a program, it will operate in the current -directory unless we tell it otherwise. For example, it will usually -search for files there, and create new files there if it needs to. +一般来说,当我们运行一个程序时,如果我们没有指定路径,则该程序会在当前目录下执行。例如,我们常常会搜索文件,并在需要时创建文件。 -To see what lives in a given directory, we use the `ls` command: +为了查看指定目录下包含哪些文件,我们使用`ls` 命令: ```console missing:~$ ls @@ -155,12 +143,7 @@ home ... ``` -Unless a directory is given as its first argument, `ls` will print the -contents of the current directory. Most commands accept flags and -options (flags with values) that start with `-` to modify their -behavior. Usually, running a program with the `-h` or `--help` flag -(`/?` on Windows) will print some help text that tells you what flags -and options are available. For example, `ls --help` tells us: +除非我们利用第一个参数指定目录,否则 `ls` 会打印当前目录下的文件。大多数的命令接受标记和选项(带有值的标记),它们以`-` 开头,并可以改变程序的行为。通常,在执行程序时使用`-h` 或 `--help` 标记可以打印帮助信息,以便了解有哪些可用的标记或选项。例如,`ls --help` 的输出如下: ``` -l use a long listing format From 3f4898bd1d61c86f5ea4968eaabb6dfe5f4254db Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 16 May 2020 23:25:16 +0800 Subject: [PATCH 305/640] update trans --- _2020/course-shell.md | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index 7ec780b7..7b9e2d90 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -90,13 +90,8 @@ missing:~$ /bin/echo $PATH /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ``` -When we run the `echo` command, the shell sees that it should execute -the program `echo`, and then searches through the `:`-separated list of -directories in `$PATH` for a file by that name. When it finds it, it -runs it (assuming the file is _executable_; more on that later). We can -find out which file is executed for a given program name using the -`which` program. We can also bypass `$PATH` entirely by giving the -_path_ to the file we want to execute. +当我们执行 `echo` 命令时,shell了解到需要执行 `echo` 这个程序,随后它便会在`$PATH`中搜索由`:`所分割的一系列目录,基于名字搜索该程序。当找到该程序时便执行(假定该文件是 _可执行程序_,后续课程将详细讲解)。确定某个程序名代表的是哪个具体的程序,可以使用 +`which` 程序。我们也可以绕过 `$PATH` ,通过直接指定需要执行的程序的路径来执行该程序 ## 在shell中导航 From 8672fab9d374cfc37b8a4481e4e851009803d578 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 16 May 2020 23:44:06 +0800 Subject: [PATCH 306/640] update trans --- _2020/course-shell.md | 19 +++++-------------- 1 file changed, 5 insertions(+), 14 deletions(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index 7b9e2d90..101c0a0f 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -149,20 +149,11 @@ missing:~$ ls -l /home drwxr-xr-x 1 missing users 4096 Jun 15 2019 missing ``` -This gives us a bunch more information about each file or directory -present. First, the `d` at the beginning of the line tells us that -`missing` is a directory. Then follow three groups of three characters -(`rwx`). These indicate what permissions the owner of the file -(`missing`), the owning group (`users`), and everyone else respectively -have on the relevant item. A `-` indicates that the given principal does -not have the given permission. Above, only the owner is allowed to -modify (`w`) the `missing` directory (i.e., add/remove files in it). To -enter a directory, a user must have "search" (represented by "execute": -`x`) permissions on that directory (and its parents). To list its -contents, a user must have read (`r`) permissions on that directory. For -files, the permissions are as you would expect. Notice that nearly all -the files in `/bin` have the `x` permission set for the last group, -"everyone else", so that anyone can execute those programs. +这个参数可以打印出更加详细地列出目录下文件或文件夹的信息。首先,本行第一个字符`d` 表示 +`missing` 是一个目录。然后接下来的九个字符,每三个字符构成一组。 +(`rwx`). 它们分别代表了文件所有者(`missing`),用户组 (`users`) 以及其他所有人具有的权限。其中 `-`表示该用户不具备相应的权限。从上面的信息来看,只有文件所有者可以修改(`w`) , `missing` 文件夹 (例如,添加或删除文件夹中的文件)。为了进入某个文件夹,用户需要具备该文件夹以及其父文件夹的“搜索”权限(以“可执行”:`x`)权限表示。为了列出它的包含的内容,用户必须对该文件夹具备读权限(`r`)。对于文件来说,权限的意义也是类似的。注意,`/bin`目录下的程序在最后一组,即表示所有人的用户组中,均包含`x`权限,也就是说任何人都可以执行这些程序。 + + Some other handy programs to know about at this point are `mv` (to rename/move a file), `cp` (to copy a file), and `mkdir` (to make a new From 9a86a36e22a2a4541a6cca64e8ca7acdfb142631 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 16 May 2020 23:53:30 +0800 Subject: [PATCH 307/640] update the title --- _2020/command-line.md | 2 +- _2020/data-wrangling.md | 2 +- _2020/debugging-profiling.md | 2 +- _2020/editors.md | 2 +- _2020/metaprogramming.md | 2 +- _2020/potpourri.md | 6 ++++-- _2020/security.md | 2 +- _2020/shell-tools.md | 2 +- _2020/version-control.md | 2 +- 9 files changed, 12 insertions(+), 10 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 52ce3083..34d386f7 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -1,6 +1,6 @@ --- layout: lecture -title: "Command-line Environment" +title: "命令行环境" date: 2019-01-21 ready: true video: diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index 30cd9d4d..a723c497 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -1,6 +1,6 @@ --- layout: lecture -title: "Data Wrangling" +title: "数据清理" date: 2019-01-16 ready: true video: diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 8dfe5ef5..85ddcf63 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -1,6 +1,6 @@ --- layout: lecture -title: "Debugging and Profiling" +title: "调试及性能分析" date: 2019-01-23 ready: true video: diff --git a/_2020/editors.md b/_2020/editors.md index 5f782a73..38d0df4b 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -1,6 +1,6 @@ --- layout: lecture -title: "Editors (Vim)" +title: "编辑器 (Vim)" date: 2019-01-15 ready: true video: diff --git a/_2020/metaprogramming.md b/_2020/metaprogramming.md index f6c49956..ff8e201f 100644 --- a/_2020/metaprogramming.md +++ b/_2020/metaprogramming.md @@ -1,6 +1,6 @@ --- layout: lecture -title: "Metaprogramming" +title: "元编程" details: build systems, dependency management, testing, CI date: 2019-01-27 ready: true diff --git a/_2020/potpourri.md b/_2020/potpourri.md index 2a0b6400..691a8ab4 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -1,6 +1,6 @@ --- layout: lecture -title: "Potpourri" +title: "大杂烩" date: 2019-01-29 ready: true video: @@ -10,6 +10,7 @@ video: ## Table of Contents +- [Table of Contents](#table-of-contents) - [Keyboard remapping](#keyboard-remapping) - [Daemons](#daemons) - [FUSE](#fuse) @@ -19,7 +20,8 @@ video: - [Window managers](#window-managers) - [VPNs](#vpns) - [Markdown](#markdown) -- [Hammerspoon(desktop-automation-on-macOS)](#hammerspoon-desktop-automation-on-macos) +- [Hammerspoon (desktop automation on macOS)](#hammerspoon-desktop-automation-on-macos) + - [Resources](#resources) - [Booting + Live USBs](#booting--live-usbs) - [Docker, Vagrant, VMs, Cloud, OpenStack](#docker-vagrant-vms-cloud-openstack) - [Notebook programming](#notebook-programming) diff --git a/_2020/security.md b/_2020/security.md index 39a223b0..27a21c37 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -1,6 +1,6 @@ --- layout: lecture -title: "Security and Cryptography" +title: "安全和密码学" date: 2019-01-28 ready: true video: diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 8c1d937a..dcecab34 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -1,6 +1,6 @@ --- layout: lecture -title: "Shell Tools and Scripting" +title: "Shell 工具和脚本" date: 2019-01-14 ready: true video: diff --git a/_2020/version-control.md b/_2020/version-control.md index b6083ad1..4659d86d 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -1,6 +1,6 @@ --- layout: lecture -title: "Version Control (Git)" +title: "版本控制(Git)" date: 2019-01-22 ready: true video: From 4808e131336c5643e465af368c5dfe83e005dd62 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 17 May 2020 00:06:41 +0800 Subject: [PATCH 308/640] update the title --- index.md | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-) diff --git a/index.md b/index.md index 7de28ac3..a1aaf4c3 100644 --- a/index.md +++ b/index.md @@ -16,7 +16,7 @@ only enables you to spend less time on figuring out how to bend your tools to your will, but it also lets you solve problems that would previously seem impossibly complex. -Read about the [motivation behind this class](/about/). +阅读[开设此课程的动机](/about/). {% comment %} # Registration @@ -24,7 +24,7 @@ Read about the [motivation behind this class](/about/). Sign up for the IAP 2020 class by filling out this [registration form](https://forms.gle/TD1KnwCSV52qexVt9). {% endcomment %} -# Schedule +# 日程 {% comment %} **Lecture**: 35-225, 2pm--3pm
@@ -50,15 +50,14 @@ Sign up for the IAP 2020 class by filling out this [registration form](https://f Video recordings of the lectures are available [on YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuloKGG59rS43e29ro7I57J). -# About the class +# 关于本课程 -**Staff**: This class is co-taught by [Anish](https://www.anishathalye.com/), [Jon](https://thesquareplanet.com/), and [Jose](http://josejg.com/). -**Questions**: Email us at [missing-semester@mit.edu](mailto:missing-semester@mit.edu). +**教员**: 本课程由 [Anish](https://www.anishathalye.com/)、 [Jon](https://thesquareplanet.com/) 和 [Jose](http://josejg.com/)教授。 +**问题**: 请通过 [missing-semester@mit.edu](mailto:missing-semester@mit.edu)联系我们 -# Beyond MIT +# 在 MIT 之外 -We've also shared this class beyond MIT in the hopes that others may -benefit from these resources. You can find posts and discussion on +我们将本课程分享到了MIT之外,希望其他人也能受益于这些资源。你可以在下面这些地方找到相关文章和讨论。 - [Hacker News](https://news.ycombinator.com/item?id=22226380) - [Lobsters](https://lobste.rs/s/ti1k98/missing_semester_your_cs_education_mit) @@ -67,18 +66,21 @@ benefit from these resources. You can find posts and discussion on - [Twitter](https://twitter.com/jonhoo/status/1224383452591509507) - [YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuloKGG59rS43e29ro7I57J) -## Acknowledgements +## 致谢 + +感谢 Elaine Mello, Jim Cain, 以及 [MIT Open +Learning](https://openlearning.mit.edu/) 帮助我们录制讲座视频。 +感谢 Anthony Zolnik 和 [MIT +AeroAstro](https://aeroastro.mit.edu/) 提供 A/V 设备。感谢 Brandi Adams 和 +[MIT EECS](https://www.eecs.mit.edu/) 对本课程的支持。 + -We thank Elaine Mello, Jim Cain, and [MIT Open -Learning](https://openlearning.mit.edu/) for making it possible for us to -record lecture videos; Anthony Zolnik and [MIT -AeroAstro](https://aeroastro.mit.edu/) for A/V equipment; and Brandi Adams and -[MIT EECS](https://www.eecs.mit.edu/) for supporting this class. ---
-

Source code.

+

Source code.

Licensed under CC BY-NC-SA.

+

Translator: Lingfeng Ai (hanxiaomax@qq.com)

See here for contribution & translation guidelines.

From d6fdcf206b37220410756632447bd572aec5c0a2 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 17 May 2020 00:10:10 +0800 Subject: [PATCH 309/640] update edit link --- _2020/metaprogramming.md | 2 +- _layouts/lecture.html | 2 +- index.md | 3 ++- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/_2020/metaprogramming.md b/_2020/metaprogramming.md index ff8e201f..d98a8aff 100644 --- a/_2020/metaprogramming.md +++ b/_2020/metaprogramming.md @@ -1,7 +1,7 @@ --- layout: lecture title: "元编程" -details: build systems, dependency management, testing, CI +details: 构建系统、依赖管理、测试、持续集成 date: 2019-01-27 ready: true video: diff --git a/_layouts/lecture.html b/_layouts/lecture.html index ce758efe..087491ba 100644 --- a/_layouts/lecture.html +++ b/_layouts/lecture.html @@ -17,6 +17,6 @@

{{ page.title }}{% if page.subtitle %}
diff --git a/index.md b/index.md index a1aaf4c3..0b40a242 100644 --- a/index.md +++ b/index.md @@ -71,7 +71,8 @@ YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuloKGG59rS43e29ro7I57 感谢 Elaine Mello, Jim Cain, 以及 [MIT Open Learning](https://openlearning.mit.edu/) 帮助我们录制讲座视频。 感谢 Anthony Zolnik 和 [MIT -AeroAstro](https://aeroastro.mit.edu/) 提供 A/V 设备。感谢 Brandi Adams 和 +AeroAstro](https://aeroastro.mit.edu/) 提供 A/V 设备。 +感谢 Brandi Adams 和 [MIT EECS](https://www.eecs.mit.edu/) 对本课程的支持。 From 238f6afc628c1a172148ac12ae07644d7f17cc64 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 17 May 2020 00:31:44 +0800 Subject: [PATCH 310/640] translate index --- index.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/index.md b/index.md index 0b40a242..6565d234 100644 --- a/index.md +++ b/index.md @@ -1,6 +1,6 @@ --- layout: page -title: The Missing Semester of Your CS Education +title: 计算机科学教育中缺失的一课(The Missing Semester of Your CS Education) --- Classes teach you all about advanced topics within CS, from operating systems @@ -47,12 +47,13 @@ Sign up for the IAP 2020 class by filling out this [registration form](https://f {% endfor %} -Video recordings of the lectures are available [on -YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuloKGG59rS43e29ro7I57J). +讲座视频可以在 [ +YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuloKGG59rS43e29ro7I57J)上找到 # 关于本课程 **教员**: 本课程由 [Anish](https://www.anishathalye.com/)、 [Jon](https://thesquareplanet.com/) 和 [Jose](http://josejg.com/)教授。 + **问题**: 请通过 [missing-semester@mit.edu](mailto:missing-semester@mit.edu)联系我们 # 在 MIT 之外 @@ -70,8 +71,10 @@ YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuloKGG59rS43e29ro7I57 感谢 Elaine Mello, Jim Cain, 以及 [MIT Open Learning](https://openlearning.mit.edu/) 帮助我们录制讲座视频。 + 感谢 Anthony Zolnik 和 [MIT AeroAstro](https://aeroastro.mit.edu/) 提供 A/V 设备。 + 感谢 Brandi Adams 和 [MIT EECS](https://www.eecs.mit.edu/) 对本课程的支持。 From 9fe279e2deb31b69b92e9f34a60c858097b077a1 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 17 May 2020 00:38:24 +0800 Subject: [PATCH 311/640] trans lectures --- _2020/index.html | 7 ++++--- index.md | 6 +++--- 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/_2020/index.html b/_2020/index.html index 2ceee8dd..791c7e7d 100644 --- a/_2020/index.html +++ b/_2020/index.html @@ -28,8 +28,9 @@ {% endfor %} -Video recordings of the lectures are available on YouTube. +讲座视频可以在 YouTube上找到。 -

Previous year's lectures

-

You can find lecture notes and videos from last year's version of this class.

+

往期讲座

+ +

您也可以访问去年的讲座笔记和视频

diff --git a/index.md b/index.md index 6565d234..c8bf41bb 100644 --- a/index.md +++ b/index.md @@ -1,6 +1,6 @@ --- layout: page -title: 计算机科学教育中缺失的一课(The Missing Semester of Your CS Education) +title: 计算机科学教育中缺失的一课 --- Classes teach you all about advanced topics within CS, from operating systems @@ -16,7 +16,7 @@ only enables you to spend less time on figuring out how to bend your tools to your will, but it also lets you solve problems that would previously seem impossibly complex. -阅读[开设此课程的动机](/about/). +关于[开设此课程的动机](/about/). {% comment %} # Registration @@ -69,7 +69,7 @@ YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuloKGG59rS43e29ro7I57 ## 致谢 -感谢 Elaine Mello, Jim Cain, 以及 [MIT Open +感谢 Elaine Mello, Jim Cain 以及 [MIT Open Learning](https://openlearning.mit.edu/) 帮助我们录制讲座视频。 感谢 Anthony Zolnik 和 [MIT From 412966629686516518900da651a209bcca6b6c7d Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 17 May 2020 09:29:13 +0800 Subject: [PATCH 312/640] finish charpter 1 course-shell --- _2020/course-shell.md | 131 +++++++++++++++--------------------------- index.md | 20 +++---- 2 files changed, 53 insertions(+), 98 deletions(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index 101c0a0f..0128a350 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -154,15 +154,9 @@ drwxr-xr-x 1 missing users 4096 Jun 15 2019 missing (`rwx`). 它们分别代表了文件所有者(`missing`),用户组 (`users`) 以及其他所有人具有的权限。其中 `-`表示该用户不具备相应的权限。从上面的信息来看,只有文件所有者可以修改(`w`) , `missing` 文件夹 (例如,添加或删除文件夹中的文件)。为了进入某个文件夹,用户需要具备该文件夹以及其父文件夹的“搜索”权限(以“可执行”:`x`)权限表示。为了列出它的包含的内容,用户必须对该文件夹具备读权限(`r`)。对于文件来说,权限的意义也是类似的。注意,`/bin`目录下的程序在最后一组,即表示所有人的用户组中,均包含`x`权限,也就是说任何人都可以执行这些程序。 +在这个阶段,还有几个趁手的命令是您需要掌握的,例如 `mv` (用于重命名或移动文件)、 `cp` (拷贝文件)以及 `mkdir` (新建文件夹)。 -Some other handy programs to know about at this point are `mv` (to -rename/move a file), `cp` (to copy a file), and `mkdir` (to make a new -directory). - -If you ever want _more_ information about a program's arguments, inputs, -outputs, or how it works in general, give the `man` program a try. It -takes as an argument the name of a program, and shows you its _manual -page_. Press `q` to exit. +如果您想要知道关于程序参数、输入输出的信息,亦或是想要了解它们的工作方式,请试试 `man` 这个程序。它会接受一个程序名作为参数,然后将它的文档(用户手册)展现给您。注意,使用`q` 可以退出该程序。 ```console missing:~$ man ls @@ -170,15 +164,12 @@ missing:~$ man ls ## 在程序间创建连接 -In the shell, programs have two primary "streams" associated with them: -their input stream and their output stream. When the program tries to -read input, it reads from the input stream, and when it prints -something, it prints to its output stream. Normally, a program's input -and output are both your terminal. That is, your keyboard as input and -your screen as output. However, we can also rewire those streams! +在shell中,程序有两个主要的“流”:他们的输入流和输出流。 +当程序尝试读取信息时,它们会从输入流中进行读取,当程序打印信息时,它们会将信息输出到输出流中。 +通常,一个程序的输入输出流都是您的终端。也就是,您的键盘作为输入,显示器作为输出。 +但是,我们也可以重定向这些流! -The simplest form of redirection is `< file` and `> file`. These let you -rewire the input and output streams of a program to a file respectively: +最简单的重定向是 `< file` 和 `> file`。这两个命令可以将程序的输入输出流分别重定向到文件: ```console missing:~$ echo hello > hello.txt @@ -191,10 +182,8 @@ missing:~$ cat hello2.txt hello ``` -You can also use `>>` to append to a file. Where this kind of -input/output redirection really shines is in the use of _pipes_. The `|` -operator lets you "chain" programs such that the output of one is the -input of another: +您还可以使用 `>>` 来向一个文件追加内容。使用管道( _pipes_),我们能够更好的利用文件重定向。 +`|`操作符允许我们将一个程序的输出和另外一个程序的输入连接起来: ```console missing:~$ ls -l / | tail -n1 @@ -203,36 +192,26 @@ missing:~$ curl --head --silent google.com | grep --ignore-case content-length | 219 ``` -We will go into a lot more detail about how to take advantage of pipes -in the lecture on data wrangling. +我们会在数据清理一章中更加详细的探讨如何更好的利用管道。 ## 一个功能全面又强大的工具 -On most Unix-like systems, one user is special: the "root" user. You may -have seen it in the file listings above. The root user is above (almost) -all access restrictions, and can create, read, update, and delete any -file in the system. You will not usually log into your system as the -root user though, since it's too easy to accidentally break something. -Instead, you will be using the `sudo` command. As its name implies, it -lets you "do" something "as su" (short for "super user", or "root"). -When you get permission denied errors, it is usually because you need to -do something as root. Though make sure you first double-check that you -really wanted to do it that way! - -One thing you need to be root in order to do is writing to the `sysfs` file -system mounted under `/sys`. `sysfs` exposes a number of kernel parameters as -files, so that you can easily reconfigure the kernel on the fly without -specialized tools. **Note that sysfs does not exist on Windows or macOS.** - -For example, the brightness of your laptop's screen is exposed through a file -called `brightness` under +对于大多数的类Unix系统,有一类用户是非常特殊的,那就是:根用户(root用户)。 +您应该已经注意到来,在上面的输出结果中,根用户几乎不受任何限制,他可以创建、读取、更新和删除系统中的任何文件。 +通常在我们并不会以根用户的身份直接登陆系统,因为这样可能会因为某些错误的操作而破坏系统。 +取而代之的是我们会在需要的时候使用 `sudo` 命令。顾名思义,它的作用是让您可以以su(super user 或 root的简写)的身份do一些事情。 +当您遇到拒绝访问(permission denied)的错误时,通常是因为此时您必须是根用户才能操作。此时也请再次确认您是真的要执行此操作。 + +有一件事情是您必须作为根用户才能做的,那就是向`sysfs` 文件写入内容。系统被挂在在`/sys`下, `sysfs` 文件则暴露了一些内核(kernel)参数。 +因此,您不需要借助任何专用的工具,就可以轻松地在运行期间配置系统内核。**注意 Windows or macOS没有这个文件** + +例如,您笔记本电脑的屏幕亮度写在 `brightness` 文件中,它位于 ``` /sys/class/backlight ``` +通过将数值写入该文件,我们可以改变屏幕的亮度。现在,蹦到您脑袋里的第一个想法可能是: -By writing a value into that file, we can change the screen brightness. -Your first instinct might be to do something like: ```console $ sudo find -L /sys/class/backlight -maxdepth 2 -name '*brightness*' @@ -242,67 +221,47 @@ $ sudo echo 3 > brightness An error occurred while redirecting file 'brightness' open: Permission denied ``` +出乎意料的是,我们还是得到了一个错误信息。毕竟,我们已经使用了 +`sudo` 命令!关于shell,有件事我们必须要知道。`|`、`>`、和 `<` 是通过shell执行的,而不是被各个程序单独执行。 +`echo` 等程序并不知道`|`的存在,它们只知道从自己的输入输出流中进行读写。 +对于上面这种情况, _shell_ (权限为您的当前用户) 在设置 `sudo echo` 前尝试打开 brightness 文件并写入,但是系统拒绝了shell的操作因为此时shell不是根用户。 -This error may come as a surprise. After all, we ran the command with -`sudo`! This is an important thing to know about the shell. Operations -like `|`, `>`, and `<` are done _by the shell_, not by the individual -program. `echo` and friends do not "know" about `|`. They just read from -their input and write to their output, whatever it may be. In the case -above, the _shell_ (which is authenticated just as your user) tries to -open the brightness file for writing, before setting that as `sudo -echo`'s output, but is prevented from doing so since the shell does not -run as root. Using this knowledge, we can work around this: +明白这一点后,我们可以这样操作: ```console $ echo 3 | sudo tee brightness ``` - -Since the `tee` program is the one to open the `/sys` file for writing, -and _it_ is running as `root`, the permissions all work out. You can -control all sorts of fun and useful things through `/sys`, such as the -state of various system LEDs (your path might be different): +因为打开`/sys` 文件的是`tee`这个程序,并且该程序以`root`权限在运行,因此操作可以进行。 +这样您就可以在`/sys`中愉快地玩耍了,例如修改系统中各种LED的状态(路径可能会有所不同): ```console $ echo 1 | sudo tee /sys/class/leds/input6::scrolllock/brightness ``` -# 下一步 +# 接下来..... 学到这里,您掌握对shell知识已经可以完成一些基础对任务了。您应该已经可以查找感兴趣对文件并使用大多数程序对基本功能了。 在下一场讲座中,我们会探讨如何利用shell及其他工具执行并自动化更复杂的任务。 # 课后练习 - 1. Create a new directory called `missing` under `/tmp`. - 1. Look up the `touch` program. The `man` program is your friend. - 1. Use `touch` to create a new file called `semester` in `missing`. - 1. Write the following into that file, one line at a time: +1. 在 `/tmp` 下新建一个名为 `missing` 的文件夹。 +2. 用 `man` 查看程序 `touch` 的使用手册。 +3. 用 `touch` 在 `missing` 文件夹中新建一个叫 `semester` 的文件。 +4. 将以下内容一行一行地写入 `semester` 文件: ``` #!/bin/sh curl --head --silent https://missing.csail.mit.edu ``` - The first line might be tricky to get working. It's helpful to know that - `#` starts a comment in Bash, and `!` has a special meaning even within - double-quoted (`"`) strings. Bash treats single-quoted strings (`'`) - differently: they will do the trick in this case. See the Bash - [quoting](https://www.gnu.org/software/bash/manual/html_node/Quoting.html) - manual page for more information. - 1. Try to execute the file, i.e. type the path to the script (`./semester`) - into your shell and press enter. Understand why it doesn't work by - consulting the output of `ls` (hint: look at the permission bits of the - file). - 1. Run the command by explicitly starting the `sh` interpreter, and giving it - the file `semester` as the first argument, i.e. `sh semester`. Why does - this work, while `./semester` didn't? - 1. Look up the `chmod` program (e.g. use `man chmod`). - 1. Use `chmod` to make it possible to run the command `./semester` rather than - having to type `sh semester`. How does your shell know that the file is - supposed to be interpreted using `sh`? See this page on the - [shebang](https://en.wikipedia.org/wiki/Shebang_(Unix)) line for more - information. - 1. Use `|` and `>` to write the "last modified" date output by - `semester` into a file called `last-modified.txt` in your home - directory. - 1. Write a command that reads out your laptop battery's power level or your - desktop machine's CPU temperature from `/sys`. Note: if you're a macOS - user, your OS doesn't have sysfs, so you can skip this exercise. + 第一行可能有点棘手, `#` 在Bash中表示注释,而 `!` 即使被双引号(`"`)包裹也具有特殊的含义。 + 单引号(`'`)则不一样,此处利用这一点解决输入问题。更多信息请参考 [Bash quoting手册](https://www.gnu.org/software/bash/manual/html_node/Quoting.html) + +1. 尝试执行这个文件。例如,将该脚本的路径(`./semester`)输入到您的shell中并回车。如果程序无法执行,请使用 `ls`命令来获取信息并理解其不能执行的原因。 +2. 查看 `chmod` 的手册(例如,使用`man chmod`命令) + +3. 使用 `chmod` 命令改变权限,使 `./semester` 能够成功执行,不要使用`sh semester`来执行该程序。您的shell是如何知晓,这个文件需要使用`sh`来解析呢?更多信息请参考:[shebang](https://en.wikipedia.org/wiki/Shebang_(Unix)) + +4. 使用 `|` 和 `>` ,将 `semester` 文件输出的最后更改日期信息,写入根目录下的 `last-modified.txt` 的文件中 + +5. 写一段命令来从 `/sys` 中获取笔记本的电量信息,或者台式机CPU的温度。注意:macOS并没有sysfs,所以mac用户可以跳过这一题。 + diff --git a/index.md b/index.md index c8bf41bb..b0326f0a 100644 --- a/index.md +++ b/index.md @@ -3,18 +3,14 @@ layout: page title: 计算机科学教育中缺失的一课 --- -Classes teach you all about advanced topics within CS, from operating systems -to machine learning, but there’s one critical subject that’s rarely covered, -and is instead left to students to figure out on their own: proficiency with -their tools. We’ll teach you how to master the command-line, use a powerful -text editor, use fancy features of version control systems, and much more! - -Students spend hundreds of hours using these tools over the course of their -education (and thousands over their career), so it makes sense to make the -experience as fluid and frictionless as possible. Mastering these tools not -only enables you to spend less time on figuring out how to bend your tools to -your will, but it also lets you solve problems that would previously seem -impossibly complex. +学校里有很多向您介绍从操作系统到机器学习等计算机科学进阶主题的课程,然而 +有一个至关重要的主题却很少被包括在这些课程之内,反而留给学生们自己去探索。 +这部分内容就是:精通工具。在这个系列课程中,我们会帮助您精通命令行,使用强大对文本编辑器,使用版本控制系统提供的多种特性等等。 + +学生在他们受教育阶段就会和这些工具朝夕相处(在他们的职业生涯中更是这样)。 +因此,花时间打磨使用这些工具的能力并能够最终熟练、流畅地使用它们是非常有必要的。 +精通这些工具不仅可以帮助您更快的使用工具完成任务,并且可以帮助您解决在之前看来似乎无比复杂的问题。 + 关于[开设此课程的动机](/about/). From 10656e9e4d6bebdcc6ed11c365d3ded5f9407356 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 17 May 2020 10:14:38 +0800 Subject: [PATCH 313/640] update trans --- _2020/shell-tools.md | 98 +++++++++++++++++++++----------------------- 1 file changed, 46 insertions(+), 52 deletions(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index dcecab34..0e9ea36c 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -8,9 +8,10 @@ video: id: kgII-YWo3Zw --- -In this lecture we will present some of the basics of using bash as a scripting language along with a number of shell tools that cover several of the most common tasks that you will be constantly performing in the command line. +在这节课中,我们将会展示bash作为脚本语言的一些基础操作,以及几种最常用的shell工具。 -# Shell Scripting + +# Shell 脚本 So far we have seen how to execute commands in the shell and pipe them together. However, in many scenarios you will want to perform a series of commands and make use of control flow expressions like conditionals or loops. @@ -166,9 +167,9 @@ Some differences between shell functions and scripts that you should keep in min - Functions are executed in the current shell environment whereas scripts execute in their own process. Thus, functions can modify environment variables, e.g. change your current directory, whereas scripts can't. Scripts will be passed by value environment variables that have been exported using [`export`](http://man7.org/linux/man-pages/man1/export.1p.html) - As with any programming language functions are a powerful construct to achieve modularity, code reuse and clarity of shell code. Often shell scripts will include their own function definitions. -# Shell Tools +# Shell 工具 -## Finding how to use commands +## 查看命令如何使用 At this point you might be wondering how to find the flags for the commands in the aliasing section such as `ls -l`, `mv -i` and `mkdir -p`. More generally, given a command how do you go about finding out what it does and its different options? @@ -186,7 +187,7 @@ Sometimes manpages can be overly detailed descriptions of the commands and it ca For instance, I find myself referring back to the tldr pages for [`tar`](https://tldr.ostera.io/tar) and [`ffmpeg`](https://tldr.ostera.io/ffmpeg) way more often than the manpages. -## Finding files +## 查找文件 One of the most common repetitive tasks that every programmer faces is finding files or directories. All UNIX-like systems come packaged with [`find`](http://man7.org/linux/man-pages/man1/find.1.html), a great shell tool to find files. `find` will recursively search for files matching some criteria. Some examples: @@ -226,7 +227,7 @@ Therefore one trade-off between the two is speed vs freshness. Moreover `find` and similar tools can also find files using attributes such as file size, modification time or file permissions while `locate` just uses the name. A more in depth comparison can be found [here](https://unix.stackexchange.com/questions/60205/locate-vs-find-usage-pros-and-cons-of-each-other). -## Finding code +## 查找代码 Finding files is useful but quite often you are after what is in the file. A common scenario is wanting to search for all files that contain some pattern, along with where in those files said pattern occurs. @@ -254,7 +255,7 @@ rg --stats PATTERN Note that as with `find`/`fd`, it is important that you know that these problems can be quickly solved using one of these tools, while the specific tools you use are not as important. -## Finding shell commands +## 查找 shell 命令 So far we have seen how to find files and code, but as you start spending more time in the shell you may want to find specific commands you typed at some point. The first thing to know is that the typing up arrow will give you back your last command and if you keep pressing it you will slowly go through your shell history. @@ -280,7 +281,7 @@ Lastly, a thing to have in mind is that if you start a command with a leading sp This comes in handy when you are typing commands with passwords or other bits of sensitive information. If you make the mistake of not adding the leading space you can always manually remove the entry by editing your `.bash_history` or `.zhistory`. -## Directory Navigation +## 文件夹导航 So far we have assumed that you already are where you need to be to perform these actions, but how do you go about quickly navigating directories? There are many simple ways that you could do this, such as writing shell aliases, creating symlinks with [ln -s](http://man7.org/linux/man-pages/man1/ln.1.html) but the truth is that developers have figured out quite clever and sophisticated solutions by now. @@ -292,16 +293,16 @@ The most straightforward use is _autojump_ which adds a `z` command that you can More complex tools exist to quickly get an overview of a directory structure [`tree`](https://linux.die.net/man/1/tree), [`broot`](https://github.com/Canop/broot) or even full fledged file managers like [`nnn`](https://github.com/jarun/nnn) or [`ranger`](https://github.com/ranger/ranger) -# Exercises +# 课后练习 -1. Read [`man ls`](http://man7.org/linux/man-pages/man1/ls.1.html) and write an `ls` command that lists files in the following manner +1. 阅读 [`man ls`](http://man7.org/linux/man-pages/man1/ls.1.html) ,然后使用`ls` 命令进行如下操作: - - Includes all files, including hidden files - - Sizes are listed in human readable format (e.g. 454M instead of 454279954) - - Files are ordered by recency - - Output is colorized + - 所有文件(包括隐藏文件) + - 文件打印以人类可以理解的格式输出 (例如,使用454M 而不是 454279954) + - 文件以最近访问顺序排序 + - 以彩色文本显示输出结果 - A sample output would look like this + 典型输出如下: ``` -rw-r--r-- 1 user group 1.1M Jan 14 09:53 baz @@ -310,28 +311,24 @@ More complex tools exist to quickly get an overview of a directory structure [`t -rw-r--r-- 1 user group 106M Jan 13 12:12 foo drwx------+ 47 user group 1.5K Jan 12 18:08 .. ``` - {% comment %} ls -lath --color=auto {% endcomment %} +2. 编写两个bash函数 `marco` 和 `polo` 执行下面的操作。 + 每当你执行 `marco` 时,当前的工作目录应当以某种形式保存,当执行 `polo` 时,无论现在处在什么目录下,都应当 `cd` 回到当时执行 `marco` 的目录。 + 为了方便debug,你可以把代码写在单独的文件 `marco.sh` 中,并通过 `source marco.sh`命令,(重新)加载函数。 + {% comment %} + marco() { + export MARCO=$(pwd) + } + polo() { + cd "$MARCO" + } + {% endcomment %} -1. Write bash functions `marco` and `polo` that do the following. -Whenever you execute `marco` the current working directory should be saved in some manner, then when you execute `polo`, no matter what directory you are in, `polo` should `cd` you back to the directory where you executed `marco`. -For ease of debugging you can write the code in a file `marco.sh` and (re)load the definitions to your shell by executing `source marco.sh`. - -{% comment %} -marco() { - export MARCO=$(pwd) -} - -polo() { - cd "$MARCO" -} -{% endcomment %} - -1. Say you have a command that fails rarely. In order to debug it you need to capture its output but it can be time consuming to get a failure run. -Write a bash script that runs the following script until it fails and captures its standard output and error streams to files and prints everything at the end. -Bonus points if you can also report how many runs it took for the script to fail. +3. 假设您有一个命令,它很少出错。因此为了在出错时能够对其进行调试,需要花费大量的时间重现错误并捕获输出。 + 编写一段bash脚本,运行如下的脚本直到它出错,将它的标准输出和标准错误流记录到文件,并在最后输出所有内容。 + 加分项:报告脚本在失败前共运行了多少次。 ```bash #!/usr/bin/env bash @@ -347,30 +344,27 @@ Bonus points if you can also report how many runs it took for the script to fail echo "Everything went according to plan" ``` -{% comment %} -#!/usr/bin/env bash + {% comment %} + #!/usr/bin/env bash -count=0 -until [[ "$?" -ne 0 ]]; -do - count=$((count+1)) - ./random.sh &> out.txt -done + count=0 + until [[ "$?" -ne 0 ]]; + do + count=$((count+1)) + ./random.sh &> out.txt + done -echo "found error after $count runs" -cat out.txt -{% endcomment %} + echo "found error after $count runs" + cat out.txt + {% endcomment %} + +4. 本节课我们讲解了 `find` 命令的 `-exec` 参数非常强大,它可以对我们查找对文件进行操作。但是,如果我们要对所有文件进行操作呢?例如创建一个zip压缩文件?我们已经知道,命令行可以从参数或标准输入接受输入。在用管道连接命令时,我们将标准输出和标准输入连接起来,但是有些命令,例如`tar` 则需要从参数接受输入。这里我们可以使用[`xargs`](http://man7.org/linux/man-pages/man1/xargs.1.html) 命令,它可以使用标准输入中的内容作为参数。 + 例如 `ls | xargs rm` 会删除当前目录中的所有文件。 -1. As we covered in the lecture `find`'s `-exec` can be very powerful for performing operations over the files we are searching for. -However, what if we want to do something with **all** the files, like creating a zip file? -As you have seen so far commands will take input from both arguments and STDIN. -When piping commands, we are connecting STDOUT to STDIN, but some commands like `tar` take inputs from arguments. -To bridge this disconnect there's the [`xargs`](http://man7.org/linux/man-pages/man1/xargs.1.html) command which will execute a command using STDIN as arguments. -For example `ls | xargs rm` will delete the files in the current directory. + 您的任务是编写一个命令,它可以递归地查找文件夹中所有的HTML文件,并将它们压缩成zip文件。注意,即使文件名中包含空格,您的命令也应该能够正确执行(提示:查看 `xargs`的参数`-d`) - Your task is to write a command that recursively finds all HTML files in the folder and makes a zip with them. Note that your command should work even if the files have spaces (hint: check `-d` flag for `xargs`) {% comment %} find . -type f -name "*.html" | xargs -d '\n' tar -cvzf archive.tar.gz {% endcomment %} -1. (Advanced) Write a command or script to recursively find the most recently modified file in a directory. More generally, can you list all files by recency? +5. (进阶) 编写一个命令或脚本递归的查找文件夹中最近使用的文件。更通用的做法,你可以按照最近的使用时间列出文件吗? From 705b45ed9bbcd2fdff430f71b65e9341014d1556 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 17 May 2020 10:58:57 +0800 Subject: [PATCH 314/640] add author --- _layouts/lecture.html | 1 + index.md | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/_layouts/lecture.html b/_layouts/lecture.html index 087491ba..f6893c2d 100644 --- a/_layouts/lecture.html +++ b/_layouts/lecture.html @@ -18,5 +18,6 @@

{{ page.title }}{% if page.subtitle %}

Edit this page.

+

Translator: Lingfeng Ai

Licensed under CC BY-NC-SA.

diff --git a/index.md b/index.md index b0326f0a..edcc4309 100644 --- a/index.md +++ b/index.md @@ -81,6 +81,6 @@ AeroAstro](https://aeroastro.mit.edu/) 提供 A/V 设备。

Source code.

Licensed under CC BY-NC-SA.

-

Translator: Lingfeng Ai (hanxiaomax@qq.com)

+

Translator: Lingfeng Ai

See here for contribution & translation guidelines.

From f892286aefaf4e8fb23c2ec045922016d005cc04 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 17 May 2020 10:59:12 +0800 Subject: [PATCH 315/640] update trans --- _2020/shell-tools.md | 35 +++++++++++++++-------------------- 1 file changed, 15 insertions(+), 20 deletions(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 0e9ea36c..c2ea7e00 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -13,8 +13,7 @@ video: # Shell 脚本 -So far we have seen how to execute commands in the shell and pipe them together. -However, in many scenarios you will want to perform a series of commands and make use of control flow expressions like conditionals or loops. +到目前为止,我们已经学习来如何在shell中执行命令,并使用管道将命令组合使用。但是,很多情况下我们需要执行一系列到操作并使用条件或循环这样的控制流。 Shell scripts are the next step in complexity. Most shells have their own scripting language with variables, control flow and its own syntax. @@ -48,24 +47,20 @@ mcd () { } ``` -Here `$1` is the first argument to the script/function. -Unlike other scripting languages, bash uses a variety of special variables to refer to arguments, error codes and other relevant variables. Below is a list of some of them. A more comprehensive list can be found [here](https://www.tldp.org/LDP/abs/html/special-chars.html). -- `$0` - Name of the script -- `$1` to `$9` - Arguments to the script. `$1` is the first argument and so on. -- `$@` - All the arguments -- `$#` - Number of arguments -- `$?` - Return code of the previous command -- `$$` - Process Identification number for the current script -- `!!` - Entire last command, including arguments. A common pattern is to execute a command only for it to fail due to missing permissions, then you can quickly execute it with sudo by doing `sudo !!` -- `$_` - Last argument from the last command. If you are in an interactive shell, you can also quickly get this value by typing `Esc` followed by `.` - -Commands will often return output using `STDOUT`, errors through `STDERR` and a Return Code to report errors in a more script friendly manner. -Return code or exit status is the way scripts/commands have to communicate how execution went. -A value of 0 usually means everything went OK, anything different from 0 means an error occurred. - -Exit codes can be used to conditionally execute commands using `&&` (and operator) and `||` (or operator). Commands can also be separated within the same line using a semicolon `;`. -The `true` program will always have a 0 return code and the `false` command will always have a 1 return code. -Let's see some examples +这里 `$1` 是脚本到第一个参数。与其他脚本语言不同到是,bash使用了很多特殊到变量来表示参数、错误代码和相关变量。下面是列举来其中一些变量,更完整到列表可以参考 [这里](https://www.tldp.org/LDP/abs/html/special-chars.html)。 +- `$0` - 脚本名 +- `$1` 到 `$9` - 脚本到参数。 `$1` 是第一个参数,依此类推。 +- `$@` - 所有参数 +- `$#` - 参数个数 +- `$?` - 前一个命令到返回值 +- `$$` - 当前脚本到进程识别码 +- `!!` - 完整到上一条命令,包括参数。常见应用:当你因为权限不足执行命令失败时,可以使用 `sudo !!`再尝试一次。 +- `$_` - 上一条命令的最后一个参数。如果你正在使用的是交互式shell,你可以通过按下 `Esc` 之后键入 . 来获取这个值。 + +命令通常使用 `STDOUT`来返回输出值,使用`STDERR` 来返回错误及错误码,便于脚本以更加友好到方式报告错误。 +返回码或退出状态是脚本/命令之间交流执行状态到方式。返回值0表示正常执行,其他所有非0的返回值都表示有错误发生。 + +退出码可以搭配`&&` (与操作符) 和 `||` (或操作符)使用,用来进行条件判断,决定是否执行其他程序。同一行的多个命令可以用` ; `分隔。程序 `true` 的返回码永远是`0`,`false` 的返回码永远是`1`。让我们看几个例子 ```bash false || echo "Oops, fail" From 51d604b6f2b6ff2de4e10b0c0ae871c672f8a316 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 17 May 2020 11:02:06 +0800 Subject: [PATCH 316/640] change title --- index.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/index.md b/index.md index edcc4309..29963678 100644 --- a/index.md +++ b/index.md @@ -1,6 +1,6 @@ --- layout: page -title: 计算机科学教育中缺失的一课 +title: The Missing Semester of Your CS Education --- 学校里有很多向您介绍从操作系统到机器学习等计算机科学进阶主题的课程,然而 @@ -44,13 +44,13 @@ Sign up for the IAP 2020 class by filling out this [registration form](https://f 讲座视频可以在 [ -YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuloKGG59rS43e29ro7I57J)上找到 +YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuloKGG59rS43e29ro7I57J)上找到。 # 关于本课程 -**教员**: 本课程由 [Anish](https://www.anishathalye.com/)、 [Jon](https://thesquareplanet.com/) 和 [Jose](http://josejg.com/)教授。 +**教员**: 本课程由 [Anish](https://www.anishathalye.com/)、 [Jon](https://thesquareplanet.com/) 和 [Jose](http://josejg.com/) 讲授。 -**问题**: 请通过 [missing-semester@mit.edu](mailto:missing-semester@mit.edu)联系我们 +**问题**: 请通过 [missing-semester@mit.edu](mailto:missing-semester@mit.edu)联系我们。 # 在 MIT 之外 From 853946606c73828f14bddf579b3dcf513209d156 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 17 May 2020 16:21:32 +0800 Subject: [PATCH 317/640] add trans status --- _2020/command-line.md | 2 +- _2020/data-wrangling.md | 2 +- _2020/debugging-profiling.md | 2 +- _2020/editors.md | 2 +- _2020/metaprogramming.md | 2 +- _2020/potpourri.md | 6 +++--- _2020/qa.md | 4 ++-- _2020/security.md | 2 +- _2020/shell-tools.md | 2 +- _2020/version-control.md | 2 +- index.md | 7 +++---- 11 files changed, 16 insertions(+), 17 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 34d386f7..ca12eacf 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -2,7 +2,7 @@ layout: lecture title: "命令行环境" date: 2019-01-21 -ready: true +ready: false video: aspect: 56.25 id: e8BO_dYxk5c diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index a723c497..fe24629c 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -2,7 +2,7 @@ layout: lecture title: "数据清理" date: 2019-01-16 -ready: true +ready: false video: aspect: 56.25 id: sz_dsktIjt4 diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 85ddcf63..e39bbd1b 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -2,7 +2,7 @@ layout: lecture title: "调试及性能分析" date: 2019-01-23 -ready: true +ready: false video: aspect: 56.25 id: l812pUnKxME diff --git a/_2020/editors.md b/_2020/editors.md index 38d0df4b..e1f6efd8 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -2,7 +2,7 @@ layout: lecture title: "编辑器 (Vim)" date: 2019-01-15 -ready: true +ready: false video: aspect: 56.25 id: a6Q8Na575qc diff --git a/_2020/metaprogramming.md b/_2020/metaprogramming.md index d98a8aff..d850e839 100644 --- a/_2020/metaprogramming.md +++ b/_2020/metaprogramming.md @@ -3,7 +3,7 @@ layout: lecture title: "元编程" details: 构建系统、依赖管理、测试、持续集成 date: 2019-01-27 -ready: true +ready: false video: aspect: 56.25 id: _Ms1Z4xfqv4 diff --git a/_2020/potpourri.md b/_2020/potpourri.md index 691a8ab4..5da53334 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -2,15 +2,15 @@ layout: lecture title: "大杂烩" date: 2019-01-29 -ready: true +ready: false video: aspect: 56.25 id: JZDt-PRq0uo --- -## Table of Contents +## 目录 -- [Table of Contents](#table-of-contents) +- [目录](#%e7%9b%ae%e5%bd%95) - [Keyboard remapping](#keyboard-remapping) - [Daemons](#daemons) - [FUSE](#fuse) diff --git a/_2020/qa.md b/_2020/qa.md index 82ad7fd4..eefcb5ba 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -2,7 +2,7 @@ layout: lecture title: "Q&A" date: 2019-01-30 -ready: true +ready: false video: aspect: 56.25 id: Wz50FvGG6xU @@ -10,7 +10,7 @@ video: For the last lecture, we answered questions that the students submitted: -- [Any recommendations on learning Operating Systems related topics like processes, virtual memory, interrupts, memory management, etc ](#any-recommendations-on-learning-operating-systems-related-topics-like-processes-virtual-memory-interrupts-memory-management-etc) +- [Any recommendations on learning Operating Systems related topics like processes, virtual memory, interrupts, memory management, etc](#any-recommendations-on-learning-operating-systems-related-topics-like-processes-virtual-memory-interrupts-memory-management-etc) - [What are some of the tools you'd prioritize learning first?](#what-are-some-of-the-tools-youd-prioritize-learning-first) - [When do I use Python versus a Bash scripts versus some other language?](#when-do-i-use-python-versus-a-bash-scripts-versus-some-other-language) - [What is the difference between `source script.sh` and `./script.sh`](#what-is-the-difference-between-source-scriptsh-and-scriptsh) diff --git a/_2020/security.md b/_2020/security.md index 27a21c37..d88dd467 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -2,7 +2,7 @@ layout: lecture title: "安全和密码学" date: 2019-01-28 -ready: true +ready: false video: aspect: 56.25 id: tjwobAmnKTo diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index c2ea7e00..91a18c9a 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -2,7 +2,7 @@ layout: lecture title: "Shell 工具和脚本" date: 2019-01-14 -ready: true +ready: false video: aspect: 56.25 id: kgII-YWo3Zw diff --git a/_2020/version-control.md b/_2020/version-control.md index 4659d86d..52f46639 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -2,7 +2,7 @@ layout: lecture title: "版本控制(Git)" date: 2019-01-22 -ready: true +ready: false video: aspect: 56.25 id: 2sjqTHE0zok diff --git a/index.md b/index.md index 29963678..d4f97393 100644 --- a/index.md +++ b/index.md @@ -3,8 +3,7 @@ layout: page title: The Missing Semester of Your CS Education --- -学校里有很多向您介绍从操作系统到机器学习等计算机科学进阶主题的课程,然而 -有一个至关重要的主题却很少被包括在这些课程之内,反而留给学生们自己去探索。 +学校里有很多向您介绍从操作系统到机器学习等计算机科学进阶主题的课程,然而有一个至关重要的主题却很少被包括在这些课程之内,反而留给学生们自己去探索。 这部分内容就是:精通工具。在这个系列课程中,我们会帮助您精通命令行,使用强大对文本编辑器,使用版本控制系统提供的多种特性等等。 学生在他们受教育阶段就会和这些工具朝夕相处(在他们的职业生涯中更是这样)。 @@ -34,9 +33,9 @@ Sign up for the IAP 2020 class by filling out this [registration form](https://f
  • {{ lecture.date | date: '%-m/%d' }}: {% if lecture.ready %} - {{ lecture.title }} + {{ lecture.title }} {% else %} - {{ lecture.title }} {% if lecture.noclass %}[no class]{% endif %} + {{ lecture.title }} {% if lecture.noclass %}[no class]{% endif %} {% endif %}
  • {% endif %} From 833e596f5c1cbc0d116b302580b244cac3564ad8 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 17 May 2020 17:09:00 +0800 Subject: [PATCH 318/640] add trans status --- _2020/shell-tools.md | 77 ++++++++++++++++++++++---------------------- 1 file changed, 38 insertions(+), 39 deletions(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 91a18c9a..9ddeb73f 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -13,31 +13,29 @@ video: # Shell 脚本 -到目前为止,我们已经学习来如何在shell中执行命令,并使用管道将命令组合使用。但是,很多情况下我们需要执行一系列到操作并使用条件或循环这样的控制流。 +到目前为止,我们已经学习来如何在shell中执行命令,并使用管道将命令组合使用。但是,很多情况下我们需要执行一系列的操作并使用条件或循环这样的控制流。 -Shell scripts are the next step in complexity. -Most shells have their own scripting language with variables, control flow and its own syntax. -What makes shell scripting different from other scripting programming language is that is optimized for performing shell related tasks. -Thus, creating command pipelines, saving results into files or reading from standard input are primitives in shell scripting which makes it easier to use than general purpose scripting languages. -For this section we will focus on bash scripting since it is the most common. +shell脚本是一种更加复杂度的工具。 -To assign variables in bash use the syntax `foo=bar` and access the value of the variable with `$foo`. -Note that `foo = bar` will not work since it is interpreted as calling the `foo` program with arguments `=` and `bar`. -In general, in shell scripts the space character will perform argument splitting and it can be confusing to use at first so always check for that. +大多数shell都有自己的一套脚本语言,包括变量、控制流和自己的语法。shell脚本与其他脚本语言不同之处在于,shell脚本针对shell所从事的相关工作进行来优化。因此,创建命令流程(pipelines)、将结果保存到文件、从标准输入中读取输入,这些都是shell脚本中的原生操作,这让它比通用的脚本语言更易用。本节中,我们会专注于bash脚本,因为它最流行,应用更为广泛。 + +在bash中为变量赋值的语法是`foo=bar`,访问变量中存储的数值,其语法为 `$foo`。 +需要注意的是,`foo = bar` (使用括号隔开)是不能正确工作的,因为解释器会调用程序`foo` 并将 `=` 和 `bar`作为参数。 +总的来说,在shell脚本中使用空格会起到分割参数的作用,有时候可能会造成混淆,请务必多加检查。 Strings in bash can be defined with `'` and `"` delimiters but they are not equivalent. -Strings delimited with `'` are literal strings and will not substitute variable values whereas `"` delimited strings will. +Bash中的字符串通过`'` 和 `"`分隔符来定义,但是它们的含义并不相同。以`'`定义的字符串为原义字符串,其中的变量不会被转义,而 `"`定义的字符串会将变量值进行替换。 ```bash foo=bar echo "$foo" -# prints bar +# 打印 bar echo '$foo' -# prints $foo +# 打印 $foo ``` -As with most programming languages, bash supports control flow techniques including `if`, `case`, `while` and `for`. -Similarly, `bash` has functions that take arguments and can operate with them. Here is an example of a function that creates a directory and `cd`s into it. +和其他大多数的编程语言一样,`bash`也支持`if`, `case`, `while` 和 `for` 这些控制流关键字。同样地, +, `bash` 也支持函数,它可以接受参数并基于参数进行操作。下面这个函数是一个例子,它会创建一个函数并使用`cd`进入该文件夹。 ```bash @@ -79,25 +77,24 @@ false ; echo "This will always run" # This will always run ``` -Another common pattern is wanting to get the output of a command as a variable. This can be done with _command substitution_. -Whenever you place `$( CMD )` it will execute `CMD`, get the output of the command and substitute it in place. -For example, if you do `for file in $(ls)`, the shell will first call `ls` and then iterate over those values. -A lesser known similar feature is _process substitution_, `<( CMD )` will execute `CMD` and place the output in a temporary file and substitute the `<()` with that file's name. This is useful when commands expect values to be passed by file instead of by STDIN. For example, `diff <(ls foo) <(ls bar)` will show differences between files in dirs `foo` and `bar`. +另一个常见的模式是以变量的形式获取一个命令的输出,这可以通过 _命令替换_ (_command substitution_)实现。 + +当您通过 `$( CMD )` 这样的方式来执行`CMD` 这个命令时,然后它的输出结果会替换掉 `$( CMD )` 。例如,如果执行 `for file in $(ls)` ,shell首先将调用`ls` ,然后遍历得到的这些返回值。还有一个冷门的类似特性是 _进程替换_(_process substitution_), `<( CMD )` 会执行 `CMD` 并将结果输出到一个临时文件中,并将 `<( CMD )` 替换成临时文件名。这在我们希望返回值通过文件而不是STDIN传递时很有用。例如, `diff <(ls foo) <(ls bar)` 会显示文件夹 `foo` 和 `bar` 中文件的区别。 +说了很多,现在该看例子了,下面这个例子展示了一部分上面提到的特性。这段脚本会遍历我们提供的参数,使用`grep` 搜索字符串 `foobar`,如果没有找到,则将其作为注释追加到文件中。 -Since that was a huge information dump let's see an example that showcases some of these features. It will iterate through the arguments we provide, `grep` for the string `foobar` and append it to the file as a comment if it's not found. ```bash #!/bin/bash -echo "Starting program at $(date)" # Date will be substituted +echo "Starting program at $(date)" # date会被替换成日期和时间 echo "Running program $0 with $# arguments with pid $$" for file in $@; do grep foobar $file > /dev/null 2> /dev/null - # When pattern is not found, grep has exit status 1 - # We redirect STDOUT and STDERR to a null register since we do not care about them + # 如果模式没有找到,则grep退出状态为 1 + # 我们将标准输出流和标准错误流重定向到Null,因为我们并不关心这些信息 if [[ $? -ne 0 ]]; then echo "File $file does not have any foobar, adding one" echo "# foobar" >> "$file" @@ -105,35 +102,36 @@ for file in $@; do done ``` -In the comparison we tested whether `$?` was not equal to 0. -Bash implements many comparisons of this sort, you can find a detailed list in the manpage for [`test`](http://man7.org/linux/man-pages/man1/test.1.html). -When performing comparisons in bash try to use double brackets `[[ ]]` in favor of simple brackets `[ ]`. Chances of making mistakes are lower although it won't be portable to `sh`. A more detailed explanation can be found [here](http://mywiki.wooledge.org/BashFAQ/031). +在条件语句中,我们比较 `$?` 是否等于0。 +Bash实现了许多类似的比较操作,您可以查看 [`test 手册`](http://man7.org/linux/man-pages/man1/test.1.html)。 +在bash中进行比较时,尽量使用双方括号 `[[ ]]` 而不是单方括号 `[ ]`,这样会降低犯错的几率,尽管这样并不能兼容 `sh`。 更详细的说明参见[这里](http://mywiki.wooledge.org/BashFAQ/031)。 + +当执行脚本时,我们经常需要提供形式类似的参数。bash使我们可以轻松的实现这一操作,它可以基于文件扩展名展开表达式。这一技术被称为shell的 _通配_( _globbing_) -When launching scripts, you will often want to provide arguments that are similar. Bash has ways of making this easier, expanding expressions by carrying out filename expansion. These techniques are often referred to as shell _globbing_. -- Wildcards - Whenever you want to perform some sort of wildcard matching you can use `?` and `*` to match one or any amount of characters respectively. For instance, given files `foo`, `foo1`, `foo2`, `foo10` and `bar`, the command `rm foo?` will delete `foo1` and `foo2` whereas `rm foo*` will delete all but `bar`. -- Curly braces `{}` - Whenever you have a common substring in a series of commands you can use curly braces for bash to expand this automatically. This comes in very handy when moving or converting files. +- 通配符 - 当你想要利用通配符进行匹配时,你可以分别使用 `?` 和 `*` 来匹配一个或任意个字符。例如,对于文件`foo`, `foo1`, `foo2`, `foo10` 和 `bar`, `rm foo?`这条命令会删除`foo1` 和 `foo2` ,而`rm foo*` 则会删除除了`bar`之外的所有文件。 +- 花括号`{}` - 当你有一系列的指令,其中包含一段公共子串时,可以用花括号来自动展开这些命令。这在批量移动或转换文件时非常方便。 ```bash convert image.{png,jpg} -# Will expand to +# 会展开为 convert image.png image.jpg cp /path/to/project/{foo,bar,baz}.sh /newpath -# Will expand to +# 会展开为 cp /path/to/project/foo.sh /path/to/project/bar.sh /path/to/project/baz.sh /newpath -# Globbing techniques can also be combined +# 也可以结合通配使用 mv *{.py,.sh} folder -# Will move all *.py and *.sh files - +# 会删除所有 *.py 和 *.sh 文件 mkdir foo bar -# This creates files foo/a, foo/b, ... foo/h, bar/a, bar/b, ... bar/h + +# 下面命令会创建foo/a, foo/b, ... foo/h, bar/a, bar/b, ... bar/h这些文件 touch {foo,bar}/{a..h} touch foo/x bar/y -# Show differences between files in foo and bar +# 显示foo和bar文件的不同 diff <(ls foo) <(ls bar) -# Outputs +# 输出 # < x # --- # > y @@ -141,9 +139,9 @@ diff <(ls foo) <(ls bar) -Writing `bash` scripts can be tricky and unintuitive. There are tools like [shellcheck](https://github.com/koalaman/shellcheck) that will help you find out errors in your sh/bash scripts. +编写 `bash` 脚本有时候会很别扭和反直觉。例如 [shellcheck](https://github.com/koalaman/shellcheck)这样的工具可以帮助你定位sh/bash脚本中的错误。 -Note that scripts need not necessarily be written in bash to be called from the terminal. For instance, here's a simple Python script that outputs its arguments in reversed order +注意,脚本并不一定只有用bash写才能在终端里调用。比如说,这是一段Python脚本,作用是将输入的参数倒序输出: ```python #!/usr/local/bin/python @@ -152,7 +150,8 @@ for arg in reversed(sys.argv[1:]): print(arg) ``` -The shell knows to execute this script with a python interpreter instead of a shell command because we included a [shebang](https://en.wikipedia.org/wiki/Shebang_(Unix)) line at the top of the script. +shell知道去用python解释器而不是shell命令来运行这段脚本,是因为脚本的开头第一行的[shebang](https://en.wikipedia.org/wiki/Shebang_(Unix))。 + It is good practice to write shebang lines using the [`env`](http://man7.org/linux/man-pages/man1/env.1.html) command that will resolve to wherever the command lives in the system, increasing the portability of your scripts. To resolve the location, `env` will make use of the `PATH` environment variable we introduced in the first lecture. For this example the shebang line would look like `#!/usr/bin/env python`. From 0cce00a4b6444d5dfef64b738200867a89f4b0c7 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 17 May 2020 17:50:44 +0800 Subject: [PATCH 319/640] add trans status --- _2020/shell-tools.md | 50 +++++++++++++++++++++++--------------------- 1 file changed, 26 insertions(+), 24 deletions(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 9ddeb73f..979fd5ba 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -152,39 +152,40 @@ for arg in reversed(sys.argv[1:]): shell知道去用python解释器而不是shell命令来运行这段脚本,是因为脚本的开头第一行的[shebang](https://en.wikipedia.org/wiki/Shebang_(Unix))。 -It is good practice to write shebang lines using the [`env`](http://man7.org/linux/man-pages/man1/env.1.html) command that will resolve to wherever the command lives in the system, increasing the portability of your scripts. To resolve the location, `env` will make use of the `PATH` environment variable we introduced in the first lecture. -For this example the shebang line would look like `#!/usr/bin/env python`. +在shebang行中使用 [`env`](http://man7.org/linux/man-pages/man1/env.1.html) 命令是一种好的实践,它会利用环境变量中的程序来解析该脚本,这样就提高来您的脚本的可移植性。`env` 会利用我们第一节讲座中介绍过的`PATH` 环境变量来进行定位。 +例如,使用了`env`的shebang看上去时这样的`#!/usr/bin/env python`。 + + +shell函数和脚本有如下一些不同点: + +- 函数只能用与shell使用相同的语言,脚本可以使用任意语言。因此在脚本中包含 `shebang` 是很重要的。 +- 函数仅在定义时被加载,脚本会在每次被执行时加载。这让函数的加载比脚本略快一些,但每次修改函数定义,都要重新加载一次。 +- 函数会在当前的shell环境中执行,脚本会在单独的进程中执行。因此,函数可以对环境变量进行更改,比如改变当前工作目录,脚本则不行。脚本需要使用 [`export`](http://man7.org/linux/man-pages/man1/export.1p.html) 将环境变量导出,并将值传递给环境变量。 +- 与其他程序语言一样,函数可以提高代码模块性、代码复用性并创建清晰性的结构。shell脚本中往往也会包含它们自己的函数定义。 + -Some differences between shell functions and scripts that you should keep in mind are: -- Functions have to be in the same language as the shell, while scripts can be written in any language. This is why including a shebang for scripts is important. -- Functions are loaded once when their definition is read. Scripts are loaded every time they are executed. This makes functions slightly faster to load but whenever you change them you will have to reload their definition. -- Functions are executed in the current shell environment whereas scripts execute in their own process. Thus, functions can modify environment variables, e.g. change your current directory, whereas scripts can't. Scripts will be passed by value environment variables that have been exported using [`export`](http://man7.org/linux/man-pages/man1/export.1p.html) -- As with any programming language functions are a powerful construct to achieve modularity, code reuse and clarity of shell code. Often shell scripts will include their own function definitions. # Shell 工具 ## 查看命令如何使用 -At this point you might be wondering how to find the flags for the commands in the aliasing section such as `ls -l`, `mv -i` and `mkdir -p`. -More generally, given a command how do you go about finding out what it does and its different options? -You could always start googling, but since UNIX predates StackOverflow there are builtin ways of getting this information. +看到这里,您可能会有疑问,我们应该如何为特定的命令找到合适的标记呢?例如 `ls -l`, `mv -i` 和 `mkdir -p`。更一般的庆幸是,给您一个命令行,您应该怎样了解如何使用这个命令行并找出它的不同的选项呢? +一般来说,您可能会先去网上搜索答案,但是,UNIX 可比 StackOverflow 出现的早,因此我们的系统里其实早就包含了可以获取相关信息的方法。 -As we saw in the shell lecture, the first order approach is to call said command with the `-h` or `--help` flags. A more detailed approach is to use the `man` command. -Short for manual, [`man`](http://man7.org/linux/man-pages/man1/man.1.html) provides a manual page (called manpage) for a command you specify. -For example, `man rm` will output the behavior of the `rm` command along with the flags that it takes including the `-i` flag we showed earlier. -In fact, what I have been linking so far for every command is the online version of Linux manpages for the commands. -Even non native commands that you install will have manpage entries if the developer wrote them and included them as part of the installation process. -For interactive tools such as the ones based on ncurses, help for the commands can often be accessed within the program using the `:help` command or typing `?`. +在上一节中我们介绍过,最常用的方法是为对应的命令行添加`-h` 或 `--help` 标记。另外一个更详细的方法则是使用`man` 命令。[`man`](http://man7.org/linux/man-pages/man1/man.1.html) 命令是手册(manual)的缩写,它提供了命令的用户手册。 -Sometimes manpages can be overly detailed descriptions of the commands and it can become hard to decipher what flags/syntax to use for common use cases. -[TLDR pages](https://tldr.sh/) are a nifty complementary solution that focuses on giving example use cases of a command so you can quickly figure out which options to use. -For instance, I find myself referring back to the tldr pages for [`tar`](https://tldr.ostera.io/tar) and [`ffmpeg`](https://tldr.ostera.io/ffmpeg) way more often than the manpages. +例如,`man rm` 会输出命令 `rm` 的说明,同时还有其标记列表,包括之前我们介绍过的`-i`。 +事实上,目前我们给出的所有命令的说明链接,都是网页版的Linux命令手册。即使是您安装的第三方命令,前提是开发者编写了手册并将其包含在了安装包中。在交互式的、基于字符处理的终端窗口中,一般也可以通过 `:help` 命令或键入 `?`来获取帮助。 + +有时候手册内容太过详实,让我们难以在其中查找哪些最常用的标记和语法。 +[TLDR pages](https://tldr.sh/) 是一个很不错的替代品,它提供了一些案例,可以帮助您快速找到正确的选项。 + +例如,自己就常常在tldr上搜索[`tar`](https://tldr.ostera.io/tar) 和 [`ffmpeg`](https://tldr.ostera.io/ffmpeg) 的用法。 ## 查找文件 -One of the most common repetitive tasks that every programmer faces is finding files or directories. -All UNIX-like systems come packaged with [`find`](http://man7.org/linux/man-pages/man1/find.1.html), a great shell tool to find files. `find` will recursively search for files matching some criteria. Some examples: +程序员们面对的最常见的重复任务就是查找文件或目录。所有的类UNIX系统都包含一个名为 [`find`](http://man7.org/linux/man-pages/man1/find.1.html)的工具,它是shell上用于查找文件的绝佳工具。`find`命令会递归地搜索符合条件的文件,例如: ```bash # Find all directories named src @@ -196,8 +197,9 @@ find . -mtime -1 # Find all zip files with size in range 500k to 10M find . -size +500k -size -10M -name '*.tar.gz' ``` -Beyond listing files, find can also perform actions over files that match your query. -This property can be incredibly helpful to simplify what could be fairly monotonous tasks. +除了列出所寻找的文件之外,find还能对所有查找到的文件进行操作。这能极大地简化一些单调的任务。 + + ```bash # Delete all files with .tmp extension find . -name '*.tmp' -exec rm {} \; @@ -205,7 +207,7 @@ find . -name '*.tmp' -exec rm {} \; find . -name '*.png' -exec convert {} {.}.jpg \; ``` -Despite `find`'s ubiquitousness, its syntax can sometimes be tricky to remember. +尽管 `find` 用途广泛,它的语法却比较难以记忆。例如, For instance, to simply find files that match some pattern `PATTERN` you have to execute `find -name '*PATTERN*'` (or `-iname` if you want the pattern matching to be case insensitive). You could start building aliases for those scenarios but as part of the shell philosophy is good to explore using alternatives. Remember, one of the best properties of the shell is that you are just calling programs so you can find (or even write yourself) replacements for some. From f3db01cbb729addb983a669cde91198ee8277e67 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 17 May 2020 18:07:51 +0800 Subject: [PATCH 320/640] add trans status --- _2020/shell-tools.md | 27 ++++++++++++--------------- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 979fd5ba..84cbb259 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -207,21 +207,18 @@ find . -name '*.tmp' -exec rm {} \; find . -name '*.png' -exec convert {} {.}.jpg \; ``` -尽管 `find` 用途广泛,它的语法却比较难以记忆。例如, -For instance, to simply find files that match some pattern `PATTERN` you have to execute `find -name '*PATTERN*'` (or `-iname` if you want the pattern matching to be case insensitive). -You could start building aliases for those scenarios but as part of the shell philosophy is good to explore using alternatives. -Remember, one of the best properties of the shell is that you are just calling programs so you can find (or even write yourself) replacements for some. -For instance, [`fd`](https://github.com/sharkdp/fd) is a simple, fast and user-friendly alternative to `find`. -It offers some nice defaults like colorized output, default regex matching, Unicode support and it has in what my opinion is a more intuitive syntax. -The syntax to find a pattern `PATTERN` is `fd PATTERN`. - -Most would agree that `find` and `fd` are good but some of you might be wondering about the efficiency of looking for files every time versus compiling some sort of index or database for quickly searching. -That is what [`locate`](http://man7.org/linux/man-pages/man1/locate.1.html) is for. -`locate` uses a database that is updated using [`updatedb`](http://man7.org/linux/man-pages/man1/updatedb.1.html). -In most systems `updatedb` is updated daily via [`cron`](http://man7.org/linux/man-pages/man8/cron.8.html). -Therefore one trade-off between the two is speed vs freshness. -Moreover `find` and similar tools can also find files using attributes such as file size, modification time or file permissions while `locate` just uses the name. -A more in depth comparison can be found [here](https://unix.stackexchange.com/questions/60205/locate-vs-find-usage-pros-and-cons-of-each-other). +尽管 `find` 用途广泛,它的语法却比较难以记忆。例如,为了查找满足模式 `PATTERN` 的文件,您需要执行 `find -name '*PATTERN*'` (如果您希望模式匹配时是区分大小写,可以使用`-iname`选项) + +您当然可以使用alias设置别名来简化上述操作,但shell的哲学之一便是寻找(更好用的)替代方案。 +记住,shell最好的特性就是您只是在调用程序,因此您只要找到合适的替代程序即可(甚至自己编写)。 + +例如, [`fd`](https://github.com/sharkdp/fd) 就是一个更简单、更快速、更友好的程序,它可以用来作为`find`的替代品。它有很多不错的默认设置,例如输出着色、默认支持正则匹配、支持unicode并且我认为它的语法更符合直觉。以模式`PATTERN` 搜索的语法是 `fd PATTERN`。 + +大多数人都认为 `find` 和 `fd` 已经很好用了,但是有的人可能向知道,我们是不可以可以有更高效的方法,例如不要每次都搜索文件而是通过编译索引或建立数据库的方式来实现更加快速地搜索。 + +这就要靠 [`locate`](http://man7.org/linux/man-pages/man1/locate.1.html) 了。 +`locate` 使用一个由 [`updatedb`](http://man7.org/linux/man-pages/man1/updatedb.1.html)负责更新的数据库,在大多数系统中 `updatedb` 都会通过 [`cron`](http://man7.org/linux/man-pages/man8/cron.8.html)每日更新。这便需要我们在速度和时效性之间作出权衡。而且,`find` 和类似的工具可以通过别的属性比如文件大小、修改时间或是权限来查找文件,`locate`则只能通过文件名。 [here](https://unix.stackexchange.com/questions/60205/locate-vs-find-usage-pros-and-cons-of-each-other)有一个更详细的对比。 + ## 查找代码 From 213e3a09c713b613d1a0c1078cef415efe4453f7 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 17 May 2020 18:32:23 +0800 Subject: [PATCH 321/640] add trans status --- _2020/shell-tools.md | 41 +++++++++++++++++++---------------------- 1 file changed, 19 insertions(+), 22 deletions(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 84cbb259..85aeb9fe 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -222,19 +222,16 @@ find . -name '*.png' -exec convert {} {.}.jpg \; ## 查找代码 -Finding files is useful but quite often you are after what is in the file. -A common scenario is wanting to search for all files that contain some pattern, along with where in those files said pattern occurs. -To achieve this, most UNIX-like systems provide [`grep`](http://man7.org/linux/man-pages/man1/grep.1.html), a generic tool for matching patterns from the input text. -It is an incredibly valuable shell tool and we will cover it more in detail during the data wrangling lecture. - -`grep` has many flags that make it a very versatile tool. -Some I frequently use are `-C` for getting **C**ontext around the matching line and `-v` for in**v**erting the match, i.e. print all lines that do **not** match the pattern. For example, `grep -C 5` will print 5 lines before and after the match. -When it comes to quickly parsing through many files, you want to use `-R` since it will **R**ecursively go into directories and look for text files for the matching string. - -But `grep -R` can be improved in many ways, such as ignoring `.git` folders, using multi CPU support, &c. -So there has been no shortage of alternatives developed, including [ack](https://beyondgrep.com/), [ag](https://github.com/ggreer/the_silver_searcher) and [rg](https://github.com/BurntSushi/ripgrep). -All of them are fantastic but pretty much cover the same need. -For now I am sticking with ripgrep (`rg`) given how fast and intuitive it is. Some examples: +查找文件是很有用的技能,但是很多时候您的目标其实是查看文件的内容。一个最常见的场景是您希望查找具有某种模式的全部文件,并找它们的位置。 + +为了实现这一点,很多类UNIX的系统都提供了[`grep`](http://man7.org/linux/man-pages/man1/grep.1.html)命令,它是用于对输入文本进行匹配的通用工具。它是一个非常重要的shell工具,我们会在后续的数据清理课程中深入的探讨它。 + +`grep` 有很多选项,这也使它成为一个非常全能的工具。其中我经常使用的有 `-C` :获取查找结果的上下文(Context);`-v` 将对结果进行反选(Invert),也就是输出不匹配的结果。举例来说, `grep -C 5` 会输出匹配结果前后五行。当需要搜索大量文件的时候,使用 `-R` 会递归地进入子目录并搜索所有的文本文件。 + +但是,我们有很多办法可以对 `grep -R` 进行改进,例如使其忽略`.git` 文件夹,使用多CPU等等。 + +因此也出现了很多它的替代品,包括 [ack](https://beyondgrep.com/), [ag](https://github.com/ggreer/the_silver_searcher) 和 [rg](https://github.com/BurntSushi/ripgrep)。它们都特别好用,但是功能也都差不多,我比较常用的是 ripgrep (`rg`) ,因为它速度快,而且用法非常符合直觉。例子如下: + ```bash # Find all python files where I used the requests library rg -t py 'import requests' @@ -246,20 +243,20 @@ rg foo -A 5 rg --stats PATTERN ``` -Note that as with `find`/`fd`, it is important that you know that these problems can be quickly solved using one of these tools, while the specific tools you use are not as important. +与 `find`/`fd` 一样,重要的是你要知道有些问题使用合适的工具就会迎刃而解,而具体选择哪个工具则不是那么重要。 + ## 查找 shell 命令 -So far we have seen how to find files and code, but as you start spending more time in the shell you may want to find specific commands you typed at some point. -The first thing to know is that the typing up arrow will give you back your last command and if you keep pressing it you will slowly go through your shell history. +目前为止,我们已经学习了如何查找文件和代码,但随着你使用shell的时间越来越久,您可能想要找到之前输入过的某条命令。首先,按向上的方向键会显示你使用过的上一条命令,继续按上键则会遍历整个历史记录。 + + +`history` 命令允许您以程序员的方式来访问shell中输入的历史命令。这个命令会在标准输出中打印shell中的里面命令。如果我们要搜索历史记录,则可以利用管道将输出结果传递给 `grep` 进行模式搜索。 +`history | grep find` 会打印包含find子串的命令。 + +对于大多数的shell来说,您可以使用 `Ctrl+R` 对命令历史记录进行回溯搜索。敲 `Ctrl+R` 后您可以输入子串来进行匹配,查找历史命令行。 -The `history` command will let you access your shell history programmatically. -It will print your shell history to the standard output. -If we want to search there we can pipe that output to `grep` and search for patterns. -`history | grep find` will print commands with the substring "find". -In most shells you can make use of `Ctrl+R` to perform backwards search through your history. -After pressing `Ctrl+R` you can type a substring you want to match for commands in your history. As you keep pressing it you will cycle through the matches in your history. This can also be enabled with the UP/DOWN arrows in [zsh](https://github.com/zsh-users/zsh-history-substring-search). A nice addition on top of `Ctrl+R` comes with using [fzf](https://github.com/junegunn/fzf/wiki/Configuring-shell-key-bindings#ctrl-r) bindings. From 2425a4836a0e709785a9daa2dc69eaa2f123e7d1 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 17 May 2020 18:55:01 +0800 Subject: [PATCH 322/640] add trans status --- _2020/shell-tools.md | 36 +++++++++++++++++------------------- 1 file changed, 17 insertions(+), 19 deletions(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 85aeb9fe..e7196b13 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -2,7 +2,7 @@ layout: lecture title: "Shell 工具和脚本" date: 2019-01-14 -ready: false +ready: true video: aspect: 56.25 id: kgII-YWo3Zw @@ -256,32 +256,30 @@ rg --stats PATTERN 对于大多数的shell来说,您可以使用 `Ctrl+R` 对命令历史记录进行回溯搜索。敲 `Ctrl+R` 后您可以输入子串来进行匹配,查找历史命令行。 +反复按下就会在所有搜索结果中循环。在 [zsh](https://github.com/zsh-users/zsh-history-substring-search)中,使用方向键上或下也可以完成这项工作。 -As you keep pressing it you will cycle through the matches in your history. -This can also be enabled with the UP/DOWN arrows in [zsh](https://github.com/zsh-users/zsh-history-substring-search). -A nice addition on top of `Ctrl+R` comes with using [fzf](https://github.com/junegunn/fzf/wiki/Configuring-shell-key-bindings#ctrl-r) bindings. -`fzf` is a general purpose fuzzy finder that can be used with many commands. -Here is used to fuzzily match through your history and present results in a convenient and visually pleasing manner. -Another cool history-related trick I really enjoy is **history-based autosuggestions**. -First introduced by the [fish](https://fishshell.com/) shell, this feature dynamically autocompletes your current shell command with the most recent command that you typed that shares a common prefix with it. -It can be enabled in [zsh](https://github.com/zsh-users/zsh-autosuggestions) and it is a great quality of life trick for your shell. +`Ctrl+R` 可以配合 [fzf](https://github.com/junegunn/fzf/wiki/Configuring-shell-key-bindings#ctrl-r) 使用。`fzf` 是一个通用对模糊查找工具,它可以和很多命令一起使用。这里我们可以对历史命令进行模糊查找并将结果以赏心悦目的格式输出。 + +另外一个和历史命令相关的技巧我喜欢称之为**基于历史的自动补全**。 +这一特性最初是由 [fish](https://fishshell.com/) shell 创建的,它可以根据您最近使用过的开头相同的命令,动态地对当前对shell命令进行补全。这一功能在 [zsh](https://github.com/zsh-users/zsh-autosuggestions) 中也可以使用,它可以极大对提高用户体验。 + +最后,有一点值得注意,输入命令时,如果您在命令的开头加上一个空格,它就不会被加进shell记录中。当你输入包含密码或是其他敏感信息的命令时会用到这一特性。如果你不小心忘了在前面加空格,可以通过编辑。`bash_history`或 `.zhistory` 来手动地从历史记录中移除那一项。 + -Lastly, a thing to have in mind is that if you start a command with a leading space it won't be added to your shell history. -This comes in handy when you are typing commands with passwords or other bits of sensitive information. -If you make the mistake of not adding the leading space you can always manually remove the entry by editing your `.bash_history` or `.zhistory`. ## 文件夹导航 -So far we have assumed that you already are where you need to be to perform these actions, but how do you go about quickly navigating directories? -There are many simple ways that you could do this, such as writing shell aliases, creating symlinks with [ln -s](http://man7.org/linux/man-pages/man1/ln.1.html) but the truth is that developers have figured out quite clever and sophisticated solutions by now. +之前对所有操作我们都默认一个前提,即您已经位于想要执行命令的目录下,但是如何才能高效地在目录 +间随意切换呢?有很多简便的方法可以做到,比如设置alias,使用 [ln -s](http://man7.org/linux/man-pages/man1/ln.1.html)创建符号连接等。而开发者们已经想到了很多更为精妙的解决方案。 + +对于本课程的主题来说,我们希望对常用的情况进行优化。使用[`fasd`](https://github.com/clvv/fasd)可以查找最常用和/或最近使用的文件和目录。 + +Fasd 基于 [_frecency_](https://developer.mozilla.org/en/The_Places_frecency_algorithm)对文件和文件排序,也就是说它会同时针对频率(_frequency_ )和时效( _recency_)进行排序。 -As with the theme of this course, you often want to optimize for the common case. -Finding frequent and/or recent files and directories can be done through tools like [`fasd`](https://github.com/clvv/fasd) -Fasd ranks files and directories by [_frecency_](https://developer.mozilla.org/en/The_Places_frecency_algorithm), that is, by both _frequency_ and _recency_. -The most straightforward use is _autojump_ which adds a `z` command that you can use to quickly `cd` using a substring of a _frecent_ directory. E.g. if you often go to `/home/user/files/cool_project` you can simply `z cool` to jump there. +最直接对用法是自动跳转 (_autojump_),对于经常访问的目录,在目录名子串前加入一个命令 `z` 就可以快速切换命令到该目录。例如, 如果您经常访问`/home/user/files/cool_project` 目录,那么可以直接使用 `z cool` 跳转到该目录。 -More complex tools exist to quickly get an overview of a directory structure [`tree`](https://linux.die.net/man/1/tree), [`broot`](https://github.com/Canop/broot) or even full fledged file managers like [`nnn`](https://github.com/jarun/nnn) or [`ranger`](https://github.com/ranger/ranger) +还有一些更复杂的工具可以用来概览目录结构,例如 [`tree`](https://linux.die.net/man/1/tree), [`broot`](https://github.com/Canop/broot) 或更加完整对文件管理器,例如 [`nnn`](https://github.com/jarun/nnn) 或 [`ranger`](https://github.com/ranger/ranger)。 # 课后练习 From c156ea61364f5f6336c015064a8525d67a0d40da Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 17 May 2020 22:12:49 +0800 Subject: [PATCH 323/640] change title --- index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/index.md b/index.md index d4f97393..dcfe569b 100644 --- a/index.md +++ b/index.md @@ -1,6 +1,6 @@ --- layout: page -title: The Missing Semester of Your CS Education +title: The Missing Semester of Your CS Education 中文版 --- 学校里有很多向您介绍从操作系统到机器学习等计算机科学进阶主题的课程,然而有一个至关重要的主题却很少被包括在这些课程之内,反而留给学生们自己去探索。 From 613a3c4e1ee7739cc4a3d705cc320797f1566b59 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 17 May 2020 23:03:42 +0800 Subject: [PATCH 324/640] add trans status --- _2020/editors.md | 111 +++++++++++++++++++---------------------------- 1 file changed, 45 insertions(+), 66 deletions(-) diff --git a/_2020/editors.md b/_2020/editors.md index e1f6efd8..31be03b1 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -8,43 +8,22 @@ video: id: a6Q8Na575qc --- -Writing English words and writing code are very different activities. When -programming, you spend more time switching files, reading, navigating, and -editing code compared to writing a long stream. It makes sense that there are -different types of programs for writing English words versus code (e.g. -Microsoft Word versus Visual Studio Code). - -As programmers, we spend most of our time editing code, so it's worth investing -time mastering an editor that fits your needs. Here's how you learn a new -editor: - -- Start with a tutorial (i.e. this lecture, plus resources that we point out) -- Stick with using the editor for all your text editing needs (even if it slows -you down initially) -- Look things up as you go: if it seems like there should be a better way to do -something, there probably is - -If you follow the above method, fully committing to using the new program for -all text editing purposes, the timeline for learning a sophisticated text -editor looks like this. In an hour or two, you'll learn basic editor functions -such as opening and editing files, save/quit, and navigating buffers. Once -you're 20 hours in, you should be as fast as you were with your old editor. -After that, the benefits start: you will have enough knowledge and muscle -memory that using the new editor saves you time. Modern text editors are fancy -and powerful tools, so the learning never stops: you'll get even faster as you -learn more. - -# Which editor to learn? - -Programmers have [strong opinions](https://en.wikipedia.org/wiki/Editor_war) -about their text editors. - -Which editors are popular today? See this [Stack Overflow -survey](https://insights.stackoverflow.com/survey/2019/#development-environments-and-tools) -(there may be some bias because Stack Overflow users may not be representative -of programmers as a whole). [Visual Studio -Code](https://code.visualstudio.com/) is the most popular editor. -[Vim](https://www.vim.org/) is the most popular command-line-based editor. +写作和写代码其实是两项非常不同的活动。当我们编程的时候,会经常在文件间进行切换、阅读、浏览和修改代码,而不是连续编写一大段的文字。因此代码编辑器和文本编辑器种是很不同的两种工具(例如 微软的 Word 与 Visual Studio Code) + +作为程序员,我们大部分时间都花在代码编辑上,所以花点时间掌握某个适合自己的编辑器是非常值得的。通常学习使用一个新的编辑器包含以下步骤: + +- 阅读教程(比如这节课以及我们为您提供的资源) +- 坚持使用它来完成你所有的编辑工作(即使一开始这会让你的工作效率降低) +- 随时查阅:如果某个操作看起来像是有更方便的实现方法,一般情况下真的会有。 + +如果您能够遵循上述步骤,并且坚持使用新的编辑器完成您所有的文本编辑任务,那么学习一个复杂的代码编辑器的过程一般是这样的:头两个小时,您会学习到编辑器的基本操作,例如打开和编辑文件、保存与退出、浏览缓冲区。当学习时间累计达到20个小时之后,您使用新编辑器时当效率应该已经和使用老编辑器一样来。再次之后,其益处开始显现:有了足够的知识和肌肉记忆后,使用新编辑器将大大节省你的时间。而现代文本编辑器都是些复杂且强大的工具,永远有新东西可学:学的越多,效率越高。 + +# 该学哪个编辑器? + +程序员们对自己正在使用的文本编辑器通常有着 [非常强的执念](https://en.wikipedia.org/wiki/Editor_war)。 + + +现在最流行的编辑器是什么? [Stack Overflow的调查](https://insights.stackoverflow.com/survey/2019/#development-environments-and-tools)(这个调查可能并不如我们想象的那样客观,因为Stack Overflow 的用户并不能代表所有程序员 )显示, [Visual Studio Code](https://code.visualstudio.com/)是目前最流行的代码编辑器。而[Vim](https://www.vim.org/) 则是最流行的基于命令行的编辑器。 ## Vim @@ -61,7 +40,7 @@ going to focus on explaining the philosophy of Vim, teaching you the basics, showing you some of the more advanced functionality, and giving you the resources to master the tool. -# Philosophy of Vim +# Vim的哲学 When programming, you spend most of your time reading/editing, not writing. For this reason, Vim is a _modal_ editor: it has different modes for inserting text @@ -73,17 +52,17 @@ avoids using the arrow keys because it requires too much movement. The end result is an editor that can match the speed at which you think. -# Modal editing +# 编辑模式 + + +Vim的设计以大多数时间都花在阅读、浏览和进行少量编辑改动为基础,因此它具有多种操作模式: -Vim's design is based on the idea that a lot of programmer time is spent -reading, navigating, and making small edits, as opposed to writing long streams -of text. For this reason, Vim has multiple operating modes. +- *正常模式*:在文件中四处移动光标进行修改 +- *插入模式*:插入文本 +- *替换模式*:替换文本 +- *可视(一般,行,块)模式*:选中文本块 +- *命令模式*:用于执行命令 -- **Normal**: for moving around a file and making edits -- **Insert**: for inserting text -- **Replace**: for replacing text -- **Visual** (plain, line, or block) mode: for selecting blocks of text -- **Command-line**: for running a command Keystrokes have different meanings in different operating modes. For example, the letter `x` in insert mode will just insert a literal character 'x', but in @@ -104,9 +83,9 @@ You use the `` key a lot when using Vim: consider remapping Caps Lock to Escape ([macOS instructions](https://vim.fandom.com/wiki/Map_caps_lock_to_escape_in_macOS)). -# Basics +# 基本操作 -## Inserting text +## 插入文本 From normal mode, press `i` to enter insert mode. Now, Vim behaves like any other text editor, until you press `` to return to normal mode. This, @@ -126,7 +105,7 @@ different parts of a file at the same time. By default, Vim opens with a single tab, which contains a single window. -## Command-line +## 命令行 Command mode can be entered by typing `:` in normal mode. Your cursor will jump to the command line at the bottom of the screen upon pressing `:`. This mode @@ -142,14 +121,14 @@ has many functionalities, including opening, saving, and closing files, and - `:help :w` opens help for the `:w` command - `:help w` opens help for the `w` movement -# Vim's interface is a programming language +# Vim 的接口其实是一种编程语言 The most important idea in Vim is that Vim's interface itself is a programming language. Keystrokes (with mnemonic names) are commands, and these commands _compose_. This enables efficient movement and edits, especially once the commands become muscle memory. -## Movement +## 移动 You should spend most of your time in normal mode, using movement commands to navigate the buffer. Movements in Vim are also called "nouns", because they @@ -168,7 +147,7 @@ refer to chunks of text. - `,` / `;` for navigating matches - Search: `/{regex}`, `n` / `N` for navigating matches -## Selection +## 选择 Visual modes: @@ -178,7 +157,7 @@ Visual modes: Can use movement keys to make selection. -## Edits +## 编辑 Everything that you used to do with the mouse, you now do with the keyboard using editing commands that compose with movement commands. Here's where Vim's @@ -204,7 +183,7 @@ are also called "verbs", because verbs act on nouns. - `p` to paste - Lots more to learn: e.g. `~` flips the case of a character -## Counts +## 计数 You can combine nouns and verbs with a count, which will perform a given action a number of times. @@ -213,7 +192,7 @@ a number of times. - `5j` move 5 lines down - `7dw` delete 7 words -## Modifiers +## 修饰语 You can use modifiers to change the meaning of a noun. Some modifiers are `i`, which means "inner" or "inside", and `a`, which means "around". @@ -279,7 +258,7 @@ made using Vim to how you might make the same edits using another program. Notice how very few keystrokes are required in Vim, allowing you to edit at the speed you think. -# Customizing Vim +# 自定义 Vim Vim is customized through a plain-text configuration file in `~/.vimrc` (containing Vimscript commands). There are probably lots of basic settings that @@ -299,7 +278,7 @@ inspiration, for example, your instructors' Vim configs lots of good blog posts on this topic too. Try not to copy-and-paste people's full configuration, but read it, understand it, and take what you need. -# Extending Vim +# 扩展 Vim There are tons of plugins for extending Vim. Contrary to outdated advice that you might find on the internet, you do _not_ need to use a plugin manager for @@ -323,7 +302,7 @@ Check out [Vim Awesome](https://vimawesome.com/) for more awesome Vim plugins. There are also tons of blog posts on this topic: just search for "best Vim plugins". -# Vim-mode in other programs +# 其他程序的 Vim 模式 Many tools support Vim emulation. The quality varies from good to great; depending on the tool, it may not support the fancier Vim features, but most @@ -350,7 +329,7 @@ set editing-mode vi With this setting, for example, the Python REPL will support Vim bindings. -## Others +## 其他 There are even vim keybinding extensions for web [browsers](http://vim.wikia.com/wiki/Vim_key_bindings_for_web_browsers), some @@ -360,14 +339,14 @@ for Google Chrome and [Tridactyl](https://github.com/tridactyl/tridactyl) for Firefox. You can even get Vim bindings in [Jupyter notebooks](https://github.com/lambdalisue/jupyter-vim-binding). -# Advanced Vim +# Vim 进阶 Here are a few examples to show you the power of the editor. We can't teach you all of these kinds of things, but you'll learn them as you go. A good heuristic: whenever you're using your editor and you think "there must be a better way of doing this", there probably is: look it up online. -## Search and replace +## 搜索和替换 `:s` (substitute) command ([documentation](http://vim.wikia.com/wiki/Search_and_replace)). @@ -376,12 +355,12 @@ better way of doing this", there probably is: look it up online. - `%s/\[.*\](\(.*\))/\1/g` - replace named Markdown links with plain URLs -## Multiple windows +## 多窗口 - `:sp` / `:vsp` to split windows - Can have multiple views of the same buffer. -## Macros +## 宏 - `q{character}` to start recording a macro in register `{character}` - `q` to stop recording @@ -415,7 +394,7 @@ better way of doing this", there probably is: look it up online. - `999@q` - Manually remove last `,` and add `[` and `]` delimiters -# Resources +# 扩展资料 - `vimtutor` is a tutorial that comes installed with Vim - [Vim Adventures](https://vim-adventures.com/) is a game to learn Vim @@ -426,7 +405,7 @@ better way of doing this", there probably is: look it up online. - [Vim Screencasts](http://vimcasts.org/) - [Practical Vim](https://pragprog.com/book/dnvim2/practical-vim-second-edition) (book) -# Exercises +# 课后练习 1. Complete `vimtutor`. Note: it looks best in a [80x24](https://en.wikipedia.org/wiki/VT100) (80 columns by 24 lines) From c5373aae4c66d0a115ae00e0953ac3197ffc2629 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Tue, 19 May 2020 07:53:42 +0800 Subject: [PATCH 325/640] fix typo --- index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/index.md b/index.md index dcfe569b..f539903b 100644 --- a/index.md +++ b/index.md @@ -4,7 +4,7 @@ title: The Missing Semester of Your CS Education 中文版 --- 学校里有很多向您介绍从操作系统到机器学习等计算机科学进阶主题的课程,然而有一个至关重要的主题却很少被包括在这些课程之内,反而留给学生们自己去探索。 -这部分内容就是:精通工具。在这个系列课程中,我们会帮助您精通命令行,使用强大对文本编辑器,使用版本控制系统提供的多种特性等等。 +这部分内容就是:精通工具。在这个系列课程中,我们会帮助您精通命令行、使用强大的文本编辑器、使用版本控制系统提供的多种特性等等。 学生在他们受教育阶段就会和这些工具朝夕相处(在他们的职业生涯中更是这样)。 因此,花时间打磨使用这些工具的能力并能够最终熟练、流畅地使用它们是非常有必要的。 From 984ae62106169e78bcb60a624ed64c174097ff34 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Tue, 19 May 2020 09:07:50 +0800 Subject: [PATCH 326/640] Update index.md --- index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/index.md b/index.md index f539903b..b8af0777 100644 --- a/index.md +++ b/index.md @@ -3,7 +3,7 @@ layout: page title: The Missing Semester of Your CS Education 中文版 --- -学校里有很多向您介绍从操作系统到机器学习等计算机科学进阶主题的课程,然而有一个至关重要的主题却很少被包括在这些课程之内,反而留给学生们自己去探索。 +对于计算机教育来说,从操作系统到机器学习,这些高大上课程和主题已经非常多了。然而有一个至关重要的主题却很少被专门讲授,而是留给学生们自己去探索。 这部分内容就是:精通工具。在这个系列课程中,我们会帮助您精通命令行、使用强大的文本编辑器、使用版本控制系统提供的多种特性等等。 学生在他们受教育阶段就会和这些工具朝夕相处(在他们的职业生涯中更是这样)。 From 472dad0c3ca60cddd8c9397e463e7a75ff5db215 Mon Sep 17 00:00:00 2001 From: Shumo Chu Date: Tue, 19 May 2020 19:23:05 -0400 Subject: [PATCH 327/640] wip editor translate --- _2020/editors.md | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/_2020/editors.md b/_2020/editors.md index 31be03b1..e49ef53a 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -27,18 +27,13 @@ video: ## Vim -All the instructors of this class use Vim as their editor. Vim has a rich -history; it originated from the Vi editor (1976), and it's still being -developed today. Vim has some really neat ideas behind it, and for this reason, -lots of tools support a Vim emulation mode (for example, 1.4 million people -have installed [Vim emulation for VS code](https://github.com/VSCodeVim/Vim)). -Vim is probably worth learning even if you finally end up switching to some -other text editor. - -It's not possible to teach all of Vim's functionality in 50 minutes, so we're -going to focus on explaining the philosophy of Vim, teaching you the basics, -showing you some of the more advanced functionality, and giving you the -resources to master the tool. +这门课的所有教员都使用Vim作为编辑器。Vim有着悠久历史;它始于1976年的Vi编辑器,到现在还在 +不断开发中。Vim有很多聪明的设计思想,所以很多其他工具也支持Vim模式(比如,140万人安装了 +[Vim emulation for VS code](https://github.com/VSCodeVim/Vim))。即使你最后使用 +其他编辑器,Vim也值得学习。 + +由于不可能在50分钟内教授Vim的所有功能, 我们会专注于解释Vim的设计哲学,教你基础知识, +展示一部分高级功能,然后给你掌握这个工具所需要的资源。 # Vim的哲学 From 2e6bb84d61315e0a05e724f48245dc73ed38661e Mon Sep 17 00:00:00 2001 From: Shumo Chu Date: Wed, 20 May 2020 21:33:06 -0400 Subject: [PATCH 328/640] wip editor --- _2020/editors.md | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/_2020/editors.md b/_2020/editors.md index e49ef53a..52ab872d 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -58,15 +58,11 @@ Vim的设计以大多数时间都花在阅读、浏览和进行少量编辑改 - *可视(一般,行,块)模式*:选中文本块 - *命令模式*:用于执行命令 +在不同的操作模式, 键盘敲击的含义也不同。比如,`x` 在插入模式会插入字母`x`,但是在正常模式 +会删除当前光标所在下的字母,在可视模式下则会删除选中文块。 -Keystrokes have different meanings in different operating modes. For example, -the letter `x` in insert mode will just insert a literal character 'x', but in -normal mode, it will delete the character under the cursor, and in visual mode, -it will delete the selection. - -In its default configuration, Vim shows the current mode in the bottom left. -The initial/default mode is normal mode. You'll generally spend most of your -time between normal mode and insert mode. +在默认设置下,Vim会在左下角显示当前的模式。 Vim启动时的默认模式是正常模式。通常你会把大部分 +时间花在正常模式和插入模式。 You change modes by pressing `` (the escape key) to switch from any mode back to normal mode. From normal mode, enter insert mode with `i`, replace mode From d1b79b833786c94bc49a692b3220174fc63520d0 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Thu, 21 May 2020 10:59:20 +0800 Subject: [PATCH 329/640] add Project Status --- README.md | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/README.md b/README.md index f4573f17..5c616e3a 100644 --- a/README.md +++ b/README.md @@ -16,3 +16,24 @@ bundle exec jekyll serve -w ## License All the content in this course, including the website source code, lecture notes, exercises, and lecture videos is licensed under Attribution-NonCommercial-ShareAlike 4.0 International [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). See [here](https://missing.csail.mit.edu/license) for more information on contributions or translations. + +----------------- + +## Project Status + +To contribute to this tanslation project, please book your topic in this table and create a issue accordingly. + +| lectures | translator | status | +| ---- | ---- |---- | +| 课程概览与shell | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | +| Shell 工具和脚本 | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | +| 编辑器 (Vim) | [@stechu](https://github.com/stechu) | In-progress | +| 数据清理 | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | +| 命令行环境 | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | +| 版本控制(Git) | | TO-DO | +| 调试及性能分析 | | TO-DO | +| 元编程 | | TO-DO | +| 安全和密码学 | | TO-DO | +| 大杂烩 | | TO-DO | +| Q&A | | TO-DO | +| About | [@Binlogo](https://github.com/Binlogo) | In-progress | From e79b60d1f43153a20cb001086b5a255a441cb23c Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Thu, 21 May 2020 11:01:56 +0800 Subject: [PATCH 330/640] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 5c616e3a..0213895a 100644 --- a/README.md +++ b/README.md @@ -21,7 +21,7 @@ All the content in this course, including the website source code, lecture notes ## Project Status -To contribute to this tanslation project, please book your topic in this table and create a issue accordingly. +To contribute to this tanslation project, please book your topic by creating an issue and I will update this table accordingly to avoid rework. | lectures | translator | status | | ---- | ---- |---- | From b27fa500b5deb5cf82fec10ec82e59fda2a0c75c Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Thu, 21 May 2020 11:27:23 +0800 Subject: [PATCH 331/640] Update README.md --- README.md | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/README.md b/README.md index 0213895a..a951bcde 100644 --- a/README.md +++ b/README.md @@ -25,15 +25,15 @@ To contribute to this tanslation project, please book your topic by creating an | lectures | translator | status | | ---- | ---- |---- | -| 课程概览与shell | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | -| Shell 工具和脚本 | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | -| 编辑器 (Vim) | [@stechu](https://github.com/stechu) | In-progress | -| 数据清理 | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | -| 命令行环境 | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | -| 版本控制(Git) | | TO-DO | -| 调试及性能分析 | | TO-DO | -| 元编程 | | TO-DO | -| 安全和密码学 | | TO-DO | -| 大杂烩 | | TO-DO | -| Q&A | | TO-DO | -| About | [@Binlogo](https://github.com/Binlogo) | In-progress | +| [course-shell.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/course-shell.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | +| [shell-tools.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/shell-tools.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | +| [editors.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/editors.md) | [@stechu](https://github.com/stechu) | In-progress | +| [data-wrangling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/data-wrangling.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | +| [command-line.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/command-line.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | +| [version-control.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/version-control.md) | | TO-DO | +| [debugging-profiling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/debugging-profiling.md) | | TO-DO | +| [metaprogramming.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/metaprogramming.md) | | TO-DO | +| [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | | TO-DO | +| [potpourri.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md) | | TO-DO | +| [qa.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/qa.md) | | TO-DO | +| [about.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/about.md) | [@Binlogo](https://github.com/Binlogo) | In-progress | From a1a3f2f6fba54a0da2a17a2c8731f6b236bdba5b Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Thu, 21 May 2020 13:45:33 +0800 Subject: [PATCH 332/640] Update data-wrangling.md --- _2020/data-wrangling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index fe24629c..07212ecc 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -1,6 +1,6 @@ --- layout: lecture -title: "数据清理" +title: "数据整理" date: 2019-01-16 ready: false video: From 746e6f0065c83d36600e6cb66e1ce74fcc733951 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Thu, 21 May 2020 13:53:25 +0800 Subject: [PATCH 333/640] Update issue templates --- .github/ISSUE_TEMPLATE/translation.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) create mode 100644 .github/ISSUE_TEMPLATE/translation.md diff --git a/.github/ISSUE_TEMPLATE/translation.md b/.github/ISSUE_TEMPLATE/translation.md new file mode 100644 index 00000000..22b6027c --- /dev/null +++ b/.github/ISSUE_TEMPLATE/translation.md @@ -0,0 +1,13 @@ +--- +name: translation +about: choose the file you plan to translate +title: '' +labels: trans +assignees: '' + +--- + +Filename : +Estimated time of finish : + +Note: Please make sure you can finish it within two weeks. From 8e614c7dd3d317701f3ec7bfc7a750b1aac64315 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Thu, 21 May 2020 14:30:11 +0800 Subject: [PATCH 334/640] Update data-wrangling.md --- _2020/data-wrangling.md | 20 +++++++------------- 1 file changed, 7 insertions(+), 13 deletions(-) diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index 07212ecc..9df23c38 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -12,19 +12,13 @@ video: [Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anicor/data_wrangling_iap_2019/) {% endcomment %} -Have you ever wanted to take data in one format and turn it into a -different format? Of course you have! That, in very general terms, is -what this lecture is all about. Specifically, massaging data, whether in -text or binary format, until you end up with exactly what you wanted. - -We've already seen some basic data wrangling in past lectures. Pretty -much any time you use the `|` operator, you are performing some kind of -data wrangling. Consider a command like `journalctl | grep -i intel`. It -finds all system log entries that mention Intel (case insensitive). You -may not think of it as wrangling data, but it is going from one format -(your entire system log) to a format that is more useful to you (just -the intel log entries). Most data wrangling is about knowing what tools -you have at your disposal, and how to combine them. + +您是否曾经有过这样的需求,将某种格式存储的数据转换成另外一种格式? 肯定有过,对吧! +一般来讲,这正是我们这节课所要讲授的主要内容。具体来讲,我们需要不断地对数据进行处理,直到得到我们想要的最终结果。 + +在之前的课程中,其实我们已经接触到了一些数据整理的基本技术。可以这么说,每当您使用管道运算符的时候,其实就是在进行某种形式的数据整理。 +例如这样一条命令 `journalctl | grep -i intel`,它会找到所有包含intel(区分大小写)的系统日志。您可能并不认为是数据整理,但是它确实将某种形式的数据(全部系统日志)转换成了另外一种形式的数据(仅包含intel的日志)。大多数情况下,数据整理需要您能够明确哪些工具可以被用来达成特定数据整理的目的,并且明白如何组合使用这些工具。 + Let's start from the beginning. To wrangle data, we need two things: data to wrangle, and something to do with it. Logs often make for a good From bfb7e73ee27dda65164dafef0da9b8da44f701d2 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Thu, 21 May 2020 16:16:24 +0800 Subject: [PATCH 335/640] Update data-wrangling.md --- _2020/data-wrangling.md | 15 ++++----------- 1 file changed, 4 insertions(+), 11 deletions(-) diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index 9df23c38..f232b3f9 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -20,27 +20,20 @@ video: 例如这样一条命令 `journalctl | grep -i intel`,它会找到所有包含intel(区分大小写)的系统日志。您可能并不认为是数据整理,但是它确实将某种形式的数据(全部系统日志)转换成了另外一种形式的数据(仅包含intel的日志)。大多数情况下,数据整理需要您能够明确哪些工具可以被用来达成特定数据整理的目的,并且明白如何组合使用这些工具。 -Let's start from the beginning. To wrangle data, we need two things: -data to wrangle, and something to do with it. Logs often make for a good -use-case, because you often want to investigate things about them, and -reading the whole thing isn't feasible. Let's figure out who's trying to -log into my server by looking at my server's log: +让我们从头讲起。既然需恶习数据整理,那有两样东西自然是必不可少的:用来整理的数据以及相关的应用场景。日志处理通常是一个比较典型的使用场景,因为我们经常需要在日志中查找某些信息。这种情况下通读日志是不现实的。现在,让我们研究一下系统日志,看看哪些用户曾经尝试过登录我们的服务器: ```bash ssh myserver journalctl ``` -That's far too much stuff. Let's limit it to ssh stuff: +内容太多了。现在让我们把涉及sshd的信息过滤出来: ```bash ssh myserver journalctl | grep sshd ``` -Notice that we're using a pipe to stream a _remote_ file through `grep` -on our local computer! `ssh` is magical, and we will talk more about it -in the next lecture on the command-line environment. This is still way -more stuff than we wanted though. And pretty hard to read. Let's do -better: +注意,这里我们使用管道将一个远程服务器上的文件传递给本机的 `grep` 程序! +`ssh` 太牛了,下一节课我们会讲授命令行环境,届时我们会详细讨论ssh的相关内容。此时我们打印出的内容,仍然比我们需要的要多得多,读起来也非常费劲。我们来改进一下: ```bash ssh myserver 'journalctl | grep sshd | grep "Disconnected from"' | less From a0edcb7c19b239301a12c6cc5fb0b0eb99b58ad3 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Thu, 21 May 2020 22:52:44 +0800 Subject: [PATCH 336/640] update trans --- _2020/data-wrangling.md | 89 +++++++++++++---------------------------- 1 file changed, 27 insertions(+), 62 deletions(-) diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index f232b3f9..927283fb 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -17,8 +17,8 @@ video: 一般来讲,这正是我们这节课所要讲授的主要内容。具体来讲,我们需要不断地对数据进行处理,直到得到我们想要的最终结果。 在之前的课程中,其实我们已经接触到了一些数据整理的基本技术。可以这么说,每当您使用管道运算符的时候,其实就是在进行某种形式的数据整理。 -例如这样一条命令 `journalctl | grep -i intel`,它会找到所有包含intel(区分大小写)的系统日志。您可能并不认为是数据整理,但是它确实将某种形式的数据(全部系统日志)转换成了另外一种形式的数据(仅包含intel的日志)。大多数情况下,数据整理需要您能够明确哪些工具可以被用来达成特定数据整理的目的,并且明白如何组合使用这些工具。 +例如这样一条命令 `journalctl | grep -i intel`,它会找到所有包含intel(区分大小写)的系统日志。您可能并不认为是数据整理,但是它确实将某种形式的数据(全部系统日志)转换成了另外一种形式的数据(仅包含intel的日志)。大多数情况下,数据整理需要您能够明确哪些工具可以被用来达成特定数据整理的目的,并且明白如何组合使用这些工具。 让我们从头讲起。既然需恶习数据整理,那有两样东西自然是必不可少的:用来整理的数据以及相关的应用场景。日志处理通常是一个比较典型的使用场景,因为我们经常需要在日志中查找某些信息。这种情况下通读日志是不现实的。现在,让我们研究一下系统日志,看看哪些用户曾经尝试过登录我们的服务器: @@ -26,42 +26,30 @@ video: ssh myserver journalctl ``` -内容太多了。现在让我们把涉及sshd的信息过滤出来: +内容太多了。现在让我们把涉及 sshd 的信息过滤出来: ```bash ssh myserver journalctl | grep sshd ``` 注意,这里我们使用管道将一个远程服务器上的文件传递给本机的 `grep` 程序! -`ssh` 太牛了,下一节课我们会讲授命令行环境,届时我们会详细讨论ssh的相关内容。此时我们打印出的内容,仍然比我们需要的要多得多,读起来也非常费劲。我们来改进一下: +`ssh` 太牛了,下一节课我们会讲授命令行环境,届时我们会详细讨论 `ssh` 的相关内容。此时我们打印出的内容,仍然比我们需要的要多得多,读起来也非常费劲。我们来改进一下: ```bash ssh myserver 'journalctl | grep sshd | grep "Disconnected from"' | less ``` -Why the additional quoting? Well, our logs may be quite large, and it's -wasteful to stream it all to our computer and then do the filtering. -Instead, we can do the filtering on the remote server, and then massage -the data locally. `less` gives us a "pager" that allows us to scroll up -and down through the long output. To save some additional traffic while -we debug our command-line, we can even stick the current filtered logs -into a file so that we don't have to access the network while -developing: +多出来的引号是什么作用呢?这么说把,我们的日志是一个非常大的文件,把这么大的文件流直接传输到我们本地的电脑上再进行过滤是对流量的一种浪费。因此我们采取另外一种方式,我们先在远端机器上过滤文本内容,然后再将结果传输到本机。 `less` 为我们创建来一个文件分页器,使我们可以通过翻页的方式浏览较长的文本。为了进一步节省流量,我们甚至可以将当前过滤出的日志保存到文件中,这样后续就不需要再次通过网络访问该文件了: + ```console $ ssh myserver 'journalctl | grep sshd | grep "Disconnected from"' > ssh.log $ less ssh.log ``` -There's still a lot of noise here. There are _a lot_ of ways to get rid -of that, but let's look at one of the most powerful tools in your -toolkit: `sed`. +过滤结果中仍然包含不少没用的数据。我们有很多办法可以删除这些无用的数据,但是让我们先研究一下 `sed` 这个非常强大的工具。 -`sed` is a "stream editor" that builds on top of the old `ed` editor. In -it, you basically give short commands for how to modify the file, rather -than manipulate its contents directly (although you can do that too). -There are tons of commands, but one of the most common ones is `s`: -substitution. For example, we can write: +`sed` 是一个基于文本编辑器`ed`构建的"流编辑器" is a "stream editor" that builds on top of the old `ed` editor. 在 `sed` 中,您基本上是利用一些简短的命令来修改文件,而不是直接操作文件的内容(尽管您也可以选择这样做)。相关的命令行非常多,但是最常用的是 `s`,即*替换*命令,例如我们可以这样写: ```bash ssh myserver journalctl @@ -70,57 +58,34 @@ ssh myserver journalctl | sed 's/.*Disconnected from //' ``` -What we just wrote was a simple _regular expression_; a powerful -construct that lets you match text against patterns. The `s` command is -written on the form: `s/REGEX/SUBSTITUTION/`, where `REGEX` is the -regular expression you want to search for, and `SUBSTITUTION` is the -text you want to substitute matching text with. - -## Regular expressions - -Regular expressions are common and useful enough that it's worthwhile to -take some time to understand how they work. Let's start by looking at -the one we used above: `/.*Disconnected from /`. Regular expressions are -usually (though not always) surrounded by `/`. Most ASCII characters -just carry their normal meaning, but some characters have "special" -matching behavior. Exactly which characters do what vary somewhat -between different implementations of regular expressions, which is a -source of great frustration. Very common patterns are: - - - `.` means "any single character" except newline - - `*` zero or more of the preceding match - - `+` one or more of the preceding match - - `[abc]` any one character of `a`, `b`, and `c` - - `(RX1|RX2)` either something that matches `RX1` or `RX2` - - `^` the start of the line - - `$` the end of the line - -`sed`'s regular expressions are somewhat weird, and will require you to -put a `\` before most of these to give them their special meaning. Or -you can pass `-E`. - -So, looking back at `/.*Disconnected from /`, we see that it matches -any text that starts with any number of characters, followed by the -literal string "Disconnected from ". Which is what we wanted. But -beware, regular expressions are trixy. What if someone tried to log in -with the username "Disconnected from"? We'd have: +上面这段命令中,我们使用了一段简单的*正则表达式*。正则表达式是一种非常强大工具,可以让我们基于某种模式来对字符串进行匹配。`s` 命令的语法如下:`s/REGEX/SUBSTITUTION/`, 其中 `REGEX` 部分是我们需要使用的正则表达式,而 `SUBSTITUTION` 是用于替换匹配结果的文本。 + +## 正则表达式 + +正则表达式非常常见也非常有用,值得您花些时间去理解它。让我们从这一句正则表达式开始学习: `/.*Disconnected from /`。正则表达式通常以(尽管并不总是) `/`开始和结束。大多数的ASCII字符都表示它们本来的含义,但是有一些字符确实有表示匹配行为的“特殊”含义。不同字符所表示的含义,根据正则表达式的实现方式不同,也会有所变化,这一点确实令人沮丧。常见的模式有: + + - `.` 除空格之外的"任意单个字符" + - `*` 匹配前面字符零次或多次 + - `+` 匹配前面字符一次或多次 + - `[abc]` 匹配 `a`, `b` 和 `c` 中的任意一个 + - `(RX1|RX2)` 任何能够匹配`RX1` 或 `RX2`的结果 + - `^` 行首 + - `$` 行尾 + +`sed` 的正则表达式有些时候是比较奇怪的,它需要你在这些模式前添加`\`才能使其具有特殊含义。或者,您也可以添加`-E`选项来支持这些匹配。 + +因此,回过头我们再看`/.*Disconnected from /`,我们会发现这个正则表达式可以匹配任何以若干任意字符开头,并接着包含"Disconnected from "的字符串。这也正式我们所希望的。但是请注意,正则表达式并不容易写对。如果有人将 "Disconnected from" 作为自己的用户名会怎样呢? ``` Jan 17 03:13:00 thesquareplanet.com sshd[2631]: Disconnected from invalid user Disconnected from 46.97.239.16 port 55920 [preauth] ``` - -What would we end up with? Well, `*` and `+` are, by default, "greedy". -They will match as much text as they can. So, in the above, we'd end up -with just +我们的正则表达式匹配结果是怎样的呢?`*` 和 `+` 在默认情况下是贪婪模式,也就是说,它们会尽可能多的匹配文本。因此对上述字符串的匹配结果如下: ``` 46.97.239.16 port 55920 [preauth] ``` - -Which may not be what we wanted. In some regular expression -implementations, you can just suffix `*` or `+` with a `?` to make them -non-greedy, but sadly `sed` doesn't support that. We _could_ switch to -perl's command-line mode though, which _does_ support that construct: +这可不上我们想要的结果。对于某些正则表达式的实现来说,您可以给 `*` 或 `+` 增加一个`?`后缀使其变成非贪婪模式,但是很可惜`sed` 并不支持该后缀。不过,我们可以切换到 +perl 的命令行模式,该模式支持编写这样的正则表达式: ```bash perl -pe 's/.*?Disconnected from //' From 9efec6b01587c37c36aab1097132db52ffed838a Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Fri, 22 May 2020 12:52:09 +0800 Subject: [PATCH 337/640] update trans --- _2020/data-wrangling.md | 103 +++++++++++++--------------------------- 1 file changed, 33 insertions(+), 70 deletions(-) diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index 927283fb..535c4404 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -79,7 +79,7 @@ ssh myserver journalctl ``` Jan 17 03:13:00 thesquareplanet.com sshd[2631]: Disconnected from invalid user Disconnected from 46.97.239.16 port 55920 [preauth] ``` -我们的正则表达式匹配结果是怎样的呢?`*` 和 `+` 在默认情况下是贪婪模式,也就是说,它们会尽可能多的匹配文本。因此对上述字符串的匹配结果如下: +正则表达式会如何匹配?`*` 和 `+` 在默认情况下是贪婪模式,也就是说,它们会尽可能多的匹配文本。因此对上述字符串的匹配结果如下: ``` 46.97.239.16 port 55920 [preauth] @@ -91,63 +91,36 @@ perl 的命令行模式,该模式支持编写这样的正则表达式: perl -pe 's/.*?Disconnected from //' ``` -We'll stick to `sed` for the rest of this, because it's by far the more -common tool for these kinds of jobs. `sed` can also do other handy -things like print lines following a given match, do multiple -substitutions per invocation, search for things, etc. But we won't cover -that too much here. `sed` is basically an entire topic in and of itself, -but there are often better tools. +让我们回到 `sed` 命令并使用它完成后续的任务,毕竟对于这一类任务,`sed`是最常见的工具。`sed` 还可以非常方便的做一些事情,例如打印匹配后的内容,一次调用中进行多次替换搜索等。但是这些内容我们并不会在此进行介绍。`sed` 本身是一个非常全能的工具,但是在具体功能上往往能找到更好的工作作为替代品。 -Okay, so we also have a suffix we'd like to get rid of. How might we do -that? It's a little tricky to match just the text that follows the -username, especially if the username can have spaces and such! What we -need to do is match the _whole_ line: +好的,我们还需要去掉用户名后面的后缀,应该如何操作呢? +想要匹配用户名后面的文本,尤其是当这里的用户名可以包含空格时,这个问题变得非常棘手!这里我们需要做的是匹配*一整行*: ```bash | sed -E 's/.*Disconnected from (invalid |authenticating )?user .* [^ ]+ port [0-9]+( \[preauth\])?$//' ``` +让我们借助正则表达式在线调试工具[regex +debugger](https://regex101.com/r/qqbZqh/2)来理解这段表达式。OK,开始的部分和以前是一样的。随后,我们匹配两种类型的“user”(在日志中基于两种前缀区分)。再然后我们匹配属于用户名的所有字符。接着,再匹配任意一个单词(`[^ ]+` 会匹配任意非空切不包含空格的序列)。紧接着后面匹配单词“port”和它后面的遗传数字,以及可能存在的后缀 +`[preauth]`,最后再匹配行尾。 -Let's look at what's going on with a [regex -debugger](https://regex101.com/r/qqbZqh/2). Okay, so the start is still -as before. Then, we're matching any of the "user" variants (there are -two prefixes in the logs). Then we're matching on any string of -characters where the username is. Then we're matching on any single word -(`[^ ]+`; any non-empty sequence of non-space characters). Then the word -"port" followed by a sequence of digits. Then possibly the suffix -`[preauth]`, and then the end of the line. - -Notice that with this technique, as username of "Disconnected from" -won't confuse us any more. Can you see why? - -There is one problem with this though, and that is that the entire log -becomes empty. We want to _keep_ the username after all. For this, we -can use "capture groups". Any text matched by a regex surrounded by -parentheses is stored in a numbered capture group. These are available -in the substitution (and in some engines, even in the pattern itself!) -as `\1`, `\2`, `\3`, etc. So: + +注意,这样做的话,即使用户名是“Disconnected from”,对匹配结果也不会有任何影响,您知道这是为什么吗?。你能发现是为什么吗? + +问题还没有完全解决,日志的内容全部被替换成了空字符串,整个日志的内容因此都被删除了。我们实际上希望能够将用户名*保留*下来。对此,我们可以使用“捕获组(capture groups)”来完成。被圆括号内的正则表达式匹配到的文本,都会被存入一系列以编号区分的捕获组中。捕获组的内容可以在替换字符串时使用(有些正则表达式的引擎甚至支持替换表达式本身),例如`\1`、 `\2`、`\3`等等,因此可以使用如下命令: ```bash | sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' ``` -As you can probably imagine, you can come up with _really_ complicated -regular expressions. For example, here's an article on how you might -match an [e-mail -address](https://www.regular-expressions.info/email.html). It's [not -easy](https://emailregex.com/). And there's [lots of -discussion](https://stackoverflow.com/questions/201323/how-to-validate-an-email-address-using-a-regular-expression/1917982). -And people have [written -tests](https://fightingforalostcause.net/content/misc/2006/compare-email-regex.php). -And [test matrices](https://mathiasbynens.be/demo/url-regex). You can -even write a regex for determining if a given number [is a prime -number](https://www.noulakaz.net/2007/03/18/a-regular-expression-to-check-for-prime-numbers/). +想必您已经意识到了,为了完成某种匹配,我们最终可能会写出非常复杂的正则表达式。例如,这里有一篇关于如何匹配电子邮箱地址的文章[e-mail address](https://www.regular-expressions.info/email.html),匹配电子邮箱可一点[也不简单](https://emailregex.com/)。网络上还有很多关于如何匹配电子邮箱地址的[讨论](https://stackoverflow.com/questions/201323/how-to-validate-an-email-address-using-a-regular-expression/1917982)。人们还为其编写了[测试用例](https://fightingforalostcause.net/content/misc/2006/compare-email-regex.php). +及 [测试矩阵](https://mathiasbynens.be/demo/url-regex)。您甚至可以编写一个用于判断一个数[是否为质数](https://www.noulakaz.net/2007/03/18/a-regular-expression-to-check-for-prime-numbers/)的正则表达式。 + -Regular expressions are notoriously hard to get right, but they are also -very handy to have in your toolbox! +正则表达式是出了名的难以写对,但是它仍然会是您强大的常备工具之一。 -## Back to data wrangling +## 回到数据整理 -Okay, so we now have +OK,现在我们有如下表达式: ```bash ssh myserver journalctl @@ -156,14 +129,11 @@ ssh myserver journalctl | sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' ``` -`sed` can do all sorts of other interesting things, like injecting text -(with the `i` command), explicitly printing lines (with the `p` -command), selecting lines by index, and lots of other things. Check `man -sed`! +`sed` 还可以做很多各种各样有趣的事情,例如文本注入: +(使用 `i` 命令),打印特定的行 (使用 `p` +命令),基于索引选择特定行等等。详情请见`man sed`! -Anyway. What we have now gives us a list of all the usernames that have -attempted to log in. But this is pretty unhelpful. Let's look for common -ones: +现在,我们已经得到了一个包含用户名的列表,列表中的用户都曾经尝试过登陆我们的系统。但这还不够,让我们过滤出那些最常出现的用户: ```bash ssh myserver journalctl @@ -173,10 +143,7 @@ ssh myserver journalctl | sort | uniq -c ``` -`sort` will, well, sort its input. `uniq -c` will collapse consecutive -lines that are the same into a single line, prefixed with a count of the -number of occurrences. We probably want to sort that too and only keep -the most common logins: +`sort` 会对其输入数据进行排序。`uniq -c` 会把连续出现的行折叠为一行并使用出现次数作为前缀。我们希望按照出现次数排序,过滤出最常登陆的用户: ```bash ssh myserver journalctl @@ -187,18 +154,18 @@ ssh myserver journalctl | sort -nk1,1 | tail -n10 ``` -`sort -n` will sort in numeric (instead of lexicographic) order. `-k1,1` -means "sort by only the first whitespace-separated column". The `,n` -part says "sort until the `n`th field, where the default is the end of -the line. In this _particular_ example, sorting by the whole line -wouldn't matter, but we're here to learn! +`sort -n` 会按照数字顺序对输入进行排序(默认情况下是按照字典序排序 +`-k1,1` 则表示“仅基于以空格分割的第一列进行排序”。`,n` 部分表示“仅排序到第n个部分”,默认情况是到行尾。就本例来说,针对整个行进行排序也没有任何问题,我们这里主要是为了学习这一用法! -If we wanted the _least_ common ones, we could use `head` instead of -`tail`. There's also `sort -r`, which sorts in reverse order. +如果我们希望得到登陆次数最少的用户,我们可以使用 `head` 来代替 +`tail`。或者使用`sort -r`来进行倒序排序。 Okay, so that's pretty cool, but we'd sort of like to only give the usernames, and maybe not one per line? +好,相当不错,但我们有点想只取用户名,而且不要一行一个地显示。 + + ```bash ssh myserver journalctl | grep sshd @@ -209,17 +176,13 @@ ssh myserver journalctl | awk '{print $2}' | paste -sd, ``` -Let's start with `paste`: it lets you combine lines (`-s`) by a given -single-character delimiter (`-d`). But what's this `awk` business? +我们可以利用 `paste`命令来合并行(`-s`),并指定一个分隔符进行分割 (`-d`),那么`awk`的作用又是什么呢? -## awk -- another editor +## awk -- 另外一种编辑器 -`awk` is a programming language that just happens to be really good at -processing text streams. There is _a lot_ to say about `awk` if you were -to learn it properly, but as with many other things here, we'll just go -through the basics. +`awk` 其实是一种编程语言,只不过它碰巧非常善于处理文本。关于 `awk` 可以介绍的内容太多了,限于篇幅,这里我们仅介绍一些基础知识。 -First, what does `{print $2}` do? Well, `awk` programs take the form of +首先, `{print $2}` 的作用是什么? `awk` 程序 take the form of an optional pattern plus a block saying what to do if the pattern matches a given line. The default pattern (which we used above) matches all lines. Inside the block, `$0` is set to the entire line's contents, From d0215f3dbb445c76df2a5cad6c7292e33286eb19 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Fri, 22 May 2020 16:43:33 +0800 Subject: [PATCH 338/640] finish --- _2020/data-wrangling.md | 126 ++++++++++++---------------------------- 1 file changed, 36 insertions(+), 90 deletions(-) diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index 535c4404..4b94ad42 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -2,7 +2,7 @@ layout: lecture title: "数据整理" date: 2019-01-16 -ready: false +ready: true video: aspect: 56.25 id: sz_dsktIjt4 @@ -14,7 +14,7 @@ video: 您是否曾经有过这样的需求,将某种格式存储的数据转换成另外一种格式? 肯定有过,对吧! -一般来讲,这正是我们这节课所要讲授的主要内容。具体来讲,我们需要不断地对数据进行处理,直到得到我们想要的最终结果。 +这也正是我们这节课所要讲授的主要内容。具体来讲,我们需要不断地对数据进行处理,直到得到我们想要的最终结果。 在之前的课程中,其实我们已经接触到了一些数据整理的基本技术。可以这么说,每当您使用管道运算符的时候,其实就是在进行某种形式的数据整理。 @@ -182,30 +182,18 @@ ssh myserver journalctl `awk` 其实是一种编程语言,只不过它碰巧非常善于处理文本。关于 `awk` 可以介绍的内容太多了,限于篇幅,这里我们仅介绍一些基础知识。 -首先, `{print $2}` 的作用是什么? `awk` 程序 take the form of -an optional pattern plus a block saying what to do if the pattern -matches a given line. The default pattern (which we used above) matches -all lines. Inside the block, `$0` is set to the entire line's contents, -and `$1` through `$n` are set to the `n`th _field_ of that line, when -separated by the `awk` field separator (whitespace by default, change -with `-F`). In this case, we're saying that, for every line, print the -contents of the second field, which happens to be the username! +首先, `{print $2}` 的作用是什么? `awk` 程序接受一个模式串(可选),以及一个代码块,指定当模式匹配时应该做何种操作。默认当模式串即匹配所有行(上面命令中当用法)。 +在代码块中,`$0` 表示正行的内容,`$1` 到 `$n` 为一行中的n个区域,区域的分割基于 `awk` 的域分隔符(默认是空格,可以通过`-F`来修改)。在这个例子中,我们的代码意思是,对于每一行文本,打印其第二个部分,也就是用户名。 -Let's see if we can do something fancier. Let's compute the number of -single-use usernames that start with `c` and end with `e`: +让我们看看,还有什么炫酷的操作我们可以做。让我们统计一下所有以`c` 开头,以 `e` 结尾,并且仅尝试过一次登陆的用户。 ```bash | awk '$1 == 1 && $2 ~ /^c[^ ]*e$/ { print $2 }' | wc -l ``` -There's a lot to unpack here. First, notice that we now have a pattern -(the stuff that goes before `{...}`). The pattern says that the first -field of the line should be equal to 1 (that's the count from `uniq --c`), and that the second field should match the given regular -expression. And the block just says to print the username. We then count -the number of lines in the output with `wc -l`. +让我们好好分析一下。首先,注意这次我们为`awk`指定来一个匹配模式串(也就是`{...}`前面的那部分内容)。该匹配要求文本的第一部分需要等于1(这部分刚好是`uniq -c`得到的计数值),然后其第二部分必须满足给定的一个正则表达式。代码快中的内容则表示打印用户名。然后我们使用 `wc -l` 统计输出结果的行数。 -However, `awk` is a programming language, remember? +不过,既然 `awk` 是一种编程语言,那么则可以这样: ```awk BEGIN { rows = 0 } @@ -213,31 +201,26 @@ $1 == 1 && $2 ~ /^c[^ ]*e$/ { rows += $1 } END { print rows } ``` -`BEGIN` is a pattern that matches the start of the input (and `END` -matches the end). Now, the per-line block just adds the count from the -first field (although it'll always be 1 in this case), and then we print -it out at the end. In fact, we _could_ get rid of `grep` and `sed` -entirely, because `awk` [can do it -all](https://backreference.org/2010/02/10/idiomatic-awk/), but we'll -leave that as an exercise to the reader. -## Analyzing data +`BEGIN` 也是一种模式,它会匹配输入的开头( `END` 则匹配结尾)。然后,对每一行第一个部分进行累加,最后将结果输出。事实上,我们完全可以抛弃 `grep` 和 `sed` ,因为 `awk` 就可以[解决所有问题](https://backreference.org/2010/02/10/idiomatic-awk)。至于怎么做,就留给读者们做课后练习吧。 -You can do math! For example, add the numbers on each line together: + +## 分析数据 + +想做数学计算也是可以的!例如这样,您可以将每行的数字加起来: ```bash | paste -sd+ | bc -l ``` -Or produce more elaborate expressions: +下面这种更加复杂的表达式也可以: ```bash echo "2*($(data | paste -sd+))" | bc -l ``` -You can get stats in a variety of ways. -[`st`](https://github.com/nferraz/st) is pretty neat, but if you already -have R: +您可以通过多种方式获取统计数据。如果已经安装了R语言,[`st`](https://github.com/nferraz/st)是个不错的选择: + ```bash ssh myserver journalctl @@ -248,13 +231,9 @@ ssh myserver journalctl | awk '{print $1}' | R --slave -e 'x <- scan(file="stdin", quiet=TRUE); summary(x)' ``` -R is another (weird) programming language that's great at data analysis -and [plotting](https://ggplot2.tidyverse.org/). We won't go into too -much detail, but suffice to say that `summary` prints summary statistics -about a matrix, and we computed a matrix from the input stream of -numbers, so R gives us the statistics we wanted! +R 也是一种编程语言,它非常适合被用来进行数据分析和[绘制图表](https://ggplot2.tidyverse.org/)。这里我们不会讲的特别详细, 您只需要知道`summary` 可以打印统计结果,我们通过输入的信息计算出一个矩阵,然后R语言就可以得到我们想要的统计数据。 -If you just want some simple plotting, `gnuplot` is your friend: +如果您希望绘制一些简单的图表, `gnuplot` 可以帮助到您: ```bash ssh myserver journalctl @@ -266,23 +245,18 @@ ssh myserver journalctl | gnuplot -p -e 'set boxwidth 0.5; plot "-" using 1:xtic(2) with boxes' ``` -## Data wrangling to make arguments +## 利用数据整理来确定参数 + +有时候您要利用数据整理技术从一长串列表里找出你所需要安装或移除的东西。我们之前讨论的相关技术配合 `xargs` 即可实现: -Sometimes you want to do data wrangling to find things to install or -remove based on some longer list. The data wrangling we've talked about -so far + `xargs` can be a powerful combo: ```bash rustup toolchain list | grep nightly | grep -vE "nightly-x86" | sed 's/-x86.*//' | xargs rustup toolchain uninstall ``` -## Wrangling binary data +## 整理二进制数据 -So far, we have mostly talked about wrangling textual data, but pipes -are just as useful for binary data. For example, we can use ffmpeg to -capture an image from our camera, convert it to grayscale, compress it, -send it to a remote machine over SSH, decompress it there, make a copy, -and then display it. +虽然到目前为止我们的讨论都是基于文本数据,但对于二进制文件其实同样有用。例如我们可以用 ffmpeg 从相机中捕获一张图片,将其转换成灰度图后通过SSH将压缩后的文件发送到远端服务器,并在那里解压、存档并显示。 ```bash ffmpeg -loglevel panic -i /dev/video0 -frames 1 -f image2 - @@ -291,57 +265,29 @@ ffmpeg -loglevel panic -i /dev/video0 -frames 1 -f image2 - | ssh mymachine 'gzip -d | tee copy.jpg | env DISPLAY=:0 feh -' ``` -# Exercises - -1. Take this [short interactive regex tutorial](https://regexone.com/). -2. Find the number of words (in `/usr/share/dict/words`) that contain at - least three `a`s and don't have a `'s` ending. What are the three - most common last two letters of those words? `sed`'s `y` command, or - the `tr` program, may help you with case insensitivity. How many - of those two-letter combinations are there? And for a challenge: - which combinations do not occur? -3. To do in-place substitution it is quite tempting to do something like - `sed s/REGEX/SUBSTITUTION/ input.txt > input.txt`. However this is a - bad idea, why? Is this particular to `sed`? Use `man sed` to find out - how to accomplish this. -4. Find your average, median, and max system boot time over the last ten - boots. Use `journalctl` on Linux and `log show` on macOS, and look - for log timestamps near the beginning and end of each boot. On Linux, - they may look something like: +# 课后练习 + +1. 学习一下这篇简短的 [交互式正则表达式教程](https://regexone.com/). +2. 统计words文件 (`/usr/share/dict/words`) 中包含至少三个`a` 且不以`'s` 结尾的单词个数。这些单词中,出现频率最高的末尾两个字母是什么? `sed`的 `y`命令,或者 `tr` 程序也许可以帮你解决大小写的问题。共存在多少种词尾两字母组合?还有一个很 有挑战性的问题:哪个组合从未出现过? +3. 进行原地替换听上去很有诱惑力,例如: + `sed s/REGEX/SUBSTITUTION/ input.txt > input.txt`。但是这并不是一个明知的做法,为什么呢?还是说只有 `sed`是这样的? 查看 `man sed` 来完成这个问题 + +4. 找出您最近十次开机的开机时间平均数、中位数和最长时间。在Linux上需要用到 `journalctl` ,而在 macOS 上使用 `log show`。找到每次起到开始和结束时的时间戳。在Linux上类似这样操作: ``` Logs begin at ... ``` - and + 和 ``` systemd[577]: Startup finished in ... ``` - On macOS, [look - for](https://eclecticlight.co/2018/03/21/macos-unified-log-3-finding-your-way/): + 在 macOS 上, [查找](https://eclecticlight.co/2018/03/21/macos-unified-log-3-finding-your-way/): + ``` === system boot: ``` - and + 和 ``` Previous shutdown cause: 5 ``` -5. Look for boot messages that are _not_ shared between your past three - reboots (see `journalctl`'s `-b` flag). Break this task down into - multiple steps. First, find a way to get just the logs from the past - three boots. There may be an applicable flag on the tool you use to - extract the boot logs, or you can use `sed '0,/STRING/d'` to remove - all lines previous to one that matches `STRING`. Next, remove any - parts of the line that _always_ varies (like the timestamp). Then, - de-duplicate the input lines and keep a count of each one (`uniq` is - your friend). And finally, eliminate any line whose count is 3 (since - it _was_ shared among all the boots). -6. Find an online data set like [this - one](https://stats.wikimedia.org/EN/TablesWikipediaZZ.htm), [this - one](https://ucr.fbi.gov/crime-in-the-u.s/2016/crime-in-the-u.s.-2016/topic-pages/tables/table-1). - or maybe one [from - here](https://www.springboard.com/blog/free-public-data-sets-data-science-project/). - Fetch it using `curl` and extract out just two columns of numerical - data. If you're fetching HTML data, - [`pup`](https://github.com/EricChiang/pup) might be helpful. For JSON - data, try [`jq`](https://stedolan.github.io/jq/). Find the min and - max of one column in a single command, and the sum of the difference - between the two columns in another. +5. 查看之前三次重启启动信息中不同的部分 (参见 `journalctl`的`-b` 选项)。将这一任务分为几个步骤,首先获取之前三次启动的启动日志,也许获取启动日志的命令就有合适的选项可以帮助您提取前三次启动的日志,亦或者您可以使用`sed '0,/STRING/d'` 来删除 `STRING`匹配到的字符串前面的全部内容。然后,过滤掉每次都不相同的部分,例如时间戳。下一步,重复记录输入行并对其计数(可以使用`uniq` )。最后,删除所有出现过3次的内容(因为这些内容上三次启动日志中的重复部分)。 +6. 在网上找一个类似 [这个](https://stats.wikimedia.org/EN/TablesWikipediaZZ.htm) 或者 [这个](https://ucr.fbi.gov/crime-in-the-u.s/2016/crime-in-the-u.s.-2016/topic-pages/tables/table-1)的数据集。或者从 [这里](https://www.springboard.com/blog/free-public-data-sets-data-science-project/)找一些。使用 `curl` 获取数据集并提取其中两列数据,如果您想要获取的是HTML数据,那么[`pup`](https://github.com/EricChiang/pup)可能会更有帮助。对于JSON类型的数据,可以试试[`jq`](https://stedolan.github.io/jq/)。请使用一条指令来找出其中一列的最大值和最小值,用另外一条指令计算两列之间差的总和。 From d17e20d67bacb194d482c76bfe4d35812fd68515 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Fri, 22 May 2020 16:55:07 +0800 Subject: [PATCH 339/640] fix typo and so on --- _2020/data-wrangling.md | 45 +++++++++++++++++------------------------ 1 file changed, 19 insertions(+), 26 deletions(-) diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index 4b94ad42..b49f9fd3 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -20,7 +20,7 @@ video: 例如这样一条命令 `journalctl | grep -i intel`,它会找到所有包含intel(区分大小写)的系统日志。您可能并不认为是数据整理,但是它确实将某种形式的数据(全部系统日志)转换成了另外一种形式的数据(仅包含intel的日志)。大多数情况下,数据整理需要您能够明确哪些工具可以被用来达成特定数据整理的目的,并且明白如何组合使用这些工具。 -让我们从头讲起。既然需恶习数据整理,那有两样东西自然是必不可少的:用来整理的数据以及相关的应用场景。日志处理通常是一个比较典型的使用场景,因为我们经常需要在日志中查找某些信息。这种情况下通读日志是不现实的。现在,让我们研究一下系统日志,看看哪些用户曾经尝试过登录我们的服务器: +让我们从头讲起。既然需学习数据整理,那有两样东西自然是必不可少的:用来整理的数据以及相关的应用场景。日志处理通常是一个比较典型的使用场景,因为我们经常需要在日志中查找某些信息,这种情况下通读日志是不现实的。现在,让我们研究一下系统日志,看看哪些用户曾经尝试过登录我们的服务器: ```bash ssh myserver journalctl @@ -39,7 +39,7 @@ ssh myserver journalctl | grep sshd ssh myserver 'journalctl | grep sshd | grep "Disconnected from"' | less ``` -多出来的引号是什么作用呢?这么说把,我们的日志是一个非常大的文件,把这么大的文件流直接传输到我们本地的电脑上再进行过滤是对流量的一种浪费。因此我们采取另外一种方式,我们先在远端机器上过滤文本内容,然后再将结果传输到本机。 `less` 为我们创建来一个文件分页器,使我们可以通过翻页的方式浏览较长的文本。为了进一步节省流量,我们甚至可以将当前过滤出的日志保存到文件中,这样后续就不需要再次通过网络访问该文件了: +多出来的引号是什么作用呢?这么说吧,我们的日志是一个非常大的文件,把这么大的文件流直接传输到我们本地的电脑上再进行过滤是对流量的一种浪费。因此我们采取另外一种方式,我们先在远端机器上过滤文本内容,然后再将结果传输到本机。 `less` 为我们创建来一个文件分页器,使我们可以通过翻页的方式浏览较长的文本。为了进一步节省流量,我们甚至可以将当前过滤出的日志保存到文件中,这样后续就不需要再次通过网络访问该文件了: ```console @@ -49,7 +49,7 @@ $ less ssh.log 过滤结果中仍然包含不少没用的数据。我们有很多办法可以删除这些无用的数据,但是让我们先研究一下 `sed` 这个非常强大的工具。 -`sed` 是一个基于文本编辑器`ed`构建的"流编辑器" is a "stream editor" that builds on top of the old `ed` editor. 在 `sed` 中,您基本上是利用一些简短的命令来修改文件,而不是直接操作文件的内容(尽管您也可以选择这样做)。相关的命令行非常多,但是最常用的是 `s`,即*替换*命令,例如我们可以这样写: +`sed` 是一个基于文本编辑器`ed`构建的"流编辑器" 。在 `sed` 中,您基本上是利用一些简短的命令来修改文件,而不是直接操作文件的内容(尽管您也可以选择这样做)。相关的命令行非常多,但是最常用的是 `s`,即*替换*命令,例如我们可以这样写: ```bash ssh myserver journalctl @@ -62,7 +62,7 @@ ssh myserver journalctl ## 正则表达式 -正则表达式非常常见也非常有用,值得您花些时间去理解它。让我们从这一句正则表达式开始学习: `/.*Disconnected from /`。正则表达式通常以(尽管并不总是) `/`开始和结束。大多数的ASCII字符都表示它们本来的含义,但是有一些字符确实有表示匹配行为的“特殊”含义。不同字符所表示的含义,根据正则表达式的实现方式不同,也会有所变化,这一点确实令人沮丧。常见的模式有: +正则表达式非常常见也非常有用,值得您花些时间去理解它。让我们从这一句正则表达式开始学习: `/.*Disconnected from /`。正则表达式通常以(尽管并不总是) `/`开始和结束。大多数的 ASCII 字符都表示它们本来的含义,但是有一些字符确实具有表示匹配行为的“特殊”含义。不同字符所表示的含义,根据正则表达式的实现方式不同,也会有所变化,这一点确实令人沮丧。常见的模式有: - `.` 除空格之外的"任意单个字符" - `*` 匹配前面字符零次或多次 @@ -74,7 +74,7 @@ ssh myserver journalctl `sed` 的正则表达式有些时候是比较奇怪的,它需要你在这些模式前添加`\`才能使其具有特殊含义。或者,您也可以添加`-E`选项来支持这些匹配。 -因此,回过头我们再看`/.*Disconnected from /`,我们会发现这个正则表达式可以匹配任何以若干任意字符开头,并接着包含"Disconnected from "的字符串。这也正式我们所希望的。但是请注意,正则表达式并不容易写对。如果有人将 "Disconnected from" 作为自己的用户名会怎样呢? +回过头我们再看`/.*Disconnected from /`,我们会发现这个正则表达式可以匹配任何以若干任意字符开头,并接着包含"Disconnected from "的字符串。这也正式我们所希望的。但是请注意,正则表达式并不容易写对。如果有人将 "Disconnected from" 作为自己的用户名会怎样呢? ``` Jan 17 03:13:00 thesquareplanet.com sshd[2631]: Disconnected from invalid user Disconnected from 46.97.239.16 port 55920 [preauth] @@ -84,27 +84,26 @@ Jan 17 03:13:00 thesquareplanet.com sshd[2631]: Disconnected from invalid user D ``` 46.97.239.16 port 55920 [preauth] ``` -这可不上我们想要的结果。对于某些正则表达式的实现来说,您可以给 `*` 或 `+` 增加一个`?`后缀使其变成非贪婪模式,但是很可惜`sed` 并不支持该后缀。不过,我们可以切换到 +这可不上我们想要的结果。对于某些正则表达式的实现来说,您可以给 `*` 或 `+` 增加一个`?` 后缀使其变成非贪婪模式,但是很可惜 `sed` 并不支持该后缀。不过,我们可以切换到 perl 的命令行模式,该模式支持编写这样的正则表达式: ```bash perl -pe 's/.*?Disconnected from //' ``` -让我们回到 `sed` 命令并使用它完成后续的任务,毕竟对于这一类任务,`sed`是最常见的工具。`sed` 还可以非常方便的做一些事情,例如打印匹配后的内容,一次调用中进行多次替换搜索等。但是这些内容我们并不会在此进行介绍。`sed` 本身是一个非常全能的工具,但是在具体功能上往往能找到更好的工作作为替代品。 +让我们回到 `sed` 命令并使用它完成后续的任务,毕竟对于这一类任务,`sed`是最常见的工具。`sed` 还可以非常方便的做一些事情,例如打印匹配后的内容,一次调用中进行多次替换搜索等。但是这些内容我们并不会在此进行介绍。`sed` 本身是一个非常全能的工具,但是在具体功能上往往能找到更好的工具作为替代品。 好的,我们还需要去掉用户名后面的后缀,应该如何操作呢? + 想要匹配用户名后面的文本,尤其是当这里的用户名可以包含空格时,这个问题变得非常棘手!这里我们需要做的是匹配*一整行*: ```bash | sed -E 's/.*Disconnected from (invalid |authenticating )?user .* [^ ]+ port [0-9]+( \[preauth\])?$//' ``` -让我们借助正则表达式在线调试工具[regex -debugger](https://regex101.com/r/qqbZqh/2)来理解这段表达式。OK,开始的部分和以前是一样的。随后,我们匹配两种类型的“user”(在日志中基于两种前缀区分)。再然后我们匹配属于用户名的所有字符。接着,再匹配任意一个单词(`[^ ]+` 会匹配任意非空切不包含空格的序列)。紧接着后面匹配单词“port”和它后面的遗传数字,以及可能存在的后缀 -`[preauth]`,最后再匹配行尾。 +让我们借助正则表达式在线调试工具[regex debugger](https://regex101.com/r/qqbZqh/2) 来理解这段表达式。OK,开始的部分和以前是一样的,随后,我们匹配两种类型的“user”(在日志中基于两种前缀区分)。再然后我们匹配属于用户名的所有字符。接着,再匹配任意一个单词(`[^ ]+` 会匹配任意非空切不包含空格的序列)。紧接着后面匹配单“port”和它后面的一串数字,以及可能存在的后缀`[preauth]`,最后再匹配行尾。 -注意,这样做的话,即使用户名是“Disconnected from”,对匹配结果也不会有任何影响,您知道这是为什么吗?。你能发现是为什么吗? +注意,这样做的话,即使用户名是“Disconnected from”,对匹配结果也不会有任何影响,您知道这是为什么吗? 问题还没有完全解决,日志的内容全部被替换成了空字符串,整个日志的内容因此都被删除了。我们实际上希望能够将用户名*保留*下来。对此,我们可以使用“捕获组(capture groups)”来完成。被圆括号内的正则表达式匹配到的文本,都会被存入一系列以编号区分的捕获组中。捕获组的内容可以在替换字符串时使用(有些正则表达式的引擎甚至支持替换表达式本身),例如`\1`、 `\2`、`\3`等等,因此可以使用如下命令: @@ -112,8 +111,7 @@ debugger](https://regex101.com/r/qqbZqh/2)来理解这段表达式。OK,开始 | sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' ``` -想必您已经意识到了,为了完成某种匹配,我们最终可能会写出非常复杂的正则表达式。例如,这里有一篇关于如何匹配电子邮箱地址的文章[e-mail address](https://www.regular-expressions.info/email.html),匹配电子邮箱可一点[也不简单](https://emailregex.com/)。网络上还有很多关于如何匹配电子邮箱地址的[讨论](https://stackoverflow.com/questions/201323/how-to-validate-an-email-address-using-a-regular-expression/1917982)。人们还为其编写了[测试用例](https://fightingforalostcause.net/content/misc/2006/compare-email-regex.php). -及 [测试矩阵](https://mathiasbynens.be/demo/url-regex)。您甚至可以编写一个用于判断一个数[是否为质数](https://www.noulakaz.net/2007/03/18/a-regular-expression-to-check-for-prime-numbers/)的正则表达式。 +想必您已经意识到了,为了完成某种匹配,我们最终可能会写出非常复杂的正则表达式。例如,这里有一篇关于如何匹配电子邮箱地址的文章[e-mail address](https://www.regular-expressions.info/email.html),匹配电子邮箱可一点[也不简单](https://emailregex.com/)。网络上还有很多关于如何匹配电子邮箱地址的[讨论](https://stackoverflow.com/questions/201323/how-to-validate-an-email-address-using-a-regular-expression/1917982)。人们还为其编写了[测试用例](https://fightingforalostcause.net/content/misc/2006/compare-email-regex.php)及 [测试矩阵](https://mathiasbynens.be/demo/url-regex)。您甚至可以编写一个用于判断一个数[是否为质数](https://www.noulakaz.net/2007/03/18/a-regular-expression-to-check-for-prime-numbers/)的正则表达式。 正则表达式是出了名的难以写对,但是它仍然会是您强大的常备工具之一。 @@ -129,9 +127,7 @@ ssh myserver journalctl | sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/' ``` -`sed` 还可以做很多各种各样有趣的事情,例如文本注入: -(使用 `i` 命令),打印特定的行 (使用 `p` -命令),基于索引选择特定行等等。详情请见`man sed`! +`sed` 还可以做很多各种各样有趣的事情,例如文本注入:(使用 `i` 命令),打印特定的行 (使用 `p`命令),基于索引选择特定行等等。详情请见`man sed`! 现在,我们已经得到了一个包含用户名的列表,列表中的用户都曾经尝试过登陆我们的系统。但这还不够,让我们过滤出那些最常出现的用户: @@ -157,13 +153,10 @@ ssh myserver journalctl `sort -n` 会按照数字顺序对输入进行排序(默认情况下是按照字典序排序 `-k1,1` 则表示“仅基于以空格分割的第一列进行排序”。`,n` 部分表示“仅排序到第n个部分”,默认情况是到行尾。就本例来说,针对整个行进行排序也没有任何问题,我们这里主要是为了学习这一用法! -如果我们希望得到登陆次数最少的用户,我们可以使用 `head` 来代替 -`tail`。或者使用`sort -r`来进行倒序排序。 +如果我们希望得到登陆次数最少的用户,我们可以使用 `head` 来代替`tail`。或者使用`sort -r`来进行倒序排序。 -Okay, so that's pretty cool, but we'd sort of like to only give the -usernames, and maybe not one per line? -好,相当不错,但我们有点想只取用户名,而且不要一行一个地显示。 +相当不错。但我们只想获取用户名,而且不要一行一个地显示。 ```bash @@ -176,22 +169,22 @@ ssh myserver journalctl | awk '{print $2}' | paste -sd, ``` -我们可以利用 `paste`命令来合并行(`-s`),并指定一个分隔符进行分割 (`-d`),那么`awk`的作用又是什么呢? +我们可以利用 `paste`命令来合并行(`-s`),并指定一个分隔符进行分割 (`-d`),那`awk`的作用又是什么呢? ## awk -- 另外一种编辑器 `awk` 其实是一种编程语言,只不过它碰巧非常善于处理文本。关于 `awk` 可以介绍的内容太多了,限于篇幅,这里我们仅介绍一些基础知识。 首先, `{print $2}` 的作用是什么? `awk` 程序接受一个模式串(可选),以及一个代码块,指定当模式匹配时应该做何种操作。默认当模式串即匹配所有行(上面命令中当用法)。 -在代码块中,`$0` 表示正行的内容,`$1` 到 `$n` 为一行中的n个区域,区域的分割基于 `awk` 的域分隔符(默认是空格,可以通过`-F`来修改)。在这个例子中,我们的代码意思是,对于每一行文本,打印其第二个部分,也就是用户名。 +在代码块中,`$0` 表示正行的内容,`$1` 到 `$n` 为一行中的 n 个区域,区域的分割基于 `awk` 的域分隔符(默认是空格,可以通过`-F`来修改)。在这个例子中,我们的代码意思是:对于每一行文本,打印其第二个部分,也就是用户名。 -让我们看看,还有什么炫酷的操作我们可以做。让我们统计一下所有以`c` 开头,以 `e` 结尾,并且仅尝试过一次登陆的用户。 +让我们康康,还有什么炫酷的操作可以做。让我们统计一下所有以`c` 开头,以 `e` 结尾,并且仅尝试过一次登陆的用户。 ```bash | awk '$1 == 1 && $2 ~ /^c[^ ]*e$/ { print $2 }' | wc -l ``` -让我们好好分析一下。首先,注意这次我们为`awk`指定来一个匹配模式串(也就是`{...}`前面的那部分内容)。该匹配要求文本的第一部分需要等于1(这部分刚好是`uniq -c`得到的计数值),然后其第二部分必须满足给定的一个正则表达式。代码快中的内容则表示打印用户名。然后我们使用 `wc -l` 统计输出结果的行数。 +让我们好好分析一下。首先,注意这次我们为 `awk`指定了一个匹配模式串(也就是`{...}`前面的那部分内容)。该匹配要求文本的第一部分需要等于1(这部分刚好是`uniq -c`得到的计数值),然后其第二部分必须满足给定的一个正则表达式。代码快中的内容则表示打印用户名。然后我们使用 `wc -l` 统计输出结果的行数。 不过,既然 `awk` 是一种编程语言,那么则可以这样: @@ -231,7 +224,7 @@ ssh myserver journalctl | awk '{print $1}' | R --slave -e 'x <- scan(file="stdin", quiet=TRUE); summary(x)' ``` -R 也是一种编程语言,它非常适合被用来进行数据分析和[绘制图表](https://ggplot2.tidyverse.org/)。这里我们不会讲的特别详细, 您只需要知道`summary` 可以打印统计结果,我们通过输入的信息计算出一个矩阵,然后R语言就可以得到我们想要的统计数据。 +R 也是一种编程语言,它非常适合被用来进行数据分析和[绘制图表](https://ggplot2.tidyverse.org/)。这里我们不会讲的特别详细, 您只需要知道`summary` 可以打印统计结果。我们通过输入的信息计算出一个矩阵,然后R语言就可以得到我们想要的统计数据。 如果您希望绘制一些简单的图表, `gnuplot` 可以帮助到您: From cfb080ea9179982c60a3722db93787f4569ef04b Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Fri, 22 May 2020 16:59:00 +0800 Subject: [PATCH 340/640] change status for data-wrangling.md marked as done --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index a951bcde..d100c065 100644 --- a/README.md +++ b/README.md @@ -28,7 +28,7 @@ To contribute to this tanslation project, please book your topic by creating an | [course-shell.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/course-shell.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [shell-tools.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/shell-tools.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [editors.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/editors.md) | [@stechu](https://github.com/stechu) | In-progress | -| [data-wrangling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/data-wrangling.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | +| [data-wrangling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/data-wrangling.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [command-line.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/command-line.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | | [version-control.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/version-control.md) | | TO-DO | | [debugging-profiling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/debugging-profiling.md) | | TO-DO | From 5674757c95a42ae9557dcb681f7c8b2324793a32 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Fri, 22 May 2020 17:07:58 +0800 Subject: [PATCH 341/640] book one more lectures --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index d100c065..73a73c8e 100644 --- a/README.md +++ b/README.md @@ -30,7 +30,7 @@ To contribute to this tanslation project, please book your topic by creating an | [editors.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/editors.md) | [@stechu](https://github.com/stechu) | In-progress | | [data-wrangling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/data-wrangling.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [command-line.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/command-line.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | -| [version-control.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/version-control.md) | | TO-DO | +| [version-control.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/version-control.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | | [debugging-profiling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/debugging-profiling.md) | | TO-DO | | [metaprogramming.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/metaprogramming.md) | | TO-DO | | [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | | TO-DO | From e5036deb7e47776007b426c8c48203dede8085bc Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Fri, 22 May 2020 22:46:02 +0800 Subject: [PATCH 342/640] update trans --- _2020/version-control.md | 78 +++++++++++++++------------------------- 1 file changed, 28 insertions(+), 50 deletions(-) diff --git a/_2020/version-control.md b/_2020/version-control.md index 52f46639..6659e031 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -8,54 +8,32 @@ video: id: 2sjqTHE0zok --- -Version control systems (VCSs) are tools used to track changes to source code -(or other collections of files and folders). As the name implies, these tools -help maintain a history of changes; furthermore, they facilitate collaboration. -VCSs track changes to a folder and its contents in a series of snapshots, where -each snapshot encapsulates the entire state of files/folders within a top-level -directory. VCSs also maintain metadata like who created each snapshot, messages -associated with each snapshot, and so on. - -Why is version control useful? Even when you're working by yourself, it can let -you look at old snapshots of a project, keep a log of why certain changes were -made, work on parallel branches of development, and much more. When working -with others, it's an invaluable tool for seeing what other people have changed, -as well as resolving conflicts in concurrent development. - -Modern VCSs also let you easily (and often automatically) answer questions -like: - -- Who wrote this module? -- When was this particular line of this particular file edited? By whom? Why - was it edited? -- Over the last 1000 revisions, when/why did a particular unit test stop -working? - -While other VCSs exist, **Git** is the de facto standard for version control. -This [XKCD comic](https://xkcd.com/1597/) captures Git's reputation: +版本控制系统 (VCSs) 是一类用于追踪源代码(或其他文件、文件夹)改动的工具。顾名思义,这些工具可以帮助我们管理修改历史;不仅如此,它还可以让协作编码更方便。VCS通过一系列的快照将某个文件夹及其内容保存了起来,每个快照都包含了文件或文件夹的完整状态。同时它还维护了快照创建者的信息以及每个快照的管相关信息等等。 + +为什么说版本控制系统非常有用?即使您只是一个人进行编程工作,它也可以帮您创建项目的快照,记录每个改动的目的、基于多分支并行开发等等。和别人协作开发时,它更是一个无价之宝,您可以看到别人对代码进行的修改,同时解决由于并行开发引起的冲突。 + + +现代的版本控制系统可以帮助您轻松地(甚至自动地)回答以下问题: + +- 当前模块是谁编写的? +- 这个文件的这一行是什么时候被编辑的?是谁作出的修改?修改原因是什么呢? +- 最近的1000个版本中,何时/为什么导致了单元测试失败? + +尽管版本控制系统有很多, 其事实上的标准则是 **Git** 。这篇 [XKCD 漫画](https://xkcd.com/1597/) 则反映出了人们对 Git 的评价: ![xkcd 1597](https://imgs.xkcd.com/comics/git.png) -Because Git's interface is a leaky abstraction, learning Git top-down (starting -with its interface / command-line interface) can lead to a lot of confusion. -It's possible to memorize a handful of commands and think of them as magic -incantations, and follow the approach in the comic above whenever anything goes -wrong. +因为 Git 接口的抽象有些问题,通过自顶向下的方式(从接口、命令行接口开始)学习 Git 可能会让人感到非常困惑。很多时候您只能死记硬背一些命令行,然后像使用魔法一样使用它们,一旦出现问题,就只能像上面那幅漫画里说的那样去处理了。 -While Git admittedly has an ugly interface, its underlying design and ideas are -beautiful. While an ugly interface has to be _memorized_, a beautiful design -can be _understood_. For this reason, we give a bottom-up explanation of Git, -starting with its data model and later covering the command-line interface. -Once the data model is understood, the commands can be better understood, in -terms of how they manipulate the underlying data model. +尽管 Git 的接口有些粗糙,但是它的底层设计和思想却是非常优雅的。丑陋的接口只能靠死记硬背,而优雅的底层设计则非常容易被人理解。因此,我们将通过一种自底向上的方式像您介绍 Git。我们会从数据模型开始,最后再学习它的接口。一旦您搞懂了 Git 的数据模型,再学习其接口并理解这些接口是如何操作数据模型的,就非常容易了。 -# Git's data model +# Git 的数据模型 There are many ad-hoc approaches you could take to version control. Git has a well thought-out model that enables all the nice features of version control, like maintaining history, supporting branches, and enabling collaboration. -## Snapshots +## 快照 Git models the history of a collection of files and folders within some top-level directory as a series of snapshots. In Git terminology, a file is @@ -77,7 +55,7 @@ example, we might have a tree as follows: The top-level tree contains two elements, a tree "foo" (that itself contains one element, a blob "bar.txt"), and a blob "baz.txt". -## Modeling history: relating snapshots +## 历史记录建模:关联快照 How should a version control system relate snapshots? One simple model would be to have a linear history. A history would be a list of snapshots in time-order. @@ -121,7 +99,7 @@ corrected, however; it's just that "edits" to the commit history are actually creating entirely new commits, and references (see below) are updated to point to the new ones. -## Data model, as pseudocode +## 数据模型及其伪代码表示 It may be instructive to see Git's data model written down in pseudocode: @@ -143,7 +121,7 @@ type commit = struct { It's a clean, simple model of history. -## Objects and content-addressing +## 对象和内存寻址 An "object" is a blob, tree, or commit: @@ -187,7 +165,7 @@ the following: git is wonderful ``` -## References +## 引用 Now, all snapshots can be identified by their SHA-1 hash. That's inconvenient, because humans aren't good at remembering strings of 40 hexadecimal characters. @@ -222,7 +200,7 @@ history, so that when we take a new snapshot, we know what it is relative to (how we set the `parents` field of the commit). In Git, that "where we currently are" is a special reference called "HEAD". -## Repositories +## 仓库 Finally, we can define what (roughly) is a Git _repository_: it is the data `objects` and `references`. @@ -238,7 +216,7 @@ uncommitted changes and make the 'master' ref point to commit `5d83f9e`", there' probably a command to do it (e.g. in this case, `git checkout master; git reset --hard 5d83f9e`). -# Staging area +# 暂存区 This is another concept that's orthogonal to the data model, but it's a part of the interface to create commits. @@ -258,13 +236,13 @@ Git accommodates such scenarios by allowing you to specify which modifications should be included in the next snapshot through a mechanism called the "staging area". -# Git command-line interface +# Git 的命令行接口 To avoid duplicating information, we're not going to explain the commands below in detail. See the highly recommended [Pro Git](https://git-scm.com/book/en/v2) for more information, or watch the lecture video. -## Basics +## 基础 {% comment %} @@ -431,7 +409,7 @@ index 94bab17..f0013b2 100644 - `git diff `: shows differences in a file between snapshots - `git checkout `: updates HEAD and current branch -## Branching and merging +## 分支和合并 {% comment %} @@ -481,7 +459,7 @@ command is used for merging. - `git bisect`: binary search history (e.g. for regressions) - `.gitignore`: [specify](https://git-scm.com/docs/gitignore) intentionally untracked files to ignore -# Miscellaneous +# 杂项 - **GUIs**: There are many [GUI clients](https://git-scm.com/downloads/guis) out there for Git. We personally don't use them and use the command-line @@ -505,7 +483,7 @@ requests](https://help.github.com/en/github/collaborating-with-issues-and-pull-r hosts, like [GitLab](https://about.gitlab.com/) and [BitBucket](https://bitbucket.org/). -# Resources +# 资源 - [Pro Git](https://git-scm.com/book/en/v2) is **highly recommended reading**. Going through Chapters 1--5 should teach you most of what you need to use Git @@ -525,7 +503,7 @@ words](https://smusamashah.github.io/blog/2017/10/14/explain-git-in-simple-words - [Learn Git Branching](https://learngitbranching.js.org/) is a browser-based game that teaches you Git. -# Exercises +# 课后练习 1. If you don't have any past experience with Git, either try reading the first couple chapters of [Pro Git](https://git-scm.com/book/en/v2) or go through a From 8054a232f6b86aac86cd1b71314891ffe6986f04 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Fri, 22 May 2020 22:48:44 +0800 Subject: [PATCH 343/640] book qa.md for AA1HSHH --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 73a73c8e..c5a1a522 100644 --- a/README.md +++ b/README.md @@ -35,5 +35,5 @@ To contribute to this tanslation project, please book your topic by creating an | [metaprogramming.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/metaprogramming.md) | | TO-DO | | [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | | TO-DO | | [potpourri.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md) | | TO-DO | -| [qa.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/qa.md) | | TO-DO | +| [qa.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/qa.md) | [@AA1HSHH](https://github.com/AA1HSHH) | TO-DO | | [about.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/about.md) | [@Binlogo](https://github.com/Binlogo) | In-progress | From 25e3facc8e54b0e8dacdbe7fb778be0d0cdaa5cc Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Fri, 22 May 2020 22:49:16 +0800 Subject: [PATCH 344/640] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index c5a1a522..f03c19ce 100644 --- a/README.md +++ b/README.md @@ -35,5 +35,5 @@ To contribute to this tanslation project, please book your topic by creating an | [metaprogramming.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/metaprogramming.md) | | TO-DO | | [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | | TO-DO | | [potpourri.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md) | | TO-DO | -| [qa.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/qa.md) | [@AA1HSHH](https://github.com/AA1HSHH) | TO-DO | +| [qa.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/qa.md) | [@AA1HSHH](https://github.com/AA1HSHH) | In-progress | | [about.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/about.md) | [@Binlogo](https://github.com/Binlogo) | In-progress | From 352237e4fbb2f59c757ea29b8430c222bed880d0 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Fri, 22 May 2020 23:15:01 +0800 Subject: [PATCH 345/640] update trans --- _2020/version-control.md | 31 ++++++++----------------------- 1 file changed, 8 insertions(+), 23 deletions(-) diff --git a/_2020/version-control.md b/_2020/version-control.md index 6659e031..c559abf7 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -29,18 +29,11 @@ video: # Git 的数据模型 -There are many ad-hoc approaches you could take to version control. Git has a -well thought-out model that enables all the nice features of version control, -like maintaining history, supporting branches, and enabling collaboration. +进行版本控制的方法很多。Git 拥有一个经过精心设计的模型,这使其能够支持版本控制所需的所有特性,例如维护历史记录、支持分支和促进协作。 ## 快照 -Git models the history of a collection of files and folders within some -top-level directory as a series of snapshots. In Git terminology, a file is -called a "blob", and it's just a bunch of bytes. A directory is called a -"tree", and it maps names to blobs or trees (so directories can contain other -directories). A snapshot is the top-level tree that is being tracked. For -example, we might have a tree as follows: +Git 将顶级目录中的文件和文件夹作为集合,并通过一系列快照来管理其历史记录。在Git的术语里,文件被称作Blob对象(数据对象),也就是一组数据。目录则被称之为“树”,它将名字与Blob对象或树对象进行映射(使得目录中可以包含其他目录)。快照则是被追踪的最顶层的树。例如,一个树看起来可能是这样的: ``` (tree) @@ -52,24 +45,15 @@ example, we might have a tree as follows: +- baz.txt (blob, contents = "git is wonderful") ``` -The top-level tree contains two elements, a tree "foo" (that itself contains -one element, a blob "bar.txt"), and a blob "baz.txt". +这个顶层的树包含了两个元素,一个名为 "foo" 的树(它本身包含了一个blob对象 "bar.txt"),以及一个对blob对象 "baz.txt"。 ## 历史记录建模:关联快照 -How should a version control system relate snapshots? One simple model would be -to have a linear history. A history would be a list of snapshots in time-order. -For many reasons, Git doesn't use a simple model like this. +版本控制系统和快照有什么关系呢?线性历史记录是一种最简单的模型,它包含了一组按照时间顺序线性排列的快照。不过处于种种原因,Git并没有采用这样的模型。 -In Git, a history is a directed acyclic graph (DAG) of snapshots. That may -sound like a fancy math word, but don't be intimidated. All this means is that -each snapshot in Git refers to a set of "parents", the snapshots that preceded -it. It's a set of parents rather than a single parent (as would be the case in -a linear history) because a snapshot might descend from multiple parents, for -example due to combining (merging) two parallel branches of development. +在 Git 中,历史记录是一个由快照组成的有向无环图。有向无环图,听上去似乎是什么高大上的数学名词,不过不要怕。您只需要知道这代表 Git 中的每个快照都有一系列的“父辈”,也就是其之前的一系列快照。注意,快照具有多个“父辈”而非一个,因为某个快照可能由多个父辈而来。例如,经过合并后的两条分支。 -Git calls these snapshots "commit"s. Visualizing a commit history might look -something like this: +在 Git 中,这些快照被称为“提交”。通过可视化的方式来表示这些历史提交记录时,看起来差不多是这样的: ``` o <-- o <-- o <-- o @@ -78,7 +62,8 @@ o <-- o <-- o <-- o --- o <-- o ``` -In the ASCII art above, the `o`s correspond to individual commits (snapshots). +上面是一个 ASCII 码构成的简图,其中的 `o` 表示一次提交(快照)。 + The arrows point to the parent of each commit (it's a "comes before" relation, not "comes after"). After the third commit, the history branches into two separate branches. This might correspond to, for example, two separate features From e526ef14346b8842193ebdb7958e854229f2b2fc Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Fri, 22 May 2020 23:36:37 +0800 Subject: [PATCH 346/640] update trans --- _2020/version-control.md | 43 +++++++++++++--------------------------- 1 file changed, 14 insertions(+), 29 deletions(-) diff --git a/_2020/version-control.md b/_2020/version-control.md index c559abf7..b889457f 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -64,13 +64,7 @@ o <-- o <-- o <-- o 上面是一个 ASCII 码构成的简图,其中的 `o` 表示一次提交(快照)。 -The arrows point to the parent of each commit (it's a "comes before" relation, -not "comes after"). After the third commit, the history branches into two -separate branches. This might correspond to, for example, two separate features -being developed in parallel, independently from each other. In the future, -these branches may be merged to create a new snapshot that incorporates both of -the features, producing a new history that looks like this, with the newly -created merge commit shown in bold: +箭头指向了当前提交的父辈(这是一种“在。。。之前”,而不是“在。。。之后”的关系)。在第三次提交之后,历史记录分岔成了两条独立的分支。这可能因为此时需要同时开发两个不同的特性,它们之间是相互独立的。开发完成后,这些分支可能会被合并并创建一个新的提交,这个新的提交会同时包含这些特性。新的提交会创建一个新的历史记录,看上去像这样(最新的合并提交用粗体标记):
     o <-- o <-- o <-- o <---- o
    @@ -79,23 +73,20 @@ o <-- o <-- o <-- o <---- o
                   --- o <-- o
     
    -Commits in Git are immutable. This doesn't mean that mistakes can't be -corrected, however; it's just that "edits" to the commit history are actually -creating entirely new commits, and references (see below) are updated to point -to the new ones. +Git 中的提交是不可改变的。但这并不代表错误不能被修改,只不过这种“修改”实际上是创建了一个全新的提交记录。而引用(参见下文)则被更新为指向这些新的提交。 ## 数据模型及其伪代码表示 -It may be instructive to see Git's data model written down in pseudocode: +以伪代码的形式来学习 Git 的数据模型,可能更加清晰: ``` -// a file is a bunch of bytes +// 文件就是一组数据 type blob = array -// a directory contains named files and directories +// 一个包含文件和目录的目录 type tree = map -// a commit has parents, metadata, and the top-level tree +// 每个提交都包含一个父辈,元数据和顶层树 type commit = struct { parent: array author: string @@ -104,18 +95,18 @@ type commit = struct { } ``` -It's a clean, simple model of history. +这是一种简洁的历史模型。 + ## 对象和内存寻址 -An "object" is a blob, tree, or commit: +Git 中的对象可以是 blob、树或提交: ``` type object = blob | tree | commit ``` -In Git data store, all objects are content-addressed by their [SHA-1 -hash](https://en.wikipedia.org/wiki/SHA-1). +Git 在储存数据时,所有的对象都会基于它们的[SHA-1 hash](https://en.wikipedia.org/wiki/SHA-1)进行寻址。 ``` objects = map @@ -128,23 +119,17 @@ def load(id): return objects[id] ``` -Blobs, trees, and commits are unified in this way: they are all objects. When -they reference other objects, they don't actually _contain_ them in their -on-disk representation, but have a reference to them by their hash. +Blobs、树和提交都一样,它们都是对象。当它们引用其他对象时,它们并没有真正的在硬盘上保存这些对象,而是仅仅保存了它们的哈希值作为引用。 -For example, the tree for the example directory structure [above](#snapshots) -(visualized using `git cat-file -p 698281bc680d1995c5f4caaf3359721a5a58d48d`), -looks like this: +例如,上面例子中的树,For example, the tree for the example directory structure [above](#snapshots)(可以通过`git cat-file -p 698281bc680d1995c5f4caaf3359721a5a58d48d` 来进行可视化),看上去是这样的: ``` 100644 blob 4448adbf7ecd394f42ae135bbeed9676e894af85 baz.txt 040000 tree c68d233a33c5c06e0340e4c224f0afca87c8ce87 foo ``` -The tree itself contains pointers to its contents, `baz.txt` (a blob) and `foo` -(a tree). If we look at the contents addressed by the hash corresponding to -baz.txt with `git cat-file -p 4448adbf7ecd394f42ae135bbeed9676e894af85`, we get -the following: +树本身会包含一些指向其他内容的指针,例如`baz.txt` (blob) 和 `foo` +(树)。如果我们用`git cat-file -p 4448adbf7ecd394f42ae135bbeed9676e894af85`,即通过哈希值查看 baz.txte 的内容,会得到以下信息: ``` git is wonderful From f1f177e3984819f2c7a9897935fac5a63c4e88bd Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 23 May 2020 00:37:54 +0800 Subject: [PATCH 347/640] update trans --- _2020/version-control.md | 169 ++++++++++++++++----------------------- 1 file changed, 70 insertions(+), 99 deletions(-) diff --git a/_2020/version-control.md b/_2020/version-control.md index b889457f..456309b7 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -137,14 +137,9 @@ git is wonderful ## 引用 -Now, all snapshots can be identified by their SHA-1 hash. That's inconvenient, -because humans aren't good at remembering strings of 40 hexadecimal characters. +现在,所有的快照都可以通过它们的SHA-1哈希值来标记了。但这也太不方便来,谁也记不住一串 40 位的十六进制字符。 -Git's solution to this problem is human-readable names for SHA-1 hashes, called -"references". References are pointers to commits. Unlike objects, which are -immutable, references are mutable (can be updated to point to a new commit). -For example, the `master` reference usually points to the latest commit in the -main branch of development. +针对这一问题,Git 的解决方法是给这些哈希值赋予人类可读的名字,也就是引用(references)。引用是指向提交的指针。与对象不同的是,它是可变的(引用可以被更新,指向新的提交)。例如,`master` 引用通常会指向主分支的最新一次提交。 ``` references = map @@ -162,18 +157,13 @@ def load_reference(name_or_id): return load(name_or_id) ``` -With this, Git can use human-readable names like "master" to refer to a -particular snapshot in the history, instead of a long hexadecimal string. +这样,Git 就可以使用诸如 "master" 这样人类刻度的名称来表示历史记录中某个特定的提交,而不需要在使用一长串十六进制字符了。 -One detail is that we often want a notion of "where we currently are" in the -history, so that when we take a new snapshot, we know what it is relative to -(how we set the `parents` field of the commit). In Git, that "where we -currently are" is a special reference called "HEAD". +有一个细节需要我们注意, 通常情况下,我们会想要知道“我们当前所在位置”,并将其标记下来。这样当我们创建新的快照的时候,我们就可以知道它的相对位置(如何设置它的“父辈”)。在 Git 中,我们当前的位置有一个特殊的索引,它就是"HEAD"。 ## 仓库 -Finally, we can define what (roughly) is a Git _repository_: it is the data -`objects` and `references`. +最后,我们可以粗略地给出 Git 仓库的定义了:`对象` 和 `引用`。 On disk, all Git stores are objects and references: that's all there is to Git's data model. All `git` commands map to some manipulation of the commit DAG by @@ -367,17 +357,17 @@ index 94bab17..f0013b2 100644 {% endcomment %} -- `git help `: get help for a git command -- `git init`: creates a new git repo, with data stored in the `.git` directory -- `git status`: tells you what's going on -- `git add `: adds files to staging area -- `git commit`: creates a new commit - - Write [good commit messages](https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html)! -- `git log`: shows a flattened log of history -- `git log --all --graph --decorate`: visualizes history as a DAG -- `git diff `: show differences since the last commit -- `git diff `: shows differences in a file between snapshots -- `git checkout `: updates HEAD and current branch +- `git help `: 获取 git 命令的帮助信息 +- `git init`: 创建一个新的 git 仓库,其数据会存放在一个名为 `.git` 的目录下 +- `git status`: 显示当前的仓库状态 +- `git add `: 添加文件到暂存区 +- `git commit`: 创建一个新的提交 + - 如何编写 [良好的提交信息](https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html)! +- `git log`: 显示历史日志 +- `git log --all --graph --decorate`: 可视化历史记录(有向无环图) +- `git diff `: 显示与上一次提交间的不同 +- `git diff `: 显示某个文件两个版本之间的不同 +- `git checkout `: 更新HEAD和目前的分支 ## 分支和合并 @@ -394,84 +384,66 @@ command is used for merging. {% endcomment %} -- `git branch`: shows branches -- `git branch `: creates a branch -- `git checkout -b `: creates a branch and switches to it - - same as `git branch ; git checkout ` -- `git merge `: merges into current branch -- `git mergetool`: use a fancy tool to help resolve merge conflicts -- `git rebase`: rebase set of patches onto a new base - -## Remotes - -- `git remote`: list remotes -- `git remote add `: add a remote -- `git push :`: send objects to remote, and update remote reference -- `git branch --set-upstream-to=/`: set up correspondence between local and remote branch -- `git fetch`: retrieve objects/references from a remote -- `git pull`: same as `git fetch; git merge` -- `git clone`: download repository from remote - -## Undo - -- `git commit --amend`: edit a commit's contents/message -- `git reset HEAD `: unstage a file -- `git checkout -- `: discard changes - -# Advanced Git - -- `git config`: Git is [highly customizable](https://git-scm.com/docs/git-config) -- `git clone --shallow`: clone without entire version history -- `git add -p`: interactive staging -- `git rebase -i`: interactive rebasing -- `git blame`: show who last edited which line -- `git stash`: temporarily remove modifications to working directory -- `git bisect`: binary search history (e.g. for regressions) -- `.gitignore`: [specify](https://git-scm.com/docs/gitignore) intentionally untracked files to ignore +- `git branch`: 显示分支 +- `git branch `: 创建分支 +- `git checkout -b `: 创建分支并切换到该分支 + - 相当于 `git branch ; git checkout ` +- `git merge `: 合并到当前分支 +- `git mergetool`: 使用工具来处理合并冲突 +- `git rebase`: 将一系列补丁变基(rebase)为新的基线 + +## 远端操作 + +- `git remote`: 列出远端 +- `git remote add `: 添加一个远端 +- `git push :`: 将对象传送至远端并更新远端引用 +- `git branch --set-upstream-to=/`: 创建本地和远端分支的关联关系 +- `git fetch`: 从远端获取对象/索引 +- `git pull`: 相当于 `git fetch; git merge` +- `git clone`: 从远端下载仓库 + +## 撤销 + +- `git commit --amend`: 编辑提交的内容或信息 +- `git reset HEAD `: 恢复暂存的文件 +- `git checkout -- `: 丢弃修改 + +# Git 高级操作 + +- `git config`: Git 是一个 [高度可定制的](https://git-scm.com/docs/git-config) 工具 +- `git clone --shallow`: 克隆仓库,但是不包括版本历史信息 +- `git add -p`: 交互式暂存 +- `git rebase -i`: 交互式变基 +- `git blame`: 查看最后修改某行的人 +- `git stash`: 暂时移除工作目录下的修改内容 +- `git bisect`: 通过二分查找搜索历史记录 +- `.gitignore`: [指定](https://git-scm.com/docs/gitignore) 故意不追踪的文件 # 杂项 -- **GUIs**: There are many [GUI clients](https://git-scm.com/downloads/guis) -out there for Git. We personally don't use them and use the command-line -interface instead. -- **Shell integration**: It's super handy to have a Git status as part of your -shell prompt ([zsh](https://github.com/olivierverdier/zsh-git-prompt), -[bash](https://github.com/magicmonty/bash-git-prompt)). Often included in -frameworks like [Oh My Zsh](https://github.com/ohmyzsh/ohmyzsh). -- **Editor integration**: Similarly to the above, handy integrations with many -features. [fugitive.vim](https://github.com/tpope/vim-fugitive) is the standard -one for Vim. -- **Workflows**: we taught you the data model, plus some basic commands; we -didn't tell you what practices to follow when working on big projects (and -there are [many](https://nvie.com/posts/a-successful-git-branching-model/) -[different](https://www.endoflineblog.com/gitflow-considered-harmful) -[approaches](https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow)). -- **GitHub**: Git is not GitHub. GitHub has a specific way of contributing code -to other projects, called [pull -requests](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests). -- **Other Git providers**: GitHub is not special: there are many Git repository -hosts, like [GitLab](https://about.gitlab.com/) and -[BitBucket](https://bitbucket.org/). +- **图形用户界面**: Git 的 [图形用户界面客户端](https://git-scm.com/downloads/guis) 有很多,但是我们自己并不使用这些图形用户界面的客户端,我们选择使用命令行接口 +- **Shell 集成**: 将 Git 状态集成到您的shell中会非常方便。([zsh](https://github.com/olivierverdier/zsh-git-prompt),[bash](https://github.com/magicmonty/bash-git-prompt))。[Oh My Zsh](https://github.com/ohmyzsh/ohmyzsh)这样的框架中一般以及集成了这一功能 +- **编辑器集成**: 和上面一条类似,将 Git 集成到编辑器中好处多多。[fugitive.vim](https://github.com/tpope/vim-fugitive) 是 Vim 中集成 GIt 的常用插件 +- **工作流**:我们已经讲解了数据模型与一些基础命令,但还没讨论到进行大型项目时的一些惯例 ( +有[很多](https://nvie.com/posts/a-successful-git-branching-model/) +[不同的](https://www.endoflineblog.com/gitflow-considered-harmful) +[处理方法](https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow)) +- **GitHub**: Git 并不等同于 GitHub。 在 GitHub 中您需要使用一个被称作[拉取请求(pull request)](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)的方法来像其他项目贡献代码 +- **Other Git 提供商**: GitHub 并不是唯一的。还有像[GitLab](https://about.gitlab.com/) 和 +[BitBucket](https://bitbucket.org/)这样的平台。 # 资源 -- [Pro Git](https://git-scm.com/book/en/v2) is **highly recommended reading**. -Going through Chapters 1--5 should teach you most of what you need to use Git -proficiently, now that you understand the data model. The later chapters have -some interesting, advanced material. -- [Oh Shit, Git!?!](https://ohshitgit.com/) is a short guide on how to recover -from some common Git mistakes. -- [Git for Computer -Scientists](https://eagain.net/articles/git-for-computer-scientists/) is a -short explanation of Git's data model, with less pseudocode and more fancy -diagrams than these lecture notes. +- [Pro Git](https://git-scm.com/book/en/v2) ,**强烈推荐**! +学习前五章的内容可以教会您流畅使用 Git 的绝大多数技巧,因为您已经理解了 Git 的数据模型。后面的章节提供了很多有趣的高级主题。([Pro Git 中文版](https://git-scm.com/book/zh/v2)) +- [Oh Shit, Git!?!](https://ohshitgit.com/) ,简短的介绍了如何从 Git 错误中恢复 +- [Git for Computer Scientists](https://eagain.net/articles/git-for-computer-scientists/) is a +简短的介绍了 Git 的数据模型,与本文相比包含少量的伪代码以及大量的精美图片。 - [Git from the Bottom Up](https://jwiegley.github.io/git-from-the-bottom-up/) -is a detailed explanation of Git's implementation details beyond just the data -model, for the curious. -- [How to explain git in simple -words](https://smusamashah.github.io/blog/2017/10/14/explain-git-in-simple-words) -- [Learn Git Branching](https://learngitbranching.js.org/) is a browser-based -game that teaches you Git. +详细的介绍了 Git 的实现细节,而不仅仅局限于数据模型。好奇的同学可以看看。 +- [How to explain git in simple words](https://smusamashah.github.io/blog/2017/10/14/explain-git-in-simple-words) +- [Learn Git Branching](https://learngitbranching.js.org/) 通过基于浏览器的游戏来学习 Git + # 课后练习 @@ -479,8 +451,7 @@ game that teaches you Git. couple chapters of [Pro Git](https://git-scm.com/book/en/v2) or go through a tutorial like [Learn Git Branching](https://learngitbranching.js.org/). As you're working through it, relate Git commands to the data model. -1. Clone the [repository for the -class website](https://github.com/missing-semester/missing-semester). +1. Clone the [repository for the class website](https://github.com/missing-semester/missing-semester). 1. Explore the version history by visualizing it as a graph. 1. Who was the last person to modify `README.md`? (Hint: use `git log` with an argument) From 10fb594fe75e02bc59d31ffe8a3ab2e79969da1c Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 23 May 2020 08:16:02 +0800 Subject: [PATCH 348/640] finished. not including the part marked as comments --- _2020/version-control.md | 78 +++++++++++----------------------------- 1 file changed, 20 insertions(+), 58 deletions(-) diff --git a/_2020/version-control.md b/_2020/version-control.md index 456309b7..fa4299d3 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -165,42 +165,23 @@ def load_reference(name_or_id): 最后,我们可以粗略地给出 Git 仓库的定义了:`对象` 和 `引用`。 -On disk, all Git stores are objects and references: that's all there is to Git's -data model. All `git` commands map to some manipulation of the commit DAG by -adding objects and adding/updating references. +在硬盘上,Git 仅存储对象和引用:因为其数据模型仅包含这些东西。所有的 `git` 命令都对应着对提交树的操作,例如增加对象,增加或删除引用。 -Whenever you're typing in any command, think about what manipulation the -command is making to the underlying graph data structure. Conversely, if you're -trying to make a particular kind of change to the commit DAG, e.g. "discard -uncommitted changes and make the 'master' ref point to commit `5d83f9e`", there's -probably a command to do it (e.g. in this case, `git checkout master; git reset ---hard 5d83f9e`). +当您输入某个指令时,请思考一些这条命令是如何对底层的图数据结构进行操作的。另一方面,如果您希望修改提交树,例如“丢弃未提交的修改和将 ‘master’ 引用指向提交`5d83f9e`” 时,有什么命令可以完成该操作(针对这个具体问题,您可以使用`git checkout master; git reset --hard 5d83f9e`) # 暂存区 -This is another concept that's orthogonal to the data model, but it's a part of -the interface to create commits. +Git 中还包括一个和数据模型完全不相关的概念,但它确是创建提交的接口的一部分。 -One way you might imagine implementing snapshotting as described above is to have -a "create snapshot" command that creates a new snapshot based on the _current -state_ of the working directory. Some version control tools work like this, but -not Git. We want clean snapshots, and it might not always be ideal to make a -snapshot from the current state. For example, imagine a scenario where you've -implemented two separate features, and you want to create two separate commits, -where the first introduces the first feature, and the next introduces the -second feature. Or imagine a scenario where you have debugging print statements -added all over your code, along with a bugfix; you want to commit the bugfix -while discarding all the print statements. +就上面介绍的快照系统来说,您也许会期望它的实现里包括一个 “创建快照”的命令,该命令能够基于当前工作目录的当前状态,创建一个全新的快照。有些版本控制系统确实是这样工作的,但 Git 不是。我们希望简洁的快照,而且每次从当前状态创建快照可能效果并不理想。例如,考虑如下常见,您开发了两个独立的特性,然后您希望创建两个独立的提交,其中第一个提交仅包含第一个特性,而第二个提交仅包含第二个特性。或者,假设您在调试代码时添加了很多打印语句,然后您仅仅希望提交和修复 bug 相关的代码而丢弃所有的打印语句。 -Git accommodates such scenarios by allowing you to specify which modifications -should be included in the next snapshot through a mechanism called the "staging -area". + +Git 处理这些场景的方法是使用一种叫做 “暂存区(staging area)”的机制,它允许您指定下次快照中要包括那些改动。 # Git 的命令行接口 -To avoid duplicating information, we're not going to explain the commands below -in detail. See the highly recommended [Pro Git](https://git-scm.com/book/en/v2) -for more information, or watch the lecture video. +为了避免重复信息,我们将不会详细解释以下命令行。强烈推荐您阅读[Pro Git 中文版](https://git-scm.com/book/zh/v2)或可以观看本讲座的视频来学习。 + ## 基础 @@ -447,34 +428,15 @@ command is used for merging. # 课后练习 -1. If you don't have any past experience with Git, either try reading the first - couple chapters of [Pro Git](https://git-scm.com/book/en/v2) or go through a - tutorial like [Learn Git Branching](https://learngitbranching.js.org/). As - you're working through it, relate Git commands to the data model. -1. Clone the [repository for the class website](https://github.com/missing-semester/missing-semester). - 1. Explore the version history by visualizing it as a graph. - 1. Who was the last person to modify `README.md`? (Hint: use `git log` with - an argument) - 1. What was the commit message associated with the last modification to the - `collections:` line of `_config.yml`? (Hint: use `git blame` and `git - show`) -1. One common mistake when learning Git is to commit large files that should - not be managed by Git or adding sensitive information. Try adding a file to - a repository, making some commits and then deleting that file from history - (you may want to look at - [this](https://help.github.com/articles/removing-sensitive-data-from-a-repository/)). -1. Clone some repository from GitHub, and modify one of its existing files. - What happens when you do `git stash`? What do you see when running `git log - --all --oneline`? Run `git stash pop` to undo what you did with `git stash`. - In what scenario might this be useful? -1. Like many command line tools, Git provides a configuration file (or dotfile) - called `~/.gitconfig`. Create an alias in `~/.gitconfig` so that when you - run `git graph`, you get the output of `git log --all --graph --decorate - --oneline`. -1. You can define global ignore patterns in `~/.gitignore_global` after running - `git config --global core.excludesfile ~/.gitignore_global`. Do this, and - set up your global gitignore file to ignore OS-specific or editor-specific - temporary files, like `.DS_Store`. -1. Clone the [repository for the class - website](https://github.com/missing-semester/missing-semester), find a typo - or some other improvement you can make, and submit a pull request on GitHub. +1. 如果您之前从来没有用过 Git,请阅读 [Pro Git](https://git-scm.com/book/en/v2) 的前几章,或者完成像[Learn Git Branching](https://learngitbranching.js.org/)这样的教程。尤其要注意学习 Git 的命令和数据模型相关内容 + +2. 克隆 [本课程网站的仓库](https://github.com/missing-semester/missing-semester) + 1. 将版本历史可视化并进行探索 + 2. 是谁最后修改来 `README.md`文件?(提示:使用 `git log` 命令并添加合适的参数) + 3. 最好一次修改What was the commit message associated with the last modification to the + `_config.yml` 文件中 `collections:` 行时的提交信息是什么?(提示:使用`git blame` 和 `git show`) +3. 使用 Git 时的一个常见错误时提交本不应该由 Git 管理的大文件,或是将含有敏感信息的文件提交给 Git 。尝试像仓库中添加一个文件并添加提交信息,然后将其从历史中删除 ( [这篇文章也许会有帮助](https://help.github.com/articles/removing-sensitive-data-from-a-repository/)) +4. 从 GitHub 上克隆某个仓库,修改一些文件。当您使用 `git stash` 会发生什么?当您执行 `git log --all --oneline` 时会显示什么?通过 `git stash pop` 命令来撤销 `git stash`操作,什么时候会用到这一技巧? +5. 与其他的命令行工具一样,Git 也提供了一个名为 `~/.gitconfig` 配置文件 (或 dotfile)。请在 `~/.gitconfig` 中创建一个别名,使您在运行 `git graph` 时,您可以得到 `git log --all --graph --decorate --oneline`的输出结果 +6. 您可以通过执行`git config --global core.excludesfile ~/.gitignore_global` 在 `~/.gitignore_global` 中创建全局忽略规则。配置您的全局 gitignore 文件来字典忽略系统或编辑器的临时文件,例如 `.DS_Store` +7. 克隆 [本课程网站的仓库](https://github.com/missing-semester/missing-semester),找找有没有错别字或其他可以改进的地方,在 GitHub 上发起拉取请求( Pull Request) From 54dbb3ab380758fa979b4b304a55d0750211929b Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Sat, 23 May 2020 08:23:23 +0800 Subject: [PATCH 349/640] book security.md for catcarbon thanks in advance --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index f03c19ce..485165db 100644 --- a/README.md +++ b/README.md @@ -33,7 +33,7 @@ To contribute to this tanslation project, please book your topic by creating an | [version-control.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/version-control.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | | [debugging-profiling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/debugging-profiling.md) | | TO-DO | | [metaprogramming.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/metaprogramming.md) | | TO-DO | -| [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | | TO-DO | +| [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | [@catcarbon](https://github.com/catcarbon) | TO-DO | | [potpourri.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md) | | TO-DO | | [qa.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/qa.md) | [@AA1HSHH](https://github.com/AA1HSHH) | In-progress | | [about.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/about.md) | [@Binlogo](https://github.com/Binlogo) | In-progress | From fbe0440e7d58690f0414baec15b43e624dd5aa6f Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Sat, 23 May 2020 08:23:46 +0800 Subject: [PATCH 350/640] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 485165db..f2a61cfe 100644 --- a/README.md +++ b/README.md @@ -33,7 +33,7 @@ To contribute to this tanslation project, please book your topic by creating an | [version-control.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/version-control.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | | [debugging-profiling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/debugging-profiling.md) | | TO-DO | | [metaprogramming.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/metaprogramming.md) | | TO-DO | -| [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | [@catcarbon](https://github.com/catcarbon) | TO-DO | +| [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | [@catcarbon](https://github.com/catcarbon) | In-progress | | [potpourri.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md) | | TO-DO | | [qa.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/qa.md) | [@AA1HSHH](https://github.com/AA1HSHH) | In-progress | | [about.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/about.md) | [@Binlogo](https://github.com/Binlogo) | In-progress | From d90279ce64c83d9eb224af5ccff017f210bcd1ec Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 23 May 2020 08:51:24 +0800 Subject: [PATCH 351/640] change ready to true --- _2020/version-control.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/version-control.md b/_2020/version-control.md index fa4299d3..fe8e44a0 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -2,7 +2,7 @@ layout: lecture title: "版本控制(Git)" date: 2019-01-22 -ready: false +ready: true video: aspect: 56.25 id: 2sjqTHE0zok From 864ecb6584b2bcfd6eb107d6866b3d22e4151205 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 23 May 2020 12:16:06 +0800 Subject: [PATCH 352/640] review and fix --- _2020/version-control.md | 62 +++++++++++++++++----------------------- 1 file changed, 27 insertions(+), 35 deletions(-) diff --git a/_2020/version-control.md b/_2020/version-control.md index fe8e44a0..5169a834 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -8,24 +8,23 @@ video: id: 2sjqTHE0zok --- -版本控制系统 (VCSs) 是一类用于追踪源代码(或其他文件、文件夹)改动的工具。顾名思义,这些工具可以帮助我们管理修改历史;不仅如此,它还可以让协作编码更方便。VCS通过一系列的快照将某个文件夹及其内容保存了起来,每个快照都包含了文件或文件夹的完整状态。同时它还维护了快照创建者的信息以及每个快照的管相关信息等等。 +版本控制系统 (VCSs) 是一类用于追踪源代码(或其他文件、文件夹)改动的工具。顾名思义,这些工具可以帮助我们管理代码的修改历史;不仅如此,它还可以让协作编码变得更方便。VCS通过一系列的快照将某个文件夹及其内容保存了起来,每个快照都包含了文件或文件夹的完整状态。同时它还维护了快照创建者的信息以及每个快照的管相关信息等等。 为什么说版本控制系统非常有用?即使您只是一个人进行编程工作,它也可以帮您创建项目的快照,记录每个改动的目的、基于多分支并行开发等等。和别人协作开发时,它更是一个无价之宝,您可以看到别人对代码进行的修改,同时解决由于并行开发引起的冲突。 - 现代的版本控制系统可以帮助您轻松地(甚至自动地)回答以下问题: - 当前模块是谁编写的? - 这个文件的这一行是什么时候被编辑的?是谁作出的修改?修改原因是什么呢? - 最近的1000个版本中,何时/为什么导致了单元测试失败? -尽管版本控制系统有很多, 其事实上的标准则是 **Git** 。这篇 [XKCD 漫画](https://xkcd.com/1597/) 则反映出了人们对 Git 的评价: +尽管版本控制系统有很多, 其事实上的标准则是 **Git** 。而这篇 [XKCD 漫画](https://xkcd.com/1597/) 则反映出了人们对 Git 的评价: ![xkcd 1597](https://imgs.xkcd.com/comics/git.png) -因为 Git 接口的抽象有些问题,通过自顶向下的方式(从接口、命令行接口开始)学习 Git 可能会让人感到非常困惑。很多时候您只能死记硬背一些命令行,然后像使用魔法一样使用它们,一旦出现问题,就只能像上面那幅漫画里说的那样去处理了。 +因为 Git 接口的抽象泄漏(leaky abstraction)问题,通过自顶向下的方式(从命令行接口开始)学习 Git 可能会让人感到非常困惑。很多时候您只能死记硬背一些命令行,然后像使用魔法一样使用它们,一旦出现问题,就只能像上面那幅漫画里说的那样去处理了。 -尽管 Git 的接口有些粗糙,但是它的底层设计和思想却是非常优雅的。丑陋的接口只能靠死记硬背,而优雅的底层设计则非常容易被人理解。因此,我们将通过一种自底向上的方式像您介绍 Git。我们会从数据模型开始,最后再学习它的接口。一旦您搞懂了 Git 的数据模型,再学习其接口并理解这些接口是如何操作数据模型的,就非常容易了。 +尽管 Git 的接口有些丑陋,但是它的底层设计和思想却是非常优雅的。丑陋的接口只能靠死记硬背,而优雅的底层设计则非常容易被人理解。因此,我们将通过一种自底向上的方式像您介绍 Git。我们会从数据模型开始,最后再学习它的接口。一旦您搞懂了 Git 的数据模型,再学习其接口并理解这些接口是如何操作数据模型的就非常容易了。 # Git 的数据模型 @@ -49,9 +48,9 @@ Git 将顶级目录中的文件和文件夹作为集合,并通过一系列快 ## 历史记录建模:关联快照 -版本控制系统和快照有什么关系呢?线性历史记录是一种最简单的模型,它包含了一组按照时间顺序线性排列的快照。不过处于种种原因,Git并没有采用这样的模型。 +版本控制系统和快照有什么关系呢?线性历史记录是一种最简单的模型,它包含了一组按照时间顺序线性排列的快照。不过处于种种原因,Git 并没有采用这样的模型。 -在 Git 中,历史记录是一个由快照组成的有向无环图。有向无环图,听上去似乎是什么高大上的数学名词,不过不要怕。您只需要知道这代表 Git 中的每个快照都有一系列的“父辈”,也就是其之前的一系列快照。注意,快照具有多个“父辈”而非一个,因为某个快照可能由多个父辈而来。例如,经过合并后的两条分支。 +在 Git 中,历史记录是一个由快照组成的有向无环图。有向无环图,听上去似乎是什么高大上的数学名词。不过不要怕,您只需要知道这代表 Git 中的每个快照都有一系列的“父辈”,也就是其之前的一系列快照。注意,快照具有多个“父辈”而非一个,因为某个快照可能由多个父辈而来。例如,经过合并后的两条分支。 在 Git 中,这些快照被称为“提交”。通过可视化的方式来表示这些历史提交记录时,看起来差不多是这样的: @@ -121,7 +120,7 @@ def load(id): Blobs、树和提交都一样,它们都是对象。当它们引用其他对象时,它们并没有真正的在硬盘上保存这些对象,而是仅仅保存了它们的哈希值作为引用。 -例如,上面例子中的树,For example, the tree for the example directory structure [above](#snapshots)(可以通过`git cat-file -p 698281bc680d1995c5f4caaf3359721a5a58d48d` 来进行可视化),看上去是这样的: +例如,[above](#snapshots)例子中的树(可以通过`git cat-file -p 698281bc680d1995c5f4caaf3359721a5a58d48d` 来进行可视化),看上去是这样的: ``` 100644 blob 4448adbf7ecd394f42ae135bbeed9676e894af85 baz.txt @@ -137,7 +136,7 @@ git is wonderful ## 引用 -现在,所有的快照都可以通过它们的SHA-1哈希值来标记了。但这也太不方便来,谁也记不住一串 40 位的十六进制字符。 +现在,所有的快照都可以通过它们的 SHA-1 哈希值来标记了。但这也太不方便了,谁也记不住一串 40 位的十六进制字符。 针对这一问题,Git 的解决方法是给这些哈希值赋予人类可读的名字,也就是引用(references)。引用是指向提交的指针。与对象不同的是,它是可变的(引用可以被更新,指向新的提交)。例如,`master` 引用通常会指向主分支的最新一次提交。 @@ -167,14 +166,13 @@ def load_reference(name_or_id): 在硬盘上,Git 仅存储对象和引用:因为其数据模型仅包含这些东西。所有的 `git` 命令都对应着对提交树的操作,例如增加对象,增加或删除引用。 -当您输入某个指令时,请思考一些这条命令是如何对底层的图数据结构进行操作的。另一方面,如果您希望修改提交树,例如“丢弃未提交的修改和将 ‘master’ 引用指向提交`5d83f9e`” 时,有什么命令可以完成该操作(针对这个具体问题,您可以使用`git checkout master; git reset --hard 5d83f9e`) +当您输入某个指令时,请思考一些这条命令是如何对底层的图数据结构进行操作的。另一方面,如果您希望修改提交树,例如“丢弃未提交的修改和将 ‘master’ 引用指向提交`5d83f9e` 时,有什么命令可以完成该操作(针对这个具体问题,您可以使用`git checkout master; git reset --hard 5d83f9e`) # 暂存区 Git 中还包括一个和数据模型完全不相关的概念,但它确是创建提交的接口的一部分。 -就上面介绍的快照系统来说,您也许会期望它的实现里包括一个 “创建快照”的命令,该命令能够基于当前工作目录的当前状态,创建一个全新的快照。有些版本控制系统确实是这样工作的,但 Git 不是。我们希望简洁的快照,而且每次从当前状态创建快照可能效果并不理想。例如,考虑如下常见,您开发了两个独立的特性,然后您希望创建两个独立的提交,其中第一个提交仅包含第一个特性,而第二个提交仅包含第二个特性。或者,假设您在调试代码时添加了很多打印语句,然后您仅仅希望提交和修复 bug 相关的代码而丢弃所有的打印语句。 - +就上面介绍的快照系统来说,您也许会期望它的实现里包括一个 “创建快照” 的命令,该命令能够基于当前工作目录的当前状态创建一个全新的快照。有些版本控制系统确实是这样工作的,但 Git 不是。我们希望简洁的快照,而且每次从当前状态创建快照可能效果并不理想。例如,考虑如下场景,您开发了两个独立的特性,然后您希望创建两个独立的提交,其中第一个提交仅包含第一个特性,而第二个提交仅包含第二个特性。或者,假设您在调试代码时添加了很多打印语句,然后您仅仅希望提交和修复 bug 相关的代码而丢弃所有的打印语句。 Git 处理这些场景的方法是使用一种叫做 “暂存区(staging area)”的机制,它允许您指定下次快照中要包括那些改动。 @@ -346,9 +344,9 @@ index 94bab17..f0013b2 100644 - 如何编写 [良好的提交信息](https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html)! - `git log`: 显示历史日志 - `git log --all --graph --decorate`: 可视化历史记录(有向无环图) -- `git diff `: 显示与上一次提交间的不同 -- `git diff `: 显示某个文件两个版本之间的不同 -- `git checkout `: 更新HEAD和目前的分支 +- `git diff `: 显示与上一次提交之间的差异 +- `git diff `: 显示某个文件两个版本之间的差异 +- `git checkout `: 更新 HEAD 和目前的分支 ## 分支和合并 @@ -403,40 +401,34 @@ command is used for merging. # 杂项 - **图形用户界面**: Git 的 [图形用户界面客户端](https://git-scm.com/downloads/guis) 有很多,但是我们自己并不使用这些图形用户界面的客户端,我们选择使用命令行接口 -- **Shell 集成**: 将 Git 状态集成到您的shell中会非常方便。([zsh](https://github.com/olivierverdier/zsh-git-prompt),[bash](https://github.com/magicmonty/bash-git-prompt))。[Oh My Zsh](https://github.com/ohmyzsh/ohmyzsh)这样的框架中一般以及集成了这一功能 +- **Shell 集成**: 将 Git 状态集成到您的 shell 中会非常方便。([zsh](https://github.com/olivierverdier/zsh-git-prompt),[bash](https://github.com/magicmonty/bash-git-prompt))。[Oh My Zsh](https://github.com/ohmyzsh/ohmyzsh)这样的框架中一般以及集成了这一功能 - **编辑器集成**: 和上面一条类似,将 Git 集成到编辑器中好处多多。[fugitive.vim](https://github.com/tpope/vim-fugitive) 是 Vim 中集成 GIt 的常用插件 - **工作流**:我们已经讲解了数据模型与一些基础命令,但还没讨论到进行大型项目时的一些惯例 ( 有[很多](https://nvie.com/posts/a-successful-git-branching-model/) [不同的](https://www.endoflineblog.com/gitflow-considered-harmful) [处理方法](https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow)) - **GitHub**: Git 并不等同于 GitHub。 在 GitHub 中您需要使用一个被称作[拉取请求(pull request)](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)的方法来像其他项目贡献代码 -- **Other Git 提供商**: GitHub 并不是唯一的。还有像[GitLab](https://about.gitlab.com/) 和 -[BitBucket](https://bitbucket.org/)这样的平台。 +- **Other Git 提供商**: GitHub 并不是唯一的。还有像[GitLab](https://about.gitlab.com/) 和 [BitBucket](https://bitbucket.org/)这样的平台。 # 资源 -- [Pro Git](https://git-scm.com/book/en/v2) ,**强烈推荐**! -学习前五章的内容可以教会您流畅使用 Git 的绝大多数技巧,因为您已经理解了 Git 的数据模型。后面的章节提供了很多有趣的高级主题。([Pro Git 中文版](https://git-scm.com/book/zh/v2)) -- [Oh Shit, Git!?!](https://ohshitgit.com/) ,简短的介绍了如何从 Git 错误中恢复 -- [Git for Computer Scientists](https://eagain.net/articles/git-for-computer-scientists/) is a -简短的介绍了 Git 的数据模型,与本文相比包含少量的伪代码以及大量的精美图片。 -- [Git from the Bottom Up](https://jwiegley.github.io/git-from-the-bottom-up/) -详细的介绍了 Git 的实现细节,而不仅仅局限于数据模型。好奇的同学可以看看。 -- [How to explain git in simple words](https://smusamashah.github.io/blog/2017/10/14/explain-git-in-simple-words) -- [Learn Git Branching](https://learngitbranching.js.org/) 通过基于浏览器的游戏来学习 Git +- [Pro Git](https://git-scm.com/book/en/v2) ,**强烈推荐**!学习前五章的内容可以教会您流畅使用 Git 的绝大多数技巧,因为您已经理解了 Git 的数据模型。后面的章节提供了很多有趣的高级主题。([Pro Git 中文版](https://git-scm.com/book/zh/v2)); +- [Oh Shit, Git!?!](https://ohshitgit.com/) ,简短的介绍了如何从 Git 错误中恢复; +- [Git for Computer Scientists](https://eagain.net/articles/git-for-computer-scientists/) ,简短的介绍了 Git 的数据模型,与本文相比包含较少量的伪代码以及大量的精美图片; +- [Git from the Bottom Up](https://jwiegley.github.io/git-from-the-bottom-up/)详细的介绍了 Git 的实现细节,而不仅仅局限于数据模型。好奇的同学可以看看; +- [How to explain git in simple words](https://smusamashah.github.io/blog/2017/10/14/explain-git-in-simple-words); +- [Learn Git Branching](https://learngitbranching.js.org/) 通过基于浏览器的游戏来学习 Git ; # 课后练习 -1. 如果您之前从来没有用过 Git,请阅读 [Pro Git](https://git-scm.com/book/en/v2) 的前几章,或者完成像[Learn Git Branching](https://learngitbranching.js.org/)这样的教程。尤其要注意学习 Git 的命令和数据模型相关内容 - +1. 如果您之前从来没有用过 Git,推荐您阅读 [Pro Git](https://git-scm.com/book/en/v2) 的前几章,或者完成像[Learn Git Branching](https://learngitbranching.js.org/)这样的教程。重点关注 Git 命令和数据模型相关内容; 2. 克隆 [本课程网站的仓库](https://github.com/missing-semester/missing-semester) 1. 将版本历史可视化并进行探索 2. 是谁最后修改来 `README.md`文件?(提示:使用 `git log` 命令并添加合适的参数) - 3. 最好一次修改What was the commit message associated with the last modification to the - `_config.yml` 文件中 `collections:` 行时的提交信息是什么?(提示:使用`git blame` 和 `git show`) -3. 使用 Git 时的一个常见错误时提交本不应该由 Git 管理的大文件,或是将含有敏感信息的文件提交给 Git 。尝试像仓库中添加一个文件并添加提交信息,然后将其从历史中删除 ( [这篇文章也许会有帮助](https://help.github.com/articles/removing-sensitive-data-from-a-repository/)) + 3. 最后一次修改`_config.yml` 文件中 `collections:` 行时的提交信息是什么?(提示:使用`git blame` 和 `git show`) +3. 使用 Git 时的一个常见错误时提交本不应该由 Git 管理的大文件,或是将含有敏感信息的文件提交给 Git 。尝试像仓库中添加一个文件并添加提交信息,然后将其从历史中删除 ( [这篇文章也许会有帮助](https://help.github.com/articles/removing-sensitive-data-from-a-repository/)); 4. 从 GitHub 上克隆某个仓库,修改一些文件。当您使用 `git stash` 会发生什么?当您执行 `git log --all --oneline` 时会显示什么?通过 `git stash pop` 命令来撤销 `git stash`操作,什么时候会用到这一技巧? -5. 与其他的命令行工具一样,Git 也提供了一个名为 `~/.gitconfig` 配置文件 (或 dotfile)。请在 `~/.gitconfig` 中创建一个别名,使您在运行 `git graph` 时,您可以得到 `git log --all --graph --decorate --oneline`的输出结果 -6. 您可以通过执行`git config --global core.excludesfile ~/.gitignore_global` 在 `~/.gitignore_global` 中创建全局忽略规则。配置您的全局 gitignore 文件来字典忽略系统或编辑器的临时文件,例如 `.DS_Store` -7. 克隆 [本课程网站的仓库](https://github.com/missing-semester/missing-semester),找找有没有错别字或其他可以改进的地方,在 GitHub 上发起拉取请求( Pull Request) +5. 与其他的命令行工具一样,Git 也提供了一个名为 `~/.gitconfig` 配置文件 (或 dotfile)。请在 `~/.gitconfig` 中创建一个别名,使您在运行 `git graph` 时,您可以得到 `git log --all --graph --decorate --oneline`的输出结果; +6. 您可以通过执行`git config --global core.excludesfile ~/.gitignore_global` 在 `~/.gitignore_global` 中创建全局忽略规则。配置您的全局 gitignore 文件来字典忽略系统或编辑器的临时文件,例如 `.DS_Store`; +7. 克隆 [本课程网站的仓库](https://github.com/missing-semester/missing-semester),找找有没有错别字或其他可以改进的地方,在 GitHub 上发起拉取请求( Pull Request); From 52eeb16f953517deffc249386cd50dcbe6691078 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 23 May 2020 12:20:24 +0800 Subject: [PATCH 353/640] mark vcs as done --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index a951bcde..fec245f4 100644 --- a/README.md +++ b/README.md @@ -30,7 +30,7 @@ To contribute to this tanslation project, please book your topic by creating an | [editors.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/editors.md) | [@stechu](https://github.com/stechu) | In-progress | | [data-wrangling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/data-wrangling.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | | [command-line.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/command-line.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | -| [version-control.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/version-control.md) | | TO-DO | +| [version-control.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/version-control.md) | | Done | | [debugging-profiling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/debugging-profiling.md) | | TO-DO | | [metaprogramming.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/metaprogramming.md) | | TO-DO | | [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | | TO-DO | From 3ed73a6f20f71e7d0e7a18b198b8d12fd3e072f9 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 23 May 2020 12:28:28 +0800 Subject: [PATCH 354/640] small changes in index like remove the translator name --- index.md | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/index.md b/index.md index b8af0777..fb733117 100644 --- a/index.md +++ b/index.md @@ -4,10 +4,11 @@ title: The Missing Semester of Your CS Education 中文版 --- 对于计算机教育来说,从操作系统到机器学习,这些高大上课程和主题已经非常多了。然而有一个至关重要的主题却很少被专门讲授,而是留给学生们自己去探索。 -这部分内容就是:精通工具。在这个系列课程中,我们会帮助您精通命令行、使用强大的文本编辑器、使用版本控制系统提供的多种特性等等。 +这部分内容就是:精通工具。在这个系列课程中,我们讲授命令行、强大的文本编辑器的使用、使用版本控制系统提供的多种特性等等。 学生在他们受教育阶段就会和这些工具朝夕相处(在他们的职业生涯中更是这样)。 -因此,花时间打磨使用这些工具的能力并能够最终熟练、流畅地使用它们是非常有必要的。 +因此,花时间打磨使用这些工具的能力并能够最终熟练地、流畅地使用它们是非常有必要的。 + 精通这些工具不仅可以帮助您更快的使用工具完成任务,并且可以帮助您解决在之前看来似乎无比复杂的问题。 @@ -53,7 +54,7 @@ YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuloKGG59rS43e29ro7I57 # 在 MIT 之外 -我们将本课程分享到了MIT之外,希望其他人也能受益于这些资源。你可以在下面这些地方找到相关文章和讨论。 +我们也将本课程分享到了 MIT 之外,希望其他人也能受益于这些资源。您可以在下面这些地方找到相关文章和讨论。 - [Hacker News](https://news.ycombinator.com/item?id=22226380) - [Lobsters](https://lobste.rs/s/ti1k98/missing_semester_your_cs_education_mit) @@ -64,14 +65,11 @@ YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuloKGG59rS43e29ro7I57 ## 致谢 -感谢 Elaine Mello, Jim Cain 以及 [MIT Open -Learning](https://openlearning.mit.edu/) 帮助我们录制讲座视频。 +感谢 Elaine Mello, Jim Cain 以及 [MIT Open Learning](https://openlearning.mit.edu/) 帮助我们录制讲座视频。 -感谢 Anthony Zolnik 和 [MIT -AeroAstro](https://aeroastro.mit.edu/) 提供 A/V 设备。 +感谢 Anthony Zolnik 和 [MIT AeroAstro](https://aeroastro.mit.edu/) 提供 A/V 设备。 -感谢 Brandi Adams 和 -[MIT EECS](https://www.eecs.mit.edu/) 对本课程的支持。 +感谢 Brandi Adams 和 [MIT EECS](https://www.eecs.mit.edu/) 对本课程的支持。 @@ -80,6 +78,5 @@ AeroAstro](https://aeroastro.mit.edu/) 提供 A/V 设备。

    Source code.

    Licensed under CC BY-NC-SA.

    -

    Translator: Lingfeng Ai

    See here for contribution & translation guidelines.

    From 9b0ffc7ea2ad9b97b87776edf002e8a41d5a2aef Mon Sep 17 00:00:00 2001 From: Yi Zhang Date: Sat, 23 May 2020 00:44:48 -0400 Subject: [PATCH 355/640] translate until little before KDF --- .gitignore | 1 + _2020/security.md | 173 ++++++++++++++++++---------------------------- 2 files changed, 70 insertions(+), 104 deletions(-) diff --git a/.gitignore b/.gitignore index 1a2b01ef..c9215db2 100644 --- a/.gitignore +++ b/.gitignore @@ -1,3 +1,4 @@ +.ruby-version .bundle/ _site/ .jekyll-metadata diff --git a/_2020/security.md b/_2020/security.md index d88dd467..89cc6fcf 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -8,68 +8,49 @@ video: id: tjwobAmnKTo --- -Last year's [security and privacy lecture](/2019/security/) focused on how you -can be more secure as a computer _user_. This year, we will focus on security -and cryptography concepts that are relevant in understanding tools covered -earlier in this class, such as the use of hash functions in Git or key -derivation functions and symmetric/asymmetric cryptosystems in SSH. - -This lecture is not a substitute for a more rigorous and complete course on -computer systems security ([6.858](https://css.csail.mit.edu/6.858/)) or -cryptography ([6.857](https://courses.csail.mit.edu/6.857/) and 6.875). Don't -do security work without formal training in security. Unless you're an expert, -don't [roll your own -crypto](https://www.schneier.com/blog/archives/2015/05/amateurs_produc.html). -The same principle applies to systems security. - -This lecture has a very informal (but we think practical) treatment of basic -cryptography concepts. This lecture won't be enough to teach you how to -_design_ secure systems or cryptographic protocols, but we hope it will be -enough to give you a general understanding of the programs and protocols you -already use. - -# Entropy - -[Entropy](https://en.wikipedia.org/wiki/Entropy_(information_theory)) is a -measure of randomness. This is useful, for example, when determining the -strength of a password. +去年的[这节课](/2019/security/)我们从计算机 _用户_ 的角度探讨了增强隐私保护和安全的方法。 +今年我们将关注比如散列函数、密钥生成函数、对称/非对称密码体系这些安全和密码学的概念是如何应用于前几节课所学到的工具(Git和SSH)中的。 + +本课程不能作为计算机系统安全 ([6.858](https://css.csail.mit.edu/6.858/)) 或者 +密码学 ([6.857](https://courses.csail.mit.edu/6.857/)以及6.875)的替代。 +如果你不是密码学的专家,请不要[试图创造或者修改加密算法](https://www.schneier.com/blog/archives/2015/05/amateurs_produc.html)。从事和计算机系统安全相关的工作同理。 + +这节课将对一些基本的概念进行简单(但实用)的说明。 +虽然这些说明不足以让你学会如何 _设计_ 安全系统或者加密协议,但我们希望你可以对现在使用的程序和协议有一个大概了解。 + +# 熵 + +[熵](https://en.wikipedia.org/wiki/Entropy_(information_theory))(Entropy) 是对不确定性的量度。 +它的一个应用是决定密码的强度。 ![XKCD 936: Password Strength](https://imgs.xkcd.com/comics/password_strength.png) -As the above [XKCD comic](https://xkcd.com/936/) illustrates, a password like -"correcthorsebatterystaple" is more secure than one like "Tr0ub4dor&3". But how -do you quantify something like this? +正如上面的 [XKCD 漫画](https://xkcd.com/936/) 所描述的, +"correcthorsebatterystaple" 这个密码比 "Tr0ub4dor&3" 更安全——可是熵是如何量化安全性的呢? -Entropy is measured in _bits_, and when selecting uniformly at random from a -set of possible outcomes, the entropy is equal to `log_2(# of possibilities)`. -A fair coin flip gives 1 bit of entropy. A dice roll (of a 6-sided die) has -\~2.58 bits of entropy. +熵的单位是 _比特_。对于一个均匀分布的随机离散变量,熵等于`log_2(所有可能的个数,即n)`。 +扔一次硬币的熵是1比特。掷一次(六面)骰子的熵大约为2.58比特。 -You should consider that the attacker knows the _model_ of the password, but -not the randomness (e.g. from [dice -rolls](https://en.wikipedia.org/wiki/Diceware)) used to select a particular -password. +一般我们认为攻击者了解密码的模型(最小长度,最大长度,可能包含的字符种类等),但是不了解某个密码是如何随机选择的—— +比如[掷骰子](https://en.wikipedia.org/wiki/Diceware)。 -How many bits of entropy is enough? It depends on your threat model. For online -guessing, as the XKCD comic points out, \~40 bits of entropy is pretty good. To -be resistant to offline guessing, a stronger password would be necessary (e.g. -80 bits, or more). +使用多少比特的熵取决于应用的威胁模型。 +上面的XKCD漫画告诉我们,大约40比特的熵足以对抗在线穷举攻击(受限于网络速度和应用认证机制)。 +而对于离线穷举攻击(主要受限于计算速度), 一般需要更强的密码 (比如80比特或更多)。 -# Hash functions +# 散列函数 -A [cryptographic hash -function](https://en.wikipedia.org/wiki/Cryptographic_hash_function) maps data -of arbitrary size to a fixed size, and has some special properties. A rough -specification of a hash function is as follows: +[密码散列函数](https://en.wikipedia.org/wiki/Cryptographic_hash_function) +(Cryptographic hash function) 可以将任意大小的数据映射为一个固定大小的输出。除此之外,还有一些其他特性。 +一个散列函数的大概规范如下: ``` -hash(value: array) -> vector (for some fixed N) +hash(value: array) -> vector (N对于该函数固定) ``` -An example of a hash function is [SHA1](https://en.wikipedia.org/wiki/SHA-1), -which is used in Git. It maps arbitrary-sized inputs to 160-bit outputs (which -can be represented as 40 hexadecimal characters). We can try out the SHA1 hash -on an input using the `sha1sum` command: +[SHA-1](https://en.wikipedia.org/wiki/SHA-1)是Git中使用的一种散列函数, +它可以将任意大小的输入映射为一个160比特(可被40位十六进制数表示)的输出。 +下面我们用`sha1sum`命令来测试SHA1对几个字符串的输出: ```console $ printf 'hello' | sha1sum @@ -80,39 +61,27 @@ $ printf 'Hello' | sha1sum f7ff9e8b7bb2e09b70935a5d785e0cc5d9d0abf0 ``` -At a high level, a hash function can be thought of as a hard-to-invert -random-looking (but deterministic) function (and this is the [ideal model of a -hash function](https://en.wikipedia.org/wiki/Random_oracle)). A hash function -has the following properties: - -- Deterministic: the same input always generates the same output. -- Non-invertible: it is hard to find an input `m` such that `hash(m) = h` for -some desired output `h`. -- Target collision resistant: given an input `m_1`, it's hard to find a -different input `m_2` such that `hash(m_1) = hash(m_2)`. -- Collision resistant: it's hard to find two inputs `m_1` and `m_2` such that -`hash(m_1) = hash(m_2)` (note that this is a strictly stronger property than -target collision resistance). - -Note: while it may work for certain purposes, SHA-1 is [no -longer](https://shattered.io/) considered a strong cryptographic hash function. -You might find this table of [lifetimes of cryptographic hash -functions](https://valerieaurora.org/hash.html) interesting. However, note that -recommending specific hash functions is beyond the scope of this lecture. If you -are doing work where this matters, you need formal training in -security/cryptography. - -## Applications - -- Git, for content-addressed storage. The idea of a [hash -function](https://en.wikipedia.org/wiki/Hash_function) is a more general -concept (there are non-cryptographic hash functions). Why does Git use a -cryptographic hash function? -- A short summary of the contents of a file. Software can often be downloaded -from (potentially less trustworthy) mirrors, e.g. Linux ISOs, and it would be -nice to not have to trust them. The official sites usually post hashes -alongside the download links (that point to third-party mirrors), so that the -hash can be checked after downloading a file. +抽象地讲,散列函数可以被认为是一个难以取反,且看上去随机(但具确定性)的函数 +(这就是[散列函数的理想模型](https://en.wikipedia.org/wiki/Random_oracle))。 +一个散列函数拥有以下特性: + +- 确定性:对于不变的输入永远有相同的输出。 +- 不可逆性:对于`hash(m) = h`,难以通过已知的输出`h`来计算出原始输入`m`。 +- 目标碰撞抵抗性/弱无碰撞:对于一个给定输入`m_1`,难以找到`m_2 != m_1`且`hash(m_1) = hash(m_2)`。 +- 碰撞抵抗性/强无碰撞:难以找到一组满足`hash(m_1) = hash(m_2)`的输入`m_1, m_2`(该性质严格强于目标碰撞抵抗性)。 + +注:虽然SHA-1还可以用于特定用途,它已经[不再被认为](https://shattered.io/)是一个强密码散列函数。 +你可参照[密码散列函数的生命周期](https://valerieaurora.org/hash.html)这个表格了解一些散列函数是何时被发现弱点及破解的。 +请注意,针对应用推荐特定的散列函数超出了本课程内容的范畴。 +如果选择散列函数对于你的工作非常重要,请先系统学习信息安全及密码学。 + + +## 密码散列函数的应用 + +- Git中的内容寻址存储(Content addressed storage):[散列函数](https://en.wikipedia.org/wiki/Hash_function) 是一个宽泛的概念(存在非密码学的散列函数),那么Git为什么要特意使用密码散列函数? +- 文件的信息摘要(Message digest):像Linux ISO这样的软件可以从非官方的(有时不太可信的)镜像站下载,所以需要设法确认下载的软件和官方一致。 +官方网站一般会在(指向镜像站的)下载链接旁边备注安装文件的哈希值。 +用户从镜像站下载安装文件后可以对照公布的哈希值来确定安装文件没有被篡改。 - [Commitment schemes](https://en.wikipedia.org/wiki/Commitment_scheme). Suppose you want to commit to a particular value, but reveal the value itself later. For example, I want to do a fair coin toss "in my head", without a @@ -122,15 +91,12 @@ random()`, and then share `h = sha256(r)`. Then, you could call heads or tails call, I can reveal my value `r`, and you can confirm that I haven't cheated by checking `sha256(r)` matches the hash I shared earlier. -# Key derivation functions +# 密钥生成函数 -A related concept to cryptographic hashes, [key derivation -functions](https://en.wikipedia.org/wiki/Key_derivation_function) (KDFs) are -used for a number of applications, including producing fixed-length output for -use as keys in other cryptographic algorithms. Usually, KDFs are deliberately -slow, in order to slow down offline brute-force attacks. +[密钥生成函数](https://en.wikipedia.org/wiki/Key_derivation_function) (Key Derivation Functions) 作为密码散列函数的相关概念,被应用于包括生成固定长度,可以使用在其他密码算法中的密钥等方面。 +为了对抗穷举法攻击,密钥生成函数通常较慢。 -## Applications +## 密钥生成函数的应用 - Producing keys from passphrases for use in other cryptographic algorithms (e.g. symmetric cryptography, see below). @@ -140,7 +106,7 @@ approach is to generate and store a random each user, store `KDF(password + salt)`, and verify login attempts by re-computing the KDF given the entered password and the stored salt. -# Symmetric cryptography +# 对称加密 Hiding message contents is probably the first concept you think about when you think about cryptography. Symmetric cryptography accomplishes this with the @@ -160,13 +126,13 @@ has the obvious correctness property, that `decrypt(encrypt(m, k), k) = m`. An example of a symmetric cryptosystem in wide use today is [AES](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard). -## Applications +## 对称加密的应用 - Encrypting files for storage in an untrusted cloud service. This can be combined with KDFs, so you can encrypt a file with a passphrase. Generate `key = KDF(passphrase)`, and then store `encrypt(file, key)`. -# Asymmetric cryptography +# 非对称加密 The term "asymmetric" refers to there being two keys, with two different roles. A private key, as its name implies, is meant to be kept private, while the @@ -204,7 +170,7 @@ message, without the _private_ key, it's hard to produce a signature such that verify function has the obvious correctness property that `verify(message, sign(message, private key), public key) = true`. -## Applications +## 非对称加密的应用 - [PGP email encryption](https://en.wikipedia.org/wiki/Pretty_Good_Privacy). People can have their public keys posted online (e.g. in a PGP keyserver, or on @@ -215,7 +181,7 @@ communication channels. - Signing software. Git can have GPG-signed commits and tags. With a posted public key, anyone can verify the authenticity of downloaded software. -## Key distribution +## 密钥分发 Asymmetric-key cryptography is wonderful, but it has a big challenge of distributing public keys / mapping public keys to real-world identities. There @@ -228,9 +194,9 @@ proof](https://keybase.io/blog/chat-apps-softer-than-tofu) (along with other neat ideas). Each model has its merits; we (the instructors) like Keybase's model. -# Case studies +# 案例分析 -## Password managers +## 密码管理器 This is an essential tool that everyone should try to use (e.g. [KeePassXC](https://keepassxc.org/)). Password managers let you use unique, @@ -242,16 +208,15 @@ Using a password manager lets you avoid password reuse (so you're less impacted when websites get compromised), use high-entropy passwords (so you're less likely to get compromised), and only need to remember a single high-entropy password. -## Two-factor authentication +## 两步验证 -[Two-factor -authentication](https://en.wikipedia.org/wiki/Multi-factor_authentication) +[Two-factor authentication](https://en.wikipedia.org/wiki/Multi-factor_authentication) (2FA) requires you to use a passphrase ("something you know") along with a 2FA authenticator (like a [YubiKey](https://www.yubico.com/), "something you have") in order to protect against stolen passwords and [phishing](https://en.wikipedia.org/wiki/Phishing) attacks. -## Full disk encryption +## 全盘加密 Keeping your laptop's entire disk encrypted is an easy way to protect your data in the case that your laptop is stolen. You can use [cryptsetup + @@ -262,7 +227,7 @@ Windows, or [FileVault](https://support.apple.com/en-us/HT204837) on macOS. This encrypts the entire disk with a symmetric cipher, with a key protected by a passphrase. -## Private messaging +## 聊天加密 Use [Signal](https://signal.org/) or [Keybase](https://keybase.io/). End-to-end security is bootstrapped from asymmetric-key encryption. Obtaining your @@ -304,12 +269,12 @@ security concepts, tips - HTTPS {% endcomment %} -# Resources +# 资源 - [Last year's notes](/2019/security/): from when this lecture was more focused on security and privacy as a computer user - [Cryptographic Right Answers](https://latacora.micro.blog/2018/04/03/cryptographic-right-answers.html): answers "what crypto should I use for X?" for many common X. -# Exercises +# 练习 1. **Entropy.** 1. Suppose a password is chosen as a concatenation of five lower-case From 6510714e57a4175ccbe2b83b247fa7722b0ecf50 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 23 May 2020 13:56:02 +0800 Subject: [PATCH 356/640] update trans --- _2020/command-line.md | 48 +++++++++++++++++++++---------------------- 1 file changed, 24 insertions(+), 24 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index ca12eacf..3fcd55b2 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -13,13 +13,13 @@ In this lecture we will go through several ways in which you can improve your wo We will also learn about different ways to improve your shell and other tools, by defining aliases and configuring them using dotfiles. Both of these can help you save time, e.g. by using the same configurations in all your machines without having to type long commands. We will look at how to work with remote machines using SSH. -# Job Control +# 作业控制 In some cases you will need to interrupt a job while it is executing, for instance if a command is taking too long to complete (such as a `find` with a very large directory structure to search through). Most of the time, you can do `Ctrl-C` and the command will stop. But how does this actually work and why does it sometimes fail to stop the process? -## Killing a process +## 结束进程 Your shell is using a UNIX communication mechanism called a _signal_ to communicate information to the process. When a process receives a signal it stops its execution, deals with the signal and potentially changes the flow of execution based on the information that the signal delivered. For this reason, signals are _software interrupts_. @@ -56,7 +56,7 @@ I got a SIGINT, but I am not stopping While `SIGINT` and `SIGQUIT` are both usually associated with terminal related requests, a more generic signal for asking a process to exit gracefully is the `SIGTERM` signal. To send this signal we can use the [`kill`](http://man7.org/linux/man-pages/man1/kill.1.html) command, with the syntax `kill -TERM `. -## Pausing and backgrounding processes +## 暂停和后台执行进程 Signals can do other things beyond killing a process. For instance, `SIGSTOP` pauses a process. In the terminal, typing `Ctrl-Z` will prompt the shell to send a `SIGTSTP` signal, short for Terminal Stop (i.e. the terminal's version of `SIGSTOP`). @@ -125,7 +125,7 @@ A special signal is `SIGKILL` since it cannot be captured by the process and it You can learn more about these and other signals [here](https://en.wikipedia.org/wiki/Signal_(IPC)) or typing [`man signal`](http://man7.org/linux/man-pages/man7/signal.7.html) or `kill -t`. -# Terminal Multiplexers +# 终端多路复用 When using the command line interface you will often want to run more than one thing at once. For instance, you might want to run your editor and your program side by side. @@ -164,7 +164,7 @@ The most popular terminal multiplexer these days is [`tmux`](http://man7.org/lin For further reading, [here](https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/) is a quick tutorial on `tmux` and [this](http://linuxcommand.org/lc3_adv_termmux.php) has a more detailed explanation that covers the original `screen` command. You might also want to familiarize yourself with [`screen`](http://man7.org/linux/man-pages/man1/screen.1.html), since it comes installed in most UNIX systems. -# Aliases +# 别名 It can become tiresome typing long commands that involve many flags or verbose options. For this reason, most shells support _aliasing_. @@ -214,7 +214,7 @@ Note that aliases do not persist shell sessions by default. To make an alias persistent you need to include it in shell startup files, like `.bashrc` or `.zshrc`, which we are going to introduce in the next section. -# Dotfiles +# 配置文件(Dotfiles) Many programs are configured using plain-text files known as _dotfiles_ (because the file names begin with a `.`, e.g. `~/.vimrc`, so that they are @@ -267,7 +267,7 @@ All of the class instructors have their dotfiles publicly accessible on GitHub: [Jose](https://github.com/jjgo/dotfiles). -## Portability +## 可移植性 A common pain with dotfiles is that the configurations might not work when working with several machines, e.g. if they have different operating systems or shells. Sometimes you also want some configuration to be applied only in a given machine. @@ -307,7 +307,7 @@ if [ -f ~/.aliases ]; then fi ``` -# Remote Machines +# 远端设备 It has become more and more common for programmers to use remote servers in their everyday work. If you need to use remote servers in order to deploy backend software or you need a server with higher computational capabilities, you will end up using a Secure Shell (SSH). As with most tools covered, SSH is highly configurable so it is worth learning about it. @@ -320,18 +320,18 @@ ssh foo@bar.mit.edu Here we are trying to ssh as user `foo` in server `bar.mit.edu`. The server can be specified with a URL (like `bar.mit.edu`) or an IP (something like `foobar@192.168.1.42`). Later we will see that if we modify ssh config file you can access just using something like `ssh bar`. -## Executing commands +## 执行命令 An often overlooked feature of `ssh` is the ability to run commands directly. `ssh foobar@server ls` will execute `ls` in the home folder of foobar. It works with pipes, so `ssh foobar@server ls | grep PATTERN` will grep locally the remote output of `ls` and `ls | ssh foobar@server grep PATTERN` will grep remotely the local output of `ls`. -## SSH Keys +## SSH 密钥 Key-based authentication exploits public-key cryptography to prove to the server that the client owns the secret private key without revealing the key. This way you do not need to reenter your password every time. Nevertheless, the private key (often `~/.ssh/id_rsa` and more recently `~/.ssh/id_ed25519`) is effectively your password, so treat it like so. -### Key generation +### 密钥生成 To generate a pair you can run [`ssh-keygen`](http://man7.org/linux/man-pages/man1/ssh-keygen.1.html). ```bash @@ -341,7 +341,7 @@ You should choose a passphrase, to avoid someone who gets hold of your private k If you have ever configured pushing to GitHub using SSH keys, then you have probably done the steps outlined [here](https://help.github.com/articles/connecting-to-github-with-ssh/) and have a valid key pair already. To check if you have a passphrase and validate it you can run `ssh-keygen -y -f /path/to/key`. -### Key based authentication +### 基于密钥的认证机制 `ssh` will look into `.ssh/authorized_keys` to determine which clients it should let in. To copy a public key over you can use: @@ -355,7 +355,7 @@ A simpler solution can be achieved with `ssh-copy-id` where available: ssh-copy-id -i .ssh/id_ed25519.pub foobar@remote ``` -## Copying files over SSH +## 通过 SSH 复制文件 There are many ways to copy files over ssh: @@ -363,7 +363,7 @@ There are many ways to copy files over ssh: - [`scp`](http://man7.org/linux/man-pages/man1/scp.1.html) when copying large amounts of files/directories, the secure copy `scp` command is more convenient since it can easily recurse over paths. The syntax is `scp path/to/local_file remote_host:path/to/remote_file` - [`rsync`](http://man7.org/linux/man-pages/man1/rsync.1.html) improves upon `scp` by detecting identical files in local and remote, and preventing copying them again. It also provides more fine grained control over symlinks, permissions and has extra features like the `--partial` flag that can resume from a previously interrupted copy. `rsync` has a similar syntax to `scp`. -## Port Forwarding +## 端口转发 In many scenarios you will run into software that listens to specific ports in the machine. When this happens in your local machine you can type `localhost:PORT` or `127.0.0.1:PORT`, but what do you do with a remote server that does not have its ports directly available through the network/internet?. @@ -379,7 +379,7 @@ comes in two flavors: Local Port Forwarding and Remote Port Forwarding (see the The most common scenario is local port forwarding, where a service in the remote machine listens in a port and you want to link a port in your local machine to forward to the remote port. For example, if we execute `jupyter notebook` in the remote server that listens to the port `8888`. Thus, to forward that to the local port `9999`, we would do `ssh -L 9999:localhost:8888 foobar@remote_server` and then navigate to `locahost:9999` in our local machine. -## SSH Configuration +## SSH 配置 We have covered many many arguments that we can pass. A tempting alternative is to create shell aliases that look like ```bash @@ -408,7 +408,7 @@ Note that the `~/.ssh/config` file can be considered a dotfile, and in general i Server side configuration is usually specified in `/etc/ssh/sshd_config`. Here you can make changes like disabling password authentication, changing ssh ports, enabling X11 forwarding, &c. You can specify config settings on a per user basis. -## Miscellaneous +## 杂项 A common pain when connecting to a remote server are disconnections due to shutting down/sleeping your computer or changing a network. Moreover if one has a connection with significant lag using ssh can become quite frustrating. [Mosh](https://mosh.org/), the mobile shell, improves upon ssh, allowing roaming connections, intermittent connectivity and providing intelligent local echo. @@ -416,7 +416,7 @@ Sometimes it is convenient to mount a remote folder. [sshfs](https://github.com/ locally, and then you can use a local editor. -# Shells & Frameworks +# Shell & 框架 During shell tool and scripting we covered the `bash` shell because it is by far the most ubiquitous shell and most systems have it as the default option. Nevertheless, it is not the only option. @@ -439,7 +439,7 @@ For example, the `zsh` shell is a superset of `bash` and provides many convenien One thing to note when using these frameworks is that they may slow down your shell, especially if the code they run is not properly optimized or it is too much code. You can always profile it and disable the features that you do not use often or value over speed. -# Terminal Emulators +# 终端模拟器 Along with customizing your shell, it is worth spending some time figuring out your choice of **terminal emulator** and its settings. There are many many terminal emulators out there (here is a [comparison](https://anarc.at/blog/2018-04-12-terminal-emulators-1/)). @@ -452,9 +452,9 @@ Since you might be spending hundreds to thousands of hours in your terminal it p - Scrollback configuration - Performance (some newer terminals like [Alacritty](https://github.com/jwilm/alacritty) or [kitty](https://sw.kovidgoyal.net/kitty/) offer GPU acceleration). -# Exercises +# 课后练习 -## Job control +## 作业控制 1. From what we have seen, we can use some `ps aux | grep` commands to get our jobs' pids and then kill them, but there are better ways to do it. Start a `sleep 10000` job in a terminal, background it with `Ctrl-Z` and continue its execution with `bg`. Now use [`pgrep`](http://man7.org/linux/man-pages/man1/pgrep.1.html) to find its pid and [`pkill`](http://man7.org/linux/man-pages/man1/pgrep.1.html) to kill it without ever typing the pid itself. (Hint: use the `-af` flags). @@ -464,18 +464,18 @@ One way to achieve this is to use the [`wait`](http://man7.org/linux/man-pages/m However, this strategy will fail if we start in a different bash session, since `wait` only works for child processes. One feature we did not discuss in the notes is that the `kill` command's exit status will be zero on success and nonzero otherwise. `kill -0` does not send a signal but will give a nonzero exit status if the process does not exist. Write a bash function called `pidwait` that takes a pid and waits until the given process completes. You should use `sleep` to avoid wasting CPU unnecessarily. -## Terminal multiplexer +## 终端多路复用 1. Follow this `tmux` [tutorial](https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/) and then learn how to do some basic customizations following [these steps](https://www.hamvocke.com/blog/a-guide-to-customizing-your-tmux-conf/). -## Aliases +## 别名 1. Create an alias `dc` that resolves to `cd` for when you type it wrongly. 1. Run `history | awk '{$1="";print substr($0,2)}' | sort | uniq -c | sort -n | tail -n 10` to get your top 10 most used commands and consider writing shorter aliases for them. Note: this works for Bash; if you're using ZSH, use `history 1` instead of just `history`. -## Dotfiles +## 配置文件 Let's get you up to speed with dotfiles. 1. Create a folder for your dotfiles and set up version @@ -488,7 +488,7 @@ Let's get you up to speed with dotfiles. 1. Migrate all of your current tool configurations to your dotfiles repository. 1. Publish your dotfiles on GitHub. -## Remote Machines +## 远端设备 Install a Linux virtual machine (or use an already existing one) for this exercise. If you are not familiar with virtual machines check out [this](https://hibbard.eu/install-ubuntu-virtual-box/) tutorial for installing one. From 85eafb83259bcf8b220a4a467ccbda52cfba7129 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 23 May 2020 17:22:25 +0800 Subject: [PATCH 357/640] update trans --- _2020/command-line.md | 35 ++++++++++++++++------------------- 1 file changed, 16 insertions(+), 19 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 3fcd55b2..87d4599a 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -8,24 +8,23 @@ video: id: e8BO_dYxk5c --- -In this lecture we will go through several ways in which you can improve your workflow when using the shell. We have been working with the shell for a while now, but we have mainly focused on executing different commands. We will now see how to run several processes at the same time while keeping track of them, how to stop or pause a specific process and how to make a process run in the background. +当您使用 shell 进行工作时,可以使用一些方法改善您的工作流,本节课我们就来讨论这些方法。 +我们以及使用 shell 一段时间了,但是到目前为止我们的关注点集中在使用不同的命令行。现在,我们将会学习如何同时执行多个不同的进程并追踪它们的状态、停止或暂停某个进程以及如何使进程在后台运行。 -We will also learn about different ways to improve your shell and other tools, by defining aliases and configuring them using dotfiles. Both of these can help you save time, e.g. by using the same configurations in all your machines without having to type long commands. We will look at how to work with remote machines using SSH. +我们还将学习一些能够改善您的 shell 及其他工具的工作流的方法,主要途径是通过定义别名或基于配置文件对其进行配置。这些方法都可以帮您节省大量的时间。例如,仅需要执行一些简单的命令,我们就可以是在所有的主机上使用相同的配置。我们还会学习如何使用 SSH 操作远端机器。 -# 作业控制 +# 任务控制 -In some cases you will need to interrupt a job while it is executing, for instance if a command is taking too long to complete (such as a `find` with a very large directory structure to search through). -Most of the time, you can do `Ctrl-C` and the command will stop. -But how does this actually work and why does it sometimes fail to stop the process? +某些情况下我们需要在任务执行时将其中断,例如当一个命令需要执行很长时间才能完成时(比如使用 `find` 搜索一个非常大的目录结构时)。大多数情况下,我们可以使用 `Ctrl-C` 来停止命令的执行。但是它的工作原理是什么呢?为什么有的时候会无法结束进程? ## 结束进程 -Your shell is using a UNIX communication mechanism called a _signal_ to communicate information to the process. When a process receives a signal it stops its execution, deals with the signal and potentially changes the flow of execution based on the information that the signal delivered. For this reason, signals are _software interrupts_. +您的 shell 会使用 UNIX 提供的信号机制执行进程间通信。当一个进程接收到信号时,它会停止执行、处理该信号并基于信号传递的信息来改变其执行。就这一点而言,信号是一种*软件中断*。 -In our case, when typing `Ctrl-C` this prompts the shell to deliver a `SIGINT` signal to the process. +就上述例子而言,当我们输入 `Ctrl-C` 时,shell 会发送一个`SIGINT` 信号到进程。 -Here's a minimal example of a Python program that captures `SIGINT` and ignores it, no longer stopping. To kill this program we can now use the `SIGQUIT` signal instead, by typing `Ctrl-\`. +下面这个Python程序向您展示了捕获信号`SIGINT` 并忽略它的基本操作,它并不会让程序停止。为了停止这个程序,我们需要使用`SIGQUIT` 信号,通过输入`Ctrl-\`可以发送该信号。 ```python #!/usr/bin/env python @@ -42,7 +41,7 @@ while True: i += 1 ``` -Here's what happens if we send `SIGINT` twice to this program, followed by `SIGQUIT`. Note that `^` is how `Ctrl` is displayed when typed in the terminal. +如果我们向这个程序发送两次 `SIGINT` ,然后再发送一次 `SIGQUIT`,程序会有什么反应?注意 `^` 是我们在终端输入`Ctrl` 时的表示形式: ``` $ python sigint.py @@ -53,20 +52,18 @@ I got a SIGINT, but I am not stopping 30^\[1] 39913 quit python sigint.py ``` -While `SIGINT` and `SIGQUIT` are both usually associated with terminal related requests, a more generic signal for asking a process to exit gracefully is the `SIGTERM` signal. -To send this signal we can use the [`kill`](http://man7.org/linux/man-pages/man1/kill.1.html) command, with the syntax `kill -TERM `. +尽管 `SIGINT` 和 `SIGQUIT` 都常常用来发出和终止程序相关都请求。`SIGTERM` 则是一个更加通用的,让程序优雅地退出的信号。为了发出这个信号我们需要使用[`kill`](http://man7.org/linux/man-pages/man1/kill.1.html) 命令, 它的语法是: `kill -TERM `. ## 暂停和后台执行进程 -Signals can do other things beyond killing a process. For instance, `SIGSTOP` pauses a process. In the terminal, typing `Ctrl-Z` will prompt the shell to send a `SIGTSTP` signal, short for Terminal Stop (i.e. the terminal's version of `SIGSTOP`). +信号可以让进程做其他的事情,而不仅仅是终止它们。例如,`SIGSTOP` 会让进程暂停。在终端中,键入 `Ctrl-Z` 会让 shell 发送 `SIGTSTP` 信号。 -We can then continue the paused job in the foreground or in the background using [`fg`](http://man7.org/linux/man-pages/man1/fg.1p.html) or [`bg`](http://man7.org/linux/man-pages/man1/bg.1p.html), respectively. +我们可以使用 [`fg`](http://man7.org/linux/man-pages/man1/fg.1p.html) 或 [`bg`](http://man7.org/linux/man-pages/man1/bg.1p.html) 命令恢复暂停的工作。它们分别表示在前台继续或在后台继续。 -The [`jobs`](http://man7.org/linux/man-pages/man1/jobs.1p.html) command lists the unfinished jobs associated with the current terminal session. -You can refer to those jobs using their pid (you can use [`pgrep`](http://man7.org/linux/man-pages/man1/pgrep.1.html) to find that out). -More intuitively, you can also refer to a process using the percent symbol followed by its job number (displayed by `jobs`). To refer to the last backgrounded job you can use the `$!` special parameter. -One more thing to know is that the `&` suffix in a command will run the command in the background, giving you the prompt back, although it will still use the shell's STDOUT which can be annoying (use shell redirections in that case). +[`jobs`](http://man7.org/linux/man-pages/man1/jobs.1p.html) 命令会列出当前终端会话中尚未完成的全部任务。您可以使用 pid 引用这些任务(可以用 [`pgrep`](http://man7.org/linux/man-pages/man1/pgrep.1.html) 找出 pid)。更加符合直觉的操作是,您可以使用百分号 + 任务编号(`jobs` 会打印任务编号)来选取该任务。如果要选择最近的一个任务,可以使用 `$!` 这一特别参数。 + +还有一件事情需要掌握,那就是命令中的 `&` 后缀可以让命令在直接在后台运行,这使得您可以直接在 shell 中继续做其他操作,不过它此时还是会使用 shell 的标准输出,这一点有时候会比较恼人(这种情况可以使用 shell 重定向处理)。 To background an already running program you can do `Ctrl-Z` followed by `bg`. Note that backgrounded processes are still children processes of your terminal and will die if you close the terminal (this will send yet another signal, `SIGHUP`). @@ -454,7 +451,7 @@ Since you might be spending hundreds to thousands of hours in your terminal it p # 课后练习 -## 作业控制 +## 任务控制 1. From what we have seen, we can use some `ps aux | grep` commands to get our jobs' pids and then kill them, but there are better ways to do it. Start a `sleep 10000` job in a terminal, background it with `Ctrl-Z` and continue its execution with `bg`. Now use [`pgrep`](http://man7.org/linux/man-pages/man1/pgrep.1.html) to find its pid and [`pkill`](http://man7.org/linux/man-pages/man1/pgrep.1.html) to kill it without ever typing the pid itself. (Hint: use the `-af` flags). From 5617c593a769871017cb6159a5cbcd3d999e33e9 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 23 May 2020 17:32:24 +0800 Subject: [PATCH 358/640] update trrans --- _2020/command-line.md | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 87d4599a..0a53f8b7 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -65,12 +65,10 @@ I got a SIGINT, but I am not stopping 还有一件事情需要掌握,那就是命令中的 `&` 后缀可以让命令在直接在后台运行,这使得您可以直接在 shell 中继续做其他操作,不过它此时还是会使用 shell 的标准输出,这一点有时候会比较恼人(这种情况可以使用 shell 重定向处理)。 -To background an already running program you can do `Ctrl-Z` followed by `bg`. -Note that backgrounded processes are still children processes of your terminal and will die if you close the terminal (this will send yet another signal, `SIGHUP`). -To prevent that from happening you can run the program with [`nohup`](http://man7.org/linux/man-pages/man1/nohup.1.html) (a wrapper to ignore `SIGHUP`), or use `disown` if the process has already been started. -Alternatively, you can use a terminal multiplexer as we will see in the next section. +让已经在运行的进程转到后台运行,您可以键入`Ctrl-Z` ,然后紧接着再输入`bg`。注意,后台的进程仍然是您的终端进程的子进程,一旦您关闭来终端(会发送另外一个信号`SIGHUP`),这些后台的进程也会终止。为了防止这种情况发生,您可以使用 [`nohup`](http://man7.org/linux/man-pages/man1/nohup.1.html) (一个用来忽略 `SIGHUP` 的封装) 来运行程序。针对已经运行的程序,可以使用`disown` 。除此之外,您可以使用终端多路复用器来实现,下一章节我们会进行详细地探讨。 + +我们在下面这个简单的会话中展示来这些概念的应用。 -Below is a sample session to showcase some of these concepts. ``` $ sleep 1000 @@ -117,9 +115,9 @@ $ jobs ``` -A special signal is `SIGKILL` since it cannot be captured by the process and it will always terminate it immediately. However, it can have bad side effects such as leaving orphaned children processes. +`SIGKILL` 是一个特殊的信号,它不能被进程捕获并且它会马上结束该进程。不过这样做会有一些副作用,例如留下孤儿进程。 -You can learn more about these and other signals [here](https://en.wikipedia.org/wiki/Signal_(IPC)) or typing [`man signal`](http://man7.org/linux/man-pages/man7/signal.7.html) or `kill -t`. +你可以在 [here](https://en.wikipedia.org/wiki/Signal_(IPC)) 或输入 [`man signal`](http://man7.org/linux/man-pages/man7/signal.7.html) 或使用 `kill -t` 来获取更多关于信号的信息。 # 终端多路复用 From fe0424bd6b5be7b314e547d7877708806ec09d8e Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 24 May 2020 00:10:52 +0800 Subject: [PATCH 359/640] update trans --- _2020/command-line.md | 81 ++++++++++++++++++++++--------------------- 1 file changed, 41 insertions(+), 40 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 0a53f8b7..b839901d 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -122,49 +122,50 @@ $ jobs # 终端多路复用 -When using the command line interface you will often want to run more than one thing at once. -For instance, you might want to run your editor and your program side by side. -Although this can be achieved by opening new terminal windows, using a terminal multiplexer is a more versatile solution. - -Terminal multiplexers like [`tmux`](http://man7.org/linux/man-pages/man1/tmux.1.html) allow you to multiplex terminal windows using panes and tabs so you can interact with multiple shell sessions. -Moreover, terminal multiplexers let you detach a current terminal session and reattach at some point later in time. -This can make your workflow much better when working with remote machines since it voids the need to use `nohup` and similar tricks. - -The most popular terminal multiplexer these days is [`tmux`](http://man7.org/linux/man-pages/man1/tmux.1.html). `tmux` is highly configurable and by using the associated keybindings you can create multiple tabs and panes and quickly navigate through them. - -`tmux` expects you to know its keybindings, and they all have the form ` x` where that means (1) press `Ctrl+b`, (2) release `Ctrl+b`, and then (3) press `x`. `tmux` has the following hierarchy of objects: -- **Sessions** - a session is an independent workspace with one or more windows - + `tmux` starts a new session. - + `tmux new -s NAME` starts it with that name. - + `tmux ls` lists the current sessions - + Within `tmux` typing ` d` detaches the current session - + `tmux a` attaches the last session. You can use `-t` flag to specify which - -- **Windows** - Equivalent to tabs in editors or browsers, they are visually separate parts of the same session - + ` c` Creates a new window. To close it you can just terminate the shells doing `` - + ` N` Go to the _N_ th window. Note they are numbered - + ` p` Goes to the previous window - + ` n` Goes to the next window - + ` ,` Rename the current window - + ` w` List current windows - -- **Panes** - Like vim splits, panes let you have multiple shells in the same visual display. - + ` "` Split the current pane horizontally - + ` %` Split the current pane vertically - + ` ` Move to the pane in the specified _direction_. Direction here means arrow keys. - + ` z` Toggle zoom for the current pane - + ` [` Start scrollback. You can then press `` to start a selection and `` to copy that selection. - + ` ` Cycle through pane arrangements. - -For further reading, -[here](https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/) is a quick tutorial on `tmux` and [this](http://linuxcommand.org/lc3_adv_termmux.php) has a more detailed explanation that covers the original `screen` command. You might also want to familiarize yourself with [`screen`](http://man7.org/linux/man-pages/man1/screen.1.html), since it comes installed in most UNIX systems. +当您在使用命令行接口时,您通常会希望同时执行多个任务。距离来说,您可以想要同时运行您的编辑器,并在终端的另外一侧执行程序。尽管再打开一个新的终端窗口也能达到目的,使用终端多路复用器则是一种更好的办法。 + +像 [`tmux`](http://man7.org/linux/man-pages/man1/tmux.1.html) 这类的终端多路复用器可以允许我们基于面板和标签分割出多个终端窗口,这样您便可以同时与多个 shell 会话进行交互。 + +不仅如此,终端多路复用使我们可以分离当前终端会话并在将来重新连接。 + +这让你操作远端设备时的工作流大大改善,避免了 `nohup` 和其他类似技巧的使用。 + + +现在最流行的终端多路器是 [`tmux`](http://man7.org/linux/man-pages/man1/tmux.1.html)。`tmux` 是一个高度可定制的工具,您可以使用相关快捷键创建多个标签页并在它们间导航。 + +`tmux` 的快捷键需要我们掌握,它们都是类似 ` x` 这样的组合,即需要先按下`Ctrl+b`,松开后再按下 `x`。`tmux` 中对象的继承结构如下: +- **会话** - 每个会话都是一个独立的工作区,其中包含一个或多个窗口 + + `tmux` 开始一个新的会话 + + `tmux new -s NAME` 以指定名称开始一个新的会话 + + `tmux ls` 列出当前所有会话 + + 在 `tmux` 中输入 ` d` ,将当前会话分离 + + `tmux a` 重新连接最后一个会话。您也可以通过 `-t` 来指定具体的会话 + + +- **窗口** - 相当于编辑器或是浏览器中的标签页,从视觉上将一个会话分割为多个部分 + + ` c` 创建一个新的窗口,使用 ``关闭 + + ` N` 跳转到第 _N_ 个窗口,注意每个窗口都是有编号的 + + ` p` 切换到前一个窗口 + + ` n` 切换到下一个窗口 + + ` ,` 重命名当前窗口 + + ` w` 列出当前所有窗口 + +- **面板** - 像vim中的分屏一样,面板使我们可以在一个屏幕里显示多个shell + + ` "` 水平分割 + + ` %` 垂直分割 + + ` <方向>` 切换到指定方向的面板,*<方向>* 指的是键盘上的方向键 + + ` z` 切换当前面板的缩放 + + ` [` 开始往回卷动屏幕。你可以按下空格键来开始选择,回车键复制选中的部分 + + ` <空格>` 在不同的面板排布间切换 + +扩展阅读: +[这里](https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/) 是一份快速入门 `tmux` 的教程, [而这一篇](http://linuxcommand.org/lc3_adv_termmux.php) 文章则更加详细。它包含来原本的 `screen` 命令。您也许想要掌握 [`screen`](http://man7.org/linux/man-pages/man1/screen.1.html) 命令,因为在大多数 UNIX 系统中都默认安装有该程序。 + # 别名 -It can become tiresome typing long commands that involve many flags or verbose options. -For this reason, most shells support _aliasing_. -A shell alias is a short form for another command that your shell will replace automatically for you. -For instance, an alias in bash has the following structure: +输入一长串包含许多选项的命令会非常麻烦。因此,大多数 shell 都支持设置别名。shell 的别名相当于一个长命令的缩写,shell 会自动将其替换成原本的命令。例如,bash 中的别名语法如下: + ```bash alias alias_name="command_to_alias arg1 arg2" From 051ed3621fdcbd09510ee8371d271262a6372784 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 24 May 2020 00:30:38 +0800 Subject: [PATCH 360/640] update trans --- _2020/command-line.md | 49 ++++++++++++++----------------------------- 1 file changed, 16 insertions(+), 33 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index b839901d..d0bd4277 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -153,7 +153,7 @@ $ jobs - **面板** - 像vim中的分屏一样,面板使我们可以在一个屏幕里显示多个shell + ` "` 水平分割 + ` %` 垂直分割 - + ` <方向>` 切换到指定方向的面板,*<方向>* 指的是键盘上的方向键 + + ` <方向>` 切换到指定方向的面板,<方向> 指的是键盘上的方向键 + ` z` 切换当前面板的缩放 + ` [` 开始往回卷动屏幕。你可以按下空格键来开始选择,回车键复制选中的部分 + ` <空格>` 在不同的面板排布间切换 @@ -171,9 +171,9 @@ $ jobs alias alias_name="command_to_alias arg1 arg2" ``` -Note that there is no space around the equal sign `=`, because [`alias`](http://man7.org/linux/man-pages/man1/alias.1p.html) is a shell command that takes a single argument. +注意, `=`两边是没有空格的,因为 [`alias`](http://man7.org/linux/man-pages/man1/alias.1p.html) 是一个 shell 命令,它只接受一个参数。 -Aliases have many convenient features: +别名有许多很方便的特性: ```bash # Make shorthands for common flags @@ -206,8 +206,7 @@ alias ll # Will print ll='ls -lh' ``` -Note that aliases do not persist shell sessions by default. -To make an alias persistent you need to include it in shell startup files, like `.bashrc` or `.zshrc`, which we are going to introduce in the next section. +值得注意的是,在默认情况下,shell 并不会保存别名。为了让别名持续生效,你需要将配置放进 shell 的启动文件里,像是`.bashrc` 或 `.zshrc`,下一节我们就会讲到。 # 配置文件(Dotfiles) @@ -228,37 +227,21 @@ Some other examples of tools that can be configured through dotfiles are: - `bash` - `~/.bashrc`, `~/.bash_profile` - `git` - `~/.gitconfig` -- `vim` - `~/.vimrc` and the `~/.vim` folder +- `vim` - `~/.vimrc` 和 `~/.vim` folder - `ssh` - `~/.ssh/config` - `tmux` - `~/.tmux.conf` -How should you organize your dotfiles? They should be in their own folder, -under version control, and **symlinked** into place using a script. This has -the benefits of: - -- **Easy installation**: if you log in to a new machine, applying your -customizations will only take a minute. -- **Portability**: your tools will work the same way everywhere. -- **Synchronization**: you can update your dotfiles anywhere and keep them all -in sync. -- **Change tracking**: you're probably going to be maintaining your dotfiles -for your entire programming career, and version history is nice to have for -long-lived projects. - -What should you put in your dotfiles? -You can learn about your tool's settings by reading online documentation or -[man pages](https://en.wikipedia.org/wiki/Man_page). Another great way is to -search the internet for blog posts about specific programs, where authors will -tell you about their preferred customizations. Yet another way to learn about -customizations is to look through other people's dotfiles: you can find tons of -[dotfiles -repositories](https://github.com/search?o=desc&q=dotfiles&s=stars&type=Repositories) -on Github --- see the most popular one -[here](https://github.com/mathiasbynens/dotfiles) (we advise you not to blindly -copy configurations though). -[Here](https://dotfiles.github.io/) is another good resource on the topic. - -All of the class instructors have their dotfiles publicly accessible on GitHub: [Anish](https://github.com/anishathalye/dotfiles), +我们应该如何管理这些配置文件呢,它们应该在它们的文件夹下,并使用版本控制系统进行管理,然后通过脚本将其 **符号链接** 到需要的地方。这么做有如下好处: + +- **安装简单**: 如果您登陆了一台新的设备,在这台设备上应用您的配置只需要几分钟的时间; +- **可以执行**: 你的工具在任何地方都以相同的配置工作 +- **同步**: 在一处更新配置文件,可以同步到其他所有地方 +- **变更追踪**: 你可能要在整个程序员生涯中持续维护这些配置文件,而对于长期项目而言,版本历史是非常重要的 + +配置文件中需要放些什么?你可以通过在线文档和[man pages](https://en.wikipedia.org/wiki/Man_page)了解所使用工具的设置项。另一个方法是在网上搜索有关特定程序的文章,作者们在文章中会分享他们的配置。还有一种方法就是直接浏览其他人的配置文件:您可以在这里找到无数的[dotfiles repositories](https://github.com/search?o=desc&q=dotfiles&s=stars&type=Repositories) —— 其中最受欢迎的那些可以在[这里](https://github.com/mathiasbynens/dotfiles)找到(我们建议你不要直接复制别人的配置)。[这里](https://dotfiles.github.io/) 也有一些非常有用的资源。 + +本节课的老师们的配置文件都已经在 GitHub 上开源: +[Anish](https://github.com/anishathalye/dotfiles), [Jon](https://github.com/jonhoo/configs), [Jose](https://github.com/jjgo/dotfiles). From 010f3f61755595cb300ca8e065683205eff97cc5 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 24 May 2020 00:55:07 +0800 Subject: [PATCH 361/640] update trans --- _2020/command-line.md | 31 +++++++++++++++---------------- 1 file changed, 15 insertions(+), 16 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index d0bd4277..2249b0dd 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -211,19 +211,16 @@ alias ll # 配置文件(Dotfiles) -Many programs are configured using plain-text files known as _dotfiles_ -(because the file names begin with a `.`, e.g. `~/.vimrc`, so that they are -hidden in the directory listing `ls` by default). +很多程序的配置都是通过纯文本格式的被称作*点文件*的配置文件来完成的(之所以称为点文件,是因为它们的文件名以 `.` 开头,例如 `~/.vimrc`。也正因为此,它们默认是隐藏文件,`ls`并不会显示它们)。 -Shells are one example of programs configured with such files. On startup, your shell will read many files to load its configuration. -Depending on the shell, whether you are starting a login and/or interactive the entire process can be quite complex. -[Here](https://blog.flowblok.id.au/2013-02/shell-startup-scripts.html) is an excellent resource on the topic. +shell 的配置也是通过这类文件完成的。在启动时,您的 shell 程序会读取很多文件以加载其配置项。根据 shell 本身的不同,您从登陆开始还是以交互的方式完成这一过程可能会有很大的不同。关于这一话题,[这里](https://blog.flowblok.id.au/2013-02/shell-startup-scripts.html) 有非常好的资源 -For `bash`, editing your `.bashrc` or `.bash_profile` will work in most systems. -Here you can include commands that you want to run on startup, like the alias we just described or modifications to your `PATH` environment variable. -In fact, many programs will ask you to include a line like `export PATH="$PATH:/path/to/program/bin"` in your shell configuration file so their binaries can be found. +对于 `bash`来说,在大多数系统下,您可以通过编辑 `.bashrc` 或 `.bash_profile` 来进行配置。在文件中您可以添加需要在启动时执行的命令,例如上文我们讲到过的别名,或者是您的环境变量。 -Some other examples of tools that can be configured through dotfiles are: +实际上,很多程序都要求您在 shell 的配置文件中包含一行类似 `export PATH="$PATH:/path/to/program/bin"` 的命令,这样才能确保这些程序能够被 shell 找到。 + + +还有一些其他的工具也可以通过*点文件*进行配置: - `bash` - `~/.bashrc`, `~/.bash_profile` - `git` - `~/.gitconfig` @@ -424,12 +421,14 @@ Along with customizing your shell, it is worth spending some time figuring out y Since you might be spending hundreds to thousands of hours in your terminal it pays off to look into its settings. Some of the aspects that you may want to modify in your terminal include: -- Font choice -- Color Scheme -- Keyboard shortcuts -- Tab/Pane support -- Scrollback configuration -- Performance (some newer terminals like [Alacritty](https://github.com/jwilm/alacritty) or [kitty](https://sw.kovidgoyal.net/kitty/) offer GPU acceleration). +- 字体选择 +- 彩色主题 +- 快捷键 +- 标签页/面板支持 +- 回退配置 +- 性能(像 [Alacritty](https://github.com/jwilm/alacritty) 或者 [kitty](https://sw.kovidgoyal.net/kitty/) 这种比较新的终端,它们支持GPU加速)。 + + # 课后练习 From 614a4a54221ce261892bc9918d93f5a3403001df Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 24 May 2020 15:38:49 +0800 Subject: [PATCH 362/640] update trans --- _2020/command-line.md | 68 ++++++++++++++++++++++--------------------- 1 file changed, 35 insertions(+), 33 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 2249b0dd..e72c2f4e 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -237,7 +237,7 @@ shell 的配置也是通过这类文件完成的。在启动时,您的 shell 配置文件中需要放些什么?你可以通过在线文档和[man pages](https://en.wikipedia.org/wiki/Man_page)了解所使用工具的设置项。另一个方法是在网上搜索有关特定程序的文章,作者们在文章中会分享他们的配置。还有一种方法就是直接浏览其他人的配置文件:您可以在这里找到无数的[dotfiles repositories](https://github.com/search?o=desc&q=dotfiles&s=stars&type=Repositories) —— 其中最受欢迎的那些可以在[这里](https://github.com/mathiasbynens/dotfiles)找到(我们建议你不要直接复制别人的配置)。[这里](https://dotfiles.github.io/) 也有一些非常有用的资源。 -本节课的老师们的配置文件都已经在 GitHub 上开源: +本课程的老师们也在 GitHub 上开源了他们的配置文件: [Anish](https://github.com/anishathalye/dotfiles), [Jon](https://github.com/jonhoo/configs), [Jose](https://github.com/jjgo/dotfiles). @@ -245,37 +245,36 @@ shell 的配置也是通过这类文件完成的。在启动时,您的 shell ## 可移植性 -A common pain with dotfiles is that the configurations might not work when working with several machines, e.g. if they have different operating systems or shells. Sometimes you also want some configuration to be applied only in a given machine. +配置文件的一个常见的痛点是它可能并不能在多种设备上生效。例如,如果您在不同设备上使用的操作系统或者 shell 是不同的,则配置文件是无法生效的。或者,有时您仅希望特定的配置只在某些设备上生效。 + +有一些技巧可以轻松达成这些目的。如果配置文件 if 语句,则您可以借助它针对不同的设备编写不同的配置。例如,您的 shell 可以这样做: -There are some tricks for making this easier. -If the configuration file supports it, use the equivalent of if-statements to -apply machine specific customizations. For example, your shell could have something -like: ```bash if [[ "$(uname)" == "Linux" ]]; then {do_something}; fi -# Check before using shell-specific features +# 使用和 shell 相关的配置时先检查当前 shell 类型 if [[ "$SHELL" == "zsh" ]]; then {do_something}; fi -# You can also make it machine-specific +# 您也可以针对特定的设备进行配置 if [[ "$(hostname)" == "myServer" ]]; then {do_something}; fi ``` -If the configuration file supports it, make use of includes. For example, -a `~/.gitconfig` can have a setting: +如果配置文件支持 include 功能,您也可以多加利用。例如:`~/.gitconfig` 可以这样编写: ``` [include] path = ~/.gitconfig_local ``` -And then on each machine, `~/.gitconfig_local` can contain machine-specific -settings. You could even track these in a separate repository for -machine-specific settings. +然后我们可以在每天设备上创建配置文件 `~/.gitconfig_local` 来包含与该设备相关的特定配置。您甚至应该创建一个单独的代码仓库来管理这些与设备相关的配置。 This idea is also useful if you want different programs to share some configurations. For instance, if you want both `bash` and `zsh` to share the same set of aliases you can write them under `.aliases` and have the following block in both: + +如果您希望在不同的程序之间共享某些配置,该方法也适用。例如,如果你想要在 `bash` 和 `zsh` 中同时启用一些别名,你可以把它们写在 `.aliases` 里,然后在这两个 shell 里应用: + + ```bash # Test if ~/.aliases exists and source it if [ -f ~/.aliases ]; then @@ -285,47 +284,49 @@ fi # 远端设备 -It has become more and more common for programmers to use remote servers in their everyday work. If you need to use remote servers in order to deploy backend software or you need a server with higher computational capabilities, you will end up using a Secure Shell (SSH). As with most tools covered, SSH is highly configurable so it is worth learning about it. +对于程序员来说,在他们的日常工作中使用远程服务器已经非常普遍来。如果您需要使用远程服务器来部署后端软件或您需要一些计算能力强大的服务器,您就会用到安全 shell(SSH)。和其他工具一样,SSH 也是可以高度定制的,也值得我们花时间学习它。 -To `ssh` into a server you execute a command as follows +通过如下命令,您可以使用 `ssh` 连接到其他服务器: ```bash ssh foo@bar.mit.edu ``` -Here we are trying to ssh as user `foo` in server `bar.mit.edu`. -The server can be specified with a URL (like `bar.mit.edu`) or an IP (something like `foobar@192.168.1.42`). Later we will see that if we modify ssh config file you can access just using something like `ssh bar`. +这里我们尝试以用户名 `foo` 登陆服务器 `bar.mit.edu`。服务器可以通过 URL 指定(例如`bar.mit.edu`),也可以使用 IP 指定(例如`foobar@192.168.1.42`)。后面我们会介绍如何修改 ssh 配置文件使我们可以用类似 `ssh bar` 这样的命令来登陆服务器。 ## 执行命令 -An often overlooked feature of `ssh` is the ability to run commands directly. -`ssh foobar@server ls` will execute `ls` in the home folder of foobar. -It works with pipes, so `ssh foobar@server ls | grep PATTERN` will grep locally the remote output of `ls` and `ls | ssh foobar@server grep PATTERN` will grep remotely the local output of `ls`. + `ssh` 的一个经常被忽视的特性是它可以直接远程执行命令。 +`ssh foobar@server ls` 可以直接在用foobar的命令下执行 `ls` 命令。 +想要配合管道来使用也可以, `ssh foobar@server ls | grep PATTERN` 会在本地查询远端 `ls` 的输出而 `ls | ssh foobar@server grep PATTERN` 会在远端对本地 `ls` 输出的结果进行查询。 ## SSH 密钥 -Key-based authentication exploits public-key cryptography to prove to the server that the client owns the secret private key without revealing the key. This way you do not need to reenter your password every time. Nevertheless, the private key (often `~/.ssh/id_rsa` and more recently `~/.ssh/id_ed25519`) is effectively your password, so treat it like so. +基于密钥的验证机制利用了密码学中的公钥,我们只需要向服务器证明客户端持有对应的私钥,而不需要公开其私钥。这样您就可以避免每次登陆都输入密码的麻烦了秘密就可以登陆。不过,私钥(通常是 `~/.ssh/id_rsa` 或者 `~/.ssh/id_ed25519`) 等效于您的密码,所以一定要好好保存它。 ### 密钥生成 -To generate a pair you can run [`ssh-keygen`](http://man7.org/linux/man-pages/man1/ssh-keygen.1.html). +使用 [`ssh-keygen`](http://man7.org/linux/man-pages/man1/ssh-keygen.1.html) 命令可以生成一对密钥: + ```bash ssh-keygen -o -a 100 -t ed25519 -f ~/.ssh/id_ed25519 ``` -You should choose a passphrase, to avoid someone who gets hold of your private key to access authorized servers. Use [`ssh-agent`](http://man7.org/linux/man-pages/man1/ssh-agent.1.html) or [`gpg-agent`](https://linux.die.net/man/1/gpg-agent) so you do not have to type your passphrase every time. -If you have ever configured pushing to GitHub using SSH keys, then you have probably done the steps outlined [here](https://help.github.com/articles/connecting-to-github-with-ssh/) and have a valid key pair already. To check if you have a passphrase and validate it you can run `ssh-keygen -y -f /path/to/key`. +您可以为密钥设置密码,防止有人持有您的私钥并使用它访问您的服务器。您可以使用 [`ssh-agent`](http://man7.org/linux/man-pages/man1/ssh-agent.1.html) 或 [`gpg-agent`](https://linux.die.net/man/1/gpg-agent) ,这样就不需要每次都输入该密码了。 + + +如果你曾经配置过使用 SSH 密钥推送到 GitHub,那么可能你已经完成了[这里](https://help.github.com/articles/connecting-to-github-with-ssh/) 介绍的这些步骤,并且已经有了一个可用的密钥对。要检查你是否持有密码并验证它,你可以运行 `ssh-keygen -y -f /path/to/key`. ### 基于密钥的认证机制 -`ssh` will look into `.ssh/authorized_keys` to determine which clients it should let in. To copy a public key over you can use: +`ssh` 会查询 `.ssh/authorized_keys` 来确认那些用户可以被允许登陆。您可以通过下面的命令将一个公钥拷贝到这里: ```bash cat .ssh/id_ed25519.pub | ssh foobar@remote 'cat >> ~/.ssh/authorized_keys' ``` -A simpler solution can be achieved with `ssh-copy-id` where available: +如果支持 `ssh-copy-id` 的话,可以使用下面这种更简单的解决方案: ```bash ssh-copy-id -i .ssh/id_ed25519.pub foobar@remote @@ -333,7 +334,7 @@ ssh-copy-id -i .ssh/id_ed25519.pub foobar@remote ## 通过 SSH 复制文件 -There are many ways to copy files over ssh: +使用 ssh 复制文件有很多方法: - `ssh+tee`, the simplest is to use `ssh` command execution and STDIN input by doing `cat localfile | ssh remote_server tee serverfile`. Recall that [`tee`](http://man7.org/linux/man-pages/man1/tee.1.html) writes the output from STDIN into a file. - [`scp`](http://man7.org/linux/man-pages/man1/scp.1.html) when copying large amounts of files/directories, the secure copy `scp` command is more convenient since it can easily recurse over paths. The syntax is `scp path/to/local_file remote_host:path/to/remote_file` @@ -406,12 +407,13 @@ For example, the `zsh` shell is a superset of `bash` and provides many convenien **Frameworks** can improve your shell as well. Some popular general frameworks are [prezto](https://github.com/sorin-ionescu/prezto) or [oh-my-zsh](https://github.com/robbyrussll/oh-my-zsh), and smaller ones that focus on specific features such as [zsh-syntax-highlighting](https://github.com/zsh-users/zsh-syntax-highlighting) or [zsh-history-substring-search](https://github.com/zsh-users/zsh-history-substring-search). Shells like [fish](https://fishshell.com/) include many of these user-friendly features by default. Some of these features include: -- Right prompt -- Command syntax highlighting -- History substring search -- manpage based flag completions -- Smarter autocompletion -- Prompt themes +- 向右对齐 +- 命令语法高亮 +- 历史子串查询 +- 基于手册页面的选项补全 +- 更智能的自动补全 +- 提示符主题 + One thing to note when using these frameworks is that they may slow down your shell, especially if the code they run is not properly optimized or it is too much code. You can always profile it and disable the features that you do not use often or value over speed. From 28043cbb34c976072a0192f988ec590b34d343c2 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 24 May 2020 16:37:05 +0800 Subject: [PATCH 363/640] update trans --- _2020/command-line.md | 47 ++++++++++++++++++++----------------------- 1 file changed, 22 insertions(+), 25 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index e72c2f4e..3678ab8a 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -269,12 +269,8 @@ if [[ "$(hostname)" == "myServer" ]]; then {do_something}; fi 然后我们可以在每天设备上创建配置文件 `~/.gitconfig_local` 来包含与该设备相关的特定配置。您甚至应该创建一个单独的代码仓库来管理这些与设备相关的配置。 -This idea is also useful if you want different programs to share some configurations. For instance, if you want both `bash` and `zsh` to share the same set of aliases you can write them under `.aliases` and have the following block in both: - - 如果您希望在不同的程序之间共享某些配置,该方法也适用。例如,如果你想要在 `bash` 和 `zsh` 中同时启用一些别名,你可以把它们写在 `.aliases` 里,然后在这两个 shell 里应用: - ```bash # Test if ~/.aliases exists and source it if [ -f ~/.aliases ]; then @@ -336,34 +332,37 @@ ssh-copy-id -i .ssh/id_ed25519.pub foobar@remote 使用 ssh 复制文件有很多方法: -- `ssh+tee`, the simplest is to use `ssh` command execution and STDIN input by doing `cat localfile | ssh remote_server tee serverfile`. Recall that [`tee`](http://man7.org/linux/man-pages/man1/tee.1.html) writes the output from STDIN into a file. -- [`scp`](http://man7.org/linux/man-pages/man1/scp.1.html) when copying large amounts of files/directories, the secure copy `scp` command is more convenient since it can easily recurse over paths. The syntax is `scp path/to/local_file remote_host:path/to/remote_file` -- [`rsync`](http://man7.org/linux/man-pages/man1/rsync.1.html) improves upon `scp` by detecting identical files in local and remote, and preventing copying them again. It also provides more fine grained control over symlinks, permissions and has extra features like the `--partial` flag that can resume from a previously interrupted copy. `rsync` has a similar syntax to `scp`. +- `ssh+tee`, 最简单的方法是执行 `ssh` 命令,然后通过这样的方法利用标准输入实现 `cat localfile | ssh remote_server tee serverfile`。回忆一下,[`tee`](http://man7.org/linux/man-pages/man1/tee.1.html) 命令会将标准输出写入到一个文件 +- [`scp`](http://man7.org/linux/man-pages/man1/scp.1.html) :当需要拷贝大量的文件或目录时,使用`scp` 命令则更加方便,因为它可以方便的遍历相关路径。语法如下:`scp path/to/local_file remote_host:path/to/remote_file` +- [`rsync`](http://man7.org/linux/man-pages/man1/rsync.1.html) 对 `scp` 进行来改进,它可以检测本地和远端的文件以防止重复拷贝。它还可以提供一些诸如符号连接、权限管理等精心打磨的功能。甚至还可以基于 `--partial`标记实现断点续传。`rsync` 的语法和`scp`类似。 ## 端口转发 -In many scenarios you will run into software that listens to specific ports in the machine. When this happens in your local machine you can type `localhost:PORT` or `127.0.0.1:PORT`, but what do you do with a remote server that does not have its ports directly available through the network/internet?. +很多情况下我们都会遇到软件需要监听特定设备的端口。如果是在您的本机,可以使用 `localhost:PORT` 或 `127.0.0.1:PORT`。但是如果需要监听远程服务器的端口该如何操作呢?这种情况下远端的端口并不会直接通过网络暴露给您。 + +此时就需要进行 *端口转发*。端口转发有两种,一种是本地端口转发和远程端口转发(参见下图,该图片引用自这篇[StackOverflow 文章](https://unix.stackexchange.com/questions/115897/whats-ssh-port-forwarding-and-whats-the-difference-between-ssh-local-and-remot)中的图片。 -This is called _port forwarding_ and it -comes in two flavors: Local Port Forwarding and Remote Port Forwarding (see the pictures for more details, credit of the pictures from [this StackOverflow post](https://unix.stackexchange.com/questions/115897/whats-ssh-port-forwarding-and-whats-the-difference-between-ssh-local-and-remot)). -**Local Port Forwarding** -![Local Port Forwarding](https://i.stack.imgur.com/a28N8.png  "Local Port Forwarding") +**本地端口转发** +![Local Port Forwarding](https://i.stack.imgur.com/a28N8.png  "本地端口转发") -**Remote Port Forwarding** -![Remote Port Forwarding](https://i.stack.imgur.com/4iK3b.png  "Remote Port Forwarding") +**远程端口转发** +![Remote Port Forwarding](https://i.stack.imgur.com/4iK3b.png  "远程端口转发") + + +常见的情景是使用本地端口转发,即远端设备上的服务监听一个端口,而您希望在本地设备上的一个端口建立连接并转发到远程端口上。例如,我们在远端服务器上运行 Jupyter notebook 并监听 `8888` 端口。 染后,建立从本地端口 `9999` 的转发,使用 `ssh -L 9999:localhost:8888 foobar@remote_server` 。这样只需要访问本地的 `localhost:9999` 即可。 -The most common scenario is local port forwarding, where a service in the remote machine listens in a port and you want to link a port in your local machine to forward to the remote port. For example, if we execute `jupyter notebook` in the remote server that listens to the port `8888`. Thus, to forward that to the local port `9999`, we would do `ssh -L 9999:localhost:8888 foobar@remote_server` and then navigate to `locahost:9999` in our local machine. ## SSH 配置 -We have covered many many arguments that we can pass. A tempting alternative is to create shell aliases that look like +我们已经介绍了很多参数。为它们创建一个别名是个好想法,我们可以这样做: + ```bash alias my_server="ssh -i ~/.id_ed25519 --port 2222 -L 9999:localhost:8888 foobar@remote_server ``` -However, there is a better alternative using `~/.ssh/config`. +不过,更好的方法是使用 `~/.ssh/config`. ```bash Host vm @@ -373,24 +372,22 @@ Host vm IdentityFile ~/.ssh/id_ed25519 LocalForward 9999 localhost:8888 -# Configs can also take wildcards +# 在配置文件中也可以使用通配符 Host *.mit.edu User foobaz ``` -An additional advantage of using the `~/.ssh/config` file over aliases is that other programs like `scp`, `rsync`, `mosh`, &c are able to read it as well and convert the settings into the corresponding flags. - +这么做的好处是,使用 `~/.ssh/config` 文件来创建别名,类似 `scp`, `rsync`, `mosh`的这些命令都可以读取这个配置并将设置转换为对于的命令行选项。 -Note that the `~/.ssh/config` file can be considered a dotfile, and in general it is fine for it to be included with the rest of your dotfiles. However, if you make it public, think about the information that you are potentially providing strangers on the internet: addresses of your servers, users, open ports, &c. This may facilitate some types of attacks so be thoughtful about sharing your SSH configuration. +注意,`~/.ssh/config` 文件也可以被当作配置文件,而且一般情况下也是可以被倒入其他配置文件的。不过,如果您将其公开到互联网上,那么其他人都将会看到您的服务器地址、用户名、开放端口等等。这些信息可能会帮助到那些企图攻击您系统的黑客,所以请务必三思。 -Server side configuration is usually specified in `/etc/ssh/sshd_config`. Here you can make changes like disabling password authentication, changing ssh ports, enabling X11 forwarding, &c. You can specify config settings on a per user basis. +服务器侧的配置通常放在 `/etc/ssh/sshd_config`。您可以在这里配置免密认证、修改 shh 端口、开启 X11 转发等等。 您也可以为每个用户单独指定配置。 ## 杂项 -A common pain when connecting to a remote server are disconnections due to shutting down/sleeping your computer or changing a network. Moreover if one has a connection with significant lag using ssh can become quite frustrating. [Mosh](https://mosh.org/), the mobile shell, improves upon ssh, allowing roaming connections, intermittent connectivity and providing intelligent local echo. +连接远程服务器的一个常见痛点是遇到由关机、休眠或网络环境变化导致的掉线。如果连接的延迟很高也很让人讨厌。[Mosh](https://mosh.org/),也就是mobile shell 对 ssh 进行了改进,它允许连接漫游、间歇性连接及智能本地回显。 -Sometimes it is convenient to mount a remote folder. [sshfs](https://github.com/libfuse/sshfs) can mount a folder on a remote server -locally, and then you can use a local editor. +有时将一个远端文件夹挂载到本地会比较方便, [sshfs](https://github.com/libfuse/sshfs) 可以将远端服务器上的一个文件夹挂载到本地,然后你就可以使用本地的编辑器了。 # Shell & 框架 From aa5d51fbd63ddb72483042ba850a044190ae4fe0 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 24 May 2020 16:53:00 +0800 Subject: [PATCH 364/640] update trans --- _2020/command-line.md | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 3678ab8a..0dcc235f 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -392,17 +392,17 @@ Host *.mit.edu # Shell & 框架 -During shell tool and scripting we covered the `bash` shell because it is by far the most ubiquitous shell and most systems have it as the default option. Nevertheless, it is not the only option. +在 shell 工具和脚本那节课中我们已经介绍了 `bash` shell,因为它是目前最通用的 shell,大多数的系统都将其作为默认 shell。但是,它并不是唯一的选项。 -For example, the `zsh` shell is a superset of `bash` and provides many convenient features out of the box such as: +例如,`zsh` shell 是 `bash` 的超集并提供了一些方便的功能: -- Smarter globbing, `**` -- Inline globbing/wildcard expansion -- Spelling correction -- Better tab completion/selection -- Path expansion (`cd /u/lo/b` will expand as `/usr/local/bin`) +- 智能替换, `**` +- 行内替换/通配符扩展 +- 拼写纠错 +- 更好的 tab 补全和选择 +- 路径展开 (`cd /u/lo/b` 会被展开为 `/usr/local/bin`) -**Frameworks** can improve your shell as well. Some popular general frameworks are [prezto](https://github.com/sorin-ionescu/prezto) or [oh-my-zsh](https://github.com/robbyrussll/oh-my-zsh), and smaller ones that focus on specific features such as [zsh-syntax-highlighting](https://github.com/zsh-users/zsh-syntax-highlighting) or [zsh-history-substring-search](https://github.com/zsh-users/zsh-history-substring-search). Shells like [fish](https://fishshell.com/) include many of these user-friendly features by default. Some of these features include: +**框架** 也可以改进您的 shell。比较流行的通用框架包括[prezto](https://github.com/sorin-ionescu/prezto) 或 [oh-my-zsh](https://github.com/robbyrussll/oh-my-zsh)。还有一些更精简的框架,它们往往专注于某一个特定功能,例如[zsh 语法高亮](https://github.com/zsh-users/zsh-syntax-highlighting) 或 [zsh 历史子串查询](https://github.com/zsh-users/zsh-history-substring-search)。 像 [fish](https://fishshell.com/) 这样的 shell 包含了很多用户友好的功能,其中一些特性包括: - 向右对齐 - 命令语法高亮 @@ -411,14 +411,14 @@ For example, the `zsh` shell is a superset of `bash` and provides many convenien - 更智能的自动补全 - 提示符主题 - -One thing to note when using these frameworks is that they may slow down your shell, especially if the code they run is not properly optimized or it is too much code. You can always profile it and disable the features that you do not use often or value over speed. +需要注意的是,使用这些框架可能会降低您 shell 的性能,尤其是如果这些框架的代码没有优化或者代码过多。您随时可以测试其性能或禁用某些不常用的功能来实现速度与功能的平衡。 # 终端模拟器 -Along with customizing your shell, it is worth spending some time figuring out your choice of **terminal emulator** and its settings. There are many many terminal emulators out there (here is a [comparison](https://anarc.at/blog/2018-04-12-terminal-emulators-1/)). +和自定义 shell 一样,花点时间选择适合您的 **终端模拟器**并进行设置是很有必要的。有许多终端模拟器可供您选择(这里有一些关于它们之间[比较](https://anarc.at/blog/2018-04-12-terminal-emulators-1/)的信息) + -Since you might be spending hundreds to thousands of hours in your terminal it pays off to look into its settings. Some of the aspects that you may want to modify in your terminal include: +您会花上很多时间在使用终端上,因此研究一下终端的设置是很有必要的,您可以从下面这些方面来配置您的终端: - 字体选择 - 彩色主题 From 56a55d51cea316edd32b7be4cd02bb3100b78472 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 24 May 2020 17:26:02 +0800 Subject: [PATCH 365/640] udate trans --- _2020/command-line.md | 50 ++++++++++++++++++++----------------------- 1 file changed, 23 insertions(+), 27 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 0dcc235f..a9c06791 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -433,44 +433,40 @@ Host *.mit.edu ## 任务控制 -1. From what we have seen, we can use some `ps aux | grep` commands to get our jobs' pids and then kill them, but there are better ways to do it. Start a `sleep 10000` job in a terminal, background it with `Ctrl-Z` and continue its execution with `bg`. Now use [`pgrep`](http://man7.org/linux/man-pages/man1/pgrep.1.html) to find its pid and [`pkill`](http://man7.org/linux/man-pages/man1/pgrep.1.html) to kill it without ever typing the pid itself. (Hint: use the `-af` flags). +1. 我们可以使用类似 `ps aux | grep` 这样的命令来获取任务的 pid ,然后您可以基于pid 来结束这些进程。但我们其实有更好的方法来做这件事。在终端中执行 `sleep 10000` 这个任务。然后用 `Ctrl-Z` 将其切换到后台并使用 `bg`来继续允许它。现在,使用 [`pgrep`](http://man7.org/linux/man-pages/man1/pgrep.1.html) 来查找 pid 并使用 [`pkill`](http://man7.org/linux/man-pages/man1/pgrep.1.html) to结束进程而不需要手动输入pid。(提示:: 使用 `-af` 标记)。 -1. Say you don't want to start a process until another completes, how you would go about it? In this exercise our limiting process will always be `sleep 60 &`. -One way to achieve this is to use the [`wait`](http://man7.org/linux/man-pages/man1/wait.1p.html) command. Try launching the sleep command and having an `ls` wait until the background process finishes. +2. 如果您希望某个进程结束后再开始另外一个进程, 应该如何实现呢?在这个练习中,我们使用 `sleep 60 &` 作为先执行的程序。一种方法是使用 [`wait`](http://man7.org/linux/man-pages/man1/wait.1p.html) 命令。尝试启动这个休眠命令,然后待其结束后再执行 `ls` 命令。 - However, this strategy will fail if we start in a different bash session, since `wait` only works for child processes. One feature we did not discuss in the notes is that the `kill` command's exit status will be zero on success and nonzero otherwise. `kill -0` does not send a signal but will give a nonzero exit status if the process does not exist. - Write a bash function called `pidwait` that takes a pid and waits until the given process completes. You should use `sleep` to avoid wasting CPU unnecessarily. + 但是,如果我们在不同的 bash 会话中进行操作,则上述方法就不起作用来。因为 `wait` 只能对子进程起作用。之前我们没有提过的一个特性是,`kill` 命令成功退出时其状态码为 0 ,其他状态则是非0。`kill -0` 则不会发送信号,但是会在进程不存在时返回一个不为0的状态码。请编写一个 bash 函数 `pidwait` ,它接受一个 pid 作为输入参数,然后一直等待直到该进程结束。您需要使用 `sleep` 来避免浪费 CPU 性能。 ## 终端多路复用 -1. Follow this `tmux` [tutorial](https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/) and then learn how to do some basic customizations following [these steps](https://www.hamvocke.com/blog/a-guide-to-customizing-your-tmux-conf/). +1. 请完成这个 `tmux` [教程](https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/) 参考[这些步骤](https://www.hamvocke.com/blog/a-guide-to-customizing-your-tmux-conf/)来学习如何自定义 `tmux`。 ## 别名 -1. Create an alias `dc` that resolves to `cd` for when you type it wrongly. - -1. Run `history | awk '{$1="";print substr($0,2)}' | sort | uniq -c | sort -n | tail -n 10` to get your top 10 most used commands and consider writing shorter aliases for them. Note: this works for Bash; if you're using ZSH, use `history 1` instead of just `history`. +1. 创建一个 `dc` 别名,它的功能是当我们错误的将 `cd` 输入为 `dc` 时也能正确执行。 +2. 执行 `history | awk '{$1="";print substr($0,2)}' | sort | uniq -c | sort -n | tail -n 10` 来获取您最常用的十条命令,尝试为它们创建别名。注意:这个命令只在 Bash 中生效,如果您使用 ZSH,使用`history 1` 替换 `history`。 ## 配置文件 +让我们帮助您进一步学习配置文件: + +1. 为您的配置文件新建一个文件夹,并设置好版本控制 +2. 在其中添加至少一个配置文件,比如说你的 shell,在其中包含一些自定义设置(可以从设置 `$PS1` 开始)。 +3. 建立一种在新设备进行快速安装配置的方法(无需手动操作)。最简单的方法是写一个 shell 脚本对每个文件使用 `ln -s`,也可以使用[专用工具](https://dotfiles.github.io/utilities/) +4. 在新的虚拟机上测试该安装脚本。 +5. 将您现有的所有配置文件移动到项目仓库里。 +6. 将项目发布到GitHub。 + -Let's get you up to speed with dotfiles. -1. Create a folder for your dotfiles and set up version - control. -1. Add a configuration for at least one program, e.g. your shell, with some - customization (to start off, it can be something as simple as customizing your shell prompt by setting `$PS1`). -1. Set up a method to install your dotfiles quickly (and without manual effort) on a new machine. This can be as simple as a shell script that calls `ln -s` for each file, or you could use a [specialized - utility](https://dotfiles.github.io/utilities/). -1. Test your installation script on a fresh virtual machine. -1. Migrate all of your current tool configurations to your dotfiles repository. -1. Publish your dotfiles on GitHub. ## 远端设备 -Install a Linux virtual machine (or use an already existing one) for this exercise. If you are not familiar with virtual machines check out [this](https://hibbard.eu/install-ubuntu-virtual-box/) tutorial for installing one. +进行下面的练习需要您先安装一个 Linux 虚拟机(如果已经安装过则可以直接使用),如果您对虚拟机尚不熟悉,可以参考[这篇教程](https://hibbard.eu/install-ubuntu-virtual-box/) 来进行安装。 -1. Go to `~/.ssh/` and check if you have a pair of SSH keys there. If not, generate them with `ssh-keygen -o -a 100 -t ed25519`. It is recommended that you use a password and use `ssh-agent` , more info [here](https://www.ssh.com/ssh/agent). -1. Edit `.ssh/config` to have an entry as follows +1. 前往 `~/.ssh/` 并查看是否已经存在 SSH 密钥对。如果不存在,请使用`ssh-keygen -o -a 100 -t ed25519`来创建一个。建议为密钥设置密码然后使用`ssh-agent`,更多信息可以参考 [这里](https://www.ssh.com/ssh/agent); +2. 在`.ssh/config`加入下面内容: ```bash Host vm @@ -479,8 +475,8 @@ Host vm IdentityFile ~/.ssh/id_ed25519 LocalForward 9999 localhost:8888 ``` -1. Use `ssh-copy-id vm` to copy your ssh key to the server. -1. Start a webserver in your VM by executing `python -m http.server 8888`. Access the VM webserver by navigating to `http://localhost:9999` in your machine. -1. Edit your SSH server config by doing `sudo vim /etc/ssh/sshd_config` and disable password authentication by editing the value of `PasswordAuthentication`. Disable root login by editing the value of `PermitRootLogin`. Restart the `ssh` service with `sudo service sshd restart`. Try sshing in again. -1. (Challenge) Install [`mosh`](https://mosh.org/) in the VM and establish a connection. Then disconnect the network adapter of the server/VM. Can mosh properly recover from it? -1. (Challenge) Look into what the `-N` and `-f` flags do in `ssh` and figure out what a command to achieve background port forwarding. +1. 使用 `ssh-copy-id vm` 将您的 ssh 密钥拷贝到服务器。 +2. 使用`python -m http.server 8888` 在您的虚拟机中启动一个 Web 服务器并通过本机的`http://localhost:9999` 访问虚拟机上的 Web 服务器 +3. 使用`sudo vim /etc/ssh/sshd_config` 编辑 SSH 服务器配置,通过修改`PasswordAuthentication`的值来禁用密码验证。通过修改`PermitRootLogin`的值来禁用 root 登陆。然后使用`sudo service sshd restart`重启 `ssh` 服务器,然后重新尝试。 +4. (附加题) 在虚拟机中安装 [`mosh`](https://mosh.org/) 并启动连接。然后断开服务器/虚拟机的网络适配器。mosh可以恢复连接吗? +5. (附加题) 查看`ssh`的`-N` 和 `-f` 选项的作用,找出在后台进行端口转发的命令是什么? \ No newline at end of file From aedfcdd4c490391e4750fb31aef7b90d9fdd8632 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 24 May 2020 17:30:51 +0800 Subject: [PATCH 366/640] fix link --- _2020/command-line.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index a9c06791..0ed0a790 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -340,7 +340,7 @@ ssh-copy-id -i .ssh/id_ed25519.pub foobar@remote 很多情况下我们都会遇到软件需要监听特定设备的端口。如果是在您的本机,可以使用 `localhost:PORT` 或 `127.0.0.1:PORT`。但是如果需要监听远程服务器的端口该如何操作呢?这种情况下远端的端口并不会直接通过网络暴露给您。 -此时就需要进行 *端口转发*。端口转发有两种,一种是本地端口转发和远程端口转发(参见下图,该图片引用自这篇[StackOverflow 文章](https://unix.stackexchange.com/questions/115897/whats-ssh-port-forwarding-and-whats-the-difference-between-ssh-local-and-remot)中的图片。 +此时就需要进行 *端口转发*。端口转发有两种,一种是本地端口转发和远程端口转发(参见下图,该图片引用自这篇[StackOverflow 文章](https://unix.stackexchange.com/questions/115897/whats-ssh-port-forwarding-and-whats-the-difference-between-ssh-local-and-remot))中的图片。 **本地端口转发** From bafe06932c3281716ef83a6743dd44bd862c05dd Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 24 May 2020 19:29:52 +0800 Subject: [PATCH 367/640] fix errors --- _2020/command-line.md | 86 +++++++++++++++++++++---------------------- 1 file changed, 42 insertions(+), 44 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 0ed0a790..b1f7bc5b 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -9,22 +9,23 @@ video: --- 当您使用 shell 进行工作时,可以使用一些方法改善您的工作流,本节课我们就来讨论这些方法。 -我们以及使用 shell 一段时间了,但是到目前为止我们的关注点集中在使用不同的命令行。现在,我们将会学习如何同时执行多个不同的进程并追踪它们的状态、停止或暂停某个进程以及如何使进程在后台运行。 -我们还将学习一些能够改善您的 shell 及其他工具的工作流的方法,主要途径是通过定义别名或基于配置文件对其进行配置。这些方法都可以帮您节省大量的时间。例如,仅需要执行一些简单的命令,我们就可以是在所有的主机上使用相同的配置。我们还会学习如何使用 SSH 操作远端机器。 +我们已经使用 shell 一段时间了,但是到目前为止我们的关注点主要集中在使用不同的命令上面。现在,我们将会学习如何同时执行多个不同的进程并追踪它们的状态、如何停止或暂停某个进程以及如何使进程在后台运行。 + +我们还将学习一些能够改善您的 shell 及其他工具的工作流的方法,这主要是通过定义别名或基于配置文件对其进行配置来实现的。这些方法都可以帮您节省大量的时间。例如,仅需要执行一些简单的命令,我们就可以在所有的主机上使用相同的配置。我们还会学习如何使用 SSH 操作远端机器。 # 任务控制 -某些情况下我们需要在任务执行时将其中断,例如当一个命令需要执行很长时间才能完成时(比如使用 `find` 搜索一个非常大的目录结构时)。大多数情况下,我们可以使用 `Ctrl-C` 来停止命令的执行。但是它的工作原理是什么呢?为什么有的时候会无法结束进程? +某些情况下我们需要中断正在执行的任务,比如当一个命令需要执行很长时间才能完成时(假设我们在使用 `find` 搜索一个非常大的目录结构)。大多数情况下,我们可以使用 `Ctrl-C` 来停止命令的执行。但是它的工作原理是什么呢?为什么有的时候会无法结束进程? ## 结束进程 您的 shell 会使用 UNIX 提供的信号机制执行进程间通信。当一个进程接收到信号时,它会停止执行、处理该信号并基于信号传递的信息来改变其执行。就这一点而言,信号是一种*软件中断*。 -就上述例子而言,当我们输入 `Ctrl-C` 时,shell 会发送一个`SIGINT` 信号到进程。 +在上面的例子中,当我们输入 `Ctrl-C` 时,shell 会发送一个`SIGINT` 信号到进程。 -下面这个Python程序向您展示了捕获信号`SIGINT` 并忽略它的基本操作,它并不会让程序停止。为了停止这个程序,我们需要使用`SIGQUIT` 信号,通过输入`Ctrl-\`可以发送该信号。 +下面这个 Python 程序向您展示了捕获信号`SIGINT` 并忽略它的基本操作,它并不会让程序停止。为了停止这个程序,我们需要使用`SIGQUIT` 信号,通过输入`Ctrl-\`可以发送该信号。 ```python #!/usr/bin/env python @@ -52,7 +53,7 @@ I got a SIGINT, but I am not stopping 30^\[1] 39913 quit python sigint.py ``` -尽管 `SIGINT` 和 `SIGQUIT` 都常常用来发出和终止程序相关都请求。`SIGTERM` 则是一个更加通用的,让程序优雅地退出的信号。为了发出这个信号我们需要使用[`kill`](http://man7.org/linux/man-pages/man1/kill.1.html) 命令, 它的语法是: `kill -TERM `. +尽管 `SIGINT` 和 `SIGQUIT` 都常常用来发出和终止程序相关的请求。`SIGTERM` 则是一个更加通用的、也更加优雅地退出信号。为了发出这个信号我们需要使用 [`kill`](http://man7.org/linux/man-pages/man1/kill.1.html) 命令, 它的语法是: `kill -TERM `。 ## 暂停和后台执行进程 @@ -61,14 +62,13 @@ I got a SIGINT, but I am not stopping 我们可以使用 [`fg`](http://man7.org/linux/man-pages/man1/fg.1p.html) 或 [`bg`](http://man7.org/linux/man-pages/man1/bg.1p.html) 命令恢复暂停的工作。它们分别表示在前台继续或在后台继续。 -[`jobs`](http://man7.org/linux/man-pages/man1/jobs.1p.html) 命令会列出当前终端会话中尚未完成的全部任务。您可以使用 pid 引用这些任务(可以用 [`pgrep`](http://man7.org/linux/man-pages/man1/pgrep.1.html) 找出 pid)。更加符合直觉的操作是,您可以使用百分号 + 任务编号(`jobs` 会打印任务编号)来选取该任务。如果要选择最近的一个任务,可以使用 `$!` 这一特别参数。 - -还有一件事情需要掌握,那就是命令中的 `&` 后缀可以让命令在直接在后台运行,这使得您可以直接在 shell 中继续做其他操作,不过它此时还是会使用 shell 的标准输出,这一点有时候会比较恼人(这种情况可以使用 shell 重定向处理)。 +[`jobs`](http://man7.org/linux/man-pages/man1/jobs.1p.html) 命令会列出当前终端会话中尚未完成的全部任务。您可以使用 pid 引用这些任务(也可以用 [`pgrep`](http://man7.org/linux/man-pages/man1/pgrep.1.html) 找出 pid)。更加符合直觉的操作是您可以使用百分号 + 任务编号(`jobs` 会打印任务编号)来选取该任务。如果要选择最近的一个任务,可以使用 `$!` 这一特殊参数。 -让已经在运行的进程转到后台运行,您可以键入`Ctrl-Z` ,然后紧接着再输入`bg`。注意,后台的进程仍然是您的终端进程的子进程,一旦您关闭来终端(会发送另外一个信号`SIGHUP`),这些后台的进程也会终止。为了防止这种情况发生,您可以使用 [`nohup`](http://man7.org/linux/man-pages/man1/nohup.1.html) (一个用来忽略 `SIGHUP` 的封装) 来运行程序。针对已经运行的程序,可以使用`disown` 。除此之外,您可以使用终端多路复用器来实现,下一章节我们会进行详细地探讨。 +还有一件事情需要掌握,那就是命令中的 `&` 后缀可以让命令在直接在后台运行,这使得您可以直接在 shell 中继续做其他操作,不过它此时还是会使用 shell 的标准输出,这一点有时会比较恼人(这种情况可以使用 shell 重定向处理)。 -我们在下面这个简单的会话中展示来这些概念的应用。 +让已经在运行的进程转到后台运行,您可以键入`Ctrl-Z` ,然后紧接着再输入`bg`。注意,后台的进程仍然是您的终端进程的子进程,一旦您关闭终端(会发送另外一个信号`SIGHUP`),这些后台的进程也会终止。为了防止这种情况发生,您可以使用 [`nohup`](http://man7.org/linux/man-pages/man1/nohup.1.html) (一个用来忽略 `SIGHUP` 的封装) 来运行程序。针对已经运行的程序,可以使用`disown` 。除此之外,您可以使用终端多路复用器来实现,下一章节我们会进行详细地探讨。 +下面这个简单的会话中展示来了些概念的应用。 ``` $ sleep 1000 @@ -117,7 +117,7 @@ $ jobs `SIGKILL` 是一个特殊的信号,它不能被进程捕获并且它会马上结束该进程。不过这样做会有一些副作用,例如留下孤儿进程。 -你可以在 [here](https://en.wikipedia.org/wiki/Signal_(IPC)) 或输入 [`man signal`](http://man7.org/linux/man-pages/man7/signal.7.html) 或使用 `kill -t` 来获取更多关于信号的信息。 +您可以在 [这里](https://en.wikipedia.org/wiki/Signal_(IPC)) 或输入 [`man signal`](http://man7.org/linux/man-pages/man7/signal.7.html) 或使用 `kill -t` 来获取更多关于信号的信息。 # 终端多路复用 @@ -128,8 +128,7 @@ $ jobs 不仅如此,终端多路复用使我们可以分离当前终端会话并在将来重新连接。 -这让你操作远端设备时的工作流大大改善,避免了 `nohup` 和其他类似技巧的使用。 - +这让您操作远端设备时的工作流大大改善,避免了 `nohup` 和其他类似技巧的使用。 现在最流行的终端多路器是 [`tmux`](http://man7.org/linux/man-pages/man1/tmux.1.html)。`tmux` 是一个高度可定制的工具,您可以使用相关快捷键创建多个标签页并在它们间导航。 @@ -150,23 +149,22 @@ $ jobs + ` ,` 重命名当前窗口 + ` w` 列出当前所有窗口 -- **面板** - 像vim中的分屏一样,面板使我们可以在一个屏幕里显示多个shell +- **面板** - 像 vim 中的分屏一样,面板使我们可以在一个屏幕里显示多个 shell + ` "` 水平分割 + ` %` 垂直分割 + ` <方向>` 切换到指定方向的面板,<方向> 指的是键盘上的方向键 + ` z` 切换当前面板的缩放 - + ` [` 开始往回卷动屏幕。你可以按下空格键来开始选择,回车键复制选中的部分 + + ` [` 开始往回卷动屏幕。您可以按下空格键来开始选择,回车键复制选中的部分 + ` <空格>` 在不同的面板排布间切换 扩展阅读: -[这里](https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/) 是一份快速入门 `tmux` 的教程, [而这一篇](http://linuxcommand.org/lc3_adv_termmux.php) 文章则更加详细。它包含来原本的 `screen` 命令。您也许想要掌握 [`screen`](http://man7.org/linux/man-pages/man1/screen.1.html) 命令,因为在大多数 UNIX 系统中都默认安装有该程序。 +[这里](https://www.hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/) 是一份 `tmux` 快速入门教程, [而这一篇](http://linuxcommand.org/lc3_adv_termmux.php) 文章则更加详细,它包含了 `screen` 命令。您也许想要掌握 [`screen`](http://man7.org/linux/man-pages/man1/screen.1.html) 命令,因为在大多数 UNIX 系统中都默认安装有该程序。 # 别名 输入一长串包含许多选项的命令会非常麻烦。因此,大多数 shell 都支持设置别名。shell 的别名相当于一个长命令的缩写,shell 会自动将其替换成原本的命令。例如,bash 中的别名语法如下: - ```bash alias alias_name="command_to_alias arg1 arg2" ``` @@ -176,37 +174,37 @@ alias alias_name="command_to_alias arg1 arg2" 别名有许多很方便的特性: ```bash -# Make shorthands for common flags +# 创建常用命令的缩写 alias ll="ls -lh" -# Save a lot of typing for common commands +# 能够少输入很多 alias gs="git status" alias gc="git commit" alias v="vim" -# Save you from mistyping +# 手误打错命令也没关系 alias sl=ls -# Overwrite existing commands for better defaults +# 重新定义一些命令行的默认行为 alias mv="mv -i" # -i prompts before overwrite alias mkdir="mkdir -p" # -p make parent dirs as needed alias df="df -h" # -h prints human readable format -# Alias can be composed +# 别名可以组合使用 alias la="ls -A" alias lla="la -l" -# To ignore an alias run it prepended with \ +# 在忽略某个别名 \ls -# Or disable an alias altogether with unalias +# 或者禁用别名 unalias la -# To get an alias definition just call it with alias +# 获取别名的定义 alias ll -# Will print ll='ls -lh' +# 会打印 ll='ls -lh' ``` -值得注意的是,在默认情况下,shell 并不会保存别名。为了让别名持续生效,你需要将配置放进 shell 的启动文件里,像是`.bashrc` 或 `.zshrc`,下一节我们就会讲到。 +值得注意的是,在默认情况下 shell 并不会保存别名。为了让别名持续生效,您需要将配置放进 shell 的启动文件里,像是`.bashrc` 或 `.zshrc`,下一节我们就会讲到。 # 配置文件(Dotfiles) @@ -224,18 +222,18 @@ shell 的配置也是通过这类文件完成的。在启动时,您的 shell - `bash` - `~/.bashrc`, `~/.bash_profile` - `git` - `~/.gitconfig` -- `vim` - `~/.vimrc` 和 `~/.vim` folder +- `vim` - `~/.vimrc` 和 `~/.vim` 目录 - `ssh` - `~/.ssh/config` - `tmux` - `~/.tmux.conf` 我们应该如何管理这些配置文件呢,它们应该在它们的文件夹下,并使用版本控制系统进行管理,然后通过脚本将其 **符号链接** 到需要的地方。这么做有如下好处: - **安装简单**: 如果您登陆了一台新的设备,在这台设备上应用您的配置只需要几分钟的时间; -- **可以执行**: 你的工具在任何地方都以相同的配置工作 +- **可以执行**: 您的工具在任何地方都以相同的配置工作 - **同步**: 在一处更新配置文件,可以同步到其他所有地方 -- **变更追踪**: 你可能要在整个程序员生涯中持续维护这些配置文件,而对于长期项目而言,版本历史是非常重要的 +- **变更追踪**: 您可能要在整个程序员生涯中持续维护这些配置文件,而对于长期项目而言,版本历史是非常重要的 -配置文件中需要放些什么?你可以通过在线文档和[man pages](https://en.wikipedia.org/wiki/Man_page)了解所使用工具的设置项。另一个方法是在网上搜索有关特定程序的文章,作者们在文章中会分享他们的配置。还有一种方法就是直接浏览其他人的配置文件:您可以在这里找到无数的[dotfiles repositories](https://github.com/search?o=desc&q=dotfiles&s=stars&type=Repositories) —— 其中最受欢迎的那些可以在[这里](https://github.com/mathiasbynens/dotfiles)找到(我们建议你不要直接复制别人的配置)。[这里](https://dotfiles.github.io/) 也有一些非常有用的资源。 +配置文件中需要放些什么?您可以通过在线文档和[帮助手册](https://en.wikipedia.org/wiki/Man_page)了解所使用工具的设置项。另一个方法是在网上搜索有关特定程序的文章,作者们在文章中会分享他们的配置。还有一种方法就是直接浏览其他人的配置文件:您可以在这里找到无数的[dotfiles 仓库](https://github.com/search?o=desc&q=dotfiles&s=stars&type=Repositories) —— 其中最受欢迎的那些可以在[这里](https://github.com/mathiasbynens/dotfiles)找到(我们建议您不要直接复制别人的配置)。[这里](https://dotfiles.github.io/) 也有一些非常有用的资源。 本课程的老师们也在 GitHub 上开源了他们的配置文件: [Anish](https://github.com/anishathalye/dotfiles), @@ -269,7 +267,7 @@ if [[ "$(hostname)" == "myServer" ]]; then {do_something}; fi 然后我们可以在每天设备上创建配置文件 `~/.gitconfig_local` 来包含与该设备相关的特定配置。您甚至应该创建一个单独的代码仓库来管理这些与设备相关的配置。 -如果您希望在不同的程序之间共享某些配置,该方法也适用。例如,如果你想要在 `bash` 和 `zsh` 中同时启用一些别名,你可以把它们写在 `.aliases` 里,然后在这两个 shell 里应用: +如果您希望在不同的程序之间共享某些配置,该方法也适用。例如,如果您想要在 `bash` 和 `zsh` 中同时启用一些别名,您可以把它们写在 `.aliases` 里,然后在这两个 shell 里应用: ```bash # Test if ~/.aliases exists and source it @@ -299,7 +297,7 @@ ssh foo@bar.mit.edu ## SSH 密钥 -基于密钥的验证机制利用了密码学中的公钥,我们只需要向服务器证明客户端持有对应的私钥,而不需要公开其私钥。这样您就可以避免每次登陆都输入密码的麻烦了秘密就可以登陆。不过,私钥(通常是 `~/.ssh/id_rsa` 或者 `~/.ssh/id_ed25519`) 等效于您的密码,所以一定要好好保存它。 +基于密钥的验证机制使用了密码学中的公钥,我们只需要向服务器证明客户端持有对应的私钥,而不需要公开其私钥。这样您就可以避免每次登陆都输入密码的麻烦了秘密就可以登陆。不过,私钥(通常是 `~/.ssh/id_rsa` 或者 `~/.ssh/id_ed25519`) 等效于您的密码,所以一定要好好保存它。 ### 密钥生成 @@ -312,7 +310,7 @@ ssh-keygen -o -a 100 -t ed25519 -f ~/.ssh/id_ed25519 您可以为密钥设置密码,防止有人持有您的私钥并使用它访问您的服务器。您可以使用 [`ssh-agent`](http://man7.org/linux/man-pages/man1/ssh-agent.1.html) 或 [`gpg-agent`](https://linux.die.net/man/1/gpg-agent) ,这样就不需要每次都输入该密码了。 -如果你曾经配置过使用 SSH 密钥推送到 GitHub,那么可能你已经完成了[这里](https://help.github.com/articles/connecting-to-github-with-ssh/) 介绍的这些步骤,并且已经有了一个可用的密钥对。要检查你是否持有密码并验证它,你可以运行 `ssh-keygen -y -f /path/to/key`. +如果您曾经配置过使用 SSH 密钥推送到 GitHub,那么可能您已经完成了[这里](https://help.github.com/articles/connecting-to-github-with-ssh/) 介绍的这些步骤,并且已经有了一个可用的密钥对。要检查您是否持有密码并验证它,您可以运行 `ssh-keygen -y -f /path/to/key`. ### 基于密钥的认证机制 @@ -332,9 +330,9 @@ ssh-copy-id -i .ssh/id_ed25519.pub foobar@remote 使用 ssh 复制文件有很多方法: -- `ssh+tee`, 最简单的方法是执行 `ssh` 命令,然后通过这样的方法利用标准输入实现 `cat localfile | ssh remote_server tee serverfile`。回忆一下,[`tee`](http://man7.org/linux/man-pages/man1/tee.1.html) 命令会将标准输出写入到一个文件 -- [`scp`](http://man7.org/linux/man-pages/man1/scp.1.html) :当需要拷贝大量的文件或目录时,使用`scp` 命令则更加方便,因为它可以方便的遍历相关路径。语法如下:`scp path/to/local_file remote_host:path/to/remote_file` -- [`rsync`](http://man7.org/linux/man-pages/man1/rsync.1.html) 对 `scp` 进行来改进,它可以检测本地和远端的文件以防止重复拷贝。它还可以提供一些诸如符号连接、权限管理等精心打磨的功能。甚至还可以基于 `--partial`标记实现断点续传。`rsync` 的语法和`scp`类似。 +- `ssh+tee`, 最简单的方法是执行 `ssh` 命令,然后通过这样的方法利用标准输入实现 `cat localfile | ssh remote_server tee serverfile`。回忆一下,[`tee`](http://man7.org/linux/man-pages/man1/tee.1.html) 命令会将标准输出写入到一个文件; +- [`scp`](http://man7.org/linux/man-pages/man1/scp.1.html) :当需要拷贝大量的文件或目录时,使用`scp` 命令则更加方便,因为它可以方便的遍历相关路径。语法如下:`scp path/to/local_file remote_host:path/to/remote_file`; +- [`rsync`](http://man7.org/linux/man-pages/man1/rsync.1.html) 对 `scp` 进行来改进,它可以检测本地和远端的文件以防止重复拷贝。它还可以提供一些诸如符号连接、权限管理等精心打磨的功能。甚至还可以基于 `--partial`标记实现断点续传。`rsync` 的语法和`scp`类似; ## 端口转发 @@ -377,7 +375,7 @@ Host *.mit.edu User foobaz ``` -这么做的好处是,使用 `~/.ssh/config` 文件来创建别名,类似 `scp`, `rsync`, `mosh`的这些命令都可以读取这个配置并将设置转换为对于的命令行选项。 +这么做的好处是,使用 `~/.ssh/config` 文件来创建别名,类似 `scp`、`rsync`和`mosh`的这些命令都可以读取这个配置并将设置转换为对应的命令行选项。 注意,`~/.ssh/config` 文件也可以被当作配置文件,而且一般情况下也是可以被倒入其他配置文件的。不过,如果您将其公开到互联网上,那么其他人都将会看到您的服务器地址、用户名、开放端口等等。这些信息可能会帮助到那些企图攻击您系统的黑客,所以请务必三思。 @@ -385,9 +383,9 @@ Host *.mit.edu ## 杂项 -连接远程服务器的一个常见痛点是遇到由关机、休眠或网络环境变化导致的掉线。如果连接的延迟很高也很让人讨厌。[Mosh](https://mosh.org/),也就是mobile shell 对 ssh 进行了改进,它允许连接漫游、间歇性连接及智能本地回显。 +连接远程服务器的一个常见痛点是遇到由关机、休眠或网络环境变化导致的掉线。如果连接的延迟很高也很让人讨厌。[Mosh](https://mosh.org/)(即 mobile shell )对 ssh 进行了改进,它允许连接漫游、间歇连接及智能本地回显。 -有时将一个远端文件夹挂载到本地会比较方便, [sshfs](https://github.com/libfuse/sshfs) 可以将远端服务器上的一个文件夹挂载到本地,然后你就可以使用本地的编辑器了。 +有时将一个远端文件夹挂载到本地会比较方便, [sshfs](https://github.com/libfuse/sshfs) 可以将远端服务器上的一个文件夹挂载到本地,然后您就可以使用本地的编辑器了。 # Shell & 框架 @@ -433,11 +431,11 @@ Host *.mit.edu ## 任务控制 -1. 我们可以使用类似 `ps aux | grep` 这样的命令来获取任务的 pid ,然后您可以基于pid 来结束这些进程。但我们其实有更好的方法来做这件事。在终端中执行 `sleep 10000` 这个任务。然后用 `Ctrl-Z` 将其切换到后台并使用 `bg`来继续允许它。现在,使用 [`pgrep`](http://man7.org/linux/man-pages/man1/pgrep.1.html) 来查找 pid 并使用 [`pkill`](http://man7.org/linux/man-pages/man1/pgrep.1.html) to结束进程而不需要手动输入pid。(提示:: 使用 `-af` 标记)。 +1. 我们可以使用类似 `ps aux | grep` 这样的命令来获取任务的 pid ,然后您可以基于pid 来结束这些进程。但我们其实有更好的方法来做这件事。在终端中执行 `sleep 10000` 这个任务。然后用 `Ctrl-Z` 将其切换到后台并使用 `bg`来继续允许它。现在,使用 [`pgrep`](http://man7.org/linux/man-pages/man1/pgrep.1.html) 来查找 pid 并使用 [`pkill`](http://man7.org/linux/man-pages/man1/pgrep.1.html) 结束进程而不需要手动输入pid。(提示:: 使用 `-af` 标记)。 2. 如果您希望某个进程结束后再开始另外一个进程, 应该如何实现呢?在这个练习中,我们使用 `sleep 60 &` 作为先执行的程序。一种方法是使用 [`wait`](http://man7.org/linux/man-pages/man1/wait.1p.html) 命令。尝试启动这个休眠命令,然后待其结束后再执行 `ls` 命令。 - 但是,如果我们在不同的 bash 会话中进行操作,则上述方法就不起作用来。因为 `wait` 只能对子进程起作用。之前我们没有提过的一个特性是,`kill` 命令成功退出时其状态码为 0 ,其他状态则是非0。`kill -0` 则不会发送信号,但是会在进程不存在时返回一个不为0的状态码。请编写一个 bash 函数 `pidwait` ,它接受一个 pid 作为输入参数,然后一直等待直到该进程结束。您需要使用 `sleep` 来避免浪费 CPU 性能。 + 但是,如果我们在不同的 bash 会话中进行操作,则上述方法就不起作用了。因为 `wait` 只能对子进程起作用。之前我们没有提过的一个特性是,`kill` 命令成功退出时其状态码为 0 ,其他状态则是非0。`kill -0` 则不会发送信号,但是会在进程不存在时返回一个不为0的状态码。请编写一个 bash 函数 `pidwait` ,它接受一个 pid 作为输入参数,然后一直等待直到该进程结束。您需要使用 `sleep` 来避免浪费 CPU 性能。 ## 终端多路复用 @@ -453,7 +451,7 @@ Host *.mit.edu 让我们帮助您进一步学习配置文件: 1. 为您的配置文件新建一个文件夹,并设置好版本控制 -2. 在其中添加至少一个配置文件,比如说你的 shell,在其中包含一些自定义设置(可以从设置 `$PS1` 开始)。 +2. 在其中添加至少一个配置文件,比如说您的 shell,在其中包含一些自定义设置(可以从设置 `$PS1` 开始)。 3. 建立一种在新设备进行快速安装配置的方法(无需手动操作)。最简单的方法是写一个 shell 脚本对每个文件使用 `ln -s`,也可以使用[专用工具](https://dotfiles.github.io/utilities/) 4. 在新的虚拟机上测试该安装脚本。 5. 将您现有的所有配置文件移动到项目仓库里。 From 3ddd75c78d4ab690ad4bca0c683645d24bdbf42c Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 24 May 2020 19:31:18 +0800 Subject: [PATCH 368/640] marked as ready --- _2020/command-line.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index b1f7bc5b..438309a4 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -2,7 +2,7 @@ layout: lecture title: "命令行环境" date: 2019-01-21 -ready: false +ready: true video: aspect: 56.25 id: e8BO_dYxk5c From c40387c461119f8300409641f7e9c9fe6c8ba13a Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Sun, 24 May 2020 19:35:33 +0800 Subject: [PATCH 369/640] command line finished --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 1f5e44cf..d96e9406 100644 --- a/README.md +++ b/README.md @@ -29,7 +29,7 @@ To contribute to this tanslation project, please book your topic by creating an | [shell-tools.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/shell-tools.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [editors.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/editors.md) | [@stechu](https://github.com/stechu) | In-progress | | [data-wrangling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/data-wrangling.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | -| [command-line.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/command-line.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | +| [command-line.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/command-line.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [version-control.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/version-control.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [debugging-profiling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/debugging-profiling.md) | | TO-DO | | [metaprogramming.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/metaprogramming.md) | | TO-DO | From a29f9d013708bd9488b35043e25d6f472514f803 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Sun, 24 May 2020 19:37:22 +0800 Subject: [PATCH 370/640] Update README.md --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index d96e9406..5ccf8e5f 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,7 @@ # The Missing Semester of Your CS Education Website for the [The Missing Semester of Your CS Education](https://missing.csail.mit.edu/) class! +[中文站点](https://missing-semester-cn.github.io) Contributions are most welcome! If you have edits or new content to add, please open an issue or submit a pull request. From 6b6000bb185d351fefa70c70486116e4db7c28b8 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Sun, 24 May 2020 19:39:06 +0800 Subject: [PATCH 371/640] book metaprogramming --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 5ccf8e5f..cbf06a32 100644 --- a/README.md +++ b/README.md @@ -33,7 +33,7 @@ To contribute to this tanslation project, please book your topic by creating an | [command-line.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/command-line.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [version-control.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/version-control.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [debugging-profiling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/debugging-profiling.md) | | TO-DO | -| [metaprogramming.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/metaprogramming.md) | | TO-DO | +| [metaprogramming.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/metaprogramming.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | | [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | [@catcarbon](https://github.com/catcarbon) | In-progress | | [potpourri.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md) | | TO-DO | | [qa.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/qa.md) | [@AA1HSHH](https://github.com/AA1HSHH) | In-progress | From 8cc391e893b13142e54bf55e4024a2d0285305fd Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Mon, 25 May 2020 08:52:02 +0800 Subject: [PATCH 372/640] update trans --- _2020/metaprogramming.md | 164 ++++++++++----------------------------- 1 file changed, 39 insertions(+), 125 deletions(-) diff --git a/_2020/metaprogramming.md b/_2020/metaprogramming.md index d850e839..1bf1e5e9 100644 --- a/_2020/metaprogramming.md +++ b/_2020/metaprogramming.md @@ -13,53 +13,18 @@ video: [Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anicor/data_wrangling_iap_2019/) {% endcomment %} -What do we mean by "metaprogramming"? Well, it was the best collective -term we could come up with for the set of things that are more about -_process_ than they are about writing code or working more efficiently. -In this lecture, we will look at systems for building and testing your -code, and for managing dependencies. These may seem like they are of -limited importance in your day-to-day as a student, but the moment you -interact with a larger code base through an internship or once you enter -the "real world", you will see this everywhere. We should note that -"metaprogramming" can also mean "[programs that operate on -programs](https://en.wikipedia.org/wiki/Metaprogramming)", whereas that -is not quite the definition we are using for the purposes of this -lecture. - -# Build systems - -If you write a paper in LaTeX, what are the commands you need to run to -produce your paper? What about the ones used to run your benchmarks, -plot them, and then insert that plot into your paper? Or to compile the -code provided in the class you're taking and then running the tests? - -For most projects, whether they contain code or not, there is a "build -process". Some sequence of operations you need to do to go from your -inputs to your outputs. Often, that process might have many steps, and -many branches. Run this to generate this plot, that to generate those -results, and something else to produce the final paper. As with so many -of the things we have seen in this class, you are not the first to -encounter this annoyance, and luckily there exist many tools to help -you! - -These are usually called "build systems", and there are _many_ of them. -Which one you use depends on the task at hand, your language of -preference, and the size of the project. At their core, they are all -very similar though. You define a number of _dependencies_, a number of -_targets_, and _rules_ for going from one to the other. You tell the -build system that you want a particular target, and its job is to find -all the transitive dependencies of that target, and then apply the rules -to produce intermediate targets all the way until the final target has -been produced. Ideally, the build system does this without unnecessarily -executing rules for targets whose dependencies haven't changed and where -the result is available from a previous build. - -`make` is one of the most common build systems out there, and you will -usually find it installed on pretty much any UNIX-based computer. It has -its warts, but works quite well for simple-to-moderate projects. When -you run `make`, it consults a file called `Makefile` in the current -directory. All the targets, their dependencies, and the rules are -defined in that file. Let's take a look at one: +我们这里说的 “元编程(metaprogramming)” 是什么意思呢?好吧,对于本文要介绍的这些内容,这是我们能够想到的最能概括它们的词。因为我们今天要讲的东西,更多是关于 *流程* ,而不是写代码或更高效的工作。本节课我们会学习构建系统、代码测试以及依赖管理。在您还是学生的时候,这些东西看上去似乎对你来说没那么重要,不过当你开始实习或走进社会的时候,您将会接触到大型的代码库,本节课讲授的这些东西也会变得随处可见。必须要指出的是,“元编程” 也有[用于操作程序的程序](https://en.wikipedia.org/wiki/Metaprogramming)" 之含义,这和我们今天讲座所介绍的概念是完全不同的。 + +# 构建系统 + +如果您使用 LaTeX 来编写论文,您需要执行哪些命令才能编译出你想要的论文呢?执行基准测试、绘制图表然后将其插入论文的命令又有哪些?或者,如何编译本课程提供的代码并执行测试呢? + +对于大多数系统来说,不论其是否包含代码,都会包含一个“构建过程”。有时,您需要执行一系列操作。通常,这一过程包含了很多步骤,很多分支。执行一些命令来生成图表,然后执行另外的一些命令生成结果,然后在执行其他的命令来生成最终的论文。有很多事情需要我们完成,您并不是第一个因此感到苦恼的人,幸运的是,有很多工具可以帮助我们完成这些操作。 + +这些工具通常被称为 "构建系统",而且这些工具还不少。如何选择工具完全取决于您当前手头上要完成的任务以及项目的规模。从本质上讲,这些工具都是非常类似的。您需要定义*依赖*、*目标*和*规则*。您必须告诉构建系统您具体的构建目标,系统的任务则是找到构建这些目标所需要的依赖,并根据规则构建所需的中间产物,直到最终目标被构建出来。理想的情况下,如果目标的依赖没有发生改动,并且我们可以从之前的构建中复用这些依赖,那么与其相关的构建规则并不会被执行。 + +`make` 是最常用的构建系统之一,您会发现它通常被安装到了几乎所有基于UNIX的系统中。 +`make`并不完美,但是对于中小型项目来说,它已经足够好了。当您执行 `make` 时,它会去参考当前目录下名为 `Makefile` 的文件。所有构建目标、相关依赖和规则都需要在该文件中定义,它看上去是这样的: ```make paper.pdf: paper.tex plot-data.png @@ -69,29 +34,16 @@ plot-%.png: %.dat plot.py ./plot.py -i $*.dat -o $@ ``` -Each directive in this file is a rule for how to produce the left-hand -side using the right-hand side. Or, phrased differently, the things -named on the right-hand side are dependencies, and the left-hand side is -the target. The indented block is a sequence of programs to produce the -target from those dependencies. In `make`, the first directive also -defines the default goal. If you run `make` with no arguments, this is -the target it will build. Alternatively, you can run something like -`make plot-data.png`, and it will build that target instead. - -The `%` in a rule is a "pattern", and will match the same string on the -left and on the right. For example, if the target `plot-foo.png` is -requested, `make` will look for the dependencies `foo.dat` and -`plot.py`. Now let's look at what happens if we run `make` with an empty -source directory. +这个文件中的指令是即如何使用右侧文件构建左侧文件的规则。或者,换句话说,引号左侧的是构建目标,引号右侧的是构建它所需的依赖。缩进的部分是从依赖构建目标时需要用到的一段程序。在 `make` 中,第一条指令还指明了构建的目的,如果您使用不带参数的 `make`,这便是我们最终的构建结果。或者,您可以使用这样的命令来构建其他目标:`make plot-data.png`。 + +规则中的 `%` 是一种模式,它会匹配其左右两侧相同的字符串。例如,如果目标是 `plot-foo.png`, `make` 会去寻找 `foo.dat` 和 `plot.py` 作为依赖。现在,让我们看看如果在一个空的源码目录中执行`make` 会发生什么? ```console $ make make: *** No rule to make target 'paper.tex', needed by 'paper.pdf'. Stop. ``` -`make` is helpfully telling us that in order to build `paper.pdf`, it -needs `paper.tex`, and it has no rule telling it how to make that file. -Let's try making it! +`make` 会告诉我们,为了构建出`paper.pdf`,它需要 `paper.tex`,但是并没有一条规则能够告诉它如何构建该文件。让我们构建它吧! ```console $ touch paper.tex @@ -99,10 +51,7 @@ $ make make: *** No rule to make target 'plot-data.png', needed by 'paper.pdf'. Stop. ``` -Hmm, interesting, there _is_ a rule to make `plot-data.png`, but it is a -pattern rule. Since the source files do not exist (`foo.dat`), `make` -simply states that it cannot make that file. Let's try creating all the -files: +哟,有意思,我们是**有**构建 `plot-data.png` 的规则的,但是这是一条模式规则。因为源文件`foo.dat` 并不存在,因此 `make` 就会告诉你它不能构建 `plot-data.png`,让我们创建这些文件: ```console $ cat paper.tex @@ -134,7 +83,7 @@ $ cat data.dat 5 8 ``` -Now what happens if we run `make`? +当我们执行 `make` 时会发生什么? ```console $ make @@ -143,18 +92,15 @@ pdflatex paper.tex ... lots of output ... ``` -And look, it made a PDF for us! -What if we run `make` again? +看!PDF ! + +如果再次执行 `make` 会怎样? ```console $ make make: 'paper.pdf' is up to date. ``` - -It didn't do anything! Why not? Well, because it didn't need to. It -checked that all of the previously-built targets were still up to date -with respect to their listed dependencies. We can test this by modifying -`paper.tex` and then re-running `make`: +什么事情都没做!为什么?好吧,因为它什么都不需要做。make回去检查之前的构建是因其依赖改变而需要被更新。让我们试试修改 `paper.tex` 在重新执行 `make`: ```console $ vim paper.tex @@ -163,40 +109,14 @@ pdflatex paper.tex ... ``` -Notice that `make` did _not_ re-run `plot.py` because that was not -necessary; none of `plot-data.png`'s dependencies changed! - -# Dependency management - -At a more macro level, your software projects are likely to have -dependencies that are themselves projects. You might depend on installed -programs (like `python`), system packages (like `openssl`), or libraries -within your programming language (like `matplotlib`). These days, most -dependencies will be available through a _repository_ that hosts a -large number of such dependencies in a single place, and provides a -convenient mechanism for installing them. Some examples include the -Ubuntu package repositories for Ubuntu system packages, which you access -through the `apt` tool, RubyGems for Ruby libraries, PyPi for Python -libraries, or the Arch User Repository for Arch Linux user-contributed -packages. - -Since the exact mechanisms for interacting with these repositories vary -a lot from repository to repository and from tool to tool, we won't go -too much into the details of any specific one in this lecture. What we -_will_ cover is some of the common terminology they all use. The first -among these is _versioning_. Most projects that other projects depend on -issue a _version number_ with every release. Usually something like -8.1.3 or 64.1.20192004. They are often, but not always, numerical. -Version numbers serve many purposes, and one of the most important of -them is to ensure that software keeps working. Imagine, for example, -that I release a new version of my library where I have renamed a -particular function. If someone tried to build some software that -depends on my library after I release that update, the build might fail -because it calls a function that no longer exists! Versioning attempts -to solve this problem by letting a project say that it depends on a -particular version, or range of versions, of some other project. That -way, even if the underlying library changes, dependent software -continues building by using an older version of my library. +注意 `make` 并**没有**重新构建 `plot.py`,因为没必要;`plot-data.png` 的所有依赖都没有发生改变。 + + +# 依赖管理 + +就您的项目来说,它的依赖可能本身也是其他的项目。您也许会依赖某些程序(例如 `python`)、系统包 (例如 `openssl`)或相关编程语言的库(例如 `matplotlib`)。 现在,大多数的依赖可以通过某些**软件仓库**来获取,这些仓库会在一个地方托管大量的依赖,我们则可以通过一套非常简单的机制来安装依赖。例如 Ubuntu 系统下面有Ubuntu软件包仓库,您可以通过`apt` 这个工具来访问, RubyGems 则包含了 Ruby 的相关库,PyPi 包含了 Python 库, Arch Linux 用户贡献的库则可以在 Arch User Repository 中找到。 + +由于每个仓库、每种工具的运行机制都不太一样,因此我们并不会在本节课深入讲解具体的细节。我们会介绍一些通用的术语,例如*版本控制*。大多数被其他项目所依赖的项目都会在每次发布新版本时创建一个*版本号*。通常看上去像 8.1.3 或 64.1.20192004。版本号一般是数字构成的,但也并不绝对。版本号有很多用途,其中最重要的作用是保证软件能够运行。试想一下,加入我的库要发布一个新版本,在这个版本里面我重命名了某个函数。如果有人在我的库升级版本后,仍希望基于它构建新的软件,那么很可能构建会失败,因为它希望调用的函数已经不复存在了。有了版本控制就可以很好的解决这个问题,我们可以指定当前项目需要基于某个版本,甚至某个范围内的版本,或是某些项目来构建。这么做的话,即使某个被依赖的库发生了变化,依赖它的软件可以基于其之前的版本进行构建。 That also isn't ideal though! What if I issue a security update which does _not_ change the public interface of my library (its "API"), and @@ -272,24 +192,18 @@ GitHub domain. This makes it trivial for us to update the website! We just make our changes locally, commit them with git, and then push. CI takes care of the rest. -## A brief aside on testing +## 测试简介 + +多数的大型软件都有“测试套”。您可能已经对测试的相关概念有所了解,但是我们觉得有些测试方法和测试术语还是应该再次提醒一下: -Most large software projects come with a "test suite". You may already -be familiar with the general concept of testing, but we thought we'd -quickly mention some approaches to testing and testing terminology that -you may encounter in the wild: + - 测试套:所有测试的统称 + - 单元测试:一个“微型测试”,用于对某个封装的特性进行测试 + - 集成测试:: 一个“宏观测试”,针对系统的某一大部分进行,测试其不同的特性或组件是否能*协同*工作。 + - 回归测试:用于保证之前引起问题的 bug 不会再次出现 + - 模拟(Mocking): 使用一个假的实现来替换函数、模块或类型,屏蔽那些和测试不相关的内容。例如,您可能会“模拟网络连接” 或 “模拟硬盘” - - Test suite: a collective term for all the tests - - Unit test: a "micro-test" that tests a specific feature in isolation - - Integration test: a "macro-test" that runs a larger part of the - system to check that different feature or components work _together_. - - Regression test: a test that implements a particular pattern that - _previously_ caused a bug to ensure that the bug does not resurface. - - Mocking: the replace a function, module, or type with a fake - implementation to avoid testing unrelated functionality. For example, - you might "mock the network" or "mock the disk". -# Exercises +# 课后练习 1. Most makefiles provide a target called `clean`. This isn't intended to produce a file called `clean`, but instead to clean up any files From efaaf9dd8ccdc9f2126660c9d782ec7e8c67b7df Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Mon, 25 May 2020 10:15:32 +0800 Subject: [PATCH 373/640] update trans --- _2020/metaprogramming.md | 136 ++++++++------------------------------- 1 file changed, 26 insertions(+), 110 deletions(-) diff --git a/_2020/metaprogramming.md b/_2020/metaprogramming.md index 1bf1e5e9..0b1fac82 100644 --- a/_2020/metaprogramming.md +++ b/_2020/metaprogramming.md @@ -118,79 +118,24 @@ pdflatex paper.tex 由于每个仓库、每种工具的运行机制都不太一样,因此我们并不会在本节课深入讲解具体的细节。我们会介绍一些通用的术语,例如*版本控制*。大多数被其他项目所依赖的项目都会在每次发布新版本时创建一个*版本号*。通常看上去像 8.1.3 或 64.1.20192004。版本号一般是数字构成的,但也并不绝对。版本号有很多用途,其中最重要的作用是保证软件能够运行。试想一下,加入我的库要发布一个新版本,在这个版本里面我重命名了某个函数。如果有人在我的库升级版本后,仍希望基于它构建新的软件,那么很可能构建会失败,因为它希望调用的函数已经不复存在了。有了版本控制就可以很好的解决这个问题,我们可以指定当前项目需要基于某个版本,甚至某个范围内的版本,或是某些项目来构建。这么做的话,即使某个被依赖的库发生了变化,依赖它的软件可以基于其之前的版本进行构建。 -That also isn't ideal though! What if I issue a security update which -does _not_ change the public interface of my library (its "API"), and -which any project that depended on the old version should immediately -start using? This is where the different groups of numbers in a version -come in. The exact meaning of each one varies between projects, but one -relatively common standard is [_semantic -versioning_](https://semver.org/). With semantic versioning, every -version number is of the form: major.minor.patch. The rules are: - - - If a new release does not change the API, increase the patch version. - - If you _add_ to your API in a backwards-compatible way, increase the - minor version. - - If you change the API in a non-backwards-compatible way, increase the - major version. - -This already provides some major advantages. Now, if my project depends -on your project, it _should_ be safe to use the latest release with the -same major version as the one I built against when I developed it, as -long as its minor version is at least what it was back then. In other -words, if I depend on your library at version `1.3.7`, then it _should_ -be fine to build it with `1.3.8`, `1.6.1`, or even `1.3.0`. Version -`2.2.4` would probably not be okay, because the major version was -increased. We can see an example of semantic versioning in Python's -version numbers. Many of you are probably aware that Python 2 and Python -3 code do not mix very well, which is why that was a _major_ version -bump. Similarly, code written for Python 3.5 might run fine on Python -3.7, but possibly not on 3.4. - -When working with dependency management systems, you may also come -across the notion of _lock files_. A lock file is simply a file that -lists the exact version you are _currently_ depending on of each -dependency. Usually, you need to explicitly run an update program to -upgrade to newer versions of your dependencies. There are many reasons -for this, such as avoiding unnecessary recompiles, having reproducible -builds, or not automatically updating to the latest version (which may -be broken). And extreme version of this kind of dependency locking is -_vendoring_, which is where you copy all the code of your dependencies -into your own project. That gives you total control over any changes to -it, and lets you introduce your own changes to it, but also means you -have to explicitly pull in any updates from the upstream maintainers -over time. - -# Continuous integration systems - -As you work on larger and larger projects, you'll find that there are -often additional tasks you have to do whenever you make a change to it. -You might have to upload a new version of the documentation, upload a -compiled version somewhere, release the code to pypi, run your test -suite, and all sort of other things. Maybe every time someone sends you -a pull request on GitHub, you want their code to be style checked and -you want some benchmarks to run? When these kinds of needs arise, it's -time to take a look at continuous integration. - -Continuous integration, or CI, is an umbrella term for "stuff that runs -whenever your code changes", and there are many companies out there that -provide various types of CI, often for free for open-source projects. -Some of the big ones are Travis CI, Azure Pipelines, and GitHub Actions. -They all work in roughly the same way: you add a file to your repository -that describes what should happen when various things happen to that -repository. By far the most common one is a rule like "when someone -pushes code, run the test suite". When the event triggers, the CI -provider spins up a virtual machines (or more), runs the commands in -your "recipe", and then usually notes down the results somewhere. You -might set it up so that you are notified if the test suite stops -passing, or so that a little badge appears on your repository as long as -the tests pass. - -As an example of a CI system, the class website is set up using GitHub -Pages. Pages is a CI action that runs the Jekyll blog software on every -push to `master` and makes the built site available on a particular -GitHub domain. This makes it trivial for us to update the website! We -just make our changes locally, commit them with git, and then push. CI -takes care of the rest. +这样还并不理想!如果我们发布了一项和安全相关的升级,它*没有*影响到任何公开接口(API),但是处于安全的考虑,依赖它的项目都应该立即升级,那应该怎么做呢?这也是版本号包含多个部分的原因。不同项目所用的版本号其具体含义并不完全相同,但是一个相对比较常用的标准是[语义版本号](https://semver.org/),这种版本号具有不同的语义,它的格式是这样的:主版本号.次版本号.补丁号。相关规则有: + + - 如果新的版本没有改变 API,请将补丁号递增; + - 如果您添加了 API 并且该改动是向后兼容的,请将次版本号递增; + - 如果您修改了 API 但是它并不向后兼容,请将主版本号递增。 + +这么做有很多好处。现在如果我们的项目是基于您的项目构建的,那么只要最新版本的主版本号只要没变就是安全的 ,次版本号不低于之前我们使用的版本即可。换句话说,如果我依赖的版本是`1.3.7`,那么使用`1.3.8`、`1.6.1`,甚至是`1.3.0`都是可以的。如果版本号是 `2.2.4` 就不一定能用了,因为它的主版本号增加了。我们可以将 Python 的版本号作为语义版本号的一个实例。您应该知道,Python 2 和 Python 3 的代码是不兼容的,这也是为什么 Python 的主版本号改变的原因。类似的,使用 Python 3.5 编写的代码在 3.7 上可以运行,但是在 3.4 上可能会不行。 + +使用依赖管理系统的时候,您可能会遇到锁文件(_lock files_)这一概念。锁文件列出了您当前每个依赖所对应的具体版本号。通常,您需要执行升级程序才能更新依赖的版本。这么做的原因有很多,例如避免不必要的重新编译、创建可复现的软件版本或禁止自动升级到最新版本(可能会包含 bug)。还有一种极端的依赖锁定叫做 +_vendoring_,它会把您的依赖中的所有代码直接拷贝到您的项目中,这样您就能够完全掌控代码的任何修改,同时您也可以将自己的修改添加进去,不过这也意味着如何该依赖的维护者更新了某些代码,您也必须要自己去拉取这些更新。 + +# 持续集成系统 + +随着您接触到的项目规模越来越大,您会发现修改代码之后还有很多额外的工作要做。您可能需要上传一份新版本的文档、上传编译后的文件到某处、发布代码到 pypi,执行测试套等等。或许您希望每次有人提交代码到 GitHub 的时候,他们的代码风格被检查过并执行过某些基准测试?如果您有这方面的需求,那么请花些时间了解一下持续集成。 + +持续集成,或者叫做 CI 是一种雨伞术语(umbrella term),它指的是那些“当您的代码变动时,自动运行的那些东西”,市场上有很多提供各式各样 CI 工具的公司,这些工具大部分都是免费或开源的。比较大的有 Travis CI、Azure Pipelines 和 GitHub Actions。它们的工作原理都是类似的:您需要在代码仓库中添加一个文件,描述当前仓库发生任何修改时,应该如何应对。目前为止,最常见的规则是:如果有人提交代码,执行测试套。当这个事件被触发时,CI 提供方会启动一个(或多个)虚拟机,执行您制定的规则,并且通常会记录下相关的执行结果。您可以进行某些设置,这样当测试套失败时您能够收到通知或者当测试全部通过时,您的仓库主页会显示一个徽标。 + +本课程的网站基于 GitHub Pages 构建,这就是一个很好的例子。Pages 在每次`master`有代码更新时,会执行 Jekyll 博客软件,然后使您的站点可以通过某个 GitHub 域名来访问。对于我们来说这些事情太琐碎了,我现在我们只需要在本地进行修改,然后使用 git 提交代码,发布到远端。CI 会自动帮我们处理后续的事情。 ## 测试简介 @@ -205,40 +150,11 @@ takes care of the rest. # 课后练习 - 1. Most makefiles provide a target called `clean`. This isn't intended - to produce a file called `clean`, but instead to clean up any files - that can be re-built by make. Think of it as a way to "undo" all of - the build steps. Implement a `clean` target for the `paper.pdf` - `Makefile` above. You will have to make the target - [phony](https://www.gnu.org/software/make/manual/html_node/Phony-Targets.html). - You may find the [`git - ls-files`](https://git-scm.com/docs/git-ls-files) subcommand useful. - A number of other very common make targets are listed - [here](https://www.gnu.org/software/make/manual/html_node/Standard-Targets.html#Standard-Targets). - 2. Take a look at the various ways to specify version requirements for - dependencies in [Rust's build - system](https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html). - Most package repositories support similar syntax. For each one - (caret, tilde, wildcard, comparison, and multiple), try to come up - with a use-case in which that particular kind of requirement makes - sense. - 3. Git can act as a simple CI system all by itself. In `.git/hooks` - inside any git repository, you will find (currently inactive) files - that are run as scripts when a particular action happens. Write a - [`pre-commit`](https://git-scm.com/docs/githooks#_pre_commit) hook - that runs `make paper.pdf` and refuses the commit if the `make` - command fails. This should prevent any commit from having an - unbuildable version of the paper. - 4. Set up a simple auto-published page using [GitHub - Pages](https://help.github.com/en/actions/automating-your-workflow-with-github-actions). - Add a [GitHub Action](https://github.com/features/actions) to the - repository to run `shellcheck` on any shell files in that - repository (here is [one way to do - it](https://github.com/marketplace/actions/shellcheck)). Check that - it works! - 5. [Build your - own](https://help.github.com/en/actions/automating-your-workflow-with-github-actions/building-actions) - GitHub action to run [`proselint`](http://proselint.com/) or - [`write-good`](https://github.com/btford/write-good) on all the - `.md` files in the repository. Enable it in your repository, and - check that it works by filing a pull request with a typo in it. + 1. 大多数的 makefiles 都提供了 一个名为 `clean` 的构建目标,这并不是说我们会生成一个名为`clean`的文件,而是我们可以使用它清理文件,让 make 重新构建。您可以理解为它的作用是“撤销”所有构建步骤。在上面的 makefile 中为`paper.pdf`实现一个`clean` 目标。您需要构建[phony](https://www.gnu.org/software/make/manual/html_node/Phony-Targets.html)。您也许会发现 [`git ls-files`](https://git-scm.com/docs/git-ls-files) 子命令很有用。其他一些有用的 make 构建目标可以在[这里](https://www.gnu.org/software/make/manual/html_node/Standard-Targets.html#Standard-Targets)找到; + + 2. 指定版本要求的方法很多,让我们学习一下 [Rust的构建系统](https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html)的依赖管理。大多数的包管理仓库都支持类似的语法。对于每种语法(尖号、波浪号、通配符、比较、乘积),构建一种场景使其具有实际意义; + + 3. Git 可以作为一个简单的 CI 系统来使用,在任何 git 仓库中的 `.git/hooks` 目录中,您可以找到一些文件(当前处于未激活状态),它们的作用和脚本一样,当某些事件发生时便可以自动执行。请编写一个 + [`pre-commit`](https://git-scm.com/docs/githooks#_pre_commit) 钩子,当执行`make`命令失败后,它会执行 `make paper.pdf` 并拒绝您的提交。这样做可以避免产生包含不可构建版本的提交信息; + 4. 基于 [GitHub Pages](https://help.github.com/en/actions/automating-your-workflow-with-github-actions) 创建任意一个可以自动发布的页面。添加一个[GitHub Action](https://github.com/features/actions) 到该仓库,对仓库中的所有 shell 文件执行 `shellcheck`([方法之一](https://github.com/marketplace/actions/shellcheck)); + 5. [构建属于您的](https://help.github.com/en/actions/automating-your-workflow-with-github-actions/building-actions) GitHub action,对仓库中所有的`.md`文件执行[`proselint`](http://proselint.com/) 或 [`write-good`](https://github.com/btford/write-good),在您的仓库中开启这一功能,提交一个包含错误的文件看看该功能是否生效。 From 6665d25a142fb84528f9b3e6b5f138ff76d880d7 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Mon, 25 May 2020 10:28:14 +0800 Subject: [PATCH 374/640] reveiw --- _2020/metaprogramming.md | 22 ++++++++++------------ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/_2020/metaprogramming.md b/_2020/metaprogramming.md index 0b1fac82..d48701f6 100644 --- a/_2020/metaprogramming.md +++ b/_2020/metaprogramming.md @@ -3,7 +3,7 @@ layout: lecture title: "元编程" details: 构建系统、依赖管理、测试、持续集成 date: 2019-01-27 -ready: false +ready: true video: aspect: 56.25 id: _Ms1Z4xfqv4 @@ -13,18 +13,17 @@ video: [Reddit Discussion](https://www.reddit.com/r/hackertools/comments/anicor/data_wrangling_iap_2019/) {% endcomment %} -我们这里说的 “元编程(metaprogramming)” 是什么意思呢?好吧,对于本文要介绍的这些内容,这是我们能够想到的最能概括它们的词。因为我们今天要讲的东西,更多是关于 *流程* ,而不是写代码或更高效的工作。本节课我们会学习构建系统、代码测试以及依赖管理。在您还是学生的时候,这些东西看上去似乎对你来说没那么重要,不过当你开始实习或走进社会的时候,您将会接触到大型的代码库,本节课讲授的这些东西也会变得随处可见。必须要指出的是,“元编程” 也有[用于操作程序的程序](https://en.wikipedia.org/wiki/Metaprogramming)" 之含义,这和我们今天讲座所介绍的概念是完全不同的。 +我们这里说的 “元编程(metaprogramming)” 是什么意思呢?好吧,对于本文要介绍的这些内容,这是我们能够想到的最能概括它们的词。因为我们今天要讲的东西,更多是关于 *流程* ,而不是写代码或更高效的工作。本节课我们会学习构建系统、代码测试以及依赖管理。在您还是学生的时候,这些东西看上去似乎对您来说没那么重要,不过当您开始实习或走进社会的时候,您将会接触到大型的代码库,本节课讲授的这些东西也会变得随处可见。必须要指出的是,“元编程” 也有[用于操作程序的程序](https://en.wikipedia.org/wiki/Metaprogramming)" 之含义,这和我们今天讲座所介绍的概念是完全不同的。 # 构建系统 -如果您使用 LaTeX 来编写论文,您需要执行哪些命令才能编译出你想要的论文呢?执行基准测试、绘制图表然后将其插入论文的命令又有哪些?或者,如何编译本课程提供的代码并执行测试呢? +如果您使用 LaTeX 来编写论文,您需要执行哪些命令才能编译出您想要的论文呢?执行基准测试、绘制图表然后将其插入论文的命令又有哪些?或者,如何编译本课程提供的代码并执行测试呢? 对于大多数系统来说,不论其是否包含代码,都会包含一个“构建过程”。有时,您需要执行一系列操作。通常,这一过程包含了很多步骤,很多分支。执行一些命令来生成图表,然后执行另外的一些命令生成结果,然后在执行其他的命令来生成最终的论文。有很多事情需要我们完成,您并不是第一个因此感到苦恼的人,幸运的是,有很多工具可以帮助我们完成这些操作。 这些工具通常被称为 "构建系统",而且这些工具还不少。如何选择工具完全取决于您当前手头上要完成的任务以及项目的规模。从本质上讲,这些工具都是非常类似的。您需要定义*依赖*、*目标*和*规则*。您必须告诉构建系统您具体的构建目标,系统的任务则是找到构建这些目标所需要的依赖,并根据规则构建所需的中间产物,直到最终目标被构建出来。理想的情况下,如果目标的依赖没有发生改动,并且我们可以从之前的构建中复用这些依赖,那么与其相关的构建规则并不会被执行。 -`make` 是最常用的构建系统之一,您会发现它通常被安装到了几乎所有基于UNIX的系统中。 -`make`并不完美,但是对于中小型项目来说,它已经足够好了。当您执行 `make` 时,它会去参考当前目录下名为 `Makefile` 的文件。所有构建目标、相关依赖和规则都需要在该文件中定义,它看上去是这样的: +`make` 是最常用的构建系统之一,您会发现它通常被安装到了几乎所有基于UNIX的系统中。`make`并不完美,但是对于中小型项目来说,它已经足够好了。当您执行 `make` 时,它会去参考当前目录下名为 `Makefile` 的文件。所有构建目标、相关依赖和规则都需要在该文件中定义,它看上去是这样的: ```make paper.pdf: paper.tex plot-data.png @@ -34,7 +33,7 @@ plot-%.png: %.dat plot.py ./plot.py -i $*.dat -o $@ ``` -这个文件中的指令是即如何使用右侧文件构建左侧文件的规则。或者,换句话说,引号左侧的是构建目标,引号右侧的是构建它所需的依赖。缩进的部分是从依赖构建目标时需要用到的一段程序。在 `make` 中,第一条指令还指明了构建的目的,如果您使用不带参数的 `make`,这便是我们最终的构建结果。或者,您可以使用这样的命令来构建其他目标:`make plot-data.png`。 +这个文件中的指令,即如何使用右侧文件构建左侧文件的规则。或者,换句话说,引号左侧的是构建目标,引号右侧的是构建它所需的依赖。缩进的部分是从依赖构建目标时需要用到的一段程序。在 `make` 中,第一条指令还指明了构建的目的,如果您使用不带参数的 `make`,这便是我们最终的构建结果。或者,您可以使用这样的命令来构建其他目标:`make plot-data.png`。 规则中的 `%` 是一种模式,它会匹配其左右两侧相同的字符串。例如,如果目标是 `plot-foo.png`, `make` 会去寻找 `foo.dat` 和 `plot.py` 作为依赖。现在,让我们看看如果在一个空的源码目录中执行`make` 会发生什么? @@ -51,7 +50,7 @@ $ make make: *** No rule to make target 'plot-data.png', needed by 'paper.pdf'. Stop. ``` -哟,有意思,我们是**有**构建 `plot-data.png` 的规则的,但是这是一条模式规则。因为源文件`foo.dat` 并不存在,因此 `make` 就会告诉你它不能构建 `plot-data.png`,让我们创建这些文件: +哟,有意思,我们是**有**构建 `plot-data.png` 的规则的,但是这是一条模式规则。因为源文件`foo.dat` 并不存在,因此 `make` 就会告诉您它不能构建 `plot-data.png`,让我们创建这些文件: ```console $ cat paper.tex @@ -116,9 +115,9 @@ pdflatex paper.tex 就您的项目来说,它的依赖可能本身也是其他的项目。您也许会依赖某些程序(例如 `python`)、系统包 (例如 `openssl`)或相关编程语言的库(例如 `matplotlib`)。 现在,大多数的依赖可以通过某些**软件仓库**来获取,这些仓库会在一个地方托管大量的依赖,我们则可以通过一套非常简单的机制来安装依赖。例如 Ubuntu 系统下面有Ubuntu软件包仓库,您可以通过`apt` 这个工具来访问, RubyGems 则包含了 Ruby 的相关库,PyPi 包含了 Python 库, Arch Linux 用户贡献的库则可以在 Arch User Repository 中找到。 -由于每个仓库、每种工具的运行机制都不太一样,因此我们并不会在本节课深入讲解具体的细节。我们会介绍一些通用的术语,例如*版本控制*。大多数被其他项目所依赖的项目都会在每次发布新版本时创建一个*版本号*。通常看上去像 8.1.3 或 64.1.20192004。版本号一般是数字构成的,但也并不绝对。版本号有很多用途,其中最重要的作用是保证软件能够运行。试想一下,加入我的库要发布一个新版本,在这个版本里面我重命名了某个函数。如果有人在我的库升级版本后,仍希望基于它构建新的软件,那么很可能构建会失败,因为它希望调用的函数已经不复存在了。有了版本控制就可以很好的解决这个问题,我们可以指定当前项目需要基于某个版本,甚至某个范围内的版本,或是某些项目来构建。这么做的话,即使某个被依赖的库发生了变化,依赖它的软件可以基于其之前的版本进行构建。 +由于每个仓库、每种工具的运行机制都不太一样,因此我们并不会在本节课深入讲解具体的细节。我们会介绍一些通用的术语,例如*版本控制*。大多数被其他项目所依赖的项目都会在每次发布新版本时创建一个*版本号*。通常看上去像 8.1.3 或 64.1.20192004。版本号一般是数字构成的,但也并不绝对。版本号有很多用途,其中最重要的作用是保证软件能够运行。试想一下,假如我的库要发布一个新版本,在这个版本里面我重命名了某个函数。如果有人在我的库升级版本后,仍希望基于它构建新的软件,那么很可能构建会失败,因为它希望调用的函数已经不复存在了。有了版本控制就可以很好的解决这个问题,我们可以指定当前项目需要基于某个版本,甚至某个范围内的版本,或是某些项目来构建。这么做的话,即使某个被依赖的库发生了变化,依赖它的软件可以基于其之前的版本进行构建。 -这样还并不理想!如果我们发布了一项和安全相关的升级,它*没有*影响到任何公开接口(API),但是处于安全的考虑,依赖它的项目都应该立即升级,那应该怎么做呢?这也是版本号包含多个部分的原因。不同项目所用的版本号其具体含义并不完全相同,但是一个相对比较常用的标准是[语义版本号](https://semver.org/),这种版本号具有不同的语义,它的格式是这样的:主版本号.次版本号.补丁号。相关规则有: +这样还并不理想!如果我们发布了一项和安全相关的升级,它并*没有*影响到任何公开接口(API),但是处于安全的考虑,依赖它的项目都应该立即升级,那应该怎么做呢?这也是版本号包含多个部分的原因。不同项目所用的版本号其具体含义并不完全相同,但是一个相对比较常用的标准是[语义版本号](https://semver.org/),这种版本号具有不同的语义,它的格式是这样的:主版本号.次版本号.补丁号。相关规则有: - 如果新的版本没有改变 API,请将补丁号递增; - 如果您添加了 API 并且该改动是向后兼容的,请将次版本号递增; @@ -126,14 +125,13 @@ pdflatex paper.tex 这么做有很多好处。现在如果我们的项目是基于您的项目构建的,那么只要最新版本的主版本号只要没变就是安全的 ,次版本号不低于之前我们使用的版本即可。换句话说,如果我依赖的版本是`1.3.7`,那么使用`1.3.8`、`1.6.1`,甚至是`1.3.0`都是可以的。如果版本号是 `2.2.4` 就不一定能用了,因为它的主版本号增加了。我们可以将 Python 的版本号作为语义版本号的一个实例。您应该知道,Python 2 和 Python 3 的代码是不兼容的,这也是为什么 Python 的主版本号改变的原因。类似的,使用 Python 3.5 编写的代码在 3.7 上可以运行,但是在 3.4 上可能会不行。 -使用依赖管理系统的时候,您可能会遇到锁文件(_lock files_)这一概念。锁文件列出了您当前每个依赖所对应的具体版本号。通常,您需要执行升级程序才能更新依赖的版本。这么做的原因有很多,例如避免不必要的重新编译、创建可复现的软件版本或禁止自动升级到最新版本(可能会包含 bug)。还有一种极端的依赖锁定叫做 -_vendoring_,它会把您的依赖中的所有代码直接拷贝到您的项目中,这样您就能够完全掌控代码的任何修改,同时您也可以将自己的修改添加进去,不过这也意味着如何该依赖的维护者更新了某些代码,您也必须要自己去拉取这些更新。 +使用依赖管理系统的时候,您可能会遇到锁文件(_lock files_)这一概念。锁文件列出了您当前每个依赖所对应的具体版本号。通常,您需要执行升级程序才能更新依赖的版本。这么做的原因有很多,例如避免不必要的重新编译、创建可复现的软件版本或禁止自动升级到最新版本(可能会包含 bug)。还有一种极端的依赖锁定叫做 _vendoring_,它会把您的依赖中的所有代码直接拷贝到您的项目中,这样您就能够完全掌控代码的任何修改,同时您也可以将自己的修改添加进去,不过这也意味着如何该依赖的维护者更新了某些代码,您也必须要自己去拉取这些更新。 # 持续集成系统 随着您接触到的项目规模越来越大,您会发现修改代码之后还有很多额外的工作要做。您可能需要上传一份新版本的文档、上传编译后的文件到某处、发布代码到 pypi,执行测试套等等。或许您希望每次有人提交代码到 GitHub 的时候,他们的代码风格被检查过并执行过某些基准测试?如果您有这方面的需求,那么请花些时间了解一下持续集成。 -持续集成,或者叫做 CI 是一种雨伞术语(umbrella term),它指的是那些“当您的代码变动时,自动运行的那些东西”,市场上有很多提供各式各样 CI 工具的公司,这些工具大部分都是免费或开源的。比较大的有 Travis CI、Azure Pipelines 和 GitHub Actions。它们的工作原理都是类似的:您需要在代码仓库中添加一个文件,描述当前仓库发生任何修改时,应该如何应对。目前为止,最常见的规则是:如果有人提交代码,执行测试套。当这个事件被触发时,CI 提供方会启动一个(或多个)虚拟机,执行您制定的规则,并且通常会记录下相关的执行结果。您可以进行某些设置,这样当测试套失败时您能够收到通知或者当测试全部通过时,您的仓库主页会显示一个徽标。 +持续集成,或者叫做 CI 是一种雨伞术语(umbrella term),它指的是那些“当您的代码变动时,自动运行的东西”,市场上有很多提供各式各样 CI 工具的公司,这些工具大部分都是免费或开源的。比较大的有 Travis CI、Azure Pipelines 和 GitHub Actions。它们的工作原理都是类似的:您需要在代码仓库中添加一个文件,描述当前仓库发生任何修改时,应该如何应对。目前为止,最常见的规则是:如果有人提交代码,执行测试套。当这个事件被触发时,CI 提供方会启动一个(或多个)虚拟机,执行您制定的规则,并且通常会记录下相关的执行结果。您可以进行某些设置,这样当测试套失败时您能够收到通知或者当测试全部通过时,您的仓库主页会显示一个徽标。 本课程的网站基于 GitHub Pages 构建,这就是一个很好的例子。Pages 在每次`master`有代码更新时,会执行 Jekyll 博客软件,然后使您的站点可以通过某个 GitHub 域名来访问。对于我们来说这些事情太琐碎了,我现在我们只需要在本地进行修改,然后使用 git 提交代码,发布到远端。CI 会自动帮我们处理后续的事情。 From bd33ae7be4cc4390f8faa8c26d02b3a02ea86b99 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Mon, 25 May 2020 10:32:01 +0800 Subject: [PATCH 375/640] mark metaprogramming as done --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index cbf06a32..8c49bd9f 100644 --- a/README.md +++ b/README.md @@ -33,7 +33,7 @@ To contribute to this tanslation project, please book your topic by creating an | [command-line.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/command-line.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [version-control.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/version-control.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [debugging-profiling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/debugging-profiling.md) | | TO-DO | -| [metaprogramming.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/metaprogramming.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | +| [metaprogramming.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/metaprogramming.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | [@catcarbon](https://github.com/catcarbon) | In-progress | | [potpourri.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md) | | TO-DO | | [qa.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/qa.md) | [@AA1HSHH](https://github.com/AA1HSHH) | In-progress | From abf0822b59789c89846eabbbb4e23cdd99938bc5 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Mon, 25 May 2020 20:24:34 +0800 Subject: [PATCH 376/640] book debugging --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 8c49bd9f..08213fcb 100644 --- a/README.md +++ b/README.md @@ -32,7 +32,7 @@ To contribute to this tanslation project, please book your topic by creating an | [data-wrangling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/data-wrangling.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [command-line.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/command-line.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [version-control.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/version-control.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | -| [debugging-profiling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/debugging-profiling.md) | | TO-DO | +| [debugging-profiling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/debugging-profiling.md) |[@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | | [metaprogramming.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/metaprogramming.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | [@catcarbon](https://github.com/catcarbon) | In-progress | | [potpourri.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md) | | TO-DO | From fadada629a223c6793c54a035eec5ca190953ecb Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Tue, 26 May 2020 08:06:36 +0800 Subject: [PATCH 377/640] update trans --- _2020/debugging-profiling.md | 103 ++++++++++++++++++----------------- 1 file changed, 52 insertions(+), 51 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index e39bbd1b..2de642b1 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -8,25 +8,28 @@ video: id: l812pUnKxME --- -A golden rule in programming is that code does not do what you expect it to do, but what you tell it to do. -Bridging that gap can sometimes be a quite difficult feat. -In this lecture we are going to cover useful techniques for dealing with buggy and resource hungry code: debugging and profiling. -# Debugging +代码不能完全按照您的想法运行,它只能完全按照您的写法运行,这是编程界的一条金科玉律。 -## Printf debugging and Logging +让您的写法符合您的想法是非常困难的。在这节课中,我们会传授给您一些非常有用技术,帮您处理代码中的 bug 和程序性能问题。 -"The most effective debugging tool is still careful thought, coupled with judiciously placed print statements" — Brian Kernighan, _Unix for Beginners_. -A first approach to debug a program is to add print statements around where you have detected the problem, and keep iterating until you have extracted enough information to understand what is responsible for the issue. +# 调试代码 -A second approach is to use logging in your program, instead of ad hoc print statements. Logging is better than regular print statements for several reasons: +## 打印调试法与日志 -- You can log to files, sockets or even remote servers instead of standard output. -- Logging supports severity levels (such as INFO, DEBUG, WARN, ERROR, &c), that allow you to filter the output accordingly. -- For new issues, there's a fair chance that your logs will contain enough information to detect what is going wrong. +"最有效的 debug 工具就是细致的分析配合位于恰当位置的打印语句" — Brian Kernighan, _Unix 新手入门_。 -[Here](/static/files/logger.py) is an example code that logs messages: +调试代码的第一种方法往往是在您发现问题的地方添加一些打印语句,然后不断重复此过程直到您获取了足够的信息并可以找到问题的根本原因。 + +另外一个方法是使用日志,而不是临时添加打印语句。日志较普通的打印语句有如下的一些优势: + +- 您可以将日志写入文件、socket 或者甚至是发送到远端服务器而不仅仅是标准输出; +- 日志可以支持严重等级(例如 INFO, DEBUG, WARN, ERROR等),这使您可以根据需要过滤日志; +- 对于新发现的问题,很可能您的日志中已经包含了可以帮助您定位问题的足够的信息。 + + +[这里](/static/files/logger.py) 是一个包含日志的例程序: ```bash $ python logger.py @@ -39,9 +42,9 @@ $ python logger.py color # Color formatted output ``` -One of my favorite tips for making logs more readable is to color code them. -By now you probably have realized that your terminal uses colors to make things more readable. But how does it do it? -Programs like `ls` or `grep` are using [ANSI escape codes](https://en.wikipedia.org/wiki/ANSI_escape_code), which are special sequences of characters to indicate your shell to change the color of the output. For example, executing `echo -e "\e[38;2;255;0;0mThis is red\e[0m"` prints the message `This is red` in red on your terminal. The following script shows how to print many RGB colors into your terminal. +有很多技巧可以使日志的可读性变得更好,我最喜欢的一个是技巧是对其进行着色。到目前为止,您应该已经知道,以彩色文本显示终端信息时可读性更好。但是应该如何设置呢? + +`ls` 和 `grep` 这样的程序会使用 [ANSI escape codes](https://en.wikipedia.org/wiki/ANSI_escape_code),它是一系列的特殊字符,可以使您的 shell 改变输出结果的颜色。例如,执行 `echo -e "\e[38;2;255;0;0mThis is red\e[0m"` 会打印红色的字符串:`This is red` 。下面这个脚本向您展示了如何在终端中打印多种颜色。 ```bash #!/usr/bin/env bash @@ -54,24 +57,22 @@ for R in $(seq 0 20 255); do done ``` -## Third party logs +## 第三方日志系统 -As you start building larger software systems you will most probably run into dependencies that run as separate programs. -Web servers, databases or message brokers are common examples of this kind of dependencies. -When interacting with these systems it is often necessary to read their logs, since client side error messages might not suffice. +如果您正在构建大型软件系统,您很可能会使用到一些依赖,有些依赖会作为程序单独运行。如 Web 服务器、数据库或消息代理都是此类常见的第三方依赖。 -Luckily, most programs write their own logs somewhere in your system. -In UNIX systems, it is commonplace for programs to write their logs under `/var/log`. -For instance, the [NGINX](https://www.nginx.com/) webserver places its logs under `/var/log/nginx`. -More recently, systems have started using a **system log**, which is increasingly where all of your log messages go. -Most (but not all) Linux systems use `systemd`, a system daemon that controls many things in your system such as which services are enabled and running. -`systemd` places the logs under `/var/log/journal` in a specialized format and you can use the [`journalctl`](http://man7.org/linux/man-pages/man1/journalctl.1.html) command to display the messages. -Similarly, on macOS there is still `/var/log/system.log` but an increasing number of tools use the system log, that can be displayed with [`log show`](https://www.manpagez.com/man/1/log/). -On most UNIX systems you can also use the [`dmesg`](http://man7.org/linux/man-pages/man1/dmesg.1.html) command to access the kernel log. +和这些系统交互的时候,阅读它们的日志是非常必要的,因为仅靠客户端侧的错误信息可能并不足以定位问题。 -For logging under the system logs you can use the [`logger`](http://man7.org/linux/man-pages/man1/logger.1.html) shell program. -Here's an example of using `logger` and how to check that the entry made it to the system logs. -Moreover, most programming languages have bindings logging to the system log. +幸运的是,大多数的程序都会将日志保存在您的系统中的某个地方。对于 UNIX 系统来说,程序的日志通常存放在 `/var/log`。 +例如, [NGINX](https://www.nginx.com/) web 服务器就将其日志存放于`/var/log/nginx`。 +最近,系统开始使用 **system log**,您所有的日志都会保存在这里。大多数的(但不是全部)Linux 系统都会使用 `systemd`,这是一个系统守护进程,它会控制您系统中的很多东西,例如哪些服务应该启动并运行。`systemd` 会将日志以某种特殊格式存放于`/var/log/journal`,您可以使用 [`journalctl`](http://man7.org/linux/man-pages/man1/journalctl.1.html) 命令显示这些消息。 +类似地,在 macOS there is still `/var/log/system.log` but an increasing number of tools use the system log, that can be displayed with [`log show`](https://www.manpagez.com/man/1/log/). + +对于大多数的 UNIX 系统,您也可以使用[`dmesg`](http://man7.org/linux/man-pages/man1/dmesg.1.html) 命令来读取内核的日志。 + +如果您希望将日志加入到系统日志中,您可以使用 [`logger`](http://man7.org/linux/man-pages/man1/logger.1.html) 这个 shell 程序。下面这个例子显示了如何使用 `logger`并且如何找到能够将其存入系统日志的条目。 + +不仅如此,大多数的编程语言都支持向系统日志中写日志。 ```bash logger "Hello Logs" @@ -81,33 +82,33 @@ log show --last 1m | grep Hello journalctl --since "1m ago" | grep Hello ``` -As we saw in the data wrangling lecture, logs can be quite verbose and they require some level of processing and filtering to get the information you want. -If you find yourself heavily filtering through `journalctl` and `log show` you can consider using their flags, which can perform a first pass of filtering of their output. -There are also some tools like [`lnav`](http://lnav.org/), that provide an improved presentation and navigation for log files. +正如我们在数据整理那节课上看到的那样,日志的内容可以非常的多,我们需要对其进行处理和过滤才能得到我们想要的信息。 + +如果您发现您需要对 `journalctl` 和 `log show` 的结果进行大量的过滤,那么此时可以考虑使用它们自带的选项对其结果先过滤一遍再输出。还有一些像 [`lnav`](http://lnav.org/) 这样的工具,它为日志文件提供了更好的展现和浏览方式。 + +## 调试器 -## Debuggers +当通过打印已经不能满足您的调试需求时,您应该使用调试器。 -When printf debugging is not enough you should use a debugger. -Debuggers are programs that let you interact with the execution of a program, allowing the following: +调试器是一种可以允许我们和正在执行的程序进行交互的程序,它可以做到: -- Halt execution of the program when it reaches a certain line. -- Step through the program one instruction at a time. -- Inspect values of variables after the program crashed. -- Conditionally halt the execution when a given condition is met. -- And many more advanced features +- 当到达某一行时将程序暂停; +- 一次一条指令地逐步执行程序; +- 程序崩溃后查看变量的值; +- 满足特定条件是暂停程序; +- 其他高级功能。 -Many programming languages come with some form of debugger. -In Python this is the Python Debugger [`pdb`](https://docs.python.org/3/library/pdb.html). +很多编程语言都有自己的调试器。Python 的调试器是[`pdb`](https://docs.python.org/3/library/pdb.html). -Here is a brief description of some of the commands `pdb` supports: +下面对`pdb` 支持对命令进行简单对介绍: -- **l**(ist) - Displays 11 lines around the current line or continue the previous listing. -- **s**(tep) - Execute the current line, stop at the first possible occasion. -- **n**(ext) - Continue execution until the next line in the current function is reached or it returns. -- **b**(reak) - Set a breakpoint (depending on the argument provided). -- **p**(rint) - Evaluate the expression in the current context and print its value. There's also **pp** to display using [`pprint`](https://docs.python.org/3/library/pprint.html) instead. -- **r**(eturn) - Continue execution until the current function returns. -- **q**(uit) - Quit the debugger. +- **l**(ist) - 显示当前行附近的11行或继续执行之前的显示; +- **s**(tep) - 执行当前行,并在第一个可能的地方停止 +- **n**(ext) - 继续执行直到当前函数的下一条语句或者 return 语句; +- **b**(reak) - 设置断点(基于传入对参数); +- **p**(rint) - 在当前上下文对表达式求值并打印结果。还有一个命令是**pp** ,它使用 [`pprint`](https://docs.python.org/3/library/pprint.html) 打印; +- **r**(eturn) - 继续执行知道当前函数返回; +- **q**(uit) - 退出调试器。 Let's go through an example of using `pdb` to fix the following buggy python code. (See the lecture video). From d16a49280c8f6debf5c75fe74a10c34d772a78e8 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Tue, 26 May 2020 08:18:34 +0800 Subject: [PATCH 378/640] update trans --- _2020/debugging-profiling.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 2de642b1..45a1466f 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -110,7 +110,7 @@ journalctl --since "1m ago" | grep Hello - **r**(eturn) - 继续执行知道当前函数返回; - **q**(uit) - 退出调试器。 -Let's go through an example of using `pdb` to fix the following buggy python code. (See the lecture video). +让我们使用`pdb` 来修复下面的 Python 代码(参考讲座视频) ```python def bubble_sort(arr): @@ -126,8 +126,8 @@ print(bubble_sort([4, 2, 1, 8, 7, 6])) ``` -Note that since Python is an interpreted language we can use the `pdb` shell to execute commands and to execute instructions. -[`ipdb`](https://pypi.org/project/ipdb/) is an improved `pdb` that uses the [`IPython`](https://ipython.org) REPL enabling tab completion, syntax highlighting, better tracebacks, and better introspection while retaining the same interface as the `pdb` module. +注意,因为 Python 是一种解释型语言,所以我们可以通过 `pdb` shell 执行命令。 +[`ipdb`](https://pypi.org/project/ipdb/) 是一种增强型的 `pdb` ,它使用[`IPython`](https://ipython.org) 作为 REPL并开启了 tab 补全、语法高亮、更好的回溯和更好的内省,同时还保留了`pdb` 模块相同的接口。 For more low level programming you will probably want to look into [`gdb`](https://www.gnu.org/software/gdb/) (and its quality of life modification [`pwndbg`](https://github.com/pwndbg/pwndbg)) and [`lldb`](https://lldb.llvm.org/). They are optimized for C-like language debugging but will let you probe pretty much any process and get its current machine state: registers, stack, program counter, &c. From 8184d365c1ce546462a89747d9bd34970c2a7f9e Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Tue, 26 May 2020 08:37:36 +0800 Subject: [PATCH 379/640] update trans --- _2020/debugging-profiling.md | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 45a1466f..c8e43bd3 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -129,17 +129,15 @@ print(bubble_sort([4, 2, 1, 8, 7, 6])) 注意,因为 Python 是一种解释型语言,所以我们可以通过 `pdb` shell 执行命令。 [`ipdb`](https://pypi.org/project/ipdb/) 是一种增强型的 `pdb` ,它使用[`IPython`](https://ipython.org) 作为 REPL并开启了 tab 补全、语法高亮、更好的回溯和更好的内省,同时还保留了`pdb` 模块相同的接口。 -For more low level programming you will probably want to look into [`gdb`](https://www.gnu.org/software/gdb/) (and its quality of life modification [`pwndbg`](https://github.com/pwndbg/pwndbg)) and [`lldb`](https://lldb.llvm.org/). -They are optimized for C-like language debugging but will let you probe pretty much any process and get its current machine state: registers, stack, program counter, &c. +对于更底层的编程语言,您可能需要了解一下 [`gdb`](https://www.gnu.org/software/gdb/) ( 以及它的改进版 [`pwndbg`](https://github.com/pwndbg/pwndbg)) 和 [`lldb`](https://lldb.llvm.org/)。 +它们都对类 C 语言的调试进行了优化,它允许您探索任意进程及其机器状态:寄存器、堆栈、程序计数器等。 -## Specialized Tools +## 专门工具 -Even if what you are trying to debug is a black box binary there are tools that can help you with that. -Whenever programs need to perform actions that only the kernel can, they use [System Calls](https://en.wikipedia.org/wiki/System_call). -There are commands that let you trace the syscalls your program makes. In Linux there's [`strace`](http://man7.org/linux/man-pages/man1/strace.1.html) and macOS and BSD have [`dtrace`](http://dtrace.org/blogs/about/). `dtrace` can be tricky to use because it uses its own `D` language, but there is a wrapper called [`dtruss`](https://www.manpagez.com/man/1/dtruss/) that provides an interface more similar to `strace` (more details [here](https://8thlight.com/blog/colin-jones/2015/11/06/dtrace-even-better-than-strace-for-osx.html)). +即使您需要调试的程序是一个二进制的黑盒程序,仍然有一些工具可以帮助到您。当您的程序需要执行一些只有操作系统内核才能完成的操作时,它需要使用 [系统调用](https://en.wikipedia.org/wiki/System_call)。有一些命令可以帮助您追踪您的程序执行的系统调用。在 Linux 中可以使用[`strace`](http://man7.org/linux/man-pages/man1/strace.1.html) ,在 macOS 和 BSD 中可以使用 [`dtrace`](http://dtrace.org/blogs/about/)。`dtrace` 用起来可能有些别扭,因为它使用的是它自有的 `D` 语言,但是我们可以使用一个叫做 [`dtruss`](https://www.manpagez.com/man/1/dtruss/) 的封装使其具有和 `strace` (更多信息参考 [这里](https://8thlight.com/blog/colin-jones/2015/11/06/dtrace-even-better-than-strace-for-osx.html))类似的接口 -Below are some examples of using `strace` or `dtruss` to show [`stat`](http://man7.org/linux/man-pages/man2/stat.2.html) syscall traces for an execution of `ls`. For a deeper dive into `strace`, [this](https://blogs.oracle.com/linux/strace-the-sysadmins-microscope-v2) is a good read. +下面的例子展现来如何使用 `strace` 或 `dtruss` 来显示`ls` 执行时,对[`stat`](http://man7.org/linux/man-pages/man2/stat.2.html) 系统调用进行追踪对结果。若需要深入了解 `strace`,[这篇文章](https://blogs.oracle.com/linux/strace-the-sysadmins-microscope-v2) 值得一读。 ```bash # On Linux From 117a49449b22abaf0fcd25ade30c14bf5cc817ad Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E7=8E=8B=E5=85=B4=E5=BD=AC=5FBinboy?= Date: Fri, 22 May 2020 00:16:26 +0800 Subject: [PATCH 380/640] Translate about --- about.md | 145 ++++++++++++++++++++++--------------------------------- 1 file changed, 57 insertions(+), 88 deletions(-) diff --git a/about.md b/about.md index a67db7a0..66bd283f 100644 --- a/about.md +++ b/about.md @@ -1,141 +1,110 @@ --- layout: lecture -title: "Why we are teaching this class" +title: "开设此课程的动机" --- -During a traditional Computer Science education, chances are you will take -plenty of classes that teach you advanced topics within CS, everything from -Operating Systems to Programming Languages to Machine Learning. But at many -institutions there is one essential topic that is rarely covered and is instead -left for students to pick up on their own: computing ecosystem literacy. - -Over the years, we have helped teach several classes at MIT, and over and over -we have seen that many students have limited knowledge of the tools available -to them. Computers were built to automate manual tasks, yet students often -perform repetitive tasks by hand or fail to take full advantage of powerful -tools such as version control and text editors. In the best case, this results -in inefficiencies and wasted time; in the worst case, it results in issues like -data loss or inability to complete certain tasks. - -These topics are not taught as part of the university curriculum: students are -never shown how to use these tools, or at least not how to use them -efficiently, and thus waste time and effort on tasks that _should_ be simple. -The standard CS curriculum is missing critical topics about the computing -ecosystem that could make students' lives significantly easier. +在传统的计算机科学课程中,从操作系统、编程语言到机器学习,这些高大上课程和主题已经非常多了。 +然而有一个至关重要的主题却很少被专门讲授,而是留给学生们自己去探索。 这部分内容就是:精通工具。 + +这些年,我们在麻省理工学院参与了许多课程的助教活动,过程当中愈发意识到很多学生对于工具的了解知之甚少。 +计算机设计的初衷就是任务自动化,然而学生们却常常陷在大量的重复任务中,或者无法完全发挥出诸如 +版本控制、文本编辑器等强大作用。效率低下和浪费时间还是其次,更糟糕的是,这还可能导致数据丢失或 +无法完成某些特定任务。 + +这些主题不是大学课程的一部分:学生一直都不知道如何使用这些工具,或者说,至少是不知道如何高效 +地使用,因此浪费了时间和精力在本来可以更简单的任务上。标准的计算机科学课程缺少了这门能让计算 +变得更简捷的关键课程。 # The missing semester of your CS education -To help remedy this, we are running a class that covers all the topics we -consider crucial to be an effective computer scientist and programmer. The -class is pragmatic and practical, and it provides hands-on introduction to -tools and techniques that you can immediately apply in a wide variety of -situations you will encounter. The class is being run during MIT's "Independent -Activities Period" in January 2020 — a one-month semester that features shorter -student-run classes. While the lectures themselves are only available to MIT -students, we will provide all lecture materials along with video recordings of -lectures to the public. +为了解决这个问题,我们开启了一个课程,涵盖各项对成为高效率计算机科学家或程序员至关重要的 +主题。这个课程实用且具有很强的实践性,提供了各种能够立即广泛应用解决问题的趁手工具指导。 +该课在 2020 年 1 月”独立活动期“开设,为期一个月,是学生开办的短期课程。虽然该课程针对 +麻省理工学院,但我们公开提供了全部课程的录制视频与相关资料。 -If this sounds like it might be for you, here are some concrete -examples of what the class will teach: +如果该课程适合你,那么以下还有一些具体的课程示例: -## Command shell +## 命令行与 shell 工具 -How to automate common and repetitive tasks with aliases, scripts, -and build systems. No more copy-pasting commands from a text -document. No more "run these 15 commands one after the other". No -more "you forgot to run this thing" or "you forgot to pass this -argument". +如何使用别名、脚本和构建系统来自动化执行通用重复的任务。不再总是从文档中拷贝粘贴 +命令。不要再“逐个执行这 15 个命令”,不要再“你忘了执行这个命令”、“你忘了传那个 +参数”,类似的对话不要再有了。 -For example, searching through your history quickly can be a huge time saver. In the example below we show several tricks related to navigating your shell history for `convert` commands. +例如,快速搜索历史记录可以节省大量时间。在下面这个示例中,我们展示了如何通过`convert`命令 +在历史记录中跳转的一些技巧。 -## Version control +## 版本控制 -How to use version control _properly_, and take advantage of it to -save you from disaster, collaborate with others, and quickly find and -isolate problematic changes. No more `rm -rf; git clone`. No more -merge conflicts (well, fewer of them at least). No more huge blocks -of commented-out code. No more fretting over how to find what broke -your code. No more "oh no, did we delete the working code?!". We'll -even teach you how to contribute to other people's projects with pull -requests! +如何**正确地**使用版本控制,利用它避免尴尬的情况发生,与他人协作,并且能够快速定位 +有问题的提交改动。不再`rm -rf; git clone`。不再合并冲突(当然,至少是更少发生)。 +不再大量注释代码。不再为解决 bug 而找遍所有代码。不再“我去,刚才是删了有用的代码?!”。 +我们将教你如何通过请求合并来为他人的项目贡献代码。 + +下面这个示例中,我们使用`git bisect`来定位哪个提交破坏了单元测试,并且通过`git rever`来进行修复。 -In the example below we use `git bisect` to find which commit broke a unit test and then we fix it with `git revert`. -## Text editing +## 文本编辑 + +不论是本地还是远程,如何通过命令行高效地编辑文件,并且充分利用编辑器特性。不再来回复制 +文件。不再重复编辑文件。 -How to efficiently edit files from the command-line, both locally and -remotely, and take advantage of advanced editor features. No more -copying files back and forth. No more repetitive file editing. +Vim 的宏是它最好的特性之一,在下面这个示例中,我们使用嵌套的 Vim 宏快速地将 html 表格转换成了 csv 格式。 -Vim macros are one of its best features, in the example below we quickly convert an html table to csv format using a nested vim macro. -## Remote machines +## 远程服务器 -How to stay sane when working with remote machines using SSH keys and -terminal multiplexing. No more keeping many terminals open just to -run two commands at once. No more typing your password every time you -connect. No more losing everything just because your Internet -disconnected or you had to reboot your laptop. +使用 SSH 密钥在远程机器下工作如何保持清醒,并且终端能够复用。不再为了仅执行个别命令 +总是打开许多命令终端。不再每次连接都总输入密码。不再因为网络断开或必须重启笔记本时 +就丢失全部上下文。 -In the example below we use `tmux` to keep sessions alive in remote servers and `mosh` to support network roaming and disconnection. +以下示例,我们使用`tmux`来保持会话在远程服务器活跃,并使用`mosh`来支持网络漫游和断开连接。 -## Finding files +## 查找文件 -How to quickly find files that you are looking for. No -more clicking through files in your project until you find the one -that has the code you want. +如何快速查找你需要的文件。不再挨个点击项目中的文件,直到找到你所需的代码。 -In the example below we quickly look for files with `fd` and for code snippets with `rg`. We also quickly `cd` and `vim` recent/frequent files/folder using `fasd`. +以下示例,我们通过`fd`快速查找文件,通过`rg`找代码片段。我们也用到了`fasd`快速`cd`并`vim`最近/常用的文件/文件夹。 -## Data wrangling +## 数据处理 -How to quickly and easily modify, view, parse, plot, and compute over -data and files directly from the command-line. No more copy pasting -from log files. No more manually computing statistics over data. No -more spreadsheet plotting. +如何通过命令行直接轻松快速地修改、查看、解析、绘制和计算数据和文件。不再从日志文件拷贝 +粘贴。不再手动统计数据。不再用电子表格画图。 -## Virtual machines +## 虚拟机 -How to use virtual machines to try out new operating systems, isolate -unrelated projects, and keep your main machine clean and tidy. No -more accidentally corrupting your computer while doing a security -lab. No more millions of randomly installed packages with differing -versions. +如何使用虚拟机尝试新操作系统,隔离无关的项目,并且保持宿主机整洁。不再因为做安全实验而 +意外损坏你的计算机。不再有大量随机安装的不同版本软件包。 -## Security +## 安全 -How to be on the Internet without immediately revealing all of your -secrets to the world. No more coming up with passwords that match the -insane criteria yourself. No more unsecured, open WiFi networks. No -more unencrypted messaging. +如何在不泄露隐私的情况下畅游互联网。不再抓破脑袋想符合自己疯狂规则的密码。不再连接不安全 +的开放 WiFi 网络。不再未加密传输消息。 -# Conclusion +# 结论 -This, and more, will be covered across the 12 class lectures, each including an -exercise for you to get more familiar with the tools on your own. If you can't -wait for January, you can also take a look at the lectures from [Hacker -Tools](https://hacker-tools.github.io/lectures/), which we ran during IAP last -year. It is the precursor to this class, and covers many of the same topics. +以上包括更多的内容将涵盖在 12 节课程中,每堂课都包括能让你自己更熟悉这些工具的练手小测验。如果不能 +等到一月,你也可以看下[黑客工具](https://hacker-tools.github.io/lectures/),这是我们去年的 +试讲。它是本课程的前身,包含许多相同的主题。 -We hope to see you in January, whether virtually or in person! +无论面对面还是远程在线,欢迎你的参与。 Happy hacking,
    Anish, Jose, and Jon From c7311d37435202b1750f1187927a8c33c50db712 Mon Sep 17 00:00:00 2001 From: "dependabot[bot]" <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue, 26 May 2020 20:08:51 +0000 Subject: [PATCH 381/640] Bump activesupport from 6.0.2.1 to 6.0.3.1 Bumps [activesupport](https://github.com/rails/rails) from 6.0.2.1 to 6.0.3.1. - [Release notes](https://github.com/rails/rails/releases) - [Changelog](https://github.com/rails/rails/blob/v6.0.3.1/activesupport/CHANGELOG.md) - [Commits](https://github.com/rails/rails/compare/v6.0.2.1...v6.0.3.1) Signed-off-by: dependabot[bot] --- Gemfile.lock | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/Gemfile.lock b/Gemfile.lock index 3f6d2449..a5e36e7a 100644 --- a/Gemfile.lock +++ b/Gemfile.lock @@ -1,12 +1,12 @@ GEM remote: https://rubygems.org/ specs: - activesupport (6.0.2.1) + activesupport (6.0.3.1) concurrent-ruby (~> 1.0, >= 1.0.2) i18n (>= 0.7, < 2) minitest (~> 5.1) tzinfo (~> 1.1) - zeitwerk (~> 2.2) + zeitwerk (~> 2.2, >= 2.2.2) addressable (2.7.0) public_suffix (>= 2.0.2, < 5.0) coffee-script (2.4.1) @@ -16,7 +16,7 @@ GEM colorator (1.1.0) commonmarker (0.17.13) ruby-enum (~> 0.5) - concurrent-ruby (1.1.5) + concurrent-ruby (1.1.6) dnsruby (1.61.3) addressable (~> 2.5) em-websocket (0.5.1) @@ -202,7 +202,7 @@ GEM jekyll (>= 3.5, < 5.0) jekyll-feed (~> 0.9) jekyll-seo-tag (~> 2.1) - minitest (5.13.0) + minitest (5.14.1) multipart-post (2.1.1) nokogiri (1.10.8) mini_portile2 (~> 2.4.0) @@ -233,10 +233,10 @@ GEM thread_safe (0.3.6) typhoeus (1.3.1) ethon (>= 0.9.0) - tzinfo (1.2.6) + tzinfo (1.2.7) thread_safe (~> 0.1) unicode-display_width (1.6.0) - zeitwerk (2.2.2) + zeitwerk (2.3.0) PLATFORMS ruby From fce1f8dcd363d42caa897425226e174adac70da7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=A3=92=E6=A3=92=E5=BD=AC=5FBinboy?= Date: Wed, 27 May 2020 20:15:53 +0800 Subject: [PATCH 382/640] Update about.md Co-authored-by: Lingfeng_Ai --- about.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/about.md b/about.md index 66bd283f..5ac70e95 100644 --- a/about.md +++ b/about.md @@ -8,7 +8,7 @@ title: "开设此课程的动机" 这些年,我们在麻省理工学院参与了许多课程的助教活动,过程当中愈发意识到很多学生对于工具的了解知之甚少。 计算机设计的初衷就是任务自动化,然而学生们却常常陷在大量的重复任务中,或者无法完全发挥出诸如 -版本控制、文本编辑器等强大作用。效率低下和浪费时间还是其次,更糟糕的是,这还可能导致数据丢失或 +版本控制、文本编辑器等工具的强大作用。效率低下和浪费时间还是其次,更糟糕的是,这还可能导致数据丢失或 无法完成某些特定任务。 这些主题不是大学课程的一部分:学生一直都不知道如何使用这些工具,或者说,至少是不知道如何高效 From b400670fc2eb78c0726ba9351c5d8411d4cb2c32 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=A3=92=E6=A3=92=E5=BD=AC=5FBinboy?= Date: Wed, 27 May 2020 20:21:39 +0800 Subject: [PATCH 383/640] Apply suggestions from code review Co-authored-by: Lingfeng_Ai --- about.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/about.md b/about.md index 5ac70e95..3db2e92c 100644 --- a/about.md +++ b/about.md @@ -40,9 +40,10 @@ title: "开设此课程的动机" ## 版本控制 如何**正确地**使用版本控制,利用它避免尴尬的情况发生,与他人协作,并且能够快速定位 -有问题的提交改动。不再`rm -rf; git clone`。不再合并冲突(当然,至少是更少发生)。 +如何**正确地**使用版本控制,利用它避免尴尬的情况发生。与他人协作,并且能够快速定位 +有问题的提交 不再大量注释代码。不再为解决 bug 而找遍所有代码。不再“我去,刚才是删了有用的代码?!”。 -我们将教你如何通过请求合并来为他人的项目贡献代码。 +我们将教你如何通过拉取请求来为他人的项目贡献代码。 下面这个示例中,我们使用`git bisect`来定位哪个提交破坏了单元测试,并且通过`git rever`来进行修复。 @@ -96,11 +97,11 @@ Vim 的宏是它最好的特性之一,在下面这个示例中,我们使用 ## 安全 如何在不泄露隐私的情况下畅游互联网。不再抓破脑袋想符合自己疯狂规则的密码。不再连接不安全 -的开放 WiFi 网络。不再未加密传输消息。 +的开放 WiFi 网络。不再传输未加密的信息。 # 结论 -以上包括更多的内容将涵盖在 12 节课程中,每堂课都包括能让你自己更熟悉这些工具的练手小测验。如果不能 +这 12 节课将包括但不限于以上内容,同时每堂课都提供了能帮助你熟悉这些工具的练手小测验。如果不能 等到一月,你也可以看下[黑客工具](https://hacker-tools.github.io/lectures/),这是我们去年的 试讲。它是本课程的前身,包含许多相同的主题。 From dd7aee9a9dc4063cbbed93096cd1e62758a35a7f Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Wed, 27 May 2020 23:30:41 +0800 Subject: [PATCH 384/640] mark about.md as done --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 08213fcb..8df220ef 100644 --- a/README.md +++ b/README.md @@ -37,4 +37,4 @@ To contribute to this tanslation project, please book your topic by creating an | [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | [@catcarbon](https://github.com/catcarbon) | In-progress | | [potpourri.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md) | | TO-DO | | [qa.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/qa.md) | [@AA1HSHH](https://github.com/AA1HSHH) | In-progress | -| [about.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/about.md) | [@Binlogo](https://github.com/Binlogo) | In-progress | +| [about.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/about.md) | [@Binlogo](https://github.com/Binlogo) | Done | From 6d88d0898aa7e417791e692326797f8a2b57f804 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Wed, 27 May 2020 23:33:30 +0800 Subject: [PATCH 385/640] Update nav.html --- _includes/nav.html | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_includes/nav.html b/_includes/nav.html index b953e891..1bbd9d2c 100644 --- a/_includes/nav.html +++ b/_includes/nav.html @@ -5,8 +5,8 @@ From a882dd21c3601c294be7e032b35d8cabc67705ca Mon Sep 17 00:00:00 2001 From: Yi Zhang Date: Thu, 28 May 2020 23:59:34 -0400 Subject: [PATCH 386/640] update kdf --- _2020/security.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/_2020/security.md b/_2020/security.md index 89cc6fcf..dd51433c 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -98,9 +98,8 @@ checking `sha256(r)` matches the hash I shared earlier. ## 密钥生成函数的应用 -- Producing keys from passphrases for use in other cryptographic algorithms -(e.g. symmetric cryptography, see below). -- Storing login credentials. Storing plaintext passwords is bad; the right +- 从密码生成可以在其他加密算法中使用的密钥,比如对称加密算法(见下)。 +- 存储登录Storing login credentials. Storing plaintext passwords is bad; the right approach is to generate and store a random [salt](https://en.wikipedia.org/wiki/Salt_(cryptography)) `salt = random()` for each user, store `KDF(password + salt)`, and verify login attempts by From 1335b50c8dd1b741be1d729c9ec0aecf535bce6e Mon Sep 17 00:00:00 2001 From: Yi Zhang Date: Fri, 29 May 2020 02:57:44 -0400 Subject: [PATCH 387/640] commitment scheme, kdf, symmetric crypto --- _2020/security.md | 71 +++++++++++++++++++---------------------------- 1 file changed, 29 insertions(+), 42 deletions(-) diff --git a/_2020/security.md b/_2020/security.md index dd51433c..f3ef4d72 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -78,18 +78,16 @@ f7ff9e8b7bb2e09b70935a5d785e0cc5d9d0abf0 ## 密码散列函数的应用 -- Git中的内容寻址存储(Content addressed storage):[散列函数](https://en.wikipedia.org/wiki/Hash_function) 是一个宽泛的概念(存在非密码学的散列函数),那么Git为什么要特意使用密码散列函数? +- Git中的内容寻址存储(Content addressed storage):[散列函数](https://en.wikipedia.org/wiki/Hash_function)是一个宽泛的概念(存在非密码学的散列函数),那么Git为什么要特意使用密码散列函数? - 文件的信息摘要(Message digest):像Linux ISO这样的软件可以从非官方的(有时不太可信的)镜像站下载,所以需要设法确认下载的软件和官方一致。 官方网站一般会在(指向镜像站的)下载链接旁边备注安装文件的哈希值。 用户从镜像站下载安装文件后可以对照公布的哈希值来确定安装文件没有被篡改。 -- [Commitment schemes](https://en.wikipedia.org/wiki/Commitment_scheme). -Suppose you want to commit to a particular value, but reveal the value itself -later. For example, I want to do a fair coin toss "in my head", without a -trusted shared coin that two parties can see. I could choose a value `r = -random()`, and then share `h = sha256(r)`. Then, you could call heads or tails -(we'll agree that even `r` means heads, and odd `r` means tails). After you -call, I can reveal my value `r`, and you can confirm that I haven't cheated by -checking `sha256(r)` matches the hash I shared earlier. +- [承诺机制](https://en.wikipedia.org/wiki/Commitment_scheme)(Commitment scheme): +假设我希望承诺一个值,但之后再透露它—— +比如在没有一个可信的、双方可见的硬币的情况下在我的脑海中公平的“扔一次硬币”。 +我可以选择一个值`r = random()`,并和你分享它的哈希值`h = sha256(r)`。 +这时你可以开始猜硬币的正反:我们一致同意偶数`r`代表正面,奇数`r`代表反面。 +你猜完了以后,我告诉你值`r`的内容,得出胜负。同时你可以使用`sha256(r)`来检查我分享的哈希值`h`以确认我没有作弊。 # 密钥生成函数 @@ -99,37 +97,31 @@ checking `sha256(r)` matches the hash I shared earlier. ## 密钥生成函数的应用 - 从密码生成可以在其他加密算法中使用的密钥,比如对称加密算法(见下)。 -- 存储登录Storing login credentials. Storing plaintext passwords is bad; the right -approach is to generate and store a random -[salt](https://en.wikipedia.org/wiki/Salt_(cryptography)) `salt = random()` for -each user, store `KDF(password + salt)`, and verify login attempts by -re-computing the KDF given the entered password and the stored salt. +- 存储登录凭证时不可直接存储明文密码。
    +正确的方法是针对每个用户随机生成一个[盐](https://en.wikipedia.org/wiki/Salt_(cryptography)) `salt = random()`, +并存储盐,以及密钥生成函数对连接了盐的明文密码生成的哈希值`KDF(password + salt)`。
    +在验证登录请求时,使用输入的密码连接存储的盐重新计算哈希值`KDF(input + salt)`,并与存储的哈希值对比。 # 对称加密 -Hiding message contents is probably the first concept you think about when you -think about cryptography. Symmetric cryptography accomplishes this with the -following set of functionality: +说到加密,可能你会首先想到隐藏明文信息。对称加密使用以下几个方法来实现这个功能: ``` -keygen() -> key (this function is randomized) +keygen() -> key (这是一个随机方法) -encrypt(plaintext: array, key) -> array (the ciphertext) -decrypt(ciphertext: array, key) -> array (the plaintext) +encrypt(plaintext: array, key) -> array (输出密文) +decrypt(ciphertext: array, key) -> array (输出明文) ``` -The encrypt function has the property that given the output (ciphertext), it's -hard to determine the input (plaintext) without the key. The decrypt function -has the obvious correctness property, that `decrypt(encrypt(m, k), k) = m`. +加密方法`encrypt()`输出的密文`ciphertext`很难在不知道`key`的情况下得出明文`plaintext`。
    +解密方法`decrypt()`有明显的正确性。因为功能要求给定密文及其密钥,解密方法必须输出明文:`decrypt(encrypt(m, k), k) = m`。 -An example of a symmetric cryptosystem in wide use today is -[AES](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard). +[AES](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard) 是现在常用的一种对称加密系统。 ## 对称加密的应用 -- Encrypting files for storage in an untrusted cloud service. This can be -combined with KDFs, so you can encrypt a file with a passphrase. Generate `key -= KDF(passphrase)`, and then store `encrypt(file, key)`. +- 对在不信任的云服务上存储的文件进行加密。
    对称加密可以和密钥生成函数配合使用,这样可以使用密码加密文件: +将密码输入密钥生成函数生成密钥 `key = KDF(passphrase)`,然后存储`encrypt(file, key)`。 # 非对称加密 @@ -270,23 +262,18 @@ security concepts, tips # 资源 -- [Last year's notes](/2019/security/): from when this lecture was more focused on security and privacy as a computer user -- [Cryptographic Right Answers](https://latacora.micro.blog/2018/04/03/cryptographic-right-answers.html): answers "what crypto should I use for X?" for many common X. +- [去年的讲稿](/2019/security/): 更注重于计算机用户可以如何增强隐私保护和安全 +- [Cryptographic Right Answers](https://latacora.micro.blog/2018/04/03/cryptographic-right-answers.html): +解答了在一些应用环境下“应该使用什么加密?”的问题 # 练习 -1. **Entropy.** - 1. Suppose a password is chosen as a concatenation of five lower-case - dictionary words, where each word is selected uniformly at random from a - dictionary of size 100,000. An example of such a password is - `correcthorsebatterystaple`. How many bits of entropy does this have? - 1. Consider an alternative scheme where a password is chosen as a sequence - of 8 random alphanumeric characters (including both lower-case and - upper-case letters). An example is `rg8Ql34g`. How many bits of entropy - does this have? - 1. Which is the stronger password? - 1. Suppose an attacker can try guessing 10,000 passwords per second. On - average, how long will it take to break each of the passwords? +1. **熵** + 1. 假设一个密码是从五个小写的单词拼接组成,每个单词都是从一个含有10万单词的字典中随机选择,且每个单词选中的概率相同。 + 一个符合这样构造的例子是`correcthorsebatterystaple`。这个密码有多少比特的熵? + 1. 假设另一个密码是用八个随机的大小写字母或数字组成。一个符合这样构造的例子是`rg8Ql34g`。这个密码又有多少比特的熵? + 1. 哪一个密码更强? + 1. 假设一个攻击者每秒可以尝试1万个密码,这个攻击者需要多久可以分别破解上述两个密码? 1. **Cryptographic hash functions.** Download a Debian image from a [mirror](https://www.debian.org/CD/http-ftp/) (e.g. [this file](http://debian.xfree.com.ar/debian-cd/10.2.0/amd64/iso-cd/debian-10.2.0-amd64-netinst.iso) From 184e2870c2bff180b372ee474a3d8a130a4e20f6 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Fri, 29 May 2020 23:01:41 +0800 Subject: [PATCH 388/640] update trans --- _2020/debugging-profiling.md | 51 ++++++++++++++++-------------------- 1 file changed, 23 insertions(+), 28 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index c8e43bd3..b004f69e 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -147,25 +147,21 @@ sudo strace -e lstat ls -l > /dev/null sudo dtruss -t lstat64_extended ls -l > /dev/null ``` -Under some circumstances, you may need to look at the network packets to figure out the issue in your program. -Tools like [`tcpdump`](http://man7.org/linux/man-pages/man1/tcpdump.1.html) and [Wireshark](https://www.wireshark.org/) are network packet analyzers that let you read the contents of network packets and filter them based on different criteria. +有些情况下,我们需要查看网络数据包才能定位问题。像 [`tcpdump`](http://man7.org/linux/man-pages/man1/tcpdump.1.html) 和 [Wireshark](https://www.wireshark.org/) 这样的网络数据包分析工具可以帮助您获取网络数据包的内容并基于不同的条件进行过滤。 -For web development, the Chrome/Firefox developer tools are quite handy. They feature a large number of tools, including: -- Source code - Inspect the HTML/CSS/JS source code of any website. -- Live HTML, CSS, JS modification - Change the website content, styles and behavior to test (you can see for yourself that website screenshots are not valid proofs). -- Javascript shell - Execute commands in the JS REPL. -- Network - Analyze the requests timeline. -- Storage - Look into the Cookies and local application storage. +对于 web 开发, Chrome/Firefox 的开发者工具非常方便,功能也很强大: +- 源码 -查看任意站点的 HTML/CSS/JS 源码; +- 实时地修改 HTML, CSS, JS 代码 - 修改网站的内容、样式和行为用于测试(从这一点您也能看出来,网页截图是不可靠的); +- Javascript shell - 在 JS REPL中执行命令; +- 网络 - 分析请求的时间线; +- 存储 - 查看 Cookies 和本地应用存储。 -## Static Analysis +## 静态分析 -For some issues you do not need to run any code. -For example, just by carefully looking at a piece of code you could realize that your loop variable is shadowing an already existing variable or function name; or that a program reads a variable before defining it. -Here is where [static analysis](https://en.wikipedia.org/wiki/Static_program_analysis) tools come into play. -Static analysis programs take source code as input and analyze it using coding rules to reason about its correctness. +有些问题是您不需要执行代码就能发现的。例如,仔细观察一段代码,您就能发现某个循环变量覆盖了某个已经存在的变量或函数名;或是有个变量在被读取之前并没有被定义。 +这种情况下 [静态分析](https://en.wikipedia.org/wiki/Static_program_analysis) 工具就可以帮我们找到问题。静态分析会将程序的源码作为输入然后基于编码规则对其进行分析并对代码的正确性进行推理。 -In the following Python snippet there are several mistakes. -First, our loop variable `foo` shadows the previous definition of the function `foo`. We also wrote `baz` instead of `bar` in the last line, so the program will crash after completing the `sleep` call (which will take one minute). +下面这段 Python 代码中存在几个问题。 首先,我们的循环变量`foo` 覆盖了之前定义的函数`foo`。最后一行,我们还把 `bar` 错写成了`baz`,因此当程序完成`sleep` (一分钟后)后,执行到这一行的时候便会崩溃。 ```python import time @@ -180,11 +176,9 @@ bar *= 0.2 time.sleep(60) print(baz) ``` +静态分析工具可以发现此类的问题。当我们使用[`pyflakes`](https://pypi.org/project/pyflakes) 分析代码的似乎,我们会得到与这两处 bug 相关的错误信息。[`mypy`](http://mypy-lang.org/) 则是另外一个工具,它可以对代码进行类型检查。这里,`mypy` 会经过我们`bar` 起初是一个 `int` ,然后变成了 `float`。这些问题都可以在不允许代码的情况下被发现。 -Static analysis tools can identify this kind of issues. When we run [`pyflakes`](https://pypi.org/project/pyflakes) on the code we get the errors related to both bugs. [`mypy`](http://mypy-lang.org/) is another tool that can detect type checking issues. Here, `mypy` will warn us that `bar` is initially an `int` and is then casted to a `float`. -Again, note that all these issues were detected without having to run the code. - -In the shell tools lecture we covered [`shellcheck`](https://www.shellcheck.net/), which is a similar tool for shell scripts. +在 shell 工具那一节课的时候,我们介绍了 [`shellcheck`](https://www.shellcheck.net/),这是一个类似的工具,但它是应用于 shell 脚本的。 ```bash $ pyflakes foobar.py @@ -198,18 +192,19 @@ foobar.py:11: error: Name 'baz' is not defined Found 3 errors in 1 file (checked 1 source file) ``` -Most editors and IDEs support displaying the output of these tools within the editor itself, highlighting the locations of warnings and errors. -This is often called **code linting** and it can also be used to display other types of issues such as stylistic violations or insecure constructs. +大多数的编辑器和 IDE 都支持在编辑界面显示这些工具的分析结果、高亮有警告和错误的位置。 +这个过程通常成为 **code linting** 。风格检查或安全检查的结果同样也可以进行相应的显示。 + +在 vim 中,有 [`ale`](https://vimawesome.com/plugin/ale) 或 [`syntastic`](https://vimawesome.com/plugin/syntastic) 可以帮助您做同样的事情。 +在 Python 中, [`pylint`](https://www.pylint.org) 和 [`pep8`](https://pypi.org/project/pep8/) 是两种用于进行风格检查的工具,而 [`bandit`](https://pypi.org/project/bandit/) 工具则用于检查安全相关的问题。 -In vim, the plugins [`ale`](https://vimawesome.com/plugin/ale) or [`syntastic`](https://vimawesome.com/plugin/syntastic) will let you do that. -For Python, [`pylint`](https://www.pylint.org) and [`pep8`](https://pypi.org/project/pep8/) are examples of stylistic linters and [`bandit`](https://pypi.org/project/bandit/) is a tool designed to find common security issues. -For other languages people have compiled comprehensive lists of useful static analysis tools, such as [Awesome Static Analysis](https://github.com/mre/awesome-static-analysis) (you may want to take a look at the _Writing_ section) and for linters there is [Awesome Linters](https://github.com/caramelomartins/awesome-linters). +对于其他语言的开发者来说,静态分析工具可以参考这个列表:[Awesome Static Analysis](https://github.com/mre/awesome-static-analysis) (您也许会对 _Writing_ 一节感兴趣) 。对于 linters 则可以参考这个列表: [Awesome Linters](https://github.com/caramelomartins/awesome-linters)。 -A complementary tool to stylistic linting are code formatters such as [`black`](https://github.com/psf/black) for Python, `gofmt` for Go, `rustfmt` for Rust or [`prettier`](https://prettier.io/) for JavaScript, HTML and CSS. -These tools autoformat your code so that it's consistent with common stylistic patterns for the given programming language. -Although you might be unwilling to give stylistic control about your code, standardizing code format will help other people read your code and will make you better at reading other people's (stylistically standardized) code. +对于风格检查和代码格式化,还有以下一些工具可以作为补充:用于 Python 的 [`black`](https://github.com/psf/black)、用于 Go 语言的 `gofmt`、用于 Rust 的 `rustfmt` 或是用于 JavaScript, HTML 和 CSS 的 [`prettier`](https://prettier.io/) 。这些工具可以自动格式化您的代码,这样代码风格就可以与常见的风格保持一致。 +尽管您可能并不想对代码进行风格控制,标准的代码风格有助于方便别人阅读您的代码,也可以方便您阅读它的代码。 -# Profiling +s +# 性能分析 Even if your code functionally behaves as you would expect, that might not be good enough if it takes all your CPU or memory in the process. Algorithms classes often teach big _O_ notation but not how to find hot spots in your programs. From e367e59f8eb1a2f45cb9fb20a48b203f43af9e39 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Fri, 29 May 2020 23:07:02 +0800 Subject: [PATCH 389/640] update trans --- _2020/debugging-profiling.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index b004f69e..08a185ab 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -203,14 +203,13 @@ Found 3 errors in 1 file (checked 1 source file) 对于风格检查和代码格式化,还有以下一些工具可以作为补充:用于 Python 的 [`black`](https://github.com/psf/black)、用于 Go 语言的 `gofmt`、用于 Rust 的 `rustfmt` 或是用于 JavaScript, HTML 和 CSS 的 [`prettier`](https://prettier.io/) 。这些工具可以自动格式化您的代码,这样代码风格就可以与常见的风格保持一致。 尽管您可能并不想对代码进行风格控制,标准的代码风格有助于方便别人阅读您的代码,也可以方便您阅读它的代码。 -s # 性能分析 Even if your code functionally behaves as you would expect, that might not be good enough if it takes all your CPU or memory in the process. Algorithms classes often teach big _O_ notation but not how to find hot spots in your programs. Since [premature optimization is the root of all evil](http://wiki.c2.com/?PrematureOptimization), you should learn about profilers and monitoring tools. They will help you understand which parts of your program are taking most of the time and/or resources so you can focus on optimizing those parts. -## Timing +## 计时 Similarly to the debugging case, in many scenarios it can be enough to just print the time it took your code between two points. Here is an example in Python using the [`time`](https://docs.python.org/3/library/time.html) module. @@ -249,7 +248,7 @@ user 0m0.015s sys 0m0.012s ``` -## Profilers +## 性能分析工具 ### CPU @@ -354,7 +353,7 @@ Line # Hits Time Per Hit % Time Line Contents 11 24 33.0 1.4 0.0 urls.append(url['href']) ``` -### Memory +### 内存 In languages like C or C++ memory leaks can cause your program to never release memory that it doesn't need anymore. To help in the process of memory debugging you can use tools like [Valgrind](https://valgrind.org/) that will help you identify memory leaks. @@ -398,7 +397,7 @@ For example, `perf` can easily report poor cache locality, high amounts of page - `perf report` - Formats and prints the data collected in `perf.data` -### Visualization +### 可视化 Profiler output for real world programs will contain large amounts of information because of the inherent complexity of software projects. Humans are visual creatures and are quite terrible at reading large amounts of numbers and making sense of them. @@ -414,7 +413,7 @@ In Python you can use the [`pycallgraph`](http://pycallgraph.slowchop.com/en/mas ![Call Graph](https://upload.wikimedia.org/wikipedia/commons/2/2f/A_Call_Graph_generated_by_pycallgraph.png) -## Resource Monitoring +## 资源监控 Sometimes, the first step towards analyzing the performance of your program is to understand what its actual resource consumption is. Programs often run slowly when they are resource constrained, e.g. without enough memory or on a slow network connection. @@ -434,7 +433,7 @@ A more interactive version of `du` is [`ncdu`](https://dev.yorhel.nl/ncdu) which If you want to test these tools you can also artificially impose loads on the machine using the [`stress`](https://linux.die.net/man/1/stress) command. -### Specialized tools +### 专用工具 Sometimes, black box benchmarking is all you need to determine what software to use. Tools like [`hyperfine`](https://github.com/sharkdp/hyperfine) let you quickly benchmark command line programs. @@ -459,9 +458,9 @@ Summary As it was the case for debugging, browsers also come with a fantastic set of tools for profiling webpage loading, letting you figure out where time is being spent (loading, rendering, scripting, &c). More info for [Firefox](https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Profiling_with_the_Built-in_Profiler) and [Chrome](https://developers.google.com/web/tools/chrome-devtools/rendering-toolss). -# Exercises +# 课后练习 -## Debugging +## 调试 1. Use `journalctl` on Linux or `log show` on macOS to get the super user accesses and commands in the last day. If there aren't any you can execute some harmless commands such as `sudo ls` and check again. @@ -480,7 +479,8 @@ If there aren't any you can execute some harmless commands such as `sudo ls` and ``` 1. (Advanced) Read about [reversible debugging](https://undo.io/resources/reverse-debugging-whitepaper/) and get a simple example working using [`rr`](https://rr-project.org/) or [`RevPDB`](https://morepypy.blogspot.com/2016/07/reverse-debugging-for-python.html). -## Profiling + +## 性能分析 1. [Here](/static/files/sorts.py) are some sorting algorithm implementations. Use [`cProfile`](https://docs.python.org/3/library/profile.html) and [`line_profiler`](https://github.com/rkern/line_profiler) to compare the runtime of insertion sort and quicksort. What is the bottleneck of each algorithm? Use then `memory_profiler` to check the memory consumption, why is insertion sort better? Check now the inplace version of quicksort. Challenge: Use `perf` to look at the cycle counts and cache hits and misses of each algorithm. From c424c9d44ff92e41d80f11bd3b7ec679e73c26c1 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Fri, 29 May 2020 23:24:49 +0800 Subject: [PATCH 390/640] update trans --- _2020/debugging-profiling.md | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 08a185ab..7973f1bd 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -205,27 +205,25 @@ Found 3 errors in 1 file (checked 1 source file) # 性能分析 -Even if your code functionally behaves as you would expect, that might not be good enough if it takes all your CPU or memory in the process. -Algorithms classes often teach big _O_ notation but not how to find hot spots in your programs. -Since [premature optimization is the root of all evil](http://wiki.c2.com/?PrematureOptimization), you should learn about profilers and monitoring tools. They will help you understand which parts of your program are taking most of the time and/or resources so you can focus on optimizing those parts. +即使您的代码能够向您期望的一样运行,但是如果它消耗了您全部的 CPU 和内存,那么它显然也不是个好程序。算法课上我们通常会介绍大O标记法,但却没交给我们如何找到程序中的热点。 +因为 [过早的优化是万恶之源](http://wiki.c2.com/?PrematureOptimization),您需要学习性能分析和监控工具。它们会帮助您找到程序中最耗时、最耗资源的部分,这样您就可以有针对性的进行性能优化。 ## 计时 -Similarly to the debugging case, in many scenarios it can be enough to just print the time it took your code between two points. -Here is an example in Python using the [`time`](https://docs.python.org/3/library/time.html) module. +和调试代码类似,大多数情况下我们只需要打印两处代码之间的时间即可发现问题。下面这个例子中,我们使用了 Python 的 [`time`](https://docs.python.org/3/library/time.html) 模块。 ```python import time, random n = random.randint(1, 10) * 100 -# Get current time +# 获取当前时间 start = time.time() -# Do some work +# 执行一些操作 print("Sleeping for {} ms".format(n)) time.sleep(n/1000) -# Compute time between start and now +# 比较当前时间和起始时间 print(time.time() - start) # Output @@ -233,7 +231,7 @@ print(time.time() - start) # 0.5713930130004883 ``` -However, wall clock time can be misleading since your computer might be running other processes at the same time or waiting for events to happen. It is common for tools to make a distinction between _Real_, _User_ and _Sys_ time. In general, _User_ + _Sys_ tells you how much time your process actually spent in the CPU (more detailed explanation [here](https://stackoverflow.com/questions/556405/what-do-real-user-and-sys-mean-in-the-output-of-time1)). +不过,执行时间(wall clock time)也可能会误导您,因为您的电脑可能也在同时运行其他进程,也可能在此期间发生了等待。 对于工具来说,需要区分真实时间、用户时间和系统时间。通常来说,用户时间+系统时间代表了您的进程所消耗的实际 CPU (更详细的解释可以参照[这篇文章](https://stackoverflow.com/questions/556405/what-do-real-user-and-sys-mean-in-the-output-of-time1))。 - _Real_ - Wall clock elapsed time from start to finish of the program, including the time taken by other processes and time taken while blocked (e.g. waiting for I/O or network) - _User_ - Amount of time spent in the CPU running user code From 3f397730ffcfaf9780d84cb56bbebbb15e314335 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 30 May 2020 00:06:29 +0800 Subject: [PATCH 391/640] update trans --- _2020/debugging-profiling.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 7973f1bd..bc65dd62 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -233,11 +233,11 @@ print(time.time() - start) 不过,执行时间(wall clock time)也可能会误导您,因为您的电脑可能也在同时运行其他进程,也可能在此期间发生了等待。 对于工具来说,需要区分真实时间、用户时间和系统时间。通常来说,用户时间+系统时间代表了您的进程所消耗的实际 CPU (更详细的解释可以参照[这篇文章](https://stackoverflow.com/questions/556405/what-do-real-user-and-sys-mean-in-the-output-of-time1))。 -- _Real_ - Wall clock elapsed time from start to finish of the program, including the time taken by other processes and time taken while blocked (e.g. waiting for I/O or network) -- _User_ - Amount of time spent in the CPU running user code -- _Sys_ - Amount of time spent in the CPU running kernel code +- 真实时间 - 从程序开始到结束流失掉到真实时间,包括其他进程到执行时间以及阻塞消耗的时间(例如等待 I/O或网络); +- _User_ - CPU 执行用户代码所花费的时间; +- _Sys_ - CPU 执行系统内核代码所花费的时间。 -For example, try running a command that performs an HTTP request and prefixing it with [`time`](http://man7.org/linux/man-pages/man1/time.1.html). Under a slow connection you might get an output like the one below. Here it took over 2 seconds for the request to complete but the process only took 15ms of CPU user time and 12ms of kernel CPU time. +例如,试着执行一个用于发起 HTTP 请求的命令并在其前面添加 [`time`](http://man7.org/linux/man-pages/man1/time.1.html) 前缀。网络不好的情况下您可能会看到下面的输出结果。请求花费了 2s 才完成,但是进程仅花费了 15ms 的 CPU 用户时间和 12ms 的 CPU 内核时间。 ```bash $ time curl https://missing.csail.mit.edu &> /dev/null` @@ -246,13 +246,13 @@ user 0m0.015s sys 0m0.012s ``` -## 性能分析工具 +## 性能分析工具(profilers) ### CPU -Most of the time when people refer to _profilers_ they actually mean _CPU profilers_, which are the most common. -There are two main types of CPU profilers: _tracing_ and _sampling_ profilers. -Tracing profilers keep a record of every function call your program makes whereas sampling profilers probe your program periodically (commonly every millisecond) and record the program's stack. +大多数情况下,当人们提及性能分析工具的时候,通常指的是 CPU 性能分析工具。 +CPU 性能分析工具有两种: 追溯分析器(_tracing_)及采样分析其(_sampling_)。 +追溯分析器 profilers keep a record of every function call your program makes whereas sampling profilers probe your program periodically (commonly every millisecond) and record the program's stack. They use these records to present aggregate statistics of what your program spent the most time doing. [Here](https://jvns.ca/blog/2017/12/17/how-do-ruby---python-profilers-work-) is a good intro article if you want more detail on this topic. From f13acf10bd97304dc95bc62634b312a9c377f5af Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 30 May 2020 01:01:42 +0800 Subject: [PATCH 392/640] update trans --- _2020/debugging-profiling.md | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index bc65dd62..e31a2eb9 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -251,15 +251,13 @@ sys 0m0.012s ### CPU 大多数情况下,当人们提及性能分析工具的时候,通常指的是 CPU 性能分析工具。 -CPU 性能分析工具有两种: 追溯分析器(_tracing_)及采样分析其(_sampling_)。 -追溯分析器 profilers keep a record of every function call your program makes whereas sampling profilers probe your program periodically (commonly every millisecond) and record the program's stack. -They use these records to present aggregate statistics of what your program spent the most time doing. -[Here](https://jvns.ca/blog/2017/12/17/how-do-ruby---python-profilers-work-) is a good intro article if you want more detail on this topic. +CPU 性能分析工具有两种: 追溯分析器(_tracing_)及采样分析器(_sampling_)。 +追溯分析器 会记录程序的每一次函数调用,而采样分析器则只会周期性的监测(通常为每毫秒)您的程序并记录程序堆栈。它们使用这些记录来生成统计信息,显示程序在哪些事情上花费了最多的时间。如果您希望了解更多相关信息,可以参考[这篇](https://jvns.ca/blog/2017/12/17/how-do-ruby---python-profilers-work-) 介绍性的文章。 -Most programming languages have some sort of command line profiler that you can use to analyze your code. -They often integrate with full fledged IDEs but for this lecture we are going to focus on the command line tools themselves. -In Python we can use the `cProfile` module to profile time per function call. Here is a simple example that implements a rudimentary grep in Python: +大多数的编程语言都有一些基于命令行都分析器,我们可以使用它们来分析代码。它们通常可以集成在 IDE 中,但是本节课我们会专注于这些命令行工具本身。 + +在 Python 中,我们使用 `cProfile` 模块来分析每次函数调用所消耗都时间。 在下面的例子中,我们实现了一个基础的 grep 命令: ```python #!/usr/bin/env python @@ -283,7 +281,7 @@ if __name__ == '__main__': grep(pattern, file) ``` -We can profile this code using the following command. Analyzing the output we can see that IO is taking most of the time and that compiling the regex takes a fair amount of time as well. Since the regex only needs to be compiled once, we can factor it out of the for. +我们可以使用下面的命令来对这段代码进行分析。通过它的输出我们可以直到,IO 消耗来大量的时间,编译正则表达式也比较耗费时间。因为正则表达式只需要编译一次,我们可以将其移动到 for 循环外面来改进性能。 ``` $ python -m cProfile -s tottime grep.py 1000 '^(import|\s*def)[^,]*$' *.py From 0882169e1d583c786e33e20cf16e1cd1d6972d63 Mon Sep 17 00:00:00 2001 From: Yi Zhang Date: Fri, 29 May 2020 18:15:53 -0400 Subject: [PATCH 393/640] some case study --- _2020/security.md | 129 +++++++++++++++++++--------------------------- 1 file changed, 53 insertions(+), 76 deletions(-) diff --git a/_2020/security.md b/_2020/security.md index f3ef4d72..250fcb57 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -20,8 +20,7 @@ video: # 熵 -[熵](https://en.wikipedia.org/wiki/Entropy_(information_theory))(Entropy) 是对不确定性的量度。 -它的一个应用是决定密码的强度。 +[熵](https://en.wikipedia.org/wiki/Entropy_(information_theory))(Entropy) 度量了不确定性并可以用来决定密码的强度。 ![XKCD 936: Password Strength](https://imgs.xkcd.com/comics/password_strength.png) @@ -120,111 +119,89 @@ decrypt(ciphertext: array, key) -> array (输出明文) ## 对称加密的应用 -- 对在不信任的云服务上存储的文件进行加密。
    对称加密可以和密钥生成函数配合使用,这样可以使用密码加密文件: +- 加密不信任的云服务上存储的文件。
    对称加密和密钥生成函数配合起来,就可以使用密码加密文件: 将密码输入密钥生成函数生成密钥 `key = KDF(passphrase)`,然后存储`encrypt(file, key)`。 # 非对称加密 -The term "asymmetric" refers to there being two keys, with two different roles. -A private key, as its name implies, is meant to be kept private, while the -public key can be publicly shared and it won't affect security (unlike sharing -the key in a symmetric cryptosystem). Asymmetric cryptosystems provide the -following set of functionality, to encrypt/decrypt and to sign/verify: +非对称加密的“非对称”代表在其环境中,使用两个具有不同功能的密钥: +一个是私钥(private key),不向外公布;另一个是公钥(public key),公布公钥不像公布对称加密的共享密钥那样可能影响加密体系的安全性。
    +非对称加密使用以下几个方法来实现加密/解密(encrypt/decrypt),以及签名/验证(sign/verify): ``` -keygen() -> (public key, private key) (this function is randomized) +keygen() -> (public key, private key) (这是一个随机方法) -encrypt(plaintext: array, public key) -> array (the ciphertext) -decrypt(ciphertext: array, private key) -> array (the plaintext) +encrypt(plaintext: array, public key) -> array (输出密文) +decrypt(ciphertext: array, private key) -> array (输出明文) -sign(message: array, private key) -> array (the signature) -verify(message: array, signature: array, public key) -> bool (whether or not the signature is valid) +sign(message: array, private key) -> array (生成签名) +verify(message: array, signature: array, public key) -> bool (验证签名是否是由和这个公钥相关的私钥生成的) ``` -The encrypt/decrypt functions have properties similar to their analogs from -symmetric cryptosystems. A message can be encrypted using the _public_ key. -Given the output (ciphertext), it's hard to determine the input (plaintext) -without the _private_ key. The decrypt function has the obvious correctness -property, that `decrypt(encrypt(m, public key), private key) = m`. - -Symmetric and asymmetric encryption can be compared to physical locks. A -symmetric cryptosystem is like a door lock: anyone with the key can lock and -unlock it. Asymmetric encryption is like a padlock with a key. You could give -the unlocked lock to someone (the public key), they could put a message in a -box and then put the lock on, and after that, only you could open the lock -because you kept the key (the private key). - -The sign/verify functions have the same properties that you would hope physical -signatures would have, in that it's hard to forge a signature. No matter the -message, without the _private_ key, it's hard to produce a signature such that -`verify(message, signature, public key)` returns true. And of course, the -verify function has the obvious correctness property that `verify(message, -sign(message, private key), public key) = true`. +非对称的加密/解密方法和对称的加密/解密方法有类似的特征。
    +信息在非对称加密中使用 _公钥_ 加密, +且输出的密文很难在不知道 _私钥_ 的情况下得出明文。
    +解密方法`decrypt()`有明显的正确性。 +给定密文及私钥,解密方法一定会输出明文: +`decrypt(encrypt(m, public key), private key) = m`。 + +对称加密和非对称加密可以类比为机械锁。 +对称加密就好比一个防盗门:只要是有钥匙的人都可以开门或者锁门。 +非对称加密好比一个可以拿下来的挂锁。你可以把打开状态的挂锁(公钥)给任何一个人并保留唯一的钥匙(私钥)。这样他们将给你的信息装进盒子里并用这个挂锁锁上以后,只有你可以用保留的钥匙开锁。 + +签名/验证方法具有和书面签名类似的特征。
    +在不知道 _私钥_ 的情况下,不管需要签名的信息为何,很难计算出一个可以使 +`verify(message, signature, public key)` 返回为真的签名。
    +对于使用私钥签名的信息,验证方法验证和私钥相对应的公钥时一定返回为真: `verify(message, +sign(message, private key), public key) = true`。 ## 非对称加密的应用 -- [PGP email encryption](https://en.wikipedia.org/wiki/Pretty_Good_Privacy). -People can have their public keys posted online (e.g. in a PGP keyserver, or on -[Keybase](https://keybase.io/)). Anyone can send them encrypted email. -- Private messaging. Apps like [Signal](https://signal.org/) and -[Keybase](https://keybase.io/) use asymmetric keys to establish private -communication channels. -- Signing software. Git can have GPG-signed commits and tags. With a posted -public key, anyone can verify the authenticity of downloaded software. +- [PGP电子邮件加密](https://en.wikipedia.org/wiki/Pretty_Good_Privacy):用户可以将所使用的公钥在线发布,比如:PGP密钥服务器或 +[Keybase](https://keybase.io/)。任何人都可以向他们发送加密的电子邮件。 +- 聊天加密:像 [Signal](https://signal.org/) 和 +[Keybase](https://keybase.io/) 使用非对称密钥来建立私密聊天。 +- 软件签名:Git 支持用户对提交(commit)和标签(tag)进行GPG签名。任何人都可以使用软件开发者公布的签名公钥验证下载的已签名软件。 ## 密钥分发 -Asymmetric-key cryptography is wonderful, but it has a big challenge of -distributing public keys / mapping public keys to real-world identities. There -are many solutions to this problem. Signal has one simple solution: trust on -first use, and support out-of-band public key exchange (you verify your -friends' "safety numbers" in person). PGP has a different solution, which is -[web of trust](https://en.wikipedia.org/wiki/Web_of_trust). Keybase has yet -another solution of [social -proof](https://keybase.io/blog/chat-apps-softer-than-tofu) (along with other -neat ideas). Each model has its merits; we (the instructors) like Keybase's -model. +非对称加密面对的主要挑战是,如何分发公钥并对应现实世界中存在的人或组织。 + +Signal的信任模型是,信任用户第一次使用时给出的身份(trust on first use),同时支持用户线下(out-of-band)、面对面交换公钥(Signal里的safety number)。 + +PGP使用的是[信任网络](https://en.wikipedia.org/wiki/Web_of_trust)。简单来说,如果我想加入一个信任网络,则必须让已经在信任网络中的成员对我进行线下验证,比如对比证件。验证无误后,信任网络的成员使用私钥对我的公钥进行签名。这样我就成为了信任网络的一部分。只要我使用签名过的公钥所对应的私钥就可以证明“我是我”。 + +Keybase主要使用[社交网络证明 (social proof)](https://keybase.io/blog/chat-apps-softer-than-tofu),和一些别的精巧设计。 + +每个信任模型有它们各自的优点:我们(讲师)更倾向于 Keybase 使用的模型。 # 案例分析 ## 密码管理器 -This is an essential tool that everyone should try to use (e.g. -[KeePassXC](https://keepassxc.org/)). Password managers let you use unique, -randomly generated high-entropy passwords for all your websites, and they save -all your passwords in one place, encrypted with a symmetric cipher with a key -produced from a passphrase using a KDF. +每个人都应该尝试使用密码管理器,比如[KeePassXC](https://keepassxc.org/)。 + +密码管理器会帮助你对每个网站生成随机且复杂(表现为高熵)的密码,并使用你指定的主密码配合密钥生成函数来对称加密它们。 -Using a password manager lets you avoid password reuse (so you're less impacted -when websites get compromised), use high-entropy passwords (so you're less likely to -get compromised), and only need to remember a single high-entropy password. +你只需要记住一个复杂的主密码,密码管理器就可以生成很多复杂度高且不会重复使用的密码。密码管理器通过这种方式降低密码被猜出的可能,并减少网站信息泄露后对其他网站密码的威胁。 ## 两步验证 -[Two-factor authentication](https://en.wikipedia.org/wiki/Multi-factor_authentication) -(2FA) requires you to use a passphrase ("something you know") along with a 2FA -authenticator (like a [YubiKey](https://www.yubico.com/), "something you have") -in order to protect against stolen passwords and -[phishing](https://en.wikipedia.org/wiki/Phishing) attacks. +[两步验证](https://en.wikipedia.org/wiki/Multi-factor_authentication)(2FA)要求用户同时使用密码(“你知道的信息”)和一个身份验证器(“你拥有的物品”,比如[YubiKey](https://www.yubico.com/))来消除密码泄露或者[钓鱼攻击](https://en.wikipedia.org/wiki/Phishing)的威胁。 + ## 全盘加密 -Keeping your laptop's entire disk encrypted is an easy way to protect your data -in the case that your laptop is stolen. You can use [cryptsetup + -LUKS](https://wiki.archlinux.org/index.php/Dm-crypt/Encrypting_a_non-root_file_system) -on Linux, -[BitLocker](https://fossbytes.com/enable-full-disk-encryption-windows-10/) on -Windows, or [FileVault](https://support.apple.com/en-us/HT204837) on macOS. -This encrypts the entire disk with a symmetric cipher, with a key protected by -a passphrase. +对笔记本电脑的硬盘进行全盘加密是防止因设备丢失而信息泄露的简单且有效方法。 +Linux的[cryptsetup + +LUKS](https://wiki.archlinux.org/index.php/Dm-crypt/Encrypting_a_non-root_file_system), +Windows的[BitLocker](https://fossbytes.com/enable-full-disk-encryption-windows-10/),或者macOS的[FileVault](https://support.apple.com/en-us/HT204837)都使用一个由密码保护的对称密钥来加密盘上的所有信息。 ## 聊天加密 -Use [Signal](https://signal.org/) or [Keybase](https://keybase.io/). End-to-end -security is bootstrapped from asymmetric-key encryption. Obtaining your -contacts' public keys is the critical step here. If you want good security, you -need to authenticate public keys out-of-band (with Signal or Keybase), or trust -social proofs (with Keybase). +[Signal](https://signal.org/)和[Keybase](https://keybase.io/)使用非对称加密对用户提供端到端(End-to-end)安全性。 + +获取联系人的公钥非常关键。为了保证安全性,应使用线下方式验证Signal或者Keybase的用户公钥,或者信任Keybase用户提供的社交网络证明。 ## SSH From 11ea1d4f71e4661d34dd6702c2e77d3b55b2df6e Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 30 May 2020 14:01:50 +0800 Subject: [PATCH 394/640] update trans --- _2020/debugging-profiling.md | 29 +++++++++++++++-------------- 1 file changed, 15 insertions(+), 14 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index e31a2eb9..2060dbc6 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -303,18 +303,19 @@ $ python -m cProfile -s tottime grep.py 1000 '^(import|\s*def)[^,]*$' *.py ``` -A caveat of Python's `cProfile` profiler (and many profilers for that matter) is that they display time per function call. That can become unintuitive really fast, specially if you are using third party libraries in your code since internal function calls are also accounted for. -A more intuitive way of displaying profiling information is to include the time taken per line of code, which is what _line profilers_ do. +关于 Python 的 `cProfile` 分析器(以及其他一些类似的一些分析器),需要注意的是它显示的是每次函数调用的时间。看上去可能快到反直觉,尤其是如果您在代码里面使用了第三方的函数库,因为内部函数调用也会被看作函数调用。 + +更加符合直觉的显示分析信息的方式是包括每行代码的执行时间,这也是 +*行分析器* 的工作。例如,下面这段 Python 代码会向本课程的网站发起一个请求,然后解析响应返回的页面中的全部 URL: -For instance, the following piece of Python code performs a request to the class website and parses the response to get all URLs in the page: ```python #!/usr/bin/env python import requests from bs4 import BeautifulSoup -# This is a decorator that tells line_profiler -# that we want to analyze this function +# 这个装饰器会告诉行分析器 +# 我们想要分析这个函数 @profile def get_urls(): response = requests.get('https://missing.csail.mit.edu') @@ -327,7 +328,7 @@ if __name__ == '__main__': get_urls() ``` -If we used Python's `cProfile` profiler we'd get over 2500 lines of output, and even with sorting it'd be hard to understand where the time is being spent. A quick run with [`line_profiler`](https://github.com/rkern/line_profiler) shows the time taken per line: +如果我们使用 Python 的 `cProfile` 分析器,我们会得到超过2500行的输出结果,即使对其进行排序,我仍然搞不懂时间到底都花在哪了。如果我们使用 [`line_profiler`](https://github.com/rkern/line_profiler),它会基于行来显示时间: ```bash $ kernprof -l -v a.py @@ -351,11 +352,11 @@ Line # Hits Time Per Hit % Time Line Contents ### 内存 -In languages like C or C++ memory leaks can cause your program to never release memory that it doesn't need anymore. -To help in the process of memory debugging you can use tools like [Valgrind](https://valgrind.org/) that will help you identify memory leaks. +像 C 或者 C++ 这样的语言,内存泄漏会导致您的程序在使用完内存后不去释放它。为了应对内存类的 Bug,我们可以使用类似 [Valgrind](https://valgrind.org/) 这样的工具来检查内存泄漏问题。 + +对于 Python 这类具有垃圾回收机制的语言,内存分析器也是很有用的,因为对于某个对象来说,只要有指针还指向它,那它就不会被回收。 -In garbage collected languages like Python it is still useful to use a memory profiler because as long as you have pointers to objects in memory they won't be garbage collected. -Here's an example program and its associated output when running it with [memory-profiler](https://pypi.org/project/memory-profiler/) (note the decorator like in `line-profiler`). +下面这个例子及其输出,展示了 [memory-profiler](https://pypi.org/project/memory-profiler/) 是如何工作的(注意装饰器和 `line-profiler` 类似)。 ```python @profile @@ -381,11 +382,11 @@ Line # Mem usage Increment Line Contents 8 13.61 MB 0.00 MB return a ``` -### Event Profiling +### 事件分析 + +在我们使用`strace`调试代码的时候,您可能会希望忽略一些特殊的代码并希望在分析时将其当作黑盒处理。[`perf`](http://man7.org/linux/man-pages/man1/perf.1.html) 命令将 CPU 的区别进行了抽象,它不会报告时间和内存的消耗,而是报告与您的程序相关的系统事件。 -As it was the case for `strace` for debugging, you might want to ignore the specifics of the code that you are running and treat it like a black box when profiling. -The [`perf`](http://man7.org/linux/man-pages/man1/perf.1.html) command abstracts CPU differences away and does not report time or memory, but instead it reports system events related to your programs. -For example, `perf` can easily report poor cache locality, high amounts of page faults or livelocks. Here is an overview of the command: +例如,`perf` can easily report poor cache locality, high amounts of page faults or livelocks. Here is an overview of the command: - `perf list` - List the events that can be traced with perf - `perf stat COMMAND ARG1 ARG2` - Gets counts of different events related a process or command From 6b74caf33c75bd57299d0ff0fa6d06b57a23e8cf Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 30 May 2020 14:55:07 +0800 Subject: [PATCH 395/640] update trans --- _2020/debugging-profiling.md | 39 ++++++++++++++++++------------------ 1 file changed, 19 insertions(+), 20 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 2060dbc6..0aebc8b7 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -66,7 +66,7 @@ done 幸运的是,大多数的程序都会将日志保存在您的系统中的某个地方。对于 UNIX 系统来说,程序的日志通常存放在 `/var/log`。 例如, [NGINX](https://www.nginx.com/) web 服务器就将其日志存放于`/var/log/nginx`。 最近,系统开始使用 **system log**,您所有的日志都会保存在这里。大多数的(但不是全部)Linux 系统都会使用 `systemd`,这是一个系统守护进程,它会控制您系统中的很多东西,例如哪些服务应该启动并运行。`systemd` 会将日志以某种特殊格式存放于`/var/log/journal`,您可以使用 [`journalctl`](http://man7.org/linux/man-pages/man1/journalctl.1.html) 命令显示这些消息。 -类似地,在 macOS there is still `/var/log/system.log` but an increasing number of tools use the system log, that can be displayed with [`log show`](https://www.manpagez.com/man/1/log/). +类似地,在 macOS 系统中还是 `/var/log/system.log`,但是有更多的工具会使用系统日志,它的内容可以使用 [`log show`](https://www.manpagez.com/man/1/log/) 显示。 对于大多数的 UNIX 系统,您也可以使用[`dmesg`](http://man7.org/linux/man-pages/man1/dmesg.1.html) 命令来读取内核的日志。 @@ -210,7 +210,7 @@ Found 3 errors in 1 file (checked 1 source file) ## 计时 -和调试代码类似,大多数情况下我们只需要打印两处代码之间的时间即可发现问题。下面这个例子中,我们使用了 Python 的 [`time`](https://docs.python.org/3/library/time.html) 模块。 +和调试代码类似,大多数情况下我们只需要打印两处代码之间的时间即可发现问题。下面这个例子中,我们使用了 Python 的 [`time`](https://docs.python.org/3/library/time.html)模块。 ```python import time, random @@ -251,8 +251,8 @@ sys 0m0.012s ### CPU 大多数情况下,当人们提及性能分析工具的时候,通常指的是 CPU 性能分析工具。 -CPU 性能分析工具有两种: 追溯分析器(_tracing_)及采样分析器(_sampling_)。 -追溯分析器 会记录程序的每一次函数调用,而采样分析器则只会周期性的监测(通常为每毫秒)您的程序并记录程序堆栈。它们使用这些记录来生成统计信息,显示程序在哪些事情上花费了最多的时间。如果您希望了解更多相关信息,可以参考[这篇](https://jvns.ca/blog/2017/12/17/how-do-ruby---python-profilers-work-) 介绍性的文章。 +CPU 性能分析工具有两种: 追踪分析器(_tracing_)及采样分析器(_sampling_)。 +追踪分析器 会记录程序的每一次函数调用,而采样分析器则只会周期性的监测(通常为每毫秒)您的程序并记录程序堆栈。它们使用这些记录来生成统计信息,显示程序在哪些事情上花费了最多的时间。如果您希望了解更多相关信息,可以参考[这篇](https://jvns.ca/blog/2017/12/17/how-do-ruby---python-profilers-work-) 介绍性的文章。 大多数的编程语言都有一些基于命令行都分析器,我们可以使用它们来分析代码。它们通常可以集成在 IDE 中,但是本节课我们会专注于这些命令行工具本身。 @@ -386,21 +386,20 @@ Line # Mem usage Increment Line Contents 在我们使用`strace`调试代码的时候,您可能会希望忽略一些特殊的代码并希望在分析时将其当作黑盒处理。[`perf`](http://man7.org/linux/man-pages/man1/perf.1.html) 命令将 CPU 的区别进行了抽象,它不会报告时间和内存的消耗,而是报告与您的程序相关的系统事件。 -例如,`perf` can easily report poor cache locality, high amounts of page faults or livelocks. Here is an overview of the command: +例如,`perf` 可以报告不佳的缓存局部性(poor cache locality)、大量的页错误(page faults)或活锁(livelocks)。下面是关于常见命令的简介: -- `perf list` - List the events that can be traced with perf -- `perf stat COMMAND ARG1 ARG2` - Gets counts of different events related a process or command -- `perf record COMMAND ARG1 ARG2` - Records the run of a command and saves the statistical data into a file called `perf.data` -- `perf report` - Formats and prints the data collected in `perf.data` +- `perf list` - 列出可以被 pref 追踪的事件; +- `perf stat COMMAND ARG1 ARG2` - 收集与某个进程或指令相关的不同事件; +- `perf record COMMAND ARG1 ARG2` - 记录命令执行的采样信息并将统计数据储存在`perf.data`中; +- `perf report` - 格式化并打印 `perf.data` 中的数据。 ### 可视化 -Profiler output for real world programs will contain large amounts of information because of the inherent complexity of software projects. -Humans are visual creatures and are quite terrible at reading large amounts of numbers and making sense of them. -Thus there are many tools for displaying profiler's output in an easier to parse way. +使用分析器来分析真实的程序时,由于软件的复杂性,其输出结果中包含了大量的信息。人类是一种视觉动物,非常不善于阅读大量的文字。因此很多工具都提供了可视化分析器输出结果的功能。 + +对于采样分析器来说,常见的显示 CPU 分析数据的形式是 [火焰图](http://www.brendangregg.com/flamegraphs.html),火焰图会在 Y 轴显示函数调用关系,并在 X 轴显示其耗时的比例。火焰图同时还是可交互的,您可以深入程序的某一具体部分,并查看其栈追踪(您可以尝试点击下面的图片)。 -One common way to display CPU profiling information for sampling profilers is to use a [Flame Graph](http://www.brendangregg.com/flamegraphs.html), which will display a hierarchy of function calls across the Y axis and time taken proportional to the X axis. They are also interactive, letting you zoom into specific parts of the program and get their stack traces (try clicking in the image below). [![FlameGraph](http://www.brendangregg.com/FlameGraphs/cpu-bash-flamegraph.svg)](http://www.brendangregg.com/FlameGraphs/cpu-bash-flamegraph.svg) @@ -416,16 +415,16 @@ Sometimes, the first step towards analyzing the performance of your program is t Programs often run slowly when they are resource constrained, e.g. without enough memory or on a slow network connection. There are a myriad of command line tools for probing and displaying different system resources like CPU usage, memory usage, network, disk usage and so on. -- **General Monitoring** - Probably the most popular is [`htop`](https://hisham.hm/htop/index.php), which is an improved version of [`top`](http://man7.org/linux/man-pages/man1/top.1.html). +- **监控** - Probably the most popular is [`htop`](https://hisham.hm/htop/index.php), which is an improved version of [`top`](http://man7.org/linux/man-pages/man1/top.1.html). `htop` presents various statistics for the currently running processes on the system. `htop` has a myriad of options and keybinds, some useful ones are: `` to sort processes, `t` to show tree hierarchy and `h` to toggle threads. See also [`glances`](https://nicolargo.github.io/glances/) for similar implementation with a great UI. For getting aggregate measures across all processes, [`dstat`](http://dag.wiee.rs/home-made/dstat/) is another nifty tool that computes real-time resource metrics for lots of different subsystems like I/O, networking, CPU utilization, context switches, &c. -- **I/O operations** - [`iotop`](http://man7.org/linux/man-pages/man8/iotop.8.html) displays live I/O usage information and is handy to check if a process is doing heavy I/O disk operations -- **Disk Usage** - [`df`](http://man7.org/linux/man-pages/man1/df.1.html) displays metrics per partitions and [`du`](http://man7.org/linux/man-pages/man1/du.1.html) displays **d**isk **u**sage per file for the current directory. In these tools the `-h` flag tells the program to print with **h**uman readable format. +- **I/O 操作** - [`iotop`](http://man7.org/linux/man-pages/man8/iotop.8.html) displays live I/O usage information and is handy to check if a process is doing heavy I/O disk operations +- **磁盘使用** - [`df`](http://man7.org/linux/man-pages/man1/df.1.html) displays metrics per partitions and [`du`](http://man7.org/linux/man-pages/man1/du.1.html) displays **d**isk **u**sage per file for the current directory. In these tools the `-h` flag tells the program to print with **h**uman readable format. A more interactive version of `du` is [`ncdu`](https://dev.yorhel.nl/ncdu) which lets you navigate folders and delete files and folders as you navigate. -- **Memory Usage** - [`free`](http://man7.org/linux/man-pages/man1/free.1.html) displays the total amount of free and used memory in the system. Memory is also displayed in tools like `htop`. -- **Open Files** - [`lsof`](http://man7.org/linux/man-pages/man8/lsof.8.html) lists file information about files opened by processes. It can be quite useful for checking which process has opened a specific file. -- **Network Connections and Config** - [`ss`](http://man7.org/linux/man-pages/man8/ss.8.html) lets you monitor incoming and outgoing network packets statistics as well as interface statistics. A common use case of `ss` is figuring out what process is using a given port in a machine. For displaying routing, network devices and interfaces you can use [`ip`](http://man7.org/linux/man-pages/man8/ip.8.html). Note that `netstat` and `ifconfig` have been deprecated in favor of the former tools respectively. -- **Network Usage** - [`nethogs`](https://github.com/raboof/nethogs) and [`iftop`](http://www.ex-parrot.com/pdw/iftop/) are good interactive CLI tools for monitoring network usage. +- **内存使用** - [`free`](http://man7.org/linux/man-pages/man1/free.1.html) displays the total amount of free and used memory in the system. Memory is also displayed in tools like `htop`. +- **打开文件** - [`lsof`](http://man7.org/linux/man-pages/man8/lsof.8.html) lists file information about files opened by processes. It can be quite useful for checking which process has opened a specific file. +- **网络连接和配置** - [`ss`](http://man7.org/linux/man-pages/man8/ss.8.html) lets you monitor incoming and outgoing network packets statistics as well as interface statistics. A common use case of `ss` is figuring out what process is using a given port in a machine. For displaying routing, network devices and interfaces you can use [`ip`](http://man7.org/linux/man-pages/man8/ip.8.html). Note that `netstat` and `ifconfig` have been deprecated in favor of the former tools respectively. +- **网络使用** - [`nethogs`](https://github.com/raboof/nethogs) and [`iftop`](http://www.ex-parrot.com/pdw/iftop/) are good interactive CLI tools for monitoring network usage. If you want to test these tools you can also artificially impose loads on the machine using the [`stress`](https://linux.die.net/man/1/stress) command. From 9f6fafa54a22da93926f52c20486fa9d41d4f635 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 30 May 2020 15:58:56 +0800 Subject: [PATCH 396/640] update trans --- _2020/debugging-profiling.md | 41 ++++++++++++++++-------------------- 1 file changed, 18 insertions(+), 23 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 0aebc8b7..9242a79d 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -400,41 +400,37 @@ Line # Mem usage Increment Line Contents 对于采样分析器来说,常见的显示 CPU 分析数据的形式是 [火焰图](http://www.brendangregg.com/flamegraphs.html),火焰图会在 Y 轴显示函数调用关系,并在 X 轴显示其耗时的比例。火焰图同时还是可交互的,您可以深入程序的某一具体部分,并查看其栈追踪(您可以尝试点击下面的图片)。 - [![FlameGraph](http://www.brendangregg.com/FlameGraphs/cpu-bash-flamegraph.svg)](http://www.brendangregg.com/FlameGraphs/cpu-bash-flamegraph.svg) -Call graphs or control flow graphs display the relationships between subroutines within a program by including functions as nodes and functions calls between them as directed edges. When coupled with profiling information such as the number of calls and time taken, call graphs can be quite useful for interpreting the flow of a program. -In Python you can use the [`pycallgraph`](http://pycallgraph.slowchop.com/en/master/) library to generate them. +调用图和控制流图可以显示子程序之间的关系,它将函数作为节点并把函数调用作为边。将它们和分析器的信息(例如调用次数、耗时等)放在一起使用时,调用图会变得非常有用,它可以帮助我们分析程序的流程。 +在 Python 中您可以使用 [`pycallgraph`](http://pycallgraph.slowchop.com/en/master/) 来生成这些图片。 ![Call Graph](https://upload.wikimedia.org/wikipedia/commons/2/2f/A_Call_Graph_generated_by_pycallgraph.png) ## 资源监控 -Sometimes, the first step towards analyzing the performance of your program is to understand what its actual resource consumption is. -Programs often run slowly when they are resource constrained, e.g. without enough memory or on a slow network connection. -There are a myriad of command line tools for probing and displaying different system resources like CPU usage, memory usage, network, disk usage and so on. +有时候,分析程序性能的第一步是搞清楚它所消耗的资源。程序变慢通常是因为它所需要的资源不够了。例如,没有足够的内存或者网络连接变慢的时候。 + +有很多很多的工具可以被用来显示不同的系统资源,例如 CPU 占用、内存使用、网络、磁盘使用等。 -- **监控** - Probably the most popular is [`htop`](https://hisham.hm/htop/index.php), which is an improved version of [`top`](http://man7.org/linux/man-pages/man1/top.1.html). -`htop` presents various statistics for the currently running processes on the system. `htop` has a myriad of options and keybinds, some useful ones are: `` to sort processes, `t` to show tree hierarchy and `h` to toggle threads. -See also [`glances`](https://nicolargo.github.io/glances/) for similar implementation with a great UI. For getting aggregate measures across all processes, [`dstat`](http://dag.wiee.rs/home-made/dstat/) is another nifty tool that computes real-time resource metrics for lots of different subsystems like I/O, networking, CPU utilization, context switches, &c. -- **I/O 操作** - [`iotop`](http://man7.org/linux/man-pages/man8/iotop.8.html) displays live I/O usage information and is handy to check if a process is doing heavy I/O disk operations -- **磁盘使用** - [`df`](http://man7.org/linux/man-pages/man1/df.1.html) displays metrics per partitions and [`du`](http://man7.org/linux/man-pages/man1/du.1.html) displays **d**isk **u**sage per file for the current directory. In these tools the `-h` flag tells the program to print with **h**uman readable format. -A more interactive version of `du` is [`ncdu`](https://dev.yorhel.nl/ncdu) which lets you navigate folders and delete files and folders as you navigate. -- **内存使用** - [`free`](http://man7.org/linux/man-pages/man1/free.1.html) displays the total amount of free and used memory in the system. Memory is also displayed in tools like `htop`. -- **打开文件** - [`lsof`](http://man7.org/linux/man-pages/man8/lsof.8.html) lists file information about files opened by processes. It can be quite useful for checking which process has opened a specific file. -- **网络连接和配置** - [`ss`](http://man7.org/linux/man-pages/man8/ss.8.html) lets you monitor incoming and outgoing network packets statistics as well as interface statistics. A common use case of `ss` is figuring out what process is using a given port in a machine. For displaying routing, network devices and interfaces you can use [`ip`](http://man7.org/linux/man-pages/man8/ip.8.html). Note that `netstat` and `ifconfig` have been deprecated in favor of the former tools respectively. -- **网络使用** - [`nethogs`](https://github.com/raboof/nethogs) and [`iftop`](http://www.ex-parrot.com/pdw/iftop/) are good interactive CLI tools for monitoring network usage. +- **通用监控** - 最流行的工具要数 [`htop`](https://hisham.hm/htop/index.php),了,它是 [`top`](http://man7.org/linux/man-pages/man1/top.1.html)的改进版。`htop` 可以显示当前运行进程的多种统计信息。`htop` 有很多选项和快捷键,常见的有:`` 进程排序、 `t` 显示树状结构和 `h` 打开或折叠线程。 还可以留意一下 [`glances`](https://nicolargo.github.io/glances/) ,它的实现类似但是用户界面更好。如果需要合并测量全部的进程, [`dstat`](http://dag.wiee.rs/home-made/dstat/) 是也是一个非常好用的工具,它可以实时地计算不同子系统资源的度量数据,例如 I/O、网络、 CPU 利用率、上下文切换等等; +- **I/O 操作** - [`iotop`](http://man7.org/linux/man-pages/man8/iotop.8.html) 可以显示实时 I/O 占用信息而且可以非常方便地检查某个进程是否正在执行大量的磁盘读写操作; +- **磁盘使用** - [`df`](http://man7.org/linux/man-pages/man1/df.1.html) 可以显示每个分区的信息,而 [`du`](http://man7.org/linux/man-pages/man1/du.1.html) 则可以显示当前目录下每个文件的磁盘使用情况( **d**isk **u**sage)。`-h` 选项可以使命令使用对人类(*h*uman)更加友好的格式显示数据;[`ncdu`](https://dev.yorhel.nl/ncdu)是一个交互性更好的 `du` ,它可以让您在不同目录下导航、删除文件和文件夹; +- **内存使用** - [`free`](http://man7.org/linux/man-pages/man1/free.1.html) 可以显示系统当前空闲的内存。内存也可以使用 `htop` 这样的工具来显示; +- **打开文件** - [`lsof`](http://man7.org/linux/man-pages/man8/lsof.8.html) 可以列出被进程打开的文件信息。 当我们需要查看某个文件是被哪个进程打开的时候,这个命令非常有用;. +- **网络连接和配置** - [`ss`](http://man7.org/linux/man-pages/man8/ss.8.html) le帮助我们监控网络包的收发情况以及网络接口的显示信息。`ss` 常见的一个使用场景是找到端口被进程占用的信息。如果要显示路由、网络设备和接口信息,您可以使用 [`ip`](http://man7.org/linux/man-pages/man8/ip.8.html) 命令。注意,`netstat` 和 `ifconfig` 这两个命令已经被前面那些工具所代替了。 +- **网络使用** - [`nethogs`](https://github.com/raboof/nethogs) 和 [`iftop`](http://www.ex-parrot.com/pdw/iftop/) 是非常好的用于对网络占用进行监控的交互式命令行工具。 -If you want to test these tools you can also artificially impose loads on the machine using the [`stress`](https://linux.die.net/man/1/stress) command. +如果您希望测试一下这些工具,您可以使用 [`stress`](https://linux.die.net/man/1/stress) 命令来为系统人为地增加负载。 ### 专用工具 -Sometimes, black box benchmarking is all you need to determine what software to use. -Tools like [`hyperfine`](https://github.com/sharkdp/hyperfine) let you quickly benchmark command line programs. -For instance, in the shell tools and scripting lecture we recommended `fd` over `find`. We can use `hyperfine` to compare them in tasks we run often. -E.g. in the example below `fd` was 20x faster than `find` in my machine. +有时候,您只需要对黑盒程序进行基准测试,并依此对软件选择进行评估。 +类似 [`hyperfine`](https://github.com/sharkdp/hyperfine) 这样的命令行可以帮您快速进行基准测试。例如,我们在 shell 工具和脚本那一节课中我们推荐使用 `fd` 来代替 `find`。我们这里可以用`hyperfine`来比较一下它们。 + +例如,下面的例子中,我们可以看到`fd` 比 `find` 要快20倍。 ```bash $ hyperfine --warmup 3 'fd -e jpg' 'find . -iname "*.jpg"' @@ -451,8 +447,7 @@ Summary 21.89 ± 2.33 times faster than 'find . -iname "*.jpg"' ``` -As it was the case for debugging, browsers also come with a fantastic set of tools for profiling webpage loading, letting you figure out where time is being spent (loading, rendering, scripting, &c). -More info for [Firefox](https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Profiling_with_the_Built-in_Profiler) and [Chrome](https://developers.google.com/web/tools/chrome-devtools/rendering-toolss). +和 debug 一样,浏览器也包含了很多不错的性能分析工具,可以用来分析页面加载,让我们可以搞清楚时间都消耗在什么地方(加载、渲染、脚本等等)。 更多关于 [Firefox](https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Profiling_with_the_Built-in_Profiler) 和 [Chrome](https://developers.google.com/web/tools/chrome-devtools/rendering-toolss)的信息可以点击链接。 # 课后练习 From b0914c3102075d89957d6d65bf4ee261823cc657 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 30 May 2020 16:38:39 +0800 Subject: [PATCH 397/640] update trans --- _2020/debugging-profiling.md | 29 +++++++++++------------------ 1 file changed, 11 insertions(+), 18 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 9242a79d..ff857f3a 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -2,7 +2,7 @@ layout: lecture title: "调试及性能分析" date: 2019-01-23 -ready: false +ready: true video: aspect: 56.25 id: l812pUnKxME @@ -452,13 +452,11 @@ Summary # 课后练习 ## 调试 -1. Use `journalctl` on Linux or `log show` on macOS to get the super user accesses and commands in the last day. -If there aren't any you can execute some harmless commands such as `sudo ls` and check again. +1. 使用 Linux 上的 `journalctl` 或 macOS 上的 `log show` 命令来获取最近一天中超级用户的登陆信息及其所执行的指令。如果找不到相关信息,您可以执行一些无害的命令,例如`sudo ls` 然后再次查看。 -1. Do [this](https://github.com/spiside/pdb-tutorial) hands on `pdb` tutorial to familiarize yourself with the commands. For a more in depth tutorial read [this](https://realpython.com/python-debugging-pdb). - -1. Install [`shellcheck`](https://www.shellcheck.net/) and try checking the following script. What is wrong with the code? Fix it. Install a linter plugin in your editor so you can get your warnings automatically. +2. 学习 [这份](https://github.com/spiside/pdb-tutorial) `pdb` 实践教程并熟悉相关的命令。更深入的信息您可以参考[这份](https://realpython.com/python-debugging-pdb)教程。 +3. 安装 [`shellcheck`](https://www.shellcheck.net/) 并尝试对下面的脚本进行检查。这段代码有什么问题吗?请修复相关问题。在您的编辑器中安装一个linter差距,这样它就可以自动地显示相关警告信息。 ```bash #!/bin/sh ## Example: a typical script with several problems @@ -469,14 +467,13 @@ If there aren't any you can execute some harmless commands such as `sudo ls` and done ``` -1. (Advanced) Read about [reversible debugging](https://undo.io/resources/reverse-debugging-whitepaper/) and get a simple example working using [`rr`](https://rr-project.org/) or [`RevPDB`](https://morepypy.blogspot.com/2016/07/reverse-debugging-for-python.html). +4. (进阶题) 请阅读 [可逆调试](https://undo.io/resources/reverse-debugging-whitepaper/) 并尝试创建一个可以工作的例子(使用 [`rr`](https://rr-project.org/) 或 [`RevPDB`](https://morepypy.blogspot.com/2016/07/reverse-debugging-for-python.html))。 ## 性能分析 -1. [Here](/static/files/sorts.py) are some sorting algorithm implementations. Use [`cProfile`](https://docs.python.org/3/library/profile.html) and [`line_profiler`](https://github.com/rkern/line_profiler) to compare the runtime of insertion sort and quicksort. What is the bottleneck of each algorithm? Use then `memory_profiler` to check the memory consumption, why is insertion sort better? Check now the inplace version of quicksort. Challenge: Use `perf` to look at the cycle counts and cache hits and misses of each algorithm. - -1. Here's some (arguably convoluted) Python code for computing Fibonacci numbers using a function for each number. +1. [这里](/static/files/sorts.py) 有一些排序算法的实现。请使用 [`cProfile`](https://docs.python.org/3/library/profile.html) 和 [`line_profiler`](https://github.com/rkern/line_profiler) 来比较插入排序和快速排序的性能。两种算法的瓶颈分别在哪里?然后使用 `memory_profiler` 来检查内存消耗,为什么插入排序更好一些?然后在看看原地排序版本的快排。附加题:使用 `perf` 来查看不同算法的循环次数及缓存命中及丢失情况。 +2. 这里有一些用于计算斐波那契数列 Python 代码,它为计算每个数字都定义了一个函数: ```python #!/usr/bin/env python def fib0(): return 0 @@ -494,14 +491,10 @@ If there aren't any you can execute some harmless commands such as `sudo ls` and # exec("fib{} = lru_cache(1)(fib{})".format(n, n)) print(eval("fib9()")) ``` + 将代码拷贝到文件中使其变为一个可执行的程序。安装 [`pycallgraph`](http://pycallgraph.slowchop.com/en/master/)。并使用 `pycallgraph graphviz -- ./fib.py` 来执行代码并查看`pycallgraph.png` 这个文件。`fib0` 被调用了多少次?我们可以通过?我们可以通过记忆法来对其进行优化。将注释掉的部分放开,然后重新生成图片。这回每个`fibN` 函数被调用了多少次? +3. 我们经常会遇到的情况是某个我们希望去监听的端口已经被其他进程占用了。让我们通过进程的PID查找相应的进程。首先执行 `python -m http.server 4444` 启动一个最简单的 web 服务器来监听 `4444` 端口。在另外一个终端中,执行 `lsof | grep LISTEN` 打印出所有监听端口的进程及相应的端口。找到对应的 PID 然后使用 `kill ` 停止该进程。 - Put the code into a file and make it executable. Install [`pycallgraph`](http://pycallgraph.slowchop.com/en/master/). Run the code as is with `pycallgraph graphviz -- ./fib.py` and check the `pycallgraph.png` file. How many times is `fib0` called?. We can do better than that by memoizing the functions. Uncomment the commented lines and regenerate the images. How many times are we calling each `fibN` function now? - -1. A common issue is that a port you want to listen on is already taken by another process. Let's learn how to discover that process pid. First execute `python -m http.server 4444` to start a minimal web server listening on port `4444`. On a separate terminal run `lsof | grep LISTEN` to print all listening processes and ports. Find that process pid and terminate it by running `kill `. - -1. Limiting processes resources can be another handy tool in your toolbox. -Try running `stress -c 3` and visualize the CPU consumption with `htop`. Now, execute `taskset --cpu-list 0,2 stress -c 3` and visualize it. Is `stress` taking three CPUs? Why not? Read [`man taskset`](http://man7.org/linux/man-pages/man1/taskset.1.html). -Challenge: achieve the same using [`cgroups`](http://man7.org/linux/man-pages/man7/cgroups.7.html). Try limiting the memory consumption of `stress -m`. +4. 限制进程资源也是一个非常有用的技术。执行 `stress -c 3` 并使用`htop` 对 CPU 消耗进行可视化。现在,执行`taskset --cpu-list 0,2 stress -c 3` 并可视化。`stress` 占用了3个 CPU 吗?为什么没有?阅读[`man taskset`](http://man7.org/linux/man-pages/man1/taskset.1.html)来寻找答案。附加题:使用 [`cgroups`](http://man7.org/linux/man-pages/man7/cgroups.7.html)来实现相同的操作,尝试使用`stress -m`来限制内存使用 -1. (Advanced) The command `curl ipinfo.io` performs a HTTP request an fetches information about your public IP. Open [Wireshark](https://www.wireshark.org/) and try to sniff the request and reply packets that `curl` sent and received. (Hint: Use the `http` filter to just watch HTTP packets). +5. (进阶题) `curl ipinfo.io` 命令或执行 HTTP 请求并获取关于您 IP 的信息。打开 [Wireshark](https://www.wireshark.org/) 并抓取 `curl` 发起的请求和收到的回复报文。(提示:可以使用`http`进行过滤,只显示 HTTP 报文) From 48a4d5d1b352c4314f2f7739b2ba831736d6754e Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 30 May 2020 16:53:32 +0800 Subject: [PATCH 398/640] fix typo --- _2020/debugging-profiling.md | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index ff857f3a..286fab72 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -18,9 +18,9 @@ video: ## 打印调试法与日志 -"最有效的 debug 工具就是细致的分析配合位于恰当位置的打印语句" — Brian Kernighan, _Unix 新手入门_。 +"最有效的 debug 工具就是细致的分析,配合恰当位置的打印语句" — Brian Kernighan, _Unix 新手入门_。 -调试代码的第一种方法往往是在您发现问题的地方添加一些打印语句,然后不断重复此过程直到您获取了足够的信息并可以找到问题的根本原因。 +调试代码的第一种方法往往是在您发现问题的地方添加一些打印语句,然后不断重复此过程直到您获取了足够的信息并找到问题的根本原因。 另外一个方法是使用日志,而不是临时添加打印语句。日志较普通的打印语句有如下的一些优势: @@ -63,10 +63,11 @@ done 和这些系统交互的时候,阅读它们的日志是非常必要的,因为仅靠客户端侧的错误信息可能并不足以定位问题。 -幸运的是,大多数的程序都会将日志保存在您的系统中的某个地方。对于 UNIX 系统来说,程序的日志通常存放在 `/var/log`。 -例如, [NGINX](https://www.nginx.com/) web 服务器就将其日志存放于`/var/log/nginx`。 -最近,系统开始使用 **system log**,您所有的日志都会保存在这里。大多数的(但不是全部)Linux 系统都会使用 `systemd`,这是一个系统守护进程,它会控制您系统中的很多东西,例如哪些服务应该启动并运行。`systemd` 会将日志以某种特殊格式存放于`/var/log/journal`,您可以使用 [`journalctl`](http://man7.org/linux/man-pages/man1/journalctl.1.html) 命令显示这些消息。 -类似地,在 macOS 系统中还是 `/var/log/system.log`,但是有更多的工具会使用系统日志,它的内容可以使用 [`log show`](https://www.manpagez.com/man/1/log/) 显示。 +幸运的是,大多数的程序都会将日志保存在您的系统中的某个地方。对于 UNIX 系统来说,程序的日志通常存放在 `/var/log`。例如, [NGINX](https://www.nginx.com/) web 服务器就将其日志存放于`/var/log/nginx`。 + +目前,系统开始使用 **system log**,您所有的日志都会保存在这里。大多数的(但不是全部)Linux 系统都会使用 `systemd`,这是一个系统守护进程,它会控制您系统中的很多东西,例如哪些服务应该启动并运行。`systemd` 会将日志以某种特殊格式存放于`/var/log/journal`,您可以使用 [`journalctl`](http://man7.org/linux/man-pages/man1/journalctl.1.html) 命令显示这些消息。 + +类似地,在 macOS 系统中是 `/var/log/system.log`,但是有更多的工具会使用系统日志,它的内容可以使用 [`log show`](https://www.manpagez.com/man/1/log/) 显示。 对于大多数的 UNIX 系统,您也可以使用[`dmesg`](http://man7.org/linux/man-pages/man1/dmesg.1.html) 命令来读取内核的日志。 @@ -107,7 +108,7 @@ journalctl --since "1m ago" | grep Hello - **n**(ext) - 继续执行直到当前函数的下一条语句或者 return 语句; - **b**(reak) - 设置断点(基于传入对参数); - **p**(rint) - 在当前上下文对表达式求值并打印结果。还有一个命令是**pp** ,它使用 [`pprint`](https://docs.python.org/3/library/pprint.html) 打印; -- **r**(eturn) - 继续执行知道当前函数返回; +- **r**(eturn) - 继续执行直到当前函数返回; - **q**(uit) - 退出调试器。 让我们使用`pdb` 来修复下面的 Python 代码(参考讲座视频) @@ -206,7 +207,7 @@ Found 3 errors in 1 file (checked 1 source file) # 性能分析 即使您的代码能够向您期望的一样运行,但是如果它消耗了您全部的 CPU 和内存,那么它显然也不是个好程序。算法课上我们通常会介绍大O标记法,但却没交给我们如何找到程序中的热点。 -因为 [过早的优化是万恶之源](http://wiki.c2.com/?PrematureOptimization),您需要学习性能分析和监控工具。它们会帮助您找到程序中最耗时、最耗资源的部分,这样您就可以有针对性的进行性能优化。 +鉴于 [过早的优化是万恶之源](http://wiki.c2.com/?PrematureOptimization),您需要学习性能分析和监控工具,它们会帮助您找到程序中最耗时、最耗资源的部分,这样您就可以有针对性的进行性能优化。 ## 计时 @@ -281,7 +282,7 @@ if __name__ == '__main__': grep(pattern, file) ``` -我们可以使用下面的命令来对这段代码进行分析。通过它的输出我们可以直到,IO 消耗来大量的时间,编译正则表达式也比较耗费时间。因为正则表达式只需要编译一次,我们可以将其移动到 for 循环外面来改进性能。 +我们可以使用下面的命令来对这段代码进行分析。通过它的输出我们可以直到,IO 消耗了大量的时间,编译正则表达式也比较耗费时间。因为正则表达式只需要编译一次,我们可以将其移动到 for 循环外面来改进性能。 ``` $ python -m cProfile -s tottime grep.py 1000 '^(import|\s*def)[^,]*$' *.py @@ -305,8 +306,7 @@ $ python -m cProfile -s tottime grep.py 1000 '^(import|\s*def)[^,]*$' *.py 关于 Python 的 `cProfile` 分析器(以及其他一些类似的一些分析器),需要注意的是它显示的是每次函数调用的时间。看上去可能快到反直觉,尤其是如果您在代码里面使用了第三方的函数库,因为内部函数调用也会被看作函数调用。 -更加符合直觉的显示分析信息的方式是包括每行代码的执行时间,这也是 -*行分析器* 的工作。例如,下面这段 Python 代码会向本课程的网站发起一个请求,然后解析响应返回的页面中的全部 URL: +更加符合直觉的显示分析信息的方式是包括每行代码的执行时间,这也是*行分析器*的工作。例如,下面这段 Python 代码会向本课程的网站发起一个请求,然后解析响应返回的页面中的全部 URL: ```python @@ -389,14 +389,14 @@ Line # Mem usage Increment Line Contents 例如,`perf` 可以报告不佳的缓存局部性(poor cache locality)、大量的页错误(page faults)或活锁(livelocks)。下面是关于常见命令的简介: - `perf list` - 列出可以被 pref 追踪的事件; -- `perf stat COMMAND ARG1 ARG2` - 收集与某个进程或指令相关的不同事件; +- `perf stat COMMAND ARG1 ARG2` - 收集与某个进程或指令相关的事件; - `perf record COMMAND ARG1 ARG2` - 记录命令执行的采样信息并将统计数据储存在`perf.data`中; - `perf report` - 格式化并打印 `perf.data` 中的数据。 ### 可视化 -使用分析器来分析真实的程序时,由于软件的复杂性,其输出结果中包含了大量的信息。人类是一种视觉动物,非常不善于阅读大量的文字。因此很多工具都提供了可视化分析器输出结果的功能。 +使用分析器来分析真实的程序时,由于软件的复杂性,其输出结果中将包含大量的信息。人类是一种视觉动物,非常不善于阅读大量的文字。因此很多工具都提供了可视化分析器输出结果的功能。 对于采样分析器来说,常见的显示 CPU 分析数据的形式是 [火焰图](http://www.brendangregg.com/flamegraphs.html),火焰图会在 Y 轴显示函数调用关系,并在 X 轴显示其耗时的比例。火焰图同时还是可交互的,您可以深入程序的某一具体部分,并查看其栈追踪(您可以尝试点击下面的图片)。 From 3718eb00683ae6d9dd24955cdf6e1cbfb124e0e5 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Sat, 30 May 2020 16:55:40 +0800 Subject: [PATCH 399/640] mark debugging as done --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 8df220ef..8d247f03 100644 --- a/README.md +++ b/README.md @@ -32,7 +32,7 @@ To contribute to this tanslation project, please book your topic by creating an | [data-wrangling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/data-wrangling.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [command-line.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/command-line.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [version-control.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/version-control.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | -| [debugging-profiling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/debugging-profiling.md) |[@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | +| [debugging-profiling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/debugging-profiling.md) |[@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [metaprogramming.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/metaprogramming.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | [@catcarbon](https://github.com/catcarbon) | In-progress | | [potpourri.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md) | | TO-DO | From 63285abe0b28b74c89865b1eb60151a8530b7318 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 30 May 2020 18:11:09 +0800 Subject: [PATCH 400/640] add badge and ribbon --- _includes/nav.html | 9 +++++++++ index.md | 3 +++ static/css/main.css | 33 +++++++++++++++++++++++++++++++++ 3 files changed, 45 insertions(+) diff --git a/_includes/nav.html b/_includes/nav.html index 1bbd9d2c..06115000 100644 --- a/_includes/nav.html +++ b/_includes/nav.html @@ -7,7 +7,16 @@

    + \ No newline at end of file diff --git a/index.md b/index.md index fb733117..08cad99d 100644 --- a/index.md +++ b/index.md @@ -3,6 +3,9 @@ layout: page title: The Missing Semester of Your CS Education 中文版 --- + + + 对于计算机教育来说,从操作系统到机器学习,这些高大上课程和主题已经非常多了。然而有一个至关重要的主题却很少被专门讲授,而是留给学生们自己去探索。 这部分内容就是:精通工具。在这个系列课程中,我们讲授命令行、强大的文本编辑器的使用、使用版本控制系统提供的多种特性等等。 diff --git a/static/css/main.css b/static/css/main.css index 9dabb63e..65423ee5 100644 --- a/static/css/main.css +++ b/static/css/main.css @@ -370,3 +370,36 @@ input[type=checkbox]:checked ~ .menu-label:after { #content pre { break-inside: avoid-page; page-break-inside: avoid; } #content div.small:last-of-type { display: none; } } + +.ribbon { + background-color: #8cbcea; + overflow: hidden; + white-space: nowrap; + /* top left corner */ + position: absolute; + right: -50px; + top: 40px; + /* 45 deg ccw rotation */ + -webkit-transform: rotate(45deg); + -moz-transform: rotate(45deg); + -ms-transform: rotate(45deg); + -o-transform: rotate(45deg); + transform: rotate(45deg); + /* shadow */ + -webkit-box-shadow: 0 0 10px #888; + -moz-box-shadow: 0 0 10px #888; + /* box-shadow: 0 0 10px #888; */ + box-shadow: 0px -1px 20px 0px #562c8c6b; +} +.ribbon a { + border: 1px solid #000; + color: #000; + display: block; + font: bold 81.25% "Helvetica Neue", Helvetica, Arial, sans-serif; + margin: 1px 0; + padding: 10px 50px; + text-align: center; + text-decoration: none; + /* shadow */ + /* text-shadow: 0 0 5px #444; */ +} From b2c3d5634bf98f469805a40e92d6f0f21d1c053b Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Sat, 30 May 2020 18:23:15 +0800 Subject: [PATCH 401/640] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 8d247f03..e4b6c092 100644 --- a/README.md +++ b/README.md @@ -35,6 +35,6 @@ To contribute to this tanslation project, please book your topic by creating an | [debugging-profiling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/debugging-profiling.md) |[@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [metaprogramming.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/metaprogramming.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | [@catcarbon](https://github.com/catcarbon) | In-progress | -| [potpourri.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md) | | TO-DO | +| [potpourri.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | | [qa.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/qa.md) | [@AA1HSHH](https://github.com/AA1HSHH) | In-progress | | [about.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/about.md) | [@Binlogo](https://github.com/Binlogo) | Done | From e0e7c9fdbedc7d7b08f30f9f200ba30ab0bfd655 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 30 May 2020 22:37:11 +0800 Subject: [PATCH 402/640] add english link and social badge --- _includes/nav.html | 1 + static/css/main.css | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/_includes/nav.html b/_includes/nav.html index 06115000..662391bb 100644 --- a/_includes/nav.html +++ b/_includes/nav.html @@ -7,6 +7,7 @@
    讲座列表 关于本课程 + 英文版 GitHub stars diff --git a/static/css/main.css b/static/css/main.css index 65423ee5..bfde6704 100644 --- a/static/css/main.css +++ b/static/css/main.css @@ -233,7 +233,8 @@ hr { } #top-nav { - max-width: 40rem; + max-width: 75rem; + padding-left:8rem; margin: auto; text-align: center; } From 5c86f4582003b59ff9722ca68bf3a2b5aaaba10c Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Sun, 31 May 2020 09:31:37 +0800 Subject: [PATCH 403/640] Update qa.md --- _2020/qa.md | 282 ++++++++++++++++++++++++++++++---------------------- 1 file changed, 161 insertions(+), 121 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index eefcb5ba..354d03c1 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -1,6 +1,6 @@ --- layout: lecture -title: "Q&A" +title: "提问&回答" date: 2019-01-30 ready: false video: @@ -8,175 +8,215 @@ video: id: Wz50FvGG6xU --- -For the last lecture, we answered questions that the students submitted: +最后一节课,我们回答学生提出的问题: -- [Any recommendations on learning Operating Systems related topics like processes, virtual memory, interrupts, memory management, etc](#any-recommendations-on-learning-operating-systems-related-topics-like-processes-virtual-memory-interrupts-memory-management-etc) -- [What are some of the tools you'd prioritize learning first?](#what-are-some-of-the-tools-youd-prioritize-learning-first) -- [When do I use Python versus a Bash scripts versus some other language?](#when-do-i-use-python-versus-a-bash-scripts-versus-some-other-language) -- [What is the difference between `source script.sh` and `./script.sh`](#what-is-the-difference-between-source-scriptsh-and-scriptsh) -- [What are the places where various packages and tools are stored and how does referencing them work? What even is `/bin` or `/lib`?](#what-are-the-places-where-various-packages-and-tools-are-stored-and-how-does-referencing-them-work-what-even-is-bin-or-lib) -- [Should I `apt-get install` a python-whatever, or `pip install` whatever package?](#should-i-apt-get-install-a-python-whatever-or-pip-install-whatever-package) -- [What's the easiest and best profiling tools to use to improve performance of my code?](#whats-the-easiest-and-best-profiling-tools-to-use-to-improve-performance-of-my-code) -- [What browser plugins do you use?](#what-browser-plugins-do-you-use) -- [What are other useful data wrangling tools?](#what-are-other-useful-data-wrangling-tools) -- [What is the difference between Docker and a Virtual Machine?](#what-is-the-difference-between-docker-and-a-virtual-machine) -- [What are the advantages and disadvantages of each OS and how can we choose between them (e.g. choosing the best Linux distribution for our purposes)?](#what-are-the-advantages-and-disadvantages-of-each-os-and-how-can-we-choose-between-them-eg-choosing-the-best-linux-distribution-for-our-purposes) -- [Vim vs Emacs?](#vim-vs-emacs) -- [Any tips or tricks for Machine Learning applications?](#any-tips-or-tricks-for-machine-learning-applications) -- [Any more Vim tips?](#any-more-vim-tips) -- [What is 2FA and why should I use it?](#what-is-2fa-and-why-should-i-use-it) -- [Any comments on differences between web browsers?](#any-comments-on-differences-between-web-browsers) +- [学习操作系统相关话题的推荐,比如 进程,虚拟内存,中断,内存管理等](#学习操作系统相关话题的推荐,比如 进程,虚拟内存,中断,内存管理等) +- [你会优先学习的工具有那些?](#你会优先学习的工具有那些?) +- [使用Python VS Bash脚本 VS 其他语言?](#使用Python VS Bash脚本 VS 其他语言?) +- [`source script.sh` 和`./script.sh`有什么区别](#`source script.sh` 和`./script.sh`有什么区别) +- [各种软件包和工具存储在哪里? 引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里? 引用过程是怎样的? `/bin` 或 `/lib` 是什么?) +- [我应该用`apt-get install`还是`pip install` 去下载包呢?](#我应该用`apt-get install`还是`pip install` 去下载包呢) +- [提高代码性能的最简单和最好的性能分析工具是什么?](#提高代码性能的最简单和最好的性能分析工具是什么) +- [你使用那些浏览器插件?](#你使用那些浏览器插件) +- [有哪些有用的数据整理工具?](#有哪些有用的数据整理工具) +- [Docker 和 虚拟机 有什么区别?](#Docker 和 虚拟机 有什么区别) +- [每种OS的优缺点是什么,我们如何选择(比如如何选择针对我们目的的最好Linux发行版)?](#每种OS的优缺点是什么,我们如何选择(比如如何选择针对我们目的的最好Linux发行版)) +- [Vim 编辑器vs Emacs编辑器?](#Vim 编辑器vs Emacs编辑器?) +- [机器学习应用的提示或技巧?](#机器学习应用的提示或技巧?) +- [还有更多的Vim提示吗?](#还有更多的Vim提示吗?) +- [2FA是什么,为什么我需要使用它?](#2FA是什么,为什么我需要使用它?) +- [对于不同的Web浏览器有什么评价??](#对于不同的Web浏览器有什么评价?) -## Any recommendations on learning Operating Systems related topics like processes, virtual memory, interrupts, memory management, etc -First, it is unclear whether you actually need to be very familiar with all of these topics since they are very low level topics. -They will matter as you start writing more low level code like implementing or modifying a kernel. Otherwise, most topics will not be relevant, with the exception of processes and signals that were briefly covered in other lectures. +## 学习操作系统相关话题的推荐,比如 进程,虚拟内存,中断,内存管理等 -Some good resources to learn about this topic: -- [MIT's 6.828 class](https://pdos.csail.mit.edu/6.828/) - Graduate level class on Operating System Engineering. Class materials are publicly available. -- Modern Operating Systems (4th ed) - by Andrew S. Tanenbaum is a good overview of many of the mentioned concepts. -- The Design and Implementation of the FreeBSD Operating System - A good resource about the FreeBSD OS (note that this is not Linux). -- Other guides like [Writing an OS in Rust](https://os.phil-opp.com/) where people implement a kernel step by step in various languages, mostly for teaching purposes. +首先,不清楚你是不是真的需要熟悉这些 更底层的话题。 +当你开始编写更加底层的代码比如 实现或修改 内核 的时候,这些很重要。 +除了其他课程中简要介绍过的进程和信号量之外,大部分话题都不相关。 -## What are some of the tools you'd prioritize learning first? -Some topics worth prioritizing: +学习这些话题的资源: -- Learning how to use your keyboard more and your mouse less. This can be through keyboard shortcuts, changing interfaces, &c. -- Learning your editor well. As a programmer most of your time is spent editing files so it really pays off to learn this skill well. -- Learning how to automate and/or simplify repetitive tasks in your workflow because the time savings will be enormous... -- Learning about version control tools like Git and how to use it in conjunction with GitHub to collaborate in modern software projects. +- [MIT's 6.828 class](https://pdos.csail.mit.edu/6.828/) - Graduate level class on Operating System Engineering. Class materials are publicly available. 研究生阶段的操作系统课程(课程资料是公开的) +- Modern Operating Systems (4th ed) - Andrew S. Tanenbaum is a good overview of many of the mentioned concepts. **对很多上述概念都有很好的描述** +- The Design and Implementation of the FreeBSD Operating System - 关于FreeBSD OS 的好资源(注意,FreeBSD OS不是Linux) +- 其他的指南例如 [Writing an OS in Rust](https://os.phil-opp.com/) where people implement a kernel step by step in various languages, mostly for teaching purposes.这里用不同的语言 逐步实现了内核,主要用于教学的目的。 -## When do I use Python versus a Bash scripts versus some other language? -In general, bash scripts are useful for short and simple one-off scripts when you just want to run a specific series of commands. bash has a set of oddities that make it hard to work with for larger programs or scripts: +## 你会优先学习的工具有那些? -- bash is easy to get right for a simple use case but it can be really hard to get right for all possible inputs. For example, spaces in script arguments have led to countless bugs in bash scripts. -- bash is not amenable to code reuse so it can be hard to reuse components of previous programs you have written. More generally, there is no concept of software libraries in bash. -- bash relies on many magic strings like `$?` or `$@` to refer to specific values, whereas other languages refer to them explicitly, like `exitCode` or `sys.args` respectively. -Therefore, for larger and/or more complex scripts we recommend using more mature scripting languages like Python or Ruby. -You can find online countless libraries that people have already written to solve common problems in these languages. -If you find a library that implements the specific functionality you care about in some language, usually the best thing to do is to just use that language. -## What is the difference between `source script.sh` and `./script.sh` +值得优先学习的话题: -In both cases the `script.sh` will be read and executed in a bash session, the difference lies in which session is running the commands. -For `source` the commands are executed in your current bash session and thus any changes made to the current environment, like changing directories or defining functions will persist in the current session once the `source` command finishes executing. -When running the script standalone like `./script.sh`, your current bash session starts a new instance of bash that will run the commands in `script.sh`. -Thus, if `script.sh` changes directories, the new bash instance will change directories but once it exits and returns control to the parent bash session, the parent session will remain in the same place. -Similarly, if `script.sh` defines a function that you want to access in your terminal, you need to `source` it for it to be defined in your current bash session. Otherwise, if you run it, the new bash process will be the one to process the function definition instead of your current shell. +- 学着更多去使用键盘而不是鼠标。这可以通过快捷键,更换接口等等。 -## What are the places where various packages and tools are stored and how does referencing them work? What even is `/bin` or `/lib`? +- 学好编辑器。作为程序员你的大部分时间都是在编辑文件,因此学好这些技能会给你带来回报。 -Regarding programs that you execute in your terminal, they are all found in the directories listed in your `PATH` environment variable and you can use the `which` command (or the `type` command) to check where your shell is finding a specific program. -In general, there are some conventions about where specific types of files live. Here are some of the ones we talked about, check the [Filesystem, Hierarchy Standard](https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard) for a more comprehensive list. +- 学习怎样去自动化或简化工作流程中的重复任务。因为这会节省大量的时间。 -- `/bin` - Essential command binaries -- `/sbin` - Essential system binaries, usually to be run by root -- `/dev` - Device files, special files that often are interfaces to hardware devices -- `/etc` - Host-specific system-wide configuration files -- `/home` - Home directories for users in the system -- `/lib` - Common libraries for system programs -- `/opt` - Optional application software -- `/sys` - Contains information and configuration for the system (covered in the [first lecture](/2020/course-shell/)) -- `/tmp` - Temporary files (also `/var/tmp`). Usually deleted between reboots. -- `/usr/` - Read only user data - + `/usr/bin` - Non-essential command binaries - + `/usr/sbin` - Non-essential system binaries, usually to be run by root - + `/usr/local/bin` - Binaries for user compiled programs -- `/var` - Variable files like logs or caches +- 学习版本控制工具如Git 并且知道如何使用它与GitHub结合,在现代的软件项目中协同工作。 -## Should I `apt-get install` a python-whatever, or `pip install` whatever package? +- Learning how to use your keyboard more and your mouse less. This can be through **keyboard shortcuts, changing interfaces, &c**. -There's no universal answer to this question. It's related to the more general question of whether you should use your system's package manager or a language-specific package manager to install software. A few things to take into account: -- Common packages will be available through both, but less popular ones or more recent ones might not be available in your system package manager. In this, case using the language-specific tool is the better choice. -- Similarly, language-specific package managers usually have more up to date versions of packages than system package managers. -- When using your system package manager, libraries will be installed system wide. This means that if you need different versions of a library for development purposes, the system package manager might not suffice. For this scenario, most programming languages provide some sort of isolated or virtual environment so you can install different versions of libraries without running into conflicts. For Python, there's virtualenv, and for Ruby, there's RVM. -- Depending on the operating system and the hardware architecture, some of these packages might come with binaries or might need to be compiled. For instance, in ARM computers like the Raspberry Pi, using the system package manager can be better than the language specific one if the former comes in form of binaries and the later needs to be compiled. This is highly dependent on your specific setup. +## 使用Python VS Bash脚本 VS 其他语言? -You should try to use one solution or the other and not both since that can lead to conflicts that are hard to debug. Our recommendation is to use the language-specific package manager whenever possible, and to use isolated environments (like Python's virtualenv) to avoid polluting the global environment. -## What's the easiest and best profiling tools to use to improve performance of my code? -The easiest tool that is quite useful for profiling purposes is [print timing](/2020/debugging-profiling/#timing). -You just manually compute the time taken between different parts of your code. By repeatedly doing this, you can effectively do a binary search over your code and find the segment of code that took the longest. +通常来说,当你仅想要运行一系列的命令的时候,Bash 脚本对于简短的一次性脚本有用。Bash 脚本有一些比较奇怪的地方,这使得大型程序或脚本难以用Bash实现: + +- Bash 可以获取简单的用例,但是很难获得全部可能的输入。例如,脚本参数中的空格会导致Bash 脚本出错。 +- Bash 对于 代码重用并不友好。因此,重用你先前已经写好代码部分很困难。Bash 中没有软件库的概念。 +- Bash依赖于一些 像`$?` 或 `$@`的特殊字符指代特殊的值。其他的语言会显式地引用,比如`exitCode` 或`sys.args`。 + +因此,对于大型或者更加复杂地脚本我们推荐使用更加成熟的脚本语言例如 Python 和 Ruby。 +你可以找到数不胜数的在线库,有人已经写好了去解决常见的问题用这种语言。 +你可以在网上找到很多用这些语言写的解决常见问题的库。 +如果你发现某种语言实现了你需要的特定功能的库,最好的方式就是直接去用那种语言 + + +## `source script.sh` 和`./script.sh`有什么区别 +两种情况下 `script.sh` 都会在bash会话种被读取和执行,不同点在于那个会话在执行这个命令。 +对于`source`命令来说,命令是在当前的bash会话种执行的,因此当`source`执行完毕,对当前环境的任何更改(例如更改目录或是自定义函数)都会保存在当前会话中。 +当像`./script.sh`这样独立运行脚本时,当前的bash会话将启动新的bash实例,并在该实例中运行命令`script.sh`。因此,如果`script.sh`更改目录,新的bash实例会更改目录,但是一旦退出并将控制权返回给父bash会话,父会话仍然留在先前的位置(不会有目录的更改)。 +同样,如果`script.sh`定义了要在终端中访问的函数,需要用`source`命令在使得在当前bash会话中定义这个函数。否则,如果您运行`./script.sh`,只有新的bash进程才能执行定义的函数,而当前的shell不能。 + + + +## 各种软件包和工具存储在哪里? 引用过程是怎样的? `/bin` 或 `/lib` 是什么? +根据你在命令行中运行的程序,这些包和工具会在`PATH`环境变量所列出的目录中找到 并且 你可以使用 `which`命令(或是`type`命令)来检查你的shell在哪里发现了特定的程序。 +一般来说,特定种类的文件存储有一定的规范,[文件系统,层次结构标准(Filesystem, Hierarchy Standard)](https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard)可以查到我们讨论内容的详细列表 + +- `/bin` - 基本命令二进制文件 +- `/sbin` - 基本的系统二进制文件,通常是root运行的 +- `/dev` - 设备文件,通常是硬件设备接口文件 +- `/etc` - 主机特定的系统配置文件 +- `/home` - 系统用户的家目录 +- `/lib` - 系统软件通用库 +- `/opt` - 可选的应用软件 +- `/sys` - 包含系统的信息和配置( [第一堂课](/2020/course-shell/)介绍的) +- `/tmp` - 临时文件 (`/var/tmp`) 通常在重启之间删除 +- `/usr/` - 只读的用户数据 + + `/usr/bin` - 非必须的命令二进制文件 + + `/usr/sbin` - 非必须的系统二进制文件,通常是由root运行的 + + `/usr/local/bin` - 用户编译程序的二进制文件 +- `/var` -变量文件 像日志或缓存 + +## 我应该用`apt-get install`还是`pip install` 去下载包呢? + +这个问题没有普遍的答案。这与使用系统程序包管理器还是特定语言的程序包管理器来安装软件这一更普遍的问题有关。需要考虑的几件事: + +- 通用软件包可以通过这两种方法获得,但是不太流行的软件包或较新的软件包可能不在系统程序包管理器中。在这种情况下,使用特定语言的工具的情况是更好的选择。 +- 同样,特定语言的程序包管理器相比系统程序包管理器更多的最新版本的程序包。 +- 当使用系统软件包管理器时,将在系统范围内安装库。这意味着,如果出于开发目的需要不同版本的库,则系统软件包管理器可能不能满足你的需要。对于这种情况,大多数编程语言都提供了隔离或虚拟环境,因此您可以安装不同版本的库而不会发生冲突。对于Python,有virtualenv,对于Ruby,有RVM。 +- 根据操作系统和硬件体系结构,其中一些软件包可能会附带二进制文件或可能需要编译。例如,在树莓派(Raspberry Pi)之类的ARM架构计算机中,在软件附带二进制文件和需要编译的情况下,使用系统包管理器比特定语言包管理器更好。这在很大程度上取决于您的特定设置。 +你应该仅使用一种解决方案,而不同时使用两种方法,因为这可能会导致难以调试的冲突。我们的建议是尽可能使用特定语言的程序包管理器,并使用隔离的环境(例如Python的virtualenv)以避免影响全局环境。 + + + +## 提高代码性能的最简单和最好的性能分析工具是什么? +性能分析方面最有用和简单工具是[print timing](/2020/debugging-profiling/#timing)。你只需手动计算代码不同部分之间花费的时间。通过重复执行此操作,你可以有效地对代码进行二分法搜索,并找到花费时间最长的代码段。 + +对于更高级的工具,Valgrind的 [Callgrind](http://valgrind.org/docs/manual/cl-manual.html)可让你运行程序并计算一切的时间花费以及所有调用堆栈,即哪个函数调用了另一个函数。然后,它会生成程序源代码的带注释的版本,其中包含每行花费的时间。但是,它会使程序速度降低一个数量级,并且不支持线程。对于其他情况,[`perf`](http://www.brendangregg.com/perf.html)工具和其他特定语言的采样性能分析器可以非常快速地输出有用的数据。[Flamegraphs](http://www.brendangregg.com/flamegraphs.html) 是对于采样分析器输出很好的可视化工具。你还应该使用针对编程语言或任务的特定的工具。例如,对于Web开发,Chrome和Firefox内置的开发工具具有出色的性能分析器。 + +有时,代码中最慢的部分是系统等待磁盘读取或网络数据包之类的事件。在这些情况下,值得检查的是,关于硬件功能的理论速度的后验计算与实际读数没有偏差。 +**粗略计算理论速度,根据实际读数与硬件容量偏差不大** 也有专门的工具来分析系统调用中的等待时间。这些工具包括执行用户程序内核跟踪的[eBPF](http://www.brendangregg.com/blog/2019-01-01/learn-ebpf-tracing.html) 之类的工具。如果需要低级的性能分析,[`bpftrace`](https://github.com/iovisor/bpftrace) 值得一试。 -For more advanced tools, Valgrind's [Callgrind](http://valgrind.org/docs/manual/cl-manual.html) lets you run your program and measure how long everything takes and all the call stacks, namely which function called which other function. It then produces an annotated version of your program's source code with the time taken per line. However, it slows down your program by an order of magnitude and does not support threads. For other cases, the [`perf`](http://www.brendangregg.com/perf.html) tool and other language specific sampling profilers can output useful data pretty quickly. [Flamegraphs](http://www.brendangregg.com/flamegraphs.html) are good visualization tool for the output of said sampling profilers. You should also try to use specific tools for the programming language or task you are working with. For example, for web development, the dev tools built into Chrome and Firefox have fantastic profilers. Sometimes the slow part of your code will be because your system is waiting for an event like a disk read or a network packet. In those cases, it is worth checking that back-of-the-envelope calculations about the theoretical speed in terms of hardware capabilities do not deviate from the actual readings. There are also specialized tools to analyze the wait times in system calls. These include tools like [eBPF](http://www.brendangregg.com/blog/2019-01-01/learn-ebpf-tracing.html) that perform kernel tracing of user programs. In particular [`bpftrace`](https://github.com/iovisor/bpftrace) is worth checking out if you need to perform this sort of low level profiling. -## What browser plugins do you use? +## 你使用那些浏览器插件? + +我们钟爱的插件主要与安全性与可用性有关: +- [uBlock Origin](https://github.com/gorhill/uBlock) - 是一个[wide-spectrum](https://github.com/gorhill/uBlock/wiki/Blocking-mode)不仅可以拦截广告,还可以拦截第三方的页面。这也会覆盖内部的脚本和其他种类的加载资源。如果你打算花更多的时间去配置,前往[medium mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-medium-mode)或者 [hard mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-hard-mode)。这些会使得一些网站停止工作直到你调整好了这些设置,这会显著提高你的网络安全。另外, [easy mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-easy-mode)已经很好了,可以拦截大部分的广告和跟踪,你也可以自定义规则来拦截网站对象。 + +- [Stylus](https://github.com/openstyles/stylus/) -Stylish的分支(不要使用Stylish,它会 [窃取浏览记录](https://www.theregister.co.uk/2018/07/05/browsers_pull_stylish_but_invasive_browser_extension/))),这个插件可让你将自定义CSS样式表侧面加载到网站。使用Stylus,你可以轻松地自定义和修改网站的外观。可以删除侧边栏,更改背景颜色,甚至更改文字大小或字体样式。这可以使你得经常访问的网站更具可读性。此外,Stylus可以找到其他用户编写并发布在[userstyles.org](https://userstyles.org/)中的样式。例如,大多数常见的网站都有一个或几个深色主题样式。 + +- 全页屏幕捕获-内置于Firefox和 [Chrome 扩展程序](https://chrome.google.com/webstore/detail/full-page-screen-capture/fdpohaocaechififmbbbbbknoalclacl?hl=en)中。这些插件提供完整的网站截图,通常比打印要好用。 + +- [多账户容器](https://addons.mozilla.org/en-US/firefox/addon/multi-account-containers/) -该插件使你可以将Cookie分为“容器”,从而允许你以不同的身份浏览web网页 并且/或 确保网站无法在它们之间共享信息。 +- 密码集成管理器-大多数密码管理器都有浏览器插件,这些插件帮你将登录凭据输入网站的过程不仅方便,而且更加安全。与简单复制粘贴用户名和密码相比,这些插件将首先检查网站域是否与列出的条目相匹配,以防止冒充著名网站的网络钓鱼攻击窃取登录凭据。 + + +## 有哪些有用的数据整理工具? +在数据整理课程中,我们没有时间讨论的一些数据整理工具包括`jq`或`pup`分别是用于JSON和HTML数据的专用解析器。Perl语言是用于更高级的数据整理管道的另一个很好的工具。另一个技巧是使用`column -t`命令, 可用于将空格文本(不一定对齐)转换为正确的列对齐文本。 + +一般来说,vim和Python是两个不常规的数据整理工具。对于某些复杂的多行转换,vim宏可以是非常宝贵的工具。你可以记录一系列操作,并根据需要重复执行多次,例如,在编辑的 [讲义](/2020/editors/#macros) (去年 [视频](/2019/editors/))中,有一个示例就是使用vim宏将XML格式的文件转换为JSON。 + + +对于通常以CSV格式显示的表格数据, Python [pandas](https://pandas.pydata.org/) 库是一个很棒的工具。不仅因为它使定义复杂的操作(如分组依据,联接或过滤器)变得非常容易;而且 而且还很容易绘制数据的不同属性。它还支持导出为多种表格格式,包括XLS,HTML或LaTeX。另外,R语言(一种理论上[不好](http://arrgh.tim-smith.us/)的语言)具有很多功能,可以计算数据的统计数据,在管道的最后一步中非常有用。 [ggplot2](https://ggplot2.tidyverse.org/)是R中很棒的绘图库。 + + + + +## Docker 和 虚拟机 有什么区别? + +Docker 基于更加普遍的概念,称为容器。关于容器和虚拟机之间最大的不同是 虚拟机会执行整个的 **OS 栈,包括内核(即使这个内核和主机内核相同)**。与虚拟机不同,容器避免运行其他内核实例 反而是与主机分享内核。在Linux环境中,有LXC机制来实现,并且这能使一系列分离的主机好像是使用自己的硬件启动程序,而实际上是共享主机的硬件和内核。因此容器的开销小于完整的虚拟机。 + +另一方面,容器的隔离性较弱而且只有在主机运行相同的内核时才能正常工作。例如如果你在macOS上运行Docker,Docker需要启动Linux虚拟机去获取初始的Linux内核,这样开销仍然很大。最后,Docker是容器的特定实现,它是为软件部署定制的。基于这些,它有一些奇怪之处:例如,默认情况下,Docker容器在重启之间不会维持以任何形式的存储。 + + + +## 每种OS的优缺点是什么,我们如何选择(比如如何选择针对我们目的的最好Linux发行版) + +关于Linux发行版,尽管有很多版本,但大部分发行版在大多数使用情况下的表现是相同的。 +可以在任何发行版中学习Linux和UNIX的特性和其内部工作原理。 +发行版之间的根本区别是发行版如何处理软件包更新。 +某些版本,例如Arch Linux采用滚动更新策略,用了最前沿的技术(bleeding-edge),但软件可能并不稳定。另外一些发行版(如Debian,CentOS或Ubuntu LTS)其更新要保守得多,因此更新会更稳定,但不能使用一些新功能。我们建议你使用Debian或Ubuntu来获得简单稳定的台式机和服务器体验。 + +Mac OS是介于Windows和Linux之间的一个很好的中间OS,它有很漂亮的界面。但是,Mac OS是基于BSD而不是Linux,因此系统的某些部分和命令是不同的。 +另一种值得体验的是FreeBSD。虽然某些程序不能在FreeBSD上运行,但与Linux相比,BSD生态系统的碎片化程度要低得多,并且说明文档更加友好。 +除了开发Windows应用程序或需要某些在Windows上更好功能(例如对游戏的良好驱动程序支持)外,我们不建议使用Windows。 + +对于双启动系统,我们认为最有效的实现是macOS的bootcamp,从长远来看,任何其他组合都可能会出现问题,尤其是当你结合了其他功能比如磁盘加密。 + + + -Some of our favorites, mostly related to security and usability: +## Vim 编辑器vs Emacs编辑器? +我们三个都使用vim作为我们的主要编辑器。但是Emacs也是一个不错的选择,你可以两者都尝试,看看那个更适合你。Emacs不遵循vim的模式编辑,但是这些功能可以通过Emacs插件 [Evil](https://github.com/emacs-evil/evil) 或 [Doom Emacs](https://github.com/hlissner/doom-emacs)来实现。Emacs的优点是可以用Lisp语言进行扩展(Lisp比vim默认的脚本语言vimscript要更好)。 -- [uBlock Origin](https://github.com/gorhill/uBlock) - It is a [wide-spectrum](https://github.com/gorhill/uBlock/wiki/Blocking-mode) blocker that doesn’t just stop ads, but all sorts of third-party communication a page may try to do. This also cover inline scripts and other types of resource loading. If you’re willing to spend some time on configuration to make things work, go to [medium mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-medium-mode) or even [hard mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-hard-mode). Those will make some sites not work until you’ve fiddled with the settings enough, but will also significantly improve your online security. Otherwise, the [easy mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-easy-mode) is already a good default that blocks most ads and tracking. You can also define your own rules about what website objects to block. -- [Stylus](https://github.com/openstyles/stylus/) - a fork of Stylish (don't use Stylish, it was shown to [steal users browsing history](https://www.theregister.co.uk/2018/07/05/browsers_pull_stylish_but_invasive_browser_extension/)), allows you to sideload custom CSS stylesheets to websites. With Stylus you can easily customize and modify the appearance of websites. This can be removing a sidebar, changing the background color or even the text size or font choice. This is fantastic for making websites that you visit frequently more readable. Moreover, Stylus can find styles written by other users and published in [userstyles.org](https://userstyles.org/). Most common websites have one or several dark theme stylesheets for instance. -- Full Page Screen Capture - Built into Firefox and [Chrome extension](https://chrome.google.com/webstore/detail/full-page-screen-capture/fdpohaocaechififmbbbbbknoalclacl?hl=en). Let's you take a screenshot of a full website, often much better than printing for reference purposes. -- [Multi Account Containers](https://addons.mozilla.org/en-US/firefox/addon/multi-account-containers/) - lets you separate cookies into "containers", allowing you to browse the web with different identities and/or ensuring that websites are unable to share information between them. -- Password Manager Integration - Most password managers have browser extensions that make inputting your credentials into websites not only more convenient but also more secure. Compared to simply copy-pasting your user and password, these tools will first check that the website domain matches the one listed for the entry, preventing phishing attacks that impersonate popular websites to steal credentials. +## 机器学习应用的提示或技巧? -## What are other useful data wrangling tools? +课程的一些经验可以直接用于机器学习程序。 -Some of the data wrangling tools we did not have time to cover during the data wrangling lecture include `jq` or `pup` which are specialized parsers for JSON and HTML data respectively. The Perl programming language is another good tool for more advanced data wrangling pipelines. Another trick is the `column -t` command that can be used to convert whitespace text (not necessarily aligned) into properly column aligned text. +就像许多科学学科一样,在机器学习中,你经常要进行一系列实验,并检查哪些数据有效,哪些无效。 -More generally a couple of more unconventional data wrangling tools are vim and Python. For some complex and multi-line transformations, vim macros can be a quite invaluable tools to use. You can just record a series of actions and repeat them as many times as you want, for instance in the editors [lecture notes](/2020/editors/#macros) (and last year's [video](/2019/editors/)) there is an example of converting a XML-formatted file into JSON just using vim macros. +你可以使用Shell轻松快速地搜索这些实验结果,并且以明智的方式汇总。这意味着在需要在给定的时间范围或在使用特定数据集的情况下,检查所有实验结果。通过使用JSON文件记录实验的所有相关参数,使用我们在本课程中介绍的工具,这件事情可以变得非常简单。 -For tabular data, often presented in CSVs, the [pandas](https://pandas.pydata.org/) Python library is a great tool. Not only because it makes it quite easy to define complex operations like group by, join or filters; but also makes it quite easy to plot different properties of your data. It also supports exporting to many table formats including XLS, HTML or LaTeX. Alternatively the R programming language (an arguably [bad](http://arrgh.tim-smith.us/) programming language) has lots of functionality for computing statistics over data and can be quite useful as the last step of your pipeline. [ggplot2](https://ggplot2.tidyverse.org/) is a great plotting library in R. +最后,如果你不使用集群提交你的GPU作业,那你应该研究如何使该过程自动化,因为这是一项非常耗时的任务,会消耗你的精力。 -## What is the difference between Docker and a Virtual Machine? +## 还有更多的Vim提示吗? -Docker is based on a more general concept called containers. The main difference between containers and virtual machines is that virtual machines will execute an entire OS stack, including the kernel, even if the kernel is the same as the host machine. Unlike VMs, containers avoid running another instance of the kernel and instead share the kernel with the host. In Linux, this is achieved through a mechanism called LXC, and it makes use of a series of isolation mechanism to spin up a program that thinks it's running on its own hardware but it's actually sharing the hardware and kernel with the host. Thus, containers have a lower overhead than a full VM. -On the flip side, containers have a weaker isolation and only work if the host runs the same kernel. For instance if you run Docker on macOS, Docker needs to spin up a Linux virtual machine to get an initial Linux kernel and thus the overhead is still significant. Lastly, Docker is a specific implementation of containers and it is tailored for software deployment. Because of this, it has some quirks: for example, Docker containers will not persist any form of storage between reboots by default. +更多的提示: -## What are the advantages and disadvantages of each OS and how can we choose between them (e.g. choosing the best Linux distribution for our purposes)? +- 插件 - 花时间去探索插件。有很多不错的插件解决vim的缺陷或者增加了与现有vim 工作流很好结合的新功能。这部分内容,资源是[VimAwesome](https://vimawesome.com/) 和其他程序员的dotfiles -Regarding Linux distros, even though there are many, many distros, most of them will behave fairly identically for most use cases. -Most of Linux and UNIX features and inner workings can be learned in any distro. -A fundamental difference between distros is how they deal with package updates. -Some distros, like Arch Linux, use a rolling update policy where things are bleeding-edge but things might break every so often. On the other hand, some distros like Debian, CentOS or Ubuntu LTS releases are much more conservative with releasing updates in their repositories so things are usually more stable at the expense of sacrificing newer features. -Our recommendation for an easy and stable experience with both desktops and servers is to use Debian or Ubuntu. +- 标记 - 在vim里你可以使用 `m` 为字母 `X`做标记,之后你可以通过 `'`回到标记位置。这可以让你快速导航到文件内或跨文件间的特定位置。 -Mac OS is a good middle point between Windows and Linux that has a nicely polished interface. However, Mac OS is based on BSD rather than Linux, so some parts of the system and commands are different. -An alternative worth checking is FreeBSD. Even though some programs will not run on FreeBSD, the BSD ecosystem is much less fragmented and better documented than Linux. -We discourage Windows for anything but for developing Windows applications or if there is some deal breaker feature that you need, like good driver support for gaming. +- 导航 - `Ctrl+O` and `Ctrl+I` 使你在最近访问位置向后向前移动。 -For dual boot systems, we think that the most working implementation is macOS' bootcamp and that any other combination can be problematic on the long run, specially if you combine it with other features like disk encryption. +- 撤销树 - vim 有不错的机制跟踪(文件)更改,不同于其他的编辑器,vim存储变更树,因此即使你撤销后做了一些修改,你仍然可以通过撤销树的导航回到初始状态。一些插件比如 [gundo.vim](https://github.com/sjl/gundo.vim) 和 [undotree](https://github.com/mbbill/undotree)通过图形化来展示撤销树 -## Vim vs Emacs? +- 时间的撤销 - `:earlier` 和 `:later`命令使得你可以用时间参考而不是某一时刻的更改。 -The three of us use vim as our primary editor but Emacs is also a good alternative and it's worth trying both to see which works better for you. Emacs does not follow vim's modal editing, but this can be enabled through Emacs plugins like [Evil](https://github.com/emacs-evil/evil) or [Doom Emacs](https://github.com/hlissner/doom-emacs). -An advantage of using Emacs is that extensions can be implemented in Lisp, a better scripting language than vimscript, Vim's default scripting language. +- [持续撤销](https://vim.fandom.com/wiki/Using_undo_branches#Persistent_undo)是一个默认未开启的vim的内置功能,在vim启动之间保存撤销历史,通过设置 在 `.vimrc`目录下的`undofile` 和 `undodir`, vim会保存每个文件的修改历史。 -## Any tips or tricks for Machine Learning applications? +- 领导按键 - 领导按键是一个用于用户配置自定义命令的特殊的按键。这种模式通常是按下后释放这个按键(通常是空格键)和其他的按键去执行特殊的命令。插件会用这些按键增加他们的功能,例如 插件UndoTree使用 ` U` 去打开撤销树。 +- 高级文本对象 - 文本对象像搜索也可以用vim命令构成,例如`d/`会删除下一处匹配pattern的位置 ,**`cgn`会改变最后搜索到的字符串的下一个存在。** -Some of the lessons and takeaways from this class can directly be applied to ML applications. -As it is the case with many science disciplines, in ML you often perform a series of experiments and want to check what things worked and what didn't. -You can use shell tools to easily and quickly search through these experiments and aggregate the results in a sensible way. This could mean subselecting all experiments in a given time frame or that use a specific dataset. By using a simple JSON file to log all relevant parameters of the experiments, this can be incredibly simple with the tools we covered in this class. -Lastly, if you do not work with some sort of cluster where you submit your GPU jobs, you should look into how to automate this process since it can be a quite time consuming task that also eats away your mental energy. -## Any more Vim tips? -A few more tips: -- Plugins - Take your time and explore the plugin landscape. There are a lot of great plugins that address some of vim's shortcomings or add new functionality that composes well with existing vim workflows. For this, good resources are [VimAwesome](https://vimawesome.com/) and other programmers' dotfiles. -- Marks - In vim, you can set a mark doing `m` for some letter `X`. You can then go back to that mark doing `'`. This let's you quickly navigate to specific locations within a file or even across files. -- Navigation - `Ctrl+O` and `Ctrl+I` move you backward and forward respectively through your recently visited locations. -- Undo Tree - Vim has a quite fancy mechanism for keeping track of changes. Unlike other editors, vim stores a tree of changes so even if you undo and then make a different change you can still go back to the original state by navigating the undo tree. Some plugins like [gundo.vim](https://github.com/sjl/gundo.vim) and [undotree](https://github.com/mbbill/undotree) expose this tree in a graphical way. -- Undo with time - The `:earlier` and `:later` commands will let you navigate the files using time references instead of one change at a time. -- [Persistent undo](https://vim.fandom.com/wiki/Using_undo_branches#Persistent_undo) is an amazing built-in feature of vim that is disabled by default. It persists undo history between vim invocations. By setting `undofile` and `undodir` in your `.vimrc`, vim will storage a per-file history of changes. -- Leader Key - The leader key is a special key that is often left to the user to be configured for custom commands. The pattern is usually to press and release this key (often the space key) and then some other key to execute a certain command. Often, plugins will use this key to add their own functionality, for instance the UndoTree plugin uses ` U` to open the undo tree. -- Advanced Text Objects - Text objects like searches can also be composed with vim commands. E.g. `d/` will delete to the next match of said pattern or `cgn` will change the next occurrence of the last searched string. +## 2FA是什么,为什么我需要使用它? +双因子验证(Two Factor Authentication 2FA)在密码之上为帐户增加了一层额外的保护。为了登录,您不仅需要知道一些密码,还必须以某种方式“证明”可以访问某些硬件设备。最简单的情形,可以通过在接收手机的SMS来实现(尽管SMS 2FA 存在 [已知问题](https://www.kaspersky.com/blog/2fa-practical-guide/24219/))。我们推荐使用[YubiKey](https://www.yubico.com/)之类的[U2F](https://en.wikipedia.org/wiki/Universal_2nd_Factor)方案。 -## What is 2FA and why should I use it? -Two Factor Authentication (2FA) adds an extra layer of protection to your accounts on top of passwords. In order to login, you not only have to know some password, but you also have to "prove" in some way you have access to some hardware device. In the most simple case, this can be achieved by receiving an SMS on your phone, although there are [known issues](https://www.kaspersky.com/blog/2fa-practical-guide/24219/) with SMS 2FA. A better alternative we endorse is to use a [U2F](https://en.wikipedia.org/wiki/Universal_2nd_Factor) solution like [YubiKey](https://www.yubico.com/). -## Any comments on differences between web browsers? +## 对于不同的Web浏览器有什么评价? -The current landscape of browsers as of 2020 is that most of them are like Chrome because they use the same engine (Blink). This means that Microsoft Edge which is also based on Blink, and Safari, which is based on WebKit, a similar engine to Blink, are just worse versions of Chrome. Chrome is a reasonably good browser both in terms of performance and usability. Should you want an alternative, Firefox is our recommendation. It is comparable to Chrome in pretty much every regard and it excels for privacy reasons. -Another browser called [Flow](https://www.ekioh.com/flow-browser/) is not user ready yet, but it is implementing a new rendering engine that promises to be faster than the current ones. +2020的浏览器现状是,大部分的浏览器都与Chrome 类似,因为他们都使用同样的引擎(Blink)。 Microsoft Edge同样基于 Blink,至于Safari 基于 WebKit(与Blink类似的引擎),这些浏览器仅仅是更糟糕的Chorme版本。不管是在性能还是可用性上,Chorme都是一款好的浏览器。如果你想要替代品,我们推荐Firefox。Firefox与Chorme的在各方面不相上下,并且在隐私方面更加出色。有一款目前还没有完成Flow的浏览器**浏览器目前还没有完成**,它实现了全新的渲染引擎(rendering engine),可以保证比现有引擎快。 From 4c6f0037614ee4da4f42758fc51c45080b6e98b1 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Sun, 31 May 2020 10:45:43 +0800 Subject: [PATCH 404/640] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index e4b6c092..20f12418 100644 --- a/README.md +++ b/README.md @@ -35,6 +35,6 @@ To contribute to this tanslation project, please book your topic by creating an | [debugging-profiling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/debugging-profiling.md) |[@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [metaprogramming.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/metaprogramming.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | [@catcarbon](https://github.com/catcarbon) | In-progress | -| [potpourri.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | In-progress | +| [potpourri.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md) | [@catcarbon](https://github.com/catcarbon) | In-progress | | [qa.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/qa.md) | [@AA1HSHH](https://github.com/AA1HSHH) | In-progress | | [about.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/about.md) | [@Binlogo](https://github.com/Binlogo) | Done | From 405b367f82d1e46deee8fe6a69f1477aaef36e75 Mon Sep 17 00:00:00 2001 From: Yi Zhang Date: Sun, 31 May 2020 01:05:43 -0400 Subject: [PATCH 405/640] last part done, ready for review --- _2020/security.md | 74 ++++++++++++++--------------------------------- 1 file changed, 22 insertions(+), 52 deletions(-) diff --git a/_2020/security.md b/_2020/security.md index 250fcb57..fb91e3d6 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -205,29 +205,17 @@ Windows的[BitLocker](https://fossbytes.com/enable-full-disk-encryption-windows- ## SSH -We've covered the use of SSH and SSH keys in an [earlier -lecture](/2020/command-line/#remote-machines). Let's look at the cryptography -aspects of this. - -When you run `ssh-keygen`, it generates an asymmetric keypair, `public_key, -private_key`. This is generated randomly, using entropy provided by the -operating system (collected from hardware events, etc.). The public key is -stored as-is (it's public, so keeping it a secret is not important), but at -rest, the private key should be encrypted on disk. The `ssh-keygen` program -prompts the user for a passphrase, and this is fed through a key derivation -function to produce a key, which is then used to encrypt the private key with a -symmetric cipher. - -In use, once the server knows the client's public key (stored in the -`.ssh/authorized_keys` file), a connecting client can prove its identity using -asymmetric signatures. This is done through -[challenge-response](https://en.wikipedia.org/wiki/Challenge%E2%80%93response_authentication). -At a high level, the server picks a random number and sends it to the client. -The client then signs this message and sends the signature back to the server, -which checks the signature against the public key on record. This effectively -proves that the client is in possession of the private key corresponding to the -public key that's in the server's `.ssh/authorized_keys` file, so the server -can allow the client to log in. +我们在[之前的一堂课](/2020/command-line/#remote-machines)讨论了SSH和SSH密钥的使用。那么我们今天从密码学的角度来分析一下它们。 + +当你运行`ssh-keygen`命令,它会生成一个非对称密钥对:公钥和私钥`(public_key, private_key)`。 +生成过程中使用的随机数由系统提供的熵决定。这些熵可以来源于硬件事件(hardware events)等。 +公钥最终会被分发,它可以直接明文存储。 +但是为了防止泄露,私钥必须加密存储。`ssh-keygen`命令会提示用户输入一个密码,并将它输入密钥生成函数 +产生一个密钥。最终,`ssh-keygen`使用对称加密算法和这个密钥加密私钥。 + +在实际运用中,当服务器已知用户的公钥(存储在`.ssh/authorized_keys`文件中,一般在用户HOME目录下),尝试连接的客户端可以使用非对称签名来证明用户的身份——这便是[挑战应答方式](https://en.wikipedia.org/wiki/Challenge%E2%80%93response_authentication)。 +简单来说,服务器选择一个随机数字发送给客户端。客户端使用用户私钥对这个数字信息签名后返回服务器。 +服务器随后使用`.ssh/authorized_keys`文件中存储的用户公钥来验证返回的信息是否由所对应的私钥所签名。这种验证方式可以有效证明试图登录的用户持有所需的私钥。 {% comment %} extra topics, if there's time @@ -251,32 +239,14 @@ security concepts, tips 1. 假设另一个密码是用八个随机的大小写字母或数字组成。一个符合这样构造的例子是`rg8Ql34g`。这个密码又有多少比特的熵? 1. 哪一个密码更强? 1. 假设一个攻击者每秒可以尝试1万个密码,这个攻击者需要多久可以分别破解上述两个密码? -1. **Cryptographic hash functions.** Download a Debian image from a - [mirror](https://www.debian.org/CD/http-ftp/) (e.g. [this - file](http://debian.xfree.com.ar/debian-cd/10.2.0/amd64/iso-cd/debian-10.2.0-amd64-netinst.iso) - from an Argentinean mirror). Cross-check the hash (e.g. using the - `sha256sum` command) with the hash retrieved from the official Debian site - (e.g. [this - file](https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/SHA256SUMS) - hosted at `debian.org`, if you've downloaded the linked file from the - Argentinean mirror). -1. **Symmetric cryptography.** Encrypt a file with AES encryption, using - [OpenSSL](https://www.openssl.org/): `openssl aes-256-cbc -salt -in {input - filename} -out {output filename}`. Look at the contents using `cat` or - `hexdump`. Decrypt it with `openssl aes-256-cbc -d -in {input filename} -out - {output filename}` and confirm that the contents match the original using - `cmp`. -1. **Asymmetric cryptography.** - 1. Set up [SSH - keys](https://www.digitalocean.com/community/tutorials/how-to-set-up-ssh-keys--2) - on a computer you have access to (not Athena, because Kerberos interacts - weirdly with SSH keys). Rather than using RSA keys as in the linked - tutorial, use more secure [ED25519 - keys](https://wiki.archlinux.org/index.php/SSH_keys#Ed25519). Make sure - your private key is encrypted with a passphrase, so it is protected at - rest. - 1. [Set up GPG](https://www.digitalocean.com/community/tutorials/how-to-use-gpg-to-encrypt-and-sign-messages) - 1. Send Anish an encrypted email ([public key](https://keybase.io/anish)). - 1. Sign a Git commit with `git commit -C` or create a signed Git tag with - `git tag -s`. Verify the signature on the commit with `git show - --show-signature` or on the tag with `git tag -v`. +1. **密码散列函数** 从[Debian镜像站](https://www.debian.org/CD/http-ftp/)下载一个光盘映像(比如这个来自阿根廷镜像站的[映像](http://debian.xfree.com.ar/debian-cd/10.2.0/amd64/iso-cd/debian-10.2.0-amd64-netinst.iso))。使用`sha256sum`命令对比下载映像的哈希值和官方Debian站公布的哈希值。如果你下载了上面的映像,官方公布的哈希值可以参考[这个文件](https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/SHA256SUMS)。 +1. **对称加密** 使用 + [OpenSSL](https://www.openssl.org/)的AES模式加密一个文件: `openssl aes-256-cbc -salt -in {源文件名} -out {加密文件名}`。 + 使用`cat`或者`hexdump`对比源文件和加密的文件,再用 `openssl aes-256-cbc -d -in {加密文件名} -out + {解密文件名}` 命令解密刚刚加密的文件。最后使用`cmp`命令确认源文件和解密后的文件内容相同。 +1. **非对称加密** + 1. 在你自己的电脑上使用更安全的[ED25519算法](https://wiki.archlinux.org/index.php/SSH_keys#Ed25519)生成一组[SSH + 密钥对](https://www.digitalocean.com/community/tutorials/how-to-set-up-ssh-keys--2)。为了确保私钥不使用时的安全,一定使用密码加密你的私钥。 + 1. [配置GPG](https://www.digitalocean.com/community/tutorials/how-to-use-gpg-to-encrypt-and-sign-messages)。 + 1. 给Anish发送一封加密的电子邮件([Anish的公钥](https://keybase.io/anish))。 + 1. 使用`git commit -C`命令签名一个Git提交,并使用`git show --show-signature`命令验证这个提交的签名。或者,使用`git tag -s`命令签名一个Git标签,并使用`git tag -v`命令验证标签的签名。 From d7bbdeac90456a10062ef9e9b8f88ce40d816e34 Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Sun, 31 May 2020 14:18:34 +0800 Subject: [PATCH 406/640] Update qa.md --- _2020/qa.md | 194 +++++++++++++++++++++------------------------------- 1 file changed, 77 insertions(+), 117 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index 354d03c1..8b650ebc 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -11,21 +11,21 @@ video: 最后一节课,我们回答学生提出的问题: - [学习操作系统相关话题的推荐,比如 进程,虚拟内存,中断,内存管理等](#学习操作系统相关话题的推荐,比如 进程,虚拟内存,中断,内存管理等) -- [你会优先学习的工具有那些?](#你会优先学习的工具有那些?) -- [使用Python VS Bash脚本 VS 其他语言?](#使用Python VS Bash脚本 VS 其他语言?) -- [`source script.sh` 和`./script.sh`有什么区别](#`source script.sh` 和`./script.sh`有什么区别) -- [各种软件包和工具存储在哪里? 引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里? 引用过程是怎样的? `/bin` 或 `/lib` 是什么?) -- [我应该用`apt-get install`还是`pip install` 去下载包呢?](#我应该用`apt-get install`还是`pip install` 去下载包呢) -- [提高代码性能的最简单和最好的性能分析工具是什么?](#提高代码性能的最简单和最好的性能分析工具是什么) +- [你会优先学习的工具有那些?](#你会优先学习的工具有那些) +- [使用Python VS Bash脚本 VS 其他语言?](#使用Python VS Bash脚本 VS 其他语言) +- [`source script.sh` 和`./script.sh`有什么区别?](#`source script.sh` 和`./script.sh`有什么区别) +- [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里?引用过程是怎样的?`/bin` 或 `/lib` 是什么) +- [我应该用`apt-get install`还是`pip install` 去下载软件包呢?](#我应该用`apt-get install`还是`pip install`去下载包呢) +- [用于提高代码性能,简单好用的性能分析工具有哪些?](#用于提高代码性能,简单好用的性能分析工具有哪些) - [你使用那些浏览器插件?](#你使用那些浏览器插件) - [有哪些有用的数据整理工具?](#有哪些有用的数据整理工具) -- [Docker 和 虚拟机 有什么区别?](#Docker 和 虚拟机 有什么区别) -- [每种OS的优缺点是什么,我们如何选择(比如如何选择针对我们目的的最好Linux发行版)?](#每种OS的优缺点是什么,我们如何选择(比如如何选择针对我们目的的最好Linux发行版)) -- [Vim 编辑器vs Emacs编辑器?](#Vim 编辑器vs Emacs编辑器?) -- [机器学习应用的提示或技巧?](#机器学习应用的提示或技巧?) -- [还有更多的Vim提示吗?](#还有更多的Vim提示吗?) -- [2FA是什么,为什么我需要使用它?](#2FA是什么,为什么我需要使用它?) -- [对于不同的Web浏览器有什么评价??](#对于不同的Web浏览器有什么评价?) +- [Docker和虚拟机有什么区别?](#Docker和虚拟机有什么区别) +- [每种OS的优缺点是什么,我们如何选择(比如选择最适用于我们需求的Linux发行版)?](#每种OS的优缺点是什么,我们如何选择(比如选择最适用于我们需求的Linux发行版)) +- [Vim 编辑器vs Emacs编辑器?](#Vim编辑器vsEmacs编辑器) +- [机器学习应用的提示或技巧?](#机器学习应用的提示或技巧) +- [还有更多的Vim提示吗?](#还有更多的Vim提示吗) +- [2FA是什么,为什么我需要使用它?](#2FA是什么,为什么我需要使用它) +- [对于不同的Web浏览器有什么评价?](#对于不同的Web浏览器有什么评价) ## 学习操作系统相关话题的推荐,比如 进程,虚拟内存,中断,内存管理等 @@ -33,62 +33,49 @@ video: 首先,不清楚你是不是真的需要熟悉这些 更底层的话题。 -当你开始编写更加底层的代码比如 实现或修改 内核 的时候,这些很重要。 -除了其他课程中简要介绍过的进程和信号量之外,大部分话题都不相关。 +当你开始编写更加底层的代码比如 实现或修改 内核 的时候,这些很重要。除了其他课程中简要介绍过的进程和信号量之外,大部分话题都不相关。 +学习这些的相关资源: -学习这些话题的资源: - -- [MIT's 6.828 class](https://pdos.csail.mit.edu/6.828/) - Graduate level class on Operating System Engineering. Class materials are publicly available. 研究生阶段的操作系统课程(课程资料是公开的) -- Modern Operating Systems (4th ed) - Andrew S. Tanenbaum is a good overview of many of the mentioned concepts. **对很多上述概念都有很好的描述** -- The Design and Implementation of the FreeBSD Operating System - 关于FreeBSD OS 的好资源(注意,FreeBSD OS不是Linux) -- 其他的指南例如 [Writing an OS in Rust](https://os.phil-opp.com/) where people implement a kernel step by step in various languages, mostly for teaching purposes.这里用不同的语言 逐步实现了内核,主要用于教学的目的。 +- [MIT's 6.828 class](https://pdos.csail.mit.edu/6.828/) - 研究生阶段的操作系统课程(课程资料是公开的)。 +- 现代操作系统 第四版(Modern Operating Systems 4th ed) - 作者是Andrew S. Tanenbaum 这本书对上述很多概念都有很好的描述。 +- FreeBSD的设计与实现(The Design and Implementation of the FreeBSD Operating System) - 关于FreeBSD OS 的好资源(注意,FreeBSD OS不是Linux)。 +- 其他的指南例如 [用 Rust写操作系统](https://os.phil-opp.com/) 这里用不同的语言 逐步实现了内核,主要用于教学的目的。 ## 你会优先学习的工具有那些? +值得优先学习的内容: - -值得优先学习的话题: - -- 学着更多去使用键盘而不是鼠标。这可以通过快捷键,更换接口等等。 - -- 学好编辑器。作为程序员你的大部分时间都是在编辑文件,因此学好这些技能会给你带来回报。 - +- 学着更多去使用键盘,更少使用鼠标。这可以通过快捷键,更换界面等等。 +- 学好编辑器。作为程序员你的大部分时间都是在编辑文件,因此值得学好这些技能。 - 学习怎样去自动化或简化工作流程中的重复任务。因为这会节省大量的时间。 +- 学习像Git之类的 版本控制工具并且知道如何使用它与GitHub结合,在现代的软件项目中协同工作。 -- 学习版本控制工具如Git 并且知道如何使用它与GitHub结合,在现代的软件项目中协同工作。 - -- Learning how to use your keyboard more and your mouse less. This can be through **keyboard shortcuts, changing interfaces, &c**. +## 使用Python VS Bash脚本 VS 其他语言? - -## 使用Python VS Bash脚本 VS 其他语言? - - - -通常来说,当你仅想要运行一系列的命令的时候,Bash 脚本对于简短的一次性脚本有用。Bash 脚本有一些比较奇怪的地方,这使得大型程序或脚本难以用Bash实现: +通常来说,Bash 脚本对于简短的一次性脚本有用,比如当你想要运行一系列的命令的时候。Bash 脚本有一些比较奇怪的地方,这使得大型程序或脚本难以用Bash实现: - Bash 可以获取简单的用例,但是很难获得全部可能的输入。例如,脚本参数中的空格会导致Bash 脚本出错。 -- Bash 对于 代码重用并不友好。因此,重用你先前已经写好代码部分很困难。Bash 中没有软件库的概念。 -- Bash依赖于一些 像`$?` 或 `$@`的特殊字符指代特殊的值。其他的语言会显式地引用,比如`exitCode` 或`sys.args`。 +- Bash 对于 代码重用并不友好。因此,重用你先前已经写好代码部分很困难。通常Bash 中没有软件库的概念。 +- Bash依赖于一些 像`$?` 或 `$@`的特殊字符指代特殊的值。其他的语言却会显式地引用,比如`exitCode` 或`sys.args`。 因此,对于大型或者更加复杂地脚本我们推荐使用更加成熟的脚本语言例如 Python 和 Ruby。 -你可以找到数不胜数的在线库,有人已经写好了去解决常见的问题用这种语言。 -你可以在网上找到很多用这些语言写的解决常见问题的库。 -如果你发现某种语言实现了你需要的特定功能的库,最好的方式就是直接去用那种语言 - +你可以找到数不胜数的用这些语言写的,用来解决常见的问题在线库。 +如果你发现某种语言实现了你需要的特定功能的库,最好的方式就是直接去使用那种语言 ## `source script.sh` 和`./script.sh`有什么区别 -两种情况下 `script.sh` 都会在bash会话种被读取和执行,不同点在于那个会话在执行这个命令。 -对于`source`命令来说,命令是在当前的bash会话种执行的,因此当`source`执行完毕,对当前环境的任何更改(例如更改目录或是自定义函数)都会保存在当前会话中。 -当像`./script.sh`这样独立运行脚本时,当前的bash会话将启动新的bash实例,并在该实例中运行命令`script.sh`。因此,如果`script.sh`更改目录,新的bash实例会更改目录,但是一旦退出并将控制权返回给父bash会话,父会话仍然留在先前的位置(不会有目录的更改)。 -同样,如果`script.sh`定义了要在终端中访问的函数,需要用`source`命令在使得在当前bash会话中定义这个函数。否则,如果您运行`./script.sh`,只有新的bash进程才能执行定义的函数,而当前的shell不能。 - +两种情况下 `script.sh` 都会在bash会话种被读取和执行,不同点在于那个会话在执行这个命令。 +对于`source`命令来说,命令是在当前的bash会话种执行的,因此当`source`执行完毕,对当前环境的任何更改(例如更改目录或是自定义函数)都会留存在当前会话中。 +单独运行`./script.sh`这样独立运行脚本时,当前的bash会话将启动新的bash实例,并在该实例中运行命令`script.sh`。 +因此,如果`script.sh`更改目录,新的bash实例会更改目录,但是一旦退出并将控制权返回给父bash会话,父会话仍然留在先前的位置(不会有目录的更改)。 +同样,如果`script.sh`定义了要在终端中访问的函数,需要用`source`命令在当前bash会话中定义这个函数。否则,如果你运行`./script.sh`,只有新的bash进程才能执行定义的函数,而当前的shell不能。 ## 各种软件包和工具存储在哪里? 引用过程是怎样的? `/bin` 或 `/lib` 是什么? -根据你在命令行中运行的程序,这些包和工具会在`PATH`环境变量所列出的目录中找到 并且 你可以使用 `which`命令(或是`type`命令)来检查你的shell在哪里发现了特定的程序。 -一般来说,特定种类的文件存储有一定的规范,[文件系统,层次结构标准(Filesystem, Hierarchy Standard)](https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard)可以查到我们讨论内容的详细列表 + +根据你在命令行中运行的程序,这些包和工具会全部在`PATH`环境变量所列出的目录中查找到, 你可以使用 `which`命令(或是`type`命令)来检查你的shell在哪里发现了特定的程序。 +一般来说,特定种类的文件存储有一定的规范,[文件系统,层次结构标准(Filesystem, Hierarchy Standard)](https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard)可以查到我们讨论内容的详细列表。 - `/bin` - 基本命令二进制文件 - `/sbin` - 基本的系统二进制文件,通常是root运行的 @@ -107,116 +94,89 @@ video: ## 我应该用`apt-get install`还是`pip install` 去下载包呢? -这个问题没有普遍的答案。这与使用系统程序包管理器还是特定语言的程序包管理器来安装软件这一更普遍的问题有关。需要考虑的几件事: +这个问题没有普遍的答案。这与使用系统程序包管理器还是特定语言的程序包管理器来安装软件这一更普遍的问题相关。需要考虑的几件事: -- 通用软件包可以通过这两种方法获得,但是不太流行的软件包或较新的软件包可能不在系统程序包管理器中。在这种情况下,使用特定语言的工具的情况是更好的选择。 -- 同样,特定语言的程序包管理器相比系统程序包管理器更多的最新版本的程序包。 -- 当使用系统软件包管理器时,将在系统范围内安装库。这意味着,如果出于开发目的需要不同版本的库,则系统软件包管理器可能不能满足你的需要。对于这种情况,大多数编程语言都提供了隔离或虚拟环境,因此您可以安装不同版本的库而不会发生冲突。对于Python,有virtualenv,对于Ruby,有RVM。 -- 根据操作系统和硬件体系结构,其中一些软件包可能会附带二进制文件或可能需要编译。例如,在树莓派(Raspberry Pi)之类的ARM架构计算机中,在软件附带二进制文件和需要编译的情况下,使用系统包管理器比特定语言包管理器更好。这在很大程度上取决于您的特定设置。 +- 常见的软件包都可以通过这两种方法获得,但是小众的软件包或较新的软件包可能不在系统程序包管理器中。在这种情况下,使用特定语言的工具的情况是更好的选择。 +- 同样,特定语言的程序包管理器相比系统程序包管理器有更多的最新版本的程序包。 +- 当使用系统软件包管理器时,将在系统范围内安装库。如果出于开发目的需要不同版本的库,则系统软件包管理器可能不能满足你的需要。对于这种情况,大多数编程语言都提供了隔离或虚拟环境,因此您可以安装不同版本的库而不会发生冲突。对于Python,有virtualenv,对于Ruby,有RVM。 +- 根据操作系统和硬件架构,其中一些软件包可能会附带二进制文件或可能需要编译。例如,在树莓派(Raspberry Pi)之类的ARM架构计算机中,在软件附带二进制文件和需要编译的情况下,使用系统包管理器比特定语言包管理器更好。这在很大程度上取决于您的特定设置。 你应该仅使用一种解决方案,而不同时使用两种方法,因为这可能会导致难以调试的冲突。我们的建议是尽可能使用特定语言的程序包管理器,并使用隔离的环境(例如Python的virtualenv)以避免影响全局环境。 +## 用于提高代码性能,简单好用的性能分析工具有哪些? +性能分析方面相当有用和简单工具是[print timing](/2020/debugging-profiling/#timing)。你只需手动计算代码不同部分之间花费的时间。通过重复执行此操作,你可以有效地对代码进行二分法搜索,并找到花费时间最长的代码段。 -## 提高代码性能的最简单和最好的性能分析工具是什么? -性能分析方面最有用和简单工具是[print timing](/2020/debugging-profiling/#timing)。你只需手动计算代码不同部分之间花费的时间。通过重复执行此操作,你可以有效地对代码进行二分法搜索,并找到花费时间最长的代码段。 - -对于更高级的工具,Valgrind的 [Callgrind](http://valgrind.org/docs/manual/cl-manual.html)可让你运行程序并计算一切的时间花费以及所有调用堆栈,即哪个函数调用了另一个函数。然后,它会生成程序源代码的带注释的版本,其中包含每行花费的时间。但是,它会使程序速度降低一个数量级,并且不支持线程。对于其他情况,[`perf`](http://www.brendangregg.com/perf.html)工具和其他特定语言的采样性能分析器可以非常快速地输出有用的数据。[Flamegraphs](http://www.brendangregg.com/flamegraphs.html) 是对于采样分析器输出很好的可视化工具。你还应该使用针对编程语言或任务的特定的工具。例如,对于Web开发,Chrome和Firefox内置的开发工具具有出色的性能分析器。 - -有时,代码中最慢的部分是系统等待磁盘读取或网络数据包之类的事件。在这些情况下,值得检查的是,关于硬件功能的理论速度的后验计算与实际读数没有偏差。 -**粗略计算理论速度,根据实际读数与硬件容量偏差不大** 也有专门的工具来分析系统调用中的等待时间。这些工具包括执行用户程序内核跟踪的[eBPF](http://www.brendangregg.com/blog/2019-01-01/learn-ebpf-tracing.html) 之类的工具。如果需要低级的性能分析,[`bpftrace`](https://github.com/iovisor/bpftrace) 值得一试。 +对于更高级的工具,Valgrind的 [Callgrind](http://valgrind.org/docs/manual/cl-manual.html)可让你运行程序并计算一切的时间花费以及所有调用堆栈(即哪个函数调用了另一个函数)。然后,它会生成源代码的带注释的版本,其中包含每行花费的时间。但是,它会使程序速度降低一个数量级,并且不支持线程。对于其他情况,[`perf`](http://www.brendangregg.com/perf.html)工具和其他特定语言的采样性能分析器可以非常快速地输出有用的数据。[Flamegraphs](http://www.brendangregg.com/flamegraphs.html) 是对于采样分析器输出的可视化工具。你还应该使用针对编程语言或任务的特定的工具。例如,对于Web开发,Chrome和Firefox内置的开发工具具有出色的性能分析器。 - -Sometimes the slow part of your code will be because your system is waiting for an event like a disk read or a network packet. In those cases, it is worth checking that back-of-the-envelope calculations about the theoretical speed in terms of hardware capabilities do not deviate from the actual readings. There are also specialized tools to analyze the wait times in system calls. These include tools like [eBPF](http://www.brendangregg.com/blog/2019-01-01/learn-ebpf-tracing.html) that perform kernel tracing of user programs. In particular [`bpftrace`](https://github.com/iovisor/bpftrace) is worth checking out if you need to perform this sort of low level profiling. +有时,代码中最慢的部分是系统等待磁盘读取或网络数据包之类的事件。在这些情况下,需要检查根据硬件性能估算的理论速度不偏离实际数值 , 也有专门的工具来分析系统调用中的等待时间,包括用于用户程序内核跟踪的[eBPF](http://www.brendangregg.com/blog/2019-01-01/learn-ebpf-tracing.html) 之类的工具。如果需要低级的性能分析,[`bpftrace`](https://github.com/iovisor/bpftrace) 值得一试。 ## 你使用那些浏览器插件? 我们钟爱的插件主要与安全性与可用性有关: -- [uBlock Origin](https://github.com/gorhill/uBlock) - 是一个[wide-spectrum](https://github.com/gorhill/uBlock/wiki/Blocking-mode)不仅可以拦截广告,还可以拦截第三方的页面。这也会覆盖内部的脚本和其他种类的加载资源。如果你打算花更多的时间去配置,前往[medium mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-medium-mode)或者 [hard mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-hard-mode)。这些会使得一些网站停止工作直到你调整好了这些设置,这会显著提高你的网络安全。另外, [easy mode](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-easy-mode)已经很好了,可以拦截大部分的广告和跟踪,你也可以自定义规则来拦截网站对象。 - -- [Stylus](https://github.com/openstyles/stylus/) -Stylish的分支(不要使用Stylish,它会 [窃取浏览记录](https://www.theregister.co.uk/2018/07/05/browsers_pull_stylish_but_invasive_browser_extension/))),这个插件可让你将自定义CSS样式表侧面加载到网站。使用Stylus,你可以轻松地自定义和修改网站的外观。可以删除侧边栏,更改背景颜色,甚至更改文字大小或字体样式。这可以使你得经常访问的网站更具可读性。此外,Stylus可以找到其他用户编写并发布在[userstyles.org](https://userstyles.org/)中的样式。例如,大多数常见的网站都有一个或几个深色主题样式。 - -- 全页屏幕捕获-内置于Firefox和 [Chrome 扩展程序](https://chrome.google.com/webstore/detail/full-page-screen-capture/fdpohaocaechififmbbbbbknoalclacl?hl=en)中。这些插件提供完整的网站截图,通常比打印要好用。 - -- [多账户容器](https://addons.mozilla.org/en-US/firefox/addon/multi-account-containers/) -该插件使你可以将Cookie分为“容器”,从而允许你以不同的身份浏览web网页 并且/或 确保网站无法在它们之间共享信息。 -- 密码集成管理器-大多数密码管理器都有浏览器插件,这些插件帮你将登录凭据输入网站的过程不仅方便,而且更加安全。与简单复制粘贴用户名和密码相比,这些插件将首先检查网站域是否与列出的条目相匹配,以防止冒充著名网站的网络钓鱼攻击窃取登录凭据。 - +- [uBlock Origin](https://github.com/gorhill/uBlock) - 是一个[用途广泛(wide-spectrum)](https://github.com/gorhill/uBlock/wiki/Blocking-mode)的拦截器,它不仅可以拦截广告,还可以拦截第三方的页面,也可以拦截内部脚本和其他种类资源的加载。如果你打算花更多的时间去配置,前往[中等模式(medium mode)](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-medium-mode)或者 [强力模式(hard mode)](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-hard-mode)。这些会使得一些网站停止工作直到你调整好了这些设置,但是这会显著提高你的网络安全水平。另外, [简易模式(easy mode)](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-easy-mode)作为默认模式已经相当不错了,可以拦截大部分的广告和跟踪,你也可以自定义规则来拦截网站对象。 +- [Stylus](https://github.com/openstyles/stylus/) - 是Stylish的分支(不要使用Stylish,它会 [窃取浏览记录](https://www.theregister.co.uk/2018/07/05/browsers_pull_stylish_but_invasive_browser_extension/))),这个插件可让你将自定义CSS样式加载到网站。使用Stylus,你可以轻松地自定义和修改网站的外观。可以删除侧边框,更改背景颜色,甚至更改文字大小或字体样式。这可以使你得经常访问的网站更具可读性。此外,Stylus可以找到其他用户编写并发布在[userstyles.org](https://userstyles.org/)中的样式。大多数常见的网站都有一个或几个深色主题样式。 +- 全页屏幕捕获 - 内置于Firefox和 [Chrome 扩展程序](https://chrome.google.com/webstore/detail/full-page-screen-capture/fdpohaocaechififmbbbbbknoalclacl?hl=en)中。这些插件提供完整的网站截图,通常比打印要好用。 +- [多账户容器](https://addons.mozilla.org/en-US/firefox/addon/multi-account-containers/) - 该插件使你可以将Cookie分为“容器”,从而允许你以不同的身份浏览web网页 并且/或 确保网站无法在它们之间共享信息。 +- 密码集成管理器-大多数密码管理器都有浏览器插件,这些插件帮你将登录凭据输入网站的过程不仅方便,而且更加安全。与简单复制粘贴用户名和密码相比,这些插件将首先检查网站域是否与列出的条目相匹配,以防止冒充著名网站的网络钓鱼窃取登录凭据。 ## 有哪些有用的数据整理工具? -在数据整理课程中,我们没有时间讨论的一些数据整理工具包括`jq`或`pup`分别是用于JSON和HTML数据的专用解析器。Perl语言是用于更高级的数据整理管道的另一个很好的工具。另一个技巧是使用`column -t`命令, 可用于将空格文本(不一定对齐)转换为正确的列对齐文本。 -一般来说,vim和Python是两个不常规的数据整理工具。对于某些复杂的多行转换,vim宏可以是非常宝贵的工具。你可以记录一系列操作,并根据需要重复执行多次,例如,在编辑的 [讲义](/2020/editors/#macros) (去年 [视频](/2019/editors/))中,有一个示例就是使用vim宏将XML格式的文件转换为JSON。 +在数据整理那一节课程中,我们没有时间讨论的一些数据整理工具包括分别用于JSON和HTML数据的专用解析器,`jq`和`pup`。Perl语言是另一个用于更高级的数据整理管道的很好的工具。另一个技巧是使用`column -t`命令, 可用于将空格文本(不一定对齐)转换为正确的对齐文本。 +一般来说,vim和Python是两个不常规的数据整理工具。对于某些复杂的多行转换,vim宏可以是非常有用的工具。你可以记录一系列操作,并根据需要重复执行多次,例如,在编辑的 [讲义](/2020/editors/#macros) (去年 [视频](/2019/editors/))中,有一个示例是使用vim宏将XML格式的文件转换为JSON。 -对于通常以CSV格式显示的表格数据, Python [pandas](https://pandas.pydata.org/) 库是一个很棒的工具。不仅因为它使定义复杂的操作(如分组依据,联接或过滤器)变得非常容易;而且 而且还很容易绘制数据的不同属性。它还支持导出为多种表格格式,包括XLS,HTML或LaTeX。另外,R语言(一种理论上[不好](http://arrgh.tim-smith.us/)的语言)具有很多功能,可以计算数据的统计数据,在管道的最后一步中非常有用。 [ggplot2](https://ggplot2.tidyverse.org/)是R中很棒的绘图库。 +对于通常以CSV格式显示的表格数据, Python [pandas](https://pandas.pydata.org/) 库是一个很棒的工具。不仅因为它使得定义复杂的操作(如分组依据,联接或过滤器)变得非常容易, 而且还很容易绘制数据的不同属性。它还支持导出为多种表格格式,包括XLS,HTML或LaTeX。另外,R语言(一种理论上[不好](http://arrgh.tim-smith.us/)的语言)具有很多功能,可以计算数据的统计数据,这在管道的最后一步中非常有用。 [ggplot2](https://ggplot2.tidyverse.org/)是R中很棒的绘图库。 +## Docker和虚拟机有什么区别? +Docker 基于更加普遍的被称为容器的概念。关于容器和虚拟机之间最大的不同是 虚拟机会执行整个的 OS 栈,包括内核(即使这个内核和主机内核相同)。与虚拟机不同的是,容器避免运行其他内核实例 ,而是与主机分享内核。在Linux环境中,有LXC机制来实现,并且这能使一系列分离的主机像是在使用自己的硬件启动程序,而实际上是共享主机的硬件和内核。因此容器的开销小于完整的虚拟机。 +另一方面,容器的隔离性较弱而且只有在主机运行相同的内核时才能正常工作。例如,如果你在macOS上运行Docker,Docker需要启动Linux虚拟机去获取初始的Linux内核,这样的开销仍然很大。最后,Docker是容器的特定实现,它是为软件部署定制的。基于这些,它有一些奇怪之处:例如,默认情况下,Docker容器在重启之间不会有以任何形式的存储。 -## Docker 和 虚拟机 有什么区别? +## 每种OS的优缺点是什么,我们如何选择(比如选择最适用于我们需求的Linux发行版) -Docker 基于更加普遍的概念,称为容器。关于容器和虚拟机之间最大的不同是 虚拟机会执行整个的 **OS 栈,包括内核(即使这个内核和主机内核相同)**。与虚拟机不同,容器避免运行其他内核实例 反而是与主机分享内核。在Linux环境中,有LXC机制来实现,并且这能使一系列分离的主机好像是使用自己的硬件启动程序,而实际上是共享主机的硬件和内核。因此容器的开销小于完整的虚拟机。 - -另一方面,容器的隔离性较弱而且只有在主机运行相同的内核时才能正常工作。例如如果你在macOS上运行Docker,Docker需要启动Linux虚拟机去获取初始的Linux内核,这样开销仍然很大。最后,Docker是容器的特定实现,它是为软件部署定制的。基于这些,它有一些奇怪之处:例如,默认情况下,Docker容器在重启之间不会维持以任何形式的存储。 - - - -## 每种OS的优缺点是什么,我们如何选择(比如如何选择针对我们目的的最好Linux发行版) - -关于Linux发行版,尽管有很多版本,但大部分发行版在大多数使用情况下的表现是相同的。 -可以在任何发行版中学习Linux和UNIX的特性和其内部工作原理。 +关于Linux发行版,尽管有相当多的版本,但大部分发行版在大多数使用情况下的表现是相同的。 +可以在任何发行版中学习Linux与UNIX的特性和其内部工作原理。 发行版之间的根本区别是发行版如何处理软件包更新。 -某些版本,例如Arch Linux采用滚动更新策略,用了最前沿的技术(bleeding-edge),但软件可能并不稳定。另外一些发行版(如Debian,CentOS或Ubuntu LTS)其更新要保守得多,因此更新会更稳定,但不能使用一些新功能。我们建议你使用Debian或Ubuntu来获得简单稳定的台式机和服务器体验。 +某些版本,例如Arch Linux采用滚动更新策略,用了最前沿的技术(bleeding-edge),但软件可能并不稳定。另外一些发行版(如Debian,CentOS或Ubuntu LTS)其更新策略要保守得多,因此更新的内容会更稳定,但牺牲了一些新功能。我们建议你使用Debian或Ubuntu来获得简单稳定的台式机和服务器体验。 -Mac OS是介于Windows和Linux之间的一个很好的中间OS,它有很漂亮的界面。但是,Mac OS是基于BSD而不是Linux,因此系统的某些部分和命令是不同的。 +Mac OS是介于Windows和Linux之间的一个OS,它有很漂亮的界面。但是,Mac OS是基于BSD而不是Linux,因此系统的某些部分和命令是不同的。 另一种值得体验的是FreeBSD。虽然某些程序不能在FreeBSD上运行,但与Linux相比,BSD生态系统的碎片化程度要低得多,并且说明文档更加友好。 -除了开发Windows应用程序或需要某些在Windows上更好功能(例如对游戏的良好驱动程序支持)外,我们不建议使用Windows。 +除了开发Windows应用程序或需要使用某些Windows更好支持的功能(例如对游戏的驱动程序支持)外,我们不建议使用Windows。 对于双启动系统,我们认为最有效的实现是macOS的bootcamp,从长远来看,任何其他组合都可能会出现问题,尤其是当你结合了其他功能比如磁盘加密。 - - - ## Vim 编辑器vs Emacs编辑器? -我们三个都使用vim作为我们的主要编辑器。但是Emacs也是一个不错的选择,你可以两者都尝试,看看那个更适合你。Emacs不遵循vim的模式编辑,但是这些功能可以通过Emacs插件 [Evil](https://github.com/emacs-evil/evil) 或 [Doom Emacs](https://github.com/hlissner/doom-emacs)来实现。Emacs的优点是可以用Lisp语言进行扩展(Lisp比vim默认的脚本语言vimscript要更好)。 + +我们三个都使用vim作为我们的主要编辑器。但是Emacs也是一个不错的选择,你可以两者都尝试,看看那个更适合你。Emacs不遵循vim的模式编辑,但是这些功能可以通过Emacs插件 像 [Evil](https://github.com/emacs-evil/evil) 或 [Doom Emacs](https://github.com/hlissner/doom-emacs)来实现。 +Emacs的优点是可以用Lisp语言进行扩展(Lisp比vim默认的脚本语言vimscript要更好)。 ## 机器学习应用的提示或技巧? 课程的一些经验可以直接用于机器学习程序。 - 就像许多科学学科一样,在机器学习中,你经常要进行一系列实验,并检查哪些数据有效,哪些无效。 - -你可以使用Shell轻松快速地搜索这些实验结果,并且以明智的方式汇总。这意味着在需要在给定的时间范围或在使用特定数据集的情况下,检查所有实验结果。通过使用JSON文件记录实验的所有相关参数,使用我们在本课程中介绍的工具,这件事情可以变得非常简单。 - +你可以使用Shell轻松快速地搜索这些实验结果,并且以明智的方式汇总。这意味着在需要在给定的时间内或使用特定数据集的情况下,检查所有实验结果。通过使用JSON文件记录实验的所有相关参数,使用我们在本课程中介绍的工具,这件事情可以变得极其简单。 最后,如果你不使用集群提交你的GPU作业,那你应该研究如何使该过程自动化,因为这是一项非常耗时的任务,会消耗你的精力。 ## 还有更多的Vim提示吗? 更多的提示: -- 插件 - 花时间去探索插件。有很多不错的插件解决vim的缺陷或者增加了与现有vim 工作流很好结合的新功能。这部分内容,资源是[VimAwesome](https://vimawesome.com/) 和其他程序员的dotfiles - -- 标记 - 在vim里你可以使用 `m` 为字母 `X`做标记,之后你可以通过 `'`回到标记位置。这可以让你快速导航到文件内或跨文件间的特定位置。 - +- 插件 - 花时间去探索插件。有很多不错的插件解决vim的缺陷或者增加了与现有vim 工作流很好结合的新功能。关于这部分内容,资源是[VimAwesome](https://vimawesome.com/) 和其他程序员的dotfiles。 +- 标记 - 在vim里你可以使用 `m` 为字母 `X`做标记,之后你可以通过 `'`回到标记位置。这可以让你快速定位到文件内或文件间的特定位置。 - 导航 - `Ctrl+O` and `Ctrl+I` 使你在最近访问位置向后向前移动。 - -- 撤销树 - vim 有不错的机制跟踪(文件)更改,不同于其他的编辑器,vim存储变更树,因此即使你撤销后做了一些修改,你仍然可以通过撤销树的导航回到初始状态。一些插件比如 [gundo.vim](https://github.com/sjl/gundo.vim) 和 [undotree](https://github.com/mbbill/undotree)通过图形化来展示撤销树 - -- 时间的撤销 - `:earlier` 和 `:later`命令使得你可以用时间参考而不是某一时刻的更改。 - -- [持续撤销](https://vim.fandom.com/wiki/Using_undo_branches#Persistent_undo)是一个默认未开启的vim的内置功能,在vim启动之间保存撤销历史,通过设置 在 `.vimrc`目录下的`undofile` 和 `undodir`, vim会保存每个文件的修改历史。 - -- 领导按键 - 领导按键是一个用于用户配置自定义命令的特殊的按键。这种模式通常是按下后释放这个按键(通常是空格键)和其他的按键去执行特殊的命令。插件会用这些按键增加他们的功能,例如 插件UndoTree使用 ` U` 去打开撤销树。 -- 高级文本对象 - 文本对象像搜索也可以用vim命令构成,例如`d/`会删除下一处匹配pattern的位置 ,**`cgn`会改变最后搜索到的字符串的下一个存在。** - - - +- 撤销树 - vim 有不错的更改跟踪机制,不同于其他的编辑器,vim存储变更树,因此即使你撤销后做了一些修改,你仍然可以通过撤销树的导航回到初始状态。一些插件比如 [gundo.vim](https://github.com/sjl/gundo.vim) 和 [undotree](https://github.com/mbbill/undotree)通过图形化来展示撤销树 +- 时间撤销 - `:earlier` 和 `:later`命令使得你可以用时间参考而不是某一时刻的更改来定位文件。 +- [持续撤销](https://vim.fandom.com/wiki/Using_undo_branches#Persistent_undo)是一个默认未被开启的vim的内置功能,它在vim启动之间保存撤销历史,通过设置 在 `.vimrc`目录下的`undofile` 和 `undodir`, vim会保存每个文件的修改历史。 +- 前缀键(Leader Key) - 前缀键是一个用于用户配置自定义命令的特殊的按键。这种模式通常是按下后释放这个按键(通常是空格键)与其他的按键去执行特殊的命令。插件会用这些按键增加他们的功能,例如 插件UndoTree使用 ` U` 去打开撤销树。 +- 高级文本对象 - 文本对象像搜索也可以用vim命令构成。例如,`d/`会删除下一处匹配pattern的字符串 ,`cgn`可以用于更改上次搜索的关键字。 ## 2FA是什么,为什么我需要使用它? -双因子验证(Two Factor Authentication 2FA)在密码之上为帐户增加了一层额外的保护。为了登录,您不仅需要知道一些密码,还必须以某种方式“证明”可以访问某些硬件设备。最简单的情形,可以通过在接收手机的SMS来实现(尽管SMS 2FA 存在 [已知问题](https://www.kaspersky.com/blog/2fa-practical-guide/24219/))。我们推荐使用[YubiKey](https://www.yubico.com/)之类的[U2F](https://en.wikipedia.org/wiki/Universal_2nd_Factor)方案。 - +双因子验证(Two Factor Authentication 2FA)在密码之上为帐户增加了一层额外的保护。为了登录,您不仅需要知道一些密码,还必须以某种方式“证明”可以访问某些硬件设备。最简单的情形是可以通过接收手机的SMS来实现(尽管SMS 2FA 存在 [已知问题](https://www.kaspersky.com/blog/2fa-practical-guide/24219/))。我们推荐使用[YubiKey](https://www.yubico.com/)之类的[U2F](https://en.wikipedia.org/wiki/Universal_2nd_Factor)方案。 ## 对于不同的Web浏览器有什么评价? -2020的浏览器现状是,大部分的浏览器都与Chrome 类似,因为他们都使用同样的引擎(Blink)。 Microsoft Edge同样基于 Blink,至于Safari 基于 WebKit(与Blink类似的引擎),这些浏览器仅仅是更糟糕的Chorme版本。不管是在性能还是可用性上,Chorme都是一款好的浏览器。如果你想要替代品,我们推荐Firefox。Firefox与Chorme的在各方面不相上下,并且在隐私方面更加出色。有一款目前还没有完成Flow的浏览器**浏览器目前还没有完成**,它实现了全新的渲染引擎(rendering engine),可以保证比现有引擎快。 +2020的浏览器现状是,大部分的浏览器都与Chrome 类似,因为他们都使用同样的引擎(Blink)。 Microsoft Edge同样基于 Blink,至于Safari 基于 WebKit(与Blink类似的引擎),这些浏览器仅仅是更糟糕的Chorme版本。不管是在性能还是可用性上,Chorme都是一款好的浏览器。如果你想要替代品,我们推荐Firefox。Firefox与Chorme的在各方面不相上下,并且在隐私方面更加出色。 +有一款目前还没有完成的叫Flow的浏览器,它实现了全新的渲染引擎,可以保证比现有引擎速度更快。 From 35b4cb1f0938062a84926d8ce007688f747a3c22 Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Sun, 31 May 2020 14:33:07 +0800 Subject: [PATCH 407/640] Update qa.md --- _2020/qa.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index 8b650ebc..acff0236 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -10,12 +10,12 @@ video: 最后一节课,我们回答学生提出的问题: -- [学习操作系统相关话题的推荐,比如 进程,虚拟内存,中断,内存管理等](#学习操作系统相关话题的推荐,比如 进程,虚拟内存,中断,内存管理等) +- [学习操作系统相关话题的推荐,比如进程,虚拟内存,中断,内存管理等](#学习操作系统相关话题的推荐,比如进程,虚拟内存,中断,内存管理等) - [你会优先学习的工具有那些?](#你会优先学习的工具有那些) -- [使用Python VS Bash脚本 VS 其他语言?](#使用Python VS Bash脚本 VS 其他语言) +- [使用Python VS Bash脚本 VS 其他语言?](#使用Python-VS- Bash脚本-VS-其他语言) - [`source script.sh` 和`./script.sh`有什么区别?](#`source script.sh` 和`./script.sh`有什么区别) - [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里?引用过程是怎样的?`/bin` 或 `/lib` 是什么) -- [我应该用`apt-get install`还是`pip install` 去下载软件包呢?](#我应该用`apt-get install`还是`pip install`去下载包呢) +- [我应该用`apt-get install`还是`pip install` 去下载软件包呢?](#我应该用apt-get+install还是pip+install去下载包呢) - [用于提高代码性能,简单好用的性能分析工具有哪些?](#用于提高代码性能,简单好用的性能分析工具有哪些) - [你使用那些浏览器插件?](#你使用那些浏览器插件) - [有哪些有用的数据整理工具?](#有哪些有用的数据整理工具) @@ -28,7 +28,7 @@ video: - [对于不同的Web浏览器有什么评价?](#对于不同的Web浏览器有什么评价) -## 学习操作系统相关话题的推荐,比如 进程,虚拟内存,中断,内存管理等 +## 学习操作系统相关话题的推荐,比如进程,虚拟内存,中断,内存管理等 @@ -52,7 +52,7 @@ video: - 学习怎样去自动化或简化工作流程中的重复任务。因为这会节省大量的时间。 - 学习像Git之类的 版本控制工具并且知道如何使用它与GitHub结合,在现代的软件项目中协同工作。 -## 使用Python VS Bash脚本 VS 其他语言? +## 使用Python VS Bash脚本 VS 其他语言? 通常来说,Bash 脚本对于简短的一次性脚本有用,比如当你想要运行一系列的命令的时候。Bash 脚本有一些比较奇怪的地方,这使得大型程序或脚本难以用Bash实现: From eef4533c5a7c7ddf2c941773351c06572ad2b3d3 Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Sun, 31 May 2020 14:42:36 +0800 Subject: [PATCH 408/640] Update qa.md --- _2020/qa.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index acff0236..16d0197b 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -10,12 +10,13 @@ video: 最后一节课,我们回答学生提出的问题: -- [学习操作系统相关话题的推荐,比如进程,虚拟内存,中断,内存管理等](#学习操作系统相关话题的推荐,比如进程,虚拟内存,中断,内存管理等) + +- [学习操作系统相关话题的推荐,比如进程,虚拟内存,中断,内存管理等](#学习操作系统相关话题的推荐 比如进程 虚拟内存 中断 内存管理等) - [你会优先学习的工具有那些?](#你会优先学习的工具有那些) - [使用Python VS Bash脚本 VS 其他语言?](#使用Python-VS- Bash脚本-VS-其他语言) -- [`source script.sh` 和`./script.sh`有什么区别?](#`source script.sh` 和`./script.sh`有什么区别) -- [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里?引用过程是怎样的?`/bin` 或 `/lib` 是什么) -- [我应该用`apt-get install`还是`pip install` 去下载软件包呢?](#我应该用apt-get+install还是pip+install去下载包呢) +- [`source script.sh` 和`./script.sh`有什么区别?](#source script.sh 和./script.sh有什么区别) +- [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里?引用过程是怎样的?/bin 或 /lib 是什么) +- [我应该用`apt-get install`还是`pip install` 去下载软件包呢?](#我应该用apt-get install还是pip install去下载包呢) - [用于提高代码性能,简单好用的性能分析工具有哪些?](#用于提高代码性能,简单好用的性能分析工具有哪些) - [你使用那些浏览器插件?](#你使用那些浏览器插件) - [有哪些有用的数据整理工具?](#有哪些有用的数据整理工具) From bcf4507179980564bad93c8cc5084a793f0b6a53 Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Sun, 31 May 2020 14:47:54 +0800 Subject: [PATCH 409/640] Update qa.md --- _2020/qa.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index 16d0197b..bd625961 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -14,8 +14,8 @@ video: - [学习操作系统相关话题的推荐,比如进程,虚拟内存,中断,内存管理等](#学习操作系统相关话题的推荐 比如进程 虚拟内存 中断 内存管理等) - [你会优先学习的工具有那些?](#你会优先学习的工具有那些) - [使用Python VS Bash脚本 VS 其他语言?](#使用Python-VS- Bash脚本-VS-其他语言) -- [`source script.sh` 和`./script.sh`有什么区别?](#source script.sh 和./script.sh有什么区别) -- [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里?引用过程是怎样的?/bin 或 /lib 是什么) +- [`source script.sh` 和`./script.sh`有什么区别?](#source script.sh 和script.sh有什么区别) +- [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里?引用过程是怎样的?-bin 或 lib 是什么) - [我应该用`apt-get install`还是`pip install` 去下载软件包呢?](#我应该用apt-get install还是pip install去下载包呢) - [用于提高代码性能,简单好用的性能分析工具有哪些?](#用于提高代码性能,简单好用的性能分析工具有哪些) - [你使用那些浏览器插件?](#你使用那些浏览器插件) From 7759c510019e228285c283aea9b0b258d0efd1f3 Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Sun, 31 May 2020 14:51:10 +0800 Subject: [PATCH 410/640] Update qa.md --- _2020/qa.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/qa.md b/_2020/qa.md index bd625961..dcd043e3 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -25,7 +25,7 @@ video: - [Vim 编辑器vs Emacs编辑器?](#Vim编辑器vsEmacs编辑器) - [机器学习应用的提示或技巧?](#机器学习应用的提示或技巧) - [还有更多的Vim提示吗?](#还有更多的Vim提示吗) -- [2FA是什么,为什么我需要使用它?](#2FA是什么,为什么我需要使用它) +- [2FA是什么,为什么我需要使用它?](#2FA是什么为什么我需要使用它) - [对于不同的Web浏览器有什么评价?](#对于不同的Web浏览器有什么评价) From a76cb84cc61ed2da35a133a15d6d4c29997950e9 Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Sun, 31 May 2020 14:53:27 +0800 Subject: [PATCH 411/640] Update qa.md --- _2020/qa.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index dcd043e3..e04ec7f6 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -11,17 +11,17 @@ video: 最后一节课,我们回答学生提出的问题: -- [学习操作系统相关话题的推荐,比如进程,虚拟内存,中断,内存管理等](#学习操作系统相关话题的推荐 比如进程 虚拟内存 中断 内存管理等) +- [学习操作系统相关话题的推荐,比如进程,虚拟内存,中断,内存管理等](#学习操作系统相关话题的推荐比如进程虚拟内存中断内存管理等) - [你会优先学习的工具有那些?](#你会优先学习的工具有那些) -- [使用Python VS Bash脚本 VS 其他语言?](#使用Python-VS- Bash脚本-VS-其他语言) -- [`source script.sh` 和`./script.sh`有什么区别?](#source script.sh 和script.sh有什么区别) -- [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里?引用过程是怎样的?-bin 或 lib 是什么) -- [我应该用`apt-get install`还是`pip install` 去下载软件包呢?](#我应该用apt-get install还是pip install去下载包呢) -- [用于提高代码性能,简单好用的性能分析工具有哪些?](#用于提高代码性能,简单好用的性能分析工具有哪些) +- [使用Python VS Bash脚本 VS 其他语言?](#使用PythonVS Bash脚本VS其他语言) +- [`source script.sh` 和`./script.sh`有什么区别?](#source script.sh和script.sh有什么区别) +- [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里引用过程是怎样的bin 或lib是什么) +- [我应该用`apt-get install`还是`pip install` 去下载软件包呢?](#我应该用apt-getinstall还是pipinstall去下载包呢) +- [用于提高代码性能,简单好用的性能分析工具有哪些?](#用于提高代码性能简单好用的性能分析工具有哪些) - [你使用那些浏览器插件?](#你使用那些浏览器插件) - [有哪些有用的数据整理工具?](#有哪些有用的数据整理工具) - [Docker和虚拟机有什么区别?](#Docker和虚拟机有什么区别) -- [每种OS的优缺点是什么,我们如何选择(比如选择最适用于我们需求的Linux发行版)?](#每种OS的优缺点是什么,我们如何选择(比如选择最适用于我们需求的Linux发行版)) +- [每种OS的优缺点是什么,我们如何选择(比如选择最适用于我们需求的Linux发行版)?](#每种OS的优缺点是什么我们如何选择比如选择最适用于我们需求的Linux发行版) - [Vim 编辑器vs Emacs编辑器?](#Vim编辑器vsEmacs编辑器) - [机器学习应用的提示或技巧?](#机器学习应用的提示或技巧) - [还有更多的Vim提示吗?](#还有更多的Vim提示吗) From 11a6a2787119116b1561733e79aa35f0a880facc Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Sun, 31 May 2020 14:55:55 +0800 Subject: [PATCH 412/640] Update qa.md --- _2020/qa.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index e04ec7f6..01b308f0 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -14,15 +14,15 @@ video: - [学习操作系统相关话题的推荐,比如进程,虚拟内存,中断,内存管理等](#学习操作系统相关话题的推荐比如进程虚拟内存中断内存管理等) - [你会优先学习的工具有那些?](#你会优先学习的工具有那些) - [使用Python VS Bash脚本 VS 其他语言?](#使用PythonVS Bash脚本VS其他语言) -- [`source script.sh` 和`./script.sh`有什么区别?](#source script.sh和script.sh有什么区别) -- [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里引用过程是怎样的bin 或lib是什么) -- [我应该用`apt-get install`还是`pip install` 去下载软件包呢?](#我应该用apt-getinstall还是pipinstall去下载包呢) +- [`source script.sh` 和`./script.sh`有什么区别?](#sourcescript.sh和script.sh有什么区别) +- [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里引用过程是怎样的bin或lib是什么) +- [我应该用`apt-get install`还是`pip install` 去下载软件包呢?](#我应该用aptgetinstall还是pipinstall去下载包呢) - [用于提高代码性能,简单好用的性能分析工具有哪些?](#用于提高代码性能简单好用的性能分析工具有哪些) - [你使用那些浏览器插件?](#你使用那些浏览器插件) - [有哪些有用的数据整理工具?](#有哪些有用的数据整理工具) - [Docker和虚拟机有什么区别?](#Docker和虚拟机有什么区别) - [每种OS的优缺点是什么,我们如何选择(比如选择最适用于我们需求的Linux发行版)?](#每种OS的优缺点是什么我们如何选择比如选择最适用于我们需求的Linux发行版) -- [Vim 编辑器vs Emacs编辑器?](#Vim编辑器vsEmacs编辑器) +- [Vim 编辑器vs Emacs编辑器?](#vim编辑器vsemacs编辑器) - [机器学习应用的提示或技巧?](#机器学习应用的提示或技巧) - [还有更多的Vim提示吗?](#还有更多的Vim提示吗) - [2FA是什么,为什么我需要使用它?](#2FA是什么为什么我需要使用它) From 3238d7fcc356d63fdabd74d832ba6ef23c34e9f1 Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Sun, 31 May 2020 14:59:20 +0800 Subject: [PATCH 413/640] Update qa.md --- _2020/qa.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index 01b308f0..44728275 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -14,9 +14,9 @@ video: - [学习操作系统相关话题的推荐,比如进程,虚拟内存,中断,内存管理等](#学习操作系统相关话题的推荐比如进程虚拟内存中断内存管理等) - [你会优先学习的工具有那些?](#你会优先学习的工具有那些) - [使用Python VS Bash脚本 VS 其他语言?](#使用PythonVS Bash脚本VS其他语言) -- [`source script.sh` 和`./script.sh`有什么区别?](#sourcescript.sh和script.sh有什么区别) +- [`source script.sh` 和`./script.sh`有什么区别?](#source-script.sh和script.sh有什么区别) - [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里引用过程是怎样的bin或lib是什么) -- [我应该用`apt-get install`还是`pip install` 去下载软件包呢?](#我应该用aptgetinstall还是pipinstall去下载包呢) +- [我应该用`apt-get install`还是`pip install` 去下载软件包呢?](#我应该用aptget-install还是pip-install去下载包呢) - [用于提高代码性能,简单好用的性能分析工具有哪些?](#用于提高代码性能简单好用的性能分析工具有哪些) - [你使用那些浏览器插件?](#你使用那些浏览器插件) - [有哪些有用的数据整理工具?](#有哪些有用的数据整理工具) From da3400c17014c8a0abd4c9642685d91e5f49e72f Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Sun, 31 May 2020 15:08:01 +0800 Subject: [PATCH 414/640] Update qa.md --- _2020/qa.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index 44728275..07e56894 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -13,10 +13,10 @@ video: - [学习操作系统相关话题的推荐,比如进程,虚拟内存,中断,内存管理等](#学习操作系统相关话题的推荐比如进程虚拟内存中断内存管理等) - [你会优先学习的工具有那些?](#你会优先学习的工具有那些) -- [使用Python VS Bash脚本 VS 其他语言?](#使用PythonVS Bash脚本VS其他语言) +- [使用Python VS Bash脚本 VS 其他语言?](#使用Python VS Bash脚本 VS 其他语言) - [`source script.sh` 和`./script.sh`有什么区别?](#source-script.sh和script.sh有什么区别) - [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里引用过程是怎样的bin或lib是什么) -- [我应该用`apt-get install`还是`pip install` 去下载软件包呢?](#我应该用aptget-install还是pip-install去下载包呢) +- [我应该用`apt-get install`还是`pip install` 去下载软件包呢?](#我应该用apt-get-install还是pip-install--去下载软件包呢) - [用于提高代码性能,简单好用的性能分析工具有哪些?](#用于提高代码性能简单好用的性能分析工具有哪些) - [你使用那些浏览器插件?](#你使用那些浏览器插件) - [有哪些有用的数据整理工具?](#有哪些有用的数据整理工具) From 0ee0b92c37f019a67c81458aec76ea4e3fcf76c0 Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Sun, 31 May 2020 15:09:05 +0800 Subject: [PATCH 415/640] Update qa.md --- _2020/qa.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/qa.md b/_2020/qa.md index 07e56894..6a227adc 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -13,7 +13,7 @@ video: - [学习操作系统相关话题的推荐,比如进程,虚拟内存,中断,内存管理等](#学习操作系统相关话题的推荐比如进程虚拟内存中断内存管理等) - [你会优先学习的工具有那些?](#你会优先学习的工具有那些) -- [使用Python VS Bash脚本 VS 其他语言?](#使用Python VS Bash脚本 VS 其他语言) +- [使用Python VS Bash脚本 VS 其他语言?](#使用PythonVSBash脚本VS其他语言) - [`source script.sh` 和`./script.sh`有什么区别?](#source-script.sh和script.sh有什么区别) - [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里引用过程是怎样的bin或lib是什么) - [我应该用`apt-get install`还是`pip install` 去下载软件包呢?](#我应该用apt-get-install还是pip-install--去下载软件包呢) From ad98e89826585267e6a41e91d3beb600912b874c Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Sun, 31 May 2020 15:13:01 +0800 Subject: [PATCH 416/640] Update qa.md --- _2020/qa.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/qa.md b/_2020/qa.md index 6a227adc..6e4bf70f 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -22,7 +22,7 @@ video: - [有哪些有用的数据整理工具?](#有哪些有用的数据整理工具) - [Docker和虚拟机有什么区别?](#Docker和虚拟机有什么区别) - [每种OS的优缺点是什么,我们如何选择(比如选择最适用于我们需求的Linux发行版)?](#每种OS的优缺点是什么我们如何选择比如选择最适用于我们需求的Linux发行版) -- [Vim 编辑器vs Emacs编辑器?](#vim编辑器vsemacs编辑器) +- [Vim 编辑器vs Emacs编辑器?](#vim-编辑器vs-emacs编辑器) - [机器学习应用的提示或技巧?](#机器学习应用的提示或技巧) - [还有更多的Vim提示吗?](#还有更多的Vim提示吗) - [2FA是什么,为什么我需要使用它?](#2FA是什么为什么我需要使用它) From 81a106577f08ae7742f7f8db377d31901918036b Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Sun, 31 May 2020 15:16:53 +0800 Subject: [PATCH 417/640] Update qa.md --- _2020/qa.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index 6e4bf70f..502fe058 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -13,7 +13,7 @@ video: - [学习操作系统相关话题的推荐,比如进程,虚拟内存,中断,内存管理等](#学习操作系统相关话题的推荐比如进程虚拟内存中断内存管理等) - [你会优先学习的工具有那些?](#你会优先学习的工具有那些) -- [使用Python VS Bash脚本 VS 其他语言?](#使用PythonVSBash脚本VS其他语言) +- [使用Python VS Bash脚本 VS 其他语言?](#使用python-vs-bash脚本-vs-其他语言) - [`source script.sh` 和`./script.sh`有什么区别?](#source-script.sh和script.sh有什么区别) - [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里引用过程是怎样的bin或lib是什么) - [我应该用`apt-get install`还是`pip install` 去下载软件包呢?](#我应该用apt-get-install还是pip-install--去下载软件包呢) @@ -22,7 +22,7 @@ video: - [有哪些有用的数据整理工具?](#有哪些有用的数据整理工具) - [Docker和虚拟机有什么区别?](#Docker和虚拟机有什么区别) - [每种OS的优缺点是什么,我们如何选择(比如选择最适用于我们需求的Linux发行版)?](#每种OS的优缺点是什么我们如何选择比如选择最适用于我们需求的Linux发行版) -- [Vim 编辑器vs Emacs编辑器?](#vim-编辑器vs-emacs编辑器) +- [Vim 编辑器 VS Emacs编辑器?](#vim-编辑器-vs-emacs编辑器) - [机器学习应用的提示或技巧?](#机器学习应用的提示或技巧) - [还有更多的Vim提示吗?](#还有更多的Vim提示吗) - [2FA是什么,为什么我需要使用它?](#2FA是什么为什么我需要使用它) @@ -148,7 +148,7 @@ Mac OS是介于Windows和Linux之间的一个OS,它有很漂亮的界面。但 对于双启动系统,我们认为最有效的实现是macOS的bootcamp,从长远来看,任何其他组合都可能会出现问题,尤其是当你结合了其他功能比如磁盘加密。 -## Vim 编辑器vs Emacs编辑器? +## Vim 编辑器 VS Emacs编辑器? 我们三个都使用vim作为我们的主要编辑器。但是Emacs也是一个不错的选择,你可以两者都尝试,看看那个更适合你。Emacs不遵循vim的模式编辑,但是这些功能可以通过Emacs插件 像 [Evil](https://github.com/emacs-evil/evil) 或 [Doom Emacs](https://github.com/hlissner/doom-emacs)来实现。 Emacs的优点是可以用Lisp语言进行扩展(Lisp比vim默认的脚本语言vimscript要更好)。 From dcdfd4b0cc6285b357e8fd13016e8508ec907611 Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Sun, 31 May 2020 15:29:15 +0800 Subject: [PATCH 418/640] Update qa.md --- _2020/qa.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index 502fe058..783f5ea8 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -14,9 +14,9 @@ video: - [学习操作系统相关话题的推荐,比如进程,虚拟内存,中断,内存管理等](#学习操作系统相关话题的推荐比如进程虚拟内存中断内存管理等) - [你会优先学习的工具有那些?](#你会优先学习的工具有那些) - [使用Python VS Bash脚本 VS 其他语言?](#使用python-vs-bash脚本-vs-其他语言) -- [`source script.sh` 和`./script.sh`有什么区别?](#source-script.sh和script.sh有什么区别) -- [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里引用过程是怎样的bin或lib是什么) -- [我应该用`apt-get install`还是`pip install` 去下载软件包呢?](#我应该用apt-get-install还是pip-install--去下载软件包呢) +- [`source script.sh` 和`./script.sh`有什么区别?](#source-scriptsh-和scriptsh有什么区别) +- [各种软件包和工具存储在哪里?引用过程是怎样的?`/bin` 或`/lib`是什么?](#各种软件包和工具存储在哪里引用过程是怎样的bin-或lib是什么) +- [我应该用`apt-get install`还是`pip install`去下载软件包呢?](#我应该用apt-get-install还是pip-install去下载软件包呢) - [用于提高代码性能,简单好用的性能分析工具有哪些?](#用于提高代码性能简单好用的性能分析工具有哪些) - [你使用那些浏览器插件?](#你使用那些浏览器插件) - [有哪些有用的数据整理工具?](#有哪些有用的数据整理工具) @@ -73,7 +73,7 @@ video: 因此,如果`script.sh`更改目录,新的bash实例会更改目录,但是一旦退出并将控制权返回给父bash会话,父会话仍然留在先前的位置(不会有目录的更改)。 同样,如果`script.sh`定义了要在终端中访问的函数,需要用`source`命令在当前bash会话中定义这个函数。否则,如果你运行`./script.sh`,只有新的bash进程才能执行定义的函数,而当前的shell不能。 -## 各种软件包和工具存储在哪里? 引用过程是怎样的? `/bin` 或 `/lib` 是什么? +## 各种软件包和工具存储在哪里?引用过程是怎样的?`/bin` 或`/lib`是什么? 根据你在命令行中运行的程序,这些包和工具会全部在`PATH`环境变量所列出的目录中查找到, 你可以使用 `which`命令(或是`type`命令)来检查你的shell在哪里发现了特定的程序。 一般来说,特定种类的文件存储有一定的规范,[文件系统,层次结构标准(Filesystem, Hierarchy Standard)](https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard)可以查到我们讨论内容的详细列表。 @@ -93,7 +93,7 @@ video: + `/usr/local/bin` - 用户编译程序的二进制文件 - `/var` -变量文件 像日志或缓存 -## 我应该用`apt-get install`还是`pip install` 去下载包呢? +## 我应该用`apt-get install`还是`pip install`去下载软件包呢? 这个问题没有普遍的答案。这与使用系统程序包管理器还是特定语言的程序包管理器来安装软件这一更普遍的问题相关。需要考虑的几件事: From bf0cf5dadc72e66472ec6337de9cc70346fca569 Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Sun, 31 May 2020 16:40:54 +0800 Subject: [PATCH 419/640] Update qa.md --- _2020/qa.md | 74 ++++++++++++++++++++++++++--------------------------- 1 file changed, 37 insertions(+), 37 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index 783f5ea8..29d6258f 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -11,7 +11,7 @@ video: 最后一节课,我们回答学生提出的问题: -- [学习操作系统相关话题的推荐,比如进程,虚拟内存,中断,内存管理等](#学习操作系统相关话题的推荐比如进程虚拟内存中断内存管理等) +- [学习操作系统相关内容的推荐,比如进程,虚拟内存,中断,内存管理等](#学习操作系统相关内容的推荐比如进程虚拟内存中断内存管理等) - [你会优先学习的工具有那些?](#你会优先学习的工具有那些) - [使用Python VS Bash脚本 VS 其他语言?](#使用python-vs-bash脚本-vs-其他语言) - [`source script.sh` 和`./script.sh`有什么区别?](#source-scriptsh-和scriptsh有什么区别) @@ -29,14 +29,14 @@ video: - [对于不同的Web浏览器有什么评价?](#对于不同的Web浏览器有什么评价) -## 学习操作系统相关话题的推荐,比如进程,虚拟内存,中断,内存管理等 +## 学习操作系统相关内容的推荐,比如进程,虚拟内存,中断,内存管理等 首先,不清楚你是不是真的需要熟悉这些 更底层的话题。 -当你开始编写更加底层的代码比如 实现或修改 内核 的时候,这些很重要。除了其他课程中简要介绍过的进程和信号量之外,大部分话题都不相关。 +当你开始编写更加底层的代码,比如实现或修改内核 的时候,这些很重要。除了其他课程中简要介绍过的进程和信号量之外,大部分话题都不相关。 -学习这些的相关资源: +学习资源: - [MIT's 6.828 class](https://pdos.csail.mit.edu/6.828/) - 研究生阶段的操作系统课程(课程资料是公开的)。 - 现代操作系统 第四版(Modern Operating Systems 4th ed) - 作者是Andrew S. Tanenbaum 这本书对上述很多概念都有很好的描述。 @@ -48,28 +48,28 @@ video: 值得优先学习的内容: -- 学着更多去使用键盘,更少使用鼠标。这可以通过快捷键,更换界面等等。 +- 学着更多去使用键盘,更少使用鼠标。可以用快捷键,更换界面等等。 - 学好编辑器。作为程序员你的大部分时间都是在编辑文件,因此值得学好这些技能。 - 学习怎样去自动化或简化工作流程中的重复任务。因为这会节省大量的时间。 -- 学习像Git之类的 版本控制工具并且知道如何使用它与GitHub结合,在现代的软件项目中协同工作。 +- 学习像Git之类的 版本控制工具并且知道如何与GitHub结合,在现代的软件项目中协同工作。 ## 使用Python VS Bash脚本 VS 其他语言? -通常来说,Bash 脚本对于简短的一次性脚本有用,比如当你想要运行一系列的命令的时候。Bash 脚本有一些比较奇怪的地方,这使得大型程序或脚本难以用Bash实现: +通常来说,Bash 脚本对于简短的一次性脚本有效,比如当你想要运行一系列的命令的时候。但是Bash 脚本有一些比较奇怪的地方,这使得大型程序或脚本难以用Bash实现: - Bash 可以获取简单的用例,但是很难获得全部可能的输入。例如,脚本参数中的空格会导致Bash 脚本出错。 -- Bash 对于 代码重用并不友好。因此,重用你先前已经写好代码部分很困难。通常Bash 中没有软件库的概念。 +- Bash 对于代码重用并不友好。因此,重用你先前已经写好代码部分很困难。通常Bash 中没有软件库的概念。 - Bash依赖于一些 像`$?` 或 `$@`的特殊字符指代特殊的值。其他的语言却会显式地引用,比如`exitCode` 或`sys.args`。 -因此,对于大型或者更加复杂地脚本我们推荐使用更加成熟的脚本语言例如 Python 和 Ruby。 -你可以找到数不胜数的用这些语言写的,用来解决常见的问题在线库。 -如果你发现某种语言实现了你需要的特定功能的库,最好的方式就是直接去使用那种语言 +因此,对于大型或者更加复杂的脚本我们推荐使用更加成熟的脚本语言例如 Python 和 Ruby。 +你可以找到很多用这些语言编写的,用来解决常见问题的在线库。 +如果你发现某种语言实现了你需要的特定功能库,最好的方式就是直接去使用那种语言。 -## `source script.sh` 和`./script.sh`有什么区别 +## `source script.sh` 和`./script.sh`有什么区别? -两种情况下 `script.sh` 都会在bash会话种被读取和执行,不同点在于那个会话在执行这个命令。 -对于`source`命令来说,命令是在当前的bash会话种执行的,因此当`source`执行完毕,对当前环境的任何更改(例如更改目录或是自定义函数)都会留存在当前会话中。 -单独运行`./script.sh`这样独立运行脚本时,当前的bash会话将启动新的bash实例,并在该实例中运行命令`script.sh`。 +两种情况下 `script.sh` 都会在bash会话种被读取和执行,不同点在于那个会话执行这个命令。 +对于`source`命令来说,命令是在当前的bash会话种执行的,因此当`source`执行完毕,对当前环境的任何更改(例如更改目录或是定义函数)都会留存在当前会话中。 +单独运行`./script.sh`时,当前的bash会话将启动新的bash实例,并在新实例中运行命令`script.sh`。 因此,如果`script.sh`更改目录,新的bash实例会更改目录,但是一旦退出并将控制权返回给父bash会话,父会话仍然留在先前的位置(不会有目录的更改)。 同样,如果`script.sh`定义了要在终端中访问的函数,需要用`source`命令在当前bash会话中定义这个函数。否则,如果你运行`./script.sh`,只有新的bash进程才能执行定义的函数,而当前的shell不能。 @@ -97,37 +97,37 @@ video: 这个问题没有普遍的答案。这与使用系统程序包管理器还是特定语言的程序包管理器来安装软件这一更普遍的问题相关。需要考虑的几件事: -- 常见的软件包都可以通过这两种方法获得,但是小众的软件包或较新的软件包可能不在系统程序包管理器中。在这种情况下,使用特定语言的工具的情况是更好的选择。 +- 常见的软件包都可以通过这两种方法获得,但是小众的软件包或较新的软件包可能不在系统程序包管理器中。在这种情况下,使用特定语言的程序包管理器是更好的选择。 - 同样,特定语言的程序包管理器相比系统程序包管理器有更多的最新版本的程序包。 -- 当使用系统软件包管理器时,将在系统范围内安装库。如果出于开发目的需要不同版本的库,则系统软件包管理器可能不能满足你的需要。对于这种情况,大多数编程语言都提供了隔离或虚拟环境,因此您可以安装不同版本的库而不会发生冲突。对于Python,有virtualenv,对于Ruby,有RVM。 -- 根据操作系统和硬件架构,其中一些软件包可能会附带二进制文件或可能需要编译。例如,在树莓派(Raspberry Pi)之类的ARM架构计算机中,在软件附带二进制文件和需要编译的情况下,使用系统包管理器比特定语言包管理器更好。这在很大程度上取决于您的特定设置。 +- 当使用系统软件包管理器时,将在系统范围内安装库。如果出于开发目的需要不同版本的库,则系统软件包管理器可能不能满足你的需要。对于这种情况,大多数编程语言都提供了隔离或虚拟环境,因此您可以用特定语言的程序包管理器安装不同版本的库而不会发生冲突。对于Python,有virtualenv,对于Ruby,有RVM。 +- 根据操作系统和硬件架构,其中一些软件包可能会附带二进制文件或者软件包需要编译。例如,在树莓派(Raspberry Pi)之类的ARM架构计算机中,在软件附带二进制文件和软件包需要编译的情况下,使用系统包管理器比特定语言包管理器更好。这在很大程度上取决于你的特定设置。 你应该仅使用一种解决方案,而不同时使用两种方法,因为这可能会导致难以调试的冲突。我们的建议是尽可能使用特定语言的程序包管理器,并使用隔离的环境(例如Python的virtualenv)以避免影响全局环境。 ## 用于提高代码性能,简单好用的性能分析工具有哪些? 性能分析方面相当有用和简单工具是[print timing](/2020/debugging-profiling/#timing)。你只需手动计算代码不同部分之间花费的时间。通过重复执行此操作,你可以有效地对代码进行二分法搜索,并找到花费时间最长的代码段。 -对于更高级的工具,Valgrind的 [Callgrind](http://valgrind.org/docs/manual/cl-manual.html)可让你运行程序并计算一切的时间花费以及所有调用堆栈(即哪个函数调用了另一个函数)。然后,它会生成源代码的带注释的版本,其中包含每行花费的时间。但是,它会使程序速度降低一个数量级,并且不支持线程。对于其他情况,[`perf`](http://www.brendangregg.com/perf.html)工具和其他特定语言的采样性能分析器可以非常快速地输出有用的数据。[Flamegraphs](http://www.brendangregg.com/flamegraphs.html) 是对于采样分析器输出的可视化工具。你还应该使用针对编程语言或任务的特定的工具。例如,对于Web开发,Chrome和Firefox内置的开发工具具有出色的性能分析器。 +对于更高级的工具,Valgrind的 [Callgrind](http://valgrind.org/docs/manual/cl-manual.html)可让你运行程序并计算所有的时间花费以及所有调用堆栈(即哪个函数调用了另一个函数)。然后,它会生成带注释的代码版本,其中包含每行花费的时间。但是,它会使程序运行速度降低一个数量级,并且不支持线程。对于其他情况,[`perf`](http://www.brendangregg.com/perf.html)工具和其他特定语言的采样性能分析器可以非常快速地输出有用的数据。[Flamegraphs](http://www.brendangregg.com/flamegraphs.html) 是对于采样分析器输出的可视化工具。你还可以使用针对特定编程语言或任务的工具。例如,对于Web开发而言,Chrome和Firefox内置的开发工具具有出色的性能分析器。 -有时,代码中最慢的部分是系统等待磁盘读取或网络数据包之类的事件。在这些情况下,需要检查根据硬件性能估算的理论速度不偏离实际数值 , 也有专门的工具来分析系统调用中的等待时间,包括用于用户程序内核跟踪的[eBPF](http://www.brendangregg.com/blog/2019-01-01/learn-ebpf-tracing.html) 之类的工具。如果需要低级的性能分析,[`bpftrace`](https://github.com/iovisor/bpftrace) 值得一试。 +有时,代码中最慢的部分是系统等待磁盘读取或网络数据包之类的事件。在这些情况下,需要检查根据硬件性能估算的理论速度是否不偏离实际数值 , 也有专门的工具来分析系统调用中的等待时间,包括用于用户程序内核跟踪的[eBPF](http://www.brendangregg.com/blog/2019-01-01/learn-ebpf-tracing.html) 。如果需要低级的性能分析,[`bpftrace`](https://github.com/iovisor/bpftrace) 值得一试。 ## 你使用那些浏览器插件? 我们钟爱的插件主要与安全性与可用性有关: -- [uBlock Origin](https://github.com/gorhill/uBlock) - 是一个[用途广泛(wide-spectrum)](https://github.com/gorhill/uBlock/wiki/Blocking-mode)的拦截器,它不仅可以拦截广告,还可以拦截第三方的页面,也可以拦截内部脚本和其他种类资源的加载。如果你打算花更多的时间去配置,前往[中等模式(medium mode)](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-medium-mode)或者 [强力模式(hard mode)](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-hard-mode)。这些会使得一些网站停止工作直到你调整好了这些设置,但是这会显著提高你的网络安全水平。另外, [简易模式(easy mode)](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-easy-mode)作为默认模式已经相当不错了,可以拦截大部分的广告和跟踪,你也可以自定义规则来拦截网站对象。 -- [Stylus](https://github.com/openstyles/stylus/) - 是Stylish的分支(不要使用Stylish,它会 [窃取浏览记录](https://www.theregister.co.uk/2018/07/05/browsers_pull_stylish_but_invasive_browser_extension/))),这个插件可让你将自定义CSS样式加载到网站。使用Stylus,你可以轻松地自定义和修改网站的外观。可以删除侧边框,更改背景颜色,甚至更改文字大小或字体样式。这可以使你得经常访问的网站更具可读性。此外,Stylus可以找到其他用户编写并发布在[userstyles.org](https://userstyles.org/)中的样式。大多数常见的网站都有一个或几个深色主题样式。 +- [uBlock Origin](https://github.com/gorhill/uBlock) - 是一个[用途广泛(wide-spectrum)](https://github.com/gorhill/uBlock/wiki/Blocking-mode)的拦截器,它不仅可以拦截广告,还可以拦截第三方的页面,也可以拦截内部脚本和其他种类资源的加载。如果你打算花更多的时间去配置,前往[中等模式(medium mode)](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-medium-mode)或者 [强力模式(hard mode)](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-hard-mode)。在你调整好设置之前一些网站会停止工作,但是这些配置会显著提高你的网络安全水平。另外, [简易模式(easy mode)](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-easy-mode)作为默认模式已经相当不错了,可以拦截大部分的广告和跟踪,你也可以自定义规则来拦截网站对象。 +- [Stylus](https://github.com/openstyles/stylus/) - 是Stylish的分支(不要使用Stylish,它会 [窃取浏览记录](https://www.theregister.co.uk/2018/07/05/browsers_pull_stylish_but_invasive_browser_extension/))),这个插件可让你将自定义CSS样式加载到网站。使用Stylus,你可以轻松地自定义和修改网站的外观。可以删除侧边框,更改背景颜色,甚至更改文字大小或字体样式。这可以使你经常访问的网站更具可读性。此外,Stylus可以找到其他用户编写并发布在[userstyles.org](https://userstyles.org/)中的样式。大多数常见的网站都有一个或几个深色主题样式。 - 全页屏幕捕获 - 内置于Firefox和 [Chrome 扩展程序](https://chrome.google.com/webstore/detail/full-page-screen-capture/fdpohaocaechififmbbbbbknoalclacl?hl=en)中。这些插件提供完整的网站截图,通常比打印要好用。 - [多账户容器](https://addons.mozilla.org/en-US/firefox/addon/multi-account-containers/) - 该插件使你可以将Cookie分为“容器”,从而允许你以不同的身份浏览web网页 并且/或 确保网站无法在它们之间共享信息。 - 密码集成管理器-大多数密码管理器都有浏览器插件,这些插件帮你将登录凭据输入网站的过程不仅方便,而且更加安全。与简单复制粘贴用户名和密码相比,这些插件将首先检查网站域是否与列出的条目相匹配,以防止冒充著名网站的网络钓鱼窃取登录凭据。 ## 有哪些有用的数据整理工具? -在数据整理那一节课程中,我们没有时间讨论的一些数据整理工具包括分别用于JSON和HTML数据的专用解析器,`jq`和`pup`。Perl语言是另一个用于更高级的数据整理管道的很好的工具。另一个技巧是使用`column -t`命令, 可用于将空格文本(不一定对齐)转换为正确的对齐文本。 +在数据整理那一节课程中,我们没有时间讨论一些数据整理工具,包括分别用于JSON和HTML数据的专用解析器,`jq`和`pup`。Perl语言是另一个更高级的用于数据整理管道的工具。另一个技巧是使用`column -t`命令, 可用于将空格文本(不一定对齐)转换为对齐的文本。 一般来说,vim和Python是两个不常规的数据整理工具。对于某些复杂的多行转换,vim宏可以是非常有用的工具。你可以记录一系列操作,并根据需要重复执行多次,例如,在编辑的 [讲义](/2020/editors/#macros) (去年 [视频](/2019/editors/))中,有一个示例是使用vim宏将XML格式的文件转换为JSON。 -对于通常以CSV格式显示的表格数据, Python [pandas](https://pandas.pydata.org/) 库是一个很棒的工具。不仅因为它使得定义复杂的操作(如分组依据,联接或过滤器)变得非常容易, 而且还很容易绘制数据的不同属性。它还支持导出为多种表格格式,包括XLS,HTML或LaTeX。另外,R语言(一种理论上[不好](http://arrgh.tim-smith.us/)的语言)具有很多功能,可以计算数据的统计数据,这在管道的最后一步中非常有用。 [ggplot2](https://ggplot2.tidyverse.org/)是R中很棒的绘图库。 +对于通常以CSV格式显示的表格数据, Python [pandas](https://pandas.pydata.org/) 库是一个很棒的工具。不仅因为它让复杂操作的定义(如分组依据,联接或过滤器)变得非常容易, 而且还便于根据不同属性绘制数据。它还支持导出多种表格格式,包括XLS,HTML或LaTeX。另外,R语言(一种有争议的[不好](http://arrgh.tim-smith.us/)的语言)具有很多功能,可以计算数据的统计数据,这在管道的最后一步中非常有用。 [ggplot2](https://ggplot2.tidyverse.org/)是R中很棒的绘图库。 ## Docker和虚拟机有什么区别? @@ -135,12 +135,12 @@ Docker 基于更加普遍的被称为容器的概念。关于容器和虚拟机 另一方面,容器的隔离性较弱而且只有在主机运行相同的内核时才能正常工作。例如,如果你在macOS上运行Docker,Docker需要启动Linux虚拟机去获取初始的Linux内核,这样的开销仍然很大。最后,Docker是容器的特定实现,它是为软件部署定制的。基于这些,它有一些奇怪之处:例如,默认情况下,Docker容器在重启之间不会有以任何形式的存储。 -## 每种OS的优缺点是什么,我们如何选择(比如选择最适用于我们需求的Linux发行版) +## 每种OS的优缺点是什么,我们如何选择(比如选择最适用于我们需求的Linux发行版)? 关于Linux发行版,尽管有相当多的版本,但大部分发行版在大多数使用情况下的表现是相同的。 -可以在任何发行版中学习Linux与UNIX的特性和其内部工作原理。 +可以在任何发行版中学习Linux与UNIX的特性和内部工作原理。 发行版之间的根本区别是发行版如何处理软件包更新。 -某些版本,例如Arch Linux采用滚动更新策略,用了最前沿的技术(bleeding-edge),但软件可能并不稳定。另外一些发行版(如Debian,CentOS或Ubuntu LTS)其更新策略要保守得多,因此更新的内容会更稳定,但牺牲了一些新功能。我们建议你使用Debian或Ubuntu来获得简单稳定的台式机和服务器体验。 +某些版本,例如Arch Linux采用滚动更新策略,用了最前沿的软件包(bleeding-edge),但软件可能并不稳定。另外一些发行版(如Debian,CentOS或Ubuntu LTS)其更新策略要保守得多,因此更新的内容会更稳定,但牺牲了一些新功能。我们建议你使用Debian或Ubuntu来获得简单稳定的台式机和服务器体验。 Mac OS是介于Windows和Linux之间的一个OS,它有很漂亮的界面。但是,Mac OS是基于BSD而不是Linux,因此系统的某些部分和命令是不同的。 另一种值得体验的是FreeBSD。虽然某些程序不能在FreeBSD上运行,但与Linux相比,BSD生态系统的碎片化程度要低得多,并且说明文档更加友好。 @@ -157,27 +157,27 @@ Emacs的优点是可以用Lisp语言进行扩展(Lisp比vim默认的脚本语 课程的一些经验可以直接用于机器学习程序。 就像许多科学学科一样,在机器学习中,你经常要进行一系列实验,并检查哪些数据有效,哪些无效。 -你可以使用Shell轻松快速地搜索这些实验结果,并且以明智的方式汇总。这意味着在需要在给定的时间内或使用特定数据集的情况下,检查所有实验结果。通过使用JSON文件记录实验的所有相关参数,使用我们在本课程中介绍的工具,这件事情可以变得极其简单。 +你可以使用Shell轻松快速地搜索这些实验结果,并且以合理的方式汇总。这意味着需要在限定时间内或使用特定数据集的情况下,检查所有实验结果。通过使用JSON文件记录实验的所有相关参数,使用我们在本课程中介绍的工具,这件事情可以变得极其简单。 最后,如果你不使用集群提交你的GPU作业,那你应该研究如何使该过程自动化,因为这是一项非常耗时的任务,会消耗你的精力。 ## 还有更多的Vim提示吗? 更多的提示: -- 插件 - 花时间去探索插件。有很多不错的插件解决vim的缺陷或者增加了与现有vim 工作流很好结合的新功能。关于这部分内容,资源是[VimAwesome](https://vimawesome.com/) 和其他程序员的dotfiles。 +- 插件 - 花时间去探索插件。有很多不错的插件解决了vim的缺陷或者增加了与现有vim 工作流很好结合的新功能。关于这部分内容,资源是[VimAwesome](https://vimawesome.com/) 和其他程序员的dotfiles。 - 标记 - 在vim里你可以使用 `m` 为字母 `X`做标记,之后你可以通过 `'`回到标记位置。这可以让你快速定位到文件内或文件间的特定位置。 -- 导航 - `Ctrl+O` and `Ctrl+I` 使你在最近访问位置向后向前移动。 +- 导航 - `Ctrl+O` and `Ctrl+I` 使你在最近访问位置前后移动。 - 撤销树 - vim 有不错的更改跟踪机制,不同于其他的编辑器,vim存储变更树,因此即使你撤销后做了一些修改,你仍然可以通过撤销树的导航回到初始状态。一些插件比如 [gundo.vim](https://github.com/sjl/gundo.vim) 和 [undotree](https://github.com/mbbill/undotree)通过图形化来展示撤销树 -- 时间撤销 - `:earlier` 和 `:later`命令使得你可以用时间参考而不是某一时刻的更改来定位文件。 -- [持续撤销](https://vim.fandom.com/wiki/Using_undo_branches#Persistent_undo)是一个默认未被开启的vim的内置功能,它在vim启动之间保存撤销历史,通过设置 在 `.vimrc`目录下的`undofile` 和 `undodir`, vim会保存每个文件的修改历史。 -- 前缀键(Leader Key) - 前缀键是一个用于用户配置自定义命令的特殊的按键。这种模式通常是按下后释放这个按键(通常是空格键)与其他的按键去执行特殊的命令。插件会用这些按键增加他们的功能,例如 插件UndoTree使用 ` U` 去打开撤销树。 -- 高级文本对象 - 文本对象像搜索也可以用vim命令构成。例如,`d/`会删除下一处匹配pattern的字符串 ,`cgn`可以用于更改上次搜索的关键字。 +- 时间撤销 - `:earlier` 和 `:later`命令使得你可以用时间而非某一时刻的更改来定位文件。 +- [持续撤销](https://vim.fandom.com/wiki/Using_undo_branches#Persistent_undo) - 是一个默认未被开启的vim的内置功能,它在vim启动之间保存撤销历史,通过设置 在 `.vimrc`目录下的`undofile` 和 `undodir`, vim会保存每个文件的修改历史。 +- 热键(Leader Key) - 热键是一个用于用户配置自定义命令的特殊的按键。这种模式通常是按下后释放这个按键(通常是空格键)与其他的按键去执行特殊的命令。插件会用这些按键增加他们的功能,例如 插件UndoTree使用 ` U` 去打开撤销树。 +- 高级文本对象 - 文本对象比如搜索也可以用vim命令构成。例如,`d/`会删除下一处匹配pattern的字符串 ,`cgn`可以用于更改上次搜索的关键字。 ## 2FA是什么,为什么我需要使用它? -双因子验证(Two Factor Authentication 2FA)在密码之上为帐户增加了一层额外的保护。为了登录,您不仅需要知道一些密码,还必须以某种方式“证明”可以访问某些硬件设备。最简单的情形是可以通过接收手机的SMS来实现(尽管SMS 2FA 存在 [已知问题](https://www.kaspersky.com/blog/2fa-practical-guide/24219/))。我们推荐使用[YubiKey](https://www.yubico.com/)之类的[U2F](https://en.wikipedia.org/wiki/Universal_2nd_Factor)方案。 +双因子验证(Two Factor Authentication 2FA)在密码之上为帐户增加了一层额外的保护。为了登录,您不仅需要知道密码,还必须以某种方式“证明”可以访问某些硬件设备。最简单的情形是可以通过接收手机的SMS来实现(尽管SMS 2FA 存在 [已知问题](https://www.kaspersky.com/blog/2fa-practical-guide/24219/))。我们推荐使用[YubiKey](https://www.yubico.com/)之类的[U2F](https://en.wikipedia.org/wiki/Universal_2nd_Factor)方案。 ## 对于不同的Web浏览器有什么评价? -2020的浏览器现状是,大部分的浏览器都与Chrome 类似,因为他们都使用同样的引擎(Blink)。 Microsoft Edge同样基于 Blink,至于Safari 基于 WebKit(与Blink类似的引擎),这些浏览器仅仅是更糟糕的Chorme版本。不管是在性能还是可用性上,Chorme都是一款好的浏览器。如果你想要替代品,我们推荐Firefox。Firefox与Chorme的在各方面不相上下,并且在隐私方面更加出色。 -有一款目前还没有完成的叫Flow的浏览器,它实现了全新的渲染引擎,可以保证比现有引擎速度更快。 +2020的浏览器现状是,大部分的浏览器都与Chrome 类似,因为他们都使用同样的引擎(Blink)。 Microsoft Edge同样基于 Blink,至于Safari 基于 WebKit(与Blink类似的引擎),这些浏览器仅仅是更糟糕的Chorme版本。不管是在性能还是可用性上,Chorme都是一款很不错的浏览器。如果你想要替代品,我们推荐Firefox。Firefox与Chorme的在各方面不相上下,并且在隐私方面更加出色。 +有一款目前还没有完成的叫Flow的浏览器,它实现了全新的渲染引擎,有望比现有引擎速度更快。 From 0b7786854375182ab2a3ba4d66727f88b5df7501 Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Sun, 31 May 2020 16:50:36 +0800 Subject: [PATCH 420/640] Update qa.md --- _2020/qa.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index 29d6258f..33349386 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -2,7 +2,7 @@ layout: lecture title: "提问&回答" date: 2019-01-30 -ready: false +ready: true video: aspect: 56.25 id: Wz50FvGG6xU @@ -99,7 +99,7 @@ video: - 常见的软件包都可以通过这两种方法获得,但是小众的软件包或较新的软件包可能不在系统程序包管理器中。在这种情况下,使用特定语言的程序包管理器是更好的选择。 - 同样,特定语言的程序包管理器相比系统程序包管理器有更多的最新版本的程序包。 -- 当使用系统软件包管理器时,将在系统范围内安装库。如果出于开发目的需要不同版本的库,则系统软件包管理器可能不能满足你的需要。对于这种情况,大多数编程语言都提供了隔离或虚拟环境,因此您可以用特定语言的程序包管理器安装不同版本的库而不会发生冲突。对于Python,有virtualenv,对于Ruby,有RVM。 +- 当使用系统软件包管理器时,将在系统范围内安装库。如果出于开发目的需要不同版本的库,则系统软件包管理器可能不能满足你的需要。对于这种情况,大多数编程语言都提供了隔离或虚拟环境,因此你可以用特定语言的程序包管理器安装不同版本的库而不会发生冲突。对于Python,有virtualenv,对于Ruby,有RVM。 - 根据操作系统和硬件架构,其中一些软件包可能会附带二进制文件或者软件包需要编译。例如,在树莓派(Raspberry Pi)之类的ARM架构计算机中,在软件附带二进制文件和软件包需要编译的情况下,使用系统包管理器比特定语言包管理器更好。这在很大程度上取决于你的特定设置。 你应该仅使用一种解决方案,而不同时使用两种方法,因为这可能会导致难以调试的冲突。我们的建议是尽可能使用特定语言的程序包管理器,并使用隔离的环境(例如Python的virtualenv)以避免影响全局环境。 @@ -175,7 +175,7 @@ Emacs的优点是可以用Lisp语言进行扩展(Lisp比vim默认的脚本语 ## 2FA是什么,为什么我需要使用它? -双因子验证(Two Factor Authentication 2FA)在密码之上为帐户增加了一层额外的保护。为了登录,您不仅需要知道密码,还必须以某种方式“证明”可以访问某些硬件设备。最简单的情形是可以通过接收手机的SMS来实现(尽管SMS 2FA 存在 [已知问题](https://www.kaspersky.com/blog/2fa-practical-guide/24219/))。我们推荐使用[YubiKey](https://www.yubico.com/)之类的[U2F](https://en.wikipedia.org/wiki/Universal_2nd_Factor)方案。 +双因子验证(Two Factor Authentication 2FA)在密码之上为帐户增加了一层额外的保护。为了登录,你不仅需要知道密码,还必须以某种方式“证明”可以访问某些硬件设备。最简单的情形是可以通过接收手机的SMS来实现(尽管SMS 2FA 存在 [已知问题](https://www.kaspersky.com/blog/2fa-practical-guide/24219/))。我们推荐使用[YubiKey](https://www.yubico.com/)之类的[U2F](https://en.wikipedia.org/wiki/Universal_2nd_Factor)方案。 ## 对于不同的Web浏览器有什么评价? From 48dda553f055e7ad3349ddfb0a3d3b0640ba1613 Mon Sep 17 00:00:00 2001 From: xlfsummer <20750969+xlfsummer@users.noreply.github.com> Date: Tue, 2 Jun 2020 01:28:06 +0800 Subject: [PATCH 421/640] fix typos and styles --- _2020/course-shell.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index 0128a350..1c93fbc8 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -61,7 +61,7 @@ s课后我们会安排答疑的时间来回答您的问题。如果您参加的 missing:~$ ``` -这是shell最主要的文本接口。它告诉你,你的主机名是 `missing` 并且您当前的工作目录("current working directory")或者说您当前所在的位置是`~` (表示 "home")。 `$`符号表示您现在的身份不是root用户(稍后会介绍)。在找个提示符中,您可以输入 _命令_ ,命令最终会被shell解析。最简单的命令是执行一个程序: +这是shell最主要的文本接口。它告诉你,你的主机名是 `missing` 并且您当前的工作目录("current working directory")或者说您当前所在的位置是`~` (表示 "home")。 `$`符号表示您现在的身份不是root用户(稍后会介绍)。在这个提示符中,您可以输入 _命令_ ,命令最终会被shell解析。最简单的命令是执行一个程序: ```console missing:~$ date @@ -69,7 +69,7 @@ Fri 10 Jan 2020 11:49:31 AM EST missing:~$ ``` -这里,我们执行了 `date` 找个程序,不出意料地,它打印出了当前的日前和时间。然后,shell等待我们输入其他命令。我们可以在执行命令的同时向程序传递 _参数_ : +这里,我们执行了 `date` 这个程序,不出意料地,它打印出了当前的日前和时间。然后,shell等待我们输入其他命令。我们可以在执行命令的同时向程序传递 _参数_ : ```console missing:~$ echo hello @@ -95,7 +95,7 @@ missing:~$ /bin/echo $PATH ## 在shell中导航 -shell中的路径是一组被分割的目录,在 Linux 和 macOS 上使用 `/` 分割,而在Windows上是`\`。路径 `/`代表的是系统的根目录,所有的文件夹都包括在找个路径之下,在Windows上每个盘都有一个根目录(例如: +shell中的路径是一组被分割的目录,在 Linux 和 macOS 上使用 `/` 分割,而在Windows上是`\`。路径 `/`代表的是系统的根目录,所有的文件夹都包括在这个路径之下,在Windows上每个盘都有一个根目录(例如: `C:\`)。 我们假设您在学习本课程时使用的是Linux文件系统。如果某个路径以`/` 开头,那么它是一个 _绝对路径_,其他的都术语 _相对路径_ 。相对路径是指相对于当前工作目录的路径,当前工作目录可以使用 `pwd` 命令来获取。此外,切换目录需要使用 `cd` 命令。在路径中,`.` 表示的是当前目录,而 `..` 表示上级目录: ```console @@ -182,7 +182,7 @@ missing:~$ cat hello2.txt hello ``` -您还可以使用 `>>` 来向一个文件追加内容。使用管道( _pipes_),我们能够更好的利用文件重定向。 +您还可以使用 `>>` 来向一个文件追加内容。使用管道( _pipes_ ),我们能够更好的利用文件重定向。 `|`操作符允许我们将一个程序的输出和另外一个程序的输入连接起来: ```console @@ -197,12 +197,12 @@ missing:~$ curl --head --silent google.com | grep --ignore-case content-length | ## 一个功能全面又强大的工具 对于大多数的类Unix系统,有一类用户是非常特殊的,那就是:根用户(root用户)。 -您应该已经注意到来,在上面的输出结果中,根用户几乎不受任何限制,他可以创建、读取、更新和删除系统中的任何文件。 -通常在我们并不会以根用户的身份直接登陆系统,因为这样可能会因为某些错误的操作而破坏系统。 +您应该已经注意到了,在上面的输出结果中,根用户几乎不受任何限制,他可以创建、读取、更新和删除系统中的任何文件。 +通常在我们并不会以根用户的身份直接登录系统,因为这样可能会因为某些错误的操作而破坏系统。 取而代之的是我们会在需要的时候使用 `sudo` 命令。顾名思义,它的作用是让您可以以su(super user 或 root的简写)的身份do一些事情。 当您遇到拒绝访问(permission denied)的错误时,通常是因为此时您必须是根用户才能操作。此时也请再次确认您是真的要执行此操作。 -有一件事情是您必须作为根用户才能做的,那就是向`sysfs` 文件写入内容。系统被挂在在`/sys`下, `sysfs` 文件则暴露了一些内核(kernel)参数。 +有一件事情是您必须作为根用户才能做的,那就是向`sysfs` 文件写入内容。系统被挂载在`/sys`下, `sysfs` 文件则暴露了一些内核(kernel)参数。 因此,您不需要借助任何专用的工具,就可以轻松地在运行期间配置系统内核。**注意 Windows or macOS没有这个文件** 例如,您笔记本电脑的屏幕亮度写在 `brightness` 文件中,它位于 @@ -240,7 +240,7 @@ $ echo 1 | sudo tee /sys/class/leds/input6::scrolllock/brightness # 接下来..... -学到这里,您掌握对shell知识已经可以完成一些基础对任务了。您应该已经可以查找感兴趣对文件并使用大多数程序对基本功能了。 +学到这里,您掌握的shell知识已经可以完成一些基础的任务了。您应该已经可以查找感兴趣的文件并使用大多数程序的基本功能了。 在下一场讲座中,我们会探讨如何利用shell及其他工具执行并自动化更复杂的任务。 # 课后练习 @@ -259,7 +259,7 @@ $ echo 1 | sudo tee /sys/class/leds/input6::scrolllock/brightness 1. 尝试执行这个文件。例如,将该脚本的路径(`./semester`)输入到您的shell中并回车。如果程序无法执行,请使用 `ls`命令来获取信息并理解其不能执行的原因。 2. 查看 `chmod` 的手册(例如,使用`man chmod`命令) -3. 使用 `chmod` 命令改变权限,使 `./semester` 能够成功执行,不要使用`sh semester`来执行该程序。您的shell是如何知晓,这个文件需要使用`sh`来解析呢?更多信息请参考:[shebang](https://en.wikipedia.org/wiki/Shebang_(Unix)) +3. 使用 `chmod` 命令改变权限,使 `./semester` 能够成功执行,不要使用`sh semester`来执行该程序。您的shell是如何知晓这个文件需要使用`sh`来解析呢?更多信息请参考:[shebang](https://en.wikipedia.org/wiki/Shebang_(Unix)) 4. 使用 `|` 和 `>` ,将 `semester` 文件输出的最后更改日期信息,写入根目录下的 `last-modified.txt` 的文件中 From 70226f7ea8bdec566745511d315f26bf576b1da6 Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Fri, 5 Jun 2020 11:11:51 +0800 Subject: [PATCH 422/640] Update qa.md --- _2020/qa.md | 158 ++++++++++++++++++++++++++-------------------------- 1 file changed, 80 insertions(+), 78 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index 33349386..ff1c9b3f 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -8,176 +8,178 @@ video: id: Wz50FvGG6xU --- + + 最后一节课,我们回答学生提出的问题: -- [学习操作系统相关内容的推荐,比如进程,虚拟内存,中断,内存管理等](#学习操作系统相关内容的推荐比如进程虚拟内存中断内存管理等) +- [学习操作系统相关内容的推荐,比如进程,虚拟内存,中断,内存管理等](#学习操作系统相关内容的推荐比如进程虚拟内存中断内存管理等) - [你会优先学习的工具有那些?](#你会优先学习的工具有那些) -- [使用Python VS Bash脚本 VS 其他语言?](#使用python-vs-bash脚本-vs-其他语言) -- [`source script.sh` 和`./script.sh`有什么区别?](#source-scriptsh-和scriptsh有什么区别) -- [各种软件包和工具存储在哪里?引用过程是怎样的?`/bin` 或`/lib`是什么?](#各种软件包和工具存储在哪里引用过程是怎样的bin-或lib是什么) -- [我应该用`apt-get install`还是`pip install`去下载软件包呢?](#我应该用apt-get-install还是pip-install去下载软件包呢) +- [使用 Python VS Bash脚本 VS 其他语言?](#使用python-vs-bash脚本-vs-其他语言) +- [ `source script.sh` 和 `./script.sh` 有什么区别?](#source-scriptsh-和scriptsh有什么区别) +- [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里引用过程是怎样的bin-或lib是什么) +- [我应该用 `apt-get install` 还是 `pip install` 去下载软件包呢?](#我应该用apt-get-install还是pip-install去下载软件包呢) - [用于提高代码性能,简单好用的性能分析工具有哪些?](#用于提高代码性能简单好用的性能分析工具有哪些) - [你使用那些浏览器插件?](#你使用那些浏览器插件) - [有哪些有用的数据整理工具?](#有哪些有用的数据整理工具) - [Docker和虚拟机有什么区别?](#Docker和虚拟机有什么区别) -- [每种OS的优缺点是什么,我们如何选择(比如选择最适用于我们需求的Linux发行版)?](#每种OS的优缺点是什么我们如何选择比如选择最适用于我们需求的Linux发行版) -- [Vim 编辑器 VS Emacs编辑器?](#vim-编辑器-vs-emacs编辑器) +- [不同操作系统的优缺点是什么,我们如何选择(比如选择最适用于我们需求的Linux发行版)?](#不同操作系统的优缺点是什么我们如何选择比如选择最适用于我们需求的Linux发行版) +- [Vim 编辑器 VS Emacs 编辑器?](#vim-编辑器-vs-emacs编辑器) - [机器学习应用的提示或技巧?](#机器学习应用的提示或技巧) -- [还有更多的Vim提示吗?](#还有更多的Vim提示吗) +- [还有更多的 Vim 小窍门吗?](#还有更多的Vim小窍门吗) - [2FA是什么,为什么我需要使用它?](#2FA是什么为什么我需要使用它) -- [对于不同的Web浏览器有什么评价?](#对于不同的Web浏览器有什么评价) +- [对于不同的 Web 浏览器有什么评价?](#对于不同的Web浏览器有什么评价) ## 学习操作系统相关内容的推荐,比如进程,虚拟内存,中断,内存管理等 -首先,不清楚你是不是真的需要熟悉这些 更底层的话题。 -当你开始编写更加底层的代码,比如实现或修改内核 的时候,这些很重要。除了其他课程中简要介绍过的进程和信号量之外,大部分话题都不相关。 +首先,不清楚你是不是真的需要了解这些更底层的话题。 +当你开始编写更加底层的代码,比如实现或修改内核的时候,这些内容是很重要的。除了其他课程中简要介绍过的进程和信号量之外,大部分话题都不相关。 学习资源: -- [MIT's 6.828 class](https://pdos.csail.mit.edu/6.828/) - 研究生阶段的操作系统课程(课程资料是公开的)。 -- 现代操作系统 第四版(Modern Operating Systems 4th ed) - 作者是Andrew S. Tanenbaum 这本书对上述很多概念都有很好的描述。 -- FreeBSD的设计与实现(The Design and Implementation of the FreeBSD Operating System) - 关于FreeBSD OS 的好资源(注意,FreeBSD OS不是Linux)。 -- 其他的指南例如 [用 Rust写操作系统](https://os.phil-opp.com/) 这里用不同的语言 逐步实现了内核,主要用于教学的目的。 +- [MIT's 6.828 class](https://pdos.csail.mit.edu/6.828/) - 研究生阶段的操作系统课程(课程资料是公开的)。 +- 现代操作系统 第四版(*Modern Operating Systems 4th ed*) - 作者是Andrew S. Tanenbaum 这本书对上述很多概念都有很好的描述。 +- FreeBSD的设计与实现(*The Design and Implementation of the FreeBSD Operating System*) - 关于FreeBSD OS 不错的资源(注意,FreeBSD OS 不是 Linux)。 +- 其他的指南例如 [用 Rust 写操作系统](https://os.phil-opp.com/) 这里用不同的语言逐步实现了内核,主要用于教学的目的。 ## 你会优先学习的工具有那些? 值得优先学习的内容: -- 学着更多去使用键盘,更少使用鼠标。可以用快捷键,更换界面等等。 -- 学好编辑器。作为程序员你的大部分时间都是在编辑文件,因此值得学好这些技能。 +- 多去使用键盘,少使用鼠标。这一目标可以通过多加利用快捷键,更换界面等来实现。 +- 学好编辑器。作为程序员你大部分时间都是在编辑文件,因此值得学好这些技能。 - 学习怎样去自动化或简化工作流程中的重复任务。因为这会节省大量的时间。 -- 学习像Git之类的 版本控制工具并且知道如何与GitHub结合,在现代的软件项目中协同工作。 +- 学习像 Git 之类的版本控制工具并且知道如何与 GitHub 结合,以便在现代的软件项目中协同工作。 -## 使用Python VS Bash脚本 VS 其他语言? +## 使用 Python VS Bash脚本 VS 其他语言? -通常来说,Bash 脚本对于简短的一次性脚本有效,比如当你想要运行一系列的命令的时候。但是Bash 脚本有一些比较奇怪的地方,这使得大型程序或脚本难以用Bash实现: +通常来说,Bash 脚本对于简短的一次性脚本有效,比如当你想要运行一系列的命令的时候。但是Bash 脚本有一些比较奇怪的地方,这使得大型程序或脚本难以用 Bash 实现: - Bash 可以获取简单的用例,但是很难获得全部可能的输入。例如,脚本参数中的空格会导致Bash 脚本出错。 -- Bash 对于代码重用并不友好。因此,重用你先前已经写好代码部分很困难。通常Bash 中没有软件库的概念。 -- Bash依赖于一些 像`$?` 或 `$@`的特殊字符指代特殊的值。其他的语言却会显式地引用,比如`exitCode` 或`sys.args`。 +- Bash 对于代码重用并不友好。因此,重用你先前已经写好的代码很困难。通常 Bash 中没有软件库的概念。 +- Bash 依赖于一些像 `$?` 或 `$@` 的特殊字符指代特殊的值。其他的语言却会显式地引用,比如 `exitCode` 或 `sys.args`。 因此,对于大型或者更加复杂的脚本我们推荐使用更加成熟的脚本语言例如 Python 和 Ruby。 你可以找到很多用这些语言编写的,用来解决常见问题的在线库。 -如果你发现某种语言实现了你需要的特定功能库,最好的方式就是直接去使用那种语言。 +如果你发现某种语言实现了你所需要的特定功能库,最好的方式就是直接去使用那种语言。 -## `source script.sh` 和`./script.sh`有什么区别? +## `source script.sh` 和 `./script.sh` 有什么区别? 两种情况下 `script.sh` 都会在bash会话种被读取和执行,不同点在于那个会话执行这个命令。 -对于`source`命令来说,命令是在当前的bash会话种执行的,因此当`source`执行完毕,对当前环境的任何更改(例如更改目录或是定义函数)都会留存在当前会话中。 -单独运行`./script.sh`时,当前的bash会话将启动新的bash实例,并在新实例中运行命令`script.sh`。 -因此,如果`script.sh`更改目录,新的bash实例会更改目录,但是一旦退出并将控制权返回给父bash会话,父会话仍然留在先前的位置(不会有目录的更改)。 -同样,如果`script.sh`定义了要在终端中访问的函数,需要用`source`命令在当前bash会话中定义这个函数。否则,如果你运行`./script.sh`,只有新的bash进程才能执行定义的函数,而当前的shell不能。 +对于 `source` 命令来说,命令是在当前的bash会话种执行的,因此当 `source` 执行完毕,对当前环境的任何更改(例如更改目录或是定义函数)都会留存在当前会话中。 +单独运行 `./script.sh` 时,当前的bash会话将启动新的bash会话(实例),并在新实例中运行命令 `script.sh`。 +因此,如果 `script.sh` 更改目录,新的bash会话(实例)会更改目录,但是一旦退出并将控制权返回给父bash会话,父会话仍然留在先前的位置(不会有目录的更改)。 +同样,如果 `script.sh` 定义了要在终端中访问的函数,需要用 `source` 命令在当前bash会话中定义这个函数。否则,如果你运行 `./script.sh`,只有新的bash会话(进程)才能执行定义的函数,而当前的shell不能。 -## 各种软件包和工具存储在哪里?引用过程是怎样的?`/bin` 或`/lib`是什么? +## 各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么? -根据你在命令行中运行的程序,这些包和工具会全部在`PATH`环境变量所列出的目录中查找到, 你可以使用 `which`命令(或是`type`命令)来检查你的shell在哪里发现了特定的程序。 +根据你在命令行中运行的程序,这些包和工具会全部在 `PATH` 环境变量所列出的目录中查找到, 你可以使用 `which` 命令(或是 `type` 命令)来检查你的shell在哪里发现了特定的程序。 一般来说,特定种类的文件存储有一定的规范,[文件系统,层次结构标准(Filesystem, Hierarchy Standard)](https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard)可以查到我们讨论内容的详细列表。 - `/bin` - 基本命令二进制文件 - `/sbin` - 基本的系统二进制文件,通常是root运行的 -- `/dev` - 设备文件,通常是硬件设备接口文件 +- `/dev` - 设备文件,通常是硬件设备接口文件 - `/etc` - 主机特定的系统配置文件 - `/home` - 系统用户的家目录 - `/lib` - 系统软件通用库 - `/opt` - 可选的应用软件 -- `/sys` - 包含系统的信息和配置( [第一堂课](/2020/course-shell/)介绍的) -- `/tmp` - 临时文件 (`/var/tmp`) 通常在重启之间删除 -- `/usr/` - 只读的用户数据 - + `/usr/bin` - 非必须的命令二进制文件 - + `/usr/sbin` - 非必须的系统二进制文件,通常是由root运行的 - + `/usr/local/bin` - 用户编译程序的二进制文件 +- `/sys` - 包含系统的信息和配置([第一堂课](/2020/course-shell/)介绍的) +- `/tmp` - 临时文件( `/var/tmp` ) 通常在重启之间删除 +- `/usr/` - 只读的用户数据 + + `/usr/bin` - 非必须的命令二进制文件 + + `/usr/sbin` - 非必须的系统二进制文件,通常是由root运行的 + + `/usr/local/bin` - 用户编译程序的二进制文件 - `/var` -变量文件 像日志或缓存 -## 我应该用`apt-get install`还是`pip install`去下载软件包呢? +## 我应该用 `apt-get install` 还是 `pip install` 去下载软件包呢? 这个问题没有普遍的答案。这与使用系统程序包管理器还是特定语言的程序包管理器来安装软件这一更普遍的问题相关。需要考虑的几件事: - 常见的软件包都可以通过这两种方法获得,但是小众的软件包或较新的软件包可能不在系统程序包管理器中。在这种情况下,使用特定语言的程序包管理器是更好的选择。 - 同样,特定语言的程序包管理器相比系统程序包管理器有更多的最新版本的程序包。 -- 当使用系统软件包管理器时,将在系统范围内安装库。如果出于开发目的需要不同版本的库,则系统软件包管理器可能不能满足你的需要。对于这种情况,大多数编程语言都提供了隔离或虚拟环境,因此你可以用特定语言的程序包管理器安装不同版本的库而不会发生冲突。对于Python,有virtualenv,对于Ruby,有RVM。 -- 根据操作系统和硬件架构,其中一些软件包可能会附带二进制文件或者软件包需要编译。例如,在树莓派(Raspberry Pi)之类的ARM架构计算机中,在软件附带二进制文件和软件包需要编译的情况下,使用系统包管理器比特定语言包管理器更好。这在很大程度上取决于你的特定设置。 -你应该仅使用一种解决方案,而不同时使用两种方法,因为这可能会导致难以调试的冲突。我们的建议是尽可能使用特定语言的程序包管理器,并使用隔离的环境(例如Python的virtualenv)以避免影响全局环境。 +- 当使用系统软件包管理器时,将在系统范围内安装库。如果出于开发目的需要不同版本的库,则系统软件包管理器可能不能满足你的需要。对于这种情况,大多数编程语言都提供了隔离或虚拟环境,因此你可以用特定语言的程序包管理器安装不同版本的库而不会发生冲突。对于 Python,可以使用 virtualenv,对于 Ruby,使用 RVM 。 +- 根据操作系统和硬件架构,其中一些软件包可能会附带二进制文件或者软件包需要被编译。例如,在树莓派(Raspberry Pi)之类的ARM架构计算机中,在软件附带二进制文件和软件包需要被编译的情况下,使用系统包管理器比特定语言包管理器更好。这在很大程度上取决于你的特定设置。 +你应该仅使用一种解决方案,而不同时使用两种方法,因为这可能会导致难以解决的冲突。我们的建议是尽可能使用特定语言的程序包管理器,并使用隔离的环境(例如 Python 的 virtualenv)以避免影响全局环境。 ## 用于提高代码性能,简单好用的性能分析工具有哪些? 性能分析方面相当有用和简单工具是[print timing](/2020/debugging-profiling/#timing)。你只需手动计算代码不同部分之间花费的时间。通过重复执行此操作,你可以有效地对代码进行二分法搜索,并找到花费时间最长的代码段。 -对于更高级的工具,Valgrind的 [Callgrind](http://valgrind.org/docs/manual/cl-manual.html)可让你运行程序并计算所有的时间花费以及所有调用堆栈(即哪个函数调用了另一个函数)。然后,它会生成带注释的代码版本,其中包含每行花费的时间。但是,它会使程序运行速度降低一个数量级,并且不支持线程。对于其他情况,[`perf`](http://www.brendangregg.com/perf.html)工具和其他特定语言的采样性能分析器可以非常快速地输出有用的数据。[Flamegraphs](http://www.brendangregg.com/flamegraphs.html) 是对于采样分析器输出的可视化工具。你还可以使用针对特定编程语言或任务的工具。例如,对于Web开发而言,Chrome和Firefox内置的开发工具具有出色的性能分析器。 +对于更高级的工具, Valgrind 的 [Callgrind](http://valgrind.org/docs/manual/cl-manual.html)可让你运行程序并计算所有的时间花费以及所有调用堆栈(即哪个函数调用了另一个函数)。然后,它会生成带注释的代码版本,其中包含每行花费的时间。但是,它会使程序运行速度降低一个数量级,并且不支持线程。其他的,[ `perf` ](http://www.brendangregg.com/perf.html)工具和其他特定语言的采样性能分析器可以非常快速地输出有用的数据。[Flamegraphs](http://www.brendangregg.com/flamegraphs.html) 是对采样分析器输出结果的可视化工具。你还可以使用针对特定编程语言或任务的工具。例如,对于 Web 开发而言,Chrome 和 Firefox 内置的开发工具具有出色的性能分析器。 -有时,代码中最慢的部分是系统等待磁盘读取或网络数据包之类的事件。在这些情况下,需要检查根据硬件性能估算的理论速度是否不偏离实际数值 , 也有专门的工具来分析系统调用中的等待时间,包括用于用户程序内核跟踪的[eBPF](http://www.brendangregg.com/blog/2019-01-01/learn-ebpf-tracing.html) 。如果需要低级的性能分析,[`bpftrace`](https://github.com/iovisor/bpftrace) 值得一试。 +有时,代码中最慢的部分是系统等待磁盘读取或网络数据包之类的事件。在这些情况下,需要检查根据硬件性能估算的理论速度是否不偏离实际数值,也有专门的工具来分析系统调用中的等待时间,包括用于用户程序内核跟踪的[eBPF](http://www.brendangregg.com/blog/2019-01-01/learn-ebpf-tracing.html) 。如果需要低级的性能分析,[ `bpftrace` ](https://github.com/iovisor/bpftrace) 值得一试。 ## 你使用那些浏览器插件? 我们钟爱的插件主要与安全性与可用性有关: -- [uBlock Origin](https://github.com/gorhill/uBlock) - 是一个[用途广泛(wide-spectrum)](https://github.com/gorhill/uBlock/wiki/Blocking-mode)的拦截器,它不仅可以拦截广告,还可以拦截第三方的页面,也可以拦截内部脚本和其他种类资源的加载。如果你打算花更多的时间去配置,前往[中等模式(medium mode)](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-medium-mode)或者 [强力模式(hard mode)](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-hard-mode)。在你调整好设置之前一些网站会停止工作,但是这些配置会显著提高你的网络安全水平。另外, [简易模式(easy mode)](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-easy-mode)作为默认模式已经相当不错了,可以拦截大部分的广告和跟踪,你也可以自定义规则来拦截网站对象。 -- [Stylus](https://github.com/openstyles/stylus/) - 是Stylish的分支(不要使用Stylish,它会 [窃取浏览记录](https://www.theregister.co.uk/2018/07/05/browsers_pull_stylish_but_invasive_browser_extension/))),这个插件可让你将自定义CSS样式加载到网站。使用Stylus,你可以轻松地自定义和修改网站的外观。可以删除侧边框,更改背景颜色,甚至更改文字大小或字体样式。这可以使你经常访问的网站更具可读性。此外,Stylus可以找到其他用户编写并发布在[userstyles.org](https://userstyles.org/)中的样式。大多数常见的网站都有一个或几个深色主题样式。 -- 全页屏幕捕获 - 内置于Firefox和 [Chrome 扩展程序](https://chrome.google.com/webstore/detail/full-page-screen-capture/fdpohaocaechififmbbbbbknoalclacl?hl=en)中。这些插件提供完整的网站截图,通常比打印要好用。 -- [多账户容器](https://addons.mozilla.org/en-US/firefox/addon/multi-account-containers/) - 该插件使你可以将Cookie分为“容器”,从而允许你以不同的身份浏览web网页 并且/或 确保网站无法在它们之间共享信息。 -- 密码集成管理器-大多数密码管理器都有浏览器插件,这些插件帮你将登录凭据输入网站的过程不仅方便,而且更加安全。与简单复制粘贴用户名和密码相比,这些插件将首先检查网站域是否与列出的条目相匹配,以防止冒充著名网站的网络钓鱼窃取登录凭据。 +- [uBlock Origin](https://github.com/gorhill/uBlock) - 是一个[用途广泛(wide-spectrum)](https://github.com/gorhill/uBlock/wiki/Blocking-mode)的拦截器,它不仅可以拦截广告,还可以拦截第三方的页面,也可以拦截内部脚本和其他种类资源的加载。如果你打算花更多的时间去配置,前往[中等模式(medium mode)](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-medium-mode)或者 [强力模式(hard mode)](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-hard-mode)。在你调整好设置之前一些网站会停止工作,但是这些配置会显著提高你的网络安全水平。另外, [简易模式(easy mode)](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-easy-mode)作为默认模式已经相当不错了,可以拦截大部分的广告和跟踪,你也可以自定义规则来拦截网站对象。 +- [Stylus](https://github.com/openstyles/stylus/) - 是Stylish的分支(不要使用Stylish,它会[窃取浏览记录](https://www.theregister.co.uk/2018/07/05/browsers_pull_stylish_but_invasive_browser_extension/))),这个插件可让你将自定义CSS样式加载到网站。使用Stylus,你可以轻松地自定义和修改网站的外观。可以删除侧边框,更改背景颜色,甚至更改文字大小或字体样式。这可以使你经常访问的网站更具可读性。此外,Stylus可以找到其他用户编写并发布在[userstyles.org](https://userstyles.org/)中的样式。大多数常用的网站都有一个或几个深色主题样式。 +- 全页屏幕捕获 - 内置于 Firefox 和 [ Chrome 扩展程序](https://chrome.google.com/webstore/detail/full-page-screen-capture/fdpohaocaechififmbbbbbknoalclacl?hl=en)中。这些插件提供完整的网站截图,通常比打印要好用。 +- [多账户容器](https://addons.mozilla.org/en-US/firefox/addon/multi-account-containers/) - 该插件使你可以将Cookie分为“容器”,从而允许你以不同的身份浏览web网页并且/或确保网站无法在它们之间共享信息。 +- 密码集成管理器 - 大多数密码管理器都有浏览器插件,这些插件帮你将登录凭据输入网站的过程不仅方便,而且更加安全。与简单复制粘贴用户名和密码相比,这些插件将首先检查网站域是否与列出的条目相匹配,以防止冒充网站的网络钓鱼窃取登录凭据。 ## 有哪些有用的数据整理工具? -在数据整理那一节课程中,我们没有时间讨论一些数据整理工具,包括分别用于JSON和HTML数据的专用解析器,`jq`和`pup`。Perl语言是另一个更高级的用于数据整理管道的工具。另一个技巧是使用`column -t`命令, 可用于将空格文本(不一定对齐)转换为对齐的文本。 +在数据整理那一节课程中,我们没有时间讨论一些数据整理工具,包括分别用于JSON和HTML数据的专用解析器, `jq` 和 `pup`。Perl语言是另一个更高级的可以用于数据整理管道的工具。另一个技巧是使用 `column -t`命令,可以将空格文本(不一定对齐)转换为对齐的文本。 -一般来说,vim和Python是两个不常规的数据整理工具。对于某些复杂的多行转换,vim宏可以是非常有用的工具。你可以记录一系列操作,并根据需要重复执行多次,例如,在编辑的 [讲义](/2020/editors/#macros) (去年 [视频](/2019/editors/))中,有一个示例是使用vim宏将XML格式的文件转换为JSON。 +一般来说,vim和Python是两个不常规的数据整理工具。对于某些复杂的多行转换,vim宏可以是非常有用的工具。你可以记录一系列操作,并根据需要重复执行多次,例如,在编辑的[讲义](/2020/editors/#macros)(去年 [视频](/2019/editors/))中,有一个示例是使用vim宏将XML格式的文件转换为JSON。 -对于通常以CSV格式显示的表格数据, Python [pandas](https://pandas.pydata.org/) 库是一个很棒的工具。不仅因为它让复杂操作的定义(如分组依据,联接或过滤器)变得非常容易, 而且还便于根据不同属性绘制数据。它还支持导出多种表格格式,包括XLS,HTML或LaTeX。另外,R语言(一种有争议的[不好](http://arrgh.tim-smith.us/)的语言)具有很多功能,可以计算数据的统计数据,这在管道的最后一步中非常有用。 [ggplot2](https://ggplot2.tidyverse.org/)是R中很棒的绘图库。 +对于通常以CSV格式显示的表格数据, Python [pandas](https://pandas.pydata.org/)库是一个很棒的工具。不仅因为它让复杂操作的定义(如分组依据,联接或过滤器)变得非常容易,而且还便于根据不同属性绘制数据。它还支持导出多种表格格式,包括 XLS,HTML 或 LaTeX。另外,R语言(一种有争议的[不好](http://arrgh.tim-smith.us/)的语言)具有很多功能,可以计算数据的统计数据,这在管道的最后一步中非常有用。 [ggplot2](https://ggplot2.tidyverse.org/)是R中很棒的绘图库。 ## Docker和虚拟机有什么区别? -Docker 基于更加普遍的被称为容器的概念。关于容器和虚拟机之间最大的不同是 虚拟机会执行整个的 OS 栈,包括内核(即使这个内核和主机内核相同)。与虚拟机不同的是,容器避免运行其他内核实例 ,而是与主机分享内核。在Linux环境中,有LXC机制来实现,并且这能使一系列分离的主机像是在使用自己的硬件启动程序,而实际上是共享主机的硬件和内核。因此容器的开销小于完整的虚拟机。 +Docker 基于容器这个更为普遍的概念。关于容器和虚拟机之间最大的不同是,虚拟机会执行整个的 OS 栈,包括内核(即使这个内核和主机内核相同)。与虚拟机不同的是,容器避免运行其他内核实例,而是与主机分享内核。在Linux环境中,有LXC机制来实现,并且这能使一系列分离的主机像是在使用自己的硬件启动程序,而实际上是共享主机的硬件和内核。因此容器的开销小于完整的虚拟机。 -另一方面,容器的隔离性较弱而且只有在主机运行相同的内核时才能正常工作。例如,如果你在macOS上运行Docker,Docker需要启动Linux虚拟机去获取初始的Linux内核,这样的开销仍然很大。最后,Docker是容器的特定实现,它是为软件部署定制的。基于这些,它有一些奇怪之处:例如,默认情况下,Docker容器在重启之间不会有以任何形式的存储。 +另一方面,容器的隔离性较弱而且只有在主机运行相同的内核时才能正常工作。例如,如果你在macOS 上运行 Docker,Docker 需要启动 Linux虚拟机去获取初始的 Linux内核,这样的开销仍然很大。最后,Docker 是容器的特定实现,它是为软件部署定制的。基于这些,它有一些奇怪之处:例如,默认情况下,Docker 容器在重启之间不会有以任何形式的存储。 -## 每种OS的优缺点是什么,我们如何选择(比如选择最适用于我们需求的Linux发行版)? +## 不同操作系统的优缺点是什么,我们如何选择(比如选择最适用于我们需求的Linux发行版)? 关于Linux发行版,尽管有相当多的版本,但大部分发行版在大多数使用情况下的表现是相同的。 -可以在任何发行版中学习Linux与UNIX的特性和内部工作原理。 +可以在任何发行版中学习 Linux 与 UNIX 的特性和其内部工作原理。 发行版之间的根本区别是发行版如何处理软件包更新。 -某些版本,例如Arch Linux采用滚动更新策略,用了最前沿的软件包(bleeding-edge),但软件可能并不稳定。另外一些发行版(如Debian,CentOS或Ubuntu LTS)其更新策略要保守得多,因此更新的内容会更稳定,但牺牲了一些新功能。我们建议你使用Debian或Ubuntu来获得简单稳定的台式机和服务器体验。 +某些版本,例如 Arch Linux 采用滚动更新策略,用了最前沿的软件包(bleeding-edge),但软件可能并不稳定。另外一些发行版(如Debian,CentOS 或 Ubuntu LTS)其更新策略要保守得多,因此更新的内容会更稳定,但牺牲了一些新功能。我们建议你使用 Debian 或 Ubuntu 来获得简单稳定的台式机和服务器体验。 -Mac OS是介于Windows和Linux之间的一个OS,它有很漂亮的界面。但是,Mac OS是基于BSD而不是Linux,因此系统的某些部分和命令是不同的。 -另一种值得体验的是FreeBSD。虽然某些程序不能在FreeBSD上运行,但与Linux相比,BSD生态系统的碎片化程度要低得多,并且说明文档更加友好。 -除了开发Windows应用程序或需要使用某些Windows更好支持的功能(例如对游戏的驱动程序支持)外,我们不建议使用Windows。 +Mac OS 是介于 Windows 和 Linux 之间的一个操作系统,它有很漂亮的界面。但是,Mac OS 是基于BSD 而不是 Linux,因此系统的某些部分和命令是不同的。 +另一种值得体验的是 FreeBSD。虽然某些程序不能在 FreeBSD 上运行,但与 Linux 相比,BSD 生态系统的碎片化程度要低得多,并且说明文档更加友好。 +除了开发Windows应用程序或需要使用某些 Windows系统更好支持的功能(例如对游戏的驱动程序支持)外,我们不建议使用 Windows。 -对于双启动系统,我们认为最有效的实现是macOS的bootcamp,从长远来看,任何其他组合都可能会出现问题,尤其是当你结合了其他功能比如磁盘加密。 +对于双启动系统,我们认为最有效的实现是 macOS 的 bootcamp,从长远来看,任何其他组合都可能会出现问题,尤其是当你结合了其他功能比如磁盘加密。 -## Vim 编辑器 VS Emacs编辑器? +## Vim 编辑器 VS Emacs 编辑器? -我们三个都使用vim作为我们的主要编辑器。但是Emacs也是一个不错的选择,你可以两者都尝试,看看那个更适合你。Emacs不遵循vim的模式编辑,但是这些功能可以通过Emacs插件 像 [Evil](https://github.com/emacs-evil/evil) 或 [Doom Emacs](https://github.com/hlissner/doom-emacs)来实现。 -Emacs的优点是可以用Lisp语言进行扩展(Lisp比vim默认的脚本语言vimscript要更好)。 +我们三个都使用 vim 作为我们的主要编辑器。但是 Emacs 也是一个不错的选择,你可以两者都尝试,看看那个更适合你。Emacs 不使用 vim 的模式编辑,但是这些功能可以通过 Emacs 插件像[Evil](https://github.com/emacs-evil/evil) 或 [Doom Emacs](https://github.com/hlissner/doom-emacs)来实现。 +Emacs的优点是可以用Lisp语言进行扩展(Lisp比vim默认的脚本语言vimscript要更好用)。 ## 机器学习应用的提示或技巧? 课程的一些经验可以直接用于机器学习程序。 就像许多科学学科一样,在机器学习中,你经常要进行一系列实验,并检查哪些数据有效,哪些无效。 -你可以使用Shell轻松快速地搜索这些实验结果,并且以合理的方式汇总。这意味着需要在限定时间内或使用特定数据集的情况下,检查所有实验结果。通过使用JSON文件记录实验的所有相关参数,使用我们在本课程中介绍的工具,这件事情可以变得极其简单。 -最后,如果你不使用集群提交你的GPU作业,那你应该研究如何使该过程自动化,因为这是一项非常耗时的任务,会消耗你的精力。 +你可以使用 Shell 轻松快速地搜索这些实验结果,并且以合理的方式汇总。这意味着需要在限定时间内或使用特定数据集的情况下,检查所有实验结果。通过使用JSON文件记录实验的所有相关参数,使用我们在本课程中介绍的工具,这件事情可以变得极其简单。 +最后,如果你不使用集群提交你的 GPU 作业,那你应该研究如何使该过程自动化,因为这是一项非常耗时的任务,会消耗你的精力。 -## 还有更多的Vim提示吗? +## 还有更多的 Vim 小窍门吗? -更多的提示: +更多的窍门: -- 插件 - 花时间去探索插件。有很多不错的插件解决了vim的缺陷或者增加了与现有vim 工作流很好结合的新功能。关于这部分内容,资源是[VimAwesome](https://vimawesome.com/) 和其他程序员的dotfiles。 -- 标记 - 在vim里你可以使用 `m` 为字母 `X`做标记,之后你可以通过 `'`回到标记位置。这可以让你快速定位到文件内或文件间的特定位置。 +- 插件 - 花时间去探索插件。有很多不错的插件修复了vim的缺陷或者增加了与现有vim工作流很好结合的新功能。关于这部分内容,资源是[VimAwesome](https://vimawesome.com/) 和其他程序员的dotfiles。 +- 标记 - 在vim里你可以使用 `m` 为字母 `X` 做标记,之后你可以通过 `'` 回到标记位置。这可以让你快速定位到文件内或文件间的特定位置。 - 导航 - `Ctrl+O` and `Ctrl+I` 使你在最近访问位置前后移动。 -- 撤销树 - vim 有不错的更改跟踪机制,不同于其他的编辑器,vim存储变更树,因此即使你撤销后做了一些修改,你仍然可以通过撤销树的导航回到初始状态。一些插件比如 [gundo.vim](https://github.com/sjl/gundo.vim) 和 [undotree](https://github.com/mbbill/undotree)通过图形化来展示撤销树 -- 时间撤销 - `:earlier` 和 `:later`命令使得你可以用时间而非某一时刻的更改来定位文件。 -- [持续撤销](https://vim.fandom.com/wiki/Using_undo_branches#Persistent_undo) - 是一个默认未被开启的vim的内置功能,它在vim启动之间保存撤销历史,通过设置 在 `.vimrc`目录下的`undofile` 和 `undodir`, vim会保存每个文件的修改历史。 -- 热键(Leader Key) - 热键是一个用于用户配置自定义命令的特殊的按键。这种模式通常是按下后释放这个按键(通常是空格键)与其他的按键去执行特殊的命令。插件会用这些按键增加他们的功能,例如 插件UndoTree使用 ` U` 去打开撤销树。 -- 高级文本对象 - 文本对象比如搜索也可以用vim命令构成。例如,`d/`会删除下一处匹配pattern的字符串 ,`cgn`可以用于更改上次搜索的关键字。 +- 撤销树 - vim 有不错的更改跟踪机制,不同于其他的编辑器,vim存储变更树,因此即使你撤销后做了一些修改,你仍然可以通过撤销树的导航回到初始状态。一些插件比如 [gundo.vim](https://github.com/sjl/gundo.vim) 和 [undotree](https://github.com/mbbill/undotree)通过图形化来展示撤销树。 +- 时间撤销 - `:earlier` 和 `:later` 命令使得你可以用时间而非某一时刻的更改来定位文件。 +- [持续撤销](https://vim.fandom.com/wiki/Using_undo_branches#Persistent_undo) - 是一个默认未被开启的vim的内置功能,它在vim启动之间保存撤销历史,需要配置 在 `.vimrc` 目录下的`undofile` 和 `undodir`,vim会保存每个文件的修改历史。 +- 热键(Leader Key) - 热键是一个用于用户配置自定义命令的特殊的按键。这种模式通常是按下后释放这个按键(通常是空格键)与其他的按键去执行特殊的命令。插件会用这些按键增加它们的功能,例如 插件UndoTree使用 ` U` 去打开撤销树。 +- 高级文本对象 - 文本对象比如搜索也可以用vim命令构成。例如,`d/` 会删除下一处匹配pattern的字符串,`cgn` 可以用于更改上次搜索的关键字。 ## 2FA是什么,为什么我需要使用它? -双因子验证(Two Factor Authentication 2FA)在密码之上为帐户增加了一层额外的保护。为了登录,你不仅需要知道密码,还必须以某种方式“证明”可以访问某些硬件设备。最简单的情形是可以通过接收手机的SMS来实现(尽管SMS 2FA 存在 [已知问题](https://www.kaspersky.com/blog/2fa-practical-guide/24219/))。我们推荐使用[YubiKey](https://www.yubico.com/)之类的[U2F](https://en.wikipedia.org/wiki/Universal_2nd_Factor)方案。 +双因子验证(Two Factor Authentication 2FA)在密码之上为帐户增加了一层额外的保护。为了登录,你不仅需要知道密码,还必须以某种方式“证明”可以访问某些硬件设备。最简单的情形是可以通过接收手机的 SMS 来实现(尽管 SMS 2FA 存在 [已知问题](https://www.kaspersky.com/blog/2fa-practical-guide/24219/))。我们推荐使用[YubiKey](https://www.yubico.com/)之类的[U2F](https://en.wikipedia.org/wiki/Universal_2nd_Factor)方案。 -## 对于不同的Web浏览器有什么评价? +## 对于不同的 Web 浏览器有什么评价? -2020的浏览器现状是,大部分的浏览器都与Chrome 类似,因为他们都使用同样的引擎(Blink)。 Microsoft Edge同样基于 Blink,至于Safari 基于 WebKit(与Blink类似的引擎),这些浏览器仅仅是更糟糕的Chorme版本。不管是在性能还是可用性上,Chorme都是一款很不错的浏览器。如果你想要替代品,我们推荐Firefox。Firefox与Chorme的在各方面不相上下,并且在隐私方面更加出色。 -有一款目前还没有完成的叫Flow的浏览器,它实现了全新的渲染引擎,有望比现有引擎速度更快。 +2020的浏览器现状是,大部分的浏览器都与 Chrome 类似,因为它们都使用同样的引擎(Blink)。 Microsoft Edge 同样基于 Blink,至于 Safari 基于 WebKit(与Blink类似的引擎),这些浏览器仅仅是更糟糕的 Chorme 版本。不管是在性能还是可用性上,Chorme 都是一款很不错的浏览器。如果你想要替代品,我们推荐 Firefox。Firefox 与 Chorme 的在各方面不相上下,并且在隐私方面更加出色。 +有一款目前还没有完成的叫 Flow 的浏览器,它实现了全新的渲染引擎,有望比现有引擎速度更快。 From b83e5b3dd1c1bacf23ccd76533796eb592e373fe Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Fri, 5 Jun 2020 11:15:15 +0800 Subject: [PATCH 423/640] Update qa.md --- _2020/qa.md | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index ff1c9b3f..77fbe8a3 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -9,26 +9,25 @@ video: --- - 最后一节课,我们回答学生提出的问题: - [学习操作系统相关内容的推荐,比如进程,虚拟内存,中断,内存管理等](#学习操作系统相关内容的推荐比如进程虚拟内存中断内存管理等) - [你会优先学习的工具有那些?](#你会优先学习的工具有那些) -- [使用 Python VS Bash脚本 VS 其他语言?](#使用python-vs-bash脚本-vs-其他语言) -- [ `source script.sh` 和 `./script.sh` 有什么区别?](#source-scriptsh-和scriptsh有什么区别) +- [使用 Python VS Bash脚本 VS 其他语言?](#使用-python-vs-bash脚本-vs-其他语言) +- [ `source script.sh` 和 `./script.sh` 有什么区别?](#source-scriptsh-和-scriptsh-有什么区别) - [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里引用过程是怎样的bin-或lib是什么) -- [我应该用 `apt-get install` 还是 `pip install` 去下载软件包呢?](#我应该用apt-get-install还是pip-install去下载软件包呢) +- [我应该用 `apt-get install` 还是 `pip install` 去下载软件包呢?](#我应该用-apt-get-install-还是-pip-install-去下载软件包呢) - [用于提高代码性能,简单好用的性能分析工具有哪些?](#用于提高代码性能简单好用的性能分析工具有哪些) - [你使用那些浏览器插件?](#你使用那些浏览器插件) - [有哪些有用的数据整理工具?](#有哪些有用的数据整理工具) - [Docker和虚拟机有什么区别?](#Docker和虚拟机有什么区别) - [不同操作系统的优缺点是什么,我们如何选择(比如选择最适用于我们需求的Linux发行版)?](#不同操作系统的优缺点是什么我们如何选择比如选择最适用于我们需求的Linux发行版) -- [Vim 编辑器 VS Emacs 编辑器?](#vim-编辑器-vs-emacs编辑器) +- [Vim 编辑器 VS Emacs 编辑器?](#vim-编辑器-vs-emacs-编辑器) - [机器学习应用的提示或技巧?](#机器学习应用的提示或技巧) -- [还有更多的 Vim 小窍门吗?](#还有更多的Vim小窍门吗) +- [还有更多的 Vim 小窍门吗?](#还有更多的-vim-小窍门吗) - [2FA是什么,为什么我需要使用它?](#2FA是什么为什么我需要使用它) -- [对于不同的 Web 浏览器有什么评价?](#对于不同的Web浏览器有什么评价) +- [对于不同的 Web 浏览器有什么评价?](#对于不同的-Web-浏览器有什么评价) ## 学习操作系统相关内容的推荐,比如进程,虚拟内存,中断,内存管理等 From b6e1587fa1a37fdfae29982c592fa595b1ad6fed Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Fri, 5 Jun 2020 11:21:31 +0800 Subject: [PATCH 424/640] Update qa.md --- _2020/qa.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index 77fbe8a3..109b5526 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -9,6 +9,7 @@ video: --- + 最后一节课,我们回答学生提出的问题: @@ -16,14 +17,14 @@ video: - [你会优先学习的工具有那些?](#你会优先学习的工具有那些) - [使用 Python VS Bash脚本 VS 其他语言?](#使用-python-vs-bash脚本-vs-其他语言) - [ `source script.sh` 和 `./script.sh` 有什么区别?](#source-scriptsh-和-scriptsh-有什么区别) -- [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里引用过程是怎样的bin-或lib是什么) +- [各种软件包和工具存储在哪里?引用过程是怎样的? `/bin` 或 `/lib` 是什么?](#各种软件包和工具存储在哪里引用过程是怎样的-bin-或-lib-是什么) - [我应该用 `apt-get install` 还是 `pip install` 去下载软件包呢?](#我应该用-apt-get-install-还是-pip-install-去下载软件包呢) - [用于提高代码性能,简单好用的性能分析工具有哪些?](#用于提高代码性能简单好用的性能分析工具有哪些) - [你使用那些浏览器插件?](#你使用那些浏览器插件) - [有哪些有用的数据整理工具?](#有哪些有用的数据整理工具) - [Docker和虚拟机有什么区别?](#Docker和虚拟机有什么区别) - [不同操作系统的优缺点是什么,我们如何选择(比如选择最适用于我们需求的Linux发行版)?](#不同操作系统的优缺点是什么我们如何选择比如选择最适用于我们需求的Linux发行版) -- [Vim 编辑器 VS Emacs 编辑器?](#vim-编辑器-vs-emacs-编辑器) +- [使用 Vim 编辑器 VS Emacs 编辑器?](#使用-vim-编辑器-vs-emacs-编辑器) - [机器学习应用的提示或技巧?](#机器学习应用的提示或技巧) - [还有更多的 Vim 小窍门吗?](#还有更多的-vim-小窍门吗) - [2FA是什么,为什么我需要使用它?](#2FA是什么为什么我需要使用它) @@ -149,7 +150,7 @@ Mac OS 是介于 Windows 和 Linux 之间的一个操作系统,它有很漂亮 对于双启动系统,我们认为最有效的实现是 macOS 的 bootcamp,从长远来看,任何其他组合都可能会出现问题,尤其是当你结合了其他功能比如磁盘加密。 -## Vim 编辑器 VS Emacs 编辑器? +## 使用 Vim 编辑器 VS Emacs 编辑器? 我们三个都使用 vim 作为我们的主要编辑器。但是 Emacs 也是一个不错的选择,你可以两者都尝试,看看那个更适合你。Emacs 不使用 vim 的模式编辑,但是这些功能可以通过 Emacs 插件像[Evil](https://github.com/emacs-evil/evil) 或 [Doom Emacs](https://github.com/hlissner/doom-emacs)来实现。 Emacs的优点是可以用Lisp语言进行扩展(Lisp比vim默认的脚本语言vimscript要更好用)。 From 56d8defb95fccc0e5149b4803d25b4b5a7ddb83f Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Fri, 5 Jun 2020 16:38:35 +0800 Subject: [PATCH 425/640] Update qa.md --- _2020/qa.md | 38 ++++++++++++++++++-------------------- 1 file changed, 18 insertions(+), 20 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index 109b5526..78769a0f 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -8,8 +8,6 @@ video: id: Wz50FvGG6xU --- - - 最后一节课,我们回答学生提出的问题: @@ -109,7 +107,7 @@ video: 性能分析方面相当有用和简单工具是[print timing](/2020/debugging-profiling/#timing)。你只需手动计算代码不同部分之间花费的时间。通过重复执行此操作,你可以有效地对代码进行二分法搜索,并找到花费时间最长的代码段。 -对于更高级的工具, Valgrind 的 [Callgrind](http://valgrind.org/docs/manual/cl-manual.html)可让你运行程序并计算所有的时间花费以及所有调用堆栈(即哪个函数调用了另一个函数)。然后,它会生成带注释的代码版本,其中包含每行花费的时间。但是,它会使程序运行速度降低一个数量级,并且不支持线程。其他的,[ `perf` ](http://www.brendangregg.com/perf.html)工具和其他特定语言的采样性能分析器可以非常快速地输出有用的数据。[Flamegraphs](http://www.brendangregg.com/flamegraphs.html) 是对采样分析器输出结果的可视化工具。你还可以使用针对特定编程语言或任务的工具。例如,对于 Web 开发而言,Chrome 和 Firefox 内置的开发工具具有出色的性能分析器。 +对于更高级的工具, Valgrind 的 [Callgrind](http://valgrind.org/docs/manual/cl-manual.html)可让你运行程序并计算所有的时间花费以及所有调用堆栈(即哪个函数调用了另一个函数)。然后,它会生成带注释的代码版本,其中包含每行花费的时间。但是,它会使程序运行速度降低一个数量级,并且不支持线程。其他的,[ `perf` ](http://www.brendangregg.com/perf.html)工具和其他特定语言的采样性能分析器可以非常快速地输出有用的数据。[Flamegraphs](http://www.brendangregg.com/flamegraphs.html) 是对采样分析器结果的可视化工具。你还可以使用针对特定编程语言或任务的工具。例如,对于 Web 开发而言,Chrome 和 Firefox 内置的开发工具具有出色的性能分析器。 有时,代码中最慢的部分是系统等待磁盘读取或网络数据包之类的事件。在这些情况下,需要检查根据硬件性能估算的理论速度是否不偏离实际数值,也有专门的工具来分析系统调用中的等待时间,包括用于用户程序内核跟踪的[eBPF](http://www.brendangregg.com/blog/2019-01-01/learn-ebpf-tracing.html) 。如果需要低级的性能分析,[ `bpftrace` ](https://github.com/iovisor/bpftrace) 值得一试。 @@ -118,37 +116,37 @@ video: 我们钟爱的插件主要与安全性与可用性有关: - [uBlock Origin](https://github.com/gorhill/uBlock) - 是一个[用途广泛(wide-spectrum)](https://github.com/gorhill/uBlock/wiki/Blocking-mode)的拦截器,它不仅可以拦截广告,还可以拦截第三方的页面,也可以拦截内部脚本和其他种类资源的加载。如果你打算花更多的时间去配置,前往[中等模式(medium mode)](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-medium-mode)或者 [强力模式(hard mode)](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-hard-mode)。在你调整好设置之前一些网站会停止工作,但是这些配置会显著提高你的网络安全水平。另外, [简易模式(easy mode)](https://github.com/gorhill/uBlock/wiki/Blocking-mode:-easy-mode)作为默认模式已经相当不错了,可以拦截大部分的广告和跟踪,你也可以自定义规则来拦截网站对象。 -- [Stylus](https://github.com/openstyles/stylus/) - 是Stylish的分支(不要使用Stylish,它会[窃取浏览记录](https://www.theregister.co.uk/2018/07/05/browsers_pull_stylish_but_invasive_browser_extension/))),这个插件可让你将自定义CSS样式加载到网站。使用Stylus,你可以轻松地自定义和修改网站的外观。可以删除侧边框,更改背景颜色,甚至更改文字大小或字体样式。这可以使你经常访问的网站更具可读性。此外,Stylus可以找到其他用户编写并发布在[userstyles.org](https://userstyles.org/)中的样式。大多数常用的网站都有一个或几个深色主题样式。 +- [Stylus](https://github.com/openstyles/stylus/) - 是Stylish的分支(不要使用Stylish,它会[窃取浏览记录](https://www.theregister.co.uk/2018/07/05/browsers_pull_stylish_but_invasive_browser_extension/))),这个插件可让你将自定义CSS样式加载到网站。使用Stylus,你可以轻松地自定义和修改网站的外观。可以删除侧边框,更改背景颜色,更改文字大小或字体样式。这可以使你经常访问的网站更具可读性。此外,Stylus可以找到其他用户编写并发布在[userstyles.org](https://userstyles.org/)中的样式。大多数常用的网站都有一个或几个深色主题样式。 - 全页屏幕捕获 - 内置于 Firefox 和 [ Chrome 扩展程序](https://chrome.google.com/webstore/detail/full-page-screen-capture/fdpohaocaechififmbbbbbknoalclacl?hl=en)中。这些插件提供完整的网站截图,通常比打印要好用。 - [多账户容器](https://addons.mozilla.org/en-US/firefox/addon/multi-account-containers/) - 该插件使你可以将Cookie分为“容器”,从而允许你以不同的身份浏览web网页并且/或确保网站无法在它们之间共享信息。 - 密码集成管理器 - 大多数密码管理器都有浏览器插件,这些插件帮你将登录凭据输入网站的过程不仅方便,而且更加安全。与简单复制粘贴用户名和密码相比,这些插件将首先检查网站域是否与列出的条目相匹配,以防止冒充网站的网络钓鱼窃取登录凭据。 ## 有哪些有用的数据整理工具? -在数据整理那一节课程中,我们没有时间讨论一些数据整理工具,包括分别用于JSON和HTML数据的专用解析器, `jq` 和 `pup`。Perl语言是另一个更高级的可以用于数据整理管道的工具。另一个技巧是使用 `column -t`命令,可以将空格文本(不一定对齐)转换为对齐的文本。 +在数据整理那一节课程中,我们没有时间讨论一些数据整理工具,包括分别用于JSON和HTML数据的专用解析器, `jq` 和 `pup`。Perl语言是另一个更高级的可以用于数据整理管道的工具。另一个技巧是使用 `column -t` 命令,可以将空格文本(不一定对齐)转换为对齐的文本。 -一般来说,vim和Python是两个不常规的数据整理工具。对于某些复杂的多行转换,vim宏可以是非常有用的工具。你可以记录一系列操作,并根据需要重复执行多次,例如,在编辑的[讲义](/2020/editors/#macros)(去年 [视频](/2019/editors/))中,有一个示例是使用vim宏将XML格式的文件转换为JSON。 +一般来说,vim和Python是两个不常规的数据整理工具。对于某些复杂的多行转换,vim宏是非常有用的工具。你可以记录一系列操作,并根据需要重复执行多次,例如,在编辑的[讲义](/2020/editors/#macros)(去年 [视频](/2019/editors/))中,有一个示例是使用vim宏将XML格式的文件转换为JSON。 -对于通常以CSV格式显示的表格数据, Python [pandas](https://pandas.pydata.org/)库是一个很棒的工具。不仅因为它让复杂操作的定义(如分组依据,联接或过滤器)变得非常容易,而且还便于根据不同属性绘制数据。它还支持导出多种表格格式,包括 XLS,HTML 或 LaTeX。另外,R语言(一种有争议的[不好](http://arrgh.tim-smith.us/)的语言)具有很多功能,可以计算数据的统计数据,这在管道的最后一步中非常有用。 [ggplot2](https://ggplot2.tidyverse.org/)是R中很棒的绘图库。 +对于通常以CSV格式显示的表格数据, Python [pandas](https://pandas.pydata.org/)库是一个很棒的工具。不仅因为它能让复杂操作的定义(如分组依据,联接或过滤器)变得非常容易,而且还便于根据不同属性绘制数据。它还支持导出多种表格格式,包括 XLS,HTML 或 LaTeX。另外,R语言(一种有争议的[不好](http://arrgh.tim-smith.us/)的语言)具有很多功能,可以计算数据的统计数字,这在管道的最后一步中非常有用。 [ggplot2](https://ggplot2.tidyverse.org/)是R中很棒的绘图库。 ## Docker和虚拟机有什么区别? -Docker 基于容器这个更为普遍的概念。关于容器和虚拟机之间最大的不同是,虚拟机会执行整个的 OS 栈,包括内核(即使这个内核和主机内核相同)。与虚拟机不同的是,容器避免运行其他内核实例,而是与主机分享内核。在Linux环境中,有LXC机制来实现,并且这能使一系列分离的主机像是在使用自己的硬件启动程序,而实际上是共享主机的硬件和内核。因此容器的开销小于完整的虚拟机。 +Docker 基于容器这个更为概括的概念。关于容器和虚拟机之间最大的不同是,虚拟机会执行整个的 OS 栈,包括内核(即使这个内核和主机内核相同)。与虚拟机不同,容器避免运行其他内核实例,而是与主机分享内核。在Linux环境中,有LXC机制来实现,并且这能使一系列分离的主机像是在使用自己的硬件启动程序,而实际上是共享主机的硬件和内核。因此容器的开销小于完整的虚拟机。 -另一方面,容器的隔离性较弱而且只有在主机运行相同的内核时才能正常工作。例如,如果你在macOS 上运行 Docker,Docker 需要启动 Linux虚拟机去获取初始的 Linux内核,这样的开销仍然很大。最后,Docker 是容器的特定实现,它是为软件部署定制的。基于这些,它有一些奇怪之处:例如,默认情况下,Docker 容器在重启之间不会有以任何形式的存储。 +另一方面,容器的隔离性较弱而且只有在主机运行相同的内核时才能正常工作。例如,如果你在macOS 上运行 Docker,Docker 需要启动 Linux虚拟机去获取初始的 Linux内核,这样的开销仍然很大。最后,Docker 是容器的特定实现,它是为软件部署而定制的。基于这些,它有一些奇怪之处:例如,默认情况下,Docker 容器在重启之间不会有以任何形式的存储。 ## 不同操作系统的优缺点是什么,我们如何选择(比如选择最适用于我们需求的Linux发行版)? 关于Linux发行版,尽管有相当多的版本,但大部分发行版在大多数使用情况下的表现是相同的。 -可以在任何发行版中学习 Linux 与 UNIX 的特性和其内部工作原理。 +可以使用任何发行版去学习 Linux 与 UNIX 的特性和其内部工作原理。 发行版之间的根本区别是发行版如何处理软件包更新。 -某些版本,例如 Arch Linux 采用滚动更新策略,用了最前沿的软件包(bleeding-edge),但软件可能并不稳定。另外一些发行版(如Debian,CentOS 或 Ubuntu LTS)其更新策略要保守得多,因此更新的内容会更稳定,但牺牲了一些新功能。我们建议你使用 Debian 或 Ubuntu 来获得简单稳定的台式机和服务器体验。 +某些版本,例如 Arch Linux 采用滚动更新策略,用了最前沿的软件包(bleeding-edge),但软件可能并不稳定。另外一些发行版(如Debian,CentOS 或 Ubuntu LTS)其更新策略要保守得多,因此更新的内容会更稳定,但会牺牲一些新功能。我们建议你使用 Debian 或 Ubuntu 来获得简单稳定的台式机和服务器体验。 Mac OS 是介于 Windows 和 Linux 之间的一个操作系统,它有很漂亮的界面。但是,Mac OS 是基于BSD 而不是 Linux,因此系统的某些部分和命令是不同的。 另一种值得体验的是 FreeBSD。虽然某些程序不能在 FreeBSD 上运行,但与 Linux 相比,BSD 生态系统的碎片化程度要低得多,并且说明文档更加友好。 -除了开发Windows应用程序或需要使用某些 Windows系统更好支持的功能(例如对游戏的驱动程序支持)外,我们不建议使用 Windows。 +除了开发Windows应用程序或需要使用某些Windows系统更好支持的功能(例如对游戏的驱动程序支持)外,我们不建议使用 Windows。 -对于双启动系统,我们认为最有效的实现是 macOS 的 bootcamp,从长远来看,任何其他组合都可能会出现问题,尤其是当你结合了其他功能比如磁盘加密。 +对于双系统,我们认为最有效的是 macOS 的 bootcamp,长期来看,任何其他组合都可能会出现问题,尤其是当你结合了其他功能比如磁盘加密。 ## 使用 Vim 编辑器 VS Emacs 编辑器? @@ -158,7 +156,7 @@ Emacs的优点是可以用Lisp语言进行扩展(Lisp比vim默认的脚本语 ## 机器学习应用的提示或技巧? 课程的一些经验可以直接用于机器学习程序。 -就像许多科学学科一样,在机器学习中,你经常要进行一系列实验,并检查哪些数据有效,哪些无效。 +就像许多科学学科一样,在机器学习中,你需要进行一系列实验,并检查哪些数据有效,哪些无效。 你可以使用 Shell 轻松快速地搜索这些实验结果,并且以合理的方式汇总。这意味着需要在限定时间内或使用特定数据集的情况下,检查所有实验结果。通过使用JSON文件记录实验的所有相关参数,使用我们在本课程中介绍的工具,这件事情可以变得极其简单。 最后,如果你不使用集群提交你的 GPU 作业,那你应该研究如何使该过程自动化,因为这是一项非常耗时的任务,会消耗你的精力。 @@ -166,14 +164,14 @@ Emacs的优点是可以用Lisp语言进行扩展(Lisp比vim默认的脚本语 更多的窍门: -- 插件 - 花时间去探索插件。有很多不错的插件修复了vim的缺陷或者增加了与现有vim工作流很好结合的新功能。关于这部分内容,资源是[VimAwesome](https://vimawesome.com/) 和其他程序员的dotfiles。 +- 插件 - 花时间去探索插件。有很多不错的插件修复了vim的缺陷或者增加了能够与现有vim工作流结合的新功能。关于这部分内容,资源是[VimAwesome](https://vimawesome.com/) 和其他程序员的dotfiles。 - 标记 - 在vim里你可以使用 `m` 为字母 `X` 做标记,之后你可以通过 `'` 回到标记位置。这可以让你快速定位到文件内或文件间的特定位置。 -- 导航 - `Ctrl+O` and `Ctrl+I` 使你在最近访问位置前后移动。 -- 撤销树 - vim 有不错的更改跟踪机制,不同于其他的编辑器,vim存储变更树,因此即使你撤销后做了一些修改,你仍然可以通过撤销树的导航回到初始状态。一些插件比如 [gundo.vim](https://github.com/sjl/gundo.vim) 和 [undotree](https://github.com/mbbill/undotree)通过图形化来展示撤销树。 +- 导航 - `Ctrl+O` 和 `Ctrl+I` 命令可以使你在最近访问位置前后移动。 +- 撤销树 - vim 有不错的更改跟踪机制,不同于其他的编辑器,vim存储变更树,因此即使你撤销后做了一些修改,你仍然可以通过撤销树的导航回到初始状态。一些插件比如 [gundo.vim](https://github.com/sjl/gundo.vim) 和 [undotree](https://github.com/mbbill/undotree) 通过图形化来展示撤销树。 - 时间撤销 - `:earlier` 和 `:later` 命令使得你可以用时间而非某一时刻的更改来定位文件。 -- [持续撤销](https://vim.fandom.com/wiki/Using_undo_branches#Persistent_undo) - 是一个默认未被开启的vim的内置功能,它在vim启动之间保存撤销历史,需要配置 在 `.vimrc` 目录下的`undofile` 和 `undodir`,vim会保存每个文件的修改历史。 -- 热键(Leader Key) - 热键是一个用于用户配置自定义命令的特殊的按键。这种模式通常是按下后释放这个按键(通常是空格键)与其他的按键去执行特殊的命令。插件会用这些按键增加它们的功能,例如 插件UndoTree使用 ` U` 去打开撤销树。 -- 高级文本对象 - 文本对象比如搜索也可以用vim命令构成。例如,`d/` 会删除下一处匹配pattern的字符串,`cgn` 可以用于更改上次搜索的关键字。 +- [持续撤销](https://vim.fandom.com/wiki/Using_undo_branches#Persistent_undo) - 是一个默认未被开启的vim的内置功能,它在vim启动之间保存撤销历史,需要配置在 `.vimrc` 目录下的`undofile` 和 `undodir`,vim会保存每个文件的修改历史。 +- 热键(Leader Key) - 热键是一个用于用户自定义配置命令的特殊按键。这种模式通常是按下后释放这个按键(通常是空格键)并与其他的按键组合去实现一个特殊的命令。插件也会用这些按键增加它们的功能,例如,插件UndoTree使用 ` U` 去打开撤销树。 +- 高级文本对象 - 文本对象比如搜索也可以用vim命令构成。例如,`d/` 会删除下一处匹配 pattern 的字符串,`cgn` 可以用于更改上次搜索的关键字。 ## 2FA是什么,为什么我需要使用它? From e05b7a4a1000d49953ca4d5148ef32628f0f7437 Mon Sep 17 00:00:00 2001 From: Yi Zhang Date: Fri, 5 Jun 2020 04:46:44 -0400 Subject: [PATCH 426/640] applied requested changes --- _2020/security.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/_2020/security.md b/_2020/security.md index fb91e3d6..d5773b4a 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -60,7 +60,7 @@ $ printf 'Hello' | sha1sum f7ff9e8b7bb2e09b70935a5d785e0cc5d9d0abf0 ``` -抽象地讲,散列函数可以被认为是一个难以取反,且看上去随机(但具确定性)的函数 +抽象地讲,散列函数可以被认为是一个不可逆,且看上去随机(但具确定性)的函数 (这就是[散列函数的理想模型](https://en.wikipedia.org/wiki/Random_oracle))。 一个散列函数拥有以下特性: @@ -69,7 +69,7 @@ f7ff9e8b7bb2e09b70935a5d785e0cc5d9d0abf0 - 目标碰撞抵抗性/弱无碰撞:对于一个给定输入`m_1`,难以找到`m_2 != m_1`且`hash(m_1) = hash(m_2)`。 - 碰撞抵抗性/强无碰撞:难以找到一组满足`hash(m_1) = hash(m_2)`的输入`m_1, m_2`(该性质严格强于目标碰撞抵抗性)。 -注:虽然SHA-1还可以用于特定用途,它已经[不再被认为](https://shattered.io/)是一个强密码散列函数。 +注:虽然SHA-1还可以用于特定用途,但它已经[不再被认为](https://shattered.io/)是一个强密码散列函数。 你可参照[密码散列函数的生命周期](https://valerieaurora.org/hash.html)这个表格了解一些散列函数是何时被发现弱点及破解的。 请注意,针对应用推荐特定的散列函数超出了本课程内容的范畴。 如果选择散列函数对于你的工作非常重要,请先系统学习信息安全及密码学。 @@ -119,7 +119,7 @@ decrypt(ciphertext: array, key) -> array (输出明文) ## 对称加密的应用 -- 加密不信任的云服务上存储的文件。
    对称加密和密钥生成函数配合起来,就可以使用密码加密文件: +- 加密不信任的云服务上存储的文件。对称加密和密钥生成函数配合起来,就可以使用密码加密文件: 将密码输入密钥生成函数生成密钥 `key = KDF(passphrase)`,然后存储`encrypt(file, key)`。 # 非对称加密 @@ -185,7 +185,7 @@ Keybase主要使用[社交网络证明 (social proof)](https://keybase.io/blog/c 你只需要记住一个复杂的主密码,密码管理器就可以生成很多复杂度高且不会重复使用的密码。密码管理器通过这种方式降低密码被猜出的可能,并减少网站信息泄露后对其他网站密码的威胁。 -## 两步验证 +## 两步验证(双因子验证) [两步验证](https://en.wikipedia.org/wiki/Multi-factor_authentication)(2FA)要求用户同时使用密码(“你知道的信息”)和一个身份验证器(“你拥有的物品”,比如[YubiKey](https://www.yubico.com/))来消除密码泄露或者[钓鱼攻击](https://en.wikipedia.org/wiki/Phishing)的威胁。 From 153adf9911b92f0e421298c6ff68e676c5fcd97e Mon Sep 17 00:00:00 2001 From: AA1HSHH <48511594+AA1HSHH@users.noreply.github.com> Date: Fri, 5 Jun 2020 17:40:05 +0800 Subject: [PATCH 427/640] Update qa.md --- _2020/qa.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index 78769a0f..0f211ba8 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -67,7 +67,7 @@ video: ## `source script.sh` 和 `./script.sh` 有什么区别? -两种情况下 `script.sh` 都会在bash会话种被读取和执行,不同点在于那个会话执行这个命令。 +这两种情况 `script.sh` 都会在bash会话中被读取和执行,不同点在于那个会话执行这个命令。 对于 `source` 命令来说,命令是在当前的bash会话种执行的,因此当 `source` 执行完毕,对当前环境的任何更改(例如更改目录或是定义函数)都会留存在当前会话中。 单独运行 `./script.sh` 时,当前的bash会话将启动新的bash会话(实例),并在新实例中运行命令 `script.sh`。 因此,如果 `script.sh` 更改目录,新的bash会话(实例)会更改目录,但是一旦退出并将控制权返回给父bash会话,父会话仍然留在先前的位置(不会有目录的更改)。 @@ -95,7 +95,7 @@ video: ## 我应该用 `apt-get install` 还是 `pip install` 去下载软件包呢? -这个问题没有普遍的答案。这与使用系统程序包管理器还是特定语言的程序包管理器来安装软件这一更普遍的问题相关。需要考虑的几件事: +这个问题没有普遍的答案。这与使用系统程序包管理器还是特定语言的程序包管理器来安装软件这一更笼统的问题相关。需要考虑的几件事: - 常见的软件包都可以通过这两种方法获得,但是小众的软件包或较新的软件包可能不在系统程序包管理器中。在这种情况下,使用特定语言的程序包管理器是更好的选择。 - 同样,特定语言的程序包管理器相比系统程序包管理器有更多的最新版本的程序包。 From 08406d460a867dcd529231d46d2aa7bcf9075500 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Sat, 6 Jun 2020 16:15:35 +0800 Subject: [PATCH 428/640] Update security.md --- _2020/security.md | 40 ++-------------------------------------- 1 file changed, 2 insertions(+), 38 deletions(-) diff --git a/_2020/security.md b/_2020/security.md index d5773b4a..5b58753d 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -2,7 +2,7 @@ layout: lecture title: "安全和密码学" date: 2019-01-28 -ready: false +ready: true video: aspect: 56.25 id: tjwobAmnKTo @@ -213,40 +213,4 @@ Windows的[BitLocker](https://fossbytes.com/enable-full-disk-encryption-windows- 但是为了防止泄露,私钥必须加密存储。`ssh-keygen`命令会提示用户输入一个密码,并将它输入密钥生成函数 产生一个密钥。最终,`ssh-keygen`使用对称加密算法和这个密钥加密私钥。 -在实际运用中,当服务器已知用户的公钥(存储在`.ssh/authorized_keys`文件中,一般在用户HOME目录下),尝试连接的客户端可以使用非对称签名来证明用户的身份——这便是[挑战应答方式](https://en.wikipedia.org/wiki/Challenge%E2%80%93response_authentication)。 -简单来说,服务器选择一个随机数字发送给客户端。客户端使用用户私钥对这个数字信息签名后返回服务器。 -服务器随后使用`.ssh/authorized_keys`文件中存储的用户公钥来验证返回的信息是否由所对应的私钥所签名。这种验证方式可以有效证明试图登录的用户持有所需的私钥。 - -{% comment %} -extra topics, if there's time - -security concepts, tips -- biometrics -- HTTPS -{% endcomment %} - -# 资源 - -- [去年的讲稿](/2019/security/): 更注重于计算机用户可以如何增强隐私保护和安全 -- [Cryptographic Right Answers](https://latacora.micro.blog/2018/04/03/cryptographic-right-answers.html): -解答了在一些应用环境下“应该使用什么加密?”的问题 - -# 练习 - -1. **熵** - 1. 假设一个密码是从五个小写的单词拼接组成,每个单词都是从一个含有10万单词的字典中随机选择,且每个单词选中的概率相同。 - 一个符合这样构造的例子是`correcthorsebatterystaple`。这个密码有多少比特的熵? - 1. 假设另一个密码是用八个随机的大小写字母或数字组成。一个符合这样构造的例子是`rg8Ql34g`。这个密码又有多少比特的熵? - 1. 哪一个密码更强? - 1. 假设一个攻击者每秒可以尝试1万个密码,这个攻击者需要多久可以分别破解上述两个密码? -1. **密码散列函数** 从[Debian镜像站](https://www.debian.org/CD/http-ftp/)下载一个光盘映像(比如这个来自阿根廷镜像站的[映像](http://debian.xfree.com.ar/debian-cd/10.2.0/amd64/iso-cd/debian-10.2.0-amd64-netinst.iso))。使用`sha256sum`命令对比下载映像的哈希值和官方Debian站公布的哈希值。如果你下载了上面的映像,官方公布的哈希值可以参考[这个文件](https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/SHA256SUMS)。 -1. **对称加密** 使用 - [OpenSSL](https://www.openssl.org/)的AES模式加密一个文件: `openssl aes-256-cbc -salt -in {源文件名} -out {加密文件名}`。 - 使用`cat`或者`hexdump`对比源文件和加密的文件,再用 `openssl aes-256-cbc -d -in {加密文件名} -out - {解密文件名}` 命令解密刚刚加密的文件。最后使用`cmp`命令确认源文件和解密后的文件内容相同。 -1. **非对称加密** - 1. 在你自己的电脑上使用更安全的[ED25519算法](https://wiki.archlinux.org/index.php/SSH_keys#Ed25519)生成一组[SSH - 密钥对](https://www.digitalocean.com/community/tutorials/how-to-set-up-ssh-keys--2)。为了确保私钥不使用时的安全,一定使用密码加密你的私钥。 - 1. [配置GPG](https://www.digitalocean.com/community/tutorials/how-to-use-gpg-to-encrypt-and-sign-messages)。 - 1. 给Anish发送一封加密的电子邮件([Anish的公钥](https://keybase.io/anish))。 - 1. 使用`git commit -C`命令签名一个Git提交,并使用`git show --show-signature`命令验证这个提交的签名。或者,使用`git tag -s`命令签名一个Git标签,并使用`git tag -v`命令验证标签的签名。 +在实际运用中,当服务器已知用户的公钥(存储在`.ssh/authorized_keys`文件中,一般在用户HOME目录下),尝试连接的客户端可以使用非对称签名 From 86f2249999f50fa564da62d06a2f3eacc2126b93 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 6 Jun 2020 20:34:44 +0800 Subject: [PATCH 429/640] fix mobile view --- _includes/nav.html | 19 ++++++++++--------- _layouts/lecture.html | 1 - static/css/main.css | 2 +- 3 files changed, 11 insertions(+), 11 deletions(-) diff --git a/_includes/nav.html b/_includes/nav.html index 662391bb..815cfc71 100644 --- a/_includes/nav.html +++ b/_includes/nav.html @@ -1,23 +1,24 @@ -
    +{% comment %} \ No newline at end of file +
    {% endcomment %} \ No newline at end of file diff --git a/_layouts/lecture.html b/_layouts/lecture.html index f6893c2d..087491ba 100644 --- a/_layouts/lecture.html +++ b/_layouts/lecture.html @@ -18,6 +18,5 @@

    {{ page.title }}{% if page.subtitle %}

    Edit this page.

    -

    Translator: Lingfeng Ai

    Licensed under CC BY-NC-SA.

    diff --git a/static/css/main.css b/static/css/main.css index bfde6704..111a670d 100644 --- a/static/css/main.css +++ b/static/css/main.css @@ -234,7 +234,7 @@ hr { #top-nav { max-width: 75rem; - padding-left:8rem; + /* padding-left:8rem; */ margin: auto; text-align: center; } From 2f34217385a8f65f36c9c838f0c41f83b1e5b95f Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sat, 6 Jun 2020 20:36:37 +0800 Subject: [PATCH 430/640] qa and security done --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 20f12418..b17dd5e0 100644 --- a/README.md +++ b/README.md @@ -34,7 +34,7 @@ To contribute to this tanslation project, please book your topic by creating an | [version-control.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/version-control.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [debugging-profiling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/debugging-profiling.md) |[@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [metaprogramming.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/metaprogramming.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | -| [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | [@catcarbon](https://github.com/catcarbon) | In-progress | +| [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | [@catcarbon](https://github.com/catcarbon) | Done | | [potpourri.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md) | [@catcarbon](https://github.com/catcarbon) | In-progress | -| [qa.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/qa.md) | [@AA1HSHH](https://github.com/AA1HSHH) | In-progress | +| [qa.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/qa.md) | [@AA1HSHH](https://github.com/AA1HSHH) | Done | | [about.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/about.md) | [@Binlogo](https://github.com/Binlogo) | Done | From f754e58736176e5437f8aa2d4027d5d2ac2db429 Mon Sep 17 00:00:00 2001 From: Shumo Chu Date: Sat, 6 Jun 2020 18:08:20 -0400 Subject: [PATCH 431/640] wip editor --- _2020/editors.md | 79 ++++++++++++++++++++---------------------------- 1 file changed, 32 insertions(+), 47 deletions(-) diff --git a/_2020/editors.md b/_2020/editors.md index 52ab872d..9c081e8a 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -37,19 +37,16 @@ video: # Vim的哲学 -When programming, you spend most of your time reading/editing, not writing. For -this reason, Vim is a _modal_ editor: it has different modes for inserting text -vs manipulating text. Vim is programmable (with Vimscript and also other -languages like Python), and Vim's interface itself is a programming language: -keystrokes (with mnemonic names) are commands, and these commands are -composable. Vim avoids the use of the mouse, because it's too slow; Vim even -avoids using the arrow keys because it requires too much movement. +在编程的时候,你会把大量时间花在阅读/编辑而不是在写代码上。所以, Vim 是一个 _多模态_ 编辑 +器: 它对于插入文字和操纵文字有不同的模式。 Vim 既是可编程的 (可以使用 Vimscript 或者像 +Python 一样的其他程序语言), Vim 的接口本身也是一个程序语言: 键入操作 (以及其助记名) +是命令, 这些命令也是可组合的。 Vim 避免了使用鼠标,因为那样太慢了; Vim 甚至避免用 +上下左右键因为那样需要太多的手指移动。 -The end result is an editor that can match the speed at which you think. +这样的设计哲学的结果是一个能跟上你思维速度的编辑器。 # 编辑模式 - Vim的设计以大多数时间都花在阅读、浏览和进行少量编辑改动为基础,因此它具有多种操作模式: - *正常模式*:在文件中四处移动光标进行修改 @@ -64,60 +61,48 @@ Vim的设计以大多数时间都花在阅读、浏览和进行少量编辑改 在默认设置下,Vim会在左下角显示当前的模式。 Vim启动时的默认模式是正常模式。通常你会把大部分 时间花在正常模式和插入模式。 -You change modes by pressing `` (the escape key) to switch from any mode -back to normal mode. From normal mode, enter insert mode with `i`, replace mode -with `R`, visual mode with `v`, visual line mode with `V`, visual block mode -with `` (Ctrl-V, sometimes also written `^V`), and command-line mode with -`:`. +你可以按下 `` (逃脱键) 从任何其他模式返回正常模式。 在正常模式,键入 `i` 进入插入 +模式, `R` 进入替换模式, `v` 进入可视(一般)模式, `V` 进入可视(行)模式, `` +(Ctrl-V, 有时也写作 `^V`), `:` 进入命令模式。 -You use the `` key a lot when using Vim: consider remapping Caps Lock to -Escape ([macOS -instructions](https://vim.fandom.com/wiki/Map_caps_lock_to_escape_in_macOS)). +因为你会在使用 Vim 时大量使用 `` 键,考虑把大小写锁定键重定义成逃脱键 ([MacOS 教程](https://vim.fandom.com/wiki/Map_caps_lock_to_escape_in_macOS) )。 # 基本操作 ## 插入文本 -From normal mode, press `i` to enter insert mode. Now, Vim behaves like any -other text editor, until you press `` to return to normal mode. This, -along with the basics explained above, are all you need to start editing files -using Vim (though not particularly efficiently, if you're spending all your -time editing from insert mode). +在正常模式, 键入 `i` 进入插入模式。 现在 Vim 跟很多其他的编辑器一样, 直到你键入`` +返回正常模式。 你只需要掌握这一点和上面介绍的所有基知识就可以使用 Vim 来编辑文件了 +(虽然如果你一直停留在插入模式内不一定高效)。 -## Buffers, tabs, and windows +## 缓存, 标签页, 窗口 -Vim maintains a set of open files, called "buffers". A Vim session has a number -of tabs, each of which has a number of windows (split panes). Each window shows -a single buffer. Unlike other programs you are familiar with, like web -browsers, there is not a 1-to-1 correspondence between buffers and windows; -windows are merely views. A given buffer may be open in _multiple_ windows, -even within the same tab. This can be quite handy, for example, to view two -different parts of a file at the same time. +Vim 会维护一系列打开的文件,称为 “缓存”。 一个 Vim 会话包含一系列标签页,每个标签页包含 +一系列窗口 (分隔面板)。每个窗口显示一个缓存。 跟网页浏览器等其他你熟悉的程序不一样的是, +缓存和窗口不是一一对应的关系; 窗口只是视角。 一个缓存可以在 _多个_ 窗口打开,甚至在同一 +个标签页内的多个窗口打开。这个功能其实很好用, 比如在查看同一个文件的不同部分的时候。 -By default, Vim opens with a single tab, which contains a single window. +Vim 默认打开一个标签页,这个标签也包含一个窗口。 ## 命令行 -Command mode can be entered by typing `:` in normal mode. Your cursor will jump -to the command line at the bottom of the screen upon pressing `:`. This mode -has many functionalities, including opening, saving, and closing files, and -[quitting Vim](https://twitter.com/iamdevloper/status/435555976687923200). +在正常模式下键入 `:` 进入命令行模式。 在键入 `:` 后,你的光标会立即跳到屏幕下方的命令行。 +这个模式有很多功能, 包括打开, 保存, 关闭文件, 以及 +[退出 Vim](https://twitter.com/iamdevloper/status/435555976687923200)。 -- `:q` quit (close window) -- `:w` save ("write") -- `:wq` save and quit -- `:e {name of file}` open file for editing -- `:ls` show open buffers -- `:help {topic}` open help - - `:help :w` opens help for the `:w` command - - `:help w` opens help for the `w` movement +- `:q` 退出 (关闭窗口) +- `:w` 保存 (写) +- `:wq` 保存然后退出 +- `:e {文件名}` 打开要编辑的文件 +- `:ls` 显示打开的缓存 +- `:help {标题}` 打开帮助文档 + - `:help :w` 打开 `:w` 命令的帮助文档 + - `:help w` 打开 `w` 移动的帮助文档 # Vim 的接口其实是一种编程语言 -The most important idea in Vim is that Vim's interface itself is a programming -language. Keystrokes (with mnemonic names) are commands, and these commands -_compose_. This enables efficient movement and edits, especially once the -commands become muscle memory. +Vim 最重要的设计思想是 Vim 的界面本省是一个程序语言。 键入操作 (以及他们的助记名) +本身是命令, 这些命令可以组合使用。 这使得移动和编辑更加高效,特别是一旦形成肌肉记忆。 ## 移动 From e0878e09a837947ad03cb715e692a21e0bff6244 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 7 Jun 2020 10:57:01 +0800 Subject: [PATCH 432/640] fix security --- _2020/security.md | 38 +++++++++++++++++++++++++++++++++++++- 1 file changed, 37 insertions(+), 1 deletion(-) diff --git a/_2020/security.md b/_2020/security.md index 5b58753d..6a28c3dd 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -213,4 +213,40 @@ Windows的[BitLocker](https://fossbytes.com/enable-full-disk-encryption-windows- 但是为了防止泄露,私钥必须加密存储。`ssh-keygen`命令会提示用户输入一个密码,并将它输入密钥生成函数 产生一个密钥。最终,`ssh-keygen`使用对称加密算法和这个密钥加密私钥。 -在实际运用中,当服务器已知用户的公钥(存储在`.ssh/authorized_keys`文件中,一般在用户HOME目录下),尝试连接的客户端可以使用非对称签名 +在实际运用中,当服务器已知用户的公钥(存储在`.ssh/authorized_keys`文件中,一般在用户HOME目录下),尝试连接的客户端可以使用非对称签名来证明用户的身份——这便是[挑战应答方式](https://en.wikipedia.org/wiki/Challenge%E2%80%93response_authentication)。 +简单来说,服务器选择一个随机数字发送给客户端。客户端使用用户私钥对这个数字信息签名后返回服务器。 +服务器随后使用`.ssh/authorized_keys`文件中存储的用户公钥来验证返回的信息是否由所对应的私钥所签名。这种验证方式可以有效证明试图登录的用户持有所需的私钥。 + +{% comment %} +extra topics, if there's time + +security concepts, tips +- biometrics +- HTTPS +{% endcomment %} + +# 资源 + +- [去年的讲稿](/2019/security/): 更注重于计算机用户可以如何增强隐私保护和安全 +- [Cryptographic Right Answers](https://latacora.micro.blog/2018/04/03/cryptographic-right-answers.html): +解答了在一些应用环境下“应该使用什么加密?”的问题 + +# 练习 + +1. **熵** + 1. 假设一个密码是从五个小写的单词拼接组成,每个单词都是从一个含有10万单词的字典中随机选择,且每个单词选中的概率相同。 + 一个符合这样构造的例子是`correcthorsebatterystaple`。这个密码有多少比特的熵? + 1. 假设另一个密码是用八个随机的大小写字母或数字组成。一个符合这样构造的例子是`rg8Ql34g`。这个密码又有多少比特的熵? + 1. 哪一个密码更强? + 1. 假设一个攻击者每秒可以尝试1万个密码,这个攻击者需要多久可以分别破解上述两个密码? +1. **密码散列函数** 从[Debian镜像站](https://www.debian.org/CD/http-ftp/)下载一个光盘映像(比如这个来自阿根廷镜像站的[映像](http://debian.xfree.com.ar/debian-cd/10.2.0/amd64/iso-cd/debian-10.2.0-amd64-netinst.iso))。使用`sha256sum`命令对比下载映像的哈希值和官方Debian站公布的哈希值。如果你下载了上面的映像,官方公布的哈希值可以参考[这个文件](https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/SHA256SUMS)。 +1. **对称加密** 使用 + [OpenSSL](https://www.openssl.org/)的AES模式加密一个文件: `openssl aes-256-cbc -salt -in {源文件名} -out {加密文件名}`。 + 使用`cat`或者`hexdump`对比源文件和加密的文件,再用 `openssl aes-256-cbc -d -in {加密文件名} -out + {解密文件名}` 命令解密刚刚加密的文件。最后使用`cmp`命令确认源文件和解密后的文件内容相同。 +1. **非对称加密** + 1. 在你自己的电脑上使用更安全的[ED25519算法](https://wiki.archlinux.org/index.php/SSH_keys#Ed25519)生成一组[SSH + 密钥对](https://www.digitalocean.com/community/tutorials/how-to-set-up-ssh-keys--2)。为了确保私钥不使用时的安全,一定使用密码加密你的私钥。 + 1. [配置GPG](https://www.digitalocean.com/community/tutorials/how-to-use-gpg-to-encrypt-and-sign-messages)。 + 1. 给Anish发送一封加密的电子邮件([Anish的公钥](https://keybase.io/anish))。 + 1. 使用`git commit -C`命令签名一个Git提交,并使用`git show --show-signature`命令验证这个提交的签名。或者,使用`git tag -s`命令签名一个Git标签,并使用`git tag -v`命令验证标签的签名。 From 7df600f69a50c08bf0ddf561a106b3f70f88c130 Mon Sep 17 00:00:00 2001 From: hanxiaomax-mac Date: Sun, 7 Jun 2020 10:58:49 +0800 Subject: [PATCH 433/640] update --- _2020/security.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/security.md b/_2020/security.md index 6a28c3dd..8c2ea5eb 100644 --- a/_2020/security.md +++ b/_2020/security.md @@ -231,7 +231,7 @@ security concepts, tips - [Cryptographic Right Answers](https://latacora.micro.blog/2018/04/03/cryptographic-right-answers.html): 解答了在一些应用环境下“应该使用什么加密?”的问题 -# 练习 +# 课后练习 1. **熵** 1. 假设一个密码是从五个小写的单词拼接组成,每个单词都是从一个含有10万单词的字典中随机选择,且每个单词选中的概率相同。 From 1d06eb9a51ffaedf79100adaab901c97fdbc4dec Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Fri, 12 Jun 2020 08:51:12 +0800 Subject: [PATCH 434/640] typo --- _2020/data-wrangling.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index b49f9fd3..8244768e 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -20,7 +20,7 @@ video: 例如这样一条命令 `journalctl | grep -i intel`,它会找到所有包含intel(区分大小写)的系统日志。您可能并不认为是数据整理,但是它确实将某种形式的数据(全部系统日志)转换成了另外一种形式的数据(仅包含intel的日志)。大多数情况下,数据整理需要您能够明确哪些工具可以被用来达成特定数据整理的目的,并且明白如何组合使用这些工具。 -让我们从头讲起。既然需学习数据整理,那有两样东西自然是必不可少的:用来整理的数据以及相关的应用场景。日志处理通常是一个比较典型的使用场景,因为我们经常需要在日志中查找某些信息,这种情况下通读日志是不现实的。现在,让我们研究一下系统日志,看看哪些用户曾经尝试过登录我们的服务器: +让我们从头讲起。既然是学习数据整理,那有两样东西自然是必不可少的:用来整理的数据以及相关的应用场景。日志处理通常是一个比较典型的使用场景,因为我们经常需要在日志中查找某些信息,这种情况下通读日志是不现实的。现在,让我们研究一下系统日志,看看哪些用户曾经尝试过登录我们的服务器: ```bash ssh myserver journalctl @@ -58,7 +58,7 @@ ssh myserver journalctl | sed 's/.*Disconnected from //' ``` -上面这段命令中,我们使用了一段简单的*正则表达式*。正则表达式是一种非常强大工具,可以让我们基于某种模式来对字符串进行匹配。`s` 命令的语法如下:`s/REGEX/SUBSTITUTION/`, 其中 `REGEX` 部分是我们需要使用的正则表达式,而 `SUBSTITUTION` 是用于替换匹配结果的文本。 +上面这段命令中,我们使用了一段简单的*正则表达式*。正则表达式是一种非常强大的工具,可以让我们基于某种模式来对字符串进行匹配。`s` 命令的语法如下:`s/REGEX/SUBSTITUTION/`, 其中 `REGEX` 部分是我们需要使用的正则表达式,而 `SUBSTITUTION` 是用于替换匹配结果的文本。 ## 正则表达式 From 8e29bde6cd7e96bbd4a5857b5085bad4b4cf613b Mon Sep 17 00:00:00 2001 From: Shumo Chu Date: Sat, 13 Jun 2020 00:26:13 -0400 Subject: [PATCH 435/640] progress editor.md --- _2020/editors.md | 247 ++++++++++++++++++++++++----------------------- 1 file changed, 125 insertions(+), 122 deletions(-) diff --git a/_2020/editors.md b/_2020/editors.md index 9c081e8a..150a7b4b 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -106,81 +106,76 @@ Vim 最重要的设计思想是 Vim 的界面本省是一个程序语言。 键 ## 移动 -You should spend most of your time in normal mode, using movement commands to -navigate the buffer. Movements in Vim are also called "nouns", because they -refer to chunks of text. - -- Basic movement: `hjkl` (left, down, up, right) -- Words: `w` (next word), `b` (beginning of word), `e` (end of word) -- Lines: `0` (beginning of line), `^` (first non-blank character), `$` (end of line) -- Screen: `H` (top of screen), `M` (middle of screen), `L` (bottom of screen) -- Scroll: `Ctrl-u` (up), `Ctrl-d` (down) -- File: `gg` (beginning of file), `G` (end of file) -- Line numbers: `:{number}` or `{number}G` (line {number}) -- Misc: `%` (corresponding item) -- Find: `f{character}`, `t{character}`, `F{character}`, `T{character}` - - find/to forward/backward {character} on the current line - - `,` / `;` for navigating matches -- Search: `/{regex}`, `n` / `N` for navigating matches +你应该会大部分时间在正常模式下,使用移动命令在缓存中导航。在 Vim 里面移动也被成为 “名词”, +因为他们指向文字块。 + +- 基本移动: `hjkl` (左, 下, 上, 右) +- 词: `w` (下一个词), `b` (词初), `e` (词尾) +- 行: `0` (行初), `^` (第一个非空格字符), `$` (行尾) +- 屏幕: `H` (屏幕首行), `M` (屏幕中间), `L` (屏幕底部) +- 翻页: `Ctrl-u` (上翻), `Ctrl-d` (下翻) +- 文件: `gg` (文件头), `G` (文件尾) +- 行数: `:{行数}` 或者 `{行数}G` ({行数}为行数) +- 杂项: `%` (找到配对,比如括号或者 /* */ 之类的注释对) +- 查找: `f{字符}`, `t{字符}`, `F{字符}`, `T{字符}` + - 查找/到 向前/向后 在本行的{字符} + - `,` / `;` 用于导航匹配 +- 搜索: `/{正则表达式}`, `n` / `N` 用于导航匹配 ## 选择 -Visual modes: +可视化模式: -- Visual -- Visual Line -- Visual Block +- 可视化 +- 可视化行 +- 可视化块 -Can use movement keys to make selection. +可以用移动命令来选中。 ## 编辑 -Everything that you used to do with the mouse, you now do with the keyboard -using editing commands that compose with movement commands. Here's where Vim's -interface starts to look like a programming language. Vim's editing commands -are also called "verbs", because verbs act on nouns. - -- `i` enter insert mode - - but for manipulating/deleting text, want to use something more than - backspace -- `o` / `O` insert line below / above -- `d{motion}` delete {motion} - - e.g. `dw` is delete word, `d$` is delete to end of line, `d0` is delete - to beginning of line -- `c{motion}` change {motion} - - e.g. `cw` is change word - - like `d{motion}` followed by `i` -- `x` delete character (equal do `dl`) -- `s` substitute character (equal to `xi`) -- visual mode + manipulation - - select text, `d` to delete it or `c` to change it -- `u` to undo, `` to redo -- `y` to copy / "yank" (some other commands like `d` also copy) -- `p` to paste -- Lots more to learn: e.g. `~` flips the case of a character +所有你需要用鼠标做的事, 你现在都可以用键盘:采用编辑命令和移动命令的组合来完成。 +这就是 Vim 的界面开始看起来像一个程序语言的时候。Vim 的编辑命令也被称为 “动词”, +因为动词可以施动于名词。 + +- `i` 进入插入模式 + - 但是对于操纵/编辑文本,不单想用退格键完成 +- `o` / `O` 在之上/之下插入行 +- `d{移动命令}` 删除 {移动命令} + - 例如, `dw` 删除词, `d$` 删除到行尾, `d0` 删除到行头。 +- `c{移动命令}` 改变 {移动命令} + - 例如, `cw` 改变词 + - 比如 `d{移动命令}` 再 `i` +- `x` 删除字符 (等同于 `dl`) +- `s` 替换字符 (等同于 `xi`) +- 可视化模式 + 操作 + - 选中文字, `d` 删除 或者 `c` 改变 +- `u` 撤销, `` 重做 +- `y` 复制 / "yank" (其他一些命令比如 `d` 也会复制) +- `p` 粘贴 +- 更多值得学习的: 比如 `~` 改变字符的大小写 ## 计数 -You can combine nouns and verbs with a count, which will perform a given action -a number of times. +你可以用一个计数来结合“名词” 和 “动词”, 这会执行指定操作若干次。 -- `3w` move 3 words forward -- `5j` move 5 lines down -- `7dw` delete 7 words +- `3w` 向前移动三个词 +- `5j` 向下移动5行 +- `7dw` 删除7个词 ## 修饰语 -You can use modifiers to change the meaning of a noun. Some modifiers are `i`, -which means "inner" or "inside", and `a`, which means "around". +你可以用修饰语改变 “名词” 的意义。修饰语有 `i`, 表示 “内部” 或者 “在内“, 和 `i`, +表示 ”周围“。 -- `ci(` change the contents inside the current pair of parentheses -- `ci[` change the contents inside the current pair of square brackets -- `da'` delete a single-quoted string, including the surrounding single quotes +- `ci(` 改变当前括号内的内容 +- `ci[` 改变当前方括号内的内容 +- `da'` 删除一个单引号字符窗, 包括周围的单引号 -# Demo +# 演示 -Here is a broken [fizz buzz](https://en.wikipedia.org/wiki/Fizz_buzz) -implementation: +这里是一个有问题的 [fizz buzz](https://en.wikipedia.org/wiki/Fizz_buzz) +实现: ```python def fizz_buzz(limit): @@ -196,87 +191,86 @@ def main(): fizz_buzz(10) ``` -We will fix the following issues: - -- Main is never called -- Starts at 0 instead of 1 -- Prints "fizz" and "buzz" on separate lines for multiples of 15 -- Prints "fizz" for multiples of 5 -- Uses a hard-coded argument of 10 instead of taking a command-line argument - -{% comment %} -- main is never called - - `G` end of file - - `o` open new line below - - type in "if __name__ ..." thing -- starts at 0 instead of 1 - - search for `/range` - - `ww` to move forward 2 words - - `i` to insert text, "1, " - - `ea` to insert after limit, "+1" -- newline for "fizzbuzz" - - `jj$i` to insert text at end of line - - add ", end=''" - - `jj.` to repeat for second print - - `jjo` to open line below if - - add "else: print()" +我们会修复以下问题: + +- 主函数没有被调用 +- 从 0 而不是 1 开始 +- 在 15 的整数倍的时候在不用行打印 "fizz" 和 "buzz" +- 在 5 的整数倍的时候打印 "fizz" +- 采用硬编码的参数 10 而不是从命令控制行读取参数 + +{% 注释 %} +- 主函数没有被调用 + - `G` 文件尾 + - `o` 向下打开一个新行 + - 输入 "if __name__ ..." +- 从 0 而不是 1 开始 + - 搜索 `/range` + - `ww` 向前移动两个词 + - `i` 插入文字, "1, " + - `ea` 在 limit 后插入, "+1" +- 在新的一行 "fizzbuzz" + - `jj$i` 插入文字到行尾 + - 加入 ", end=''" + - `jj.` 重复第二个打印 + - `jjo` 在 if 打开一行 + - 加入 "else: print()" - fizz fizz - - `ci'` to change fizz -- command-line argument - - `ggO` to open above + - `ci'` 变到 fizz +- 命令控制行参数 + - `ggO` 向上打开 - "import sys" - `/10` - `ci(` to "int(sys.argv[1])" -{% endcomment %} +{% 注释 %} -See the lecture video for the demonstration. Compare how the above changes are -made using Vim to how you might make the same edits using another program. -Notice how very few keystrokes are required in Vim, allowing you to edit at the -speed you think. +展示详情请观看课程视频。 比较上面用 Vim 的操作和你可能使用其他程序的操作。 +值得一提的是 Vim 需要很少的键盘操作,允许你编辑的速度跟上你思维的速度。 # 自定义 Vim -Vim is customized through a plain-text configuration file in `~/.vimrc` -(containing Vimscript commands). There are probably lots of basic settings that -you want to turn on. +Vim 由一个位于 `~/.vimrc` 的文本配置文件 (包含 Vim 脚本命令)。 你可能会启用很多基本 +设置。 We are providing a well-documented basic config that you can use as a starting point. We recommend using this because it fixes some of Vim's quirky default -behavior. **Download our config [here](/2020/files/vimrc) and save it to +behavior. + +我们提供一个文档详细的基本设置, 你可以用它当作你的初始设置。 我们推荐使用这个设置因为 +它修复了一些 Vim 默认设置奇怪行为。 +**在 [这儿](/2020/files/vimrc) 下载我们的设置, 然后将它保存成 `~/.vimrc`.** -Vim is heavily customizable, and it's worth spending time exploring -customization options. You can look at people's dotfiles on GitHub for -inspiration, for example, your instructors' Vim configs +Vim 能够被重度自定义, 花时间探索自定义选项是值得的。 你可以参考其他人的在 GitHub +上共享的设置文件, 比如, 你的授课人的 Vim 设置 ([Anish](https://github.com/anishathalye/dotfiles/blob/master/vimrc), [Jon](https://github.com/jonhoo/configs/blob/master/editor/.config/nvim/init.vim) (uses [neovim](https://neovim.io/)), -[Jose](https://github.com/JJGO/dotfiles/blob/master/vim/.vimrc)). There are -lots of good blog posts on this topic too. Try not to copy-and-paste people's -full configuration, but read it, understand it, and take what you need. +[Jose](https://github.com/JJGO/dotfiles/blob/master/vim/.vimrc))。 +有很多好的博客文章也聊到了这个话题。 尽量不要复制粘贴别人的整个设置文件, +而是阅读和理解它, 然后采用对你有用的部分。 # 扩展 Vim -There are tons of plugins for extending Vim. Contrary to outdated advice that -you might find on the internet, you do _not_ need to use a plugin manager for -Vim (since Vim 8.0). Instead, you can use the built-in package management -system. Simply create the directory `~/.vim/pack/vendor/start/`, and put -plugins in there (e.g. via `git clone`). +Vim 有很多扩展插件。 跟很多互联网上已经过时的建议相反, 你 _不_ 需要在 Vim 使用一个插件 +管理器(从 Vim 8.0 开始)。 你可以使用内置的插件管理系统。 只需要创建一个 +`~/.vim/pack/vendor/start/` 的文件家, 然后把插件放到这里 (比如通过 `git clone`)。 -Here are some of our favorite plugins: +以下是一些我们最爱的插件: -- [ctrlp.vim](https://github.com/ctrlpvim/ctrlp.vim): fuzzy file finder -- [ack.vim](https://github.com/mileszs/ack.vim): code search -- [nerdtree](https://github.com/scrooloose/nerdtree): file explorer -- [vim-easymotion](https://github.com/easymotion/vim-easymotion): magic motions +- [ctrlp.vim](https://github.com/ctrlpvim/ctrlp.vim): 模糊文件查找 +- [ack.vim](https://github.com/mileszs/ack.vim): 代码搜索 +- [nerdtree](https://github.com/scrooloose/nerdtree): 文件浏览器 +- [vim-easymotion](https://github.com/easymotion/vim-easymotion): 魔术操作 We're trying to avoid giving an overwhelmingly long list of plugins here. You can check out the instructors' dotfiles +我们尽量避免在这里提供一长串插件。 你可以查看授课人们的点文件 ([Anish](https://github.com/anishathalye/dotfiles), [Jon](https://github.com/jonhoo/configs), [Jose](https://github.com/JJGO/dotfiles)) to see what other plugins we use. -Check out [Vim Awesome](https://vimawesome.com/) for more awesome Vim plugins. -There are also tons of blog posts on this topic: just search for "best Vim -plugins". +Check out [Vim Awesome](https://vimawesome.com/) 来了解一些很棒的插件. +这个话题也有很多博客文章: 搜索 "best Vim +plugins"。 # 其他程序的 Vim 模式 @@ -284,6 +278,9 @@ Many tools support Vim emulation. The quality varies from good to great; depending on the tool, it may not support the fancier Vim features, but most cover the basics pretty well. +很多工具提供了 Vim 模式。 这些 Vim 模式的质量参差不齐; 取决于具体工具, 有的提供了 +很多酷炫的 Vim 功能, 但是大多数对基本功能支持的很好。 + ## Shell If you're a Bash user, use `set -o vi`. If you use Zsh, `bindkey -v`. For Fish, @@ -292,28 +289,34 @@ If you're a Bash user, use `set -o vi`. If you use Zsh, `bindkey -v`. For Fish, editor is launched when a program wants to start an editor. For example, `git` will use this editor for commit messages. +如果你是一个 Bash 用户, 用 `set -o vi`。 如果你用 Zsh: `bindkey -v`。 Fish 用 +`fish_vi_key_bindings`。 另外, 不管利用什么 shell, 你可以 +`export EDITOR=vim`。 这是一个用来决定当一个程序需要启动编辑时启动哪个的环境变量。 +例如, `git` 会使用这个编辑器来编辑 commit 信息。 + ## Readline -Many programs use the [GNU -Readline](https://tiswww.case.edu/php/chet/readline/rltop.html) library for -their command-line interface. Readline supports (basic) Vim emulation too, -which can be enabled by adding the following line to the `~/.inputrc` file: +很多程序使用 [GNU +Readline](https://tiswww.case.edu/php/chet/readline/rltop.html) 库来作为 +它们的命令控制行界面。 Readline 也支持基本的 Vim 模式, +可以通过在 `~/.inputrc` 添加如下行开启: ``` set editing-mode vi ``` -With this setting, for example, the Python REPL will support Vim bindings. +在这个设置下, 比如, Python REPL 会支持 Vim 快捷键。 ## 其他 There are even vim keybinding extensions for web -[browsers](http://vim.wikia.com/wiki/Vim_key_bindings_for_web_browsers), some -popular ones are +甚至有 Vim 的网页浏览键盘绑定扩展 +[browsers](http://vim.wikia.com/wiki/Vim_key_bindings_for_web_browsers), 受欢迎的有 +用于 Google Chrome 的 [Vimium](https://chrome.google.com/webstore/detail/vimium/dbepggeogbaibhgnhhndojpepiihcmeb?hl=en) -for Google Chrome and [Tridactyl](https://github.com/tridactyl/tridactyl) for -Firefox. You can even get Vim bindings in [Jupyter -notebooks](https://github.com/lambdalisue/jupyter-vim-binding). +和用于 Firefox 的 [Tridactyl](https://github.com/tridactyl/tridactyl)。 +你甚至可以在 [Jupyter +notebooks](https://github.com/lambdalisue/jupyter-vim-binding) 中用 Vim 绑定。 # Vim 进阶 From 9bacb0b85c848d18980d81a47b4d250b519879bd Mon Sep 17 00:00:00 2001 From: Shumo Chu Date: Sat, 13 Jun 2020 01:08:58 -0400 Subject: [PATCH 436/640] editor done --- _2020/editors.md | 120 ++++++++++++++++++++++------------------------- 1 file changed, 57 insertions(+), 63 deletions(-) diff --git a/_2020/editors.md b/_2020/editors.md index 150a7b4b..17d19375 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -320,99 +320,93 @@ notebooks](https://github.com/lambdalisue/jupyter-vim-binding) 中用 Vim 绑定 # Vim 进阶 -Here are a few examples to show you the power of the editor. We can't teach you -all of these kinds of things, but you'll learn them as you go. A good -heuristic: whenever you're using your editor and you think "there must be a -better way of doing this", there probably is: look it up online. +这里我们提供了一些展示这个编辑器能力的例子。我们无法把所有的这样的事情都教给你, 但是你 +可以在使用中学习。 一个好的对策是: 当你在使用你的编辑器的时候感觉 “一定有更好的方法来做这个”, +那么很可能真的有: 上网搜寻一下。 ## 搜索和替换 -`:s` (substitute) command ([documentation](http://vim.wikia.com/wiki/Search_and_replace)). +`:s` (替换) 命令 ([文档](http://vim.wikia.com/wiki/Search_and_replace))。 - `%s/foo/bar/g` - - replace foo with bar globally in file + - 在整个文件中将 foo 全局替换成 bar - `%s/\[.*\](\(.*\))/\1/g` - - replace named Markdown links with plain URLs + - 将有命名的 Markdown 链接替换成简单 URLs ## 多窗口 -- `:sp` / `:vsp` to split windows -- Can have multiple views of the same buffer. +- 用 `:sp` / `:vsp` 来分割窗口 +- 能有一个缓存的多个视角。 ## 宏 -- `q{character}` to start recording a macro in register `{character}` -- `q` to stop recording -- `@{character}` replays the macro -- Macro execution stops on error -- `{number}@{character}` executes a macro {number} times -- Macros can be recursive - - first clear the macro with `q{character}q` - - record the macro, with `@{character}` to invoke the macro recursively - (will be a no-op until recording is complete) -- Example: convert xml to json ([file](/2020/files/example-data.xml)) - - Array of objects with keys "name" / "email" - - Use a Python program? - - Use sed / regexes +- `q{字符}` 来开始在寄存器 `{字符` 中录制宏 +- `q` 停止录制 +- `@{字符}` 重放宏 +- 宏的执行遇错误会停止 +- `{计数}@{字符}` 执行一个宏 {计数} 次 +- 宏可以递归 + - 首先用 `q{字符}q` 清除宏 + - 录制该宏, 用 `@{字符}` 来递归调用该宏 + (在录制完成之前不会有任何操作) +- 例子: 将 xml 转成 json ([file](/2020/files/example-data.xml)) + - 一个有 "name" / "email" 键对象的数组 + - 用一个 Python 程序? + - 用 sed / 正则表达式 - `g/people/d` - `%s//{/g` - `%s/\(.*\)<\/name>/"name": "\1",/g` - ... - - Vim commands / macros - - `Gdd`, `ggdd` delete first and last lines - - Macro to format a single element (register `e`) - - Go to line with `` + - Vim 命令 / 宏 + - `Gdd`, `ggdd` 删除第一行和最后一行 + - 格式化最后一个元素的宏 (寄存器 `e`) + - 到有 `` 的行 - `qe^r"f>s": "fq` - - Macro to format a person - - Go to line with `` + - 格式化一个人的宏 + - 到有 `` 的行 - `qpS{j@eA,j@ejS},q` - - Macro to format a person and go to the next person - - Go to line with `` + - 格式化一个人然后转到另外一个人的宏 + - 到有 `` 的行 - `qq@pjq` - - Execute macro until end of file + - 执行宏到文件尾 - `999@q` - - Manually remove last `,` and add `[` and `]` delimiters + - 手动移除最后的 `,` 然后加上 `[` 和 `]` 分隔符 # 扩展资料 -- `vimtutor` is a tutorial that comes installed with Vim -- [Vim Adventures](https://vim-adventures.com/) is a game to learn Vim +- `vimtutor` 是一个 Vim 安装时自带的教程 +- [Vim Adventures](https://vim-adventures.com/) 是一个学习使用 Vim 的游戏 - [Vim Tips Wiki](http://vim.wikia.com/wiki/Vim_Tips_Wiki) -- [Vim Advent Calendar](https://vimways.org/2019/) has various Vim tips -- [Vim Golf](http://www.vimgolf.com/) is [code golf](https://en.wikipedia.org/wiki/Code_golf), but where the programming language is Vim's UI +- [Vim Advent Calendar](https://vimways.org/2019/) 有很多 Vim 小技巧 +- [Vim Golf](http://www.vimgolf.com/) 是用 Vim 的用户界面作为程序语言的 [code golf](https://en.wikipedia.org/wiki/Code_golf) - [Vi/Vim Stack Exchange](https://vi.stackexchange.com/) - [Vim Screencasts](http://vimcasts.org/) -- [Practical Vim](https://pragprog.com/book/dnvim2/practical-vim-second-edition) (book) +- [Practical Vim](https://pragprog.com/book/dnvim2/practical-vim-second-edition) (书) # 课后练习 -1. Complete `vimtutor`. Note: it looks best in a - [80x24](https://en.wikipedia.org/wiki/VT100) (80 columns by 24 lines) - terminal window. -1. Download our [basic vimrc](/2020/files/vimrc) and save it to `~/.vimrc`. Read - through the well-commented file (using Vim!), and observe how Vim looks and - behaves slightly differently with the new config. -1. Install and configure a plugin: +1. 完成 `vimtutor`。 备注: 它在一个 + [80x24](https://en.wikipedia.org/wiki/VT100) (80 列, 24 行) + 终端窗口看起来最好。 +1. 下载我们的 [基本 vimrc](/2020/files/vimrc), 然后把它保存到 `~/.vimrc`。 通读这个注释详细的文件 + (用 Vim!), 然后观察 Vim 在这个新的设置下看起来和使用起来有哪些细微的区别。 +1. 安装和配置一个插件: [ctrlp.vim](https://github.com/ctrlpvim/ctrlp.vim). - 1. Create the plugins directory with `mkdir -p ~/.vim/pack/vendor/start` - 1. Download the plugin: `cd ~/.vim/pack/vendor/start; git clone + 1. 用 `mkdir -p ~/.vim/pack/vendor/start` 创建插件文件夹 + 1. 下载这个插件: `cd ~/.vim/pack/vendor/start; git clone https://github.com/ctrlpvim/ctrlp.vim` - 1. Read the - [documentation](https://github.com/ctrlpvim/ctrlp.vim/blob/master/readme.md) - for the plugin. Try using CtrlP to locate a file by navigating to a - project directory, opening Vim, and using the Vim command-line to start + 1. 读这个插件的 + [文档](https://github.com/ctrlpvim/ctrlp.vim/blob/master/readme.md)。 + 尝试用 CtrlP 来在一个工程文件夹里定位一个文件, 打开 Vim, 然后用 Vim 命令控制行开始 `:CtrlP`. - 1. Customize CtrlP by adding + 1. 自定义 CtrlP: 添加 [configuration](https://github.com/ctrlpvim/ctrlp.vim/blob/master/readme.md#basic-options) - to your `~/.vimrc` to open CtrlP by pressing Ctrl-P. -1. To practice using Vim, re-do the [Demo](#demo) from lecture on your own - machine. -1. Use Vim for _all_ your text editing for the next month. Whenever something - seems inefficient, or when you think "there must be a better way", try - Googling it, there probably is. If you get stuck, come to office hours or - send us an email. -1. Configure your other tools to use Vim bindings (see instructions above). -1. Further customize your `~/.vimrc` and install more plugins. -1. (Advanced) Convert XML to JSON ([example file](/2020/files/example-data.xml)) - using Vim macros. Try to do this on your own, but you can look at the - [macros](#macros) section above if you get stuck. + 到你的 `~/.vimrc` 来用按 Ctrl-P 打开 CtrlP +1. 练习使用 Vim, 在你自己的机器上重做 [演示](#demo)。 +1. 下个月用 Vim 做你 _所有_ 文件编辑。 每当不够高效的时候, 或者你感觉 “一定有一个更好的方式”, + 尝试求助搜索引擎, 很有可能有一个更好的方式。 如果你遇到难题, 来我们的答疑时间或者给我们发邮件。 +1. 在你的其他工具中设置 Vim 绑定 (见上面的操作指南)。 +1. 进一步自定义你的 `~/.vimrc` 和安装更多插件。 +1. (高阶) 用 Vim 宏将 XML 转换到 JSON ([例子文件](/2020/files/example-data.xml))。 + 尝试着先完全自己做, 但是在你卡住的时候可以查看上面 + [宏](#macros) 章节。 From 0f76ad5e74bf99acaef4ae6a086c5dea73779417 Mon Sep 17 00:00:00 2001 From: Yi Zhang Date: Sun, 14 Jun 2020 20:03:51 -0400 Subject: [PATCH 437/640] update porpourri --- _2020/potpourri.md | 233 +++++++++++++++------------------------------ 1 file changed, 77 insertions(+), 156 deletions(-) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index 5da53334..1699ad96 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -11,14 +11,14 @@ video: ## 目录 - [目录](#%e7%9b%ae%e5%bd%95) -- [Keyboard remapping](#keyboard-remapping) -- [Daemons](#daemons) +- [修改键位映射](#%E4%BF%AE%E6%94%B9%E9%94%AE%E4%BD%8D%E6%98%A0%E5%B0%84) +- [守护进程](#%E5%AE%88%E6%8A%A4%E8%BF%9B%E7%A8%8B) - [FUSE](#fuse) -- [Backups](#backups) +- [备份](#%E5%A4%87%E4%BB%BD) - [APIs](#apis) - [Common command-line flags/patterns](#common-command-line-flagspatterns) - [Window managers](#window-managers) -- [VPNs](#vpns) +- [VPN](#vpn) - [Markdown](#markdown) - [Hammerspoon (desktop automation on macOS)](#hammerspoon-desktop-automation-on-macos) - [Resources](#resources) @@ -27,33 +27,31 @@ video: - [Notebook programming](#notebook-programming) - [GitHub](#github) -## Keyboard remapping +## 修改键位映射 +作为一名程序员,键盘是你的主要输入工具。它像电脑里的其他部件一样是可配置的,而且值得你在这上面花时间。 -As a programmer, your keyboard is your main input method. As with pretty much anything in your computer, it is configurable (and worth configuring). +一个很常见的配置是修改键位映射。通常这个功能由在电脑上运行的软件实现。当某一个按键被按下,软件截获键盘发出的按键事件(keypress event)并使用另外一个事件取代。比如: +- 将Caps Lock映射为Ctrl或者Escape。Caps Lock使用了键盘上一个非常方便的位置而它的功能却很少被用到,所以我们(讲师)非常推荐这个修改。 +- 将PrtSc映射为播放/暂停。大部分操作系统支持播放/暂停键。 +- 交换Ctrl和Meta键(Windows的徽标键或者Mac的Command键)。 -The most basic change is to remap keys. -This usually involves some software that is listening and, whenever a certain key is pressed, it intercepts that event and replaces it with another event corresponding to a different key. Some examples: -- Remap Caps Lock to Ctrl or Escape. We (the instructors) highly encourage this setting since Caps Lock has a very convenient location but is rarely used. -- Remapping PrtSc to Play/Pause music. Most OSes have a play/pause key. -- Swapping Ctrl and the Meta (Windows or Command) key. +你也可以将键位映射为任意常用的指令。软件监听到特定的按键组合后会运行设定的脚本。 +- 打开一个新的终端或者浏览器窗口。 +- 输出特定的字符串,比如:一个超长邮件地址或者MIT ID。 +- 使电脑或者显示器进入睡眠模式。 -You can also map keys to arbitrary commands of your choosing. This is useful for common tasks that you perform. Here, some software listens for a specific key combination and executes some script whenever that event is detected. -- Open a new terminal or browser window. -- Inserting some specific text, e.g. your long email address or your MIT ID number. -- Sleeping the computer or the displays. +甚至更复杂的修改也可以通过软件实现: +- 映射按键顺序,比如:按Shift键五下切换大小写锁定。 +- 区别映射单点和长按,比如:单点Caps Lock映射为Escape,而长按Caps Lock映射为Ctrl。 +- 对不同的键盘或软件保存专用的映射配置。 -There are even more complex modifications you can configure: -- Remapping sequences of keys, e.g. pressing shift five times toggles Caps Lock. -- Remapping on tap vs on hold, e.g. Caps Lock key is remapped to Esc if you quickly tap it, but is remapped to Ctrl if you hold it and use it as a modifier. -- Having remaps being keyboard or software specific. +下面是一些修改键位映射的软件: +- macOS - [karabiner-elements](https://pqrs.org/osx/karabiner/), [skhd](https://github.com/koekeishiya/skhd) 或者 [BetterTouchTool](https://folivora.ai/) +- Linux - [xmodmap](https://wiki.archlinux.org/index.php/Xmodmap) 或者 [Autokey](https://github.com/autokey/autokey) +- Windows - 控制面板,[AutoHotkey](https://www.autohotkey.com/) 或者 [SharpKeys](https://www.randyrants.com/category/sharpkeys/) +- QMK - 如果你的键盘支持定制固件,[QMK](https://docs.qmk.fm/) 可以直接在键盘的硬件上修改键位映射。保留在键盘里的映射免除了在别的机器上的重复配置。 -Some software resources to get started on the topic: -- macOS - [karabiner-elements](https://pqrs.org/osx/karabiner/), [skhd](https://github.com/koekeishiya/skhd) or [BetterTouchTool](https://folivora.ai/) -- Linux - [xmodmap](https://wiki.archlinux.org/index.php/Xmodmap) or [Autokey](https://github.com/autokey/autokey) -- Windows - Builtin in Control Panel, [AutoHotkey](https://www.autohotkey.com/) or [SharpKeys](https://www.randyrants.com/category/sharpkeys/) -- QMK - If your keyboard supports custom firmware you can use [QMK](https://docs.qmk.fm/) to configure the hardware device itself so the remaps works for any machine you use the keyboard with. - -## Daemons +## 守护进程 You are probably already familiar with the notion of daemons, even if the word seems new. Most computers have a series of processes that are always running in the background rather than waiting for a user to launch them and interact with them. @@ -85,7 +83,8 @@ Restart=on-failure WantedBy=multi-user.target ``` -Also, if you just want to run some program with a given frequency there is no need to build a custom daemon, you can use [`cron`](http://man7.org/linux/man-pages/man8/cron.8.html), a daemon your system already runs to perform scheduled tasks. +如果你只是想定期运行一些程序,可以直接使用[`cron`](http://man7.org/linux/man-pages/man8/cron.8.html)。它是一个系统内置的,用来执行定期任务的守护进程。 + ## FUSE @@ -108,28 +107,19 @@ Some interesting examples of FUSE filesystems are: - [kbfs](https://keybase.io/docs/kbfs) - Distributed filesystem with end-to-end encryption. You can have private, shared and public folders. - [borgbackup](https://borgbackup.readthedocs.io/en/stable/usage/mount.html) - Mount your deduplicated, compressed and encrypted backups for ease of browsing. -## Backups +## 备份 -Any data that you haven’t backed up is data that could be gone at any moment, forever. -It's easy to copy data around, it's hard to reliable backup data. -Here are some good backup basics and the pitfalls of some approaches. +任何没有备份的数据都可能在一个瞬间永远消失。复制数据很简单,但是可靠的备份数据很难。下面列举了一些关于备份的基础知识,以及一些备份方法容易掉进的陷阱。 -First, a copy of the data in the same disk is not a backup, because the disk is the single point of failure for all the data. Similarly, an external drive in your home is also a weak backup solution since it could be lost in a fire/robbery/&c. Instead, having an off-site backup is a recommended practice. +首先,复制存储在同一个磁盘上的数据不是备份,因为这个磁盘是一个单点故障(single point of failure)。这个磁盘一旦出现问题,所有的数据都可能丢失。放在家里的外置磁盘因为火灾、抢劫等原因可能会和源数据一起丢失,所以是一个弱备份。推荐的做法是将数据备份到不同的地点存储。 -Synchronization solutions are not backups. For instance, Dropbox/GDrive are convenient solutions, but when data is erased or corrupted they propagate the change. For the same reason, disk mirroring solutions like RAID are not backups. They don't help if data gets deleted, corrupted or encrypted by ransomware. +同步方案也不是备份。即使方便如Dropbox或者Google Drive,当数据在本地被抹除或者损坏,同步方案可能会把这些“更改”同步到云端。同理,像RAID这样的磁盘镜像方案也不是备份。它不能防止文件被意外删除、损坏、或者被勒索软件加密。 -Some core features of good backups solutions are versioning, deduplication and security. -Versioning backups ensure that you can access your history of changes and efficiently recover files. -Efficient backup solutions use data deduplication to only store incremental changes and reduce the storage overhead. -Regarding security, you should ask yourself what someone would need to know/have in order to read your data and, more importantly, to delete all your data and associated backups. -Lastly, blindly trusting backups is a terrible idea and you should verify regularly that you can use them to recover data. +有效备份方案的几个核心特性是:版本控制,删除重复数据,以及安全性。对备份的数据实施版本控制保证了用户可以从任何记录过的历史版本中恢复数据。在备份中检测并删除重复数据,使其仅备份增量变化可以减少存储开销。在安全性方面,作为用户,你应该考虑别人需要有什么信息或者工具才可以访问或者完全删除你的数据及备份。最后一点,不要盲目信任备份方案。用户应该经常检查备份是否可以用来恢复数据。 -Backups go beyond local files in your computer. -Given the significant growth of web applications, large amounts of your data are only stored in the cloud. -For instance, your webmail, social media photos, music playlists in streaming services or online docs are gone if you lose access to the corresponding accounts. -Having an offline copy of this information is the way to go, and you can find online tools that people have built to fetch the data and save it. +备份不限制于备份在本地计算机上的文件。云端应用的重大发展使得我们很多的数据只存储在云端。当我们无法登录这些应用,在云端存储的网络邮件,社交网络上的照片,流媒体音乐播放列表,以及在线文档等等都会随之丢失。用户应该有这些数据的离线备份,而且已经有项目可以帮助下载并存储它们。 -For a more detailed explanation, see 2019's lecture notes on [Backups](/2019/backups). +如果想要了解更多具体内容,请参考本课程2019年关于备份的[课堂笔记](/2019/backups)。 ## APIs @@ -220,76 +210,36 @@ windows with your keyboard, and you can resize them and move them around, all without touching the mouse. They are worth looking into! -## VPNs - -VPNs are all the rage these days, but it's not clear that's for [any -good reason](https://gist.github.com/joepie91/5a9909939e6ce7d09e29). You -should be aware of what a VPN does and does not get you. A VPN, in the -best case, is _really_ just a way for you to change your internet -service provider as far as the internet is concerned. All your traffic -will look like it's coming from the VPN provider instead of your "real" -location, and the network you are connected to will only see encrypted -traffic. - -While that may seem attractive, keep in mind that when you use a VPN, -all you are really doing is shifting your trust from you current ISP to -the VPN hosting company. Whatever your ISP _could_ see, the VPN provider -now sees _instead_. If you trust them _more_ than your ISP, that is a -win, but otherwise, it is not clear that you have gained much. If you -are sitting on some dodgy unencrypted public Wi-Fi at an airport, then -maybe you don't trust the connection much, but at home, the trade-off is -not quite as clear. - -You should also know that these days, much of your traffic, at least of -a sensitive nature, is _already_ encrypted through HTTPS or TLS more -generally. In that case, it usually matters little whether you are on -a "bad" network or not -- the network operator will only learn what -servers you talk to, but not anything about the data that is exchanged. - -Notice that I said "in the best case" above. It is not unheard of for -VPN providers to accidentally misconfigure their software such that the -encryption is either weak or entirely disabled. Some VPN providers are -malicious (or at the very least opportunist), and will log all your -traffic, and possibly sell information about it to third parties. -Choosing a bad VPN provider is often worse than not using one in the -first place. - -In a pinch, MIT [runs a VPN](https://ist.mit.edu/vpn) for its students, -so that may be worth taking a look at. Also, if you're going to roll -your own, give [WireGuard](https://www.wireguard.com/) a look. +## VPN + +VPN现在非常火,但我们不清楚这是不是因为[一些好的理由](https://gist.github.com/joepie91/5a9909939e6ce7d09e29)。你应该了解VPN能提供的功能和它的限制。使用了VPN的你对于互联网而言,**最好的情况**下也就是换了一个网络供应商(ISP)。所有你发出的流量看上去来源于VPN供应商的网络而不是你的“真实”地址,而你实际接入的网络只能看到加密的流量。 + +虽然这听上去非常诱人,但是你应该知道使用VPN只是把原本对网络供应商的信任放在了VPN供应商那里——网络供应商 _能看到的_,VPN供应商 _也都能看到_。如果相比网络供应商你更信任VPN供应商,那当然很好。反之,则连接VPN的价值不明确。机场的不加密公共热点确实不可以信任,但是在家庭网络环境里,这个差异就没有那么明显。 + +你也应该了解现在大部分包含用户敏感信息的流量已经被HTTPS或者TLS加密。这种情况下你所处的网络环境是否“安全”不太重要:供应商只能看到你和哪些服务器在交谈,却不能看到你们交谈的内容。 + +这一切的大前提都是“最好的情况”。曾经发生过VPN提供商错误使用弱加密或者直接禁用加密的先例。另外,有些恶意的或者带有投机心态的供应商会记录和你有关的所有流量,并很可能会将这些信息卖给第三方。找错一家VPN经常比一开始就不用VPN更危险。 + +MIT向有访问校内资源需求的成员开放自己运营的[VPN](https://ist.mit.edu/vpn)。如果你也想自己配置一个VPN,可以了解一下 [WireGuard](https://www.wireguard.com/) 以及 [Algo](https://github.com/trailofbits/algo)。 ## Markdown -There is a high chance that you will write some text over the course of -your career. And often, you will want to mark up that text in simple -ways. You want some text to be bold or italic, or you want to add -headers, links, and code fragments. Instead of pulling out a heavy tool -like Word or LaTeX, you may want to consider using the lightweight -markup language [Markdown](https://commonmark.org/help/). - -You have probably seen Markdown already, or at least some variant of it. -Subsets of it are used and supported almost everywhere, even if it's not -under the name Markdown. At its core, Markdown is an attempt to codify -the way that people already often mark up text when they are writing -plain text documents. Emphasis (*italics*) is added by surrounding a -word with `*`. Strong emphasis (**bold**) is added using `**`. Lines -starting with `#` are headings (and the number of `#`s is the subheading -level). Any line starting with `-` is a bullet list item, and any line -starting with a number + `.` is a numbered list item. Backtick is used -to show words in `code font`, and a code block can be entered by -indenting a line with four spaces or surrounding it with -triple-backticks: +你在职业生涯中大概率会编写各种各样的文档。在很多情况下这些文档需要使用标记来增加可读性,比如:插入粗体或者斜体内容,增加页眉、超链接、以及代码片段。 + +在不使用Word或者LaTeX等复杂工具的情况下,你可以考虑使用 [Markdown](https://commonmark.org/help/) 这个轻量化的标记语言(markup language)。你可能已经见过Markdown或者它的一个变种。很多环境都支持并使用Markdown的一些子功能。 + +Markdown致力于将人们编写纯文本时的一些习惯标准化。比如: +- 用`*`包围的文字表示强调(*斜体*),或者用`**`表示特别强调(**粗体**)。 +- 以`#`开头的行是标题,`#`的数量表示标题的级别,比如:`##二级标题`。 +- 以`-`开头代表一个无序列表的元素。一个数字加`.`(比如`1.`)代表一个有序列表元素。 +- 反引号`` ` ``(backtick)包围的文字会以`代码字体`显示。如果要显示一段代码,可以在每一行前加四个空格缩进,或者使用三个反引号包围整个代码片段。 ``` - code goes here + 就像这样 ``` +- 如果要添加超链接,将 _需要显示_ 的文字用方括号包围,并在后面紧接着用圆括号包围链接:`[显示文字](指向的链接)`。 -To add a link, place the _text_ for the link in square brackets, -and the URL immediately following that in parentheses: `[name](url)`. -Markdown is easy to get started with, and you can use it nearly -everywhere. In fact, the lecture notes for this lecture, and all the -others, are written in Markdown, and you can see the raw Markdown -[here](https://raw.githubusercontent.com/missing-semester/missing-semester/master/_2020/potpourri.md). +Markdown不仅容易上手,而且应用非常广泛。实际上本课程的课堂笔记和其他资料都是使用Markdown编写的。点击[这个链接](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md)可以看到本页面的原始Markdown内容。 @@ -344,32 +294,24 @@ you can use a live USB to recover data or fix the operating system. ## Docker, Vagrant, VMs, Cloud, OpenStack -[Virtual machines](https://en.wikipedia.org/wiki/Virtual_machine) and similar -tools like containers let you emulate a whole computer system, including the -operating system. This can be useful for creating an isolated environment for -testing, development, or exploration (e.g. running potentially malicious code). +[虚拟机](https://en.wikipedia.org/wiki/Virtual_machine)(Virtual Machine)以及如容器化(containerization)等工具可以帮助你模拟一个包括操作系统的完整计算机系统。虚拟机可以用于创建独立的测试或者开发环境,以及用作安全测试的沙盒。 -[Vagrant](https://www.vagrantup.com/) is a tool that lets you describe machine -configurations (operating system, services, packages, etc.) in code, and then -instantiate VMs with a simple `vagrant up`. [Docker](https://www.docker.com/) -is conceptually similar but it uses containers instead. +[Vagrant](https://www.vagrantup.com/) 是一个构建和配置虚拟开发环境的工具。它支持用户在配置文件中写入比如操作系统、系统服务、需要安装的软件包等描述,然后使用`vagrant up`命令在各种环境(VirtualBox,KVM,Hyper-V等)中启动一个虚拟机。[Docker](https://www.docker.com/) 是一个使用容器化概念的类似工具。 -You can rent virtual machines on the cloud, and it's a nice way to get instant -access to: +租用云端虚拟机可以享受以下资源的即时访问: -- A cheap always-on machine that has a public IP address, used to host services -- A machine with a lot of CPU, disk, RAM, and/or GPU -- Many more machines than you physically have access to (billing is often by -the second, so if you want a lot of computing for a short amount of time, it's -feasible to rent 1000 computers for a couple of minutes) +- 便宜、常开、且有公共IP地址的虚拟机用来托管网站等服务 +- 有大量CPU、磁盘、内存、以及GPU资源的虚拟机 +- 超出用户可以使用的物理主机数量的虚拟机 + - 相比物理主机的固定开支,虚拟机的开支一般按运行的时间计算。所以如果用户只需要在短时间内使用大量算力,租用1000台虚拟机运行几分钟明显更加划算。 -Popular services include [Amazon AWS](https://aws.amazon.com/), [Google -Cloud](https://cloud.google.com/), and -[DigitalOcean](https://www.digitalocean.com/). +受欢迎的VPS服务商有 [Amazon AWS](https://aws.amazon.com/),[Google +Cloud](https://cloud.google.com/),以及 +[DigitalOcean](https://www.digitalocean.com/)。 -If you're a member of MIT CSAIL, you can get free VMs for research purposes -through the [CSAIL OpenStack -instance](https://tig.csail.mit.edu/shared-computing/open-stack/). +MIT CSAIL的成员可以使用 [CSAIL OpenStack +instance](https://tig.csail.mit.edu/shared-computing/open-stack/) +申请免费的虚拟机用于研究。 ## Notebook programming @@ -383,32 +325,11 @@ programming environment that's great for doing math-oriented programming. ## GitHub -[GitHub](https://github.com/) is one of the most popular platforms for -open-source software development. Many of the tools we've talked about in this -class, from [vim](https://github.com/vim/vim) to -[Hammerspoon](https://github.com/Hammerspoon/hammerspoon), are hosted on -GitHub. It's easy to get started contributing to open-source to help improve -the tools that you use every day. - -There are two primary ways in which people contribute to projects on GitHub: - -- Creating an -[issue](https://help.github.com/en/github/managing-your-work-on-github/creating-an-issue). -This can be used to report bugs or request a new feature. Neither of these -involves reading or writing code, so it can be pretty lightweight to do. -High-quality bug reports can be extremely valuable to developers. Commenting on -existing discussions can be helpful too. -- Contribute code through a [pull -request](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests). -This is generally more involved than creating an issue. You can -[fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) -a repository on GitHub, clone your fork, create a new branch, make some changes -(e.g. fix a bug or implement a feature), push the branch, and then [create a -pull -request](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request). -After this, there will generally be some back-and-forth with the project -maintainers, who will give you feedback on your patch. Finally, if all goes -well, your patch will be merged into the upstream repository. Often times, -larger projects will have a contributing guide, tag beginner-friendly issues, -and some even have mentorship programs to help first-time contributors become -familiar with the project. +[GitHub](https://github.com/) 是最受欢迎的开源软件开发平台之一。我们课程中提到的很多工具,从[vim](https://github.com/vim/vim) 到 +[Hammerspoon](https://github.com/Hammerspoon/hammerspoon),都托管在Github上。向你每天使用的开源工具作出贡献其实很简单,下面是两种贡献者们经常使用的方法: + +- 创建一个[议题(issue)](https://help.github.com/en/github/managing-your-work-on-github/creating-an-issue)。 +议题可以用来反映软件运行的问题或者请求新的功能。创建议题并不需要创建者阅读或者编写代码,所以它是一个轻量化的贡献方式。高质量的问题报告对于开发者十分重要。在现有的议题发表评论也可以对项目的开发作出贡献。 +- 使用[拉取请求(pull request)](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)提交代码更改。由于涉及到阅读和编写代码,提交拉取请求总的来说比创建议题更加深入。拉取请求是请求别人把你自己的代码拉取(且合并)到他们的仓库里。很多开源项目仅允许认证的管理者管理项目代码,所以一般需要[复刻(fork)](https://help.github.com/en/github/getting-started-with-github/fork-a-repo)这些项目的上游仓库(upstream repository),在你的Github账号下创建一个内容完全相同但是由你控制的复刻仓库。这样你就可以在这个复刻仓库自由创建新的分支并推送修复问题或者实现新功能的代码。完成修改以后再回到开源项目的Github页面[创建一个拉取请求](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request)。 + +提交请求后,项目管理者会和你交流拉取请求里的代码并给出反馈。如果没有问题,你的代码会和上游仓库中的代码合并。很多大的开源项目会提供贡献指南,容易上手的议题,甚至专门的指导项目来帮助参与者熟悉这些项目。 From 99b50dc70d7f6b06c78f09c2bdb7c94a8263c741 Mon Sep 17 00:00:00 2001 From: Yi Zhang Date: Mon, 15 Jun 2020 01:53:06 -0400 Subject: [PATCH 438/640] update potpourri: FUSE --- _2020/potpourri.md | 30 ++++++++++++------------------ 1 file changed, 12 insertions(+), 18 deletions(-) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index 1699ad96..7ac0cdbd 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -20,8 +20,8 @@ video: - [Window managers](#window-managers) - [VPN](#vpn) - [Markdown](#markdown) -- [Hammerspoon (desktop automation on macOS)](#hammerspoon-desktop-automation-on-macos) - - [Resources](#resources) +- [Hammerspoon (macOS桌面自动化)](#Hammerspoon%20(macOS%E6%A1%8C%E9%9D%A2%E8%87%AA%E5%8A%A8%E5%8C%96)) + - [资源](#%E8%B5%84%E6%BA%90) - [Booting + Live USBs](#booting--live-usbs) - [Docker, Vagrant, VMs, Cloud, OpenStack](#docker-vagrant-vms-cloud-openstack) - [Notebook programming](#notebook-programming) @@ -88,24 +88,18 @@ WantedBy=multi-user.target ## FUSE -Modern software systems are usually composed of smaller building blocks that are composed together. -Your operating system supports using different filesystem backends because there is a common language of what operations a filesystem supports. -For instance, when you run `touch` to create a file, `touch` performs a system call to the kernel to create the file and the kernel performs the appropriate filesystem call to create the given file. -A caveat is that UNIX filesystems are traditionally implemented as kernel modules and only the kernel is allowed to perform filesystem calls. +现在的软件系统一般由很多模块化的组件构建而成。你使用的操作系统可以通过一系列共同的方式使用不同的文件系统上的相似功能。比如当你使用`touch`命令创建文件的时候,`touch`使用系统调用(system call)向内核发出请求。内核再根据文件系统,调用特有的方法来创建文件。这里的问题是,UNIX文件系统在传统上是以内核模块的形式实现,导致只有内核可以进行文件系统相关的调用。 -[FUSE](https://en.wikipedia.org/wiki/Filesystem_in_Userspace) (Filesystem in User Space) allows filesystems to be implemented by a user program. FUSE lets users run user space code for filesystem calls and then bridges the necessary calls to the kernel interfaces. -In practice, this means that users can implement arbitrary functionality for filesystem calls. +[FUSE](https://en.wikipedia.org/wiki/Filesystem_in_Userspace)(Filesystem in User Space)允许运行在用户空间上的程序实现文件系统调用,并将这些调用与内核接口联系起来。在实践中,这意味着用户可以在文件系统调用中实现任意功能。 -For example, FUSE can be used so whenever you perform an operation in a virtual filesystem, that operation is forwarded through SSH to a remote machine, performed there, and the output is returned back to you. -This way, local programs can see the file as if it was in your computer while in reality it's in a remote server. -This is effectively what `sshfs` does. +FUSE可以用于实现如:一个将所有文件系统操作都使用SSH转发到远程主机,由远程主机处理后返回结果到本地计算机的虚拟文件系统。这个文件系统里的文件虽然存储在远程主机,对于本地计算机上的软件而言和存储在本地别无二致。`sshfs`就是一个实现了这种功能的FUSE文件系统。 -Some interesting examples of FUSE filesystems are: -- [sshfs](https://github.com/libfuse/sshfs) - Open locally remote files/folder through an SSH connection. -- [rclone](https://rclone.org/commands/rclone_mount/) - Mount cloud storage services like Dropbox, GDrive, Amazon S3 or Google Cloud Storage and open data locally. -- [gocryptfs](https://nuetzlich.net/gocryptfs/) - Encrypted overlay system. Files are stored encrypted but once the FS is mounted they appear as plaintext in the mountpoint. -- [kbfs](https://keybase.io/docs/kbfs) - Distributed filesystem with end-to-end encryption. You can have private, shared and public folders. -- [borgbackup](https://borgbackup.readthedocs.io/en/stable/usage/mount.html) - Mount your deduplicated, compressed and encrypted backups for ease of browsing. +一些有趣的FUSE文件系统包括: +- [sshfs](https://github.com/libfuse/sshfs):使用SSH连接在本地打开远程主机上的文件。 +- [rclone](https://rclone.org/commands/rclone_mount/):将Dropbox、Google Drive、Amazon S3、或者Google Cloud Storage一类的云存储服务挂载到本地系统上。 +- [gocryptfs](https://nuetzlich.net/gocryptfs/):覆盖在加密文件上的文件系统。文件以加密形式保存在磁盘里,但该文件系统挂载后用户可以直接从挂载点访问文件的明文。 +- [kbfs](https://keybase.io/docs/kbfs):分布式端到端加密文件系统。在这个文件系统里有私密(private),共享(shared),以及公开(public)三种类型的文件夹。 +- [borgbackup](https://borgbackup.readthedocs.io/en/stable/usage/mount.html):方便用户浏览删除重复数据后压缩过的加密备份。 ## 备份 @@ -243,7 +237,7 @@ Markdown不仅容易上手,而且应用非常广泛。实际上本课程的课 -## Hammerspoon (desktop automation on macOS) +## Hammerspoon (macOS桌面自动化) [Hammerspoon](https://www.hammerspoon.org/) is a desktop automation framework for macOS. It lets you write Lua scripts that hook into operating system From 9784dbb84e1691b4bc41da80914be5f9fb0ee003 Mon Sep 17 00:00:00 2001 From: Shumo Chu Date: Mon, 15 Jun 2020 16:29:37 -0400 Subject: [PATCH 439/640] address comments --- _2020/editors.md | 37 ++++++++++++++++--------------------- 1 file changed, 16 insertions(+), 21 deletions(-) diff --git a/_2020/editors.md b/_2020/editors.md index 17d19375..ad0a4a97 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -52,10 +52,10 @@ Vim的设计以大多数时间都花在阅读、浏览和进行少量编辑改 - *正常模式*:在文件中四处移动光标进行修改 - *插入模式*:插入文本 - *替换模式*:替换文本 -- *可视(一般,行,块)模式*:选中文本块 +- *可视化(一般,行,块)模式*:选中文本块 - *命令模式*:用于执行命令 -在不同的操作模式, 键盘敲击的含义也不同。比如,`x` 在插入模式会插入字母`x`,但是在正常模式 +在不同的操作模式下, 键盘敲击的含义也不同。比如,`x` 在插入模式会插入字母`x`,但是在正常模式 会删除当前光标所在下的字母,在可视模式下则会删除选中文块。 在默认设置下,Vim会在左下角显示当前的模式。 Vim启动时的默认模式是正常模式。通常你会把大部分 @@ -65,7 +65,7 @@ Vim的设计以大多数时间都花在阅读、浏览和进行少量编辑改 模式, `R` 进入替换模式, `v` 进入可视(一般)模式, `V` 进入可视(行)模式, `` (Ctrl-V, 有时也写作 `^V`), `:` 进入命令模式。 -因为你会在使用 Vim 时大量使用 `` 键,考虑把大小写锁定键重定义成逃脱键 ([MacOS 教程](https://vim.fandom.com/wiki/Map_caps_lock_to_escape_in_macOS) )。 +因为你会在使用 Vim 时大量使用 `` 键,可以考虑把大小写锁定键重定义成逃脱键 ([MacOS 教程](https://vim.fandom.com/wiki/Map_caps_lock_to_escape_in_macOS) )。 # 基本操作 @@ -106,8 +106,8 @@ Vim 最重要的设计思想是 Vim 的界面本省是一个程序语言。 键 ## 移动 -你应该会大部分时间在正常模式下,使用移动命令在缓存中导航。在 Vim 里面移动也被成为 “名词”, -因为他们指向文字块。 +多数时候你会在正常模式下,使用移动命令在缓存中导航。在 Vim 里面移动也被成为 “名词”, +因为它们指向文字块。 - 基本移动: `hjkl` (左, 下, 上, 右) - 词: `w` (下一个词), `b` (词初), `e` (词尾) @@ -232,10 +232,6 @@ def main(): Vim 由一个位于 `~/.vimrc` 的文本配置文件 (包含 Vim 脚本命令)。 你可能会启用很多基本 设置。 -We are providing a well-documented basic config that you can use as a starting -point. We recommend using this because it fixes some of Vim's quirky default -behavior. - 我们提供一个文档详细的基本设置, 你可以用它当作你的初始设置。 我们推荐使用这个设置因为 它修复了一些 Vim 默认设置奇怪行为。 **在 [这儿](/2020/files/vimrc) 下载我们的设置, 然后将它保存成 @@ -262,9 +258,8 @@ Vim 有很多扩展插件。 跟很多互联网上已经过时的建议相反, - [nerdtree](https://github.com/scrooloose/nerdtree): 文件浏览器 - [vim-easymotion](https://github.com/easymotion/vim-easymotion): 魔术操作 -We're trying to avoid giving an overwhelmingly long list of plugins here. You -can check out the instructors' dotfiles -我们尽量避免在这里提供一长串插件。 你可以查看授课人们的点文件 + +我们尽量避免在这里提供一份冗长的插件列表。 你可以查看讲师们的开源的配置文件 ([Anish](https://github.com/anishathalye/dotfiles), [Jon](https://github.com/jonhoo/configs), [Jose](https://github.com/JJGO/dotfiles)) to see what other plugins we use. @@ -305,18 +300,18 @@ Readline](https://tiswww.case.edu/php/chet/readline/rltop.html) 库来作为 set editing-mode vi ``` -在这个设置下, 比如, Python REPL 会支持 Vim 快捷键。 +比如, 在这个设置下, Python REPL 会支持 Vim 快捷键。 ## 其他 There are even vim keybinding extensions for web -甚至有 Vim 的网页浏览键盘绑定扩展 +甚至有 Vim 的网页浏览快捷键 [browsers](http://vim.wikia.com/wiki/Vim_key_bindings_for_web_browsers), 受欢迎的有 用于 Google Chrome 的 [Vimium](https://chrome.google.com/webstore/detail/vimium/dbepggeogbaibhgnhhndojpepiihcmeb?hl=en) 和用于 Firefox 的 [Tridactyl](https://github.com/tridactyl/tridactyl)。 你甚至可以在 [Jupyter -notebooks](https://github.com/lambdalisue/jupyter-vim-binding) 中用 Vim 绑定。 +notebooks](https://github.com/lambdalisue/jupyter-vim-binding) 中用 Vim 快捷键。 # Vim 进阶 @@ -336,11 +331,11 @@ notebooks](https://github.com/lambdalisue/jupyter-vim-binding) 中用 Vim 绑定 ## 多窗口 - 用 `:sp` / `:vsp` 来分割窗口 -- 能有一个缓存的多个视角。 +- 同一个缓存可以在多个窗口中显示。 ## 宏 -- `q{字符}` 来开始在寄存器 `{字符` 中录制宏 +- `q{字符}` 来开始在寄存器 `{字符}` 中录制宏 - `q` 停止录制 - `@{字符}` 重放宏 - 宏的执行遇错误会停止 @@ -360,13 +355,13 @@ notebooks](https://github.com/lambdalisue/jupyter-vim-binding) 中用 Vim 绑定 - Vim 命令 / 宏 - `Gdd`, `ggdd` 删除第一行和最后一行 - 格式化最后一个元素的宏 (寄存器 `e`) - - 到有 `` 的行 + - 跳转到有 `` 的行 - `qe^r"f>s": "fq` - 格式化一个人的宏 - - 到有 `` 的行 + - 跳转到有 `` 的行 - `qpS{j@eA,j@ejS},q` - 格式化一个人然后转到另外一个人的宏 - - 到有 `` 的行 + - 跳转到有 `` 的行 - `qq@pjq` - 执行宏到文件尾 - `999@q` @@ -405,7 +400,7 @@ notebooks](https://github.com/lambdalisue/jupyter-vim-binding) 中用 Vim 绑定 1. 练习使用 Vim, 在你自己的机器上重做 [演示](#demo)。 1. 下个月用 Vim 做你 _所有_ 文件编辑。 每当不够高效的时候, 或者你感觉 “一定有一个更好的方式”, 尝试求助搜索引擎, 很有可能有一个更好的方式。 如果你遇到难题, 来我们的答疑时间或者给我们发邮件。 -1. 在你的其他工具中设置 Vim 绑定 (见上面的操作指南)。 +1. 在你的其他工具中设置 Vim 快捷键 (见上面的操作指南)。 1. 进一步自定义你的 `~/.vimrc` 和安装更多插件。 1. (高阶) 用 Vim 宏将 XML 转换到 JSON ([例子文件](/2020/files/example-data.xml))。 尝试着先完全自己做, 但是在你卡住的时候可以查看上面 From 40d3b5955de31773e7e847089867ffdce59e2c4f Mon Sep 17 00:00:00 2001 From: Yi Zhang Date: Mon, 15 Jun 2020 17:26:56 -0400 Subject: [PATCH 440/640] potpourri update: daemon, window manager, hammerspoon --- _2020/potpourri.md | 87 +++++++++++++++++++--------------------------- 1 file changed, 35 insertions(+), 52 deletions(-) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index 7ac0cdbd..21ac8935 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -17,7 +17,7 @@ video: - [备份](#%E5%A4%87%E4%BB%BD) - [APIs](#apis) - [Common command-line flags/patterns](#common-command-line-flagspatterns) -- [Window managers](#window-managers) +- [窗口管理器](#%E7%AA%97%E5%8F%A3%E7%AE%A1%E7%90%86%E5%99%A8) - [VPN](#vpn) - [Markdown](#markdown) - [Hammerspoon (macOS桌面自动化)](#Hammerspoon%20(macOS%E6%A1%8C%E9%9D%A2%E8%87%AA%E5%8A%A8%E5%8C%96)) @@ -53,34 +53,38 @@ video: ## 守护进程 -You are probably already familiar with the notion of daemons, even if the word seems new. -Most computers have a series of processes that are always running in the background rather than waiting for a user to launch them and interact with them. -These processes are called daemons and the programs that run as daemons often end with a `d` to indicate so. -For example `sshd`, the SSH daemon, is the program responsible for listening to incoming SSH requests and checking that the remote user has the necessary credentials to log in. +即便守护进程(daemon)这个词看上去有些陌生,你应该已经大约明白它的概念。大部分计算机都有一系列在后台保持运行,不需要用户手动运行或者交互的进程。这些进程就是守护进程。以守护进程运行的程序名一般以`d`结尾,比如SSH服务端`sshd`,用来监听传入的SSH连接请求并对用户进行鉴权。 -In Linux, `systemd` (the system daemon) is the most common solution for running and setting up daemon processes. -You can run `systemctl status` to list the current running daemons. Most of them might sound unfamiliar but are responsible for core parts of the system such as managing the network, solving DNS queries or displaying the graphical interface for the system. -Systemd can be interacted with the `systemctl` command in order to `enable`, `disable`, `start`, `stop`, `restart` or check the `status` of services (those are the `systemctl` commands). +Linux中的`systemd`(the system daemon)是最常用的配置和运行守护进程的方法。运行`systemctl status`命令可以看到正在运行的所有守护进程。这里面有很多可能你没有见过,但是掌管了系统的核心部分的进程:管理网络、DNS解析、显示系统的图形界面等等。用户使用`systemctl`命令和`systemd`交互来`enable`(启用)、`disable`(禁用)、`start`(启动)、`stop`(停止)、`restart`(重启)、或者`status`(检查)配置好的守护进程及系统服务。 -More interestingly, `systemd` has a fairly accessible interface for configuring and enabling new daemons (or services). -Below is an example of a daemon for running a simple Python app. -We won't go in the details but as you can see most of the fields are pretty self explanatory. +`systemd`提供了一个很方便的界面用于配置和启用新的守护进程或系统服务。下面的配置文件使用了守护进程来运行一个简单的Python程序。文件的内容非常直接所以我们不对它详细阐述。`systemd`配置文件的详细指南可参见[freedesktop.org](https://www.freedesktop.org/software/systemd/man/systemd.service.html)。 ```ini # /etc/systemd/system/myapp.service [Unit] +# 配置文件描述 Description=My Custom App +# 在网络服务启动后启动该进程 After=network.target [Service] +# 运行该进程的用户 User=foo +# 运行该进程的用户组 Group=foo +# 运行该进程的根目录 WorkingDirectory=/home/foo/projects/mydaemon +# 开始该进程的命令 ExecStart=/usr/bin/local/python3.7 app.py +# 在出现错误时重启该进程 Restart=on-failure [Install] +# 相当于Windows的开机启动。即使GUI没有启动,该进程也会加载并运行 WantedBy=multi-user.target +# 如果该进程仅需要在GUI活动时运行,这里应写作: +# WantedBy=graphical.target +# graphical.target在multi-user.target的基础上运行和GUI相关的服务 ``` 如果你只是想定期运行一些程序,可以直接使用[`cron`](http://man7.org/linux/man-pages/man8/cron.8.html)。它是一个系统内置的,用来执行定期任务的守护进程。 @@ -96,14 +100,14 @@ FUSE可以用于实现如:一个将所有文件系统操作都使用SSH转发 一些有趣的FUSE文件系统包括: - [sshfs](https://github.com/libfuse/sshfs):使用SSH连接在本地打开远程主机上的文件。 -- [rclone](https://rclone.org/commands/rclone_mount/):将Dropbox、Google Drive、Amazon S3、或者Google Cloud Storage一类的云存储服务挂载到本地系统上。 +- [rclone](https://rclone.org/commands/rclone_mount/):将Dropbox、Google Drive、Amazon S3、或者Google Cloud Storage一类的云存储服务挂载为本地文件系统。 - [gocryptfs](https://nuetzlich.net/gocryptfs/):覆盖在加密文件上的文件系统。文件以加密形式保存在磁盘里,但该文件系统挂载后用户可以直接从挂载点访问文件的明文。 - [kbfs](https://keybase.io/docs/kbfs):分布式端到端加密文件系统。在这个文件系统里有私密(private),共享(shared),以及公开(public)三种类型的文件夹。 -- [borgbackup](https://borgbackup.readthedocs.io/en/stable/usage/mount.html):方便用户浏览删除重复数据后压缩过的加密备份。 +- [borgbackup](https://borgbackup.readthedocs.io/en/stable/usage/mount.html):方便用户浏览删除重复数据后的压缩加密备份。 ## 备份 -任何没有备份的数据都可能在一个瞬间永远消失。复制数据很简单,但是可靠的备份数据很难。下面列举了一些关于备份的基础知识,以及一些备份方法容易掉进的陷阱。 +任何没有备份的数据都可能在一个瞬间永远消失。复制数据很简单,但是可靠地备份数据很难。下面列举了一些关于备份的基础知识,以及一些常见做法容易掉进的陷阱。 首先,复制存储在同一个磁盘上的数据不是备份,因为这个磁盘是一个单点故障(single point of failure)。这个磁盘一旦出现问题,所有的数据都可能丢失。放在家里的外置磁盘因为火灾、抢劫等原因可能会和源数据一起丢失,所以是一个弱备份。推荐的做法是将数据备份到不同的地点存储。 @@ -183,26 +187,13 @@ features though that can be good to be aware of: you pass things that look like flags without them being interpreted as such: `rm -- -r` or `ssh machine --for-ssh -- foo --for-foo`. -## Window managers - -Most of you are used to using a "drag and drop" window manager, like -what comes with Windows, macOS, and Ubuntu by default. There are windows -that just sort of hang there on screen, and you can drag them around, -resize them, and have them overlap one another. But these are only one -_type_ of window manager, often referred to as a "floating" window -manager. There are many others, especially on Linux. A particularly -common alternative is a "tiling" window manager. In a tiling window -manager, windows never overlap, and are instead arranged as tiles on -your screen, sort of like panes in tmux. With a tiling window manager, -the screen is always filled by whatever windows are open, arranged -according to some _layout_. If you have just one window, it takes up the -full screen. If you then open another, the original window shrinks to -make room for it (often something like 2/3 and 1/3). If you open a -third, the other windows will again shrink to accommodate the new -window. Just like with tmux panes, you can navigate around these tiled -windows with your keyboard, and you can resize them and move them -around, all without touching the mouse. They are worth looking into! +## 窗口管理器 +大部分人适应了Windows、macOS、以及Ubuntu默认的“拖拽”式窗口管理器。这些窗口管理器的窗口一般就堆在屏幕上,你可以拖拽改变窗口的位置、缩放窗口、以及让窗口堆叠在一起。这种堆叠式(floating/stacking)管理器只是窗口管理器中的一种。特别在Linux中,有很多种其他的管理器。 + +平铺式(tiling)管理器就是一个常见的替代。顾名思义,平铺式管理器会把不同的窗口像贴瓷砖一样平铺在一起而不和其他窗口重叠。这和 [tmux](https://github.com/tmux/tmux) 管理终端窗口的方式类似。平铺式管理器按照写好的布局显示打开的窗口。如果只打开一个窗口,它会填满整个屏幕。新开一个窗口的时候,原来的窗口会缩小到比如三分之二或者三分之一的大小来腾出空间。打开更多的窗口会让已有的窗口进一步调整。 + +就像tmux那样,平铺式管理器可以让你在完全不使用鼠标的情况下使用键盘切换、缩放、以及移动窗口。它们值得一试! ## VPN @@ -239,30 +230,22 @@ Markdown不仅容易上手,而且应用非常广泛。实际上本课程的课 ## Hammerspoon (macOS桌面自动化) -[Hammerspoon](https://www.hammerspoon.org/) is a desktop automation framework -for macOS. It lets you write Lua scripts that hook into operating system -functionality, allowing you to interact with the keyboard/mouse, windows, -displays, filesystem, and much more. +[Hammerspoon](https://www.hammerspoon.org/)是面向macOS的一个桌面自动化框架。它允许用户编写和操作系统功能挂钩的Lua脚本,从而与键盘、鼠标、窗口、文件系统等交互。 -Some examples of things you can do with Hammerspoon: +下面是Hammerspoon的一些示例应用: -- Bind hotkeys to move windows to specific locations -- Create a menu bar button that automatically lays out windows in a specific layout -- Mute your speaker when you arrive in lab (by detecting the WiFi network) -- Show you a warning if you've accidentally taken your friend's power supply +- 绑定移动窗口到的特定位置的快捷键 +- 创建可以自动将窗口整理成特定布局的菜单栏按钮 +- 在你到实验室以后,通过检测所连接的WiFi网络自动静音扬声器 +- 在你不小心拿了朋友的充电器时弹出警告 -At a high level, Hammerspoon lets you run arbitrary Lua code, bound to menu -buttons, key presses, or events, and Hammerspoon provides an extensive library -for interacting with the system, so there's basically no limit to what you can -do with it. Many people have made their Hammerspoon configurations public, so -you can generally find what you need by searching the internet, but you can -always write your own code from scratch. +从用户的角度,Hammerspoon可以运行任意Lua代码,绑定菜单栏按钮、按键、或者事件。Hammerspoon提供了一个全面的用于和系统交互的库,因此它能没有限制地实现任何功能。你可以从头编写自己的Hammerspoon配置,也可以结合别人公布的配置来满足自己的需求。 -### Resources +### 资源 -- [Getting Started with Hammerspoon](https://www.hammerspoon.org/go/) -- [Sample configurations](https://github.com/Hammerspoon/hammerspoon/wiki/Sample-Configurations) -- [Anish's Hammerspoon config](https://github.com/anishathalye/dotfiles-local/tree/mac/hammerspoon) +- [Getting Started with Hammerspoon](https://www.hammerspoon.org/go/):Hammerspoon官方教程 +- [Sample configurations](https://github.com/Hammerspoon/hammerspoon/wiki/Sample-Configurations):Hammerspoon官方示例配置 +- [Anish's Hammerspoon config](https://github.com/anishathalye/dotfiles-local/tree/mac/hammerspoon):Anish的Hammerspoon配置 ## Booting + Live USBs From f700b83e99ac66dcb13f0dba770bfe65d3bc8ab8 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Tue, 16 Jun 2020 12:31:31 +0800 Subject: [PATCH 441/640] publish editor.md --- _2020/editors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/editors.md b/_2020/editors.md index ad0a4a97..42cf1a0e 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -2,7 +2,7 @@ layout: lecture title: "编辑器 (Vim)" date: 2019-01-15 -ready: false +ready: true video: aspect: 56.25 id: a6Q8Na575qc From c043840a066d99d4b0a29523a33e14e23eba1e8e Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Tue, 16 Jun 2020 12:36:12 +0800 Subject: [PATCH 442/640] Update editors.md --- _2020/editors.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/_2020/editors.md b/_2020/editors.md index 42cf1a0e..97f27714 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -199,7 +199,8 @@ def main(): - 在 5 的整数倍的时候打印 "fizz" - 采用硬编码的参数 10 而不是从命令控制行读取参数 -{% 注释 %} +{% comment %} + - 主函数没有被调用 - `G` 文件尾 - `o` 向下打开一个新行 @@ -222,7 +223,9 @@ def main(): - "import sys" - `/10` - `ci(` to "int(sys.argv[1])" -{% 注释 %} + +{% comment %} + 展示详情请观看课程视频。 比较上面用 Vim 的操作和你可能使用其他程序的操作。 值得一提的是 Vim 需要很少的键盘操作,允许你编辑的速度跟上你思维的速度。 From 1d414669d2a5b03a7811a680eac9a887c190879b Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Tue, 16 Jun 2020 20:48:03 +0800 Subject: [PATCH 443/640] Update editors.md --- _2020/editors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/editors.md b/_2020/editors.md index 97f27714..91b3fecb 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -224,7 +224,7 @@ def main(): - `/10` - `ci(` to "int(sys.argv[1])" -{% comment %} +{% endcomment %} 展示详情请观看课程视频。 比较上面用 Vim 的操作和你可能使用其他程序的操作。 From 90eafa4b40314d7710e1a009956c5c57b0e2f7e8 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Tue, 16 Jun 2020 20:50:44 +0800 Subject: [PATCH 444/640] Update editors.md --- _2020/editors.md | 9 --------- 1 file changed, 9 deletions(-) diff --git a/_2020/editors.md b/_2020/editors.md index 91b3fecb..d1c5e3ed 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -272,20 +272,12 @@ plugins"。 # 其他程序的 Vim 模式 -Many tools support Vim emulation. The quality varies from good to great; -depending on the tool, it may not support the fancier Vim features, but most -cover the basics pretty well. 很多工具提供了 Vim 模式。 这些 Vim 模式的质量参差不齐; 取决于具体工具, 有的提供了 很多酷炫的 Vim 功能, 但是大多数对基本功能支持的很好。 ## Shell -If you're a Bash user, use `set -o vi`. If you use Zsh, `bindkey -v`. For Fish, -`fish_vi_key_bindings`. Additionally, no matter what shell you use, you can -`export EDITOR=vim`. This is the environment variable used to decide which -editor is launched when a program wants to start an editor. For example, `git` -will use this editor for commit messages. 如果你是一个 Bash 用户, 用 `set -o vi`。 如果你用 Zsh: `bindkey -v`。 Fish 用 `fish_vi_key_bindings`。 另外, 不管利用什么 shell, 你可以 @@ -307,7 +299,6 @@ set editing-mode vi ## 其他 -There are even vim keybinding extensions for web 甚至有 Vim 的网页浏览快捷键 [browsers](http://vim.wikia.com/wiki/Vim_key_bindings_for_web_browsers), 受欢迎的有 用于 Google Chrome 的 From 73cb0965ba9104eace777cc4e8deaeadb4bdab89 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Wed, 17 Jun 2020 20:54:27 +0800 Subject: [PATCH 445/640] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index b17dd5e0..3db8f719 100644 --- a/README.md +++ b/README.md @@ -28,7 +28,7 @@ To contribute to this tanslation project, please book your topic by creating an | ---- | ---- |---- | | [course-shell.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/course-shell.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [shell-tools.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/shell-tools.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | -| [editors.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/editors.md) | [@stechu](https://github.com/stechu) | In-progress | +| [editors.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/editors.md) | [@stechu](https://github.com/stechu) | Done | | [data-wrangling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/data-wrangling.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [command-line.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/command-line.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [version-control.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/version-control.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | From 2f9b6d416fe4fdbfb5a970cf31e0c5468a2b28d3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E9=9D=9E=E6=B3=95=E6=93=8D=E4=BD=9C?= Date: Thu, 18 Jun 2020 14:36:57 +0800 Subject: [PATCH 446/640] Update shell-tools.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 修正翻译内容 --- _2020/shell-tools.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index e7196b13..17952b02 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -122,7 +122,7 @@ cp /path/to/project/foo.sh /path/to/project/bar.sh /path/to/project/baz.sh /newp # 也可以结合通配使用 mv *{.py,.sh} folder -# 会删除所有 *.py 和 *.sh 文件 +# 会移动所有 *.py 和 *.sh 文件 mkdir foo bar @@ -188,13 +188,13 @@ shell函数和脚本有如下一些不同点: 程序员们面对的最常见的重复任务就是查找文件或目录。所有的类UNIX系统都包含一个名为 [`find`](http://man7.org/linux/man-pages/man1/find.1.html)的工具,它是shell上用于查找文件的绝佳工具。`find`命令会递归地搜索符合条件的文件,例如: ```bash -# Find all directories named src +# 查找所有名称为src的文件夹 find . -name src -type d -# Find all python files that have a folder named test in their path +# 查找所有文件夹路径中包含test的python文件 find . -path '**/test/**/*.py' -type f -# Find all files modified in the last day +# 查找前一天修改的所有文件 find . -mtime -1 -# Find all zip files with size in range 500k to 10M +# 查找所有大小在500k至10M的tar.gz文件 find . -size +500k -size -10M -name '*.tar.gz' ``` 除了列出所寻找的文件之外,find还能对所有查找到的文件进行操作。这能极大地简化一些单调的任务。 @@ -233,13 +233,13 @@ find . -name '*.png' -exec convert {} {.}.jpg \; 因此也出现了很多它的替代品,包括 [ack](https://beyondgrep.com/), [ag](https://github.com/ggreer/the_silver_searcher) 和 [rg](https://github.com/BurntSushi/ripgrep)。它们都特别好用,但是功能也都差不多,我比较常用的是 ripgrep (`rg`) ,因为它速度快,而且用法非常符合直觉。例子如下: ```bash -# Find all python files where I used the requests library +# 查找所有使用了requests库的文件 rg -t py 'import requests' -# Find all files (including hidden files) without a shebang line +# 查找所有没有写shebang的文件(包含隐藏文件) rg -u --files-without-match "^#!" -# Find all matches of foo and print the following 5 lines +# 查找所有包含了foo,并打印其之后的5行 rg foo -A 5 -# Print statistics of matches (# of matched lines and files ) +# 打印匹配的统计信息(匹配的行和文件的数量) rg --stats PATTERN ``` From d9014a25539061c2a1f30dcce1274d6c7a08d249 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E9=9D=9E=E6=B3=95=E6=93=8D=E4=BD=9C?= Date: Sat, 20 Jun 2020 14:50:55 +0800 Subject: [PATCH 447/640] Update _2020/shell-tools.md Co-authored-by: Lingfeng_Ai --- _2020/shell-tools.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 17952b02..e2e269f0 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -233,7 +233,7 @@ find . -name '*.png' -exec convert {} {.}.jpg \; 因此也出现了很多它的替代品,包括 [ack](https://beyondgrep.com/), [ag](https://github.com/ggreer/the_silver_searcher) 和 [rg](https://github.com/BurntSushi/ripgrep)。它们都特别好用,但是功能也都差不多,我比较常用的是 ripgrep (`rg`) ,因为它速度快,而且用法非常符合直觉。例子如下: ```bash -# 查找所有使用了requests库的文件 +# 查找所有使用了 requests 库的文件 rg -t py 'import requests' # 查找所有没有写shebang的文件(包含隐藏文件) rg -u --files-without-match "^#!" From 78801e3b2bd4f9497f45f23bcb6a44d26b9862bb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E9=9D=9E=E6=B3=95=E6=93=8D=E4=BD=9C?= Date: Sat, 20 Jun 2020 14:51:02 +0800 Subject: [PATCH 448/640] Update _2020/shell-tools.md Co-authored-by: Lingfeng_Ai --- _2020/shell-tools.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index e2e269f0..04d64483 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -235,7 +235,7 @@ find . -name '*.png' -exec convert {} {.}.jpg \; ```bash # 查找所有使用了 requests 库的文件 rg -t py 'import requests' -# 查找所有没有写shebang的文件(包含隐藏文件) +# 查找所有没有写 shebang 的文件(包含隐藏文件) rg -u --files-without-match "^#!" # 查找所有包含了foo,并打印其之后的5行 rg foo -A 5 From c091240b7314529af112dd0cab0a4ad6ca25b8dc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E9=9D=9E=E6=B3=95=E6=93=8D=E4=BD=9C?= Date: Sat, 20 Jun 2020 14:51:09 +0800 Subject: [PATCH 449/640] Update _2020/shell-tools.md Co-authored-by: Lingfeng_Ai --- _2020/shell-tools.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 04d64483..fee2c47e 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -237,7 +237,7 @@ find . -name '*.png' -exec convert {} {.}.jpg \; rg -t py 'import requests' # 查找所有没有写 shebang 的文件(包含隐藏文件) rg -u --files-without-match "^#!" -# 查找所有包含了foo,并打印其之后的5行 +# 查找所有的foo字符串,并打印其之后的5行 rg foo -A 5 # 打印匹配的统计信息(匹配的行和文件的数量) rg --stats PATTERN From 1d2fe502fb6391ab14db841bda836add47abc813 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Sat, 20 Jun 2020 19:27:32 +0800 Subject: [PATCH 450/640] Update index.md --- index.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/index.md b/index.md index 08cad99d..06e839b1 100644 --- a/index.md +++ b/index.md @@ -1,10 +1,9 @@ --- layout: page -title: The Missing Semester of Your CS Education 中文版 +title: 计算机教育中缺失的一课 --- - - +# The Missing Semester of Your CS Education 中文版 对于计算机教育来说,从操作系统到机器学习,这些高大上课程和主题已经非常多了。然而有一个至关重要的主题却很少被专门讲授,而是留给学生们自己去探索。 这部分内容就是:精通工具。在这个系列课程中,我们讲授命令行、强大的文本编辑器的使用、使用版本控制系统提供的多种特性等等。 From 5c8daa96b46ec54ed26ef20f598c2b7b8936835e Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Sat, 20 Jun 2020 19:52:20 +0800 Subject: [PATCH 451/640] Update index.md --- index.md | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/index.md b/index.md index 06e839b1..21220a6a 100644 --- a/index.md +++ b/index.md @@ -5,10 +5,7 @@ title: 计算机教育中缺失的一课 # The Missing Semester of Your CS Education 中文版 -对于计算机教育来说,从操作系统到机器学习,这些高大上课程和主题已经非常多了。然而有一个至关重要的主题却很少被专门讲授,而是留给学生们自己去探索。 -这部分内容就是:精通工具。在这个系列课程中,我们讲授命令行、强大的文本编辑器的使用、使用版本控制系统提供的多种特性等等。 - -学生在他们受教育阶段就会和这些工具朝夕相处(在他们的职业生涯中更是这样)。 +大学里的计算机课程通常专注于讲授从操作系统到机器学习这些学院派的课程或主题,而对于如何精通工具这一主题则往往会留给学生自行探索。在这个系列课程中,我们讲授命令行、强大的文本编辑器的使用、使用版本控制系统提供的多种特性等等。学生在他们受教育阶段就会和这些工具朝夕相处(在他们的职业生涯中更是这样)。 因此,花时间打磨使用这些工具的能力并能够最终熟练地、流畅地使用它们是非常有必要的。 精通这些工具不仅可以帮助您更快的使用工具完成任务,并且可以帮助您解决在之前看来似乎无比复杂的问题。 From a219781e4c8060fd887145b6e70661409f87703e Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Sat, 20 Jun 2020 19:58:21 +0800 Subject: [PATCH 452/640] Update nav.html --- _includes/nav.html | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_includes/nav.html b/_includes/nav.html index 815cfc71..16d8dcfe 100644 --- a/_includes/nav.html +++ b/_includes/nav.html @@ -8,7 +8,7 @@ {% comment %} {% endcomment %} \ No newline at end of file +

    {% endcomment %} From 17779540f64d10a0f897ba8e6b99036a28bbe8f4 Mon Sep 17 00:00:00 2001 From: Yi Zhang Date: Sun, 21 Jun 2020 21:57:53 -0400 Subject: [PATCH 453/640] potpourri update: completed translation, some punctuation and formatting changes --- _2020/potpourri.md | 235 ++++++++++++++++++--------------------------- 1 file changed, 91 insertions(+), 144 deletions(-) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index 21ac8935..c44138a6 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -1,7 +1,7 @@ --- layout: lecture title: "大杂烩" -date: 2019-01-29 +date: 2020-06-21 ready: false video: aspect: 56.25 @@ -15,34 +15,34 @@ video: - [守护进程](#%E5%AE%88%E6%8A%A4%E8%BF%9B%E7%A8%8B) - [FUSE](#fuse) - [备份](#%E5%A4%87%E4%BB%BD) -- [APIs](#apis) -- [Common command-line flags/patterns](#common-command-line-flagspatterns) +- [API(应用程序接口)](#API%EF%BC%88%E5%BA%94%E7%94%A8%E7%A8%8B%E5%BA%8F%E6%8E%A5%E5%8F%A3%EF%BC%89) +- [常见命令行标志参数及模式](#%E5%B8%B8%E8%A7%81%E5%91%BD%E4%BB%A4%E8%A1%8C%E6%A0%87%E5%BF%97%E5%8F%82%E6%95%B0%E5%8F%8A%E6%A8%A1%E5%BC%8F) - [窗口管理器](#%E7%AA%97%E5%8F%A3%E7%AE%A1%E7%90%86%E5%99%A8) - [VPN](#vpn) - [Markdown](#markdown) - [Hammerspoon (macOS桌面自动化)](#Hammerspoon%20(macOS%E6%A1%8C%E9%9D%A2%E8%87%AA%E5%8A%A8%E5%8C%96)) - [资源](#%E8%B5%84%E6%BA%90) -- [Booting + Live USBs](#booting--live-usbs) +- [开机引导以及 Live USB](#%E5%BC%80%E6%9C%BA%E5%BC%95%E5%AF%BC%E4%BB%A5%E5%8F%8A%20Live%20USB) - [Docker, Vagrant, VMs, Cloud, OpenStack](#docker-vagrant-vms-cloud-openstack) -- [Notebook programming](#notebook-programming) +- [交互式记事本编程](#%E4%BA%A4%E4%BA%92%E5%BC%8F%E8%AE%B0%E4%BA%8B%E6%9C%AC%E7%BC%96%E7%A8%8B) - [GitHub](#github) ## 修改键位映射 -作为一名程序员,键盘是你的主要输入工具。它像电脑里的其他部件一样是可配置的,而且值得你在这上面花时间。 +作为一名程序员,键盘是你的主要输入工具。它像计算机里的其他部件一样是可配置的,而且值得你在这上面花时间。 -一个很常见的配置是修改键位映射。通常这个功能由在电脑上运行的软件实现。当某一个按键被按下,软件截获键盘发出的按键事件(keypress event)并使用另外一个事件取代。比如: -- 将Caps Lock映射为Ctrl或者Escape。Caps Lock使用了键盘上一个非常方便的位置而它的功能却很少被用到,所以我们(讲师)非常推荐这个修改。 -- 将PrtSc映射为播放/暂停。大部分操作系统支持播放/暂停键。 -- 交换Ctrl和Meta键(Windows的徽标键或者Mac的Command键)。 +一个很常见的配置是修改键位映射。通常这个功能由在计算机上运行的软件实现。当某一个按键被按下,软件截获键盘发出的按键事件(keypress event)并使用另外一个事件取代。比如: +- 将 Caps Lock 映射为 Ctrl 或者 Escape:Caps Lock 使用了键盘上一个非常方便的位置而它的功能却很少被用到,所以我们(讲师)非常推荐这个修改; +- 将 PrtSc 映射为播放/暂停:大部分操作系统支持播放/暂停键; +- 交换 Ctrl 和 Meta 键(Windows 的徽标键或者 Mac 的 Command 键)。 你也可以将键位映射为任意常用的指令。软件监听到特定的按键组合后会运行设定的脚本。 -- 打开一个新的终端或者浏览器窗口。 -- 输出特定的字符串,比如:一个超长邮件地址或者MIT ID。 -- 使电脑或者显示器进入睡眠模式。 +- 打开一个新的终端或者浏览器窗口; +- 输出特定的字符串,比如:一个超长邮件地址或者 MIT ID; +- 使计算机或者显示器进入睡眠模式。 甚至更复杂的修改也可以通过软件实现: -- 映射按键顺序,比如:按Shift键五下切换大小写锁定。 -- 区别映射单点和长按,比如:单点Caps Lock映射为Escape,而长按Caps Lock映射为Ctrl。 +- 映射按键顺序,比如:按 Shift 键五下切换大小写锁定; +- 区别映射单点和长按,比如:单点 Caps Lock 映射为 Escape,而长按 Caps Lock 映射为 Ctrl; - 对不同的键盘或软件保存专用的映射配置。 下面是一些修改键位映射的软件: @@ -53,11 +53,11 @@ video: ## 守护进程 -即便守护进程(daemon)这个词看上去有些陌生,你应该已经大约明白它的概念。大部分计算机都有一系列在后台保持运行,不需要用户手动运行或者交互的进程。这些进程就是守护进程。以守护进程运行的程序名一般以`d`结尾,比如SSH服务端`sshd`,用来监听传入的SSH连接请求并对用户进行鉴权。 +即便守护进程(daemon)这个词看上去有些陌生,你应该已经大约明白它的概念。大部分计算机都有一系列在后台保持运行,不需要用户手动运行或者交互的进程。这些进程就是守护进程。以守护进程运行的程序名一般以 `d` 结尾,比如 SSH 服务端 `sshd`,用来监听传入的 SSH 连接请求并对用户进行鉴权。 -Linux中的`systemd`(the system daemon)是最常用的配置和运行守护进程的方法。运行`systemctl status`命令可以看到正在运行的所有守护进程。这里面有很多可能你没有见过,但是掌管了系统的核心部分的进程:管理网络、DNS解析、显示系统的图形界面等等。用户使用`systemctl`命令和`systemd`交互来`enable`(启用)、`disable`(禁用)、`start`(启动)、`stop`(停止)、`restart`(重启)、或者`status`(检查)配置好的守护进程及系统服务。 +Linux 中的 `systemd`(the system daemon)是最常用的配置和运行守护进程的方法。运行 `systemctl status` 命令可以看到正在运行的所有守护进程。这里面有很多可能你没有见过,但是掌管了系统的核心部分的进程:管理网络、DNS解析、显示系统的图形界面等等。用户使用 `systemctl` 命令和 `systemd` 交互来`enable`(启用)、`disable`(禁用)、`start`(启动)、`stop`(停止)、`restart`(重启)、或者`status`(检查)配置好的守护进程及系统服务。 -`systemd`提供了一个很方便的界面用于配置和启用新的守护进程或系统服务。下面的配置文件使用了守护进程来运行一个简单的Python程序。文件的内容非常直接所以我们不对它详细阐述。`systemd`配置文件的详细指南可参见[freedesktop.org](https://www.freedesktop.org/software/systemd/man/systemd.service.html)。 +`systemd` 提供了一个很方便的界面用于配置和启用新的守护进程或系统服务。下面的配置文件使用了守护进程来运行一个简单的 Python 程序。文件的内容非常直接所以我们不对它详细阐述。`systemd` 配置文件的详细指南可参见 [freedesktop.org](https://www.freedesktop.org/software/systemd/man/systemd.service.html)。 ```ini # /etc/systemd/system/myapp.service @@ -87,23 +87,23 @@ WantedBy=multi-user.target # graphical.target在multi-user.target的基础上运行和GUI相关的服务 ``` -如果你只是想定期运行一些程序,可以直接使用[`cron`](http://man7.org/linux/man-pages/man8/cron.8.html)。它是一个系统内置的,用来执行定期任务的守护进程。 +如果你只是想定期运行一些程序,可以直接使用 [`cron`](http://man7.org/linux/man-pages/man8/cron.8.html)。它是一个系统内置的,用来执行定期任务的守护进程。 ## FUSE -现在的软件系统一般由很多模块化的组件构建而成。你使用的操作系统可以通过一系列共同的方式使用不同的文件系统上的相似功能。比如当你使用`touch`命令创建文件的时候,`touch`使用系统调用(system call)向内核发出请求。内核再根据文件系统,调用特有的方法来创建文件。这里的问题是,UNIX文件系统在传统上是以内核模块的形式实现,导致只有内核可以进行文件系统相关的调用。 +现在的软件系统一般由很多模块化的组件构建而成。你使用的操作系统可以通过一系列共同的方式使用不同的文件系统上的相似功能。比如当你使用 `touch` 命令创建文件的时候,`touch` 使用系统调用(system call)向内核发出请求。内核再根据文件系统,调用特有的方法来创建文件。这里的问题是,UNIX 文件系统在传统上是以内核模块的形式实现,导致只有内核可以进行文件系统相关的调用。 [FUSE](https://en.wikipedia.org/wiki/Filesystem_in_Userspace)(Filesystem in User Space)允许运行在用户空间上的程序实现文件系统调用,并将这些调用与内核接口联系起来。在实践中,这意味着用户可以在文件系统调用中实现任意功能。 -FUSE可以用于实现如:一个将所有文件系统操作都使用SSH转发到远程主机,由远程主机处理后返回结果到本地计算机的虚拟文件系统。这个文件系统里的文件虽然存储在远程主机,对于本地计算机上的软件而言和存储在本地别无二致。`sshfs`就是一个实现了这种功能的FUSE文件系统。 +FUSE 可以用于实现如:一个将所有文件系统操作都使用 SSH 转发到远程主机,由远程主机处理后返回结果到本地计算机的虚拟文件系统。这个文件系统里的文件虽然存储在远程主机,对于本地计算机上的软件而言和存储在本地别无二致。`sshfs`就是一个实现了这种功能的 FUSE 文件系统。 -一些有趣的FUSE文件系统包括: -- [sshfs](https://github.com/libfuse/sshfs):使用SSH连接在本地打开远程主机上的文件。 -- [rclone](https://rclone.org/commands/rclone_mount/):将Dropbox、Google Drive、Amazon S3、或者Google Cloud Storage一类的云存储服务挂载为本地文件系统。 -- [gocryptfs](https://nuetzlich.net/gocryptfs/):覆盖在加密文件上的文件系统。文件以加密形式保存在磁盘里,但该文件系统挂载后用户可以直接从挂载点访问文件的明文。 -- [kbfs](https://keybase.io/docs/kbfs):分布式端到端加密文件系统。在这个文件系统里有私密(private),共享(shared),以及公开(public)三种类型的文件夹。 -- [borgbackup](https://borgbackup.readthedocs.io/en/stable/usage/mount.html):方便用户浏览删除重复数据后的压缩加密备份。 +一些有趣的 FUSE 文件系统包括: +- [sshfs](https://github.com/libfuse/sshfs):使用SSH连接在本地打开远程主机上的文件 +- [rclone](https://rclone.org/commands/rclone_mount/):将 Dropbox、Google Drive、Amazon S3、或者 Google Cloud Storage 一类的云存储服务挂载为本地文件系统 +- [gocryptfs](https://nuetzlich.net/gocryptfs/):覆盖在加密文件上的文件系统。文件以加密形式保存在磁盘里,但该文件系统挂载后用户可以直接从挂载点访问文件的明文 +- [kbfs](https://keybase.io/docs/kbfs):分布式端到端加密文件系统。在这个文件系统里有私密(private),共享(shared),以及公开(public)三种类型的文件夹 +- [borgbackup](https://borgbackup.readthedocs.io/en/stable/usage/mount.html):方便用户浏览删除重复数据后的压缩加密备份 ## 备份 @@ -111,7 +111,7 @@ FUSE可以用于实现如:一个将所有文件系统操作都使用SSH转发 首先,复制存储在同一个磁盘上的数据不是备份,因为这个磁盘是一个单点故障(single point of failure)。这个磁盘一旦出现问题,所有的数据都可能丢失。放在家里的外置磁盘因为火灾、抢劫等原因可能会和源数据一起丢失,所以是一个弱备份。推荐的做法是将数据备份到不同的地点存储。 -同步方案也不是备份。即使方便如Dropbox或者Google Drive,当数据在本地被抹除或者损坏,同步方案可能会把这些“更改”同步到云端。同理,像RAID这样的磁盘镜像方案也不是备份。它不能防止文件被意外删除、损坏、或者被勒索软件加密。 +同步方案也不是备份。即使方便如 Dropbox 或者 Google Drive,当数据在本地被抹除或者损坏,同步方案可能会把这些“更改”同步到云端。同理,像 RAID 这样的磁盘镜像方案也不是备份。它不能防止文件被意外删除、损坏、或者被勒索软件加密。 有效备份方案的几个核心特性是:版本控制,删除重复数据,以及安全性。对备份的数据实施版本控制保证了用户可以从任何记录过的历史版本中恢复数据。在备份中检测并删除重复数据,使其仅备份增量变化可以减少存储开销。在安全性方面,作为用户,你应该考虑别人需要有什么信息或者工具才可以访问或者完全删除你的数据及备份。最后一点,不要盲目信任备份方案。用户应该经常检查备份是否可以用来恢复数据。 @@ -120,126 +120,89 @@ FUSE可以用于实现如:一个将所有文件系统操作都使用SSH转发 如果想要了解更多具体内容,请参考本课程2019年关于备份的[课堂笔记](/2019/backups)。 -## APIs - -We've talked a lot in this class about using your computer more -efficiently to accomplish _local_ tasks, but you will find that many of -these lessons also extend to the wider internet. Most services online -will have "APIs" that let you programmatically access their data. For -example, the US government has an API that lets you get weather -forecasts, which you could use to easily get a weather forecast in your -shell. - -Most of these APIs have a similar format. They are structured URLs, -often rooted at `api.service.com`, where the path and query parameters -indicate what data you want to read or what action you want to perform. -For the US weather data for example, to get the forecast for a -particular location, you issue GET request (with `curl` for example) to -https://api.weather.gov/points/42.3604,-71.094. The response itself -contains a bunch of other URLs that let you get specific forecasts for -that region. Usually, the responses are formatted as JSON, which you can -then pipe through a tool like [`jq`](https://stedolan.github.io/jq/) to -massage into what you care about. - -Some APIs require authentication, and this usually takes the form of -some sort of secret _token_ that you need to include with the request. -You should read the documentation for the API to see what the particular -service you are looking for uses, but "[OAuth](https://www.oauth.com/)" -is a protocol you will often see used. At its heart, OAuth is a way to -give you tokens that can "act as you" on a given service, and can only -be used for particular purposes. Keep in mind that these tokens are -_secret_, and anyone who gains access to your token can do whatever the -token allows under _your_ account! - -[IFTTT](https://ifttt.com/) is a website and service centered around the -idea of APIs — it provides integrations with tons of services, and lets -you chain events from them in nearly arbitrary ways. Give it a look! - -## Common command-line flags/patterns - -Command-line tools vary a lot, and you will often want to check out -their `man` pages before using them. They often share some common -features though that can be good to be aware of: - - - Most tools support some kind of `--help` flag to display brief usage - instructions for the tool. - - Many tools that can cause irrevocable change support the notion of a - "dry run" in which they only print what they _would have done_, but - do not actually perform the change. Similarly, they often have an - "interactive" flag that will prompt you for each destructive action. - - You can usually use `--version` or `-V` to have the program print its - own version (handy for reporting bugs!). - - Almost all tools have a `--verbose` or `-v` flag to produce more - verbose output. You can usually include the flag multiple times - (`-vvv`) to get _more_ verbose output, which can be handy for - debugging. Similarly, many tools have a `--quiet` flag for making it - only print something on error. - - In many tools, `-` in place of a file name means "standard input" or - "standard output", depending on the argument. - - Possibly destructive tools are generally not recursive by default, - but support a "recursive" flag (often `-r`) to make them recurse. - - Sometimes, you want to pass something that _looks_ like a flag as a - normal argument. For example, imagine you wanted to remove a file - called `-r`. Or you want to run one program "through" another, like - `ssh machine foo`, and you want to pass a flag to the "inner" program - (`foo`). The special argument `--` makes a program _stop_ processing - flags and options (things starting with `-`) in what follows, letting - you pass things that look like flags without them being interpreted - as such: `rm -- -r` or `ssh machine --for-ssh -- foo --for-foo`. +## API(应用程序接口) + +关于如何使用计算机有效率地完成 _本地_ 任务,我们这堂课已经介绍了很多方法。这些方法在互联网上其实也适用。大多数线上服务提供的 API(应用程序接口)让你可以通过编程方式来访问这些服务的数据。比如,美国国家气象局就提供了一个可以从 shell 中获取天气预报的 API。 + +这些 API 大多具有类似的格式。它们的结构化 URL 通常使用 `api.service.com` 作为根路径,用户可以访问不同的子路径来访问需要调用的操作,以及添加查询参数使 API 返回符合查询参数条件的结果。 + +以美国天气数据为例,为了获得某个地点的天气数据,你可以发送一个 GET 请求(比如使用`curl`)到[`https://api.weather.gov/points/42.3604,-71.094`](`https://api.weather.gov/points/42.3604,-71.094`)。返回中会包括一系列用于获取特定信息(比如小时预报、气象观察站信息等)的 URL。通常这些返回都是`JSON`格式,你可以使用[`jq`](https://stedolan.github.io/jq/)等工具来选取需要的部分。 + +有些需要认证的 API 通常要求用户在请求中加入某种私密令牌(secret token)来完成认证。请阅读你想访问的 API 所提供的文档来确定它请求的认证方式,但是其实大多数 API 都会使用 [OAuth](https://www.oauth.com/)。OAuth 通过向用户提供一系列仅可用于该 API 特定功能的私密令牌进行校验。因为使用了有效 OAuth 令牌的请求在 API 看来就是用户本人发出的请求,所以请一定保管好这些私密令牌。否则其他人就可以冒用你的身份进行任何你可以在这个 API 上进行的操作。 + +[IFTTT](https://ifttt.com/) 这个网站可以将很多 API 整合在一起,让某 API 发生的特定事件触发在其他 API 上执行的任务。IFTTT 的全称 If This Then That 足以说明它的用法,比如在检测到用户的新推文后,自动发布在其他平台。但是你可以对它支持的 API 进行任意整合,所以试着来设置一下任何你需要的功能吧! + +## 常见命令行标志参数及模式 + +命令行工具的用法千差万别,阅读 `man` 页面可以帮助你理解每种工具的用法。即便如此,下面我们将介绍一下命令行工具一些常见的共同功能。 + + - 大部分工具支持 `--help` 或者类似的标志参数(flag)来显示它们的简略用法。 + - 会造成不可撤回操作的工具一般会提供“空运行”(dry run)标志参数,这样用户可以确认工具真实运行时会进行的操作。这些工具通常也会有“交互式”(interactive)标志参数,在执行每个不可撤回的操作前提示用户确认。 + - `--version` 或者 `-V` 标志参数可以让工具显示它的版本信息(对于提交软件问题报告非常重要)。 + - 基本所有的工具支持使用 `--verbose` 或者 `-v` 标志参数来输出详细的运行信息。多次使用这个标志参数,比如 `-vvv`,可以让工具输出更详细的信息(经常用于调试)。同样,很多工具支持 `--quiet` 标志参数来抑制除错误提示之外的其他输出。 + - 大多数工具中,使用 `-` 代替输入或者输出文件名意味着工具将从标准输入(standard input)获取所需内容,或者向标准输出(standard output)输出结果。 + - 会造成破坏性结果的工具一般默认进行非递归的操作,但是支持使用“递归”(recursive)标志函数(通常是 `-r`)。 + - 有的时候你可能需要向工具传入一个 _看上去_ 像标志参数的普通参数,比如: + - 使用 `rm` 删除一个叫 `-r` 的文件; + - 在通过一个程序运行另一个程序的时候(`ssh machine foo`),向内层的程序(`foo`)传递一个标志参数。 + + 这时候你可以使用特殊参数 `--` 让某个程序 _停止处理_ `--` 后面出现的标志参数以及选项(以 `-` 开头的内容): + - `rm -- -r` 会让 `rm` 将 `-r` 当作文件名; + - `ssh machine --for-ssh -- foo --for-foo` 的 `--` 会让 `ssh` 知道 `--for-foo` 不是 `ssh` 的标志参数。 ## 窗口管理器 -大部分人适应了Windows、macOS、以及Ubuntu默认的“拖拽”式窗口管理器。这些窗口管理器的窗口一般就堆在屏幕上,你可以拖拽改变窗口的位置、缩放窗口、以及让窗口堆叠在一起。这种堆叠式(floating/stacking)管理器只是窗口管理器中的一种。特别在Linux中,有很多种其他的管理器。 +大部分人适应了 Windows、macOS、以及 Ubuntu 默认的“拖拽”式窗口管理器。这些窗口管理器的窗口一般就堆在屏幕上,你可以拖拽改变窗口的位置、缩放窗口、以及让窗口堆叠在一起。这种堆叠式(floating/stacking)管理器只是窗口管理器中的一种。特别在 Linux 中,有很多种其他的管理器。 平铺式(tiling)管理器就是一个常见的替代。顾名思义,平铺式管理器会把不同的窗口像贴瓷砖一样平铺在一起而不和其他窗口重叠。这和 [tmux](https://github.com/tmux/tmux) 管理终端窗口的方式类似。平铺式管理器按照写好的布局显示打开的窗口。如果只打开一个窗口,它会填满整个屏幕。新开一个窗口的时候,原来的窗口会缩小到比如三分之二或者三分之一的大小来腾出空间。打开更多的窗口会让已有的窗口进一步调整。 -就像tmux那样,平铺式管理器可以让你在完全不使用鼠标的情况下使用键盘切换、缩放、以及移动窗口。它们值得一试! +就像 tmux 那样,平铺式管理器可以让你在完全不使用鼠标的情况下使用键盘切换、缩放、以及移动窗口。它们值得一试! ## VPN -VPN现在非常火,但我们不清楚这是不是因为[一些好的理由](https://gist.github.com/joepie91/5a9909939e6ce7d09e29)。你应该了解VPN能提供的功能和它的限制。使用了VPN的你对于互联网而言,**最好的情况**下也就是换了一个网络供应商(ISP)。所有你发出的流量看上去来源于VPN供应商的网络而不是你的“真实”地址,而你实际接入的网络只能看到加密的流量。 +VPN 现在非常火,但我们不清楚这是不是因为[一些好的理由](https://gist.github.com/joepie91/5a9909939e6ce7d09e29)。你应该了解 VPN 能提供的功能和它的限制。使用了 VPN 的你对于互联网而言,**最好的情况**下也就是换了一个网络供应商(ISP)。所有你发出的流量看上去来源于 VPN 供应商的网络而不是你的“真实”地址,而你实际接入的网络只能看到加密的流量。 -虽然这听上去非常诱人,但是你应该知道使用VPN只是把原本对网络供应商的信任放在了VPN供应商那里——网络供应商 _能看到的_,VPN供应商 _也都能看到_。如果相比网络供应商你更信任VPN供应商,那当然很好。反之,则连接VPN的价值不明确。机场的不加密公共热点确实不可以信任,但是在家庭网络环境里,这个差异就没有那么明显。 +虽然这听上去非常诱人,但是你应该知道使用 VPN 只是把原本对网络供应商的信任放在了 VPN 供应商那里——网络供应商 _能看到的_,VPN 供应商 _也都能看到_。如果相比网络供应商你更信任 VPN 供应商,那当然很好。反之,则连接VPN的价值不明确。机场的不加密公共热点确实不可以信任,但是在家庭网络环境里,这个差异就没有那么明显。 -你也应该了解现在大部分包含用户敏感信息的流量已经被HTTPS或者TLS加密。这种情况下你所处的网络环境是否“安全”不太重要:供应商只能看到你和哪些服务器在交谈,却不能看到你们交谈的内容。 +你也应该了解现在大部分包含用户敏感信息的流量已经被 HTTPS 或者 TLS 加密。这种情况下你所处的网络环境是否“安全”不太重要:供应商只能看到你和哪些服务器在交谈,却不能看到你们交谈的内容。 -这一切的大前提都是“最好的情况”。曾经发生过VPN提供商错误使用弱加密或者直接禁用加密的先例。另外,有些恶意的或者带有投机心态的供应商会记录和你有关的所有流量,并很可能会将这些信息卖给第三方。找错一家VPN经常比一开始就不用VPN更危险。 +这一切的大前提都是“最好的情况”。曾经发生过 VPN 提供商错误使用弱加密或者直接禁用加密的先例。另外,有些恶意的或者带有投机心态的供应商会记录和你有关的所有流量,并很可能会将这些信息卖给第三方。找错一家 VPN 经常比一开始就不用 VPN 更危险。 -MIT向有访问校内资源需求的成员开放自己运营的[VPN](https://ist.mit.edu/vpn)。如果你也想自己配置一个VPN,可以了解一下 [WireGuard](https://www.wireguard.com/) 以及 [Algo](https://github.com/trailofbits/algo)。 +MIT 向有访问校内资源需求的成员开放自己运营的 [VPN](https://ist.mit.edu/vpn)。如果你也想自己配置一个 VPN,可以了解一下 [WireGuard](https://www.wireguard.com/) 以及 [Algo](https://github.com/trailofbits/algo)。 ## Markdown 你在职业生涯中大概率会编写各种各样的文档。在很多情况下这些文档需要使用标记来增加可读性,比如:插入粗体或者斜体内容,增加页眉、超链接、以及代码片段。 -在不使用Word或者LaTeX等复杂工具的情况下,你可以考虑使用 [Markdown](https://commonmark.org/help/) 这个轻量化的标记语言(markup language)。你可能已经见过Markdown或者它的一个变种。很多环境都支持并使用Markdown的一些子功能。 +在不使用 Word 或者 LaTeX 等复杂工具的情况下,你可以考虑使用 [Markdown](https://commonmark.org/help/) 这个轻量化的标记语言(markup language)。你可能已经见过Markdown或者它的一个变种。很多环境都支持并使用 Markdown 的一些子功能。 -Markdown致力于将人们编写纯文本时的一些习惯标准化。比如: -- 用`*`包围的文字表示强调(*斜体*),或者用`**`表示特别强调(**粗体**)。 -- 以`#`开头的行是标题,`#`的数量表示标题的级别,比如:`##二级标题`。 -- 以`-`开头代表一个无序列表的元素。一个数字加`.`(比如`1.`)代表一个有序列表元素。 -- 反引号`` ` ``(backtick)包围的文字会以`代码字体`显示。如果要显示一段代码,可以在每一行前加四个空格缩进,或者使用三个反引号包围整个代码片段。 +Markdown 致力于将人们编写纯文本时的一些习惯标准化。比如: +- 用`*`包围的文字表示强调(*斜体*),或者用`**`表示特别强调(**粗体**); +- 以`#`开头的行是标题,`#`的数量表示标题的级别,比如:`##二级标题`; +- 以`-`开头代表一个无序列表的元素。一个数字加`.`(比如`1.`)代表一个有序列表元素; +- 反引号 `` ` ``(backtick)包围的文字会以`代码字体`显示。如果要显示一段代码,可以在每一行前加四个空格缩进,或者使用三个反引号包围整个代码片段: ``` 就像这样 ``` - 如果要添加超链接,将 _需要显示_ 的文字用方括号包围,并在后面紧接着用圆括号包围链接:`[显示文字](指向的链接)`。 -Markdown不仅容易上手,而且应用非常广泛。实际上本课程的课堂笔记和其他资料都是使用Markdown编写的。点击[这个链接](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md)可以看到本页面的原始Markdown内容。 +Markdown 不仅容易上手,而且应用非常广泛。实际上本课程的课堂笔记和其他资料都是使用 Markdown 编写的。点击[这个链接](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md)可以看到本页面的原始 Markdown 内容。 ## Hammerspoon (macOS桌面自动化) -[Hammerspoon](https://www.hammerspoon.org/)是面向macOS的一个桌面自动化框架。它允许用户编写和操作系统功能挂钩的Lua脚本,从而与键盘、鼠标、窗口、文件系统等交互。 +[Hammerspoon](https://www.hammerspoon.org/) 是面向 macOS 的一个桌面自动化框架。它允许用户编写和操作系统功能挂钩的Lua脚本,从而与键盘、鼠标、窗口、文件系统等交互。 -下面是Hammerspoon的一些示例应用: +下面是 Hammerspoon 的一些示例应用: - 绑定移动窗口到的特定位置的快捷键 - 创建可以自动将窗口整理成特定布局的菜单栏按钮 -- 在你到实验室以后,通过检测所连接的WiFi网络自动静音扬声器 +- 在你到实验室以后,通过检测所连接的 WiFi 网络自动静音扬声器 - 在你不小心拿了朋友的充电器时弹出警告 -从用户的角度,Hammerspoon可以运行任意Lua代码,绑定菜单栏按钮、按键、或者事件。Hammerspoon提供了一个全面的用于和系统交互的库,因此它能没有限制地实现任何功能。你可以从头编写自己的Hammerspoon配置,也可以结合别人公布的配置来满足自己的需求。 +从用户的角度,Hammerspoon 可以运行任意 Lua 代码,绑定菜单栏按钮、按键、或者事件。Hammerspoon 提供了一个全面的用于和系统交互的库,因此它能没有限制地实现任何功能。你可以从头编写自己的 Hammerspoon 配置,也可以结合别人公布的配置来满足自己的需求。 ### 资源 @@ -247,66 +210,50 @@ Markdown不仅容易上手,而且应用非常广泛。实际上本课程的课 - [Sample configurations](https://github.com/Hammerspoon/hammerspoon/wiki/Sample-Configurations):Hammerspoon官方示例配置 - [Anish's Hammerspoon config](https://github.com/anishathalye/dotfiles-local/tree/mac/hammerspoon):Anish的Hammerspoon配置 -## Booting + Live USBs +## 开机引导以及 Live USB -When your machine boots up, before the operating system is loaded, the -[BIOS](https://en.wikipedia.org/wiki/BIOS)/[UEFI](https://en.wikipedia.org/wiki/Unified_Extensible_Firmware_Interface) -initializes the system. During this process, you can press a specific key -combination to configure this layer of software. For example, your computer may -say something like "Press F9 to configure BIOS. Press F12 to enter boot menu." -during the boot process. You can configure all sorts of hardware-related -settings in the BIOS menu. You can also enter the boot menu to boot from an -alternate device instead of your hard drive. +在你的计算机启动时,[BIOS](https://en.wikipedia.org/wiki/BIOS) 或者 [UEFI](https://en.wikipedia.org/wiki/Unified_Extensible_Firmware_Interface) 会在加载操作系统之前对硬件系统进行初始化,这被称为引导(booting)。你可以通过按下计算机提示的键位组合来配置引导,比如 `Press F9 to configure BIOS. Press F12 to enter boot menu`。在 BIOS 菜单中你可以对硬件相关的设置进行更改,也可以在引导菜单中选择从硬盘以外的其他设备加载操作系统——比如 Live USB。 -[Live USBs](https://en.wikipedia.org/wiki/Live_USB) are USB flash drives -containing an operating system. You can create one of these by downloading an -operating system (e.g. a Linux distribution) and burning it to the flash drive. -This process is a little bit more complicated than simply copying a `.iso` file -to the disk. There are tools like [UNetbootin](https://unetbootin.github.io/) -to help you create live USBs. +[Live USB](https://en.wikipedia.org/wiki/Live_USB) 是包含了完整操作系统的闪存盘。Live USB 的用途非常广泛,包括: + - 作为安装操作系统的启动盘; + - 在不将操作系统安装到硬盘的情况下,直接运行 Live USB 上的操作系统; + - 对硬盘上的相同操作系统进行修复; + - 恢复硬盘上的数据。 -Live USBs are useful for all sorts of purposes. Among other things, if you -break your existing operating system installation so that it no longer boots, -you can use a live USB to recover data or fix the operating system. +Live USB 通过在闪存盘上 _写入_ 操作系统的镜像制作,而写入不是单纯的往闪存盘上复制 `.iso` 文件。你可以使用 [UNetbootin](https://unetbootin.github.io/) 、[Rufus](github.com/pbatard/rufus) 等 Live USB 写入工具制作。 ## Docker, Vagrant, VMs, Cloud, OpenStack [虚拟机](https://en.wikipedia.org/wiki/Virtual_machine)(Virtual Machine)以及如容器化(containerization)等工具可以帮助你模拟一个包括操作系统的完整计算机系统。虚拟机可以用于创建独立的测试或者开发环境,以及用作安全测试的沙盒。 -[Vagrant](https://www.vagrantup.com/) 是一个构建和配置虚拟开发环境的工具。它支持用户在配置文件中写入比如操作系统、系统服务、需要安装的软件包等描述,然后使用`vagrant up`命令在各种环境(VirtualBox,KVM,Hyper-V等)中启动一个虚拟机。[Docker](https://www.docker.com/) 是一个使用容器化概念的类似工具。 +[Vagrant](https://www.vagrantup.com/) 是一个构建和配置虚拟开发环境的工具。它支持用户在配置文件中写入比如操作系统、系统服务、需要安装的软件包等描述,然后使用 `vagrant up` 命令在各种环境(VirtualBox,KVM,Hyper-V等)中启动一个虚拟机。[Docker](https://www.docker.com/) 是一个使用容器化概念的类似工具。 租用云端虚拟机可以享受以下资源的即时访问: - 便宜、常开、且有公共IP地址的虚拟机用来托管网站等服务 -- 有大量CPU、磁盘、内存、以及GPU资源的虚拟机 +- 有大量 CPU、磁盘、内存、以及 GPU 资源的虚拟机 - 超出用户可以使用的物理主机数量的虚拟机 - 相比物理主机的固定开支,虚拟机的开支一般按运行的时间计算。所以如果用户只需要在短时间内使用大量算力,租用1000台虚拟机运行几分钟明显更加划算。 -受欢迎的VPS服务商有 [Amazon AWS](https://aws.amazon.com/),[Google +受欢迎的 VPS 服务商有 [Amazon AWS](https://aws.amazon.com/),[Google Cloud](https://cloud.google.com/),以及 [DigitalOcean](https://www.digitalocean.com/)。 -MIT CSAIL的成员可以使用 [CSAIL OpenStack +MIT CSAIL 的成员可以使用 [CSAIL OpenStack instance](https://tig.csail.mit.edu/shared-computing/open-stack/) 申请免费的虚拟机用于研究。 -## Notebook programming +## 交互式记事本编程 -[Notebook programming -environments](https://en.wikipedia.org/wiki/Notebook_interface) can be really -handy for doing certain types of interactive or exploratory development. -Perhaps the most popular notebook programming environment today is -[Jupyter](https://jupyter.org/), for Python (and several other languages). -[Wolfram Mathematica](https://www.wolfram.com/mathematica/) is another notebook -programming environment that's great for doing math-oriented programming. +[交互式记事本](https://en.wikipedia.org/wiki/Notebook_interface)可以帮助开发者进行与运行结果交互等探索性的编程。现在最受欢迎的交互式记事本环境大概是 [Jupyter](https://jupyter.org/)。它的名字来源于所支持的三种核心语言:Julia、Python、R。[Wolfram Mathematica](https://www.wolfram.com/mathematica/) 是另外一个常用于科学计算的优秀环境。 ## GitHub -[GitHub](https://github.com/) 是最受欢迎的开源软件开发平台之一。我们课程中提到的很多工具,从[vim](https://github.com/vim/vim) 到 +[GitHub](https://github.com/) 是最受欢迎的开源软件开发平台之一。我们课程中提到的很多工具,从 [vim](https://github.com/vim/vim) 到 [Hammerspoon](https://github.com/Hammerspoon/hammerspoon),都托管在Github上。向你每天使用的开源工具作出贡献其实很简单,下面是两种贡献者们经常使用的方法: - 创建一个[议题(issue)](https://help.github.com/en/github/managing-your-work-on-github/creating-an-issue)。 议题可以用来反映软件运行的问题或者请求新的功能。创建议题并不需要创建者阅读或者编写代码,所以它是一个轻量化的贡献方式。高质量的问题报告对于开发者十分重要。在现有的议题发表评论也可以对项目的开发作出贡献。 -- 使用[拉取请求(pull request)](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)提交代码更改。由于涉及到阅读和编写代码,提交拉取请求总的来说比创建议题更加深入。拉取请求是请求别人把你自己的代码拉取(且合并)到他们的仓库里。很多开源项目仅允许认证的管理者管理项目代码,所以一般需要[复刻(fork)](https://help.github.com/en/github/getting-started-with-github/fork-a-repo)这些项目的上游仓库(upstream repository),在你的Github账号下创建一个内容完全相同但是由你控制的复刻仓库。这样你就可以在这个复刻仓库自由创建新的分支并推送修复问题或者实现新功能的代码。完成修改以后再回到开源项目的Github页面[创建一个拉取请求](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request)。 +- 使用[拉取请求(pull request)](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)提交代码更改。由于涉及到阅读和编写代码,提交拉取请求总的来说比创建议题更加深入。拉取请求是请求别人把你自己的代码拉取(且合并)到他们的仓库里。很多开源项目仅允许认证的管理者管理项目代码,所以一般需要[复刻(fork)](https://help.github.com/en/github/getting-started-with-github/fork-a-repo)这些项目的上游仓库(upstream repository),在你的 Github 账号下创建一个内容完全相同但是由你控制的复刻仓库。这样你就可以在这个复刻仓库自由创建新的分支并推送修复问题或者实现新功能的代码。完成修改以后再回到开源项目的 Github 页面[创建一个拉取请求](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request)。 提交请求后,项目管理者会和你交流拉取请求里的代码并给出反馈。如果没有问题,你的代码会和上游仓库中的代码合并。很多大的开源项目会提供贡献指南,容易上手的议题,甚至专门的指导项目来帮助参与者熟悉这些项目。 From e2bd8abcdbcfc2cb87e0c357a828e504b61b0cc3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E9=9D=9E=E6=B3=95=E6=93=8D=E4=BD=9C?= Date: Mon, 22 Jun 2020 16:30:39 +0800 Subject: [PATCH 454/640] Update editors.md --- _2020/editors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/editors.md b/_2020/editors.md index d1c5e3ed..64d5b4bf 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -63,7 +63,7 @@ Vim的设计以大多数时间都花在阅读、浏览和进行少量编辑改 你可以按下 `` (逃脱键) 从任何其他模式返回正常模式。 在正常模式,键入 `i` 进入插入 模式, `R` 进入替换模式, `v` 进入可视(一般)模式, `V` 进入可视(行)模式, `` -(Ctrl-V, 有时也写作 `^V`), `:` 进入命令模式。 +(Ctrl-V, 有时也写作 `^V`)进入可视(块)模式, `:` 进入命令模式。 因为你会在使用 Vim 时大量使用 `` 键,可以考虑把大小写锁定键重定义成逃脱键 ([MacOS 教程](https://vim.fandom.com/wiki/Map_caps_lock_to_escape_in_macOS) )。 From 63a64516022f688dc937a6aed75673094c692212 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Mon, 22 Jun 2020 17:40:42 +0800 Subject: [PATCH 455/640] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 3db8f719..60a186ac 100644 --- a/README.md +++ b/README.md @@ -35,6 +35,6 @@ To contribute to this tanslation project, please book your topic by creating an | [debugging-profiling.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/debugging-profiling.md) |[@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [metaprogramming.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/metaprogramming.md) | [@Lingfeng AI](https://github.com/hanxiaomax) | Done | | [security.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/security.md) | [@catcarbon](https://github.com/catcarbon) | Done | -| [potpourri.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md) | [@catcarbon](https://github.com/catcarbon) | In-progress | +| [potpourri.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/potpourri.md) | [@catcarbon](https://github.com/catcarbon) | Done | | [qa.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/_2020/qa.md) | [@AA1HSHH](https://github.com/AA1HSHH) | Done | | [about.md](https://github.com/missing-semester-cn/missing-semester-cn.github.io/blob/master/about.md) | [@Binlogo](https://github.com/Binlogo) | Done | From 1bb8caa5b991f5d6644d66929f92c29c10d2a3dd Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Mon, 22 Jun 2020 17:44:01 +0800 Subject: [PATCH 456/640] Update potpourri.md --- _2020/potpourri.md | 28 +++++++++++++--------------- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index c44138a6..525977d4 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -2,7 +2,7 @@ layout: lecture title: "大杂烩" date: 2020-06-21 -ready: false +ready: true video: aspect: 56.25 id: JZDt-PRq0uo @@ -94,12 +94,12 @@ WantedBy=multi-user.target 现在的软件系统一般由很多模块化的组件构建而成。你使用的操作系统可以通过一系列共同的方式使用不同的文件系统上的相似功能。比如当你使用 `touch` 命令创建文件的时候,`touch` 使用系统调用(system call)向内核发出请求。内核再根据文件系统,调用特有的方法来创建文件。这里的问题是,UNIX 文件系统在传统上是以内核模块的形式实现,导致只有内核可以进行文件系统相关的调用。 -[FUSE](https://en.wikipedia.org/wiki/Filesystem_in_Userspace)(Filesystem in User Space)允许运行在用户空间上的程序实现文件系统调用,并将这些调用与内核接口联系起来。在实践中,这意味着用户可以在文件系统调用中实现任意功能。 +[FUSE](https://en.wikipedia.org/wiki/Filesystem_in_Userspace)(用户空间文件系统)允许运行在用户空间上的程序实现文件系统调用,并将这些调用与内核接口联系起来。在实践中,这意味着用户可以在文件系统调用中实现任意功能。 FUSE 可以用于实现如:一个将所有文件系统操作都使用 SSH 转发到远程主机,由远程主机处理后返回结果到本地计算机的虚拟文件系统。这个文件系统里的文件虽然存储在远程主机,对于本地计算机上的软件而言和存储在本地别无二致。`sshfs`就是一个实现了这种功能的 FUSE 文件系统。 一些有趣的 FUSE 文件系统包括: -- [sshfs](https://github.com/libfuse/sshfs):使用SSH连接在本地打开远程主机上的文件 +- [sshfs](https://github.com/libfuse/sshfs):使用 SSH 连接在本地打开远程主机上的文件 - [rclone](https://rclone.org/commands/rclone_mount/):将 Dropbox、Google Drive、Amazon S3、或者 Google Cloud Storage 一类的云存储服务挂载为本地文件系统 - [gocryptfs](https://nuetzlich.net/gocryptfs/):覆盖在加密文件上的文件系统。文件以加密形式保存在磁盘里,但该文件系统挂载后用户可以直接从挂载点访问文件的明文 - [kbfs](https://keybase.io/docs/kbfs):分布式端到端加密文件系统。在这个文件系统里有私密(private),共享(shared),以及公开(public)三种类型的文件夹 @@ -130,7 +130,7 @@ FUSE 可以用于实现如:一个将所有文件系统操作都使用 SSH 转 有些需要认证的 API 通常要求用户在请求中加入某种私密令牌(secret token)来完成认证。请阅读你想访问的 API 所提供的文档来确定它请求的认证方式,但是其实大多数 API 都会使用 [OAuth](https://www.oauth.com/)。OAuth 通过向用户提供一系列仅可用于该 API 特定功能的私密令牌进行校验。因为使用了有效 OAuth 令牌的请求在 API 看来就是用户本人发出的请求,所以请一定保管好这些私密令牌。否则其他人就可以冒用你的身份进行任何你可以在这个 API 上进行的操作。 -[IFTTT](https://ifttt.com/) 这个网站可以将很多 API 整合在一起,让某 API 发生的特定事件触发在其他 API 上执行的任务。IFTTT 的全称 If This Then That 足以说明它的用法,比如在检测到用户的新推文后,自动发布在其他平台。但是你可以对它支持的 API 进行任意整合,所以试着来设置一下任何你需要的功能吧! +[IFTTT](https://ifttt.com/) 这个网站可以将很多 API 整合在一起,让某 API 发生的特定事件触发在其他 API 上执行的任务。IFTTT 的全称If This Then That 足以说明它的用法,比如在检测到用户的新推文后,自动发布在其他平台。但是你可以对它支持的 API 进行任意整合,所以试着来设置一下任何你需要的功能吧! ## 常见命令行标志参数及模式 @@ -174,7 +174,7 @@ MIT 向有访问校内资源需求的成员开放自己运营的 [VPN](https://i 你在职业生涯中大概率会编写各种各样的文档。在很多情况下这些文档需要使用标记来增加可读性,比如:插入粗体或者斜体内容,增加页眉、超链接、以及代码片段。 -在不使用 Word 或者 LaTeX 等复杂工具的情况下,你可以考虑使用 [Markdown](https://commonmark.org/help/) 这个轻量化的标记语言(markup language)。你可能已经见过Markdown或者它的一个变种。很多环境都支持并使用 Markdown 的一些子功能。 +在不使用 Word 或者 LaTeX 等复杂工具的情况下,你可以考虑使用 [Markdown](https://commonmark.org/help/) 这个轻量化的标记语言(markup language)。你可能已经见过 Markdown 或者它的一个变种。很多环境都支持并使用 Markdown 的一些子功能。 Markdown 致力于将人们编写纯文本时的一些习惯标准化。比如: - 用`*`包围的文字表示强调(*斜体*),或者用`**`表示特别强调(**粗体**); @@ -191,9 +191,9 @@ Markdown 不仅容易上手,而且应用非常广泛。实际上本课程的 -## Hammerspoon (macOS桌面自动化) +## Hammerspoon (macOS 桌面自动化) -[Hammerspoon](https://www.hammerspoon.org/) 是面向 macOS 的一个桌面自动化框架。它允许用户编写和操作系统功能挂钩的Lua脚本,从而与键盘、鼠标、窗口、文件系统等交互。 +[Hammerspoon](https://www.hammerspoon.org/) 是面向 macOS 的一个桌面自动化框架。它允许用户编写和操作系统功能挂钩的 Lua 脚本,从而与键盘、鼠标、窗口、文件系统等交互。 下面是 Hammerspoon 的一些示例应用: @@ -206,9 +206,9 @@ Markdown 不仅容易上手,而且应用非常广泛。实际上本课程的 ### 资源 -- [Getting Started with Hammerspoon](https://www.hammerspoon.org/go/):Hammerspoon官方教程 -- [Sample configurations](https://github.com/Hammerspoon/hammerspoon/wiki/Sample-Configurations):Hammerspoon官方示例配置 -- [Anish's Hammerspoon config](https://github.com/anishathalye/dotfiles-local/tree/mac/hammerspoon):Anish的Hammerspoon配置 +- [Getting Started with Hammerspoon](https://www.hammerspoon.org/go/):Hammerspoon 官方教程 +- [Sample configurations](https://github.com/Hammerspoon/hammerspoon/wiki/Sample-Configurations):Hammerspoon 官方示例配置 +- [Anish's Hammerspoon config](https://github.com/anishathalye/dotfiles-local/tree/mac/hammerspoon):Anish 的 Hammerspoon 配置 ## 开机引导以及 Live USB @@ -235,12 +235,10 @@ Live USB 通过在闪存盘上 _写入_ 操作系统的镜像制作,而写入 - 超出用户可以使用的物理主机数量的虚拟机 - 相比物理主机的固定开支,虚拟机的开支一般按运行的时间计算。所以如果用户只需要在短时间内使用大量算力,租用1000台虚拟机运行几分钟明显更加划算。 -受欢迎的 VPS 服务商有 [Amazon AWS](https://aws.amazon.com/),[Google -Cloud](https://cloud.google.com/),以及 +受欢迎的 VPS 服务商有 [Amazon AWS](https://aws.amazon.com/),[Google Cloud](https://cloud.google.com/),以及 [DigitalOcean](https://www.digitalocean.com/)。 -MIT CSAIL 的成员可以使用 [CSAIL OpenStack -instance](https://tig.csail.mit.edu/shared-computing/open-stack/) +MIT CSAIL 的成员可以使用 [CSAIL OpenStack instance](https://tig.csail.mit.edu/shared-computing/open-stack/) 申请免费的虚拟机用于研究。 ## 交互式记事本编程 @@ -250,7 +248,7 @@ instance](https://tig.csail.mit.edu/shared-computing/open-stack/) ## GitHub [GitHub](https://github.com/) 是最受欢迎的开源软件开发平台之一。我们课程中提到的很多工具,从 [vim](https://github.com/vim/vim) 到 -[Hammerspoon](https://github.com/Hammerspoon/hammerspoon),都托管在Github上。向你每天使用的开源工具作出贡献其实很简单,下面是两种贡献者们经常使用的方法: +[Hammerspoon](https://github.com/Hammerspoon/hammerspoon),都托管在 Github 上。向你每天使用的开源工具作出贡献其实很简单,下面是两种贡献者们经常使用的方法: - 创建一个[议题(issue)](https://help.github.com/en/github/managing-your-work-on-github/creating-an-issue)。 议题可以用来反映软件运行的问题或者请求新的功能。创建议题并不需要创建者阅读或者编写代码,所以它是一个轻量化的贡献方式。高质量的问题报告对于开发者十分重要。在现有的议题发表评论也可以对项目的开发作出贡献。 From 08ce463865fb3ec76861931ce315b6b575d46343 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E9=9D=9E=E6=B3=95=E6=93=8D=E4=BD=9C?= Date: Tue, 23 Jun 2020 09:30:05 +0800 Subject: [PATCH 457/640] Update editors.md --- _2020/editors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/editors.md b/_2020/editors.md index 64d5b4bf..fde90fc1 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -165,7 +165,7 @@ Vim 最重要的设计思想是 Vim 的界面本省是一个程序语言。 键 ## 修饰语 -你可以用修饰语改变 “名词” 的意义。修饰语有 `i`, 表示 “内部” 或者 “在内“, 和 `i`, +你可以用修饰语改变 “名词” 的意义。修饰语有 `i`, 表示 “内部” 或者 “在内“, 和 `a`, 表示 ”周围“。 - `ci(` 改变当前括号内的内容 From e4c93b05b711a1749fee97e7fe472a38a205dc0a Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Wed, 24 Jun 2020 13:22:24 +0800 Subject: [PATCH 458/640] add translation link --- index.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/index.md b/index.md index 21220a6a..e06d03c7 100644 --- a/index.md +++ b/index.md @@ -62,6 +62,17 @@ YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuloKGG59rS43e29ro7I57 - [Twitter](https://twitter.com/jonhoo/status/1224383452591509507) - [YouTube](https://www.youtube.com/playlist?list=PLyzOVJj3bHQuloKGG59rS43e29ro7I57J) + +# 译文 + +- [繁体中文](https://missing-semester-zh-hant.github.io/) +- [Korean](https://missing-semester-kr.github.io/) +- [Spanish](https://missing-semester-esp.github.io/) +- [Turkish](https://missing-semester-tr.github.io/) + +注意: 上述连接为社区翻译,我们并未验证其内容 + + ## 致谢 感谢 Elaine Mello, Jim Cain 以及 [MIT Open Learning](https://openlearning.mit.edu/) 帮助我们录制讲座视频。 From 421fbb4230af8a17032fd30657148c29ec0ae430 Mon Sep 17 00:00:00 2001 From: qilingzhao <67507976+qilingzhao@users.noreply.github.com> Date: Fri, 3 Jul 2020 20:01:19 +0800 Subject: [PATCH 459/640] Update shell-tools.md --- _2020/shell-tools.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index fee2c47e..2c8dcae3 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -207,7 +207,7 @@ find . -name '*.tmp' -exec rm {} \; find . -name '*.png' -exec convert {} {.}.jpg \; ``` -尽管 `find` 用途广泛,它的语法却比较难以记忆。例如,为了查找满足模式 `PATTERN` 的文件,您需要执行 `find -name '*PATTERN*'` (如果您希望模式匹配时是区分大小写,可以使用`-iname`选项) +尽管 `find` 用途广泛,它的语法却比较难以记忆。例如,为了查找满足模式 `PATTERN` 的文件,您需要执行 `find -name '*PATTERN*'` (如果您希望模式匹配时是不区分大小写,可以使用`-iname`选项) 您当然可以使用alias设置别名来简化上述操作,但shell的哲学之一便是寻找(更好用的)替代方案。 记住,shell最好的特性就是您只是在调用程序,因此您只要找到合适的替代程序即可(甚至自己编写)。 From 54d2689b55e6f1f0a3c38506b10f41832da5c0b3 Mon Sep 17 00:00:00 2001 From: hwj1995 Date: Wed, 8 Jul 2020 23:24:35 +0800 Subject: [PATCH 460/640] Update editors.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 修改为O / o 在之上/之下插入行 --- _2020/editors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/editors.md b/_2020/editors.md index 64d5b4bf..e69840dd 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -140,7 +140,7 @@ Vim 最重要的设计思想是 Vim 的界面本省是一个程序语言。 键 - `i` 进入插入模式 - 但是对于操纵/编辑文本,不单想用退格键完成 -- `o` / `O` 在之上/之下插入行 +- `O` / `o` 在之上/之下插入行 - `d{移动命令}` 删除 {移动命令} - 例如, `dw` 删除词, `d$` 删除到行尾, `d0` 删除到行头。 - `c{移动命令}` 改变 {移动命令} From 5e3a787d5a2554909775ee93a5b717426958d350 Mon Sep 17 00:00:00 2001 From: hwj1995 Date: Tue, 14 Jul 2020 11:58:49 +0800 Subject: [PATCH 461/640] Update data-wrangling.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit `[^ ]+` 会匹配任意非空且不包含空格的序列 --- _2020/data-wrangling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index 8244768e..bed269e1 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -100,7 +100,7 @@ perl -pe 's/.*?Disconnected from //' ```bash | sed -E 's/.*Disconnected from (invalid |authenticating )?user .* [^ ]+ port [0-9]+( \[preauth\])?$//' ``` -让我们借助正则表达式在线调试工具[regex debugger](https://regex101.com/r/qqbZqh/2) 来理解这段表达式。OK,开始的部分和以前是一样的,随后,我们匹配两种类型的“user”(在日志中基于两种前缀区分)。再然后我们匹配属于用户名的所有字符。接着,再匹配任意一个单词(`[^ ]+` 会匹配任意非空切不包含空格的序列)。紧接着后面匹配单“port”和它后面的一串数字,以及可能存在的后缀`[preauth]`,最后再匹配行尾。 +让我们借助正则表达式在线调试工具[regex debugger](https://regex101.com/r/qqbZqh/2) 来理解这段表达式。OK,开始的部分和以前是一样的,随后,我们匹配两种类型的“user”(在日志中基于两种前缀区分)。再然后我们匹配属于用户名的所有字符。接着,再匹配任意一个单词(`[^ ]+` 会匹配任意非空且不包含空格的序列)。紧接着后面匹配单“port”和它后面的一串数字,以及可能存在的后缀`[preauth]`,最后再匹配行尾。 注意,这样做的话,即使用户名是“Disconnected from”,对匹配结果也不会有任何影响,您知道这是为什么吗? From aac27f263a0c8961f50313b13276320957441658 Mon Sep 17 00:00:00 2001 From: hwj1995 Date: Thu, 16 Jul 2020 20:24:50 +0800 Subject: [PATCH 462/640] Update version-control.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 常见错误是 --- _2020/version-control.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/version-control.md b/_2020/version-control.md index 5169a834..b0dcffa0 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -427,7 +427,7 @@ command is used for merging. 1. 将版本历史可视化并进行探索 2. 是谁最后修改来 `README.md`文件?(提示:使用 `git log` 命令并添加合适的参数) 3. 最后一次修改`_config.yml` 文件中 `collections:` 行时的提交信息是什么?(提示:使用`git blame` 和 `git show`) -3. 使用 Git 时的一个常见错误时提交本不应该由 Git 管理的大文件,或是将含有敏感信息的文件提交给 Git 。尝试像仓库中添加一个文件并添加提交信息,然后将其从历史中删除 ( [这篇文章也许会有帮助](https://help.github.com/articles/removing-sensitive-data-from-a-repository/)); +3. 使用 Git 时的一个常见错误是提交本不应该由 Git 管理的大文件,或是将含有敏感信息的文件提交给 Git 。尝试像仓库中添加一个文件并添加提交信息,然后将其从历史中删除 ( [这篇文章也许会有帮助](https://help.github.com/articles/removing-sensitive-data-from-a-repository/)); 4. 从 GitHub 上克隆某个仓库,修改一些文件。当您使用 `git stash` 会发生什么?当您执行 `git log --all --oneline` 时会显示什么?通过 `git stash pop` 命令来撤销 `git stash`操作,什么时候会用到这一技巧? 5. 与其他的命令行工具一样,Git 也提供了一个名为 `~/.gitconfig` 配置文件 (或 dotfile)。请在 `~/.gitconfig` 中创建一个别名,使您在运行 `git graph` 时,您可以得到 `git log --all --graph --decorate --oneline`的输出结果; 6. 您可以通过执行`git config --global core.excludesfile ~/.gitignore_global` 在 `~/.gitignore_global` 中创建全局忽略规则。配置您的全局 gitignore 文件来字典忽略系统或编辑器的临时文件,例如 `.DS_Store`; From 55b85fff122ad4cba7d412a102187d9158e88593 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E9=9D=9E=E6=B3=95=E6=93=8D=E4=BD=9C?= Date: Fri, 17 Jul 2020 09:32:17 +0800 Subject: [PATCH 463/640] Update potpourri.md --- _2020/potpourri.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index 525977d4..2f4bfe12 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -1,7 +1,7 @@ --- layout: lecture title: "大杂烩" -date: 2020-06-21 +date: 2020-01-29 ready: true video: aspect: 56.25 From 0afb7d5b7ad602f8784d6f9ada1a4a036457a703 Mon Sep 17 00:00:00 2001 From: hwj1995 Date: Fri, 17 Jul 2020 17:48:22 +0800 Subject: [PATCH 464/640] Update debugging-profiling.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 修改翻译错误“差距” ==》 “插件” --- _2020/debugging-profiling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 286fab72..942fab53 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -456,7 +456,7 @@ Summary 2. 学习 [这份](https://github.com/spiside/pdb-tutorial) `pdb` 实践教程并熟悉相关的命令。更深入的信息您可以参考[这份](https://realpython.com/python-debugging-pdb)教程。 -3. 安装 [`shellcheck`](https://www.shellcheck.net/) 并尝试对下面的脚本进行检查。这段代码有什么问题吗?请修复相关问题。在您的编辑器中安装一个linter差距,这样它就可以自动地显示相关警告信息。 +3. 安装 [`shellcheck`](https://www.shellcheck.net/) 并尝试对下面的脚本进行检查。这段代码有什么问题吗?请修复相关问题。在您的编辑器中安装一个linter插件,这样它就可以自动地显示相关警告信息。 ```bash #!/bin/sh ## Example: a typical script with several problems From 924aff593ffdd2c34c91829e0d67775a392d4cc5 Mon Sep 17 00:00:00 2001 From: hwj1995 Date: Mon, 20 Jul 2020 16:44:47 +0800 Subject: [PATCH 465/640] Update potpourri.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 课程时间错误 --- _2020/potpourri.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index 2f4bfe12..1f786d0a 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -1,7 +1,7 @@ --- layout: lecture title: "大杂烩" -date: 2020-01-29 +date: 2019-01-29 ready: true video: aspect: 56.25 From af6214469686292ee963f30ad5cd3d402114cbba Mon Sep 17 00:00:00 2001 From: hwj Date: Tue, 21 Jul 2020 18:36:48 +0800 Subject: [PATCH 466/640] Update potpourri.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 重复 --- _2020/potpourri.md | 1 - 1 file changed, 1 deletion(-) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index 2f4bfe12..c0574692 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -10,7 +10,6 @@ video: ## 目录 -- [目录](#%e7%9b%ae%e5%bd%95) - [修改键位映射](#%E4%BF%AE%E6%94%B9%E9%94%AE%E4%BD%8D%E6%98%A0%E5%B0%84) - [守护进程](#%E5%AE%88%E6%8A%A4%E8%BF%9B%E7%A8%8B) - [FUSE](#fuse) From 634feb71dd7a7ecc423e5fca5e646b1653de2123 Mon Sep 17 00:00:00 2001 From: SodaCris <18463922396@163.com> Date: Fri, 24 Jul 2020 11:02:03 +0800 Subject: [PATCH 467/640] =?UTF-8?q?=E7=BF=BB=E8=AF=91?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- _2020/editors.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_2020/editors.md b/_2020/editors.md index 3babddfb..edf343c0 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -265,8 +265,8 @@ Vim 有很多扩展插件。 跟很多互联网上已经过时的建议相反, 我们尽量避免在这里提供一份冗长的插件列表。 你可以查看讲师们的开源的配置文件 ([Anish](https://github.com/anishathalye/dotfiles), [Jon](https://github.com/jonhoo/configs), -[Jose](https://github.com/JJGO/dotfiles)) to see what other plugins we use. -Check out [Vim Awesome](https://vimawesome.com/) 来了解一些很棒的插件. +[Jose](https://github.com/JJGO/dotfiles)) 来看看我们使用的是哪些插件. +浏览 [Vim Awesome](https://vimawesome.com/) 来了解一些很棒的插件. 这个话题也有很多博客文章: 搜索 "best Vim plugins"。 From f8c7f38abc6f790a770662b7247d4dc2c55e7720 Mon Sep 17 00:00:00 2001 From: SodaCris <18463922396@163.com> Date: Fri, 24 Jul 2020 11:06:56 +0800 Subject: [PATCH 468/640] Update editors.md --- _2020/editors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/editors.md b/_2020/editors.md index edf343c0..bbdb1871 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -265,7 +265,7 @@ Vim 有很多扩展插件。 跟很多互联网上已经过时的建议相反, 我们尽量避免在这里提供一份冗长的插件列表。 你可以查看讲师们的开源的配置文件 ([Anish](https://github.com/anishathalye/dotfiles), [Jon](https://github.com/jonhoo/configs), -[Jose](https://github.com/JJGO/dotfiles)) 来看看我们使用的是哪些插件. +[Jose](https://github.com/JJGO/dotfiles)) 来看看我们使用其他插件. 浏览 [Vim Awesome](https://vimawesome.com/) 来了解一些很棒的插件. 这个话题也有很多博客文章: 搜索 "best Vim plugins"。 From 96b3cf876995edcb58ede79c15015d3648c8db50 Mon Sep 17 00:00:00 2001 From: SodaCris <18463922396@163.com> Date: Fri, 24 Jul 2020 11:07:55 +0800 Subject: [PATCH 469/640] Update editors.md --- _2020/editors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/editors.md b/_2020/editors.md index bbdb1871..56c22242 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -265,7 +265,7 @@ Vim 有很多扩展插件。 跟很多互联网上已经过时的建议相反, 我们尽量避免在这里提供一份冗长的插件列表。 你可以查看讲师们的开源的配置文件 ([Anish](https://github.com/anishathalye/dotfiles), [Jon](https://github.com/jonhoo/configs), -[Jose](https://github.com/JJGO/dotfiles)) 来看看我们使用其他插件. +[Jose](https://github.com/JJGO/dotfiles)) 来看看我们使用的其他插件. 浏览 [Vim Awesome](https://vimawesome.com/) 来了解一些很棒的插件. 这个话题也有很多博客文章: 搜索 "best Vim plugins"。 From 3179a0e875a031349530a17ccdb6d2e21f10eefd Mon Sep 17 00:00:00 2001 From: SodaCris <18463922396@163.com> Date: Fri, 24 Jul 2020 22:13:15 +0800 Subject: [PATCH 470/640] Update editors.md --- _2020/editors.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_2020/editors.md b/_2020/editors.md index 56c22242..81db8936 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -265,8 +265,8 @@ Vim 有很多扩展插件。 跟很多互联网上已经过时的建议相反, 我们尽量避免在这里提供一份冗长的插件列表。 你可以查看讲师们的开源的配置文件 ([Anish](https://github.com/anishathalye/dotfiles), [Jon](https://github.com/jonhoo/configs), -[Jose](https://github.com/JJGO/dotfiles)) 来看看我们使用的其他插件. -浏览 [Vim Awesome](https://vimawesome.com/) 来了解一些很棒的插件. +[Jose](https://github.com/JJGO/dotfiles)) 来看看我们使用的其他插件。 +浏览 [Vim Awesome](https://vimawesome.com/) 来了解一些很棒的插件。 这个话题也有很多博客文章: 搜索 "best Vim plugins"。 From 1c2dfd4f37b52cde3c52b5f849ef8b2d486f31af Mon Sep 17 00:00:00 2001 From: Hanzhi Liu <55851864+MisakaCenter@users.noreply.github.com> Date: Mon, 27 Jul 2020 14:40:36 +0800 Subject: [PATCH 471/640] [ fix ] typos in course-shell --- _2020/course-shell.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index 1c93fbc8..47fc43f3 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -27,13 +27,13 @@ video: # 课程结构 -本课程包含11个时常在一小时左右的讲座,每一个讲座都会关注一个 +本课程包含11个时长在一小时左右的讲座,每一个讲座都会关注一个 [特定的主题](/missing-semester/2020/)。尽管这些讲座之间基本上是各自独立的,但随着课程的进行,我们会假定您已经掌握了之前的内容。 每个讲座都有在线笔记供查阅,但是课上的很多内容并不会包含在笔记中。因此我们也会把课程录制下来发布到互联网上供大家观看学习。 我们希望能在这11个一小时讲座中涵盖大部分必须的内容,因此课程地节奏会比较紧凑。 为了能帮助您以自己的节奏来掌握讲座内容,每次课程都包含来一组练习来帮助您掌握本节课的重点。 -s课后我们会安排答疑的时间来回答您的问题。如果您参加的是在线课程,可以发送邮件到 +课后我们会安排答疑的时间来回答您的问题。如果您参加的是在线课程,可以发送邮件到 [missing-semester@mit.edu](mailto:missing-semester@mit.edu)来联系我们。 由于时长的限制,我们不可能达到那些专门课程一样的细致程度,我们会适时地将您介绍一些优秀的资源,帮助您深入的理解相关的工具或主题。 @@ -48,10 +48,10 @@ s课后我们会安排答疑的时间来回答您的问题。如果您参加的 这些交互接口可以覆盖80%的使用场景,但是它们也从根本上限制了您的操作方式——你不能点击一个不存在的按钮或者是用语音输入一个还没有被录入的指令。 为了充分利用计算机的能力,我们不得不回到最根本的方式,使用文字接口:Shell -几乎所有您能够接触到的平台都支持某种形式都shell,有些甚至还提供了多种shell供您选择。虽然它们之间有些细节上都差异,但是其核心功能都是一样都:它允许你执行程序,输入并获取某种半结构化都输出。 +几乎所有您能够接触到的平台都支持某种形式的shell,有些甚至还提供了多种shell供您选择。虽然它们之间有些细节上都差异,但是其核心功能都是一样的:它允许你执行程序,输入并获取某种半结构化的输出。 本节课我们会使用Bourne Again SHell, 简称 "bash" 。 -这是被最广泛使用都一种shell,它都语法和其他都shell都是类似的。打开shell _提示符_(您输入指令的地方),您首先需要打开 _终端_ 。您的设备通常都已经内置了终端,或者您也可以安装一个,非常简单。 +这是被最广泛使用的一种shell,它的语法和其他的shell都是类似的。打开shell _提示符_(您输入指令的地方),您首先需要打开 _终端_ 。您的设备通常都已经内置了终端,或者您也可以安装一个,非常简单。 ## 使用 shell @@ -96,7 +96,7 @@ missing:~$ /bin/echo $PATH ## 在shell中导航 shell中的路径是一组被分割的目录,在 Linux 和 macOS 上使用 `/` 分割,而在Windows上是`\`。路径 `/`代表的是系统的根目录,所有的文件夹都包括在这个路径之下,在Windows上每个盘都有一个根目录(例如: -`C:\`)。 我们假设您在学习本课程时使用的是Linux文件系统。如果某个路径以`/` 开头,那么它是一个 _绝对路径_,其他的都术语 _相对路径_ 。相对路径是指相对于当前工作目录的路径,当前工作目录可以使用 `pwd` 命令来获取。此外,切换目录需要使用 `cd` 命令。在路径中,`.` 表示的是当前目录,而 `..` 表示上级目录: +`C:\`)。 我们假设您在学习本课程时使用的是Linux文件系统。如果某个路径以`/` 开头,那么它是一个 _绝对路径_,其他的都是 _相对路径_ 。相对路径是指相对于当前工作目录的路径,当前工作目录可以使用 `pwd` 命令来获取。此外,切换目录需要使用 `cd` 命令。在路径中,`.` 表示的是当前目录,而 `..` 表示上级目录: ```console missing:~$ pwd @@ -164,7 +164,7 @@ missing:~$ man ls ## 在程序间创建连接 -在shell中,程序有两个主要的“流”:他们的输入流和输出流。 +在shell中,程序有两个主要的“流”:它们的输入流和输出流。 当程序尝试读取信息时,它们会从输入流中进行读取,当程序打印信息时,它们会将信息输出到输出流中。 通常,一个程序的输入输出流都是您的终端。也就是,您的键盘作为输入,显示器作为输出。 但是,我们也可以重定向这些流! From ec2588cbf1e722e7f357afed1d589e339313f845 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=B2=99=E9=B9=8F?= Date: Thu, 30 Jul 2020 00:59:30 +0800 Subject: [PATCH 472/640] Update course-shell.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 原文的 “home directory” 翻译成“家目录”更好一些吧 --- _2020/course-shell.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index 1c93fbc8..d30bb30c 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -261,7 +261,7 @@ $ echo 1 | sudo tee /sys/class/leds/input6::scrolllock/brightness 3. 使用 `chmod` 命令改变权限,使 `./semester` 能够成功执行,不要使用`sh semester`来执行该程序。您的shell是如何知晓这个文件需要使用`sh`来解析呢?更多信息请参考:[shebang](https://en.wikipedia.org/wiki/Shebang_(Unix)) -4. 使用 `|` 和 `>` ,将 `semester` 文件输出的最后更改日期信息,写入根目录下的 `last-modified.txt` 的文件中 +4. 使用 `|` 和 `>` ,将 `semester` 文件输出的最后更改日期信息,写入家目录下的 `last-modified.txt` 的文件中 5. 写一段命令来从 `/sys` 中获取笔记本的电量信息,或者台式机CPU的温度。注意:macOS并没有sysfs,所以mac用户可以跳过这一题。 From 1e172498a8a4c06d61850188aeddf86e88c5438e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=B2=99=E9=B9=8F?= Date: Sat, 1 Aug 2020 00:12:11 +0800 Subject: [PATCH 473/640] Update data-wrangling.md --- _2020/data-wrangling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index 8244768e..d7218e6e 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -261,7 +261,7 @@ ffmpeg -loglevel panic -i /dev/video0 -frames 1 -f image2 - # 课后练习 1. 学习一下这篇简短的 [交互式正则表达式教程](https://regexone.com/). -2. 统计words文件 (`/usr/share/dict/words`) 中包含至少三个`a` 且不以`'s` 结尾的单词个数。这些单词中,出现频率最高的末尾两个字母是什么? `sed`的 `y`命令,或者 `tr` 程序也许可以帮你解决大小写的问题。共存在多少种词尾两字母组合?还有一个很 有挑战性的问题:哪个组合从未出现过? +2. 统计words文件 (`/usr/share/dict/words`) 中包含至少三个`a` 且不以`'s` 结尾的单词个数。这些单词中,出现频率前三的末尾两个字母是什么? `sed`的 `y`命令,或者 `tr` 程序也许可以帮你解决大小写的问题。共存在多少种词尾两字母组合?还有一个很 有挑战性的问题:哪个组合从未出现过? 3. 进行原地替换听上去很有诱惑力,例如: `sed s/REGEX/SUBSTITUTION/ input.txt > input.txt`。但是这并不是一个明知的做法,为什么呢?还是说只有 `sed`是这样的? 查看 `man sed` 来完成这个问题 From d8eb5d31b8e8e17b6f78c7b3f6b355c9ff610021 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E6=A3=92=E6=A3=92=E5=BD=AC=5FBinboy?= Date: Sat, 15 Aug 2020 17:30:16 +0800 Subject: [PATCH 474/640] Fix typo --- _2020/editors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/editors.md b/_2020/editors.md index 81db8936..3ffb7c2c 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -252,7 +252,7 @@ Vim 能够被重度自定义, 花时间探索自定义选项是值得的。 Vim 有很多扩展插件。 跟很多互联网上已经过时的建议相反, 你 _不_ 需要在 Vim 使用一个插件 管理器(从 Vim 8.0 开始)。 你可以使用内置的插件管理系统。 只需要创建一个 -`~/.vim/pack/vendor/start/` 的文件家, 然后把插件放到这里 (比如通过 `git clone`)。 +`~/.vim/pack/vendor/start/` 的文件夹, 然后把插件放到这里 (比如通过 `git clone`)。 以下是一些我们最爱的插件: From 983791ed673ad85789777765a4bd08355cf24a85 Mon Sep 17 00:00:00 2001 From: weijiew <49638002+weijiew@users.noreply.github.com> Date: Mon, 24 Aug 2020 21:26:53 +0800 Subject: [PATCH 475/640] Update editors.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Vim 最重要的设计思想是 Vim 的界面本省是一个程序语言。 修改为 : Vim 最重要的设计思想是 Vim 的界面本身是一个程序语言。 --- _2020/editors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/editors.md b/_2020/editors.md index 3ffb7c2c..8e4e9f16 100644 --- a/_2020/editors.md +++ b/_2020/editors.md @@ -101,7 +101,7 @@ Vim 默认打开一个标签页,这个标签也包含一个窗口。 # Vim 的接口其实是一种编程语言 -Vim 最重要的设计思想是 Vim 的界面本省是一个程序语言。 键入操作 (以及他们的助记名) +Vim 最重要的设计思想是 Vim 的界面本身是一个程序语言。 键入操作 (以及他们的助记名) 本身是命令, 这些命令可以组合使用。 这使得移动和编辑更加高效,特别是一旦形成肌肉记忆。 ## 移动 From 0b223ee4e224865dff2b4ce7662a249d4994915b Mon Sep 17 00:00:00 2001 From: sunset wan Date: Mon, 31 Aug 2020 23:34:06 +0800 Subject: [PATCH 476/640] =?UTF-8?q?=E4=B8=AD=E8=8B=B1=E6=96=87=E4=B9=8B?= =?UTF-8?q?=E9=97=B4=E9=80=82=E5=BD=93=E6=B7=BB=E5=8A=A0=E7=A9=BA=E6=A0=BC?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- _2020/course-shell.md | 91 +++++++++++++++++++++---------------------- 1 file changed, 45 insertions(+), 46 deletions(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index 47fc43f3..bc531fcd 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -1,6 +1,6 @@ --- layout: lecture -title: "课程概览与shell" +title: "课程概览与 shell" date: 2019-01-13 ready: true video: @@ -17,7 +17,7 @@ video: 作为计算机科学家,我们都知道计算机最擅长帮助我们完成重复性的工作。 但是我们却常常忘记这一点也适用于我们使用计算机的方式,而不仅仅是利用计算机程序去帮我们求解问题。 在从事与计算机相关的工作时,我们有很多触手可及的工具可以帮助我们更高效的解决问题。 -但是我们中的大多数人实际上只利用了这些工具中的很少一部分,我们常常只是死记硬背地掌握了一些对我们来说如咒语一般的命令, +但是我们中的大多数人实际上只利用了这些工具中的很少一部分,我们常常只是死记硬背一些如咒语般的命令, 或是当我们卡住的时候,盲目地从网上复制粘贴一些命令。 本课程意在帮你解决这一问题。 @@ -27,14 +27,13 @@ video: # 课程结构 -本课程包含11个时长在一小时左右的讲座,每一个讲座都会关注一个 +本课程包含 11 个时长在一小时左右的讲座,每一个讲座都会关注一个 [特定的主题](/missing-semester/2020/)。尽管这些讲座之间基本上是各自独立的,但随着课程的进行,我们会假定您已经掌握了之前的内容。 每个讲座都有在线笔记供查阅,但是课上的很多内容并不会包含在笔记中。因此我们也会把课程录制下来发布到互联网上供大家观看学习。 -我们希望能在这11个一小时讲座中涵盖大部分必须的内容,因此课程地节奏会比较紧凑。 -为了能帮助您以自己的节奏来掌握讲座内容,每次课程都包含来一组练习来帮助您掌握本节课的重点。 +我们希望能在这 11 个一小时讲座中涵盖大部分必须的内容,因此课程的信息密度是相当大的。为了能帮助您以自己的节奏来掌握讲座内容,每次课程都包含一组练习来帮助您掌握本节课的重点。 课后我们会安排答疑的时间来回答您的问题。如果您参加的是在线课程,可以发送邮件到 -[missing-semester@mit.edu](mailto:missing-semester@mit.edu)来联系我们。 + [missing-semester@mit.edu](mailto:missing-semester@mit.edu) 来联系我们。 由于时长的限制,我们不可能达到那些专门课程一样的细致程度,我们会适时地将您介绍一些优秀的资源,帮助您深入的理解相关的工具或主题。 但是如果您还有一些特别关注的话题,也请联系我们。 @@ -44,14 +43,14 @@ video: ## shell 是什么? -如今的计算机有着多种多样的交互接口让我们可以进行指令的的输入,从炫酷的图像用户界面(GUI),语音输入甚至是AR/VR都已经无处不在。 -这些交互接口可以覆盖80%的使用场景,但是它们也从根本上限制了您的操作方式——你不能点击一个不存在的按钮或者是用语音输入一个还没有被录入的指令。 +如今的计算机有着多种多样的交互接口让我们可以进行指令的的输入,从炫酷的图像用户界面(GUI),语音输入甚至是 AR/VR 都已经无处不在。 +这些交互接口可以覆盖 80% 的使用场景,但是它们也从根本上限制了您的操作方式——你不能点击一个不存在的按钮或者是用语音输入一个还没有被录入的指令。 为了充分利用计算机的能力,我们不得不回到最根本的方式,使用文字接口:Shell -几乎所有您能够接触到的平台都支持某种形式的shell,有些甚至还提供了多种shell供您选择。虽然它们之间有些细节上都差异,但是其核心功能都是一样的:它允许你执行程序,输入并获取某种半结构化的输出。 +几乎所有您能够接触到的平台都支持某种形式的 shell,有些甚至还提供了多种 shell 供您选择。虽然它们之间有些细节上都差异,但是其核心功能都是一样的:它允许你执行程序,输入并获取某种半结构化的输出。 -本节课我们会使用Bourne Again SHell, 简称 "bash" 。 -这是被最广泛使用的一种shell,它的语法和其他的shell都是类似的。打开shell _提示符_(您输入指令的地方),您首先需要打开 _终端_ 。您的设备通常都已经内置了终端,或者您也可以安装一个,非常简单。 +本节课我们会使用 Bourne Again SHell, 简称 "bash" 。 +这是被最广泛使用的一种 shell,它的语法和其他的 shell 都是类似的。打开shell _提示符_(您输入指令的地方),您首先需要打开 _终端_ 。您的设备通常都已经内置了终端,或者您也可以安装一个,非常简单。 ## 使用 shell @@ -61,7 +60,7 @@ video: missing:~$ ``` -这是shell最主要的文本接口。它告诉你,你的主机名是 `missing` 并且您当前的工作目录("current working directory")或者说您当前所在的位置是`~` (表示 "home")。 `$`符号表示您现在的身份不是root用户(稍后会介绍)。在这个提示符中,您可以输入 _命令_ ,命令最终会被shell解析。最简单的命令是执行一个程序: +这是 shell 最主要的文本接口。它告诉你,你的主机名是 `missing` 并且您当前的工作目录("current working directory")或者说您当前所在的位置是 `~` (表示 "home")。 `$` 符号表示您现在的身份不是 root 用户(稍后会介绍)。在这个提示符中,您可以输入 _命令_ ,命令最终会被 shell 解析。最简单的命令是执行一个程序: ```console missing:~$ date @@ -69,16 +68,16 @@ Fri 10 Jan 2020 11:49:31 AM EST missing:~$ ``` -这里,我们执行了 `date` 这个程序,不出意料地,它打印出了当前的日前和时间。然后,shell等待我们输入其他命令。我们可以在执行命令的同时向程序传递 _参数_ : +这里,我们执行了 `date` 这个程序,不出意料地,它打印出了当前的日前和时间。然后,shell 等待我们输入其他命令。我们可以在执行命令的同时向程序传递 _参数_ : ```console missing:~$ echo hello hello ``` -上例中,我们让shell执行 `echo` ,同时指定参数`hello`。`echo` 程序将该参数打印出来。 -shell基于空格分割命令并进行解析,然后执行第一个单词代表的程序,并将后续的单词作为程序可以访问的参数。如果您希望传递的参数中包含空格(例如一个名为 My Photos 的文件夹),您要么用使用单引号,双引号将其包裹起来,要么使用转义符号`\`进行处理(`My\ Photos`)。 +上例中,我们让 shell 执行 `echo` ,同时指定参数 `hello`。`echo` 程序将该参数打印出来。 +shell 基于空格分割命令并进行解析,然后执行第一个单词代表的程序,并将后续的单词作为程序可以访问的参数。如果您希望传递的参数中包含空格(例如一个名为 My Photos 的文件夹),您要么用使用单引号,双引号将其包裹起来,要么使用转义符号 `\` 进行处理(`My\ Photos`)。 -但是,shell是如何知道去哪里寻找 `date` 或 `echo` 的呢?其实,类似于Python or Ruby,shell是一个编程环境,所以它具备变量、条件、循环和函数(下一课进行讲解)。当你在shell中执行命令时,您实际上是在执行一段shell可以解释执行的简短代码。如果你要求shell执行某个指令,但是该指令并不是shell所了解的编程关键字,那么它会去咨询 _环境变量_ `$PATH`,它会列出当shell接到某条指令时,进行程序搜索的路径: +但是,shell 是如何知道去哪里寻找 `date` 或 `echo` 的呢?其实,类似于 Python 或 Ruby,shell 是一个编程环境,所以它具备变量、条件、循环和函数(下一课进行讲解)。当你在 shell 中执行命令时,您实际上是在执行一段 shell 可以解释执行的简短代码。如果你要求 shell 执行某个指令,但是该指令并不是 shell 所了解的编程关键字,那么它会去咨询 _环境变量_ `$PATH`,它会列出当 shell 接到某条指令时,进行程序搜索的路径: ```console @@ -90,13 +89,13 @@ missing:~$ /bin/echo $PATH /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin ``` -当我们执行 `echo` 命令时,shell了解到需要执行 `echo` 这个程序,随后它便会在`$PATH`中搜索由`:`所分割的一系列目录,基于名字搜索该程序。当找到该程序时便执行(假定该文件是 _可执行程序_,后续课程将详细讲解)。确定某个程序名代表的是哪个具体的程序,可以使用 -`which` 程序。我们也可以绕过 `$PATH` ,通过直接指定需要执行的程序的路径来执行该程序 +当我们执行 `echo` 命令时,shell 了解到需要执行 `echo` 这个程序,随后它便会在 `$PATH` 中搜索由 `:` 所分割的一系列目录,基于名字搜索该程序。当找到该程序时便执行(假定该文件是 _可执行程序_,后续课程将详细讲解)。确定某个程序名代表的是哪个具体的程序,可以使用 +`which` 程序。我们也可以绕过 `$PATH`,通过直接指定需要执行的程序的路径来执行该程序 ## 在shell中导航 -shell中的路径是一组被分割的目录,在 Linux 和 macOS 上使用 `/` 分割,而在Windows上是`\`。路径 `/`代表的是系统的根目录,所有的文件夹都包括在这个路径之下,在Windows上每个盘都有一个根目录(例如: -`C:\`)。 我们假设您在学习本课程时使用的是Linux文件系统。如果某个路径以`/` 开头,那么它是一个 _绝对路径_,其他的都是 _相对路径_ 。相对路径是指相对于当前工作目录的路径,当前工作目录可以使用 `pwd` 命令来获取。此外,切换目录需要使用 `cd` 命令。在路径中,`.` 表示的是当前目录,而 `..` 表示上级目录: +shell 中的路径是一组被分割的目录,在 Linux 和 macOS 上使用 `/` 分割,而在Windows上是 `\`。路径 `/` 代表的是系统的根目录,所有的文件夹都包括在这个路径之下,在Windows上每个盘都有一个根目录(例如: +`C:\`)。 我们假设您在学习本课程时使用的是 Linux 文件系统。如果某个路径以 `/` 开头,那么它是一个 _绝对路径_,其他的都是 _相对路径_ 。相对路径是指相对于当前工作目录的路径,当前工作目录可以使用 `pwd` 命令来获取。此外,切换目录需要使用 `cd` 命令。在路径中,`.` 表示的是当前目录,而 `..` 表示上级目录: ```console missing:~$ pwd @@ -117,11 +116,11 @@ missing:~$ ../../bin/echo hello hello ``` -注意,shell会实时显示当前的路径信息。您可以通过配置shell提示符来显示各种有用的信息,这一内容我们会在后面的课程中进行讨论。 +注意,shell 会实时显示当前的路径信息。您可以通过配置 shell 提示符来显示各种有用的信息,这一内容我们会在后面的课程中进行讨论。 一般来说,当我们运行一个程序时,如果我们没有指定路径,则该程序会在当前目录下执行。例如,我们常常会搜索文件,并在需要时创建文件。 -为了查看指定目录下包含哪些文件,我们使用`ls` 命令: +为了查看指定目录下包含哪些文件,我们使用 `ls` 命令: ```console missing:~$ ls @@ -138,7 +137,7 @@ home ... ``` -除非我们利用第一个参数指定目录,否则 `ls` 会打印当前目录下的文件。大多数的命令接受标记和选项(带有值的标记),它们以`-` 开头,并可以改变程序的行为。通常,在执行程序时使用`-h` 或 `--help` 标记可以打印帮助信息,以便了解有哪些可用的标记或选项。例如,`ls --help` 的输出如下: +除非我们利用第一个参数指定目录,否则 `ls` 会打印当前目录下的文件。大多数的命令接受标记和选项(带有值的标记),它们以 `-` 开头,并可以改变程序的行为。通常,在执行程序时使用 `-h` 或 `--help` 标记可以打印帮助信息,以便了解有哪些可用的标记或选项。例如,`ls --help` 的输出如下: ``` -l use a long listing format @@ -149,14 +148,14 @@ missing:~$ ls -l /home drwxr-xr-x 1 missing users 4096 Jun 15 2019 missing ``` -这个参数可以打印出更加详细地列出目录下文件或文件夹的信息。首先,本行第一个字符`d` 表示 +这个参数可以打印出更加详细地列出目录下文件或文件夹的信息。首先,本行第一个字符 `d` 表示 `missing` 是一个目录。然后接下来的九个字符,每三个字符构成一组。 -(`rwx`). 它们分别代表了文件所有者(`missing`),用户组 (`users`) 以及其他所有人具有的权限。其中 `-`表示该用户不具备相应的权限。从上面的信息来看,只有文件所有者可以修改(`w`) , `missing` 文件夹 (例如,添加或删除文件夹中的文件)。为了进入某个文件夹,用户需要具备该文件夹以及其父文件夹的“搜索”权限(以“可执行”:`x`)权限表示。为了列出它的包含的内容,用户必须对该文件夹具备读权限(`r`)。对于文件来说,权限的意义也是类似的。注意,`/bin`目录下的程序在最后一组,即表示所有人的用户组中,均包含`x`权限,也就是说任何人都可以执行这些程序。 +(`rwx`). 它们分别代表了文件所有者(`missing`),用户组(`users`) 以及其他所有人具有的权限。其中 `-` 表示该用户不具备相应的权限。从上面的信息来看,只有文件所有者可以修改(`w`),`missing` 文件夹 (例如,添加或删除文件夹中的文件)。为了进入某个文件夹,用户需要具备该文件夹以及其父文件夹的“搜索”权限(以“可执行”:`x`)权限表示。为了列出它的包含的内容,用户必须对该文件夹具备读权限(`r`)。对于文件来说,权限的意义也是类似的。注意,`/bin` 目录下的程序在最后一组,即表示所有人的用户组中,均包含 `x` 权限,也就是说任何人都可以执行这些程序。 -在这个阶段,还有几个趁手的命令是您需要掌握的,例如 `mv` (用于重命名或移动文件)、 `cp` (拷贝文件)以及 `mkdir` (新建文件夹)。 +在这个阶段,还有几个趁手的命令是您需要掌握的,例如 `mv`(用于重命名或移动文件)、 `cp`(拷贝文件)以及 `mkdir`(新建文件夹)。 -如果您想要知道关于程序参数、输入输出的信息,亦或是想要了解它们的工作方式,请试试 `man` 这个程序。它会接受一个程序名作为参数,然后将它的文档(用户手册)展现给您。注意,使用`q` 可以退出该程序。 +如果您想要知道关于程序参数、输入输出的信息,亦或是想要了解它们的工作方式,请试试 `man` 这个程序。它会接受一个程序名作为参数,然后将它的文档(用户手册)展现给您。注意,使用 `q` 可以退出该程序。 ```console missing:~$ man ls @@ -164,7 +163,7 @@ missing:~$ man ls ## 在程序间创建连接 -在shell中,程序有两个主要的“流”:它们的输入流和输出流。 +在 shell 中,程序有两个主要的“流”:它们的输入流和输出流。 当程序尝试读取信息时,它们会从输入流中进行读取,当程序打印信息时,它们会将信息输出到输出流中。 通常,一个程序的输入输出流都是您的终端。也就是,您的键盘作为输入,显示器作为输出。 但是,我们也可以重定向这些流! @@ -183,7 +182,7 @@ hello ``` 您还可以使用 `>>` 来向一个文件追加内容。使用管道( _pipes_ ),我们能够更好的利用文件重定向。 -`|`操作符允许我们将一个程序的输出和另外一个程序的输入连接起来: +`|` 操作符允许我们将一个程序的输出和另外一个程序的输入连接起来: ```console missing:~$ ls -l / | tail -n1 @@ -196,14 +195,14 @@ missing:~$ curl --head --silent google.com | grep --ignore-case content-length | ## 一个功能全面又强大的工具 -对于大多数的类Unix系统,有一类用户是非常特殊的,那就是:根用户(root用户)。 +对于大多数的类 Unix 系统,有一类用户是非常特殊的,那就是:根用户(root user)。 您应该已经注意到了,在上面的输出结果中,根用户几乎不受任何限制,他可以创建、读取、更新和删除系统中的任何文件。 通常在我们并不会以根用户的身份直接登录系统,因为这样可能会因为某些错误的操作而破坏系统。 -取而代之的是我们会在需要的时候使用 `sudo` 命令。顾名思义,它的作用是让您可以以su(super user 或 root的简写)的身份do一些事情。 -当您遇到拒绝访问(permission denied)的错误时,通常是因为此时您必须是根用户才能操作。此时也请再次确认您是真的要执行此操作。 +取而代之的是我们会在需要的时候使用 `sudo` 命令。顾名思义,它的作用是让您可以以 su(super user 或 root 的简写)的身份执行一些操作。 +当您遇到拒绝访问(permission denied)的错误时,通常是因为此时您必须是根用户才能操作。然而,请再次确认您是真的要执行此操作。 -有一件事情是您必须作为根用户才能做的,那就是向`sysfs` 文件写入内容。系统被挂载在`/sys`下, `sysfs` 文件则暴露了一些内核(kernel)参数。 -因此,您不需要借助任何专用的工具,就可以轻松地在运行期间配置系统内核。**注意 Windows or macOS没有这个文件** +有一件事情是您必须作为根用户才能做的,那就是向 `sysfs` 文件写入内容。系统被挂载在 `/sys` 下,`sysfs` 文件则暴露了一些内核(kernel)参数。 +因此,您不需要借助任何专用的工具,就可以轻松地在运行期间配置系统内核。**注意 Windows 和 macOS 没有这个文件** 例如,您笔记本电脑的屏幕亮度写在 `brightness` 文件中,它位于 @@ -222,17 +221,17 @@ An error occurred while redirecting file 'brightness' open: Permission denied ``` 出乎意料的是,我们还是得到了一个错误信息。毕竟,我们已经使用了 -`sudo` 命令!关于shell,有件事我们必须要知道。`|`、`>`、和 `<` 是通过shell执行的,而不是被各个程序单独执行。 -`echo` 等程序并不知道`|`的存在,它们只知道从自己的输入输出流中进行读写。 -对于上面这种情况, _shell_ (权限为您的当前用户) 在设置 `sudo echo` 前尝试打开 brightness 文件并写入,但是系统拒绝了shell的操作因为此时shell不是根用户。 +`sudo` 命令!关于 shell,有件事我们必须要知道。`|`、`>`、和 `<` 是通过 shell 执行的,而不是被各个程序单独执行。 +`echo` 等程序并不知道 `|` 的存在,它们只知道从自己的输入输出流中进行读写。 +对于上面这种情况, _shell_ (权限为您的当前用户) 在设置 `sudo echo` 前尝试打开 brightness 文件并写入,但是系统拒绝了 shell 的操作因为此时 shell 不是根用户。 明白这一点后,我们可以这样操作: ```console $ echo 3 | sudo tee brightness ``` -因为打开`/sys` 文件的是`tee`这个程序,并且该程序以`root`权限在运行,因此操作可以进行。 -这样您就可以在`/sys`中愉快地玩耍了,例如修改系统中各种LED的状态(路径可能会有所不同): +因为打开 `/sys` 文件的是 `tee` 这个程序,并且该程序以 `root` 权限在运行,因此操作可以进行。 +这样您就可以在 `/sys` 中愉快地玩耍了,例如修改系统中各种LED的状态(路径可能会有所不同): ```console $ echo 1 | sudo tee /sys/class/leds/input6::scrolllock/brightness @@ -240,8 +239,8 @@ $ echo 1 | sudo tee /sys/class/leds/input6::scrolllock/brightness # 接下来..... -学到这里,您掌握的shell知识已经可以完成一些基础的任务了。您应该已经可以查找感兴趣的文件并使用大多数程序的基本功能了。 -在下一场讲座中,我们会探讨如何利用shell及其他工具执行并自动化更复杂的任务。 +学到这里,您掌握的 shell 知识已经可以完成一些基础的任务了。您应该已经可以查找感兴趣的文件并使用大多数程序的基本功能了。 +在下一场讲座中,我们会探讨如何利用 shell 及其他工具执行并自动化更复杂的任务。 # 课后练习 @@ -254,14 +253,14 @@ $ echo 1 | sudo tee /sys/class/leds/input6::scrolllock/brightness curl --head --silent https://missing.csail.mit.edu ``` 第一行可能有点棘手, `#` 在Bash中表示注释,而 `!` 即使被双引号(`"`)包裹也具有特殊的含义。 - 单引号(`'`)则不一样,此处利用这一点解决输入问题。更多信息请参考 [Bash quoting手册](https://www.gnu.org/software/bash/manual/html_node/Quoting.html) + 单引号(`'`)则不一样,此处利用这一点解决输入问题。更多信息请参考 [Bash quoting 手册](https://www.gnu.org/software/bash/manual/html_node/Quoting.html) -1. 尝试执行这个文件。例如,将该脚本的路径(`./semester`)输入到您的shell中并回车。如果程序无法执行,请使用 `ls`命令来获取信息并理解其不能执行的原因。 -2. 查看 `chmod` 的手册(例如,使用`man chmod`命令) +1. 尝试执行这个文件。例如,将该脚本的路径(`./semester`)输入到您的shell中并回车。如果程序无法执行,请使用 `ls` 命令来获取信息并理解其不能执行的原因。 +2. 查看 `chmod` 的手册(例如,使用 `man chmod` 命令) -3. 使用 `chmod` 命令改变权限,使 `./semester` 能够成功执行,不要使用`sh semester`来执行该程序。您的shell是如何知晓这个文件需要使用`sh`来解析呢?更多信息请参考:[shebang](https://en.wikipedia.org/wiki/Shebang_(Unix)) +3. 使用 `chmod` 命令改变权限,使 `./semester` 能够成功执行,不要使用 `sh semester` 来执行该程序。您的 shell 是如何知晓这个文件需要使用 `sh` 来解析呢?更多信息请参考:[shebang](https://en.wikipedia.org/wiki/Shebang_(Unix)) 4. 使用 `|` 和 `>` ,将 `semester` 文件输出的最后更改日期信息,写入根目录下的 `last-modified.txt` 的文件中 -5. 写一段命令来从 `/sys` 中获取笔记本的电量信息,或者台式机CPU的温度。注意:macOS并没有sysfs,所以mac用户可以跳过这一题。 +5. 写一段命令来从 `/sys` 中获取笔记本的电量信息,或者台式机 CPU 的温度。注意:macOS 并没有 sysfs,所以 Mac 用户可以跳过这一题。 From 06457676cbab990489806a7271fabc68c8929f57 Mon Sep 17 00:00:00 2001 From: walter <32014404+hzcheney@users.noreply.github.com> Date: Fri, 2 Oct 2020 19:08:07 +0800 Subject: [PATCH 477/640] Update shell-tools.md fixed:redundancy line deleted. --- _2020/shell-tools.md | 1 - 1 file changed, 1 deletion(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 2c8dcae3..e8c96c59 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -23,7 +23,6 @@ shell脚本是一种更加复杂度的工具。 需要注意的是,`foo = bar` (使用括号隔开)是不能正确工作的,因为解释器会调用程序`foo` 并将 `=` 和 `bar`作为参数。 总的来说,在shell脚本中使用空格会起到分割参数的作用,有时候可能会造成混淆,请务必多加检查。 -Strings in bash can be defined with `'` and `"` delimiters but they are not equivalent. Bash中的字符串通过`'` 和 `"`分隔符来定义,但是它们的含义并不相同。以`'`定义的字符串为原义字符串,其中的变量不会被转义,而 `"`定义的字符串会将变量值进行替换。 ```bash From b64a1ea8e37c2dce25f82eb21c5beba23cdac381 Mon Sep 17 00:00:00 2001 From: uniquebby <15367145356@163.com> Date: Tue, 6 Oct 2020 14:36:37 +0800 Subject: [PATCH 478/640] Fix-some-typo in shell_tools.md and version-control.md --- _2020/shell-tools.md | 3 ++- _2020/version-control.md | 7 ++++--- 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index e8c96c59..2b0e1895 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -282,6 +282,7 @@ Fasd 基于 [_frecency_](https://developer.mozilla.org/en/The_Places_frecency_al # 课后练习 + 1. 阅读 [`man ls`](http://man7.org/linux/man-pages/man1/ls.1.html) ,然后使用`ls` 命令进行如下操作: - 所有文件(包括隐藏文件) @@ -345,7 +346,7 @@ ls -lath --color=auto cat out.txt {% endcomment %} -4. 本节课我们讲解了 `find` 命令的 `-exec` 参数非常强大,它可以对我们查找对文件进行操作。但是,如果我们要对所有文件进行操作呢?例如创建一个zip压缩文件?我们已经知道,命令行可以从参数或标准输入接受输入。在用管道连接命令时,我们将标准输出和标准输入连接起来,但是有些命令,例如`tar` 则需要从参数接受输入。这里我们可以使用[`xargs`](http://man7.org/linux/man-pages/man1/xargs.1.html) 命令,它可以使用标准输入中的内容作为参数。 +4. 本节课我们讲解了 `find` 命令的 `-exec` 参数非常强大,它可以对我们查找的文件进行操作。但是,如果我们要对所有文件进行操作呢?例如创建一个zip压缩文件?我们已经知道,命令行可以从参数或标准输入接受输入。在用管道连接命令时,我们将标准输出和标准输入连接起来,但是有些命令,例如`tar` 则需要从参数接受输入。这里我们可以使用[`xargs`](http://man7.org/linux/man-pages/man1/xargs.1.html) 命令,它可以使用标准输入中的内容作为参数。 例如 `ls | xargs rm` 会删除当前目录中的所有文件。 您的任务是编写一个命令,它可以递归地查找文件夹中所有的HTML文件,并将它们压缩成zip文件。注意,即使文件名中包含空格,您的命令也应该能够正确执行(提示:查看 `xargs`的参数`-d`) diff --git a/_2020/version-control.md b/_2020/version-control.md index b0dcffa0..f128afc4 100644 --- a/_2020/version-control.md +++ b/_2020/version-control.md @@ -24,7 +24,7 @@ video: 因为 Git 接口的抽象泄漏(leaky abstraction)问题,通过自顶向下的方式(从命令行接口开始)学习 Git 可能会让人感到非常困惑。很多时候您只能死记硬背一些命令行,然后像使用魔法一样使用它们,一旦出现问题,就只能像上面那幅漫画里说的那样去处理了。 -尽管 Git 的接口有些丑陋,但是它的底层设计和思想却是非常优雅的。丑陋的接口只能靠死记硬背,而优雅的底层设计则非常容易被人理解。因此,我们将通过一种自底向上的方式像您介绍 Git。我们会从数据模型开始,最后再学习它的接口。一旦您搞懂了 Git 的数据模型,再学习其接口并理解这些接口是如何操作数据模型的就非常容易了。 +尽管 Git 的接口有些丑陋,但是它的底层设计和思想却是非常优雅的。丑陋的接口只能靠死记硬背,而优雅的底层设计则非常容易被人理解。因此,我们将通过一种自底向上的方式向您介绍 Git。我们会从数据模型开始,最后再学习它的接口。一旦您搞懂了 Git 的数据模型,再学习其接口并理解这些接口是如何操作数据模型的就非常容易了。 # Git 的数据模型 @@ -407,7 +407,7 @@ command is used for merging. 有[很多](https://nvie.com/posts/a-successful-git-branching-model/) [不同的](https://www.endoflineblog.com/gitflow-considered-harmful) [处理方法](https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow)) -- **GitHub**: Git 并不等同于 GitHub。 在 GitHub 中您需要使用一个被称作[拉取请求(pull request)](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)的方法来像其他项目贡献代码 +- **GitHub**: Git 并不等同于 GitHub。 在 GitHub 中您需要使用一个被称作[拉取请求(pull request)](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests)的方法来向其他项目贡献代码 - **Other Git 提供商**: GitHub 并不是唯一的。还有像[GitLab](https://about.gitlab.com/) 和 [BitBucket](https://bitbucket.org/)这样的平台。 # 资源 @@ -422,12 +422,13 @@ command is used for merging. # 课后练习 + 1. 如果您之前从来没有用过 Git,推荐您阅读 [Pro Git](https://git-scm.com/book/en/v2) 的前几章,或者完成像[Learn Git Branching](https://learngitbranching.js.org/)这样的教程。重点关注 Git 命令和数据模型相关内容; 2. 克隆 [本课程网站的仓库](https://github.com/missing-semester/missing-semester) 1. 将版本历史可视化并进行探索 2. 是谁最后修改来 `README.md`文件?(提示:使用 `git log` 命令并添加合适的参数) 3. 最后一次修改`_config.yml` 文件中 `collections:` 行时的提交信息是什么?(提示:使用`git blame` 和 `git show`) -3. 使用 Git 时的一个常见错误是提交本不应该由 Git 管理的大文件,或是将含有敏感信息的文件提交给 Git 。尝试像仓库中添加一个文件并添加提交信息,然后将其从历史中删除 ( [这篇文章也许会有帮助](https://help.github.com/articles/removing-sensitive-data-from-a-repository/)); +3. 使用 Git 时的一个常见错误是提交本不应该由 Git 管理的大文件,或是将含有敏感信息的文件提交给 Git 。尝试向仓库中添加一个文件并添加提交信息,然后将其从历史中删除 ( [这篇文章也许会有帮助](https://help.github.com/articles/removing-sensitive-data-from-a-repository/)); 4. 从 GitHub 上克隆某个仓库,修改一些文件。当您使用 `git stash` 会发生什么?当您执行 `git log --all --oneline` 时会显示什么?通过 `git stash pop` 命令来撤销 `git stash`操作,什么时候会用到这一技巧? 5. 与其他的命令行工具一样,Git 也提供了一个名为 `~/.gitconfig` 配置文件 (或 dotfile)。请在 `~/.gitconfig` 中创建一个别名,使您在运行 `git graph` 时,您可以得到 `git log --all --graph --decorate --oneline`的输出结果; 6. 您可以通过执行`git config --global core.excludesfile ~/.gitignore_global` 在 `~/.gitignore_global` 中创建全局忽略规则。配置您的全局 gitignore 文件来字典忽略系统或编辑器的临时文件,例如 `.DS_Store`; From ac61fa5545ca9a5a6b09b633ba92c5a6571a3f76 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E9=99=88=E4=BA=8E=E8=A1=A1?= <65726360+skyfact@users.noreply.github.com> Date: Thu, 19 Nov 2020 17:03:11 +0800 Subject: [PATCH 479/640] Update shell-tools.md fix a code error to be consistent with the original course --- _2020/shell-tools.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index 2b0e1895..b2e588e9 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -203,7 +203,7 @@ find . -size +500k -size -10M -name '*.tar.gz' # Delete all files with .tmp extension find . -name '*.tmp' -exec rm {} \; # Find all PNG files and convert them to JPG -find . -name '*.png' -exec convert {} {.}.jpg \; +find . -name '*.png' -exec convert {} {}.jpg \; ``` 尽管 `find` 用途广泛,它的语法却比较难以记忆。例如,为了查找满足模式 `PATTERN` 的文件,您需要执行 `find -name '*PATTERN*'` (如果您希望模式匹配时是不区分大小写,可以使用`-iname`选项) From 2907e904a073980c85dd9f99012c527a7f6435f6 Mon Sep 17 00:00:00 2001 From: NaLan ZeYu Date: Fri, 18 Dec 2020 06:53:54 +0800 Subject: [PATCH 480/640] Fix broken link --- _2020/potpourri.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/potpourri.md b/_2020/potpourri.md index 6b40904a..bbfe924d 100644 --- a/_2020/potpourri.md +++ b/_2020/potpourri.md @@ -219,7 +219,7 @@ Markdown 不仅容易上手,而且应用非常广泛。实际上本课程的 - 对硬盘上的相同操作系统进行修复; - 恢复硬盘上的数据。 -Live USB 通过在闪存盘上 _写入_ 操作系统的镜像制作,而写入不是单纯的往闪存盘上复制 `.iso` 文件。你可以使用 [UNetbootin](https://unetbootin.github.io/) 、[Rufus](github.com/pbatard/rufus) 等 Live USB 写入工具制作。 +Live USB 通过在闪存盘上 _写入_ 操作系统的镜像制作,而写入不是单纯的往闪存盘上复制 `.iso` 文件。你可以使用 [UNetbootin](https://unetbootin.github.io/) 、[Rufus](https://github.com/pbatard/rufus) 等 Live USB 写入工具制作。 ## Docker, Vagrant, VMs, Cloud, OpenStack From d8add5eaa5cfb74e0a9223a5d3a4dde7868e66c0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E4=BB=BB=E6=96=87=E9=BE=99?= Date: Fri, 18 Dec 2020 11:59:24 +0800 Subject: [PATCH 481/640] fix typo --- _2020/debugging-profiling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 942fab53..e22bdb15 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -256,7 +256,7 @@ CPU 性能分析工具有两种: 追踪分析器(_tracing_)及采样分析 追踪分析器 会记录程序的每一次函数调用,而采样分析器则只会周期性的监测(通常为每毫秒)您的程序并记录程序堆栈。它们使用这些记录来生成统计信息,显示程序在哪些事情上花费了最多的时间。如果您希望了解更多相关信息,可以参考[这篇](https://jvns.ca/blog/2017/12/17/how-do-ruby---python-profilers-work-) 介绍性的文章。 -大多数的编程语言都有一些基于命令行都分析器,我们可以使用它们来分析代码。它们通常可以集成在 IDE 中,但是本节课我们会专注于这些命令行工具本身。 +大多数的编程语言都有一些基于命令行的分析器,我们可以使用它们来分析代码。它们通常可以集成在 IDE 中,但是本节课我们会专注于这些命令行工具本身。 在 Python 中,我们使用 `cProfile` 模块来分析每次函数调用所消耗都时间。 在下面的例子中,我们实现了一个基础的 grep 命令: From c5e749b83dd6163f9560c41bd3ffda0a17125006 Mon Sep 17 00:00:00 2001 From: LGiki Date: Sat, 19 Dec 2020 08:57:50 +0800 Subject: [PATCH 482/640] Fix typo in qa.md --- _2020/qa.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_2020/qa.md b/_2020/qa.md index 0f211ba8..cbeaded3 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -67,8 +67,8 @@ video: ## `source script.sh` 和 `./script.sh` 有什么区别? -这两种情况 `script.sh` 都会在bash会话中被读取和执行,不同点在于那个会话执行这个命令。 -对于 `source` 命令来说,命令是在当前的bash会话种执行的,因此当 `source` 执行完毕,对当前环境的任何更改(例如更改目录或是定义函数)都会留存在当前会话中。 +这两种情况 `script.sh` 都会在bash会话中被读取和执行,不同点在于哪个会话执行这个命令。 +对于 `source` 命令来说,命令是在当前的bash会话中执行的,因此当 `source` 执行完毕,对当前环境的任何更改(例如更改目录或是定义函数)都会留存在当前会话中。 单独运行 `./script.sh` 时,当前的bash会话将启动新的bash会话(实例),并在新实例中运行命令 `script.sh`。 因此,如果 `script.sh` 更改目录,新的bash会话(实例)会更改目录,但是一旦退出并将控制权返回给父bash会话,父会话仍然留在先前的位置(不会有目录的更改)。 同样,如果 `script.sh` 定义了要在终端中访问的函数,需要用 `source` 命令在当前bash会话中定义这个函数。否则,如果你运行 `./script.sh`,只有新的bash会话(进程)才能执行定义的函数,而当前的shell不能。 From 144f586fbbb7adf2cf785caf5f19c61720dfa8a8 Mon Sep 17 00:00:00 2001 From: Chunlin Zhang Date: Wed, 23 Dec 2020 14:00:40 +0800 Subject: [PATCH 483/640] =?UTF-8?q?=E6=89=80=E6=B6=88=E8=80=97=E9=83=BD?= =?UTF-8?q?=E6=97=B6=E9=97=B4=20->=20=E6=89=80=E6=B6=88=E8=80=97=E7=9A=84?= =?UTF-8?q?=E6=97=B6=E9=97=B4?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- _2020/debugging-profiling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index e22bdb15..4585d491 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -258,7 +258,7 @@ CPU 性能分析工具有两种: 追踪分析器(_tracing_)及采样分析 大多数的编程语言都有一些基于命令行的分析器,我们可以使用它们来分析代码。它们通常可以集成在 IDE 中,但是本节课我们会专注于这些命令行工具本身。 -在 Python 中,我们使用 `cProfile` 模块来分析每次函数调用所消耗都时间。 在下面的例子中,我们实现了一个基础的 grep 命令: +在 Python 中,我们使用 `cProfile` 模块来分析每次函数调用所消耗的时间。 在下面的例子中,我们实现了一个基础的 grep 命令: ```python #!/usr/bin/env python From 2fd928e439fc1c250cb9ce03a05f6887ab0801a7 Mon Sep 17 00:00:00 2001 From: akiirokaede Date: Wed, 23 Dec 2020 18:05:46 +0800 Subject: [PATCH 484/640] =?UTF-8?q?=E4=BF=AE=E6=94=B9=E8=AF=AF=E8=BE=93?= =?UTF-8?q?=E5=85=A5=E7=9A=84=E6=96=87=E5=AD=97?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- _2020/course-shell.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index bc531fcd..e98c189a 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -47,7 +47,7 @@ video: 这些交互接口可以覆盖 80% 的使用场景,但是它们也从根本上限制了您的操作方式——你不能点击一个不存在的按钮或者是用语音输入一个还没有被录入的指令。 为了充分利用计算机的能力,我们不得不回到最根本的方式,使用文字接口:Shell -几乎所有您能够接触到的平台都支持某种形式的 shell,有些甚至还提供了多种 shell 供您选择。虽然它们之间有些细节上都差异,但是其核心功能都是一样的:它允许你执行程序,输入并获取某种半结构化的输出。 +几乎所有您能够接触到的平台都支持某种形式的 shell,有些甚至还提供了多种 shell 供您选择。虽然它们之间有些细节上的差异,但是其核心功能都是一样的:它允许你执行程序,输入并获取某种半结构化的输出。 本节课我们会使用 Bourne Again SHell, 简称 "bash" 。 这是被最广泛使用的一种 shell,它的语法和其他的 shell 都是类似的。打开shell _提示符_(您输入指令的地方),您首先需要打开 _终端_ 。您的设备通常都已经内置了终端,或者您也可以安装一个,非常简单。 From 8587cde1e74c9e5f055df0d9005d158f6a59c2cc Mon Sep 17 00:00:00 2001 From: Jay Xu Date: Wed, 30 Dec 2020 17:51:15 +0800 Subject: [PATCH 485/640] Update metaprogramming.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 引号 -> 冒号 --- _2020/metaprogramming.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/metaprogramming.md b/_2020/metaprogramming.md index d48701f6..14531304 100644 --- a/_2020/metaprogramming.md +++ b/_2020/metaprogramming.md @@ -33,7 +33,7 @@ plot-%.png: %.dat plot.py ./plot.py -i $*.dat -o $@ ``` -这个文件中的指令,即如何使用右侧文件构建左侧文件的规则。或者,换句话说,引号左侧的是构建目标,引号右侧的是构建它所需的依赖。缩进的部分是从依赖构建目标时需要用到的一段程序。在 `make` 中,第一条指令还指明了构建的目的,如果您使用不带参数的 `make`,这便是我们最终的构建结果。或者,您可以使用这样的命令来构建其他目标:`make plot-data.png`。 +这个文件中的指令,即如何使用右侧文件构建左侧文件的规则。或者,换句话说,冒号左侧的是构建目标,冒号右侧的是构建它所需的依赖。缩进的部分是从依赖构建目标时需要用到的一段程序。在 `make` 中,第一条指令还指明了构建的目的,如果您使用不带参数的 `make`,这便是我们最终的构建结果。或者,您可以使用这样的命令来构建其他目标:`make plot-data.png`。 规则中的 `%` 是一种模式,它会匹配其左右两侧相同的字符串。例如,如果目标是 `plot-foo.png`, `make` 会去寻找 `foo.dat` 和 `plot.py` 作为依赖。现在,让我们看看如果在一个空的源码目录中执行`make` 会发生什么? From 60cd1a3f72835dd3133fda99946e704b3f042681 Mon Sep 17 00:00:00 2001 From: Sean Li Date: Fri, 1 Jan 2021 21:27:32 +0800 Subject: [PATCH 486/640] Fix words typo MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 明知➡明智 --- _2020/data-wrangling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/data-wrangling.md b/_2020/data-wrangling.md index 381f0044..fcfccfb2 100644 --- a/_2020/data-wrangling.md +++ b/_2020/data-wrangling.md @@ -263,7 +263,7 @@ ffmpeg -loglevel panic -i /dev/video0 -frames 1 -f image2 - 1. 学习一下这篇简短的 [交互式正则表达式教程](https://regexone.com/). 2. 统计words文件 (`/usr/share/dict/words`) 中包含至少三个`a` 且不以`'s` 结尾的单词个数。这些单词中,出现频率前三的末尾两个字母是什么? `sed`的 `y`命令,或者 `tr` 程序也许可以帮你解决大小写的问题。共存在多少种词尾两字母组合?还有一个很 有挑战性的问题:哪个组合从未出现过? 3. 进行原地替换听上去很有诱惑力,例如: - `sed s/REGEX/SUBSTITUTION/ input.txt > input.txt`。但是这并不是一个明知的做法,为什么呢?还是说只有 `sed`是这样的? 查看 `man sed` 来完成这个问题 + `sed s/REGEX/SUBSTITUTION/ input.txt > input.txt`。但是这并不是一个明智的做法,为什么呢?还是说只有 `sed`是这样的? 查看 `man sed` 来完成这个问题 4. 找出您最近十次开机的开机时间平均数、中位数和最长时间。在Linux上需要用到 `journalctl` ,而在 macOS 上使用 `log show`。找到每次起到开始和结束时的时间戳。在Linux上类似这样操作: ``` From 4d370b411b38a7861619a649b6578d224d857545 Mon Sep 17 00:00:00 2001 From: Sean Li Date: Sun, 3 Jan 2021 11:50:53 +0800 Subject: [PATCH 487/640] fix typo --- _2020/command-line.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 438309a4..99507c7c 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -348,7 +348,7 @@ ssh-copy-id -i .ssh/id_ed25519.pub foobar@remote ![Remote Port Forwarding](https://i.stack.imgur.com/4iK3b.png  "远程端口转发") -常见的情景是使用本地端口转发,即远端设备上的服务监听一个端口,而您希望在本地设备上的一个端口建立连接并转发到远程端口上。例如,我们在远端服务器上运行 Jupyter notebook 并监听 `8888` 端口。 染后,建立从本地端口 `9999` 的转发,使用 `ssh -L 9999:localhost:8888 foobar@remote_server` 。这样只需要访问本地的 `localhost:9999` 即可。 +常见的情景是使用本地端口转发,即远端设备上的服务监听一个端口,而您希望在本地设备上的一个端口建立连接并转发到远程端口上。例如,我们在远端服务器上运行 Jupyter notebook 并监听 `8888` 端口。 然后,建立从本地端口 `9999` 的转发,使用 `ssh -L 9999:localhost:8888 foobar@remote_server` 。这样只需要访问本地的 `localhost:9999` 即可。 @@ -477,4 +477,4 @@ Host vm 2. 使用`python -m http.server 8888` 在您的虚拟机中启动一个 Web 服务器并通过本机的`http://localhost:9999` 访问虚拟机上的 Web 服务器 3. 使用`sudo vim /etc/ssh/sshd_config` 编辑 SSH 服务器配置,通过修改`PasswordAuthentication`的值来禁用密码验证。通过修改`PermitRootLogin`的值来禁用 root 登陆。然后使用`sudo service sshd restart`重启 `ssh` 服务器,然后重新尝试。 4. (附加题) 在虚拟机中安装 [`mosh`](https://mosh.org/) 并启动连接。然后断开服务器/虚拟机的网络适配器。mosh可以恢复连接吗? -5. (附加题) 查看`ssh`的`-N` 和 `-f` 选项的作用,找出在后台进行端口转发的命令是什么? \ No newline at end of file +5. (附加题) 查看`ssh`的`-N` 和 `-f` 选项的作用,找出在后台进行端口转发的命令是什么? From 0159d5977334333942893bd4b7a45f59960a9e50 Mon Sep 17 00:00:00 2001 From: roseauhan Date: Sun, 3 Jan 2021 14:49:33 +0800 Subject: [PATCH 488/640] Update shell-tools.md --- _2020/shell-tools.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index b2e588e9..0adb51cd 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -20,7 +20,7 @@ shell脚本是一种更加复杂度的工具。 大多数shell都有自己的一套脚本语言,包括变量、控制流和自己的语法。shell脚本与其他脚本语言不同之处在于,shell脚本针对shell所从事的相关工作进行来优化。因此,创建命令流程(pipelines)、将结果保存到文件、从标准输入中读取输入,这些都是shell脚本中的原生操作,这让它比通用的脚本语言更易用。本节中,我们会专注于bash脚本,因为它最流行,应用更为广泛。 在bash中为变量赋值的语法是`foo=bar`,访问变量中存储的数值,其语法为 `$foo`。 -需要注意的是,`foo = bar` (使用括号隔开)是不能正确工作的,因为解释器会调用程序`foo` 并将 `=` 和 `bar`作为参数。 +需要注意的是,`foo = bar` (使用空格隔开)是不能正确工作的,因为解释器会调用程序`foo` 并将 `=` 和 `bar`作为参数。 总的来说,在shell脚本中使用空格会起到分割参数的作用,有时候可能会造成混淆,请务必多加检查。 Bash中的字符串通过`'` 和 `"`分隔符来定义,但是它们的含义并不相同。以`'`定义的字符串为原义字符串,其中的变量不会被转义,而 `"`定义的字符串会将变量值进行替换。 From e41ba04b63b19eb61f32266d07b193ee3564bb1b Mon Sep 17 00:00:00 2001 From: lucifer Date: Mon, 4 Jan 2021 14:52:04 +0800 Subject: [PATCH 489/640] fix: typo --- _2020/shell-tools.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index b2e588e9..0adb51cd 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -20,7 +20,7 @@ shell脚本是一种更加复杂度的工具。 大多数shell都有自己的一套脚本语言,包括变量、控制流和自己的语法。shell脚本与其他脚本语言不同之处在于,shell脚本针对shell所从事的相关工作进行来优化。因此,创建命令流程(pipelines)、将结果保存到文件、从标准输入中读取输入,这些都是shell脚本中的原生操作,这让它比通用的脚本语言更易用。本节中,我们会专注于bash脚本,因为它最流行,应用更为广泛。 在bash中为变量赋值的语法是`foo=bar`,访问变量中存储的数值,其语法为 `$foo`。 -需要注意的是,`foo = bar` (使用括号隔开)是不能正确工作的,因为解释器会调用程序`foo` 并将 `=` 和 `bar`作为参数。 +需要注意的是,`foo = bar` (使用空格隔开)是不能正确工作的,因为解释器会调用程序`foo` 并将 `=` 和 `bar`作为参数。 总的来说,在shell脚本中使用空格会起到分割参数的作用,有时候可能会造成混淆,请务必多加检查。 Bash中的字符串通过`'` 和 `"`分隔符来定义,但是它们的含义并不相同。以`'`定义的字符串为原义字符串,其中的变量不会被转义,而 `"`定义的字符串会将变量值进行替换。 From aabe85d4cc1f2c56a22c00c960aaa1a0e3d87daf Mon Sep 17 00:00:00 2001 From: lucifer Date: Mon, 4 Jan 2021 15:01:30 +0800 Subject: [PATCH 490/640] Update shell-tools.md --- _2020/shell-tools.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index b2e588e9..d8d9a9f7 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -44,7 +44,7 @@ mcd () { } ``` -这里 `$1` 是脚本到第一个参数。与其他脚本语言不同到是,bash使用了很多特殊到变量来表示参数、错误代码和相关变量。下面是列举来其中一些变量,更完整到列表可以参考 [这里](https://www.tldp.org/LDP/abs/html/special-chars.html)。 +这里 `$1` 是脚本的第一个参数。与其他脚本语言不同的是,bash使用了很多特殊的变量来表示参数、错误代码和相关变量。下面是列举来其中一些变量,更完整的列表可以参考 [这里](https://www.tldp.org/LDP/abs/html/special-chars.html)。 - `$0` - 脚本名 - `$1` 到 `$9` - 脚本到参数。 `$1` 是第一个参数,依此类推。 - `$@` - 所有参数 From aa2246da39d108fc07bfc1c7e2bb77f087e453a1 Mon Sep 17 00:00:00 2001 From: hongbo-zhu-cn Date: Tue, 5 Jan 2021 23:23:05 +0100 Subject: [PATCH 491/640] Chorme -> Chrome fix typos: Chorme -> Chrome --- _2020/qa.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/qa.md b/_2020/qa.md index cbeaded3..d03efebd 100644 --- a/_2020/qa.md +++ b/_2020/qa.md @@ -179,5 +179,5 @@ Emacs的优点是可以用Lisp语言进行扩展(Lisp比vim默认的脚本语 ## 对于不同的 Web 浏览器有什么评价? -2020的浏览器现状是,大部分的浏览器都与 Chrome 类似,因为它们都使用同样的引擎(Blink)。 Microsoft Edge 同样基于 Blink,至于 Safari 基于 WebKit(与Blink类似的引擎),这些浏览器仅仅是更糟糕的 Chorme 版本。不管是在性能还是可用性上,Chorme 都是一款很不错的浏览器。如果你想要替代品,我们推荐 Firefox。Firefox 与 Chorme 的在各方面不相上下,并且在隐私方面更加出色。 +2020的浏览器现状是,大部分的浏览器都与 Chrome 类似,因为它们都使用同样的引擎(Blink)。 Microsoft Edge 同样基于 Blink,至于 Safari 基于 WebKit(与Blink类似的引擎),这些浏览器仅仅是更糟糕的 Chrome 版本。不管是在性能还是可用性上,Chrome 都是一款很不错的浏览器。如果你想要替代品,我们推荐 Firefox。Firefox 与 Chrome 的在各方面不相上下,并且在隐私方面更加出色。 有一款目前还没有完成的叫 Flow 的浏览器,它实现了全新的渲染引擎,有望比现有引擎速度更快。 From feb9fba58d3c92214eb8fdff28b7df68b1a429d8 Mon Sep 17 00:00:00 2001 From: iron Date: Wed, 6 Jan 2021 12:21:48 +0800 Subject: [PATCH 492/640] Update course-shell.md No need for translation --- _2020/course-shell.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/course-shell.md b/_2020/course-shell.md index 5734d584..4f3366a1 100644 --- a/_2020/course-shell.md +++ b/_2020/course-shell.md @@ -260,7 +260,7 @@ $ echo 1 | sudo tee /sys/class/leds/input6::scrolllock/brightness 3. 使用 `chmod` 命令改变权限,使 `./semester` 能够成功执行,不要使用 `sh semester` 来执行该程序。您的 shell 是如何知晓这个文件需要使用 `sh` 来解析呢?更多信息请参考:[shebang](https://en.wikipedia.org/wiki/Shebang_(Unix)) -4. 使用 `|` 和 `>` ,将 `semester` 文件输出的最后更改日期信息,写入家目录下的 `last-modified.txt` 的文件中 +4. 使用 `|` 和 `>` ,将 `semester` 文件输出的最后更改日期信息,写入/home目录下的 `last-modified.txt` 的文件中 5. 写一段命令来从 `/sys` 中获取笔记本的电量信息,或者台式机 CPU 的温度。注意:macOS 并没有 sysfs,所以 Mac 用户可以跳过这一题。 From dbde4bbb0d83e4ada0128465760198315c77b3b4 Mon Sep 17 00:00:00 2001 From: Gaffey <253896514@qq.com> Date: Wed, 6 Jan 2021 15:20:07 +0800 Subject: [PATCH 493/640] =?UTF-8?q?=E6=96=87=E5=AD=97=E5=8B=98=E8=AF=AF?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- _2020/shell-tools.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/_2020/shell-tools.md b/_2020/shell-tools.md index e12b8162..feb7e5dd 100644 --- a/_2020/shell-tools.md +++ b/_2020/shell-tools.md @@ -46,12 +46,12 @@ mcd () { 这里 `$1` 是脚本的第一个参数。与其他脚本语言不同的是,bash使用了很多特殊的变量来表示参数、错误代码和相关变量。下面是列举来其中一些变量,更完整的列表可以参考 [这里](https://www.tldp.org/LDP/abs/html/special-chars.html)。 - `$0` - 脚本名 -- `$1` 到 `$9` - 脚本到参数。 `$1` 是第一个参数,依此类推。 +- `$1` 到 `$9` - 脚本的参数。 `$1` 是第一个参数,依此类推。 - `$@` - 所有参数 - `$#` - 参数个数 -- `$?` - 前一个命令到返回值 -- `$$` - 当前脚本到进程识别码 -- `!!` - 完整到上一条命令,包括参数。常见应用:当你因为权限不足执行命令失败时,可以使用 `sudo !!`再尝试一次。 +- `$?` - 前一个命令的返回值 +- `$$` - 当前脚本的进程识别码 +- `!!` - 完整的上一条命令,包括参数。常见应用:当你因为权限不足执行命令失败时,可以使用 `sudo !!`再尝试一次。 - `$_` - 上一条命令的最后一个参数。如果你正在使用的是交互式shell,你可以通过按下 `Esc` 之后键入 . 来获取这个值。 命令通常使用 `STDOUT`来返回输出值,使用`STDERR` 来返回错误及错误码,便于脚本以更加友好到方式报告错误。 @@ -151,7 +151,7 @@ for arg in reversed(sys.argv[1:]): shell知道去用python解释器而不是shell命令来运行这段脚本,是因为脚本的开头第一行的[shebang](https://en.wikipedia.org/wiki/Shebang_(Unix))。 -在shebang行中使用 [`env`](http://man7.org/linux/man-pages/man1/env.1.html) 命令是一种好的实践,它会利用环境变量中的程序来解析该脚本,这样就提高来您的脚本的可移植性。`env` 会利用我们第一节讲座中介绍过的`PATH` 环境变量来进行定位。 +在 `shebang` 行中使用 [`env`](http://man7.org/linux/man-pages/man1/env.1.html) 命令是一种好的实践,它会利用环境变量中的程序来解析该脚本,这样就提高来您的脚本的可移植性。`env` 会利用我们第一节讲座中介绍过的`PATH` 环境变量来进行定位。 例如,使用了`env`的shebang看上去时这样的`#!/usr/bin/env python`。 @@ -168,7 +168,7 @@ shell函数和脚本有如下一些不同点: ## 查看命令如何使用 -看到这里,您可能会有疑问,我们应该如何为特定的命令找到合适的标记呢?例如 `ls -l`, `mv -i` 和 `mkdir -p`。更一般的庆幸是,给您一个命令行,您应该怎样了解如何使用这个命令行并找出它的不同的选项呢? +看到这里,您可能会有疑问,我们应该如何为特定的命令找到合适的标记呢?例如 `ls -l`, `mv -i` 和 `mkdir -p`。更普遍的是,给您一个命令行,您应该怎样了解如何使用这个命令行并找出它的不同的选项呢? 一般来说,您可能会先去网上搜索答案,但是,UNIX 可比 StackOverflow 出现的早,因此我们的系统里其实早就包含了可以获取相关信息的方法。 在上一节中我们介绍过,最常用的方法是为对应的命令行添加`-h` 或 `--help` 标记。另外一个更详细的方法则是使用`man` 命令。[`man`](http://man7.org/linux/man-pages/man1/man.1.html) 命令是手册(manual)的缩写,它提供了命令的用户手册。 @@ -213,7 +213,7 @@ find . -name '*.png' -exec convert {} {}.jpg \; 例如, [`fd`](https://github.com/sharkdp/fd) 就是一个更简单、更快速、更友好的程序,它可以用来作为`find`的替代品。它有很多不错的默认设置,例如输出着色、默认支持正则匹配、支持unicode并且我认为它的语法更符合直觉。以模式`PATTERN` 搜索的语法是 `fd PATTERN`。 -大多数人都认为 `find` 和 `fd` 已经很好用了,但是有的人可能向知道,我们是不可以可以有更高效的方法,例如不要每次都搜索文件而是通过编译索引或建立数据库的方式来实现更加快速地搜索。 +大多数人都认为 `find` 和 `fd` 已经很好用了,但是有的人可能想知道,我们是不是可以有更高效的方法,例如不要每次都搜索文件而是通过编译索引或建立数据库的方式来实现更加快速地搜索。 这就要靠 [`locate`](http://man7.org/linux/man-pages/man1/locate.1.html) 了。 `locate` 使用一个由 [`updatedb`](http://man7.org/linux/man-pages/man1/updatedb.1.html)负责更新的数据库,在大多数系统中 `updatedb` 都会通过 [`cron`](http://man7.org/linux/man-pages/man8/cron.8.html)每日更新。这便需要我们在速度和时效性之间作出权衡。而且,`find` 和类似的工具可以通过别的属性比如文件大小、修改时间或是权限来查找文件,`locate`则只能通过文件名。 [here](https://unix.stackexchange.com/questions/60205/locate-vs-find-usage-pros-and-cons-of-each-other)有一个更详细的对比。 @@ -261,7 +261,7 @@ rg --stats PATTERN `Ctrl+R` 可以配合 [fzf](https://github.com/junegunn/fzf/wiki/Configuring-shell-key-bindings#ctrl-r) 使用。`fzf` 是一个通用对模糊查找工具,它可以和很多命令一起使用。这里我们可以对历史命令进行模糊查找并将结果以赏心悦目的格式输出。 另外一个和历史命令相关的技巧我喜欢称之为**基于历史的自动补全**。 -这一特性最初是由 [fish](https://fishshell.com/) shell 创建的,它可以根据您最近使用过的开头相同的命令,动态地对当前对shell命令进行补全。这一功能在 [zsh](https://github.com/zsh-users/zsh-autosuggestions) 中也可以使用,它可以极大对提高用户体验。 +这一特性最初是由 [fish](https://fishshell.com/) shell 创建的,它可以根据您最近使用过的开头相同的命令,动态地对当前对shell命令进行补全。这一功能在 [zsh](https://github.com/zsh-users/zsh-autosuggestions) 中也可以使用,它可以极大的提高用户体验。 最后,有一点值得注意,输入命令时,如果您在命令的开头加上一个空格,它就不会被加进shell记录中。当你输入包含密码或是其他敏感信息的命令时会用到这一特性。如果你不小心忘了在前面加空格,可以通过编辑。`bash_history`或 `.zhistory` 来手动地从历史记录中移除那一项。 @@ -278,7 +278,7 @@ Fasd 基于 [_frecency_](https://developer.mozilla.org/en/The_Places_frecency_al 最直接对用法是自动跳转 (_autojump_),对于经常访问的目录,在目录名子串前加入一个命令 `z` 就可以快速切换命令到该目录。例如, 如果您经常访问`/home/user/files/cool_project` 目录,那么可以直接使用 `z cool` 跳转到该目录。 -还有一些更复杂的工具可以用来概览目录结构,例如 [`tree`](https://linux.die.net/man/1/tree), [`broot`](https://github.com/Canop/broot) 或更加完整对文件管理器,例如 [`nnn`](https://github.com/jarun/nnn) 或 [`ranger`](https://github.com/ranger/ranger)。 +还有一些更复杂的工具可以用来概览目录结构,例如 [`tree`](https://linux.die.net/man/1/tree), [`broot`](https://github.com/Canop/broot) 或更加完整的文件管理器,例如 [`nnn`](https://github.com/jarun/nnn) 或 [`ranger`](https://github.com/ranger/ranger)。 # 课后练习 From 4513938c6c0cece5a5ddfc858c9eda4e2a2c80a2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kai=20Tang=20=28=E5=94=90=E6=81=BA=EF=BC=89?= Date: Thu, 7 Jan 2021 10:29:03 -0500 Subject: [PATCH 494/640] =?UTF-8?q?=E4=BF=AE=E6=94=B9=E8=B7=9D=E7=A6=BB?= =?UTF-8?q?=E4=B8=BA=E4=B8=BE=E4=BE=8B?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- _2020/command-line.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/command-line.md b/_2020/command-line.md index 99507c7c..f4d82c57 100644 --- a/_2020/command-line.md +++ b/_2020/command-line.md @@ -122,7 +122,7 @@ $ jobs # 终端多路复用 -当您在使用命令行接口时,您通常会希望同时执行多个任务。距离来说,您可以想要同时运行您的编辑器,并在终端的另外一侧执行程序。尽管再打开一个新的终端窗口也能达到目的,使用终端多路复用器则是一种更好的办法。 +当您在使用命令行接口时,您通常会希望同时执行多个任务。举例来说,您可以想要同时运行您的编辑器,并在终端的另外一侧执行程序。尽管再打开一个新的终端窗口也能达到目的,使用终端多路复用器则是一种更好的办法。 像 [`tmux`](http://man7.org/linux/man-pages/man1/tmux.1.html) 这类的终端多路复用器可以允许我们基于面板和标签分割出多个终端窗口,这样您便可以同时与多个 shell 会话进行交互。 From b4c75242b3fc06193250aa196bb8027799803d4c Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Fri, 8 Jan 2021 13:58:03 +0800 Subject: [PATCH 495/640] Update README.md --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 60a186ac..ff8942fb 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,9 @@ -# The Missing Semester of Your CS Education +# The Missing Semester of Your CS Education [![gitlocalized ](https://gitlocalize.com/repo/5643/zh/badge.svg)](https://gitlocalize.com/repo/5643/zh?utm_source=badge) Website for the [The Missing Semester of Your CS Education](https://missing.csail.mit.edu/) class! [中文站点](https://missing-semester-cn.github.io) + Contributions are most welcome! If you have edits or new content to add, please open an issue or submit a pull request. From 597c30141083f51ea8f3cea6b1a544ef28a7de36 Mon Sep 17 00:00:00 2001 From: Lingfeng_Ai Date: Fri, 8 Jan 2021 14:00:11 +0800 Subject: [PATCH 496/640] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index ff8942fb..fd7e8e05 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# The Missing Semester of Your CS Education [![gitlocalized ](https://gitlocalize.com/repo/5643/zh/badge.svg)](https://gitlocalize.com/repo/5643/zh?utm_source=badge) +# The Missing Semester of Your CS Education Website for the [The Missing Semester of Your CS Education](https://missing.csail.mit.edu/) class! [中文站点](https://missing-semester-cn.github.io) From 8e3b812b37deb49407c2e9efad6e35b8d56849f2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kai=20Tang=20=28=E5=94=90=E6=81=BA=EF=BC=89?= Date: Fri, 8 Jan 2021 10:24:26 -0500 Subject: [PATCH 497/640] =?UTF-8?q?fix=20typo:=20=E4=BC=BC=E4=B9=8E=3D>?= =?UTF-8?q?=E6=97=B6=E5=80=99?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- _2020/debugging-profiling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index 4585d491..e02216c6 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -177,7 +177,7 @@ bar *= 0.2 time.sleep(60) print(baz) ``` -静态分析工具可以发现此类的问题。当我们使用[`pyflakes`](https://pypi.org/project/pyflakes) 分析代码的似乎,我们会得到与这两处 bug 相关的错误信息。[`mypy`](http://mypy-lang.org/) 则是另外一个工具,它可以对代码进行类型检查。这里,`mypy` 会经过我们`bar` 起初是一个 `int` ,然后变成了 `float`。这些问题都可以在不允许代码的情况下被发现。 +静态分析工具可以发现此类的问题。当我们使用[`pyflakes`](https://pypi.org/project/pyflakes) 分析代码的时候,我们会得到与这两处 bug 相关的错误信息。[`mypy`](http://mypy-lang.org/) 则是另外一个工具,它可以对代码进行类型检查。这里,`mypy` 会经过我们`bar` 起初是一个 `int` ,然后变成了 `float`。这些问题都可以在不允许代码的情况下被发现。 在 shell 工具那一节课的时候,我们介绍了 [`shellcheck`](https://www.shellcheck.net/),这是一个类似的工具,但它是应用于 shell 脚本的。 From c0016267eae8d8ace83f3a214efc175b6d2938cc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Kai=20Tang=20=28=E5=94=90=E6=81=BA=EF=BC=89?= Date: Fri, 8 Jan 2021 10:30:32 -0500 Subject: [PATCH 498/640] Update debugging-profiling.md --- _2020/debugging-profiling.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_2020/debugging-profiling.md b/_2020/debugging-profiling.md index e02216c6..7965a3a3 100644 --- a/_2020/debugging-profiling.md +++ b/_2020/debugging-profiling.md @@ -177,7 +177,7 @@ bar *= 0.2 time.sleep(60) print(baz) ``` -静态分析工具可以发现此类的问题。当我们使用[`pyflakes`](https://pypi.org/project/pyflakes) 分析代码的时候,我们会得到与这两处 bug 相关的错误信息。[`mypy`](http://mypy-lang.org/) 则是另外一个工具,它可以对代码进行类型检查。这里,`mypy` 会经过我们`bar` 起初是一个 `int` ,然后变成了 `float`。这些问题都可以在不允许代码的情况下被发现。 +静态分析工具可以发现此类的问题。当我们使用[`pyflakes`](https://pypi.org/project/pyflakes) 分析代码的时候,我们会得到与这两处 bug 相关的错误信息。[`mypy`](http://mypy-lang.org/) 则是另外一个工具,它可以对代码进行类型检查。这里,`mypy` 会经过我们`bar` 起初是一个 `int` ,然后变成了 `float`。这些问题都可以在不运行代码的情况下被发现。 在 shell 工具那一节课的时候,我们介绍了 [`shellcheck`](https://www.shellcheck.net/),这是一个类似的工具,但它是应用于 shell 脚本的。 From 49767ba9312e6ca38a555118283e3b8196ddf255 Mon Sep 17 00:00:00 2001 From: Vifly <48406926+vifly@users.noreply.github.com> Date: Mon, 11 Jan 2021 22:37:35 +0800 Subject: [PATCH 499/640] typo fix --- about.md | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/about.md b/about.md index 3db2e92c..736f5406 100644 --- a/about.md +++ b/about.md @@ -17,9 +17,9 @@ title: "开设此课程的动机" # The missing semester of your CS education -为了解决这个问题,我们开启了一个课程,涵盖各项对成为高效率计算机科学家或程序员至关重要的 +为了解决这个问题,我们开设了一个课程,涵盖各项对成为高效率计算机科学家或程序员至关重要的 主题。这个课程实用且具有很强的实践性,提供了各种能够立即广泛应用解决问题的趁手工具指导。 -该课在 2020 年 1 月”独立活动期“开设,为期一个月,是学生开办的短期课程。虽然该课程针对 +该课在 2020 年 1 月“独立活动期”开设,为期一个月,是学生开办的短期课程。虽然该课程针对 麻省理工学院,但我们公开提供了全部课程的录制视频与相关资料。 如果该课程适合你,那么以下还有一些具体的课程示例: @@ -39,7 +39,6 @@ title: "开设此课程的动机" ## 版本控制 -如何**正确地**使用版本控制,利用它避免尴尬的情况发生,与他人协作,并且能够快速定位 如何**正确地**使用版本控制,利用它避免尴尬的情况发生。与他人协作,并且能够快速定位 有问题的提交 不再大量注释代码。不再为解决 bug 而找遍所有代码。不再“我去,刚才是删了有用的代码?!”。 @@ -64,11 +63,11 @@ Vim 的宏是它最好的特性之一,在下面这个示例中,我们使用 ## 远程服务器 -使用 SSH 密钥在远程机器下工作如何保持清醒,并且终端能够复用。不再为了仅执行个别命令 -总是打开许多命令终端。不再每次连接都总输入密码。不再因为网络断开或必须重启笔记本时 +使用 SSH 密钥连接远程机器进行工作时如何保持连接,并且让终端能够复用。不再为了仅执行个别命令 +总是打开许多命令行终端。不再每次连接都总输入密码。不再因为网络断开或必须重启笔记本时 就丢失全部上下文。 -以下示例,我们使用`tmux`来保持会话在远程服务器活跃,并使用`mosh`来支持网络漫游和断开连接。 +以下示例,我们使用`tmux`来保持远程服务器的会话存在,并使用`mosh`来支持网络漫游和断开连接。