New Python beginner lessons #132

gvwilson · 2013-11-06T16:11:50Z

Lay out directory structure for new novice/intermediate lessons (see Construct novice shell lesson. #121-Construct intermediate SQL lesson. #130).
Add notebooks for novice introduction to Python, plus utilties, data files, and one image.
Modify Makefile to delete .pyc files when doing make clean.
Add a NEW_MATERIAL.md file that will eventually become the new README.md.

ahmadia · 2013-11-06T16:19:22Z

+1 on using a new development branch here master so that we can land our current pull requests into gh-pages while allowing straightforward work here.

wking · 2013-11-06T16:26:40Z

On Wed, Nov 06, 2013 at 08:11:51AM -0800, Greg Wilson wrote:

-- File Changes --

…
A python/novice/inflammation-01.csv (60)
A python/novice/inflammation-02.csv (60)
…
A python/novice/util/gen-inflammation.py (19)

You add gen-inflammation.py and inflammation-01.csv in 282934c
(Starting beginner's lessons on Python, 2013-11-03). More
inflammation-.csv files enter in 459c91a (Second beginner's lesson on
Python, 2013-11-03). I'd prefer if the auto-generated
inflammation-.csv files were not included in the development branch
at all, and were instead generated along with other pre-Jekyll
transitions (#92) like RMarkdown → Markdown (#119) and IPyNb splitting
(#119 (comment))
during the “build a per-boot branch for downstream consumption” step).

wking · 2013-11-06T16:30:00Z

On Wed, Nov 06, 2013 at 08:19:23AM -0800, Aron Ahmadia wrote:

+1 on using a new development branch here master

Also +1. And +1 for @gvwilson using a feature branch in his own
repository :).

On Wed, Nov 06, 2013 at 08:26:21AM -0800, W. Trevor King wrote:

transitions (#92) like RMarkdown → Markdown (#119) and IPyNb splitting

Oops, I flipped my references. Should be #119 and then #92.

wking · 2013-11-06T16:34:21Z

On Wed, Nov 06, 2013 at 08:29:52AM -0800, W. Trevor King wrote:

On Wed, Nov 06, 2013 at 08:19:23AM -0800, Aron Ahmadia wrote:

+1 on using a new development branch here master

Also +1. And +1 for @gvwilson using a feature branch in his own
repository :).

Although the branching-off point for bc/master from bc/gh-pages seems
somewhat arbitrary. I'd suggest either orphan branch or the current
bc/gh-pages tip for these stand-alone new-format lessons.

gvwilson · 2013-11-06T16:35:03Z

I've checked in the files because they're cat'd in notebooks, and I want
to ensure consistency at the point of checkout. I also tried to sync
addition of the .csv with commit of the corresponding notebook.

ahmadia · 2013-11-06T16:37:59Z

Although the branching-off point for bc/master from bc/gh-pages seems
somewhat arbitrary.

@wking I think the plan is to back-tag that commit as a pre-release, land the current PRs, then move development over to this branch. It happens to be the place where @gvwilson started working on the new reorganization from the (then-tip) of bc/gh-pages.

ahmadia · 2013-11-06T16:50:02Z

I don't have strong opinions on the generated CSV files, since they are so tiny. I think they do fall under the we should eventually generate these instead of committing them category, but we don't have that flow properly set up, so I'm +1 on leaving them as-is for now.

wking · 2013-11-06T16:51:35Z

On Wed, Nov 06, 2013 at 08:35:05AM -0800, Greg Wilson wrote:

I've checked in the files because they're cat'd in notebooks, and I
want to ensure consistency at the point of checkout.

Consistency as in “identical ‘random’ data” should be possible by
setting the seed explicitly in a Makefile rule building the files.

Consistency as in “ready for Jekyll and per-boot-camp branches” is not
possible as pointed out by #119 and #92.

For previewing the content in this PR, I understand that you want the
the auto-generated CSV files around, but I don't think they belong in
the the final development branch.

I also tried to sync addition of the .csv with commit of the
corresponding notebook.

If you're generating them with Makefile rules, you can put the new
rules with the commit of the corresponding notebook. For example,
459c91a (Second beginner's lesson on Python, 2013-11-03) could add
something like (untested):

PYTHON = python2.7
-IMMUNIZATION_DATA_INDEXES = 01
+IMMUNIZATION_DATA_INDEXES = $(shell seq 12)
IMMUNIZATION_DATA = $(patsubst %,python/novice/inflammation-%.csv,$(IMMUNIZATION_DATA_INDEXES))

python/novice/inflammation-%.csv: python/novice/util/gen-inflammation.py
$(PYTHON) "$<" > "$@"

wking · 2013-11-06T16:57:12Z

On Wed, Nov 06, 2013 at 08:50:04AM -0800, Aron Ahmadia wrote:

I don't have strong opinions on the generated CSV files, since they
are so tiny. I think they do fall under the we should eventually
generate these instead of committing them category, but we don't
have that flow properly set up, so I'm +1 on leaving them as-is for
now.

Ok, I'm just trying to:

set a good precedent for future auto-generated content commits, and
get us to address this before adopting the new restructuring, to
avoid another restructuring after we do decide to tackle Find and/or build tools to help manage lesson material. #119.

ahmadia · 2013-11-06T17:02:47Z

@wking - I think another restructuring after our current round of restructuring appears to be inevitable :)

Your points are absolutely valid, and I really appreciate your close eye on what's entering the repository, because even a 50 KB generated file would be a bad idea in this context.

I'd love to have a flow in place that includes a content generation stage, but I don't think we're going to be able to really seriously discuss that until January. Until then, I propose we disallow any big generated content into the repositories, and work with the R and IPython Notebook files in an effort to get those ready for generating as well.

I agree that it's a compromise, but as @gvwilson says, let's focus on getting the content in first, and we can defer cleaning up while we're still sorting out our development strategy.

gvwilson · 2013-11-06T18:03:08Z

Comments on python/novice/01-numpy.ipynb sent by @jdblischak by email before this PR landed:

NumPy is automatically loaded by Canopy
Like the explicit explanation of dot notation
typos: numpy.loaded --> numpy.loadtxt
I like that you always use print. At our past boot camp, the students were really confused by the fact that the last line of a cell would print automatically (the question was asked multiple times).
hyperlinks for functions do not work
You don't explain that ":" includes everything when slicing. You should introduce that concept before using it to extract data. Show them that data[:4, 10:] is the same as data[0:4, 10:40]
Should explicitly state that axis=0 corresponds to columns and axis=1 corresponds to rows. And maybe give some intuition for this. I would have thought 0 would be rows and 1 columns since the shape of an array is always listed rows and then columns. I miss R already...
%matplotlib inline does not work on Windows: "ERROR: Line magic function %matplotlib not found". But the figures still appear inline.
"Why do all of our plots stop just short of the upper end of our graph?" Don't know. Are you expecting students to search the internet or am I missing something obvious?
If you refer to a line number in a code cell, you should tell them how to show line numbers (Ctrl-m l)

gvwilson · 2013-11-06T18:04:49Z

Comments on python/novice/02-func.ipynb by @jdblischak sent by email before this PR landed:

I did not like the normalize function example for multiple reasons:

To test the function requires that the students remember the numpy.arange function or at minimum remember that they had learned it before and look it up in the last lesson. In my frustrating experience, I have had to help many students that just stare at the screen even though the code they need to get started could easily be copy-pasted from the current or previous lesson. I'd suggest reminding them how to create a series of integers so that they can focus on the new task.
Completing this task not only requires utilizing the new Python syntax that they just learned, but also some mathematical reasoning. Students that show up to a beginner's programming workshop as graduate students or postdocs are unlikely to be confident in their math skills. I see this exercise getting bogged down more by the math than the programming. It is similar to using the modulo to find out if a number is even or odd. Even when we tried explicitly explaining that the modulo returns a remainder and thus an even number will have remainder zero, there was still a significant number of students that could not get the right answer.
While the first exercise with the normalize function did not take me long to complete, I can't say the same for the second challenge. It took me a few minutes to figure out the math, so I can only imagine how long this would take for our students to complete. I'd prefer that we not use precious boot camp time testing the student's math comprehension skills.
I don't like the name normalize because that term is overloaded. In traditional statistics, it refers to transforming data to a normal distribution. In genomics and other fields, it can be used to refer to transformation of data to any other distribution. It can also refer to rescaling, which is what your example is. Since these lessons will be used for various audiences, how about using the name rescale instead?

gvwilson · 2013-11-06T18:06:09Z

Comment on python/novice/03-loop.ipynb sent by @jdblischak before this PR landed:

How did you imagine the students solving the function to reverse a string? I came up with two solutions, but I think both are somewhat advanced. This is the first for loop they are going to have ever written. My first requires them to remember how to index from the back of a list, initiate a string and an integer variable, and to update both of those variables during each iteration of the loop. The second one I doubt the students would ever come up with since the lesson is about loops and you only briefly introduced specifying a step in a slice in the first lesson. Perhaps you could have an exercise before this one that is super simple. One where they can struggle to remember to put a colon and indent the body of the loop. Then once they have gained some confidence and familiarity with the for loop, they could move on to this exercise.

def rev(s):
    x = -1
    new_s = ''
    for character in s:
        new_s = new_s + s[x]
        x = x - 1
    print new_s

def rev(s):
    print s[-1::-1]

And I am stumped on the second one. To solve this problem I would either use the range function in conjunction with a for loop or use a while loop. Since you have not introduced the range function or while loops, how did you envision them solving this? My solutions are below:

def expo(x, n):
    answer = 1
    for i in range(n):
        answer = answer * x
    return answer

def expo(x, n):
    answer = 1
    counter = 0
    while counter < n:
        answer = answer * x
        counter = counter + 1
    return answer

gvwilson · 2013-11-06T18:07:15Z

Comment sent by @wking before this PR landed:

genfromtxt is much nicer than loadtxt. My favorite genfromtxt feature is it's ability to read column names from a header line, which means you can avoid problems due to column ordering inconsistencies between the generator and consumer. When you don't need its fanciness, genfromtxt is basically a drop in loadtxt replacement, so I'd recommend it for
starting students off

gvwilson · 2013-11-06T18:07:59Z

Comment by @wking sent before this PR landed:

There's some indentation trouble around your second challenge, where you also introduce tupple assignment:

first, second = 'Grace', 'Hopper'

without having covered it in the text. Maybe the goal of the challenge is to have them try that for themselves, and you step in and explain it afterward?

gvwilson · 2013-11-06T18:08:26Z

Comment sent by @jiffyclub before this PR landed:

The first challenge set in 01-numpy.ipynb seems utterly unrelated to the preceding material.

wking · 2013-11-06T18:34:31Z

On Wed, Nov 06, 2013 at 08:11:51AM -0800, Greg Wilson wrote:

Lay out directory structure for new novice/intermediate lessons (see Construct novice shell lesson. #121-Construct intermediate SQL lesson. #130).

Add notebooks for novice introduction to Python, plus utilties, data files, and one image.

Are we floating this a an example to decide how the new restructured
content will work (#118, #119, #120), or are we assuming that the
existing Python content (6e7b321, #24, #27, #28, #30, #43, #57, #60,
#62, #77, #85, #86, #104, +swcarpentry/boot-camps and
swcarpentry/website PRs) is not cut out for the new beginner lessons
(#123) and that we want to start over from scratch? I think detailed
comments about the content of this branch distracts from the former
goal, but maybe we've already put the nail in the coffin of our
existing IPyNb content for the novice-Python lessons?

gvwilson · 2013-11-06T18:51:58Z

I hope most of the existing content under lessons can be recycled for
intermediates (though that's @ethanwhite's call). Our existing material
is clearly not suitable for complete beginners (cc @jdblischak and
others); this stuff has been field-tested, and seems to work much better
for people who've never programmed.

wking · 2013-11-06T19:19:13Z

On Wed, Nov 06, 2013 at 10:52:00AM -0800, Greg Wilson wrote:

I hope most of the existing content under lessons can be recycled for
intermediates (though that's @ethanwhite's call). Our existing material
is clearly not suitable for complete beginners (cc @jdblischak and
others);

Agreed, just making sure we were all on the same page.

this stuff has been field-tested, and seems to work much better for
people who've never programmed.

This stuff as in “PR #132”? And “field-tested” in which boot camps?
I don't see “inflammation” in any pre-#118 commits for the boot camps
I have tagged [1](pointers to missing repositories welcome). I
certainly think #132 reads better for novice programmers than our
existing stuff. I'd just like to have a better feeling for where this
stuff came from and what the earlier trials looked like. Feedback
from previous field-testing would help resolve questions like
@jdblischak's confusion over student solutions to the
string-manipulation exercises 2.

gvwilson · 2013-11-06T19:47:01Z

And “field-tested” in which boot camps?
Most recently Greenwich (worked very well); before that, here in Toronto.

wking · 2013-11-06T20:39:23Z

On Wed, Nov 06, 2013 at 11:47:03AM -0800, Greg Wilson wrote:

And “field-tested” in which boot camps?
Most recently Greenwich (worked very well); before that, here in Toronto.

Thanks. I've tagged 2013-10-greenwich
(https://github.com/swcarpentry/2013-10-24-greenwich) but I'm having
trouble finding the Toronto repository. It looks like Greenwich has
the sample inflammation data, but you live-coded the notebooks without
instructor notes?

gvwilson · 2013-11-11T10:48:39Z

On 2013-11-10 9:19 PM, Aron Ahmadia wrote:

@gvwilson https://github.com/gvwilson - Does tomorrow still count as
weekend? I don't think I'm going to be able to get to this one tonight :(

No worries - I distracted you with my branching mistakes.

DamienIrving · 2013-11-13T03:51:05Z

Just a couple of comments on the testing content in 05-qa.ipynb.

In the "limits to testing" section, is_all_bases seems like an odd choice for a function that is supposed to check whether a character string contains only the letters A, C, G, and T. Would check_ACGT be a better choice?
At the beginning of the unit testing section, you explain what makes a good unit testing tool (must be easy to add or change tests, understand the previous tests, etc). Since the audience are unlikely to ever have to design their own unit testing library like unittest or nose, I'm wondering whether this discussion is relevant? It might be better to simply remove it and begin the unit testing section with the following paragraph ("The simplest kind of test...")

jdblischak · 2013-11-14T16:36:00Z

python/novice/05-qa.ipynb

+      "and most importantly,\n",
+      "functions.\n",
+      "What they haven't done is show us how to tell if a program is getting the right answer.\n",
+      "If each line we right has a 99% chance of being right,\n",


Is 99% an empirically derived estimate, or is this simply a thought experiment to justify testing?

jdblischak · 2013-11-14T20:39:09Z

I really like the last lesson on taking command line arguments! I wish I had come across a similar lesson when I was first learning Python. I think novices will really benefit from this material.

ethanwhite · 2013-11-14T20:52:12Z

I really like the last lesson on taking command line arguments! I wish I had come across a similar lesson when I was first learning Python. I think novices will really benefit from this material.

I really like the command line lesson as well, but it's actually material that I think of as being more intermediate. @gvwilson - you've gotten all the way through the command line material with complete beginners?

gvwilson · 2013-11-14T22:08:05Z

On 2013-11-14 3:52 PM, Ethan White wrote:

I really like the command line lesson as well, but it's actually
material that I think of as being more intermediate. @gvwilson
https://github.com/gvwilson - you've gotten all the way through the
command line material with complete beginners?
About one time in three, and only after they had seen the shell. I
would drop something else from Python in order to include this if
necessary: many people have said it's really important to show them that
Python isn't just a notebook thing.

ahmadia · 2013-11-14T22:40:59Z

many people have said it's really important to show them that Python isn't just a notebook thing.

+1

…teach about defensive programming here

ahmadia · 2013-11-21T15:45:48Z

@gvwilson - This PR has gotten too big for me to casually review. I'd suggest you delete behind you the Python lesson material you've used, and merge this when you're ready.

New Python beginner lessons

gvwilson added 8 commits November 2, 2013 07:42

Laying out new material

5819712

Starting beginner's lessons on Python

282934c

Second beginner's lesson on Python

459c91a

Third beginner's lesson on Python

1f1f276

Fourth beginner's lesson on Python

c9ad4c9

Renaming file to be consistent with hyphenation convention

c4cf8d9

Final two lessons on Python (still need challenges)

f6af1ee

Removing .pyc files when cleaning

afc1cf8

wking mentioned this pull request Nov 7, 2013

What structure and metadata should lessons have? #120

Closed

ethanwhite mentioned this pull request Nov 13, 2013

Construct intermediate Python lesson. #128

Closed

jdblischak reviewed Nov 14, 2013
View reviewed changes

wking mentioned this pull request Nov 15, 2013

Construct novice R lesson. #124

Closed

gvwilson added 5 commits November 17, 2013 13:56

Moving assertions and unit testing to intermediate; not sure what to …

258b0d0

…teach about defensive programming here

Material on QA from novice for recycling

9b607b7

Re-introducing unit testing

0fa32dd

Rewriting material on testing

eca0566

Defensive programming

beacc6a

gvwilson pushed a commit that referenced this pull request Nov 22, 2013

Merge pull request #132 from gvwilson/new-python-beginner-lessons

42b9b82

New Python beginner lessons

gvwilson merged commit 42b9b82 into swcarpentry:master Nov 22, 2013

ethanwhite mentioned this pull request Nov 22, 2013

What content should be covered in the structured programming sections of intermediate bootcamps? #152

Closed

gvwilson deleted the new-python-beginner-lessons branch November 26, 2013 16:24

dmj111 mentioned this pull request Dec 10, 2013

Ignore notes dmj111/bc#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Python beginner lessons #132

New Python beginner lessons #132

gvwilson commented Nov 6, 2013

ahmadia commented Nov 6, 2013

wking commented Nov 6, 2013

wking commented Nov 6, 2013

wking commented Nov 6, 2013

gvwilson commented Nov 6, 2013

ahmadia commented Nov 6, 2013

ahmadia commented Nov 6, 2013

wking commented Nov 6, 2013

wking commented Nov 6, 2013

ahmadia commented Nov 6, 2013

gvwilson commented Nov 6, 2013

gvwilson commented Nov 6, 2013

gvwilson commented Nov 6, 2013

gvwilson commented Nov 6, 2013

gvwilson commented Nov 6, 2013

gvwilson commented Nov 6, 2013

wking commented Nov 6, 2013

gvwilson commented Nov 6, 2013

wking commented Nov 6, 2013

gvwilson commented Nov 6, 2013

wking commented Nov 6, 2013

gvwilson commented Nov 11, 2013

DamienIrving commented Nov 13, 2013

jdblischak Nov 14, 2013

jdblischak commented Nov 14, 2013

ethanwhite commented Nov 14, 2013

gvwilson commented Nov 14, 2013

ahmadia commented Nov 14, 2013

ahmadia commented Nov 21, 2013

New Python beginner lessons #132

New Python beginner lessons #132

Conversation

gvwilson commented Nov 6, 2013

ahmadia commented Nov 6, 2013

wking commented Nov 6, 2013

wking commented Nov 6, 2013

wking commented Nov 6, 2013

gvwilson commented Nov 6, 2013

ahmadia commented Nov 6, 2013

ahmadia commented Nov 6, 2013

wking commented Nov 6, 2013

wking commented Nov 6, 2013

ahmadia commented Nov 6, 2013

gvwilson commented Nov 6, 2013

gvwilson commented Nov 6, 2013

gvwilson commented Nov 6, 2013

gvwilson commented Nov 6, 2013

gvwilson commented Nov 6, 2013

gvwilson commented Nov 6, 2013

wking commented Nov 6, 2013

gvwilson commented Nov 6, 2013

wking commented Nov 6, 2013

gvwilson commented Nov 6, 2013

wking commented Nov 6, 2013

gvwilson commented Nov 11, 2013

DamienIrving commented Nov 13, 2013

jdblischak Nov 14, 2013

Choose a reason for hiding this comment

jdblischak commented Nov 14, 2013

ethanwhite commented Nov 14, 2013

gvwilson commented Nov 14, 2013

ahmadia commented Nov 14, 2013

ahmadia commented Nov 21, 2013