Add alt text to brandon cluster

programminghistorian · Sep 7, 2019 · 397ffd1 · 397ffd1
1 parent cc250ab
commit 397ffd1
Show file tree

Hide file tree

Showing 9 changed files with 79 additions and 73 deletions.
diff --git a/en/lessons/data-mining-the-internet-archive.md b/en/lessons/data-mining-the-internet-archive.md
@@ -16,6 +16,7 @@ activity: acquiring
 topics: [web-scraping]
 abstract: "The collections of the Internet Archive include many digitized historical sources. Many contain rich bibliographic data in a format called MARC. In this lesson, you'll learn how to use Python to automate the downloading of large numbers of MARC files from the Internet Archive and the parsing of MARC records for specific information such as authors, places of publication, and dates. The lesson can be applied more generally to other Internet Archive files and to MARC records found elsewhere."
 redirect_from: /lessons/data-mining-the-internet-archive
+avatar_alt: Group of of men working in a mine
 ---
 
 {% include toc.html %}
@@ -336,7 +337,7 @@ for item in search:
    print item['identifier']
 ```
 
-You should get the same results. 
+You should get the same results.
 
 The second thing to note about the *for loop* is that the indented block
 could could have contained other commands. In this case, we printed each
@@ -547,10 +548,10 @@ def map_xml(function, *files):
     """
     map a function onto the file, so that for each record that is
     parsed the function will get called with the extracted record
-    
+
     def do_it(r):
     print r
-    
+
     map_xml(do_it, 'marc.xml')
     """
 ```

diff --git a/en/lessons/data_wrangling_and_management_in_R.md b/en/lessons/data_wrangling_and_management_in_R.md
@@ -16,6 +16,7 @@ abstract: "This tutorial explores how scholars can organize 'tidy' data, underst
 layout: lesson
 review-ticket: https://github.com/programminghistorian/ph-submissions/issues/60
 redirect_from: /lessons/data-wrangling-and-management-in-R
+avatar_alt: Bar of soap
 ---
 
 {% include toc.html %}

diff --git a/en/lessons/dealing-with-big-data-and-network-analysis-using-neo4j.md b/en/lessons/dealing-with-big-data-and-network-analysis-using-neo4j.md
diff --git a/en/lessons/downloading-multiple-records-using-query-strings.md b/en/lessons/downloading-multiple-records-using-query-strings.md
@@ -18,6 +18,7 @@ abstract: "Downloading a single record from a website is easy, but downloading m
 previous: output-keywords-in-context-in-html-file
 python_warning: true
 redirect_from: /lessons/downloading-multiple-records-using-query-strings
+avatar_alt: Figures working in a mine, pushing carts
 ---
 
 {% include toc.html %}
@@ -130,7 +131,7 @@ Take a look at the URL produced with the last search results page. It
 should look like this:
 
 ``` xml
-https://www.oldbaileyonline.org/search.jsp?gen=1&form=searchHomePage&_divs_fulltext=mulatto*+negro*&kwparse=advanced&_divs_div0Type_div1Type=sessionsPaper_trialAccount&fromYear=1700&fromMonth=00&toYear=1750&toMonth=99&start=0&count=0 
+https://www.oldbaileyonline.org/search.jsp?gen=1&form=searchHomePage&_divs_fulltext=mulatto*+negro*&kwparse=advanced&_divs_div0Type_div1Type=sessionsPaper_trialAccount&fromYear=1700&fromMonth=00&toYear=1750&toMonth=99&start=0&count=0
 ```
 
 We had a look at URLs in [Viewing HTML Files][], but this looks a lot
@@ -150,7 +151,7 @@ https://www.oldbaileyonline.org/search.jsp
 &toYear=1750
 &toMonth=99
 &start=0
-&count=0 
+&count=0
 ```
 
 In this view, we see more clearly our 12 important pieces of information
@@ -161,7 +162,7 @@ it does not do anything.) and a series of 10 *name/value pairs* put
 together with & characters. Together these 10 name/value pairs comprise
 the query string, which tells the search engine what variables to use in
 specific stages of the search. Notice that each name/value pair contains
-both a variable name: toYear, and then assigns that variable a value: 1750. 
+both a variable name: toYear, and then assigns that variable a value: 1750.
 This works in exactly the same way as *Function Arguments* by
 passing certain information to specific variables. In this case, the
 most important variable is `_divs_fulltext=` which has been given the
@@ -243,7 +244,7 @@ page. We have already got the first one by using the form on the
 website:
 
 ``` xml
-https://www.oldbaileyonline.org/search.jsp?gen=1&form=searchHomePage&_divs_fulltext=mulatto*+negro*&kwparse=advanced&_divs_div0Type_div1Type=sessionsPaper_trialAccount&fromYear=1700&fromMonth=00&toYear=1750&toMonth=99&start=0&count=0 
+https://www.oldbaileyonline.org/search.jsp?gen=1&form=searchHomePage&_divs_fulltext=mulatto*+negro*&kwparse=advanced&_divs_div0Type_div1Type=sessionsPaper_trialAccount&fromYear=1700&fromMonth=00&toYear=1750&toMonth=99&start=0&count=0
 ```
 
 We could type this URL out twice and alter the ‘*start*’ variable to get
@@ -463,7 +464,7 @@ def getSearchResults(query, kwparse, fromYear, fromMonth, toYear, toMonth, entri
         url += '&toMonth=' + toMonth
         url += '&start=' + str(startValue)
         url += '&count=0'
-    
+
         #download the page and save the result.
         response = urllib2.urlopen(url)
         webContent = response.read()
@@ -567,7 +568,7 @@ def getSearchResults(query, kwparse, fromYear, fromMonth, toYear, toMonth, entri
         url += '&toMonth=' + toMonth
         url += '&start=' + str(startValue)
         url += '&count=0'
-    
+
         #download the page and save the result.
         response = urllib2.urlopen(url)
         webContent = response.read()
@@ -711,7 +712,7 @@ the trials. The first entry starts with “Anne Smith” so you can use the
 Notice Anne’s name is part of a link:
 
 ``` xml
-browse.jsp?id=t17160113-18&amp;div=t17160113-18&amp;terms=mulatto*_negro*#highlight 
+browse.jsp?id=t17160113-18&amp;div=t17160113-18&amp;terms=mulatto*_negro*#highlight
 ```
 
 Perfect, the link contains the trial ID! Scroll through the remaining
@@ -1063,7 +1064,7 @@ the command output so we know which files failed to download. This
 should be added as the last line in the function.
 
 ```
-print "failed to download: " + str(failedAttempts) 
+print "failed to download: " + str(failedAttempts)
 ```
 
 Now when you run the program, should there be a problem downloading a

diff --git a/en/lessons/editing-audio-with-audacity.md b/en/lessons/editing-audio-with-audacity.md
@@ -15,6 +15,7 @@ topics: [data-manipulation]
 abstract: "In this lesson you will learn how to use Audacity to load, record, edit, mix, and export audio files."
 review-ticket: https://github.com/programminghistorian/ph-submissions/issues/15
 redirect_from: /lessons/editing-audio-with-audacity
+avatar_alt: Two gramophones facing each other
 ---
 
 {% include toc.html %}

diff --git a/en/lessons/exploring-and-analyzing-network-data-with-python.md b/en/lessons/exploring-and-analyzing-network-data-with-python.md
@@ -23,6 +23,7 @@ topics: [network-analysis]
 date: 2017-08-23
 abstract: "This lesson introduces network metrics and how to draw conclusions from them when working with humanities data. You will learn how to use the NetworkX Python package to produce and work with these network statistics."
 redirect_from: /lessons/exploring-and-analyzing-network-data-with-python
+avatar_alt: Train tracks intersecting
 ---
 
 {% include toc.html %}

diff --git a/en/lessons/extracting-illustrated-pages.md b/en/lessons/extracting-illustrated-pages.md
@@ -16,6 +16,7 @@ difficulty: 2
 activity: acquiring
 topics: [api]
 abstract: Machine learning and API extensions by HathiTrust and Internet Archive are making it easier to extract page regions of visual interest from digitized volumes. This lesson shows how to efficiently extract those regions and, in doing so, prompt new, visual research questions.
+avatar_alt: Scientific measuring device
 ---
 
 {% include toc.html %}

diff --git a/en/lessons/extracting-keywords.md b/en/lessons/extracting-keywords.md
@@ -17,6 +17,7 @@ topics: [data-manipulation]
 abstract: "This lesson will teach you how to use Python to extract a set of keywords very quickly and systematically from a set of texts."
 python_warning: true
 redirect_from: /lessons/extracting-keywords
+avatar_alt: Woman churning butter or milk
 ---
 
 {% include toc.html %}
@@ -59,7 +60,7 @@ The first step of this process is to take a look at the data that we will be usi
 
 {% include figure.html filename="extracting-keywords-1.png" caption="Screenshot of the first forty entries in the dataset" %}
 
-Download the dataset and spend a couple of minutes looking at the types of information available. You should notice three columns of information. The first, 'Name', contains the name of the graduate. The second: 'Details', contains the biographical information known about that person. The final column, 'Matriculation Year', contains the year in which the person matriculated (began their studies). This final column was extracted from the details column in the pre-processing stage of this tutorial. The first two columns are as you would find them on the British History Online version of the *Alumni Oxonienses*. If you notice more than three columns then your spreadsheet programme has incorrectly set the [delimiter](https://en.wikipedia.org/wiki/Delimiter) between columns. It should be set to "," (double quotes, comma). How you do this depends on your spreadsheet programme, but you should be able to find the solution online. 
+Download the dataset and spend a couple of minutes looking at the types of information available. You should notice three columns of information. The first, 'Name', contains the name of the graduate. The second: 'Details', contains the biographical information known about that person. The final column, 'Matriculation Year', contains the year in which the person matriculated (began their studies). This final column was extracted from the details column in the pre-processing stage of this tutorial. The first two columns are as you would find them on the British History Online version of the *Alumni Oxonienses*. If you notice more than three columns then your spreadsheet programme has incorrectly set the [delimiter](https://en.wikipedia.org/wiki/Delimiter) between columns. It should be set to "," (double quotes, comma). How you do this depends on your spreadsheet programme, but you should be able to find the solution online.
 
 Most (but not all) of these bibliographic entries contain enough information to tell us what county the graduate came from. Notice that a large number of entries contain placenames that correspond to either major cities ('of London', in the first entry) or English counties ('of Middlesex' in entry 5 or 'of Wilts' - short for Wiltshire in entry 6). If you are not British you may not be familiar with these county names. You can find a list of [historic counties of England](http://en.wikipedia.org/wiki/Historic_counties_of_England) on Wikipedia.
 
@@ -161,7 +162,7 @@ The fourth line closes the open text file. The fifth line prints out the results
 
 Save this file as `extractKeywords.py`, again to the same folder as the other files, and then run it with Python. To do this from the command line, first you need to launch your command line terminal.
 
-On Windows it is called `Command Prompt`. Windows users may find it easier to launch Python by opening the folder containing your `extractKeywords.py` file, then press `shift` + `right-click` and then select 'open command window here'. Assuming you have Python installed, you should be able to run your programme using the command beginning with 'python' below. 
+On Windows it is called `Command Prompt`. Windows users may find it easier to launch Python by opening the folder containing your `extractKeywords.py` file, then press `shift` + `right-click` and then select 'open command window here'. Assuming you have Python installed, you should be able to run your programme using the command beginning with 'python' below.
 
 On Mac OS X, this is found in the `Applications` folder and is called `Terminal`. Once the Terminal window is open, you need to point your Terminal at the directory that contains all of the files you have just created. I have called my directory 'ExtractingKeywordSets' and I have it on my computer's Desktop. To change the Terminal to this directory, I use the following command:
 
@@ -268,7 +269,7 @@ This code will automatically check each word in a text, keeping track of matches
 
 If it looks like it worked, delete the 'print matches' line and move to the next step.
 
-### Step 5: Output results 
+### Step 5: Output results
 
 If you have got to this stage, then your Python program is already finding the matching keywords from your gazetteer. All we need to do now is print them out to the command output pane in a format that's easy to work with.
 
@@ -282,7 +283,7 @@ Add the following lines to your program, minding the indentation as always:
         matchString = ''
         for matches in storedMatches:
             matchString = matchString + matches + "\t"
-        
+
         print matchString
 
 ```
@@ -293,7 +294,7 @@ If there IS a match, then the program creates a new variable called 'matchString
 
 When all of the matching keywords have been added to 'matchString', the program prints it out to the command output before moving on to the next text.
 
-If you save your work and run the program, you should now have code that achieves all of the steps from the algorithm and outputs the results to your command output. 
+If you save your work and run the program, you should now have code that achieves all of the steps from the algorithm and outputs the results to your command output.
 
 The finished code should look like this:
 
@@ -314,7 +315,7 @@ f.close()
 for entry in allTexts:
     matches = 0
     storedMatches = []
-    
+
     #for each entry:
     allWords = entry.split(' ')
     for words in allWords:
@@ -332,15 +333,15 @@ for entry in allTexts:
             else:
                 storedMatches.append(words)
             matches += 1
-    
+
     #if there is a stored result, print it out
     if matches == 0:
         print ' '
     else:
         matchString = ''
         for matches in storedMatches:
             matchString = matchString + matches + "\t"
-        
+
         print matchString
 ```
 
@@ -416,7 +417,7 @@ with open('The_Dataset_-_Alumni_Oxonienses-Jas1.csv') as csvfile:
     for row in reader:
         #the full row for each entry, which will be used to recreate the improved CSV file in a moment
         fullRow.append((row['Name'], row['Details'], row['Matriculation Year']))
-        
+
         #the column we want to parse for our keywords
         row = row['Details'].lower()
         allTexts.append(row)
@@ -484,7 +485,7 @@ with open('The_Dataset_-_Alumni_Oxonienses-Jas1.csv') as csvfile:
     for row in reader:
         #the full row for each entry, which will be used to recreate the improved CSV file in a moment
         fullRow.append((row['Name'], row['Details'], row['Matriculation Year']))
-        
+
         #the column we want to parse for our keywords
         row = row['Details'].lower()
         allTexts.append(row)
@@ -503,7 +504,7 @@ with open(filename, 'a') as csvfile:
     fieldnames = ['Name', 'Details', 'Matriculation Year', 'Placename']
     writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
     writer.writeheader()
-    
+
     #NEW! define the output for each row and then print to the output csv file
     writer = csv.writer(csvfile)
 
@@ -512,24 +513,24 @@ with open(filename, 'a') as csvfile:
 
         matches = 0
         storedMatches = []
-        
+
         #for each entry:
         allWords = entry.split(' ')
         for words in allWords:
-    
+
             #remove punctuation that will interfere with matching
             words = words.replace(',', '')
             words = words.replace('.', '')
             words = words.replace(';', '')
-    
+
             #if a keyword match is found, store the result.
             if words in allKeywords:
                 if words in storedMatches:
                     continue
                 else:
                     storedMatches.append(words)
                 matches += 1
-        
+
         #CHANGED! send any matches to a new row of the csv file.
         if matches == 0:
             newRow = fullRow[counter]

diff --git a/en/lessons/fetch-and-parse-data-with-openrefine.md b/en/lessons/fetch-and-parse-data-with-openrefine.md
@@ -15,6 +15,7 @@ activity: acquiring
 topics: [data-manipulation, web-scraping, api]
 abstract: "OpenRefine is a powerful tool for exploring, cleaning, and transforming data. In this lesson you will learn how to use Refine to fetch URLs and parse web content."
 redirect_from: /lessons/fetch-and-parse-data-with-openrefine
+avatar_alt: Machine for water filtration 
 ---
 
 {% include toc.html %}
@@ -640,5 +641,3 @@ OpenRefine is a flexible, pragmatic tool that simplifies routine tasks and, when
 [^use]: As of July 2017, see [API Documentation](http://text-processing.com/docs/index.html).
 [^1]: Jacob Perkins, "Sentiment Analysis with Python NLTK Text Classification", [http://text-processing.com/demo/sentiment/](http://text-processing.com/demo/sentiment/).
 [^2]: Vivek Narayanan, Ishan Arora, and Arjun Bhatia, "Fast and accurate sentiment classification using an enhanced Naive Bayes model", 2013, [arXiv:1305.6143](https://arxiv.org/abs/1305.6143).
-
-