CSCI128 · ezrichards · Jan 5, 2024 · Dec 18, 2023 · Dec 18, 2023 · Dec 18, 2023
diff --git a/pretext/AdditionalTopics/BigO.ptx b/pretext/AdditionalTopics/BigO.ptx
@@ -0,0 +1,202 @@
+<?xml version="1.0"?>
+<section xml:id="extratopics-bigo">
+  <title>Big O Analysis</title>
+
+  <p>A common question that comes up when programming is: "How long will my program take to run?". Even if a program provides the correct output, if it takes 
+  too long to finish then it is unacceptable. There is a problem here though, it's impossible to reliably say exactly how long a program will take to run. 
+  It depends on too many things. The capabilities of the computer running the code, what else is running on the computer, and the size of the input are just 
+  some of the things that would need to be considered.
+  </p>
+
+  <p>To simplify this issue, we'll give up trying to estimate exactly how long a program will run, and instead look at the biggest factor that affect 
+  existing code: the size of the input. If we wrote a program that ran for 60 seconds on 100 megabytes of input data, how should we expect the program to 
+  react to 200 megabytes of input data? Maybe it would run in 120 seconds (twice the data for twice the run time)? Maybe it would still run in 60 seconds, 
+  assuming that extra data isn't used. Or maybe the program would run for far longer. The issue is that we don't know what the relationship is between the size 
+  of the input data and the behavior of the program.</p>
+
+  <p>This is where <term>Big O Analysis</term> comes in. Big O is a notation computer scientists use to describe the relationship between the size 
+  of the input data and the behavior of the program. These terms are written like a mathematical function using the variable n. n as a variable represents the 
+  size of the input data provided to the program. The Big O function tells us how n affects the time the program will take to complete.</p>
+
+  <p>Consider the example we had before. We have a program that takes 60 seconds to run on 100 megabytes of input data, we'd like to know (roughly) 
+  how long the program might take to run on 200 megabytes of input data. If we know the run time of the program is the function f(n) = n^2, with n being 
+  the size of the data, now we have enough information to make a guess. If n is doubled, then the time the program runs for will quadruple! (2*n)^2 = 4 * n^2.</p>
+
+  <p>The formal mathematical notation for Big O is denoted with a capital O (a <em>big o</em>!) followed by parentheses.
+    Inside of the <c>O()</c> is most commonly some term of n. In our previous example, we would say the program has O(n^2) behavior.</p>
+
+  <p>Different functions of n have different <em>magnitudes</em>, which helps us to quantify how quick or slow an algorithm is relative to the input size <c>n</c>.
+    From left to right, left being the quickest time and right being the slowest time, we typically see these complexities:</p>
+
+  <p><c>O(1)</c>, <c>O(logn)</c>, <c>O(n)</c>, <c>O(nlogn)</c>, <c>O(n^2)</c>, <c>O(n^3)</c>, <c>O(2^n)</c>, <c>O(n!)</c>.</p>
+
+  <p>Big O is like a limit in that only the most significant terms matter as <c>n</c> gets bigger and bigger. We typically expect n to be very, VERY large because 
+  small inputs aren't as strongly affected by time limits. If a program takes 0.001 seconds to run with most normal data, is it really a big deal if it takes 0.004 
+  seconds on occasion? What if we were dealing with a program that had to run for a month though? Now that factor of four starts to hurt a lot more.</p>
+
+  <p>There is another important aspect that we have ignored up to this point: programs can often have wildly different behavior depending on their input. 
+  Consider a contrived example:</p>
+  <program language="python">
+    <input>
+var = input()
+if 'a' in var:
+	while True:
+		print("run forever!")
+else:
+	print("done")
+    </input>
+  </program>
+
+  <p>In this program, the size of the input doesn't matter as much as whether the input string contains a letter "a" or not. If it does, the program runs forever. 
+  If it doesn't, the program ends almost immediately. How do we reconcile this with our Big O notation? The answer is to be a pessimist. We adopt the assumption that 
+  everything that can happen to slow down our program will happen. In the code above, we assume that the input ALWAYS will contain an "a". This assumption is broadly 
+  known as the "worst case". Big O notation uses this assumption in every instance you will see it (at least in this class).</p>
+
+  <p>Let's look at some more examples:</p>
+  <program language="python">
+    <input>
+sum = 1 + 1
+print(sum)
+    </input>
+  </program>
+  <p>This code has a Big O of <c>O(1)</c>, also referred to as <term>constant time</term>. This is because the program does nothing with its input. In fact, it doesn't
+  even take input! Constant time operations are typically things in code which <em>do not</em> loop. A constant time program suggests it will always finish in a 
+  consistent amount of time, no matter what happens.</p>
+
+  <p>Now, let's check out an example with a loop:</p>
+  <program language="python">
+    <input>
+def example_func(n):
+    for i in range(n):
+        print(i)
+    </input>
+  </program>
+  <p>As you can see, this function simply prints <c>0</c> to <c>n</c>. Each print takes a little time, so a larger n means a longer program run time.
+   We denote the complexity of <c>example_func</c> as <c>O(n)</c>,
+    because whether <c>n = 100</c> or <c>n = 10000000</c>, as the complexity trends to infinity, it remains <c>O(n)</c>.</p>
+
+  <p>In the last code example, <c>O(n)</c> was the complexity for <em>all</em> cases, because the loop <em>always</em> goes to <c>n</c>.</p>
+
+  <image source="ExtraTopics/Figures/complexity.png" width="75%" alt="Big O Complexity Graph"/>
+
+  <p>This figure shows complexities as a graph and which ones are considered "desirable" or at least "acceptable". Context mostly determines if these are "good" terms or not,
+  but do strive to never write something worse than <c>O(n^3)</c>!</p>
+
+  <p>It may be difficult to appreciate the implications of these terms when first seeing them. Let's say we have an algorithm with the following complexities, but they 
+  all run with the same time (1 milliseconds) for n = 10. This table shows what will happen if we increase the size of the input:</p>
+
+  <table>
+    <tabular>
+      <row>
+        <cell>
+          n
+        </cell>
+        <cell>
+          <c>O(log(n))</c>
+        </cell>
+        <cell>
+          <c>O(n)</c>
+        </cell>
+        <cell>
+          <c>O(n^3)</c>
+        </cell>
+        <cell>
+          <c>O(2^n)</c>
+        </cell>
+      </row>
+      <row>
+        <cell>
+          10
+        </cell>
+        <cell>
+          1 ms
+        </cell>
+        <cell>
+          1 ms
+        </cell>
+        <cell>
+          1 ms
+        </cell>
+        <cell>
+          1 ms
+        </cell>
+      </row>
+      <row>
+        <cell>
+          11
+        </cell>
+        <cell>
+          1 ms
+        </cell>
+        <cell>
+          1.1 ms
+        </cell>
+        <cell>
+          ~1.3 ms
+        </cell>
+        <cell>
+          2 ms
+        </cell>
+      </row>
+      <row>
+        <cell>
+          20
+        </cell>
+        <cell>
+          1.3 ms
+        </cell>
+        <cell>
+          2 ms
+        </cell>
+        <cell>
+          8 ms
+        </cell>
+        <cell>
+          1 s
+        </cell>
+      </row>
+      <row>
+        <cell>
+          100
+        </cell>
+        <cell>
+          2 ms
+        </cell>
+        <cell>
+          10 ms
+        </cell>
+        <cell>
+          1 s
+        </cell>
+        <cell>
+          10^16 years
+        </cell>
+      </row>
+      <row>
+        <cell>
+          100000
+        </cell>
+        <cell>
+          5 ms
+        </cell>
+        <cell>
+          10 s
+        </cell>
+        <cell>
+          31 years
+        </cell>
+        <cell>
+          :)
+        </cell>
+      </row>
+    </tabular>
+  </table>
+
+  <p>As you can see, what started off as a negligible difference exploded into a totally unacceptable time for larger input sizes applied to larger Big O terms. Examples like these are precisely why 
+  computer scientists are so fixated on Big O. 100000 data points is not a lot of data. Large tech companies are often running code on billions or 
+  trillions of data points, and anything less the most efficient code won't be able to run at-scale.</p>
+
+  <p>We will end this section with a disclaimer. We have only covered the bare basic concepts of Big O here today. If you continue to study computer science, 
+  you'll have more opportunities to explore it in much more detail, including seeing the formal definition of Big O as well as learning how to determine the Big O of your own code.
+  For this specific class, we only ask you to be familiar with the notation of Big O and have a basic intuition behind what it communicates.</p>
+</section>
diff --git a/pretext/AdditionalTopics/BinaryRepresentations.ptx b/pretext/AdditionalTopics/BinaryRepresentations.ptx
@@ -0,0 +1,53 @@
+<?xml version="1.0"?>
+<section xml:id="extratopics-binaryrepresentations">
+  <title>Binary Representations</title>
+
+  <p>Have you ever seen all of the "hacker" 01010110101010s in the movies? As you might
+    know, this is called binary. While it's not actually how hacking works, binary is still the base of all computing.
+    Every word that you are reading right now was transmitted to your computer as a series of 1's and 0's. Although you won't 
+    be typing 0's and 1's at a keyboard all day, binary is still useful to know.</p>
+
+  <p>Quick background: binary is a numbering system, just like decimal (the numbering system we normally use). 
+    Decimal uses the digits 0-9, but binary only uses the digits 0 and 1, which are called bits.
+    In other words, binary is just a different way of counting.</p>
+
+  <p>Believe it or not, this is indirectly how you've been counting your entire life. For instance, in decimal numbering (base 10):</p>
+
+  <p><c>1023</c> (base 10) = (1 * 10^3) + (0 * 10^2) + (2 * 10^1) + (3 * 10^0)</p>
+
+  <p>There are even more numbering systems, like hexadecimal and octal, but you only need to understand binary for this course.</p>
+
+  <p>Binary deals with powers of two (hence the name), reading from right to left and starting at 0. 
+    If the bit is 0, it is "off" and the position is multiplied by 0; if the bit is 1, it is "on" and its
+    position in the number is the exponent with 2 as the base. Binary numbering is also called <term>base 2</term>
+    because of that. For instance:</p>
+
+  <p><c>1000</c> (base 2) = (1 * 2^3) + (0 * 2^2) + (0 * 2^1) + (0 * 2^0) = 8</p>
+
+  <p><term>Converting decimal to binary</term>: A quick way to convert decimal to binary is to find the largest
+    factor of 2 that will go into the number, and concatenate 1 if it goes into the number; concatenate 0 if not. Subtract
+    the number from the running total and repeat until we hit 0. For instance:</p>
+
+  <p>Example: Convert 78 to binary</p>
+  <p>1. If we think about all of our powers of 2, 2^7 = 128 is too large (128 &gt; 78), so we know 2^6 is where we'll start our number, and we need a 1 in that position. We now have: <c>1xxxxxx</c>.</p>
+  <p>2. 78 - 64 = 14, which is our remainder from the last digit. 2^5 = 32 &gt; 14, so we know 2^5 is a 0. We now have: <c>10xxxxx</c>.</p>
+  <p>3. 78 - 64 = 14, which is our remainder from the last digit. 2^4 = 16 &gt; 14, so we know 2^4 is a 0. We now have: <c>100xxxx</c>.</p>
+  <p>4. 78 - 64 = 14, which is our remainder from the last digit. 2^3 = 8 &lt; 14, so we know 2^3 is a 1 because it fits in! We now have: <c>1001xxx</c>.</p>
+  <p>5. 78 - 64 - 8 = 6, which is our remainder from the last digit. 2^2 = 4 &lt; 6, so we know 2^2 is a 1 because it fits in! We now have: <c>10011xx</c>.</p>
+  <p>6. 78 - 64 - 8 - 4 = 2, which is our remainder from the last digit. 2^1 = 2 &lt; 4, so we know 2^1 is a 1 because it fits in! We now have: <c>100111x</c>.</p>
+  <p>7. 78 - 64 - 8 - 4 - 2 = 0, so we are done and can fill any remainders with a 0 bit.</p>
+  <p>Our final answer is: <c>1001110</c> (base 2)</p>
+
+  <note>
+    Typically when we write binary, we'll see our bits in groups of 4, because our binary sequences are normally
+    some multiple of 4, like 8, 16, or 32. Because of this, we would add a leading zero and
+    write our previous answer as: <c>0100</c> <c>1110</c> (base 2).
+  </note>
+
+  <p><term>Converting binary to decimal</term>: As mentioned above, you can simply look at each bit,
+    and add 2 to the power of its position if the bit is 1.</p>
+
+  <p>Like usual in math, there are a few different ways to arrive at one conclusion. These are not
+    the only ways to do conversions. If these explanations don't make sense to you, ask your instructor
+    or Google for their explanation.</p>
+</section>
diff --git a/pretext/AdditionalTopics/DataScience.ptx b/pretext/AdditionalTopics/DataScience.ptx
@@ -0,0 +1,53 @@
+<?xml version="1.0"?>
+<section xml:id="extratopics-datascience">
+  <title>Data Science</title>
+
+  <p>Data science is a multidisciplinary field which combines computer science,
+    math, and other domains to answer questions using data.</p>
+
+  <p>As the world moves more and more towards storing and analyzing large amounts of data,
+    data science is a vital skill for you to be familiar with, whether you're a computer science major or not. It is also 
+	a very common and useful application of programming, which is why we're discussing it in this class.</p>
+
+  <p>Data science is perhaps best defined by describing what data science looks like. <term>The data science process</term> consists of four steps:</p>
+
+  <ol>
+    <li>Obtaining data</li>
+    <li>Cleaning the data</li>
+    <li>Exploring the data</li>
+    <li>Predicting unknowns</li>
+  </ol>
+
+  <p><term>Obtaining the data</term>: We live in a time where data is more abundant then ever before. Getting a hold of data can involve gathering it yourself,
+  purchasing it, or taking advantage of the many, many sites online now which have a plethora of data 
+  available for free (and sometimes paid) use. if you are getting your data from some 3rd party, it will likely come in a .csv, .json, or SQL database format.</p>
+
+  <p><term>Cleaning the data</term>: This can vary, but ultimately you need to prepare your data
+    in a way that makes it easily usable in the next steps. Often data starts out "noisy" or contains errors. In this step you may
+    fix things in the data, change missing data, or correct wrong data.</p>
+
+  <p>Cleaning is regularly considered the longest step in this process! Data can come in all sorts of different
+    formats now, with anomalies, with blanks, and so much more. It often depends on context and you own goals 
+	what "fixing" data even means.</p>
+
+  <p><term>Exploring the data</term>: Now that the data is prepared, we can do some analysis on it! As the term suggests, exploring the data is about coming to better
+  understand it. You often don't know what is interesting or useful about data when you first encounter it. You may need to do some sort of statistical
+    analysis to uncover the interesting aspects, or you may want to graph values and look for relationships and trends visually.</p>
+
+  <p><term>Predicting unknowns</term>: Having come to understand the data better, you can now use it to create new knowledge. These days, this step typically involves 
+    using machine learning models. These techniques can generally be split into three groups:</p>
+
+  <ol>
+    <li><term>Supervised Learning</term>: With supervised learning, we try to construct a model that describes the relationship between inputs and outputs (regularly 
+	referred to as "labels"). Knowing what labels we want in advance is what makes a method "supervised". For example, we could create a model to guess when an email 
+	is spam or not based on its contents; the label here is "spam" or "not spam". Or we could try to guess what the stock price will be for our favorite company based 
+	on how it has performed in the last few weeks. The label here would be the predicted stock price.</li>
+    <li><term>Unsupervised Learning</term>: Contrasting with supervised learning, with unsupervised learning we don't know the labels in advance. An example here could be
+	using social media data to automatically identify friend groups. We don't know in advance how many groups we'll find or what their nature will be. Because of this, it
+	can be harder to guess what kind of results unsupervised learning will produce.</li>
+    <li><term>Semi-Supervised Learning:</term> Semi-supervised learning is an attempt to capture the best aspects of both supervised and unsupervised learning. With these
+	approaches we start with some data that has labels and also some data that doesn't. To use a previous example, we could take a collection of emails, only some of 
+	which have been labeled as spam or not, and still try to construct a reliable method for identifying new emails as spam. If it goes well, then we've saved ourselves 
+	a lot of time that would have otherwise been spent labeling emails.</li>
+  </ol>
+</section>
diff --git a/pretext/AdditionalTopics/Glossary.ptx b/pretext/AdditionalTopics/Glossary.ptx
@@ -2,11 +2,24 @@
 <section xml:id="dictionaries_glossary">
   <title>Glossary</title>
   <glossary sorted="False">
+    <gi>
+      <title>Binary search</title>
+      <p>A searching algorithm where you look through a sorted list in halves; an improvement upon linear search.</p>
+    </gi>
+    <gi>
+      <title>Big O Notation</title>
+      <p>A notation computer scientists use to describe the relationship between the size 
+        of the input data and the behavior of the program, denoted with O() and some factor inside of the parenthesis.</p>
+    </gi>
+    <gi>
+      <title>constant time</title>
+      <p>A Big O time complexity of <c>O(1)</c>.</p>
+    </gi>
     <gi>
       <title>comprehension</title>
       <p>A specific Python construction that allows collection types to be created and filled with a single line of code.</p>
     </gi>
-	<gi>
+	  <gi>
       <title>dictionary</title>
       <p>A collection of key-value pairs that maps from keys to values. The keys
                             can be any immutable type, and the values can be any type.</p>
@@ -21,6 +34,10 @@
       <p>One of the pairs of items in a dictionary. Values are looked up in a
                             dictionary by key.</p>
     </gi>
+    <gi>
+      <title>Linear search</title>
+      <p>A searching algorithm where you look through in a linear order (directly from start to end).</p>
+    </gi>
     <gi>
       <title>mapping type</title>
       <p>A mapping type is a data type comprised of a collection of keys and