Skip to content

Commit

Permalink
Minor changes to text and formatting
Browse files Browse the repository at this point in the history
  • Loading branch information
abarciauskas-bgse committed Aug 13, 2024
1 parent 135c053 commit cf35de9
Show file tree
Hide file tree
Showing 4 changed files with 15 additions and 14 deletions.
6 changes: 3 additions & 3 deletions book/tutorials/cloud-computing/01-cloud-computing.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,12 @@
":open:\n",
"If you have your laptop available, open the terminal app and use the appropriate commands to determine CPU and memory.\n",
"\n",
"<div style=\"float:left; padding: 30px;\">\n",
"<div style=\"width:60%; padding: 30px;\">\n",
"\n",
"| Operating System (OS) | CPU command | Memory Command |\n",
"|-----------------------|-----------------------------------------------------------------------------------|----------------------------|\n",
"| MacOS | <code>sysctl -a \\| grep hw.ncpu</code> | <code>top -l 1 \\| grep PhysMem</code> |\n",
"| Linux (cryocloud) | <code>lscpu \\| grep \"^CPU\\(s\\):\"</code> | <code>free -h</code> | \n",
"| MacOS | `sysctl -a \\| grep hw.ncpu` | `top -l 1 \\| grep PhysMem` |\n",
"| Linux (cryocloud) | `lscpu \\| grep \"^CPU\\(s\\):\"` | `free -h` | \n",
"| Windows | https://www.top-password.com/blog/find-number-of-cores-in-your-cpu-on-windows-10/ | |\n",
"</div>\n",
"\n",
Expand Down
6 changes: 3 additions & 3 deletions book/tutorials/cloud-computing/02-cloud-data-access.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@
"Navigate [https://search.earthdata.nasa.gov](https://search.earthdata.nasa.gov), search for ICESat-2 and answer the following questions:\n",
"\n",
"* Which DAAC hosts ICESat-2 datasets?\n",
"* Which ICESat-2 datasets are hosted on the AWS Cloud and how can you tell?\n",
"* How many ICESat-2 datasets are hosted on the AWS Cloud and how can you tell?\n",
":::\n",
"\n",
"\n",
Expand All @@ -59,8 +59,8 @@
"Here are a likely few:\n",
"1. Download data from a DAAC to your local machine.\n",
"2. Download data from cloud storage to your local machine.\n",
"3. Download data from a DAAC to a virtual machine in the cloud (when would you do this?).\n",
"4. Login to a virtual machine in the cloud, like cryointhecloud, and access data directly.\n",
"3. Login to a virtual machine in the cloud and download data from a DAAC (when would you do this?).\n",
"4. Login to a virtual machine in the cloud, like CryoCloud, and access data directly.\n",
"\n",
"```{image} ./images/different-modes-of-access.png\n",
":width: 1000px\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@
"# Cloud-Optimized Data Access\n",
"\n",
"<br />\n",
"Recall from the introduction that cloud object storage is accessed over the network. Local file storage access will always be faster but there are limitations. This is why the design of file formats in the cloud requires more consideration than local file storage.\n",
"\n",
"Recall from the [Cloud Data Access Notebook](./02-cloud-data-access.ipynb) that cloud object storage is accessed over the network. Local file storage access will always be faster but there are limitations. This is why the design of file formats in the cloud requires more consideration than local file storage.\n",
"\n",
"## 🏋️ Exercise\n",
"\n",
Expand Down Expand Up @@ -48,19 +49,19 @@
"\n",
"### An analogy - Moving away from home\n",
"\n",
"Imagine when you lived at home with your parents. Everything was right there when you needed it (like local file storage). Let's say you're about to move away to college (the cloud), and you are not allowed to bring anything with you. You put everything in your parent's (infinitely large) garage (cloud object storage). Given you would need to have things shipped to you, would it be better to leave everything unpacked? To put everything all in one box? A few different boxes? And what would be the most efficient way for your parents to know where things were when you asked for them?\n",
"Imagine when you lived at home with your parents. Everything was right there when you needed it (like local file storage). Let's say you're about to move away to college (the cloud), but you have decided to backpack there and so you can't bring any of your belongings with you. You put everything in your parent's (infinitely large) garage (cloud object storage). Given you would need to have things shipped to you, would it be better to leave everything unpacked? To put everything all in one box? A few different boxes? And what would be the most efficient way for your parents to know where things were when you asked for them?\n",
"\n",
"```{image} ./images/dalle-college.png\n",
":width: 400px\n",
":align: center\n",
"```\n",
"<p style=\"font-size:10px\">image generated with ChatGPT 4</p>\n",
"\n",
"You are probably familiar with the following file formats: HDF5, NetCDF, GeoTIFF. You can actually make any of these formats \"cloud-optimized\" by:\n",
"You can actually make any common geospatial data formats (HDF5/NetCDF, GeoTIFF, LAS (LIDAR Aerial Survey)) \"cloud-optimized\" by:\n",
"\n",
"1. Separate metadata from data and store it contiguously data so it can be read with one request.\n",
"2. Store data in chunks, so the whole file doesn't have to be read to access a portion of the data.\n",
"3. Make sure chunks of data are not too small, so more data can be fetched with each request.\n",
"1. Separate metadata from data and store it contiguously so it can be read with one request.\n",
"2. Store data in chunks, so the whole file doesn't have to be read to access a portion of the data, and it can be compressed.\n",
"3. Make sure the chunks of data are not too small, so more data is fetched with each request.\n",
"4. Make sure the chunks are not too large, which means more data has to be transferred and decompression takes longer.\n",
"5. Compress these chunks so there is less data to transfer over the network.\n",
"\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
"Recall from [03-cloud-optimized-data-access.ipynb](./03-cloud-optimized-data-access.ipynb) that we can make any HDF5 file cloud-optimized by restructuring the file so that all the metadata is in one place and chunks are \"not too big\" and \"not too small\". However, as users of the data, not archivers, we don't control how the file is generated and distributed, so if we're restructuring the data we might want to go with something even better - a **\"cloud-native\"** format.\n",
"\n",
":::{important} Cloud-Native Formats\n",
"Cloud-native formats are formats that were designed specifically to be used in a cloud environment. This usually means that metadata and indexes for data is separated from metadata in a way that allows for logical dataset access across multiple files. In other words, it is fast to open a large dataset and access just the parts of it that you need.\n",
"Cloud-native formats are formats that were designed specifically to be used in a cloud environment. This usually means that metadata and indexes for data is separated from the data itself in a way that allows for logical dataset access across multiple files. In other words, it is fast to open a large dataset and access just the parts of it that you need.\n",
":::\n",
"\n",
":::{warning}\n",
Expand Down Expand Up @@ -73,7 +73,7 @@
"\n",
"\n",
"gdf = gpd.GeoDataFrame(df, geometry='geometry')\n",
"null_value = gdf['h_canopy'].max() \n",
"null_value = gdf['h_canopy'].max() # can we change this to a no data value?\n",
"gdf_filtered = gdf.loc[gdf['h_canopy'] != null_value]\n",
"gdf_filtered"
]
Expand Down

0 comments on commit cf35de9

Please sign in to comment.