From c34f027f3fdb84654aaea1d2bcb30bc55480a4d3 Mon Sep 17 00:00:00 2001 From: John Yaist Date: Mon, 30 Sep 2024 10:42:45 -0700 Subject: [PATCH] republish table data used in guide --- .../part2_data_io_reading_data.ipynb | 5249 ++++++++++++++++- 1 file changed, 5248 insertions(+), 1 deletion(-) diff --git a/guide/05-working-with-the-spatially-enabled-dataframe/part2_data_io_reading_data.ipynb b/guide/05-working-with-the-spatially-enabled-dataframe/part2_data_io_reading_data.ipynb index b86f1b743..6458fcc95 100644 --- a/guide/05-working-with-the-spatially-enabled-dataframe/part2_data_io_reading_data.ipynb +++ b/guide/05-working-with-the-spatially-enabled-dataframe/part2_data_io_reading_data.ipynb @@ -1 +1,5248 @@ -{"cells":[{"cell_type":"markdown","metadata":{},"source":["# Part-2 Data IO with SeDF - Accessing Data\n"]},{"cell_type":"markdown","metadata":{"toc":true},"source":["

Table of Contents

\n","
\n"]},{"cell_type":"markdown","metadata":{},"source":["## Introduction\n"]},{"cell_type":"markdown","metadata":{},"source":["In _part-1_ of this guide series, we started with an introduction to the [**Spatially enabled DataFrame**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#geoaccessor) (SeDF), the `spatial` and `geom` namespaces, and looked at a quick example of SeDF in action. In this part of the guide series, we will look at how GIS data can be accessed from various data formats using SeDF.\n","\n","GIS users work with different vector-based spatial data formats, like published layers on remote servers (web layers) and local data. The [**Spatially enabled DataFrame**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#geoaccessor) allows the users to read, write, and manipulate spatial data by bringing the data in-memory.\n","\n","The _SeDF_ integrates with Esri's [**ArcPy** site-package](http://pro.arcgis.com/en/pro-app/arcpy/get-started/what-is-arcpy-.htm), as well as the open source [pyshp](https://github.com/GeospatialPython/pyshp/), [shapely](https://github.com/Toblerity/Shapely) and [fiona](https://github.com/Toblerity/Fiona) packages. This means that the _SeDF_ can use either [shapely](https://pypi.org/project/Shapely/) or [arcpy](https://www.esri.com/en-us/arcgis/products/arcgis-python-libraries/libraries/arcpy) geometry engines to provide you with options for easily working with geospatial data, regardless of your platform. The _SeDF_ transforms the data into the formats you desire, allowing you to use Python functionality to analyze and visualize geographic information.\n","\n","Data can be read and scripted to automate workflows and be visualized on maps in a [Jupyter notebooks](../using-the-jupyter-notebook-environment/). Let's explore the options available for accessing GIS data with the versatile _Spatially enabled DataFrame_.\n"]},{"cell_type":"markdown","metadata":{},"source":["The data used in this guide is available as an [item](https://www.arcgis.com/home/item.html?id=c7140ae3d7ae4fd0817181461019aa75). We will start by importing some libraries and downloading and extracting the data needed for the analysis in this guide.\n"]},{"cell_type":"code","execution_count":1,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:51:37.257478Z","start_time":"2021-11-22T19:51:12.679381Z"}},"outputs":[],"source":["# Import Libraries\n","import pandas as pd\n","from arcgis.features import GeoAccessor, GeoSeriesAccessor\n","from arcgis.gis import GIS\n","from IPython.display import display\n","import zipfile\n","import os\n","import shutil"]},{"cell_type":"code","execution_count":2,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:51:38.872324Z","start_time":"2021-11-22T19:51:37.261479Z"}},"outputs":[],"source":["# Create a GIS connection\n","gis = GIS()\n","agol_gis = GIS(\"https://www.arcgis.com\", \"arcgis_python\", \"amazing_arcgis_123\")"]},{"cell_type":"code","execution_count":3,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:51:38.980325Z","start_time":"2021-11-22T19:51:38.876325Z"}},"outputs":[{"data":{"text/html":["
\n","
\n"," \n"," \n"," \n","
\n","\n","
\n"," sedf_guide_data\n"," \n","
Data for Spatially enabled DataFrame GuidesShapefile by api_data_owner\n","
Last Modified: November 11, 2021\n","
0 comments, 4 views\n","
\n","
\n"," "],"text/plain":[""]},"execution_count":3,"metadata":{},"output_type":"execute_result"}],"source":["# Get the data item\n","data_item = gis.content.get('c7140ae3d7ae4fd0817181461019aa75')\n","data_item"]},{"cell_type":"markdown","metadata":{},"source":["The cell below downloads and extracts the data from the data item to your machine.\n"]},{"cell_type":"code","execution_count":4,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:51:45.305934Z","start_time":"2021-11-22T19:51:42.206937Z"}},"outputs":[{"name":"stdout","output_type":"stream","text":["Removed existing data directory\n","Dataset unzipped at: sedf_data\\cities\n"]}],"source":["# Download and extract the data\n","def unzip_data():\n"," \"\"\"\n"," This function:\n"," - creates a directory `sedf_data` to download the data from the item\n"," - downloads the item as `sedf_guide_data.zip` file in the sedf_data directory\n"," - unzips and extracts the data to '.\\sedf_data\\cities'.\n"," \"\"\"\n"," try:\n","\n"," # path to downloaded data folder\n"," data_dir = os.path.join(os.getcwd(), 'sedf_data')\n","\n"," # remove existing cities directory if exists\n"," if os.path.isdir(data_dir):\n"," shutil.rmtree(data_dir)\n"," print(f'Removed existing data directory')\n"," else:\n"," os.makedirs(data_dir)\n","\n"," data_item.download(data_dir) # download the data item\n"," # path to zipped file inside data folder\n"," zipped_file_path = os.path.join(data_dir, 'sedf_guide_data.zip')\n","\n"," # unzip the data\n"," zip_ref = zipfile.ZipFile(zipped_file_path, 'r')\n"," zip_ref.extractall(data_dir)\n"," zip_ref.close()\n","\n"," # path to new cities directory\n"," cities_dir = os.path.join(data_dir, 'cities')\n"," print(f'Dataset unzipped at: {os.path.relpath(cities_dir)}')\n","\n"," except Exception as e:\n"," print(f'Error unzipping file: {e}')\n","\n","\n","# Extract data\n","unzip_data()"]},{"cell_type":"markdown","metadata":{},"source":["## Accessing GIS Data\n"]},{"cell_type":"markdown","metadata":{},"source":["The [**Spatially enabled DataFrame**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#geoaccessor) reads from many **sources**, including [Feature layers](https://doc.arcgis.com/en/arcgis-online/share-maps/hosted-web-layers.htm), [Feature classes](http://desktop.arcgis.com/en/arcmap/latest/manage-data/feature-classes/a-quick-tour-of-feature-classes.htm), [Shapefiles](http://desktop.arcgis.com/en/arcmap/latest/manage-data/shapefiles/what-is-a-shapefile.htm), Pandas [DataFrames](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe) and more. Let's dive into the details of accessing GIS data from various sources.\n"]},{"cell_type":"markdown","metadata":{},"source":["### Read in Web Feature Layers\n","\n","[Feature layers](https://doc.arcgis.com/en/arcgis-online/share-maps/hosted-web-layers.htm) hosted on [**ArcGIS Online**](https://www.arcgis.com) or [**ArcGIS Enterprise**](http://enterprise.arcgis.com/en/) can be easily read into a Spatially enabled DataFrame using the [`from_layer()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=from_layer#arcgis.features.GeoAccessor.from_layer) method.\n","\n","The example below shows how the [`get()`](https://developers.arcgis.com/python/api-reference/arcgis.gis.toc.html?highlight=gis%20content%20get#arcgis.gis.ContentManager.get) method can be used to retrieve an ArcGIS Online [`item`](https://developers.arcgis.com/python/api-reference/arcgis.gis.toc.html?highlight=gis%20content%20get#item) and how the [`layers`](https://developers.arcgis.com/python/api-reference/arcgis.gis.toc.html#layer) property of an `item` can be used to access the data.\n"]},{"cell_type":"code","execution_count":5,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:51:52.373464Z","start_time":"2021-11-22T19:51:51.851896Z"},"scrolled":true},"outputs":[{"data":{"text/html":["
\n","
\n"," \n"," \n"," \n","
\n","\n","
\n"," USA Major Cities\n"," \n","
This layer presents the locations of cities within the United States with populations of approximately 10,000 or greater, all state capitals, and the national capital.Feature Layer Collection by esri_dm\n","
Last Modified: May 19, 2020\n","
1 comments, 33,841,105 views\n","
\n","
\n"," "],"text/plain":[""]},"execution_count":5,"metadata":{},"output_type":"execute_result"}],"source":["gis = GIS()\n","item = gis.content.search(\n"," \"USA Major Cities\", item_type=\"Feature layer\", outside_org=True)[0]\n","item"]},{"cell_type":"code","execution_count":6,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:51:56.288612Z","start_time":"2021-11-22T19:51:52.376465Z"}},"outputs":[{"data":{"text/plain":["(3886, 50)"]},"execution_count":6,"metadata":{},"output_type":"execute_result"}],"source":["# Obtain the first feature layer from the item\n","flayer = item.layers[0]\n","\n","# Use the `from_layer` static method in the 'spatial' namespace on the Pandas' DataFrame\n","sdf = pd.DataFrame.spatial.from_layer(flayer)\n","\n","# Check shape\n","sdf.shape"]},{"cell_type":"code","execution_count":7,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:51:56.317613Z","start_time":"2021-11-22T19:51:56.291617Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
AGE_10_14AGE_15_19AGE_20_24AGE_25_34AGE_35_44AGE_45_54AGE_55_64AGE_5_9AGE_65_74AGE_75_84...PLACEFIPSPOP2010POPULATIONPOP_CLASSRENTER_OCCSHAPESTSTFIPSVACANTWHITE
01313105873420311767144611361503665486...1601990138161518161271{\"x\": -12462673.723706165, \"y\": 5384674.994080...ID1627113002
189081781817991235133011431099721579...1607840118991194661441{\"x\": -12506251.313993266, \"y\": 5341537.793529...ID163189893
21275013959169663213527048295952417712933121767087...1608830205671225405833359{\"x\": -12938676.6836459, \"y\": 5403597.04949123...ID166996182991
3790768699144511361134935959679464...1611260103451072761461{\"x\": -12667411.402393516, \"y\": 5241722.820606...ID162417984
43803377936877571555947443624439722961222...1612250462375394275196{\"x\": -12989383.674504515, \"y\": 5413226.487333...ID16142835856
\n","

5 rows × 50 columns

\n","
"],"text/plain":[" AGE_10_14 AGE_15_19 AGE_20_24 AGE_25_34 AGE_35_44 AGE_45_54 \\\n","0 1313 1058 734 2031 1767 1446 \n","1 890 817 818 1799 1235 1330 \n","2 12750 13959 16966 32135 27048 29595 \n","3 790 768 699 1445 1136 1134 \n","4 3803 3779 3687 7571 5559 4744 \n","\n"," AGE_55_64 AGE_5_9 AGE_65_74 AGE_75_84 ... PLACEFIPS POP2010 \\\n","0 1136 1503 665 486 ... 1601990 13816 \n","1 1143 1099 721 579 ... 1607840 11899 \n","2 24177 12933 12176 7087 ... 1608830 205671 \n","3 935 959 679 464 ... 1611260 10345 \n","4 3624 4397 2296 1222 ... 1612250 46237 \n","\n"," POPULATION POP_CLASS RENTER_OCC \\\n","0 15181 6 1271 \n","1 11946 6 1441 \n","2 225405 8 33359 \n","3 10727 6 1461 \n","4 53942 7 5196 \n","\n"," SHAPE ST STFIPS VACANT WHITE \n","0 {\"x\": -12462673.723706165, \"y\": 5384674.994080... ID 16 271 13002 \n","1 {\"x\": -12506251.313993266, \"y\": 5341537.793529... ID 16 318 9893 \n","2 {\"x\": -12938676.6836459, \"y\": 5403597.04949123... ID 16 6996 182991 \n","3 {\"x\": -12667411.402393516, \"y\": 5241722.820606... ID 16 241 7984 \n","4 {\"x\": -12989383.674504515, \"y\": 5413226.487333... ID 16 1428 35856 \n","\n","[5 rows x 50 columns]"]},"execution_count":7,"metadata":{},"output_type":"execute_result"}],"source":["# Check first few records\n","sdf.head()"]},{"cell_type":"code","execution_count":8,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:51:56.324128Z","start_time":"2021-11-22T19:51:56.320614Z"}},"outputs":[{"data":{"text/plain":["pandas.core.frame.DataFrame"]},"execution_count":8,"metadata":{},"output_type":"execute_result"}],"source":["# Check type of sdf\n","type(sdf)"]},{"cell_type":"code","execution_count":9,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:51:56.343129Z","start_time":"2021-11-22T19:51:56.326129Z"}},"outputs":[{"data":{"text/plain":["['point']"]},"execution_count":9,"metadata":{},"output_type":"execute_result"}],"source":["# Access spatial namespace\n","sdf.spatial.geometry_type"]},{"cell_type":"markdown","metadata":{},"source":["> We can see that the dataset has 3886 records and 50 columns. Inspecting the `type` of `sdf` object and accessing the `spatial` namespace shows us that a _Spatially enabled DataFrame_ has been created from all the data in the layer.\n"]},{"cell_type":"markdown","metadata":{},"source":["#### Memory usage and the `query()` operation\n"]},{"cell_type":"markdown","metadata":{},"source":["The [**`from_layer()`**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=from_layer#arcgis.features.GeoAccessor.from_layer) method will attempt to read all the data from the layer into the memory. This approach works when you are dealing with small datasets. However, when it comes to large datasets, it becomes imperative to use the memory efficiently and query for only what is necessary.\n","\n","Let's take a look at the memory usage of the existing _SeDF_ using the [`memory_usage()`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.memory_usage.html) method from Pandas.\n"]},{"cell_type":"code","execution_count":10,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:51:56.731373Z","start_time":"2021-11-22T19:51:56.722376Z"}},"outputs":[{"name":"stdout","output_type":"stream","text":["Shape of data: (3886, 50)\n","Memory used: 1.48 MB\n"]}],"source":["# Check memory usage of current sdf\n","mem_used = sdf.memory_usage().sum() / (1024**2) # converting to megabytes\n","print(f'Shape of data: {sdf.shape}')\n","print(f'Memory used: {round(mem_used, 2)} MB')"]},{"cell_type":"markdown","metadata":{},"source":["> We can see that a `SeDF` created using the `from_layer()` method reads all the data into the memory. So, the `sdf` object has 3886 records and 50 columns, and uses 1.48MB memory.\n","\n","But what if we only needed a small amount of data for our analysis and did not need to bring everything from the layer into the memory? Good question... let's see how we can achieve that.\n"]},{"cell_type":"markdown","metadata":{},"source":["The [**`query()`**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#arcgis.features.FeatureLayer.query) method is a powerful operation that allows you to use [SQL](https://en.wikipedia.org/wiki/SQL) like queries to return only a subset of records. **Since the processing is performed on the server, this operation is not restricted by the capacity of your computer.**\n","\n","The method returns a [`FeatureSet`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=query#arcgis.features.FeatureSet) object; however, the return type can be changed to a _Spatially enabled DataFrame_ object by specifying the parameter `as_df=True`.\n","\n","Let's subset the data using `query()`, create a new _SeDF_, and check the memory usage. We'll use the `AGE_45_54` column to query the layer and get a subset of records.\n"]},{"cell_type":"code","execution_count":11,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:51:58.344586Z","start_time":"2021-11-22T19:51:57.814005Z"}},"outputs":[{"data":{"text/plain":["(316, 50)"]},"execution_count":11,"metadata":{},"output_type":"execute_result"}],"source":["# Filter feature layer records with a query.\n","sub_sdf = flayer.query(where=\"AGE_45_54 < 1500\", as_df=True)\n","sub_sdf.shape"]},{"cell_type":"code","execution_count":12,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:51:58.354587Z","start_time":"2021-11-22T19:51:58.346589Z"}},"outputs":[{"name":"stdout","output_type":"stream","text":["Memory used is: 0.12 MB\n"]}],"source":["# Check memory usage of current sdf\n","mem_used = sub_sdf.memory_usage().sum() / (1024**2) # converting to megabytes\n","print(f'Memory used is: {round(mem_used, 2)} MB')"]},{"cell_type":"markdown","metadata":{},"source":["> Now that we are only querying for records where `AGE_45_54 < 1500`, the result is a smaller DataFrame with 316 records and 50 columns. Since the processing is performed on the server side, only a subset of data is being saved in the memory reducing usage from **1.48 MB to 0.12 MB**.\n"]},{"cell_type":"markdown","metadata":{},"source":["The [`query()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#arcgis.features.FeatureLayer.query) method allows you to specify a number of optional parameters that may further refine and transform the results. One such key parameter is `out_fields`. With `out_fields`, you can subset your data by specifying a list of field names to return.\n"]},{"cell_type":"code","execution_count":13,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:51:59.855435Z","start_time":"2021-11-22T19:51:59.601357Z"}},"outputs":[{"data":{"text/plain":["(316, 6)"]},"execution_count":13,"metadata":{},"output_type":"execute_result"}],"source":["# Filter feature layer with where and out_fields\n","out_fields = ['NAME', 'ST', 'POP_CLASS', 'AGE_45_54']\n","sub_sdf2 = flayer.query(where=\"AGE_45_54 < 1500\",\n"," out_fields=out_fields,\n"," as_df=True)\n","sub_sdf2.shape"]},{"cell_type":"code","execution_count":14,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:51:59.868435Z","start_time":"2021-11-22T19:51:59.858435Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
FIDNAMESTPOP_CLASSAGE_45_54SHAPE
01AmmonID61446{\"x\": -12462673.723706165, \"y\": 5384674.994080...
12BlackfootID61330{\"x\": -12506251.313993266, \"y\": 5341537.793529...
24BurleyID61134{\"x\": -12667411.402393516, \"y\": 5241722.820606...
36ChubbuckID61494{\"x\": -12520053.904151963, \"y\": 5300220.333409...
412JeromeID61155{\"x\": -12747828.64784961, \"y\": 5269214.8197742...
\n","
"],"text/plain":[" FID NAME ST POP_CLASS AGE_45_54 \\\n","0 1 Ammon ID 6 1446 \n","1 2 Blackfoot ID 6 1330 \n","2 4 Burley ID 6 1134 \n","3 6 Chubbuck ID 6 1494 \n","4 12 Jerome ID 6 1155 \n","\n"," SHAPE \n","0 {\"x\": -12462673.723706165, \"y\": 5384674.994080... \n","1 {\"x\": -12506251.313993266, \"y\": 5341537.793529... \n","2 {\"x\": -12667411.402393516, \"y\": 5241722.820606... \n","3 {\"x\": -12520053.904151963, \"y\": 5300220.333409... \n","4 {\"x\": -12747828.64784961, \"y\": 5269214.8197742... "]},"execution_count":14,"metadata":{},"output_type":"execute_result"}],"source":["# Check head\n","sub_sdf2.head()"]},{"cell_type":"code","execution_count":15,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:01.302923Z","start_time":"2021-11-22T19:52:01.295930Z"}},"outputs":[{"name":"stdout","output_type":"stream","text":["Memory used is: 0.01 MB\n"]}],"source":["# Check memory usage of current sdf\n","mem_used = sub_sdf2.memory_usage().sum() / (1024**2) # converting to megabytes\n","print(f'Memory used is: {round(mem_used, 2)} MB')"]},{"cell_type":"markdown","metadata":{},"source":["> Using `out_fields`, we have further reduced memory usage by subsetting the data and bringing only necessary information into the memory.\n"]},{"cell_type":"markdown","metadata":{},"source":["#### Create SeDF from FeatureSet\n"]},{"cell_type":"markdown","metadata":{},"source":["As mentioned earlier, the [**`query()`**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#arcgis.features.FeatureLayer.query) method returns a [`FeatureSet`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=query#arcgis.features.FeatureSet) object. The `FeatureSet` object contains useful information about the data that can be accessed through its various properties.\n","\n","Let's use the `AGE_45_54` column to query the layer to get the result as a `FeatureSet` and check some its properties.\n"]},{"cell_type":"code","execution_count":16,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:03.250472Z","start_time":"2021-11-22T19:52:02.945753Z"}},"outputs":[],"source":["# Filter feature layer to return a feature set.\n","fset = flayer.query(where=\"AGE_45_54 < 1500\")"]},{"cell_type":"code","execution_count":17,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:03.259475Z","start_time":"2021-11-22T19:52:03.253475Z"}},"outputs":[{"data":{"text/plain":["arcgis.features.feature.FeatureSet"]},"execution_count":17,"metadata":{},"output_type":"execute_result"}],"source":["# Check type\n","type(fset)"]},{"cell_type":"code","execution_count":18,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:03.411404Z","start_time":"2021-11-22T19:52:03.406402Z"}},"outputs":[{"data":{"text/plain":["316"]},"execution_count":18,"metadata":{},"output_type":"execute_result"}],"source":["# Check length\n","len(fset.features)"]},{"cell_type":"code","execution_count":19,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:03.633193Z","start_time":"2021-11-22T19:52:03.627193Z"}},"outputs":[{"data":{"text/plain":["{'x': -12462673.723706165,\n"," 'y': 5384674.994080178,\n"," 'spatialReference': {'wkid': 102100, 'latestWkid': 3857}}"]},"execution_count":19,"metadata":{},"output_type":"execute_result"}],"source":["# Check geometry of a feature in the featureset\n","fset.features[0].geometry"]},{"cell_type":"markdown","metadata":{},"source":["The `fields` property of a `FeatureSet` returns a list containing information about each column recorded as a dictionary. Let's use the `fields` property to access information about the first column.\n"]},{"cell_type":"code","execution_count":20,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:04.088356Z","start_time":"2021-11-22T19:52:04.083359Z"}},"outputs":[{"data":{"text/plain":["{'name': 'FID',\n"," 'type': 'esriFieldTypeOID',\n"," 'alias': 'FID',\n"," 'sqlType': 'sqlTypeInteger',\n"," 'domain': None,\n"," 'defaultValue': None}"]},"execution_count":20,"metadata":{},"output_type":"execute_result"}],"source":["# Check details of a column in the feature set\n","fset.fields[0]"]},{"cell_type":"markdown","metadata":{},"source":["Let's get the names of the columns in the data.\n"]},{"cell_type":"code","execution_count":21,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:04.548802Z","start_time":"2021-11-22T19:52:04.542798Z"}},"outputs":[{"data":{"text/plain":["['FID', 'NAME', 'CLASS', 'ST', 'STFIPS']"]},"execution_count":21,"metadata":{},"output_type":"execute_result"}],"source":["# Get column names\n","f_names = [f['name'] for f in fset.fields]\n","f_names[:5]"]},{"cell_type":"markdown","metadata":{},"source":["Now, let's create a _Spatially enabled DataFrame_ from a `FeatureSet` using the [`.sdf`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=sdf#arcgis.features.FeatureSet.sdf) property.\n"]},{"cell_type":"code","execution_count":22,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:05.026469Z","start_time":"2021-11-22T19:52:05.006466Z"}},"outputs":[{"data":{"text/plain":["(316, 50)"]},"execution_count":22,"metadata":{},"output_type":"execute_result"}],"source":["# Create SeDF from FeatureSet\n","fset_df = fset.sdf\n","fset_df.shape"]},{"cell_type":"code","execution_count":23,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:05.276147Z","start_time":"2021-11-22T19:52:05.258149Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
FIDNAMECLASSSTSTFIPSPLACEFIPSCAPITALPOP_CLASSPOPULATIONPOP2010...MARHH_NO_CMHH_CHILDFHH_CHILDFAMILIESAVE_FAM_SZHSE_UNITSVACANTOWNER_OCCRENTER_OCCSHAPE
01AmmoncityID16160199061518113816...113110633533523.61474727132051271{\"x\": -12462673.723706165, \"y\": 5384674.994080...
12BlackfootcityID16160784061194611899...108117438129583.31454731827881441{\"x\": -12506251.313993266, \"y\": 5341537.793529...
\n","

2 rows × 50 columns

\n","
"],"text/plain":[" FID NAME CLASS ST STFIPS PLACEFIPS CAPITAL POP_CLASS POPULATION \\\n","0 1 Ammon city ID 16 1601990 6 15181 \n","1 2 Blackfoot city ID 16 1607840 6 11946 \n","\n"," POP2010 ... MARHH_NO_C MHH_CHILD FHH_CHILD FAMILIES AVE_FAM_SZ \\\n","0 13816 ... 1131 106 335 3352 3.61 \n","1 11899 ... 1081 174 381 2958 3.31 \n","\n"," HSE_UNITS VACANT OWNER_OCC RENTER_OCC \\\n","0 4747 271 3205 1271 \n","1 4547 318 2788 1441 \n","\n"," SHAPE \n","0 {\"x\": -12462673.723706165, \"y\": 5384674.994080... \n","1 {\"x\": -12506251.313993266, \"y\": 5341537.793529... \n","\n","[2 rows x 50 columns]"]},"execution_count":23,"metadata":{},"output_type":"execute_result"}],"source":["# Check head\n","fset_df.head(2)"]},{"cell_type":"code","execution_count":24,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:05.502653Z","start_time":"2021-11-22T19:52:05.496653Z"}},"outputs":[{"data":{"text/plain":["['point']"]},"execution_count":24,"metadata":{},"output_type":"execute_result"}],"source":["# Check geometry type\n","fset_df.spatial.geometry_type"]},{"cell_type":"markdown","metadata":{},"source":["> The `spatial` namespace shows that a _Spatially enabled DataFrame_ has been created from a `FeatureSet`.\n"]},{"cell_type":"markdown","metadata":{},"source":["#### Create SeDF from FeatureCollection\n"]},{"cell_type":"markdown","metadata":{},"source":["Tools within the ArcGIS API for Python often return a [FeatureCollection](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#featurecollection) object as a result of some analysis. A `FeatureCollection` is an in-memory collection of [Feature](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#arcgis.features.Feature) objects with rendering information. Similar to feature layers, feature collections can also be used to store features. With a feature collection, a service is not created to serve out the feature data.\n","\n","Let's create a `SeDF` from a FeatureCollection. Here, we:\n","\n","- Import the [Major Ports](https://www.arcgis.com/home/item.html?id=405963eaea24428c9db236ec289760eb) feature layer.\n","- Create 5 mile buffers using [`create_buffers()`](https://developers.arcgis.com/python/api-reference/arcgis.features.use_proximity.html#create-buffers) tool resulting in a FeatureCollection.\n","- Using the [query()](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#arcgis.features.FeatureLayer.query) method on a FeatureCollection returns a [FeatureSet](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=query#arcgis.features.FeatureSet) object. We will create a `SeDF` from the buffered FeatureCollection using the the [`.sdf`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=sdf#arcgis.features.FeatureSet.sdf) property of a FeatureSet object returned from `query()`.\n"]},{"cell_type":"code","execution_count":25,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:10.071650Z","start_time":"2021-11-22T19:52:09.984656Z"}},"outputs":[{"data":{"text/html":["
\n","
\n"," \n"," \n"," \n","
\n","\n","
\n"," Major Ports\n"," \n","
This feature layer, utilizing data from the U.S. Department of Transportation, depicts Major Ports in the United States by total tonnage.Feature Layer Collection by Federal_User_Community\n","
Last Modified: October 27, 2021\n","
0 comments, 157,223 views\n","
\n","
\n"," "],"text/plain":[""]},"execution_count":25,"metadata":{},"output_type":"execute_result"}],"source":["# Get the ports item\n","ports_item = gis.content.get(\"405963eaea24428c9db236ec289760eb\")\n","ports_item"]},{"cell_type":"code","execution_count":26,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:10.203674Z","start_time":"2021-11-22T19:52:10.197678Z"}},"outputs":[{"data":{"text/plain":[""]},"execution_count":26,"metadata":{},"output_type":"execute_result"}],"source":["# Get the ports layer\n","ports_lyr = ports_item.layers[0]\n","ports_lyr"]},{"cell_type":"code","execution_count":27,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:40.150295Z","start_time":"2021-11-22T19:52:10.430626Z"}},"outputs":[],"source":["# Create buffers\n","from arcgis.features.use_proximity import create_buffers\n","ports_buffer50 = create_buffers(\n"," ports_lyr, distances=[5], units='Miles', gis=agol_gis)"]},{"cell_type":"code","execution_count":28,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:40.159300Z","start_time":"2021-11-22T19:52:40.154296Z"}},"outputs":[{"data":{"text/plain":["arcgis.features.feature.FeatureCollection"]},"execution_count":28,"metadata":{},"output_type":"execute_result"}],"source":["# Check type of result from the analysis\n","type(ports_buffer50)"]},{"cell_type":"markdown","metadata":{},"source":["> The `create_buffers()` tool resulted in a `FeatureCollection`.\n","\n","Now, we will create a `SeDF` from the `FeatureCollection` object.\n"]},{"cell_type":"code","execution_count":29,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:40.197297Z","start_time":"2021-11-22T19:52:40.162296Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
OBJECTID_1OBJECTIDIDPORTPORT_NAMEGRAND_TOTAFOREIGN_TOIMPORTSEXPORTSDOMESTICBUFF_DISTORIG_FIDAnalysisAreaSHAPE
011124C4947Unalaska Island, AK165228112368294262518105784154525178.528402{\"rings\": [[[-18806114.3995, 7138385.537799999...
12285C4410Kahului, Maui, HI36154492039120391035950585278.528402{\"rings\": [[[-17418472.419, 2388455.4312999994...
\n","
"],"text/plain":[" OBJECTID_1 OBJECTID ID PORT PORT_NAME GRAND_TOTA \\\n","0 1 1 124 C4947 Unalaska Island, AK 1652281 \n","1 2 2 85 C4410 Kahului, Maui, HI 3615449 \n","\n"," FOREIGN_TO IMPORTS EXPORTS DOMESTIC BUFF_DIST ORIG_FID AnalysisArea \\\n","0 1236829 426251 810578 415452 5 1 78.528402 \n","1 20391 20391 0 3595058 5 2 78.528402 \n","\n"," SHAPE \n","0 {\"rings\": [[[-18806114.3995, 7138385.537799999... \n","1 {\"rings\": [[[-17418472.419, 2388455.4312999994... "]},"execution_count":29,"metadata":{},"output_type":"execute_result"}],"source":["# Create SeDF\n","sedf_fc = ports_buffer50.query().sdf\n","sedf_fc.head(2)"]},{"cell_type":"code","execution_count":30,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:40.205296Z","start_time":"2021-11-22T19:52:40.199296Z"}},"outputs":[{"data":{"text/plain":["['polygon']"]},"execution_count":30,"metadata":{},"output_type":"execute_result"}],"source":["# Check geometry type\n","sedf_fc.spatial.geometry_type"]},{"cell_type":"markdown","metadata":{},"source":["> The `spatial` namespace shows that a _Spatially enabled DataFrame_ has been created from a `FeatureCollection`.\n"]},{"cell_type":"markdown","metadata":{},"source":["### Read in local GIS data\n"]},{"cell_type":"markdown","metadata":{},"source":["Local geospatial data, such as [`Feature classes`](http://desktop.arcgis.com/en/arcmap/latest/manage-data/feature-classes/a-quick-tour-of-feature-classes.htm) and [`shapefiles`](http://desktop.arcgis.com/en/arcmap/latest/manage-data/shapefiles/what-is-a-shapefile.htm) can be easily accessed using the [**Spatially enabled DataFrame**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#geoaccessor). The [`from_featureclass()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?arcgis.features.GeoAccessor.from_featureclass#arcgis.features.GeoAccessor.from_featureclass) method can be used to access local data. Let's look at some examples.\n"]},{"cell_type":"markdown","metadata":{},"source":["#### Reading a Shapefile\n","\n","A locally stored [`shapefile`](http://desktop.arcgis.com/en/arcmap/latest/manage-data/shapefiles/what-is-a-shapefile.htm) can be accessed by passing the location of the file in the [`from_featureclass()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?arcgis.features.GeoAccessor.from_featureclass#arcgis.features.GeoAccessor.from_featureclass) method.\n"]},{"cell_type":"markdown","metadata":{},"source":["
\n"," Note: In the absence of arcpy, the PyShp package must be present in your current conda environment in order to read shapefiles. To check if PyShp is present, you can run the following in a cell:\n"," \n"," !conda list pyshp\n"," \n","To install PyShp, you can run the following in a cell:\n"," \n"," !conda install pyshp\n","
\n"]},{"cell_type":"code","execution_count":31,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:41.031721Z","start_time":"2021-11-22T19:52:40.207294Z"}},"outputs":[{"data":{"text/plain":["(3886, 51)"]},"execution_count":31,"metadata":{},"output_type":"execute_result"}],"source":["# Reading from shape file\n","shp_df = pd.DataFrame.spatial.from_featureclass(\n"," location=\"./sedf_data/cities/cities.shp\")\n","shp_df.shape"]},{"cell_type":"code","execution_count":32,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:41.058722Z","start_time":"2021-11-22T19:52:41.034722Z"}},"outputs":[{"data":{"text/plain":["['point']"]},"execution_count":32,"metadata":{},"output_type":"execute_result"}],"source":["shp_df.spatial.geometry_type"]},{"cell_type":"markdown","metadata":{},"source":["> The `spatial` namespace shows that a _Spatially enabled DataFrame_ has been created from the `shapefile` stored locally.\n"]},{"cell_type":"markdown","metadata":{},"source":["##### Shapefile from a URL\n","\n","The url of a zipped `shapefile` can be used to create a `SeDF` by passing the url as location in the `from_featureclass()` method. The image below shows how the operation can be performed.\n"]},{"cell_type":"markdown","metadata":{},"source":["
\n"," Note: This operation requires PyShp to be available in the environment.\n","\n","
\n"]},{"attachments":{"image.png":{"image/png":""}},"cell_type":"markdown","metadata":{},"source":["![image.png](attachment:image.png)\n"]},{"cell_type":"markdown","metadata":{},"source":["#### Reading a Featureclass\n","\n","A [featureclass](http://desktop.arcgis.com/en/arcmap/latest/manage-data/feature-classes/a-quick-tour-of-feature-classes.htm) can be accessed from a File Geodatabase by passing its location in the [`from_featureclass()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?arcgis.features.GeoAccessor.from_featureclass#arcgis.features.GeoAccessor.from_featureclass) method.\n"]},{"cell_type":"markdown","metadata":{},"source":["
\n"," Note: In the absence of arcpy, the Fiona package must be present in your current conda environment in order to read a featureclass.\n"," To check if Fiona is present, you can run the following in a cell:\n"," \n"," !conda list fiona\n"," \n","To install Fiona, you can run the following in a cell:\n"," \n"," !conda install fiona\n","
\n"]},{"cell_type":"code","execution_count":33,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:41.480724Z","start_time":"2021-11-22T19:52:41.062722Z"}},"outputs":[{"data":{"text/plain":["(3886, 51)"]},"execution_count":33,"metadata":{},"output_type":"execute_result"}],"source":["# Reading from FGDB\n","fcls_df = pd.DataFrame.spatial.from_featureclass(\n"," location=\"./sedf_data/cities/cities.gdb/cities\")\n","fcls_df.shape"]},{"cell_type":"code","execution_count":34,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:41.506721Z","start_time":"2021-11-22T19:52:41.485728Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
OBJECTIDage_10_14age_15_19age_20_24age_25_34age_35_44age_45_54age_55_64age_5_9age_65_74...placefipspop2010populationpop_classrenter_occststfipsvacantwhiteSHAPE
011313105873420311767144611361503665...1601990138161518161271ID1627113002{\"x\": -12462673.7237, \"y\": 5384674.994099997, ...
1289081781817991235133011431099721...1607840118991194661441ID163189893{\"x\": -12506251.314, \"y\": 5341537.793499999, \"...
\n","

2 rows × 51 columns

\n","
"],"text/plain":[" OBJECTID age_10_14 age_15_19 age_20_24 age_25_34 age_35_44 age_45_54 \\\n","0 1 1313 1058 734 2031 1767 1446 \n","1 2 890 817 818 1799 1235 1330 \n","\n"," age_55_64 age_5_9 age_65_74 ... placefips pop2010 population \\\n","0 1136 1503 665 ... 1601990 13816 15181 \n","1 1143 1099 721 ... 1607840 11899 11946 \n","\n"," pop_class renter_occ st stfips vacant white \\\n","0 6 1271 ID 16 271 13002 \n","1 6 1441 ID 16 318 9893 \n","\n"," SHAPE \n","0 {\"x\": -12462673.7237, \"y\": 5384674.994099997, ... \n","1 {\"x\": -12506251.314, \"y\": 5341537.793499999, \"... \n","\n","[2 rows x 51 columns]"]},"execution_count":34,"metadata":{},"output_type":"execute_result"}],"source":["# Check head\n","fcls_df.head(2)"]},{"cell_type":"code","execution_count":35,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:52:41.525725Z","start_time":"2021-11-22T19:52:41.510728Z"}},"outputs":[{"data":{"text/plain":["['point']"]},"execution_count":35,"metadata":{},"output_type":"execute_result"}],"source":["# Check geometry type\n","fcls_df.spatial.geometry_type"]},{"cell_type":"markdown","metadata":{},"source":["> The `spatial` namespace shows that a _Spatially enabled DataFrame_ has been created from the `featureclass` stored locally.\n"]},{"cell_type":"markdown","metadata":{},"source":["**Specify optional parameters**\n"]},{"cell_type":"markdown","metadata":{},"source":["The **`from_featureclass()`** method allows users to specify optional parameters when the `ArcPy` library is available in the current environment. These parameters are:\n","\n","- `sql_clause`: a pair of SQL prefix and postfix clauses, `sql_clause=(prefix,postfix)`, organized in a list or a tuple can be passed to query specific data. The parameter allows only a small set of operations to be performed. Learn more about the allowed operations [here](https://pro.arcgis.com/en/pro-app/latest/arcpy/data-access/searchcursor-class.htm).\n","- `where_clause`: where statement to subset the data. Learn more about it [here](https://pro.arcgis.com/en/pro-app/latest/help/mapping/navigation/sql-reference-for-elements-used-in-query-expressions.htm).\n","- `fields`: to subset the data for specific fields.\n","- `spatial_filter`: a geometry object to filter the results.\n"]},{"cell_type":"markdown","metadata":{},"source":["
\n"," Note: The operations below can only be performed in an environment that contains arcpy.\n","\n","
\n"]},{"cell_type":"markdown","metadata":{},"source":["##### Subset data for specific fields\n"]},{"cell_type":"code","execution_count":36,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:55:03.504998Z","start_time":"2021-11-22T19:55:03.352999Z"}},"outputs":[{"data":{"text/plain":["(3886, 3)"]},"execution_count":36,"metadata":{},"output_type":"execute_result"}],"source":["# Subset for fields\n","fcls_flds = pd.DataFrame.spatial.from_featureclass(location=\"./sedf_data/cities/cities.gdb/cities\",\n"," fields=['st', 'pop_class'])\n","fcls_flds.shape"]},{"cell_type":"code","execution_count":37,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:55:04.378124Z","start_time":"2021-11-22T19:55:04.367127Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
stpop_classSHAPE
0ID6{\"x\": -12462673.7237, \"y\": 5384674.994099997, ...
1ID6{\"x\": -12506251.314, \"y\": 5341537.793499999, \"...
\n","
"],"text/plain":[" st pop_class SHAPE\n","0 ID 6 {\"x\": -12462673.7237, \"y\": 5384674.994099997, ...\n","1 ID 6 {\"x\": -12506251.314, \"y\": 5341537.793499999, \"..."]},"execution_count":37,"metadata":{},"output_type":"execute_result"}],"source":["# Check head\n","fcls_flds.head(2)"]},{"cell_type":"markdown","metadata":{},"source":["##### Subset using `where_clause`\n","\n","Learn more about how to use `where_clause` [here](https://pro.arcgis.com/en/pro-app/latest/help/mapping/navigation/sql-reference-for-elements-used-in-query-expressions.htm).\n"]},{"cell_type":"code","execution_count":38,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:55:06.957737Z","start_time":"2021-11-22T19:55:06.876084Z"}},"outputs":[{"data":{"text/plain":["(15, 51)"]},"execution_count":38,"metadata":{},"output_type":"execute_result"}],"source":["# Subset using where_clause\n","fcls_whr = pd.DataFrame.spatial.from_featureclass(location=\"./sedf_data/cities/cities.gdb/cities\",\n"," where_clause=\"st='ID' and pop_class=6\")\n","fcls_whr.shape"]},{"cell_type":"code","execution_count":39,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:55:07.111124Z","start_time":"2021-11-22T19:55:07.093125Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
OBJECTIDage_10_14age_15_19age_20_24age_25_34age_35_44age_45_54age_55_64age_5_9age_65_74...placefipspop2010populationpop_classrenter_occststfipsvacantwhiteSHAPE
011313105873420311767144611361503665...1601990138161518161271ID1627113002{\"x\": -12462673.7237, \"y\": 5384674.994099997, ...
1289081781817991235133011431099721...1607840118991194661441ID163189893{\"x\": -12506251.314, \"y\": 5341537.793499999, \"...
\n","

2 rows × 51 columns

\n","
"],"text/plain":[" OBJECTID age_10_14 age_15_19 age_20_24 age_25_34 age_35_44 age_45_54 \\\n","0 1 1313 1058 734 2031 1767 1446 \n","1 2 890 817 818 1799 1235 1330 \n","\n"," age_55_64 age_5_9 age_65_74 ... placefips pop2010 population \\\n","0 1136 1503 665 ... 1601990 13816 15181 \n","1 1143 1099 721 ... 1607840 11899 11946 \n","\n"," pop_class renter_occ st stfips vacant white \\\n","0 6 1271 ID 16 271 13002 \n","1 6 1441 ID 16 318 9893 \n","\n"," SHAPE \n","0 {\"x\": -12462673.7237, \"y\": 5384674.994099997, ... \n","1 {\"x\": -12506251.314, \"y\": 5341537.793499999, \"... \n","\n","[2 rows x 51 columns]"]},"execution_count":39,"metadata":{},"output_type":"execute_result"}],"source":["# Check head\n","fcls_whr.head(2)"]},{"cell_type":"markdown","metadata":{},"source":["##### Subset using `fields` and `where_clause`\n"]},{"cell_type":"code","execution_count":40,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:55:08.852159Z","start_time":"2021-11-22T19:55:08.788159Z"}},"outputs":[{"data":{"text/plain":["(15, 5)"]},"execution_count":40,"metadata":{},"output_type":"execute_result"}],"source":["# Subset using where_clause\n","flds_whr = pd.DataFrame.spatial.from_featureclass(location=\"./sedf_data/cities/cities.gdb/cities\",\n"," fields=[\n"," 'st', 'pop_class', 'age_10_14', 'age_15_19'],\n"," where_clause=\"st='ID' and pop_class=6\")\n","flds_whr.shape"]},{"cell_type":"code","execution_count":41,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:55:09.717600Z","start_time":"2021-11-22T19:55:09.706606Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
stpop_classage_10_14age_15_19SHAPE
0ID613131058{\"x\": -12462673.7237, \"y\": 5384674.994099997, ...
1ID6890817{\"x\": -12506251.314, \"y\": 5341537.793499999, \"...
\n","
"],"text/plain":[" st pop_class age_10_14 age_15_19 \\\n","0 ID 6 1313 1058 \n","1 ID 6 890 817 \n","\n"," SHAPE \n","0 {\"x\": -12462673.7237, \"y\": 5384674.994099997, ... \n","1 {\"x\": -12506251.314, \"y\": 5341537.793499999, \"... "]},"execution_count":41,"metadata":{},"output_type":"execute_result"}],"source":["# Check head\n","flds_whr.head(2)"]},{"cell_type":"markdown","metadata":{},"source":["##### Subset using `sql_clause`\n","\n","`sql_clause` can be combined with `fields` and `where_clause` to further subset the data. You can learn more about the allowed operations [here](https://pro.arcgis.com/en/pro-app/latest/arcpy/data-access/searchcursor-class.htm). Now let's look at some examples.\n"]},{"cell_type":"markdown","metadata":{},"source":["###### Prefix `sql_clause` - DISTINCT operation\n"]},{"cell_type":"code","execution_count":42,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:55:12.045052Z","start_time":"2021-11-22T19:55:11.704053Z"}},"outputs":[{"data":{"text/plain":["(3886, 51)"]},"execution_count":42,"metadata":{},"output_type":"execute_result"}],"source":["# Prefix Sql clause - DISTINCT operation\n","fcls_sql1 = pd.DataFrame.spatial.from_featureclass(location=\"./sedf_data/cities/cities.gdb/cities\",\n"," sql_clause=(\"DISTINCT pop_class\", None))\n","\n","# Check shape\n","fcls_sql1.shape"]},{"cell_type":"code","execution_count":43,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:55:12.590891Z","start_time":"2021-11-22T19:55:12.570891Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
OBJECTIDage_10_14age_15_19age_20_24age_25_34age_35_44age_45_54age_55_64age_5_9age_65_74...placefipspop2010populationpop_classrenter_occststfipsvacantwhiteSHAPE
0941124712131043202216922116182711871037...0507330156201477163006AR0513036216{\"x\": -10006810.091, \"y\": 4290154.581699997, \"...
114057967487541999171720621450760851...246685012677131886814MD2428111613{\"x\": -8517714.7855, \"y\": 4744316.880199999, \"...
\n","

2 rows × 51 columns

\n","
"],"text/plain":[" OBJECTID age_10_14 age_15_19 age_20_24 age_25_34 age_35_44 age_45_54 \\\n","0 941 1247 1213 1043 2022 1692 2116 \n","1 1405 796 748 754 1999 1717 2062 \n","\n"," age_55_64 age_5_9 age_65_74 ... placefips pop2010 population \\\n","0 1827 1187 1037 ... 0507330 15620 14771 \n","1 1450 760 851 ... 2466850 12677 13188 \n","\n"," pop_class renter_occ st stfips vacant white \\\n","0 6 3006 AR 05 1303 6216 \n","1 6 814 MD 24 281 11613 \n","\n"," SHAPE \n","0 {\"x\": -10006810.091, \"y\": 4290154.581699997, \"... \n","1 {\"x\": -8517714.7855, \"y\": 4744316.880199999, \"... \n","\n","[2 rows x 51 columns]"]},"execution_count":43,"metadata":{},"output_type":"execute_result"}],"source":["# Check head\n","fcls_sql1.head(2)"]},{"cell_type":"markdown","metadata":{},"source":["###### Postfix `sql_clause` with specific fields\n","\n","Here, we will subset the data for the state and population class fields and apply a postfix clause.\n"]},{"cell_type":"code","execution_count":44,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:55:13.456845Z","start_time":"2021-11-22T19:55:13.280450Z"}},"outputs":[{"data":{"text/plain":["(3886, 3)"]},"execution_count":44,"metadata":{},"output_type":"execute_result"}],"source":["# Postfix Sql clause with specific fields\n","fcls_sql2 = pd.DataFrame.spatial.from_featureclass(location=\"./sedf_data/cities/cities.gdb/cities\",\n"," fields=['st', 'pop_class'],\n"," sql_clause=(None, \"ORDER BY st, pop_class\"))\n","# Check shape\n","fcls_sql2.shape"]},{"cell_type":"code","execution_count":45,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:55:14.111586Z","start_time":"2021-11-22T19:55:14.100594Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
stpop_classSHAPE
0AK6{\"x\": -16417572.1606, \"y\": 9562359.403800003, ...
1AK6{\"x\": -16455422.2224, \"y\": 9574022.0224, \"spat...
2AK6{\"x\": -16444303.0276, \"y\": 9568008.9705, \"spat...
3AK6{\"x\": -14962313.3618, \"y\": 8031014.926600002, ...
4AK6{\"x\": -16657118.680399999, \"y\": 8746757.662600...
\n","
"],"text/plain":[" st pop_class SHAPE\n","0 AK 6 {\"x\": -16417572.1606, \"y\": 9562359.403800003, ...\n","1 AK 6 {\"x\": -16455422.2224, \"y\": 9574022.0224, \"spat...\n","2 AK 6 {\"x\": -16444303.0276, \"y\": 9568008.9705, \"spat...\n","3 AK 6 {\"x\": -14962313.3618, \"y\": 8031014.926600002, ...\n","4 AK 6 {\"x\": -16657118.680399999, \"y\": 8746757.662600..."]},"execution_count":45,"metadata":{},"output_type":"execute_result"}],"source":["# Check head\n","fcls_sql2.head()"]},{"cell_type":"markdown","metadata":{},"source":["###### Prefix and Postfix `sql_clause` with specific fields and `where_clause`\n","\n","Here, we will subset the data using `where_clause`, keep specific fields, and then apply both prefix and postfix clause.\n"]},{"cell_type":"code","execution_count":48,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:55:51.001847Z","start_time":"2021-11-22T19:55:50.922841Z"}},"outputs":[{"data":{"text/plain":["(22, 5)"]},"execution_count":48,"metadata":{},"output_type":"execute_result"}],"source":["# Prefix and Postfix sql_clause\n","fcls_sql3_df = pd.DataFrame.spatial.from_featureclass(location=\"./sedf_data/cities/cities.gdb/cities\",\n"," fields=[\n"," 'st', 'name', 'pop_class', 'age_10_14'],\n"," where_clause=\"st='ID'\",\n"," sql_clause=(\"DISTINCT pop_class\", \"ORDER BY name\"))\n","\n","# Check Shape\n","fcls_sql3_df.shape"]},{"cell_type":"code","execution_count":49,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T19:55:51.761628Z","start_time":"2021-11-22T19:55:51.749637Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
stnamepop_classage_10_14SHAPE
0IDAmmon61313{\"x\": -12462673.7237, \"y\": 5384674.994099997, ...
1IDBlackfoot6890{\"x\": -12506251.314, \"y\": 5341537.793499999, \"...
2IDBoise City812750{\"x\": -12938676.683600001, \"y\": 5403597.049500...
3IDBurley6790{\"x\": -12667411.4024, \"y\": 5241722.820600003, ...
4IDCaldwell73803{\"x\": -12989383.6745, \"y\": 5413226.487300001, ...
\n","
"],"text/plain":[" st name pop_class age_10_14 \\\n","0 ID Ammon 6 1313 \n","1 ID Blackfoot 6 890 \n","2 ID Boise City 8 12750 \n","3 ID Burley 6 790 \n","4 ID Caldwell 7 3803 \n","\n"," SHAPE \n","0 {\"x\": -12462673.7237, \"y\": 5384674.994099997, ... \n","1 {\"x\": -12506251.314, \"y\": 5341537.793499999, \"... \n","2 {\"x\": -12938676.683600001, \"y\": 5403597.049500... \n","3 {\"x\": -12667411.4024, \"y\": 5241722.820600003, ... \n","4 {\"x\": -12989383.6745, \"y\": 5413226.487300001, ... "]},"execution_count":49,"metadata":{},"output_type":"execute_result"}],"source":["# Check head\n","fcls_sql3_df.head()"]},{"cell_type":"markdown","metadata":{},"source":["##### Using `spatial_filter`\n"]},{"cell_type":"markdown","metadata":{},"source":["`spatial_filter` can be used to query the results by using a spatial relationship with another geometry. The spatial filtering is even more powerful when integrated with [Geoenrichment](https://developers.arcgis.com/python/guide/part1-introduction-to-geoenrichment/). Let's use this approach to filter our results for the state of Idaho. In this example, we will:\n","\n","- use `arcgis.geoenrichment.Country` to derive the geometries for the state of Idaho.\n","- use `arcgis.geometry.filters.intersects(geometry, sr=None)` to create a geometry filter object that filters results whose geometry intersects with the specified geometry (i.e. filter data points within the boundary of Idaho).\n","- pass the geometry filter object to `spatial_filter` to get desired results.\n"]},{"cell_type":"markdown","metadata":{},"source":["
\n"," Note: To perform enrichment operations, GeoEnrichment must be configured in your GIS organization. GeoEnrichment consumes credits, and you can learn more about credit consumption here. \n","
\n"]},{"cell_type":"code","execution_count":51,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T20:03:11.171059Z","start_time":"2021-11-22T20:03:11.154058Z"}},"outputs":[],"source":["# Basic Imports\n","from arcgis.geometry import Geometry\n","from arcgis.geometry.filters import intersects\n","from arcgis.geoenrichment import Country"]},{"cell_type":"code","execution_count":59,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T20:08:43.643602Z","start_time":"2021-11-22T20:08:43.139513Z"}},"outputs":[{"data":{"text/plain":["arcgis.geoenrichment.enrichment.Country"]},"execution_count":59,"metadata":{},"output_type":"execute_result"}],"source":["# Create country object\n","usa = Country.get('US', gis=agol_gis)\n","type(usa)"]},{"cell_type":"code","execution_count":62,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T20:08:49.034325Z","start_time":"2021-11-22T20:08:47.854467Z"}},"outputs":[{"data":{"text/plain":[""]},"metadata":{},"output_type":"display_data"},{"data":{"image/svg+xml":[""],"text/plain":[""]},"execution_count":62,"metadata":{},"output_type":"execute_result"}],"source":["# Get boundaries for Idaho\n","named_area_ID = usa.search(query='Idaho', layers=['US.States'])\n","display(named_area_ID[0])\n","named_area_ID[0].geometry.as_arcpy"]},{"cell_type":"code","execution_count":64,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T20:10:38.529463Z","start_time":"2021-11-22T20:10:38.524455Z"}},"outputs":[{"data":{"text/plain":["{'wkid': 4326, 'latestWkid': 4326}"]},"execution_count":64,"metadata":{},"output_type":"execute_result"}],"source":["# Create spatial reference\n","sr_id = named_area_ID[0].geometry[\"spatialReference\"]\n","sr_id"]},{"cell_type":"code","execution_count":66,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T20:12:24.265943Z","start_time":"2021-11-22T20:12:24.259940Z"}},"outputs":[{"data":{"text/plain":["dict"]},"execution_count":66,"metadata":{},"output_type":"execute_result"}],"source":["# Construct a geometry filter using the filter geometry\n","id_state_filter = intersects(named_area_ID[0].geometry,\n"," sr=sr_id)\n","type(id_state_filter)"]},{"cell_type":"code","execution_count":71,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T20:19:12.104168Z","start_time":"2021-11-22T20:19:10.973170Z"},"code_folding":[]},"outputs":[{"data":{"text/plain":["(22, 5)"]},"execution_count":71,"metadata":{},"output_type":"execute_result"}],"source":["# Pass geometry filter object as a spatial_filter\n","fcls_spfl_df = pd.DataFrame.spatial.from_featureclass(location=\"./sedf_data/cities/cities.gdb/cities\",\n"," fields=[\n"," 'st', 'name', 'pop_class', 'age_10_14'],\n"," spatial_filter=id_state_filter)\n","# Check shape\n","fcls_spfl_df.shape"]},{"cell_type":"code","execution_count":73,"metadata":{"ExecuteTime":{"end_time":"2021-11-22T20:26:39.851895Z","start_time":"2021-11-22T20:26:39.840893Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
stnamepop_classage_10_14SHAPE
0IDAmmon61313{\"x\": -12462673.7237, \"y\": 5384674.994099997, ...
1IDBlackfoot6890{\"x\": -12506251.314, \"y\": 5341537.793499999, \"...
2IDBoise City812750{\"x\": -12938676.683600001, \"y\": 5403597.049500...
3IDBurley6790{\"x\": -12667411.4024, \"y\": 5241722.820600003, ...
4IDCaldwell73803{\"x\": -12989383.6745, \"y\": 5413226.487300001, ...
\n","
"],"text/plain":[" st name pop_class age_10_14 \\\n","0 ID Ammon 6 1313 \n","1 ID Blackfoot 6 890 \n","2 ID Boise City 8 12750 \n","3 ID Burley 6 790 \n","4 ID Caldwell 7 3803 \n","\n"," SHAPE \n","0 {\"x\": -12462673.7237, \"y\": 5384674.994099997, ... \n","1 {\"x\": -12506251.314, \"y\": 5341537.793499999, \"... \n","2 {\"x\": -12938676.683600001, \"y\": 5403597.049500... \n","3 {\"x\": -12667411.4024, \"y\": 5241722.820600003, ... \n","4 {\"x\": -12989383.6745, \"y\": 5413226.487300001, ... "]},"execution_count":73,"metadata":{},"output_type":"execute_result"}],"source":["# Check head\n","fcls_spfl_df.head()"]},{"cell_type":"markdown","metadata":{},"source":["> The result shows the data points filtered for Idaho as defined by the spatial filter.\n","\n","You can learn more about applying spatial filters in our [Working with geometries](https://developers.arcgis.com/python/guide/part4-spatial-filters/#arcgis.geometry.filters-module) guide series.\n"]},{"cell_type":"markdown","metadata":{},"source":["### Read in DataFrame with Addresses\n"]},{"cell_type":"markdown","metadata":{},"source":["A `SeDF` can be easily created from a DataFrame with address information using the [`from_df()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=from_featureclass#arcgis.features.GeoAccessor.from_df) method. This method geocodes the addresses using the first configured geocoder in your GIS. The locations generated after geocoding are used as the geometry of the SeDF.\n","\n","You can learn more about geocoding in our [Finding Places with geocoding](https://developers.arcgis.com/python/guide/part1-what-is-geocoding/) guide series.\n"]},{"cell_type":"markdown","metadata":{},"source":["
\n"," Note: The from_df() method performs a batch geocoding operation which consumes credits. If a geocoder is not specified, then the first configured geocoder in your GIS organization will be used. Learn more about credit consumption here.\n","\n","To avoid credit consumption, you may specify your own `geocoder`.\n","\n","
\n"]},{"cell_type":"markdown","metadata":{},"source":["Let's look at an example of using `from_df()`. We will read addresses into a DataFrame using the [`pd.read_csv()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html) method. Next, we will create a SeDF by passing the DataFrame and address column as parameters to the `from_df()` method.\n"]},{"cell_type":"code","execution_count":48,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:34:12.278791Z","start_time":"2021-11-11T22:34:12.267792Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
Address
0602 Murray Cir, Sausalito, CA 94965
1340 Stockton St, San Francisco, CA 94108
23619 Balboa St, San Francisco, CA 94121
31274 El Camino Real, San Bruno, CA 94066
4625 Monterey Blvd, San Francisco, CA 94127
\n","
"],"text/plain":[" Address\n","0 602 Murray Cir, Sausalito, CA 94965\n","1 340 Stockton St, San Francisco, CA 94108\n","2 3619 Balboa St, San Francisco, CA 94121\n","3 1274 El Camino Real, San Bruno, CA 94066\n","4 625 Monterey Blvd, San Francisco, CA 94127"]},"execution_count":48,"metadata":{},"output_type":"execute_result"}],"source":["# Read the csv file with address into a DataFrame\n","orders_df = pd.read_csv(\"./sedf_data/cities/orders.csv\")\n","\n","# Check head\n","orders_df.head()"]},{"cell_type":"markdown","metadata":{},"source":["> The DataFrame shows a column with address information.\n"]},{"cell_type":"code","execution_count":53,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:35:55.437412Z","start_time":"2021-11-11T22:35:53.956939Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
AddressSHAPE
0602 Murray Cir, Sausalito, CA 94965{\"x\": -122.47885242199999, \"y\": 37.83735920100...
1340 Stockton St, San Francisco, CA 94108{\"x\": -122.44955096499996, \"y\": 37.73152250200...
23619 Balboa St, San Francisco, CA 94121{\"x\": -122.49772620499999, \"y\": 37.77567413500...
31274 El Camino Real, San Bruno, CA 94066{\"x\": -122.40685153899994, \"y\": 37.78910429100...
4625 Monterey Blvd, San Francisco, CA 94127{\"x\": -122.42218381299995, \"y\": 37.63856151200...
\n","
"],"text/plain":[" Address \\\n","0 602 Murray Cir, Sausalito, CA 94965 \n","1 340 Stockton St, San Francisco, CA 94108 \n","2 3619 Balboa St, San Francisco, CA 94121 \n","3 1274 El Camino Real, San Bruno, CA 94066 \n","4 625 Monterey Blvd, San Francisco, CA 94127 \n","\n"," SHAPE \n","0 {\"x\": -122.47885242199999, \"y\": 37.83735920100... \n","1 {\"x\": -122.44955096499996, \"y\": 37.73152250200... \n","2 {\"x\": -122.49772620499999, \"y\": 37.77567413500... \n","3 {\"x\": -122.40685153899994, \"y\": 37.78910429100... \n","4 {\"x\": -122.42218381299995, \"y\": 37.63856151200... "]},"execution_count":53,"metadata":{},"output_type":"execute_result"}],"source":["# Use from_df to create SeDF\n","orders_sdf = pd.DataFrame.spatial.from_df(\n"," df=orders_df, address_column=\"Address\")\n","orders_sdf.head()"]},{"cell_type":"code","execution_count":54,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:35:57.704801Z","start_time":"2021-11-11T22:35:57.697804Z"}},"outputs":[{"data":{"text/plain":["['point']"]},"execution_count":54,"metadata":{},"output_type":"execute_result"}],"source":["# Check geometry type\n","orders_sdf.spatial.geometry_type"]},{"cell_type":"markdown","metadata":{},"source":["> The `spatial` namespace shows that a _Spatially enabled DataFrame_ has been created from a Pandas DataFrame with address information.\n"]},{"cell_type":"markdown","metadata":{},"source":["### Read in DataFrame with Lat/Long Information\n"]},{"cell_type":"markdown","metadata":{},"source":["As we saw in part-1 of this guide series, a SeDF can be created from any Pandas DataFrame with location information (Latitude and Longitude) using the [`from_xy()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#arcgis.features.GeoAccessor.from_xy) method.\n","\n","Let's look at an example. We will read the data with latitude and longitude information into a DataFrame using the [`pd.read_csv()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html) method. Then, we will create a SeDF by passing the DataFrame, latitude, and longitude as parameters to the `from_xy()` method.\n"]},{"cell_type":"code","execution_count":55,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:07.669481Z","start_time":"2021-11-11T22:36:07.650485Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
Provider NameProvider CityProvider StateResidents Total Admissions COVID-19Residents Total COVID-19 CasesResidents Total COVID-19 DeathsNumber of All BedsTotal Number of Occupied BedsLONGITUDELATITUDE
0GROSSE POINTE MANORNILESIL556129961-87.79297342.012012
1MILLER'S MERRY MANORDUNKIRKIN0004643-85.19765140.392722
2PARKWAY MANORMARIONIL00013184-88.98294437.750143
3AVANTARA LONG GROVELONG GROVEIL61410195131-87.98644242.160843
4HARMONY NURSING & REHAB CENTERCHICAGOIL197516180116-87.72635341.975505
\n","
"],"text/plain":[" Provider Name Provider City Provider State \\\n","0 GROSSE POINTE MANOR NILES IL \n","1 MILLER'S MERRY MANOR DUNKIRK IN \n","2 PARKWAY MANOR MARION IL \n","3 AVANTARA LONG GROVE LONG GROVE IL \n","4 HARMONY NURSING & REHAB CENTER CHICAGO IL \n","\n"," Residents Total Admissions COVID-19 Residents Total COVID-19 Cases \\\n","0 5 56 \n","1 0 0 \n","2 0 0 \n","3 6 141 \n","4 19 75 \n","\n"," Residents Total COVID-19 Deaths Number of All Beds \\\n","0 12 99 \n","1 0 46 \n","2 0 131 \n","3 0 195 \n","4 16 180 \n","\n"," Total Number of Occupied Beds LONGITUDE LATITUDE \n","0 61 -87.792973 42.012012 \n","1 43 -85.197651 40.392722 \n","2 84 -88.982944 37.750143 \n","3 131 -87.986442 42.160843 \n","4 116 -87.726353 41.975505 "]},"execution_count":55,"metadata":{},"output_type":"execute_result"}],"source":["# Read the data\n","cms_df = pd.read_csv('./sedf_data/cities/sample_cms_data.csv')\n","\n","# Return the first 5 records\n","cms_df.head()"]},{"cell_type":"code","execution_count":56,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:07.692482Z","start_time":"2021-11-11T22:36:07.672485Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
Provider NameProvider CityProvider StateResidents Total Admissions COVID-19Residents Total COVID-19 CasesResidents Total COVID-19 DeathsNumber of All BedsTotal Number of Occupied BedsLONGITUDELATITUDESHAPE
0GROSSE POINTE MANORNILESIL556129961-87.79297342.012012{\"spatialReference\": {\"wkid\": 4326}, \"x\": -87....
1MILLER'S MERRY MANORDUNKIRKIN0004643-85.19765140.392722{\"spatialReference\": {\"wkid\": 4326}, \"x\": -85....
2PARKWAY MANORMARIONIL00013184-88.98294437.750143{\"spatialReference\": {\"wkid\": 4326}, \"x\": -88....
3AVANTARA LONG GROVELONG GROVEIL61410195131-87.98644242.160843{\"spatialReference\": {\"wkid\": 4326}, \"x\": -87....
4HARMONY NURSING & REHAB CENTERCHICAGOIL197516180116-87.72635341.975505{\"spatialReference\": {\"wkid\": 4326}, \"x\": -87....
\n","
"],"text/plain":[" Provider Name Provider City Provider State \\\n","0 GROSSE POINTE MANOR NILES IL \n","1 MILLER'S MERRY MANOR DUNKIRK IN \n","2 PARKWAY MANOR MARION IL \n","3 AVANTARA LONG GROVE LONG GROVE IL \n","4 HARMONY NURSING & REHAB CENTER CHICAGO IL \n","\n"," Residents Total Admissions COVID-19 Residents Total COVID-19 Cases \\\n","0 5 56 \n","1 0 0 \n","2 0 0 \n","3 6 141 \n","4 19 75 \n","\n"," Residents Total COVID-19 Deaths Number of All Beds \\\n","0 12 99 \n","1 0 46 \n","2 0 131 \n","3 0 195 \n","4 16 180 \n","\n"," Total Number of Occupied Beds LONGITUDE LATITUDE \\\n","0 61 -87.792973 42.012012 \n","1 43 -85.197651 40.392722 \n","2 84 -88.982944 37.750143 \n","3 131 -87.986442 42.160843 \n","4 116 -87.726353 41.975505 \n","\n"," SHAPE \n","0 {\"spatialReference\": {\"wkid\": 4326}, \"x\": -87.... \n","1 {\"spatialReference\": {\"wkid\": 4326}, \"x\": -85.... \n","2 {\"spatialReference\": {\"wkid\": 4326}, \"x\": -88.... \n","3 {\"spatialReference\": {\"wkid\": 4326}, \"x\": -87.... \n","4 {\"spatialReference\": {\"wkid\": 4326}, \"x\": -87.... "]},"execution_count":56,"metadata":{},"output_type":"execute_result"}],"source":["# Create a SeDF\n","cms_sedf = pd.DataFrame.spatial.from_xy(\n"," df=cms_df, x_column='LONGITUDE', y_column='LATITUDE', sr=4326)\n","\n","# Check head\n","cms_sedf.head()"]},{"cell_type":"markdown","metadata":{},"source":["> The `SHAPE` feature shows that a _Spatially enabled DataFrame_ has been created from a Pandas DataFrame with latitude and longitude information.\n"]},{"cell_type":"markdown","metadata":{},"source":["### Read in GeoPandas DataFrame\n"]},{"cell_type":"markdown","metadata":{},"source":["A `SeDF` can be easily created from a [GeoPandas's](https://geopandas.org/index.html) [GeoDataFrame](https://geopandas.org/docs/reference/geodataframe.html) using the [`from_geodataframe()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#arcgis.features.GeoAccessor.from_geodataframe) method. We will:\n","\n","- Import Geopandas and create a GeoDataFrame.\n","- Create a [**Spatially enabled DataFrame**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#geoaccessor) from a GeoDataFrame.\n"]},{"cell_type":"markdown","metadata":{},"source":["#### Create a GeoDataFrame\n","\n","Here, we will create a `GeoDataFrame` from a Pandas DataFrame, `cms_df`, defined above.\n"]},{"cell_type":"code","execution_count":57,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:07.724482Z","start_time":"2021-11-11T22:36:07.694483Z"}},"outputs":[],"source":["# Import libraries\n","from geopandas import GeoDataFrame\n","from shapely.geometry import Point"]},{"cell_type":"code","execution_count":58,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:07.743482Z","start_time":"2021-11-11T22:36:07.726483Z"}},"outputs":[{"data":{"text/plain":["(124, 9)"]},"execution_count":58,"metadata":{},"output_type":"execute_result"}],"source":["# Read the data\n","cms_df = pd.read_csv('./sedf_data/cities/sample_cms_data.csv')\n","\n","# Create Geopandas DataFrame\n","gdf = GeoDataFrame(cms_df.drop(['LONGITUDE', 'LATITUDE'], axis=1),\n"," crs={'init': 'epsg:4326'},\n"," geometry=[Point(xy) for xy in zip(cms_df.LONGITUDE, cms_df.LATITUDE)])\n","gdf.shape"]},{"cell_type":"code","execution_count":59,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:07.758480Z","start_time":"2021-11-11T22:36:07.745481Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
Provider NameProvider CityProvider StateResidents Total Admissions COVID-19Residents Total COVID-19 CasesResidents Total COVID-19 DeathsNumber of All BedsTotal Number of Occupied Bedsgeometry
0GROSSE POINTE MANORNILESIL556129961POINT (-87.79297 42.01201)
1MILLER'S MERRY MANORDUNKIRKIN0004643POINT (-85.19765 40.39272)
\n","
"],"text/plain":[" Provider Name Provider City Provider State \\\n","0 GROSSE POINTE MANOR NILES IL \n","1 MILLER'S MERRY MANOR DUNKIRK IN \n","\n"," Residents Total Admissions COVID-19 Residents Total COVID-19 Cases \\\n","0 5 56 \n","1 0 0 \n","\n"," Residents Total COVID-19 Deaths Number of All Beds \\\n","0 12 99 \n","1 0 46 \n","\n"," Total Number of Occupied Beds geometry \n","0 61 POINT (-87.79297 42.01201) \n","1 43 POINT (-85.19765 40.39272) "]},"execution_count":59,"metadata":{},"output_type":"execute_result"}],"source":["# Check head\n","gdf.head(2)"]},{"cell_type":"markdown","metadata":{},"source":["> A GeoDataFrame has been created with a `geometry` column that stores the geometry of the dataset.\n"]},{"cell_type":"markdown","metadata":{},"source":["#### Create a SeDF from GeoDataFrame\n","\n","Here, we will create a `SeDF` from the `gdf` GeoDataFrame created above using the [`from_geodataframe()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#arcgis.features.GeoAccessor.from_geodataframe) method.\n"]},{"cell_type":"code","execution_count":60,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:07.777481Z","start_time":"2021-11-11T22:36:07.760480Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
Provider NameProvider CityProvider StateResidents Total Admissions COVID-19Residents Total COVID-19 CasesResidents Total COVID-19 DeathsNumber of All BedsTotal Number of Occupied BedsSHAPE
0GROSSE POINTE MANORNILESIL556129961{\"x\": -87.792973, \"y\": 42.012012, \"spatialRefe...
1MILLER'S MERRY MANORDUNKIRKIN0004643{\"x\": -85.197651, \"y\": 40.392722, \"spatialRefe...
\n","
"],"text/plain":[" Provider Name Provider City Provider State \\\n","0 GROSSE POINTE MANOR NILES IL \n","1 MILLER'S MERRY MANOR DUNKIRK IN \n","\n"," Residents Total Admissions COVID-19 Residents Total COVID-19 Cases \\\n","0 5 56 \n","1 0 0 \n","\n"," Residents Total COVID-19 Deaths Number of All Beds \\\n","0 12 99 \n","1 0 46 \n","\n"," Total Number of Occupied Beds \\\n","0 61 \n","1 43 \n","\n"," SHAPE \n","0 {\"x\": -87.792973, \"y\": 42.012012, \"spatialRefe... \n","1 {\"x\": -85.197651, \"y\": 40.392722, \"spatialRefe... "]},"execution_count":60,"metadata":{},"output_type":"execute_result"}],"source":["# Create a SeDF\n","sedf_gpd = pd.DataFrame.spatial.from_geodataframe(gdf)\n","sedf_gpd.head(2)"]},{"cell_type":"code","execution_count":61,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:07.784482Z","start_time":"2021-11-11T22:36:07.779484Z"}},"outputs":[{"data":{"text/plain":["['point']"]},"execution_count":61,"metadata":{},"output_type":"execute_result"}],"source":["# Check geometry type\n","sedf_gpd.spatial.geometry_type"]},{"cell_type":"markdown","metadata":{},"source":["> The `spatial` namespace shows that a _Spatially enabled DataFrame_ has been created from a GeoDataFrame.\n"]},{"cell_type":"markdown","metadata":{},"source":["### Read in feather format data\n"]},{"cell_type":"markdown","metadata":{},"source":["A `SeDF` can be easily created from the data in [feather](https://arrow.apache.org/docs/python/feather.html) format using the [`from_feather()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=from_feather#arcgis.features.GeoAccessor.from_feather) method. The method's defaults _SHAPE_ is the `spatial_column` for geo-spatial information, but any other column with spatial information can be specified.\n"]},{"cell_type":"code","execution_count":62,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:07.798483Z","start_time":"2021-11-11T22:36:07.786483Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
Provider NameProvider CityProvider StateResidents Total Admissions COVID-19Residents Total COVID-19 CasesResidents Total COVID-19 DeathsNumber of All BedsTotal Number of Occupied BedsLONGITUDELATITUDESHAPE
0GROSSE POINTE MANORNILESIL556129961-87.79297342.012012{\"spatialReference\": {\"wkid\": 4326}, \"x\": -87....
1MILLER'S MERRY MANORDUNKIRKIN0004643-85.19765140.392722{\"spatialReference\": {\"wkid\": 4326}, \"x\": -85....
\n","
"],"text/plain":[" Provider Name Provider City Provider State \\\n","0 GROSSE POINTE MANOR NILES IL \n","1 MILLER'S MERRY MANOR DUNKIRK IN \n","\n"," Residents Total Admissions COVID-19 Residents Total COVID-19 Cases \\\n","0 5 56 \n","1 0 0 \n","\n"," Residents Total COVID-19 Deaths Number of All Beds \\\n","0 12 99 \n","1 0 46 \n","\n"," Total Number of Occupied Beds LONGITUDE LATITUDE \\\n","0 61 -87.792973 42.012012 \n","1 43 -85.197651 40.392722 \n","\n"," SHAPE \n","0 {\"spatialReference\": {\"wkid\": 4326}, \"x\": -87.... \n","1 {\"spatialReference\": {\"wkid\": 4326}, \"x\": -85.... "]},"execution_count":62,"metadata":{},"output_type":"execute_result"}],"source":["# Check head\n","cms_sedf.head(2)"]},{"cell_type":"code","execution_count":63,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:07.854484Z","start_time":"2021-11-11T22:36:07.802481Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
Provider NameProvider CityProvider StateResidents Total Admissions COVID-19Residents Total COVID-19 CasesResidents Total COVID-19 DeathsNumber of All BedsTotal Number of Occupied BedsLONGITUDELATITUDESHAPE
0GROSSE POINTE MANORNILESIL556129961-87.79297342.012012{\"x\": -87.792973, \"y\": 42.012012, \"spatialRefe...
1MILLER'S MERRY MANORDUNKIRKIN0004643-85.19765140.392722{\"x\": -85.197651, \"y\": 40.392722, \"spatialRefe...
\n","
"],"text/plain":[" Provider Name Provider City Provider State \\\n","0 GROSSE POINTE MANOR NILES IL \n","1 MILLER'S MERRY MANOR DUNKIRK IN \n","\n"," Residents Total Admissions COVID-19 Residents Total COVID-19 Cases \\\n","0 5 56 \n","1 0 0 \n","\n"," Residents Total COVID-19 Deaths Number of All Beds \\\n","0 12 99 \n","1 0 46 \n","\n"," Total Number of Occupied Beds LONGITUDE LATITUDE \\\n","0 61 -87.792973 42.012012 \n","1 43 -85.197651 40.392722 \n","\n"," SHAPE \n","0 {\"x\": -87.792973, \"y\": 42.012012, \"spatialRefe... \n","1 {\"x\": -85.197651, \"y\": 40.392722, \"spatialRefe... "]},"execution_count":63,"metadata":{},"output_type":"execute_result"}],"source":["# Create SeDf by reading from feather\n","sedf_fthr = pd.DataFrame.spatial.from_feather(\n"," './sedf_data/cities/sample_cms_data.feather')\n","sedf_fthr.head(2)"]},{"cell_type":"code","execution_count":64,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:07.861486Z","start_time":"2021-11-11T22:36:07.856482Z"}},"outputs":[{"data":{"text/plain":["['point']"]},"execution_count":64,"metadata":{},"output_type":"execute_result"}],"source":["# Check geometry type\n","sedf_fthr.spatial.geometry_type"]},{"cell_type":"markdown","metadata":{},"source":["> The `spatial` namespace shows that a _Spatially enabled DataFrame_ has been created from _feather_ format data.\n"]},{"cell_type":"markdown","metadata":{},"source":["### Read in Non-spatial Table data\n","\n","Non-spatial table data can be hosted on [**ArcGIS Online**](https://www.arcgis.com) or [**ArcGIS Enterprise**](http://enterprise.arcgis.com/en/), or it can be stored locally in a File Geodatabase. A `SeDF` can be easily created from such non-spatial table data using the following methods:\n","\n","- [`from_table()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=from_feather#arcgis.features.GeoAccessor.from_table) - for local data\n","- [`from_layer()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=from_layer#arcgis.features.GeoAccessor.from_layer) - for data hosted on ArcGIS Online or Enterprise\n"]},{"cell_type":"markdown","metadata":{},"source":["#### Using the `from_table()` method\n","\n","A `SeDF` can be created from local non-spatial data using the [`from_table()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=from_feather#arcgis.features.GeoAccessor.from_table) method. The method can read a csv file (in any environment) or a table stored in a File Geodatabase (with ArcPy only).\n"]},{"cell_type":"markdown","metadata":{},"source":["##### Reading a csv file\n"]},{"cell_type":"code","execution_count":65,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:08.546484Z","start_time":"2021-11-11T22:36:07.863481Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
Provider NameProvider CityProvider StateResidents Total Admissions COVID-19Residents Total COVID-19 CasesResidents Total COVID-19 DeathsNumber of All BedsTotal Number of Occupied BedsLONGITUDELATITUDE
0GROSSE POINTE MANORNILESIL556129961-87.79297342.012012
1MILLER'S MERRY MANORDUNKIRKIN0004643-85.19765140.392722
\n","
"],"text/plain":[" Provider Name Provider City Provider State \\\n","0 GROSSE POINTE MANOR NILES IL \n","1 MILLER'S MERRY MANOR DUNKIRK IN \n","\n"," Residents Total Admissions COVID-19 Residents Total COVID-19 Cases \\\n","0 5 56 \n","1 0 0 \n","\n"," Residents Total COVID-19 Deaths Number of All Beds \\\n","0 12 99 \n","1 0 46 \n","\n"," Total Number of Occupied Beds LONGITUDE LATITUDE \n","0 61 -87.792973 42.012012 \n","1 43 -85.197651 40.392722 "]},"execution_count":65,"metadata":{},"output_type":"execute_result"}],"source":["# Create SeDF\n","tbl_df = pd.DataFrame.spatial.from_table(\n"," filename='./sedf_data/cities/sample_cms_data.csv')\n","tbl_df.head(2)"]},{"cell_type":"markdown","metadata":{},"source":["> A Pandas DataFrame without any spatial information is returned.\n"]},{"cell_type":"markdown","metadata":{},"source":["##### Reading table from a File Geodatabase\n"]},{"cell_type":"markdown","metadata":{},"source":["
\n"," Note: The operation below can only be performed in an environment that contains arcpy.\n","\n","
\n"]},{"cell_type":"code","execution_count":66,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:08.615484Z","start_time":"2021-11-11T22:36:08.548486Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
OBJECTIDNAMEOTHEROWNER_OCCPLACEFIPSPOP2010POPULATIONPOP_CLASSRENTER_OCCSTSTFIPSVACANTWHITE
01Ammon30732051601990138161518161271ID1627113002
12Blackfoot107727881607840118991194661441ID163189893
\n","
"],"text/plain":[" OBJECTID NAME OTHER OWNER_OCC PLACEFIPS POP2010 POPULATION \\\n","0 1 Ammon 307 3205 1601990 13816 15181 \n","1 2 Blackfoot 1077 2788 1607840 11899 11946 \n","\n"," POP_CLASS RENTER_OCC ST STFIPS VACANT WHITE \n","0 6 1271 ID 16 271 13002 \n","1 6 1441 ID 16 318 9893 "]},"execution_count":66,"metadata":{},"output_type":"execute_result"}],"source":["# Create SeDF\n","tbl_df2 = pd.DataFrame.spatial.from_table(\n"," filename=\"./sedf_data/cities/cities.gdb/cities_table_export\")\n","tbl_df2.head(2)"]},{"cell_type":"markdown","metadata":{},"source":["> A Pandas DataFrame without any spatial information is returned.\n"]},{"cell_type":"markdown","metadata":{},"source":["#### Using the `from_layer()` method\n","\n","A `SeDF` can be created from hosted non-spatial data using the[`from_layer()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=from_layer#arcgis.features.GeoAccessor.from_layer) method.\n"]},{"cell_type":"code","execution_count":67,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:08.892350Z","start_time":"2021-11-11T22:36:08.618492Z"}},"outputs":[{"data":{"text/html":["
\n","
\n"," \n"," \n"," \n","
\n","\n","
\n"," sedf_major_cities_table\n"," \n","
Table Layer by api_data_owner\n","
Last Modified: November 11, 2021\n","
0 comments, 4 views\n","
\n","
\n"," "],"text/plain":[""]},"execution_count":67,"metadata":{},"output_type":"execute_result"}],"source":["# Get table item\n","tbl_item = agol_gis.content.get(\"b022d30f881f478f8155153b9205ce12\")\n","tbl_item"]},{"cell_type":"code","execution_count":68,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:09.836988Z","start_time":"2021-11-11T22:36:08.897350Z"}},"outputs":[{"data":{"text/plain":[""]},"execution_count":68,"metadata":{},"output_type":"execute_result"}],"source":["# Get table url\n","tbl = tbl_item.tables[0]\n","tbl"]},{"cell_type":"code","execution_count":69,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:11.687513Z","start_time":"2021-11-11T22:36:09.841640Z"}},"outputs":[{"data":{"text/html":["
\n","\n","
\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
CLASSOBJECTIDPLACEFIPSPOP2010POPULATIONPOP_CLASSSTFIPS
0city116019901381615181616
1city216078401189911946616
\n",""],"text/plain":[" CLASS OBJECTID PLACEFIPS POP2010 POPULATION POP_CLASS STFIPS\n","0 city 1 1601990 13816 15181 6 16\n","1 city 2 1607840 11899 11946 6 16"]},"execution_count":69,"metadata":{},"output_type":"execute_result"}],"source":["tbl_df2 = pd.DataFrame.spatial.from_layer(tbl)\n","tbl_df2.head(2)"]},{"cell_type":"markdown","metadata":{},"source":["> A Pandas DataFrame without any spatial information is returned.\n"]},{"cell_type":"markdown","metadata":{},"source":["### Read in data from '_lite and portable_' databases\n"]},{"cell_type":"markdown","metadata":{},"source":["Geospatial data stored in a [mobile geodatabase](https://pro.arcgis.com/en/pro-app/latest/help/data/geodatabases/manage-mobile-gdb/mobile-geodatabases.htm) (.geodatabase) or a [SQLite Database](https://pro.arcgis.com/en/pro-app/latest/tool-reference/data-management/create-sqlite-database.htm) can be easily accessed using the [**Spatially enabled DataFrame**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#geoaccessor).\n","\n","- A mobile geodatabase (.geodatabase) is a collection of various types of GIS datasets contained in a single file on disk that can store, query, and manage spatial and nonspatial data. Mobile geodatabases are stored in an SQLite database.\n","\n","- SQLite is a full-featured relational database with the advantage of being portable and interoperable making it ubiquitous in mobile app development.\n","\n","The [`from_featureclass()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?arcgis.features.GeoAccessor.from_featureclass#arcgis.features.GeoAccessor.from_featureclass) method can be used to create a `SeDF` by reading in data from these databases. Let's look at some examples.\n"]},{"cell_type":"markdown","metadata":{},"source":["
\n"," Note: The operations below can only be performed in an environment that contains arcpy.\n","\n","
\n"]},{"cell_type":"markdown","metadata":{},"source":["#### Read from a mobile geodatabase\n"]},{"cell_type":"code","execution_count":70,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:12.166989Z","start_time":"2021-11-11T22:36:11.692511Z"}},"outputs":[{"data":{"text/plain":["(3886, 51)"]},"execution_count":70,"metadata":{},"output_type":"execute_result"}],"source":["# Reading from mobile geodatabase\n","mobile_gdb_df = pd.DataFrame.spatial.from_featureclass(\n"," location=\"./sedf_data/cities/cities_mobile.geodatabase/main.cities\")\n","mobile_gdb_df.shape"]},{"cell_type":"code","execution_count":71,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:12.212993Z","start_time":"2021-11-11T22:36:12.171999Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
OBJECTIDage_10_14age_15_19age_20_24age_25_34age_35_44age_45_54age_55_64age_5_9age_65_74...placefipspop2010populationpop_classrenter_occststfipsvacantwhiteSHAPE
011313105873420311767144611361503665...1601990138161518161271ID1627113002{\"x\": -12462673.7237, \"y\": 5384674.994099997, ...
1289081781817991235133011431099721...1607840118991194661441ID163189893{\"x\": -12506251.314, \"y\": 5341537.793499999, \"...
\n","

2 rows × 51 columns

\n","
"],"text/plain":[" OBJECTID age_10_14 age_15_19 age_20_24 age_25_34 age_35_44 age_45_54 \\\n","0 1 1313 1058 734 2031 1767 1446 \n","1 2 890 817 818 1799 1235 1330 \n","\n"," age_55_64 age_5_9 age_65_74 ... placefips pop2010 population \\\n","0 1136 1503 665 ... 1601990 13816 15181 \n","1 1143 1099 721 ... 1607840 11899 11946 \n","\n"," pop_class renter_occ st stfips vacant white \\\n","0 6 1271 ID 16 271 13002 \n","1 6 1441 ID 16 318 9893 \n","\n"," SHAPE \n","0 {\"x\": -12462673.7237, \"y\": 5384674.994099997, ... \n","1 {\"x\": -12506251.314, \"y\": 5341537.793499999, \"... \n","\n","[2 rows x 51 columns]"]},"execution_count":71,"metadata":{},"output_type":"execute_result"}],"source":["# Check head\n","mobile_gdb_df.head(2)"]},{"cell_type":"code","execution_count":72,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:12.230989Z","start_time":"2021-11-11T22:36:12.216993Z"}},"outputs":[{"data":{"text/plain":["['point']"]},"execution_count":72,"metadata":{},"output_type":"execute_result"}],"source":["# Check geometry type\n","mobile_gdb_df.spatial.geometry_type"]},{"cell_type":"markdown","metadata":{},"source":["> The `spatial` namespace shows that a _Spatially enabled DataFrame_ has been created.\n"]},{"cell_type":"markdown","metadata":{},"source":["#### Read from a SQLite database\n"]},{"cell_type":"code","execution_count":73,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:12.660988Z","start_time":"2021-11-11T22:36:12.232993Z"}},"outputs":[{"data":{"text/plain":["(3886, 51)"]},"execution_count":73,"metadata":{},"output_type":"execute_result"}],"source":["# Reading from sqlite database\n","sqlite_df = pd.DataFrame.spatial.from_featureclass(\n"," location=\"./sedf_data/cities/cities.sqlite/main.cities\")\n","sqlite_df.shape"]},{"cell_type":"code","execution_count":74,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:12.681991Z","start_time":"2021-11-11T22:36:12.664992Z"}},"outputs":[{"data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
OBJECTIDage_10_14age_15_19age_20_24age_25_34age_35_44age_45_54age_55_64age_5_9age_65_74...placefipspop2010populationpop_classrenter_occststfipsvacantwhiteSHAPE
011313105873420311767144611361503665...1601990138161518161271ID1627113002{\"x\": -12462673.7237, \"y\": 5384674.994099997, ...
1289081781817991235133011431099721...1607840118991194661441ID163189893{\"x\": -12506251.314, \"y\": 5341537.793499999, \"...
\n","

2 rows × 51 columns

\n","
"],"text/plain":[" OBJECTID age_10_14 age_15_19 age_20_24 age_25_34 age_35_44 age_45_54 \\\n","0 1 1313 1058 734 2031 1767 1446 \n","1 2 890 817 818 1799 1235 1330 \n","\n"," age_55_64 age_5_9 age_65_74 ... placefips pop2010 population \\\n","0 1136 1503 665 ... 1601990 13816 15181 \n","1 1143 1099 721 ... 1607840 11899 11946 \n","\n"," pop_class renter_occ st stfips vacant white \\\n","0 6 1271 ID 16 271 13002 \n","1 6 1441 ID 16 318 9893 \n","\n"," SHAPE \n","0 {\"x\": -12462673.7237, \"y\": 5384674.994099997, ... \n","1 {\"x\": -12506251.314, \"y\": 5341537.793499999, \"... \n","\n","[2 rows x 51 columns]"]},"execution_count":74,"metadata":{},"output_type":"execute_result"}],"source":["# Check head\n","sqlite_df.head(2)"]},{"cell_type":"code","execution_count":75,"metadata":{"ExecuteTime":{"end_time":"2021-11-11T22:36:12.697990Z","start_time":"2021-11-11T22:36:12.685991Z"}},"outputs":[{"data":{"text/plain":["['point']"]},"execution_count":75,"metadata":{},"output_type":"execute_result"}],"source":["# Check geometry type\n","sqlite_df.spatial.geometry_type"]},{"cell_type":"markdown","metadata":{},"source":["> The `spatial` namespace shows that a _Spatially enabled DataFrame_ has been created.\n"]},{"cell_type":"markdown","metadata":{},"source":["## Conclusion\n"]},{"cell_type":"markdown","metadata":{},"source":["In this guide, we explored how [**Spatially enabled DataFrame**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#geoaccessor) (SeDF) can be used to read spatial data from various formats. We started by reading data from web feature layers and using the `query()` operation to optimize performance and results. We explored reading data from various local data sources such as file geodatabase and shapefile. Next, we explained how data with address or coordinate information, in a geopandas dataframe, or in feather format can be used to create a SeDF. We also discussed creating SeDF from non-spatial table data. Towards the end, we also discussed how SeDF can be created using data from lite and portable databases.\n","\n","In the next part of the guide series, you will learn about exporting data using [**Spatially enabled DataFrame**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#geoaccessor).\n"]},{"cell_type":"markdown","metadata":{},"source":["
\n"," Note: Given the importance and popularity of Spatially enabled DataFrame, we are revisiting our documentation for this topic. Our goal is to enhance the existing documentation to showcase the various capabilities of Spatially enabled DataFrame in detail with even more examples this time.\n","\n","Creating quality documentation is time-consuming and exhaustive, but we are committed to providing you with the best experience possible. With that in mind, we will be rolling out the revamped guides on this topic as different parts of a guide series (like the Data Engineering or Geometry guide series). This is \"part-2\" of the guide series for Spatially Enabled DataFrame. You will continue to see the existing documentation as we revamp it to add new parts. Stay tuned for more on this topic.\n","\n","
\n"]}],"metadata":{"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.7.10"},"toc":{"base_numbering":1,"nav_menu":{},"number_sections":true,"sideBar":true,"skip_h1_title":true,"title_cell":"Table of Contents","title_sidebar":"Contents","toc_cell":true,"toc_position":{"height":"calc(100% - 180px)","left":"10px","top":"150px","width":"360.188px"},"toc_section_display":true,"toc_window_display":true}},"nbformat":4,"nbformat_minor":4} +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Part-2 Data IO with SeDF - Accessing Data\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "toc": true + }, + "source": [ + "

Table of Contents

\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Introduction\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In _part-1_ of this guide series, we started with an introduction to the [**Spatially enabled DataFrame**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#geoaccessor) (SeDF), the `spatial` and `geom` namespaces, and looked at a quick example of SeDF in action. In this part of the guide series, we will look at how GIS data can be accessed from various data formats using SeDF.\n", + "\n", + "GIS users work with different vector-based spatial data formats, like published layers on remote servers (web layers) and local data. The [**Spatially enabled DataFrame**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#geoaccessor) allows the users to read, write, and manipulate spatial data by bringing the data in-memory.\n", + "\n", + "The _SeDF_ integrates with Esri's [**ArcPy** site-package](http://pro.arcgis.com/en/pro-app/arcpy/get-started/what-is-arcpy-.htm), as well as the open source [pyshp](https://github.com/GeospatialPython/pyshp/), [shapely](https://github.com/Toblerity/Shapely) and [fiona](https://github.com/Toblerity/Fiona) packages. This means that the _SeDF_ can use either [shapely](https://pypi.org/project/Shapely/) or [arcpy](https://www.esri.com/en-us/arcgis/products/arcgis-python-libraries/libraries/arcpy) geometry engines to provide you with options for easily working with geospatial data, regardless of your platform. The _SeDF_ transforms the data into the formats you desire, allowing you to use Python functionality to analyze and visualize geographic information.\n", + "\n", + "Data can be read and scripted to automate workflows and be visualized on maps in a [Jupyter notebooks](../using-the-jupyter-notebook-environment/). Let's explore the options available for accessing GIS data with the versatile _Spatially enabled DataFrame_.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The data used in this guide is available as an [item](https://www.arcgis.com/home/item.html?id=c7140ae3d7ae4fd0817181461019aa75). We will start by importing some libraries and downloading and extracting the data needed for the analysis in this guide.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:51:37.257478Z", + "start_time": "2021-11-22T19:51:12.679381Z" + } + }, + "outputs": [], + "source": [ + "# Import Libraries\n", + "import pandas as pd\n", + "from arcgis.features import GeoAccessor, GeoSeriesAccessor\n", + "from arcgis.gis import GIS\n", + "from IPython.display import display\n", + "import zipfile\n", + "import os\n", + "import shutil" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:51:38.872324Z", + "start_time": "2021-11-22T19:51:37.261479Z" + } + }, + "outputs": [], + "source": [ + "# Create a GIS connection\n", + "gis = GIS()\n", + "agol_gis = GIS(\"https://www.arcgis.com\", \"arcgis_python\", \"amazing_arcgis_123\")" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:51:38.980325Z", + "start_time": "2021-11-22T19:51:38.876325Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "
\n", + " \n", + " \n", + " \n", + "
\n", + "\n", + "
\n", + " sedf_guide_data\n", + " \n", + "
Data for Spatially enabled DataFrame GuidesShapefile by api_data_owner\n", + "
Last Modified: November 11, 2021\n", + "
0 comments, 4 views\n", + "
\n", + "
\n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Get the data item\n", + "data_item = gis.content.get('c7140ae3d7ae4fd0817181461019aa75')\n", + "data_item" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The cell below downloads and extracts the data from the data item to your machine.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:51:45.305934Z", + "start_time": "2021-11-22T19:51:42.206937Z" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Removed existing data directory\n", + "Dataset unzipped at: sedf_data\\cities\n" + ] + } + ], + "source": [ + "# Download and extract the data\n", + "def unzip_data():\n", + " \"\"\"\n", + " This function:\n", + " - creates a directory `sedf_data` to download the data from the item\n", + " - downloads the item as `sedf_guide_data.zip` file in the sedf_data directory\n", + " - unzips and extracts the data to '.\\sedf_data\\cities'.\n", + " \"\"\"\n", + " try:\n", + "\n", + " # path to downloaded data folder\n", + " data_dir = os.path.join(os.getcwd(), 'sedf_data')\n", + "\n", + " # remove existing cities directory if exists\n", + " if os.path.isdir(data_dir):\n", + " shutil.rmtree(data_dir)\n", + " print(f'Removed existing data directory')\n", + " else:\n", + " os.makedirs(data_dir)\n", + "\n", + " data_item.download(data_dir) # download the data item\n", + " # path to zipped file inside data folder\n", + " zipped_file_path = os.path.join(data_dir, 'sedf_guide_data.zip')\n", + "\n", + " # unzip the data\n", + " zip_ref = zipfile.ZipFile(zipped_file_path, 'r')\n", + " zip_ref.extractall(data_dir)\n", + " zip_ref.close()\n", + "\n", + " # path to new cities directory\n", + " cities_dir = os.path.join(data_dir, 'cities')\n", + " print(f'Dataset unzipped at: {os.path.relpath(cities_dir)}')\n", + "\n", + " except Exception as e:\n", + " print(f'Error unzipping file: {e}')\n", + "\n", + "\n", + "# Extract data\n", + "unzip_data()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Accessing GIS Data\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The [**Spatially enabled DataFrame**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#geoaccessor) reads from many **sources**, including [Feature layers](https://doc.arcgis.com/en/arcgis-online/share-maps/hosted-web-layers.htm), [Feature classes](http://desktop.arcgis.com/en/arcmap/latest/manage-data/feature-classes/a-quick-tour-of-feature-classes.htm), [Shapefiles](http://desktop.arcgis.com/en/arcmap/latest/manage-data/shapefiles/what-is-a-shapefile.htm), Pandas [DataFrames](http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe) and more. Let's dive into the details of accessing GIS data from various sources.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Read in Web Feature Layers\n", + "\n", + "[Feature layers](https://doc.arcgis.com/en/arcgis-online/share-maps/hosted-web-layers.htm) hosted on [**ArcGIS Online**](https://www.arcgis.com) or [**ArcGIS Enterprise**](http://enterprise.arcgis.com/en/) can be easily read into a Spatially enabled DataFrame using the [`from_layer()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=from_layer#arcgis.features.GeoAccessor.from_layer) method.\n", + "\n", + "The example below shows how the [`get()`](https://developers.arcgis.com/python/api-reference/arcgis.gis.toc.html?highlight=gis%20content%20get#arcgis.gis.ContentManager.get) method can be used to retrieve an ArcGIS Online [`item`](https://developers.arcgis.com/python/api-reference/arcgis.gis.toc.html?highlight=gis%20content%20get#item) and how the [`layers`](https://developers.arcgis.com/python/api-reference/arcgis.gis.toc.html#layer) property of an `item` can be used to access the data.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:51:52.373464Z", + "start_time": "2021-11-22T19:51:51.851896Z" + }, + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "
\n", + " \n", + " \n", + " \n", + "
\n", + "\n", + "
\n", + " USA Major Cities\n", + " \n", + "
This layer presents the locations of cities within the United States with populations of approximately 10,000 or greater, all state capitals, and the national capital.Feature Layer Collection by esri_dm\n", + "
Last Modified: May 19, 2020\n", + "
1 comments, 33,841,105 views\n", + "
\n", + "
\n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "gis = GIS()\n", + "item = gis.content.search(\n", + " \"USA Major Cities\", item_type=\"Feature layer\", outside_org=True)[0]\n", + "item" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:51:56.288612Z", + "start_time": "2021-11-22T19:51:52.376465Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(3886, 50)" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Obtain the first feature layer from the item\n", + "flayer = item.layers[0]\n", + "\n", + "# Use the `from_layer` static method in the 'spatial' namespace on the Pandas' DataFrame\n", + "sdf = pd.DataFrame.spatial.from_layer(flayer)\n", + "\n", + "# Check shape\n", + "sdf.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:51:56.317613Z", + "start_time": "2021-11-22T19:51:56.291617Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
AGE_10_14AGE_15_19AGE_20_24AGE_25_34AGE_35_44AGE_45_54AGE_55_64AGE_5_9AGE_65_74AGE_75_84...PLACEFIPSPOP2010POPULATIONPOP_CLASSRENTER_OCCSHAPESTSTFIPSVACANTWHITE
01313105873420311767144611361503665486...1601990138161518161271{\"x\": -12462673.723706165, \"y\": 5384674.994080...ID1627113002
189081781817991235133011431099721579...1607840118991194661441{\"x\": -12506251.313993266, \"y\": 5341537.793529...ID163189893
21275013959169663213527048295952417712933121767087...1608830205671225405833359{\"x\": -12938676.6836459, \"y\": 5403597.04949123...ID166996182991
3790768699144511361134935959679464...1611260103451072761461{\"x\": -12667411.402393516, \"y\": 5241722.820606...ID162417984
43803377936877571555947443624439722961222...1612250462375394275196{\"x\": -12989383.674504515, \"y\": 5413226.487333...ID16142835856
\n", + "

5 rows × 50 columns

\n", + "
" + ], + "text/plain": [ + " AGE_10_14 AGE_15_19 AGE_20_24 AGE_25_34 AGE_35_44 AGE_45_54 \\\n", + "0 1313 1058 734 2031 1767 1446 \n", + "1 890 817 818 1799 1235 1330 \n", + "2 12750 13959 16966 32135 27048 29595 \n", + "3 790 768 699 1445 1136 1134 \n", + "4 3803 3779 3687 7571 5559 4744 \n", + "\n", + " AGE_55_64 AGE_5_9 AGE_65_74 AGE_75_84 ... PLACEFIPS POP2010 \\\n", + "0 1136 1503 665 486 ... 1601990 13816 \n", + "1 1143 1099 721 579 ... 1607840 11899 \n", + "2 24177 12933 12176 7087 ... 1608830 205671 \n", + "3 935 959 679 464 ... 1611260 10345 \n", + "4 3624 4397 2296 1222 ... 1612250 46237 \n", + "\n", + " POPULATION POP_CLASS RENTER_OCC \\\n", + "0 15181 6 1271 \n", + "1 11946 6 1441 \n", + "2 225405 8 33359 \n", + "3 10727 6 1461 \n", + "4 53942 7 5196 \n", + "\n", + " SHAPE ST STFIPS VACANT WHITE \n", + "0 {\"x\": -12462673.723706165, \"y\": 5384674.994080... ID 16 271 13002 \n", + "1 {\"x\": -12506251.313993266, \"y\": 5341537.793529... ID 16 318 9893 \n", + "2 {\"x\": -12938676.6836459, \"y\": 5403597.04949123... ID 16 6996 182991 \n", + "3 {\"x\": -12667411.402393516, \"y\": 5241722.820606... ID 16 241 7984 \n", + "4 {\"x\": -12989383.674504515, \"y\": 5413226.487333... ID 16 1428 35856 \n", + "\n", + "[5 rows x 50 columns]" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check first few records\n", + "sdf.head()" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:51:56.324128Z", + "start_time": "2021-11-22T19:51:56.320614Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "pandas.core.frame.DataFrame" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check type of sdf\n", + "type(sdf)" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:51:56.343129Z", + "start_time": "2021-11-22T19:51:56.326129Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['point']" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Access spatial namespace\n", + "sdf.spatial.geometry_type" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> We can see that the dataset has 3886 records and 50 columns. Inspecting the `type` of `sdf` object and accessing the `spatial` namespace shows us that a _Spatially enabled DataFrame_ has been created from all the data in the layer.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Memory usage and the `query()` operation\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The [**`from_layer()`**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=from_layer#arcgis.features.GeoAccessor.from_layer) method will attempt to read all the data from the layer into the memory. This approach works when you are dealing with small datasets. However, when it comes to large datasets, it becomes imperative to use the memory efficiently and query for only what is necessary.\n", + "\n", + "Let's take a look at the memory usage of the existing _SeDF_ using the [`memory_usage()`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.memory_usage.html) method from Pandas.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:51:56.731373Z", + "start_time": "2021-11-22T19:51:56.722376Z" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Shape of data: (3886, 50)\n", + "Memory used: 1.48 MB\n" + ] + } + ], + "source": [ + "# Check memory usage of current sdf\n", + "mem_used = sdf.memory_usage().sum() / (1024**2) # converting to megabytes\n", + "print(f'Shape of data: {sdf.shape}')\n", + "print(f'Memory used: {round(mem_used, 2)} MB')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> We can see that a `SeDF` created using the `from_layer()` method reads all the data into the memory. So, the `sdf` object has 3886 records and 50 columns, and uses 1.48MB memory.\n", + "\n", + "But what if we only needed a small amount of data for our analysis and did not need to bring everything from the layer into the memory? Good question... let's see how we can achieve that.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The [**`query()`**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#arcgis.features.FeatureLayer.query) method is a powerful operation that allows you to use [SQL](https://en.wikipedia.org/wiki/SQL) like queries to return only a subset of records. **Since the processing is performed on the server, this operation is not restricted by the capacity of your computer.**\n", + "\n", + "The method returns a [`FeatureSet`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=query#arcgis.features.FeatureSet) object; however, the return type can be changed to a _Spatially enabled DataFrame_ object by specifying the parameter `as_df=True`.\n", + "\n", + "Let's subset the data using `query()`, create a new _SeDF_, and check the memory usage. We'll use the `AGE_45_54` column to query the layer and get a subset of records.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:51:58.344586Z", + "start_time": "2021-11-22T19:51:57.814005Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(316, 50)" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Filter feature layer records with a query.\n", + "sub_sdf = flayer.query(where=\"AGE_45_54 < 1500\", as_df=True)\n", + "sub_sdf.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:51:58.354587Z", + "start_time": "2021-11-22T19:51:58.346589Z" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Memory used is: 0.12 MB\n" + ] + } + ], + "source": [ + "# Check memory usage of current sdf\n", + "mem_used = sub_sdf.memory_usage().sum() / (1024**2) # converting to megabytes\n", + "print(f'Memory used is: {round(mem_used, 2)} MB')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> Now that we are only querying for records where `AGE_45_54 < 1500`, the result is a smaller DataFrame with 316 records and 50 columns. Since the processing is performed on the server side, only a subset of data is being saved in the memory reducing usage from **1.48 MB to 0.12 MB**.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The [`query()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#arcgis.features.FeatureLayer.query) method allows you to specify a number of optional parameters that may further refine and transform the results. One such key parameter is `out_fields`. With `out_fields`, you can subset your data by specifying a list of field names to return.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:51:59.855435Z", + "start_time": "2021-11-22T19:51:59.601357Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(316, 6)" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Filter feature layer with where and out_fields\n", + "out_fields = ['NAME', 'ST', 'POP_CLASS', 'AGE_45_54']\n", + "sub_sdf2 = flayer.query(where=\"AGE_45_54 < 1500\",\n", + " out_fields=out_fields,\n", + " as_df=True)\n", + "sub_sdf2.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:51:59.868435Z", + "start_time": "2021-11-22T19:51:59.858435Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
FIDNAMESTPOP_CLASSAGE_45_54SHAPE
01AmmonID61446{\"x\": -12462673.723706165, \"y\": 5384674.994080...
12BlackfootID61330{\"x\": -12506251.313993266, \"y\": 5341537.793529...
24BurleyID61134{\"x\": -12667411.402393516, \"y\": 5241722.820606...
36ChubbuckID61494{\"x\": -12520053.904151963, \"y\": 5300220.333409...
412JeromeID61155{\"x\": -12747828.64784961, \"y\": 5269214.8197742...
\n", + "
" + ], + "text/plain": [ + " FID NAME ST POP_CLASS AGE_45_54 \\\n", + "0 1 Ammon ID 6 1446 \n", + "1 2 Blackfoot ID 6 1330 \n", + "2 4 Burley ID 6 1134 \n", + "3 6 Chubbuck ID 6 1494 \n", + "4 12 Jerome ID 6 1155 \n", + "\n", + " SHAPE \n", + "0 {\"x\": -12462673.723706165, \"y\": 5384674.994080... \n", + "1 {\"x\": -12506251.313993266, \"y\": 5341537.793529... \n", + "2 {\"x\": -12667411.402393516, \"y\": 5241722.820606... \n", + "3 {\"x\": -12520053.904151963, \"y\": 5300220.333409... \n", + "4 {\"x\": -12747828.64784961, \"y\": 5269214.8197742... " + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check head\n", + "sub_sdf2.head()" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:01.302923Z", + "start_time": "2021-11-22T19:52:01.295930Z" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Memory used is: 0.01 MB\n" + ] + } + ], + "source": [ + "# Check memory usage of current sdf\n", + "mem_used = sub_sdf2.memory_usage().sum() / (1024**2) # converting to megabytes\n", + "print(f'Memory used is: {round(mem_used, 2)} MB')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> Using `out_fields`, we have further reduced memory usage by subsetting the data and bringing only necessary information into the memory.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Create SeDF from FeatureSet\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As mentioned earlier, the [**`query()`**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#arcgis.features.FeatureLayer.query) method returns a [`FeatureSet`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=query#arcgis.features.FeatureSet) object. The `FeatureSet` object contains useful information about the data that can be accessed through its various properties.\n", + "\n", + "Let's use the `AGE_45_54` column to query the layer to get the result as a `FeatureSet` and check some its properties.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:03.250472Z", + "start_time": "2021-11-22T19:52:02.945753Z" + } + }, + "outputs": [], + "source": [ + "# Filter feature layer to return a feature set.\n", + "fset = flayer.query(where=\"AGE_45_54 < 1500\")" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:03.259475Z", + "start_time": "2021-11-22T19:52:03.253475Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "arcgis.features.feature.FeatureSet" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check type\n", + "type(fset)" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:03.411404Z", + "start_time": "2021-11-22T19:52:03.406402Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "316" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check length\n", + "len(fset.features)" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:03.633193Z", + "start_time": "2021-11-22T19:52:03.627193Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "{'x': -12462673.723706165,\n", + " 'y': 5384674.994080178,\n", + " 'spatialReference': {'wkid': 102100, 'latestWkid': 3857}}" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check geometry of a feature in the featureset\n", + "fset.features[0].geometry" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The `fields` property of a `FeatureSet` returns a list containing information about each column recorded as a dictionary. Let's use the `fields` property to access information about the first column.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:04.088356Z", + "start_time": "2021-11-22T19:52:04.083359Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "{'name': 'FID',\n", + " 'type': 'esriFieldTypeOID',\n", + " 'alias': 'FID',\n", + " 'sqlType': 'sqlTypeInteger',\n", + " 'domain': None,\n", + " 'defaultValue': None}" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check details of a column in the feature set\n", + "fset.fields[0]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's get the names of the columns in the data.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:04.548802Z", + "start_time": "2021-11-22T19:52:04.542798Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['FID', 'NAME', 'CLASS', 'ST', 'STFIPS']" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Get column names\n", + "f_names = [f['name'] for f in fset.fields]\n", + "f_names[:5]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, let's create a _Spatially enabled DataFrame_ from a `FeatureSet` using the [`.sdf`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=sdf#arcgis.features.FeatureSet.sdf) property.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:05.026469Z", + "start_time": "2021-11-22T19:52:05.006466Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(316, 50)" + ] + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Create SeDF from FeatureSet\n", + "fset_df = fset.sdf\n", + "fset_df.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:05.276147Z", + "start_time": "2021-11-22T19:52:05.258149Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
FIDNAMECLASSSTSTFIPSPLACEFIPSCAPITALPOP_CLASSPOPULATIONPOP2010...MARHH_NO_CMHH_CHILDFHH_CHILDFAMILIESAVE_FAM_SZHSE_UNITSVACANTOWNER_OCCRENTER_OCCSHAPE
01AmmoncityID16160199061518113816...113110633533523.61474727132051271{\"x\": -12462673.723706165, \"y\": 5384674.994080...
12BlackfootcityID16160784061194611899...108117438129583.31454731827881441{\"x\": -12506251.313993266, \"y\": 5341537.793529...
\n", + "

2 rows × 50 columns

\n", + "
" + ], + "text/plain": [ + " FID NAME CLASS ST STFIPS PLACEFIPS CAPITAL POP_CLASS POPULATION \\\n", + "0 1 Ammon city ID 16 1601990 6 15181 \n", + "1 2 Blackfoot city ID 16 1607840 6 11946 \n", + "\n", + " POP2010 ... MARHH_NO_C MHH_CHILD FHH_CHILD FAMILIES AVE_FAM_SZ \\\n", + "0 13816 ... 1131 106 335 3352 3.61 \n", + "1 11899 ... 1081 174 381 2958 3.31 \n", + "\n", + " HSE_UNITS VACANT OWNER_OCC RENTER_OCC \\\n", + "0 4747 271 3205 1271 \n", + "1 4547 318 2788 1441 \n", + "\n", + " SHAPE \n", + "0 {\"x\": -12462673.723706165, \"y\": 5384674.994080... \n", + "1 {\"x\": -12506251.313993266, \"y\": 5341537.793529... \n", + "\n", + "[2 rows x 50 columns]" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check head\n", + "fset_df.head(2)" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:05.502653Z", + "start_time": "2021-11-22T19:52:05.496653Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['point']" + ] + }, + "execution_count": 24, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check geometry type\n", + "fset_df.spatial.geometry_type" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> The `spatial` namespace shows that a _Spatially enabled DataFrame_ has been created from a `FeatureSet`.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Create SeDF from FeatureCollection\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Tools within the ArcGIS API for Python often return a [FeatureCollection](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#featurecollection) object as a result of some analysis. A `FeatureCollection` is an in-memory collection of [Feature](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#arcgis.features.Feature) objects with rendering information. Similar to feature layers, feature collections can also be used to store features. With a feature collection, a service is not created to serve out the feature data.\n", + "\n", + "Let's create a `SeDF` from a FeatureCollection. Here, we:\n", + "\n", + "- Import the [Major Ports](https://www.arcgis.com/home/item.html?id=405963eaea24428c9db236ec289760eb) feature layer.\n", + "- Create 5 mile buffers using [`create_buffers()`](https://developers.arcgis.com/python/api-reference/arcgis.features.use_proximity.html#create-buffers) tool resulting in a FeatureCollection.\n", + "- Using the [query()](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#arcgis.features.FeatureLayer.query) method on a FeatureCollection returns a [FeatureSet](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=query#arcgis.features.FeatureSet) object. We will create a `SeDF` from the buffered FeatureCollection using the the [`.sdf`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=sdf#arcgis.features.FeatureSet.sdf) property of a FeatureSet object returned from `query()`.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:10.071650Z", + "start_time": "2021-11-22T19:52:09.984656Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "
\n", + " \n", + " \n", + " \n", + "
\n", + "\n", + "
\n", + " Major Ports\n", + " \n", + "
This feature layer, utilizing data from the U.S. Department of Transportation, depicts Major Ports in the United States by total tonnage.Feature Layer Collection by Federal_User_Community\n", + "
Last Modified: October 27, 2021\n", + "
0 comments, 157,223 views\n", + "
\n", + "
\n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 25, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Get the ports item\n", + "ports_item = gis.content.get(\"405963eaea24428c9db236ec289760eb\")\n", + "ports_item" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:10.203674Z", + "start_time": "2021-11-22T19:52:10.197678Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 26, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Get the ports layer\n", + "ports_lyr = ports_item.layers[0]\n", + "ports_lyr" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:40.150295Z", + "start_time": "2021-11-22T19:52:10.430626Z" + } + }, + "outputs": [], + "source": [ + "# Create buffers\n", + "from arcgis.features.use_proximity import create_buffers\n", + "ports_buffer50 = create_buffers(\n", + " ports_lyr, distances=[5], units='Miles', gis=agol_gis)" + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:40.159300Z", + "start_time": "2021-11-22T19:52:40.154296Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "arcgis.features.feature.FeatureCollection" + ] + }, + "execution_count": 28, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check type of result from the analysis\n", + "type(ports_buffer50)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> The `create_buffers()` tool resulted in a `FeatureCollection`.\n", + "\n", + "Now, we will create a `SeDF` from the `FeatureCollection` object.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:40.197297Z", + "start_time": "2021-11-22T19:52:40.162296Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
OBJECTID_1OBJECTIDIDPORTPORT_NAMEGRAND_TOTAFOREIGN_TOIMPORTSEXPORTSDOMESTICBUFF_DISTORIG_FIDAnalysisAreaSHAPE
011124C4947Unalaska Island, AK165228112368294262518105784154525178.528402{\"rings\": [[[-18806114.3995, 7138385.537799999...
12285C4410Kahului, Maui, HI36154492039120391035950585278.528402{\"rings\": [[[-17418472.419, 2388455.4312999994...
\n", + "
" + ], + "text/plain": [ + " OBJECTID_1 OBJECTID ID PORT PORT_NAME GRAND_TOTA \\\n", + "0 1 1 124 C4947 Unalaska Island, AK 1652281 \n", + "1 2 2 85 C4410 Kahului, Maui, HI 3615449 \n", + "\n", + " FOREIGN_TO IMPORTS EXPORTS DOMESTIC BUFF_DIST ORIG_FID AnalysisArea \\\n", + "0 1236829 426251 810578 415452 5 1 78.528402 \n", + "1 20391 20391 0 3595058 5 2 78.528402 \n", + "\n", + " SHAPE \n", + "0 {\"rings\": [[[-18806114.3995, 7138385.537799999... \n", + "1 {\"rings\": [[[-17418472.419, 2388455.4312999994... " + ] + }, + "execution_count": 29, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Create SeDF\n", + "sedf_fc = ports_buffer50.query().sdf\n", + "sedf_fc.head(2)" + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:40.205296Z", + "start_time": "2021-11-22T19:52:40.199296Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['polygon']" + ] + }, + "execution_count": 30, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check geometry type\n", + "sedf_fc.spatial.geometry_type" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> The `spatial` namespace shows that a _Spatially enabled DataFrame_ has been created from a `FeatureCollection`.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Read in local GIS data\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Local geospatial data, such as [`Feature classes`](http://desktop.arcgis.com/en/arcmap/latest/manage-data/feature-classes/a-quick-tour-of-feature-classes.htm) and [`shapefiles`](http://desktop.arcgis.com/en/arcmap/latest/manage-data/shapefiles/what-is-a-shapefile.htm) can be easily accessed using the [**Spatially enabled DataFrame**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#geoaccessor). The [`from_featureclass()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?arcgis.features.GeoAccessor.from_featureclass#arcgis.features.GeoAccessor.from_featureclass) method can be used to access local data. Let's look at some examples.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Reading a Shapefile\n", + "\n", + "A locally stored [`shapefile`](http://desktop.arcgis.com/en/arcmap/latest/manage-data/shapefiles/what-is-a-shapefile.htm) can be accessed by passing the location of the file in the [`from_featureclass()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?arcgis.features.GeoAccessor.from_featureclass#arcgis.features.GeoAccessor.from_featureclass) method.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + " Note: In the absence of arcpy, the PyShp package must be present in your current conda environment in order to read shapefiles. To check if PyShp is present, you can run the following in a cell:\n", + " \n", + " !conda list pyshp\n", + " \n", + "To install PyShp, you can run the following in a cell:\n", + " \n", + " !conda install pyshp\n", + "
\n" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:41.031721Z", + "start_time": "2021-11-22T19:52:40.207294Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(3886, 51)" + ] + }, + "execution_count": 31, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Reading from shape file\n", + "shp_df = pd.DataFrame.spatial.from_featureclass(\n", + " location=\"./sedf_data/cities/cities.shp\")\n", + "shp_df.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:41.058722Z", + "start_time": "2021-11-22T19:52:41.034722Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['point']" + ] + }, + "execution_count": 32, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "shp_df.spatial.geometry_type" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> The `spatial` namespace shows that a _Spatially enabled DataFrame_ has been created from the `shapefile` stored locally.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Shapefile from a URL\n", + "\n", + "The url of a zipped `shapefile` can be used to create a `SeDF` by passing the url as location in the `from_featureclass()` method. The image below shows how the operation can be performed.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + " Note: This operation requires PyShp to be available in the environment.\n", + "\n", + "
\n" + ] + }, + { + "attachments": { + "image.png": { + "image/png": "" + } + }, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![image.png](attachment:image.png)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Reading a Featureclass\n", + "\n", + "A [featureclass](http://desktop.arcgis.com/en/arcmap/latest/manage-data/feature-classes/a-quick-tour-of-feature-classes.htm) can be accessed from a File Geodatabase by passing its location in the [`from_featureclass()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?arcgis.features.GeoAccessor.from_featureclass#arcgis.features.GeoAccessor.from_featureclass) method.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + " Note: In the absence of arcpy, the Fiona package must be present in your current conda environment in order to read a featureclass.\n", + " To check if Fiona is present, you can run the following in a cell:\n", + " \n", + " !conda list fiona\n", + " \n", + "To install Fiona, you can run the following in a cell:\n", + " \n", + " !conda install fiona\n", + "
\n" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:41.480724Z", + "start_time": "2021-11-22T19:52:41.062722Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(3886, 51)" + ] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Reading from FGDB\n", + "fcls_df = pd.DataFrame.spatial.from_featureclass(\n", + " location=\"./sedf_data/cities/cities.gdb/cities\")\n", + "fcls_df.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:41.506721Z", + "start_time": "2021-11-22T19:52:41.485728Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
OBJECTIDage_10_14age_15_19age_20_24age_25_34age_35_44age_45_54age_55_64age_5_9age_65_74...placefipspop2010populationpop_classrenter_occststfipsvacantwhiteSHAPE
011313105873420311767144611361503665...1601990138161518161271ID1627113002{\"x\": -12462673.7237, \"y\": 5384674.994099997, ...
1289081781817991235133011431099721...1607840118991194661441ID163189893{\"x\": -12506251.314, \"y\": 5341537.793499999, \"...
\n", + "

2 rows × 51 columns

\n", + "
" + ], + "text/plain": [ + " OBJECTID age_10_14 age_15_19 age_20_24 age_25_34 age_35_44 age_45_54 \\\n", + "0 1 1313 1058 734 2031 1767 1446 \n", + "1 2 890 817 818 1799 1235 1330 \n", + "\n", + " age_55_64 age_5_9 age_65_74 ... placefips pop2010 population \\\n", + "0 1136 1503 665 ... 1601990 13816 15181 \n", + "1 1143 1099 721 ... 1607840 11899 11946 \n", + "\n", + " pop_class renter_occ st stfips vacant white \\\n", + "0 6 1271 ID 16 271 13002 \n", + "1 6 1441 ID 16 318 9893 \n", + "\n", + " SHAPE \n", + "0 {\"x\": -12462673.7237, \"y\": 5384674.994099997, ... \n", + "1 {\"x\": -12506251.314, \"y\": 5341537.793499999, \"... \n", + "\n", + "[2 rows x 51 columns]" + ] + }, + "execution_count": 34, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check head\n", + "fcls_df.head(2)" + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:52:41.525725Z", + "start_time": "2021-11-22T19:52:41.510728Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['point']" + ] + }, + "execution_count": 35, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check geometry type\n", + "fcls_df.spatial.geometry_type" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> The `spatial` namespace shows that a _Spatially enabled DataFrame_ has been created from the `featureclass` stored locally.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Specify optional parameters**\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The **`from_featureclass()`** method allows users to specify optional parameters when the `ArcPy` library is available in the current environment. These parameters are:\n", + "\n", + "- `sql_clause`: a pair of SQL prefix and postfix clauses, `sql_clause=(prefix,postfix)`, organized in a list or a tuple can be passed to query specific data. The parameter allows only a small set of operations to be performed. Learn more about the allowed operations [here](https://pro.arcgis.com/en/pro-app/latest/arcpy/data-access/searchcursor-class.htm).\n", + "- `where_clause`: where statement to subset the data. Learn more about it [here](https://pro.arcgis.com/en/pro-app/latest/help/mapping/navigation/sql-reference-for-elements-used-in-query-expressions.htm).\n", + "- `fields`: to subset the data for specific fields.\n", + "- `spatial_filter`: a geometry object to filter the results.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + " Note: The operations below can only be performed in an environment that contains arcpy.\n", + "\n", + "
\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Subset data for specific fields\n" + ] + }, + { + "cell_type": "code", + "execution_count": 36, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:55:03.504998Z", + "start_time": "2021-11-22T19:55:03.352999Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(3886, 3)" + ] + }, + "execution_count": 36, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Subset for fields\n", + "fcls_flds = pd.DataFrame.spatial.from_featureclass(location=\"./sedf_data/cities/cities.gdb/cities\",\n", + " fields=['st', 'pop_class'])\n", + "fcls_flds.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:55:04.378124Z", + "start_time": "2021-11-22T19:55:04.367127Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
stpop_classSHAPE
0ID6{\"x\": -12462673.7237, \"y\": 5384674.994099997, ...
1ID6{\"x\": -12506251.314, \"y\": 5341537.793499999, \"...
\n", + "
" + ], + "text/plain": [ + " st pop_class SHAPE\n", + "0 ID 6 {\"x\": -12462673.7237, \"y\": 5384674.994099997, ...\n", + "1 ID 6 {\"x\": -12506251.314, \"y\": 5341537.793499999, \"..." + ] + }, + "execution_count": 37, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check head\n", + "fcls_flds.head(2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Subset using `where_clause`\n", + "\n", + "Learn more about how to use `where_clause` [here](https://pro.arcgis.com/en/pro-app/latest/help/mapping/navigation/sql-reference-for-elements-used-in-query-expressions.htm).\n" + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:55:06.957737Z", + "start_time": "2021-11-22T19:55:06.876084Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(15, 51)" + ] + }, + "execution_count": 38, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Subset using where_clause\n", + "fcls_whr = pd.DataFrame.spatial.from_featureclass(location=\"./sedf_data/cities/cities.gdb/cities\",\n", + " where_clause=\"st='ID' and pop_class=6\")\n", + "fcls_whr.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:55:07.111124Z", + "start_time": "2021-11-22T19:55:07.093125Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
OBJECTIDage_10_14age_15_19age_20_24age_25_34age_35_44age_45_54age_55_64age_5_9age_65_74...placefipspop2010populationpop_classrenter_occststfipsvacantwhiteSHAPE
011313105873420311767144611361503665...1601990138161518161271ID1627113002{\"x\": -12462673.7237, \"y\": 5384674.994099997, ...
1289081781817991235133011431099721...1607840118991194661441ID163189893{\"x\": -12506251.314, \"y\": 5341537.793499999, \"...
\n", + "

2 rows × 51 columns

\n", + "
" + ], + "text/plain": [ + " OBJECTID age_10_14 age_15_19 age_20_24 age_25_34 age_35_44 age_45_54 \\\n", + "0 1 1313 1058 734 2031 1767 1446 \n", + "1 2 890 817 818 1799 1235 1330 \n", + "\n", + " age_55_64 age_5_9 age_65_74 ... placefips pop2010 population \\\n", + "0 1136 1503 665 ... 1601990 13816 15181 \n", + "1 1143 1099 721 ... 1607840 11899 11946 \n", + "\n", + " pop_class renter_occ st stfips vacant white \\\n", + "0 6 1271 ID 16 271 13002 \n", + "1 6 1441 ID 16 318 9893 \n", + "\n", + " SHAPE \n", + "0 {\"x\": -12462673.7237, \"y\": 5384674.994099997, ... \n", + "1 {\"x\": -12506251.314, \"y\": 5341537.793499999, \"... \n", + "\n", + "[2 rows x 51 columns]" + ] + }, + "execution_count": 39, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check head\n", + "fcls_whr.head(2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Subset using `fields` and `where_clause`\n" + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:55:08.852159Z", + "start_time": "2021-11-22T19:55:08.788159Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(15, 5)" + ] + }, + "execution_count": 40, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Subset using where_clause\n", + "flds_whr = pd.DataFrame.spatial.from_featureclass(location=\"./sedf_data/cities/cities.gdb/cities\",\n", + " fields=[\n", + " 'st', 'pop_class', 'age_10_14', 'age_15_19'],\n", + " where_clause=\"st='ID' and pop_class=6\")\n", + "flds_whr.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:55:09.717600Z", + "start_time": "2021-11-22T19:55:09.706606Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
stpop_classage_10_14age_15_19SHAPE
0ID613131058{\"x\": -12462673.7237, \"y\": 5384674.994099997, ...
1ID6890817{\"x\": -12506251.314, \"y\": 5341537.793499999, \"...
\n", + "
" + ], + "text/plain": [ + " st pop_class age_10_14 age_15_19 \\\n", + "0 ID 6 1313 1058 \n", + "1 ID 6 890 817 \n", + "\n", + " SHAPE \n", + "0 {\"x\": -12462673.7237, \"y\": 5384674.994099997, ... \n", + "1 {\"x\": -12506251.314, \"y\": 5341537.793499999, \"... " + ] + }, + "execution_count": 41, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check head\n", + "flds_whr.head(2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Subset using `sql_clause`\n", + "\n", + "`sql_clause` can be combined with `fields` and `where_clause` to further subset the data. You can learn more about the allowed operations [here](https://pro.arcgis.com/en/pro-app/latest/arcpy/data-access/searchcursor-class.htm). Now let's look at some examples.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "###### Prefix `sql_clause` - DISTINCT operation\n" + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:55:12.045052Z", + "start_time": "2021-11-22T19:55:11.704053Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(3886, 51)" + ] + }, + "execution_count": 42, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Prefix Sql clause - DISTINCT operation\n", + "fcls_sql1 = pd.DataFrame.spatial.from_featureclass(location=\"./sedf_data/cities/cities.gdb/cities\",\n", + " sql_clause=(\"DISTINCT pop_class\", None))\n", + "\n", + "# Check shape\n", + "fcls_sql1.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 43, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:55:12.590891Z", + "start_time": "2021-11-22T19:55:12.570891Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
OBJECTIDage_10_14age_15_19age_20_24age_25_34age_35_44age_45_54age_55_64age_5_9age_65_74...placefipspop2010populationpop_classrenter_occststfipsvacantwhiteSHAPE
0941124712131043202216922116182711871037...0507330156201477163006AR0513036216{\"x\": -10006810.091, \"y\": 4290154.581699997, \"...
114057967487541999171720621450760851...246685012677131886814MD2428111613{\"x\": -8517714.7855, \"y\": 4744316.880199999, \"...
\n", + "

2 rows × 51 columns

\n", + "
" + ], + "text/plain": [ + " OBJECTID age_10_14 age_15_19 age_20_24 age_25_34 age_35_44 age_45_54 \\\n", + "0 941 1247 1213 1043 2022 1692 2116 \n", + "1 1405 796 748 754 1999 1717 2062 \n", + "\n", + " age_55_64 age_5_9 age_65_74 ... placefips pop2010 population \\\n", + "0 1827 1187 1037 ... 0507330 15620 14771 \n", + "1 1450 760 851 ... 2466850 12677 13188 \n", + "\n", + " pop_class renter_occ st stfips vacant white \\\n", + "0 6 3006 AR 05 1303 6216 \n", + "1 6 814 MD 24 281 11613 \n", + "\n", + " SHAPE \n", + "0 {\"x\": -10006810.091, \"y\": 4290154.581699997, \"... \n", + "1 {\"x\": -8517714.7855, \"y\": 4744316.880199999, \"... \n", + "\n", + "[2 rows x 51 columns]" + ] + }, + "execution_count": 43, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check head\n", + "fcls_sql1.head(2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "###### Postfix `sql_clause` with specific fields\n", + "\n", + "Here, we will subset the data for the state and population class fields and apply a postfix clause.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:55:13.456845Z", + "start_time": "2021-11-22T19:55:13.280450Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(3886, 3)" + ] + }, + "execution_count": 44, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Postfix Sql clause with specific fields\n", + "fcls_sql2 = pd.DataFrame.spatial.from_featureclass(location=\"./sedf_data/cities/cities.gdb/cities\",\n", + " fields=['st', 'pop_class'],\n", + " sql_clause=(None, \"ORDER BY st, pop_class\"))\n", + "# Check shape\n", + "fcls_sql2.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:55:14.111586Z", + "start_time": "2021-11-22T19:55:14.100594Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
stpop_classSHAPE
0AK6{\"x\": -16417572.1606, \"y\": 9562359.403800003, ...
1AK6{\"x\": -16455422.2224, \"y\": 9574022.0224, \"spat...
2AK6{\"x\": -16444303.0276, \"y\": 9568008.9705, \"spat...
3AK6{\"x\": -14962313.3618, \"y\": 8031014.926600002, ...
4AK6{\"x\": -16657118.680399999, \"y\": 8746757.662600...
\n", + "
" + ], + "text/plain": [ + " st pop_class SHAPE\n", + "0 AK 6 {\"x\": -16417572.1606, \"y\": 9562359.403800003, ...\n", + "1 AK 6 {\"x\": -16455422.2224, \"y\": 9574022.0224, \"spat...\n", + "2 AK 6 {\"x\": -16444303.0276, \"y\": 9568008.9705, \"spat...\n", + "3 AK 6 {\"x\": -14962313.3618, \"y\": 8031014.926600002, ...\n", + "4 AK 6 {\"x\": -16657118.680399999, \"y\": 8746757.662600..." + ] + }, + "execution_count": 45, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check head\n", + "fcls_sql2.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "###### Prefix and Postfix `sql_clause` with specific fields and `where_clause`\n", + "\n", + "Here, we will subset the data using `where_clause`, keep specific fields, and then apply both prefix and postfix clause.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:55:51.001847Z", + "start_time": "2021-11-22T19:55:50.922841Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(22, 5)" + ] + }, + "execution_count": 48, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Prefix and Postfix sql_clause\n", + "fcls_sql3_df = pd.DataFrame.spatial.from_featureclass(location=\"./sedf_data/cities/cities.gdb/cities\",\n", + " fields=[\n", + " 'st', 'name', 'pop_class', 'age_10_14'],\n", + " where_clause=\"st='ID'\",\n", + " sql_clause=(\"DISTINCT pop_class\", \"ORDER BY name\"))\n", + "\n", + "# Check Shape\n", + "fcls_sql3_df.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 49, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T19:55:51.761628Z", + "start_time": "2021-11-22T19:55:51.749637Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
stnamepop_classage_10_14SHAPE
0IDAmmon61313{\"x\": -12462673.7237, \"y\": 5384674.994099997, ...
1IDBlackfoot6890{\"x\": -12506251.314, \"y\": 5341537.793499999, \"...
2IDBoise City812750{\"x\": -12938676.683600001, \"y\": 5403597.049500...
3IDBurley6790{\"x\": -12667411.4024, \"y\": 5241722.820600003, ...
4IDCaldwell73803{\"x\": -12989383.6745, \"y\": 5413226.487300001, ...
\n", + "
" + ], + "text/plain": [ + " st name pop_class age_10_14 \\\n", + "0 ID Ammon 6 1313 \n", + "1 ID Blackfoot 6 890 \n", + "2 ID Boise City 8 12750 \n", + "3 ID Burley 6 790 \n", + "4 ID Caldwell 7 3803 \n", + "\n", + " SHAPE \n", + "0 {\"x\": -12462673.7237, \"y\": 5384674.994099997, ... \n", + "1 {\"x\": -12506251.314, \"y\": 5341537.793499999, \"... \n", + "2 {\"x\": -12938676.683600001, \"y\": 5403597.049500... \n", + "3 {\"x\": -12667411.4024, \"y\": 5241722.820600003, ... \n", + "4 {\"x\": -12989383.6745, \"y\": 5413226.487300001, ... " + ] + }, + "execution_count": 49, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check head\n", + "fcls_sql3_df.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Using `spatial_filter`\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "`spatial_filter` can be used to query the results by using a spatial relationship with another geometry. The spatial filtering is even more powerful when integrated with [Geoenrichment](https://developers.arcgis.com/python/guide/part1-introduction-to-geoenrichment/). Let's use this approach to filter our results for the state of Idaho. In this example, we will:\n", + "\n", + "- use `arcgis.geoenrichment.Country` to derive the geometries for the state of Idaho.\n", + "- use `arcgis.geometry.filters.intersects(geometry, sr=None)` to create a geometry filter object that filters results whose geometry intersects with the specified geometry (i.e. filter data points within the boundary of Idaho).\n", + "- pass the geometry filter object to `spatial_filter` to get desired results.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + " Note: To perform enrichment operations, GeoEnrichment must be configured in your GIS organization. GeoEnrichment consumes credits, and you can learn more about credit consumption here. \n", + "
\n" + ] + }, + { + "cell_type": "code", + "execution_count": 51, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T20:03:11.171059Z", + "start_time": "2021-11-22T20:03:11.154058Z" + } + }, + "outputs": [], + "source": [ + "# Basic Imports\n", + "from arcgis.geometry import Geometry\n", + "from arcgis.geometry.filters import intersects\n", + "from arcgis.geoenrichment import Country" + ] + }, + { + "cell_type": "code", + "execution_count": 59, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T20:08:43.643602Z", + "start_time": "2021-11-22T20:08:43.139513Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "arcgis.geoenrichment.enrichment.Country" + ] + }, + "execution_count": 59, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Create country object\n", + "usa = Country.get('US', gis=agol_gis)\n", + "type(usa)" + ] + }, + { + "cell_type": "code", + "execution_count": 62, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T20:08:49.034325Z", + "start_time": "2021-11-22T20:08:47.854467Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "image/svg+xml": [ + "" + ], + "text/plain": [ + "" + ] + }, + "execution_count": 62, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Get boundaries for Idaho\n", + "named_area_ID = usa.search(query='Idaho', layers=['US.States'])\n", + "display(named_area_ID[0])\n", + "named_area_ID[0].geometry.as_arcpy" + ] + }, + { + "cell_type": "code", + "execution_count": 64, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T20:10:38.529463Z", + "start_time": "2021-11-22T20:10:38.524455Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "{'wkid': 4326, 'latestWkid': 4326}" + ] + }, + "execution_count": 64, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Create spatial reference\n", + "sr_id = named_area_ID[0].geometry[\"spatialReference\"]\n", + "sr_id" + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T20:12:24.265943Z", + "start_time": "2021-11-22T20:12:24.259940Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "dict" + ] + }, + "execution_count": 66, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Construct a geometry filter using the filter geometry\n", + "id_state_filter = intersects(named_area_ID[0].geometry,\n", + " sr=sr_id)\n", + "type(id_state_filter)" + ] + }, + { + "cell_type": "code", + "execution_count": 71, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T20:19:12.104168Z", + "start_time": "2021-11-22T20:19:10.973170Z" + }, + "code_folding": [] + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(22, 5)" + ] + }, + "execution_count": 71, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Pass geometry filter object as a spatial_filter\n", + "fcls_spfl_df = pd.DataFrame.spatial.from_featureclass(location=\"./sedf_data/cities/cities.gdb/cities\",\n", + " fields=[\n", + " 'st', 'name', 'pop_class', 'age_10_14'],\n", + " spatial_filter=id_state_filter)\n", + "# Check shape\n", + "fcls_spfl_df.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 73, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-22T20:26:39.851895Z", + "start_time": "2021-11-22T20:26:39.840893Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
stnamepop_classage_10_14SHAPE
0IDAmmon61313{\"x\": -12462673.7237, \"y\": 5384674.994099997, ...
1IDBlackfoot6890{\"x\": -12506251.314, \"y\": 5341537.793499999, \"...
2IDBoise City812750{\"x\": -12938676.683600001, \"y\": 5403597.049500...
3IDBurley6790{\"x\": -12667411.4024, \"y\": 5241722.820600003, ...
4IDCaldwell73803{\"x\": -12989383.6745, \"y\": 5413226.487300001, ...
\n", + "
" + ], + "text/plain": [ + " st name pop_class age_10_14 \\\n", + "0 ID Ammon 6 1313 \n", + "1 ID Blackfoot 6 890 \n", + "2 ID Boise City 8 12750 \n", + "3 ID Burley 6 790 \n", + "4 ID Caldwell 7 3803 \n", + "\n", + " SHAPE \n", + "0 {\"x\": -12462673.7237, \"y\": 5384674.994099997, ... \n", + "1 {\"x\": -12506251.314, \"y\": 5341537.793499999, \"... \n", + "2 {\"x\": -12938676.683600001, \"y\": 5403597.049500... \n", + "3 {\"x\": -12667411.4024, \"y\": 5241722.820600003, ... \n", + "4 {\"x\": -12989383.6745, \"y\": 5413226.487300001, ... " + ] + }, + "execution_count": 73, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check head\n", + "fcls_spfl_df.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> The result shows the data points filtered for Idaho as defined by the spatial filter.\n", + "\n", + "You can learn more about applying spatial filters in our [Working with geometries](https://developers.arcgis.com/python/guide/part4-spatial-filters/#arcgis.geometry.filters-module) guide series.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Read in DataFrame with Addresses\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "A `SeDF` can be easily created from a DataFrame with address information using the [`from_df()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=from_featureclass#arcgis.features.GeoAccessor.from_df) method. This method geocodes the addresses using the first configured geocoder in your GIS. The locations generated after geocoding are used as the geometry of the SeDF.\n", + "\n", + "You can learn more about geocoding in our [Finding Places with geocoding](https://developers.arcgis.com/python/guide/part1-what-is-geocoding/) guide series.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + " Note: The from_df() method performs a batch geocoding operation which consumes credits. If a geocoder is not specified, then the first configured geocoder in your GIS organization will be used. Learn more about credit consumption here.\n", + "\n", + "To avoid credit consumption, you may specify your own `geocoder`.\n", + "\n", + "
\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's look at an example of using `from_df()`. We will read addresses into a DataFrame using the [`pd.read_csv()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html) method. Next, we will create a SeDF by passing the DataFrame and address column as parameters to the `from_df()` method.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:34:12.278791Z", + "start_time": "2021-11-11T22:34:12.267792Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
Address
0602 Murray Cir, Sausalito, CA 94965
1340 Stockton St, San Francisco, CA 94108
23619 Balboa St, San Francisco, CA 94121
31274 El Camino Real, San Bruno, CA 94066
4625 Monterey Blvd, San Francisco, CA 94127
\n", + "
" + ], + "text/plain": [ + " Address\n", + "0 602 Murray Cir, Sausalito, CA 94965\n", + "1 340 Stockton St, San Francisco, CA 94108\n", + "2 3619 Balboa St, San Francisco, CA 94121\n", + "3 1274 El Camino Real, San Bruno, CA 94066\n", + "4 625 Monterey Blvd, San Francisco, CA 94127" + ] + }, + "execution_count": 48, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Read the csv file with address into a DataFrame\n", + "orders_df = pd.read_csv(\"./sedf_data/cities/orders.csv\")\n", + "\n", + "# Check head\n", + "orders_df.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> The DataFrame shows a column with address information.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 53, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:35:55.437412Z", + "start_time": "2021-11-11T22:35:53.956939Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
AddressSHAPE
0602 Murray Cir, Sausalito, CA 94965{\"x\": -122.47885242199999, \"y\": 37.83735920100...
1340 Stockton St, San Francisco, CA 94108{\"x\": -122.44955096499996, \"y\": 37.73152250200...
23619 Balboa St, San Francisco, CA 94121{\"x\": -122.49772620499999, \"y\": 37.77567413500...
31274 El Camino Real, San Bruno, CA 94066{\"x\": -122.40685153899994, \"y\": 37.78910429100...
4625 Monterey Blvd, San Francisco, CA 94127{\"x\": -122.42218381299995, \"y\": 37.63856151200...
\n", + "
" + ], + "text/plain": [ + " Address \\\n", + "0 602 Murray Cir, Sausalito, CA 94965 \n", + "1 340 Stockton St, San Francisco, CA 94108 \n", + "2 3619 Balboa St, San Francisco, CA 94121 \n", + "3 1274 El Camino Real, San Bruno, CA 94066 \n", + "4 625 Monterey Blvd, San Francisco, CA 94127 \n", + "\n", + " SHAPE \n", + "0 {\"x\": -122.47885242199999, \"y\": 37.83735920100... \n", + "1 {\"x\": -122.44955096499996, \"y\": 37.73152250200... \n", + "2 {\"x\": -122.49772620499999, \"y\": 37.77567413500... \n", + "3 {\"x\": -122.40685153899994, \"y\": 37.78910429100... \n", + "4 {\"x\": -122.42218381299995, \"y\": 37.63856151200... " + ] + }, + "execution_count": 53, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Use from_df to create SeDF\n", + "orders_sdf = pd.DataFrame.spatial.from_df(\n", + " df=orders_df, address_column=\"Address\")\n", + "orders_sdf.head()" + ] + }, + { + "cell_type": "code", + "execution_count": 54, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:35:57.704801Z", + "start_time": "2021-11-11T22:35:57.697804Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['point']" + ] + }, + "execution_count": 54, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check geometry type\n", + "orders_sdf.spatial.geometry_type" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> The `spatial` namespace shows that a _Spatially enabled DataFrame_ has been created from a Pandas DataFrame with address information.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Read in DataFrame with Lat/Long Information\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As we saw in part-1 of this guide series, a SeDF can be created from any Pandas DataFrame with location information (Latitude and Longitude) using the [`from_xy()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#arcgis.features.GeoAccessor.from_xy) method.\n", + "\n", + "Let's look at an example. We will read the data with latitude and longitude information into a DataFrame using the [`pd.read_csv()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html) method. Then, we will create a SeDF by passing the DataFrame, latitude, and longitude as parameters to the `from_xy()` method.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 55, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:36:07.669481Z", + "start_time": "2021-11-11T22:36:07.650485Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
Provider NameProvider CityProvider StateResidents Total Admissions COVID-19Residents Total COVID-19 CasesResidents Total COVID-19 DeathsNumber of All BedsTotal Number of Occupied BedsLONGITUDELATITUDE
0GROSSE POINTE MANORNILESIL556129961-87.79297342.012012
1MILLER'S MERRY MANORDUNKIRKIN0004643-85.19765140.392722
2PARKWAY MANORMARIONIL00013184-88.98294437.750143
3AVANTARA LONG GROVELONG GROVEIL61410195131-87.98644242.160843
4HARMONY NURSING & REHAB CENTERCHICAGOIL197516180116-87.72635341.975505
\n", + "
" + ], + "text/plain": [ + " Provider Name Provider City Provider State \\\n", + "0 GROSSE POINTE MANOR NILES IL \n", + "1 MILLER'S MERRY MANOR DUNKIRK IN \n", + "2 PARKWAY MANOR MARION IL \n", + "3 AVANTARA LONG GROVE LONG GROVE IL \n", + "4 HARMONY NURSING & REHAB CENTER CHICAGO IL \n", + "\n", + " Residents Total Admissions COVID-19 Residents Total COVID-19 Cases \\\n", + "0 5 56 \n", + "1 0 0 \n", + "2 0 0 \n", + "3 6 141 \n", + "4 19 75 \n", + "\n", + " Residents Total COVID-19 Deaths Number of All Beds \\\n", + "0 12 99 \n", + "1 0 46 \n", + "2 0 131 \n", + "3 0 195 \n", + "4 16 180 \n", + "\n", + " Total Number of Occupied Beds LONGITUDE LATITUDE \n", + "0 61 -87.792973 42.012012 \n", + "1 43 -85.197651 40.392722 \n", + "2 84 -88.982944 37.750143 \n", + "3 131 -87.986442 42.160843 \n", + "4 116 -87.726353 41.975505 " + ] + }, + "execution_count": 55, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Read the data\n", + "cms_df = pd.read_csv('./sedf_data/cities/sample_cms_data.csv')\n", + "\n", + "# Return the first 5 records\n", + "cms_df.head()" + ] + }, + { + "cell_type": "code", + "execution_count": 56, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:36:07.692482Z", + "start_time": "2021-11-11T22:36:07.672485Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
Provider NameProvider CityProvider StateResidents Total Admissions COVID-19Residents Total COVID-19 CasesResidents Total COVID-19 DeathsNumber of All BedsTotal Number of Occupied BedsLONGITUDELATITUDESHAPE
0GROSSE POINTE MANORNILESIL556129961-87.79297342.012012{\"spatialReference\": {\"wkid\": 4326}, \"x\": -87....
1MILLER'S MERRY MANORDUNKIRKIN0004643-85.19765140.392722{\"spatialReference\": {\"wkid\": 4326}, \"x\": -85....
2PARKWAY MANORMARIONIL00013184-88.98294437.750143{\"spatialReference\": {\"wkid\": 4326}, \"x\": -88....
3AVANTARA LONG GROVELONG GROVEIL61410195131-87.98644242.160843{\"spatialReference\": {\"wkid\": 4326}, \"x\": -87....
4HARMONY NURSING & REHAB CENTERCHICAGOIL197516180116-87.72635341.975505{\"spatialReference\": {\"wkid\": 4326}, \"x\": -87....
\n", + "
" + ], + "text/plain": [ + " Provider Name Provider City Provider State \\\n", + "0 GROSSE POINTE MANOR NILES IL \n", + "1 MILLER'S MERRY MANOR DUNKIRK IN \n", + "2 PARKWAY MANOR MARION IL \n", + "3 AVANTARA LONG GROVE LONG GROVE IL \n", + "4 HARMONY NURSING & REHAB CENTER CHICAGO IL \n", + "\n", + " Residents Total Admissions COVID-19 Residents Total COVID-19 Cases \\\n", + "0 5 56 \n", + "1 0 0 \n", + "2 0 0 \n", + "3 6 141 \n", + "4 19 75 \n", + "\n", + " Residents Total COVID-19 Deaths Number of All Beds \\\n", + "0 12 99 \n", + "1 0 46 \n", + "2 0 131 \n", + "3 0 195 \n", + "4 16 180 \n", + "\n", + " Total Number of Occupied Beds LONGITUDE LATITUDE \\\n", + "0 61 -87.792973 42.012012 \n", + "1 43 -85.197651 40.392722 \n", + "2 84 -88.982944 37.750143 \n", + "3 131 -87.986442 42.160843 \n", + "4 116 -87.726353 41.975505 \n", + "\n", + " SHAPE \n", + "0 {\"spatialReference\": {\"wkid\": 4326}, \"x\": -87.... \n", + "1 {\"spatialReference\": {\"wkid\": 4326}, \"x\": -85.... \n", + "2 {\"spatialReference\": {\"wkid\": 4326}, \"x\": -88.... \n", + "3 {\"spatialReference\": {\"wkid\": 4326}, \"x\": -87.... \n", + "4 {\"spatialReference\": {\"wkid\": 4326}, \"x\": -87.... " + ] + }, + "execution_count": 56, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Create a SeDF\n", + "cms_sedf = pd.DataFrame.spatial.from_xy(\n", + " df=cms_df, x_column='LONGITUDE', y_column='LATITUDE', sr=4326)\n", + "\n", + "# Check head\n", + "cms_sedf.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> The `SHAPE` feature shows that a _Spatially enabled DataFrame_ has been created from a Pandas DataFrame with latitude and longitude information.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Read in GeoPandas DataFrame\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "A `SeDF` can be easily created from a [GeoPandas's](https://geopandas.org/index.html) [GeoDataFrame](https://geopandas.org/docs/reference/geodataframe.html) using the [`from_geodataframe()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#arcgis.features.GeoAccessor.from_geodataframe) method. We will:\n", + "\n", + "- Import Geopandas and create a GeoDataFrame.\n", + "- Create a [**Spatially enabled DataFrame**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#geoaccessor) from a GeoDataFrame.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Create a GeoDataFrame\n", + "\n", + "Here, we will create a `GeoDataFrame` from a Pandas DataFrame, `cms_df`, defined above.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 57, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:36:07.724482Z", + "start_time": "2021-11-11T22:36:07.694483Z" + } + }, + "outputs": [], + "source": [ + "# Import libraries\n", + "from geopandas import GeoDataFrame\n", + "from shapely.geometry import Point" + ] + }, + { + "cell_type": "code", + "execution_count": 58, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:36:07.743482Z", + "start_time": "2021-11-11T22:36:07.726483Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(124, 9)" + ] + }, + "execution_count": 58, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Read the data\n", + "cms_df = pd.read_csv('./sedf_data/cities/sample_cms_data.csv')\n", + "\n", + "# Create Geopandas DataFrame\n", + "gdf = GeoDataFrame(cms_df.drop(['LONGITUDE', 'LATITUDE'], axis=1),\n", + " crs={'init': 'epsg:4326'},\n", + " geometry=[Point(xy) for xy in zip(cms_df.LONGITUDE, cms_df.LATITUDE)])\n", + "gdf.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 59, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:36:07.758480Z", + "start_time": "2021-11-11T22:36:07.745481Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
Provider NameProvider CityProvider StateResidents Total Admissions COVID-19Residents Total COVID-19 CasesResidents Total COVID-19 DeathsNumber of All BedsTotal Number of Occupied Bedsgeometry
0GROSSE POINTE MANORNILESIL556129961POINT (-87.79297 42.01201)
1MILLER'S MERRY MANORDUNKIRKIN0004643POINT (-85.19765 40.39272)
\n", + "
" + ], + "text/plain": [ + " Provider Name Provider City Provider State \\\n", + "0 GROSSE POINTE MANOR NILES IL \n", + "1 MILLER'S MERRY MANOR DUNKIRK IN \n", + "\n", + " Residents Total Admissions COVID-19 Residents Total COVID-19 Cases \\\n", + "0 5 56 \n", + "1 0 0 \n", + "\n", + " Residents Total COVID-19 Deaths Number of All Beds \\\n", + "0 12 99 \n", + "1 0 46 \n", + "\n", + " Total Number of Occupied Beds geometry \n", + "0 61 POINT (-87.79297 42.01201) \n", + "1 43 POINT (-85.19765 40.39272) " + ] + }, + "execution_count": 59, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check head\n", + "gdf.head(2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> A GeoDataFrame has been created with a `geometry` column that stores the geometry of the dataset.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Create a SeDF from GeoDataFrame\n", + "\n", + "Here, we will create a `SeDF` from the `gdf` GeoDataFrame created above using the [`from_geodataframe()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#arcgis.features.GeoAccessor.from_geodataframe) method.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 60, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:36:07.777481Z", + "start_time": "2021-11-11T22:36:07.760480Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
Provider NameProvider CityProvider StateResidents Total Admissions COVID-19Residents Total COVID-19 CasesResidents Total COVID-19 DeathsNumber of All BedsTotal Number of Occupied BedsSHAPE
0GROSSE POINTE MANORNILESIL556129961{\"x\": -87.792973, \"y\": 42.012012, \"spatialRefe...
1MILLER'S MERRY MANORDUNKIRKIN0004643{\"x\": -85.197651, \"y\": 40.392722, \"spatialRefe...
\n", + "
" + ], + "text/plain": [ + " Provider Name Provider City Provider State \\\n", + "0 GROSSE POINTE MANOR NILES IL \n", + "1 MILLER'S MERRY MANOR DUNKIRK IN \n", + "\n", + " Residents Total Admissions COVID-19 Residents Total COVID-19 Cases \\\n", + "0 5 56 \n", + "1 0 0 \n", + "\n", + " Residents Total COVID-19 Deaths Number of All Beds \\\n", + "0 12 99 \n", + "1 0 46 \n", + "\n", + " Total Number of Occupied Beds \\\n", + "0 61 \n", + "1 43 \n", + "\n", + " SHAPE \n", + "0 {\"x\": -87.792973, \"y\": 42.012012, \"spatialRefe... \n", + "1 {\"x\": -85.197651, \"y\": 40.392722, \"spatialRefe... " + ] + }, + "execution_count": 60, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Create a SeDF\n", + "sedf_gpd = pd.DataFrame.spatial.from_geodataframe(gdf)\n", + "sedf_gpd.head(2)" + ] + }, + { + "cell_type": "code", + "execution_count": 61, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:36:07.784482Z", + "start_time": "2021-11-11T22:36:07.779484Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['point']" + ] + }, + "execution_count": 61, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check geometry type\n", + "sedf_gpd.spatial.geometry_type" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> The `spatial` namespace shows that a _Spatially enabled DataFrame_ has been created from a GeoDataFrame.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Read in feather format data\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "A `SeDF` can be easily created from the data in [feather](https://arrow.apache.org/docs/python/feather.html) format using the [`from_feather()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=from_feather#arcgis.features.GeoAccessor.from_feather) method. The method's defaults _SHAPE_ is the `spatial_column` for geo-spatial information, but any other column with spatial information can be specified.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 62, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:36:07.798483Z", + "start_time": "2021-11-11T22:36:07.786483Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
Provider NameProvider CityProvider StateResidents Total Admissions COVID-19Residents Total COVID-19 CasesResidents Total COVID-19 DeathsNumber of All BedsTotal Number of Occupied BedsLONGITUDELATITUDESHAPE
0GROSSE POINTE MANORNILESIL556129961-87.79297342.012012{\"spatialReference\": {\"wkid\": 4326}, \"x\": -87....
1MILLER'S MERRY MANORDUNKIRKIN0004643-85.19765140.392722{\"spatialReference\": {\"wkid\": 4326}, \"x\": -85....
\n", + "
" + ], + "text/plain": [ + " Provider Name Provider City Provider State \\\n", + "0 GROSSE POINTE MANOR NILES IL \n", + "1 MILLER'S MERRY MANOR DUNKIRK IN \n", + "\n", + " Residents Total Admissions COVID-19 Residents Total COVID-19 Cases \\\n", + "0 5 56 \n", + "1 0 0 \n", + "\n", + " Residents Total COVID-19 Deaths Number of All Beds \\\n", + "0 12 99 \n", + "1 0 46 \n", + "\n", + " Total Number of Occupied Beds LONGITUDE LATITUDE \\\n", + "0 61 -87.792973 42.012012 \n", + "1 43 -85.197651 40.392722 \n", + "\n", + " SHAPE \n", + "0 {\"spatialReference\": {\"wkid\": 4326}, \"x\": -87.... \n", + "1 {\"spatialReference\": {\"wkid\": 4326}, \"x\": -85.... " + ] + }, + "execution_count": 62, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check head\n", + "cms_sedf.head(2)" + ] + }, + { + "cell_type": "code", + "execution_count": 63, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:36:07.854484Z", + "start_time": "2021-11-11T22:36:07.802481Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
Provider NameProvider CityProvider StateResidents Total Admissions COVID-19Residents Total COVID-19 CasesResidents Total COVID-19 DeathsNumber of All BedsTotal Number of Occupied BedsLONGITUDELATITUDESHAPE
0GROSSE POINTE MANORNILESIL556129961-87.79297342.012012{\"x\": -87.792973, \"y\": 42.012012, \"spatialRefe...
1MILLER'S MERRY MANORDUNKIRKIN0004643-85.19765140.392722{\"x\": -85.197651, \"y\": 40.392722, \"spatialRefe...
\n", + "
" + ], + "text/plain": [ + " Provider Name Provider City Provider State \\\n", + "0 GROSSE POINTE MANOR NILES IL \n", + "1 MILLER'S MERRY MANOR DUNKIRK IN \n", + "\n", + " Residents Total Admissions COVID-19 Residents Total COVID-19 Cases \\\n", + "0 5 56 \n", + "1 0 0 \n", + "\n", + " Residents Total COVID-19 Deaths Number of All Beds \\\n", + "0 12 99 \n", + "1 0 46 \n", + "\n", + " Total Number of Occupied Beds LONGITUDE LATITUDE \\\n", + "0 61 -87.792973 42.012012 \n", + "1 43 -85.197651 40.392722 \n", + "\n", + " SHAPE \n", + "0 {\"x\": -87.792973, \"y\": 42.012012, \"spatialRefe... \n", + "1 {\"x\": -85.197651, \"y\": 40.392722, \"spatialRefe... " + ] + }, + "execution_count": 63, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Create SeDf by reading from feather\n", + "sedf_fthr = pd.DataFrame.spatial.from_feather(\n", + " './sedf_data/cities/sample_cms_data.feather')\n", + "sedf_fthr.head(2)" + ] + }, + { + "cell_type": "code", + "execution_count": 64, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:36:07.861486Z", + "start_time": "2021-11-11T22:36:07.856482Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['point']" + ] + }, + "execution_count": 64, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check geometry type\n", + "sedf_fthr.spatial.geometry_type" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> The `spatial` namespace shows that a _Spatially enabled DataFrame_ has been created from _feather_ format data.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Read in Non-spatial Table data\n", + "\n", + "Non-spatial table data can be hosted on [**ArcGIS Online**](https://www.arcgis.com) or [**ArcGIS Enterprise**](http://enterprise.arcgis.com/en/), or it can be stored locally in a File Geodatabase. A `SeDF` can be easily created from such non-spatial table data using the following methods:\n", + "\n", + "- [`from_table()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=from_feather#arcgis.features.GeoAccessor.from_table) - for local data\n", + "- [`from_layer()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=from_layer#arcgis.features.GeoAccessor.from_layer) - for data hosted on ArcGIS Online or Enterprise\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Using the `from_table()` method\n", + "\n", + "A `SeDF` can be created from local non-spatial data using the [`from_table()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=from_feather#arcgis.features.GeoAccessor.from_table) method. The method can read a csv file (in any environment) or a table stored in a File Geodatabase (with ArcPy only).\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Reading a csv file\n" + ] + }, + { + "cell_type": "code", + "execution_count": 65, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:36:08.546484Z", + "start_time": "2021-11-11T22:36:07.863481Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
Provider NameProvider CityProvider StateResidents Total Admissions COVID-19Residents Total COVID-19 CasesResidents Total COVID-19 DeathsNumber of All BedsTotal Number of Occupied BedsLONGITUDELATITUDE
0GROSSE POINTE MANORNILESIL556129961-87.79297342.012012
1MILLER'S MERRY MANORDUNKIRKIN0004643-85.19765140.392722
\n", + "
" + ], + "text/plain": [ + " Provider Name Provider City Provider State \\\n", + "0 GROSSE POINTE MANOR NILES IL \n", + "1 MILLER'S MERRY MANOR DUNKIRK IN \n", + "\n", + " Residents Total Admissions COVID-19 Residents Total COVID-19 Cases \\\n", + "0 5 56 \n", + "1 0 0 \n", + "\n", + " Residents Total COVID-19 Deaths Number of All Beds \\\n", + "0 12 99 \n", + "1 0 46 \n", + "\n", + " Total Number of Occupied Beds LONGITUDE LATITUDE \n", + "0 61 -87.792973 42.012012 \n", + "1 43 -85.197651 40.392722 " + ] + }, + "execution_count": 65, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Create SeDF\n", + "tbl_df = pd.DataFrame.spatial.from_table(\n", + " filename='./sedf_data/cities/sample_cms_data.csv')\n", + "tbl_df.head(2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> A Pandas DataFrame without any spatial information is returned.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Reading table from a File Geodatabase\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + " Note: The operation below can only be performed in an environment that contains arcpy.\n", + "\n", + "
\n" + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:36:08.615484Z", + "start_time": "2021-11-11T22:36:08.548486Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
OBJECTIDNAMEOTHEROWNER_OCCPLACEFIPSPOP2010POPULATIONPOP_CLASSRENTER_OCCSTSTFIPSVACANTWHITE
01Ammon30732051601990138161518161271ID1627113002
12Blackfoot107727881607840118991194661441ID163189893
\n", + "
" + ], + "text/plain": [ + " OBJECTID NAME OTHER OWNER_OCC PLACEFIPS POP2010 POPULATION \\\n", + "0 1 Ammon 307 3205 1601990 13816 15181 \n", + "1 2 Blackfoot 1077 2788 1607840 11899 11946 \n", + "\n", + " POP_CLASS RENTER_OCC ST STFIPS VACANT WHITE \n", + "0 6 1271 ID 16 271 13002 \n", + "1 6 1441 ID 16 318 9893 " + ] + }, + "execution_count": 66, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Create SeDF\n", + "tbl_df2 = pd.DataFrame.spatial.from_table(\n", + " filename=\"./sedf_data/cities/cities.gdb/cities_table_export\")\n", + "tbl_df2.head(2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> A Pandas DataFrame without any spatial information is returned.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Using the `from_layer()` method\n", + "\n", + "A `SeDF` can be created from hosted non-spatial data using the[`from_layer()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?highlight=from_layer#arcgis.features.GeoAccessor.from_layer) method.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "
\n", + " \n", + " \n", + " \n", + "
\n", + "\n", + "
\n", + " sedf_major_cities_table\n", + " \n", + "

Table Layer by api_data_owner\n", + "
Last Modified: September 30, 2024\n", + "
0 comments, 3 views\n", + "
\n", + "
\n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "tbl_item = agol_gis.content.get(\"019215fdda4b4b3eb5b4712f3b06f544\")\n", + "tbl_item" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Get table url\n", + "tbl = tbl_item.tables[0]\n", + "tbl" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "
\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
OBJECTIDPLACEFIPSPOP2010POPULATIONPOP_CLASSSTFIPSCLASSObjectId2
0016019901381615181616city1
1116078401189911946616city2
\n", + "" + ], + "text/plain": [ + " OBJECTID PLACEFIPS POP2010 POPULATION POP_CLASS STFIPS CLASS \\\n", + "0 0 1601990 13816 15181 6 16 city \n", + "1 1 1607840 11899 11946 6 16 city \n", + "\n", + " ObjectId2 \n", + "0 1 \n", + "1 2 " + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "import pandas as pd\n", + "tbl_df2 = pd.DataFrame.spatial.from_layer(tbl)\n", + "tbl_df2.head(2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> A Pandas DataFrame without any spatial information is returned.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Read in data from '_lite and portable_' databases\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Geospatial data stored in a [mobile geodatabase](https://pro.arcgis.com/en/pro-app/latest/help/data/geodatabases/manage-mobile-gdb/mobile-geodatabases.htm) (.geodatabase) or a [SQLite Database](https://pro.arcgis.com/en/pro-app/latest/tool-reference/data-management/create-sqlite-database.htm) can be easily accessed using the [**Spatially enabled DataFrame**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#geoaccessor).\n", + "\n", + "- A mobile geodatabase (.geodatabase) is a collection of various types of GIS datasets contained in a single file on disk that can store, query, and manage spatial and nonspatial data. Mobile geodatabases are stored in an SQLite database.\n", + "\n", + "- SQLite is a full-featured relational database with the advantage of being portable and interoperable making it ubiquitous in mobile app development.\n", + "\n", + "The [`from_featureclass()`](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html?arcgis.features.GeoAccessor.from_featureclass#arcgis.features.GeoAccessor.from_featureclass) method can be used to create a `SeDF` by reading in data from these databases. Let's look at some examples.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + " Note: The operations below can only be performed in an environment that contains arcpy.\n", + "\n", + "
\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Read from a mobile geodatabase\n" + ] + }, + { + "cell_type": "code", + "execution_count": 70, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:36:12.166989Z", + "start_time": "2021-11-11T22:36:11.692511Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(3886, 51)" + ] + }, + "execution_count": 70, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Reading from mobile geodatabase\n", + "mobile_gdb_df = pd.DataFrame.spatial.from_featureclass(\n", + " location=\"./sedf_data/cities/cities_mobile.geodatabase/main.cities\")\n", + "mobile_gdb_df.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 71, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:36:12.212993Z", + "start_time": "2021-11-11T22:36:12.171999Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
OBJECTIDage_10_14age_15_19age_20_24age_25_34age_35_44age_45_54age_55_64age_5_9age_65_74...placefipspop2010populationpop_classrenter_occststfipsvacantwhiteSHAPE
011313105873420311767144611361503665...1601990138161518161271ID1627113002{\"x\": -12462673.7237, \"y\": 5384674.994099997, ...
1289081781817991235133011431099721...1607840118991194661441ID163189893{\"x\": -12506251.314, \"y\": 5341537.793499999, \"...
\n", + "

2 rows × 51 columns

\n", + "
" + ], + "text/plain": [ + " OBJECTID age_10_14 age_15_19 age_20_24 age_25_34 age_35_44 age_45_54 \\\n", + "0 1 1313 1058 734 2031 1767 1446 \n", + "1 2 890 817 818 1799 1235 1330 \n", + "\n", + " age_55_64 age_5_9 age_65_74 ... placefips pop2010 population \\\n", + "0 1136 1503 665 ... 1601990 13816 15181 \n", + "1 1143 1099 721 ... 1607840 11899 11946 \n", + "\n", + " pop_class renter_occ st stfips vacant white \\\n", + "0 6 1271 ID 16 271 13002 \n", + "1 6 1441 ID 16 318 9893 \n", + "\n", + " SHAPE \n", + "0 {\"x\": -12462673.7237, \"y\": 5384674.994099997, ... \n", + "1 {\"x\": -12506251.314, \"y\": 5341537.793499999, \"... \n", + "\n", + "[2 rows x 51 columns]" + ] + }, + "execution_count": 71, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check head\n", + "mobile_gdb_df.head(2)" + ] + }, + { + "cell_type": "code", + "execution_count": 72, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:36:12.230989Z", + "start_time": "2021-11-11T22:36:12.216993Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['point']" + ] + }, + "execution_count": 72, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check geometry type\n", + "mobile_gdb_df.spatial.geometry_type" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> The `spatial` namespace shows that a _Spatially enabled DataFrame_ has been created.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### Read from a SQLite database\n" + ] + }, + { + "cell_type": "code", + "execution_count": 73, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:36:12.660988Z", + "start_time": "2021-11-11T22:36:12.232993Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(3886, 51)" + ] + }, + "execution_count": 73, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Reading from sqlite database\n", + "sqlite_df = pd.DataFrame.spatial.from_featureclass(\n", + " location=\"./sedf_data/cities/cities.sqlite/main.cities\")\n", + "sqlite_df.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 74, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:36:12.681991Z", + "start_time": "2021-11-11T22:36:12.664992Z" + } + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
OBJECTIDage_10_14age_15_19age_20_24age_25_34age_35_44age_45_54age_55_64age_5_9age_65_74...placefipspop2010populationpop_classrenter_occststfipsvacantwhiteSHAPE
011313105873420311767144611361503665...1601990138161518161271ID1627113002{\"x\": -12462673.7237, \"y\": 5384674.994099997, ...
1289081781817991235133011431099721...1607840118991194661441ID163189893{\"x\": -12506251.314, \"y\": 5341537.793499999, \"...
\n", + "

2 rows × 51 columns

\n", + "
" + ], + "text/plain": [ + " OBJECTID age_10_14 age_15_19 age_20_24 age_25_34 age_35_44 age_45_54 \\\n", + "0 1 1313 1058 734 2031 1767 1446 \n", + "1 2 890 817 818 1799 1235 1330 \n", + "\n", + " age_55_64 age_5_9 age_65_74 ... placefips pop2010 population \\\n", + "0 1136 1503 665 ... 1601990 13816 15181 \n", + "1 1143 1099 721 ... 1607840 11899 11946 \n", + "\n", + " pop_class renter_occ st stfips vacant white \\\n", + "0 6 1271 ID 16 271 13002 \n", + "1 6 1441 ID 16 318 9893 \n", + "\n", + " SHAPE \n", + "0 {\"x\": -12462673.7237, \"y\": 5384674.994099997, ... \n", + "1 {\"x\": -12506251.314, \"y\": 5341537.793499999, \"... \n", + "\n", + "[2 rows x 51 columns]" + ] + }, + "execution_count": 74, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check head\n", + "sqlite_df.head(2)" + ] + }, + { + "cell_type": "code", + "execution_count": 75, + "metadata": { + "ExecuteTime": { + "end_time": "2021-11-11T22:36:12.697990Z", + "start_time": "2021-11-11T22:36:12.685991Z" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['point']" + ] + }, + "execution_count": 75, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Check geometry type\n", + "sqlite_df.spatial.geometry_type" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> The `spatial` namespace shows that a _Spatially enabled DataFrame_ has been created.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Conclusion\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this guide, we explored how [**Spatially enabled DataFrame**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#geoaccessor) (SeDF) can be used to read spatial data from various formats. We started by reading data from web feature layers and using the `query()` operation to optimize performance and results. We explored reading data from various local data sources such as file geodatabase and shapefile. Next, we explained how data with address or coordinate information, in a geopandas dataframe, or in feather format can be used to create a SeDF. We also discussed creating SeDF from non-spatial table data. Towards the end, we also discussed how SeDF can be created using data from lite and portable databases.\n", + "\n", + "In the next part of the guide series, you will learn about exporting data using [**Spatially enabled DataFrame**](https://developers.arcgis.com/python/api-reference/arcgis.features.toc.html#geoaccessor).\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
\n", + " Note: Given the importance and popularity of Spatially enabled DataFrame, we are revisiting our documentation for this topic. Our goal is to enhance the existing documentation to showcase the various capabilities of Spatially enabled DataFrame in detail with even more examples this time.\n", + "\n", + "Creating quality documentation is time-consuming and exhaustive, but we are committed to providing you with the best experience possible. With that in mind, we will be rolling out the revamped guides on this topic as different parts of a guide series (like the Data Engineering or Geometry guide series). This is \"part-2\" of the guide series for Spatially Enabled DataFrame. You will continue to see the existing documentation as we revamp it to add new parts. Stay tuned for more on this topic.\n", + "\n", + "
\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.0" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": true, + "sideBar": true, + "skip_h1_title": true, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": true, + "toc_position": { + "height": "calc(100% - 180px)", + "left": "10px", + "top": "150px", + "width": "360.188px" + }, + "toc_section_display": true, + "toc_window_display": true + } + }, + "nbformat": 4, + "nbformat_minor": 4 +}