Skip to content

Commit

Permalink
Merge pull request #55 from yeesian/streaming-xml-parser
Browse files Browse the repository at this point in the history
[WIP] support streaming osm files through libexpat
  • Loading branch information
garborg committed Jan 5, 2015
2 parents 2293c61 + de29440 commit 367af58
Show file tree
Hide file tree
Showing 17 changed files with 242 additions and 409 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ Pkg.add("OpenStreetMap")
##### Dependencies
The following packages should automatically be added as "additional packages", if you do not already have them:
* LightXML.jl: parsing OpenStreetMap datafiles
* LibExpat.jl: streaming OpenStreetMap datafiles
* Winston.jl: map plotting
* Graphs.jl: map routing

Expand Down
1 change: 1 addition & 0 deletions REQUIRE
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
julia 0.3-
Compat
LightXML
LibExpat
Winston
Graphs
13 changes: 2 additions & 11 deletions docs/data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,12 @@ Reading OSM Data

OpenStreetMap data is available in a variety of formats. However, the easiest and most common to work with is the OSM XML format. OpenStreetMap.jl makes reading data from these files easy and straightforward:

.. py:function:: getOSMData(filename::String [, nodes=false, highways=false, buildings=false, features=false])
.. py:function:: getOSMData(filename::String)
Inputs:
* Required:

* ``filename`` [``String``]: Filename of OSM datafile.
* Optional:

* ``nodes`` [``Bool``]: ``true`` to read node data
* ``highways`` [``Bool``]: ``true`` to read highway data
* ``buildings`` [``Bool``]: ``true`` to read building data
* ``features`` [``Bool``]: ``true`` to read feature data

Outputs:
* ``nodes`` [``false`` or ``Dict{Int,LLA}``]: Dictionary of node locations
Expand All @@ -29,10 +23,7 @@ These four outputs store all data from the file. ``highways``, ``buildings``, an

.. code-block:: python
nodes, hwys, builds, feats = getOSMData(MAP_FILENAME, nodes=true, highways=true, buildings=true, features=true)``
**Usage Notes:**
Reading data is generally very fast unless your system runs out of memory. This is because LightXML.jl loads the entire xml file into memory as a tree rather than streaming it. A 150 MB OSM file seems to take up about 2-3 GB of RAM on my machine, so load large files with caution.
nodes, hwys, builds, feats = getOSMData(MAP_FILENAME)
Extracting Intersections
------------------------
Expand Down
2 changes: 1 addition & 1 deletion docs/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Read data from an OSM XML file:

.. code-block:: python
nodes, hwys, builds, feats = getOSMData(MAP_FILENAME, nodes=true, highways=true, buildings=true, features=true)
nodes, hwys, builds, feats = getOSMData(MAP_FILENAME)
println("Number of nodes: $(length(nodes))")
println("Number of highways: $(length(hwys))")
Expand Down
28 changes: 14 additions & 14 deletions docs/types.rst
Original file line number Diff line number Diff line change
Expand Up @@ -92,10 +92,10 @@ Region boundaries include the minimum and maximum latitude and longitude of a re
.. code-block:: python
type Bounds
min_y # min_lat or min_north
max_y # max_lat or max_north
min_x # min_lon or min_east
max_x # max_lon or max_east
min_y::Float64 # min_lat or min_north
max_y::Float64 # max_lat or max_north
min_x::Float64 # min_lon or min_east
max_x::Float64 # max_lon or max_east
end
Point Types
Expand All @@ -110,13 +110,13 @@ Used to store node data in OpenStreetMap XML files.
.. code-block:: python
type LLA
lat
lon
alt
lat::Float64
lon::Float64
alt::Float64
end
Because OpenStreetMap typically does not store altitude data, the following alias is available for convenience:
``LLA(lat, lon) = LLA(lat, lon, 0)``
``LLA(lat, lon) = LLA(lat, lon, 0.0)``

Earth-Centered-Earth-Fixed (ECEF) Coordinates
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand All @@ -126,9 +126,9 @@ Global cartesian coordinate system rotating with the Earth.
.. code-block:: python
type ECEF
x
y
z
x::Float64
y::Float64
z::Float64
end
East-North-Up (ENU) Coordinates
Expand All @@ -139,9 +139,9 @@ Local cartesian coordinate system, centered on a reference point.
.. code-block:: python
type ENU
east
north
up
east::Float64
north::Float64
up::Float64
end
Additional Types
Expand Down
1 change: 1 addition & 0 deletions src/OpenStreetMap.jl
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
module OpenStreetMap

import LightXML
import LibExpat
import Winston
import Graphs
import Compat
Expand Down
82 changes: 0 additions & 82 deletions src/buildings.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,88 +2,6 @@
### MIT License ###
### Copyright 2014 ###

### Create list of all buildings in OSM file ###
function getBuildings(street_map::LightXML.XMLDocument)

xroot = LightXML.root(street_map)
ways = LightXML.get_elements_by_tagname(xroot, "way")

buildings = Dict{Int,Building}()

for way in ways

if LightXML.has_attribute(way, "visible")
if LightXML.attribute(way, "visible") == "false"
# Visible=false indicates historic data, which we will ignore
continue
end
end

# Search for tag with k="building"
for tag in LightXML.child_elements(way)
if LightXML.name(tag) == "tag"
if LightXML.has_attribute(tag, "k")
k = LightXML.attribute(tag, "k")
if k == "building"
class = ""
if LightXML.has_attribute(tag, "v")
class = LightXML.attribute(tag, "v")
end

id = int(LightXML.attribute(way, "id"))
buildings[id] = getBuildingData(way, class)
break
end
end
end
end
end

return buildings
end

### Gather highway data from OSM element ###
function getBuildingData(building::LightXML.XMLElement, class::String="")
nodes = Int[]
class = ""
building_name = ""

# Get way ID
# id = int64(LightXML.attribute(building, "id"))

# Iterate over all "label" fields
for label in LightXML.child_elements(building)

if LightXML.name(label) == "tag" && LightXML.has_attribute(label, "k")
k = LightXML.attribute(label, "k")

# If not yet set, find the class type
if isempty(class) && k == "building"
if LightXML.has_attribute(label, "v")
class = LightXML.attribute(label, "v")
continue
end
end

# Check if building has a name
if isempty(building_name) && k == "name"
if LightXML.has_attribute(label, "v")
building_name = LightXML.attribute(label, "v")
continue
end
end
end

# Collect associated nodes
if LightXML.name(label) == "nd" && LightXML.has_attribute(label, "ref")
push!(nodes, int64(LightXML.attribute(label, "ref")))
continue
end
end

return Building(class, building_name, nodes)
end

### Classify buildings ###
function classify(buildings::Dict{Int,Building})
bdgs = Dict{Int,Int}()
Expand Down
70 changes: 0 additions & 70 deletions src/features.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,76 +2,6 @@
### MIT License ###
### Copyright 2014 ###

### Create list of all features in OSM file ###
function getFeatures(street_map::LightXML.XMLDocument)

xroot = LightXML.root(street_map)
nodes = LightXML.get_elements_by_tagname(xroot, "node")
features = Dict{Int,Feature}()

for node in nodes

if LightXML.has_attribute(node, "visible") &&
LightXML.attribute(node, "visible") == "false"
# Visible=false indicates historic data, which we will ignore
continue
end

# Search for tag giving feature information
for tag in LightXML.child_elements(node)
if LightXML.name(tag) == "tag" && LightXML.has_attribute(tag, "k")
k = LightXML.attribute(tag, "k")
if haskey(FEATURE_CLASSES, k)
id = int(LightXML.attribute(node, "id"))
features[id] = getFeatureData(node)
break
end
end
end
end

return features
end

### Gather feature data from OSM element ###
function getFeatureData(node::LightXML.XMLElement)

class = ""
detail = ""
feature_name = ""

# Get node ID
# id = int64(LightXML.attribute(node, "id"))

# Iterate over all "label" fields
for label in LightXML.child_elements(node)

if LightXML.name(label) == "tag" && LightXML.has_attribute(label, "k")
k = LightXML.attribute(label, "k")

# If empty, find the class type
if isempty(class) && LightXML.has_attribute(label, "v")
v = LightXML.attribute(label, "v")
if haskey(FEATURE_CLASSES, k)
class = k
detail = v
continue
end
end

# Check if feature has a name
if isempty(feature_name) && k == "name"
if LightXML.has_attribute(label, "v")
feature_name = LightXML.attribute(label, "v")
continue
end
end
end
end

return Feature(class, detail, feature_name)
end

### Classify features ###
function classify(features::Dict{Int,Feature})
feats = Dict{Int,Int}()
Expand Down
Loading

0 comments on commit 367af58

Please sign in to comment.