CSV to Shapefile conversion with Pandas, Fiona and Shapely
The Problem:
I’m sick of having to open Windows, then an ArcPy script (#firstworldproblems), just to convert a CSV to a Shapefile. And I also want to program it so when I get updated files I can just run a Python program rather than go through some point and clicks in QGIS. Though, I do think QGIS is awesome, I just don’t enjoy programming with it. I really enjoy working with Python’s R-like cousin Pandas.
Sean Gilles of MapBox - Fiona and Rasterio: Data Access for Python Programmers and Future Python Programmers talk at FOSS4G
This morning approaching lunch I started listening/watching to @SeanGilles’ talk from FOSS4G 2014 Fiona and Rasterio: Data Access for Python Programmers and Future Python Programmers — Sean Gillies, Mapbox. I didn’t even watch the whole thing before I started to see which Python modules I had.
I realized the only thing I was missing to write some scripts to do this task without having to use ArcPy was Fiona. Luckily, I found Installing Open Source Geo Software: Mac Edition which has a quick MaxOSX install command.
Installing Fiona
From Installing Open Source Geo Software: Mac Edition:
- Open your Terminal in mac (Ctrl + Spacebar then type Terminal to search)
-
Copy the code below and paste.
sudo ARCHFLAGS='-arch x86_64' pip install Fiona
- Type your password for installing software on the Mac - now you have Fiona installed!
Back to the task.
So now I have all the required modules; Fiona, Shapely and Pandas.
Dharhas Pothina’s Slides - where I got my code
For the code I went straight to Dharhas Pothina’s Slides from his talk: Python and GIS at TNRIS GeoRodeo - May 30th, 2014 and here’s his GitHub
Code to import a CSV into a Shapefile
I only slight modified his code, but here is a pretty straight-forward script for bringing a CSV into a DataFrame and then turning it into a Shapefile.
import pandas as pd
from shapely.geometry import Point, mapping
from fiona import collection
inCSV = '/Users/....../sites_xy_data.csv'
shpOut = '/Users/....../sites.shp'
lng = 'Longitude'
lat = 'Latitude'
schema = { 'geometry': 'Point', 'properties': { 'SITEID': 'str' } }
df = pd.read_csv(inCSV)
data = df
with collection(shpOut, "w", "ESRI Shapefile", schema) as output:
for index, row in data.iterrows():
point = Point(row[lng], row[lat])
output.write({
'properties': {'SITEID': row['SITEID']},
'geometry': mapping(point)
})
This was a fairly simple file with just a Uniquie ID (SITEID) and some Latitude, Longitude data. More complex files may come in later posts.
Next Steps
I’m really looking forward to using some of the Geoprocessing functions - checkout Tom MacWright Blog post - GIS with Python, Shapely, and Fiona - and comparing the run-time and output to ArcGIS.