A brief introduction to Python for geocomputation

1 Overview

This is an introduction to the world of geospatial analysis, or geocomputation, using Python. Since you are in this class, you already have a good working knowledge of using Python for data analytics work. I’m sure you all have a basic familiarity with maps through using things like Google Maps, actual paper maps and a globe. We’ll learn the basic geospatial concepts needed such as coordinate reference systems and the various types of data used in this realm.

1.1 Acknowledgements

I have adapted the materials from the Introduction to Geospatial Raster and Vector Data with Python course, which is part of the Carpentries Incubator. The Carpentries Incubator is a hub for people creating open access course materials using the templates and other resources provided by The Carpentries. The goal of The Carpentries is:

We teach foundational coding and data science skills to researchers worldwide.

I have used materials from Software Carpentry in my Practical Computing for Data Analytics course.

All of the materials I’m using are licensed under CC-BY 4.0. Similarly, all of my materials in this module use the same CC-BY 4.0 license.

My original plan was to use the templates from The Carpentries for this module as they are based on using R Markdown. However, recently they lost funding that led to layoffs and it doesn’t look like there are resoures for continued development of the Carpentries Workbench. With the advent of Quarto, I was hoping that the Carpentries templates would evolve to support it, but that doesn’t seem to be in the cards. So, I’m using some of their content and ideas, but am developing this module using Quarto. Quarto is a computational authoring tool that supports multiple programming languages. While Quarto is the next generation of R Markdown, it also supports Python code chunks. You can find the GitHub repo for this module at https://github.com/misken/geonewb_python.

This module will also use the extremely good online texts:

1.2 Module structure

There are a series of sections, or chapters, which cover a range of topics. Within each section, there will also be a Jupyter notebook that you will work through to get hands-on practice with each topic. The series of notebooks are based on an overarching case study involving analysis of land use on the Oakland University campus.

Here’s an overview of the topics we’ll cover in this module.

  • Introduction to Raster Data
    • what is it and what is it used to represent?
    • main Python libraries for working with raster data
    • basic plotting of raster data
    • Case Study: reading, exploring, plotting a raster file representing land use on the OU campus
  • Introduction to Vector Data
    • what is it and what is it used to represent?
    • main Python libraries for working with vector data
    • basic plotting of vector data
    • Case Study: reading, exploring, plotting a vector file representing various features on the OU campus such as buildings and rivers
  • Coordinate Reference Systems
    • this is a big, important, and complex topic
    • how do we represent something like location on our planet, that is inherently 3D, on a 2D surface such as a computer screen or a piece of paper? We use something called a projected coordinate reference system (CRS).
    • we will learn enough to be able to work effectively with data from different coordinate reference systems.
    • Case Study: check CRS of raster and vector data, reproject data from one CRS to another
  • The GIS Landscape
    • it’s a big landscape
    • software options
  • Accessing Satellite Imagery
    • use MS Planetary Computer to find and download satellite imagery
    • Case Study: Download and view Sentinel-2 images of OU campus
  • Working with Raster and Vector Data in Python
    • using rioxarray and xarray packages to read, explore, and visualize raster data
    • more use of GeoPandas for vector data manipulation
    • common raster operations such as cropping, dealing with no data values, reprojecting and resampling
    • raster computations such as NVDI
    • Case Study: analyze land use patterns

1.2.1 Software setup for the case study

You should already have the Anaconda distribution of Python installed and working. Due to the large number of libraries needed for geospatial analysis, we are also going to create and use a new Conda virtual environment for this module.

Step 1 - Download the geonewb.yml conda environment YAML file.

Step 2 - Open an Anaconda Prompt and change directories to the place where you downloaded geonewb.yml. Create the new environment with:

conda env create -f geonewb.yml
Warning

This can take several minutes. Be patient.

Later, when you are working in Jupyter notebooks for this module, you’ll select the geonewb virtual environment to use for your work.