Project Description

"This blog is updated by the JISC funded G3 Project (#jisc3g) team. We are building an framework for teaching and communicating relevant geographic concepts and data to learners from outside the world of geography and GIS. We think this blog will be of particular interest to those working or teaching in HE and FE and those interested in teaching and learning and e-learning."

|Read more about the project |

Sunday, 31 July 2011

Reflective Teaching Practice (2) - Do I need to Know There is Such a Thing as an R-Tree Index?

I've recently been working on a paper about teaching database and spatial database concepts to GIS graduate students, using a self-paced tutorial, Free and Open Source Software (FOSS) and Open Data (PostgreSQL/PostGIS, QGIS and Open Street Map).  Once of the aims when setting up the tutorial in 2010 was to open the material to other students - i.e. to make it available to non-GIS specialists.   At the time, I didn't realise that that would be exactly what this JISC G3 project is aiming to do to but now I find myself faced with a similar question to the one raised previously on this blog:  

Once we get to including material about Spatial Databases (in an advanced version of G3), what content is relevant to non-GIS specialists?

Here are my first thoughts:

What We Should Include:
I think that any material relating to databases should certainly explain their advantages - concurrent access, central storage, security, single source of truth.  We should also include information about data types - text, number, date, and in particular the spatial data type - how to store points, lines and polygons in the database. Connecting the GIS to the database and viewing and editing the data is also fundamental.  

R-Tree Index (from:
As a more advanced topic, I would also include at least some SQL querying in the material - with links to further information, and some information on indexing and spatial indexing.  These days, students are sometimes dealing with quite large datasets and I think it will be useful for them to know how to improve the performance of their system. So, for me, the R-Tree should be there (yes, it is complicated so perhaps we can leave out the detail?)

What Should Be Left Out:
E-R Diagram (from
I think that for most students there is no need to include concepts of Entity-Relationship diagrams.   Similarly, conceptual, logical and physical design and concepts related to normalisation should be excluded.  Most of the JISC G3 users will be using data that has already been modelled, and very few of them will be collecting data from scratch and therefore require tutoring on how to model the real world in a database.

As you can see, overall I found myself erring on the side of 'include', most likely as I am an "expert" in the subject and for me all of it is important.  I suspect that this dilemma will face us again and again as this project grows and more material is added.  This also highlights the importance of the use-cases and talking to end-users when developing this type of material.  

(With thanks to Kate Jones for the re-use of the title of one of her blog posts.) 


  1. On R-Tree: For graduate students in a spatial database (not GIS 101), you definitely should discuss the concept. I wouldn't bother testing over it but I think it's important to teach grad students what makes spatial databases unique relative to other databases. Understanding R-Trees - or just that there are index types designed specifically for spatial data - is one of these things unique to a spatial database.

    As for normalization, I'd have to disagree, slightly. It should be covered for grad students but within the context of non-relational databases (NoSQL), perhaps on the last day of class. They will not be using non-relational databases now (especially with Esri products) but there's a good chance they will encounter it in coming years.

    Two other things that may be worth covering:

    1. Models of geospatial data other than the traditional points, lines and polygons. Look at either the database behind OpenStreetMap or SmallWorld.

    2. Semantic Web/LinkedData. This could also be a 1-class topic. It's pretty raw for geospatial data - for instance the OGC standard for GeoSPARQL is only in it's initial review. But there's an interesting circularity to the discussion. Data in the Semantic Web is built on a graph network. Oracle actually built their RDF store on top of the graph network in Oracle Spatial. So not only can you discuss spatial on the Semantic Web but you can also discuss how spatial database concepts support the Semantic Web.

  2. Hi ebwolf

    Thanks for the comments - and in terms of teaching grad. students in GIS and related disciplines I agree with you 100% (although in reality due to time pressure normalisation usually gets relegated to 'optional extra reading' in my case and RDF is covered on another course entirely).

    What I perhaps haven't made clear enough in the blog post is that the question I was asking is what would you teach to non-GIS people .. for example archaeology students, anthropologists, architects, civil engineers - whose specialisation is not GIS but who need to know how to use it appropriately to get their own work done. This is one of the open questions for this JISCG3 project.

    Your thoughts would be most welcomed...

  3. I think one important aspect that you only touched on in the article is what the users are going to do with the spatial database. This will have a massive influence on how much detail they need to know about the ins and outs of databases. For example, if people are just using existing data, or adding data through a well-designed entry system, the don't need to know as much about the database as someone designing one.

    Also, when teaching I would be aware of whether you are teaching a GIS course or a computer course - obviously the two overlap in areas such as this, but if the students are not expecting a computer course and then get one (be in databases, programming, etc.) they are likely to be less keen!

  4. Apparently a lot of traditional GIS people struggle with the OpenStreetMap data structure and how there isn't a scheme, just freeform key/value tags to describe any geographic object in multiple ways. As a computer scientist this made sense for me though (it was designed more from a CompSci view).