4. Current research

Geometric clustering of spatial data

This is a clustering technique for finite data in a metric space.

The basic idea is to form a geometric graph with the data points as vertices by forming an edge between two distinct vertices if they are a specified distance d apart.

We start with d large enough so that all distinct points are connected (a complete graph) and decrease d in decrements until we get a disconnected graph.

The connected components at the first value of d  to give a disconnected graph are the initial clusters.

Then we rinse and repeat on the clusters.

The method can be sped up considerably by first arranging the points in a tree structure – for example in a quad-tree if the points lie in the plane.

Spatial analysis of diabetes data

The Centers for Disease Control and Prevention keep detailed data on the percentage of adults diagnosed with diabetes in each U.S. County, for the years 2004-2015 (currently).

We view this data as a set of points in 12-dimensional space – each point corresponding to a county, with the coordinates being the percentage adult diabetes for each of the years 2004-2015.

We apply techniques of spatial data analysis, including various forms of clustering, to try to tease out relevant temporal and geographic aspects of this data.

The flexible stable core hypothesis

This is a developing theory of why some students are more cable of learning mathematics than others. It is joint work with Mercedes McGowen and builds on our years of work on flexible and inflexible mathematical learners of college level mathematics.

A recent important impetus to this theory are the empirical findings of Marie Amalric and Stanislas Dehaene on areas of the brains of expert mathematicians activated by mathematical language and signs.

Spectral cluster analysis of concept maps

Student concept maps in beginning college mathematics are analyzed as finite graphs, wt the nodes being student-defined concepts, and the edges being student-defined connections between concepts. We use spectral cluster analysis via the eigenvalues and eigenvectors of the (unnormalized) Laplacian of these graphs, and examine the clusters for evidence of stability over time. This is joint work with Mercedes McGowen.

Statistics and data science education

Currently this is joint with with Donghui Yan. We have described an activity theory focus in the construction of a first course in data science for undergraduates.