Monday, November 16, 2020

Texture analysis by DiRDiP

Imagine you have a set of points evolving in time. It can be the center of colloidal particles in a suspension, of cell nuclei in a tissue, of bubbles in a foam, of people in a crowd. You also have a way of defining which point is bonded to which one. Two people are bonded if they hold hands. Two nuclei are bonded if their respective cell membranes touch. Two bubbles are bonded if they share a soap film. Two colloidal particles are bonded if they are within a certain distance from each other.

And now you make this dynamical, with people moving and handshaking around, bubbles deforming to flow through a funnel, colloidal suspension beeing sheared, tissue trying to fix the scar of a scalpel cut. Your points are moving and your bonds can stretch, rotate, break or be born.

A set of 9 positions with bonds between them, evolving in time. The positions can move, the bonds can appear (green) or disappear (red).

How would you characterize this mess? How to make statistically significant observations of such an ensemble of discreet events? How to deduce anything about the response properties of your system? 

One way is to translate the original discreet description of the evolution of the system (discreet points moving and discreet bonds deforming, appearing or disappearing) into a continuous description in terms of strain field. Even better, in terms of strain fields: the reversible part of the strain that correspond to the bonds that deform without breaking on one hand, and on the other hand the irreversible part of the strain that corresponds to the bonds that appear or disappear.

I've just released a piece of code that does just that.

It's based on a great method published by François Graner and Benjamin Dollet in 2008 that allow to go from the set of point to the mechanical properties of the system. This paper is great, pedestrian and extremely clear on every detail. I really recommend the read if you know a bit about strain tensors and continuous mechanics.

Unfortunately, the original implementation of this analysis in Delphi was lost and later implementations in Matlab were never brought to releasable form. That is why I reimplemented everything in Python - I hope in a clean enough way - so that others will be tempted to use this method. 

This new implementation relies only on Numpy and Numba for optimized calculations, and on Matplotlib to display the results. 
 
I've cooked up examples to show how to use Trackpy results stored in Pandas dataframes as positions and how scipy.spatial can be used to define bonds.
 
Following the title of the original paper, the code is named DiRDiP, for Discrete Rearranging Disordered Patterns.
 
The code can be cited using the DOI provided by Zenodo: 
Mathieu Leocmach, DiRDiP, https://doi.org/10.5281/zenodo.4276047 (2020)


What the code takes as input

  • Two arrays of coordinates at two different times. The dimensionality is arbitray, but for most applications 2D and 3D will be enough.
  • Two sets of bonds, each linking positions at one of the two times.
  • A grid on which to compute the continuous description. I've implemented the rectilinear grid in any dimension, the regular grid in any dimension, and the polar grid in 2D. Implementing other grids (spherical, cylindrical, etc) would be rather easy.

What the code outputs

Arrays of matrices, one matrix per grid element. Exemple of matrices (as defined in the paper):
  • statistical velocity gradient
  • statistical rotation rate
  • statistical topological rearrangement rate

There is a submodule dedicated to displaying such arrays of matrices in Matplotlib.

What the code does not do

  • Localize interesting points and link their positions in time. Use Trackpy for that, or what fits to your problem.
  • Decide which points are linked together. This dependes so much on the physics or biology of your system. Some possible geometrical criteria are implemented in scipy.spatial, like Voronoi neighbourhood, k-nearest neighbours, distance criterion, etc.
  • Decide how to average the results to obtain statistical significance. But you know, averaging numpy arrays is pretty easy. For example, if you have a seady state, you may want to average in time the arrays resulting from two successive time steps. If you have a spatial symmetry or invariance in your system, you may want to exploit it.

What I am proud of

  • Unit tests
  • Documentation of each function
  • Some examples

What it still lack to be perfect

  • A tutorial
  • A generated documentation 
  • More users