In my PhD research, I do a lot of analysis of 2D and 3D grid data output by simulations I run. If, for example, I had a toy system that was 3x2x2 grid points, the raw data would be structured sort of like this:

```
x y z value
0 0 0 0.9
1 0 0 1.1
2 0 0 0.8
0 1 0 1.1
1 1 0 1.0
2 1 0 0.9
0 0 1 0.6
1 0 1 1.2
2 0 1 0.8
0 1 1 0.9
1 1 1 1.2
2 1 1 1.3
```

In my analyses, it's very helpful to restructure these data into a format where, in this case, `x = [0, 1, 2]`

, `y = [0, 1]`

, `z = [0, 1]`

, and `value`

is a 3D array such that `value[i, j, k]`

returns the value corresponding to position `(x[i], y[j], z[k])`

.

It's easy to do that in just a few lines. Say the above raw data is stored in `data.dat`

.

>>> import numpy as np >>> x, y, z, value = np.loadtxt('data.dat', skiprows=1).T >>> x, y, z = np.unique(x), np.unique(y), np.unique(z) >>> nx, ny, nz = len(x), len(y), len(z) >>> value = value.reshape((nz, ny, nx)).T

Note that if the raw data had `x`

varying the slowest and `z`

varying the fastest, the final line would look like `value = value.reshape((nx, ny, nz))`

.

Finally, if you want to go the other way, where you have your `x`

, `y`

, and `z`

arrays and 3D `values`

array, you can make use of the `sklearn.utils.extmath.cartesian`

function (first introduced on this StackOverflow post. If you want `z`

to be the fastest changing variable, it would look something like this:

```
>>> from sklearn.utils.extmath import cartesian
>>> import numpy as np
>>> x, y, z = [0, 1, 2], [0, 1], [0, 1]
>>> nx, ny, nz = len(x), len(y), len(z)
>>> value = np.arange(nx*ny*nz).reshape((nx,ny,nz)) # define 3D value array
>>> xyz = cartesian((z, y, x))
>>> value = value.flatten()
>>> np.hstack((xyz, value[:,None]))
array([[ 0, 0, 0, 0],
[ 0, 0, 1, 1],
[ 0, 1, 0, 2],
[ 0, 1, 1, 3],
[ 1, 0, 0, 4],
[ 1, 0, 1, 5],
[ 1, 1, 0, 6],
[ 1, 1, 1, 7],
[ 2, 0, 0, 8],
[ 2, 0, 1, 9],
[ 2, 1, 0, 10],
[ 2, 1, 1, 11]])
```

The `value[:,None]`

thing on the last line adds an extra dimension to the 1D value array so the elements of the tuple passed to `np.hstack`

are both 2D numpy arrays.

```
>>> value.shape
(12,)
>>> value[:,None].shape
(12, 1)
>>> value[None,:].shape
(1, 12)
```

## Comments

comments powered by Disqus