A hands-on guide to LASSO-Python D3plots

python wallpaper

Want to create automatic reports, create graphs of nodes moving around perform machine learning analysis with data from within a d3plot? Then you are in the right spot here. This guide will get you started quickly and teach you the basic principles by using lasso-python.

Why LASSO-Python?

In the past we provided access to LS-Dyna d3plots through our very first repository: qd-python. After gaining a lot of expertise I decided a few years ago to move on and rewrite the entire d3plot reader in pure python to squeeze out the maximum amount of performance possible. The result was the dyna module in lasso-python. In contrast to qd it is crazy fast, requires way less memory and provides access to the underlying data in a modern, machine learning friendly fashion.

How to get started?

The same way as with every python package:

python -m pip install lasso-python

Also don’t forget to take a look at the Documentation!

How to read a file now?

Just pass the d3plot constructor the filepath to the first d3plot. It will find the state files associated to it automatically.

from lasso.dyna import *
d3plot = D3plot("path/to/d3plot")

Yes it is as easy as it looks like.

How do I get to my data?

All your data is available as numpy arrays in a dictionary accessible under d3plot.arrays. This is part of the recipe making the library fast. You can get e.g. the node displacement of all nodes as follows:

>>> # Check whats inside!
>>> d3plot.arrays.keys()
dict_keys(['node_coordinates', ...})
>>> # take node displacement
>>> node_displ = d3plot.arrays["node_displacement"]
>>> # or do it cleaner with autocomplete
>>> node_displ = d3plot.arrays[ArrayType.node_displacement]
>>> # shape is (n_timesteps, n_nodes, x_y_z)
>>> # for general shape infos see ArrayType Docs
>>> print(node_displ.shape)
(1, 4915, 3)

In contrast to qd-python no objects are created. Object orientation is comfortable since we can use the existing datastructure for e.g. neighborhood search, but it has a high price. Memory explodes easily with size due to cross-references and we always pay for something we most of the time don’t use. Providing solely arrays seems uncomfortable first, but one can accomplish the very same as with an object structure, one just has to learn “the new way” to do so.

How to get a specific thing such as a node?

Internally the nodes are identified by indexes (counting from 0 to number of nodes) instead of ids. We can convert our id which we usually use in postprocessors to indexes and use this to find the node. This is a very common pattern for almost anything, including elements:

>>> # get node id array
>>> node_ids_array = d3plot.arrays[ArrayType.node_ids]
>>> # convert to list and search node
>>> node_id = 157814
>>> node_index = node_ids_array.tolist().index(node_id)
>>> # filter out the node
>>> node_disp[: node_index]

How do I filter a part?

To filter all nodes of a part we have to select the respective nodes from the global node array. One would therefore have to:

  1. Convert part ids tp part indexes
  2. Find all elements belonging to the part using ArrayType.element_shell_part_indexes
  3. Take all the unique nodes of these elements using ArrayType.element_shell_node_indexes
  4. Use these node indexes to filter node_disp

Sounds annoying and complex? That’s why we implemented a function for it.

>>> # get a filter mask for nodes of a specific part
>>> node_filter = d3plot.get_part_filter(FilterType.NODE, part_ids=[20, 21])
>>> # filter the global nodes
>>> part_nodes_disp = node_disp[:, node_filter]
>>> print(part_nodes_disp.shape) (1, 321, 3)

What are use-cases?

They are countless but here are some inspirations:

  • Extract data into excel sheets for reports
  • Create a visualization for a report
  • Extract data for optimizations
  • Write a script monitoring a simulation and terminating it
  • Train a Neural Network written in TensorFlow or Pytorch
  • Convert a d3plot into hdf5 since it is easier to use
  • Create visualizations for specific parts
  • Create an own tool with UI in e.g. PyQt for domain specific postprocessing
  • Compare simulation data with point cloud data from a test

lasso-python makes all of this suddently quite easy.

How to create a visualization?

Simply take the e.g. node data and plot with e.g. matplotlib.

import matlotlib.pyplot as plt

# choose a node, here last node
node_xyz = d3plot.arrays["node_displacement"][:, -1]
plt.plot(timesteps, node_xyz[:, 0], '-x', label="x [mm]")
plt.plot(timesteps, node_xyz[:, 1], '-x', label="y [mm]")
plt.plot(timesteps, node_xyz[:, 2], '-x', label="z [mm]")
Node Displacement over Time
Displacement of a node in x,y,z over time.

Here we have it. The displacement of a node over time. This can be done with any result.

How to compute the Principal Stress Tensor?

This is an example of advanced usage. First let’s get all the elements in a d3plot:

>>> stress = d3plot.arrays[ArrayType.element_shell_stress]
>>> stress.shape
(34, 52772, 16, 6)
>> # (n_timesteps, n_shells, n_layers, xx_yy_zz_xy_yz_xz)

Computing principal stress seems simple, loop over all timesteps, elements and layers and compute the the principal stress BUT this is horribly slow. Golden rule of arrays in python, don’t loop. Use array syntax or array operations so the underlying vectorized math functions can take care of it.

First of all numpy offers fortunately already an eigensolver through np.linalg.eig. Most importantly in the function description, it will compute the eigendecomposition of the last two dimensions and consequently loops automatically over all previous ones. Very handy but there is still a problem. To compute the eigendecomposition in numpy we need a square matrix of shape (M, M). Thus we have to bring the stress tensor into shape first.

>>> # concat tensor components
>>> # (9 components, 34 timesteps, 52774 elems, 16 layers)
>>> stress = np.array([stress[:, :, :, 0], stress[:, :, :, 3], stress[:, :, :, 5], stress[:, :, :, 3], stress[:, :, :, 1], stress[:, :, :, 4], stress[:, :, :, 5], stress[:, :, :, 4], stress[:, :, :, 2]])
>>> >>> # reshape 9 components into matrix >>> stress = shell_stress.reshape((3, 3, *stress.shape[1:])) >>> stress.shape (3, 3, 34, 52774, 16) >>> >>> # we need to get 3, 3 into the rear
>>> # thus reorder axis with transpose >>> stress = stress.transpose((2, 3, 4, 0, 1)) >>> stress.shape (34, 52774, 16, 3, 3) >>> >>> # (optional) in case you want to mean
>>> # over the integration layers >>> # you can do it here or after the computation >>> stress = stress.mean(axis=1) >>> stress.shape (34, 52772, 3, 3) >>> >>> # finally compute eigenvalues >>> evals, evec = np.linalg.eig(stress) >>> evals.shape (34, 52772, 3) >>> evec.shape (34, 52772, 3, 3)

Most operations are cheap except for the first one, which is a copy and the last one of course. Be aware that the values may differ between your postprocessor and here, since by experience postprocessors treat layers differently (sometimes max or mean before computation, sometimes mean after computation, …).

My memory is limited, what to do?

You have several options in the D3plot constructor:

  1. n_files_to_load_at_once=1 (before v1.4.0)
  2. buffered_reading=True (v1.4.0+)
  3. state_array_filter = [...]
  4. state_filter=[...] (v1.4.0+)

Be aware enabling any of these options causes an additional copy operation internally, degrading the speed slightly. Point 2 replaces basically point 1. In the future both options limit the amount of states loaded into memory.  Point 3 allows to extract only specific state data, lowering memory usage drastically. Note geometry data is alwas read, since it might be required later on to read state data. Point 4 makes it possible to load only specific states. This avoids pulling memory from disk into RAM and thus saves loading time. Feel free to play around.

How fast is lasso-python?

Usually 4-10 times faster than postprocessors as well as qd-python, since lasso-python doesn’t build any internal datastructure but just references memory smartly. The required memory or RAM is exactly the size of the d3plots on disk if you load everything. If you selectively load data it will of course be much less.

How to achieve this amazing Performance with pure Python?

The core of qd-python was written in C++- It is surprising for many to hear that lasso-python relies on pure python and is significantly faster than qd-python. This was due to the fact that lasso-python reduces memory copies to an absolute minimum thanks to pythons amazing memory slicing. Writing such code in C or C++ is not as easy as in python. Be aware lasso-python not only beat qd-python but also postprocessors since postprocessors build up own datastructures and copy memory around whereas lasso-python just references memory it loaded into RAM a single time. lasso-python can be expected to be at around 4-10 times faster.

Why is Performance important?

Desktop machines performing postprocessing are usually limited in Network Bandwidth, RAM and CPU. Network is usually the biggest issue in bigger companies, where storage is not around the corner. So in order to significantly improve loading time, efficient I/O is crucial. Also keeping RAM low at the same time allows us to process tons of d3plots in parallel on a single machine and makes the whole technology very cloud-friendly and the best friend of Machine Learning.

How can I learn to code like this?

Besides my job I’m a coach for people interested in coding. Usually I assess peoples skills and advise them with tasks or projects suitable to enhance their skills, as well as offer coaching through code reviews and advises. In case you are interested you contact me through qd.eng.contact[at]gmail.com.

Be the first to comment

Leave a Reply

Your email address will not be published.