Browsing waveforms with the WaveformBrowser#

This is a tutorial demonstrating several ways to use the WaveformBrowser to examine waveform data. This will consist of multiple examples, increasing in complexity, and will use LEGEND test data from pylegendtestdata. The WaveformBrowser [docs] is a dspeed utility for accessing waveforms from raw files in an interactive way, enabling you to access, draw, or even process waveforms. Some use cases for this utility include investigating a population of waveforms, and debugging waveform processors.

Why do we need a waveform browser when we can access data via Pandas dataframes? Pandas dataframes work extremely well for reading tables of simple values from multiple HDF5 files. However, they are less optimal for waveforms. The reason for this is that they require holding all waveforms in memory at once. If we want to look at waveforms spread out across multiple files, this can potentially take up GBs of memory, which will cause problems! To get around this, we want to load only bits of the files into memory at a time and pull out only what we need. Since this is an inconvenient process, the WaveformBrowser will do this for you, while hiding the details as much as possible.

Let’s start by importing necessary modules and test data:

[1]:
%matplotlib inline

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import os, json
import pint

u = pint.get_application_registry()

from lgdo import lh5
from dspeed.vis.waveform_browser import WaveformBrowser
from legendtestdata import LegendTestData

ldata = LegendTestData()
raw_file = ldata.get_path("lh5/LDQTA_r117_20200110T105115Z_cal_geds_raw.lh5")

plt.rcParams["figure.figsize"] = (14, 4)
plt.rcParams["figure.facecolor"] = "white"
plt.rcParams["font.size"] = 14

Basic browsing#

First, a minimal example simply drawing waveforms from the raw file. Let’s create a minimal browser and draw the 50th waveform:

[2]:
browser = WaveformBrowser(raw_file, "geds/raw")
browser.draw_entry(50)
../_images/notebooks_WaveformBrowser_3_0.png

To draw multiple waveforms in a single figure, provide a list if indices:

[3]:
browser.draw_entry([64, 82, 94])
../_images/notebooks_WaveformBrowser_5_0.png

Now draw the next waveform in the file. You can run this cell multiple times to scroll through many waveforms:

[4]:
browser.draw_next();
../_images/notebooks_WaveformBrowser_7_0.png

Filtering waveforms#

Ok, that was nice, but how often do we just want to scroll through all of our waveforms?

For our next example, we will select a population of waveforms from within the files, and draw multiple at once. Selecting a population of events to draw uses the same syntax as NumPy and Pandas, and can be done either with a list of entries or a boolean NumPy array. This selection can be made using data from a DSP file (or higher tiers).

We will also learn how to set a few other properties of the figure.

Let’s quickly produce a DSP file (see tutorial on running DSP):

[5]:
from dspeed import build_dsp

dsp_file = "LDQTA_r117_20200110T105115Z_cal_geds_dsp.lh5"
build_dsp(raw_file, dsp_file, "metadata/dsp-config.json", write_mode="r")
/home/docs/checkouts/readthedocs.org/user_builds/dspeed/envs/stable/lib/python3.10/site-packages/dspeed/processing_chain.py:1453: RuntimeWarning: invalid value encountered in divide
  self.processor(*self.args, **self.kwargs)

Now, we load a Pandas DataFrame from the file that we can use to make our selection:

[6]:
from lgdo.lh5 import load_dfs

df = load_dfs(dsp_file, ["trapEmax", "AoE"], "geds/dsp")
/tmp/ipykernel_1526/1093512087.py:3: DeprecationWarning: load_dfs() is deprecated. Please replace it with LH5Store.read(...).view_as('pd'), or just read_as(..., 'pd'). load_dfs() will be removed in a future release.
  df = load_dfs(dsp_file, ["trapEmax", "AoE"], "geds/dsp")

We use Pandas’ querying syntax to create a selection mask around high energy events:

[7]:
energy = df["trapEmax"]
energy_selection = (energy > 10000) & (energy < 30000)

Let’s have a look at them:

[8]:
energy.hist(bins=200, range=(0, 30000))
energy[energy_selection].hist(bins=200, range=(0, 30000))
plt.xlabel("energy [a.u.]");
../_images/notebooks_WaveformBrowser_15_0.png

We then construct a WaveformBrowser with this cut:

[9]:
browser = WaveformBrowser(
    raw_file,
    "geds/raw",
    aux_values=df,
    legend="E = {trapEmax}",  # values to put in the legend
    x_lim=(40 * u.us, 50 * u.us),  # range for time-axis
    entry_mask=energy_selection,  # apply cut
    n_drawn=5,  # number to draw for draw_next
)

And finally draw the next 5 batches of 10 waveforms:

[10]:
for entries, i in zip(browser, range(2)):
    browser.new_figure()
../_images/notebooks_WaveformBrowser_19_0.png
../_images/notebooks_WaveformBrowser_19_1.png
../_images/notebooks_WaveformBrowser_19_2.png

If you can use interactive plots, you can replace browser.new_figure() with e.g. plt.pause(1) to draw a slideshow!

Visualizing waveform transforms#

Now, we’ll shift from drawing populations of waveforms to drawing waveform transforms. We can draw any waveforms that are defined in a DSP JSON configuration file. This is useful for debugging purposes and for developing processors. We will draw the baseline subtracted waveform, pole-zero corrected waveform, and trapezoidal filter waveform. We will also draw horizontal and vertical lines for trapE (the maximum of the trapezoid) and tp_0 (our estimate of the start of the waveform’s rise). The browser will determine whether these lines should be horizontal or vertical based on the unit.

[11]:
browser = WaveformBrowser(
    raw_file,
    "geds/raw",
    dsp_config="metadata/dsp-config.json",  # Need to include a dsp config file!
    database={"pz_const": "180*us"},
    lines=[
        "wf_blsub",
        "wf_pz",
        "wf_trap",
        "trapEmax",
        "tp_0",
    ],  # names of waveforms from dsp config file
    styles=[
        {"ls": ["-"], "c": ["orange"]},
        {"ls": [":"], "c": ["green"]},
        {"ls": ["--"], "c": ["blue"]},
        {"lw": [0.5], "c": ["black"]},
        {"lw": [0.5], "c": ["red"]},
    ],
    legend=[
        "Waveform",
        "PZ Corrected",
        "Trap Filter",
        "Trap Max = {trapEmax}",
        "t0 = {tp_0}",
    ],
    legend_opts={"loc": "upper left"},
    x_lim=("35*us", "75*us"),  # x axis range
)
[12]:
browser.draw_next();
../_images/notebooks_WaveformBrowser_23_0.png

Comparing waveforms#

Here’s a more advanced example that combines the previous two. We will draw waveforms from multiple populations for the sake of comparison. This will require creating two separate browsers and drawing them onto the same axes. We’ll also normalize and baseline-subtract the waveforms from parameters in a DSP file. Finally, we’ll add some formatting options to the lines and legend.

We start by selecting two sub-populations from the high-energy events from above by cutting on A/E:

[13]:
aoe = df["AoE"]
aoe_cut = (aoe < 0.045) & energy_selection
aoe_accept = (aoe > 0.045) & energy_selection

aoe[aoe_accept].hist(bins=200, range=(0, 0.1))
aoe[aoe_cut].hist(bins=200, range=(0, 0.1))
plt.xlabel("A/E [a.u.]");
../_images/notebooks_WaveformBrowser_25_0.png

Now, we create two distinct browsers:

[14]:
browser1 = WaveformBrowser(
    raw_file,
    "geds/raw",
    dsp_config="metadata/dsp-config.json",
    lines="wf_blsub",  # draw baseline subtracted waveform instead of the original
    norm="trapEmax",  # normalize waveforms based on amplitude
    styles={"color": ["red", "orange"]},  # set a color cycle for this
    legend="E = {trapEmax} ADC, A/E = {AoE:~.3f}",  # formatted values to put in the legend
    x_lim=(40 * u.us, 50 * u.us),
    entry_mask=aoe_cut,  # apply the first cut
    n_drawn=2,
)
browser2 = WaveformBrowser(
    raw_file,
    "geds/raw",
    dsp_config="metadata/dsp-config.json",
    lines="wf_blsub",
    norm="trapEmax",
    styles={"color": ["blue", "cyan"]},
    legend="E = {trapEmax} ADC, A/E = {AoE:~.3f}",
    legend_opts={
        "loc": "lower right",
        "bbox_to_anchor": (1, 0),
    },  # set options for drawing the legend
    x_lim=(40 * u.us, 50 * u.us),
    entry_mask=aoe_accept,  # apply the other cut
    n_drawn=2,
)
/home/docs/checkouts/readthedocs.org/user_builds/dspeed/envs/stable/lib/python3.10/site-packages/dspeed/processing_chain.py:1453: RuntimeWarning: invalid value encountered in divide
  self.processor(*self.args, **self.kwargs)

And draw!

[15]:
browser1.draw_next()
browser2.set_figure(browser1)  # use the same figure/axis as the other browser
browser2.draw_next(clear=False);  # Set clear to false to draw on the same axis!
../_images/notebooks_WaveformBrowser_29_0.png

Direct access of waveforms and other quantities#

The waveforms, lines and legend entries are all stored inside of the waveform browser. Sometimes you want to access these directly; maybe you want to access the raw data, or do control the lines in a way not enabled by the WaveformBrowser interface. It is possible to access them quickly and easily.

When accessing waveforms in this way, you can also do the same things previously shown, such as applying a data cut and grabbing processed waveforms. For this example, we are going to get waveforms, trap-waveforms and trap energies, after applying an A/E cut.

Let’s start by defining a browser object:

[16]:
browser = WaveformBrowser(
    raw_file,
    "geds/raw",
    dsp_config="metadata/dsp-config.json",
    database={"pz_const": "180*us"},
    lines=["waveform", "wf_trap"],
    legend=["E = {trapEmax}"],
    entry_mask=aoe_accept,
    n_drawn=2,
)

Waveforms and legend values are stored as a dictionary from the parameter name to a list of stored values:

  • The waveforms are as a list of matplotlib Line2D artists

  • Horizontal and vertical lines are also stored as Line2D artists

  • Legend entries are stored as pint Quantities

Now, let’s simply print them:

[17]:
browser.find_next()
waveforms = browser.lines["waveform"]
traps = browser.lines["wf_trap"]
energies = browser.legend_vals["trapEmax"]
for wf, trap, en in zip(waveforms, traps, energies):
    print("Raw waveform:", wf.get_ydata())
    print("Trap-filtered waveform:", trap.get_ydata())
    print("TrapEmax:", en)
    print()
Raw waveform: [11538. 11538. 11608. ... 23720. 23653. 23629.]
Trap-filtered waveform: [-0.14880781 -0.29762885 -0.33446312 ... 29.213812   29.149967
 29.014126  ]
TrapEmax: 15478.66015625 dimensionless

Raw waveform: [16610. 16610. 16578. ... 28694. 28775. 28751.]
Trap-filtered waveform: [ 0.19340937  0.38683593  0.5290797  ... 51.552177   51.308987
 51.091385  ]
TrapEmax: 15687.3740234375 dimensionless


This page has been automatically generated by nbsphinx and can be run as a Jupyter notebook available in the dspeed repository.