Researchers use cluster analysis to identify explosions

Sandia researchers Chris Young (6116) and Dorthe Carr (5736) have applied a concept long used in biology for classifying organisms, cluster analysis, to a new discipline — seismology.

The pair successfully used cluster analysis, a technique for grouping similar entities in such a way that their interrelationships are revealed, to identify explosions from mines in New Mexico and Wyoming. It was part of their research to develop analytic tools needed to monitor the Comprehensive Nuclear Test Ban Treaty (CTBT).

“This is a whole different way to look at seismic events,” Dorthe says. “It’s a new tool for determining where an explosion comes from and if it is nuclear in nature.”

The CTBT, which was signed by President Clinton in 1996 and has been approved by many countries, calls for the monitoring of all small earthquakes, mining and industrial explosions, and other natural and man-made sources of seismic waves to identify potential underground nuclear blasts. This means that the many thousands of annual naturally occurring man-made seismic events must be detected and identified as being of non-nuclear origin either by their location, depth, or other characteristics.

Currently there is no easy way to do this, largely because the huge volumes of explosions and naturally occurring seismic events. Cluster analysis offers a way to narrow the field of events that must be monitored by eliminating explosions from known mines.

Chris came up with the idea of using cluster analysis for analyzing seismic events after helping his wife, a biology doctoral student at the University of New Mexico, develop a cluster analysis computer program for her research.

“I thought ‘if it’ll work for biology, why can’t we try it for seismic events.’ It just seemed a natural,” Chris says.

One way to determine if an event is from a particular mine is for an analyst to compare known waveforms from explosions at that mine to an unknown waveform. (Waveforms are recordings of ground motion.) This can be done by eye, and for a researcher who knows a region well, events from certain mines can be easily identified visually.

“However, with the large volume of data that can be expected when monitoring the CTBT, we wanted to find a way to automate comparing unknown waveforms to archived events associated with specific mines,” Chris said.

By clustering waveforms that have similar characteristics, it becomes possible to identify explosions from specific mines.

Chris and Dorthe used waveform data sources from three mines in Wyoming and four in New Mexico to run the cluster analyses.

The Wyoming data, representing 175 different explosions from three mines, was originally collected in the early 1990s with a seismograph installed at the Pinedale Seismic Research Facility near Boulder. They were compared to the mining companies’ records of the same explosions. Two of the mines were located 320 kilometers from the monitoring station, and one was only 175 kilometers from it.

New Mexico Tech researchers collected the New Mexico data from four mines over a four-month period in 1997. This was an era when mining activity was high in the western part of the state and in eastern Arizona. Two of the mines were near Silver City, one was in eastern Arizona, and the fourth was at Mt. Taylor. All were within 260 kilometers of the seismic monitoring station. While there are many cluster analysis methods, the Sandia researchers selected a hierarchical cluster method because it is well-documented, easy to implement, computationally cheap enough to run multiple times for a given set, and produces results that can be readily interpreted for seismic events. Hierarchical clustering methods form “dendrograms,” tree-like structures showing the relationships between the events. Before performing the official cluster analyses, the Sandia researchers ran test sets preprocessing the data to bring out specific characteristics in the waveforms and using different hierarchical cluster analysis methods to determine which was the best for their purposes.

“The New Mexico data did exactly what we expected,” Dorthe says. “The cluster analysis separated data from all four mines, clearly showing which explosions occurred at which mine.”

The information from the Wyoming mines wasn’t so clear. It was difficult to distinguish the data from the two mines that were located the same distance — 320 kilometers — from the monitoring station. Also, noise ratio (background noise that disrupts signals) was considerably higher in Wyoming than in New Mexico. But when the researchers added background noise to the New Mexico data, they showed findings similar to the Wyoming results. In other words, Chris says, “clustering deteriorates as the noise level increases.”

Dorthe says that overall their research indicates that “cluster analysis can be used to identify events from specific mines, but good signals are needed.”

Now that they have concluded this, the researchers are ready to move on to their next step — developing a process of identifying explosions from known mines in real time.

“Once we do this we will be well on the way of eliminating mining explosions from all the events that are required to be monitored by the Comprehensive Nuclear Test Ban Treaty,” Dorthe says. “That will make the task of figuring out which events are nuclear much easier.”

Sandia Lab News