Sound analysis corner

boonman&schnitzler   An explanation of slope parameters used in the batecho table

This page will hopefully become an international platform to exchange and discuss software to analyse echolocation calls. To kick off I have made all of my Matlab echolocation analysis scripts freely available (see below). I have tested most parameters it calculates recently, but it's still possible that I did something wrong, so if you are unsure tell me your doubts / problems. The scripts allow one to load a series of calls, and by tapping the spacebar one call after the other is displayed, you see the frequency-time course being fitted with an exponential function and in the end loads of parameters measured from each call from the sequence are stored in a file that can be opened in Excel. You are forced to click your way through and look at every fit as the fitting doesn't always work that well. This way you know where things went wrong and you can remove these calls from your dataset afterwards. Some of the parameters the programme extracts, such as "range-Doppler coupling" or "minimum resolution limit" may harness virtually no discriminatory power. I simply never had the time to check the discriminative power of all parameters. In any case, you may want to use the parameters for something else than identifying the species. So, to cut this short, here are the scripts:

Arjan's Matlab scripts   Arjan's old Matlab scripts

If you don't have Matlab, you could try to run the scripts on Octave (freeware).

I am making some changes to the scripts and will post the new ones soon. What I propose to users is that you can always ask me to go on Skype when you run the scripts and I can explain what it all means.

-Oh now I get what they mean! This article explains all the basics of sound analysis using many clear graphs, life will be simpler after reading this! click here.

-Do you want to read an explanation that is more batsound specific? Here is a very good manual for analysing bat sounds available in french and in dutch/flemish written by Sven Verkem and adapted by Ben van der Wijden and Pierrette Nyssen. The french version is far more up to date than the dutch one though. Sven will try to find to time to get it done. An english translation of the french version would be very welcome indeed, so if you are willing to do it, please make yourself known to one of us.

-Now that you've become a professional, here is some hot stuff about analysis techniques of bat calls. Finally a paper where the authors made a serious and very successful effort to explain each equation. Thank here
Free Matlab script used in the above paper can be downloaded here.

-Here are the professionals: A paper on European species ID with many sound parameters of european bats, click here.
Classification of echolocation calls from 14 species of bat by support vector machines and ensembles of neural networks

The following persons have commented on this manuscript:

Yossi Yovel of the Weizmann Institute in Rehovot, Israel
Arjan Boonman, the webmaster of this site
Ulrich Marckmann, Ecoobs, one of the inventors of the automatic ID used in the Batcorder software

Please read below what they thought about this paper:

Yossi Yovel:

Generally speaking, I like this work. It shows that automatic segmentation and classification of bat calls is possible and helps to standardize measurements of echolocation call parameters. Here are a few comments I had:

Loosening the SVM classification criterion:
I think that the authors used a very hard criterion for correct classification by the SVMs. According to their statement: "A support vector machine was trained for every target case in the dataset (each genus, or species) to classify an instance as either belonging to that case, or not. All classifiers were then combined and categorised each echolocation call as either belonging to a specific class (genus or species), or not. A call was classified correctly only if a single support vector machine classified it, and that classification was correct"

If I understood them correctly, this means that if for example a nattereri call was classified both as a nattereri but also as another species they would regard this as a wrong decision although it was not classified by the other 10 classifiers as any other species. This could explain the lower SVM performance. They could loosen this criterion by using a tree architecture of classifiers in which one SVM classifies the call into a subgroup - for instance "FM calls" and then only the classifiers that are experts on classifying FM species are applied. This would enable the SVM to learn a more specific (easier) classification rule, because in the current setting it must learn that the calls belongs to class A and not B when B contains a huge variety of call designs.

Using the raw data (Feature selection):
Another possible way to improve results would be to use the raw data for classification rather than extracting specific parameters. Strong algorithms such as SVMs enable usage of very highly dimensional data sets. Therefore the pre-process step of calculating 12 measurements on the calls actually reduces the possible information and narrows the range of cues the algorithms can find.
Using a spectrogram and a spectrum representation for each call would provide all information used by the authors and more, i.e., instead of extracting the quartile energy they would have the entire energy distribution. The combination of these cues is technically easy to do simply by concatenating all data into one vector. The only parameter that would not be represented by this vector (the call type) could be also concatenated.
I guess that the main reason why the authors chose to narrow their dimensionality of the data and settle for 12 features is the small sample size they had especially for certain species. Such a small data set probably does not represent the real variability in the world. From my experience, I would expect the frequency dependant attenuation to introduce a lot of noise that would reduce classification performance (the authors used signals of bats recorded from the hand).

Comparison of different algorithms:
It is hard to say anything about the difference between the algorithms used by the authors - since they did not do any analysis of the importance of the parameters (except for the DFA). This is hard to do because except for the DFA the other algorithms are non-linear, but it could have been done indirectly by for instance omitting certain parameters and testing the effect on classification. Generally speaking there is no reason to assume that SVMS will perform better than DFA, except for the fact that the SVMs used by the authors were non-linear and such algorithms tend to be stronger than linear. Another problem of the small data set (mentioned by the authors) is over-fitting and since there was no analysis of the relevancy of the different parameters it is hard to assess whether this occurred.

I think that a good way to assess these classifiers for generality would be to make them online and enable people to test their recordings with them.

Some more comments on the parameters used in the paper by Arjan Boonman:
Center frequency is an interesting measure that is strongly correlated with the 'curvature' of calls. This curvature, however, could also be assessed more precisely in some kind of fitting procedure. The authors use a -1.94dB bandwidth (80%). Usually -20dB correspond well to what users typically measure in programmes such as Batsound. -1.94dB is obviously much stricter so differences between species are not always reflected so strongly in this measure. However, the measure is independent of whether a bat attained its bandwidth by using harmonics or with a single sweep, so it isn't always related to Fend-Fstart. I don't like the maximum energy parameter (=peak frequency), which is discussed in great detail elsewhere on this website. I also think just giving the general sweep rate of the entire call is a bit crude and may throw away a lot of essential information. Then, to my surprise, the authors described the amplitude pattern of the calls in such detail: four quartiles! Amplitude modulation is strongly affected by the recording situation, is it really that important? Secondly, isn't there a more efficient way (one parameter) to describe the AM pattern?
To my taste, what is still missing are more detailed measures of the frequency-time structure of calls, points (in frequency) where modulation rate is lowest or highest and so on. I am afraid that the Anabat-community is ahead of us in this respect and we should really pay some closer attention to the methods they use to measure calls.

Here are more comments on the paper by Ulrich Marckmann: Good paper. It shows the value of acoustic species identification as a tool for bat surveys. Acoustic species ID is not anymore like reading tea leaves.

Some comments:

confusion rates:
I'm not sure, if the confusion rates in table 5 refer to the training or the test data. Nevertheless the classification rates are very good - maybe too high for some species. I haven't seen the call data, but for example 100 % correct classification of Myotis bechsteinii overrates the separability (at least for central Europe). There is always the danger of overfitting. Therefore the confusion rates for the training and test data should be plotted together. A decline in the classification rate indicates overfitting. The susceptibility to overfit depends on the technique, the number of parameters and the number of calls in the training data. This leads to the next point..

Call database:
Like Yossi Yovel points out, the database with 714 calls for testing and training seems too low. There is a big chance that stochastic differences lead to a pseudo-separabillity. In addition the number is too low to cover the variability in calls within the species. I think there is a need to standardize the selection of training calls for classifier tests. In my experience for each of the more difficult species (like Myotis) you need at last 500 training calls to get stable results in practice. If one considers the geographical variability, the minimum number of calls might yet be much higher.
Little is said about the selection of training calls. The training and test data should not only cover the variability of the species but also in a uniformly weighted manner. Only by different weighting of call types the confusion rate between some pairs of species with overlapping call repertoire may vary by circa 50 %.
The small database may influence the ranking of parameter importance too. Especially unreliable parameter can have a high ranking. Because they are erratic they may be used to detect differences, that are not specific but stochastic.

Parameters and parameter extraction:
The authors point out that deterministic parameter extraction algorithms are crucial for the classification purpose. I cannot emphasise enough just how important this is. Of course parameters extracted with different algorithms will show big differences.
The authors chose to not use interpulse interval as an parameter. I highly appreciate this. Especially if a recording contains several individuals, it is often not possible (neither subjective nor automatically) to extract this parameter correctly.