An Ecoacoustics Generalized Recognition and Event Tester.
Egret is a general purpose audio recognition benchmarking tool. It's main job is to compare the performance of acoustic event recognizers that detect faunal vocalizations in environmental audio files. It can:
- can test hundreds of test audio files in parallel
- source test files from your computer, 🚧the internet🚧, 🚧an acoustic workbench🚧, or 🚧other sources🚧
- test each file in an array of analysis tools
- process acoustic event output from tools
- 🚧produce reports and graphs on recognition performance🚧
- compare performance between tools
- 🚧show recognizer performance over time🚧
- 🚧import existing test and training sets from your own CSV files so you don't have to
rewrite your datasets!🚧
- 🚧also supports the AviaNZ result format🚧
- more formats coming...
Egret doesn't analyze audio itself - that is a job of a tool.
Egret thus is made to be used with different tools. It comes with out of the box support for some tool like AnalysisPrograms.exe but it can be easily adapted to run your own tool, or recognizer.
Egret is a very early prototype - an alpha. Expect many incomplete features, bugs, and frequent breaking changes in features and scope. No warranty, express or implied, is given.
Where possible features that are not yet implemented are marked with a Construction sign emoji (🚧).
🚧TODO🚧
Egret is a command line tool. You interact with it through a terminal like:
- Windows Terminal (required) on Windows
- the Terminal app on MacOSX
- the Terminal app on Linux
Once you've opened your terminal you can use one of the following Egret commands.
Egret has two main commands:
test
which runs all of the tests once and reports its findings- 🚧
watch
which watches all your test files, configs, and tools and will continually reports which tests pass or fail🚧
The watch command is useful for interactive training or testing of a new recognizer.
You can see detailed information by using the --help
option. For example:
- Use
egret --help
to see all the commands available - Use
egret test --help
to the different options you can use to run the test command
Although Egret is a command line tool, most of it's configuration is done in it's config files.
90% of the commands you will run will look like this:
> egret test my-configuration-file.yml
There are several example config files in this repository and we'll explain the various options in the rest of this document. The configuration files are written in a configuration language called YAML.
If you need an introduction to YAML please see this article: https://sweetohm.net/article/introduction-yaml.en.html.
We highly recommend using Visual Studio Code to edit your YAML config files. It is free, and comes with built in syntax highlighting for YAML files.
The basic purpose of egret is to run tests. A group of tests is called a suite (meaning a set of programs with a uniform design and the ability to share data or in our case a set of related tests).
An empty test suite named boobook
woud look like this in a config file:
test_suite:
boobook:
# TODO: add tests
That configuration file isn't very useful - it just defines a name without any tests! We'll add some tests in the next section.
You can have multiple suites in a config file:
test_suite:
boobook:
# TODO: add tests
koala:
# TODO: add tests
For information on other test suite options, see Test Suites
A test suite contains a set of tests. Let's look at an example:
test_suite:
boobook:
tests:
# TODO: finish tests
- file: windy.wav
- file: boobook.wav
- file: boobook1.wav
- file: i_do_not_exist_because_someone_gave_me_this_silly_filename.wav
koala:
# TODO: finish tests
- file: helicopter.wav
- file: motorboat.wav
- file: koala.wav
The above configuration can be read as:
There are two test suites, boobook and koala. For the boobook suite, there are 4 tests, each using a different file. In the koala suite there are three tests, on three files.
It still isn't complete---we'll finish it in the next section. None of the tests actually test anything yet but it does run a tool on each test and ensure it does not crash.
For example, running Egret with this configuration will produce the following output:
C:\Temp > egret test boobook-koala-config.yml
Starting test command
Using configuration: C:\Temp\boobook-koala-config.yml.yml
Found 7 cases, running tests:
Results
âś…boobook.0: {4.43 s} for ap with windy.wav
âś…boobook.1: {3.56 s} for ap with boobook.wav
âś…boobook.2: {3.70 s} for ap with boobook1.wav
❌boobook.3: {0.00 s} for ap with i_do_not_exist_because_someone_gave_me_this_silly_filename.wav
- Count not find source file: C:\Temp\i_do_not_exist_because_someone_gave_me_this_silly_filename.wav
âś…koala.0: {6.12 s} for ap with helicopter.wav
âś…koala.1: {5.33 s} for ap with motorboat.wav
âś…koala.2: {9.28 s} for ap with koala.wav
Finished. Final results:
Successes: 6
Failures:1
Result: 85.71%
This works as expected. The file named i_do_not_exist_because_someone_gave_me_this_silly_filename.wav
does not actually exist and Egret reported this as an error.
Each test can have many expectations. That is, we expect that using a tool will produce some results and we expect those results to have certain properties.
This is the really important bit.
We would expect a Koala recognizer to produce (true positive) Koala recognition events.
With expectations we tell Egret what to expect for each test. Let's finally finish our config:
test_suite:
boobook:
tests:
- file: windy.wav
expect:
- segment_with: no_events
- file: boobook.wav
expect:
- label: boobook
bounds: [ 1, 500, 2, 600 ]
- label: boobook
bounds: [ 33, 500, 37, 600 ]
- file: boobook1.wav
expect:
- segment_with: event_count
count: 3
label: boobook
# NOTE: removed test for the i_do_not_exist_because_someone_gave_me_this_silly_filename.wav file
# because it clearly doesn't exist... it was right there in the name!
koala:
- file: helicopter.wav
expect:
- segment_with: no_events
- file: motorboat.wav
expect:
- segment_with: no_events
- file: koala.wav
expect:
- label: koala
bounds: [ 4.75, 300, 19.5, 1200 ]
- segment_with: no_extra_events
The above config can be read as:
- for the boobook suite, run three tests
- in the file
windy.wav
expect no acoustic events to be detected- in the file
boobook.wav
expect two acoustics events
- The first should have the label
boobook
, start at 1 second, be 1 second long, and have a bandwidth of 100 hertz- The second should have the label
boobook
, start at 33 seconds, be 5 seconds long, and have a bandwidth of 100 hertz- in the file
boobook1.wav
we expect three events, all labelled withboobook
, but we don't care where they are- for the koala suite, run three tests
- in the file
helicopter.wav
expect no acoustic events to be detected- in the file
motorboat.wav
expect no acoustic events to be detected- in the file
koala.wav
we expect 1 event
- The first should have the label
koala
, start at 4.75 seconds, be 14.75 seconds long, and have a bandwidth of 900 hertz- and: we check there are no other events found
There are several interesting features demonstrated in this example, like:
- checking for true positives
- ensuring exhaustive checks are done to help assess false positive and false negatives
- checking the duration and bandwidth of the detected events to ensure a correct match is identified
If we run this tool we get output that looks like:
Starting test command
Using configuration: C:\Work\GitHub\egret\src\Egret.Cli\config.yml
Found 2 cases, running tests:
Results
âś… boobook.0: {5.32 s} for ap with windy.wav
Segment tests:
- âś… 0: no events {match: true}
âś… boobook.1: {4.98 s} for ap with boobook.wav
Events:
- âś… 0: {label: boobook, bounds: [ 1, 500, 2, 600 ], match: true}
- âś… 1: {label: boobook, bounds: [ 33, 500, 37, 600 ], match: true}
âś… boobook.1: {6.34 s} for ap with boobook1.wav
Segment tests:
- âś… 0: Segment has 3 events {count: 3, match: true, label: boobook}
❌koala.1: {4.27 s} for ap helicopter.wav
Segment tests:
- ❌ 0: no events {match: true}
- ❌ Event count: Expected 0 results but 3 were found
❌koala.2: {6.67 s} for ap motorboat.wav
Segment tests:
- ❌ 0: no events {match: true}
- ❌ Event count: Expected 0 results but 1 were found
❌koala.3: {5.03 s} for ap koala.wav
Events:
- âś… 1: {label: KOALA, bounds: [ 4.75, 300, 19.5, 1200 ], match: true}
Segment tests:
- ❌ 0: no extra events {match: true}
- ❌ Event count: Expected 0 extra results but 1 other were found
Finished. Final results:
Successes: 3
Failures:3
Result: 50.00%
Clearly our Koala recognizer needs some work!
Crucially though, after we (attempt to) improve our Koala recognizer we can
check how well it performs by simply running the egret test
command again.
🚧We can even compare a new result to old results!🚧
Please see our docs
Please see our samples
Please see:
Unprefixed SI base units.
This means:
- durations will always be expressed/stored/read in seconds
- never milli, micro, mega, decimal hours, or decimal days (looking at you Excel!)
- frequency will always be expressed in hertz
- never kilohertz, or octaves
- dates and times will always be expressed as ISO8601 encoded strings
- examples:
2020-11-12T15:43:38+00:00
2020-11-12
15:43:38
- never: 12-hour time, never any other date format (looking at you Americans)
- examples:
In the rare case a unit or value must be reported or used that does not adhere to the above rules, the column/property name using that value will have the full unit encoded in the name.
Example:
# good
lowKilohertz: 1.0
# bad
low: 1.0
See expectations in our docs. Currently you can:
- check coordinates based on bounding box, centroid, or temporal span
- check event duration, bandwidth, or label
- check meta features like event count, check for no events, or check for no extra events
(yeah I realize I am asking this for you)
Great question! The real world is messy. Generally everywhere you see a number in an Egret config file you can replace it with an Egret expression. Expressions allow results values to roughly match each other within some given range or tolerance.
Expressions can represent strict equality, inequality, relations, tolerances, thresholds, approximations, and intervals.
See values for a lot more detail.
Please see results.
Hopefully! We go to great lengths to try and read your data.
Please see munging for more detail.
It depends. Please see imports.
🚧 We want to add support for importing (reusing) labelled datasets from:
- other Egret config files (useful for reusing noise/anti-matching datasets)
- CSV files
- AvianNZ label files
- Audacity annotations
- Raven labels
But all these features take work. Which would you like? 🚧
If your analysis tool of choice can:
- analyze a single segment of audio
- do so via a shell command
- return results in a file
then probably yes!
If it doesn't work then we're keen to address that.
Please see tools for guidance in setting up a new tool.
It can. There are two steps:
- ensure you can run your analysis via the command line (see tools for guidance)
- ensure you return results that approximately adhere to our standards (see results for guidance)
- (this is now an index off by one joke)
Yes? No? Egret doesn't care.
Egret works with tools that produce results. Egret does not care how you produce your result, only that it was produced and passes the tests defined in the configuration file.
You're free to use whatever method of analysis that you want.
Because I (@atruskie) am fast and productive with it, especially when I have to produce long-lived, real-world products.
Apache v2.
Egret logo image credit: David Clode