Skip to content

Releases: desh2608/spyder

UEM and collar

18 Mar 15:13
827de9e
Compare
Choose a tag to compare

New Features

UEM files

import spyder

# reference (ground truth)
ref = [("A", 0.0, 2.0), # (speaker, start, end)
       ("B", 1.5, 3.5),
       ("A", 4.0, 5.1)]

# hypothesis (diarization result from your algorithm)
hyp = [("1", 0.0, 0.8),
       ("2", 0.6, 2.3),
       ("3", 2.1, 3.9),
       ("1", 3.8, 5.2)]

uem = [(0.5, 5.0)]

# compute DER on full recording
print(spyder.DER(ref, hyp))
# DERMetrics(duration=5.10,miss=9.80%,falarm=21.57%,conf=25.49%,der=56.86%)

# compute DER using UEM segments
print(spyder.DER(ref, hyp, uem=uem))
# DERMetrics(duration=4.50,miss=11.11%,falarm=22.22%,conf=26.67%,der=60.00%)

From the CLI, UEM files can be passed using the -u or --uem option.

Collar

# compute DER using collar
print(spyder.DER(ref, hyp, collar=0.2))
# DERMetrics(duration=3.10,miss=3.23%,falarm=12.90%,conf=19.35%,der=35.48%)

From the CLI, use -c or --collar to score with a collar.

Speaker mapping

The returned DER now also includes reference and hypothesis speaker maps.

# get speaker mapping between reference and hypothesis
metrics = spyder.DER(ref, hyp)
print(f"Reference speaker map: {metrics.ref_map}")
print(f"Hypothesis speaker map: {metrics.hyp_map}")
# Reference speaker map: {'A': '0', 'B': '1'}
# Hypothesis speaker map: {'1': '0', '2': '2', '3': '1'}

Unit Tests

We have added basic unit testing with pytest. Check the tests/ directory for examples. These are based on the dscore tool.

What's Changed

New Contributors

Full Changelog: v0.2.0...v0.4.0

Small changes

05 Jan 06:14
Compare
Choose a tag to compare

This release contains tabulate for pretty printing of the output, and bug fix to handle the case when there are overlapping speaker turns of the same speaker.

First release

06 Mar 15:29
bc971f9
Compare
Choose a tag to compare

This is the first release of Spyder.

Features

  • Fast DER computation from Python code. Example usage:
import spyder

# reference (ground truth)
ref = [("A", 0.0, 2.0), # (speaker, start, end)
       ("B", 1.5, 3.5),
       ("A", 4.0, 5.1)]

# hypothesis (diarization result from your algorithm)
hyp = [("1", 0.0, 0.8),
       ("2", 0.6, 2.3),
       ("3", 2.1, 3.9),
       ("1", 3.8, 5.2)]

metrics = spyder.DER(ref, hyp)
print(metrics)
# DERMetrics(miss=0.098,falarm=0.216,conf=0.255,der=0.569) 

print (f"{metrics.miss:.3f}, {metrics.falarm:.3f}, {metrics.conf:3f}, {metrics.der:.3f}")
# 0.098, 0.216, 0.254, 0.569
  • CLI interface to compute DER between RTTM files. Example:
> spyder ref_rttm hyp_rttm
Average error rates:
----------------------------------------------------
Missed speaker time = 11.48
False alarm speaker time = 2.27
Speaker error time = 9.81
Diarization error rate (DER) = 23.56
  • Support for computing per-file DER from CLI using the --per-file flag.

Speed benchmark

We have done some basic speed comparisons in this blog post using the AMI development data as an example. Spyder is:

  • 3-5x faster than md-eval.pl (when invoked from Python code);
  • 10x faster than pyannote.metrics.

Processing time vs. number of turns in the recording:
Processing speed by number of turns