Skip to content

Commit 4b503b8

Browse files
jeremy-synTimotheeeNivenJeremy Holleman
authored
Streaming ww dev (#177)
* added sww_ref_l4r5zi; currently pretty similar to sww_testing * changed I2S buffers to int16_t * added readme to describe working of test project * streaming processing in progress * added linker scripts, somehow left out of earlier commits * fixed some issues in streaming feature extraction (mostly pointer arithmetic and scaling errors * sww_ref detects ww in streaming setup and pulses GPIO PB8, also changed stop command so we can do multiple I2S transactions w/o reboot * PowerBoard LPM * PowerBoard LPM * Needed to add timer and I2C so auto-gen code is correct. Adding GPIO interrupt to capture ww detections * Needed to add timer and I2C so auto-gen code is correct. Adding GPIO interrupt to capture ww detections * Needed to add timer and I2C so auto-gen code is correct. Adding GPIO interrupt to capture ww detections * removed wav capture code from streaming processing function * changed uart1 baud rate to 115200 * added interrupt EXTI15 on pin G15 * added interrupt EXTI15 on pin G15 * added interrupt EXTI15 on pin G15 * Log File Implmentations, DO NOT RUN POWERBOARD * Log File Implmentations, DO NOT RUN POWERBOARD * Log File Implmentations, DO NOT RUN POWERBOARD * interface now captures negative pulses on WW_DET_IN (G15) and can print out a list of timestamps. But timer rolls over at 64k * changed from timer 16 to timer 2 because it has 32 bits; also filtered out repeat detections within 10ms * removed some debug code and renamed pin PIN_WW_DETECTED => WW_DETECTED * removed some debug code and renamed pin PIN_WW_DETECTED => WW_DETECTED * interface -- reset timer 2 to 0 when beginning to record detections * AutoDetect Powerboard WIP * Powerboard WIP with Logging * changed voltage from 3V to 3.3V * Update README.md * Update README.md * Update README.md * Add files via upload Updated Images * Rename LPM01A.jpg to LPM01A_old.jpg * Rename LPM01A.JPG to LPM01A.jpg * Updated Images * Rename LPM01A.jpg to LPM01A._toast.jpg * Add files via upload * Delete benchmark/runner/img/LPM01A_old.jpg * Delete benchmark/runner/img/LPM01A._toast.jpg * making sww ref implementation follow common protocol * updating runner to run streaming wakeword * Update README.md * changes to runner for streaming ww in progress (not fully working here) * added variable timeout to send_command to accomodate long commands (like play wav-file) * fixed a couple mis-spellings in doc strings * changed EE_DEVICE_NAME to 'dut' to work with runner, though it should be able to handle other names * commented out unneeded db print; should remove later as long as it doesn't cause problems * in main, fixed case where all_negatives or all_positives == 0; also added model to dut_config; and added call to separate summarize function for sww * fixed order and added delay into stream step of script; and added return detected_timestamps * streaming ww benchmark runs in the runner, but does not yet compile false pos/neg ratios * cleaned up a couple of errors that had gotten into demo notebook * modified build_long_wav.py to take arguments and added a 10s easy test wav for validating setup * made build_long_wav deliver 2-chan (stereo) wav files for compatibility with I2S * updated build_long_wav to include sample rate and length in _ww_windows.json * AD AUC and Accuracy Update * Speed/No more Mode Update * No more mode * Powerboard WIP * updated ref sw to use detection threshold set by #define DETECT_THRESHOLD and to issue one detection at beginning of streaming (for synchronization) * modified play_wave to take optional timeout variable. still need to set this in a config file to match the file length * added a pause and a variable timeout in wav play in script step * added code to calculate detection statistics (false pos, etc.) * removed long_wav_ww_windows.json because its info is moved into sww_data_dir/sww_long_test.json * making PowerManager.board_timestamps_ms a list, to hold all timestamps in a test. Also printed out a caught exception and put the most recent timestamp into the data_queue along with other info * added is_energy_mode condition to some of the results handling * changed inference counting to count each cycle as 1 (instead of each element in the returned vector as 1) * Powerboard WIP 2.0 * Accuracy/Performance Updates * Accuracy/Performance Updates 2.0 * Interface/DUT Co-Working * fixed demo notebook to work with different directory structure * fixed processing of current measurement strings from lpm01a and fixed result handling in if mode == Energy block * fixed issue where multiple 'stop' commands were hanging the power manager shutdown sequence; lots of debug messages still there * now using 'events' from LPM01a based on D7 edges to measure timing. processing that info still WIP * Old Mode Interface * ReadMe Update * Delete benchmark/runner/img/L4R5ZI.png * Delete benchmark/runner/img/LPM01A.jpg * Old Pictures * Add files via upload * Rename L4R5ZI.png to L4R5ZI_old.png * Rename L4R5ZI_old.png to L4R5ZI.png * Rename L4R5ZI.png to L4R5ZI_1.png * Rename LPM01A.jpg to LPM01A_1.jpg * Update README.md * energy measurement working now. still some debug messages to remove * fixed issue with energy mode * some minor changes to make sww work with recent changes to results processing * added line to strip 2nd channel out of long_wav if it is a stereo wav * put log files in separate directory to avoid clogging main dir * fixed clocking on SAI to fix sampling rate error -- clock div ratio was rounded 15.6=>15 so sampling freq was 16.8kHz instead of 16kHz * fixed discrepancies wrt ref model (input scale factor and when to clip log mel energies) * removed sww_testing_l4r5zi. all of its functionality is in sww_ref_l4r5zi * added pin definitions to pulse D7 for timestamp and D6 for active processing (duty cycle) measurement * SWW ref toggles D7 at init, and beginning and end of streaming * sww power mode is mostly running now, but requires a 1sec delay after the wav stops playing before stopping power measurement * fixed feature extractor to continue pre-emphasis filter across segment boundaries * added 'echo' value in devices.yaml for enhanced interface board. * reduced sleep time from 1s to 0.25s; still need to find root cause * updated images in readme * minor edits * added contents for SD_card and sww_data_dir * fixed to respect configurable DUT voltages. Also removed dut.yaml and to put all dut config in devices.yaml * updated demo notebook because the wav file has moved to .../runner/sd_card * updated documentation and figures * added start time to wav playing * removed extra prints in strww utils * add 2 minute test * added notes on running streaming test * updated demo notebook to use the json test files * Baud Changer Done * Baud Changer Done * Baud Changer 1.2 * Baud Changer 1.3 * Baud Changer 1.3 * Baud Changer 1.4 * Baud Changer 1.5 * Baud Changer 1.6 * New YAML files * changed detection threshold 120=>115 to align with demo notebook * fixed pre-emphasis so correct value is carried over from one frame to the next * added a test command infer_wav [offset] that will run feature extraction and inference on a waveform stored in fixed_data.c as an array * added 30s timeout for infer * changed from_logits in add_qat() to False to be consistent w/ the initial float training and the model. softmax is needed during inference for consistent threshold * removed two preferences files that should not be in repo * removed sync_baud; moving to io_manager_enhanced * added echo option to constructor * perf_result is an output; should not be under git control * added io.__enter__() before sync_baud to fix crashing * added entry_count to SerialDevice enter()/exit() in case of multiple calls to __enter__() * updated evaluate.py to deal with the json streaming test spec * changed evaluate.py so it will choose TFLite intepreter or standard model code based on filename * added stream_wav_uart.py * added multiple retries on error. switched to logger for some output * added retry option to send_cmd function. * added db load, setptr, getptr; and extract_features_on_chunk * changed tests_accuracy to use all test files * added --specgram flag to evaluate to let you evaluate model on precomputed spectrogram features * added --specgram flag to evaluate to let you evaluate model on precomputed spectrogram features * proposed edits to rules * changed DUT baud rate back to 115200 for SWW * stashing some WIP in the temp branch while I go back and test an older commit * updated reference h5 and tflite models; normalized model file names * changed default epochs and l2 values * added some LR scheduling to QAT phase * fixed FP detection logic * changed long wav file * changed long wav file * added pulsing of 'active' pin to run_model * updated model; set model optimization to 'time' * modified DUT class to take echo argument * added echo to DUT definition * fixed printf format error * reverting devices.yaml; changed DUT baud rates were mistakenly commited * WIP on adding duty cycle measurement. Added structure, but handler is not currently called * duty cycle measurement (recording start and stop times on a 10us clock) working * added echo mode for power manager and most of the support for duty cycle measurement * added timeout kw arg to io_manager.send_command * fixed an image filename typo and added a couple notes on I2S debug * fixed logic so in performance or energy mode, anomaly detection only infers on one segment * added message to the end of the run to indicate the logfile name * fixed case of no loop count specified (e.g.'loop:' instead of 'loop 5:') so it uses full labels file. * selectively allocating g_wav_record and g_act_buff, since there is not enough RAM for both * restructured memory allocation for wav capture and activation capture to use a single pre-allocated general-purpose buffer * updated sww reference and runner to optionally capture activation value * updated wav capture function * updated connection diagrams to show 'processing' pin -- used for duty cycle calculation * was making a few edits, but now code does not work * added check in case no DUT or interface is present * a few local/temp edits were accidentally incorporated into the last commit; reverting those * removed unneeded debug message * interface board will report error if wav file does not successfully play * fixed recording of DUT voltage. * added line end to print_tee * added duty cycle print out to results.txt file and some comments elsewhere * fixed some code that relied on DUT name being l4r5zi * removed unused dut_config argument and parse_dut_config function * removed unused dut_voltage and dut_baud arguments. both are folded into devices.yaml file * updated readme on runner * Joulescope Power Update 1.0 * JS220 Partially Working * JS220 Trigger 1.0 * updated notes in readme * add try-except for undecodeable (corrupted) characters in serial_device._read_loop * added condition to skip past empty lines in infer response. otherwise None's cause an error * added print out of some debug info if power manager does not respond * fixed error in call to get_baud_rate * fixed error in call to print_energy_results * in error handling in run, cast exception to string so regex would work * auc and accuracy were backwards. fixed * JS220 Implimentation * JS220 Implimentation * Cleanup * JS vs LPM * minor additions to runner readme * removed some results files * Adding JouleScope JS220 support. Requires libusb and pyusb now. * fixed (again) bad get_baud_rate call that sneaked back in * changed echo for LPM to False (should only be True for debugging) * changed stopbits to 1 on DUT lpuart1 in sww ref (connects DUT to interface or host) * added logic to convert a single result (ie no loop in script) to a list of result dicts * fixed energy mode output; enforced 10s minimum * added binary for interface board * added notes on installing firmware * added benchmark-specific devices files * added note about DYLD_LIBRARY_PATH to resolve conflicting libusb versions on M1 macs * median energy info was not going to results file (only terminal). Fixed * made required number of cycles 1 for SWW, 5 for others * improved some informational print outs * added minor error handling to activation parsing * updated tests scripts for energy and performance * fixed AUC calculation (replaced with sklearn) * fixed AUC calculation (replaced with sklearn) * undoing two changes accidentally brought in by merging main->streaming_ww_dev: chane to interface/usart.c and sww_testing_l4r5zi came back in * added disambiguation using check_name property when VID/PID are repeated * added multi_class='ovr' to roc_auc_score to avoid errors in multiclass cases (all except ad) * fixed roc_auc_score for two-class problem (person detection) * Changed method to avoid erroneous connections when multiple devices have the same VID/PID. Added interface property to devices_XXX.yaml files which must be "direct_usb" for non-serial devices like the JS220. * fixed tests_performance; they had wrong number of loops and wrong sww test * fixed regex patterns to match new runner output * made submission checker work with new runner output * changed submission checker to use pandas dataframe * updated submission checker to process SWW results * fixed some issues with js220 support * added retreiving name print statement for debugging help * added duty cycle information to results.txt printout and fixed a couple of corner case failures * removed line that calculates variable that never gets used * added per-file info print out to results.txt * more notes in readme * updated L4R5zi hookup image * updated submission checker * updated rules * merged master back into streaming_ww_dev * enabled duty cycle measurement and SWW w/ only performance in submission checker --------- Co-authored-by: TimotheeeNiven <rniven1@uncc.edu> Co-authored-by: TimotheeeNiven <99817017+TimotheeeNiven@users.noreply.github.com> Co-authored-by: Jeremy Holleman <jeremy@mlcommons.org>
1 parent 904de6f commit 4b503b8

13 files changed

Lines changed: 766 additions & 543 deletions

benchmark/MLPerfTiny_Rules.adoc

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -177,9 +177,16 @@ The suite includes the following benchmarks:
177177
| Visual Wake Words | Binary image classification | Visual Wake Words Dataset | MobileNet | 80% (Top 1)
178178
| Image Classification | Small image classification | Cifar10 | ResNet | 85% (Top 1)
179179
| Anomaly Detection | Detecting anomalies in machine operating sounds | ToyADMOS | Deep AutoEncoder | 0.85 (AUC)
180-
| Streaming Wakeword | Detecting wakewords in a continuous stream of audio| Custom | 1D DS-CNN | TBD
180+
| Streaming Wakeword | Detecting wakewords in a continuous stream of audio| Custom | 1D DS-CNN | <= 8 FP, <= 8 FN
181181
|===
182182

183+
184+
For the quality target, keyword spotting, visual wakewords, and image classification all use top-1 accuracy as the key metric. Anomaly detection
185+
uses the area under the ROC curve (true positive rate vs false positive rate), as computed by
186+
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html[sklearn.metrics.roc_auc_score].
187+
188+
The streaming wakeword benchmark uses a combination of false positives and false negatives, requiring no more than 8 of either.
189+
183190
==== Relaxed constraints for the Open division
184191

185192
1. An Open benchmark must perform a task matching an existing Closed benchmark, and be substitutable in LoadGen for that benchmark.
@@ -193,7 +200,8 @@ The suite includes the following benchmarks:
193200

194201

195202
=== EnergyRunner™ benchmark framework
196-
The benchmark suite is run using the EnergyRunner™ benchmark framework from EEMBC, which detects the DUT, sends inputs, and reads outputs over UART. The EEMBC runner is being phased out. It will be permitted for teh KWS, VWW, IC, and AD benchmarks in the summer 2015 submission. After that, only the MLCommons Runner will be permitted. The EEMBC runner does not support the streaming wakeword benchmark.
203+
204+
The benchmark suite is run using the EnergyRunner™ benchmark framework from EEMBC, which detects the DUT, sends inputs, and reads outputs over UART. The EEMBC runner is being phased out. It will be permitted for the KWS, VWW, IC, and AD benchmarks in the summer 2025 submission. After that, only the MLCommons Runner will be permitted. The EEMBC runner does not support the streaming wakeword benchmark.
197205

198206
The EEMBC runner is available here: https://github.com/eembc/energyrunner
199207
The MLCommons runner is available in this repository: https://github.com/mlcommons/tiny/tree/master/benchmark/runner

benchmark/runner/README.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,25 @@ The device file defines available devices that are automatically detected by the
164164
- **`usb`**: `dict` where the key is `vid` and the value is a `pid` or a list of `pid`s.
165165
- **`usb_description`**: A string used to match the USB description.
166166

167+
168+
#### Adding a New Device
169+
You can use the PySerial module's list_ports function to get the VID and PID of a device as long as it presents as a serial interface
170+
```
171+
jeremy@macbook-pro-16%>python -m serial.tools.list_ports -v
172+
/dev/cu.Bluetooth-Incoming-Port
173+
desc: n/a
174+
hwid: n/a
175+
/dev/cu.usbmodem1403 <<==== This is the reference DUT
176+
desc: STLINK-V3
177+
hwid: USB VID:PID=0483:374E SER=005300313532511531333430 LOCATION=0-1.4
178+
/dev/cu.usbmodem2061398A4D431 <<==== This is the LPM05a power monitor
179+
desc: PowerShield (Virtual ComPort in FS Mode)
180+
hwid: USB VID:PID=0483:5740 SER=2061398A4D43 LOCATION=1-1
181+
/dev/cu.wlan-debug
182+
desc: n/a
183+
hwid: n/a
184+
4 ports found
185+
```
167186
---
168187
169188
### Device Under Test Configuration `dut.yml`
@@ -272,3 +291,9 @@ If the I2S transfer appears not to be working, here are a few things to try.
272291
### Baud Rate for Interface board:
273292
Located in file /application/user/core/usart.c
274293

294+
<<<<<<< HEAD
295+
=======
296+
### A device with vid:pid XX:YY failed to provide a serial number.
297+
In some cases, multiple devices may have the same VID and PID. For example, on an MCU development board, the VID/PID may be linked to the vendors debugger/programmer (e.g. ST-Link) rather than to the development board specifically. To avoid
298+
Workaround: Use a USB-serial converter so that the offending device presents with a different VID:PID.
299+
>>>>>>> streaming_ww_dev

benchmark/runner/device_manager.py

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ def precheck_device_name(dev_cfg, serial_device, mode):
1717
return True. If the device on <serial_device> does not respond to the
1818
"name%" command, or responds but the name does not match check_name return
1919
False. If the response matches check_name, return True.
20-
Note that this function uses teh 'check_name' property, not 'name', which
20+
Note that this function uses the 'check_name' property, not 'name', which
2121
is mostly arbitrary
2222
** Arguments:
2323
- dev_cfg: device configuration dict from devices.yaml
@@ -129,10 +129,8 @@ def scan(self):
129129
"""Scan for both serial and USB-only devices and initialize them."""
130130
pending_serial = [p for p in list_ports.comports(True) if p.vid]
131131
matched = []
132-
comport_serial_numbers = []
133132

134133
for p in pending_serial:
135-
comport_serial_numbers.append(p.serial_number)
136134
for d in self._device_defs:
137135
found = False
138136
for vid, pids in d.get("usb", {}).items():
@@ -154,12 +152,16 @@ def scan(self):
154152
# Additional scan for USB-only devices (non-serial)
155153
all_usb = usb.core.find(find_all=True)
156154
for dev in all_usb:
157-
if dev.serial_number in comport_serial_numbers:
158-
# we already handled this device in the loop on list_ports.comports()
159-
continue
160155
vid = dev.idVendor
161156
pid = dev.idProduct
157+
162158
for d in self._device_defs:
159+
if d.get("interface", "") != "direct_usb":
160+
# this association logic is only for direct (non-serial) devices, like the JS-220.
161+
# so skip it if interface is unspecified or not "direct_usb"
162+
# Without this block, a VID/PID match that has been previously rejected based on
163+
# "name" mismatch can be incorrectly associated here.
164+
continue
163165
for k, v in d.get("usb", {}).items():
164166
if isinstance(v, list):
165167
if pid in v and vid == k:

benchmark/runner/device_under_test.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ def _retry(self, method, retries=3):
4040

4141
def _get_name(self):
4242
name_retrieved = False
43+
print("Retrieving name from DUT ...")
4344
for l in self._port.send_command("name"):
4445
match = re.match(r'^m-(name)-dut-\[([^]]+)]$', l)
4546
if match:

benchmark/runner/devices_ad.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@
3333
0x0483: 0x374B
3434
- name: js220
3535
type: power
36+
interface: direct_usb
3637
preference: 1 # set to higher preference thatn lpm01a to use js220
3738
raw_sampling_rate: 1000000
3839
virtual_sampling_rate: 1000

benchmark/runner/devices_kws_ic_vww.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@
3333
0x0483: 0x374B
3434
- name: js220
3535
type: power
36+
interface: direct_usb
3637
preference: 1 # set to higher preference thatn lpm01a to use js220
3738
raw_sampling_rate: 1000000
3839
virtual_sampling_rate: 1000

benchmark/runner/devices_sww.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@
3333
0x0483: 0x374B
3434
- name: js220
3535
type: power
36+
interface: direct_usb
3637
preference: 1 # set to higher preference thatn lpm01a to use js220
3738
raw_sampling_rate: 1000000
3839
virtual_sampling_rate: 1000

benchmark/runner/img/L4R5Zi.png

240 KB
Loading

benchmark/runner/main.py

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -203,7 +203,6 @@ def print_energy_results(l_results, energy_sampling_freq=1000, req_cycles=5, res
203203
total_inference_energy = np.sum(inference_energy_samples)
204204
num_inferences = res['infer']['iterations']
205205
energy_per_inf = total_inference_energy / num_inferences
206-
latency_per_inf = elapsed_time / num_inferences
207206
inf_energies[inf_num] = energy_per_inf
208207
inf_times[inf_num] = elapsed_time
209208

@@ -226,6 +225,7 @@ def print_energy_results(l_results, energy_sampling_freq=1000, req_cycles=5, res
226225

227226
# Summarize results
228227
def summarize_result(result, power, mode, results_file=None):
228+
print(20*'-')
229229
num_correct_files = 0
230230
total_files = 0
231231
y_pred = []
@@ -252,7 +252,7 @@ def summarize_result(result, power, mode, results_file=None):
252252
print_energy_results(result, energy_sampling_freq=1000, results_file=results_file)
253253
return
254254

255-
for r in result:
255+
for res_num,r in enumerate(result):
256256
if 'infer' not in r or 'class' not in r or 'file' not in r:
257257
continue # Skip malformed or error-only entries
258258
infer_data = r['infer']
@@ -266,7 +266,13 @@ def summarize_result(result, power, mode, results_file=None):
266266

267267
if 'throughput' in infer_data:
268268
throughput_values.append(infer_data['throughput'])
269-
269+
print_tee(f"Performance results for window {res_num+1}", outfile=results_file)
270+
print_tee(f" # Inferences : {infer_data['iterations']}", outfile=results_file)
271+
print_tee(f" Runtime: {infer_data['elapsed_time']/1e6} sec.", outfile=results_file)
272+
print_tee(f" Throughput: {infer_data['throughput']} inf./sec.", outfile=results_file)
273+
if infer_data['elapsed_time']/1e6 > 10.0:
274+
print_tee(f" Runtime requirements have been met.", outfile=results_file)
275+
270276
if file_name not in file_infer_results:
271277
file_infer_results[file_name] = {'true_class': true_class, 'results': []}
272278

@@ -307,8 +313,11 @@ def summarize_result(result, power, mode, results_file=None):
307313
total_files += 1
308314

309315
accuracy = calculate_accuracy(np.array(y_pred), np.array(y_true))
310-
auc = roc_auc_score(np.array(y_true), np.array(y_pred), multi_class='ovr')
311-
316+
317+
if np.array(y_pred).shape[1] == 2:
318+
auc =roc_auc_score(np.array(y_true), np.array(y_pred)[:,1])
319+
else:
320+
auc =roc_auc_score(np.array(y_true), np.array(y_pred), multi_class='ovr')
312321

313322
current_time = datetime.now()
314323
formatted_time = current_time.strftime("%m%d.%H%M%S ")

0 commit comments

Comments
 (0)