MRG: Use `mffpy` for EGI MFF event reading by PragnyaKhandelwal · Pull Request #13932 · mne-tools/mne-python

PragnyaKhandelwal · 2026-05-29T10:03:41Z

Reference issue (if any)

None.

What does this implement/fix?

This replaces the internal EGI MFF event-reading path with mffpy and keeps the existing RawMff event contract intact. It also adds a small compatibility shim in mne/fixes.py for mffpy timestamp parsing so real-world MFF files with nanosecond fractional seconds can still be read correctly.

Additional information

I tested this change against the EGI MFF test module with the current testing dataset, and the full mne/io/egi/tests/test_egi.py suite passes locally.

This branch intentionally keeps the scope narrow. The separate changelog fragment should use the final PR number once the PR is opened on GitHub.

AI disclosure:
I used GitHub Copilot to help draft the PR text and review the scope.

…fixes

for more information, see https://pre-commit.ci

PragnyaKhandelwal · 2026-05-29T13:47:35Z

Hi @scott-huberty! Here is the first micro-PR for the event reader refactor, exactly as we discussed yesterday.
Quick status update on what this PR includes:
Refactored the EGI MFF event-reading path to use mffpy while keeping the existing RawMff event contract intact.
Removed legacy unused event helper code to address the Vulture dead-code warnings (commit b10ea86).
Added a temporary compatibility shim in mne/fixes.py for 9-digit fractional-second beginTime values (using defusedxml).
Kept the scope intentionally narrow to event-reading internals only.

Upstream Update:
I raised the issue here BEL-Public/mffpy#138 regarding the timestamp bug. The maintainer got back to me immediately and confirmed their pending PR #133 fixes this case! Once we review this PR and figure out the test failures, I will update the TODO comment in our shim to explicitly track their PR

Failing Checks:
The style checks are green now, but we are hitting some functional test failures in the CI. I am going to look into the logs to track down the mismatch, but if you have a minute to look at the failing CI checks and see if it's something obvious that I'm missing, I'd really appreciate the help!

pmolfese · 2026-05-29T14:58:46Z

+    # mffpy.Reader for locating the Events.xml files inside the MFF.
+    _soft_import("mffpy", "reading EGI MFF data")
+    _soft_import("defusedxml", "reading EGI MFF data")
+    import defusedxml.ElementTree as DET


Is there a reason you're using difusedxml instead of mffpy.XML.from_file( )?

Thanks so much for taking a look! There are two reasons for this:

MNE-Python has a strict internal policy requiring defusedxml for all XML parsing to protect against XML vulnerability attacks.

Because of the 9-digit fractional timestamp bug we discussed in issue BEL-Public/mffpy#138 calling mffpy.XML directly crashes the CI right now. I'm using defusedxml to route the parsing through a temporary shim in mne/fixes.py until your PR gets merged and released!

The struggle is real as we say. This might be something more for @scott-huberty and @drammock to ponder beyond this GSOC.

The MFF reader in mffpy currently used for reading epochs or evoked files is via Reader() is already using the mffpy XML.from_file() at least for getting header information and events (mne/io/egi/egimff.py). So there's already some exposure to another XML reader in the current codebase.

The pros of keeping the mffpy way of things is just how easy some things are like returning typed objects (.sensors, .epochs, .events), recover=True for malformed XML, and the fact that Reader() is already using it.

for evfile in sorted(glob(op.join(input_fname, "Events_*.xml"))): track = XML.from_file(evfile) for event in track.events: code = event.get("code") if code is None: continue ... ...

I suppose we could have a conversation with BEL and other mffpy maintainers about switching the current lxml backend to defusedxml.

My thesis is currently: "if the point of this project is to use mffpy to parse MFF files, then you should use mffpy's way of doing things and if those need changing, then we should probably change those upstream". Perhaps the compromise could be using the XML.* functions for this with the goal of changing things upstream in mffpy.

Hmm - okay faster than expected I have a mostly working defusedxml backend for mffpy. Might be able to crank this out in a week or so.

@drammock / @larsoner : 1) I assume it's ok if one of our dependencies parses XML files using lxml?

and 2) If mffpy devs cut a release after the aforementioned timestamp bug is fixed upstream... Is there any chance that MNE is comfortable with putting a lower pin on mffpy's latest release, so that we do not have to keep our shim around? 😁

Pins are fine for newly introduced dependencies. It's only an issue if we move from no pin to pin, or bump the pin.

I assume it's ok if one of our dependencies parses XML files using lxml?

we use defusedxml in our code because it closes some vulnerabilities present in the standard library. We cannot force dependencies to do likewise, but can suggest/encourage it upstream. IIRC, lxml has similar safeguards to defusedxml, but is a much more heavyweight package (does more stuff). I'd need to look into it to be sure though.

If a required dependency used lxml then I'd actually consider switching us to use it too --- why have 2 libs installed if one will suffice? But mffpy will be an optional dep (right?) so that logic doesn't necessarily apply, at least not as strongly, esp if there are salient differences in security or install size or API convenience.

Edit: sorry for the in Felicity of my last message, I just saw that @pmolfese is already looking into what it would take to switch to defusedxml upstream. Thanks! LMK if I can be of help there.

pmolfese · 2026-05-29T15:07:00Z

+        files_list = []
+    tracks = []
+    for xml_name in files_list:
+        if not xml_name.lower().endswith(".xml"):


This will have you parsing all XML files instead of just event ones. Probably unnecessary file IO.

Great catch here! You are completely right—parsing info.xml and the others here is totally unnecessary I/O. I will update this loop to explicitly filter for the event files (e.g., startswith("Events_") or similar) in my next commit. Thank you!

pmolfese · 2026-05-29T15:10:40Z

+            if event_start is None:
+                continue
+            start_sec = (event_start - start_time).total_seconds()
+            code_str = ev.get("code", "")


There are probably a couple checks you could add:

codes are supposed to be 4 characters (e.g. "STIM") and if not, then you're probably not reading a real MFF file

You probably also want to read in the labels for use if/when you add annotations

These are really great suggestions. Because this is the very first step of my GSoC project, my mentors requested that I keep this initial micro-PR strictly confined to 1:1 functional parity with MNE's legacy reader. The legacy reader didn't enforce the 4-character limit or map the labels, so I want to avoid expanding the scope just yet.
However, adding proper annotations is on my roadmap for Phase 2 here #13926, so I will definitely be referencing those labels when we get there!

scott-huberty · 2026-05-29T17:39:55Z

Thx for taking a look at this @pmolfese !!

PragnyaKhandelwal and others added 2 commits May 29, 2026 15:24

MRG: use mffpy for EGI MFF events; add timestamp parsing shim in mne.…

e465273

…fixes

[pre-commit.ci] auto fixes from pre-commit.com hooks

5a2248e

for more information, see https://pre-commit.ci

PragnyaKhandelwal mentioned this pull request May 29, 2026

GSoC 2026: Use mffpy for EGI Reader #13926

Open

11 tasks

PragnyaKhandelwal and others added 3 commits May 29, 2026 15:40

Add changelog entry

e8a58ad

MAINT: remove unused helper functions from EGI event reader

b10ea86

[pre-commit.ci] auto fixes from pre-commit.com hooks

31d8824

for more information, see https://pre-commit.ci

PragnyaKhandelwal marked this pull request as ready for review May 29, 2026 13:55

PragnyaKhandelwal requested review from agramfort, drammock and larsoner as code owners May 29, 2026 13:55

pmolfese reviewed May 29, 2026

View reviewed changes

PragnyaKhandelwal added 2 commits May 29, 2026 23:51

MAINT: narrow EGI MFF XML parsing to event files

b815651

TEST: align EGI bad XML coverage with event files

f4119cf

Uh oh!

Conversation

PragnyaKhandelwal commented May 29, 2026

Reference issue (if any)

What does this implement/fix?

Additional information

Uh oh!

PragnyaKhandelwal commented May 29, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pmolfese May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scott-huberty May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scott-huberty commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pmolfese May 29, 2026 •

edited

Loading

scott-huberty May 30, 2026 •

edited

Loading