Skip to content

Update overview.md#4

Open
AntonioMDS wants to merge 1 commit intomguthriem:mainfrom
AntonioMDS:patch-3
Open

Update overview.md#4
AntonioMDS wants to merge 1 commit intomguthriem:mainfrom
AntonioMDS:patch-3

Conversation

@AntonioMDS
Copy link

Just more comments

Just more comments
Copy link
Owner

@mguthriem mguthriem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, thanks! Just a few suggestions to tweak then we can merge this.

* Source frequency

SNAPRed can automatically identify state, on the basis of a provided run number, via accessing the relevant process variables stored in the logs of the corresponding raw data file. SNAPRed can also create and instantiate a new state automatically, although this functionality is limited to users who have write privilege to the `/SNS/SNAP/shared` directory (i.e. Beamline Staff). A "human-readable" name can be given to any state but, internally, SNAPRed manages these via the allocation of 16 character hexadecimal string ID. This approach allows for future modification of the way that States are defined.
SNAPRed can automatically identify the state from its run number, via accessing the relevant metadata stored along with the diffraction data. Should a specific state not exist in the database, i.e. the instrumnt is being used in a new configuration, SNAPRed can also create and instantiate a new state automatically, although this functionality is limited to users who have write privilege to the `/SNS/SNAP/shared` directory (i.e. Beamline Staff). A "human-readable" name can be given to any state but, internally, SNAPRed manages these via the allocation of 16 character hexadecimal string ID. This approach allows for future modification of the way that States are defined.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this. I would just tweak "identify the state from its run number" as the word its is ambiguous (sounds like it's the state's run number, but a state doesn't have a specific run number. I would replace with "an input run number"

At the SNS, neutron data are recorded in event mode, which should be considered the native data type. Event mode data are essentially arrays that record a set of properties for each detected neutron. These properties are the neutron's time-of-flight, the ID of the pixel in which it was detected and the ID of the pulse from which it was generated (this latter can be converted to a "wall clock" time).

An alternate mode of storing data is to create histograms of events: define a series of discrete bins for a property of interest (say TOF) and then count how many neutron events fall within each bin.
An alternate mode of storing data is via histograms of observation: whereby a series of discrete bins for a property of interest (say TOF) is predefined and every neutron event is assigned to a bin.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit semantics, but I'd say it's the events that are histograms. Sure events are observations too, but what's observed is an event.

An alternate mode of storing data is via histograms of observation: whereby a series of discrete bins for a property of interest (say TOF) is predefined and every neutron event is assigned to a bin.

An important consideration due to the nature of the event mode data type is that the size of the data is proportional to the number of neutrons detected. In contrast, the size of histogrammed data is constant (dictated by the chosen number of bins). This has the consequence that on instrument where the total number of events per dataset is small, events can be a highly efficient way to store the data. However, on instruments with a high flux and large detector coverage event data can become extremely large volume. SNAP sits on the borderline of these two scenarios so SNAPRed takes careful measures to ensure that the various datasets it orchestrates are efficiently stored.
An important consideration that follows from the nature of the event mode data type is that the size of the data is proportional to the number of neutrons detected. In contrast, the size of histogrammed data is constant (dictated by the chosen number of bins). Consequently, on low flux instruments where number of events per dataset is small, events can be a highly efficient way to store the data. However, on instruments with a high flux and large detector coverage event data can become extremely large volume. SNAP sits on the borderline of these two scenarios so SNAPRed takes careful measures to ensure that the reduction metodology takes into account various datasets it orchestrates. [NOTE: SNAPRed does not "store" that is why I prefered the wording of the reduction aspect]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is meant to refer to data held in memory during the reduction operation. I think you're right that "store" isn't the correct word. What I mean to refer to is effective memory management.

An important consideration that follows from the nature of the event mode data type is that the size of the data is proportional to the number of neutrons detected. In contrast, the size of histogrammed data is constant (dictated by the chosen number of bins). Consequently, on low flux instruments where number of events per dataset is small, events can be a highly efficient way to store the data. However, on instruments with a high flux and large detector coverage event data can become extremely large volume. SNAP sits on the borderline of these two scenarios so SNAPRed takes careful measures to ensure that the reduction metodology takes into account various datasets it orchestrates. [NOTE: SNAPRed does not "store" that is why I prefered the wording of the reduction aspect]

A design requirement of SNAPRed is to efficiently manage events. _This is not yet available in Phase 2._
A design requirement of SNAPRed is to efficiently manage events upon loading data for reduction or calibration . _This is not yet available in Phase 2._
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't just relevant for loading. The goal is to effectively manage RAM usage through loading and all subsequent operations. Generically "efficiently managing events" throughout the workflow.

## Lite mode

During prototyping of SNAPRed, in part addressing the efficient management of events describe above, the concept of `Lite mode` was developed. This is a process where the entire input event list is relabelled such that events within a fixed 8x8 grid of native pixels on the detector phase are given the same "super pixel" ID. The output of this is process is a "Lite workspace" that contains the same number of events as the original, but has 64 times fewer pixels (18 modules of 32x32=18432 _versus_ 18 modules of 256x256=1179648), with almost imperceptible loss of diffraction resolution.
During prototyping of SNAPRed, in part addressing the efficient management of events describe above, the concept of `Lite mode` was developed. This is a process where the entire input event list is relabelled such that events within a fixed 8x8 grid of native pixels on the detector phase are assigned the same "super pixel" ID. The output of this is process is a "Lite workspace" that contains the same number of events as the original, but has 64 times fewer pixels (18 modules of 32x32=18432 _versus_ 18 modules of 256x256=1179648), with almost imperceptible loss of diffraction resolution.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an aside: I'm not sure what I said is always true. When the guide is in, it is true, but it could be that flight tube resolution is significantly degraded in Lite mode. I'll plan to quantify this better and modify the text.

SNAP's diffraction detector system consists of two movable detector banks, each conprising a 3x3 array of Anger cameras (area resolved TOF-capable neutron detectors). This creates a natural flexibility in how detector pixels can be grouped together. As a general rule, larger pixel groups improve counting statistics but, eventually, at the expense of diffraction resolution. A consequence of this is a user should have the ability to easily switch between different pixel grouping schemes [schema?] (PGS) that describe how pixels are to be combined. Each PGS will consist of a number of subgroups each with their own ID.

Thus, SNAPRed is intended to be able to manage multiple different PGS and to reduce data from a specified list of these (allowing users multiple views on their data after reduction). The "standard" PGS based on combinations of detector components (`All`,`Bank`,`Column`,`2-4`) exist as defaults, however, it is also possible to specify unique PGS for any given instrument state. An example of this might be a scheme based on scattering angle of pixels. Such schemes differ from detector-component based schemes as the corresponding pixel ID's will change if the detector moves.
Thus, SNAPRed is intended to be able to manage multiple different PGS and to reduce data from a specified list of these (allowing users multiple views on their data after reduction). The "standard" PGS based on combinations of detector components (`All`,`Bank`,`Column`,`2-4`) exist as defaults, however, it is also possible to specify unique PGS for any given instrument state. An example of this might be a scheme based on pixels sharing a scattering angle range. Such schemes differ from detector-component based schemes as the corresponding pixel ID's will change if the detector moves. [NOTE: The pixel ID changes if the detector moves? Or did you mean the Grouping ID? If I am getting this all wrong, maybe consider rewording]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant, if I define a PGS on the basis of scattering angle, say labelling pixels from 90±10° as being in the same group. If I then move the detectors, those same pixel ID's will no longer correspond to 90±10°.

Appropriate management of large event TOF diffraction datasets requires optimized histogramming and/or compression of the raw event data. In general, this operation is lossy and so the binning parameters used must be chosen correctly. In SNAPRed, the binning parameters are properties of _each subgroup_ for a given state and are calculated for each of these. This calculation is done on the basis of a set of defined instrument parameters, namely, the incoming wavelength band and the instrument diffraction resolution, the latter is parameterised by the TOF-resolution equation - {cite:p}`WORLTON1976`- and a specification of the relevant value for SNAP.

User interaction is limited to the specification of the number of bins `NBin` within the FWHM of a measured Bragg peak. This approach exploits logarithmic binning combined with the approximately linear dependence of resolution on wavelength for a TOF diffractometer and means that the chosen `NBin` number of points will be constant for any peak in any subgroup in any state of the instrument. The value used should reflect the final approach to fitting peak shapes and be sufficient to support the numbner of parameters needed. Typically, this would include a Rietveld fit using a back-to-back exponential convolved with a psuedovoigt peak model, which has at least 5 parameters for each peak (position, width, gauss-Lorentz mixing,and the leading and trailing edge exponents) in addition to background model.
The User can then specify the number of bins `NBin` within the FWHM of a measured Bragg peak and SNAPRed establishes the optimal binning parameters. This approach exploits logarithmic binning combined with the approximately linear dependence of resolution on wavelength for a TOF diffractometer and means that the chosen `NBin` number of points will be constant for any peak in any subgroup in any state of the instrument. The value used should reflect the final approach to fitting peak shapes and be sufficient to support the numbner of parameters needed. Typically, this would include a Rietveld fit using a back-to-back exponential convolved with a psuedovoigt peak model, which has at least 5 parameters for each peak (position, width, gauss-Lorentz mixing,and the leading and trailing edge exponents) in addition to a background model.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this text should be changed. The NBin parameter is exposed during testing, but I think that we should chose a final value and then fix it. There is a later story that allows reduced data to be "down-sampled" to a lower resolution. But the default should match the instrument resolution (determined e.g. by a line width standard like the NIST LaB6).

By default, the FWHM used in combination with `NBin` to determine binning is taken from measurements of an ideal sample - taken to be NIST Silicon (640d). All data will be binned during reduction to match this native diffraction resolution of the instrument state. After reduction, it is possible increase the size of binning, downsampling the diffraction resolution, as this may be appropriate in the presense of sample-dependent peak broadening. This downsampling is controlled via specification of a multiplier of `NBin` to ensure that every peak in the output data is sampled with the same binning.

Note that, when defined as arrays, the length of an array of x-values will exceed that of the corresponding y-values by one (as the x-array contains two bin edges for each y-value)
Note that, when handling the data as hystogrammed, the length of an array of x-values will have one extra value, when compared to the corresponding y-values, as the x-array contains the pair of bin edges for each y-value, as opposer to the conventional _x-y_ point data.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is good. other than two typos: "hystogrammed" and "opposer"

Throughout this document, some it has been attempted to use consistent set of symbols.

### A note on indexing
## A note on indexing [I found this \confusing, is there a graphical way to describe this?]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to come up with something...


Below are set of symbols used hroughout this document.

### The symbols [NOTE: do we ever mention wall clock or regular time "small t" in seconds]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. It will likely come up when we add event compression (coming soon...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants