Brief summary of issue
While trying to debug the Sbits in the new firmwares with the run_scans.py sbitMapNRate/checkSbitMappingAndRate.py commands, the scans systematically failed.
According to the CTP7 log file the issue is located inside calibration_routines.checkSbitRateWithCalPulse.
Types of issue
Expected Behavior
The RPC method should perform seamlessly.
Current Behavior
The RPC method fails with the following errors in the CTP7 log:
Jul 11 09:45:32 eagle63 local0.info rpcsvc[16714]: calibration_routines.checkSbitRateWithCalPulse: Unmasking channel 14 on vfat 0 of OH 0
Jul 11 09:45:32 eagle63 local0.info rpcsvc[16714]: calibration_routines.checkSbitRateWithCalPulse: Enabling calpulse for channel 14 on vfat 0 of OH 0
Jul 11 09:45:32 eagle63 local0.info rpcsvc[16714]: calibration_routines.checkSbitRateWithCalPulse: Reseting trigger counters on OH & CTP7
Jul 11 09:45:32 eagle63 local0.info rpcsvc[16714]: calibration_routines.checkSbitRateWithCalPulse: Configuring TTC Generator to use OH 0 with pulse delay 40 and L1Ainterval 0
Jul 11 09:45:32 eagle63 local0.info rpcsvc[16714]: calibration_routines.checkSbitRateWithCalPulse: Entering ttcGenConfLocal
Jul 11 09:45:32 eagle63 local0.info rpcsvc[16714]: calibration_routines.checkSbitRateWithCalPulse: System release major is 3, v3 electronics behavior
Jul 11 09:45:32 eagle63 local0.info rpcsvc[16714]: calibration_routines.checkSbitRateWithCalPulse: ttcGenConfLocal: V3 behavior
Jul 11 09:45:32 eagle63 local0.info rpcsvc[16714]: calibration_routines.checkSbitRateWithCalPulse: ttcGenConfLocal: call ttcGenToggleLocal
Jul 11 09:45:32 eagle63 local0.info rpcsvc[16714]: calibration_routines.checkSbitRateWithCalPulse: System release major is 3, v3 electronics behavior
Jul 11 09:45:32 eagle63 local0.info rpcsvc[16714]: calibration_routines.checkSbitRateWithCalPulse: Starting TTC Generator
Jul 11 09:45:33 eagle63 local0.info rpcsvc[16714]: calibration_routines.checkSbitRateWithCalPulse: Reading trigger counters
Jul 11 09:45:33 eagle63 local0.err rpcsvc[16714]: calibration_routines.checkSbitRateWithCalPulse: read memsvc error: Bus error accessing 0x650080c8
Jul 11 09:45:33 eagle63 local0.err rpcsvc[16714]: calibration_routines.checkSbitRateWithCalPulse: read memsvc error: Bus error accessing 0x6500805c
Jul 11 09:45:33 eagle63 local0.info rpcsvc[16714]: calibration_routines.checkSbitRateWithCalPulse: Stopping TTC Generator
Jul 11 09:45:33 eagle63 local0.info rpcsvc[16714]: calibration_routines.checkSbitRateWithCalPulse: Disabling calpulse for channel 14 on vfat 0 of OH 0
Jul 11 09:45:33 eagle63 local0.err rpcsvc[16714]: calibration_routines.checkSbitRateWithCalPulse: Reading reg 65400038 failed 1 times.
Jul 11 09:45:33 eagle63 local0.info rpcsvc[16714]: calibration_routines.checkSbitRateWithCalPulse: Masking channel 14 on vfat 0 of OH 0
The registers 0x650080c8 and 0x6500805c respectivelly correspond to and GEM_AMC.OH.OH0.FPGA.TRIG.CNT.CLUSTER_COUNT GEM_AMC.OH.OH0.FPGA.TRIG.CNT.
Steps to Reproduce (for bugs)
- Launch a
sbitMapNRate scan, e.g. run_scans.py sbitMapNRate 1 4 0x1 -r 1e3 -n 10
- The scan fails
- When the scan is running all OH FPGA register accesses fail (in
gem_reg.py)
- As soon as the scan is over the accesses succeed
Possible Solution (for bugs)
The sbitMapNRate scan is always first launched with a pulse rate of 0 Hz:
https://github.com/cms-gem-daq-project/vfatqc-python-scripts/blob/dae6fb9a1d65f0d7081dc040832faf1c5f77123e/checkSbitMappingAndRate.py#L153-L166
In the ctp7_modules the L1A interval is then set a 0:
|
//Setup TTC Generator |
|
uint32_t L1Ainterval; |
|
if (pulseRate > 0) { |
|
L1Ainterval = int(40079000 / pulseRate); |
|
} |
|
else{ |
|
L1Ainterval = 0; |
|
} |
And the counters are read while the TTC Generator is running:
|
LOGGER->log_message(LogManager::INFO, "Starting TTC Generator"); |
|
writeRawAddress(addrTtcStart, 0x1, la->response); |
|
|
|
//Sleep for waitTime of milliseconds |
|
std::this_thread::sleep_for(std::chrono::milliseconds(waitTime)); |
|
|
|
//Read All Trigger Registers |
|
LOGGER->log_message(LogManager::INFO, "Reading trigger counters"); |
|
outDataCTP7Rate[chan]=readRawAddress(ohTrigRateAddr[oh::VFATS_PER_OH + 1], la->response); |
|
outDataFPGAClusterCntRate[chan]=readRawAddress(ohTrigRateAddr[oh::VFATS_PER_OH], la->response)*waitTime/1000.; |
|
outDataVFATSBits[chan]=readRawAddress(ohTrigRateAddr[vfatN], la->response)*waitTime/1000.; |
|
|
|
//Reset the TTC Generator |
|
LOGGER->log_message(LogManager::INFO, "Stopping TTC Generator"); |
|
writeRawAddress(addrTtcReset, 0x1, la->response); |
With the new firmware releases and the new 6b8b OH FPGA communication protocol the bandwidth is shared between TTC commands and slow control. Since TTC commands have higher priority slow control communication is impossible if L1A are sent at every clock cycle.
I would suggest to add a lower limit of the L1A interval (to be defined) in ttcGenConfLocal so that slow control communication is always possible. At the same time I would change the pulse rate of 0 in checkSbitMappingAndRate.py to 1.
Your Environment
Brief summary of issue
While trying to debug the Sbits in the new firmwares with the
run_scans.py sbitMapNRate/checkSbitMappingAndRate.pycommands, the scans systematically failed.According to the CTP7 log file the issue is located inside
calibration_routines.checkSbitRateWithCalPulse.Types of issue
Expected Behavior
The RPC method should perform seamlessly.
Current Behavior
The RPC method fails with the following errors in the CTP7 log:
The registers
0x650080c8and0x6500805crespectivelly correspond to andGEM_AMC.OH.OH0.FPGA.TRIG.CNT.CLUSTER_COUNTGEM_AMC.OH.OH0.FPGA.TRIG.CNT.Steps to Reproduce (for bugs)
sbitMapNRatescan, e.g.run_scans.py sbitMapNRate 1 4 0x1 -r 1e3 -n 10gem_reg.py)Possible Solution (for bugs)
The
sbitMapNRatescan is always first launched with a pulse rate of 0 Hz:https://github.com/cms-gem-daq-project/vfatqc-python-scripts/blob/dae6fb9a1d65f0d7081dc040832faf1c5f77123e/checkSbitMappingAndRate.py#L153-L166
In the
ctp7_modulesthe L1A interval is then set a 0:ctp7_modules/src/calibration_routines.cpp
Lines 976 to 983 in 92eeadc
And the counters are read while the TTC Generator is running:
ctp7_modules/src/calibration_routines.cpp
Lines 1047 to 1061 in 92eeadc
With the new firmware releases and the new 6b8b OH FPGA communication protocol the bandwidth is shared between TTC commands and slow control. Since TTC commands have higher priority slow control communication is impossible if L1A are sent at every clock cycle.
I would suggest to add a lower limit of the L1A interval (to be defined) in
ttcGenConfLocalso that slow control communication is always possible. At the same time I would change the pulse rate of 0 incheckSbitMappingAndRate.pyto 1.Your Environment