Hello, thank you for your share. I have a question when I view the code. I think each layer's weights are quantized to corresponding bit widths. But, in terms of activation quantization, it seems the input tensor of the first layer still remains 32-bit float rather than the 8-bit. The "ActFn" is applied just in the output tensor of the first layer, e.t. the input of the second layer. Is that so?