Skip to content

Question about Hessian Matrix Calculation #11

@ShunLu91

Description

@ShunLu91

Hello, really appreciate your nice work.

I hope this message finds you well. I have two questions regarding the calculation of the Hessian matrix in your code. Specifically, I'm looking at the function where you calculate the second-order derivatives for each parameter with respect to all parameters:

row = self.gradient(grad[j], inputs[i:], retain_graph=True)[j:]

(1) I wonder why only the [j:] part of the result is taken? Is it assumed that the derivative has no effect on the preceding parameters?

(2) Additionally, when assigning values, why is the assignment done as follows and could you please explain the reasoning behind these specific assignments?

out.data[ai, ai:].add_(row.clone().type_as(out).data)  # ai's row
if ai + 1 < n:
    out.data[ai + 1:, ai].add_(row.clone().type_as(out).data[1:])  # ai's column

Thank you very much for your time and effort in maintaining this project. Your help is greatly appreciated.

Best regards,
Shun Lu

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions