Skip to content

Understanding meta-learning #2

@embarrassing-noob-questions

Description

Hi, thanks for this repo, and please excuse the noob question: I am trying to understand the meta-learning side better.

If it's ok for you I will focus first on Titans/ATLAS, because you seem to understand it well:

Suppose I want to train a model to do well on unseen long context problems, or in other words - I want to teach a model to use its memory effectively to do well on long context (= contagious sequences that are longer than its context window).

So I would think that at training I will give it long texts, many times longer than its context window, and after each text like that I will somehow re-init the memory - the same way I would do at inference.

I don't see any of this in the code - means I am missing something

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions