Add support causalm finetune by gkumbhat · Pull Request #80 · caikit/caikit-nlp

gkumbhat · 2023-07-13T23:22:08Z

Closes #77

Changes

Add support for causal-lm fine-tuning
Move trainer and training argument configuration to respective resource folder to allow easy selection and configuration in tuning module.
Fix unit test issue where having cuda device would make accelerate put calculations on GPU while rest of the data is on CPU thus raising error. This was fixed by adding set_cpu_device fixture which changes the cuda environment variable and patches is_available function in torch.cuda

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

alex-jw-brooks

Thanks @gkumbhat, looks good! Just a few things.

Also a bit of a side note, do you think we should remove support (for now) for HFAutoSequenceClassifier ? Seems like it's effectively unusable between trainer changes & tokenizer builder stuff, it's kind of confusing to have it there when we don't enable it for anything

alex-jw-brooks · 2023-07-31T21:29:55Z

        lr: float = 2e-5,
        # Directory where model predictions and checkpoints will be written
        checkpoint_dir: str = "/tmp",
+        **training_arguments,


This is better! Can you link the trainer args in the docstring through?

alex-jw-brooks · 2023-07-31T21:30:20Z

+        **training_arguments,
    ):
        """
        # FIXME: Below is currently configured for Seq2Seq only


This should be removed, right?

yep. good catch! Will remove this

alex-jw-brooks · 2023-07-31T21:33:22Z

+                    "<NLP39984681E>",
+                    NotImplementedError(
+                        f"Generation on {type(self.model)} not support \
+                      currently! Please try saving and running this model in TGIS."


oof. Does exporting via the trainer save API + reloading give you a transformer model back? I wonder if it would be better to have the first inference call export and reload with a warning until we find something better / implement a causal LM trainer doing something similar. Slow feels better than completely broken here IMO.

Or, is there any way we can cast to the seq2seq trainer and leverage the generate API for that? I guess that probably doesn't handle shifting etc the same way...

yeah, I think converting the seq2seq could land with weird mismatch issues.

Saving and reloading is certainly an option. It would simplify this block of run function entirely. But could be more inefficient, since the model is already on appropriate devices at this point, so loading them again, we would loose the distribution, which is mainly what I was trying to persist here.

But certainly, not having a solution of causal lm would not be great.

alex-jw-brooks · 2023-07-31T21:42:00Z

-        device = PeftPromptTuning._get_device(device)
-        inputs = {k: v.to(device) for k, v in tok_tensors.items()}
+
+        inputs = {k: v.to(self.model.device) for k, v in tok_tensors.items()}


FYI @rawkintrevo is actually making this change in a separate PR (it's this issue #3). Can we put it back as part of this PR and use his when it's ready instead? Since this PR is primarily targeting fine tuning anyway

ah true. I had to make this change to make some tests pass 😄 but yes, can change it back.

alex-jw-brooks · 2023-07-31T21:42:28Z

+            "device_placement": True,
+        }
+
+        accelerator = Accelerator(**accelerator_args)


why build a separate dict here?

I was playing with some optional parameter regarding cpu=True.. But that didn't work well, so removed that.. So this is kinda left over from that.. Will switch it back to direct arguments instead of separate dict.

alex-jw-brooks · 2023-07-31T21:43:40Z



+@pytest.fixture()
+def set_cpu_device(request):


Nice - thanks for adding this

alex-jw-brooks · 2023-07-31T21:46:23Z

+            2. compute_metrics
+            3. callbacks
+            4. preprocess_logits_for_metrics
+        """


Same questions about documenting the kwargs here in the docstring (at least the nonexpanded ones). I assume the other one probably needs it also

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

…oading the model Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

alex-jw-brooks

Looks awesome! Some small typos and stuff, but LGTM

alex-jw-brooks · 2023-08-02T22:39:17Z

            # eval_steps=1,
+            # load_best_model_at_end
+            **training_arguments,
            **dtype_based_params,


Might be a nice good first issue in the future to cleanly make sure there aren't collisions in these expanded dicts, but for now we can leave it

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

Co-authored-by: Alex Brooks <alex.brooks@ibm.com> Signed-off-by: Gaurav Kumbhat <kumbhat.gaurav@gmail.com> Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

Add support causalm finetune Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

♻️ Refactor trainer logic and move it to resources

70dfa5d

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

gkumbhat force-pushed the add_support_causalm_finetune branch from d934455 to 70dfa5d Compare July 14, 2023 21:43

gkumbhat added 8 commits July 16, 2023 16:52

🚧 Work in progress causal-lm trainer

e9d21ff

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

🚧 Implement seq2seq collator in resources

e067c6d

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

Merge branch 'main' into add_support_causalm_finetune

7bd7b1a

🎨 Fix linting and formatting

f539380

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

🐛 Fix seq2seq training arguments

e1c8f38

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

🐛 Remove task ids from resource tokenization functions

1724b60

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

✅ Add cuda device fixture to get around cuda unit testing when available

9a8f877

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

🎨 Fix formatting

0c2df95

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

gkumbhat force-pushed the add_support_causalm_finetune branch from f359909 to 0c2df95 Compare July 30, 2023 19:19

✅✨ Make fine-tuning work for causal lm models and make tests pass

9230f4e

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

gkumbhat marked this pull request as ready for review July 31, 2023 14:07

gkumbhat requested review from alex-jw-brooks and evaline-ju as code owners July 31, 2023 14:07

alex-jw-brooks requested changes Jul 31, 2023

View reviewed changes

gkumbhat added 5 commits August 1, 2023 12:48

🔧 Make review changes and add docstring for arguments

f468973

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

✨ Add support for model.generate right after training by saving and l…

4eda03b

…oading the model Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

🎨 Fix formatting and linting

ed7bbe6

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

🐛 Fix default verbalizer declaration in fine_tuning module

4890761

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

🔧 Update parameters for trainer and add random seed

d3d962c

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

gkumbhat force-pushed the add_support_causalm_finetune branch from 09f77dc to d3d962c Compare August 1, 2023 22:27

alex-jw-brooks approved these changes Aug 2, 2023

View reviewed changes

gkumbhat and others added 2 commits August 2, 2023 17:54

⏪ Revert back dump_api script changes

f84a357

Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

Apply suggestions from code review

664a3d5

Co-authored-by: Alex Brooks <alex.brooks@ibm.com> Signed-off-by: Gaurav Kumbhat <kumbhat.gaurav@gmail.com> Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>

gkumbhat force-pushed the add_support_causalm_finetune branch from 2d5bd16 to 664a3d5 Compare August 2, 2023 22:54

gkumbhat merged commit b5d29aa into caikit:main Aug 2, 2023

gkumbhat deleted the add_support_causalm_finetune branch August 2, 2023 23:05

gkumbhat added a commit to gkumbhat/caikit-nlp that referenced this pull request Aug 24, 2023

Merge pull request caikit#80 from gkumbhat/add_support_causalm_finetune

77454ed

Add support causalm finetune Signed-off-by: gkumbhat <kumbhat.gaurav@gmail.com>



		@pytest.fixture()
		def set_cpu_device(request):

Conversation

gkumbhat commented Jul 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

alex-jw-brooks left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alex-jw-brooks left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gkumbhat commented Jul 13, 2023 •

edited

Loading