[WIP] Implement 2nd pass training using 1-best decoding results from the 1st pass network by csukuangfj · Pull Request #198 · k2-fsa/snowfall

csukuangfj · 2021-05-18T05:50:57Z

BTW, since it seems this is hard to get to work, if you feel like it you could work on a simpler idea.
In training time we'd take the best-path alignment and using some kind of RNN or masked attention
we'd predict the next label in that best-path. (We'd probably take in the output of the 1st network
as an input to that). The sequence length here is the same as the original sequence length.
In test time the way this would work at least initially, is we'd run this on n-best lists obtained from the
1st-pass decoding and use the scores to decide which of the n-best paths to keep. There are more accurate
decoding methods we could look into later.

The training objf is decreasing and seems to be converging. Will post the decoding results later.

csukuangfj · 2021-05-18T07:13:24Z

snowfall/models/second_pass_model.py

+
+        # now x2 is (B, T, F)
+
+        x_concat = torch.cat((padded_acoustics, x2), dim=-1)


TODO(fangjun): Use cross attention here

query: x2

key and value: padded_acoustics

and masked self-attention

key, query, and value: x2

csukuangfj · 2021-07-12T08:32:12Z

snowfall/common2.py

@@ -0,0 +1,484 @@
+#!/usr/bin/env python3


common2.py is the same as common.py, except that it has some code supporting
the second pass model. To avoid conflicts with the master, a new file is used.

The same goes for the following xxx2.py files, e.g., lm_rescore2.py, mmi2.py.

csukuangfj · 2021-07-13T05:31:35Z

snowfall/decoding/second_pass.py

+import k2
+
+
+class Nbest(object):


@danpovey

This file implements the Nbest class proposed in
#232 (comment)
Please have a review if it matches the proposal.

That's great! Yes it looks like what I had in mind.
I assume you would separate it from this PR though? Or maybe even submit it to k2? Since there's a lot going on here.

Will move it to k2.

csukuangfj added 2 commits April 29, 2021 10:44

initial commit.

a7b34fa

Implement 2nd pass training using 1-best results from the 1st pass.

372f4a4

csukuangfj commented May 18, 2021

View reviewed changes

csukuangfj added 10 commits May 21, 2021 17:04

Duplicate existing decode script.

74df112

Add decoding scripts using the second pass model.

cb43430

Use attention decoder for the second pass model.

073ebf2

Decode with only the first pass model.

d62487f

rename to avoid conflicts with the master.

70732df

Merge from master.

3933c3f

rename.

6dd4b1c

Merge from master.

b74c692

Merge remote-tracking branch 'dan/master' into bak

8d47ca2

fixes after refactoring.

f702b09

csukuangfj mentioned this pull request Jul 12, 2021

Plan for multi pass n-best rescoring #232

Open

csukuangfj commented Jul 12, 2021

View reviewed changes

csukuangfj added 2 commits July 12, 2021 20:22

Begin to add Nbest class.

37c6bd0

Finish Nbest.

feef5f7

csukuangfj commented Jul 13, 2021

View reviewed changes

Implement top_k.

384b6c1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Implement 2nd pass training using 1-best decoding results from the 1st pass network#198

[WIP] Implement 2nd pass training using 1-best decoding results from the 1st pass network#198
csukuangfj wants to merge 15 commits intok2-fsa:masterfrom
csukuangfj:2nd-pass

csukuangfj commented May 18, 2021

Uh oh!

csukuangfj May 18, 2021 •

edited

Loading

Uh oh!

csukuangfj Jul 12, 2021

Uh oh!

csukuangfj Jul 13, 2021 •

edited

Loading

Uh oh!

danpovey Jul 13, 2021

Uh oh!

csukuangfj Jul 14, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		# now x2 is (B, T, F)

		x_concat = torch.cat((padded_acoustics, x2), dim=-1)

Conversation

csukuangfj commented May 18, 2021

Uh oh!

csukuangfj May 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

csukuangfj Jul 12, 2021

Choose a reason for hiding this comment

Uh oh!

csukuangfj Jul 13, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danpovey Jul 13, 2021

Choose a reason for hiding this comment

Uh oh!

csukuangfj Jul 14, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

csukuangfj May 18, 2021 •

edited

Loading

csukuangfj Jul 13, 2021 •

edited

Loading