Skip to content

the-other-thanos-A4#54

Open
the-other-thanos wants to merge 3 commits into
mainfrom
the-other-thanos-A4
Open

the-other-thanos-A4#54
the-other-thanos wants to merge 3 commits into
mainfrom
the-other-thanos-A4

Conversation

@the-other-thanos

Copy link
Copy Markdown
Collaborator

ready for review

@juwei95

juwei95 commented May 7, 2026

Copy link
Copy Markdown
Collaborator

✅ Your code works and it produced the output quickly for all files I tested it with.

⚠️ The only major issue I could find with your solution is that your program refuses to process standard input.
With your implementation, if no filename is given argparse raises an error.
According to the task specification:

If no input file is given, the program should read the training text from standard input.

💡 You can make the filename optional by using the nargs=? parameter.

🤩 I liked that you took special care to handle UTF-8 encoding. This is something I missed in my solution and it caused problems for the reviewer on Windows.

🤔 FYI: I noticed that the contexts variable in the find_cont function seems to be unused.

⚙️ It's a good idea to ensure that there always exists an end state (***) in the cont_dict.
However, this does not guarantee termaination of the program.
If the user has to kill the program, no output will be produced, because gen_text never returns.
A possible approach to improving this is to make use of the generator pattern and print as you go.

👍 Overall I really appreciate the simplicity of your solution, especially the find_cont function is very straight forward and easy to follow.

Good job keep it up!

@the-other-thanos

Copy link
Copy Markdown
Collaborator Author

Thanks, for the review. I totally missed the stdin part and updated the function now. I did not do it very carefully but I hope it works most of the times xD. The contexts variable was actually a relic of me wanting to see what I am building as a list, I commented it out now.
About the termination, I am not sure I understand how the program could not terminate. I mean it could take a while for a very cyclic text where the end state is just a possibility out of many but eventually it should be reached by chance no? Maybe I misunderstand something still, would be happy to hear more. I wanted to do a different fail-safe then adding a token limit for the output but if it gets stuck in loops that would be bad 🗡️
Thanks again, for the thorough review :)

@juwei95

juwei95 commented May 8, 2026

Copy link
Copy Markdown
Collaborator

You're absolutely right, it will probably terminate in practice, and it reliably did during my testing.
If I understand the code corretly, in theory it's possible that there exists a combination of training text and RNG seed for which the program will run forever, because the *** token might never get picked.

It's just an "edge case" to consider :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants