Skip to content

Comments

tsv delete option#8

Open
stratusjerry wants to merge 3 commits intojojje:masterfrom
stratusjerry:tsv_list
Open

tsv delete option#8
stratusjerry wants to merge 3 commits intojojje:masterfrom
stratusjerry:tsv_list

Conversation

@stratusjerry
Copy link

This PR:

  • Fixes some typos and whitespace
  • Add option to delete tsv files after import
  • Adds more temp files .gitignore

@Wojciech-Wawrzyszko
Copy link

Any plans of merge that pull request ;) ?

@jojje
Copy link
Owner

jojje commented Feb 4, 2026

Hi, well not the gitignore and new option. Don't want this to become a frankenstein. If people want specific features for whatever reason, then they can fork the project to scratch a niche itch.

However the typo fixes are welcome. I think I fixed some of those when I addressed a different issue, but see there are still some left. Like eposide and DeNiro :)

So if you can reduce the PR to only the remaining such issues, I'd be happy to merge.

@jojje
Copy link
Owner

jojje commented Feb 4, 2026

To shortcircuit all this since I was already having this project open for the time being, I've fixed the typos you noted. Thanks for that!

I think this PR can be closed as the other aspects I've already commented on.

Next time, please open an Issue first to describe why a PR is warranted and what use project it's trying to solve.

@stratusjerry
Copy link
Author

@jojje thanks for the comments,

  1. the .gitignore additions were added because if the program fails to run (or even if it runs to completion) some of the files included are added to the local git dir and tracked, I used a paired down list of https://github.com/github/gitignore/blob/main/Python.gitignore
    See screenshot of files added during a regular run Screenshot 2026-02-04 153339
  2. Triple quoted strings for argparse descriptions naturally handles whitespace
  3. adding option to remove tsv after ingest helped when running on environments like containers with limited storage but I understand if you don't want to add that feature

I was thinking of adding PRs to add support for python f'strings and uv package manager, would those be accepted?

@jojje
Copy link
Owner

jojje commented Feb 4, 2026

I see, thanks for the problem definitions.

As to 1, I see the point. Sure, add those. I'll probably add "py-build" stuff (egg info etc) as well, as those tend to clutter up the workspace. The justification would be good DX ergonomics

Re #2, I see. Didn't know argparse was clever about this. Now I understand why you changed the string delimiters. Ok, sounds good; justification being that it avoids similar unintentional whitespace collapses in the future when output text is edited. One Ask though, please used double-quotes in the triple form. I.e. """ instead of single quotes '''.

Please create two separate PRs for these, since they tackle two orthogonal concerns.

Now, finally as to your "docker" concern. There are other ways to solve that. Most common is to store any data one wants persisted (such as the database itself) on a volume, and let scratch space be automatically garbage collected whenever a new container is restarted from the same image. E.g. Doing import from /tmp, and mounting a volume to "/mnt", then designating the CLI to write to that volume "imdb-sqlite --db /mnt/imdb.db".

There are of course other ways someone might want to solve that problem; that's what I meant that this sort of feature is likely to be very user-specific as per how they want to cleanup, archive, version etc the data source files from imdb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants