Skip to content

Idea for 0 and negative values.#10

Open
vwwv wants to merge 5 commits into
twanvl:masterfrom
vwwv:master
Open

Idea for 0 and negative values.#10
vwwv wants to merge 5 commits into
twanvl:masterfrom
vwwv:master

Conversation

@vwwv
Copy link
Copy Markdown

@vwwv vwwv commented Jan 18, 2016

I case you think it is ok handling 0 or negative occurrences, here it is pull request with a possible implementation. I had some doubts though:

  • shall size return the amount of positive occurrences? or any occurrences different than 0?
  • Shall member return True if the occurrence is negative or only when it is positive?
  • Is there a way to implement it such intersection is intuitive?

@rcook
Copy link
Copy Markdown
Contributor

rcook commented Jan 18, 2016

I'm not sure what the meaning of multiset with 0 or negative occurrences is. This change would make a break with the mathematical concept of multiset (https://en.wikipedia.org/wiki/Multiset) where the multiplicity is constrained to 1 or more.

If I wanted to track zero or negative values associated with a key, I would probably use Data.Map (https://hackage.haskell.org/package/containers-0.5.7.1/docs/Data-Map.html) to map my values to Int.

@vwwv
Copy link
Copy Markdown
Author

vwwv commented Jan 18, 2016

I see, well, that's true, it would not be a multiset in the mathematical sense (I think it could be modeled as a free abelian group https://en.wikipedia.org/wiki/Free_abelian_group).

About 0 occurrences, it already handles them consistently everywhere but on fromOccurList like functions. When an element does not occur, it is considered to occur 0 times (occur x (fromList []) == 0). But fromOccurList [] /= fromOccurList [(x,0)] . If those functions were changes to handle 0, just filtering 0 occurrences elements after the Map is built, I think it could eliminate some bugs.

About the negatives values, the thing is, there are people using Multiset to count, so having negative number is quiet useful, but annoying to re-implement Map every time for every project. Actually it could be nice generalize it to anything with an Ord and Monoid intances, so it could also work to count over Double or Integer.

But its true it change the behavior, maybe could it be an Idea to move it to another module within the package and give it another name instead of MultiSet?

@rcook
Copy link
Copy Markdown
Contributor

rcook commented Jan 19, 2016

If it were to have a name other than MultiSet (FreeAbelianGroup, maybe?), we would probably want to maximize the amount of code shared by the implementations. And write some tests! I added basic doctests for my contribution (see commit d4ea799).

@rcook
Copy link
Copy Markdown
Contributor

rcook commented Jan 19, 2016

Looks like something like this already exists: https://hackage.haskell.org/package/signed-multiset. They're using the name "signed multiset".

@vwwv
Copy link
Copy Markdown
Author

vwwv commented Jan 19, 2016

I was not aware of that package, which even has a similar interface! Now I'm not longer sure if it would be a good idea to duplicate that module; What do you think? On one side duplication is not nice, but on the other they have some functions that are similar but not the same, what can lead to errors, for instance, what there's union, here is unionMax, and what there is unionAdditive here is just union.

If you think it is a good idea to have SignedMultiSet (probably the best name) within this package, I can write it refactoring the current impl to share common code, and some tests :) .

@vwwv
Copy link
Copy Markdown
Author

vwwv commented Jan 19, 2016

Apart from that, there's still the question what to do with zero and negative values when they appeared on fromOccurList functions. I think the current implementation just silently add them destroying the only positive invariant; maybe it would be better throwing an error? or just truncating them to zero (deleting them from the set) ?

@twanvl
Copy link
Copy Markdown
Owner

twanvl commented Jan 19, 2016

I agree that fromOccurList should be fixed. I think the most sensible thing would be to silently remove 0s and give an error for negative values.

For compatibility with the signed-multiset package, perhaps it makes sense to add unionAdditive as an alias for the current union. That allows you to be explicit about what kind of union you want at least.

@vwwv
Copy link
Copy Markdown
Author

vwwv commented Jan 21, 2016

I set it back to behave as it used, just adding a Map.filter (>0) to the fromOccurList functions (the ones being already O(n) or O(n*log n)). I was not sure what would be better, to raise and error or silently truncating negative occurrences, so at the end I did not add the error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants