Skip to content

[cascading3] Migrate core, commons and related#1521

Merged
rubanm merged 12 commits into
cascading3from
rubanm/cascading3/core
Apr 13, 2016
Merged

[cascading3] Migrate core, commons and related#1521
rubanm merged 12 commits into
cascading3from
rubanm/cascading3/core

Conversation

@rubanm

@rubanm rubanm commented Feb 19, 2016

Copy link
Copy Markdown
Contributor

part of #1465
based on Cyrille's work in #1446

Most of the interesting changes are in:

  • Operations.scala -- to handle both old and new cascading aggregate by thresholds
  • PlatformTest.scala -- some updated tests, hashjoining and then merging the result with one side of the same join is no longer supported in cascading3

Cascading fabric selection changes will be sent in a separate PR.

@johnynek

Copy link
Copy Markdown
Contributor

@cchepelov take a look?

Comment thread build.sbt Outdated
"com.twitter" %% "algebird-core" % algebirdVersion,
"com.twitter" %% "chill" % chillVersion,
"com.twitter.elephantbird" % "elephant-bird-cascading2" % elephantbirdVersion,
"com.twitter.elephantbird" % "elephant-bird-cascading3" % elephantbirdVersion,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move this to where the versions are? elephant-bird-artifact so we can keep all these switches in one place?

@johnynek

Copy link
Copy Markdown
Contributor

Does this pass e2e tests or CI at Twitter?

@cwensel

cwensel commented Feb 19, 2016

Copy link
Copy Markdown

If we can get an isolated Cascading3 test case we can take a stab at promoting this from 'no longer supported' to 'bug' and then to 'resolved'.

@cchepelov

Copy link
Copy Markdown
Contributor

Hi @posco @rubanm
Great to see a lot of progress! Will have to come back to this next week (away from keyboard this week).

Re. the spurious ".forceToDisk"; indeed, the code should do the right thing without. The transform facility @cwensel wrote about looks like the correct place to put the necessary Boundaries in place.

  -- Cyrille

Le 19 févr. 2016 19:26, à 19:26, "P. Oscar Boykin" notifications@github.com a écrit:

@cchepelov take a look?


Reply to this email directly or view it on GitHub:
#1521 (comment)

rubanm and others added 4 commits March 2, 2016 08:28
Hadoop's -libjars doesn't support wildcards, with large class paths its easy to exhaust the max arg length for linux/os x when running commands. This acts as a filter above our interaction with the generic options parser to expand wildcards
@johnynek

Copy link
Copy Markdown
Contributor

@cwensel about the repro: It should be as easy as a cascading HashJoin followed by Merge followed by GroupBy. Sorry kind of swamped...

@rubanm

rubanm commented Apr 11, 2016

Copy link
Copy Markdown
Contributor Author

@johnynek This branch now passes e2e tests at Twitter (with a related EB change twitter/elephant-bird#465). I'm working on piloting some user jobs.

@sriramkrishnan

Copy link
Copy Markdown
Contributor

@rubanm this is pretty amazing work!

@johnynek

Copy link
Copy Markdown
Contributor

Amazing!

@johnynek

Copy link
Copy Markdown
Contributor

looks good to me to merge into cascading3 branch.

does this have all the changes from current develop branch?

@rubanm

rubanm commented Apr 12, 2016

Copy link
Copy Markdown
Contributor Author

@johnynek Thanks for the review! RC6 is currently being released to twitter source. I plan to merge develop once that release is done so it's in tandem, with the joinWithTiny fix to follow.

@johnynek

Copy link
Copy Markdown
Contributor

@rubanm sounds good. Way to push through on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants