Some changes to address blocking after a node is lost by sveesible · Pull Request #2 · couchbase/spymemcached

sveesible · 2014-04-22T02:39:07Z

BinaryOperationFactory was improperly setting a null get callback for the gets callback on clone of get operations, causing bad behavior under load with redistribution of bulkgets
Redistribution of write operations was just popping the operation back onto the original node and then continuing through the clone logic.
closing channel.socket to be consistent with other close logic
Using an iterator in handleIO for Selector.selectedKeys instead of a foreach loop because foreach changes stuff in the list while it loops

…erations. closing channel.socket so as not to orphan the socket. Using an iterator in handleIO instead of a foreach loop which isn't thread safe.

sveesible · 2014-04-22T02:45:33Z

src/main/java/net/spy/memcached/protocol/binary/BinaryOperationFactory.java

this seems like the big one in my testing that clogs up the mechanism when a server node is lost suddenly

Please see this change set where the bug was introduced last July
00f2e78

yes, that looks like a typo/bug.

http://review.couchbase.org/#/c/36221/

daschl · 2014-04-23T06:53:03Z

Hi, first, thanks much for your effort into fixing bugs here!

Looking at your PR here, I identified two things that look indeed wrong:

no return statement for the redistribute
the wrong get/gets when the op gets cloned.

For your other code, I added some additional comments that need a bit of clarification.

Note that we do the actual code review through gerrit, github is just a mirror here. You can either setup a gerrit account and work through the process, or I'll do it for you and attribute you in the commit message, whatever you like more.

Cheers,
Michael

sveesible · 2014-04-23T14:22:54Z

I don't see where to get setup with gerrit, I suppose I'd defer to you

daschl · 2014-04-23T14:29:25Z

@sveesible it is described here, but seems legit that I'll do it, its quicker I guess.

http://www.couchbase.com/wiki/display/couchbase/Contributing+Changes

This should not be defaulted to 1 as this value must be 0 for the isActive() call to return true, and thus allow redistributed ops to be put into this nodes operation queue

changes to address blocking on lost server with operations bulkget op…

4e067eb

…erations. closing channel.socket so as not to orphan the socket. Using an iterator in handleIO instead of a foreach loop which isn't thread safe.

sveesible reviewed Apr 22, 2014
View reviewed changes

Update TCPMemcachedNodeImpl.java

3c71f81

This should not be defaulted to 1 as this value must be 0 for the isActive() call to return true, and thus allow redistributed ops to be put into this nodes operation queue

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some changes to address blocking after a node is lost#2

Some changes to address blocking after a node is lost#2
sveesible wants to merge 2 commits intocouchbase:masterfrom
sveesible:master

sveesible commented Apr 22, 2014

Uh oh!

sveesible Apr 22, 2014

Uh oh!

sveesible Apr 23, 2014

Uh oh!

daschl Apr 23, 2014

Uh oh!

daschl Apr 23, 2014

Uh oh!

daschl commented Apr 23, 2014

Uh oh!

sveesible commented Apr 23, 2014

Uh oh!

daschl commented Apr 23, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sveesible commented Apr 22, 2014

Uh oh!

sveesible Apr 22, 2014

Choose a reason for hiding this comment

Uh oh!

sveesible Apr 23, 2014

Choose a reason for hiding this comment

Uh oh!

daschl Apr 23, 2014

Choose a reason for hiding this comment

Uh oh!

daschl Apr 23, 2014

Choose a reason for hiding this comment

Uh oh!

daschl commented Apr 23, 2014

Uh oh!

sveesible commented Apr 23, 2014

Uh oh!

daschl commented Apr 23, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants