# Pastebin qbfIAtHj myheadhurts> I had a chance to look at the build request code, and from what I can gather, the problem of sequential build slaves comes down to the following line: https://github.com/buildbot/buildbot/blob/master/master/buildbot/process/builder.py#L357 20:47 If we wrap everything there and below into another method and defer that, would that solve our parallel slaves issue? 20:47 or is there more to it 21:07 I think indeed, this could be a good method 21:07 sorry, don't know much about deferred so this whole thing is a maze to me 21:07 just think about it as threads 21:08 except it is cooperative threads, that can schedule at every yield 21:08 I think the whole startBuildFor could be actually not waited 21:09 I would recommend you to start experiment 21:09 but you should experiment with an integration test 21:09 and create a fake latentBuildslave using the local protocol 21:10 So not to take too much of your time, but if I wanted this to be not waited: https://github.com/buildbot/buildbot/blob/master/master/buildbot/process/builder.py#L513 21:10 do I just `return self._startBuildFor(workerforbuilder, breqs)` 21:11 no. 21:11 you need to change the caller 21:12 because, here the caller expect that it returns when the build is actually started 21:17 https://github.com/buildbot/buildbot/pull/2067 21:18 I'm away from my work env right now 21:18 so I just hacked away the change via github 21:18 sure, that's very helpful 03:33 ↔ myheadhurts nipped out #buildbot Thursday, March 24th, 2016 00:53 ↔ myheadhurts nipped out #buildbot 14:46 So I made some minor progress. EC2 instances are now launching in parallel, however, it tries to substantiate the same worker multiple times.. and ends up with one of them failing and requeuing it's jobs . 14:46 I think I need to modify https://github.com/buildbot/buildbot/blob/master/master/buildbot/process/workerforbuilder.py#L253 14:47 with something like if self.state == SUBSTANTIATING: lock.waitUntilMaybeAvailable... 14:47 1) Does this sound correct to you. 2) which kind of lock should I be using, I'm a little confused on that 15:04 hmm, nvm this needs to be in worker/base 15:39 Basically, I need help in doing a locking/waiting version of this: https://github.com/buildbot/buildbot/compare/master...aelsabbahy:parallel_latent?expand=1 16:18 can you send your WIP as a PER? 16:18 PR 16:20 Need to check with client, they have a weird approval process for opensourcing changes. 16:20 I'm already pushing it by having the code on github :\ but don't see any other way to get feedback from BB devs 16:21 ah :-/ 16:21 The boolean toggle is a hack, an ugly one that causes race conditions 16:21 but it was more me trying to understand the issue better 16:21 I willing to help, but I would be very sad if the result is not open source 16:22 I'm 99% sure it will be.. just needs to get approved 16:22 There's no interest in maintaining a local fork 16:22 Soon there's going to be some ec2.py changes that you guys get from the client (other peoples work, not mine) 16:22 ok 16:23 so, any thoughts on approach or guidance on this? 16:23 I think you should remove self.botmaster.maybeStartBuildsForBuilder(self.name) 16:23 did you try without? 16:23 I think the unclaim event should do the same 16:24 I did, but had the same issue 16:24 I'll make the change now though to be sure 16:24 I can see there is already insubstantiating boolean 16:24 what is the diff 16:25 insubstantiate() doesn't seem to be called on start 16:25 but rather on completion 16:25 unless I misread it 16:26 gonna be honest with you, this is a bit out of my python reach, so I've been very much trial and erroring my way through it 16:27 yeah, I was testing with self.botmaster.maybeStartBuildsForBuilder(self.name) commented out, doesn't work properly 16:28 insubstantiating is a boolean, not a function 16:28 if you look at the code 16:28 if self.insubstantiating or self.substantiating: 16:29 but anyway in the idea I think you are doing things right 16:29 https://github.com/buildbot/buildbot/blob/master/master/buildbot/worker/base.py#L803-L804 16:29 this is the only location I see it being set in 16:30 _soft_disconnect, _substantiation_failed, buildFinished seem to be the callers of the method 16:30 That's why I assumed it's more about cleanup 16:31 ok insubstanciate is about shuting down the VM 16:31 not starting it 16:31 so looks good 16:32 What kind of lock, or pattern do I need to use, to basically have _substantiate() check for a lock, and wait for the lock if it's not available, then return the result of the previous substantiate call 16:32 I'm guessing that's what I need to do, but again, I'm kind of guessing :\ 16:33 Most objects I looked at seem to be getting their locks passed in, so not sure what to use 16:34 So, I'm getting an informal thumbs up about creating a PR 16:35 I think there is no need to wait 16:35 to have lock 16:35 the boolean should be enough 16:35 we just need to make sure that a slave that is substanciating is consider like a slave that is building 16:37 should canStartBuild return false if it's substantiating (that seems wrong?) 16:37 I think it should return False 16:37 So with the current PR.. the issue is greatly diminished, but it still happens 16:37 as it cannotStartBuild *right now* 16:37 so instead of 5 builds getting rescheduled, I'm getting 1 or 2 builds 16:38 then when it is substanciated, you can self.botmaster.maybeStartBuildsForWorker(self.workername)) 16:38 so that the master checks if there could be other build that the slave could do in // 16:39 I really think you should work with unit tests, so that we can really make sure it works in all the cases 16:40 is there a guide for how to run the BB unit tests? 16:40 there is a big developer guide 16:40 and a very large set of example 16:45 https://docs.buildbot.net/latest/developer/tests.html <- I see that 16:46 but http://trac.buildbot.net/wiki/RunningBuildbotWithVirtualEnv <- hmm, I can try that 17:22 Hmm, so it seems the changes aren't properly running the unclaimBuildRequests() 17:22 https://github.com/buildbot/buildbot/blob/master/master/buildbot/test/unit/test_process_buildrequestdistributor.py#L678 17:22 The actual is [10, 11] rather than just [11] 17:23 I put a print statement in isItStarted() in my PR and it doesn't ever print, is the callback code incorrect? 17:55 K, so this seems to be working better, except for weird test erorrs: https://github.com/buildbot/buildbot/compare/master...aelsabbahy:parallel_latent?expand=1 17:56 I'm getting a bunch of: 17:56 self.quiet_deferred.addCallback(check) 17:56 exceptions.AttributeError: 'NoneType' object has no attribute 'addCallback' 17:57 in the test suite, for buildrequestdistributor.py