[Date Prev] [Date Index] [Date Next] [Thread Prev] [Thread Index] [Thread Next]
Bryan Stansell bryan@conserver.com
Fri, 31 Jul 2015 19:16:59 GMT
I'm still at a bit of a loss as to what's up. I'm not even sure what info would help yet. The extreme measure would be to run conserver in debug mode so that you get copious logs and reproduce the issue. But, even with that, I'm not sure the existing logging would help. And with several thousand connections, that's a lot of stuff. I can only look at this stuff in my free time, but I'm still investigating. We could/should probably email directly until we have some sort of result that can be shared with the entire list. Bryan > On Jul 31, 2015, at 10:41 AM, Lonni J Friedman <netllama@gmail.com> wrote: > > Hi Bryan, > I was wondering if you had any ideas about this issue? Or id you > needed any more info from me to investigate further? > > thanks > > On Tue, Jul 28, 2015 at 7:02 AM, Lonni J Friedman <netllama@gmail.com> wrote: >> On Sun, Jul 26, 2015 at 2:43 AM, Bryan Stansell <bryan@conserver.com> wrote: >>> I keep looking at code and thinking about this, and the only thing that makes sense is a bug somewhere. When a SIGUSR1 or '-z bringup' happens, it just walks the consoles and performs a ConsInit() on them. The exact same thing happens when you connect to a console and "open" it (with some extra stuff for feedback to the client). So, my only explanation is that it's a bug somewhere. >>> >>> And just to clarify, if you run 'console -z bringup' multiple times, they continue to get "connect timeout: forcing down" messages? But as soon as you connect to one, it'll come up on the first try? I just want to make sure the situation is correct so I can, hopefully, think about how a bug might produce the situation and try and find a fix. Right now, though, I'm just scratching my head. >> >> Yes, that's exactly the behavior I've seen. >> >>> >>> Bryan >>> >>>> On Jul 24, 2015, at 10:36 AM, Lonni J Friedman <netllama@gmail.com> wrote: >>>> >>>> Hi Bryan, >>>> When I run "console -v -z bringup", I see a lot of "console >>>> initializing" for every session that is currently down. Then 10 >>>> seconds later, I see: >>>> connect timeout: forcing down >>>> >>>> for every console that was previously listed as initializing. >>>> >>>> For a console which was down (c042.ytr001.ix), and where I manually >>>> connected and brought it up immediately, I see: >>>> >>>> [Thu Jul 23 14:05:02 2015] conserver (13867): [c042.ytr001.ix] >>>> automatic reinitialization >>>> [Thu Jul 23 14:05:02 2015] conserver (13867): [c042.ytr001.ix] console >>>> initializing >>>> [Thu Jul 23 14:05:12 2015] conserver (13867): ERROR: [c042.ytr001.ix] >>>> connect timeout: forcing down >>>> [Thu Jul 23 14:05:23 2015] conserver (13867): [c042.ytr001.ix] login >>>> ncconserverprod@localhost >>>> [Thu Jul 23 14:05:23 2015] conserver (13867): [c042.ytr001.ix] console >>>> initializing >>>> [Thu Jul 23 14:05:26 2015] conserver (13867): [c042.ytr001.ix] console up >>>> >>>> Unfortunately, I don't currently have any consoles in the weird state >>>> of failing to re-initialize automatically, yet coming up immediately >>>> with a manual console session, so I can only look at what was logged >>>> yesterday. >>>> >>>> Let me know if you need any other info. >>>> >>>> thanks >>>> >>>> On Fri, Jul 24, 2015 at 12:23 AM, Bryan Stansell <bryan@conserver.com> wrote: >>>>> What you’re doing sounds all correct, as are your expectations (it should attempt to bring up any downed consoles). My simple test setup shows that it works for me, but with lots of consoles, there could be a bug or some side-effect that happens with more. Or possibly some config settings that aren’t playing well together. Do you have any “interesting” messages in the conserver log file that appear when you run the command? >>>>> >>>>> Bryan >>>>> >>>>>> On Jul 23, 2015, at 10:59 AM, Lonni J Friedman <netllama@gmail.com> wrote: >>>>>> >>>>>> Greetings, >>>>>> I'm running conserver-8.2.1 on an Ubuntu-14.04.2 server, with several >>>>>> thousand clients, connected over IPMI. Most of the time it works >>>>>> fine, however occasionally we lose a VPN concentrator that maintains a >>>>>> VPN tunnel between remote sites and the console server, and we a large >>>>>> number of console sessions go into the 'down' state. Usually when the >>>>>> tunnel comes back up, the console sessions come back up on their own, >>>>>> however there are times when they do not come back up for hours, or >>>>>> not at all for no obvious reason. In nearly 100% of those cases, if >>>>>> someone manually runs 'console $consoleName' (where $consoleName is >>>>>> the name of the console session that is listed as 'down'), it will >>>>>> immediately come back up. >>>>>> >>>>>> According to the 'console' man page ( >>>>>> http://www.conserver.com/docs/console.man.html ), if I invoke >>>>>> 'console' with: >>>>>> -z bringup >>>>>> >>>>>> it should "Try to connect all consoles marked as down (this is >>>>>> equivalent to sending the server a SIGUSR1)". I've tried that: >>>>>> #### >>>>>> $ console -v -z bringup >>>>>> console: interface address 127.0.0.1 (lo) >>>>>> console: interface address 10.200.53.130 (eth0) >>>>>> 127.0.0.1: ok -- bringing up consoles >>>>>> #### >>>>>> >>>>>> However it doesn't seem to do anything at all. None of the down >>>>>> consoles come up ever. Yet I can still force them up manually if I >>>>>> connect to them one at a time. >>>>>> >>>>>> I'm unclear whether I'm misunderstanding how the 'bringup' command is >>>>>> intended to work, or if there's a bug somewhere. >>>>>> >>>>>> Can someone comment? >>>>>> >>>>>> thanks!