[en] Why does a ClientHeartbeat not extend the session Timestamp? (solved)

Topics: Technical Support
Jan 12, 2012 at 3:34 PM

Currently a client session's (ServerSession instance) Timestamp is only updated when the client effectively calls into the server (e.g. in the MiniChat example the ServerSession.Timestamp value is updated only when 'SendMessage' is called by a client on the server side).

However, when 'PollingEventTracing' is enabled and the client sends a heartbeat signal to the server, the ServerSession.Timestamp is NOT updated.

Why is this?

 

The current effect is, that a connected client which doesn't invoke any remote call for quite some time (longer than the defined 'SessionAgeLimit') will automatically expire and as such the session will automatically be removed from the server.

But if the (still connected client) now tries to invoke a remote call to the server, the server will throw an exception which leads to a crash on the client side!

This seems to be unlogic - as the client is/was still connected all the time and even sends its heartbeat signal - so it shouldn't expire automatically?!

Coordinator
Jan 13, 2012 at 9:23 PM

It looks like a plain bug to me. Heartbeat event was designed to keep low-activity clients alive.

Coordinator
Jan 13, 2012 at 9:25 PM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.
Coordinator
Jan 13, 2012 at 10:41 PM
Edited Jan 13, 2012 at 10:42 PM

Hi Bernd,

both issues you reported are fixed now (hopefully).

Please download the latest source snapshot and rebuild both Zyan and MiniChat sample.
Let me know if the fix works for you so I can close work items.

P.S. As for ActiveClients list, let's wait for Rainbird's opinion.
If he agrees to add a new event for tracking swept sessions, this functionality will probably be added to the upcoming release 2.3.

Jan 15, 2012 at 2:16 PM
Edited Jan 15, 2012 at 6:04 PM

Hi yallie,

thanks for the update, which solved a couple of things.
In the meantime I also changed the lib locally and came to the following changes:

1) AutoSessionExtension:
I did pretty much the same as you did, except, that I added a new property called "PollingAutoSessionExtension" to the "ZyanComponentHost" - to ensure backward compatibility.
So my code for the 'OnClientHeartbeatReceived' method looks like this:

protected virtual void OnClientHeartbeatReceived(ClientHeartbeatEventArgs e)
{
    if (_pollingAutoSessionExtension)
    {
        var session = SessionManager.GetSessionBySessionID(e.SessionID);
        if (session != null)
            session.Timestamp = DateTime.Now;
    }

    if (ClientHeartbeatReceived != null)
        ClientHeartbeatReceived(this, e);
}

 

 2) Automatic Unsubscribe Client Events:
I guess this is today handled in your "DynamicEventWireBase" class - see the "InvokeClientDelegate" method!
Here you catch any exception, and in the 'catch' handler you call 'ServerEventInfo.RemoveEventHandler' to effectively unsubscribe the event handler from the 'dead' client. However, right after, you (re)throw the exception. I think this is rather problematic, since this means, that the event is NOT further processed.
Any other active and valid clients (which also subscribed to that event) will NOT receive that event anymore!
Therefore I replaced the catch-handler with the following code (not re-throwing the exception):

protected override object InvokeClientDelegate(params object[] args)
{
	try
	{
		if (ValidateSession != null && !ValidateSession())
			throw new InvalidSessionException();

		if (EventFilter != null && !EventFilter.AllowInvocation(args))
			return null;

		return Interceptor.InvokeClientDelegate(args);
	}
	catch
	{
		// unsubscribe
		ServerEventInfo.RemoveEventHandler(Component, InDelegate);
        if (RemoveSession != null)
            RemoveSession();
	}
    return null;
}

As you  can see, I added a new "RemoveSession" delegate to it as well, like this:

public Func<bool> RemoveSession { getset; }

This allows e.g. the "ZyanDispatcher" class to inject some code, which will be executed, if a 'dead' client is detected to be 'removed', e.g. from the session manager automatically.
I actually used that in the "ZyanDispatcher.CreateClientServerWires" method to call a new server manager method:

dynamicEventWire.RemoveSession = () => sessionManager.PurgeSession(sessionId);

This new method more or less does the same as the "RemoveSession" method, except, that it fires a new event called "OnClientExpired". So here is what I added to the "SessionManagerBase" (plus an interface in "ISessionManager"):

public bool PurgeSession(Guid sessionID)
{
    bool purged = false;

    IIdentity identity = null;
    DateTime timestamp = DateTime.MinValue;
    string clientIP = string.Empty;

    var session = GetSessionBySessionID(sessionID);
    if (session != null)
    {
        identity = session.Identity;
        clientIP = session.ClientAddress;
        timestamp = session.Timestamp;

        RemoveSession(sessionID);
        purged = true;
    }

    try
    {
        if (identity != null)
            OnClientExpired(new LoginEventArgs(LoginEventType.Expired, identity, clientIP, timestamp));
    }
    catch { }

    return purged;
}

Then I changed the "SweepExpiredSessions" method (again in "SessionManagerBase") like this:

protected virtual void SweepExpiredSessions()
{
    if (SessionAgeLimit > 0)
    {
        foreach (var id in ExpiredSessions)
        {
            IIdentity identity = null;
            DateTime timestamp = DateTime.MinValue;
            string clientIP = string.Empty;

            var session = GetSessionBySessionID(id);
            if (session != null)
            {
                identity = session.Identity;
                clientIP = session.ClientAddress;
                timestamp = session.Timestamp;
            }

            RemoveSession(id);

            try
            {
                if (identity != null)
                    OnClientExpired(new LoginEventArgs(LoginEventType.Expired, identity, clientIP, timestamp));
            }
            catch { }
        }
    }
}

And added a respective "OnClientExpired" event handler:

public event EventHandler<LoginEventArgs> ClientExpired;

The same event (by the way) which I added to the "ZyanComponentHost"!
As such, the "ZyanComponentHost" now contains a ClientLoggedOn, ClientLoggedOff and ClientExpired event!
As such it is easy track a real list of active clients at any time.

What do you think...?

Many Greets

Bernd

 

 

Coordinator
Jan 16, 2012 at 8:00 PM

Hi Bernd,

if executing a client callback delegate fails, the client´s session should not be removed automaticly. Removing the session on the first little exception doesn´t allow the client to reconnect smoothly. Remember there can be many different reasons for the callback failure.

Let´s think about a big enterprise application which consists of many modules. If a single client callback failure of one module results in instant puring the client´s session, all other modules on this client will be cut-off too.

The MiniChat example indeed, isn´t a real world application. It has some bugs and lacks detailed error handling. But if I plug all the stuff in, that I would in a real world scenario, the MiniChat example won´t suite well as a example anymore. Anyway bugfixing the examples is always a good idea.

ZyanComponentHost will be extended with a new event for notification when a expired session is removed. Many thanks for your great feedback.

Distributed events in Zyan are currently (V 2.2) are only recommended for small solutions with not many clients. If you have many clients and/or have other special requirements, I suggest to use remote delegates and a custom publish/subscribe implementation. This works fine and gives you full control how distributed notifications should be done.

The following features (planned for V 2.3) will make distributed events suitable for more enhanced scenarios:

http://zyan.codeplex.com/workitem/1348
http://zyan.codeplex.com/workitem/1349

Many of the problems with distributed events that exists today with Zyan 2.2, will be solved then.

Coordinator
Jan 16, 2012 at 9:50 PM
Edited Jan 16, 2012 at 9:55 PM

@Bernd:

Great job!

You've figured out how to fix things yourself pretty quickly.
It makes me feel that Zyan architecture evolves in the right direction :)

>1. AutoSessionExtension

I don't really see why session heartbeat event should have this extra configuration property.
As you've already mentioned, expiring client session when the client still sends heartbeats is very counter-intuitive.
The heartbeat should always prolong session lifetime, as it's the main reason for the heartbeat to exist at all.

>2. Automatic Unsubscribe Client Events

Unfortunately, it would be a breaking change in both session management and event subsystems.
I feel that it would hurt backward compatibility pretty badly.

I'd suggest that you detect abnormally terminated clients by either relying on session sweeping (once the new
event for that is introduced) or implementing your own regular checks whether the certain client is still alive.
The former technique (heartbeats + session sweeping) would scale much better, to my opinion.

@Rainbird:

I've forgotten to close #1349 work item :)
Session-bound events and event filters are already implemented and should be ready to use.
I'm already using them for about a month in my project.

Jan 17, 2012 at 9:05 AM

Hi All,

thanks for your feedback!

(Rainbird): when you say:
"...if executing a client callback delegate fails, the client´s session should not be removed automaticly. Removing the session on the first little exception doesn´t allow the client to reconnect smoothly. Remember there can be many different reasons for the callback failure..."

I am not sure, if that is really correct? (of course I am just looking into your lib for a few hours, so I am sure I didn't get it all); but I am also looking for an enterprise solution and the current implementation has one huge drawback: If a client connection 'dies' (e.g. due to a client being terminated abnormally or whatsoever), the event notification isn't guaranteed today anymore - due to the fact, that you (re)throw the exception as mentioned.
E.g. in your MiniChat example:
If you connect 3 or more clients (and all of course subscribe to the same event handler) and one (e.g. the first one which connected) client 'dies' then the current lib wouldn't deliver the event anymore to ALL other client.
This means the current implementation inside your lib doesn't guarantee, that all connected and alive clients really reliably receive any event they subscribed to!
In my opinion this shouldn't happen. Instead a 'dead' client should NOT have an impact to the other still connected clients.
And this behaviour has nothing to do with the implementation of the MiniChat example - as the issue is inside the lib itself.
Just make a little test:
- start the MiniChat server
- connect 3 clients to it (name them: test1, test2, test3...)
- send some messages (you can see, that ALL clients receive the event)
- now kill the first connected client (e.g.: test1)
- now let 'test2' send a message
- you'll notice that the 'test3' client would NOT receive that message
This is a bug to my mind?!

That's why I believe, that when you remove a client event delegate via "ServerEventInfo.RemoveEventHandler", that the related session should also be removed. What sense would it make to remove a client event handler but not the client session itself?
Removing the client event handler only would leave the related session in an 'unpredictable' state?!

As said, I just looking at your lib, so bare with me, if I am overlooking things here.

Coordinator
Jan 17, 2012 at 8:04 PM
Edited Jan 17, 2012 at 8:05 PM

Hi Bernd,

>the event notification isn't guaranteed today anymore - due to the fact, that you (re)throw the exception as mentioned.

I think you're right. Rethrowing an exception coming from a client's event handler is a bad idea.
It prevents pending event handlers from the same delegate chain to execute, which is plain wrong.
I just didn't notice that you removed "throw" statement.

>you'll notice that the 'test3' client would NOT receive that message
>This is a bug to my mind?!

Yes, I absolutely agree that it's a bug in the library and it should be fixed.
We shouldn't rethrow client's exception here.
But I still don't agree that:

>That's why I believe, that when you remove a client event delegate via "ServerEventInfo.RemoveEventHandler",
>that the related session should also be removed.

If client's event handler throws an exception, it doesn't necessary mean that the connection is broken.
It may be due to some bug in a client's software, for example.

Consider some 3-tier enterprise application: WinForms client -> Application Server -> Database.
WinForms client creates and displays OrderForm which subscribes to a server-side event: OrderStatusChanged.
OrderForm is buggy, and its event handler may throw an exception which will go to the server.

Once AppServer sees that client's event handler is bad (it throws exceptions), it cancels the subscription.
We just don't want AppServer to waste any resources trying to call invalid client event handlers.
OrderForm can't handle the event anymore, but WinForms client application still can function.
For example, the user can click Save button and save changes to the order he was editing.
He can also close OrderForm and reopen it, and the event subscription will be restored, and so on.

>Removing the client event handler only would leave the related session in an 'unpredictable' state?!

I understand what you mean, but I don't really feel that invalidating the whole session is the right thing to do.
I believe that it's preferably to try our best to recover from the error, not to kick the misbehaving client out.
And of course, catch block should also log the exception in some way to help developers to spot the problem.

Coordinator
Jan 17, 2012 at 8:20 PM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.
Coordinator
Jan 18, 2012 at 5:17 AM

Hi Bernd,

I agree that´s bad, when further clients from the event´s invocation list are not notified, if one clients dies. In distributed applications this leads to bugs and results in poor reliability of the event system.

But local events have the same behavior. If an exception occures inside a event procedure, all further event procedures in the local event´s invocation list will not be called. So we can say, remote events with Zyan behave exactly like thier local ancestors.

Anyway it should be fixed in Zyan.

Jan 18, 2012 at 9:46 AM
Edited Jan 18, 2012 at 9:54 AM

Hi,

thanks for your feedback. What you are saying makes of course sense.

So maybe there is a way to differentiate the actual 'exception reason/cause' and handle it differently. E.g.:

a) if the exception is raised because of a 'lost client connection' it would be save to 'revome' the related session (or at least offer an option to do so; or offer an optional callback, so that the server implementation might react on such 'lost client connection').
As far as I can see, in case of a 'lost client connection' the original exception is thrown e.g. in the "Zyan.Communication.Protocols.Tcp.DuplexChannel.Manager.GetHostByName" method when calling "Dns.GetHostEntry(name).AddressList"

b) if the exception is raised because it occures inside a event procedure itself, then you might simply remove the event delegate from the list (like today).

But in any case you might NOT (re)throw such exception - so that other subscribed delegates might still get it.

 

P.S.:

I see the same issue inside the "Zyan.Communication.Protocols.Tcp.DuplexChannel.Connection" class when accessing the "Stream" property. This might also throw a 'SocketException' which might be (re)thrown later.
I guess another user already reported a simmilar issue.

As both the 'GetHostEntry' as well as the 'Connection.Stream' might throw a 'SocketException' it might maybe possible to use that exception to differentiate the above 2 cases.
It might not be 100% save, as the event handler implementation might also throw such exception itself - but it seems to me rather unlikely and this is at least better than nothing ;-)

Coordinator
Jan 19, 2012 at 2:32 PM
Edited Jan 19, 2012 at 2:35 PM

Hi Bernd!

>As both the 'GetHostEntry' as well as the 'Connection.Stream' might throw a 'SocketException'
>it might maybe possible to use that exception to differentiate the above 2 cases.

Unfortunately, it's not an option.

Zyan uses lots of different protocols — TcpEx, Tcp, Named Pipes, Http (and this list will grow as we add the new ones).
It's only Tcp family of protocols which is throwing SocketExceptions. Therefore, we can't rely on it at all.

>at least offer an option to do so; or offer an optional callback, so that the
>server implementation might react on such 'lost client connection').

I think, it's no problem. We can add some exception tracking mechanism for dynamic wires.
This way you'll be able to remove client sessions if you decide to do so (it does really make sense in a chat client).

I was already considering this a few months ago (I have future plans to implement a few more clever strategies —
like ordering event handlers based on their execution time, moving bad/slow handlers to the end of the list,
removing the handler if it throws exceptions several times in a row, etc.)

>But in any case you might NOT (re)throw such exception - so that other subscribed delegates might still get it.

Yep, that's for sure. We'll fix it ASAP.

>I see the same issue inside the "Zyan.Communication.Protocols.Tcp.DuplexChannel.Connection" class when accessing
>the "Stream" property. This might also throw a 'SocketException' which might be (re)thrown later.

>I guess another user already reported a simmilar issue

Do you mean this work item? TcpEx: unhandled SocketException may crash server

Jan 19, 2012 at 4:38 PM

Yes, I mean that work item.

The 'SocketException' was an example of course. Other transport protocols might offer other exceptions which might be handled similar - it was just an idea to more quickly detect 'dead' clients.