Archives for: March 2005
31/03/05
Copernic Desktop Search 1.5 has been released
Copernic Desktop Search has had its version 1.5 released. I've had a look at it to see if maybe some of my suggestions from my article What I'd like to see in Copernic Desktop Search have made it into the final version. Unfortunately, there's doesn't seem to be any information on changes that have been made since the beta version, so it's easy to miss something. Dear Copernic guys, maybe you could publish some more information on changes you make during development cycles?
Good news
CDS can now index offline mail and news folders for Thunderbird. Not so good news is, it's incredibly slow... like one article a second, or even less than that (and these are newsgroup postings, very short in average and 99% without attachments). As the Index status window shows, the status jumps back to "email indexing complete" between every two articles, I guess that's probably not correct anyway. I thought I was going to index some of my mail folders with this feature, but those are usually around 10000 messages each, so maybe that'll have to wait for now.
An issue I found is that for each of my IMAP accounts, two separate INBOX hierarchies are shown. Obviously there's only one such hierarchy in reality, no idea where the other comes from. There doesn't seem to be any difference in the folders underneath the two. Here's what this looks like:

Bad news
A much worse problem is that the offline folder email indexing doesn't seem to work at all. I tested this twice, once with an offline newsgroup folder, then with an offline IMAP folder. (BTW, if larger configuration changes are made, it's always a good idea to restart CDS. More than once it happened to me that after I had removed a folder from the current email indexing configuration, CDS still continued indexing it, until I restarted it.) In both cases, CDS wasn't able to show me the correct content of indexed entries! I just tried selecting individual entries in the "All Emails" view to see their content in the preview. This didn't really work at all... I'm sounding cautious because the behaviour was so funny: CDS would always show me content of one email in the list, but it would show that same content regardless of my current selection. I didn't even get the impression that this content really belonged to one of the emails I actually clicked on. Every once in a while, when I repeatedly changed selection, the content would change, too, only to stay the same again for the next 30 (or so) clicks.
More bad news
None of my other suggestions have been taken up be the developers. Not so nice... especially since I can't imagine some of them to be that far out, or that difficult to implement, like the suggestion to enable indexing of browser-related information for more than one browser at the same time.
Even more bad news
As Marc Orchant and James Kendrick have been reporting in recent blog articles (this is Marc's and this is James's), there are serious issues running CDS on TabletPCs. I have one of those myself, so I'm really interested in this, and Copernic doesn't seem to show any interest in solving these issues. Maybe it would be useful if more TabletPC users contacted Copernic and told them that this is important to us.
Conclusion
Well, I don't like it too much. Copernic have managed to introduce a few serious bugs in CDS since the beta and they haven't taken up much that was suggested at that time. My C# custom extractor for CDS still runs, but they haven't done anything to extend the shortcomings of the extension API, for example with previewers or the ability to influence the scanning process itself. Obviously, CDS is free software, but it wouldn't have to be if you ask me. Actually, I have three licensed programs by Copernic and none of them is being as actively developed as CDS is. Maybe Copernic would do good to listen to their users, to better coordinate their development and test cycles and for a product that may still be leading in a very competitive market... but they don't have too much time to make the right decision here.
29/03/05
Don't use exceptions for flow control
I just read a blog entry titled A word from the "Wise": Don't Use Exceptions on Alex Papadimoulis' weblog. He describes in great length how a so-called MVP had told him that using exceptions was a bad idea and continues to prove in ten points why he doesn't believe it.
Well, not. The only thing Alex proves is that he didn't remotely understand what that MVP guy was (probably, I'm only guessing here, of course) talking about. The most important hint is one that Alex misses completely: The MVP said not to use exceptions, but instead use return codes... now, what might that have been about?
The answer is simple, and it doesn't take an MVP to tell anyone... a simple Google search says enough about this. Exceptions are not a tool for flow control!
What does this mean? Simple, for example: If you are going to divide something by something else, it's a much better idea to check the divisor for zero than to catch a DivideByZeroException. That's what the MVP dude meant when he was talking about exceptions killing performance: it's a simple processor instruction to check a variable for a specific value, it'll take thousands of additional cycles to throw an exception and catch it, the result is the same.
An exception is meant as an instrument to signal that your application's flow has been broken. That's the one most important feature of the exception handling subsystem: it works across all boundaries of application flow. It's neither designed nor very useful for any situation where the state of the application is known.
Rule of thumb: if you expect that an error may occur in any specific line of code, and you are going to implement alternate handling for that particular case, don't use an exception. That's not an unexpected application state and shouldn't be handled with an exception.
And then, if you're going to measure executing speed, at least do it properly. Run a profiler and look at the number of method calls that take place in that small test program of yours. And then imagine, how fast would the .net framework be if some guys who work at MS didn't know when (not) to use exceptions? Who says that your method is only ever going to be called ten times in a row? What's your excuse why you didn't write code that's as fast as it can be? You laughed when someone tried to explain it to you?
Who switches off his/her computer at night?
Well, I don't. I never switch off my computer and I never quit the utilities that are running on it all the time. There may be better reasons for this than I have, but these are mine: I use it 14 hours a day anyway, and to boot my 3.4 GHz Athlon 64 system from a cold state into Windows XP, with all the tools running, it takes 19 minutes, that's no exaggeration.
In this context, there's something I absolutely hate: most applications, even small ones, have routines these days that run "regularly". Many do automatic update checking, some do automatic backups, whatever. And guess what those brainy developers do? They run those processes at startup, somehow assuming that a satisfying regularity will come automatically that way. Great. Crap, of course.
Guys, don't do that! It's useless! Many people don't restart your application all the time, so it just doesn't work! And I'm sure those apps take their share of the 19 minutes boot time, because once they are actually restarted it's something that happens only every few weeks, so all the update checks/backup/whatever take place at the same time. Who comes up with such an idiotic idea?
26/03/05
Defining data consistency in an object world
Many things change with the decision to work with purely object-oriented data in a specific situation. The outlook seems good: business processes and rules will be much easier to implement, completely typed data will be no problem at all and there'll be no more structural problems trying to accomodate clumsy handling of records and rows in an otherwise OO application structure. There'll be an object/relational mapping tool that takes care of all the persistence issues. There's one thing though that will pose problems much greater than originally anticipated, and it's easy to overlook large parts of that in the original decision: the wide topic of data integrity in the object world (OW).
I'm going to present some general questions and theories about data integrity in conjunction with OO data objects in this article and I'm planning to write some further articles on the same topic later. Occasionally I may reference the technology I'm personally using at the moment, which is .NET 2, the C# language and XPO.
Data integrity, what about it?
First of all, data integrity comes in two flavours:
- The technical side of things is where we are talking about referential integrity on a database level, mapping of field types between databases and programming languages, things like that.
- The logical part is part of what's often referred to as Business Logic. There are definitions for value ranges, maximum counts of assignments between objects, valid states of object fields and much more. Obviously, business logic in its whole may encompass lots of other things, the part that's relevant here is only where it relates to object data in the objects, as opposed to object data in flow between objects and states.
Many aspects of both data integrity parts are very different in the OW, compared to a "simple" relational database model.
Why is it different?
People have been thinking about these issues for the relational database scenario for a long time. Concepts like referential integrity and unique indexes are very important in this domain. Normalisation provides for database configurations where automated referential integrity can be fully exploited. Databases have features that let the designer restrict values, together with modern database access layers like ADO.NET the complete technical side of data integrity and possibly some of the logical part is covered by these mechanisms.
In the OW, this is where the problems start. Obviously, a good o/r mapping tool should be able to exploit the features of the database layer, but this isn't sufficient. As soon as a single object is mapped to more than one table (as it should be, when inheritance is used), many of these mechanisms break. For example, it's impossible to define a multi-value index, unique or not, for values that are not in the same table. Depending on implementation details of the o/r mapper, even unique indexes over fields that are in the same table may present problems, for example if the mapper doesn't make sure that all necessary fields of an object are filled (correctly!) before the object is first saved to the database.
When working directly with relational databases, the easy way to implement business logic is on the server side. Using triggers on the database level, consistency checks can be implemented, other processes executed just in time and so on. Unfortunately, this has a lot of drawbacks; one of the worst is that there's no easy way to get useful user feedback in case a check fails. In real-world applications, more often than not business logic implementations will be split, performing some kinds of actions on the database level while leaving other stuff to the client application. For the latter part, it's difficult to find the right "place" where to implement it, in .NET a typed dataset can provide part of a useful answer.
As long as there's any server-side consistency checking implemented, there's always the problem that data which is already loaded on the client, and has been changed there, may not adhere to the same restrictions the server would enforce if the data was to be saved. The programmer has to keep an eye on the exact state of things and see to it that data is saved to the server in all the right places.
There are two aspects to this issue: First, an o/r mapping tool should be allowed to define it's own database structure with as much freedom as possible. (I know that a lot of people think that this should work the other way round, letting them define a structure and leave the tool to deal with that. Apart from situations where one needs to work with legacy data structures, this is nonsense to me and contorts the purpose of such tools.) Obviously, I'd have to be very careful when writing database layer code that successfully uses the information in the generated layout, and I'd risk breakage every time I update the tool.
Second, from the OW point of view, it seems intolerable to have a number of objects in memory at any given time that may not be in a consistent state. With relational data, this is often a situation that's simply left to the developer of each distinct algorithm. But when objects are global to the application (or parts of it, at least) and there are intelligent caching and lifecycle management mechanisms in place, as implemented by a useful o/r mapper, one can't live with the possibility of inconsistent states in in-memory data.
Consequences
So, these are (some of) the specific issues we have to deal with in the OW:
- The consistency of in-memory data has much greater importance in the OW. Also, in-memory data is probably a more common thing to have to deal with.
- Standard mechanisms on the database layer may be partly useless and there's no ready-made replacement. Client-side counterparts like the index support in .NET datasets are not immediately available.
- While obviously consistency checking systems like those implemented in triggers on a database server can be implemented client-side in an object framework, there's no fixed point where this can be implemented. Everybody knows what a "before insert" trigger in a database is good for, but where do I put the same code in my OW?
These issues and their solutions will be subjects of future posts. Thanks for reading so far!
23/03/05
WinForms controls and the red X
Most people working with WinForms have probably encountered that red X that is drawn over a control at some point and just doesn't go away as long as the application is running. Originally, I had a look at the source of this some months ago and now, when I saw a relating question again, I thought I might document my findings here.

Note that I did that research with .NET 1 and I haven't checked for .NET 2 yet, so in the latter case YMMV.
So where does the red X come from? Simple: The System.Windows.Forms.Control has an internal state flag for this that gets set when an exception is thrown in the control's drawing code. So if you've never seen the red X but you want to, just throw a panel on a form and create a Paint event handler like this:
private void panel1_Paint(object sender, System.Windows.Forms.EventArgs e) {
throw new Exception("Boom");
}
Now, the really interesting thing about the red X is that you can't easily get rid of it once it's popped up. The only "official" way is to restart the application. Lucky though that .NET has powerful reflection... that makes it possible to use the following method to reset the state:
void ResetExceptionState(Control control) {
typeof(Control).InvokeMember("SetState", BindingFlags.NonPublic |
BindingFlags.InvokeMethod | BindingFlags.Instance, null,
control, new object[] { 0x400000, false });
}
So you can get that panel in the example above to have another go at drawing itself by going
ResetExceptionState(panel1); panel1.Invalidate(); // invoke redraw
Of course, if the same exception is still thrown from the paint handler, there won't be much to see as the state is immediately set back to show the X again.
Generally, of course, you should have a very close look at the reason why there's an exception thrown in the paint handling code at all. But there are situations where you might want to control the Control's behaviour in detail and in these cases it's nice to be able to handle that internal state yourself.
Two additional notes:
- You do need certain permissions to get that reflection code to run. If you want to configure your application for exact permission sets, you should use
[ReflectionPermission(SecurityAction.Demand, MemberAccess=true)]in front of the ResetExceptionState method. - As I got a request for this at the time, I have a translation of the method in VB.NET, too. I don't usually use VB, so there may be more elegant ways to do this, but here goes:
Private Sub ResetExceptionState(ByVal control As Control)
Dim args() As [Object] = {&H400000, False}
GetType(System.Windows.Forms.Control).InvokeMember("SetState", _
BindingFlags.NonPublic Or BindingFlags.InvokeMethod Or _
BindingFlags.Instance, _
Nothing, control, args)
End Sub


