The Principles of a System of Identification November 3, 2009Posted by ficial in brain dump.
The most common tool for identification is the dichotomous key. A person starts out with an unknown thing, the key presents a question about the thing, one answer leads to one later questions, the other answer to another, until the thing has been uniquely identified. This sort of key has a lot going for it – when it works it’s accurate and precise and reliable, it’s easy and intuitive to use and to understand, it teaches a user what to observe, and it’s easily printed on paper. However, it has many problems as well, and I think that with computers we can do much better. Here are the points that I think especially need addressing:
- An ID system should work with whatever traits the user can observe – binary keys specify what traits must be examined, and if the desired determination cannot be made then further progress is difficult to impossible.
- Corollary – An ID system should let the user specify what kinds of traits to care about – e.g. a binary key may require a hand lens or a chemical test when all the user has is their naked eye.
- Corollary – An ID system should take advantage of whatever observations a user is able to provide – e.g. even though growth substrate may be a good indicator for a match, a binary key cannot rely on a user having that information and so can’t include that in it’s tree (or else does include it in the tree, which makes that question a dead end if the user doesn’t have that information).
- An ID system should indicate partial matches and liklihoods – if one only gets 2/3 of the way through a binary key it is very hard to tell how far and to what the pool of potential matches has been narrowed, nor does it indicate which of those matches is most likely based on what is so far known.
- An ID system should handle inconsistent trait values – if a specimen is often yellow, sometimes orange, and rarely red, a binary key has a hard time making use of color as a determining factor.
- An ID system should make it easy to differentiate between two potential matches, or else clearly indicate that no further differentiation is possible – if a binary key gets you to two plausible matches it does not tell you the best way to determine which is the correct one.
- An ID system should be easily extensible – it is extremely difficult to merge two binary keys, or even just to add a few items to a given key; adding to a binary key often entails re-doing large sections of the key. Among other things, this makes it very hard for more than a small group to contribute directly to a given key.
It’s worth noting that any new system must also be at least as good as binary keys in these areas:
- Accurate – when a positive identification is made it should be correct 100% of the time
- Precise – the system should identify a single positive match, or as small a set as possible
- Reliable – the same specimen should give the same match results over multiple iterations by multiple users
- Useable – the system should lend itself to some straight-forward, simple, and sensible interface
- Understandable – the system should not be so complicated that users treat it as a complete black box; even if users don’t fully understand all the details of the system they should grasp the general concept
- Instructive – when a user makes an identification the process should teach them general skills of observation, something about that particular specimen, and something about other specimens in the same general category.
I am explicitly abandoning the need to have the system translate well to paper – the whole point of this exercise is to consider how things might be different in the context of the computer.