During a SIGIMAGE Executive Meeting in New Mexico a few years ago, between meals of wonderful Mexican food, arose a topic referred to as the "Image Enhancement Unification Enhancement" or the "Limited Data Dictionary Enhancement". That subject has since been renamed the "ENhancement for caCHIng Limited Authorized DAta", or just "ENCHILADA" for short. Since I was tasked with developing the preliminary requirements document to get the salsa flowing on this one, I wrote the first draft at that time. Since that time, a number of other proposals have been submitted and a lot of discussion has taken place. Each of these proposal writers has been asked to resubmit their updated proposals, so that the SIGIMAGE membership at large can review them all. To that end, I respectfully submit my idea of the whole enchilada: As you read the following, please focus on the concepts, not the implementation. I mention a few implementation details as a means to better understanding the concepts. There will be time later to focus on the implementation, if the proposal gets that far. We propose that IMAGE/SQL be enhanced to include one (somewhat transparent) manual master dataset and its corresponding B-Tree for all newly created databases. This dataset would be used to store myriad different bits of information about the database, similar to what might be found in a data dictionary. IMAGE itself would not use the data in the dataset. Rather, IMAGE would be used to maintain the data (put, delete, update, get). Other pieces of software would benefit from the existence of the data, should that software choose to retrieve it. What we are looking for here is not a comprehensive data dictionary, but instead a simple, relatively easy-to- implement enhancement that can be used to solve a large number of current problems, while still providing backward compatibility. (We will also want this dataset to be added to existing databases. Perhaps this could be done automatically when the database is first opened by the new version of IMAGE, or the first time someone does a DBPUT to the dataset. Or, perhaps we will want to provide a program that will add the set, or perhaps we will want to leave it to the third-party database tools. In any case, it would be highly desirable to have this dataset ubiquitous as soon as possible.) For purposes of discussion, let's assume that the dataset has a structured key with a type of X18, referred to as a "TAG", of the form: eemmmmnnnnttttssss. The eemmmmnnnn part refers to the entity: - "B " would be used for tags that apply to the entire database. - "S ssss " would be used for tags that apply to dataset number ssss. - "I iiii " would be used for tags that apply to item number iiii. - "F ssssiiii" would be used for tags that apply to the field corresponding to item number iiii in dataset number ssss. (Note that "ssss" and "iiii" are the right-justified, zero-filled ASCII representation of the dataset and item numbers.) The tttt part refers to the type: Several types are envisioned and are described in the discussion below. The ssss part refers to the sequence number: If we limit this field to lowercase alphanumerics (a/z,0/9), then we can have 36^4 or 1,679,616 different tags for an entity of the same type. (If someone needs more, they could treat the ssss as a 32-bit binary number and get 4 billion tags at the expense of ease-of-use and printability.) Due to B-tree access, these numbers do not need to be sequential. As you will see below, the actual use of this portion of the key can be defined by the creator of each type of tag. The dataset would also contain one or two more fields: The next field (if we decide to include it) would be a date/time stamp in some standard format. The last field would be the data field. We want to choose a data field size that won't waste too much space for the average tag, but also will not force too many tags into dealing with continuation problems if the data won't fit in one record. If we assume a 4-byte date/time stamp, then I would recommend that the data field have a picture of X96, to make the media record 128 bytes long (10+18+4+96). Access to this dataset would be through standard IMAGE intrinsics. Perhaps the dataset name could be "$EXTRA". Since dataset names cannot begin with a "$", this will not conflict with any current uses. I propose that the (internal) dataset number be one higher than the last dataset in the database. For databases that already contain the current maximum of 199 datasets, this would be dataset number 200, which I do not believe will cause any problems. When the maximum number of datasets per database limit is increased soon, perhaps the new maximum could be set at 254 instead of 255 to leave room for this one. If a database restructuring tool is used to add or delete datasets, the tool will be responsible for shifting this dataset up or down accordingly. (Of course, database restructuring tools will need to be aware of the entire mechanism and will need serious changes to support this.) The new items would have the following names: $TAG, $TIME, and $DATA. They would be "transparent" just as the "$EXTRA" dataset name is. Calls to existing DBINFO modes will continue to return the same values that they do today. In other words, our new dataset and new data items will not be counted in the set and item counts. No existing program should be affected by this enhancement. Of course, all new programs can access this information without problem. It seems reasonable to hope that IMAGE/SQL would be able to access this dataset as a System Table. As far as security is concerned, I hope we can come up with a simple solution here. I think that READ access to anyone and WRITE access to the CREATOR is reasonable. WRITE access to anyone with PRIV mode is another possibility. We will probably want either a simple program that will "unload" this information from a database into an ASCII file and another that will "reload" this information into another database. A GUI-based program to manipulate these fields would also be nice. However, we do not need HP to implement any of these. If HP provides us with the basic functionality, we can use QUERY until such time as someone (from the community or third-party vendor) provides us with fancier interfaces. How might such a facility be used? Here are just a few of the entries from the SIGIMAGE enhancement ballot that would benefit from such a facility: 1. DATE/TIME Data types. Rather than add new datatypes for the various date formats, an "F" and/or "I" tag could be added of type "FORM" (for format) that indicates which HPDATE date format is stored in that field. The data portion would contain something like "D23". Programs which wish to intelligently display some or all of the various date formats would request this tag deal with it appropriately. 2. Z/Z+/P/P+ IMAGE/SQL problem. Whether or not a "P" or "Z" type field should be signed can also be stored in a "FORM" tag. Or, if we wish to keep things separate, we could define a "SIGN" tag. IMAGE/SQL could be made to use this data field. 3. IMAGE/SQL split information. IMAGE/SQL splits are lost when a database is detached and reattached to a DBE. Split information could be stored in "F" and/or "I" tags of type "SPLT". IMAGE/SQL could then automatically use this information at database attach time. 4. NULL Items/Default values. A tag of type "DFLT" could be used for handling items with NULL values. Either IMAGE itself could use the default value for fields not included in a DBPUT field list (I don't like that myself), or perhaps an application program (or IMAGE/SQL) could use the value if a NULL ITEM is encountered. 5. Tool "memory area". "B" level flags could be used with types of "ADAG" for Adager, "BRAD" for Bradmark, etc., where vendors wish to leave notes in a database, such as history, audit trails, work in progress, and checkpoint information. "S", "I", and "F" level flags would also be possible, if so desired. 6. Tracking database files. "B" level flags of type "FILE" could be used to associate other files with a database, such as third-party indexing files, schemas, data dictionaries, or any other file that should be backed-up and restored along with the database itself. The data field would contain the POSIX name of the associated file. This tag would be useful to backup products. 7. BLOBs. It was suggested that a "F" or "I" tag, possible in the "FORM" type could indicate that the field is a BLOB pointer, and could even indicate which type (MPE-file name, POSIX-file name, URL, etc.). 8. COBOL picture. COBOL programmers have long wanted a place to associate a COBOL picture with a data item, to determine (among other things) where the implied decimal point belongs. Such information could be stored in a "PIC " type tag. 9. Other IMAGE/SQL data. The entire process of attaching databases to IMAGE/SQL is a much-talked-about topic. Perhaps if additional information is stored as tags, then the process of detaching and reattaching can be made less painful. 10. Broken chain information. When IMAGE encounters a serious error and is about to (maybe) create an I-file and (maybe) mark the root file as bad, perhaps it can add entries of tag "BRKN" to help point database repairs tools at which dataset and record is not well, and what appears to be wrong. This could significantly reduce repair time on huge databases. 11. A type of "LOCL" should also be defined so that database administrators could add their own information to a database. These are just the first potential uses that come to mind. I am sure that everyone can come up with several more. Perhaps the best use will come somewhere down the road. The nice thing about this feature is its extensibility. We will need some entity to maintain a list of known tags to avoid conflicts among tools. This could be the job of SIGIMAGE, SIGSOFTVEND, Stan, Tien- You, or someone else. Some tags, such as the Adager or Bradmark tags, would have their designs left entirely to their "owners". However, other tags, such as the format tag, "FORM", will have their use shared amongst applications, such as QUERY, IMAGE/SQL, and several third-party products. The design of these tags should be coordinated to address everyone's needs. Other proposals have found other places to stash these data, and other mechanisms with which to retrieve them. But think: We have a classic case of data storage and retrieval here. What facility do we use in such a case? Why, IMAGE of course! It provides security, logging, Query access, SQL access, and all of the other things that have brought us together in the first place. What could be easier and more straightforward for our data storage and retrieval purposes? Please remember: I can think of all kinds of ways to complicate this and perhaps add to its potential uses. But, the more complicated we make it, the less likely that it will ever be implemented. The key here is to design a simple, extensible, backward compatible facility that can be used in the solution of several problems, not a panacea for the world's problems. ------------