?



                      A Proposal for TI-DOS Archiving


     In recent months, the TI community has begun to look at various means
of storing programs and data in single files known as archives. This has
created a number of problems, not the least of which is the fact that for
each programmer who attempts to develop such an archiving method, there is
likely to be several competing methods already released.

     I have prepared this short file to simply spur discussion over the
questions that the subject of archiving raises. There are several ideas
presented herein but none of them are cast in stone. Hopefully, they wil
simply spur the inventive in the TI community to consider the questions
and to develop solutions *now* instead of later when it will be much more
difficult.

     The most well-known of the TI archive formats is the format developed
by Barry Traver. It uses a simple 18 byte file header that simply
summarizes the essential data for that file from the original TI-style
file header. (The following information was extracted from the ARCDOC file
prepared by Al Beard.) Its format is as follows:



           BYTE #         Description

            0-9         10 Character (MAX) file name.  Unused characters
                        are space characters.

            10          File Status Flags:
                                          Bit No   ON=1         OFF=0
                           >00 Dis/Fix      0   Program File   Data File
                           >01 Program      1   Internal       Display
                           >02 Int/Fix      2   Reserved
                           >80 Dis/Var      3   Write Prot     No Write Prot
                           >82 Int/Var      4   Reserved
                                           *5   Squeezed       Unsqueezed
                         *Non-standard      6   Reserved
                            meaning         7   Variable Len   Fixed Len

            11          Maximum Number of Records/Sector or AU

            12-13       Total Number of Sectors Used (unsqueezed archive) or
                        Total Number of 128 byte Records Used (squeezed archive)


            14          End of File Offset

            15          Logical Record Length

            16-17       Number of Fixed Length Records or Number of Sectors
                        Used by Variable Length Records

        The last 128 byte header record contains the characters "END!" in
        the last four bytes.  Unused header record slots (to fill out a
        128 byte record) contain zeroes.

            These 18 byte mini file headers are packed fourteen to two
            128 byte records (for a total of 252 bytes).  The remaining
            unused four bytes per sector contain either zeroes (meaning
            this is NOT the last header sector) or the characters END!
            (meaning this is the last header sector).

         d. Following the header section of the archive is the data
            section.  Each 256 byte sector of the file is packed as
            two 128 byte records.


     There are two fundamental problems with the Traver format when used
for archiving. First, the minor point is that these mini-headers are not
sized as powers of 2. This complicates (only slightly) the archive
directory maintenance functions. More serious though, is that fact that it
locks out any future expansion unless some contorted methodology is used
to find "expansion haeders" and then to further decode these.

     To alleviate this, I suggest that a different format be used for each
directory entry.


           BYTE #         Description

            0-9         10 Character (MAX) file name.  Unused characters
                        are space characters.

            10          File Status Flags:
                                          Bit No   ON=1         OFF=0
                           >00 Dis/Fix      0   Program File   Data File
                           >01 Program      1   Internal       Display
                           >02 Int/Fix      2   Reserved
                           >80 Dis/Var      3   Write Prot     No Write Prot
                           >82 Int/Var      4   Reserved
                                           *5   Squeezed       Unsqueezed
                         *Non-standard      6   Reserved
                            meaning         7   Variable Len   Fixed Len

            11          Maximum Number of Records/Sector or AU

            12-13       Total Number of Sectors Used (unsqueezed archive) or


            14          End of File Offset

            15          Logical Record Length

            16-17       Number of Fixed Length Records or Number of Sectors
                        Used by Variable Length Records

            20-30       RESERVED FOR FUTURE EXPANSION
                        (Preserve time/date stamps, etc.)

            31          Header Addendum Flag


The effect of this directory structure will allow us to (a) meet future
requirements, (b) add additional info to the archived files if needed,
(c) greatly simplify archive directory management algorithms since each
dirctory entry is exactly 32 bytes long.

     I also realize that the suggestion for the use of bytes 12-13 does
not conform with Al Beard's current methods for squeezing files. However,
it seems to me that there is a good reason to attempt to simplify the use
of that field. If, however, Al can convince us otherwise, it might be
possible to allow that field to have a dual nature.

     The RESERVED bytes provide area for immediate expansion if there is
need. The header addendum flag is a simple zero/non-zero flag that will
indicate whether additional header info is to be found with the actual
archive file member. Since this need only be a bit flag, as is the case
with the file status flags, the header addendum flag could also be
designed to carry other information. If the header flag itself is set,
then additional information will be found preceding the actual file in the
archive. This additional information can be arranged in any number of
formats but the essential item is the very first word. It will indicate
how many additional bytes of data are associated with the file header
addendum. Beyond that, I have not attempted to define the header file
addendum area since I am not sure how it might be used or if it will ever
truly be used.

     I realize that this proposal calls for a serious overhaul of the
current archive format but I believe that the overhaul would be
worthwhile. To perpetuate a less than optimal archive format will
complicate our programming efforts and perhaps lock us into a system that
is not as flexible or responsive to our changing needs.

     On the other hand, an optimal archive format will allow us to do a
variety of things, such as writing utilities to let us run programs that
are not frequently used from directly inside an archive or allow us to
extract a particular file (compressed or not) for transmission over data
lines, or to simply type to the screen.

     On yet another note, we might ask ourselves if we wish to design in
the capability to expand an existing archive. There is precedent for this.
In the CPM world, the library utilities allow the user to specify the
numbe of library directory entries in the library at the time it is
formed. This assumes that the creator has a reasonable idea of how many
entries will be required since the archive directory fully precedes the
files embedded in the archive. (NOTE: There is a way to avoid this. It
involves using a header followed by the embedded file, followed by the
next header and file, and so on. This requires that the header contain a
pointer to the next header in the archive. The disadvantage of this
technique is that the random access nature of the archive is destroyed.
This is the sort of tradeoff that needs to be fully considered by the TI
community before implementation.)

     Finally, there is one last issue that I wish to raise for
consideration. I personally believe that we should separate the archiving
process from the compression process totally. That is to say, a file in an
archive can be extracted in a compressed form or in an uncompressed form.
Likewise, a file can be compressed outside an archive and later added to
the archive. I believe that by keeping the compression and archiving
processes separate that we can not only enjoy them but can proceed to
upgrade either one without directly impacting the other.

     The purpose of this short file is to get the TI community thinking
about the archive problem. I hope that it helps accomplish that. Please
not that all of the ideas presented herein are subject to change and
improvement. They are presented simply as alternatives to the current way
of doing things.

END END END END END END END END END END END END END END END END END END


Download complete.  Turn off Capture File.