Open for Random

in Visual Basic®

by Rick Meyer

Home

Table


of


Contents

  1. What are random access files?
  2. The advantages of random access files.
  3. The basics of opening for random.
  4. Visualizing the bits on the disk.
  5. Files are created if they do not exist.
  6. The three file operations.
  7. About file buffering.
  8. The user defined type.
  9. An example with an UDT.
10. Put and Get examples.
11. Not possible to shorten files.
12. The Mini-Database demo project.
What are random access files? When I first heard the term random I was perplexed. I thought, "Do they really mean that we have to somehow deal with random data in our programs - or has this file system been developed only for games of chance where just certain programs need to get special data randomly?"

No - that turned out not to be the case. What was actually developed was a way to place and retrieve chunks of data called records rapidly any place in a file. There is considerable overhead performed by Visual Basic positioning and buffering the actual physical disk on which these records are stored to make this all happen with just a few simple commands. Still it can be confusing for those who have become familiar with the commands for sequential file I/O (Input and Output).

Random access files have several advantages over sequential access files in Visual Basic.

  • Whereas sequential reads and writes allow for only ASCII representations of data in a file (text), random access allows both binary and ASCII. The benefit you obtain with binary data is primarily with numbers which can be stored in fixed lengths that require surprisingly small amounts of file space. For instance the number 2,147,483,647 requires only 4 bytes when stored as binary data as opposed to the 10 bytes (1 for each digit) it would take to store it as ASCII data.

  • There are no delimiters required in random access comparable to the commas, spaces, and newline characters that are required between each item of data in a sequential file.

  • Random access will allow you to put and get from any position in the file the chunk of data that you want. Generally* sequential access files must read through all previous data in a file before the desired location is arrived at in a file.Writes must be appends to the end of a sequential file.
  • * An indexing system of file positions for data in a sequential file could be devised, and then using the Seek Statement, one could access data in a rapid manner sequentially. However, such a system is relegated to read only except for appending to the end as stated above.
    The basics of Open For Random include:

  • First - provide the type of variable for file I/O.
  • Second - provide a file handle variable.
  • Third - provide a variable for the number of records.
  • Fourth - get a file handle.
  • Fifth - determine the length of the record for the open statement
  • Sixth - issue the Open Statement
  • Seventh - determine the number of records existing in the file
  • 
    1. Dim recData&
    2. Dim fHnd%
    3. Dim numRecs&
    
    Private Sub Form_Load()
    4. fHnd = Freefile
    5. numRecs = Len(recData)
    6. Open "c:\myDir\myFile.ext" For Random As fHnd Len = numRecs
    7. numRecs = Lof(fHnd) \ numRecs
    End Sub
    
    In this simplest of examples just a Long intrinsic variable type is used as the record. It is only four bytes long. In fact, had we specified the Byte Type, each record would be only one byte long.

    A fairly standard programming practice is to use the file handle in steps 1 and 4, although in the open statement you could use "As #1". It is wise to adopt this practice for your projects since you will frequently have several random files open at once, and the best way to keep track of them is with file handle variables.

    Probably the one thing added in the above code that is new to you is the Len = clause in the open statement. This is important for the Visual Basic compiler to know exactly how to partition data in the file buffer so that adjacent records are not overwritten with garbage as you put single records to various file record locations. Since we know that a long is 4 bytes, you could use Len = 4 here. But it is advisable to get used to the Len() function in step 5 to get the length of your variable, because as you will soon see, the I/O variable can become very complex.

    Note how my practice is to allow numRecs to do double duty initially as the record length for the Open statement, since I rarely use that value again. You may prefer to use a separate variable.

    In this example I am showing the execution code in the Form_Load event, but of course, you could have it in any procedure. Also, I will almost always declare the three variables at the Form or Module level since they usually need to be visible in several Functions and Subs.

    If it were possible to see the first three records of data bits on disk of the random access file opened with the above code, they might look something like this. I have stacked the records one on top of another which is how I usually think of them when I write programs. But actually they are one continuous long trail of bits that the disk controller forms into bytes and then the operating system and Visual Basic group them into records as commanded with the open statement.

    
    Rec                   File Contents
       Byte   Value  Byte   Value  Byte   Value  Byte   Value
     1   1  01011101   2  11010011   3  11001101   4  11001000
     2   5  01101000   6  10111001   7  00011110   8  01111001
     3   9  00001101  10  01111010  11  11010101  12  10111011
    

    Note that when you use the Open For Random statement, if the file that you have named does not exist, it will be created automatically. In contrast, if you were to attempt to Open For Input a sequential file that does not exist, a trappable error would be generated.

    This is one of the reasons why it is advisable to determine the number of records in the file immediately after you open it. It may also be advisable under circumstances where you do not wish to create a file, to first check for its existance. The Dir$ function is one way of doing so.

    Now that you have the file open, there are only three file operations to consider:
    • Put • Get • Close

    In each instance the first parameter is the same, fHnd, the integer file handle that was used for the open statement. Put and Get each have two additional parameters. The first is a Long Type number indicating which record you want. The second is the name of the data variable to accept the Get data (for a Put operation the second is the data you want to place in the file). Visual Basic knows how long that second variable is, so it knows exactly how many bytes to place in or retrieve from the file.

    So the resultant full commands look like:

  • Put fHnd, RecordPosition&, recData
  • Get fHnd, RecordPosition&, recData
  • Close fHnd
  • Note that RecordPosition is a Long Variable Type. You could use an integer, but a long is preferred.

    It is important to know that as you perform your Put operations, Visual Basic is keeping it in a file buffer in memory. The actual physical writes to the disk are unpredictable.

    As an experiment open a new random access file, and Put records in it. Then without ending the program or closing the file, switch to Windows Explorer and locate that file. What you will probably see is that the file is there but the file size is zero. This is because the record is still in memory in the file buffer. But the instant you close the file (ending the program will automatically close the file), the record length will change in Windows Explorer indicating that the buffer has been flushed to the disk.

    Closing the file is the one predictable flush you have at your disposal. If record security is crucial (e.g. if you worry about a power outage) or if you have multiusers accessing this file, then it might be a good idea to close the file after each Put, and then immediately reopen it.

    Now for the slightly complicated business of the UDT - user defined type. In the seven step example above only a simple long variable has been used. In fact, this is used more often than you might expect as an index file of another random access file (see the reference at the bottom of the page).

    However, for any data file to be of use to us it must be able to store considerably larger records, and these records need to contain lots of information of different types. This is accomplished by first defining your own user defined type, and secondly declaring a record variable as that type. A simple example is:

    
    Private Type recType
        DeleteCode As Byte
        SortFlag As Byte
        CustCode As Long
        CustFirstName As String * 16
        CustLastName As String * 18
    End Type
    
    Dim recData As recType
    
    If you have seen types before then maybe the only thing new to you is the use of the * 16 after the String. It is how to define a fixed length string in Visual Basic. The above type has one fixed length string 16 bytes long and another 18 bytes long. One byte is required to represent each ASCII character. In Open For Random each record must be the same length, and to ensure this, it is necessary to define the length of strings. It is only done for strings because VB knows the length of all other data types (Integers, Longs, etc).

    One thing that you always have to carefully consider at the design stage is how long you want these lengths to be. They must be long enough to hold the longest value (within reason), but then you have to balance that with how much space is wasted on the disk. In the above example if in one record the CustFirstName is "Rick", then you can see you will have 12 wasted bytes of disk space on your hard drive.

    Also if records are very large it will effect how fast VB will be able to Get and Put them because the file buffer is a fixed size and more actual physical reads and writes are required as your record size increases.

    UDT's are interesting to deal with after you have declared them. Since they are made up of several different variables there must be a way to specify which variable in your UDT you want. This is done with the period "." Suppose you just got (with Get) recData from the random access file, then the customer's first name is:

    recData.CustFirstName

    Note: When you declare them, fixed length strings are initialized to the null character, Chr$(0). If you want them to be spaces (so that you do not have null chars in your file) then you would have to do it yourself. An example would be:

    recData.CustLastName = Space$(18)

    Tip: When you declare your recData, declare another called recBlank and in your initialization code go through the above process of making all the fixed length strings spaces. Then whenever you need to blank out your recData I/O variable you need only use the line:
    recData = recBlank
    Now to combine what has been covered so far using my seven steps:
    
    1. Private Type recType
           DeleteCode As Byte
           SortFlag As Byte
           CustCode As Long
           CustFirstName As String * 16
           CustLastName As String * 18
       End Type
    
       Dim recData As recType
    2. Dim fHnd%
    3. Dim numRecs&
    
    Private Sub Form_Load()
    4. fHnd = Freefile
    5. numRecs = Len(recData)
    6. Open "c:\myDir\myFile.ext" For Random As fHnd Len = numRecs
    7. numRecs = Lof(fHnd) \ numRecs
    End Sub
    
    As you see the only difference is that recData is declared as recType instead of as a Long. In step 5, Visual Basic knows the length of the declared type so Len(recData) will return 40, and then in 6 the Open statement will set all internal buffering up to expect records 40 bytes long.

    Note that only a single byte is required as a flag to indicate that this record is set for deletion, and that a future sort function should sort on the first name. These are just my examples of flags. You might consider others when you plan a record type. (See how the flags are set with Checkboxes below.)

    Now the only thing left is to show examples of a Put and Get:
    
    numRecs = numRecs + 1
    
    recData.DeleteCode = Check1.Value
    recData.SortFlag = Check2.Value
    recData.CustCode = 10000 + numRecs
    recData.CustFirstName = Text1.Text
    recData.CustLastName = Text2.Text
    
    Put fHnd, numRecs, recData
    
    This shows how to add a new last record to a random access file. Note that we are just generating the Customer Codes based on the record number. Our first customer will be customer number 10001.

    Assumed are two checkboxes and two textboxes on a Form where the data could be input. Normally there is validation code before you arrive at the Put statement to ensure that the record you are about to put has good data in it.

    One interesting thing is that you may actually Put a record in position 203 even if there are only 15 records in a file. Visual Basic will expand the file size such that there are 202 records before number 203. I have never found a reason to do this.

    Get is the reverse process:
    
    If x < 1 Or x > numRecs Then Exit Sub
    
    Get fHnd, x, recData
    
    Check1.Value = recData.DeleteCode
    Check2.Value = recData.SortFlag
    Label1.Caption = recData.CustCode
    Text1.Text= recData.CustFirstName
    Text2.Text = recData.CustLastName
    
    You must supply the x (a Long), of course, as the record number you want. Another reason to have numRecs declared at the form or module level is to check that x is in the range between 1 and numRecs before you attempt a Get.
    One peculiarity is that you can not shorten random access files. If you try to delete a record by moving all the records one position lower (with a Get of the one above and then a Put), and then blanking out the data in the top record, you will be annoyed when you find out there is no way to tell VB or the system to change the file size so that top record does not show up anywhere.

    The answer to the problem is to Open For Random a temp file of the same record length, Get the good records from your good file consecutively, Put them in the temp file consecutively, Close both files, Kill the good file, and finally Name tempfile as goodfile. (A new file has to be created.) Here is an example:

    
    Dim recData&
    Dim fHnd%
    Dim numRecs&
    
    Private Sub SomeSub()
        Dim GoodFile$, TempFile$, fHndTmp%
        
        GoodFile = "c:\myDir\myFile.ext"
        TempFile = "c:\myDir\tempf"
        
        fHnd = FreeFile
        fHndTmp = FreeFile
        numRecs = Len(recData)
        
        Open TempFile For Random As fHndTmp Len = numRecs
        Open GoodFile For Random As fHnd Len = numRecs
        numRecs = LOF(fHnd) \ numRecs
        
        Dim cnt&, j&
        For j = 1 To numRecs
            Get fHnd, j, recData
            'A possible way to eliminate one record
            'If j <> 33 Then
            'Another possible way
            'Note: single records or many records could
            '      be eliminated depending on this If
            If recData <> 12 Then
                cnt = cnt + 1
                Put fHndTmp, cnt, recData
            End If
        Next
        
        Close fHnd, fHndTmp
        Kill GoodFile
        Name TempFile As GoodFile
    End Sub
    
    The Mini-Database demo project could be described as a "poor man's database" since it simply writes and retrieves your input data as records to and from a random access file and then establishes other random access files to index the records. If you wish, you may download the Mini-Database Project (12K) to see the complete commented source code and random access in operation.

    Another tutorial is: Indexing Random Access Files in Visual Basic.