Post Reply 
newRPL: [UPDATED April 27-2017] Firmware for testing available for download
07-27-2016, 08:56 PM (This post was last modified: 09-10-2016 02:10 PM by matthiaspaul.)
Post: #350
RE: newRPL: [UPDATED July-25-16] Firmware for testing available for download
(07-26-2016 09:27 PM)Claudio L. Wrote:  
Quote:Do you support the two entries in the FS info sector already? Do you maintain the media and volume mount flags during mount/unmount/startup/sleep/shutdown?
Not implemented for a few reasons:
a) It's not exactly reliable: if the card comes for example from another 50g with stock firmware (which doesn't update these fields), you get bad info. I figured exchanging cards between calcs would be quite common.
That's right, these values are nothing that can be relied upon, they can be used only with sanity checks in place. However, it's still possible to take advantage of them. It first needs to check these conditions:

- 1. A valid FAT32 BPB is present (to be detailed)
- 2. The 16-bit "logical sector size" at BPB offset +0x00 is larger or equal to 512 bytes. (In general, FAT32 logical sector sizes can be as small as 128 bytes, however, if an FS info sector is present, the logical sector size must be at least 512 bytes.)
- 3. The 16-bit "FS info sector cluster" at FAT32 BPB offset +0x24 contains a value smaller 0xFFFF and larger than 0x0000. (These two values indicate that no FS info sector is present.)
- 4. The FS info sector has valid signatures: sector offsets +0x00..+0x03 contain values 0x52 0x52 0x61 0x41, sector offsets +0x1E4..+0x1E7 contain values 0x72 0x72 0x41 0x61, and sector offsets +0x1FC..+0x1FF contain values 0x00 0x00 0x55 0xAA.
- 5. The 32-bit value at offset +0x1EC in the FS info sector is either equal to 0xFFFFFFFF or it is larger than 0x00000001 and smaller than the volume's highest cluster number

If all these conditions are met, the "last allocated cluster pointer" can be set to the 32-bit value at offset +0x1EC. If it contains a valid value, using this value will avoid unnecessary fragmentation on future allocations and there's no need to scan over a possibly large number of already allocated clusters. If, however, the value is not valid (which is still possible at this stage), the system will find out when it attempts to allocate the next cluster as this won't be empty. However, the system will then smoothly continue to search for the next free cluster, so this potential error condition is resolved gracefully. On a not too fragmented volume it will typically still find the next free cluster much earlier than searching for it from the start of the FAT, as the outdated pointer is most probably still in the ballpark region of the last actual allocation. So, even an outdated pointer will not cause any harm, whereas a valid pointer will dramatically speed up the first allocation.

If one of these five conditions is not met, the "cluster allocation pointer" will have to be set to 0xFFFFFFFF (for "unknown"), therefore forcing the system to
search from the start on the next allocation.

For the free cluster counter, conditions 1..4 must be met as well. Additionally, the 32-bit value at offset +0x1E8 in the FS info sector must either be equal to 0xFFFFFFFF (for "unknown") or smaller than the volume's highest cluster number. Even if these conditions are met, the implementation must not rely
on this value to be correct unless it is known to be correct by other means (see below).

The actual free space can be calculated alongside other operations on the volume - as soon as the cluster pointer will have wrapped around for the first time the actual free space is known until the volume is unmounted, the medium removed or the system shut down.

Once the free space has been determined this way, the system can immediately fail further allocations if the value is 0. But for as long as the free space is only based on the FS info sector value, the system would still have to scan the FAT for free clusters even if the value indicated 0 already - once it has finished that scan, the actual value is known, so this time consuming operation will happen only once until the next unmount.

When mounting a volume, it is not normally necessary to know the free space, so it is also not necessary to perform the scan immediately. This can be delayed until someone actually wants to know the exact value (SDFREE). In a multi-threading system, the free space scanning could be carried out as a non-blocking background process.

Quote:b) Implementing it requires proper mount/unmount, which means if the user just pulls the card out then putting on a PC will indicate a "dirty" file system and will suggest (or force) disk checking procedures.
Yes, but this is exactly what should happen in this scenario, as the integrity of the file system cannot be trusted any more until after running a disk check utility.
Quote:c) To minimize fragmentation, the file system driver uses that "scan" to locate the largest block of free clusters in the disk.
Yes. Not necessarily the largest block, but on a still unfragmented volume, the last allocation was most probably at the end of the allocated area.
Quote:But now that I added SDHC support, with potentially much slower scan times I may have to rethink that.
At least this is what I would propose as it is easy to implement (almost no memory and code overhead) and it can speed up things considerably if the values are (almost) valid, and does not cause actual problems, if they are not.

Of course, there are other methods to speed up certain access patterns and there are various strategies how to possibly reduce fragmentation on FAT file systems (the above method is part of what is used by DOS and Windows). Unfortunately, they require considerably more complex implementations, more memory for various types of buffers and to hold dynamically built in-memory data structures, background processes - way too complicated for an embedded system, IMHO.

One feature may be worth considering, though: A vast amount of fragmentation is caused by frequent allocations and deallocations of files, as the system would try to maintain the integrity even of void data (to allow later undeletion), and it would thereby effectively cause more fragmentation in this scenario. However, a good amount of such interim file operations could be carried out on temporary files. Therefore, some operating systems (including DOS) have special API functions for temporary files. They do not only ensure the creation of unique file names (so the user does not have to be bothered with them), but using these functions will also cause the file system to use different allocation / deallocation strategies. The file system would no longer attempt to maintain deleted directory entries and use "fresh" clusters for new allocations, but it would try to reuse previously freed entries.

Something like this could be implemented in newRPL as well. On the command line there could be a number of "reserved file names" which the system would recognize as temporary files. The on-disk file names could use a special pattern so that the system can recognize them as temporary files (even if they are left-overs from previous sessions). The file system could thereby automatically remove orphanted temporary files.

Quote:Semicolon is not an allowed character in 8.3 names, so this is only for long names. SFN names do use the numeric tail in the classic way. But since potentially conflicting names are allowed, the SFN should not be relied upon when the LFN exists.
One example:
Let's say I create "BigFile", which has a SFN "BIGFILE".
Now I want to create "BIGFILE", which has a conflict with the previous SFN. newRPL will create LFN="BIGFILE;" and the SFN="BIGFIL~1" will be used. Now somebody reading only SFN names might think BIGFILE is "BIGFILE", when in reality is "BigFile". Not sure if it makes sense, but that's why SFN names shouldn't be relied upon.
It does make sense. It's perfect this way.
Quote:Actually, modern windows sometimes assigns a completely random SFN, not sure under which conditions, but this is perfectly valid.
Yes, I know, there are also a number of other special cases:

- If a filename fits into the 8.3 format with all characters uppercase, Windows can be configured to only create a SFN and skip creating the unnecessary LFN (thereby avoiding unnecessary clutter in the filesystem).

- If a filename fits into the 8.3 scheme and either contains only lowercase letters or combines a lowercase filename and an uppercase extension or vice versa, the creation of an LFN can be suppressed as well. In this case only an SFN is created and the case information is stored in bits 4 and 3 at offset 0x0C in directory entries, so that the LFN can be recreated from the SFN later on.

- Further, Windows can be configured to not start using numeric tails until actually necessary. It would simply truncate the name to fit into the 8.3 scheme, so the SFN for a file named "helloworld.txt" would be "HELLOWOR.TXT", not "HELLOW~1.TXT". Useful to keep as much of the original name available as SFN.

Quote:None of the above. I just can't picture a multi-user calculator... how do you place more than 2 fingers on the same keyboard? :-)
That's why I wrote "single-user permissions".

Quote:Not done yet, but I'm planning to simply have an IRQ on the card detection pin, so if the user pulls the card when there's data to be written the system will throw an exception, asking the user to reinsert the card immediately.
This sounds like a good idea! (Comparing the BPB serial number can be used to ensure that the same medium was reinserted.)

Greetings,

Matthias


--
"Programs are poems for computers."
Find all posts by this user
Quote this message in a reply
Post Reply 


Messages In This Thread
RE: newRPL: [UPDATED July-25-16] Firmware for testing available for download - matthiaspaul - 07-27-2016 08:56 PM



User(s) browsing this thread: 1 Guest(s)