diff options
Diffstat (limited to 'cpukit/libfs/src/nfsclient/README')
-rw-r--r-- | cpukit/libfs/src/nfsclient/README | 548 |
1 files changed, 0 insertions, 548 deletions
diff --git a/cpukit/libfs/src/nfsclient/README b/cpukit/libfs/src/nfsclient/README deleted file mode 100644 index 944b830e2e..0000000000 --- a/cpukit/libfs/src/nfsclient/README +++ /dev/null @@ -1,548 +0,0 @@ -RTEMS-NFS -========= - -A NFS-V2 client implementation for the RTEMS real-time -executive. - -Author: Till Straumann <strauman@slac.stanford.edu>, 2002 - -Copyright 2002, Stanford University and - Till Straumann <strauman@slac.stanford.edu> - -Stanford Notice -*************** - -Acknowledgement of sponsorship -* * * * * * * * * * * * * * * * -This software was produced by the Stanford Linear Accelerator Center, -Stanford University, under Contract DE-AC03-76SFO0515 with the Department -of Energy. - - -Contents --------- -I Overview - 1) Performance - 2) Reference Platform / Test Environment - -II Usage - 1) Initialization - 2) Mounting Remote Server Filesystems - 3) Unmounting - 4) Unloading - 5) Dumping Information / Statistics - -III Implementation Details - 1) RPCIOD - 2) NFS - 3) RTEMS Resources Used By NFS/RPCIOD - 4) Caveats & Bugs - -IV Licensing & Disclaimers - -I Overview ------------ - -This package implements a simple non-caching NFS -client for RTEMS. Most of the system calls are -supported with the exception of 'mount', i.e. it -is not possible to mount another FS on top of NFS -(mostly because of the difficulty that arises when -mount points are deleted on the server). It -shouldn't be hard to do, though. - -Note: this client supports NFS vers. 2 / MOUNT vers. 1; - NFS Version 3 or higher are NOT supported. - -The package consists of two modules: RPCIOD and NFS -itself. - - - RPCIOD is a UDP/RPC multiplexor daemon. It takes - RPC requests from multiple local client threads, - funnels them through a single socket to multiple - servers and dispatches the replies back to the - (blocked) requestor threads. - RPCIOD does packet retransmission and handles - timeouts etc. - Note however, that it does NOT do any XDR - marshalling - it is up to the requestor threads - to do the XDR encoding/decoding. RPCIOD _is_ RPC - specific, though, because its message dispatching - is based on the RPC transaction ID. - - - The NFS package maps RTEMS filesystem calls - to proper RPCs, it does the XDR work and - hands marshalled RPC requests to RPCIOD. - All of the calls are synchronous, i.e. they - block until they get a reply. - -1) Performance -- - - - - - - - -Performance sucks (due to the lack of -readahead/delayed write and caching). On a fast -(100Mb/s) ethernet, it takes about 20s to copy a -10MB file from NFS to NFS. I found, however, that -vxWorks' NFS client doesn't seem to be any -faster... - -Since there is no buffer cache with read-ahead -implemented, all NFS reads are synchronous RPC -calls. Every read operation involves sending a -request and waiting for the reply. As long as the -overhead (sending request + processing it on the -server) is significant compared to the time it -takes to transferring the actual data, increasing -the amount of data per request results in better -throughput. The UDP packet size limit imposes a -limit of 8k per RPC call, hence reading from NFS -in chunks of 8k is better than chunks of 1k [but -chunks >8k are not possible, i.e., simply not -honoured: read(a_nfs_fd, buf, 20000) returns -8192]. This is similar to the old linux days -(mount with rsize=8k). You can let stdio take -care of the buffering or use 8k buffers with -explicit read(2) operations. Note that stdio -honours the file-system's st_blksize field -if newlib is compiled with HAVE_BLKSIZE defined. -In this case, stdio uses 8k buffers for files -on NFS transparently. The blocksize NFS -reports can be tuned with a global variable -setting (see nfs.c for details). - -Further increase of throughput can be achieved -with read-ahead (issuing RPC calls in parallel -[send out request for block n+1 while you are -waiting for data of block n to arrive]). Since -this is not handled by the file system itself, you -would have to code this yourself e.g., using -parallel threads to read from a single file from -interleaved offsets. - -Another obvious improvement can be achieved if -processing the data takes a significant amount of -time. Then, having a pipeline of threads for -reading data and processing them makes sense -[thread b processes chunk n while thread a blocks -in read(chunk n+1)]. - -Some performance figures: -Software: src/nfsTest.c:nfsReadTest() [data not - processed in any way]. -Hardware: MVME6100 -Network: 100baseT-FD -Server: Linux-2.6/RHEL4-smp [dell precision 420] -File: 10MB - -Results: -Single threaded ('normal') NFS read, 1k buffers: 3.46s (2.89MB/s) -Single threaded ('normal') NFS read, 8k buffers: 1.31s (7.63MB/s) -Multi threaded; 2 readers, 8k buffers/xfers: 1.12s (8.9 MB/s) -Multi threaded; 3 readers, 8k buffers/xfers: 1.04s (9.6 MB/s) - -2) Reference Platform -- - - - - - - - - - - -RTEMS-NFS was developed and tested on - - o RTEMS-ss20020301 (local patches applied) - o PowerPC G3, G4 on Synergy SVGM series board - (custom 'SVGM' BSP, to be released soon) - o PowerPC 604 on MVME23xx - (powerpc/shared/motorola-powerpc BSP) - o Test Environment: - - RTEMS executable running CEXP - - rpciod/nfs dynamically loaded from TFTPfs - - EPICS application dynamically loaded from NFS; - the executing IOC accesses all of its files - on NFS. - -II Usage ---------- - -After linking into the system and proper initialization -(rtems-NFS supports 'magic' module initialization when -loaded into a running system with the CEXP loader), -you are ready for mounting NFSes from a server -(I avoid the term NFS filesystem because NFS already -stands for 'Network File System'). - -You should also read the - - - "RTEMS Resources Used By NFS/RPCIOD" - - "CAVEATS & BUGS" - -below. - -1) Initialization -- - - - - - - - - -NFS consists of two modules who must be initialized: - - a) the RPCIO daemon package; by calling - - rpcUdpInit(); - - note that this step must be performed prior to - initializing NFS: - - b) NFS is initialized by calling - - nfsInit( smallPoolDepth, bigPoolDepth ); - - if you supply 0 (zero) values for the pool - depths, the compile-time default configuration - is used which should work fine. - -NOTE: when using CEXP to load these modules into a -running system, initialization will be performed -automagically. - -2) Mounting Remote Server Filesystems -- - - - - - - - - - - - - - - - - - - - -There are two interfaces for mounting an NFS: - - - The (non-POSIX) RTEMS 'mount()' call: - - mount( &mount_table_entry_pointer, - &filesystem_operations_table_pointer, - options, - device, - mount_point ) - - Note that you must specify a 'mount_table_entry_pointer' - (use a dummy) - RTEMS' mount() doesn't grok a NULL for - the first argument. - - o for the 'filesystem_operations_table_pointer', supply - - &nfs_fs_ops - - o options are constants (see RTEMS headers) for specifying - read-only / read-write mounts. - - o the 'device' string specifies the remote filesystem - who is to be mounted. NFS expects a string conforming - to the following format (EBNF syntax): - - [ <uid> '.' <gid> '@' ] <hostip> ':' <path> - - The first optional part of the string allows you - to specify the credentials to be used for all - subsequent transactions with this server. If the - string is omitted, the EUID/EGID of the executing - thread (i.e. the thread performing the 'mount' - - NFS will still 'remember' these values and use them - for all future communication with this server). - - The <hostip> part denotes the server IP address - in standard 'dot' notation. It is followed by - a colon and the (absolute) path on the server. - Note that no extra characters or whitespace must - be present in the string. Example 'device' strings - are: - - "300.99@192.168.44.3:/remote/rtems/root" - - "192.168.44.3:/remote/rtems/root" - - o the 'mount_point' string identifies the local - directory (most probably on IMFS) where the NFS - is to be mounted. Note that the mount point must - already exist with proper permissions. - - - Alternate 'mount' interface. NFS offers a more - convenient wrapper taking three string arguments: - - nfsMount(uidgid_at_host, server_path, mount_point) - - This interface does DNS lookup (see reentrancy note - below) and creates the mount point if necessary. - - o the first argument specifies the server and - optionally the uid/gid to be used for authentication. - The semantics are exactly as described above: - - [ <uid> '.' <gid> '@' ] <host> - - The <host> part may be either a host _name_ or - an IP address in 'dot' notation. In the former - case, nfsMount() uses 'gethostbyname()' to do - a DNS lookup. - - IMPORTANT NOTE: gethostbyname() is NOT reentrant/ - thread-safe and 'nfsMount()' (if not provided with an - IP/dot address string) is hence subject to race conditions. - - o the 'server_path' and 'mount_point' arguments - are described above. - NOTE: If the mount point does not exist yet, - nfsMount() tries to create it. - - o if nfsMount() is called with a NULL 'uidgid_at_host' - argument, it lists all currently mounted NFS - -3) Unmounting -- - - - - - - -An NFS can be unmounted using RTEMS 'unmount()' -call (yep, it is unmount() - not umount()): - - unmount(mount_point) - -Note that you _must_ supply the mount point (string -argument). It is _not_ possible to specify the -'mountee' when unmounting. NFS implements no -convenience wrapper for this (yet), essentially because -(although this sounds unbelievable) it is non-trivial -to lookup the path leading to an RTEMS filesystem -directory node. - -4) Unloading -- - - - - - - -After unmounting all NFS from the system, the NFS -and RPCIOD modules may be stopped and unloaded. -Just call 'nfsCleanup()' and 'rpcUdpCleanup()' -in this order. You should evaluate the return value -of these routines which is non-zero if either -of them refuses to yield (e.g. because there are -still mounted filesystems). -Again, when unloading is done by CEXP this is -transparently handled. - -5) Dumping Information / Statistics -- - - - - - - - - - - - - - - - - - - -Rudimentary RPCIOD statistics are printed -to a file (stdout when NULL) by - - int rpcUdpStats(FILE *f) - -A list of all currently mounted NFS can be -printed to a file (stdout if NULL) using - - int nfsMountsShow(FILE *f) - -For convenience, this routine is also called -by nfsMount() when supplying NULL arguments. - -III Implementation Details --------------------------- - -1) RPCIOD -- - - - - - -RPCIOD was created to - -a) avoid non-reentrant librpc calls. -b) support 'asynchronous' operation over a single - socket. - -RPCIOD is a daemon thread handling 'transaction objects' -(XACTs) through an UDP socket. XACTs are marshalled RPC -calls/replies associated with RPC servers and requestor -threads. - -requestor thread: network: - - XACT packet - | | - V V - | message queue | ( socket ) - | | ^ - ----------> <----- | | - RPCIOD | - / -------------- - timeout/ (re) transmission - - -A requestor thread drops a transaction into -the message queue and goes to sleep. The XACT is -picked up by rpciod who is listening for events from -three sources: - - o the request queue - o packet arrival at the socket - o timeouts - -RPCIOD sends the XACT to its destination server and -enqueues the pending XACT into an ordered list of -outstanding transactions. - -When a packet arrives, RPCIOD (based on the RPC transaction -ID) looks up the matching XACT and wakes up the requestor -who can then XDR-decode the RPC results found in the XACT -object's buffer. - -When a timeout expires, RPCIOD examines the outstanding -XACT that is responsible for the timeout. If its lifetime -has not expired yet, RPCIOD resends the request. Otherwise, -the XACT's error status is set and the requestor is woken up. - -RPCIOD dynamically adjusts the retransmission intervals -based on the average round-trip time measured (on a per-server -basis). - -Having the requestors event driven (rather than blocking -e.g. on a semaphore) is geared to having many different -requestors (one synchronization object per requestor would -be needed otherwise). - -Requestors who want to do asynchronous IO need a different -interface which will be added in the future. - -1.a) Reentrancy -- - - - - - - - -RPCIOD does no non-reentrant librpc calls. - -1.b) Efficiency -- - - - - - - - -We shouldn't bother about efficiency until pipelining (read-ahead/ -delayed write) and caching are implemented. The round-trip delay -associated with every single RPC transaction clearly is a big -performance killer. - -Nevertheless, I could not withstand the temptation to eliminate -the extra copy step involved with socket IO: - -A user data object has to be XDR encoded into a buffer. The -buffer given to the socket where it is copied into MBUFs. -(The network chip driver might even do more copying). - -Likewise, on reception 'recvfrom' copies MBUFS into a user -buffer which is XDR decoded into the final user data object. - -Eliminating the copying into (possibly multiple) MBUFS by -'sendto()' is actually a piece of cake. RPCIOD uses the -'sosend()' routine [properly wrapped] supplying a single -MBUF header who directly points to the marshalled buffer -:-) - -Getting rid of the extra copy on reception was (only a little) -harder: I derived a 'XDR-mbuf' stream from SUN's xdr_mem which -allows for XDR-decoding out of a MBUF chain who is obtained by -soreceive(). - -2) NFS -- - - - -The actual NFS implementation is straightforward and essentially -'passive' (no threads created). Any RTEMS task executing a -filesystem call dispatched to NFS (such as 'opendir()', 'lseek()' -or 'unlink()') ends up XDR encoding arguments, dropping a -XACT into RPCIOD's message queue and going to sleep. -When woken up by RPCIOD, the XACT is decoded (using the XDR-mbuf -stream mentioned above) and the properly cooked-up results are -returned. - -3) RTEMS Resources Used By NFS/RPCIOD -- - - - - - - - - - - - - - - - - - - - -The RPCIOD/NFS package uses the following resources. Some -parameters are compile-time configurable - consult the -source files for details. - -RPCIOD: - o 1 task - o 1 message queue - o 1 socket/filedescriptor - o 2 semaphores (a third one is temporarily created during - rpcUdpCleanup()). - o 1 RTEMS EVENT (by default RTEMS_EVENT_30). - IMPORTANT: this event is used by _every_ thread executing - NFS system calls and hence is RESERVED. - o 3 events only used by RPCIOD itself, i.e. these must not - be sent to RPCIOD by no other thread (except for the intended - use, of course). The events involved are 1,2,3. - o preemption disabled sections: NONE - o sections with interrupts disabled: NONE - o NO 'timers' are used (timer code would run in IRQ context) - o memory usage: n.a - -NFS: - o 2 message queues - o 2 semaphores - o 1 semaphore per mounted NFS - o 1 slot in driver entry table (for major number) - o preemption disabled sections: NONE - o sections with interrupts disabled: NONE - o 1 task + 1 semaphore temporarily created when - listing mounted filesystems (rtems_filesystem_resolve_location()) - -4) CAVEATS & BUGS -- - - - - - - - - -Unfortunately, some bugs crawl around in the filesystem generics. -(Some of them might already be fixed in versions later than -rtems-ss-20020301). -I recommend to use the patch distributed with RTEMS-NFS. - - o RTEMS uses/used (Joel said it has been fixed already) a 'short' - ino_t which is not enough for NFS. - The driver detects this problem and enables a workaround. In rare - situations (mainly involving 'getcwd()' improper inode comparison - may result (due to the restricted size, stat() returns st_ino modulo - 2^16). In most cases, however, st_dev is compared along with st_ino - which will give correct results (different files may yield identical - st_ino but they will have different st_dev). However, there is - code (in getcwd(), for example) who assumes that files residing - in one directory must be hosted by the same device and hence omits - the st_dev comparison. In such a case, the workaround will fail. - - NOTE: changing the size (sys/types.h) of ino_t from 'short' to 'long' - is strongly recommended. It is NOT included in the patch, however - as this is a major change requiring ALL of your sources to - be recompiled. - - THE ino_t SIZE IS FIXED IN GCC-3.2/NEWLIB-1.10.0-2 DISTRIBUTED BY - OAR. - - o You may work around most filesystem bugs by observing the following - rules: - - * never use chroot() (fixed by the patch) - * never use getpwent(), getgrent() & friends - they are NOT THREAD - safe (fixed by the patch) - * NEVER use rtems_libio_share_private_env() - not even with the - patch applied. Just DONT - it is broken by design. - * All threads who have their own userenv (who have called - rtems_libio_set_private_env()) SHOULD 'chdir("/")' before - terminating. Otherwise, (i.e. if their cwd is on NFS), it will - be impossible to unmount the NFS involved. - - o The patch slightly changes the semantics of 'getpwent()' and - 'getgrent()' & friends (to what is IMHO correct anyways - the patch is - also needed to fix another problem, however): with the patch applied, - the passwd and group files are always accessed from the 'current' user - environment, i.e. a thread who has changed its 'root' or 'uid' might - not be able to access these files anymore. - - o NOTE: RTEMS 'mount()' / 'unmount()' are NOT THREAD SAFE. - - o The NFS protocol has no 'append' or 'seek_end' primitive. The client - must query the current file size (this client uses cached info) and - change the local file pointer accordingly (in 'O_APPEND' mode). - Obviously, this involves a race condition and hence multiple clients - writing the same file may lead to corruption. - -IV Licensing & Disclaimers --------------------------- - -NFS is distributed under the SLAC License - consult the -separate 'LICENSE' file. - -Government disclaimer of liability -- - - - - - - - - - - - - - - - - -Neither the United States nor the United States Department of Energy, -nor any of their employees, makes any warranty, express or implied, -or assumes any legal liability or responsibility for the accuracy, -completeness, or usefulness of any data, apparatus, product, or process -disclosed, or represents that its use would not infringe privately -owned rights. - -Stanford disclaimer of liability -- - - - - - - - - - - - - - - - - -Stanford University makes no representations or warranties, express or -implied, nor assumes any liability for the use of this software. - -Maintenance of notice -- - - - - - - - - - - -In the interest of clarity regarding the origin and status of this -software, Stanford University requests that any recipient of it maintain -this notice affixed to any distribution by the recipient that contains a -copy or derivative of this software. |