This lecture and a few
coming lectures will discuss how to implement Sockets and RPC.
You should be able to
understand how these system calls work.
The best option would be to do a ‘man
<system-call> ‘ on your Unix command prompt, where
<system-call> is one of the following:
socket
bind
listen
accept
connect
read/write/send/recv
select
dup
dup2
bind
, listen and accept are system calls used by servers; connect
is a system call used by clients; and read/write/send/recv are system
calls used by both clients and servers. These are the important system calls
to do network programming on a Unix machine using C. In case of a language
like Java all these low-level system calls are done by the language itself.
For the project you will need to understand the above system calls and how
they're used in the project code. The tutorial “
Jim Frost's BSD Sockets: A Quick And Dirty Primer” should give you a
simple and easy introduction to sockets. The notion of a client initiating
a connection while the server is waiting for a connection request is important.
Note:
The socket abstraction was designed and implemented for Unix as part of
the BSD Unix project to incorporate TCP/IP into Unix. Solaris uses a ``streams''
mechanism and provide the socket abstraction via standard libraries (most
TCP/IP code is written to use sockets since the socket abstraction is available
on all Unix -- as well as Windows -- variants). In most Unices, socket manipulation
is done using system calls, and thus the interface is described in section
2 of the manual pages. In Solaris, some of the interfaces are in section
3c (C library routines) or 3n (networking library). It is unfortunate that
the manual page locations are not uniform across all Unices, but at least
you now know the reason.
e.g for select use man –s 3C select where the section number is specified by –s option; for read/write use section number 2 and for socket, bind, listen, accept, connect use section number 3XN.
Ports are used to name
the ends of logical connections, which carry long term conversations. For
the purpose of providing services to unknown callers, a service contact port
is defined.
The
Internet Assigned Numbers Authority (IANA) maintains the Assigned Numbers
Request for Comments (RFC).
The Well Known
Ports are controlled and assigned by the IANA and on most systems
can only be used by system (or root) processes or by programs executed by
privileged users.
The assigned ports use a small portion of the possible
port numbers. The assigned ports are in range 0-1023.
The Registered
Ports are not controlled by the IANA and on most systems can be used
by ordinary user processes or programs executed by ordinary users. The Registered
Ports are in the range 1024-65535.
A daemon on a UNIX system
is a process that runs in the background to provide services. Daemons are
not special -- except for the fact that they have no "control terminal",
i.e., there's no terminal that is responsible for job control like ^C or
^Z processing, they're not too different from programs that you might write
and run yourself.
There are lots of daemons
that can run on a UNIX system: FTP, telnet, HTTP. All of these are processes
that listen for a particular type of a connection to the computer. When the
corresponding request is made, then the daemon for that request handles it.
Afterwards, the daemon continues to listen for more requests. For example,
when you try to FTP to ftp://mongoose.aul.fiu.edu, then the FTP daemon handles
the request. When you try to access index.html on http://mongoose.aul.fiu.edu,
then the HTTP daemon handles the request.
If a daemon is doing
its own listening, then it a standalone mode daemon.
Even when idle, a standalone
server would still consume resources like virtual memory space, entries in
the process table etc. Standalone mode daemons can be subdivided into
single-threaded or multi-threaded servers.
Inetd is a meta-daemon: other daemons register with it, so that
inetd listens instead of the actual daemon. The actual program
that provides the service, e.g., fingerd, would not have to consume
any resources unless a request actually arrives: inetd knows --
because of an entry in /etc/inetd.conf
-- that when a connection arrives then it should fork a subprocess
which then dups the socket representing the new connection to the
standard input and output descriptors (0 and 1) and then exec to replace
its image with that of the actual inetd-style daemon. Look for
fingerd's entry in
/etc/inetd.conf. What happens when you run the corresponding daemon
program interactively?
If a daemon is designed
to run with inetd, then the daemon is in Inetd mode
. Inetd gets started during the boot time and reads the /etc/inetd.conf
file which is its configuration file.
With inetd, there is a little bit of overhead since only when the connection comes in is the real server is started.
The Apache daemon normally runs in standalone mode, in order to make the fastest possible response to a request. When Apache starts, there is a considerable overhead involved in reading the configuration files. Contrast this with an FTP request, which has very little overhead. If Apache ran under Inetd, then this startup overhead would happen each time a request was made to the server. The user would notice that the server was responding slowly.
Running in standalone mode is one way to speed up the request cycle. In addition, Apache runs with several copies of itself ready to handle requests. The main Apache server just listens for HTTP requests. When it detects a request, it clones itself. The clone actually handles the request, not the parent process. Just as there is an overhead involved in starting the main server, there would be an overhead starting a clone of the server. To reduce this overhead, the Apache server starts with several clones of itself already initialized. These are known as the spare servers. It is possible to set a minimum amount to keep around, and a maximum amount to keep around. Please note, that this does not limit the number of requests that can be made to the server. If there are not enough spare servers to handle all the requests, then the main server will just make additional clones of itself, regardless of the maximum setting. For example, if the maximum amount of spare servers was 5, but 8 requests came in, the main server would create 3 additional clones. When all the requests were handled, the number of spare servers above the maximum would be terminated. The design of keeping spare clones around is intended to minimize startup costs associated with forking clones, since a fork system call minimally involves creating a new address space and marking the memory region copy-on-write.