Q.1: Andrew File System
Andrew File System (AFS) is a file system
that once was a part of a larger project known under the name of Andrew.
Andrew was a project of the Carnegie Mellon
University (CMU) to develop a distributed computing environment on the campus
of Carnegie Mellon, in the middle of the 1980s. AFS was originally developed
for a network computer running Unix BSD and Mach.
The objective of the design of the AFS was
to create a system for large networks. To the scale and the system reduces client-server
communication with the aid of a diagram of caching-client. Rather than transfer
recent data for each request, client stores the entire files in their cache.
In 1989, the team of original design of the
CMU AFS left and founded a company called Transarc Corporation which marketed
the AFS. In the years 90, several ports in the AFS have been carried out by the
clientele of Transarc to other systems, for example a group at MIT ported to
Transarc has acquired IBM in the 1990s and
has continued to produce commercial implementations of the AFS. In
2000 IBM has decided to release a branch of
the source code under the IBM Public License. The current development of the
AFS is now carried out in a project named OpenAFS. Today, the implementation
Open AFS supports several platforms such as: Linux, Apple macos x, Sun Solaris
and Microsoft Windows NT.
AFS communication is implemented on TCP/IP
(default server ports UDP: 7000-7009). Rx, an RPC protocol developed for AFS,
is used for the communication between machines. The communication protocol is
optimized for WANS, and there are no restrictions on the places of the servers
The basic organizational unit of AFS is the
cell. A cell AFS is a collection administered independently of the server and
the client. Usually, a cell corresponds to an Internet domain, and is subsequently
appointed. Since the AFS is the most commonly used in academia and research, it
is not unusual for the cells to also participate in a filletree AFS. However,
this is not mandatory.
Servers and clients can belong to only one
cell at a given moment, but a user can have accounts in several cells. The cell
that is connected to the cell home is; other cells are called foreigners. From
the point of view of the user, the AFS hierarchy is accessible from afs through
the operations of the system of normal files. Since the AFS is transparent the
location, the path to a file in the tree will be the same regardless of the
location of the customer. If the file is accessible or not depends on the user
and access permissions of the file, such as fixed by the access control lists.
The client-side component of the AFS is the
Cache Manager. Depending on the version of AFS, it can be executed as a user
process (with a few changes in the kernel), or as changes in the kernel (OpenAFS).
The responsibilities of the Cache Manager: the recovery of files from servers, the
maintenance of a local file cache, the translation of requests for files in
remote procedure calls, and the storage of reminders.
Several key observations of the modes of
access to the common file has inspired the design of AFS:
• commonly accessible files are usually a
• Read is more common than the
sequential write •
• are more numerous than the access random
access even if the file is shared, the scriptures are usually made by a single
• Files are referenced in
the strategy chosen for the AFS has been to
cache files locally on the clients. When a file is opened, the cache manager
first checks to see if there is a valid copy of the file in the cache. If this
is not the case, the file is retrieved from a file server. A file that has been
changed and closed on the customer is transferred to the server.
Over time, the client creates a
“set” of work often consulted for the files in the cache. A cache
size optimum is 100 MB, but it depends on what the client is used for. If the cached
files are not changed by other users, they do not have to be retrieved from the
server during the’ access by the suite. This reduces the load on the network
significantly. We can do the Caching on disk or in the memory.
The theoretical strategy used is that of
the Whole for the service of files and the caching. However, since the version
3 of the AFS, the data are served and cached in pieces of 64 kilobytes.
To ensure that the records are kept up to
date, a mechanism of reminder is used. When a file is opened for the first
time, a so-called promise of reminder sent to the customer with the recently
opened files. The
promise of recall is stored on the CLIENT
and marked as valid by the Cache Manager. If the file is updated by
another client, the server sends a reminder
to the cache manager, which defines the state of the
promise of recall to Canceled. The next
time an application accesses the file, a current copy must
be retrieved from the server. If a file is
flushed from the cache, the server is informed so that the
recall can be deleted.
In the case where the client has been
restarted, all the promises of recall must be checked to see if their statutes
have changed. To do this, the manager of cache Cache a validation request.
If several customers to occur at the same
time open, edit, and close the same file, the update resulting from the last
closure will overwrite the previous amendments. That is, the implementation of
AFS does not manage the concurrent updates, instead to delegate this
responsibility to the applications.
AFS uses locks the file. The beaches of
bytes cannot be locked together, only files. For these reasons, the AFS is not
well to situations where many users simultaneously need to write in a file, for
example in database applications.
The server machines can have several roles,
which can be combined:
• Acts as a simple file server
• acting as the server for the replicate
• acts as a device of binary distribution,
for the execution of the updated versions of the
• acts as a system to control the machine
that distributes common configuration files and acts
as a synchronization site, main schedule
when there is only a single server in a
cell, it will take over all the above-mentioned roles. It is recommended that
if a cell has more of a server, it is preferable to run multiple database
servers, to obtain the benefits of the database replication. Since the
administrative databases are often consulted, by doing this helps to distribute
the load on the servers. However, with more than three machines of the database
server is not necessary.
The file servers simple may be added as
necessary as the request of storage space increases. Only a System Control
machine is necessary in each cell, and there should be as many machines of
binary distribution as there are different types of server system.
The system can be administered from any
client workstation in the cell.
Q:2 Illustrate the operation of ASF?
The Apache Software Foundation (ASF) is a
501(c)3 non-profit organization of public charity registered in the United
States of America, has been created in 1999 primarily to:
basis for the open, development projects in collaboration software providing
the equipment, communication, the infrastructure of the company and
independent legal entity in which businesses and individuals can provide
resources and be assured that these resources are used for the good of the
public are a way for the volunteers to the shelter of legal proceedings to
protect the projects of the Foundation the mark ‘apache’, applied to its
software products, to be abused by other organizations is the fact to dry.
Unlike other efforts of software
development done under an open source license, the Apache Web server has not
been taken by a single developer (for example, such as the Linux kernel, or
Perl/Python), but it began as a diverse group of people who share common
interests and learned to know by the exchange of information, corrections and
As the group began to develop their own
version of the software, move away from the NCSA version, more people are
attracted to and started to help, first by the sending of small patches, or
suggestions, or to respond to your e-mails on the list, later by more
When the group believes that the person has
“won” the merit to be part of the development community, they have
granted direct access to the code repository, thus increasing the group and the
increase of the capacity of the group to develop the program, and to maintain
and develop more effectively.
We call this basic principle of meritocracy”:
literally, the Government by the MERIT. What is interesting to note is that the
process to the scale very well without creating friction, because unlike other
situations where power is a scarce resource and conservative, in the Apache
Group newcomers were considered as volunteers who wanted to help, rather than
people who wanted to steal a post.
Not conservative resource in game (the
money, energy, time), the Group was pleased to have new people who come to
help, and they were only filter the people they believed sufficiently committed
to the task and the rights of the associated attitudes needed to work well with
others, particularly in disagreement.
the foundation structure:
As the Apache Web Server, has started to
grow in popularity and market share, due to the synergy of its technical
qualities and to the opening of the Community behind the project, people have
begun to create projects of telecommunications by satellite. Influenced by the
spirit of the community that they have been used for, they have adopted the
same traditions of the community management.
Thus, at the time of the ASF has been
created, there were several distinct communities, each focusing on a different
side of the “web service” problem, but all united by a common set of
goals and a set of cultural traditions respected in the process and the label.
These separate communities were qualified
of “projects” and although similar, each of them showed little
differences which had made special.
To reduce friction and allow the emergence
of the diversity, rather than forcing a monoculture of at the top, the projects
are designated central agencies of decision of the world of Apache. Each
project is its delegated authority to the development of its software, and is a
great deal of latitude in the design of its own technical charter and its own
At the same time, the cultural influence of
the original Apache Group was strong and the similarities between the different
communities are obvious, as we will see later.
The Foundation is governed by the following
Directors (Board) governs the Foundation and is composed of members.
Management Committees of the project (PMB) govern the projects, and they are
compounds of committers. (Note that each member is a former committer.)
leaders of the society, appointed by the Council, which has established
policies to the scale of the Foundation in specific areas (legal, brand,
Q.3: Routing overlays as middleware
of peer to peer systems:
Peer-to-peer internet applications have
recently been popularized by file-sharing applications such as Napster,
Gnutella and Freenet. Although a large part of the attention has been focused
on the issues of rights of author raised by these specific applications,
equal-peer systems have many aspects interesting techniques as the
decentralized control, self-organization, of adaptation and the scalability.
Peer-to-peer systems can be characterized as distributed systems in which all
nodes have the capabilities and responsibilities identical and all
communications are symmetrical.
There are currently many projects aimed at
the construction of peer-to-peer applications
and the understanding of the issues and
requirements of such applications and systems. One of the main problems on a
large scale in the peer-to-peer applications Is to provide efficient algorithms
for the location of the object and the routing in the network.
This article presents the pastry, a generic
peer-to-peer the location of the object and routing method, based on an overlay
network of self-organization of the nodes connected to the Internet. Pastry is
completely decentralized, the approach of resilience, scalable, and reliable.
In addition, the Pastry has good properties of locality road.
Pastry is designed as a substrate general
for the construction of a variety of Internet applications peer-to-such as the
sharing of files, file storage,
systems of designation and group
communication. Several applications have been built on the top of the dough up
to this day, including a persistent storage, utility called past and a system
of Publication/Subscription called SCRIBE. Other applications are in course of
Baking offers the capabilities. Each node
in the network of the pastry has a numeric identifier (nodeid). When it is
presented with a message and a numeric keypad, a node of pastry to effectively
the message to the node with a nodeId numerically the most near the key, among
all the nodes of pastry currently live. The expected number of steps of the
range is in O(log n), where n is the number of nodes in the network. At each
node of pastry along the road by a message, the application is informed and can
perform calculations specific to the application linked to the message.
The DOUGH considers the locality of
network; it seeks to minimize the distance of travel of the messages, per a
metric of scalar proximity as the number of skips of IP routing. Each node of
pastry keeps the trace of its immediate neighbours in the nodeid space, and notifies
applications of the new node node, arrivals of failures and recoveries. Because
nodeIds are randomly distributed, with a high probability, the whole of
adjacent nodes with nodeId is diverse in geography, the property, the
competence, etc. Applications can take advantage of this pastry, can route to
one of 000 nodes that are numerically closest to the key. A heuristic ensures
that among a set of nodes with the closest to the nodeIds k key, the message is
likely to’ first conclude a Node “near” the node from which the
message originated from, in terms of the metric of proximity.
Applications use these capabilities in
different ways. Past, for example, uses a fileId, calculated as the hash of the
file and the name of the owner, as a pastry for a key file. Replicas of this
file are stored on the patisserie with nodeIds k digitally nodes closest to the
A file may be sought by the sending of a
message via the pastry, using the key that the fileId. Research is guaranteed
to reach a node that stores the file if one of the 000 nodes is in direct. In
addition, it follows that the message is likely to first conclude a node near
the client, among the 000 nodes; this node provides the file and consumes the
message. Mechanisms for notification of the pastrycook past allow to maintain
replicas of a file on the 000 nodes closest to the key, despite the failure of
node node and arrivals, and using only the local coordination among the
adjacent nodes with nodeIds. More details on the past the use of dough can be found
As another example of the application, in
the scribe, system publish/subscribe to a list of subscribers is stored on the
node with nodeId numerically the closest to the Actno of a subject, where the
Actno is a hash of the field name. This node is a rendez-vous point for
publishers and subscribers. Subscribers Send a message via the help of the
topicId pastry as key; the recording is saved at each node on the path. A
publisher sends data to the point of appointments via the pastry, using the
Actno as key. The rendezvous point transmits data along the multicast tree
formed by the paths from the point of appointments to all subscribers. The full
details of the use of Scribe pulp can be found in.
These and other applications during
development were all constructed with little effort on the top of the base
capacity provided by the pastries. The rest of this paper is organized as
follows. Section 2 presents the design of pastry, including a description of
F. Dabek, M. F. Kaashoek, D.
Karger, R. Morris, and I. Stoica. Wide-area cooperative storage with CFS.
In Proc. ACM SOSP’01, Banff, Canada, Oct. 2001
R. Dingledine, M. J. Freedman,
and D. Molnar. The Free Haven project: Distributed anonymous storage service.
In Proc. Workshop on Design Issues in Anonymity and Unobservability,
Berkeley, CA, July 2000.
P. Druschel and A. Rowstron.
PAST: A large-scale, persistent peer-to-peer storage utility. In Proc.
HotOS VIII, Schloss Elmau, Germany, May 2001.
J. Jannotti, D. K. Gifford, K.
L. Johnson, M. F. Kaashoek, and J. W. O’Toole. Overcast: Reliable multicasting
with an overlay network. In Proc. OSDI 2000, San Diego, CA, 2000.
J. Kangasharju, J. W. Roberts,
and K. W. Ross. Performance evaluation of redirection schemes in content
distribution networks. In Proc. 4th Web Caching Workshop, San
Diego, CA, Mar. 1999.
J. Kangasharju and K. W. Ross.
A replicated architecture for the domain name system. In Proc. IEEE
Infocom 2000, Tel Aviv, Israel, Mar. 2000.
A. Rowstron and P. Druschel.
Storage management and caching in PAST, a large-scale, persistent peer-to-peer
storage utility. In Proc. ACM SOSP’01, Banff, Canada, Oct. 2001
A. Rowstron, A.-M. Kermarrec,
P. Druschel, and M. Castro. Scribe: The design of a large-scale event
notification infrastructure. Submitted for publication. June 2001. http://www.research.microsoft.com/antr/SCRIBE/.
M. A. Sheldon, A. Duda, R.
Weiss, and D. K. Gifford. Discover: A resource discovery system based on
content routing. In Proc. 3rd International World Wide Web Conference,
Darmstadt, Germany, 1995
I. Stoica, R. Morris, D.
Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer
lookup service for Internet applications. In Proc. ACM SIGCOMM’01,
San Diego, CA, Aug. 2001.