Content text Unit 1-5
UNIT-I Introduction to Internet The Internet is a worldwide system of interconnected computer networks. The computers and computer networks exchange information using TCP/IP (Transmission Control Protocol/Internet Protocol) to communicate with each other. The computers are connected via the telecommunications networks, and the Internet can be used for e-mailing, transferring files and accessing information on the World Wide Web. The World Wide Web is a system of Internet servers that use HTTP (Hypertext Transfer Protocol) to transfer documents formatted in HTML (Hypertext Mark-up Language). These are viewed by using software for web browsers such as Netscape, Safari, Google Chrome and Internet Explorer. Hypertext enables a document to be connected to other documents on the web through hyperlinks. It is possible to move from one document to another by using hyperlinked text found within web pages. History of Internet The history of the Internet begins with the development of electronic computers in the 1950s. Initial concepts of wide area networking originated in several computer science laboratories in the United States, United Kingdom, and France. The US Department of Defense awarded contracts as early as the 1960s, including for the development of the ARPANET project, directed by Robert Taylor and managed by Lawrence Roberts. The first message was sent over the ARPANET in 1969 from computer science Professor Leonard Kleinrock's laboratory at the University of California, Los Angeles (UCLA) to the second network node at Stanford Research Institute (SRI). Packet switching networks such as the NPL network, ARPANET, Tymnet, Merit Network, CYCLADES, and Telnet, were developed in the late 1960s and early 1970s using a variety of communications protocols. Donald Davies first demonstrated packet switching in 1967 at the National Physics Laboratory (NPL) in the UK, which became a test-bed for UK research for almost two decades. The ARPANET project led to the development of protocols for internet-working, in which multiple separate networks could be joined into a network of networks. The Internet protocol suite (TCP/IP) was developed by Robert E. Kahn and Vint Cerf in the 1970s and became the standard networking protocol on the ARPANET. The ARPANET was decommissioned in 1990. In the 1980s, research at CERN in Switzerland by British computer scientist Tim Berners-Lee resulted in the World Wide Web, linking hypertext documents into an information system, accessible from any node on the network. Since the mid-1990s, the Internet has had a revolutionary impact on culture, commerce, and technology, including the rise of near-instant communication by electronic mail, instant messaging, voice over Internet Protocol (VoIP) telephone calls, two-way interactive video calls, and the World Wide Web with its discussion forums, blogs, social networking, and online shopping sites. Syllabus: A Brief Introduction to Internet, The World Wide Web, Web Browsers, Web Servers, Uniform Resource Locators, MIME, HTTP. HTML5: Evolution of HTML and XHTML, Basic Syntax, Document Structure, Links, Images, Multimedia, Lists, Tables, Creating Forms, Cascading Style sheets.
The World Wide Web: The World Wide Web (abbreviated WWW or the Web) is an information space where documents and other web resources are identified by Uniform Resource Locators (URLs), interlinked by hypertext links, and can be accessed via the Internet.The World Wide Web has been central to the development of the Information Age and is the primary tool billions of people use to interact on the Internet.[4][5][6]Web pages are primarily text documents formatted and annotated with Hypertext Markup Language (HTML). In addition to formatted text, web pages may contain images, video, audio, and software components that are rendered in the user's web browser as coherent pages of multimedia content. Difference between the Internet and the World Wide Web : The terms Internet and World Wide Web, although often used synonymously, are different. The term Internet is a nominalised abbreviation of Internetworking, and came into use in 1982. The Internet identifies not a single network, but a vast network of networks. These networks communicate with each other via the existing telecommunications networks. The Internet offers several different services including email and File Transfer Protocol (FTP). The World Wide Web, commonly known as “the Web,” is the largest and fastest growing area of the Internet (Worsley, 2000). The Web uses the network of the Internet to access and link Web sites. The Internet essentially provides the infrastructure over which the Web is able to operate (Figure 1). The Web is a way of organizing information so that any workstation or computer around the world can access it through the Internet via any means of connectivity. WEB BROWSER Web Browser is an application software that allows us to view and explore information on the web. User can request for any web page by just entering a URL into address bar. Web browser can show text, audio, video, animation and more. It is the responsibility of a web browser to interpret text and commands contained in the web page. Earlier the web browsers were text-based while now a days graphical-based or voice-based web browsers are also available. Following are the most common web browser available today: On a network, a web browser can retrieve a web page from a remote web server. The web server may restrict access to a private network such as a corporate intranet. The web browser uses the Hypertext Transfer Protocol (HTTP) to make such requests. The browser does not display the HTML tags but uses them to determine how to display the document. Web browsers coordinate various web resource elements for the written web page, such as style sheets, scripts, and images, to present the web page. Browser Vendor Internet Explorer/Microsoft Edge Microsoft Google Chrome Google Mozilla Firefox Mozilla Netscape Navigator Netscape Communications Corp. Opera Opera Software Safari Apple
Architecture There are a lot of web browser available in the market. All of them interpret and display information on the screen however their capabilities and structure vary depending upon implementation. But the most basic component that all web browser must exhibit are listed below: Controller/Dispatcher Interpreter Client Programs Controller works as a control unit in CPU. It takes input from the keyboard or mouse, interpret it and make other services to work based on input it receives. Interpreter receives the information from the controller and execute the instruction line by line. Some interpreter are mandatory while some are optional For example, HTML interpreter program is mandatory and java interpreter is optional. Client Program describes the specific protocol that will be used to access a service. Following are the client programs that are commonly used: HTTP, SMTP, FTP, NNTP, POP WEB SERVERS Web servers are computers that deliver (serves up) Web pages. Every Web server has an IP address and possibly a domain name. For example, if you enter the URL http://www.webopedia.com/index.html in your browser, this sends a request to the Web server whose domain name is webopedia.com. The server then fetches the page named index.html and sends it to your browser. Any computer can be turned into a Web server by installing server software and connecting the machine to the Internet. There are many Web server software applications, including public domain software and commercial packages. Web Server Working Web server respond to the client request in either of the following two ways: Sending the file to the client associated with the requested URL. Generating response by invoking a script and communicating with database Fig: Architecture
Architecture Web Server Architecture follows the following two approaches: Concurrent Approach Single-Process-Event-Driven Approach. Concurrent approach allows the web server to handle multiple client requests at the same time. It can be achieved by following methods: o Multi-process o Multi-threaded o Hybrid method. o Multi-processing In this a single process (parent process) initiates several single-threaded child processes and distribute incoming requests to these child processes. Each of the child processes are responsible for handling single request. It is the responsibility of parent process to monitor the load and decide if processes should be killed or forked. Multi-threaded: Unlike Multi-process, it creates multiple single-threaded process. Hybrid: It is combination of above two approaches. In this approach multiple process are created and each process initiates multiple threads. Each of the threads handles one connection. Using multiple threads in single process results in less load on system resources. Examples Following table describes the most leading web servers available today: 1. Apache HTTP Server This is the most popular web server in the world developed by the Apache Software Foundation. Apache web server is an open source software and can be installed on almost all operating systems including Linux, UNIX, Windows, FreeBSD, Mac OS X and more. About 60% of the web server machines run the Apache Web Server. 2. Internet Information Services (IIS) The Internet Information Server (IIS) is a high-performance Web Server from Microsoft. This web server runs on Windows NT/2000 and 2003 platforms (and may be on upcoming new Windows version also). IIS comes bundled with Windows NT/2000 and 2003; Because IIS is tightly integrated with the operating system, so it is relatively easy to administer it. 3.Lighttpd The lighttpd, pronounced lighty is also a free web server that is distributed with the FreeBSD operating system. This open source web server is fast, secure and consumes much less CPU power. Lighttpd can also run on Windows, Mac OS X, Linux and Solaris operating systems. 4.Sun Java System Web Server This web server from Sun Microsystems is suited for medium and large web sites. Though the server is free it is not open source. It, however, runs on Windows, Linux and UNIX platforms. The Sun Java System web server supports various languages, scripts and technologies required for Web 2.0 such as JSP, Java Servlets, PHP, Perl, Python, and Ruby on Rails, ASP and ColdFusion etc. 5.Jigsaw Server Jigsaw (W3C's Server) comes from the World Wide Web Consortium. It is open source and free and can run on various platforms like Linux, UNIX, Windows, and Mac OS X Free BSD etc. Jigsaw has been written in Java and can run CGI scripts and PHP programs.