Tuesday, December 18, 2012

Key Internet Terms

APACHE:

  • Apache is world's most popular Web server (HTTP server).
  • Apache Web server(Apache HTTP Server) often referred to as simply Apache. 
  • The first version of Apache was based on the NCSA httpd Web server and was developed in 1995. 
  • The Apache server has been developed by an open source community - Apache Software Foundation. 
  • The original version of Apache was written for UNIX, but there are now versions that run under OS/2, Windows and other platforms. 
  • Apache has various modules for supporting CGI, SSI, user authentication, URL redirection, anonymous user access, automatic directory listings, support for HTTP header metafiles, support for loading modules, content negotiation, caching proxy abilities, server status display, user home directories, etc.

APPLICATION SERVER:

  • Provides method/business logic for client applications.
  • An Application Server can support other protocols apart from HTTP.
  • Application server is based on 3-tier architecture (separation of business logic from the presentation logic and the database logic).
  • Application servers usually take care of most of technical issues and allow developers to concentrate on the business need.

WEB APPLICATION SERVER:

  • The application server combines or works with a web Server and is called a Web Application Server.
When we don't need Application  Server?
- For non critical and small application, there is no need to invest in an application server.
- A web browser which supports a Html-based front end communicates with a web server. The Web server provides several different ways to service a request band to forward back a modified or new Web page to the user. These approaches include the Common Gateway Interface (CGI), Microsoft's Active Server Page (ASP), and the Java Server Page (JSP).

COOKIE: 

  • Also known as browser cookies or tracking cookies.
  • Originally designed to help support session state, though custom cookies can be used for other things as well.
  • A cookie stores a single piece of data under unique name.
  • Cookie name have to be unique within a given webpage/website (just like a variable).
  • Cookies are stored w.r.t. the page that created them.
  • A cookie comes with an expiration date, when expiration date arrives cookie is destroyed. If cookie is created without any expiration date, it gets erased when browser is closes.
  • Cookie size is limited to 4Kb.
  • Cookie number is limited to 20 per web server. 
  • Max number of cookies allowed is 300. 
  • A cookie is piece of information that is issued by a server in an HTTP response and stored for future use by the HTTP client (Browser).
  • Browser will send it back to server with each request.
  • User doesn’t get involved in cookie exchange it's managed by the server and cookie enabled client.
  • Format of the cookie -
    • cookieName=cookieValue;expires=expirationDateGMT;path=URLpath;domain=siteDomain 

DEPLOYMENT DESCRIPTOR:

  • Describes how a component, module or application (such as a web application or enterprise application) should be deployed.
  • Configuration file for an artifact that is deployed to some container/engine. 
  • Java web applications use a deployment descriptor file to determine how URLs map to servlets, which URLs require authentication, and other information.
  • Describes the classes, resources and configuration of the application and how the web server uses them to serve web requests.
  • XML based.
  • One DD per web application.
  • JSP compilation and URL mappings are automatically taken care of, but in case more control is required, use  <jsp-file> element instead of <servlet-class>.
  • For web applications, the deployment descriptor must be called web.xml and must reside in the WEB-INF directory in the web application root.
  • For Java EE applications, the deployment descriptor must be named application.xml and must be placed directly in the META-INF directory at the top level of the application .ear file.
  • you can use the DD to customize other aspects of your web application including security roles, error pages, tag libraries, initial configuration information, and if it’s a full J2EE server, you can even declare that you’ll be accessing specific enterprise javabeans.

FORM (HTML FORM):

  • HTML forms are used to pass data to a server.
  • Part of web page that includes areas where readers can enter information to be sent back to web server.
  • A form can contain input elements like text fields, checkboxes, radio-buttons, submit buttons and more.
  • When the user submit a form, browser issues a POST request, which sends the form's data to the server.

    HTTP:

    • It's agreed opon method (protocol) for transferring hypertext documents around the web.
    • Protocol used by web browser(client) to interact with the Application (server).
    • HTTP is called a stateless protocol because each request is processed independently, without any knowledge of the requests that came before it.  
    • HTTP protocol doesn’t have any mechanism for the server to know that the client is gone.
    • HTTP is the network protocol of the Web used to deliver virtually all files and other resources. 
    • Usually HTTP takes place through TCP/IP sockets.
    • It is both simple and powerful. Knowing HTTP enables you to write Web browsers, Web servers, automatic page downloaders, link-checkers, and other useful tools.
    • Default port for HTTP servers to listen is 80.

    HTTP 0.9

    • HTTP was initially a very simple protocol used to request pages from a server. 
    • There were no request headers.
    • Only one method - GET.
    • Response had to be a HTML document.
    • The browser would connect to the server and send a GET command (example below),   and the server would respond with the contents of the requested file:
      • GET /welcome.html 

    HTTP 1.0

    • HTTP/1.0 evolved from the original '0.9' version of HTTP. 
    • By any reasonable standard, the HTTP/1.0 protocol has been stunningly successful inspite of numerous flows.
    • The process leading to HTTP/1.0 involved significant debate and experimentation, but never produced a formal specification. 
    • The HTTP Working Group (HTTP-WG) of the Internet Engineering Task Force (IETF) produced a document (RFC1945) that described the 'common usage' of HTTP/1.0, but did not attempt to create a formal standard out of the many variant implementations.
    • HTTP 1.0 defines 16 headers, none of them is mandatory.

    HTTP 1.1

    • The basic operation of HTTP/1.1 remains the same as for HTTP/1.0, and the protocol ensures that browsers and servers of different versions can all interoperate correctly.
    • If the browser is compatible with version 1.1, it uses HTTP/1.1 on the request line instead of HTTP/1.0. 
    • When the server sees this is knows it can make use of new 1.1 features, else server will respond to older standard.
    • The HTTP/1.1 specification states the various requirements for clients, proxies, and servers.
    • HTTP 1.1 defines 46 headers, and one ("Host") is required in all requests.
    • Major Changes:
      • Hostname Identification - Every request sent using HTTP/1.1 must identify the hostname (from URL) of the request. Before HTTP 1.1, each host name required a unique IP Address.
      • Content Negotiation - Ability to have a number of different versions of a single resource (difference in language, file format etc). Negotiation is done between browser and server.
      • Persistent connection - (Most Important feature)
        • In HTTP 1.0, a connection between the client and server is closed after a single request/response cycle.
        • In HTTP 1.1, a connection between the client and server is kept alive and reused for multiple requests.
        • HTTP 1.1 is fast because of persistent connections, because client doesn't need to renegociate the TCP connection after each request.
        • Persistent connections is default behavior,  unless browser explicitly tells the server not to use it, i.e., server assumes multiple requests from each connection.
        • Persistent connections are controlled by the Connection header. Unless a Connection: close header is given, the connection will remain open. Connection also gets closed on  server-configurable time out (typically 15 secs).
        • Pages today include inlined documents, images and other multimedia content. These pages can be slow to download because each item needs to be requested separately from the server, each on a separate connection. With Persistent connection it gets faster.
      • Byte Ranges
        • Byte ranges allow browsers to request parts of documents.
        • This can be used to continue an interrupted transfer, or to obtain just part of a long document (say, a single page).
        • Byte ranges are implemented by the Range header. For example, to request just the second 500-bytes of a document, the request would include - Range: bytes=500-999
        • A single request can also ask for more than one range at once (for example, first 500 bytes and the last 500 bytes of a file). When the server replies, it will send back each part in a single response, using MIME multipart encoding to distinguish the parts.
      • Chunked Transfers - Encoding
        • Normally when sending back a response the sever has to know everything about the response it is about to send. For instance, servers should set the Content-Length header on each response to the length of the response itself.
        • This can be difficult for the server to do if the content is dynamically created. So in practice servers (including Apache) often do not send a Content-Length with dynamic documents.
        • For HTTP/1.1, the Content-Length must be known in advance, if a server wants to start sending a response before knowing its total length, it might use the simple chunked transfer-encoding.
        • With chunked transfer-encoding, entire document is broken into smaller chunks and sent in series.
        • "transfer-encoding: chunked" header identifies chunked transfer-encoding.

    Why HTTP is Stateless despite proxies, cookies and persistent connections?

    • Even though multiple requests can be sent over the same HTTP connection, the server does not attach any special meaning to their arriving over the same socket. That is solely a performance thing, intended to minimize the time/bandwidth that'd otherwise be spent reestablishing a connection for each request.
    • As far as HTTP is concerned, they are all still separate requests and must contain enough information on their own to fulfill the request. That is the essence of "statelessness". Requests will not be associated with each other absent some shared info the server knows about, which in most cases is a session ID in a cookie.
    • HTTP is considered stateless because the browser sends all the information the server works (cookies, referrer, etc) with in the HTTP Request Headers.
    • While there might a database involved which does store state, HTTP is stateless, because it doesn't store anything. And even if the socket is kept open, as long as it doesn't store anything it is still considered stateless.
    • HTTP persistent connections relate to TCP connection being left open. HTTP operates on top of TCP - so TCP can be connected and/or stateful whereas HTTP would not. TCP is just the transport for HTTP.

    HTTPS

    • HTTP + SSL
    • Default port is 443

    HTTP REQUEST:

    • Request that browser sends to the web server.
    • Request could be to get some file from the web server.
    • Request could be to submit the data to be processed.

    HTTP RESPONSE:

    • Reply from a web server to the web browser's request to fetch and transmit the requested web page, image, script and other in the http response.
    • Normally consists of a status line, one or more header parameters, a blank line, and optionally from a content received in the http response.
    • An HTTP response can contain HTML. HTTP adds header information to the top of whatever content is in the response (in other words, the thing coming back from the server).
        HTTP/1.1 200 OK
        Date: Tue 14 Sep 1982 01:40 GMT
        Content-Type: text/html
        Content-length: 52
        <html>
        <body> 
            <h1>Hello</h1>
        </body>
        </html>
    • Initial line contains HTTP version, response code and English reason phrase.
    • HTTP response codes:
      • 1XX - Informational messages.
      • 2XX - success messages.
      • 3XX - redirects the client to another URL.
      • 4XX - error on the client's part. 
      • 5XX - Server is aware that it has erred or is incapable of performing the request. 

    HTTP GET:

    • Get is the default and most common HTTP method.
    • "GET" is basically for just getting (retrieving) data.
    • Its main job is to ask the server for the resource. If the resource is available then then it will given back to the user on your browser. That resource may be a HTML page, a sound file, a picture file (JPEG) etc.
    • Get is for getting something from the Server. You can send parameters to the server in Get, but the total amount of characters in a GET is really limited.
        GET /servlet/WelcomePG?userName=prince HTTP/1.1

    HTTP POST:

    • Used for Form Submissions.
    • By using Post we can request as well as send some data to the server. We use post method when we have to send a big chunk of data to the server, like when we have to send a long enquiry form then we can send it by using the post method.
    • Data from the POST method is sent by the client as a part of the request body.
        POST /servlet/WelcomePG?userName=prince HTTP/1.1
        User-Agent: MOZILLA/1/0
        Content-Type: application/x-www-form-encoded
        ContentLength: 11
        
        userName=john 

      HTTP HEAD:

      • To retrieve the meta-information about a resourceto check the time when the resource was last modified on the server
      • Structure of HEAD request is same as GET.
      • Gives you info about the requested resource without actually getting the resource

      HTTP PUT:

      • "PUT" is used to add a resource to the server.

      HTTP TRACE:

      • Asks for a loop-back of the request message, so that the client can see what’s being received on the other end, for testing or troubleshooting.
      • TRACE allows the client to see what is being received at the other end of the request chain and use that data for testing or diagnostic information.
      • It is not used in production, only used for troubleshooting.

      HTTP OPTIONS:

      • This method allows the client to determine the options and/or requirements associated with a resource, or the capabilities of a server, without implying a resource action or initiating a resource retrieval.
      • Asks for a list of the HTTP methods to which the thing at the requested URL can respond.
      • Introduced in HTTP 1.1 version.
      • Almost no one uses it.

      HTTP DELETE:

      • "DELETE" is used to delete a resource to the server.

      HTTP CONNECT:

      • This specification reserves the method name CONNECT for use with a proxy that can dynamically switch to being a SSL/TCP tunnel (to pass servlet through firewall). 
      HTTP GET, HEAD and PUT are idempotent.

      HTTP PROXY:

      • program that acts as an intermediary between a client and server.
      • It receives requests from client, and forwards those requests to the intended servers.
      • A proxy has functions of both a client and server.
      • Proxies are commonly used in firewalls, for LAN-wide caches etc.

      HYPERLINK: 

      • Hyperlink (or link) is a reference to data that the reader can directly follow, or that is followed automatically.
      • A hyperlink points to a whole document or to a specific element within a document.

      HYPERTEXT:

      • Hypertext is text which is not constrained to be linear. 
      • Hypertext is text displayed on a computer or other electronic device with references (hyperlinks) to other text that the reader can immediately access.
      • Hypertext is text with Hyperlink. 
      • Apart from running text, hypertext may contain tables, images and other presentational devices.

      HYPERMEDIA:

      • HyperMedia is a term used for hypertext which is not constrained to be text: it can include graphics, video and sound.

      INTERNET:

      • The Internet is a massive network of networks, a networking infrastructure.
      • It connects millions of computers together globally, forming a network in which any computer can communicate with any other computer as long as they are both connected to the Internet. Information that travels over the Internet does so via a variety of languages known as protocols. 

      JAVA BEAN:

      • A JavaBean (or "bean" for short) is simply a reusable component that implements specific design patterns to make it easy for programmers and development tools to discover the object's properties and behavior.
      • Reusable component that implements specific design patterns to make it easy for programmers and development tools to discover the object's properties and behavior.

      JSP:

      • File with some HTML and Java code that executes top to bottom. At runtime, it is compiled into a Java class, which is actually a servlet.
      • A JSP page looks just like an HTML page, except you can put Java and Java-related things inside the page.
      • Your JSP eventually becomes a full-fledged servlet running in your web app.
      • Major limitation with JSP - The UI is only as good as the programmer's ability to generate the proper HTML.

      JSP FORWARD:

      • Passes the request object within the server to either a servlet or a JSP page.
      • The URL shown in the browser stays unchanged when you do forward.
      • It allows the next Java Servlet/JSP to access the same request object. In OAF request parameters of one page are accessible in PFR in its CO on form submit and also in next page when we navigate using setForwardUrl or forwardImmediatly.
      • A client browser will never realize whether its request is forwarded since the forwarding happens in the server side.
      • Forwarding is useful when it is necessary to continue processing the current request with different Java Servlet/JSP.

      JSP REDIRECT:

      • Creates a new request object which doesn't carry any of old requests.
      • The first request handler JSP page tells the browser to make a new request to the target servlet or JSP page.
      • The URL shown in the browser therefore changes to the URL of the new page when you redirect.
      • Client browser with redirection enabled will refresh its address bar to the next Java Servlet/JSP address, which means that a redirection address must be client access-able address.

       MIME:

      • Multipurpose Internet Mail Extensions
      • Sometimes referred to as Content-types
      • Instead of term "MIME type", "Internet media type" should be preferred today.
      • Standard internet protocol that extends the format of email to support:
        • Text in character sets other than ASCII
        • Non-text attachments
        • Message bodies with multiple parts
        • Header information in non-ASCII character sets
      • MIME types form a standard way of classifying file types on the Internet. 
      • Internet programs such as Web servers and browsers all have a list of MIME types, so that they can transfer files of the same type in the same way, no matter what operating system they are working in.  

      MIME TYPE:

      • Sometimes referred to as Content-types  
      • Instead of term "MIME type", "Internet media type" should be preferred today. 
      • A media type is composed of two or more parts: A type, a subtype, and zero or more optional parameters. 
      • e.g. text/html; charset=UTF-8
        • text is type
        • html is subtype 
        • charset is parameter.
      • MIME types are also used in HTTP response to specify content-type.
      • Typical content types are “text/html”, “application/pdf”, and “image/jpeg”.

        SERVLETS:

        • Java-based web application server extension program that implements a standard API.
        • Servlets are modules of Java code that run in a server application (hence the name "Servlets", similar to "Applets" on the client side) to answer client requests. Servlets are not tied to a specific client-server protocol but they are most commonly used with HTTP and the word "Servlet" is often used in the meaning of "HTTP Servlet".

        SERVLET CONTAINER:

        • Also known as WEB CONTAINER, J2EE WEB CONTAINER, SERVLET ENGINE (older name).
        • It is the component of a web server that interacts with Java servlets.
        • A servlet container is nothing but a compiled, executable program which takes the responsibility of instantiating, initializing and invoking the components.
        • Container calls the service() method and handles the Request and Response parameters.
        • In order to be managed by container components must follow certain contract.
        • J2EE specification defines contract between components and container, and specifies the deployment model for components.
        • When developing applications with J2EE, we develop components that follow the contract defined in the specification.
        • The web container implements Servlet and JSP API and provide infrastructure for deploying and managing web components.
        • As per the specification, all web containers must support HTTP protocol however it may support addition protocols like HTTPS.
        • There are three types of Servlet Containers:
          • Standalone - Java-based web servers where the two modules -- main web server and the servlet container are integral part of a single program. Ex. Tomcat running all by itself.
          • In-process - Main web server and the servlet container are different programs, but the container runs within the address space of the main as a plug-in. Ex. Tomcat running inside Apache Web Server. Apache loads JVM that runs Tomcat. Webserver handles the static content by itself and Tomcat handles the servlets and jsp.
          • Out-of-process - Web server and Container both run as separate programs. Web server will use a plug-in to communicate with servlet container. Usually the plug-in will be provided by the servlet container vendor. Ex. Tomcat, Apache and mod_jk (plus-in).

        SERVLET INVOKER:

        • As defined by Apache Tomcat specification, the purpose of Invoker Servlet is to allow a web application to dynamically register new servlet definitions that correspond with a <servlet> element in the /WEB-INF/web.xml deployment descriptor.By enabling servlet invoker the servlet mapping need not be specified for servlets. Servlet ‘invoker’ is used to dispatch servlets by class name.
        • Enabling the servlet invoker can create a security hole in web application. Because, Any servlet in classpath even also inside a .jar could be invoked directly.
        • The application will also become not portable.
        • In Tomcat 3.x, by default the servlet invoker is enabled.
        • In Tomcat 4.x, by default the servlet invoker id disabled. The <servlet-mapping> tag is commented inside the default web application descriptor (web.xml), located under $CATALINA_HOME/conf.
        • Deprecated in version 6 and totally removed in 7.

          SERVLET SESSION:

          • Mechanism for maintaining state between HTTP requests during a period of continuous interaction between a browser and a web application. A session may be initiated at any time by the application and terminated by the application, by the user closing the browser, or by a period of user inactivity. A session usually corresponds to an application login/logout cycle.

          SSL

          • Secured Sockets Layer.
          • Security Protocol used to secure data between two machines using encryption. 
          • Allows private and secured access over internet by providing Authentication, Encryption and Integrity checks.
          • If SSL detects that the connection is not secure, it'll disconnect the connection and client and server establishes a new secured connection.
          • SSL sits above communication protocol like TCP/IP and below Application protocol like HTTP/SMTP etc.
          • Differnt types of SSL - DomainSSL, OrganizationalSSL, and ExtendedSSL (Highest trust level).
          • SSL Certificate assures that user is on legitimate website.

           URI:

          • Uniform Resource Identifier.
          • Specific character string that identifies a name or a resource.
          • URIs can be classified as locators (URLs), as names (URNs).
          • A resource is some chunk of information, commonly a file, but it could also be a dynamically generated query result, output of a program or script output or something else.

          URL:

          • Uniform Resource Locator.
          • Specific character string that constitutes a reference to an Internet resource.

          URN:

          • Uniform Resource Name.
          • URN is a URI that identifies a resource by name. It doesn't provide its location or how to access it.
          • Ex. ISBN 0-486-27557-4 uniquely identifies a book but doesn't tell anything of its location.
          A URN functions like a person's name, while a URL resembles that person's street address. In other words: the URN defines an item's identity, while the URL provides a method for finding it.

          WEB:

          • The World Wide Web (abbreviated as WWW or W3, commonly known as the Web), is a system of interlinked hypertext documents accessed via the Internet.
          • The World Wide Web, or simply Web, is a way of accessing information over the medium of the Internet. It is an information-sharing model that is built on top of the Internet.
          • The Web uses the HTTP protocol, only one of the languages spoken over the Internet, to transmit data. 
          • Web or World Wide Web and Internet are used as if they are synonyms, but they are different terms. Please check Internet definition above for details. Web is the system we use to access the Internet. The Web isn't the only system used to access internet but it's the most popular and widely used.

          WEB 1.0:

          • Web 1.0 provided one way communication from Website to visitors. It was primarily in early 1990. 

          WEB 2.0:

          • Its a marketing term referring to different way of building application. Ajax is considered to be part of Web 2.0.
          • Web 2.0 provides two way communication, allow user generated context. It came in late 1990s.
          • Web 2.0 has made users to be the creator of content.
          • Ajax, Social networking, mash-ups, media sharing, online commerce, Wikis,
          • Web 2.0 encourages participation, collaboration, and information sharing. Examples of Web 2.0 applications are Youtube, Orkut, Wiki, Flickr, Facebook, and so on.

          WEB 3.0:

          • Computers can interpret information like humans and intelligently generate and distribute useful content tailored to the needs of users.
          • Intelligent Web - Machine facilitated understanding of information.
          • Ex. Recommendor System
          • JavaScript, JSON, XML, AJAX, Web Services, SAAS, Cloud Computing, Mobile (ubiquitous) Computing.

          SOCIAL MEDIA:

          • Social Media refers to next step after Web 2.0. In addition to two way communication, it allows visitors to communicate to each other. Thus it provides 3-Way communication.

          WEB APPLICATION:

          •  Application available from web, composed on web components.

          WEB BROWSER:

          • The World Wide Web (abbreviated as WWW or W3, commonly known as the Web), is a system of interlinked hypertext documents accessed via the Internet.
          • The browser is the piece of software (like Netscape or Mozilla) that knows how to communicate with the server. 
          • The browser’s other big job is interpreting the HTML code and rendering the web page for the user.
          • Browser works on HTTP, responsible for creating request, interpreting response.

          WEB COMPONENTS:

          • Software component that services an incoming HTTP request and provides some kind of (hopefully valid) response.
          • Server-side object used by a Web-based client (browsers) to interact with J2EE applications.
          • Servlets and JSP pages are collectively called web components.

          WEB SERVER:

          • Web servers tirelessly wait for web requests; on getting a request for a resource, server finds the resource and sends it back.
          • Web servers are combinations of hardware and softwares that deliver (serves up) Web pages.
          • It uses Client-Server model and HTTP protocol.
          • It delivers the web pages to the client and to an application by using the web browser and the HTTP protocols respectively.Web server interacts with the client through a web browser. 
          • Web Server passes the request to  Web Container. 
          • A Web application runs within a Web container of a Web server. The servlet container together with the web server (or application server) provides the HTTP interface to the world.