Web Application Technologies
The HTTP Protocol
HTTP Requests
Referer
header: Indicates the URL that the request originated from.User-Agent
header: Provides information about the browser or client software.Host
header: Hostname appearing in the full URL being accessed.Cookie
header: Holds additional parameters that the server issued to the client.
HTTP Responses
Server
header: The web server software being used; possibly may include other information about modules and the server operating system.Set-Cookie
header: Cookie issued for later browser requests.Pragma
header: This tells the browser not to store the response in its cache.Expires
Header: Indicates that the response content expired in the past and should not be cached.Content-Type
header: Indicates what type of content is included in the body.Content-Length
header: Indicates the length of the message body in bytes.
HTTP Methods
Primarily
GET
andPOST
are the two HTTP Methods used to attack web applications.GET
is used to request resources whilePOST
is used to perform actions.HEAD
: Same as the GET request, but the server should not send the message body in its response.TRACE
: Used for diagnostics; the server should return the exact contents of the request message it received.OPTIONS
: Response should include what available HTTP methods there are for a certain resource.PUT
: Attempts to upload the specific resource to the server, using the content contained in the body of the request.
URLs
- Otherwise known as a uniform resource locator, it is a unique identifier for a web resource. The typical format is as follows:
protocol://hostname[:port]/[path/]file[?param=value]
REST
- Representational state transfer (REST) is a style of architecture for distributed systems. In a REST-ful system, the URL would include a resource almost fully in the file path rather than the query string.
http://wahh-app.com/search/ford/pinto
HTTP Headers
General Headers
Connection
: Tells the other end of the communication if the TCP connection should be closed after the HTTP transmission completes.Content-Encoding
: Specifies the type of encoding used in the message body. For instance,gzip
is used for faster transmission.Content-Length
Content-Type
Transfer-Encoding
: Specifies encoding performed on the message body to assist in transferring data over HTTP. Usually used to specify chunked encoding whenever it is used.
Request Headers
Accept
: Tells the server what content the client will accept.Accept-Encoding
: Tells the server what content encoding the client will accept.Authorization
: The credentials for a built-in HTTP authentication type.Cookie
: The previously issued cookie(s)Host
: The hostname in the urlIf-Modified-Since
: Tells when the browser last received the requested resource. If it hasn't changed since then, the server may respond with a304
response telling the client to use the cached copy of the resource.If-None-Match
: specifies anentity tag
(an identifier denoting the contents of the message body). Can also be used to find out if the browser should just use a cached copy of a resource.Origin
: used in cross-domain Ajax requests to indicate the domain that a request originated from.Referer
: Specifies the URL from which the request originated.User-Agent
: Provides information about the client browser and software generating the request.
Response Headers
Access-Control-Allow-Origin
: Indicated whether the resource can be retrieved via cross-domain Ajax requests.Cache-Control
: Passes directions back to the browser about caching (for instanceno-cache
)ETag
: Specifies an entity tag that can be submitted later in aIf-None-Match
header to tell the server what version of the resource is cached by a browser.Expires
: Tells the browser how long contents of he message body stay valid.Location
: Used in redirection responses to specify the target of a redirect.Pragma
: LikeCache-Control
, this also passes directions back to the browser about caching.Server
: Provides information about the server software.Set-Cookie
: Issues cookies to the browser for subsequent requests.WWW-Authenticate
: Used in401
responses to provide information about what types of authentication the server supports.X-Frame-Options
: Specifies if and how the response can be loaded within a browser frame.
Cookies
- Typically consist of a name/value pair, but may be any string without a space.
- Optional Attributes
expires
: Sets a date until a cookie is valid.domain
: Specifies the domain that the cookie is valid.path
: Specifies the URL path that the cookie is valid.secure
: If set, the cookie is submitted only in HTTPS requests.HttpOnly
: If set, the cookie cannot be accessed via client-side JavaScript.
Status Codes
1xx
: Informational100 Continue
: Request headers were received, so continue sending the body.
2xx
: Successful200 OK
: Request was successful and the response body contains the request result.201 Created
: Response to aPUT
request stating the request was successful.
3xx
: Redirect301 Moved Permanently
: Redirects browser to a different URL specified in theLocation
header. Use the new URL in the future rather than the original one.302 Found
: Redirects temporarily to a different URL. Revert to the original URL in subsequent requests.304 Not Modified
: Tells the browser to use the cached copy of the requests resource. Server uses theIf-Modified-Since
andIf-None-Match
request headers to determine whether to respond with a304
response.
4xx
: Request Error400 Bad Request
: The client submitted an invalid HTTP request.401 Unauthorized
: The server requires HTTP authentication before the request can be granted. In the response, theWWW-Authenticate
header will specify which authentication types are supported.403 Forbidden
: No one is allowed to access the requested resource, regardless of authentication.404 Not Found
: The resource doesn't exist.405 Method Not Allowed
: The method used in the request is not supported for that URL.413 Request Entity Too Large
: The body of the request is too large for the server to handle.414 Request URI Too Long
: URL is too large for the server to handle.
5xx
: Server Error500 Internal Server Error
: The server ran into an error while processing the request.503 Service Unavailable
: The application accessed is not responding.
HTTPS
HTTPS still uses TCP for transporting information, but it augments TCP by using Security Sockets Layer (SSL) to assure the privacy and integrity of data.
HTTP Proxies
- When an HTTP request is unencrypted and sent to a proxy server, the full URL is put into the request. The proxy server then extracts the hostname and port and uses these to direct the request to the correct web server.
- Whenever HTTPS is used, the browser can't perform the SSL handshake with the proxy server since that would breakt he secure tunnel and leave the communication vulnerable to interception. Thus, in HTTPS the proxy is just used as a TCP-level relay, passing through all traffic.
- To do this the browser makes an HTTP request to the proxy server using a CONNECT method, specifying a destination hostname and port. If the proxy server responds with a 200 status, the TCP connection is kept open and subsequent traffic is relayed to the destination web server.
HTTP Authentication
- Basic: simple authentication sending credentials as a Base64-encoded string in the request header.
- NTLM: A challenge-response mechanism using a version of the Windows NTLM protocol.
- Digest: Challenge-response mechanism using MD5 checksums of a nonce with the user's credentials.
Web Functionality
Server-Side Functionality
Four main ways parameters are sent to the application:
- URL Query String
- The file path of REST-style URLs
- HTTP cookies
- The body of requests using the POST method
The Java Platform
- Originally owned by Sun Microsystems, this is now owned by Oracle.
- Relevant Terms:
- Enterprise Java Bean (EJB): Encapsulates the logic of a specific business function within an application.
- Plain Old Java Object (POJO): An ordinary Java object, as compared to an EJB. Used to denote objects that are user-defined, simpler, and more lightweight than EJBs.
- Java Servlet: an object residing on an application server that received HTTP requests from clients and returns HTTP responses.
- Java web container: A platform/engine providing a runtime environment for Java-based web applications. Some examples include Apache Tomcat, BEA WebLogic, and JBoss.
- Common key application functions and their open-source or third-party components:
- Authentication -- JAAS, ACEGI
- Presentation layer -- SiteMesh, Tapestry
- Database object relational mapping -- Hibernate
- Logging -- Log4J
ASP.NET
- Uses the Microsoft .NET framework, providing a virtual machine (the Common Language Runtime) and .NET APIs. Because of this, any .NET language can be used, such as C# and VB.NET.
- Event-driven rather than script-based.
PHP
- Commonly used in the LAMP stack
- Linux - Operating System
- Apache - Web Server
- MySQL - Databases
- PHP - Programming language
- Commonly used open source applications put into custom-built applications:
- Bulletin Boards -- PHPBB, PHP-Nuke
- Administrative front ends -- PHPMyAdmin
- Web mail -- SquirrelMail, IlohaMail
- Photo galleries -- Gallery
- Shopping Carts -- osCommerce, ECW-Shop
- Wikis -- MediaWiki, WakkaWikki
Ruby on Rails
- Rails 1.0 was introduced in 2005, with an MVC architecture emphasis.
- Relatively fast and it can easily auto-generate:
- a model for database content
- controller actions for modifying it
- default views for the user
SQL
- Structured Query Language (SQL) is used to access data in relational databases.
- For instance, Oracle, MS-SQL, and MySQL.
- Relational databases: store data in tables, with rows and columns. Columns represent a data field, and rows represent an item with values to some of all of these data fields.
- SQL uses queries for actions like reading, adding, updating, and deleting data.
- e.g.
select mail from users where name = "lol"
- e.g.
- Vulnerabilities occur whenever an application passes user-supplied input into SQL queries executed by the back-end database.
XML
- Extensible Markup Language (XML)
- A specification for encoding data in a machine-readable form.
- Encapsulates content or child elements between tags.
- Tags can include attributes in a name/value pair.
Web Services
- Many web services use the same protocols as web applications/sites so they may be just as vulnerable to the same vulnerabilities
- Simple Object Access Protocol (SOAP) is used to exchange data in web services.
- SOAP is mainly used between server-side applications/services.
Client-Side Functionality
HTML
- Hypertext Markup Language (HTML) is a tag-based language like XML.
- XHTML is a different type of HTML based on XML and has stricter specifications.
Hyperlinks
- These are usually found as
<a>
tags, or anchor tags, in HTML. When clicking on these, the browser makes a request to the value ofhref
.
Forms
- Forms use a
POST
method that submits form data to the server in the request body. - Each form contains a hidden parameter,
redir ,
and submit parameter,submit
.- Both of these are submitted to the server, which may use them in its own logic.
- The target URL in the form submission contains a preset parameter,
app
, which can be used to control server-side processing. - The request contains a cookie parameter,
SESS
, issued to the browser in an earlier response from the server. This can also be used to control server-side processing. - In this request, a header specifying content type is included, such as:
x-www-form-urlencoded
- Another type that might be used is:
multipart/form-data
- An application may request multipart encoding in an
enctype
attribute in the form tag. - The
Content-Type
header in the request will then also specify a random string to use as a separator for parameters in the request body.
CSS
- Cascading Style Sheets (CSS) is used for presentation of a document written in a markup language.
JavaScript
- The client may perform some processing because:
- Improving application performance
- Enhancing usability
- Some example cases where this may be done:
- Validating user-entered data
- Dynamically generated content and interfaces
- Querying and updating the document object model (DOM) to control the browser's behavior.
VBScript
- This is an alternative to JavaScript supported only by Internet Explorer.
- Modeled on Visual Basic and interacts with the browser DOM.
Document Object Model
- An abstract representation of an HTML document. This is queried and manipulated through its API.
- Allows client-side scripts to access certain HTML elements using their id.
Ajax
- Originally an acronym for 'Asynchronous JavaScript and XML', but Ajax requests don't need to be asynchronous or use XML anymore.
- Key part in Ajax is
XMLHttpRequest
, a native JavaScript object that client-side scripts can use to make "background" requests.
JSON
- JavaScript Object Notation (JSON) is a simple data transfer format used to serialize arbitrary data.
- This is commonly used as an alternative to XML in Ajax applications.
Same-Origin Policy
- This is used in order to key content from different sources from interfering with each other. Content from a site can only modify content from the same site, not from other sites.
- Key features:
- A page on a domain can cause an arbitrary request to another domain, but it cannot process the data returned from that request.
- A page on a domain can load a script from another domain and execute it within its own context.
- Scripts are assumed to contain code, not data.
- A page on one domain cannot read or modify cookies or other DOM data of another domain.
HTML5
- Introduces new tags, attributes, and APIs.
- Modifies Ajax
XMLHttpRequest
to enable two-way cross-domain interaction in some situations.- Can lead to cross-domain attacks.
- Introduces some ways of client-side data storage, which can lead to privacy issues.
- New attacks: client-side SQL injection
"Web 2.0"
- A buzzword for the new trends in web apps:
- Heavy Ajax use in performing asynchronous and hidden requests
- Cross-domain integration
- Using XML, JSON, and Flex
- Supporting user-generated content, information sharing, and interaction.
Browser Extension Technologies
- Some web apps use browser extension technology to use custom code and extend the browser's built-in functionalities.
- Java applets
- ActiveX controls
- Flash object
- Silverlight objects
State and Sessions
- ASP.NET uses a hidden form field called
ViewState
to store state information.- By default, this includes a keyed hash for integrity reasons.
- Sessions are commonly tracked using a token in the form of a request parameter or HTTP cookies.
Encoding Schemes
URL Encoding
- Can only contain printable characters in the US-ASCII character set.
- ASCII code range of
0x20
to0x7e
.
- ASCII code range of
- Commonly URL Encoded characters:
%3d
=%25
%%20
Space%0a
New Line%00
Null byte
Unicode Encoding
Uses 16-bit Unicode-encoded format of characters.
Uses a preceding %u to denote that a string is a Unicode representation of a character.
- e.g.
%u2215
is/
- e.g.
UTF-8 is a variable-length encoding standard using one or more bytes to express each character.
%c2%a9
is the copyright symbol.
HTML Encoding
- HTML encodes characters in at least three different ways.
- Method one is using shortnames:
"
"'
'&
&<
<>
>
- Method two is using ASCII code in decimal form:
"
"'
'
- Method three is using ASCII code in hexadecimal form:
"
"'
'
- Method one is using shortnames:
Base64 Encoding
- Allows any binary data to be safely represented using printable ASCII characters.
- Commonly, this is used to encode e-mail attachments for sending them over SMTP.
- Also used for encoding user credentials in basic HTTP authentication.
- Base64 uses two
=
characters at the end of input data if it is shorter than 3 chunks of output data. - Character Set:
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/
Hex Encoding
- Uses ASCII characters to represent a hexadecimal block.
Remoting and Serialization Frameworks
- Some examples:
- Flex and AMF
- Silverlight and WCF
- Java serialized objects