As you probably know, squid is a web proxy cache. It has a variety of uses,
from speeding up the connection to web servers by caching repeated requests,
to blocking pages with commercial or pornographic content. Squid is very
robust and developed under GPL.
23.6.2004 08:00 | Petr Houštěk | přečteno 38910×
Caching is the way to store requested Internet objects on a server closer to a client. Local Squid cache can reduce both access time as well as bandwidth consumption. Squid also provides some kind of security and anonymity.
Because of the licence you can run Squid on almost every Unix-like operating systems. Let's suppose you run it on Linux. Installation is quite simple – for details look here. You can also use a precompiled package from your favourite distribution.
Now the Squid is installed. We just have to configure it. The Squid configuration files are kept in the directory /usr/local/squid/etc by default, but in your favourite distribution they can be moved to another location (for example on Debian there is squid.conf in /etc). Squid uses a lot of default settings, so that it can run even with a zero length configuration file. But it isn't very useful, because by default Squid denies access to all browsers.
Let's create some basic configuration to make the server working. The first option in the squid.conf file is http_port option which sets the port that Squid will listen on &ndash it can be more that one number (for example 3128 and 8080). To set where to store the cache, there is an option cache_dir. The first option of cache_dir sets where to store cache, then it's size value in megabytes and the number of subdirectories in first and second tier (it is recommended to leave here the defaults). Another important option is cache_mem, which tells Squid how much memory can be used for in-transit objects, hot objects and negative objects.
Now we have to allow users to use the proxy. We can use this temporary solution to allow all users (for details see the section about acl).
acl all src 0.0.0.0/0.0.0.0 http_access allow all
Now the very basic configuration is done and we can start the Squid for the first time. So you can configure your web browser to use it.
ACLs (access control lists) are very important part of Squid configuration. Basic authentication should be always used. The primary use of access control is to stop unauthorised users using your cache. There are two elements to access control – classes and operators. The class refers to a set of users. The set can also refer to the ip, http request, filename extensions, etc). The classes can be put through the operators – for example to allow http access, to redirect it somewhere else, etc.
The syntax of acl is
acl name type string1 string2 string3 ...
The types are source or destination ip address, source or destination domain, regular expression match of requested domain, words in the requested URL, words in the source or destination domain, current time, destination port, protocol (HTTP, FTP), method, browser type, name, anonymous system number, username/password pair, SNMP community. The decision string is used to check if the acl matches given connection. The squid first checks the type field and according to it decide how to use the decision string. The decision string could be an ip address, a network, a regular expression ... Now let's take a closer look at some types.
The most used example is like this
acl myNet src 192.168.0.0/255.255.0.0 http_access allow myNet
This acl matches all address from 192.168.0.0 to 192.168.255.255 and allows them to use your cache. All others connections will be denied. Squid adds an invisible line to the end http_access deny all if the last line tells him to accept or http access allow all if the last line tells him to deny. For example if you have this acl set
acl myIP src 192.168.5.13 acl badNet src 192.168.5.0/255.255.255.0 http_access allow myIP http_access deny badNet
squid will deny all connections from net 192.168.5.0/255.255.255.0 (except 192.168.5.13). If you connect from another network (for example 192.168.1.13), squid will accept this connection.
To decide according to the destination ip address squid use the type dst (use is quite similar).
This acl matches requests with proper source or destination domain. The types are srcdomain and dstdomain. The source domain option is not recommended, because the attacker who controls the reverse DNS entries for the attacking host will be able to manipulate these entries to bypass the srcdomain acl. The dstdomain matches the destination domain. It can be used for example to block some well-known porn sites. You should also block the site's ip, because without it someone can access the site typing the ip in their browser. Here is an example – you want to block the site www.example.com. The ip address is 10.11.12.13. The entry is
acl badDomain dstdomain example.com acl badIp dst 10.11.12.13 acl myNet src 192.168.0.0/255.255.0.0 acl all src 0.0.0.0/0.0.0.0 http_access deny badDomain http_access deny badIp http_access allow myNet http_access deny all
With the described types you can only filter sites by destination domain. The matching based on the regular expressions allows you to make much more precise filtering. Regular expression in Squid are case-sensitive by default, to make them case-insensitive use prefix -i. For example you want to deny access to all requests with word sex (or SEX, SEx, etc.). Than the proper acl is like
acl badUrl url_regex -i sex
To block all files with video content you can make an acl like
acl badUrl url_regex -i \.avi
you can also combine these two rules
acl badUrl url_regex -i sex.\*\.avi
Regular expressions can be used also for checking the source and destination domains. The types are srcdom_regex and dstdom_regex.
This type matches the requests according to the current time. The often wish is to filter unwanted sites during the work time. It can be done by combining the time and dstdomain (or dstdom_regex) ACLs. The syntax of time acl is here
acl name time [day-list] [start_hour:minute-end_hour:minute]
The day-list is a list of single characters indicating the days that the acl applies to. Here is the list: S – Sunday, M – Monday, T – Tuesday, W – Wednesday, H – Thursday, F – Friday, A – Saturday. For weekends you can use
acl weekends time SA
The most used port is 80 (that is the port the web servers almost always listen on). Some servers listen on other ports too such as 8080. The SSL connections use port 443. By default there is a list of Safe_ports defined in the squid.conf. This is the line from squid.conf
acl Safe_ports port 80 21 443 563 70 210 1025-65535
which means that destination ports 80, 21, 443, 563, 70, 210, 1025, 1026, ... 65534, 65535 are matched. To deny all other ports you can use
http_access deny !Safe_ports
There are several methods – get, post and connect. The get method is used for downloading, the post method for uploading and the connect method for ssl connections. The typical thing is to block connect type requests to non-SSL ports. The connect method allows the traffic in any direction at any time, so if you have a improperly configured proxy, you can connect to a telnet server on a distant machine from the cache server and to bypass the packet filters. So we can use this example
acl connect_method method CONNECT acl SSL_PORTS port 443 563 http_access deny connect_method !SSL_PORTS
The acl operators will be described in the next volume.