This is a tutorial on how to set up an Apache reverse proxy for caching content from JFrog Artifactory. I had to learn how to do this for work to lessen the request load on the origin Artifactory server, and improve performance overall.

The Apache server will be run on Alma Linux, but the steps are similar for other Linux versions. This was also done in AWS , but will work regardless of where your clients and/or Artifactory is being hosted.

Install Apache Packages#

Update your local software with the relevant command. In the case of Alma/CentOS/RHEL, etc. this is done with yum. Then install the Apache packages.

Note: If you’re using an Artifactory server with HTTPS, mod_ssl is required!

1yum update
2yum install -y httpd httpd-tools mod_ssl

Enable the required modules in Apache (though they should be enabled by default). They are:

1cache
2cache_disk
3headers
4expires
5proxy
6proxy_http
7ssl # if using Artifactory with HTTPS

In Alma/CentOS/RHEL, this is done by going through /etc/httpd/conf.modules.d and finding the conf files containing the modules we’d like to enable. Once found, make sure that they’re not commented out.

For example, the ssl module (mod_ssl.so) is loaded in /etc/httpd/conf.modules.d/00-ssl.conf. If I want that module loaded, this is what that conf file should look like:

1LoadModule ssl_module modules/mod_ssl.so

Once each module is loaded, start and enable the Apache service:

1systemctl start httpd
2systemctl enable httpd

Create Apache Configuration File#

Add a new configuration to Apache, where we’ll define the caching behaviour. This file will be at /etc/httpd/conf.d/proxy_cache.conf.

1touch /etc/httpd/conf.d/proxy_cache.conf

Set the file contents as below. These parameters might be explained in finer detail at a later date, but can easily be found on Apache’s documentation for those curious. Otherwise, these should work fine for the majority of cases.

 1# I. Cache Behaviour
 2CacheEnable disk /
 3CacheRoot /var/cache/httpd/mod_cache_disk/routing
 4# Don't cache files higher than 1GB
 5CacheMaxFileSize 1000000000
 6# 1 Day Cache
 7CacheDefaultExpire 86400
 8CacheQuickHandler off
 9CacheLock on
10CacheLockPath /tmp/mod_cache-lock
11CacheLockMaxAge 5
12
13# II. Cache Control Headers
14CacheHeader On
15ExpiresActive On
16
17# ignore upstream caching headers
18Header unset Expires
19Header unset Cache-Control
20Header unset Pragma
21CacheIgnoreCacheControl On
22
23# III. Reverse Proxy Settings
24SetEnv proxy-initial-not-pooled 1
25SetEnv force-proxy-request-1.0 1
26SetEnv proxy-nokeepalive 1
27
28# IV. Virtual Host Config
29<VirtualHost *:80>
30  ServerName localhost
31  SSLProxyEngine On # off if Artifactory doesn't use HTTPS
32
33  AllowEncodedSlashes On
34  RewriteEngine On
35
36  ProxyRequests Off # used for forward proxying
37  ProxyPassReverseCookiePath / /
38
39  ProxyPass "/artifactory/" https://<artifactory-domain>/artifactory connectionTimeout=5 timeout=2400
40  ProxyPassReverse "/artifactory/" https://<artifactory-domain>/artifactory
41
42  ProxyPass "/" https://<artifactory-domain> nocanon connectionTimeout=5 timeout=2400
43  ProxyPassReverse "/" https://<artifactory-domain>
44</VirtualHost>

Configure the Firewall#

These commands can be run to ensure that the firewall is open and accepts HTTP connections. They may be slightly different depending on your version of Linux:

 1sudo /usr/sbin/setsebool -P httpd_can_network_connect 1
 2iptables -P INPUT ACCEPT
 3iptables -P FORWARD ACCEPT
 4iptables -P OUTPUT ACCEPT
 5
 6iptables -t mangle -F
 7iptables -F
 8iptables -X
 9
10iptables-save

Verify that the firewall is open by running iptables -nvL. The command result should be similar in output to below (ignoring the packet counts):

1Chain INPUT (policy ACCEPT 777K packets, 2230M bytes)
2  pkts bytes target     prot opt in     out     source      destination
3
4Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
5  pkts bytes target     prot opt in     out     source      destination
6
7Chain OUTPUT (policy ACCEPT 744K packets, 2953M bytes)
8  pkts bytes target     prot opt in     out     source      destination

Test it Out!#

At this point, everything should be fine. Now if you attempt to download a file from this caching server, 2 things should happen:

  1. The file will download from the origin the first time.
  2. The file will be cached under /var/cache/httpd/mod_cache_disk/routing on the reverse proxy. This was determined by the CacheRoot variable in the configuration file.

Attempt this on another machine that can access the reverse proxy. We’ll try to wget a file through the host. It’s a pretty big file (347 MB), so we can see the difference caching can make.

 1$ wget http://54.248.12.6/artifactory/dev-test/7/x86_64/mycustom.rpm
 2--2022-08-17 13:50:03-- http://54.248.12.6/artifactory/dev-test/7/x86_64/mycustom.rpm
 3Connecting to 54.248.12.6... connected
 4HTTP request sent, awaiting response... 200 OK
 5Length: 364220416 (347M)
 6Saving to mycustom.rpm
 7
 8100%[==========================================================================>] 364,220,416 27.2 MB/s in 14s
 9
102022-08-17 13:50:17 (25.7 MB/s) - 'mycustom.rpm' saved [364220416/364220416]

So the first download took 14 seconds. Now, if we look at the CacheRoot directory on the reverse proxy, you might see a new folder. In my case, it was named tP:

1$ ls /var/cache/httpd/mod_cache_disk/routing/
2tP

Dig deeper in that folder, and you’ll find a .data file, with the same amount of bytes that comprised the original file (347M):

1$ ls -l /var/cache/httpd/mod_cache_disk/routing/tP/ux
2total 355688
3-rw-------. 1 apache apache 364220416 Aug 17 13:50 FcmSKZFSflmYcCYhQQ.data
4-rw-------. 1 apache apache       605 Aug 17 13:50 FcmSKZFSflmYcCYhQQ.header

Now back on the other server, attempt to download the file again.

 1$ wget http://54.248.12.6/artifactory/dev-test/7/x86_64/mycustom.rpm
 2--2022-08-17 13:50:36-- http://54.248.12.6/artifactory/dev-test/7/x86_64/mycustom.rpm
 3Connecting to 54.248.12.6... connected
 4HTTP request sent, awaiting response... 200 OK
 5Length: 364220416 (347M)
 6Saving to mycustom.rpm
 7
 8100%[==========================================================================>] 364,220,416 482 MB/s in 0.7s
 9
102022-08-17 13:50:37 (482 MB/s) - 'mycustom.rpm' saved [364220416/364220416]

It took less than a second (0.7s), much, much faster than 14s!

Final Notes#

Using this method, you can greatly save on time and bandwidth for downloading content to your clients. This isn’t only limited to Artifactory as the source as well. I’ve used this reverse proxy setup to cache content from even other Apache servers for example.

Big thanks to Taylor Callsen’s original article for getting me started, and I’ll link it in the references below.1


  1. Taylor Callsen: Creating a Caching Proxy Server with Apache. https://taylor.callsen.me/creating-a-caching-proxy-server-with-apache ↩︎