The difficulty with learning about domains and apache configuration is that ideally, it has to be done in the real world. It’s not possible to properly emulate the world wide web on a n internal machine – but we can make a hybrid!
In times past, we’ve had great ideas which we’ve acted on quickly to secure the right domain – it’s often the first port of call for many a branding ideation session. Of course, not all of those ideas come to fruition and this can leave many domains, sitting, parked and wasting their potential, somewhat.
Hosting is a necessary expense for a website accessible to the web but need not be an expense when it comes to training. An internal VM can be used instead. As long as the VM has a known IP and you don’t need to see your site outside of its network, you’ll be able to use it.
Setting up the VM
My colleague Mike setup the VMs here at LiquidLight. If you want to follow along with how he did it, he made an excellent write up of it here. After this setup we had a bare VM with the minimal software.
Provisioning the VM to act as a server can be approached in myriad ways. Server setup is beyond the scope of this article and can vary considerably dependant on your requirements and distribution but if you liked Mike’s write up of creating the VM on Xen, you could find details on setting up a server in Mike’s next article, so I’ll link to that here for a concise write up.
How does DNS work?
DNS (Domain Name System) can be thought of like a phone book of all the world’s websites. Rather than remembering the IP address of each and every site you want to visit, you remember the name and the name is stored in a database on the DNS with its corresponding IP address.
To look up the IP address from the name, your browser checks on 4 caches to find the relevant IP:
- The browser cache itself – all the sites you’ve visited previously are stored within the cache for a defined period.
- If unsuccessful, the browser checks the OS cache/hosts file – the OS has its own cached list of domains.
- If not successful with the above, the browser would communicate with the router that maintains its’ own cache of DNS records.
- Finally, If all these local lookups fail, the browser would move on to the ISP. ISPs retain their own cache of DNS but if they don’t have the relevant domain, they call on a domain server further and further up the line until they find the correct domain, (at which point, all other devices in the chain update their cache with the correct IP for that domain).
About IP addresses
Usually, a website will have a domain name that looks something like this; e.g. google.com’s IP is 216.58.210.238
– and if I put this IP into my browser, I get the google home/search page. This should be true, whether I’m on my home network, in an internet cafe, or on my network at work.
Your home network may be configured for an IP address like 192.168.1.1
or 192.168.0.1
.
10.0.0.1
is more commonly seen in business computer networks than in home networks – Like 192.168.0.1
, it’s an internal IP and can only be reached from inside your network.
If you were creating a site you wanted to be reachable on the web, you would point a domain at the hosting server’s IP, the IP lookup would go through the above process of checking caches and return the external, public IP of the web host.
In this exercise, there was no need for the site to be seen outside of the ‘home network’ and so, no need to rent any hosting server space from a web host.
A VM was created and assigned an IP of 10.0.0.233
and a spare domain had an ‘A’ name added and pointed to 10.0.0.233
.
(The “Hello World” message on this site won’t be viewable by anyone outside this internal network and the IP may not even exist to the computer this article is being read on but its reachable from this office and hence can be used to practice setting up domains on this local network, without incurring charges from a web host. – thrifty!)
What were the pain points?
Symptom: can’t to ssh into the server
I had dome something wrong in setting it up and now had to trouble shoot ssh connection before I could diagnose any issue at all.
The VM I was trying to reach was called ‘jeremyvm’ and it was installed on our internal server called ‘Phantom’. I could reach the VM from Phantom with the following command: sudo xl console jeremyvm
Once inside the VM, I could use sudo su -
to become root and start troubleshooting what was and wasn’t working.
It was helpful at this point to know that using the command ssh -v
increased the verbosity of the error reporting and ssh -vv
increased another step. I could have put another v on the end to get a third level of error report verbosity but I’d found the cause by then.
Was it listening on port 22? ~ Yes but I still couldn’t connect to the VM via SSH.
Was SSH being blocked by a firewall? ~ Yes! Here’s the commands that singled me out:
## SSH ##
for i in 82.68.37.54; do
iptables -A INPUT -p tcp --dport 22 --src $i -j ACCEPT
done
iptables -A INPUT -p tcp --dport 22 -j DROP
The above is part of the iptables
settings to block SSH requests from computers outside of our IP address (the office). The line for i in 82.68.37.54; do
means that for the stated IP address, do the following.
You can see, on the next line, the word ACCEPT
, so if I’m coming from the right IP address, I should be accepted onto the server as an SSH connection.
The problem was, the machines inside the network see the local IP addresses instead of the external IP addresses, hence it looked to the internal iptables
like I was trying to connect to the network from an IP that wasn’t 82.68.37.54
but 10.0.0.10
.
Solution: Unblock the relevant IPs
IP tables commands are added to every server we configure to block unwanted traffic from doing sneaky, hacky things but we’re already protected by a firewall here so, the above problem was unnecessary as well as unusual. I commented out the lines that were blocking my IP with a # and left it at that.
Symptom: the browser refused connection;
Solution: It turned out Apache wasn’t running – start apache.
Although I’d set apache to run the previous day, I hadn’t put a provision on the machine to make sure Apache was started at bootup. Our VMs backup, update and reboot overnight, so the next day, when I came back to configure the rest of the setup, Apache wasn’t running. This question, answered on StackOverflow gives more detail as to how to set Apache to run at boot. Although there are differences based on distribution, essentially, every solution involves making a symbolic link from /etc/init.d/
to the appropriate run-level folder in /etc/
.
Symptom: URL showed http://www.domain.com/domain.com/domain.com/domain.com...etc
~ A redirect loop.
It seems like by this stage, apache was running and when the request hit the right place, it has tried to fulfil its objective and forwarded to the right place. However, there’s a line in the Apache config file that takes care of redirecting and I’d missed something here.
My simple Apache config file looked a bit like this:
### Primary domain name ###
ServerName www.yourdomainname.com
ServerAlias yourdomainname.com
### Document root ###
VirtualDocumentRoot /var/www/yourdomainname.com/html
### Logs ###
CustomLog /var/log/apache2/yourdomainname.com/access.log combined
ErrorLog /var/log/apache2/yourdomainname.com/error.log
LogLevel error
### Admninistrator ###
ServerAdmin admin@yourdomainname.com
<Directory /var/www/yourdomainname.com/html>
### Rewrite rules ###
RewriteEngine On
RewriteBase /
### Restrict domain access ###
RewriteCond %{HTTP_HOST} !^www\.yourdomainname.com$ [NC]
RewriteRule ^(.*)$ www.yourdomainname.com/$1 [L,R=301]
</Directory>
The part I had misconfigured was here:
RewriteRule ^(.*)$ www.yourdomainname.com/$1 [L,R=301]
Solution: Correct the RewriteRule to read:
RewriteRule ^(.*)$ http://www.yourdomainname.com/$1 [L,R=301]
The addition of http:// on the beginning of the domain stopped the infinite redirect because it was a clear demarcation of the domain, vs the directory structure within the site. You can have many /directories within one another but if you’ve got two slashes, there’s a lot less ambiguity about where the domain ends and the directory structure begins.
On a previous exercise, I had failed to enable the site I was trying to setup (i.e. failed to setup a symlink between a file in /etc/apache2/sites-enabled/yourdomain.com.conf
and /etc/apache2/sites-available/yourdomain.com.conf
, so I made sure I had this site enabled with the command a2ensite /etc/apache2/sites-available/yourdomain.com.conf
.
Once this command was complete, I had a domain name, with an ‘A record’ pointed at my internal IP; 10.0.0.233
,
I had an apache config file enabled on the VM at that address /etc/apache2/sites-available/yourdomain.com.conf
and I had an index.html file with “Hello World” content at the place I was sending the request:
### Document root ###
VirtualDocumentRoot /var/www/yourdomainname.com/html
Now, when I go to the domain name yourdomain.com in my browser, I see “Hello World” in all its basic glory.
Conclusions
“Practise makes better” – You may be familiar with the phrase ‘practise makes perfect’ but you’ll get better at whatever you practise, whether it’s good or bad technique. Practise the wrong thing and you’ll get better at implementing things that way. So it’s important to synthesise an environment as close as possible to real production when training or practising, especially when it comes to domains and apache config.
Rather than investing in new domain names and hosting packages to practise on, you can use any domain, perhaps a spare, unused, speculatory one and an internal VM set up to be a server on your local network.
Next steps, configure some subdomains with more apache config files for this VM, to test your prowess with a digital curve ball…