What is the /etc/hosts File?

The /etc/hosts file is a plain text file that maps IP addresses to hostnames on a local computer, providing customized name resolution without relying on an external DNS server.

In the early days before DNS, the hosts file was the primary way systems resolved hostnames on networks. Today, even with widespread DNS usage, hosts files still serve several key functions.

Brief History of the Hosts File

The concept of a hosts file originated all the way back to the beginnings of networking with ARPANET in the 1970s. As a way to map between systems on this early network, host tables were manually managed and distributed – a challenging task at scale.

The hosts file remained a critical part of local area networking right through the proliferation of the internet in the 80s and 90s. Every UNIX and Linux machine relied on its /etc/hosts file to resolve everything from local servers to internet destinations. This predated the invention of modern DNS.

It was not until DNS became widely adopted by ISPs and enterprise networks in the mid 90s that this centralized hierarchy eliminated much of the manual overhead required with hosts files. And this remains the case today – DNS handles the bulk of hostname lookups globally.

But that is not to say hosts files have become obsolete. The fundamentals of how the hosts file works and what role it serves have stayed surprisingly consistent over decades. And it remains a versatile tool for specialized uses.

Key Uses of the /etc/hosts File

Here are some common uses cases and examples for editing your Linux hosts file in the modern era:

Local Development and Testing

Map project domains to your local servers instead of public IPs. Eg:

192.168.1.5 my-dev-site.test www.my-dev-site.test   

This allows working on sites and apps offline before domains are live.

Traffic Blocking and Security Policies

Mapp domains to 127.0.0.1 or 0.0.0.0 acts as a local firewall for blocking access:

127.0.0.1 block-site.com
0.0.0.0 malware.net

Commonly used to block ads, trackers, phishing pages, adult content sites, and more.

Speed Up Common Requests

Resolve frequently accessed sites and resources to internal proxy caches and load balancers for reduced latency:

10.1.10.20 wikipedia.org
172.67.19.10 ads.google.com images.stock.com apis.facebook.com 

Simplified Machine Name Resolution

Instead of hard-to-remember IPs, map instances/servers/containers to easy hostnames:

192.168.100.42 postgres-prod-db
10.2.32.5 redis-cache-02

This simplifies SSH, SCP, and other connections to them.

Other Clever Uses:

  • Map intranet sites to localhost for offline development
  • Redirect dev, test, staging URLs to your local environments
  • Blue/green deployment testing
  • Load balancing endpoints
  • Handle dynamic containers/IP changes

And much more. The hosts file enables localized overrides before DNS resolution.

Under the Hood: How Hosts Files Work

To understand how best to leverage your hosts file, it helps to know what is happening underneath on a technical level when a hostname lookup occurs:

  1. An application initiates a request to resolve a hostname (eg connecting to a web server)

  2. The operating system checks the /etc/hosts file for any entries matching the requested domain or IP

  3. If a match is found, the specified IP address from /hosts is returned immediately

  4. No external DNS query happens for hostnames matched in /hosts

  5. If no match, the request continues on to DNS servers for resolution

So essentially the OS-level hosts file acts as the first layer that hostname resolution checks on each lookup before delegating out to DNS. This allows it to override DNS for any entries it defines locally.

The Resolution Process In Depth

Under the hood, several key networking components and libraries interact when resolving lookups using the hosts file:

getaddrinfo() function – The getaddrinfo() function of the C library retrieves info like IP addresses and port numbers for hostnames. It checks the hosts file prior to DNS.

nsswitch.conf – This OS file configures the order of name service lookups, ensuring hosts file is checked before other methods.

libc resolver – The resolver library handles getaddrinfo() requests and manages caches for prior queries. It caches hosts file IPs for better performance.

systemd-resolved service (Linux) – A daemon that manages hostname resolution with caching for reduced DNS traffic. Interacts closely with hosts files.

So in summary, apps make requests that get run through multiple OS libraries, daemons, functions, and config files that all check the hosts file first before consulting DNS servers externally. This grants the hosts file its overriding power.

Format of the /etc/hosts File

The hosts file consists of lines with an IP address followed by one or more hostnames separated by whitespace:

192.168.1.100 mywebsite.local www.mywebsite.local
127.0.0.1 localhost

Comments are also allowed using the # symbol:

# This is a comment

Both IPv4 and IPv6 addresses can be used.

Note whitespace must exist between IP, hostname(s), and comments:

192.168.1.100<whitespace>mywebsite.local<whitespace>www.mywebsite.local<whitespace>#comment

The format has remained identical across essentially every operating system since the earliest days of hosts files. This allows for great flexibility in editing, scripting, parsing, and more across platforms.

Using Hosts Files to Block Websites and Ads

A very common use case for admins and end users alike is blocking access to certain websites and domains with unwanted content.

This can be used to restrict staff productivity drains, prevent access to inappropriate material in schools and libraries, block ads and trackers for privacy reasons, stop users falling victim to malicious sites, and more.

For example, to block facebook.com:

sudo nano /etc/hosts

Add the following line:

127.0.0.1 facebook.com

Save the file and Facebook will now fail to load in browsers for all users, replaced with a connection error or timeout.

For blocking ads and trackers across many sites at once:

0.0.0.0 ads.example.com trackers.example.com malicious.example.org
127.0.0.1 ads.sites.com trackers.hosts.net malicious.domains.com

This technique works by mapping unwanted domains to non-routable IP addresses so requests never leave local networking stacks and connections cannot complete. Much more efficient than proxies, and undetectable to sites compared to browser extensions.

However do keep in mind limitations:

  • Impacts single machine only – not network wide
  • Not all content/resources trapped – modern sites load assets from many domains
  • HTTPS content not blocked – hosts file operates at domain layer only

For these reasons, as powerful as hosts files remain for blocking, additional protections via firewall policies and network proxy filtering often complete the equation for organizations.

Technical Details

There are two approaches to actually blocking with the hosts file:

  1. Map to 127.0.0.1 – This points blocked domains to localhost, causing browsers to get responses from your own local web server which won‘t have the remote content to serve, resulting in connection timeouts and errors.

  2. Map to 0.0.0.0 – This resolves to a non-routable address, triggering "site unreachable" errors immediately on request, preventing connections from leaving local network interface at all.

0.0.0.0 is ideal for performance as it prevents unnecessary outbound connection attempts entirely. However some browsers handle 127.0.0.1 more seamlessly with better error messages. Generally 0.0.0.0 is preferred unless browser cosmetics are a concern.

Advanced Use Cases

In addition to basic blocking and overrides, there are some incredibly clever tricks possible with /etc/hosts files:

Local / Development Site Mapping

For web developers: easily modify local development setups to assign memorable pseudo-domains to projects not yet served live on real registered domains. Eg:

192.168.123.123 mynewsite.local
172.20.20.3 apiv1.test www.apiv1.test
10.0.1.17 webapp.localhost 

This allows properly testing sites as if they were accessible publicly, great for rapid prototyping and client reviews well before launch.

Can be set up locally or on a company network.

Access Intranets While Remote

Software consultants often work remotely, without access to a company‘s internal network or intranet properties. By mapping these domains to 127.0.0.1, resources can be mocked out:

127.0.0.1 intranet.mycompany.com dashboard.mycompany.com

Now when loading these urls remotely, content will gracefully failover since there is no actual remote server at that IP to match requests. Allows offline access.

Blue/Green Testing and Load Balancing

Blue/green deployments involve essentially two production environments, one active while the other remains passive for testing updates. Quickly reroute traffic between these using hosts files.

Hostnames can also be strategically mapped across load balanced clusters and geographically dispersed mirror servers for performance gains based on client locale. Eg:

192.168.1.100 api.myapp.com  # Production US East
192.168.1.101 api.myapp.com  # Production US West

192.168.1.102 api-test.myapp.com # Blue/Green Beta Cluster

Then switch priorities by reordering.

Handle Dynamic IP Changes

Recreate static and persistent hostnames for frequently changing infrastructure like containers and cloud deployments where IPs evolve across rebuilds:

172.31.0.6 my-container-v1
10.9.4.23 my-container-v2

This solves the issue of hard-coding unstable IPs in app configs and scripts.

Further, custom hostnames can make SSH and SCP connections more intuitive:

198.32.0.2 database-server-prod-1

Easier than remembering oft-changing numeric IPs.

Security Considerations

While hosts files provide flexibility, be aware misconfigurations and unmanaged changes here can introduce security issues. Monitor your hosts file regularly for policy adherence.

Malware will also commonly inject unwanted redirects here that can enable phishing attempts, compromise privacy, or trigger intrusive ads on sites thought to be blocked:

127.0.0.1 facebook.com #Legitimate block
192.168.50.23 facebook.com #Malicious redirect injected to malware server 

Further, large and complex hosts files can noticeably slow down network requests across an operating system. Name resolution performance relies on fast key-value lookups. The more entries needing parsed, the bigger the impact.

Test resolution time differences using time commands:

time ping test.com # Baseline DNS lookup time 
time ping test.com # With large hosts file

Compare differences.

For shared environments, restrict hosts file permissions lightly:

sudo chown root:root /etc/hosts 
sudo chmod 644 /etc/hosts

Sysadmin teams should also consider monitoring software that tracks unauthorized hosts file changes across managed nodes.

Additional Tips

To cover off some final best practices when getting the most out of your hosts file:

  • Reference the hosts file from scripts to share common configs across machines
  • Make controlled changes in peer-reviewed GIT repos under version control
  • Maintain tidy formatting for readability at a glance
  • Validate syntax with tools like host to confirm entries are properly structured
  • Never completely fill hosts files unnecessarily or disable DNS reliance outright
  • Ensure proper integration alongside internal DNS zones if running own nameservers

The hosts file empowers localized overrides, but risks can multiply if misused at excess. Find the right balance for your specific environment.

Conclusion

The /etc/hosts file remains a versatile way to customize name resolution on Linux/UNIX systems with improved security, performance, and convenience well into the modern internet era.

Hopefully this guide has provided extensive detail into additional advanced applications along with proper management and risks to remain cognizant of when harnessing the strength of this aged yet robust facility.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *