linux tool to find out size of a whole website

Unless a website is a bunch of straightup PHP or HTML files where everything is publicly accessible, you can't do this.

Web infrastructure can get complex.

Lets just start simple...

You can have a site that simply resolves like so

DATABASE (Not Publicly accessibile) <-----> Application Servers (multiple - possibly influx as to how many) <--> Load Balancer <---> Domain Name

This doesn't take into consideration that many modern sites load tons of javascripts from third party sites.

It doens't take into consideration the number of sites that have CDNs, caching a tremendmous amount of their files.

Or may load images or files from a service like S3

It also doesn't take into consideration, especially for really large sites - they may have database replication in several locations across the globe, and app servers and load balancers in front of those and you get redirected there based on geo-location.

/r/linuxquestions Thread Parent