Introduction
“The cloud” is a confusing concept to many, something that’s
not surprising given that the term is used in many different contexts and often
not in a consistent manner. This short essay is designed to make the concept
clear enough that users can understand its application in any context.
Where Computers Keep Things
A stand alone computer keeps information—which means programming
information as well as data—in three places:
1) right on the CPU chip; 2) in memory, often called RAM; and 3) in
storage, which is usually a disk on a laptop or desktop but is most often chip-based
“flash” memory in a tablet, phone, or ultralight laptop.
In its simplest
meaning, “the cloud” just refers to a fourth place where things are kept—disk
storage that is not local but on a server computer somewhere on the Internet.
The term “cloud” comes from the fact that, in the early days
of networks, engineers made careful drawings to show which devices were in
which locations and how they were connected. These network maps were helpful in
planning and troubleshooting.
Detailed network drawings were possible even for corporate networks
that connected computers in multiple cities.
The company leased just a few lines to connect the cities and it was
easy to identify them.
But when the Internet came along and businesses started
using it instead of private networks, drawing became difficult because
information flowing through this new kind of network doesn’t always take the
same path. Difficulty in drawing quickly became an impossibility as the number
of available paths between any two from points on the Internet became so large
that no one was even sure what all of them were.
As a result of the Internet’s complexity, designers of
network maps compromised by showing instead a connection going into a cloud
symbol and coming out somewhere on the other side. The cloud is simply a symbol
representing the Internet.
Things You Need to Know #1
1.
When someone talks about “cloud storage” they’re
talking about a disk drive somewhere on a server that’s connected to the
Internet.
2.
It follows that you can’t access cloud storage
unless you’re connected to the Internet.
3.
You have to read on to find out how some
services offset that issue.
An Example of Cloud Storage: Smarter
than Just a Big Disk Drive
Dropbox was one of the first “cloud storage” providers and
is still a major player, though the swift entry of Google, Microsoft, Apple,
Amazon, and others into the market has prevented them from exploding into a
gigabillion dollar company as other Internet businesses have done in the past.
Dropbox, with some 300 million users, is still major, though.
The Dropbox Folder
The thing that gave Dropbox a lead in the cloud storage
market was its ability to modify your local operating system (for example,
Windows, OS/X) in order to create a folder on your computer. The most important
thing about the Dropbox folder is that it operates like all your other folders.
You can put files in and out in the same way, no matter the computer.
Curiously, it took others a long time to be able to do the same thing: Dropbox had a smoothly functioning folder in
Windows before Microsoft was able to do the same for its competing OneDrive
service.
When You Save a File
Let’s say you’re working on your office desktop and you create
a new file, give it a name, then save it. (Always, always do this.)
You choose the Dropbox folder as the place to save the file.
What happens next?
First, the local computer system saves the file to a Dropbox
folder that is located on your local computer—a hard drive or flash memory (for
an exception, see Things You Need to Know #2-3, below).
Once the file is saved locally, the Dropbox software running
on your local computer notices that the file is new (doesn’t exist in the
cloud) and uploads it to Dropbox’s Cloud server, where it’s placed in the
folder you own there. FYI—Dropbox uses Amazon’s Web Services rather than owning
its own set of servers.
Now, there are two copies of your file. One is on your
office desktop’s hard drive and one is on Dropbox’s cloud server.
When this is done, you can go home, have a drink and dinner,
open up your laptop and find the file is there—in your local Dropbox folder.
This is possible because, as soon as you open the laptop at
home, the Dropbox service running on the laptop looks out across the Internet
and compares the local Dropbox folder with the one in the cloud. If there’s a
difference, which is the case here because there’s a new file in the cloud, the
service downloads the file to the local hard drive (your home laptop in this
case) and you can access it.
Normally, this synchronization happens so fast that the user
doesn’t know that the file wasn’t always on the local machine.
So now what happens if you edit (change) the file on your
laptop?
As soon as you save it, the Dropbox service on the laptop
notices there’s a discrepancy between the file on the laptop and the one in the
cloud—the local one is newer. Since newer is better, the service uploads the
file from the laptop and has it replace the version in the cloud.
Two things to know here.
First, the synchronization works continuously. Thus, the
Dropbox service on the desktop machine in your office will be watching the
cloud and, when it notices that a file there has changed, will compare it with
the local one (the original that you created there). Since the version in the
cloud is newer, it will be downloaded and replace the local one.
In quick summary, Dropbox’s software is constantly working
to make sure that the newest version of a file is the one you see when you open
any machine.
One cool thing about Dropbox, something that sets it apart
from some of its competitors, is that it knows you might make a mistake and
keeps backup copies of older files. So, when you change the file on your laptop
at home and that file is updated to the cloud, Dropbox will always replace the
old file in the folder and also sync the newer file to your desktop (and any
other computer you have connected) as soon as it can.
But, Dropbox will put a copy of the original file—the one
before the laptop version was uploaded—into a special backup folder. This means
that, if you didn’t want the changed file on the laptop to replace the original,
you can just go to the cloud and find the untouched original file and access it
(you’ll give it a new name after you open it, of course).
A quick warning here. If you edit a file on your home laptop then quickly close the lid without closing the file and waiting few seconds, it's likely the Dropbox (or other) software won't have time to update the version in the cloud. This means that when you get back to the office the version of the file you edited just before closing your home laptop won't be available. The takeaway: when you're ready to shut down a laptop (or other device), first close any cloud-connected files and then wait a minute or so for the system to update. My experience is that Dropbox does this much faster than its competitors.
A quick warning here. If you edit a file on your home laptop then quickly close the lid without closing the file and waiting few seconds, it's likely the Dropbox (or other) software won't have time to update the version in the cloud. This means that when you get back to the office the version of the file you edited just before closing your home laptop won't be available. The takeaway: when you're ready to shut down a laptop (or other device), first close any cloud-connected files and then wait a minute or so for the system to update. My experience is that Dropbox does this much faster than its competitors.
A computer doesn’t have to be connected to the Internet for
Dropbox to work. You use your local drive as you normally would. Then, when the
disconnected computer (for example a laptop being used on an airplane), reconnects to the Internet the synchronization process starts.
So the marvelous thing about Dropbox is that it isn’t just a
dumb disk drive somewhere out there in the Internet, it’s a smart drive that
keeps files synchronized across multiple computers and also provides backup in
case you make mistakes.
Speaking of backups, one nice thing about cloud services
from major companies like Dropbox, not to mention Google, Apple, Microsoft, and
Amazon, is that they don’t keep your data on just one physical drive in their data centers. Rather,
the data are mirrored to another cloud drive so there’s a backup if the
original drive goes down. Typically, there are also backup drives at different physical
locations. Thus, if the Google server farm in Virginia is torched by Luddites,
the data is still available somewhere in Pennsylvania (for example).
Things You Need to Know #2
1.
Cloud storage is typically a smart service that
keeps files synchronized across many machines. Logically, this isn’t
complicated because all the computers share a common central point—the versions
of the files in the cloud server. Practically, the process works because
software on each computer is constantly looking for newer versions of a file
and making sure that the newest version is on the server for connected
computers to download and use.
2.
Cloud storage can provide two kinds of
backup: 1) the cloud server has a copy
of your file in case your laptop goes down (the cloud server copy is itself
backed up in multiple locations so you don’t have to worry about a server or
even a server location going down); and 2) if you accidentally overwrite an old
file with a new one, the cloud service can usually recover the old version.
3.
With Dropbox, there’s an exception to keeping
all files locally. The Dropbox service running on devices with just a small
amount of storage and limited Internet bandwidth, for example tablets and
phones, will not keep files locally unless you tell it to do that one by one. See
https://www.dropbox.com/help/82
for information. If you don’t remember to do this, you can be somewhere without
an Internet connection and find that you can’t get a file from Dropbox to your
tablet or phone.
4.
If your computer crashes, you don’t lose the
data you have in Dropbox. Once your new or rebuilt machine is operating, you go
to the Dropbox site on the web, download the company’s software and it will
start. The software will ask for your username and password at login. Once
you’ve done that, the software will find the cloud copy of your files and begin
to download them. This will continue until the two Dropbox folders are again
the same.
Cloud Storage as a Smart Service
Dropbox’s business model is to give you a certain amount of
storage free (currently 2 GB) and then ask you to pay if you need more (you
will if you keep a lot of photos in the cloud).
Dropbox’s big competitors like Google, Microsoft, Apple, and
Amazon use the same model, but provide a lot more free storage to start: usually 5 GB (if you look carefully you can
get even more). How can they afford to do this?
Google, Apple, et al see the cloud as a way to lock you into
their brand system. Let’s take Apple as an example.
Apple sells you services and entertainment in addition to
devices: apps, music, movies, books. You
can download these to your computer and use them as needed, but you can also
keep them in Apple’s iCloud.
Why keep apps and stuff in the cloud?
In the case of apps, they have to be on the local machine to
work, but iCloud is a very nice place to keep backup copies. If you need more
local storage in your iPhone or iPad, for example to load a new version of the
IOS operating system, you can delete the local copy then reinstall it from the
cloud later. This process also makes adding a new device relatively easy, for example a new
iPad Air to replace an old one. Apple originally did a
terrible job of making this process understandable, but it’s quite good now.
Music, movies, and books also have to be local to be usable
but some of them, especially movies, take a lot of local storage. So, with a cloud service you can
download and watch, then delete and still be able to download and watch any
time you want (assuming you have an Internet connection).
Led by Amazon, providers of cloud services are also making
it possible to use movies without ever downloading—you can “stream” them from
the cloud to your device at will (again, assuming you have an Internet
connection).
It seems really generous of Apple to keep all these big
files for you, doesn’t it?
Well, it would be if that’s what they did. If you were to go
to the place where your app, movie, music, and book files are kept on an Apple
cloud server, you would see just one file. In that file is a list of what you
own. When you ask for something, the server goes to its master database of
apps, movies, etc. and pulls that one out. Just one copy needs to be stored for
tens of millions of users (though there are of course backups of the master lists and
usually there are copies distributed across the Internet to make access
faster).
In summary, the idea of making cloud storage relatively free
is that the particular service you use will become the place where you also buy
your music, movies, and books.
Cloud Computing
When large businesses started moving their local computing
power to a shared server cluster connected to the Internet, they found they
could save a lot of money, both by avoiding redundant hardware and in
consolidating administration of the software. So, a company which previously
had a set of servers and staff at each of its seven locations now could
consolidate at just one.
The next step past this is for a company not to own servers
at all, but to outsource that function to a cloud computing provider like IBM (or Amazon, or others).
IBM leases the hardware capacity and basic system software maintenance at a
price only very large businesses could match. All you do is load your own
software and you’re ready to go (and IBM can help with this). The speed of
access is about the same as it would be if the computer were next door rather
than in the cloud.
There are two big problems with this strategy. The first is the
reliability of Internet access. Fortunately, this has been mostly solved: businesses purchase two Internet connections
from two different providers and the odds of both going down at once are tiny.
The second problem is security of information on the
Internet, and it most definitely hasn’t been solved. That being said, the
problem is really no worse with cloud computing than it is for any
Internet-connected system.
Thanks to the wizardry of companies like IBM, Amazon,
Microsoft and others, you can actually send large-scale mathematical or data problems to the cloud and have
their servers (rather than just the ones you lease) do the computation for you.
In addition to the server farms that store data, these companies have fast and
vast clusters that can quickly bring a thousand, ten thousand, or more CPUs to
focus on your problem and just yours. Amazon is particularly good at this,
providing an easy to use service with simple pricing that can also include as
much disk storage as you want. If your hobby is gene sequencing at home, Amazon’s
got you covered.
Summary
The term “the cloud” is just another way of saying
“somewhere on the Internet.”
It’s always been possible to send files from place to place
on the Internet for purposes of backup; cloud storage services like Dropbox
simply make that process transparent to the user. They also provide
synchronization across multiple computers.
Cloud storage is invulnerable to local system crashes.
Anything you have in a system like Dropbox can be retrieved when your new or
rebuilt system is up and running.
Cloud services now also provide storage and management of
purchased software like music, videos, and books as well as backups of apps.
Cloud computing makes vast numbers of fast CPUs available on
demand to any business or individual who knows how to use them (and can afford
it).