Recently I’ve been using several of Amazon’s Web Services for computing,
storage, and turning text into voice and the
experience has been great! With AWS you can easily rent a super-computer or
archive some of the huge files you have laying around. Many of these
services are accessible through a series of R packages written by
Thomas Leeper and his colleagues as part of
the Cloudyr project. Their packages are well
designed and easy to use, however the initial setup of configuring AWS
credentials can be confusing for folks who have never used AWS before.
So let’s get started setting you up with Amazon Web Services. First you’ll need
to log into your Amazon account by signing into
https://aws.amazon.com/.
If you already have an Amazon account that you use to buy
dailynecessities then you can log in
with that account. After providing some basic information you should see the AWS
Console. This page has a search bar that looks like this:
Type “IAM” into the search bar and click on the first result, Amazon’s
Identity and Access Management system. On the left side of the IAM dashboard
you should see a menu item called “Users”
Click on “Users” and then you should see yet another dashboard which will list the
users you create. If you’re new to AWS then you shouldn’t see any user names
listed! Click on the blue “Add user” button to create a new user:
In the user name field just enter a name that you can easily remember, and make
sure to check the check box to give this user programmatic access:
After you’ve checked the box, you can click Next: Permissions, then you
should see the screen below.
On this page you first need to select how you’re going to assign AWS permissions
to this user. Click Attach existing policies directly so that you can
specify which services you’re going to use. In this case, we’re going to take
the aws.s3 package for a spin, so type “s3” into the search bar in the middle
of the page:
Amazon S3 is just a big hard drive that you can access over the internet (like
a programmable Dropbox). Check the box next to AmazonS3FullAccess so that
you can create and destroy files in Amazon S3. If you wanted to use a different
web service, you could search for the name of that service here to provide even
more access to this user. When you’re finished, click
Next: Review. On the following screen click Create user in order to get
your credentials. You should now see this screen:
In order to use AWS in R you need two strings: the access key ID and the
secret access key. Click show under the secret access key heading in order
to see the secret access key. Copy and paste each key into a file for safe
keeping. Never share these keys since
anyone with them can use your AWS account (the keys in this screenshot
are fake). After you leave this page you
will never be able to retrieve the secret access key, so make sure you’ve
recorded it. Don’t worry if you lose your secret access key though, you can just
create a new user to get a new key. After you have both keys you can click the
Close button. You’re now ready to start using AWS in R!
Now let’s test out our big cloud hard drive with the aws.s3 package which you
can install with install.packages("aws.s3"). To use the package you’ll need to
set up three environmental variables like I have below by specifying the
access key ID, and the secret access key which we just set up, plus the AWS
region which specifies the Amazon server farm geographically closest to you. You
can get a list of all regions
for all services here,
and you click here
to see which S3 regions are available. Load up the library and set up your keys
and region like so:
Let’s create a fresh CSV file that we can store on Amazon’s servers:
Now let’s upload this file to S3. Amazon organizes S3 into buckets, which
are like named cloud hard drives. You have to come up with a hard drive name
that nobody else has used before, so we’re going to introduce some randomness
to come up with a unique bucket name. The put_bucket() function creates the
bucket.
Now we can put files in our bucket using put_object() by specifying which
bucket we want to use:
We can then pull files out of the bucket given the name of the file with the
save_object() function. Let’s make sure the file we pulled out of the bucket
is the same as the file we put in!
## [1] TRUE
Looks like it works! Now let’s delete the hard drive we created since we won’t
need to use it anymore.
Amazon Web Services has a pretty generous free tier, so as long as you always
clean up after yourself during these experiments like these you shouldn’t incur
any charges. Have fun playing in the clouds!
Recently I’ve been using several of Amazon’s Web Services for computing,
storage, and turning text into voice and the
experience has been great! With AWS you can easily rent a super-computer or
archive some of the huge files you have laying around. Many of these
services are accessible through a series of R packages written by
Thomas Leeper and his colleagues as part of
the Cloudyr project. Their packages are well
designed and easy to use, however the initial setup of configuring AWS
credentials can be confusing for folks who have never used AWS before.
So let’s get started setting you up with Amazon Web Services. First you’ll need
to log into your Amazon account by signing into
https://aws.amazon.com/.
If you already have an Amazon account that you use to buy
dailynecessities then you can log in
with that account. After providing some basic information you should see the AWS
Console. This page has a search bar that looks like this:
Type “IAM” into the search bar and click on the first result, Amazon’s
Identity and Access Management system. On the left side of the IAM dashboard
you should see a menu item called “Users”
Click on “Users” and then you should see yet another dashboard which will list the
users you create. If you’re new to AWS then you shouldn’t see any user names
listed! Click on the blue “Add user” button to create a new user:
In the user name field just enter a name that you can easily remember, and make
sure to check the check box to give this user programmatic access:
After you’ve checked the box, you can click Next: Permissions, then you
should see the screen below.
On this page you first need to select how you’re going to assign AWS permissions
to this user. Click Attach existing policies directly so that you can
specify which services you’re going to use. In this case, we’re going to take
the aws.s3 package for a spin, so type “s3” into the search bar in the middle
of the page:
Amazon S3 is just a big hard drive that you can access over the internet (like
a programmable Dropbox). Check the box next to AmazonS3FullAccess so that
you can create and destroy files in Amazon S3. If you wanted to use a different
web service, you could search for the name of that service here to provide even
more access to this user. When you’re finished, click
Next: Review. On the following screen click Create user in order to get
your credentials. You should now see this screen:
In order to use AWS in R you need two strings: the access key ID and the
secret access key. Click show under the secret access key heading in order
to see the secret access key. Copy and paste each key into a file for safe
keeping. Never share these keys since
anyone with them can use your AWS account (the keys in this screenshot
are fake). After you leave this page you
will never be able to retrieve the secret access key, so make sure you’ve
recorded it. Don’t worry if you lose your secret access key though, you can just
create a new user to get a new key. After you have both keys you can click the
Close button. You’re now ready to start using AWS in R!
Now let’s test out our big cloud hard drive with the aws.s3 package which you
can install with install.packages("aws.s3"). To use the package you’ll need to
set up three environmental variables like I have below by specifying the
access key ID, and the secret access key which we just set up, plus the AWS
region which specifies the Amazon server farm geographically closest to you. You
can get a list of all regions
for all services here,
and you click here
to see which S3 regions are available. Load up the library and set up your keys
and region like so:
Let’s create a fresh CSV file that we can store on Amazon’s servers:
Now let’s upload this file to S3. Amazon organizes S3 into buckets, which
are like named cloud hard drives. You have to come up with a hard drive name
that nobody else has used before, so we’re going to introduce some randomness
to come up with a unique bucket name. The put_bucket() function creates the
bucket.
Now we can put files in our bucket using put_object() by specifying which
bucket we want to use:
We can then pull files out of the bucket given the name of the file with the
save_object() function. Let’s make sure the file we pulled out of the bucket
is the same as the file we put in!
## [1] TRUE
Looks like it works! Now let’s delete the hard drive we created since we won’t
need to use it anymore.
Amazon Web Services has a pretty generous free tier, so as long as you always
clean up after yourself during these experiments like these you shouldn’t incur
any charges. Have fun playing in the clouds!