## Simple Boto3 / S3 Example

An example of using the boto3 library to list files in S3 and download a file from S3

**Note: there are no credentials in this notebook**

Authentication with S3 should be done through ```aws configure``` on the machine where jupyter is running.

Alternatively authentication could be done through env variables.

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html

In [1]:
import boto3

We can create a ```resource``` or ```client``` object to call functions that operate on AWS.

First we'll use ```resource```

In [2]:
s3_resource = boto3.resource('s3')

In [3]:
bucket_name = "data-eng-21"

In [4]:
bucket = s3_resource.Bucket("data-eng-21")

In [5]:
for obj in bucket.objects.filter():
    print(obj)

s3.ObjectSummary(bucket_name='data-eng-21', key='car-sales/')
s3.ObjectSummary(bucket_name='data-eng-21', key='car-sales/CarSales-1.txt')
s3.ObjectSummary(bucket_name='data-eng-21', key='car-sales/CarSales-2.txt')
s3.ObjectSummary(bucket_name='data-eng-21', key='datasets/')
s3.ObjectSummary(bucket_name='data-eng-21', key='datasets/CarSales.csv')
s3.ObjectSummary(bucket_name='data-eng-21', key='datasets/owid-covid-data-250122.csv')
s3.ObjectSummary(bucket_name='data-eng-21', key='owid-covid/')
s3.ObjectSummary(bucket_name='data-eng-21', key='owid-covid/owid-covid-271021.csv')
s3.ObjectSummary(bucket_name='data-eng-21', key='owid-covid/owid-covid-data-1.csv')
s3.ObjectSummary(bucket_name='data-eng-21', key='owid-covid/owid-covid-data-2.csv')
s3.ObjectSummary(bucket_name='data-eng-21', key='pbi-mongo/')
s3.ObjectSummary(bucket_name='data-eng-21', key='pbi-mongo/mongodb-bi-win32-x86_64-v2.14.4.msi')
s3.ObjectSummary(bucket_name='data-eng-21', key='pbi-mongo/mongodb-connector-odbc-1.4.2-win

In [6]:
bucket.download_file('datasets/CarSales.csv', 'CarSales.csv')

There is also a number of functions available throught the ```client``` function

In [7]:
s3_client = boto3.client('s3')

s3_client.list_buckets()

{'ResponseMetadata': {'RequestId': 'HF03SJQZX4G3GAZD',
  'HostId': '5h15HBZiPGdwrD6XXDV+XXFTseOQpKroTzBRErKy4PyaBsna8qQURIlMZ1KhBAQpK3pK3GeVSno=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amz-id-2': '5h15HBZiPGdwrD6XXDV+XXFTseOQpKroTzBRErKy4PyaBsna8qQURIlMZ1KhBAQpK3pK3GeVSno=',
   'x-amz-request-id': 'HF03SJQZX4G3GAZD',
   'date': 'Wed, 26 Jan 2022 07:41:33 GMT',
   'content-type': 'application/xml',
   'transfer-encoding': 'chunked',
   'server': 'AmazonS3'},
  'RetryAttempts': 0},
 'Buckets': [{'Name': '5a-hackathon',
   'CreationDate': datetime.datetime(2021, 9, 1, 9, 12, 51, tzinfo=tzutc())},
  {'Name': 'aiml-hackathon-data',
   'CreationDate': datetime.datetime(2021, 12, 14, 21, 2, 48, tzinfo=tzutc())},
  {'Name': 'aiml-hackathon-ui',
   'CreationDate': datetime.datetime(2021, 12, 14, 20, 59, 8, tzinfo=tzutc())},
  {'Name': 'allstate-data-eng',
   'CreationDate': datetime.datetime(2022, 1, 18, 11, 9, 34, tzinfo=tzutc())},
  {'Name': 'angular-shippers',
   'CreationDate': date

In [8]:
for bucket in s3_client.list_buckets()['Buckets']:
    print('Bucket Name:', bucket['Name'])

Bucket Name: 5a-hackathon
Bucket Name: aiml-hackathon-data
Bucket Name: aiml-hackathon-ui
Bucket Name: allstate-data-eng
Bucket Name: angular-shippers
Bucket Name: angular-trading
Bucket Name: athena-data-owid
Bucket Name: conorm-test-1235
Bucket Name: data-analysis-with-python
Bucket Name: data-eng-21
Bucket Name: davidl-test
Bucket Name: dealguru-pricedata
Bucket Name: demo-site-123
Bucket Name: frank-hackathon
Bucket Name: group-8-images
Bucket Name: hackathon-files
Bucket Name: lambda-price-data
Bucket Name: lex-web-ui-codebuilddepl-lexwebuicloudfrontdistri-1jf2ef60tgcip
Bucket Name: lex-web-ui-codebuilddepl-lexwebuicloudfrontdistri-1qty1xh2th7wv
Bucket Name: lex-web-ui-codebuilddeploy-1fa-s3serveraccesslogs-97xvf4d8xmeu
Bucket Name: lex-web-ui-codebuilddeploy-1fawnk2fr-webappbucket-3u3tsf3j3i1x
Bucket Name: lex-web-ui-codebuilddeploy-uoc-s3serveraccesslogs-75sotqwy965b
Bucket Name: midas-trading-js
Bucket Name: mongo-data-lake-data
Bucket Name: reportbucketconorm123
Bucket Name: s

In [9]:
items = s3_client.list_objects_v2(Bucket='data-eng-21')

for item in items['Contents']:
    print('Item:', item['Key'])

Item: car-sales/
Item: car-sales/CarSales-1.txt
Item: car-sales/CarSales-2.txt
Item: datasets/
Item: datasets/CarSales.csv
Item: datasets/owid-covid-data-250122.csv
Item: owid-covid/
Item: owid-covid/owid-covid-271021.csv
Item: owid-covid/owid-covid-data-1.csv
Item: owid-covid/owid-covid-data-2.csv
Item: pbi-mongo/
Item: pbi-mongo/mongodb-bi-win32-x86_64-v2.14.4.msi
Item: pbi-mongo/mongodb-connector-odbc-1.4.2-win-64-bit.msi
Item: pbi-mongo/mongosqld.zip


In [3]:
import boto3

s3_client = boto3.client("s3")

# buckets_dict will be a python dictionary containing info about my buckets
buckets_dict = s3_client.list_buckets()

# loop throught the list of buckets and print the name of each one
for bucket in buckets_dict['Buckets']:
    print('Bucket Name:', bucket['Name'])

Bucket Name: 5a-hackathon
Bucket Name: aiml-hackathon-data
Bucket Name: aiml-hackathon-ui
Bucket Name: allstate-data-eng
Bucket Name: angular-shippers
Bucket Name: angular-trading
Bucket Name: athena-data-owid
Bucket Name: conorm-test-1235
Bucket Name: data-analysis-with-python
Bucket Name: data-eng-21
Bucket Name: davidl-test
Bucket Name: dealguru-pricedata
Bucket Name: demo-site-123
Bucket Name: frank-hackathon
Bucket Name: group-8-images
Bucket Name: hackathon-files
Bucket Name: lambda-price-data
Bucket Name: lex-web-ui-codebuilddepl-lexwebuicloudfrontdistri-1jf2ef60tgcip
Bucket Name: lex-web-ui-codebuilddepl-lexwebuicloudfrontdistri-1qty1xh2th7wv
Bucket Name: lex-web-ui-codebuilddeploy-1fa-s3serveraccesslogs-97xvf4d8xmeu
Bucket Name: lex-web-ui-codebuilddeploy-1fawnk2fr-webappbucket-3u3tsf3j3i1x
Bucket Name: lex-web-ui-codebuilddeploy-uoc-s3serveraccesslogs-75sotqwy965b
Bucket Name: midas-trading-js
Bucket Name: mongo-data-lake-data
Bucket Name: reportbucketconorm123
Bucket Name: s