S3 Integration
All S3 interaction lives in photos.py — creating the client, validating
uploads, uploading files, and generating pre-signed URLs.
Creating the S3 Client
Install boto3, the AWS SDK for Python:
If you already did
pip install -r requirements.txtyou can skip this
pip install boto3
The client is created once at module load time using credentials from config.py:
backend/photos.py#L1-L32 on GitHub
# photos.py
from __future__ import annotations
import uuid
from typing import TYPE_CHECKING
import boto3
from botocore.exceptions import ClientError
from config import (
AWS_ACCESS_KEY,
AWS_SECRET_KEY,
S3_ENDPOINT_URL,
BUCKET_NAME,
)
if TYPE_CHECKING:
from fastapi import UploadFile
s3 = boto3.client(
"s3",
endpoint_url=S3_ENDPOINT_URL,
aws_access_key_id=AWS_ACCESS_KEY,
aws_secret_access_key=AWS_SECRET_KEY,
region_name="us-east-1",
)
The endpoint_url parameter is what makes this work with MinIO — point it at
http://localhost:9000 instead of AWS, and boto3 talks to MinIO using the same
S3 protocol.
Validating Uploads
Before uploading anything to S3, the file is validated for type and size. This is a critical security step — never trust files from clients without checking them first.
backend/photos.py#L35-L77 on GitHub
MAX_IMAGE_SIZE_BYTES = 5 * 1024 * 1024 # 5 MB
ALLOWED_IMAGE_TYPES = [
"image/jpeg",
"image/png",
"image/gif",
"image/webp",
]
def validate_image(image: UploadFile) -> bool:
"""Validate image type and size before uploading."""
# Check MIME type
if image.content_type not in ALLOWED_IMAGE_TYPES:
print(f"Rejected: unsupported type {image.content_type}")
return False
# Measure file size without reading it into memory
current_position = image.file.tell()
image.file.seek(0, 2) # Seek to end
size = image.file.tell()
image.file.seek(current_position) # Return to original position
if size > MAX_IMAGE_SIZE_BYTES:
max_mb = MAX_IMAGE_SIZE_BYTES / (1024 * 1024)
print(f"Rejected: {size} bytes exceeds {max_mb} MB limit")
return False
return True
Two checks are performed:
- Content type — rejects anything that isn’t an accepted image format
- File size — prevents denial-of-service via huge uploads (5 MB limit)
The size check uses file.seek() to find the end of the file without reading
the content into memory, which keeps memory usage low even for large rejected
files.
Uploading to S3
This function is responsible for uploading the image to the S3 bucket.
backend/photos.py#L95-L118 on GitHub
def upload_photo(image: UploadFile) -> str | None:
"""Upload an image to S3 and return the object key, or None on failure."""
if not validate_image(image):
return None
# Prefix with UUID to prevent filename collisions between users
photo_name = f"{uuid.uuid4()}_{image.filename}"
try:
s3.upload_fileobj(
image.file,
BUCKET_NAME,
photo_name,
ExtraArgs={"ContentType": image.content_type},
)
except ClientError as e:
print(e)
return None
return photo_name
A few things worth noting:
- UUID prefix:
uuid.uuid4()generates a random unique ID prepended to the original filename. This prevents two users uploadingphoto.jpgfrom overwriting each other, and avoids path traversal attacks via crafted filenames. upload_fileobj: Streams the file directly to S3 without loading it into memory first.- Returning
None: Failures returnNonerather than raising an exception, keeping error handling at the API layer where it can return a proper HTTP response.
Pre-Signed URLs
Files in a private S3 bucket aren’t publicly accessible by URL. Pre-signed URLs are temporary, expiring links that grant access to a specific object:
backend/photos.py#L82-L92 on GitHub
def get_url(photo_name: str) -> str | None:
"""Generate a temporary pre-signed URL for accessing a photo."""
try:
return s3.generate_presigned_url(
"get_object",
ExpiresIn=604800, # 7 days in seconds
Params={"Bucket": BUCKET_NAME, "Key": photo_name},
)
except ClientError as e:
print(e)
return None
Pre-signed URLs are generated fresh each time a photo is requested — the database stores the S3 object key, not the URL, since URLs expire. After 7 days (604,800 seconds) the URL stops working and the client must request a new one.
This approach is preferable to making your bucket publicly readable because:
- Access can be revoked by deleting the object
- URLs can be scoped to specific operations (
get_object,put_object, etc.) - No credentials are exposed to the client
Note, this is an over-simplified approach, you could also cache the urls in memory on the server for seven days, which would reduce the number of API calls to S3.