S3 Performance Optimization

  • S3 is designed to support very high request rates
  • However, if your S3 buckets are routinely receiving > PUT / LIST / Delete or > 300 GET requests per second, then there are some best practice guidelines that will help optimize S3 performance
  • The guidance is based on the type of workload you are running:
    • GET-Intensive Workloads
      • Use CloudFront content delivery service to get best performance
      • CloudFront will cache your most frequently accessed objects and will reduce latency for your GET requests
    • Mixed request type workloads (mix of GET, PUT, DELETE, GET bucket)
      • Key names you use for your objects can impact performance for intensive workloads
      • S3 uses the key name to determine which partition an object will be stored in
      • The use of sequential key names e.g. names prefixed with a time stamp or alphabetical sequence increases the likelihood of having multiple objects stored on the same partition
      • For heavy workloads this can cause I/O issues and contention
      • By using a random prefix to key names, you can force S3 to distribute your keys across multiple partitions, distributing the I/O workload
      • Reduces likelihood of io contention

Exam Tips

  • Remember the 2 main approaches to Performance Optimization for S3
    • GET-Intensive workloads
      • Use cloudFront
    • Mixed-Workloads
      • Avoid sequential key names for your S3 objects
      • Instead, add a random prefix like a hex hash to the key name to prevent multiple objects from being stored on the same partition