A Query operation finds items in a table based on the Primary Key attribute and a distinct value to search for
e.g. select an item where the user ID is equal to 212, will select all the attributes for that item, e.g. first name, last name, email, etc
Use an optional Sort Key name and value to refine the results
e.g. if your Sort Key is a timestamp, you can refind the query to only select items with a timestamp of the last 7 days
By Default, a query returns all the attributes for the items but you can use the ProjectionExpression parameter if you want the query to only return the specific attributes you want
e.g. if you only want to see the email address rather than all attributes
Results are always sorted by the Sort Key
numeric order - by default in ascending order (1, 2, 3, 4)
ASCII character code values
You can reverse the order by setting the ScanIndexForward paramater to false
this parameter only applies to queries and NOT scan
By default, Queries are Eventually Consistent
You need to explicitly set the query to be strongly consistent
What is a Scan?
A Scan operation examines every item in the table
by default returns all data attributes
Use the ProjectionExpression parameter to refine the scan to only return the attributes you want
Query or Scan?
Query is more efficient than a scan
Scan dumps the entire table, then filters out the values to provide the desired result - removing the unwanted data
This adds an extra step of removing the data you don't want
As the table grows, the scan operation takes longer
Scan operation on a large table can use up the provisioned throughput for a large table in just a single operation
How To Improve Performance
You can reduce the impact of a query or scan by setting a smaller page size whjich uses fewer read operations
e.g. set the page size to return 40 items
larger number of smaller operations will allow other requests to succeed without throttling
Avoid using scan operatinos if you can: design tables in a way that you can use the Query, Get, or BatchGetItem APIs
How to Improve Scan Performance
By default, a scan operation proccesses data sequentially in returning 1 MB increments before moving on to retrieve the next 1 MB of data. It can only scan one partition at a time.
You can configure DynamoDB to use Parallel scans instead by logically dividing a table or index into segments and scanning each segment in parallel
Best to avoid parallel scans if your table or index is arleady incurring heavy read/write activity from other applications
Scan vs Query Exam Tips
A query operation finds items in a table using only the Primary Key attribute
You provide the primary key name and a distinct value to search for
A scan operation examines every item in the table
By default returns all data attributes
Use the ProjectionExpression parameter to refine the results
Query results are always sorted by the Sort Key if there is one
Sorted in ascending order
Set ScanIndexForward parameter to false to reverse the order - queries only
Query operation is generally more efficient than a Scan
Reduce the impact of a query or scan by setting a smaller page size which uses fewer read operations
Isolate scan operations to specific tables and segregate them from your mission-critical traffic
Try Parallel scans, rather than the default sequential scan
Avoid using scan operations if you can: design tables in a way that you can use the Query, Get, or BatchGetItem APIs
Extras
ProjectionExpression is used for GetItem, Query, or Scan - not just scans - to get attributes of an item