In this blog post, we will take a closer look at all the pros and cons of bringing in outside help for your tech needs, to see whether or not it's worth it for your business.
Domain migrations can seem deceptively simple on the surface, but often hide layers of complexity beneath. From DNS configurations to email routing, learn how thorough preparation and organized execution can turn a potentially chaotic process into a manageable, successful transition.
Walk you through the scenario and summarize key lessons we learned during the process of launching a microsite.
I've had a pleasant experience working with DynamoDB. Amazon did fantastic work in the past few years improving their NoSQL offering. It's now truly powerful and versatile.
Thanks to our client, ClearCare, for enabling me to work with DynamoDB and share my lessons. Together, we created an open source library called cc_dynamodb3 to help others make the most of their python integration.
As part of ClearCare's broader move to SOA (and the now more popular term, microservices), we had more freedom to choose the right tools for the job.
The need to launch a new feature created an opportunity to experiment with a separate, simpler datastore. Since they were already using so much of the AWS stack, DynamoDB was a natural choice.
We set up the new feature as a separate service, with its own servers, separate domain and database, interacting with the main product via APIs.
Thanks largely to DynamoDB's scalability, plus a solid API integration (mostly via JSON), we were able to launch the feature on a very tight deadline and have it scale immediately.
Throughout 6+ months of enhancements, we rarely had fires to put out, and almost no downtime. Remarkable, considering usage grew ~100x.
We learned a lot along the way, and I'm happy to share some of those lessons with you right now!
The following tips may help you work with Amazon Web Services' DynamoDB and python (including Django or Flask).
As of June, 2015, Amazon recommends boto3 moving forward. boto2 is being deprecated and boto3 offers official support and a much cleaner and more pythonic API to work with AWS!
Upgrading from boto2 to boto3 is fairly easy, although I strongly suggest you write tests for the affected code. Unit tests are essential, and you should have at least one for each low-level piece of code that used boto2 directly.
I also suggest at least one or two higher level tests, and manually try out a code path that's core to the business (to avoid breaking key user paths and waking up the Ops department ;).
Check out how we implemented the boto3 connection.
DynamoDB supports several data types natively that your language will probably handle a bit differently.
In our case, it's nice to convert our data to native python objects where appropriate:
Fortunately, we have that work all done and tested as part of cc_dynamodb3. Check out models.py here and here especially.
We used schematics to create a light ORM.
A table's primary key cannot be changed once created. There are lots of resources out there to help you design your tables, and I suggest you research your use cases ahead of time.
In our experience, the safest choice is a string UUID Hash Key. It gives you 100% flexibility over how you're going to uniquely identify your data. You can always add, modify or remove GSIs to improve performance for specific query operations.
Here are some easily avoidable mistakes:
You can use a begins_with
filter on a RangeKey, but you cannot use it on a HashKey.
For example, say you have a book library. If you first and foremost care to keep your books by publisher, that could be your HashKey. Then, you may sort the books by year of publication, and uniquely identify them by their ISBN. So you can have:
You can find all HarperBusiness books published in 1995 via a single query using begins_with
, what great performance!
Here is an example of how we do this in the light ORM from cc_dynamodb3:
class DynamoDBModel(Model):
# moar stuff here...
@classmethod
def all(cls):
response = cls.table().scan()
# DynamoDB scan only returns up to 1MB of data, so we need to keep scanning.
while True:
metadata = response.get('ResponseMetadata', {})
for row in response['Items']:
yield cls.from_row(row, metadata)
if response.get('LastEvaluatedKey'):
response = cls.table().scan(
ExclusiveStartKey=response['LastEvaluatedKey'],
)
else:
break
Caveat: Item.all()
may do multiple queries to DynamoDB, and thus has a hidden cost and time. Please note the use of yield
to lazy-evaluate the results. This gives you an opportunity to retrieve data only until you need to, and avoids scanning the whole table in one go.
In practice, you don't want to perform table scans or large queries on the main sever thread anyway.
created
and updated
columns in each tableThese days being the days of big data and analytics, I suggest always having a created
and an updated
column.
In our light ORM, we can do this with a BaseModel
:
from schematics import types as fields
from cc_dynamodb3.models import DynamoDBModel
class BaseModel(DynamoDBModel):
created = fields.DateTimeType(default=DynamoDBModel.utcnow)
updated = fields.DateTimeType(default=DynamoDBModel.utcnow)
def save(self, overwrite=False):
self.updated = DynamoDBModel.utcnow()
return super(DynamoDBModel, self).save(overwrite=overwrite)
Then, in your code:
from myproject.db import BaseModel
class Book(BaseModel):
publisher = fields.StringType(required=True)
This makes sure all your models inheriting from BaseModel
will have the two columns automatically populated. Ta-da!
If you'd enjoy working on these types of projects while having a chance to help make aging better, ClearCare is hiring full-time engineers! Check out their careers section.