Technology Topics by Brains

ブレインズテクノロジーの研究開発機関「未来工場」で働くエンジニアが、先端オープン技術、機械学習×データ分析(異常検知、予兆検知)に関する取組みをご紹介します。

re:Invent 2019, Day 1: Building ML practices & Advanced DynamoDB patterns

Building ML practices to address the top four use cases

I woke up very early in the morning (as early as 2pm) so I came very early to the venue too. The room of the session is shared with other sessions, so there’s no loud speaker, just some headphones set up for people who couldn’t hear the speaker directly. This session is kind of an introductory to the ML stack offerings from AWS, though it didn’t provide much detail about how to implement things, the speakers did give you a brief idea of steps you need to take to address your machine learning problems (4 of which mentioned in this session: Recommendation engines, Forecasting, Computer vision, and Natural language processing).

f:id:brains-tech:20200114115851p:plain
AWS ML Stack
What I learned from this session is: as an AWS user, you should first take advantage of their very convenient offerings like Personalize, Forecast, Rekonition, Textract, etc. to address your problem. These options are very convenient because you don’t really need the data initially to create your first models. It’s better to have a mediocre model than nothing, so if your system is just at an initial state, those are very helpful. And then when you have your own data, you can either input them to the “easy buttons” solutions, or you can customize the process further using SageMaker, there will be lower level options for you to choose from: several algorithms which you can tweak to your liking. And there is a marketplace for pre built models for different use cases that you can browse and buy them to use if it satisfies you. And if every pre built solutions cannot satisfy your needs, you can go deep down to the programming levels, which you can use frameworks like Tensorflow, Pytorch, and interfaces like Keras, or Amazon’s Gluon to program your ML pipeline yourself.
Full event can be found here.
(Update, at the Keynote one day later, AWS just announced some services that can make the process much more convenient including a cloud based IDE for SageMaker, with multiple SageMaker related goodies)

Advanced Design Patterns for DynamoDB

This is a repeated session from 2018, but there were still a lot of them being held repeatedly this year, and with a lot of people attending those sessions too. I can understand the reason behind that. Everybody is using NoSQL for their workloads, but not very many of them (myself included) truly understand what it is and how to utilize the technology.
Speaker Rick Houlihan started the presentation by addressing the reason why NoSQL is taking over RDBMS: data gets bigger, storage gets cheaper and CPUs are still expensive, it is more appropriate to store unnormalized that is ready for fast querying than to store normalized data that needs heavy computing for queries.
NoSQL is a very different beast than traditional relational databases. Using NoSQL the same way as relational databases is wrong, and if you use the same data model it’s wrong. In this session, Mr. Houlihan introduced briefly about the data model of DynamoDB, and then some advanced design patterns such as:

  • Choosing partition key to optimize partitioning performance
  • Range queries using sort key
  • Heterogeneous collections of items
  • Indexing and access patterns

f:id:brains-tech:20200114115403p:plain
Advanced design patterns for DynamoDB

Full event can be found here

-- Cuong Nguyen, Software Developer @ Brains Technology, Inc.