What is AWS Data Exchange service?
Short description of AWS new service named Data Exchange. How companies can make money sharing their data, and how companies can get easy access to data sets in cloud.
Amazon Web Services recently launched new service named Data Exchange. In data exchange subscribers can subscribe to file based data sets published by data providers.
A unit of data published to data exchange is called a product.Every product in data exchange consists of:
Details — name, description etc.
Offer — price, durations (eg. 12 month long subscription), refund policy,
Data sets —basically dynamic sets of files with data. Data sets have revisions. Every update to data set creates new revision of the data set.
To become a provider you need to be registered as AWS Marketplace seller and your Data Exchange provider registration must be qualified by AWS. Publisher company has to be registered either in US or EU. The registration procedure can be explored here. Publishers must ensure that data they publish meets the Terms and Conditions for AWS Marketplace Sellers.
As of today Data Exchange contains above 1000 data products from many different domains like financial services, geospatial data, healthcare and more. Data Exchange publisher list contains qualified companies like Reuters, Crux Informatics or Epsilon.
Subscribing is straightforward. You just go to Data Exchange console in your AWS account and search for data products you need. Data products may be free or paid. Majority of the free data products are demos/sample data of paid data products. They may come in handy when making POC’s. Once you find desired data product, read the description and check the details like price or subscription duration. When you are all set you can subscribe to data product with few simple button clicks. Subscription fee will be added to your AWS bill. Payment is full upfront.
Optionally, publisher may change the subscription method to request based. In this case you’ll first have to send subscription request. Publisher then reviews and accepts or declines the request.
Once subscribed you’ll be able to access the data using Data Exchange console or API. It is possible to download the data to your local machine or dump it to your S3 bucket. Whenever publisher updates the data product — creates new revision — you will be notified by a cloudwatch event. You can use this event to automate consumption of updated data.
AWS states that they scan the data in Data Exchange for potential malware but they do not guarantee data to be malware free. Maleware protection responsibility is shared between AWS and customer. When using data from Data Exchange you should scan it using some sort of third-party malware scanner like ClamAV.
Subscription durations can vary between 1 to 36 months. By default public data products offerings are auto-renew. Subscription terms can be changed by the data publisher at any time, however, changes will not take effect for existing subscriptions. With auto-renew enabled new subscription will be bought using most recent terms(including new price!), so be careful with it.
AWS created a Data Exchange Heartbeat which is a test data product which can be used to familiarize yourself with Data Exchange concepts and usage. Heartbeat data product contains single data set also named heartbeat. Every 15 minutes a new revision is created. You can subscribe and play with it for free.
You can read more on an official Data Exchange product page.
Any questions? feel free to contact me at contact@kscloud.pl. If you want to receive my posts on your email consider subscribing to my substack at https://cloudtalks.substack.com/.
https://kscloud.pl