Wednesday Oct 20, 2021

Data Versioning for Data Science

Today we talk about Data Versioning. Why you should do it, what to do about humans in the loop, and how to minimize mistakes.

Tools mentioned:

DVC - https://dvc.org/

Quilt Data Versioning - https://quiltdata.com/

Apache Airflow - https://airflow.apache.org/

Apache Superset - https://superset.apache.org/

OpenProject - https://www.openproject.org/

----------------------------------------

Follow the podcast on Twitter: @dsdeployed

https://twitter.com/dsdeployed

----------------------------------------

Donny Winston

I help researchers do data-intensive science together.

Twitter: https://twitter.com/donnywinston @donnywinston

Email: donny@polyneme.xyz

Website: https://polyneme.xyz/

LinkedIn: https://www.linkedin.com/in/donnywinston/

Ben Cook

I help data science teams deploy their algorithms because a machine learning model is only as good as the system that delivers it.

Twitter: @jbencook https://twitter.com/jbencook

LinkedIn: https://www.linkedin.com/in/jbencook/

Email: ben@sparrow.dev

Website: https://sparrow.dev/

Jillian Rowe

I help biotech startups deploy scalable high performance compute infrastructure on AWS.

Email: jillian@dabbleofdevops.com

Website: https://www.dabbleofdevops.com

Twitter: www.twitter.com/jillianerowe

LinkedIn: https://www.linkedin.com/in/jillian-rowe-9410437a/

Comments (0)

To leave or reply to comments, please download free Podbean or

No Comments