Data is not always useful and it doesn't matter how much of it you have.
There’s no mathematical tool to tell you if your hypothesis is true; you can only see whether it is consistent with the data, and if the data is sparse or unclear, your conclusions are uncertain.
Virgilio - Mentor for Data Science E-Learning.
Awesome Data Science with Python - Curated list of Python resources for data science.
nteract - Interactive computing suite for you.
Pandas - Powerful Python data analysis toolkit.
Weld - High-performance runtime for data analytics applications.
Vaex - Out-of-Core DataFrames for Python, visualize and explore big tabular data at a billion rows per second.
Ibis - Python data analysis framework for Hadoop and SQL engines.
Kyso - Data analytics knowledge hub.
Feather - Fast, interoperable binary data frame storage for Python, R, and more powered by Apache Arrow.
ROOT system - Provides a set of OO frameworks with all the functionality needed to handle and analyze large amounts of data in a very efficient way.
Prefect - New workflow management system, designed for modern infrastructure and powered by the open-source Prefect Core workflow engine.
Monument - High-productivity toolkit for predictions. AutoML for time series on any desktop, laptop or server.
dbt - Data build tool. Analytics engineering workflow.
Apache Zeppelin - Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Apache Nifi - Easy to use, powerful, and reliable system to process and distribute data.
Koalas - Pandas API on Apache Spark.
Prophet - Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
CuPy - NumPy-like API accelerated with CUDA.
SaturnCloud - Manage Data Science applications so Data Scientists don't have to do DevOps.
Falcon - Interactive Visual Analysis for Big Data.
Google Cloud DataLab - Interactive tools and developer experiences for Big Data on Google Cloud Platform.
Apache Kudu - Completes Hadoop's storage layer to enable fast analytics on fast data.
Jigsaw Labs - Learn Data Science part-time.
Data Science Ontology - Knowledge base about data science.
Data Engineering Project - Implementation of the data pipeline which consumes the latest news from RSS Feeds and makes them available for users via handy API.
Hex Technologies - Turn your notebooks into collaborative, sharable data apps and stories. No more loose CSVs, chart screenshots, or stale decks.
Amundsen by Lyft - Open source data discovery and metadata engine.
PandasGUI - GUI for analyzing Pandas DataFrames.
Holistics - Data Modeling & Self-Service BI Platform.