Principles, Statistical and Computational Tools for Reproducible Data Science