As you remember from the Model Deployment with Flask/Part-1, we covered the definition of Machine Learning Pipeline and model deployment then created necessary files for model deployment.
In this story, we will take the steps to make our model run on the web. At first, we will check whether our model works properly, then upload the necessary files to GitHub, and finally create an app on Heroku.
Step-1:Testing our model locally
Before uploading files to GitHub and then Heroku, it is better to check our model locally. We can do it by running the python app.py command.
Machine Learning (ML) pipeline can be defined as dividing a machine learning workflow into reusable parts that create an end-to-end design. Although the number of steps may vary, a Machine Learning pipeline usually consists of six steps that require continuous monitoring.
As seen from the picture above, the final step, after building a machine learning model, is to make the model available in production; in other words, model deployment. Model deployment means making your trained ML model available to end-users or other systems.
There are different methods for model deployment, however, this two-part story will cover only model deployment using…
There is a famous English adage, “A picture is worth a thousand words”, meaning that a single image sometimes explains complex or multiple ideas way better than words. Namely, data visualization is a great medium to communicate with our audience.
Data visualization is useful for not only presenting the insights gained but also exploring the data at hand. Understanding the structure of the data, detecting outliers, identifying trends and clusters, choosing a model to apply, evaluating the model output, and finally presenting results can be effectively done via data visualization. …
Dealing with Missing Data: Practical Imputation Methods
According to different resources, data scientists spend 80% of their time cleaning data rather than creating insights. Here are some reports that confirm this argument.
According to Crowdflower 2015 report, although there is no info about the time spent on it, 66.7% of the participants stated that data cleaning/organizing data is one of their most time-consuming tasks.