Want to be a good Data Scientist? Start with a good setup!

During the last few years as a PhD student in AI and Deep Learning I met a lot of smart people and good data scientist both in the Academia and in the Industry. Still, I’ve always been surprise of how much of these folks still work with old tools and bad coding habits. Even though in the end is just about personal choices and tastes, I still believe that for high productivity and extra-safety measures some tools are better than others.

Moreover, especially if your are a young practitioner, it’s very easy to just follow your enthusiasm and start learning to solve interesting problems but with the wrong tools or attitude. In fact, as for playing a music instrument, you risk to get used to a poor designed set of tools and consequent bad behaviors which are much more difficult to change later on then starting directly with the right foot.

In this post I would like to share with you my preferred setup for Data Science and Deep Learning, so that you can jump right on this beautiful field without incurring in unnecessary headaches due to a poor tools setup. Of course this would be not the ideal arrangement for everyone out there, depending also on the specifics of each project, but I found it very general and useful for almost anything I do.

In a nutshell we’ll need:

Let’s analyze each of them in details.

A good IDE


Text files are amazing, I’m a huge VIM fan, still some IDE features are amazing and pretty easy to customize. Personally I love PyCharm which comes with a really interesting tool suit and a smooth interface. Of all the features offered by an IDE what you should be looking first is a very robust code editor (with syntax highlighting, errors correction and smart suggestions) and a powerful debugging system!

To be continued… :-)

Authored by Vincenzo Lomonaco

comments powered by Disqus