Tuesday, February 27, 2018

Stemgraphic v.0.5.x: stem-and-leaf EDA and visualization for numbers, categoricals and text


 Stemgraphic open source


In 2016 at PyDataCarolinas, I open-sourced my stem-and-leaf toolkit for exploratory data analysis and visualization. Later, in October 2016 I had posted the link to the video.



Stemgraphic.alpha


With the 0.5.x releases, I've introduced the categorical and text support. In the next few weeks, I'll be introducing some of the features, particularly those found in the new stemgraphic.alpha module of the stemgraphic package, such as back-to-back plots and stem-and-leaf heatmaps:




But if you want to get started, check out stemgraphic.org, and the github repo (especially the notebooks).

Github Repo

https://github.com/fdion/stemgraphic


Francois Dion
@f_dion

Monday, February 26, 2018

Readings in Communication

"Ex-Libris" part VI: Communication

Part 6 of my "ex-libris" of a Data Scientist is now available. This one is about communication.
When I started this series, I introduced this diagram.

I also quoted Johh Tukey:
"Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise" 

This is quite important since a data science project has to start with a question and come up with an answer.

But how do we communicate at the time of formulating the question? How about at the time of providing an answer? By any means necessary.

Do check out the full article with the list of books:

"ex-libris" part vi


See also

Part I was on "data and databases": "ex-libris" of a Data Scientist - Part i
Part II, was on "models": "ex-libris" of a Data Scientist - Part II

Part III, was on "technology": "ex-libris" of a Data Scientist - Part III
Part IV, was on "code": "ex-libris" of a Data Scientist - Part IV
Part V was on "visualization". Bonus after that will be on management / leadership.
Francois Dion
@f_dion