Lynn Cherny
Consulting (formerly EM-Lyon Business School)
arnicas@gmail.com, @arnicas
Presented at EMAEE19, Univ of Sussex, June 2019
I do a lot with data sets (qual and quant), they've gotten bigger, and they still always need visualizing.
From R for Data Science, Grolemund & Wickham
How do we reduce the distance between these steps?
One of the few "non-programming" options for data transformation ("wrangling").
"Our philosophy has been that AI and machine learning are very important, but there are times when human context has a part to play. That’s why we made the interface very easy to go back and forth between the two,” explained [Joe] Hellerstein.
Trifacta Wrangler's Missing Data Detection and Suggestions ("tidy"/"clean")
Example: 80K Rows in Movie Ratings data set
https://fivethirtyeight.datasettes.com/fivethirtyeight/inconvenient-sequel%2Fratings
"Datasette is a tool for exploring and publishing data. It helps people take data of any shape or size and publish that as an interactive, explorable website and accompanying API.
Datasette is aimed at data journalists, museum curators, archivists, local governments and anyone else who has data that they wish to share with the world."
Eg., article tutorial here:
from Quantopian: https://github.com/quantopian/qgrid
"We have repeatedly seen the pain involved in turning some analysis code into insights that can be easily shared with decision makers within an organization or the general public. Because the technologies involved often required distinct skill sets, different teams may be involved in prototyping, developing and deploying an app to be used by non-technical people."
"Panel" examples... A few lines of Python code.
Build Pipelines with Panel
"A classifier pipeline which allows 1) capturing images from a webcam and applying object detection to the images, 2) selecting and modifying the bounding boxes and 3) classifying the contents of the selected region using Google’s Vision API (To try it yourself here)."
A Recent Cute T-SNE Layout of Text Clustered, and Images Associated...
By Fathom, https://fathom.info/bobross/
A social media problem from me & colleagues...
from a squarified view -- see code at https://ml4a.github.io/ from gene kogan; and demo
This was labelled "goldfish" but also
contains a cat.
https://planspace.org/20170911-problems_with_imagenet_and_its_solutions/
Video segment, maybe demo
Thanks!