I'm Chris Clark, the CTO and co-founder of Grove Collaborative. Over the years I've written a number of articles about software, technology, and teams.

Postspage 3

 

07/11/2013
Changing Our Development Process

This is a post about change management at a start-up.

Kaggle, at just under 20 people, is an interesting size because we are right on the cusp of being able to wing it in terms of process and communication, and needing more formal processes to make sure changes and plans …

 

01/23/2013
Getting Started with Pandas - Predicting SAT Scores for New York City Schools

An Introduction to Pandas

This tutorial will get you started with Pandas - a data analysis library for Python that is great for data preparation, joining, and ultimately generating well-formed, tabular data that's easy to use in a variety of visualization tools or (as we will see here) machine learning applications …

 

01/17/2013
Entering Kaggle Competitions with Google Predict

BigML had a great series of posts over the summer pitting some prediction-as-a-service products against each other. One of those was the Google Predict API. I thought it might be fun to enter a Kaggle competition using the API and see how it did against some of the world's top …

 

11/01/2012
Extensible, Single-Line Fizzbuzz in a Tweet

Yep, it's a fizzbuzz blog post! I can hardly believe I've gone this long without ever doing one.

I was thinking about hiring and what I would do if someone asked me fizzbuzz, and I think I would have used it as an opportunity to show off some engineering practices …

 

10/17/2012
Rapidly Saving .jpgs in Photoshop

Cliffs notes: Now, whenever I hit F2 in Photoshop, I get a high quality jpg of my file saved to the same directory. No more "File->Save As" nonsense every time I want a static version of the image.

What/how/why: I was working on some product mock-ups this …

 

10/04/2012
Engineering Practices in Data Science

Josh Wills wrote this excellent, pithy definition of a data scientist:

Data Scientists (n.): Person who is better at statistics than any software engineer and better at software engineering than any statistician.

It's certainly true that software engineering and data science are two different disciplines, and for good reason - they …

 

07/20/2012
Getting Started With Python for Data Science

As the product manager at Kaggle I'm fortunate enough to hang around with some of the world's top machine learning experts. And working at a place that specializes in running predictive modelling competitions means that I inevitably got the itch to learn some of this stuff myself.

I don't have …

 

07/01/2012
Tale of the Tape (The Indentation Apocalypse)

BRUCE "Emacs" BUFFER:: Now entering the Kill Ring in the blue trunks, with a reach of approximately three-quarters of an inch, hailing from the 9th spot in the ASCII table, it's TAAAABBBBBB!

BRUCE "Emacs" BUFFER: In the opposite corner, wearing white, you know him well, don't-call-him-32-he-prefers-hexidecimal-number-20...it's SPAAACCEEEEEE!

COMMENTATOR 1 …

 

06/29/2012
Four Fun Facts From Big Data
  1. The credit card industry's term of art for false positives in fraud detection is "insult rate". If you've ever swiped your card at a shop and had it declined, you understand why.

  2. A large auto insurance company has discovered that if a household owns two identical cars (same make, same …

 

04/01/2012
A Very Painful Bug

I'll lead off this post by listing all of the various things I thought might have caused this bug, and related phrases that I Googled in the hopes that it will lead some poor soul chasing the same issue to this post, thus shaving 2-3 points off that person's long-term …

 

03/09/2012
Juggernaut in Windows

Note: This is an old post, and no longer relevant as Juggernaut is no longer a thing. Just use socket.io directly.

I've fallen in love with Juggernaut.

Juggernaut gives you a realtime connection between your servers and client browsers. You can literally push data to clients using your web …

 

03/01/2012
Communicating Performance Thresholds

I think visually, so here's a way I like to communicate performance thresholds. There are a bunch of dimensions that might make an application behave differently. Test against those dimensions and plot the thresholds of each one. Here are some that I made up, for a theoretical application.

slide1

The goal …

 

02/21/2012
Building Software and Building Bridges

We have a problem. People can't get from one area of town to a neighboring area because there is a river in between and no road. So let's build a bridge.

Step 1: Requirements

We'll get detailed requirements from civil engineers and government officials, including environmental constraints, safety guidelines, traffic …

 

01/29/2012
No Bugs != Quality

A low bug count is not a good indicator of quality software. Lack of typos and grammatical errors is not indicative of a quality book.

If your software has nagging "quality problems", driving the open bug count to zero probably won't be much help (not least because it only addresses …

 

01/03/2012
Localization (or: Localisation) Tip

I just learned something from our documentation team: Try to avoid using gerunds in UI text because they are difficult to localize & translate. While they certainly aren't a challenge for great translators, I'm told they can be misinterpreted by mediocre translators - like the ones that outsourced translating services might use …