Clothing Items Morphing into Data

Collecting and Populating Wardrobe Data

Our ultimate goal of deriving wardrobe insights and taking data driven action on fits and clothing items is only achieveable to accurate and complete records. We tackle the important task of data collection and population.

Most of the time us data folk inherit data from external sources. Often this puts a smile on our face as we can get right to work with analysis, machine learning, or visualization. However, there is a dark side.

Often data is messy, incomplete, imbalanced, etc., which affects the degree to which the above can unearth relevant insights. In fact, most of our time is spent remedying these issues to the best of our ability (see the below from the Data Scientist Report).


Pie Chart of Data Task Duration

Often in the midst of wrangling data, it occurs to us that "if only the experiment designer considered X" or "if only they collected data on Y" then our job would be so much easier. Such is my motivation for the wardrobe database.

Beginning with the end in mind of the questions I want answered, the insights I'm hoping for, and the analysis techniques I'm interested in employing pushed me to design the wardrobe data source ideal for these activites.

With the database now ready, we can begin the process of data collection and population.


Collecting Wardrobe Data

Data collection for this project, as you might expect, was tidious, to say the least.

Think about an individual item of clothing and what must be checked and examined prior to inputing the relevant values into the database:


That's a lot to be thinking about for each item in one's wardrobe. I can say with a good degree of certainty that the duration for logging all the items in my modestly sized wardrobe is likely to take a handful of hours at best.


Man Looking at Clothes

Adding individual fit data is much simpler, though requires its own level of detail to log outfits on a daily basis. Think about what that takes:


In summary, the data collection task required patience and process. Aligning the collection requirements stated above with the tools for data population to the database was the appraoch for optimum efficiency.


Populating Wardrobe Data

Just as the sequence of adding tables to a database is important, so is the sequence for adding source data to those same tables.

Because the "wItem" table stores foreign keys for "wItemBrand", "wItemType", etd., we cannot add a clothing item if those brand and type records are not already in the database with keys to reference.

Below is a basic flowchart depicting the order and logic I run through every time I populate clothing item data:


Process for Adding Item Data

At most stages, it's imperitive to check if the value related to the current clothing item (i.e. brand, type, color, etc.) is new. Should we be working with a brand not yet populated into the database, we have an additional step before proceding to add the item.

In a similar fashion and with similar considerations, we can add outfit data (please see the far simpler flowchart below):


Process for Adding Fit Data

Collecting and populating wardrobe data is never really complete. We all buy new clothing items throughout the year, get rid of some stuff, notice as an item gets damaged, and at the very least wear clothes daily. This all requires strict cataloging of what's going on with the wardrobe.

I do hope to take the next several weeks getting to the point where the current status of my wardrobe is represented in data and that habits are being developed for consistent tracking.