From 06653d89519d1f0fa8f91e2e0c1dfcff57635dda Mon Sep 17 00:00:00 2001 From: Ivan Ivanov Date: Sat, 7 Nov 2015 17:45:40 -0500 Subject: [PATCH] readme --- README.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/README.md b/README.md index 2636cab..2cd20ae 100644 --- a/README.md +++ b/README.md @@ -21,3 +21,15 @@ download_puf(short_puf) ``` All PUF files, regardless of what dataset they come from, can be downloaded through this command. + +At this stage the `enrich_dataset.py` script can be used to add categorical labels and convert to better variable names. + +``` +# Usage: +enrich_dataset.py --input-file h94e.csv --column-dictionary FYCCodebook_2013.csv + +``` +The script will extract information about categorical variables in the input file using import.io API to parse codebook tables from the MEPS site and add columns with labels, rather than numeric IDs. + +The column dictionary is one time construction from the codebooks on the MEPS website, mapping 8-character variable names +to more descriptive ones.