diff --git a/blog/2016-01-01-complaining-about-the-weather/index.mdx b/blog/2016-01-01-complaining-about-the-weather/index.mdx index 0f1f322..4223c68 100644 --- a/blog/2016-01-01-complaining-about-the-weather/index.mdx +++ b/blog/2016-01-01-complaining-about-the-weather/index.mdx @@ -29,7 +29,7 @@ I'm originally from North Carolina, and I've been hearing a lot of people talkin So I got a bit curious: Has North Carolina over the past few months actually had more cloudy and rainy days recently than in previous years? This shouldn't be a particularly challenging task, but I'm interested to know if people's perceptions actually reflect reality. -The data we'll use comes from [forecast.io](https://forecast.io), since they can give us a cloud cover percentage. I've gone ahead and retrieved the data to a pickle file, and included the [code that was used to generate it](#Generating-the-Forecast-file). First up: What was the average cloud cover in North Carolina during August - November, and how many days were cloudy? We're going to assume that a "cloudy" day is defined as any day in which the cloud cover is above 50%. +The data we'll use comes from [forecast.io](https://forecast.io), since they can give us a cloud cover percentage. I've gone ahead and retrieved the data to a pickle file, and included the code that was used to generate it below. First up: What was the average cloud cover in North Carolina during August - November, and how many days were cloudy? We're going to assume that a "cloudy" day is defined as any day in which the cloud cover is above 50%. ```python city_forecasts = pickle.load(open('city_forecasts.p', 'rb')) diff --git a/blog/2016-04-06-tick-tock/index.mdx b/blog/2016-04-06-tick-tock/index.mdx index 4311879..433d6b2 100644 --- a/blog/2016-04-06-tick-tock/index.mdx +++ b/blog/2016-04-06-tick-tock/index.mdx @@ -36,9 +36,9 @@ So while the ending number is **not useful in any medical context**, it is still ## Getting the data -[Fitbit](https://www.fitbit.com/) has an [API available](https://dev.fitbit.com/) for people to pull their personal data off the system. It requires registering an application, authentication with OAuth, and some other complicated things. **If you're not interested in how I fetch the data, skip [here](#Wild-Extrapolations-from-Small-Data)**. +[Fitbit](https://www.fitbit.com/) has an [API available](https://dev.fitbit.com/) for people to pull their personal data off the system. It requires registering an application, authentication with OAuth, and some other complicated things. -## Registering an application +### Registering an application I've already [registered a personal application](https://dev.fitbit.com/apps/new) with Fitbit, so I can go ahead and retrieve things like the client secret from a file. diff --git a/blog/2016-05-15-the-unfair-casino/index.mdx b/blog/2016-05-15-the-unfair-casino/index.mdx index a9cf07e..cc4ca3c 100644 --- a/blog/2016-05-15-the-unfair-casino/index.mdx +++ b/blog/2016-05-15-the-unfair-casino/index.mdx @@ -155,7 +155,7 @@ We can thus finally state: **just by looking at the distribution of results from ## Simulated Annealing -What we really would like to do though, is see if there is any way to determine how exactly the dice are loaded. This is significantly more complicated, but we can borrow some algorithms from Machine Learning to figure out exactly how to perform this process. I'm using the Simulated Annealing algorithm, and I discuss why this works and why I chose it over some of the alternatives in the [justification](#Justification-of-Simulated-Annealing). If you don't care about how I set up the model and just want to see the code, check out [the actual code](#The-actual-code). +What we really would like to do though, is see if there is any way to determine how exactly the dice are loaded. This is significantly more complicated, but we can borrow some algorithms from Machine Learning to figure out exactly how to perform this process. I'm using the Simulated Annealing algorithm, and I discuss why this works and why I chose it over some of the alternatives later. If you don't care about how I set up the model and just want to see the code, check it out below. [Simulated Annealing][3] is a variation of the [Metropolis-Hastings Algorithm][4], but the important thing for us is: Simulated Annealing allows us to quickly optimize high-dimensional problems. But what exactly are we trying to optimize? Ideally, we want a function that can tell us whether one distribution for the dice better explains the results than another distribution. This is known as the **likelihood** function.