Statistics | On Things

Using iframes From This Site to Embed Datasets on Webpages

Picostat was a web application I created with Drupal 7/8/9. It had support for embedding datasets in webpages after they have been imported. I no longer maintain that project, but I transferred over the iframe functionality to pmagunia.com. You can see the code in a sample dataset page like the kmenta dataset. On that page you will find the iframe HTML code like this:

Poverty and Education in Pennsylvania

It would seem that poverty and education are connected, but what does the data say? To understand this relationship, I took my survey data from Open Intro and focused on the following three variables for counties in Pennsylvania using only the D3 JavaScript library for data visualization. Please note that data for some counties were missing for 2019. Percent of the county population in poverty - poverty_2019Percent of the county with a bachelors degree - bachelors_2019Population of the county - pop_2019 The data seems to corroborate the negative relationship as most people would likely imagine. Take a closer look using...

Exploring Education and Poverty in Tableau

Below is a screenshot of Tableau dashboard I created while taking CS 416 at the University of Illinois at Urbana-Champaign. It shows the relationship between poverty and education in the counties of Pennsylvania. You can see the general trend is negative. In other words, as more and more people become educated, the respective poverty level of their counties decreases. You can visit and interact with the dashboard on the Tableau page on my website. The following is some question and answer about my Tableau app.

Binomial Distribution

In the last section, we took a first look at the some of the conditions necessary for a binomial experiment. Let's examine the binomial distribution more closely by examining the method to calculate the probability of getting a particular outcome for a series of bernoulli trials.

Binomial Experiment

The binomial distribution is a discrete distribution which implies its sample space is finite. In other words, the values that a binomial random variable can take on is limited. The mean and variance is given by $ \mu = np $ and $ \sigma^2 = npq,$ respectively.

Binomial Testing II

Suppose we conducted our experiment from the last section, and we got three heads. That is getting 2 fewer heads than would be expected by chance alone if the coin is indeed fair. Now we need to calculate the p-value of the experiment. The p-value is the probability of obtaining a test statistic at least as extreme as the observed value. In this case, our observed value is three. More extreme cases would be if we flipped the coin and we get 0, 1, or 2 heads. However, this is a two-sided test so we can get extreme cases on...

Binomial Testing

In the last experiment, we assumed we had a fair coin. Suppose we wanted to conduct an experiment to see whether it was indeed a fair coin. First we need to state the null and alternative hypothesis and that is:

Moments

The mean is an another name for the average. You can think of the mean as a typical representative value of all the observations in a set. The standard deviation is a measure of the “spread” of the data. For example, if in a set of observations, the data was not concentrated around the mean, then that dataset would have a high standard deviation. If all the data, was concentrated around some single value, then that dataset would have a low standard deviation. The standard deviation squared is the variance. Another name for these measurements are moments.

Random Variables

A random variable, $ X $, is a function that assigns to each element in the sample space a unique value. The sample space is the domain of $X$. The value assigned to each element in the sample space is the probability for that particular element in the sample space occurring. The sum of the probabilities over the sample space must equal 1, and the probability that the random variable takes on a value $ x $ must be between 0 and 1, inclusive.

Statistical Populations

In statistics, we are often interested in the population which is some entire set of entities that share a common trait. We often draw a sample which is a subset of the population to draw inferences about the population. The set of measurements taken from the sample form the sample data.