Showing posts with label statistics. Show all posts
Showing posts with label statistics. Show all posts

Sunday, March 06, 2022

Trustworth Stats?

I'm basically skeptical of the statistics being reported from Ukraine. Too much confusion in the early days of a war--everyone gets excited and succumbs to the temptation of believing what we want  We saw that in WWII particularly with aerial combat in all theaters. 

Tuesday, December 14, 2021

The Vietnam Morass

I happened to try to check Jill Lepore's claim that hundreds of thousands marched on April 15, 1967 in New York City to protest the war--it seemed high to me. That got me into deep waters.  The NY Times seems to say that police estimated 100,000, or possibly 125,000, although they were told to prepare for 200,000 to 400,000. Elsewhere including wikipedia the "hundreds of thousands" phrase seems to be established wisdom.  Not sure anyone has tried to estimate it as carefully as we used to do with crowds at the various inaugurals.

Elsewhere there's the question of the number of draft dodgers--Wikipedia offers different vague estimates in different places, but this site has:
For its part, the U.S. government continued to prosecute draft evaders after the Vietnam War ended. A total of 209,517 men were formally accused of violating draft laws, while government officials estimate another 360,000 were never formally accused.

That sounds so specific it must be based on some official document; unfortunately they don't provide any sources. 

It's a reminder to me of how fragile is the base of "facts" for our received version of history.

Monday, July 19, 2021

Continuums

"most things that we think of as categorical are really continuous"

That's a line in this post, What Is a Woman? at Statistical Modeling.  A lot of what they post is over my head, but enough isn't to make for rewarding reading.  The phrase captures a belief I've had. It goes along with believing that most generalizations could be rephrased statistically, as in "Americans believe..." There's a statistical phrasing for "Americans"--is it "the average American", "the young American", "white Americans", "living Americans" etc. etc.  And what they believe can also be rephrased.


Monday, July 05, 2021

Collecting Statistics: Problems and Progress



GovExec had an article on the problems with data collection during the pandemic from the Covid Tracking Project.
Above and beyond any individual reporting practice, we believe that it was the lack of explanations from state governments and, most crucially, the CDC that led to misuse of data and wounded public trust. We tried our best to provide explanations where possible, and we saw transformation when we were able to get the message across to the public. Data users who were frustrated or even doubtful came to trust the numbers. Journalists reported more accurately. Hospitals could better anticipate surges.

If we could make just one change to the way state and federal COVID-19 data were reported, it would be to make an open acknowledgment of the limitations of public-health-data infrastructure whenever the data is presented. And if we could make one plea for what comes next, it’s that these systems receive the investment they deserve.

[Updated: Technology Review describes   a consortium to collect and standardize covid data into one database for research purposes. It sounds a bit klugey but that's the penalty for prioritizing privacy and silos over a rationalized centralized system. The question is whether we'll wake up and fund continuing efforts of this sort.}

Saturday, March 27, 2021

Pandemic Data Problems

 I've posted a time or two on the need for the federal government to improve its statistical/data collection processes.  

Here's a long discussion of the problems with the covid-19 data collection.

Sunday, February 07, 2021

On Improving Statistical Infrastructure

The Covid Tracking project announces its end. 

But the work itself—compiling, cleaning, standardizing, and making sense of COVID-19 data from 56 individual states and territories—is properly the work of federal public health agencies. Not only because these efforts are a governmental responsibility—which they are—but because federal teams have access to far more comprehensive data than we do, and can mandate compliance with at least some standards and requirements.

I wholeheartedly agree with this, and hope the Biden/Harris administration devotes money and attention to improving our statistical infrastructure, given the deficiencies revealed by pandemic.  

Saturday, November 30, 2019

Produce Waste

The PBS Newshour is showing a piece on food waste, featuring an effort in California. It is part of a weeklong effort.  This particular one is laudable, featuring coordinators and software packages. 

But my contrarian side is present whenever I hear an estimate of "pounds of food wasted".  Looking at the produce shown, the pounds wasted include peach pits, watermelon rinds, etc.  I know measuring the "waste" is hard, and maybe there is a benefit to using fuzzy statistics: they stir up activism.  My instinct, however, is that better stats, more solid stats, are the way you build the base for a social movement, for changing norms. 

Wednesday, October 17, 2018

MFP and Farmers.gov

Got a tweet announcing the latest figures on MFP applications and payments.  I now can't find the tweet, not sure what's the matter. 

Two things I'd like farmers.gov to do:

  1. provide online access to FSA data, like the applications and payments.  It seems to me that FSA administrators at each level should be watching the data.  (That was true when I worked for them, but we never did. But with the centralization of the payment process it should be easy to do, and there's no privacy concerns that I can see.)
  2. provide a user-friendly interface to the USDA data silos.  Does anyone outside USDA understand which data ERS has and which data NASS has?  Damned few, is my guess.  It shouldn't be too hard to present the data without regard to the organizational parents.

Friday, August 04, 2017

USDA Statistics Suck

You'd think having spent my career in USDA I'd have a good grasp of how to navigate the USDA statistics.

You'd be wrong.  Perhaps the problem is increasing senility.  I prefer to believe the problem is that USDA's statistical apparatus is stuck in the middle of the last century, pre-computer.

What's most recently teed me off is dairy (see my previous post).  I'm looking for a relatively simple set of figures: the historical number of dairy farms, 190xx to present; the number of cows, and total production for the same period.  Then I could match trends to the New Zealand figures.

USDA has two main statistical agencies: NASS (National Agricultural Statistics Service) and ERS (Economic Research Service).  In addition, if you're looking for figures on foreign ag, FAS (Foreign Agricultural Service) might come into play.  If you're looking for some figures on farm programs, FSA comes into play.

Problem is I've yet to figure out how to get these figures.  The NASS data seems tied to censuses. The best I've done is this ERS document

I think the basic problem is the statistical series have developed in close conjunction with users in the colleges and industry, so satisfying the needs of John Doe Public was a low priority.  Back in the days of paper, before the internet, people wouldn't be coming to the agencies just to satisfy their curiosity.

Sunday, February 19, 2017

A Rape Is a Rape Is a Rape?

Not so.  This piece on the Swedish "rape crisis" explains why it's in the definition.

[Updated: Kevin Drum isn't a fan of the article's stats.]

Friday, September 11, 2015

Cops and the 80/20 Rule

Looks like the 80/20 rule operates with respect to the police.  In other words, most cops do their jobs without major conflict with the citizens, some don't.  That's based on Moskos "Cop in the Hood" blog post on NYC statistics.

Wednesday, December 03, 2014

Hans Rosling Is a Bureaucrat

Via Tyler Cowen at Marginal Revolution, I got to this profile of Hans Rosling.which raised my respect for him considerably.  Rosling is famous for his presentations on world health, economic, and wellbeing statistics.  He comes off very well, and upsets many of my preconceptions.  So I already respected him

What's new from the article?  He's volunteered to go to Liberia and help on Ebola statistics.  My knee-jerk reaction (I'm a liberal so my knee jerks) is that someone so good at the big picture is likely to be inept at the nitty-gritty which bureaucrats worry about.  Not in the case of Rosling.  For example, there's a difference between showing "blank" for a county's Ebola cases and "0", a big difference. 

Sunday, June 01, 2014

The VA's Problems: a Failure of Imagination?

Much in the press about the problems with the VA.  I wonder though whether the problem wasn't at bottom a failure of imagination.  What do I mean?

Create a simplistic model of the VA--call it a bathtub.

Flowing into the VA are two flows: one is the flow of old veterans turning to the VA for support.  Now we know, I assume, pretty well the demographics of this group: how many WWII vets, how many Cold War vets, how many Iraq I vets, etc.  and the rates at which each group contacts the VA and the rate at which they are approved for care.  Once approved, I assume we also know averages of how often a vet in a specific age group needs treatment/to see a doctor.  Overall, as this group ages they're probably contacting the VA more and needing treatment more, so the potential workload is increasing.  They're also dying more, so that decreases the workload.

The second flow is of course the post 9/11 vets who need care immediately as they transition from service to civilian status.  I assume that's a bit more unpredictable, and the burden on the VA for treatment is greater, because the treatment of a 22-year old with PTSD is more difficult than a 72-year old suffering from aging.


So you have two flows of demands.  How big is the bathtub receiving the flows and how big the drain?  I assume we know how many medical professionals are employed and how many vets they can give various types of treatment to. 

Now if the flows in are bigger than the flows out, the bathtub is going to fill up and at some point it's going to overflow.  If they're exactly equal then the delay in appointments is going to represent the lag time in getting the resources to respond to the flows.  If the drain is bigger than the inflow, the appointment delay is going to represent just local conditions.  (Change the bathtub to a supermarket check out line system--sometimes lines will backup briefly just because.)

Now if you have metrics covering these items you should be able to validate your appointment time statistics by looking at the rates at which people are contacting the VA (i.e., if the rate of 72-year old vets contacting the VA drops from 2000 to 2010).  If the rates drop, that means people are giving up on the VA and going private or not getting care at all.  If the rates are pretty constant, then your stats on waiting for appointments should be goo.

  I suspect what happened with the VA is they were measuring people coming in the door, without the imagination to consider the whole universe of potential VA patients. That's my take, anyway, probably wrong.

[Updated--a Vox primer on VA care.]

Thursday, September 12, 2013

Harry Potter Kills

According to Alex Tabarrok at Marginal Revolution, a scientific sampling of people who died in the last year would show that reading Harry Potter novels is strongly correlated with dying young.  If you don't read Harry Potter you're much more likely to live to a ripe old age.  Wish I had known that before I read the series.

[more]


Tuesday, August 06, 2013

Statistics and the "Midpoint": the Case of Dairy

Long long ago I used to be good in math.  No more, but I'm still intrigued by statistics.  A recent ERS study on the consolidation of farms introduced me to a new measure.

We all know the "mean", and some of us know the "mode" and the "median".  The ERS people are using the "midpoint", specifically for cropland.  It's defined (my words) as the number of acres of cropland on a farm such that half the cropland in the country is in farms larger than that, and half is on farms smaller than that.  Because the distribution of acreage among farms is so skewed, with many farms being very small, and a few farms being very large, they argue it gives a better picture of what's happened over the last 25 years.

Using the same concept for livestock, they say:
"In 1987, the midpoint dairy herd size was 80 cows; by 2007, it was 570 cows. The change in hogs was even more striking, from 1,200 hogs removed in a year to 30,000. But consolidation was widespread: midpoint head sold for fed cattle doubled between 1987 and 2007, while those for broilers and cow-calf operations (cattle, less than 500 pounds) more than double"
80 to 570 cows is jawdropping.


Tuesday, November 06, 2012

My Own Prediction

Nate Silver's book will hit the NYTimes best seller list.  (I'm about a third through and it's very good.)

Thursday, February 02, 2012

SSA a Model for Online Operations at FSA

Social Security Administration wanted to have 44 percent of retirees sign up for SS payments online.  That was their announced goal:
The actual percentage who filed online was 41 percent, states the SSA Performance & Accountability Report for Fiscal Year 2011.
I applaud SSA for setting the goal and announcing it, even more for the followup report.  If and when USDA aims to have farmers to do more stuff online with FSA, I'd like to see USDA specify what the goals are and what the results are.

Thursday, January 12, 2012

Immigration and False Facts in NYTimes

The NY Times has an interesting op-ed article by a professor Dowell Myers, of SoCal, arguing that the immigration problem is over, because birth rates have fallen drastically, so our policies need to change.   I'd like to believe him.  Unfortunately, his facts are wrong, at least one of them:
Indeed, with millions of people retiring every week [emphasis added], America’s immigrants and their children are crucial to future economic growth: economists forecast labor-force growth to drop below 1 percent later this decade because of retiring baby boomers.
 If we have 2 million people retiring each week for 50 weeks, that's 1/3 of the nation retiring in a year. How easy it is to destroy one's credibility.

Thursday, November 03, 2011

Florence Nightingale a Mathematician?

Yes, and inventor of a class of graphs.  That's from this interesting site, which says:
Though known as a nurse who changed the standard of health care, she was actually a brilliant mathematician, and the inventor of a class of chart called the polar area diagram.

Saturday, February 26, 2011

How Do You Know a Blogger Is Far Gone

When he writes something like this:
"I CAN’T STOP MYSELF:  I subscribe to all the NASS California Crop Reports.  I love these, mostly because they read like poetry.
His real reason is the eminently logical one: statistics gives him a basis in reality, unlike the ephemera of the media.  And who is he: a very good blogger on water issues in California, water which grows much of our fruits and vegetables.