On NYC’s Open Data Portal and Parking Tickets

The author of the I Quant NY blog profiles an excellent use of of NYC’s Open Data portal in a post detailing how the city has been systematically ticketing legally parked cars:

As of late 2008, in NYC you can park in front of a sidewalk pedestrian ramp, as long as it’s not connected to a crosswalk.  It’s all written up in the NYC Traffic Rules, and for more detail, take a look at this article.

Is it a problem that drivers don’t realize that there are some extra parking spots they are now allowed to park in?  Not so much.  But, I’ve got a pedestrian ramp leading to nowhere particular in the middle of my block in Brooklyn, and on occasion I have parked there.  Despite the fact that it is legal, I’ve been ticketed for parking there.  Though I get the tickets dismissed, it’s a waste of everybody’s time. And that got me wondering- How common is it for the police to give tickets to cars legally parked in front of pedestrian ramps?  It couldn’t be just me…

In the past, there was not much you could do to stop something like this. Complaining to your local precinct would at best only solve the problem locally.  But thanks to NYC’s Open Data portal, I was able to look at the most common parking spots in the City where cars were ticketed for blocking pedestrian ramps.   It’s worth taking a moment upfront here to praise the NYPD for offering this dataset to begin with.  Though we are behind on police crime data in the city, we are ahead in other ways and the parking ticket dataset is definitely one of them.  

The response from the NYPD that the author received speaks volume (an admission of mistake and a promise to get it right with the proper training):

Mr. Wellington’s analysis identified errors the department made in issuing parking summonses. It appears to be a misunderstanding by officers on patrol of a recent, abstruse change in the parking rules.  We appreciate Mr. Wellington bringing this anomaly to our attention.

The department’s internal analysis found that patrol officers who are unfamiliar with the change have observed vehicles parked in front of pedestrian ramps and issued a summons in error. When the rule changed in 2009 to allow for certain pedestrian ramps to be blocked by parked vehicles, the department focused training on traffic agents, who write the majority of summonses.

Yet, the majority of summonses written for this code violation were written by police officers. As a result, the department sent a training message to all officers clarifying the rule change and has communicated to commanders of precincts with the highest number of summonses, informing them of the issues within their command.

Thanks to this analysis and the availability of this open data, the department is also taking steps to digitally monitor these types of summonses to ensure that they are being issued correctly.

Worth reading in entirety here.

Modeling 3,000 Years of Human History

It’s rare to find an interesting paper on history in the Proceedings of the National Academy of Sciences, so it was interesting to stumble upon Peter Turchin et al.’s “War, Space, and the Evolution of Old World Complex Societies” who developed a model that uses cultural evolution mechanisms to predict where and when the largest-scale complex societies should have arisen in human history.

From their abstract:

How did human societies evolve from small groups, integrated by face-to-face cooperation, to huge anonymous societies of today, typically organized as states? Why is there so much variation in the ability of different human populations to construct viable states? Existing theories are usually formulated as verbal models and, as a result, do not yield sharply defined, quantitative predictions that could be unambiguously tested with data. Here we develop a cultural evolutionary model that predicts where and when the largest-scale complex societies arose in human history. The central premise of the model, which we test, is that costly institutions that enabled large human groups to function without splitting up evolved as a result of intense competition between societies—primarily warfare. Warfare intensity, in turn, depended on the spread of historically attested military technologies (e.g., chariots and cavalry) and on geographic factors (e.g., rugged landscape). The model was simulated within a realistic landscape of the Afroeurasian landmass and its predictions were tested against a large dataset documenting the spatiotemporal distribution of historical large-scale societies in Afroeurasia between 1,500 BCE and 1,500 CE. The model-predicted pattern of spread of large-scale societies was very similar to the observed one. Overall, the model explained 65% of variance in the data. An alternative model, omitting the effect of diffusing military technologies, explained only 16% of variance. Our results support theories that emphasize the role of institutions in state-building and suggest a possible explanation why a long history of statehood is positively correlated with political stability, institutional quality, and income per capita.

The model simulation runs from 1500 B.C.E. to 1500 C.E.—so it encompasses the growth of societies like Mesopotamia, ancient Egypt and the like—and replicates historical trends with 65 percent accuracy.

Smithsonian Magazine summarizes:

Turchin began thinking about applying math to history in general about 15 years ago. “I always enjoyed history, but I realized then that it was the last major discipline which was not mathematized,” he explains. “But mathematical approaches—modeling, statistics, etc.—are an inherent part of any real science.”

In bringing these sorts of tools into the arena of world history and developing a mathematical model, his team was inspired by a theory called cultural multilevel selection, which predicts that competition between different groups is the main driver of the evolution of large-scale, complex societies. To build that into the model, they divided all of Africa and Eurasia into gridded squares which were each categorized by a few environmental variables (the type of habitat, elevation, and whether it had agriculture in 1500 B.C.E.). They then “seeded” military technology in squares adjacent to the grasslands of central Asia, because the domestication of horses—the dominant military technology of the age—likely arose there initially.

Over time, the model allowed for domesticated horses to spread between adjacent squares. It also simulated conflict between various entities, allowing squares to take over nearby squares, determining victory based on the area each entity controlled, and thus growing the sizes of empires. After plugging in these variables, they let the model simulate 3,000 years of human history, then compared its results to actual data, gleaned from a variety of historical atlases.

Click here to see a movie of the model in action.

Of particular interest to me was the discussion of the limitations of the model (100-year sampling and exclusion of city-states of Greece):

Due to the nature of the question addressed in our study, there are inevitably several sources of error in historical and geographical data we have used. Our decision to collect historical data only at 100-year time-slices means that the model ‘misses’ peaks of some substantial polities such as the Empire of Alexander the Great, or Attila’s Hunnic Empire. This could be seen as a limitation for traditional historical analyses because we have not included a few polities known to be historically influential. However, for the purposes of our analyses this is actually strength. Using a regular sampling strategy allows us to collect data in a systematic way independent of the hypothesis being tested rather than cherry-picking examples that support our ideas.

We have also only focused on the largest polities, i.e those that were approximately greater than 100,000 km2. This means that some complex societies, such as the Ancient Greek city states, are not included in our database. The focus on territorial extent is also a result of our attempt to be systematic and minimize bias, and this large threshold was chosen for practical considerations. Historical information about the world varies partly in the degree to which modern societies can invest in uncovering it. Our information about the history of western civilization, thus, is disproportionately good compared to some other parts of the world. Employing a relatively large cut-off minimizes the risk of “missing” polities with large  populations in less well-documented regions and time-frames, because the larger the polity the more likely it is to have left some trace in the historical record. At a smaller threshold there are simply too many polities about which we have very little information, including their territories, and the effects of a bias in our access to the historical record is increased.

Overall, I think the supporting information for the model is actually a lot more interesting read than the paper itself.

ZestFinance and the Nuances of Modeling Credit Risk

Pando Daily has a post about Peter Thiel leading a $20 million funding round for a four year old company called ZestFinance. Their goal is to better predict consumer behavior. They model more than 10,000 data points and arrive at more than 70,000 potential signals of consumer behavior. This was the most interesting bit in the article:

Not all signals are obvious, Merrill explains, noting for example that the way a consumer types their name in the credit application – using all lowercase, all uppercase, or correct case – can be a predictor of credit risk. Other seemingly trivial data points include whether an applicant has read a letter on the company’s website and whether the applicant has a pre-paid or post-paid cell phone.

ZestFinance had evolved its business model to that of an underwriting service provider to third-party subprime lenders, “exiting the lending business to avoid the appearance of competition with its new partners.” Will be interesting to see if their methodology gains acceptance in the wider banking sector in the years to come.

Read the entire post here.

Liu Qianping: Fashionable Chinese Grandpa

The Wall Street Journal profiles Liu Qianping, a 72-year-old grandfather who has taken the Internet by storm by modeling clothes:

He owes his star turn to his granddaughter, Lu Ting, a clothier who struggled for months to find a model who could boost her online store without breaking the bank. “He’s just so slender,” Ms. Lu says of her 110-pound grandfather. She notes that he looks great in crimson dresses and credits him for more than quadrupling her sales in recent weeks.

Mr. Liu’s ascent in the modeling realm speaks volumes about shifting cultural mores in a fast-aging society. The waif of a man, who goes about in a three-piece suit and a bow-tie when he isn’t clad in pink satin, is among a cadre of Chinese seniors who are all too familiar with cultural upheaval. Their lives have been marked by unimaginable change—from surviving famine to the advent of fast food. Along the way, many have adopted a devil-may-care approach that flies in the face of stereotypes about conservative Asian elders.

model_1

model_2

Thank you, Internet, for helping breaking all kinds of stereotypes. Read the entire story here.