Presentations

Ngrams: The New Tool for Social Research

Description
The interactive “Google Books Ngram Viewer” is a useful tool for historical and contemporary economic analysis. This essay addresses the nature and availability of ngram data. It also demonstrates a method for combining time-series ngram and economic data in order to discover correlations between published ideas and opinions on one hand, and the trend of economic prosperity on the other.
Categories
Published
of 13
22
Categories
Published
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Similar Documents
Share
Transcript
  Ngrams: The New Tool for Social Research James A. Montanye Abstract:   The interactive ― Google Books  Ngram Viewer‖ has sparked interest in the relative frequency of words and phrases contained in books published during the last few centuries. The online system complements existing methods for literary and related analyses. Less obvious, perhaps, is its usefulness as a tool for historical and contemporary economic analysis. This essay addresses the nature and availability of ngram data. It also demonstrates a method for combining time-series ngram and economic data in order to discover correlations between published ideas and opinions on one hand, and the trend of economic prosperity on the other. he interactive ― Google Book  s Ngram Viewer‖ has sparked  both curiosity and serious interest in the relative frequency of words and phrases contained in books  published during the last few centuries. The online Viewer, which was introduced in 2009 and improved in 2012, complements existing methods for literary and related analyses. Less obvious, perhaps, is its usefulness as a tool for economic analysis. Changes in the relative frequency of words and phrases reveal not only variations in language usage and the relative significance of ideas and opinions, but also the changing relationship between scholarly and intellectual thinking on one hand, and the trend of economic prosperity on the other hand.  Ngrams are relative, quantitative measures of word frequencies. The word itself is a term of art in the argot of comp utational linguistics. (Note that ―ngrams‖ are unrelated to ―engrams,‖ the latter being hypothetical changes in brain states that explain the process of memory.) Google‘s interactive Ngram Viewer accesses a database of word frequencies calculated from the millions of volumes that Google has scanned and archived. The data are stratified by year, language, and in some cases by region. The corpus comprising English language books published between 1800 and 2000 is searched by default. English language volumes dating from 1500 are represented within a separate corpus headed ― English One Million (2009). ‖  Ngram patterns are more volatile in earlier years due to the smaller number of books represented. The Viewer, which embodies many search and computational options, can be accessed freely online at http://books.google.com/ngrams using Google‘s Chrome   browser. For any given word, or phrase containing five words or fewer, the Viewer creates a line graph showing the relative frequency of usage by year. (The underlying data can be extracted from the web page‘s source code via the browser‘s ―view source‖ tool.) A display depicting two or more ngrams reveals how words, concepts, and ideas have shifted relative to each other over time. Figure 1 demonstrates the Viewer‘s  operation. A wild-card search (―* rights‖) of the default ―English‖ ngram corpus reveals that the discussion of ― rights ‖  in books  published during the years 1800 through 2000 mostly addressed ―human rights,‖ ―civil rights,‖ and ―property rights  (which gradually supplanted the more cumbersome phrase, ―rights of property‖ ). Further searches on ―property rights‖ in the separate ―American‖ and ―British‖ corp ora reveal that the discussion began in each country at least a century T   - 2 -  before Ronald Coase‘s (1960) sem inal paper on social cost. The discussion led prosperity growth in America after 1900, but did not bloom in British texts (presumably both pro and con) until well into the conservative Thatcher era. The ―prosperity‖ trend line appearing in this and all other Figures was derived outside of Google‘s system, in the manner described in a following section. Figure 1 ―Property Rights‖ in Britain and America   0.0000%0.0001%0.0002%0.0003%0.0004%0.0005%0.0006%0.0007%0.0008%1800 1820 1840 1860 1880 1900 1920 1940 1960 1980 2000  Year    n   G  r  a  m    F  r  e  q  u  e  n  c  y 01000200030004000500060007000    P  r  o  s  p  e  r   i   t  y   (   $   )  America Britain Prosperity   Data sources: Google (2013) and DeLong (1998).  The ngram patterns seen here might be significantly different if Google‘s database included journals as well as books. This lacuna, among other intrinsic limitations, implies that the data must be interpreted cautiously. Even so, the results provide useful clues in the search for economic relationships and related causalities. Quantifying Intellectual and Scholarly Opinion  Ngram data can be used to establish relationships between rising economic  prosperity on one hand, and intellectual and scholarly ideas and opinions concerning the creation and distribution of that prosperity on the other hand. Words and ideas matter in this realm. The venerable economist and historian Deirdre McCloskey (see, for example, 2010) has long stressed the significance of innovative ideas, virtuous beliefs, and rhetoric as drivers of social and economic change. Joseph Schumpeter (1942) famously argued that rising prosperity fuels an increase in the supply of intellectual ideas and opinions that carry the seeds of creative social and economic destruction (see also Greenspan 2013, 189  –213). Consider, for example, Robert F. Kennedy‘s familiar campaign quip from the 1960s to the effect that most people see conditions as they are and ask, ―Why?,‖ while Kennedy imagined how conditions might be improved and asked , ―Why not?‖ Woodrow   - 3 - Wil son (1891, qtd. in Pestritto 2012, 333) wrote ―that a nation is an organic thing; and that it dwells with those who do the practical thinking and organize the best concert of action ; those who hit upon opinions  fit to be made prevalent  , and have the capacity to make them so .‖ The economist Thomas Sowell (2009, 5, 282) emphasizes the importance of studying the process by which intellectual and scholarly ideas and opinions develop and spread: ―Because of the enormous impact that intellectuals can have, both w hen the are well known and when they are unknown, it is critical to try to understand the patterns of their behavior and the incentives and constraints affecting those patterns. ... [Most consequential] is their creating a general set of presumptions, beliefs and imperatives  –   a vision  –   that serves as a general framework for the way particular issues and events that come along are perceived.‖ Former Federal Reserve Chair Alan Greenspan (2013, 258) makes the point more narrowly: ―The roots of the issue of e conomic fairness, rarely discussed outside the halls of academia, date back to the long-simmering debate about who among the multitude of economic participants in the interconnected capitalist  production process has valid claims on shares of its output. To this day, it remains an issue in dispute.‖  The historical course of this dispute can be tracked through time using Google‘s  Ngram Viewer. Quantifying Prosperity Economist J. Bradford DeLong (1998) has assembled economic growth estimates spanning the years 100,000 BCE to the present. H is ―preferred‖  point estimates of real  purchasing power provide a handy proxy for tracking changes in economic prosperity. His data series contains estimates of real per-capita gross world product expressed in hypothetical 1990 ―international‖ (Geary -Khamis) dollars. The shape and timing of this series is comparable to other published estimates (see, for example, Clark 2007, 2). DeLong‘s estimates show prosperity increasing by an estimated total   of two  percent between the Athenian Golden Age and the beginning of the seventeenth century. In other words, the average individual living in the year 1600 CE enjoyed roughly the same standard of living as antiquity‘s average philosopher. By 1700  –   the eve of the Industrial Revolution, which conventionally spans the years 1750 to 1850  –   purchasing  power had increased by a total of 18 percent over that of ancient times. The increase  between 1600 and 1700 implies not only a somewhat different set of social mores, norms, customs, and practices (see McCloskey 2010; King 2013), but also revised ideas and  beliefs about how this economic ―windfall‖ might be distributed. By 1800, purchasing  power had increased by 41 percent over that of antiquity. By 1900 it had increased by 392 percent. The growth rate inflected upward following World War Two, and by 2000 the cumulative increase reached 4,638 percent. This pattern richly implies that modern social ideas and opinions differ not only from those of ancient Athens, but also from those of colonial America. An Illustrative Study in Ngrams Comparing changing economic prosperity with ngram patterns often reveals startling correlations that suggest profitable avenues for further analysis. Two related   - 4 - issues are explored in the sections and Figures below: (i) the evolution of the ― rights ‖  concept, and (ii) the redistribution of rights through rent seeking. Evolution of the Rights Concept In antiquity, only God (or ―the gods‖) had rights. Individuals had only a duty to obey the ―natural law.‖  Mainstream philosophy and law at that time entailed no concept of individual rights (Maine 1861, 269  –  70). This arrangement prevailed until John Locke reasoned, in his Second Treatise on Government   (1689), that the duties imposed by God‘s natural law necessarily entailed a scheme of reflexive natural human rights in life, liberty, and property. Individual rights eventually came to be emphasized independently of the correlative duties on which Locke predicated them.  Ngram data drawn from the ― English One Million ( 2009)‖ corpus show ― rights ‖  and ― duties ‖  tracking each other from 1700 until the late nineteenth century, at which  point the relevance of correlative ―duties‖ declined. This pattern is visualized in Figure 2. The shift‘s timing coincided not only with rising prosperity, but also with the ―progressive‖ economic and social ideals espoused by Thomas Paine and Karl Marx, among others. Figure 2 Rights, Duties, and Prosperity   0.0000%0.0020%0.0040%0.0060%0.0080%0.0100%0.0120%0.0140%0.0160%0.0180%1700 1750 1800 1850 1900 1950 2000  Year    n   G  r  a  m    F  r  e  q  u  e  n  c  y 01000200030004000500060007000    P  r  o  s  p  e  r   i   t  y   (   $   ) Rights Duties Prosperity   Data sources: Google (2013) and DeLong (1998) . Figure 2 spans the Industrial Revolution period. No explanation for the Revolution‘s  timing, and the spurt of prosperity growth that foreshadowed it, entirely satisfies economic historians despite a thick and diverse literature (see McCloskey 2010 for a critique). Many scholars attribute these developments to the Protestant work ethic. Economists, by comparison, often interpret them teleologically; that is, as the inevitable
Search
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x