Quantcast
Channel: MoneyScience: MoneyScience's news channel - Blog > Three-Toed Sloth

Data over Space and Time, Lecture 7: Linear Prediction for Time Series


"Maximum Mean Discrepancy for Training Generative Adversarial Networks" (TODAY at the statistics seminar)

0
0
Attention conservation notice: Last-minute notice of a technical talk in a city you don't live in. Only of interest if you (1) care actor/critic or co-training methods for fitting generative models, and (2) have free time in Pittsburgh this afternoon.

I have been remiss in blogging the statistics department's seminars for the new academic year. So let me try to rectify that:

Arthur Gretton, "The Maximum Mean Discrepancy for Training Generative Adversarial Networks"
Abstract: Generative adversarial networks (GANs) use neural networks as generative models, creating realistic samples that mimic real-life reference samples (for instance, images of faces, bedrooms, and more). These networks require an adaptive critic function while training, to teach the networks how to move improve their samples to better match the reference data. I will describe a kernel divergence measure, the maximum mean discrepancy, which represents one such critic function. With gradient regularisation, the MMD is used to obtain current state-of-the art performance on challenging image generation tasks, including 160 × 160 CelebA and 64 × 64 ImageNet. In addition to adversarial network training, I'll discuss issues of gradient bias for GANs based on integral probability metrics, and mechanisms for benchmarking GAN performance.
Time and place: 4:00--5:00 pm on Monday, 24 September 2018, in the Mellon Auditorium (room A35), Posner Hall, Carnegie Mellon University

As always, talks are free and open to the public.

Enigmas of Chance

Data over Space and Time, Lecture 8: Linear Prediction for Spatial and Spatio-Temporal Random Fields

Data over Space and Time, Lectures 9--13: Filtering, Fourier Analysis, African Population and Slavery, Linear Generative Models

0
0

I have fallen behind on posting announcements for the lectures, and I don't feel like writing five of these at once (*). So I'll just list them:

  1. Separating Signal and Noise with Linear Methods (a.k.a. the Wiener filter and seasonal adjustment; .Rmd)
  2. Fourier Methods I (a.k.a. a child's primer of spectral analysis; .Rmd)
  3. Midterm review
  4. Guest lecture by Prof. Patrick Manning: "African Population and Migration: Statistical Estimates, 1650--1900" [PDF handout]
  5. Linear Generative Models for Time Series (a.k.a. the eigendecomposition of the evolution operator is the source of all knowledge; .Rmd)
  6. Linear Generative Models for Spatial and Spatio-Temporal Data (a.k.a. conditional and simultaneous autoregressions; .Rmd)

*: Yes, this is a sign that I need to change my workflow. Several readers have recommended Blogdown, which looks good, but which I haven't had a chance to try out yet.

Corrupting the Young; Enigmas of Chance

Revised and Extended Remarks at "The Rise of Intelligent Economies and the Work of the IMF"

Data over Space and Time, Lectures 14 and 15: Inference for Dependent Data

Data over Space and Time, Lecture 17: Simulation

In Memoriam Joyce Fienberg

0
0

I met Joyce through her late husband Stephen, my admired and much-missed colleague. I won't pretend that she was a close friend, but she was a friend, and you could hardly hope to meet a kinder or more decent person. A massacre by a deluded bigot would be awful enough even if his victims had been prickly and unpleasant individuals. But that he murdered someone like Joyce --- five blocks from where I live --- makes it especially hard to take. I am too sad to have anything constructive to say, and too angry at living in a running morbid joke to remember her the way she deserves.


Data over Space and Time, Lectures 18 and 19: Simulation for Inference

Course Announcement: Advanced Data Analysis (36-402/36-608), Spring 2019

0
0
Attention conservation notice: Announcement of an advanced undergraduate course at a school you don't attend in a subject you don't care about.

I will be teaching 36-402/36-608, Advanced Data Analysis, in the spring.

This will be the seventh time I'll have taught it, since I took it over and re-vamped it in 2011. The biggest change from previous iterations will be in how I'll be handling class-room time, by introducing in-class small-group exercises. I've been doing this in this semester's class, and it seems to at least not be hurting their understanding, so we'll see how well it scales to a class with four or five times as many students.

(The other change is that by the time the class begins in January, the textbook will, inshallah, be in the hands of the publisher. I've finished adding everything I'm going to add, and now it's a matter of cutting stuff, and fixing mistakes.)

Advanced Data Analysis from an Elementary Point of View

Data over Space and Time, Lecture 20: Markov Chains

Books to Read While the Algae Grow in Your Fur, October 2018

0
0

Attention conservation notice: I have no taste. I also have no qualifications to discuss corporate fraud.

John Carreyrou, Bad Blood: Secrets and Lies in a Silicon Valley Startup
This is a deservedly-famous story, told meticulously. It says some very bad things about the culture around Silicon Valley which made this fraud (and waste) possible. (To be scurpulously fair, investment companies with experience in medical devices and the like don't seem to have bought in.) It also says some very bad things about our elites more broadly, since lots of influential people who were in no position to know anything useful about whether Theranos could fulfill its promises endorsed them, apparently on the basis of will-to-believe and their own arrogance. (I hereby include by reference Khurana's book on the charisma of corporate CEOs, and Xavier Marquez's great post on charisma.)
The real heroes here are, of course, the people who quietly kept following through on established procedures and regulations, and refused to bend to considerable pressure.
Luca D'Andrea, Beneath the Mountain
Mind candy: in which a stranger investigates the secrets of a small, isolated community's past, for multiple values of "past".
Walter Jon Williams, Quillifer
Misadventures of a rogue in a fantasy world whose technology level seems to be about the 1500s in our world. Quillifer has some genuinely horrible things happen to him, and brings others on himself, but keeps bouncing back, and keeps his eye on various main chances (befitting the only law clerk I can think of in fantasy literature who isn't just cannon-fodder). I didn't like him, exactly, but I was definitely entertained.

Books to Read While the Algae Grow in Your Fur; Pleasures of Detection, Portraits of Crime; Scientifiction and Fantastica

Books to Read While the Algae Grow in Your Fur, November 2018

0
0

Attention conservation notice: I have no taste. I also have no qualifications to discuss the history of photography, or of black Pittsburgh.

Cheryl Finley, Laurence Glasco and Joe W. Trotter, with an introduction by Deborah Willis, Teenie Harris, Photographer: Image, Memory, History
A terrific collection of Harris's photos of (primarily) Pittsburgh's black community from the 1930s to the 1970s, with good biographical and historical-contextual essays.
Disclaimer: Prof. Trotter is also on the faculty at CMU, but I don't believe we've ever actually met.
Ben Aaronovitch, Lies Sleeping
Mind candy: the latest installment in the long-running supernatural-procedural mystery series, where the Folly gets tangled up with the Matter of Britain.
Charles Stross, The Labyrinth Index
Mind candy; Latest installment in Stross's long-running Lovecraftian spy-fiction series. I imagine a novel about the US Presidency being taken over by a malevolent occult force seemed a lot more amusing before 2016, when this must have been mostly written. It's a good installment, but only suitable for those already immersed in the story.
Anna Lee Huber, The Anatomist's Wife and A Brush with Shadows
Mind-candy, historical mystery flavor. These are the first and sixth books in the series, because I couldn't lay hands on 2--5, but I will. (Update: More.)

Books to Read While the Algae Grow in Your Fur; Scientifiction and Fantastica; Pleasures of Detection, Portraits of Crime; Tales of Our Ancestors; Cthulhiana; Heard About Pittsburgh, PA

Books to Read While the Algae Grow in Your Fur, December 2018

0
0

Attention conservation notice: I have no taste. I also have no qualifications to discuss poetry or leftist political theory. I do know something about spatiotemporal data analysis, but you don't care about that.

Gidon Eshel, Spatiotemporal Data Analysis
I assigned this as a textbook in my fall class on data over space and time, because I need something which covered spatiotemporal data analysis, especially principal components analysis, for students who could be taking linear regression at the same time, and was cheap. This met all my requirements.
The book is divided into two parts. Part I is a review or crash course in linear algebra, building up to decomposing square matrices in terms of their eigenvalues and eigenvectors, and then the singular value decomposition of arbitrary matrices. (Some prior acquaintance with linear algebra will help, but not very much is needed.) Part II is about data analysis, covering some basic notions of time series and autocorrelation, linear regression models estimated by least squares, and "empirical orthogonal functions", i.e., principal components analysis, i.e., eigendecomposition of covariance or correlation matrices. As for "cheap", while the list price is (currently) an outrageous \$105, it's on JSTOR, so The Kids had free access to the PDF through the university library.
In retrospect, there were strengths to the book, and some serious weaknesses --- some absolute, some just for my needs.
The most important strength is that Eshel writes like a human being, and not a bloodless textbook. His authorial persona is not (thankfully) much like mine, but it's a likeable and enthusiastic one. This is related to his trying really, really hard to explain everything as simply as possible, and with multitudes of very detailed worked examples. I will probably be assigning Part I of the book, on linear algebra, as refresher material to my undergrads for years.
He is also very good at constantly returning to physical insight to motivate data-analytic procedures. (The highlight of this, for me, was section 9.7 [pp. 185ff] on when and why an autonomous, linear, discrete-time AR(1) or VAR(1) model will arise from a forced, nonlinear, continuous-time dynamical system.) If this had existed when I was a physics undergrad, or starting grad school, I'd have loved it.
Turning to the weaknesses, some of them are, as I said, merely ways in which he didn't write the book to meet my needs. His implied reader is very familiar with physics, and not just the formal, mathematical parts but also the culture (e.g., the delight in complicated compound units of measurement, saying "ensemble" when other disciplines say "distribution" or "population"). In fact, the implied reader is familiar with, or at least learning, climatology. But that reader has basically no experience with statistics, and only a little probability (so that, e.g., they're not familiar with rules for algebra with expectations and covariances*). Since my audience was undergraduate and masters-level statistics students, most of whom had only the haziest memories of high school physics, this was a mis-match.
Others weaknesses are, to my mind, a bit more serious, because they reflect more on the intrinsic content.
  • A trivial but real one: the book is printed in black and white, but many figures are (judging by the text) intended to be in color, and are scarcely comprehensible without it. (The first place this really struck me was p. 141 and Figure 9.4, but there were lots of others.) The electronic version is no better.
  • The climax of the book (chapter 11) is principal components analysis. This is really, truly important, so it deserves a lot of treatment. But it's not a very satisfying stopping place: what do you do with the principal components once you have them? What about the difference between principal components / empirical orthogonal functions and factor models? (In the book's terms, the former does a low-rank approximation to the sample covariance matrix $\mathbf{v} \approx \mathbf{w}^T \mathbf{w}$, while the latter treats it as low-rank-plus-diagonal-noise $\mathbf{v} \approx \mathbf{w}^T\mathbf{w} + \mathbf{d}$, an importantly different thing.) What about nonlinear methods of dimensionality reduction? My issue isn't so much that the book didn't do everything, as that it didn't give readers even hints of where to look.
  • There are places where the book's exposition is not very internally coherent. Chapter 8, on autocorrelation, introduces the topic with an example where $x(t) = s(t) + \epsilon(t)$, for a deterministic signal function $s(t)$ and white noise $\epsilon(t)$. Fair enough; this is a trend-plus-noise representation. But it then switches to modeling the autocorrelations as arising from processes where $x(t) = \int_{-\infty}^{t}{w(u) x(u) du} + \xi(t)$, where again $\xi(t)$ is white noise. (Linear autoregressions are the discrete-time analogs.) These are distinct classes of processes. (Readers will find it character-building to try to craft a memory kernel $w(u)$ which matches the book's running signal-plus-noise example, where $s(t) = e^{-t/120}\cos{\frac{2\pi t}{49}}$.)
  • I am all in favor of physicists' heuristic mathematical sloppiness, especially in introductory works, but there are times when it turns into mere confusion. The book persistently conflates time or sample averages with expectation values. The latter are ensemble-level quantities, deterministic functionals of the probability distribution. The former are random variables. Under various laws of large numbers or ergodic theorems, the former converge on the latter, but they are not the same. Eshel knows they are not the same, and sometimes talks about how they are not the same, but the book's notation persistently writes them both as $\langle x \rangle$, and the text sometimes flat-out identifies them. (For one especially painful example among many, p. 185.) Relatedly, the book conflates parameters (again, ensemble-level quantities, functions of the data-generating process) and estimators of those parameters (random variables)
  • The treatment of multiple regression is unfortunate. $R^2$ does not measure goodness of fit. (It's not even a measure of how well the regression predicts or explains.) At some level, Eshel knows this, since his recommendation for how to pick regressors is not "maximize $R^2$". On the other hand, his prescription for picking regressors (sec. 9.6.4, pp.180ff) is rather painful to read, and completely at odds with his stated rationale of using regression coefficients to compare alternative explanations (itself a bad, though common, idea). Very strikingly, the terms "cross-validation" and "bootstrap" do not appear in his index**. Now, to be clear, Eshel isn't worse in his treatment of regression that most non-statisticians, and he certainly understands the algebra backwards and forwards. But his advice on the craft of regression is, to be polite, weak and old-fashioned.
Summing up, the linear-algebra refresher/crash-course of Part I is great, and I even like the principal components chapters in Part II, as far as they go. But it's not ideal for my needs, and there are a bunch of ways I think it could be improved for anyone's needs. What to assign instead, I have no idea.
*: This is, I think, why he doesn't explain the calculation of the correlation time and effective sample size in sec. 8.2 (pp. 123--124), just giving a flat statement of the result, though it's really easy to prove with those tools. I do appreciate finally learning the origin of this beautiful and practical result --- G. I. Taylor, "Diffusion by Continuous Movements", Proceedings of the London Mathematical Society, series 2, volume 20 (1922), pp. 196--212 (though the book's citing it with the wrong year, confusing series number with an issue number, and no page numbers was annoying). ^
**: The absence of "ridge regression" and "Tikhonov regularization" from the index is all the more striking because they appear in section 9.3.3 as "a more general, weighted, dual minimization formalism", which, compared to ordinary least squares, is described as "sprinkling added power ... on the diagonal of an otherwise singular problem". This is, of course, a place where it would be really helpful to have a notion of cross-validation, to decide how much to sprinkle.^
Nick Srnicek and Alex Williams, Inventing the Future: Postcapitalism and a World Without Work
It's --- OK, I guess? They have some good points against what they call "folk politics", namely, that it has conspicuously failed to accomplish anything, so doubling down on more of it seems like a bad way to change the world. And they really want to change the world: the old twin goals of increasing human power over the world, and eliminating human power of other humans, are very much still there, though they might not quite adopt that formula. To get there, their basic idea is to push for a "post-work world", one where people don't have to work to survive, because they're entitled to a more-than-subsistence basic income as a matter of right. They realize that making that work will require lots of politics and pushes for certain kinds of technological progress rather than others. This is the future they want --- to finally enter (in Marx's words) "the kingdom of freedom", where we will be able to get on with all the other problems, and possibilities, confronting us.
As for getting there: like a long, long line of leftist intellectuals from the 1960s onwards, Srnicek and Williams are very taken with the idea, going back to Gramsci, that the key to achieving socialism is to first achieve ideological "hegemony". To put it crudely, this means trying to make your idea such broadly-diffused, widely-accepted, scarcely-noticed common notions that when madmen in authority channel voices from the air, they channel you. (In passing: Occupy may have done nothing to reduce economic inequality, but Gramsci's success as a strategist may be measured by the fact that he wrote in a Fascist prison.) Part of this drive for hegemony is pushing for new ideas in economics --- desirable in itself, but they are sure in advance of what inquiry should find *. Beyond this, and saying that many tactics will need to be tried out by a whole "ecology" of organizations and groups, they're pretty vague. There's some wisdom here --- who could propound a detailed plan to get to post-work post-capitalism? --- but also more ambiguity than they acknowledge. Even if a drive for a generous basic income (and all that would go with it) succeeds, the end result might not be anything like the sort of post-capitalism Srniceck and Williams envisage, if only because what we learn and experience along the way might change what seems feasible and desirable. (This is a Popperian point against Utopian plans, but it can be put in other language quite easily**.) I think Srnicek and Williams might be OK with the idea that their desired future won't be realized, so long as some better future is, and that the important point is to get people on the left not to prefigure better worlds in occasional carnivals of defiance, but to try to make them happen. Saying that doing this will require organization, concrete demands, and leadership is pretty sensible, though they do disclaim trying to revive the idea of a vanguard party.
Large portions of the book are, unfortunately, given over to insinuating, without ever quite saying, that post-work is not just desirable and possible, but a historical necessity to which we are impelled by the inexorable development of capitalism, as foreseen by the Prophet. (They also talk about how Marx's actual scenario for how capitalism would develop, and end, not only has not come to pass yet, but is pretty much certain to never come to pass.) Large portions of the book are given over to wide-ranging discussions of lots of important issues, all of which, apparently, they grasp through the medium of books and articles published by small, left-wing presses strongly influenced by post-structuralism --- as it were, the world viewed through the Verso Books catalog. (Perry Anderson had the important advantage, as a writer and thinker, of being formed outside the rather hermetic subculture/genre he helped create; these two are not so lucky.) Now, I recognize that good ideas usually emerge within a community that articulates its own distinctive tradition, so some insularity can be all to the good. In this case, I am not all that far from the authors' tradition, and sympathetic to it. But still, the effect of these two (overlapping) writerly defects is that once the book announced a topic, I often felt I could have written the subsequent passage myself; I was never surprised by what they had to say. Finishing this was a slog.
I came into the book a mere Left Popperian and market socialist, friendly to the idea of a basic income, and came out the same way. My mind was not blown, or even really changed, about anything. But it might encourage some leftist intellectuals to think constructively about the future, which would be good.
Shorter: Read Peter Frase's Four Futures instead.
*: They are quite confident that modern computing lets us have an efficient planned economy, a conclusion they support not be any technical knowledge of the issue but by citations to essays in literary magazines and collections of humanistic scholarship. As I have said before, I wish that were the case, if only because it would be insanely helpful for my own work, but I think that's just wrong. In any case, this is an important point for socialists, since it's very consequential for the kind of socialism we should pursue. It should be treated much more seriously, i.e., rigorously and knowledgeable, than they do. Fortunately, a basic income is entirely compatible with market socialism, as are other measures to ensure that people don't have to sell their labor power in order to live.
**: My own two-minute stab at making chapter 9 of The Open Society and Its Enemies sound suitable for New Left Review: "The aims of the progressive forces, always multifarious, develop dialectically in the course of the struggle to attain them. Those aims can never be limited by the horizon of any abstract, pre-conceived telos, even one designated 'socialism', but will always change and grow through praxis." (I admit "praxis" may be a bit dated.) ^
A. E. Stallings, Like: Poems
Beautiful stuff from one of my favorite contemporary poets. "Swallows" and "Epic Simile" give a fair impression of what you'll find. This also includes a lot of the poems discussed in Cynthia Haven's "Crossing Borders" essay.

Books to Read While the Algae Grow in Your Fur; Enigmas of Chance; Data over Space and Time; The Progressive Forces; The Commonwealth of Letters

Books to Read While the Algae Grow in Your Fur, January 2019

0
0

Attention conservation notice: I have no taste. I also have no qualifications to discuss the history of millenarianism, or really even statistical graphics.

Bärbel Finkenstädt, Leonhard Held and Valerie Isham (eds.), Statistical Methods for Spatio-Temporal Systems
This is an edited volume arising from a conference, with all the virtues and vices that implies. (Several chapters have references to the papers which first published the work expounded in other chapters.) I will, accordingly, review the chapters in order.
Chapter 1: "Spatio-Temporal Point Processes: Methods and Applications" (Diggle). Mostly a precis of case studies from Diggle's (deservedly standard) books on the subject, which I will get around to finishing one of these years.
Chapter 2: "Spatio-Temporal Modelling --- with a View to Biological Growth" (Vedel Jensen, Jónsdóttir, Schmiegel, and Barndorff-Nielsen). This chapter divides into two parts. One is about "ambit stochastics". In a random field $Z(s,t)$, the "ambit" of the space-time point-instant $(s,t)$ is the set of point-instants $(q,u)$, $u "past cone" of $(s,t)$.) Having a regular geometry for the ambit imposes some tractable restrictions on random fields, which are explored here for models of growth-without-decay. The second part of this chapter will only make sense to hardened habituees of Levy processes, and perhaps not even to all of them.
Chapter 3: "Using Transforms to Analyze Space-Time Processes" (Fuentes, Guttorp, and Sampson): A very nice survey of Fourier transform, wavelet transform, and PCA approaches to decomposing spatio-temporal data. There's a good account of some tests for non-stationarity, based on the idea that (essentially) we should get the nearly same transforms for different parts of the data if things really are stationary. (I should think carefully about the assumptions and the implied asymptotic regime here, since the argument makes sense, but it also makes sense that sufficiently slow mean-reversion is indistinguishable from non-stationarity.)
Chapter 4: "Geostatistical Space-Time Models, Stationarity, Seperability, and Full Symmetry" (Gneiting, Genton, and Guttorp): "Geostatistics" here refers to "kriging", or using linear prediction on correlated data. As every schoolchild knows, this boils down to finding the covariance function, $\mathrm{Cov}[Z(s_1, t_1), Z(s_2, t_2)]$. This chapter considers three kinds of symmetry restrictions on the covariance functions: "separability", where $\mathrm{Cov}[Z(s_1, t_1), Z(s_2, t_2)] = C_S(s_1, s_2) C_T(t_1, t_2)$; the weaker notion of "full symmetry", where $\mathrm{Cov}[Z(s_1, t_1), Z(s_2, t_2)] = $\mathrm{Cov}[Z(s_1, t_2), Z(s_2, t_1)]$; and "stationarity", where $\mathrm{Cov}[Z(s_1, t_1), Z(s_2, t_2)] = $\mathrm{Cov}[Z(s_1+q, t_1+h), Z(s_2+q, t_2+h)]$. As the authors explain, while separable covariance functions are often used because of their mathematical tractability, they look really weird; "full symmetry" can do a lot of the same work, at less cost in implausibility.
Chapter 5: "Space-Time Modelling of Rainfall for Continuous Simulations" (Chandler, Isham, Belline, Yang and Northrop): A detailed exposition of two models for rainfall, at different spatio-temporal scales, and how they are both motivated by and connected to data. I appreciate their frankness about things that didn't work, and the difficulties of connecting the different models.
Chapter 6, "A Primer on Space-Time Modeling from a Bayesian Perspective" (Higdon): Here "space-time modeling" means "Gaussian Markov random fields". Does what it says on the label.
All the chapters combine theory with examples --- chapter 2 is perhaps the most mathematically sophisticated one, and also the one where the examples do the least work. The most useful, from my point of view, were Chapters 3 and 4, but that's because I was teaching a class where I did a lot of kriging ad PCA, and (with some regret) no point processes. If you have a professional interest in spatio-temporal statistics, and a fair degree of prior acquaintance, I can recommend this as a useful collection of examples, case studies, and expositions of some detailed topics.
Errata, of a sort: There are supposed to be color plates between pages 142 and 143. Unfortunately, in my copy these are printed in grey, not in color.
Disclaimer: The publisher sent me a copy of this book, but that was part of my fee for reviewing a (different) book proposal for them.
Kieran Healy, Data Visualization: A Practical Introduction
Anyone who has looked at my professional writings will have noticed that my data visualizations are neither fancy nor even attractive, and they never go beyond basic R graphics. This is because I have never learned any other system for statistical visualization. And I've not done that because I'm lazy, and have little visual sense anyway. This book is the best guide I've seen to (1) learning the widely-used, and generally handsome, ggplot library in R, (2) learning the "grammar of graphics" principles on which it is based, and (3) learning the underlying psychological principles which make some graphics better or worse visualizations than others. (This is not to be confused with learning the maxims or even the tacit taste of a particular designer, even one of genius.) The writing is great, the examples are interesting, well-chosen and complete, and the presumptions about how much R, or statistics, you know coming in are minimal. I wish something like this had existed long ago, and I'm tempted, after reading it, to totally re-do the figures in my book. (Aside to my editor: I am not going to totally re-do the figures in my book.) I strongly recommend it, and will be urging it on my graduate students for the foreseeable future.
ObLinkage: The book is online, pretty much.
ObDisclaimer: Kieran and I have been saying good things about each other's blogs since the High Bronze Age of the Internet. But I paid good cash money for my copy, and have no stake in the success of this book.
Anna Lee Huber, Mortal Arts
More historical-mystery mind candy, this time flavored by the (dismal) history of early 19th century psychiatry. (Huber is pretty good, though not perfect, at avoiding anachronistic language, so nobody says "psychiatry" in the novel.)
Norman Cohn, The Pursuit of the Millennium: Revolutionary Millenarians and Mystical Anarchists of the Middle Ages
I vividly remember finding a used copy of this in the UW-Madison student bookstore when I began graduate school, in the fall of 1993, and having my mind blown by reading it that fall*. Coming back to it now, I find it still fascinating and convincing, and does an excellent job of tracing millenarian movements among the poor in Latinate Europe from the fall of Rome through the Reformation. (There are a few bits where he gets a bit psychoanalytic, but the first edition was published in 1957.) If I no longer find it mind-blowing, that's in large part because reading it sparked an enduring interest in millenarianism, and so I've long since absorbed what then (you should forgive the expression) came as a revelation.
The most controversial part of the book, I think, is the conclusion, where Cohn makes it very clear that he thinks there is a great deal of similarity, if not actual continuity, between his "revolutionary millenarians and mystical anarchists" and 20th century political extremism, both of the Fascist and the Communist variety. He hesitates --- wisely, I think --- over whether this is just a similarity, or there is an actual thread of historical continuity; but I think his case for the similarity is sound.
*: I was supposed to be having my mind blown by Sakurai. In retrospect, this incident sums up both why I was not a very good graduate student, and why I will never be a great scientist.

Books to Read While the Algae Grow in Your Fur; Enigmas of Chance; Data over Space and Time; Pleasures of Detection, Portraits of Crime; Tales of Our Ancestors; Psychoceramica; Writing for Antiquity Commit a Social Science


Data over Space and Time, Lectures 21--24

Data over Space and Time: Self-Evaluation and Lessons Learned

0
0
Attention conservation notice: Academic navel-gazing, about a class you didn't take, in a subject you don't care about, at a university you don't attend.

Well, that went better than it could have, especially since it was the first time I've taught a new undergraduate course since 2011.

Some things that worked well:

  1. The over-all choice of methods topics --- combining descriptive/exploratory techniques and generative models and their inference. Avoiding the ARIMA alphabet soup as much as possible both played to my prejudices and avoided interference with a spring course.
  2. The over-all kind and range of examples (mostly environmental and social-historical) and the avoidance of finance. I could have done some more economics, and some more neuroscience.
  3. The recurrence of linear algebra and eigen-analysis (in smoothing, principal components, linear dynamics, and Markov processes) seems to have helped some students, and at least not hurt the others.
  4. The in-class exercises did wonders for attendance. Whether doing the exercises, or that attendance, improved learning is hard to say. Some students specifically praised them in their anonymous feedback, and nobody complained.

Some things did not work so well:

  1. I was too often late in posting assignments, and too many of them had typos when first posted. (This was a real issue with the final. To any of the students reading this: my apologies once again.) I also had a lot of trouble calibrating how hard the assignments would be, so the opening problem sets were a lot more work than the later ones.
    (In my partial defense about late assignments, there were multiple problem sets which I never posted, after putting a lot of time into them, because my initial idea either proved much too complicated for this course when fully executed, or because I was, despite much effort, simply unable to reproduce published papers*. Maybe next time, if there is a next time, these efforts can see the light of day.)
  2. I let the grading get really, really behind the assignments. (Again, my apologies.)
  3. I gave less emphasis to spatial and spatio-temporal models in the second, generative half of the course than they really deserve. E.g., Markov random fields and cellular automata (and kin) probably deserve at least a lecture each, perhaps more.
  4. I didn't build in enough time for review in my initial schedule, so I ended up making some painful cuts. (In particular, nonlinear autoregressive models.)
  5. My attempt to teach Fourier analysis was a disaster. It needs much more time and preparation than I gave it.
  6. We didn't get very much at all into how to think your way through building a new model, as opposed to estimating, simulating, predicting, checking, etc., a given model.
  7. I have yet to figure out how to get the students to do the readings before class.

If I got to teach this again, I'd keep the same over-all structure, but re-work all the assignments, and re-think, very carefully, how much time I spent on which topics. Some of these issues would of course go away if there were a second semester to the course, but that's not going to happen.

*: I now somewhat suspect that one of the papers I tried to base an assignment on is just wrong, or at least could not have done the analysis the way it say it did. This is not the first time I've encountered something like this through teaching... ^

Data over Space and Time

On Godzilla and the Nature and Conditions of Cultural Success; or, Shedding the Skin

0
0

Godzilla is an outstanding example of large-scale cultural success, and of how successful cultural items become detached from their original meanings.

read more...

"Causal inference in social networks: A new hope?" (Friday at the Ann Arbor Statistics Seminar)

0
0
Attention conservation notice: Self-promoting notice of a very academic talk, at a university far from you, on a very recondite topic, solving a problem that doesn't concern you under a set of assumptions you don't understand, and wouldn't believe if I explained to you.

I seem to be giving talks again:

"Causal inference in social networks: A new hope?"
Abstract: Latent homophily generally makes it impossible to identify contagion or influence effects from observations on social networks. Sometimes, however, homophily also makes it possible to accurately infer nodes' latent attributes from their position in the larger network. I will lay out some assumptions on the network-growth process under which such inferences are good enough that they enable consistent and asymptotically unbiased estimates of the strength of social influence. Time permitting, I will also discuss the prospects for tracing out the "identification possibility frontier" for social contagion.
Joint work with Edward McFowland III
Time and place: 11:30 am -- 12:30 pm on 8 February 2019, in 411 West Hall, Statistics Department, University of Michigan

--- The underlying paper grows out of an idea that was in my paper with Andrew Thomas on social contagion: latent homophily is the problem with causal inference in social networks, but latent homophily also leads to large-scale structure in networks, and allows us to infer latent attributes from the graph; we call this "community discovery". Some years later, my student Hannah Worrall, in her senior thesis, did an extensive series of simulations showing that controlling for estimated community membership lets us infer the strength of social inference, in regimes where community-discovery is consistent. Some years after that, Ed asked me what I was wanting to work on, but wasn't, so I explained about what seemed to me the difficulties in doing some proper theory about this. As I did so, the difficulties dissolved under Ed's questioning, and the paper followed very naturally. We're now revising in reply to referees (Ed, if you're reading this --- I really am working on it!), which is as pleasant as always. But I am very pleased to have finally made a positive contribution to a problem which has occupied me for many years.

Constant Conjunction Necessary Connexion; Enigmas of Chance; Networks; Self-Centered

Data Over Space and Time

0
0

Collecting posts related to this course (36-3467/36-667).





Latest Images