A Look at Gender Representation over Time


I use the New York Times’ database tool to examine the prevalence of the phrases “he said” and “she said,” using them as proxies for the representation of men and women in news articles over time. Dramatic world events have tended to suppress women’s representation until recently. Women’s representation in the news is only about one-third that of men at present, and this level was only achieved in the late 90s. There is also a general trend of increasing usage of both proxy phrases over time, especially in the post-WWII era.


The New York Times (NYT) recently added a data visualization tool to its website, allowing people to search for the prevalence of key phrases throughout the newspaper’s history. The Guardian ran with a story that included a chart of the use of “he said” versus “she said” over time, and this has been circulating online—mostly as an indication of how lopsided the representation of men vs. women is in the news.

The following assumes that the NYT usage corresponds to broad societal attitudes and that it can be used as a proxy for understanding the relative power and prominence of women vs. men in our society. At the very least, if one assumes that the NYT caters to the wealthiest top 20% or so of the population, it says something about the standing of men and women within that elite strata. It also assumes that “he said” and “she said” are a reasonable proxies for the representation of men and women in the news.

Analysis and Discussion

Fractions HeSaid SheSaid NYT First, the correlation between “he said” and “she said” is 0.89, meaning that movement in one measure often accompanies the same direction of movement in the other. This is likely due to the particular writing styles of those employed by the Times, or due to the preferences of the editors. They, in turn, are representative of shifting societal expectations and linguistic customs—again, either broadly or within the upper strata of US society depending on the who the NYT writes for. Both culture and world events tend to effect the use of these phrases.

There has been a large increase in the use of both proxies over time. The period from 1851 to 1890 averaged 4% of articles using the phrase “he said,” while 1891 to 1950 more than doubled this to 9.2%. Then there is a gradual takeoff from 1950 to the mid-70s, where “he said” becomes much more common and tops out between 20% and 25% of all articles, holding roughly steady to the present. “She said” also grows during this time, moving from 1% in 1891 – 1950 to roughly 8% in the present.

Possible explanations for the dramatic difference in pre- and post-WWII:

  1. In the 1800s and early 1900s, it was considered proper to reference individuals by their surnames or full names rather than via pronouns.
  2. In the 1800s and early 1900s, there were fewer direct quotations of sources. Possible sub-explanations:
    1. Space limitations. Easier to summarize someone’s statement than to quote them for any length. Perhaps quote a few choice words or phrases from them, summarize the rest in between for brevity.
    2. The inability to write down every word or record precisely what someone said, again leading to a preference for summarizing. There might be a correlation here with the prevalence of recording devices, especially portable ones. However, I would think that shorthand writing should have been able to get around this and was actually developed for this exact purpose.
    3. A shift in style toward narratives and conversations. Summarizing requires the ability to take a whole statement into account and distill it down to its essence, which is often less conversational. Quoting individuals directly personalizes an article, making it slightly more conversational and story-like than a summation. Remember, this could also reflect a gradual evolution of standards among journalists, and not in the broader culture—recall that the NYT tends to serve the wealthy and professionals, so this change could be relegated only to those segments of society. A “vulgarization” of their discourse, while keeping the content roughly the same.
    4. A shift in focus/content toward personality and “what people say” and away from analysis and discussion. Reasons for this could include the desire to appear balanced and impartial or the need to provide cover from accusations of distortion—a direct quote looks more legitimate and true than a summary, even if the quote is used out of context. This could also relate to 2c above.

These are just some guesses—no idea if I’m in the right ballpark.

Ratio HeSaid SheSaid NYT

The ratio of “she said” to “he said” varies from 4% to 24% prior to the Great Depression. Note a solid upward trend from 1890 to approximately 1914. This correlates closely with the movement for women’s suffrage, with the National American Woman Suffrage Association forming in 1889 and the nineteenth amendment being passed in 1920. There is a dip from 1914 to 1919, as World War I drew a lot of attention to the political and military realm that was dominated entirely by men at the time. The level picked back up after the war, then fell again around 1928 as the Great Depression took hold. The ratio did not take off again until the mid-1960s as the second-wave feminist movement in the US developed—The Feminine Mystique was published in 1963.

If this ratio can be taken as a rough indication of the representation of women in national news and if that is an estimation of their relative standing in society, then the period from 1929 to 1965 was the worst such stretch since NYT records began in 1851. Though there was a great deal of variation prior to 1929, the 1929 – 1965 period featured consistently low numbers, no higher than 11% and hovering around 6% for most of the 1950s.

Somewhat disconcerting for those of us born recently is the fact that the current levels of (still low) representation of women in the news were only achieved in the late 1990s. Even more disconcerting is that the last two years were the lowest in two decades, though it remains to be seen whether this is an aberration or the start of a new trend.

Note also that large, systemic crises that last for multiple years see a dip in the ratio, probably because politics, wars, and markets have historically been male dominated. As I mentioned above, World War I was a departure from trend, as were the Great Depression and World War II—and cultural changes from these latter two locked in a strongly subordinate status for women for an additional two decades after their resolution. The Civil War can be picked out of the data, with the ratio plummeting from a range of 10% to 20% down to 4% or 5% from 1861 to 1864, then picking back up to around 10% after the war.

However, it’s difficult to pick out more recent crises. The Vietnam War does not show up in the data as a decrease in the ratio—likely because second wave feminism was growing at the same time and the movement was a key part of the protests against the war. Women’s voices continued to be heard during wartime, and in fact increased. There is a slight stagnation in the late 70s and early 80s, possibly due to the end of the war (and its loss as a platform for oppositional organization) and stagflation—though I’m mostly hand waving at this point and should stop speculating.

Even more recently, the effects of the dotcom bubble, the September 11th attacks, and the invasions of Iraq and Afghanistan show up as fluctuations of maybe 1 or 2 percentage points difference. In other words, not much at all. The 2008 Great Recession is completely hidden in the ratio but shows up in the two base percentage data sets—although it’s possible that the low ratio values in 2012 and 2013 are indicative of lasting, deeper effects of the Recession.

Several ideas come to mind:

  1. For the majority of the US population, the intensity of recent wars and crises has diminished greatly due to use of a volunteer army, limited US casualties, and the existence of a social safety net in the post-New Deal era.
  2. Women now participate in politics, wars, and market activities to enough of a degree that when crises occur they are now part of the focus and not relegated to the background.
  3. The wealthy and much of the upper middle class are increasingly insulated from crises that affect the rest of the nation and world, and the NYT tends to serve these audiences.

There may be some merit to (1), as the more recent wars did not feature a draft, casualties were limited compared to 20th century wars that involved the US, and the existence of unemployment insurance, social security, and welfare can mitigate the worst impacts of economic crises. This would explain why recent dips in base proxy percentages, though present, are small. However, NYT reporters, editorial writers, and editors are not drawn from the general population and instead represent the wealthier strata of society. Their readers also tend to come from these strata. So the experience of the bottom 80% on their writing is limited, thus limiting (1)’s usefulness as an explanation. It could still be true, but it’s doubtful that it shows up in this particular set of data.

Idea (2) looks more likely. Although women are underrepresented in positions of power, they may have surpassed some minimum threshold of participation where they are now sufficient in number to be noticed and quoted in situations where men used to be the sole focus. Additionally, movements associated with third wave feminism have been active in the US since the early 1990s and could be having an effect on coverage. Note that this explanation also means that, given the chance, women are roughly as likely as men to participate in wars, economic downturns, and political crises—which sounds right to me. This would explain why the ratio is pretty level over the last 15 years while the individual proxy values may rise and fall a bit more dramatically.

Finally, (3) is also possible, especially with rising wealth inequality in the US since the 1980s. Given the existence of Wall Street and so much high finance right in the NYT’s backyard, as well as my stated reservations to (1) above, idea (3) seems like an explanation for proxy percentage changes—events that effect the wealthy show up more dramatically than those that don’t. The 2008 Great Recession shows up, wars in Iraq and Afghanistan do not. However, the dotcom bust doesn’t show up either, weakening the case for this explanation.


The proportional representation of women in NYT reporting since the late 1990s has increased by a factor of three compared to the 1800s and a factor of six compared to the period 1929 to 1965. However, it’s still at the very low proportion of 30% to 35%, and even these levels were only achieved in the last 20 years. The last couple years have also seen a downward push in representation, but whether this is a new trend or an aberration has yet to be seen. If these ratios are taken as a proxy for women’s status in society, we have a long way to go. However, it’s also possible that the NYT is behind the times somewhat.

Historically, women’s representation has plummeted during wars and economic crises and risen during feminist movements and political agitation for women’s rights. The last twenty years or so may have broken this dynamic as women have increasingly entered positions of power, though it is difficult to tell. There has also been a general trend toward greater use of “he said” and “she said” in the NYT, indicating a shift in the style of reporting. The various explanations for this above cannot be evaluated with the information on hand or from my own knowledge.


Gentrification Dynamics


Gentrification is an engine for super-wealthy investors to extract income from the higher tiers of the professional upper middle class by displacing and exploiting the poor and working class. I conduct a thought experiment on the broad decisions faced by a business owner that finds herself in a suddenly more prosperous neighborhood and reflect on how these decisions, under certain assumptions in each scenario, impact workers and the neighborhood. Remember, this is a thought experiment—I’m not at all sure that I’m right, or that I’m not missing something huge.


To start, assume we have a locally owned service sector business (restaurant, barbershop, retailer, etc.) that exists in a particular urban neighborhood. Its sale prices are set at roughly what the neighborhood can afford. Some other line of work in the city takes off—something like the tech boom happens and suddenly many (but by no means all) people in the neighborhood have more money.

What happens to the business, its employees, and the neighborhood?

Scenario 1

Assume the owners pays wages that are reasonable enough for her employees to live in the neighborhood. As the incomes in the area increase, property values increase relatively little, with no speculative bubbles permitted to grow due to regulation or other institutional arrangements—perhaps huge piles of wealth are not permitted much liquidity but are instead restrained in some way. Income increases don’t go to property values (other than general home improvements), and there is only a small inflow of new, higher income residents. Prosperity is not tied heavily to land/housing value increases.

If the business owner maintains original prices, she may face some hardship due to slight increases in rent and taxes. If she raises prices she may alleviate this minor hardship. Since her customers are generally wealthier, she likely has some latitude to do so—but note that she is not compelled to do this here. However, if other businesses raise their prices as well, she is wise to follow the general trend to increase her profits, especially in anticipation of what comes next.

If she wishes to retain her local employees, she may choose to raise wages so they can afford to keep patronizing her business and other similar establishments. The local businesses that raise wages will be more likely to attract new and better employees, so her particular business has an incentive to do so. If she does not, the employees may go elsewhere for work, though again note that they still have a good chance of finding this new work in the neighborhood since other businesses are following this model. If there is not a terribly depressed neighborhood nearby, employees have leverage now to demand pay increases. If there is, owners may hire from the depressed area. Assume for now that there is NOT a depressed area to hire from.

To bring in additional income, the business may experiment and start offering modified or new services that are related to its old services. To keep costs down, it may also try changing its suppliers. If either of these strategies works then the business increases its capacity to raise employee wages, thus increasing its competitiveness on that front.

In this scenario, modest increases in income from an outside growth industry are translated first into higher prices (since that is what the traffic will bear) and then into higher wages (since it is to the advantage of each business to remain competitive and attract new talent, plus have their employees remain customers). Overall income has increased, as have costs, while services and neighborhood residents remain roughly the same—possibly with some expanded opportunities for new work or entirely new businesses, since people can afford to pay more overall, and possibly with some shuffling of suppliers to increase efficiency.

This is wage push inflation, where a wage increase in one sector leads to higher prices, which leads to more wage increases—sometimes in the sector that hosted the original increase (in which case it is directly iterative), other times in horizontally connected sectors. In the case of well structured and competitive markets, wage increases can happen relatively naturally through the incentives of business owners alone. When markets are not competitive, unions and other such organizations seem to be necessary to enforce the wage increase part of the cycle. When the labor market is slack (i.e. businesses don’t compete as much in it), wage increases are less likely to occur and this mechanism is more likely to break down.

If this mechanism works, the neighborhood is mostly preserved but undergoes some measured and stable evolution. There are increases in income, wages, and the diversity of economic activity, and there may be some decreases in non-labor business costs. Some new residents enter, some old residents leave, but the flow is manageable. Living, healthy cities contain many neighborhoods that repeatedly undergo this process and which keep the labor market from getting slack. If only one neighborhood did this in an otherwise depressed city, the slack labor market would very likely eliminate wage increases and you’re more likely to have a result like that of Scenario 2.

Scenario 2

Wages start out reasonable enough, but as incomes increase, property values enter a speculative bubble. People not in the golden industry are priced out of homes and rental units, or see that they have a lot to gain by selling and moving elsewhere. Lots of new people move into the neighborhood, lots of established resident leave.

If our business owner maintains original prices, the increased rent and taxes will probably drive her out. If she raises prices, she will be more likely to pay the rent but is now contributing to pricing others out of the neighborhood. Her own employees can no longer afford to patronize her business, and her customer base is narrower, though more affluent. Assuming there is some competition, she has an incentive to raise prices as little as possible to survive, but because she must make rent in a speculative market, she must strongly maximize the difference between income and costs. This is accomplished by not raising the wages of her employees, thus minimizing total price increases (which keeps the customer base wider and keeps the business competitive) and keeping costs down.

As in Scenario 1, there may be some possibility of changing suppliers to keep costs down. It also seems that there would be more risk and therefore less incentive to experiment with new services, and that the risk increases (and experimentation decreases) as the rental bubble’s growth rate increases.

Wages are now too low for employees to live in the neighborhood and must come from elsewhere in the city (somewhere more depressed), supplemented by the teenage children of the now-wealthier residents who will work part time for minimum wage and no benefits. If her business survives, it now caters to people who didn’t live in the neighborhood before the affluence hit and employs people who can’t afford to live in the neighborhood now—the introduction of bubble rent pushes this arrangement, makes it the most viable option. The bubble rent is an expanding cost that must be paid, stretching her business model to its limit.

I would guess that cities that host slack labor markets are more likely to fall into this trap, since the lower wage level surrounding the affluent neighborhood provides an extra shot in the arm for the investors/extractors. If the business doesn’t do this, it will perish. Its last-ditch option is to move to a more depressed part of the city where it can still operate. Its place will likely be taken by either a boutique that caters to the professional upper middle class and lower level wealthy or by some other business with low fixed costs and a greater ability to keep wages depressed—a franchise.

This is gentrification, and this is what we have come to believe development must be like. But it is a failure mode of development that ransacks neighborhoods by buying land cheap, inflating value, extracting as much of that inflated value as possible, and then moving on to the next juicy target. The neighborhood is left to fend for itself without the money that propped it up to such glorious heights.

It works especially well in already stagnant areas, since it rides off an income gradient—high prices in the target area, cheap labor within commute distance. I suspect you can even remove the original income increase and just advertise a certain neighborhood as trendy in order to attract existing wealth. Get the professional classes to move their wealth into a new fashionable spot for extraction rather than create genuinely new income streams. Keep the wealth moving, like a financial heat engine that extracts money instead of doing useful work.

Note especially that when compared to Scenario 1, the stagnation in wages in Scenario 2 is effectively theft from the worker’s wages to pay the landlords. The more benevolent the business owner, the more likely that she’ll try to raise workers wages, which means that she must raise prices higher or cut costs elsewhere (quality), which means she is more likely to go out of business. Massive rent increases further sharpen the divisions between business owners and workers.

Scenario 3

The business in Scenario 2, which includes a bubble, is now taken to be a franchise that may determine the wages it pays but which has additional fixed costs and is contractually obligated to use franchise suppliers. The only major change here is that the franchise model locks in supplier and certain administrative costs, while prohibiting most experimentation. The only two “outs” afforded in Scenario 2 are eliminated, which leaves wage stagnation and price increases as the only levers to maintain profitability. Franchises are an example of increased efficiency at the cost of lower flexibility. They can do quite well in gentrified areas, but only because they keep wages down and use their national brand names to charge higher prices. (In depressed areas, they likely do the former while not pushing the latter as much).

Closing Thoughts

Scenario 1 seems more likely when the distribution of wealth around a city or region is more equal and there are multiple loci of growth. There are more investment opportunities proportionate to the amount of wealth that investors have, and huge piles of hot money are not allowed to be used as weapons of extraction. There isn’t a strong preference to latch onto or create a very small number of “good” neighborhoods and ruin them.

Scenario 2 seems more likely when the distribution of wealth is very uneven and there are no regulatory mechanisms that prevent speculation. There are fewer investors and fewer investment opportunities, and those that do exist tend to attract tons of hot money and inevitably lead to speculation. This drives people out of their homes and drives out local businesses. By trying (and being allowed) to dump tons of money into a sure thing, that thing is destroyed and remade into something else—an engine of profit extraction that will only last as long as the investors decide to make it last, until there is another even better opportunity elsewhere. And then the money will leave, and the neighborhood will collapse. The people who once made it function are no longer there, the businesses that operated without all that speculative investment are gone. The props are removed and the neighborhood comes crashing down.

Scenario 3 is the same as 2, but explores the more restricted solution space of franchises. They are often more efficient than traditional businesses, which enables them to get a foothold in most places. But their lack of flexibility requires them to raise prices and depress wages in order to survive changes, further contributing to gentrification. They are also stagnant when it comes to generating new types of work. If I’m right about this, how much do franchises contribute to the collapse of gentrifying areas? If only so much price/wage disparity can be tolerated in a given area, franchises (especially upscale ones) may drive the area to this limit faster. Speculators would encourage this in order to increase their gains, then pull out even sooner. In this sense, the franchise owners are participating in the same behavior as the real estate speculators.

If I’m not too far off in my thinking above, many of today’s gentrified areas are likely to collapse into misery and stagnation in the future. We must develop tools to handle their inevitable breakdown and set them back on the path of slower, measured growth and vibrancy. We must halt the gentrification engine. I suspect that its only one symptom of what Stirling Newberry calls “The Red Queen’s Race,” so a solution to gentrification will likely be a sub-component of a much larger plan. I also very, very strongly recommend that you go and read that link several times until you absorb it all.

Replacing Email

Email sucks and should be replaced with a combined task queue and instant messaging system.  But these systems also lend themselves to management predation by encouraging more metrics and more tracking.  Whatever replaces email must solve its usability problem while protecting workers from tracking.  What follows are some thoughts on how a task queuing system could work, plus a few ideas for where the system needs work to protect workers.

Users may compose and send task requests.  Each request consists of:

  • Type.  Read, edit, research, calculate, etc.  Precise types of tasks can be specified by an organization.
  • Content.  The task being requested, written in plain text by the sender.  The possibility to auto-generate routine content for repeated tasks should be available.  May want to see about including some form of hyperlinks/tags to common documents, projects, etc–maintain a dictionary of common references used within the organization that can be auto-detected and tagged in the content boxes.
  • Deadline.  Date and time by which the task should be complete.
  • Priority.  An urgency level set by the sender.
  • Project.  A project this request is associated with, if any.
  • Time estimate.  Amount of time likely needed to complete the task.  This is open to abuse, as it removes quite a bit of worker agency and is susceptible to management gaming through reducing time estimates to get employees fired for performance issues.

Priority may be:

  • Hierarchical.  The more levels above a given worker the sender is, the higher the priority the sender may attach to the request.  Request priority from subordinates may be altered by their bosses.
  • Egalitarian.  Everyone has the same priority options available to them, and everyone is allocated a certain number of priority modifications per unit of time.
  • Deadline-based, possibly combined with the optional time estimate information.

Note that the above priority discussion could be extended into other aspects of the system, such as who gets to accept/decline task requests.  Hierarchical is most open to abuse, and deadline-based may be as well if management is willing to cook the numbers.

When a task is sent, it goes into the queue of the acceptor(s).  The acceptor(s) may choose to take on the task, may request a modification, or may reject it–all based on the rules established within the organization.  Once a task is accepted, it is added to the task hierarchy of the given project, with the acceptor listed as the responsible party.  An organization may choose to keep a complete history of all tasks associated with a given project, or they may disappear into the ether after completion.  Possibly both, depending on the nature of the tasks/project.

Some tasks are purely self-completed, with the task owner checking them off as complete.  Others may require management check-off.  Status of tasks include “awaiting acceptance,” “accepted,” “amendment proposed,” “delegated,” “rejected,” and “enqueued.”

Tasks may be delegated, either intact or broken into pieces.  If broken into pieces, the onus is on the delegator to break the original task up into appropriate pieces and assign them to others correctly.  Some interpretation may be necessary in the case of broader directives from upper management.  When a “parent” task is broken, the “child” tasks that it is translated into have links back to the parent, establishing a clear tree of accountability and interpretation.

One can imagine a whole project descending from a single task at the top, with various branches and twigs of sub-components trailing off.

For such a system to succeed, it needs to provide an interface with common email systems.  Exactly how to translate external emails into tasks in this system without enormous amounts of labor is an unsolved problem.

This system must also include an instant messaging system, since more detailed discussions obviously must take place in the course of completing tasks.  The instant messaging software should not provide information on worker idle time, etc.  Video chat and in person meetings would also be necessary, but do not have to form a part of this particular system–there are other options out there.

The option for detailed time monitoring of task completion should be made available to teams that desire it.  Note that I said “teams,” not organizations or managers.  Every single instance of the application will have the ability to turn on/off time tracking as a user-only (non-admin) privilege.  This does not get around the threat to fire an employee for turning off her tracking, though perhaps a single installation could spoof the time reporting by averaging that of other employees on similar/identical tasks and submitting that.

In-built worker protection from management predation and metrics obsession is another unsolved problem.