Why You Shouldn’t Draw Pattern Traces on Graphs
Visitor essay by Kip Hansen
What we name a graph is extra correctly known as “a graphical illustration of knowledge.” One quite common type of graphical illustration is “a diagram exhibiting the relation between variable portions, sometimes of two variables, every measured alongside considered one of a pair of axes at proper angles.”
Right here at WUWT we see numerous graphs — all kinds of graphs of numerous totally different knowledge units. Here’s a generally proven graph provided by NOAA taken from a bit at Local weather.gov referred to as “Did world warming cease in 1998?” by Rebecca Lindsey printed on September four, 2018.


I’m not within the particulars of this graphic illustration — the entire thing qualifies as “silliness”. The vertical scale is in levels Fahrenheit and the complete vary change over 140 years proven is on the size 2.5 °F or a couple of diploma and a half C. The fascinating factor concerning the graph is the hassle of drawing of “pattern traces” on prime of the info to convey to the reader one thing concerning the knowledge that the creator of the graphic illustration needs to speak. This “one thing” is an opinion — it’s at all times an opinion — it’s not a part of the info.
The information is the info. Turning the info right into a graphical illustration (all proper, I’ll simply use “graph” from right here on….), making the info right into a graph has already injected opinion and private judgement into the info by means of alternative of begin and finish dates, vertical and horizontal scales and, on this case, the shading of a 15-year interval at one finish. Typically the choices as to vertical and horizontal scale are made by software program — not rational people — inflicting even additional confusion and generally gross misrepresentation.
Anybody who can’t see the info clearly within the prime graph with out the help of the purple pattern line ought to discover one other subject of research (or see their optometrist). The underside graph has been changed into a propaganda assertion by the addition of 5 opinions within the type of mini-trend traces.
Pattern traces don’t change the info — they’ll solely change the notion of the info. Traits will be helpful at instances [ add a big maybe here, please ] however they do nothing for the graphs above from NOAA aside from try to denigrate the IPCC-sanctioned thought of “The Pause”, reinforcing the specified opinion of the creator and her editors at Local weather.gov (who, you’ll discover from the date of publication, are nonetheless onerous at it hammer-and-tongs, selling local weather alarm). To present Rebecca Lindsey a tiniest little bit of credit score, she does write “How a lot slower [ the rise was ] will depend on the wonderful print: which world temperature dataset you take a look at”…. She actually has that proper. Right here is Spencer’s UAH world common decrease tropospheric temperature:


One doesn’t want any pattern traces to have the ability to see The Pause that runs from the aftermath of the 1998 Tremendous El Niño to the appearance of the 2015-2016 El Niño. This illustrates two points: Drawing pattern traces on graphs is including data that isn’t a part of the info set and it truly is necessary to know that for any scientific idea, there’s a couple of set of knowledge — a couple of measurement — and it’s critically necessary to know “What Are they Actually Counting?”, the central level of which is:
So, for all measurements provided to us as data particularly if accompanied by a claimed significance – after we are advised that this measurement/quantity means this-or-that — we now have the identical important query: What precisely are they actually counting?
Naturally, there’s a corollary query: Is the factor they counted actually a measure of the factor being reported?
I just lately got here throughout an instance in one other subject of simply how intellectually harmful the cognitive dependence (nearly an habit) on pattern traces will be for scientific analysis. Bear in mind, pattern traces on fashionable graphs are sometimes being calculated and drawn by statistical software program packages and the output of these packages are far too usually taken to be some form of revealed reality.
I’ve no need to get into any controversy concerning the precise material of the paper that produced the next graphs. I’ve abbreviated the identified situation on the graphs to softly disguise it. Attempt to stick with me and focus not on the medical challenge however on the best way during which pattern traces have affected the conclusions of the researchers.
Right here’s the massive knowledge graph set from the supplemental data for the paper:
Notice that these are graphs of Incidence Charges which will be thought-about “what number of instances of this illness are reported per 100,000 inhabitants?”, right here grouped by 10-year Age Teams. They’ve added coloured pattern traces the place they suppose (opinion) important modifications have occurred in incident charges.


[ Some important details, discussed further on, can be seen on the FULL-SIZED image, which opens in a new tab or window. ]
IMPORTANT NOTE: The situation being studied on this paper isn’t one thing that’s seasonal or annual, like flu epidemics. It’s a situation that develops, in most instances, for years earlier than being found and reported, generally solely being found when it turns into debilitating. It will also be found and reported by means of common medical screening which usually is completed solely in older folks. So “annual incidence” could not a correct description of what has been measured — it’s really a measure of “annual instances found and reported’ — not really incidence which is sort of a special factor.
The printed paper makes use of a condensed model the graphs:


The older women and men are proven within the prime panels, fortunately with incidence charges declining from the 1980s to the current. Nevertheless, as considerately strengthened by the addition of coloured pattern traces, the incident charges in women and men youthful than 50 years are rising reasonably steeply. Based mostly on this (and numerous different issues), the researchers draw this conclusion:


Once more, I’ve no explicit opinion on the medical points concerned…they might be proper for causes not obvious. However right here’s the purpose I hope to speak:


I annotate the 2 panels regarding incidence charges in Males older than 50 and Males youthful than 50. Over the 45 years of knowledge, the speed in males older than 50 runs in a variety of 170 to 220 instances reported per yr, various over a 50 instances/yr band. For Males < 50, incidence rates have been very steady from 8.5 to 11 cases per year per 100,000 population for 40 years, and only recently, the last four data points, risen to 12 and 13 cases per 100,000 per year — an increase of one or two cases [per 100,000 population per year. It may be the trend line alone that creates a sense of significance. For Men > 50, between 1970 and the early 1980s, there was a rise of 60 instances per 100,000 inhabitants. But, for Males < 50, the elevated discovery and reporting of an extra one or two instances per 100,000 is concluded to be a matter of “highest precedence” — nevertheless, in actuality, it could or could not really be important in a public well being sense — and it could be inside the regular variance in discovery and reporting of this sort of illness.
The vary of incidence amongst Males < 50 remained the identical from the late 1970s to the early 2010s — that’s fairly steady. Then there are 4 barely greater outliers in a row — with will increase 1 or 2 instances per 100,000. That’s the info.
If it had been my knowledge — and my matter — say variety of Monarch butterflies visiting my backyard yearly by month or one thing, I might discover from the panel of seven graphs additional above, that the pattern traces confuse the problems. Right here it’s once more:

[ full-sized picture in new tab/window]
If we attempt to ignore the pattern traces, we will see within the first panel 20-29y incidence charges are the identical within the present decade as they had been within the 1970s — there is no such thing as a change. The vary represented on this panel, from lowest to highest knowledge level, is lower than 1.5 instances/yr.
Skipping one panel, 40-49y, we see the vary has perhaps dropped a bit however the complete magnitude vary is lower than 5 instances/100,000/yr. On this age-group, there’s a pattern line drawn which reveals a rise over the past 12-13 years, however the vary is at present decrease than within the 1970s.
Within the remaining 4 panels, we see “hump formed” knowledge, which over the 50 years, stays in the identical vary inside every age-group.
You will need to keep in mind that this isn’t an sickness or illness for which a trigger is understood or for which there’s a way of prevention, though there’s a remedy if the situation is found early sufficient. It’s a class of cancers and incidence isn’t managed by public well being actions to stop the illness. Public well being actions usually are not inflicting the change in incidence. It’s recognized to be age-related and happens more and more usually in women and men as they age.
It’s the one panel, 30-39y , that reveals a rise in incidence of simply over 2 Circumstances/100,000/yr that’s the controlling issue that pushes the Males < 50 graph to point out this improve. (It might be the 40-49y panel having the identical impact.) (once more, repeating the picture to save lots of readers scrolling up the web page):


Recall that the Conclusion and Relevance part of the paper referred to as this “This improve in incidence amongst a low-risk inhabitants calls for extra analysis on doable threat elements that could be affecting these youthful cohorts. It seems that major prevention ought to be the very best precedence to cut back the variety of youthful adults creating CRC sooner or later.”
This essay isn’t concerning the incidence of this class of most cancers amongst numerous age teams — it’s about how having statistical software program packages draw pattern traces on prime of your knowledge can result in confusion and probably misunderstandings of the info itself. I’ll admit that it’s also doable to attract pattern traces on prime of 1’s knowledge for rhetorical causes [ “expressed in terms intended to persuade or impress” ], as in our Local weather.gov instance (and hundreds of thousands of different examples in all fields of science).
On this medical case, there are further findings and reasoning behind the researchers conclusions — none of which change the essential level of this essay about statistical packages discovering and drawing pattern traces excessive of knowledge on graphs.
Backside Traces:
Pattern traces are NOT a part of the info. The information is the info.
Pattern traces are at all times opinions and interpretations added to the info and rely upon the definition (mannequin, statistical formulation, software program bundle, no matter) one is utilizing for “pattern”. These opinions and interpretations will be legitimate, invalid, or nonsensical (and all the things in between)
Pattern traces are NOT proof — the info will be proof, however not essentially proof of what it’s claimed to be proof for.
Traits usually are not causes, they’re results. Previous traits didn’t trigger the current knowledge. Current knowledge traits won’t trigger future knowledge.
In case your knowledge must be run by means of a statistical software program bundle to find out a “pattern” — then I might counsel that you’ll want to do extra or totally different analysis in your matter or that your knowledge is so noisy or random that pattern perhaps irrelevant.
Assigning “significance” to calculated traits based mostly on P-value is statistically invalid.
Don’t draw pattern traces on graphs of your knowledge. In case your knowledge is legitimate, to the most effective of your data, it doesn’t want pattern traces to “clarify” it to others.
# # # # #
Creator’s Remark Coverage:
At all times take pleasure in your feedback and am completely happy to answer, reply questions or supply additional explanations. Start your remark with “Kip…” so I do know you might be chatting with me.
As a utilization observe, it’s at all times higher to point who you might be chatting with as remark threads can get sophisticated and feedback don’t at all times seem within the order one thinks they are going to. So, in case you are replying to Joe, begin your remark with “Joe”. Some on-line periodicals and blogs (such because the NY Occasions) are actually utilizing an automaticity to pre-add “@Joe” to the remark subject in case your hit the reply button beneath a remark from Joe.
Apologies prematurely to the [unfortunately] statistically over-educated who could have completely totally different definitions of widespread English phrases used on this essay and thus arrive at opposite conclusions.
Traits and pattern traces are a subject not at all times agreed upon — some folks suppose traits have particular which means, are important, or may even be causes. Let’s hear from you.
# # # # #
Like this:
Loading…