SMOG, Fog and Similar Things
[Caption: The STS-92 Space Shuttle astronauts photographed upstate New York at sunset on October 21, 2000. Water bodies (Lake Ontario, Lake Erie, the Finger Lakes, the St. Lawrence and Niagara Rivers) are highlighted by sunlight (sun reflecting off the water surface), making for a dramatic and unusual regional view. The photograph was taken looking toward the southwest from southern Canada, and captures a regional smog layer extending across central New York, western Lake Erie and Ohio, and further west. The layer of atmospheric pollution is capped by an atmospheric inversion, which is marked by the layer of clouds at the top of the photograph. The astronauts were able to document this smog event from a variety of vantage points as they orbited over the northeastern U.S. and southern Canada. Source and License]
It was while reading a page from the Oxford Handbook of Medicine mentioning the Flesch Index and it’s applications to doctor-patient communication that first got me interested into readability indices. I’ve since meandered into some excellent articles (here and here) on the subject and have decided to include scores below each of my future posts . Why? Well, ‘coz it’s fun! I’m generating these scores using the GNU ‘diction’ and ‘style’ utilities (featured in this great article and available for download here. In Ubuntu Linux, just install the diction package through its repositories.). For those of you who’d like to add these stats without having to download anything, this website uses the same backends. The absolutely fun thing with that website is that you can see your scores change dynamically as you type!
For a very brief overview of what these scores mean, here’s an excerpt from the ‘man’ page for ‘style’ (the man page is licensed under the GNU GPL):-
The Kincaid Formula was developed for U.S. Navy training manuals; it ranges in difficulty from 5.5 to 16.3. It is probably best applied to technical documents, because it is based on adult training manuals rather than school book text. Dialogs (often found in fictional texts) are usually a series of short sentences, which lowers the score. On the other hand, scientific texts with many long scientific terms are rated higher, although they are not necessarily harder to read for people who are familiar with those terms.
Kincaid = 11.8*syllables/wds+0.39*wds/sentences-15.59
Automated Readability Index
The Automated Readability Index is typically higher than Kincaid and Coleman-Liau, but lower than Flesch.
ARI = 4.71*chars/wds+0.5*wds/sentences-21.43
The Coleman-Liau Formula usually gives a lower grade than Kincaid, ARI and Flesch when applied to technical documents.
Coleman-Liau = 5.89*chars/wds-0.3*sentences/(100*wds)-15.8
Flesch Reading Ease formula
Developed by Rudolph Flesch in 1948, the Flesch Reading Ease formula is based on school texts covering grades 3 to 12. It is widespread, especially in the USA, because it is computed easily and produces good results. The index ranges from 0 (hard) to 100 (easy). Standard English documents average around 60 to 70. Applying it to German documents gives bad results because of the different language structure.
Flesch Index = 206.835-84.6*syll/wds-1.015*wds/sent
The Fog index was developed by Robert Gunning. Its value is a
school grade. The “ideal” Fog Index level is 7 or 8. A level above 12 indicates the writing sample is too hard for most people to read. Texts less than 100 words will not produce meaningful results. Note that a correct implementation would not count words of three or more syllables that are proper names, combinations of easy words, or made three syllables by suffixes such as –ed, –es, or –ing.
Fog Index = 0.4*(wds/sent+100*((wds >= 3 syll)/wds))
The Lix formula developed by Björnsson from Sweden is very simple and employs a mapping table as well:
Lix = wds/sent+100*(wds >= 6 char)/wds
Index 34 <–> 38 <–> 41 <–> 44 <–> 48 <–> 51 <–> 54 <–> 57
School year 5 | 6 | 7 | 8 | 9 | 10 | 11
The SMOG Grading for English texts was developed by McLaughlin in 1969. Its result is a school grade.
Grading = square root of (((wds >= 3 syll)/sent)*30) + 3
It was adapted to German by Bamberger and Vanecek in 1984, who
changed the constant +3 to -2.
Having just learned a lot of comp related stuff lately (LAMP server basics being one of them ) and reflecting on this piece of exciting news, I guess there isn’t anything medical on my mind right now ! So with that I end this post.
Do send in your comments!
Readability scores for this post:
Flesch Index: 68.1/100 (plain English)
Fog Index: 11.7
Lix: 41.9 = school year 7
Copyright © 2006 – 2008 Firas MR. All rights reserved.