Monday, August 31, 2020

60 Years of Science - Part 20

 This post is the next in a series that dates back several years.  In fact, it's been going on for long enough that several posts ago I decided to upgrade from "50 Years of Science" to "60 Years of Science".  And, if we group them together, this is the twentieth main entry in the series.  You can go to https://sigma5.blogspot.com/2017/04/50-years-of-science-links.html for a post that contains links to all of the entries in the series.  I will update that post to include a link to this entry as soon as I have posted it.

I take Isaac Asimov's book "The Intelligent Man's Guide to the Physical Sciences" as my baseline for the state of science when he wrote the book (1959 - 60).  In this post I will review two sections. "Internal-combustion Engines" and "Radio" are both from the chapter titled "The Machine".

In "Internal-combustion Engines" Asimov starts out with an observation.  Electricity displaced petroleum when it came to illumination.  He doesn't mention it but the modern Oil industry started out by producing kerosene to fuel lamps.  Lamps could produce much more ight than candles could.  Before kerosene, however, "lamp oil" consisted of several different flammable liquids.

A common one was rendered from Whale blubber.  Other sources were similarly expensive and difficult to procure in large volumes.  Kerosene revolutionized all this.  Once the Oil drilling, refining, and transportation infrastructure got built up, kerosene lamps quickly displaced candles.  They also displaced other forms of lamp oil and made it possible for the first time for people of modest means to illuminate their homes after the sun went down.  The famous Rockefeller "Standard Oil Monopoly" was a kerosene monopoly, not a gasoline monopoly.

Electricity eventually displaced the kerosene lamp.  It also displaced synthetic natural gas, which was becoming available in medium to large cities at about the same time Edison invented a practical electric light bulb.  Seattle, the city I live in, has a long since shut down the "gas works" facility that produced synthetic natural gas (and large amounts of very toxic pollution) for decades.

Returning to Asimov, he notes that with transportation the change went the other way.  There, petroleum displaced the alternatives rather than the other way around.  Internal-combustion engines are the ideal match for petroleum.  The earliest internal-combustion engines, however, predate petroleum's wide availability.  Designs dating back to the beginning of the nineteenth century used turpentine vapors (a plant based product) or Hydrogen for fuel.

Otto, built the first "four-cycle" engine in 1876.  Then, as now, the first or "intake" cycle consists of pulling fuel and oxygen into the "cylinder", a cylindrical enclosed space, by means of a movable wall called the "piston".  During the "compression" cycle the piston is used to compress the fuel-air mixture.  The fuel then ignites to push the piston back down in the "stroke" cycle.  The final cycle "exhausts" the burned fuel/air by pushing it out of the cylinder.  The engine is now set up to do it all over again.

The up-and-down motion of the piston can be turned into rotary motion by the addition of the proper linkage and a "fly wheel" to even things out.  This is the basic design of the "one cylinder" internal-combustion engine.  It can be adapted to the "diesel" design (the fuel ignites on its own) or the design commonly used in gasoline engines where a "spark plug" is added to ignite the fuel at the right time.

Clerk almost immediately figured out that multiple cylinders, ganged together and organized so that each cylinder fired at a convenient time, would smooth things out.  Remember that only one of the four cycles produces any power.  Engines where the number of cylinders is a multiple of four are the easiest ones to optimize.  6 cylinder engines quickly became common, however, as engineers were soon able to come up with designs that worked surprisingly well.  In fact, any number of cylinders can be made to work if you are clever enough.

It took a while to figure out how to force the explosion at the right time.  But by 1823 electricity based designs became common.  A "storage battery" (see earlier posts for the details) could provide sufficient power.  The now common "lead - acid" design, still used for modern car batteries, was invented by Plate in 1859.  An "Induction coil", actually a specialized type of transformer (again, see earlier posts for details), boosted the modest amount of electric power needed to get the job done to a high voltage.

It is easy to make a spark when you have even a small amount of high voltage electricity available.  Spark plugs are sophisticated devices because they have to handle high voltages while simultaneously standing up in the harsh conditions produced when the fuel ignites inside the cylinder.

Once such an engine is running it will continue to run as long as fuel and air continue to be supplied to it.  But it doesn't "start", make the transition from not running to running, without help.  The first method employed to provide the necessary the initial impetus was to utilize the muscles of the operator.

Initially, a simple crank was provided to "turn the engine over".  If turned over vigorously enough the engine would take over and keep running on its own.  At the time the book came out, lawnmowers, outboard motors, and many other devices that incorporated small internal-combustion engines, relied on this "hand cranking" technique.  Now, they rarely do.

"Self starters", specialized electric motors powered by the same storage battery that was used to power the spark plugs, were soon invented but Asimov does not give a date.  Automobiles that came with a self starter as standard equipment became ubiquitous by the '30s.  The incorporation of a self starter has now spread to pretty much everywhere internal-combustion engines can be found.  The devices I listed above now usually come with a self starter as standard equipment.  The only exception I am familiar with is the chain saw.

In the case of cars, the whole "start your engine" process has now evolved to the point where it is now completely computer controlled.  You can't "hand crank" a car.  You can't even start one by manually operating the self starter.  With any make or model of new car, the driver pushes the Start/Stop button and the computer does the rest.  With hybrid cars the computer starts and stops the internal-combustion engine whenever it decides that action is appropriate.  It doesn't even consult the driver, let alone cede any control over any part of the process or its timing.

Widening our focus from the engine to the whole vehicle, Asimov notes that Daimler built the first practical "horseless carriage" in 1835.  But, to catch on, they had to be cheap.  And that required developing the capability to do "mass production", produce a large number of essentially identical devices inexpensively.

Whitney was the first person to figure out how to do this.  He is known primarily for his invention of the "cotton gin".  In this case "gin" was short for engine.  And at the time "engine" meant any complex mechanical device.  Whitney's "gin" was capable of inexpensively removing foreign matter like non-cotton parts of the cotton plant, from freshly picked cotton so that it could then be turned into thread. The hand process the "gin" replaced was slow and, therefore, expensive.

Asimov correctly observes that, significant though this invention was, working out what was necessary to move to mass productions was far more important.  The device Whitney cut his mass production teeth on was the musket.

Up to this time muskets were hand made one at a time by gunsmiths.  That meant that, if a part needed to be replaced, then the replacement part had to be hand made to fit that specific musket.  And muskets of the time frequently needed parts replaced.

Whitney hit on the idea of "interchangeable parts".  Make all the muskets in such a way that all the parts of each musket were essentially identical.  That would mean that the replacement for a part that had broken could be gotten by cannibalizing that part from another musket, one on which a different part had broken.

Alternatively, the quartermaster could stock a modest number of spare parts.  As long as he had a few of each part the musket used on hand, any broken musket could be quickly repaired.  Before interchangeable parts, he would have had to keep a duplicate of each musket in the unit in stock in order to guarantee that he had all the replacement parts that might be needed.

It turned out that parts needed to be manufactured to a very high precision.  Whitney figured out how to do that.  He also solved a myriad of other problems.  The U. S. Army was the first army to be supplied with muskets made from interchangeable parts.  The idea quickly caught on with other armies and later spread elsewhere.

Asimov skips over Benz, the first to mass produce cars.  But a Benz car of the period was quite expensive.  Ford was the first to successfully mass produce cars.  This required figuring out how to make a car out of interchangeable parts.  It also required figuring out how to do other things.

Ford and his crew eventually figured it all out but not the first time he tried.  He built his first car in 1892.  But it took him several tries to successfully produce a high quality, low cost car that could be mass produced using interchangeable parts.  He finally succeeded in 1909 with the "Model T".

He kept introducing successive innovations, all designed to drive down costs and drive up production rates.  He introduced the idea of specialization, a worker would perform only one small step in the overall process.  He introduced the "moving assembly line" so that a worker did not have to waste time by moving around.  He introduced "jigs", customized tools that made it quicker and easier for a specific task to be performed.

 By the '20s he was able to sell a Model T for less than $300 and still make a profit.  At that price point a car became cheaper to own and operate than a horse.  Others, in the auto industry and elsewhere, quickly followed suit.

Meanwhile, in 1892 Diesel introduced changes to the internal-combustion engine that made it simpler, which made it cheaper to make, and so that it could use a less expensive fuel, what we now call Diesel fuel.

There are disadvantages to the Diesel design.  But the durability and economy of using  Diesel engine in heavy duty applications has made it the preferred option for commercial vehicles.  That was true when Asimov wrote the book.  It is still true today.

The internal-combustion engine made heavier than air aircraft practical.  The first step was to understand how to fly.  The "glider" an unpowered craft was the first step.  The early pioneer in this area was Lilienthal.  But he died in a crash in 1896.  The next person to take up the mantle of the scientific investigation of flight was Langley, for a long time the head of the Smithsonian Museum. 

Langley was the first to try to extend what he learned from his glider work to the development of a powered airplane.  He failed in spite of receiving substantial funding and support.  We all know that the Wright brothers succeeded where Langley failed.  Asimov opines that Langley might have succeeded had he gotten additional funding.  But the Wrights found that some of his data was wrong.

Problems trying to use Langley's data eventually led the Wrights to redo much of his work.  Along the way they invented the wind tunnel and used it in various ingenious ways to advance the state of art in the area of aeronautical engineering.  One of the things this research turned up was the inadequacy of current internal-combustion engines.  Their power-to-weight ratio was inadequate.  So they designed and constructed their own engine which had a better power to weight ratio.

They also investigated how an airplane could be controlled.  The "Wright Flyer", the airplane that flew on December 17, 1903, was very difficult to fly.  Over subsequent years they developed many improvements that made later designs much easier to fly and maneuver.

All this was substantially in advance of what Langley had been able to achieve.  It turned out that he had been much farther from success than most people, then and now, assumed.  Frankly, the Wrights were much better scientists than Langley was.  And they were able to make their many scientific advances on a shoestring budget.

Others, but not Langley, were able to build on the work of the Wright brothers.  Curtis, a man with a motorcycle background, was able to build much better engines than the Wrights, for instance.  The Wrights were at the forefront of the design of control systems for airplanes until about 1910 but after than others surpassed their best efforts.

Part of the reason for this was that the Wrights held what they knew closely and were unwilling to cooperate with other people.  Their paranoia was justified to a considerable extent.  Curtis, for instance, was one of many who built on the work of the Wright brothers but did not want to pay them for the privilege.  But, in the end, this paranoia was one reason others were eventually able to surpass them.

The biggest impetus to the improvement of aircraft design, however, was World War I.  Governments quickly figured out that better airplanes were critical to their War efforts and threw previously unimaginable sums of money around. As a result, the airplane of 1919 was quite different from the airplane of five years earlier, at least in Europe.

America's late entry into the War sidelined American airplane companies, including the Wright's company, during this critical period.  That left them far behind their European counterparts.  In spite of its wartime success, the airplane was generally seen as the stuff of daydreams, or a military tool, or a toy for the idle rich.  That all changed when Lindbergh flew from New York to France in 1927.

The Atlantic has already been successfully crossed when he made his flight.  But Lindberg's flight was what caused caused ordinary people to think differently about flying.  This change in attitude enabled the introduction of airlines and scheduled commercial flights.  But it was still seen as the plaything of the rich and famous.

The innovation that eventually changed this was another wartime innovation.  During World War II the Germans were the first to develop a practical "jet" engine.  (The first "jet" engine was created in 1913 by Lorin but it was not practical to use it for anything.)  The jet powered ME-262 fighter could literally fly rings around anything else in the sky.  But the Germans were unable to produce enough of them to effect the outcome of the War.

Experimentation leading to innovation resulted in rapid improvements.  At the time of Asimov's book this had not translated into any significant penetration by jets of the market for commercial airlines.  The "queen of the skies" at the time of the book was an airliner that featured four propellers driven by piston engines.  Planes of this type were being manufactured by several different companies.

The first serious effort at a commercial jet powered airliner was the DE Haviland Comet.  The jet engines worked fine but the plane had a design flaw that caused it to literally crack open and fall from the sky.

In a bit of bad luck for DE Haviland, the first few crashes happened over deep water and the aircraft could not be recovered and examined.  So people knew the planes were crashing but not why.  Eventually a crashed plane was recovered and examined, and the flaw quickly identified.  But by then it was too late for DE Haviland and the Comet.

The company to first successfully crack the jet airliner puzzle was Boeing with its 707.  In an early public demonstration Boeing's chief test pilot put the plane though a complete barrel roll in front of 50,000 witnesses.

The maneuver was NOT authorized by senior management but quickly became the stuff of legend.  It convinced the public and airline executives that the plane was safe and sturdy.  In the wake of the Comet fiasco winning over both groups was critically important to the success of the plane.  And it was very successful.

Jet engines are simple.  That makes them reliable and easy to maintain.  Both keep costs down.  The engine is also much more fuel efficient than a piston driven propeller plane.  The jet allowed the airline industry to introduce the era of cheap air fares.  And, like the Model T, that put flying, either for business or pleasure, within the price range of average people and cost conscious companies.

Since the introduction of the 707 much has changed and much has stayed the same.  Current generation jet powered airliners fly at about the same speed the 707 flew.  (The Concorde flew much faster, at supersonic speed, but was never a commercial success.)

Jet engine design has evolved to substantially improve their efficiency.  Raising the operating temperature (see previous posts for details) increases efficiency.  "High bypass" designs have also improved efficiency in ways I admit to not really understanding.

A modern jet also looks very similar to a 707, a design that is now 60 years old.  The most obvious change is to add "winglets" to the tips of the wings.  These change the way air flows over the wings in ways that I again don't understand.  But again, they work.

The other change is in materials.  The main material used to make the 707 was aluminum.  "Carbon fiber" is slowly displacing aluminum as the material of choice .  It has a better strength to weight ratio and that keeps the overall weight of the plane down.  A lighter plane is a cheaper plane to operate.

Carbon fiber, however, is the material of the future, not the present.  Eventually it will be the main material used to construct commercial airplanes.  But right now most commercial airplanes currently in production use far more aluminum than carbon fiber.  Only a few reverse the ratio.  On to "Radio".

Asimov starts the chapter off with Hertz.  In 1888 he proved the existence of radio waves by transmitting a signal from one place to another and successfully detecting it.  The distance involved was modest and the equipment was extremely primitive.  But it proved that the equations Maxwell had published twenty years earlier described a real phenomenon.

Hertz was able to go beyond simple detection.  He was able to determine some of the characteristics of what we now call "radio waves".  (For a time they were called "Hertzian waves" for obvious reasons.)  He was able to determine that whatever it was had peaks and troughs like waves.  He could also measure the wavelength.  It turned out to be far larger than the wavelength of light.

In 1890 Branley used improved apparatus to transmit information over a distance of 150 yards.  Lodge was able to make more improvements.  He succeeded in transmitting Morse code using radio waves.

Marconi made further improvements and succeeded in transmitting Morse code.  First, he managed 9 miles in 1896.  He successfully sent a signal across the English Channel two years later.  Finally, he was able to transmit a message, which I believe consisted of a single letter, across the Atlantic in 1901.  All this explains why the British call it "wireless telegraphy".  They also call a radio receiver a "wireless".

Marconi was the first to work out how to confine a radio signal to a narrowly constrained "frequency".  (He got a Nobel in 1909 for this work.)  If you know the frequency, you can calculate the wavelength, and vice versa.  For historical reasons some parts of the world use "frequency" and other parts of the world use "wavelength" to denote the same thing, namely the place in the spectrum a particular signal resides.  Wavelength, the term generally preferred by scientists, is very slowly winning out over frequency.

Fessenden developed the equipment necessary to generate what we now call an AM (Amplitude Modulation) signal.  The signal is transmitted on a single fixed frequency and its amplitude, loudness, is modulated.  He used this equipment to broadcast both words and music on Christmas Eve in 1906.  This marked the first recognizably modern radio broadcast.

Asimov then goes on to describe the internal workings of "vacuum tubes".  Vacuum tubes were ubiquitous at the time of the book's publication.  They are now restricted to a few specialty uses.  They have generally been supplanted by various "solid state" technologies.  I am going to skip all this, other than to note that these developments were critical to making radio practical in those early days.

One critical capability vacuum tubes enabled was "amplification", making a weak signal louder without changing its other attributes.  Amplification is obviously important if you want to create a strong signal suitable for broadcast.  It is also critical for raising the extremely low level of the signal an antenna placed at a distance from the transmitter picks up.  It takes a substantial amount of amplification to turn this signal into something useful.

In a similar (but harder to explain) manner, vacuum tubes could be used to perform all of the other steps necessary to take the output of a phonograph, for instance, prepare it for broadcast by an AM radio station, and then broadcast it.

On the receiving end, the signal originating from a single radio station needs to be selected while the signals from all the other radio stations are rejected.  The "radio frequency" part of the signal must then be stripped off and the result amplified enough to be able to drive a speaker at sufficient intensity that we can clearly and easily hear and enjoy whatever is on the record the radio station is playing.

In the early days, pretty much all radios used AM.  It make the fewest demands in terms of the complexity and sophistication of the circuitry necessary to handle it.  That kept equipment simple and costs manageable.  The first widespread use of radio was by the various navies of the world during World War I.  For the first time in naval history it was possible to communicate with a ship at sea in real time.

At the time the equipment was expensive to make and difficult to use.  Only naval vessels could afford it.  But technology marches on.  Easy to use and relatively inexpensive equipment became available in the late '20s.  By the mid-30s costs had dropped enough that equipment was cheap enough that even depression ravaged households could afford it. 

The very popularity of radio soon exposed a weakness of AM radio.  It was not that good at providing a solid, noise free signal in many circumstances.  This led to the development of FM (Frequency Modulation).  With FM, signal strength stays constant.  The frequency the signal is broadcast on is varied instead.

This provided a much more robust method but it also required substantially more complex equipment.  That made the equipment more expensive and put it out of range for the average consumer.  It was not very popular at the time Asimov wrote the book.

That changed in the following decade.  FM equipment got better and the price dropped.  FM was able to provide a much clearer and realistic sound.  It was also quickly adapted so that shows could be broadcast in stereo.

At first it was only used by symphonies and other "high brow" entertainment where the better signal quality was important.  But aficionados soon found that rock and roll and other "low brow" music also sounded much better on FM that it did on AM.

Soon, all forms of popular entertainment migrated to FM and AM became a wasteland.  This resulted in "talk radio", where good sound quality was not very important but low operating costs were, to take over the AM band.

The advent of the Internet drove still another revolution.  Now many people never listen to the radio.  Everything that was available there is now available on the Internet and/or your smartphone.  Radio, both AM and FM, is still out there but nobody pays much attention to it.

After briefly mentioning FM Asimov moves on to television.  He notes that the first step was the "wire photo".  This was a technology that was used to transmit a picture from London to Paris in 1907.

A still picture was scanned line by line.  The variation in light and darkness underneath each line was transmitted over a telephone line or via radio.  On the other end a matching device would draw a line of varying intensity on a piece of photographic film.  When the result wad developed and printed it yielded something resembling the original picture.

The process was slow and expensive.  It might take a minute or two to transmit a single picture.  Initially it was used only by newspapers and law enforcement.  It was too inconvenient for anybody else to bother with.  But the process contained all the ideas necessary to do television.  It just needed to be speeded up by orders of magnitude.  And, of course, the cost needed to drop by orders of magnitude too.

The sound part of the process was essentially radio so it presented no new challenges.  It was the picture part that represented the challenge.  The the toughest "picture" component that needed to be developed was the television camera.  Zworykin patented an "iconoscope" that got the job done in 1938.

Since it is a fancy vacuum tube I am going to skip over the details of how it worked.  But it was able to duplicate the "scan a picture a line at a time" process discussed above.  And it could scan a complete picture in a small fraction of a second.  It worked fast enough to be "moving pictures" compatible.

The television equivalent of the radio receiver already existed.  It was, you guessed it, another kind of vacuum tube.  As with the TV camera, it was capable of building up a compete picture from a series of lines in a small fraction of a second.

The "picture tube" had a relatively flat front face with a phosphorescent coating on the back of it.  Hit with the appropriate stuff, lines would glow at varying intensities resulting in the picture being reproduced.

Lots of other equipment was necessary to connect all of it together.  But RCA, then a giant electronics company, now an infrequently used brand name, was able to demonstrate a complete end to end system at the 1939 Worlds Fair in New York City.  World War II diverted all high quality radio equipment to the War effort.  TV equipment definitely fell into that category.

But commercial TV broadcasting started up shortly after the War.  At the time TV was only capable of broadcasting in black and white.  Asimov make reference to the advent of color TV in the mid '50s.  But what was then available was more of a proof of concept than actual reality.  Color TV actually took off in a noticeable way in the mid-'60s.  And all of this was in what we would now call low-fi - low fidelity.

The TV pictures of the time, initially in black and white, later in color, were good enough to be usable.  You could see what was being shown.  But the picture was not very sharp and the  colors were not very well defined.  That was the limit of the equipment of the time.

We now talk of pixels and lines of resolution.  A first generation "IBM PC" home computer used a custom "display" that supported a screen resolution of 640x480.  This meant that the picture it displayed consisted of 480 lines, each of which had 640 pixels (separate dots of picture information) in it.  Theoretically, TV did better.  It featured 525 lines.  But this was a cheat.  The scan lines were "interleaved".

In one scan only the even lines were processed.  In the next scan only the odd lines were processed.  So each individual TV picture only had about 260 lines in it.  The "aspect ratio" was 4:3.  The picture was 4 units wide and three units high.  If we apply this same 4:3 ratio we get a line with about 350 pixels per line.  So think of a TV picture from this era as having a resolution of 350x260 and you get a more accurate idea of the true situation.

As a cross check, if we take 480 and multiply it by four then divide it by three we get 640.  So a 640x480 picture has the same 4:3 aspect ratio.  And this was by design.  PC makers wanted to reproduce what consumers were seeing when they looked at their TV.

Some early home computers did not require a custom made "display".  They could be hooked up to a standard, cheap, black and white TV set.  But the screens on these computers were not able to reproduce the 640x480 resolution the first generation IBM PC was capable of.

Of course, IBM made you buy a special display device that could handle a resolution of 640x480.  And I can tell you from personal experience that they cost significantly more than a black and white TV of similar size.  On the other hand, the display device connected to my IBM PC could easily show me 25 lines of text, each consisting of 80 characters.  On the other hand, the TV connected to the home computer could only display 40 characters on a line.  And the line count was significantly lower too.

A resolution of 640x480 represented the state of the art back then.  Things have gotten a lot better since.  Current PCs usually sport a screen resolution of 1080x1920.  If we do the appropriate math we find that the aspect ratio is 16:9.  That's the aspect ratio used by the HD (high definition) TV specification.   A side by side comparison of an actual TV from the '50s or early '60s with a modern screen would bear out the fact that we are comparing aa resolution of 350x260 to a resolution of 1920x1080.  We've come a long way, baby!

Asimov then goes on to discuss videotape.  This is an extension of the audio tape that was discussed in earlier posts.  The idea is the same.  We just need to improve things enough so that video tape can handle the information load TV puts on it.  At the time the book came out professional video tape machines were available for use by networks and TV stations.  Over time this changed.  The video tape cassette and VCR (Video Cassette Recorder) were introduced.

In their time they represented a massive change.  When Asimov wrote his book you could go to a movie theater, buy a ticket, and watch what was showing on the schedule the theater operator chose.  Or you could turn your home TV on and watch what one of the three or four TV stations then operating in your area was showing.  And you had to watch it on the schedule the station manager chose..

The idea of saying "I want to watch move X and I want to watch it now" was literally an impossibility.  The video cassette and the VCR changed all that.  Once you owned a VCR you could buy cassettes preloaded with a specific movie.  You could take it home and watch it any time you wanted to.  And you could watch it as many times as you wanted to without having to pay any additional money.  And, unlike TV, it was commercial free.

But wait!  There's more. You could also record a show off of TV then watch it at a later time.  This is what we now call "time shifting".  And you could re-watch it as many times as you wanted.  But wait!  There's still more.  Stores opened up that would rent you a video for a couple of bucks. Let's face it.  Most shows and movies are only worth a single viewing.  The rental was usually only good for a day or two.  But you could rent and watch any video the store had in stock.  And a typical store stocked many thousands of titles.

This put the consumer in charge for the first time.  They had much more freedom to watch what they wanted when they wanted.  Consumers were no longer chained to the schedule dictated by nearby movie theaters or TV stations, be they local or cable.  This change truly represented a revolution.

But wait!  There's porn.  Up to this period of time it was hard to make real money in the porn business.  All the access channels were successfully blocked by various religious and other organizations.  Most people's idea of "the worst of the worst" at the time was Playboy magazine.  Then things opened up a little and seedy theaters in medium to large cities started showing porno movies.  The money wasn't that good but it was much better than what had come before.  Video tape and the VCR changed all that.

The porn companies quickly shifted to releasing their product on video cassettes.  And the public snapped it up.  All of a sudden there was big money in porn.  Video tape rental stores quickly figured this out and added "porn" sections.  It was not long before they were making half their revenue from porn.  It turned out that there had been a giant untapped market for porn.

The big problem with VCRs and video cassettes was that their picture quality was terrible. In many cases it was poorer than broadcast TV.  That didn't matter in the early days.  The fact that people could now watch what they wanted when they wanted was such a powerful force that people forgave the poor picture quality.

Then DVDs came along.  I am not going to dive into how the technology works. For our purposes I am just going to note that DVDs delivered much better picture and sound quality.  The era of the high end TV system was born.  At first people thought that the fact that you couldn't record your own DVD (you can now but you couldn't then) would be a deal breaker.  But it turned out that most people found the process of recording things more trouble than it was worth.

It turned out that people were mostly interested in buying and renting and not much into recording stuff themselves.  So the "direct to consumer" movie (and porn and everything else - by now you could buy or rent  boxed sets of old TV shows) business continued to grow.

Everybody quickly switched from cassettes to DVDs.  The rental stores started with small DVD sections tucked into a corner.  Over the course of a few years they shifted to having large DVD sections.  It was the video cassette section that ended up tucked into a corner.  But that did not change the business in a significant way.

What did drive a big change was the Internet.  At first it did not affect video. It was too slow to handle it.  But that changed.  The speed of my first computer connection to the outside world was measured in hundreds of bits per second.  It quickly changed to thousands of bits per second then tens of thousands of bits per second.

The speed of my first "high speed" internet connection was measured in millions of bits per second.  At that point decent quality streaming video becomes possible.  I have had a one gigabit (billion bit) per second connection for a little more than a year now.  I may be foolish but I believe that will be as fast as I will ever need.

The widespread availability of high speed Internet has again revolutionized things.  If it's instantly available online why bother to own it or even rent it from a store?  Netflix and its competitors have turned us from people who buy or rent DVDs to people who stream from Netflix.

Again, in a certain sense, we have gone backwards.  If I own a DVD I can watch it forever whenever I want.  But many of the movies and TV shows on Netflix come and go.  They may be available for a couple of years but after that, they are gone.  Netflix may license them again at a later dare.  And then again, they may not.

But it turns out that if Netflix has something I want to watch now it doesn't matter to me (speaking as a typical customer) that they don't have exactly what I want.  Movie studios, and keep in mind they make most TV shows too, used to make good money from selling DVDs.  They even made good money providing rental stores with product.

But not any more.  There used to be tens of thousands of rental stores.  Now, only a few remain.  Some people still buy the odd movie or boxed set of a TV show.  But that market has also dropped away to almost nothing.

Streaming is booming.  In fact, it has gotten completely out of hand.  It used to be that you could count streaming services on the fingers of one hand.  Now there are so many of them you can hardly keep track.  And this has resulted in a fragmentation of content.  This service has a few shows you might be interested in but most shows are on another streaming service.

If you want to have access to it all you have to subscribe to perhaps twenty services.  And that will still leave gaps in what you have access to.  And even if you sign up to lots of streaming services, something that costs lots of money, it is a complete PITA to keep track of which show is in what streaming service. 

My best guess is that a lot of these services will go under.  They are all counting on other services going under leaving the field open for them to survive and thrive.  In reality, nobody knows how it is going to all shake out.

Back to Asimov.  He then tackles solid state electronics.  He goes into some detail about how transistors work.  His explanation is based on Germanium.  At the time Germanium seemed like the best material to make transistors out of.  But Silicon eventually won out as the material of choice.  (There are technical differences between Silicon and Germanium transistors.  But they aren't worth getting into.)

The book was written during the "discrete transistor" era, an era that ended up not lasting very long.  Asimov discusses how this type of vacuum tube can be replaced by that type of transistor.  For a while that's basically what happened.

But at the about the time the book was coming out the "integrated circuit" was being invented.  It's just a method for simultaneously creating multiple transistors on a single small piece of silicon.  Connections between the various transistors can be created as part of the same overall process.  And that means that you can skip the later "solder everything together" step.

A few years later, a single LSI (Large Scale Integration) chip might have contained the equivalent a thousand discrete transistors, all hooked together.  Now, the equivalent of tens of billions of transistors, complete with connections, are manufactured at once.  This is possible because the manufacture of an IC (Integrated Circuit) is essentially a photographic process.

Various "process steps" are needed.  But the important steps involved projecting a "mask", essentially the negative of a picture, onto the surface of the IC.  The light hitting the surface causes a chemical change to take place wherever the light is bright.  Using as many as 40 masks, carefully coating the surface with various chemicals at the right time and in the right sequence, and by taking other steps, a complex pattern consisting of various sub-patterns of materials with various properties is "etched" into the surface or near-surface area of the IC.

There is a pattern for where the "wire" material needs to be.  There is a pattern for where the insulator material needs to be.  There is a pattern for where each of the various components that go together to make a transistor "gate" need to be.  In the end there are many transistors connected by wires and insulated from other transistors and wires laid out on the surface of the IC.

What's important to know is that the cost of manufacturing a specific IC is not dependent on how many "features" there are on the surface of the IC.  The cost is in the number and type of "process steps" necessary.  If the manufacturing process for two different ICs follows the same recipe then an IC containing a million gates costs the same to make as an IC containing a billion gates.

As the component count goes up and as the "feature size" goes down the cost of building that "fab", the manufacturing facility capable of fabricating a particular family of ICs, might go up.  But hopefully you will be able to produce so many ICs using that fab that the cost of the fab itself ends up being a small part of the cost of producing any individual IC.

This "it costs the same to make an IC with a lot of components as it costs to make an IC with many fewer components" characteristic of IC manufacturing is the IC "secret sauce".  It explains why the price of IC based electronics doesn't go through the roof.  This is in spite of the fact that typically the new gadget contains much more complex ICs.  They enable the new model to do things the old model of a few years ago only dreamed of being able to do.

And that's where this particular installment comes to an end.

Saturday, August 22, 2020

COVID test pooling

 To be perfectly honest, this post exists to justify the writing of a computer program.  I have written two computer programs recently and, being weird, have thoroughly enjoyed the process.  I want to do more, but to do more I need a reason.  The reason just needs to be good enough to allow me to talk myself into writing the program.  Here's my new reason.

Pooling COVID tests is now a thing.  (In this post, if I say "COVID" I mean COVID-19 and not the other kinds of COVID that exist.)  The FDA has recently authorized a commercial lab to pool up to four samples when testing for COVID.  So what does that mean and why bother?  Let's start with the "why bother" part.

The consensus among the experts is that we are not testing enough.  What's enough?  You want to test everyone entering a hospital that might be COVID positive so that you can both treat them appropriately and also so you will know if hospital personnel need to take extra precautions to stay safe.  But that's the absolute minimum.

You also want to contain the spread of the virus because it is deadly to some people.  To do that you need to test the people who have come in contact with known COVID-positive people.  You also want to test people who come in contact with high-risk people.

Beyond that, hospitalization and other medical costs are very high for those who get seriously sick.  If you can find COVID-positive people early while they are not very sick maybe you can keep some of them from getting seriously sick.  If nothing else, this would help to minimize medical costs.

Then there are various behavioral measures.  No one likes any of them but the impact varies tremendously between measures.  It would be nice to know which measure work and which are a waste of time.  That means you need to know who is getting infected and how they got infected.  That requires testing.

And ideally you need to test people you don't suspect of being positive.  That's the only way you discover that something unexpected is happening.  The experts have cranked the numbers.  They know how much testing and who should be tested if we want to get a handle on this.

We aren't doing either.  We aren't testing enough and we aren't testing the right people.  As a result the U. S. is far and away the country that is doing the worst job of containing the virus.  The devastating impact this has on the economy is just the beginning of the damage this is causing.

Everybody wants to see the introduction of medical measures that would help.  A vaccine that would completely block infection is what most people are focusing on.  But if we had measures that reduced the rate at which people need to be hospitalized, reduced the severity of the care then need once they get there, or reduced the death rate, any or all of these would help a lot.

But we have made at best modest progress on any of these measures.  And it looks like we aren't going to have access to any new game changing measures soon.  It could easily take months, maybe even a year or more, for the situation to improve substantially.

Testing is the first component in an effective containment strategy that is based almost entirely on non-medical measures.  You need to test everyone who might have the virus.  Then you need to isolate them so they can't spread it to anybody else.  You also need to do contact tracing, figuring out who that person has been in contact with so you can test them.

Right now, the U. S. is employing a "none of the above"' strategy.  We aren't testing enough people.  We are not effectively isolating those who test positive.  We aren't contact tracing effectively.  All three of these things are much easier to do if you only have a few cases to deal with.  But we have more than 50,000 confirmed new cases per day and who knows how many additional unconfirmed new cases.

For the remainder of this post I am going to ignore much of what's going wrong and concentrate on how we can increase our testing capacity.  Right now, about half of the roughly 700,000 tests we are doing per day are done by commercial labs.

Demand is so high that it is taking them, on average, the best part of a week for them to return results.  I am going to skip over all the reasons why such slow turn around is bad and concentrate on what could be done to increase capacity.  Hopefully, more capacity would result in quicker turn around.

The obvious strategy is "more".  Just increase the capability of doing what we are currently doing so that we can do more of it.  That has been in the works since at least March and, while these labs have been able to ramp up capacity, they haven't been able to ramp it up fast enough to keep up with rising demand.  So we want to do as much of this as we can but we also need to think about trying other things too.

Before I delve into test pooling, the subject that justified my latest foray into computer programming, I am going to dive into biology.  I am going to discuss the biology of COVID.  It's a virus.  That means it can't exist and reproduce on its own.  It needs the cells of a plant or animal.  So let's dive into a little cell biology.

Cells are incredibly complex things.  I avoided Biology like the plague while I was going to school because everything in Biology is complex.  I am going to spare you as much as I can by only talking about a few key things about cells.  Cells have an inside and an outside.  The two are separated by a cell wall.  Like everything else in Biology, the cell wall is very complex all by itself.  One of its big jobs is to act as a gatekeeper.

Cells need to take stuff in if they are to do their thing.  They also need to excrete other stuff.  I am going to ignore the latter and only talk about the former.  There is lots of stuff outside the cell, much of it bad or dangerous.  One of the jobs of the cell wall is to keep that stuff out while letting in the stuff the cell needs.  This is done by "receptors", specialized parts of the cell wall..

A typical cell wall has dozens of different types of receptors.  Each receptor has a specialization.  And they all use a "lock and key" system.  If a chemical has the right "key" it inserts it into the "lock" part of the receptor.  This causes the receptor to behave like a door.  Once activated, the door opens, sucks the key and whatever is attached to it into the cell, then immediately slams back shut.

Viruses typically have a "spike protein" poking out of them  It's the "key" part.  The spike protein on the COVID virus fits the lock part of a receptor with the scientific nickname of "ACE2".  (Trust me -- you don't want to know what the real name of this particular receptor is.)  Different viruses have different keys, spike proteins.  These keys operate the locks of different receptors.  That is one of the things that makes one virus different from another virus.

Once a virus finds a cell with the right lock it uses its key to enter the cell.  It then takes over the cellular machinery and reprograms it to make copy after copy of itself.  When enough copies have been made the cell literally bursts apart and that frees the many viruses that the cell has manufactured to roam around looking for another cell with the right receptor on its surface.

Viruses make us sick in two different ways.  Most obviously, a cell that drops what it would normally do so that it can instead make copies of a virus, is not doing what it is supposed to do.  That's bad.

But the body has various systems that are designed to notice this sort of thing and react accordingly.  If these reaction mechanisms work the way we want them to then the body contains the damage and things go back to normal.  Obviously, the body can underreact and fail to contain the infection.  But the body can also overreact.

For instance, part of the reaction may be to raise body's temperature, give the patient a fever.  This often helps the body to fend off an infection.  But, if your body's temperature gets raised too high for too long, this "fever" response can kill you.  Inflammation, another common bodily reaction to invasion by an infectious agent, can also get out of control and kill you.  The list goes on.

This is how viruses work and COVID is no exception.  One of the things that makes COVID particularly hard to deal with is that there are ACE2 receptors on many types of cells in the human body.  COVID doesn't care what kind of cell it encounters.  It only cares whether it has an ACE2 receptor or not.

The two most common places COVID finds cells that have ACE2 receptors are cells in the lining of the nose and cells in the lungs.  But it sometimes finds itself attacking cells that are found in the liver, the kidneys, the walls of blood vessels, the brain, and more.  To paraphrase the famous bank robber "that's where the ACE2 receptors are".

Now lets apply what we now know to COVID tests.  The current "Gold Standard" test is shorthanded as the "PCR test".  PCR is a lab technique the test uses.  The technique can be tuned to an extreme degree to look for a specific kind of genetic material.  One bottleneck associated with PCR COVID tests is that only these materials that have been finetuned to respond to COVID and nothing else can be used. 

A swab collects some goo from the walls of your nose.  Deeper is better for some reason so they swab deep into your nasal passages.  This swab is then subjected to the COVID specific version of the PCR process.

The result is that, if the swab contains COVID genetic material, then that material will be amplified to millions of times its original concentration.  At that point it is easy to detect.  And the PCR process is unbelievably specific.  It only amplifies the specific genetic material it has been customized to amplify and nothing else.

This process s complicated and takes time.  But it is a standard process.  Lots of labs use it routinely to study all kinds of things.  You need the right kind of machine.  You need the specific stuff that makes the process hone in on COVID and nothing else.  But that's it.  Typically, the process takes a couple of hours.

So what's the problem?  It turns out that there are plenty of machines around.  But the supplies, often referred to as "reagents", are hard to come by.  And the machines can be used in lots of different ways.  If you are running COVID tests you are not running other kinds of tests and vice versa.  But apparently, COVID testing bottlenecks are caused by reagent issues and nothing else. 

Commercial labs can't get the quantities of reagents they need so they ration.  And the way rationing manifests itself is in its effect on turnaround time.  They hold up processing swabs until they have the reagents they need.  I got a PCR based COVID test.  I was swabbed at 2:15 PM on a Tuesday and I viewed my result by entering a code into a web site at 9:30 AM the next day.

My test was processed locally and not by a commercial lab.  The lab that processed my test used the same equipment, reagents, and process, that a commercial lab would, so the result was just as accurate and reliable.  But my test only had to travel from one place in Seattle to another place in Seattle.  There they had sufficient supplies on hand to process it immediately.

With commercial labs, samples typically need to be shipped across country.  Fast turnaround in this environment is 24-48 hours.  That should be possible but currently is not.  And now that the commercial labs have a reputation for slow turnaround, fewer samples are being sent to them.  It's a viscous cycle at this point.

And if you have COVID in your nasal passages but no where else you are, at worst, only slightly sick.  The PCR test does not determine if you have COVD in your lungs, or anywhere else, for that matter.  There are other tests for that.  But before discussing them I want to talk about another "nasal" test.  It's the test the White House is using and it's called an "antigen" test.

The body's response to invasion by a virus or other "foreign invader" is complex.  A part of this response is the creation of antigens.  These are specialized chemicals whose job is to take out the foreign invader.  And they are more or less customized.  You wouldn't want them attacking things that are supposed to be in your body.

Scientists know a lot about antigens and how the human body's "antigen response" works.  And antigen detecting chemicals can be manufactured in advance in the lab.  This means it is possible to build an chemical that "lights up" when it comes in contact with a particular type of antigen.  Done right, the kinds of antigens your body makes to help it fight a COVID infection is what should set this chemical off.  So they set about making such chemicals.

One thing they can do is to design the chemical so that it includes something that fluoresces when activated.  It is easy to detect light of a certain color even when there isn't much of it.  And the "color" doesn't even need to be visible to the human eye.  It can be, for instance, in the ultraviolet band.  As long as there is a gadget that can detect it at low levels we are good.

So an "antigen test" contains chemicals that have been customized to look for COVID antigens and have been designed to do something that is easily detected when the encounter happens.  The problem is that these tests are not 100% accurate.  They can be set off by antigens that are similar to COVID antigens but not only and exactly COVID antigens.

That apparently is what happened recently.  A governor tested positive according to the White House antigen test but tested negative according to a PCR test.  The advantage of antigen tests are that they are quick and cheap.  These kinds of problems are why these tests have not been widely adopted.  Should they be?  Maybe.  Maybe not.

Both the PCR and the White House test have other problems.  If a person has been infected but it's early days then both will give a negative result.  There isn't yet enough of what they are looking for to set them off.  They have another potential problem.  What if the disease has run its course?  Then there is no remaining virus and a PCR tests will return a negative result.  I'm not sure whether the White House test will also come up negative in this situation or not.

These two tests also assume that everybody who has COVID has COVID viruses in their nasal passages.  That's mostly true.  But the virus can go straight to the lungs without transiting through the nasal passages.  This scenario is unlikely but not impossible.

In fact, if it was easy to contact COVID by touching surfaces there should be lots of paths to infection that do not involve the nose.  We are constantly bombarded with entreaties to continuously wash surfaces, hands, etc.  But very few cases have turned up where there is no COVID in the nasal passages.  That's solid evidence that the vast majority of COVID infections are caused by airborne transmission.

There is still another kind of test.  Like the others, it has its advantages and its disadvantages.  It is called an "antibody" test.  It is like the antigen based test in that it takes advantage of the complex response the body has to infection.  It looks for antibodies manufactured by the immune system to deal with infections.

Antigen, antibody, what's the difference?  And, while I am at it, let me throw another word at you, "enzyme".  For our purposes, these are all the same.  The specifics are quite different.  But they are similar in the ways we care about.  They are cheaper than a PCR based test.  They deliver an answer much more quickly.  Many of them use equipment and supplies that are widely available.  In other words, they scale up easily.  And lastly, they tend to be far less accurate than a PCR test.

I'll get back to them later but let me now return to the PCR test.  It is currently the most used test, in part because it produces a 100% reliable result.  And, because it is so accurate, only results from this test are included in the widely reported "number of COVID tests performed" number that news reports often feature.

And, if you have been following the news, the reported number of tests is trending in the wrong direction.  Experts agree that we are not performing nearly enough tests.  So we want this number to be increasing rapidly instead of decreasing.  What can we do?

That's where the computer program I wrote comes in.  One widely circulated idea is called "sample pooling".  Let's say you mix together some of the raw material from ten swabs into one blob, then you run a PCR test on it.  If the test comes back negative then you know immediately that all ten people are negative.

You have, in effect, multiplied the reach of performing a single PCR test by a factor of ten.  So why don't we set about pooling right away?  After all, it would immediately increase the effective number of PCR tests we can do by a factor of ten.

Well, things are only this simple if the test from the pooled sample comes back negative.  In the real world sometimes it will come back positive.  What happens then?  Is there still a benefit to sample pooling and, if so, how much?  The answer is "it depends".  And the dependency is complicated.  And that's why I wrote my computer program.

My program performs what is called a Monte Carlo simulation.  That's right.  The technique is named after the famous Casino at Monte Carlo, where the rich and famous go to gamble (and gambol) in Europe.  Why?  Because the technique is based on gambling.  In this case, instead of a roulette wheel we use a random number generator.

I use my computer program to answer "what if" questions.  In the real world we want to know who has COVID and who doesn't.  Going in we don't know which ones do and which ones don't.  We don't even know what percentage of a larger group are COVID-positive.  We have to test everybody to find out what the answers are.  But we currently can't test everybody.

Fortunately, I only wanted to find out if sample pooling is a good idea or not.  Additionally, it would be nice to find out which situations it is useful in and which situations it isn't.  I devised a computer program that shed some light on those questions.

In setting about writing my program I had an intuition that the usefulness of pooling would depend on what percentage of "patients" that we were "testing" were COVID-positive.  So I said "let's assume that X% are positive.  How well would pooling work in that situation?"  I wrote a computer program that would give me the answer to that question.

This may seem artificial but bear with me.  An area where I was able to make the program behave in a very realistic manner had to do with how I had it determine which "patients" were COVID-positive and which weren't.  I fed some parameters to it that allowed the program to calculate how many "patients" it would need.  Let's say that it calculated that it would need 1,000 "patients".

For each of these 1,000 "patients" I had the program individually compute a "random" number.  (I have a reference book that devotes 150 pages of dense prose to the business of having computers generate what are technically called pseudo-random numbers.  Let me leave it at, the process I used was sufficient to the task.)  The "random" number was then manipulated in such a way that the likelihood of it coming up "positive" exactly matched the "Pct Pos" (percent positive) number that I had input.

So each "patient" was randomly assigned a status of either positive or negative.  But the percentage of "patients" in the pool as a whole that ended up being flagged as COVID-positive ended up close to the target percentage I had input. If the process is truly random then you rarely hit your target dead center.  So, for a pool of 1,000 patients, I would expect about 100 of them to come up positive, if my "Pct Pos" was 10 (10%).

This process mirrored the real world where there is a certain amount of randomness associated with whether any specific person gets COVID.  And we have historical data that tells us what percentage of various large groups people have ended up testing positive for COVID.

My program simulated this situation with a high degree of fidelity.  This method of applying a random number generator to individually model some attribute of each item in a group of items is what the Monte Carlo technique consists of.

I ran a bunch of scenarios through my program.  In each scenario I input specific values for the parameters (like "Pct Pos") that I was investigating.  That allowed me to try out a bunch of different scenarios to see what happened in each.

And I don't run each scenario I investigated just once.  I wanted to see what happened "on average".  So, I ran each of my scenarios ten times then averaged the result.  And I ran 21 different scenarios.  In all, I ran a total of 210 independent Monte Carlo simulations by the time I was done.

In each case, the program went through a setup stage where it created as many "patients" as it needed for a particular scenario.  It then I modeled how things would turn out using sample pooling.  Specifically, it calculated how many tests would be required to determine the COVID status of all "patients".  This number was then compared to how many tests would be required to just test everybody once.

There are a number of subtle effects that come into play so the results were sometimes not what you would think they would be.  (This kind of unexpected behavior is why Monte Carlo simulation is a popular way of exploring any situation complex enough to be hard to analyze directly.)  But the simulations answered the basic question, "how much could we stretch our limited testing capacity by doing things this way or that way".

For instance, in one scenario I assumed that the positivity rate was 10%.  Why?  Because it's a nice round number.  (It is also lower than the positivity rate we have seen in some real world cases.)  I used a "Pool size" of 20.  That means I was simulating mixing 20 samples together and then performing a single test on the resulting mixture.  Then I simulated 50 pools.  That meant I had 1,000 (20 x 50) simulated patients.

The program divided the "patients" up into 50 pools of 20 and then examined each pool individually to see if it contained at least one "patient" who had tested positive.  If so, then I assumed the real world equivalent would have caused that pool sample to come up positive.  Then the program counted up how many of the pools turned out to have at least one positive "patient" in it.

In the real world it would take fifty tests to check all the pools.  All the patients who were in a pool that came out negative could confidently declared to be negative without even needing to do any further testing.  So, at that point in our simulation we have done 50 tests.  But we are not done yet.

In the real world we would have had to go back and test every individual who was in one of the pools that came up positive.  The positive pool tests said that somewhere between one and twenty of the people in that specific pool were positive but it didn't tell us who was who.  (I had the program conveniently forget that it actually knew which "patients" were positive and which weren't because in the real world the test lab wouldn't have had this information.)  Thus, I had the computer program calculate that it would take another 20 tests for each "positive" pool.

Computers are good at math.  So, all this took the blink of an eye.  I had the program use this logic to tote up up how many tests would be required for each of my simulated sets of patients.  I then compared this to the simple method of just testing all the patient samples in the first place.  The improvement was the amount that the calculated number of tests differed from the number of patients in the pool.

When I actually ran the simulation I have just described ten times then averaged the results the outcome was disappointing.  Going through the elaborate pooling process only cut the number of tests we would need by 9%.  Hardly worth the trouble.

But with my handy computer program I could plug different numbers in and see what happened.  If we cut the percentage of our patients that were positive down to 5% then we got a 31% improvement.  Better, but not exactly life changing.  So let's look at some other scenarios.  BTW, this is what scientists and engineers call "exploring the parameter space".  What if we made this bigger?  What if we made that smaller?  As you are about to learn, it is often hard to guess whether a change will make things better or worse.

So, if we stick with a 10% positivity rate but change the pool size to 10 samples per pool then the improvement goes from 9% to 22%.  The FDA has approved pooling four samples together in the case of a particular test.  So going all the way down to a pool size of 4 but leaving the other parameters the same gives us a 42% improvement.  Tests go almost twice as far.

If we return to assuming the positivity is 5% then moving from a pool size of 20 to a pool size of 10 increases our efficiency to 51%.  Tests now actually go twice as far.  Decreasing the pool size to 4 increases our efficiency modestly.  It goes to 55%.  See!  Things change in unpredictable ways.

What if our positivity rate is only 2%?  Then if we use a pool size of 20 we get a 61% improvement in efficiency.  This is the best result yet.  However, a pool size of 10 gives us a 73% improvement.  That's almost a 4 - 1 improvement.  So going down to a pool size of 4 must be a good idea, right?  Nope.  The efficiency drops to 67%.  See!  You don't know how well it's going to work until you do it (or simulate it, in our case).

I also explored what happened if the positivity rate was only 1%.  For a pool size of 20 I got a 73% improvement.  For a pool size of 10 I got a 79% improvement, almost 5 - 1.  Again, going down to a pool size of 4 was a disappointment.  The improvement was 71%. Live, or in thee case simulate, and learn.

While I was at it I wanted to explore another idea that was floating around.  What about pooling two different ways.  We would have two sets of pools, the A" set of pools and the "B" set of pools..  A part of each sample would be added to a pool in each set.

Call all the patients in a specific "A" or "B" pool a "cohort".  There are various ways to make sure that for every patient there is no overlap between the other members of the cohort of whatever "A" pool the patient ends up in and the other members of the cohort of whatever "B" pool the patient ends up in.  Ideally, this would allow us to figure out which patients are positive without having to do any patient level testing.

Let's create an entirely artificial situation in order to demonstrate how this works.  Let's say we have 100 patients and exactly one of them is positive.  Let's say further that patient #1 is our positive patient and patients 2-100 are negative.  This forces us to use a pool size and pool count  of 10.  In the "A" pool let's say patient #1 ends up in pool A-1 so it comes up positive when it is tested.  Pools A-2 through A-10 all come up negative so we immediately know that patients 11-100 are negative because we used a simple "group by tens" rule to decide which "A" pool a patient ends up in..

Now let's assume that patient #1 ends up in pool B-5.  (Here we used a more complicated rule.  It doesn't matter what it is as long as it works.)  We have carefully arranged things so that patients 2-10 all ended up in some pool other than B-5.  So when we untangle things we find that we have done 20 tests.  And those tests have allowed us to determine that patient #1 is positive and all the other 99 are negative.  Twenty tests have allowed us to tell the tale for all 100 patients.  We have improved how many patients we can test by a factor of 5.  That would be great.

But, of course, the real world, and even my simulated world is more complicated.  So, I am going to have pity on you.  I'll spare you the details and just cut to the chase.  (The "chase" is where I tell you how things came out.)  It turns out that things are a lot simpler if we use identical numbers for both the pool size and the pool count. 

This is not as big a problem as it sounds.  We can just run a bunch of batches.  If we want to test 1,000 patients and we are using a pool size and count of 10 then each batch contains 10 x 10 = 100 patients.  To get to the same 1,000 total patients we just divide the 1,000 patients into 10 batches of 100 each and proceed.  So, in the end, making things work when the pool size and count are different does not turn out to be worth the trouble.  So I didn't do it.

I incorporated the more complex logic that handles things correctly for two pool scenarios into my computer program and ran some two pool scenarios through it to see what would happen.  As I had before, I ran the same scenario ten times and averaged the results.  I started with a 25 x 25 scenario, 25 pools of 25 tests each.

If our positivity rate is 10% then we actually go backwards.  It takes 1% MORE tests to get a result for all 625 patients than it would if we just tested each one of them individually in the first place..  Things get better when we drop the positivity rate down to 5%.  Then we get a 26% improvement.  Not much but better than going backwards.  Dropping the positivity rate to 2% gives us a 54% improvement.  Dropping it to 1% gives us a 70% improvement.  Now, we're getting somewhere.

If we use 10 x 10 pools, then at a 5% positivity rate we get a 43% improvement.  At a 2% positivity rate we get a 76% improvement.  But, if the positivity rate drops to 1% we only match the 76% improvement we got at 2%.

I then moved on to a 4 x 4 scenario.  It made no sense to try high positivity rates as there are only 16 patients in the pool so I only tried 2% and 1%.  A 2% positivity rate yields an improvement of 55% and a 1% positivity rate of 73%.

So what does all this tell us?  It tells us that sample pooling is a big waste of time if the positivity rate is high.  Using the single pool scheme we got about a 2 for 1 increase with a positivity rate of 5% and pool sizes of 10 or 4.  A pool size of 20 was a waste of time.  At 2% and 1% positivity rates we could get about a 4 to 1 improvement using pool sizes of 10 or 4.

And the double pooling scheme looks like a waste of time unless the positivity rate is very low.  The lowest rate I tries was 1%.  My program couldn't handle fractions of a percent so that's the lowest I could go.

But, if you are double pooling and there are zero positives in a block then you only need to run tests on one pool to determine this.  So, if you are using a 10 x 10 batch, that means you only need to run 10 tests to determine that 100 patients are negative.  If the positivity rate is well below 1% then most of your blocks are going to have no positives in them.

So, we now know just how far a pooling scheme can stretch a limited number of PCR tests.  The short answer is "not far enough".  So is all lost?  Maybe not.

There was a very interesting article in the August 7, 2020 issue of Science Magazine.  The title of the article is "Fast, cheap tests could enable safer reopening".  The "fast, cheap" tests they are talking about are the non-PCR tests I discussed above.

The authors acknowledge the accuracy problems that seem to be unavoidable with these tests.  Their analysis concludes, however, that if you test people over and over, and if you retest frequently enough, then the accuracy problems can be overcome.

Now, the rate of testing they recommend, multiple times per week, may seem extreme to some.  And it would be if each test cost $100, as a PCR test apparently does.  But Yale university has created a test that costs about $4 per test.  The Yale test just got emergency approval from the FDA.

Yale is planning on publishing DIY instructions so that anyone with the required expertise, and lots of labs have that level of expertise, can create their own version of the test without needing outside help, a license, or any of the usual folderol.  That means that they may be able to figure out ways to drive the cost of this test down significantly.

Even more interestingly,  apparently the Israelis have developed a quick, easy test that costs twenty-five cents a pop.  It's not FDA approved,  I don't know how accurate it is.  But if it, or something similar, becomes widely available, the idea of testing a large percentage of the population on something like a "twice a week" schedule becomes completely practical.

And the best thing to do is probably a hybrid solution.  If we test lots of people using these "fast, cheap" tests we will likely get lots of positives.  But, as the example with the Governor and the White House test above demonstrates, that may lead to a lot of confusion and unnecessary concern because many of the "positives" may turn out to be false positives.  But what if we stop widely administering PCR tests and instead reserve it for people who test positive using a "fast, cheap" test?

We already have the capacity to handle that rate of administering PCR tests.  The current regime is generating about 50,000 positives per day.  If the "fast, cheap" tests produce one false alarm for each true positive that would mean we would need to PCR test about 100,000 people per day.  Even if we double or triple the number of people who turn up positive by testing widely and often, we can handle the 600,000 tests per day that would require.  

But 600,000 tests per day is a worst case scenario.  Likely we would need to turn around far fewer tests.  And that should mean that the PCR infrastructure would no longer be overloaded.  And that would mean that the PCR tests infrastructure should be able to reliably produce results in 24, or at most 48, hours.  That puts us in a far better situation than the one we now occupy.