[BLANK_AUDIO]. So, our guest today is Don Loriad. Don is an environmental engineer and a Professor Emeritus at the University of North Carolina, in the department of Environmental Sciences and Engineering. Don spent his career working on environmental problems and specifically, on water and sanitation problems in developing countries. Don's been my friend and colleague for over 30 years and most of what I know about piped water and sanitation systems in developing countries, I learned from Don. So, Don, thank you so much for coming today. It's a pleasure to have you here. >> What a, what a gracious interview. I wish it were true. My mother would love to hear that. [LAUGH] >> Let me, let me start and ask you, as we're going to, we're going to talk about estimating the cost of piped water and sanitation systems today. Are there any principles or characteristics that our students should think about, before they actually get into the cost estimation task? >> indeed, Dale. There're few things that, that come to mind. Certainly the nature of the system for which the cost model is going to be developed. And I can explain that, but beyond that, I think too many statisticians just think of the, the form of the mathematical model. You have to think about what is the deep ended variable. What is the single or set of independent or explanatory variables. I use those terms interchangeably. And a, and a mathematical form. So I think those four issues are really important, okay? The nature of the system, the dependent variable, the set of explanatory variables, and the mathematical formula for [CROSSTALK]. >> So why don't you start, with that list of four and tell us about. >> Well let me start with the nature of the system. The nature of the system, oftentimes, is something that the modeler may not have a lot of control over, because the the cost modeler needs a set of data. But I generally think of systems falling into just two categories. Water and sanitation systems the first category being entire integrated systems. An example of which would be if we have total cost, construction cost data that includes a water intake, a transmission main, a pumping station, a treatment plant, elevated storage tanks in, in addition to. >> So this'll be like a green field site right, I mean yeah. >> So. >> Yeah, yeah. >> So that would, that would require one kind of mathematical model if we had data on total costs of these integrated systems. And the other, much more common situation is, when we have data for individual components of systems. So we have data for intakes, or we have data for pumping stations, or we have data for treatment plants, or networks, or elevated tanks. And it's really the latter category. The, the separate the, the separate components of systems that are almost universally the subject of cost modelling in the literature. And as a matter of fact, it's easier to get data for those components and the models that are postulated tend to fit the data much better, much better fits than when we tried to model totally integrated systems. >> Well should, should we go component by component and talk about how to model this, I mean. I guess a big component is the pipe networks, right I mean so. >> Sure, we can start with the networks. And, interestingly, they are they tend to be the the most difficult. I'm going to look at my notes here. I pulled together some ideas and I want to make sure I kind of stay on message. >> Sure. >> With those. So, In pipe networks, there are two explanatory variables. The diameter of the pipe and the length of the pipe. It's pretty easy to find construction cost data for pipes. That's because most pipe networks are built by the public sector, and the projects are put out for bid, and the contractors are asked to break down their proposal, their bid. proposal. By types of pipe. The materials of construction that could be ducked to iron pipe, for example, vis-a-vis PVC pipe. And then for each of those different materials the contractors will list we're going to have so many. Lineal meters, what with pipe length. Of let's say 2, 2-inch diameter pipe, I'll use inches rather than metric. They just. >> Yeah, that's fine. >> Easier for me. So many meters of 4-inch pipe, 6-inch pipe, 8-inch pipe. So we tend to have from contractors pretty good bid information. Pipes of different materials, the diameters of the pipes, and the lengths of those pipes. And that kind of information makes it pretty easy to develop a cost equation for a pipe of a certain material. For example, let's, let's think about PVC pipe. Lots of developing countries manufacture their own PVC pipe. Ductile iron is typically imported. But if they're manufacturing this pipe for themselves, then. Let's, let's think about PVC pipe, and assume that in a in a bid proposal form a contractor, we have information for PVC pipe of different diameters and different lengths. And the question I hear you asking is, how are we going to develop a cost equation for that, how are we going to develop a cost model for that? Okay. And what I, and, that model's in the first slide, slide one, figure one. The dependent variable, interestingly, is not total cost. If we, but rather, cost per unit length. Cost per meter of length. So if we have, if we were to think of, let's say three columns of data. We have a column of data of cost. Next column of, of data will be the diameters, different diameters. Next column will be the different lengths of pipe. If we divide costs, total costs, by length we have costs per unit length. That's the dependent variable that works best for me, and that leaves only on the right hand side of the equation. A single explanatory variable which is the diameter of the pipe. And the mathematical form of the equation that tends to fit the data best is a log linear equation. That is what we typically call a power function. So the equation that is shown in figure one. It's going to be cost per unit length is equal to some parameter, I call it alpha in, in figure one, times the diameter D raised to some exponent beta and, in order to fit that kind of a model to these raw data. We have to take the log transform of cost per unit length, and the log transform of diameter. What comes out from, falls out from ordinary Lee squares are the parameters that of best, best fit for alpha and beta, the two statistical parameters. Okay? >> And so is that the cost for purchasing the pipe or the installing the pipe or both? >> Great question, Dale. Typically the data that that are, are available from contractors in their bids. The contract documents call for the contractor to both furnish and install, okay, so it will cover furnishing install. That's a complete operating installation so if you think about, what is the install part, there are lots of different. Components, I mean after all you'll have to open up a trench, and you'll have to get people down into the trench to fit the pipe together, you have to back fill the trench there's paving, you have to control traffic, so there are lots of different components with that. But, typically this model, the log linear model. For a pipe that goes into a network covering both furnishing and installing, alright? Is this power function that we talked about and almost universally Dale, the value of beta lies between one and two, okay? >> Is that the same for water systems and for sewer lines? >> Not at all. >> You're talking about water lines now. >> I'm talking about water lines, okay? If we get into sewers, sewers are complicated, that's for sure. Let's stay on the water bit just a little bit. So there are these two components and we're expecting this log linear model to reflect the, these two components of cost. Only thinking about water systems right now. The furnishing pipe part of the cost equation. Of the cost calculus, is pretty easy because pipe is pretty easy to manufacturer and it's the, the, the furnishing part is mostly a matter of what is the cost of making the pipe. It's manufacturing time. Not entirely. Because I could tell you a story, if we have time for it, about there is transportation and sometimes transportation can really dominate the costs of the furniture supply. >> I think you've told me that story, about Yemen, yeah? >> Yemen, Yemen. >> Why don't you tell our students, yeah. >> Well, Yemen was a fascinating situation. For years, I think it's fair to say for decades, I would go into developing countries whether in Latin America or South East Asia or Africa or wherever and one of it, since I was doing a lot of work on pipe networks. They are really hard to design. They're hard to estimate their costs, and immediately I would do these cost equations. So, I'm doing cost equation, and I find out the value of betas between one and two. I go to Yemen and I find out the value of beta is less than one. And I say, immediately gets my attention, and I say, what's going on here? With this value of beta less than one. All over the world, different continents I'm finding betas between one or two, and here it is, less than one. So I immediately think of, yeah, I either have a bunch of bad data, so erroneous bids from contractors, or there's fraud going on. Wonder what in the world is going on here, so I start asking a bunch of questions only to learn I was working in Sana. Which is the capital city of Yemen, sometimes called North Yemen. That the pipe is, is all imported pipe and it came into the port city of Hodeidah on the Red Sea and then it had to be transported on flatbed trucks for hundreds of miles. And of course the cost of transportation almost, almost swamped the cost of, of furnishing the pipe. And because the transport makes no distinction among pipes of different diameters, it made that exponent of B, less than one. Large, large economies of scale. It, the transport [UNKNOWN] cost was essentially a fixed setup cost, okay? >> Mm-Hm. >> But why is beta usually used between one and two? Because on the, if it, if it were only the furnishing the, the cost of pipe, the exponent of beta would be two. It'd be very easy to show that. It would depend on the materials. The, either the weight or the volume of the materials. The PVC that went into the pipe. But once we get into the construction cost, the installation cost, opening up the trenches, putting the pipe in the trench, putting workers down in the trench, back filling the trench, controlling the traffic, buying the land, almost all of those costs. Are independent of the diameter of the pipe. They, too, look like fixed costs because you have to open up, for example, a trench, trench of certain width for people to be able to jump down into that trench to put the pipes together, with the pipe is, whether the pipe is two inches in diameter. Or ten inches in diameter. If you get into really large diameter pipe, 36, 48 inches, 60 inch diameter pipe, then the size of the trench it is in fact dictated by the size of the pipe, but that's uncommon for most water distribution networks in cities [INAUDIBLE] you don't get into pipes that size. So, those costs, the installation costs, are really treated as a fixed charge, along with the transportation cost and that's what tends to pull the exponent of data down from somewhere between. What it would be, too, if it was only furnishing, but to the furnishing and then stop. Okay, that make sense? >> Yup, Yup. >> Now you had asked me about sewers? >> huh. >> Sewers are not easy. What's the difference between water pipe that works. But networks for water systems and for sewer systems. Water networks, water pipes are under positive pressure. Technically, the way to say that is the hydraulic radiant of the pipe loves a, it lies above the crown of the pipe, crown being the top of the pipe. Sewers on the other hand, run downhill. They're on gravity. Wastewater is open channel flow, flows downhill. What this means is that for piped water networks, piped water systems, the pipes can actually follow the contour of the ground. So if the ground goes up, if it rises, the pipe is still buried maybe a couple of meters below the surface of the ground. So, if the ground is undulating, the pipe is the depth of the trench and this makes it. These models fit the data very well. That's not the case with sewers. Sewers are running downhill. So if we get into hilly terrain. I'm thinking right now of a city like Tegucigalpa, the capital city of, of, of Honduras, that essentially lies in bowl, with large mountains shooting up all around it. The same is true of Quito, Ecuador, for example. Undulating ground. So, the, you can't always find ground that wants to run downhill in such hilly terrain. So that, sometimes, if you are in, on a street where in fact the street is running downhill and the, and the, the sewer is sloping down, It's going to get deeper and deeper in the ground. And then if the ground starts to rise it gets very deep in the ground, right, and that gets to be very expensive construction. And if the depth of the trench gets to be something like five or six meters. Then to construct a trench with straight sides such that the walls are not going to collapse on the workmen when they get down in the trench and kill them. You get into the construction of having to install sheeting, what's called sheeting. It can be wood sheeting, it can be steel sheeting, but these are either planks of wood or, or, or planks of or sheets of steel that buttress the walls that keep them from collapsing. They have bracing. Very, very expensive construction. And for this reason Dale, the cost of a sewer system is much more dependent on typography than a water system. So it's pretty easy to model the cost of a water pipe network. Very, very difficult to model the cost of a sewer network. >> So can we go back then to the modeling of the water system for a minute? >> Yeah. >> So, you were talking about contractors putting bidding documents and I guess that's the data set you'd use to estimate the cost function. Can you tell us a little bit about what kind of data sets you need and sort of the size of the data sets? I mean, how many kinds, our students are trying to estimate a water cost function. >> Right. >> So, what should they be looking for in terms of datasets? >> Okay. Since water networks are mostly constructed in the, in the public sector, not entirely. Let me digress for just a minute. Think about. Private estates. You and I were working in the Philippines some time ago, and there were lots of gated communities, private estates. Where it was the developer who constructed both the water and, and the sewer system. Those data are typically hard to get because it's a private enterprise that has developed those communities. But for most water supply networks, they're constructed in the public sector. And the public sector requires there's a net advertisement for the project, it is publicly bid, then there is a formal bid opening by all the contractors who submit. Who submit their bids, and those documents are in the public sector, so the public sector will make them available for people who ask for them, and what students should be aware of is that you don't only have to find. The lowest bid, the successful bidder, he may have had the overall lowest cost, but all of these contractors are going to have bid prices that are usually fairly close to each other. So, don't turn up your noses at. Unsuccessful bidders who didn't get the job. Those cost data, also, are relevant. And what students might want to do is if there were a single project, for example, where there were, let's say, four or five or six different bidders on that project. And there was a large component of a piped water supply system. The students might want to look at, at all of the all of the different bids. The costs of them. That raises an interesting question, Dale. For, my wheels are turning here as you ask these questions. But. Let's assume that there were five different bidders for the same project and the students found cost data for each of those separate in the contractors in their bid documents. How would they use the data for developing a cost equation? I've already suggested that the cost equation is cost per unit length as a function of diameter on the right hand side. A single explanatory variable. And you're going to actually use the log transforms of those data. But now we have five or six different sources of data. What would you do? What would I do? I would probably include dummy variables. I would pool all of the data, right? And I would have, if there were five different contractors that had submitted bills submitted proposals, I would probably pool all of the cost data for all five contractors and that implies the need to include four dummy variables always. These are, there are five categories. The number of dummy variables is one less than the, than the category. The baseline category is some one of those contractors would be the best. I'm not sure if that's coming apart. Now the students can get into that. >> Yeah. >> They'll find. This kind of information, maybe not in elementary statistical courses, but always in econometrics courses. The econometricians do a lot, a good deal of this. >> So you've talked about the pipe network, what about the other components if you. Pumping stations, overhead storage tanks, you know, transmission lines. >> Yeah, I think I probably mentioned when I was talking about these two different kinds of systems, integrated or separate components, that those kinds of data are easier to get hold of. And as. Why are they easier? Let's first of all talk about why they are easier, because as a matter of fact, even in developing countries it's not easy to find too many integrated systems that are being constructed from scratch all at one, that include all the components. What most cities are doing, or most communities are doing, or even development, neighborhoods are doing. Is that they are making changes or improvements in individual areas. So it's pretty easy to find lots of information on the individual components. Without question Dale, The most common mathematical form of the model that is fitted to the cost data. Furnished and installed cost data for individual components. Let's think about, let's think about treatment plants right now. Water filtration plants. Not hard to get those kinds of data the mathematical formula that fits is a power function with a single explanatory variable on the right hand side. And that would be the hydraulic capacity of the system. The dependent variable on the left hand side would be the total construction cost, unlike the pipe networks where it was cost per unit length. So, it would be the total construction cost on the left hand side. And the explanatory variable on the right hand side would be the hydraulic capacity, and it's not the hydraulic capacity of the number of people that are going to be using that system, when it gets, after it's constructed, in a year or two. It's going to be the design capacity. So if we're talking about treatment plants, they're typically designed for a design period of maybe 15 or 20 years into, into the future. Even in developing countries, long design periods, so one would have to know what was the design population. And the, of course the bonding design flow. That raises then some questions about, what flow? Is it the average flow? Is it the peak hourly flow? Is the the peak daily flow? So I think for these kinds of components, so pumping stations or treatment plants or water intakes. The variable on the right hand side is a measure of flow capacity and had units of something like cubic meters per day and is probably the average, the average design flow. For that, for that system. That would be the best indicator explanatory for it on the right hand side. >> Okay, so Don, you're talking about cost functions for one component, but how do we get from that one component up to the total cost for a system? So, our, our students are really not mostly engineers, right? So I, I'm hoping that they can get a sense of. You know, ball park cost for serving households in different places in developing countries. So they know when they're kind of in the, in, in the right range. >> Yeah. >> So how do they. How do they get to that, you know, up to the system level, and then also from there back to the household level. >> Okay. Great questions. Let me go back to the pipe networks, especially water pipe networks, because those, that, those systems are the most difficult to model. And I have already suggested that if we have bid cost data from contractors on individual pieces of pipe, then the model is cost per unit length is equal to alpha. Diameter raised to the beta. Now, let's assume that we want to use that information for predicting the cost of an entire network, what are you going to do? Okay, let's transform. That equation, and I'm pretty sure in my, it's it's figure two. Let's take that basic equation. Let's bring L, the length of pipe that was on the left hand side of the equation, over on to the right hand side of the equation. So, what figure two shows, is that the total construction cost first and salt of a network, C, is equal to some parameter alpha, some explanatory variable diameter, D, raised to a parameter beta, all times L. We want to use this equation for predicting the cost of an entire pipe network. How are we going to do it? For L. Use the total length of pipe that's in the network, okay? So, the length of all the pipe that goes into the network. Huh? If you do that, then what are you going to use for D? D is going to be the average diameter. Well, what does that mean. The average diameter is a diameter, it's, it's, it's the diameters of all the different kinds of pipe that go into the pipe network, weighted by their respective lengths, okay? Now. So, f I have some 2-inch diameter pipe, 4-inch diameter pipe, 6-inch diameter pipe, 8-inch diameter pipe, 10-inch diameter pipe. If I multiply each individual diameter, multiply it by its respective length, sum it up over all different diameters, and divide by the total length of pipe, that's the average diameter. And that shows up in figure three. Okay, so if we have an idea of what is the average diameter of pipe that's in network. Average being weighted by their individual lengths. >> You mean for a whole city? >> For a whole city, for a whole network and we have a pretty good idea of what's the total length of pipe. How do you get that? Well, what's the length, length of streets? If you're going to have house connections, you need a pipe in front of each of them. In front, on, on each on each street. So if we have an idea of what the total length of the streets is, so that's the easy part. What's the average diameter? Depends on the size of the city. In the United States, what drives the diameters of pipes is not so much the hydraulic capacity, as much as the capacity that's needed for fighting fires. That's not the case in developing countries. In developing countries, even in pretty large cities, the pipe network is designed to carry the flow. That, that people are going to use in their households and in their businesses, right? So, and those pipes tend to be smaller than what we find in the industrialized. >> So, we should be, should be cautious transferring it, cost estimates from developing countries where fire's important to, I mean, for industrialized countries. >> [CROSSTALK] That would only apply to integrated systems. If you think about these models we're developing. We have a pipe cost function, cost per, per unit dip, length is equal to alpha, the diameter of pipe raised to beta. Okay. The question I think you're raising is, let's be careful that we make sure that we don't use average diameter pipe in industrialized countries and assume that the average diameter of pipe. >> Right. >> In developing countries is going to be in the same ballpark. It's not. In developing countries there's going to be a predominance of small diameter pipe, a lot of 2-inch diameter pipes. In our town where you and I live, the smallest diameter pipe is 6-inch diameter pipe. In New York City, it used to be 8-inch diameter pipe is the smallest diameter pipe, now it's about 10-inch. So, on individual streets 10-inch pipe is the smallest diameter pipe. In developing countries, even in large cities, in metro Manila, with millions and millions of people, they've got tons of pipe that's two inches in diameter. So if one were going to be replacing pipe in a, let's say, a neighborhood of metro Manila, one would need to have an idea what's the smallest diameter pipe. What's the largest diameter pipe? So, and you get that from the engineers. And then you make some guess, probably, or an engineering preliminary design of the pipe network to come away with some idea of what would be the average diameter pipe. Then you go back to your individual cost function and use that to predict what would be the cost of the entire network in that particular neighborhood. Given that you have an idea of what's the total length of streets and maybe what the average diameter is depending on where that net, where that neighborhood is located. If it's located near the, the center of the city, you're going to have large diameter pipe service way out in the perforates, going to have small diameter pipes. >> Can we, can we step back again and, and. >> Sure. >> And talk about the, this issue of economies of scale. >> Yeah. >> And your equation that means so, what can, can you tell us a little bit more about and so, what you think of as economies of scale and pipe water networks and. >> Oh yeah. Okay. I've already alluded to this and maybe I'll try to sharpen it. I've talked about the pipe cost function. Cost per unit length is a function of diameter raised to an exponent between one and two. That applies almost universally. Yemen was a very unusual case. The exponent of diameter being a number greater than one implies if we make a graph of that cost function, with cost per unit length on the vertical axis, the ordinate, and diameter on the horizontal axis, that is the abscissa. That's a convex function. It bends up and it goes through the origin. So it starts at the origin and it bends up. It's a convex function. Right. Now you and I, since we work with economies of scale, normally think of those kinds of convex functions, do not reflect economies of scale. But we know, we know, that there are large economies of scale, and water pipe networks, not only in the trenches, not only in the install part of the equation, but in the furnishing part of the equation as well. For example, the carrying, the hydraulic capacity of 6-inch diameter pipe, is twice as large as the hydraulic capacity of 4-inch diameter pipe. All you have to do is increase the diameter of the pipe from 4 inches to 6 inches and it carries twice as much flow. That is an indicator that there are economies of scale, that is, if we had the cost of that pipe, the average cost of using 6-inch pipe in terms of it's flow carrying capacity. Is a lower number than the average cost of a 4-inch diameter pipe in terms of it's flow carrying capacity, so this pipe cost function Dale, is a little bit tricky. It doesn't, it, it, the D, the diameter does not reflect hydraulic capacity, and I don't recall if I said at the outset when we were talking about. In general, what are the explanatory variables on the right hand side of these cost equations, but in every one of them, flow carrying capacity is key. It, it is ubiquitous. It turns up in all of mathematical models that have been developed for decades and decades and decades. There are some components of systems that you can't use flow carrying capacity, like, for example, an elevated storage tank. Okay? But for wells you can. What's the capacity of the well? Well, that's going to dictate the size of the, the tube that's punched down into the ground, and that dictates much of its cost. So we're talking about economies of scale. In fact, this pipe cost function that has D on the right hand side. Would reflect economies of scale if the D were replaced by some indicator of flow carrying capacity. And, engineers know there are lots of empirical equations for doing that. The most, the most popular being, the so called, Hasner Williams equation, that has a relationship between flow carrying capacity, Q is usually used for that, and the diameter pipe, so if you make a change of variables in that equation. You find that the exponent of Q, the flow carrying capacity is about 0.5, or 0.6. It's a number much less than one. This is big economies of scale. Big economies of scale. Right. >> Okay. >> Okay. And on the trench side. So the installation side, big economies of scale, because those costs of installing are not related to the flow-carrying capacity of the pipe. [BLANK_AUDIO]