BBC Russian

Showing posts with label Data. Show all posts

Friday, August 6, 2021

Data for "Interfuel Substitution: A Meta-Analysis"

I've long thought that there was an error in the way I calculated the shadow elasticity of substitution (SES) in my 2012 paper on interfuel substitution in the Journal of Economic Surveys. This would have been a big problem as the paper carries out a meta-analysis of SESs. But no primary paper reported the results in terms of the SES. I computed all this data from the various ways results were presented in the original studies. I never got around to doing anything about it or even checking carefully whether there was a mistake. I suppose this is because I hate finding mistakes in my papers and as a result procrastination goes into superdrive.

Yesterday a student wrote to me and requested the data. I have now checked the derivation of the SES in my database and also computed it in an alternative way. There is in fact no mistake. This is great news!

The reason I thought that there was a mistake is because of the confusing notation used for the Morishima Elasticity of Substitution (MES). Conventionally, the MES is written as MES_ij for the elasticity of substitution between inputs i and j when the price of i changes. By contrast, the cross-price elasticity is written eta_ij for the elasticity of demand for the quantity of input i with respect to the price of input j!*

I have now uploaded the database used for the meta-analysis to my data website. The following is a description of what is in the Excel spreadsheet:

Each line in the main "data" worksheet is for a specific sample/model in a specific paper. Each of these typically has multiple elasticity estimates.

Column A: Identification number for each paper.

Columns B to L: Characteristics of the authors. Including their rank in the Coupe ranking that was popular at the time.

Column M: Year paper was published.

Columns N to V: Characteristics of the journals in which the papers were published. This includes in Column O the estimated impact factor in the year of publication. Others are impact factors in later years.

Column W: Number of citations the paper had received in the Web of Science at the time the database was compiled.

Column X: Number of citations the lead author has had in their career apart from for this paper.

Columns Y to AO: Characteristics of the sample used for the estimates on that line. So looking at the first line in the table, as an example, we have:

Data from Canada for 1959-1973. Annual observations. This is a panel for different industries. N=2, so there are two industries but a single estimate for both. T is the length of the time series dimension. Sample size is N*T*Number of equations - i.e if there are 4 fuels usually 3 equations are estimated. This could be different if the cost function itself is also estimated, but it looks like no papers did that. (There are also papers using time series for individual industries etc and cross-sections at one point in time.)

Column AH: Whether fixed effects estimation was used or not (only makes sense for panel data).

Column AC: The standard deviation of change in the real oil price in that period.

Column AD: PPP GDP per capita of the country from the Penn World Table. Probably the mean for the sample period.

Column AE: Population of the country in millions. Looks like the mean for the sample period.

Columns AP to AZ are the specification of the model:

Column AP: Not4 - if there weren't 4 fuels in the analysis.

Column AQ: Partial elasticity - this is holding the level of total energy use constant.

Column AR: Total elasticity - this allows the level of total energy use to change.

Columns AS and AT: If this is a dynamic model these are estimates of the short-run or the long-run elasticity.

Column AU: The model is derived from a cost function, or something else.

Column AW: Functional form of the model.

Column AW: Form of the equations estimated - usually cost shares - log ratios means the log of the ratio of cost shares.

Column AX to AZ: How technical change is modeled. Many papers don't model any technical change explicitly. Energy model means there is biased technical change for energy inputs. Aggregate model means that if other inputs are also modeled they also have biased technical change. Kalman means that the Kalman filter was used to estimate stochastic technical change.

Columns BA to the end have the actual estimates. Different papers provide different information. All the various estimates eventually are converted into Shadow Elasticities of Substitution.

Columns BA to BP: Own price and cross-price elasticities of demand. For example: Coal-Oil means the cross-price elasticity of demand for coal with respect to the price of oil.

Columns BQ to CF: Reported translog cost function parameters.

Columns CP to CS: Cost shares at the sample mean. These are used in various elasticity formulae. They were derived in a variety of ways from the information in papers. One of these methods is the quadratic solution in Columns CG to CO. It uses demand elasticities and translog parameters to reverse engineer the cost shares. Other estimates take the ratio of demand and Allen elasticities.

Columns CT to DE: Morishima elasticities of substitution. These are asymmetric - so we have oil-coal and coal-oil. Here the terminology is very confusing. The standard terminology is that MES_ij is for a change in the price of i. So coal-oil is for a change in the price of coal. This is the reverse of what is used for cross-price elasticities! It is super-confusing.

Columns DF to DK have the shadow elasticities I actually used in the meta-analysis.

Columns DL to EA have the Allen elasticities of substitution. Some of these are reported in the papers and some I computed from the cross-price elasticities.

* You can learn more about all these elasticities in my 2011 Journal of Productivity paper on the topic.

Saturday, February 3, 2018

Data and Code for "Modeling the Emissions-Income Relationship Using Long-run Growth Rates"

I've posted on my website the data and code used in our paper "Modeling the Emissions-Income Relationship Using Long-run Growth Rates" that was recently published in Environment and Development Economics. The data is in .xls format and the econometrics code is in RATS. If you don't have RATS, I think it should be fairly easy to translate the commands into another package like Stata. If anything is unclear, please ask me. I managed to replicate all the regression results and standard errors in the paper but some of the diagnostic statistics are different. I think only once does that make a difference, and then it's in a positive way. I hope that providing this data and code will encourage people to use our approach to model the emissions-income relationship.

Friday, November 24, 2017

Data and Code for "Energy and Economic Growth: The Stylized Facts" and an Erratum

Following a request for our estimation code we have now completed a full replication package for our 2016 Energy Journal paper and uploaded it to Figshare.

While we were putting this together we noticed some minor errors in the tables in the published paper. The reported standard errors of the coefficients of lnY/P in Tables 2 and 3 for the results without outliers are incorrect. We accidentally pasted the standard errors from Table 5 into Tables 2 and 3. The correct versions of Tables 2 and 3 should look like this:

The standard errors for unconditional convergence in Tables 4 and 6 are also incorrect. The reported standard errors are not robust and one was completely wrong. The tables should look like:

None of these errors results in the significance level in terms of 1%, 5% etc. changing.

Monday, July 25, 2016

Data and Code for Our 1997 Paper in Nature

I got a request for the data in our 1997 paper in Nature on climate change. I didn't think I'd be able to send the actual data we used as I used to follow the practice of continually updating the datasets that I most used rather than keeping an archival copy of the data actually used in a paper. But I found a version from February 1997, which was the month we submitted the final version of the paper. I got the RATS code to read the file and with a few tweaks it was producing the results that are in the paper. These are the results for observational data in the paper, not those using data from the Hadley climate model. I have now put up the files on my website. In the process I found this website - zamzar.com - that can convert .wks to .xls files. Apparently, recent versions of Excel can't read the .wks Lotus 1-2-3 files that were a standard format 20 or more years years ago. For those that don't know, Lotus 1-2-3 was the most popular spreadsheet program before Microsoft introduced Excel. I used it in the late 80s and early 90s when I was in grad school.

Monday, November 17, 2014

Figshare

I just signed up for Figshare. In case you haven't heard about it yet, it is a site for sharing all kinds of research data. I uploaded the citation data I used in my 2013 article in the Journal of Economic Literature. This was a lot easier than adding data to my university's data repository so, in future, I think I will use Figshare exclusively for larger datasets I want to make public. The service is free and you get a DOI for your dataset.

Monday, September 1, 2014

Citation Data for Citation Prediction Paper Now Available

The data underlying my recent working paper: "High-Ranked Social Science Journal Articles Can Be Identified from Early Citation Information" are now available from the ANU Data Repository. The data set includes most journal articles in economics and political science published in 2006 and included in the Web of Science and the number of citations that received each year through 2012. There is also a worksheet with all economics articles from 1999 too. That's not mentioned in the working paper but is mentioned in the revised version of the article I just resubmitted to the journal. The journal's (PLoS ONE) data policy requires all data to be made available with DOI's if possible. This is definitely the current trend and looks like becoming the norm. For example, Energy Economics requires both data and code to be submitted prior to publication. I have mixed feelings about this. Obviously I am in favor of replicability but putting together datasets costs time and/or money and so it seems a bit unfair to force authors to make their data freely available as the price of publication.

P.S. 9th September
My paper was accepted at PLoS ONE! :)

Friday, July 25, 2014

Henriques PhD Dissertation on the Energy History of Portugal Now Available on the Web

Sofia has posted her dissertation to Academia.edu. It includes the data. She studied in Lund with Astrid Kander and is now at the University of Southern Denmark.

Tuesday, July 22, 2014

Top Twenty Carbon Emitters, Coal Consumers, and Coal Producers

Some slides from my upcoming introductory lecture for my Energy Economics course:

This slide uses CDIAC data on the top twenty countries by emission of carbon dioxide globally in 2010. Carbon dioxide emissions here include only those from fossil fuel combustion and cement production. I also have summed up the emissions from the European Union and added it as if it was a single country (as the EU negotiates as a bloc) in addition to including all its member countries in the ranking. The three big emitters stand out clearly from all the rest. Emissions are measured by mass of carbon. To get carbon dioxide multiply by 3.66.

Of course, coal use is a big driver of CO2. This chart shows how China consumers so much more coal than any other country and after the US and India, the rest look pretty inconsequential.

On the whole, coal is consumed where it is produced with two important exceptions - Indonesia and Australia - the two biggest coal exporters. China produces the overwhelming majority of the coal it uses despite large imports. The majority of Australian exports are coal for iron smelting, so-called "metalurgical coal".

Thursday, June 26, 2014

Canberra is the Best Place to Live in the World According to the OECD

Australia is the best country and the ACT is the best region in Australia. I checked the OECD website, and giving equal weight to each of the criteria the OECD ranks, the ACT is the highest scoring region in the world. Of course, that is not how most Australians see it. I met an Australian woman at JFK airport last week while waiting to take the train and she asked me where I lived. When I answered: "Canberra", she said: "Why would you do that to yourself?"

Wednesday, June 18, 2014

World Energy Use Increased 2.3% in 2013

The annual BP Statistical Review was just released. It shows that world energy use increased by 2.3% in 2013. According to the IMF, the world economy grew 3% in 2013. World population is growing at about 1.1% p.a. Therefore, there was a 1.2% increase in per capita energy use for a 1.9% increase in GDP per capita - a ratio of 0.63 - which is a little below our stylized fact that energy use tends to increase by 0.7% for a 1% increase in GDP.

Monday, February 10, 2014

Great New Intuitive Way to Access Climate Data

It can be hard to understand how interpret global climate data sets in the way they are usually presented. This new Google Earth based interface is really good if you just want averaged data for a few gridboxes or station level data. It is also good for visualizing the distribution of station data that supports the gridbox values. For more discussion see this blogpost on Real Climate.

Sunday, January 5, 2014

Harvard MIT Atlas of Economic Complexity

This is an interesting effort to assess the complexity of production and the level of local production knowledge across the countries of the world. The index of economic complexity is derived from the diversity and ubiquity of the goods which countries export. The rich data available on world trade is the strength of the indicator but also its weakness. It doesn't take into account of course any of the sophistication a country might have on the service side of the economy or in non-tradables. Australia ranks very badly. Based on the index the Zimbabwean and Australian economies have the same level of sophistication. Australia's complexity has also declined as minerals have increasingly dominated exports over time. With the upcoming demise of Holden and Ford, Australia is going to look even less sophisticated. Obviously, the Australian economy doesn't produce as wide a range of sophisticated products as the major industrial exporters. Still, it does seem that it has more sophisticated knowledge than the developing economies it ranks with in this analysis.

Thursday, December 19, 2013

Data from Stern and Kaufmann (2014) Climatic Change Posted on My Website

As I've already had a couple of requests for the data, I have posted the data from our forthcoming paper in Climatic Change on my website.

Saturday, September 28, 2013

Capital in the Penn World Table 8.0

This is another tricky issue with the new Penn World Table (PWT 8.0). In principle it is easy to compute a capital series if we know the level of investments each year, have estimates of the depreciation rate and the initial capital stock. The latter is the most difficult to obtain and cross-country datasets make essentially arbitrary decisions to estimate these starting stocks. The usual approach is to assume that the economy is in the steady state of the Solow model and compute the initial stock from the current level of investment, some growth rate of the economy or capital stock and the rate of depreciation. We are using that for the paper we are writing on the stylized facts of energy and growth. PWT 8.0 instead assumes that all countries had a capital/GDP ratio of 2.6 expressed in units of the local currency in the first year that data is available for that country, which could be anywhere from 1950 to 1990... There is some rationale for this. A regression analysis shows that there is no relation between the level of GDP and capital/GDP ratios in 2005 (Because of depreciation capital stocks in 2005 are not that sensitive to the initial values) and the average is about 2.6.

The interesting thing is that they have separate price series for each country for (output side) GDP (pl_gdpo) and for capital stock (pl_k). These show that in developing countries capital is much more expensive relative to output than it is in the US and other developed countries. This means that a common ratio of 2.6 translates into a real capital/GDP ratio where capital and GDP are both aggregated using US prices that varies across countries and is lower in developing and higher in developed countries. You can compute this as CK/CGDPO. Also, this will mean that there is an extra term in a cross-country Solow growth model which is the capital/output price ratio:

where Y is GDP, K capital, delta is the depreciation rate, s is the saving rate, and pY/pK is the ratio of output to capital prices. In developing countries saving buys less new capital stock per Dollar than it does in developed countries. This would be another reason in the Solow framework for why developing countries are poorer than developed countries. At least, that's what I'm understanding at the moment.

Here are the three different capital-output ratios for China:

The blue line is the ratio at international prices and the red line at constant national prices. These are equal by construction in 2005. The green line is the nominal ratio of dollar values of capital and GDP. This is equal to 2.6 in 1952. The blue line shows the strong capital deepening in China since the late 1980s. The other series do not indicate any capital deepening at all. The discrepancy between the blue and green lines is easy to explain. The price of capital/output relative to the US ratio has fallen from 2.77 in 1952 to 0.74 in 2011 (capital cost 1.39 times the US price in 1952 and 0.46 times in 2011 while output's price changed from 0.5 to 0.61 times the US level). By assumption capital and output have the same price in the US.

So what does the red "constant national prices" series mean? It will deviate from the blue line to the extent that the prices of different types of capital deviate in the country in question from the international price vector. It seems that the two lines tend to track each other much better in developed countries than developing, though India is a clear exception to that rule. For example, if structures are relatively undervalued in China (as would make sense as structures are non-traded) and the capital deepening in China is heavily driven by structures (as the data in this article by Wang and Szirmai support) then the red line will show a much slower increase in capital per unit of GDP than the blue line.

Thursday, September 26, 2013

Penn World Table 8.0

The new version of the Penn World Table - version 8.0 - has recently been made available and is now hosted at University of Groningen in the Netherlands. An NBER working paper by Feenstra et al. describes what is new in PWT 8.0.

The new edition of the dataset introduces several new measures of GDP and the working paper is mostly devoted to discussing them as well as the relationship between PPP exchange rates (relative to market exchange rates) and the level of income known as the Penn or Balassa-Samuelson Effect.

GDP is now given both in terms of the output side and the expenditure side. The difference between these is that real output side GDP (RGDP(O)) deflates expenditure on final goods (the standard macro-economic C+I+G - consumption, investment, and government expenditure), exports (X), and imports (M) using separate deflators:

The expenditure side real GDP (RGDP(E)) uses only the final output deflator to deflate the GDP. Feenstra et al. argue that the former expresses better the real production level in each country and the latter the standard of living in each country. Previous versions of the Penn World Table used the expenditure side measure only. The difference between the two measures is due to the terms of trade. Countries with relatively expensive exports and relatively cheap imports will have living standards (RGDP(E)) that are higher than their real productive capacity (RGDP(O)).

GDP is also given in "current" and "constant" prices. This terminology is confusing because usually current prices mean prices not adjusted for inflation and constant prices mean adjusted for inflation. Here constant prices mean the reference prices from a given benchmark year - in the current version 2005 - and current prices mean using the reference prices from each year though these are adjusted for US inflation. These differ because the reference prices change over time. The constant price series RGDP are better for comparisons across time while the current price series CGDP can be used to compare countries at a single point in time.

Finally, there is also an RGDP(NA) series that uses the growth rates in each country's own national accounts to extrapolate GDP in that country in years other than the benchmark year. National accounts growth rates were used exclusively in previous versions of the Penn World Table. This series can differ substantially from the RGDP(E) series as is shown by this graph for India:

According to RGDP(E) living standards in India fell from 1975 to 1985 while according to India's own national accounts they rose. Which is right? Well, it depends what you want to measure. The change in RGDP(E) measures the change in relative living standards across countries while that in RGDP(NA) measures the change in real expenditure weighted according to the budget shares in the country in question. They differ because budget shares differ across countries. RGDP(E) will also grow faster than RGDP(NA) in a country experiencing an improvement in the terms of trade as, for example, Australia did in the years up to 2009 due to the mining boom.

PWT 8.0 also includes capital stock, human capital, and total factor productivity series. The former was included for some countries in some previous versions but not version 7. The latter are both new.

So, all this sounds more complicated than using The Economist's Big Mac Index or previous versions of the PWT. The User Guide gives a less technical guide on how to use the data.

Monday, August 12, 2013

Growth in Oil Reserves 2011-12

I'm updating my lecture on fossil fuels for my Energy Economics class. BP released its latest Statistical Review of World Energy about a month ago and I'm taking my first look at it. The graph above shows the ten largest increases in oil reserves by country. I used the reserve data for 2012 in this year's report and the reserve data for 2011 in last year's report to calculate the difference. Some of the changes in reserves have been backdated in the current report. The US comes in third with 4.1 billion barrels. US reserves have increased by 6.6 billion barrels or around 15% since the fracking boom took off. Of course, this doesn't reflect the amount of oil discovered as there is ongoing production. But US reserves remain at around 2% of the world total. In terms of increases in proven reserves this isn't yet revolutionary or game changing, I think.

Sunday, July 28, 2013

U.S. Employment Trends

Some very interesting trends in US employment that I saw in a free newsletter I subscribe to. The data is from the Federal Reserve Bank of St Louis. Total US employment has partially recovered from the Great Recession:

But things look very different when you break the total down into age groups (the first graph is not seasonally adjusted and the others are...). In the main 25-54 age group there has been little recovery:

while in the above 55's there was hardly a recession:

This graph is truly stunning I think. The under 25's seem to follow the pattern of total employment:

It's perhaps understandable that employment in this cheaper to hire group has rebounded strongly (but employment of 16-19 year old's has not), but what explains the almost lack of decline in employment in the over 55's? Usually, you'd expect older workers to be retired early in a recession.

Sunday, June 16, 2013

Uncertainty in Global Greenhouse Gas Emissions

There is considerable uncertainty about levels of greenhouse gas emissions particularly for those associated with land use change as well as for fugitive emissions associated with oil and gas extraction and coal mining. Estimates of emissions from the combustion of fossil fuels have the least degree of uncertainty, but do vary depending on the data source. In 2007 estimates of emissions from fossil fuel combustion varied by only 2.7% across data sources (Macknick, 2011). Default uncertainty estimates (2 standard deviations) that have been used by the IPCC for emissions coefficients for fossil fuel combustion range from 7.2% for coal use in industry to 1.5% for diesel used in road transport (Olivier et al., 2010). In summary, the uncertainty for fossil fuel based carbon dioxide emissions is ±5% (UNEP, 2012). There is much greater variation in estimates of carbon dioxide emissions from cement production and gas flaring but these are a relatively small fraction of total emissions (Macknick, 2011). Emissions from agriculture and land-use change are much more uncertain (Tubiello et al., 2013). It is estimated that carbon dioxide emissions associated with land use change have an uncertainty of ±50%. However, this means that total anthropogenic CO2 emissions have an uncertainty of only ±10% (UNEP, 2012).

Fugitive emissions of methane in fossil fuel extraction and supply are very uncertain. Between 2-4% of natural gas may be lost globally in transport and US estimates of fugitive methane emissions have an uncertainty of ±40% (Hayhoe et al., 2002). Estimates of N2O emissions are inherently uncertain (Olivier et al., 2010). Estimated uncertainties for global emissions of methane, nitrous oxide, and fluorine based gases are ±25%, ±30%, and ±20% respectively (UNEP, 2012).

References

Hayhoe, K., H. S. Kheshgi, A. K. Jain, and D. J. Wuebbles (2002) Substitution of natural gas for coal: Climatic effects of utility sector emissions, Climatic Change 54: 107–139.

Maknick, J. (2011) Energy and CO2 emission data uncertainties, Carbon Management 2(2): 189-205.

Olivier, J. et al. (2010) Application of the IPCC uncertainty methods to EDGAR 4.1 global greenhouse gas inventories, 3rd International Workshop on Uncertainty in Greenhouse Gas Inventories, Lviv.

Tubiello, F. N. et al. (2013) The FAOSTAT database of greenhouse gas emissions from agriculture, Environ. Res. Lett. 8: 015009.

UNEP (2012) The Emissions Gap Report 2012: A UNEP Synthesis Report, United Nations Environment Programme.

Wednesday, March 13, 2013

Global Anthropogenic Sulfur Emissions Updated to 2011

A new article by Zbigniew Klimont, Steven Smith, and Janusz Cofala updates Smith et al.'s estimates of global sulfur emissions to 2011. The global downward trend that started around 1990 or earlier * continues. The small increase in the early part of the last decade was just a blip:

This chart also shows some previous estimates. In general the trend has been revised down over time. The trend in China is also now heading down:

Another paper by Smith and Bond declares "the end of the age of aerosols". Well not quite yet. We'll have to wait till 2100 for that :)

The downside for me of this new data is that I will now have to redo all the econometrics in a paper I have in preparation (with Robert Kaufmann) that was almost ready for submission :(

* As shown in my 2006 paper, studies prior to Smith et al. 2001 showed emissions continuing to grow strongly through 1990. Smith et al. (2001) showed a flattening of the trend in the 1980s. My paper showed a plateau from the mid-1970s to 1990 and Smith et al. (2011) showed a slow downward trend from 1973 to 1990 and then a steeper decline.

Friday, December 21, 2012

Results from Stern (2012) Energy Economics Now on Data Page

The results data from my recent paper in Energy Economics are now up on my data page.

Data is presented in terms of distances and log distances relative to a stochastic frontier estimated with the between estimator for the 1971-2007 period. A distance of one (or log of zero) implies that a country is just on this frontier. But the frontier moves over time as world best practice improves. Countries can have time series values below zero (or a distance of less than one) because they may reach levels of energy efficiency greater than the the average best practice for the whole period. This is especially the case in the later years. The file also gives income per capita in PPP terms from the Penn World Table. Click here for more posts explaining this project.