GDP conundrum: Some areas of concern around growth overestimation in Indian manufacturing

Based on the new GDP series, large upward revisions in manufacturing growth rates were made – from 1.1% to 6.2% in 2012-13, and from -0.7% to 5.29% in 2013 – 14 that were not reflective of the actual performance of the sector during the period. In this article, Amey Sapre, doctoral student in Economics at IIT Kanpur, analyses some of the methodological issues in measuring growth in the manufacturing sector.

This is the third of a four-part series.

The new series of national accounts with base year 2011-12 was released in January 2015. Since the release, the accuracy and reliability of GDP (Gross Domestic Product) data has been a subject of intense discussions amongst stakeholders. Many are puzzled by the revised growth figures of macro aggregates, resulting from comprehensive methodological changes. Sharp upward movements in growth rates of several sub-sectors, and new formats of presenting the aggregates brought more confusion than clarity. In particular, for the manufacturing sector, the general discontent was driven by the fact that large upward revisions in growth rates from 1.1% to 6.2% in 2012-13, and from -0.7% to 5.29% in 2013-14, were not reflective of the actual performance of the sector. Such revisions led to questioning of the reliability of the estimates and also prompted a series of commentaries and papers on decoding the growth figures in the manufacturing sector (see for instance, Central Statistical Office (CSO) (2015a, 2015b) and Nagaraj (2015a, 2015b)). Nevertheless, some key questions about computation and data sources remained unanswered.

In a recent paper, Pramod Sinha and I, focus on understanding some of the methodological issues in measuring growth in the manufacturing sector (Sapre and Sinha 2016). We begin with estimation at the nominal level to highlight that once the entire process is recreated, several inconsistencies get revealed. In doing so, we come up with three key questions:

Are we correctly measuring output and intermediate consumption in the Gross Value Added (GVA) formula¹?
Should we continue with the existing Paid-up Capital (PUC)-based ‘blow-up’ method² to account for unavailable companies?
Are we correctly identifying manufacturing firms?

Questions on measuring output and intermediate consumption in Gross Value Added

We follow the Goldar Committee report in letter and spirit and use the production-side approach to recreate the GVA for a set of firms that file in the XBRL format in the MCA21. We do a mapping of the data fields in the XBRL³ form with data fields in the CMIE (Centre for Monitoring Indian Economy) Prowess dataset⁴ and estimate GVA⁵. Conceptually, the use of MCA21 involves a shift from the erstwhile ‘establishment’ to the new ‘enterprise’ approach of value addition. The former captured production-based data from factories registered under the Factories Act. The latter captures financial data of firms and goes beyond core manufacturing to capture value addition from post-manufacturing, ancillary or related activities such as marketing, and operations of branch/head offices. How does this change impact value addition? The answer has two parts:

Changes in measures of output

Under the establishment approach, ‘Sales’ were a measure of output. In the current enterprise approach formula, several disaggregated components of revenues from products, services, operations, financial services, brokerage and commissions; rental income; and other non-operating incomes are also part of output. In the Goldar Committee report, there is a limited discussion on the inclusion or exclusion of various revenue fields in GVA computation. However, it is evident from the composition of output that value addition is not solely accruing from manufacturing activities, but also from several related/ancillary activities. This leads to inflated GVA levels as the component of output is now similar to the total income of the company, and not industrial sales. In the paper, we show a comparison with the previous sales-based method and argue that changes in output composition alone can lead to increased levels of GVA. Consequently, the growth rates get pushed upwards.

What is missing in the new GVA formula is a clear rationale of including revenues from non-manufacturing activities. If part of trading and services are to be included, it requires a clear segregation of revenue and cost items such that GVA from manufacturing and services can be separately computed.

Changes in measures of intermediate consumption

Identifying components of intermediate consumption at the enterprise level is equally difficult. Conventionally, subtracting the cost items (related to production) from output provides a measure of value addition entirely from manufacturing activities. However, for large and diversified enterprises, identifying cost items from financial data fields can pose significant challenges. A close scrutiny of the XBRL fields shows omission of important cost components such as power and fuel expenses, advertisement and marketing-related expenses. These are sizeable components and their omission can underestimate costs, thereby overestimating GVA.

Thus, two possible reasons that account for changes in GVA are increase in output due to addition of revenue items besides sales, and omission of certain cost components.

Questions on the blow-up methodology

Blow-up of GVA is an imputation method to account for data of companies that are unavailable till a cut-off date of extraction from the MCA21 portal. Presently a PUC-based blow-up method is used. The method relies on the assumption that PUC and GVA have a one-to-one, or a linear, relation. Therefore, in absence of data, PUC of available companies can be used to infer the value addition of unavailable companies. Several variants of the method are possible, such as, blow-up for each range of PUC, by industry group, by ownership type of company, among others. However, details of the procedure have not been clearly documented in official publications.

We replicate the blow-up process by constructing an available and active set of companies based on random samples taken from Prowess dataset. We find three areas of concern. First, the basic assumption of a linear relation does not hold as, across industries, one cannot draw sufficient inferences about a company’s manufacturing activities by looking at its PUC value. Computationally, the blow-up factor increases as PUC coverage declines, and given year-on-year variations in filing by companies, the extent of blow-up remains unpredictable. Third, while the GVA contribution of a firm can be negative, the PUC will always be positive. Thus, in absence of any information on unavailable firms, the PUC-based blow-up will always contribute positively, irrespective of the actual contribution of the unavailable firm. In principle, the blow-up can even lead to overestimation of value addition as it crucially depends on the extent of PUC coverage and not on actual contribution of unavailable firms. In a recent paper, Nagaraj and Srinivasan (2016) argue that the estimates based on the earlier RBI (Reserve Bank of India) sample of firms and the current MCA21 dataset suffer from self-selection bias of an unknown magnitude. The bias stems from the fact that while data is extracted on a cut-off date, firms decide whether or not to report their financials by this date. They argue that any blow-up factor that does not model the self-selection nature of reporting the data, the blow-up will produce biased and inconsistent estimates. The problem of blow-up has also posed a legacy issue with the National Accounts Statistics as despite the availability of the new MCA21 dataset data-related problems have remained unresolved.

We propose one possible solution of using industry-level growth rates of GVA to scale up the last available year’s GVA of unavailable companies. We use a sample to first classify each unavailable company into its industry and based on computed growth rates of GVA for each industry, we scale up the last available GVA of the unavailable company.

This has a few advantages over the PUC method: (i) it uses the previous year’s GVA of unavailable firms instead of PUC of available firms, (ii) it does not depend on coverage of PUC, and (iii) it captures the economic conditions faced by the firms in the industry. Computationally, on average, the method gives a lower margin of error, a better representation of firms’ conditions, and provides a close approximation to the actual GVA contribution of the firm.

Are manufacturing companies being correctly identified?

The CSO primarily relies on ITC-HS codes for identifying companies. The ITC-HS is an eight-digit coding system that identifies a commodity for the purpose of import/export and domestic movement of goods. In the MCA21 forms, a company is required to furnish product codes of their top three revenue-generating products. However, compliance has been a major issue. The Goldar Committee report highlighted that in 2011-12 only 59% of the XBRL companies reported their product codes. This complicated the process of identification and prompted an alternative strategy of using the NIC digits contained in the Company Identification Number (CIN). The problems in using both these strategies are well known. What is less clear is the extent of distortion it can cause in GVA estimates.

ITC-HS codes identify a product and not the activity. The distinction between manufacturing and trading is essential as the GVA computation formula differs. Manufacturing as an activity has different measures of output, costs and tax items as compared to trade in commodities. Thus, identifying a product does not ensure that value addition is being captured correctly. Identification of the business activity is a prerequisite.

The problem also does not get resolved by identifying the business activity based on the company’s top revenue-generating products. The first issue: is that if the top revenue item is from manufacturing, followed by trading incomes, the company gets classified as a manufacturing company; and if the top revenue item is from trading, followed by manufacturing incomes, the company gets classified as a service company. In the former, other sources of revenues of a manufacturing company are counted in GVA. In the latter, revenues from manufacturing do not get counted at all as the company is not identified as a manufacturing company. The problem is compounded when we see identification as a year-on-year issue. The top revenue-generating products of a company can vary across years. This will require the statistical authority to identify and reclassify companies on a yearly basis.

The second case is where the ITC-HS codes are unavailable. Presently, a company’s CIN and the details on its website are used to identify its business activity. The possibility of misclassification should be apparent from this method. The 21-digit CIN contains the NIC digits that are assigned to the company based on its economic activity at the time of incorporation. Over time, a company may change its business activity or may diversify into other sectors. Such changes are not reflected in the CIN code. We highlight and estimate the distortion in GVA due to this problem by studying two groups: (i) firms that operate as non-manufacturing entities but have their NIC codes registered in a manufacturing activity, and (ii) firms that are into manufacturing but have their NIC code registered in any other economic activity. As the GVA formula for manufacturing and services differ, in both cases, wrongly-classified companies will show an incorrect GVA contribution. In aggregate, both manufacturing and services sector will show a distorted picture.

Conclusion

The 2011-12 series has thrown up several conceptual and methodological questions. While new sources and methods have improved the coverage and quality of the national accounts, they have also changed our view about estimating and understanding value addition, particularly in the manufacturing sector. Detailed investigation into the computation process shows several areas of concern about overestimation due to blow-up of GVA in case of unavailable data, and identification of manufacturing companies. The reliability of the GVA estimates is crucially dependent on the robustness of the estimation procedure and availability of accurate data⁶. Understanding and solving the problems require a constructive approach and much deeper insights into the national accounts.

This is a revised version of an earlier article that appeared on Ajay Shah’s blog. The author is grateful to Prof. T.N. Srinivasan for sending his paper and sharing his views on the subject; and to Ajay Shah and Rajeswari Sengupta for helpful discussions and suggestions.

Notes:

Gross value added (GVA) is broadly defined as the value of output less the value of intermediate consumption. It measures the contribution to an economy of an individual producer, industry, sector or region.
The PUC method uses the inverse of the ratio of Paid up Capital of active to available companies to compute the blow-up factor. The blow-up factor is then multiplied to the GVA of available companies.
XBRL stands for Extensible Business Reporting Language. It is an electronic reporting format in which companies are required to file their annual financial statements with the Ministry of Corporate Affairs (MCA). The e-filing portal of MCA is popularly known as MCA21.
Prowess is a dataset on the performance and financial indicators of Indian companies. It is produced by the Centre for Monitoring Indian Economy (CMIE) Pvt. Ltd.
A detailed mapping of the data fields can be found here.
A host of issues in estimation have already been raised here, here, and here.